73
Processing XML Part II rser Operations with DOM and SAX overview L Validation with examples ocessing XML with SAX (locally and on the internet

Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

  • View
    230

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Processing XML Part II

• Parser Operations with DOM and SAX overview • XML Validation with examples

• Processing XML with SAX (locally and on the internet)

Page 2: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

FixedFloatSwap.xml

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"><FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments></FixedFloatSwap>

Page 3: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

FixedFloatSwap.dtd

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

Page 4: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Operation of a Tree-based Parser

Tree-BasedParser

ApplicationLogic

Document Tree

Valid

XML DTD

XML Document

Page 5: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Tree Benefits

• Some data preparation tasks require early

access to data that is further along in the

document (e.g. we wish to extract titles to build a table of contents)

• New tree construction is easier (e.g. xslt works from a tree to convert FpML to WML)

Page 6: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Operation of an Event Based Parser

Event-BasedParser

ApplicationLogic

Valid

XML DTD

XML Document

Page 7: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Operation of an Event Based Parser

Event-BasedParser

ApplicationLogic

Valid

XML DTD

XML Document

public void startDocument ()public void endDocument ()public void startElement (String name, AttributeList attrs)public void endElement (String name)public void characters (char buf [], int offset, int len)

public void error(SAXParseException e) throws SAXException { System.out.println("\n\n--Invalid document ---" + e); }

Page 8: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Event-Driven Benefits

• We do not need the memory required for trees

• Parsing can be done faster with no tree construction going on

Page 9: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

XML Validation

A batch validating process involves comparing the DTD against a complete document instance and producing a report containing any errors or warnings.

Software developers should consider batch validation to be analogous to program compilation, with similar errors detected.

Interactive validation involves constant comparison of the DTDagainst a document as it is being created.

Page 10: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

XML Validation

The benefits of validating documents against a DTD include:

• Programmers can write extraction and manipulation filters without fear of their software ever processing unexpected input.

• Using an XML-aware word processor, authors and editors can be guided and constrained to produce conforming documents.

Page 11: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

XML Validation Examples

XML elements may contain further, embedded elements, andthe entire document must be enclosed by a single documentelement.

The degree to which an element’s content is organized into childelements is often termed its granularity.

Some hierarchical structures may be recursive.

The Document Type Definition (DTD) contains rules for each elementallowed within a specific class of documents.

Page 12: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

// Validate.java

import java.io.*;import org.xml.sax.*;import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;

public class Validate extends HandlerBase{ public static boolean valid = true;

public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java Validate filename.xml"); System.exit (1); }

SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true);

We’ll run this program against several xml fileswith DTD’s.

Page 13: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

try { SAXParser saxParser = factory.newSAXParser(); saxParser.parse( new File(argv [0]), new Validate());

} catch (Throwable t) {

t.printStackTrace ();

} System.out.println("Valid document is " + valid); System.exit (0); }

public void error(SAXParseException e) throws SAXException { System.out.println(e.toString()); valid = false; }}

Page 14: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"><FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments></FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

XML Document

DTD

Valid document is true

Page 15: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"><FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments></FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

XML Document

DTD

Valid document is false

Page 16: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Swaps SYSTEM "FixedFloatSwap.dtd"><Swaps> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

<FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap></Swaps>

XML Document

Page 17: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="utf-8"?><!ELEMENT Swaps (FixedFloatSwap+) ><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

DTD

C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xml

Quantity Indicators ? 0 or 1 time + 1 or more times * 0 or more times

Valid document is true

Page 18: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

The locations where document text data is allowed are indicated by the keyword ‘PCDATA’ (Parsed Character Data).

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd">

<FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears> <StartYear>2000</StartYear> <EndYear>2002</EndYear> </NumYears> <NumPayments>6</NumPayments>

</FixedFloatSwap>

XML Document

Page 19: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xmlorg.xml.sax.SAXParseException: Element "NumYears" does not allow "StartYear" --(#PCDATA)org.xml.sax.SAXParseException: Element type "StartYear" is not declared.org.xml.sax.SAXParseException: Element "NumYears" does not allow "EndYear" -- (#PCDATA)org.xml.sax.SAXParseException: Element type "EndYear" is not declared.Valid document is false

Output of program afterbeing modified to displaythe error.

DTD

Page 20: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

There are strict rules which must be applied when an element is allowed to contain both text and child elements.

The PCDATA keyword must be the first token in the group, and the group must be a choice group (using “|” not “,”).

The group must be optional and repeatable.

This is known as a mixed content model.

Page 21: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="utf-8"?><!ELEMENT Mixed (emph) ><!ELEMENT emph (#PCDATA | sub | super)* ><!ELEMENT sub (#PCDATA)><!ELEMENT super (#PCDATA)>

DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Mixed SYSTEM "Mixed.dtd"><Mixed> <emph>H<sub>2</sub>O is water.</emph></Mixed>

XML Document

Valid document istrue

Page 22: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

AttributesAn attribute is associated with a particular element by the DTDand is assigned an attribute type.

The attribute type can restrict the range of values it can hold.

Example attribute types include :

CDATA indicates a simple string of characters NMTOKEN indicates a word or token A named token group such as (left | center | right)

Page 23: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED>

DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

XML Document

C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xmlorg.xml.sax.SAXParseException: Attribute value for "currency" is #REQUIRED.

Valid document is false

Page 24: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED>

DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional currency = “Pounds”>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

XML Document

Valid document is true

Page 25: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional currency = “Pounds”>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

XML Document

Valid document is true#IMPLIED means optional

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED><!ATTLIST FixedFloatSwap note CDATA #IMPLIED>

Page 26: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap note = “For your eyes only”> <Notional currency = “Pounds”>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

XML Document

Valid document is true

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED><!ATTLIST FixedFloatSwap note CDATA #IMPLIED>

Page 27: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [ <!ENTITY bankname "Mellon National Bank and Trust" > ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Bank,Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Bank (#PCDATA) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

DTD

Document usinga General Entity

Validate is true

Page 28: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "Bank"> <WML> <CARD> <xsl:apply-templates/> </CARD> </WML> </xsl:template>

<xsl:template match = "Notional | Fixed_Rate | NumYears | NumPayments"> </xsl:template> </xsl:stylesheet>

XSLT Program

Page 29: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

C:\McCarthy\www\46-928\examples\sax>java -Dcom.jclark.xsl.sax.parser=com.jclark.xml.sax.CommentDriver com.jclark.xsl.sax.Driver FixedFloatSwap.xml FixedFloatSwap.xsl FixedFloatSwap.wml

C:\McCarthy\www\46-928\examples\sax>type FixedFloatSwap.wml

<?xml version="1.0" encoding="utf-8"?>

<WML><CARD>Mellon National Bank and Trust</CARD></WML>

XSLT OUTPUT

Page 30: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [

<!ENTITY bankname SYSTEM "JustAFile.dat" >

]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

An external text entity

Page 31: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Mellon Bank And Trust CorporationWhen you need a friend!

XSLT Output

<?xml version="1.0" encoding="utf-8"?>

<WML><CARD>Mellon Bank And Trust CorporationWhen you need a friend!</CARD></WML>

JustAFile.dat

Page 32: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ENTITY % parsedCharacterData "(#PCDATA)"><!ELEMENT Notional %parsedCharacterData; ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

XML Document

DTD

Internal Parameter Entities

Page 33: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Bank> &bankname; </Bank> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Bank, Notional, Fixed_Rate, NumYears, NumPayments ) ><!ENTITY bankname "Mellon National Bank and Trust Corporation" ><!ELEMENT Bank (#PCDATA)><!ELEMENT Notional (#PCDATA)><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

XML Document

DTD

General Entity defined in the DTD

Page 34: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> <Note> <![CDATA[This is text that <b>will not be parsed for markup]]> </Note> </FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap ( Notional, Fixed_Rate, NumYears, NumPayments, Note ) ><!ELEMENT Notional (#PCDATA)><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ELEMENT Note (#PCDATA) >

XML Document

DTD

CDATA Section

Page 35: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "Note"> <WML> <CARD> <xsl:apply-templates/> </CARD>h </WML> </xsl:template>

<xsl:template match = "Notional | Fixed_Rate | NumYears | NumPayments"> </xsl:template> </xsl:stylesheet>

XSLT Program

Page 36: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="utf-8"?><WML><CARD>

This is text that &lt;b&gt;will not be parsed for markup

</CARD></WML>

XSLT Output

Page 37: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

DTD Components<?xml version="1.0" encoding = "UTF-8"?><!DOCTYPE ORDER SYSTEM "order.dtd"><!-- example order form --><ORDER SOURCE ="web" CUSTOMERTYPE="consumer" CURRENCY="USD"> <addresses> <address ADDTYPE="billship"> <firstname>Kevin</firstname> <lastname>Dick</lastname> <street ORDER="1">123 Anywhere Lane</street> <street ORDER="2">Apt 1b</street> <city>Palo Alto</city> <state>CA</state> <postal>94303</postal> <country>USA</country> </address>

Order.xml

Page 38: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<address ADDTYPE="bill"> <firstname>Kevin</firstname> <lastname>Dick</lastname> <street ORDER="1">123 Not The Same Lane</street> <street ORDER="2">Work Place</street> <city>Palo Alto</city> <state>CA</state> <postal>94300</postal> <country>USA</country> </address> </addresses>

An order may have more than oneaddress.

Page 39: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<lineitems> <lineitem ID="line1"> <product CAT="MBoard">440BX Motherboard</product> <quantity>1</quantity> <unitprice>200</unitprice> </lineitem> <lineitem ID="line2"> <product CAT = "RAM">128 MB PC-100 DIMM</product> <quantity>2</quantity> <unitprice>175</unitprice> </lineitem> <lineitem ID="line3"> <product CAT="CDROM">40x CD-ROM</product> <quantity>1</quantity> <unitprice>50</unitprice> </lineitem> </lineitems>

Several productsmay be purchased.

Page 40: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<payment> <card CARDTYPE="VISA"> <cardholder>Kevin S. Dick</cardholder> <cardnumber>11111-22222-33333</cardnumber> <expiration>01/01</expiration> </card> </payment></ORDER>

The payment is witha Visa card.

Valid document is true

Page 41: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

order.dtd<?xml version="1.0" encoding="UTF-8"?>

<!-- Example Order form DTD adapted from XML: A Manager's Guide -->

<!-- Define an ORDER element -->

<!ELEMENT ORDER (addresses, lineitems, payment)> <!ATTLIST ORDER SOURCE (web | phone | retail) #REQUIRED CUSTOMERTYPE (consumer | business) "consumer" CURRENCY CDATA "USD">

Define an order based on other elements.

Page 42: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<!ENTITY % anAddress SYSTEM "address.dtd" >%anAddress;

<!-- Collection of Addresses --><!ELEMENT addresses (address+)>

<!ENTITY % aLineItem SYSTEM "lineitem.dtd" >%aLineItem;

<!-- Collection of LineItems --><!ELEMENT lineitems (lineitem+)>

<!ENTITY % aPayment SYSTEM "payment.dtd" >%aPayment;

The other elements are in their own dtd files.

External parameterentities

Page 43: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

address.dtd<!-- Address Structure --><!ELEMENT address (firstname, middlename?, lastname, street+, city, state,postal,country)>

<!ELEMENT firstname (#PCDATA)><!ELEMENT middlename (#PCDATA)><!ELEMENT lastname (#PCDATA)><!ELEMENT street (#PCDATA)><!ELEMENT city (#PCDATA)><!ELEMENT state (#PCDATA)><!ELEMENT postal (#PCDATA)><!ELEMENT country (#PCDATA)><!ATTLIST address ADDTYPE (bill | ship | billship) "billship"><!ATTLIST street ORDER CDATA #IMPLIED>

Page 44: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

lineitem.dtd<!ELEMENT lineitem (product,quantity,unitprice)><!ATTLIST lineitem ID ID #REQUIRED>

<!ELEMENT product (#PCDATA)><!ATTLIST product CAT (CDROM|MBoard|RAM) #REQUIRED>

<!ELEMENT quantity (#PCDATA)><!ELEMENT unitprice (#PCDATA)>

Page 45: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<!ELEMENT payment (card | PO)><!ELEMENT card (cardholder, cardnumber, expiration)><!ELEMENT cardholder (#PCDATA)><!ELEMENT cardnumber (#PCDATA)><!ELEMENT expiration (#PCDATA)><!ELEMENT PO (number,authorization*)><!ELEMENT number (#PCDATA)><!ELEMENT authorization (#PCDATA)>

<!ATTLIST card CARDTYPE (VISA|MasterCard|Amex) #REQUIRED>

payment.dtd

Page 46: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Processing XML with SAX

• Important interfaces and classes are found in org.xml.sax package

• We will look at the following interfaces and then study an example

interface DocumentHandler -- reports on document events interface ErrorHandler – reports on validity errors class HandlerBase – implements both of the above plus two others

Page 47: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

public interface DocumentHandler

Receive notification of general document events.

This is the main interface that most SAX applications implement: if the application needs to be informed of basic parsing events, it implements this interface andregisters an instance with the SAX parser.

The parser uses the instance to report basic document-related events like thestart and end of elements and character data.

Page 48: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

void characters(char[] ch, int start, int length) Receive notification of character data.void endDocument() Receive notification of the end of a document.void endElement(java.lang.String name) Receive notification of the end of an element.void startDocument() Receive notification of the beginning of a document. void startElement(java.lang.String name, AttributeList atts) Receive notification of the beginning of an element.

Some methods from the DocumentHandler Interface

Page 49: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

public interface ErrorHandler

Basic interface for SAX error handlers.

If a SAX application needs to implement customized error handling, it must implement this interface and then register an instance with the SAX parser.The parser will then report all errors and warnings through this interface.

Some methods are:void error(SAXParseException exception) Receive notification of a recoverable error.void fatalError(SAXParseException exception) Receive notification of a non-recoverable error.void warning(SAXParseException exception) Receive notification of a warning.

Page 50: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

public class HandlerBaseextends java.lang.Objectimplements EntityResolver, DTDHandler, DocumentHandler, ErrorHandler

Default base class for handlers.

This class implements the default behaviour for four SAX interfaces: EntityResolver, DTDHandler, DocumentHandler, and ErrorHandler.

Page 51: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap ( Bank, Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Bank (#PCDATA)><!ELEMENT Notional (#PCDATA)><!ATTLIST Notional currency (dollars | pounds) #REQUIRED><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

FixedFloatSwap.dtd

Input

Page 52: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [<!ENTITY bankname "Pittsburgh National Corporation"> ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional currency = "pounds">100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

FixedFloatSwap.xml

Input

Page 53: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

// NotifyStr.java// Adapted from XML and Java by Maruyama, Tamura and Uramoto// IBM Tokyo Research, Addison-Wesley

import java.io.*;import org.xml.sax.*;import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;

Processing

Java event-driven processing

Page 54: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

public class NotifyStr extends HandlerBase{ public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java NotifyStr filename.xml"); System.exit (1); } SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true); NotifyStr myHandler = new NotifyStr(); try {

SAXParser saxParser = factory.newSAXParser(); saxParser.parse( new File(argv [0]), myHandler);

} catch (Throwable t) { t.printStackTrace (); } System.exit (0); }

Page 55: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

public NotifyStr() {}

public void startDocument() throws SAXException { System.out.println("startDocument called:"); }

public void endDocument() throws SAXException { System.out.println("endDocument called:"); }

Page 56: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

public void startElement(String Name, AttributeList aMap) throws SAXException {

System.out.println("startElement called: element name =" + Name); // examine the attributes for(int i = 0; i < aMap.getLength(); i++) {

String attName = aMap.getName(i); String type = aMap.getType(i); String value = aMap.getValue(i); System.out.println(" attribute name = " + attName + " type = " + type + " value = " + value); } }

Page 57: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

public void endElement(String name) throws SAXException { System.out.println("endElement is called:" + name);

}

public void characters(char[] ch, int start, int length) throws SAXException {

// build String from char array String dataFound = new String(ch,start,length); System.out.println("characters called:" + dataFound);

}

Page 58: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

public void error(SAXParseException e) throws SAXException {

System.out.println("Parsing error"); System.out.println(e.toString()); }}

Page 59: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

C:\McCarthy\www\46-928\examples\sax>java NotifyStr FixedFloatSwap.xmlstartDocument called:startElement called: element name =FixedFloatSwapstartElement called: element name =Bankcharacters called:Pittsburgh National CorporationendElement is called:BankstartElement called: element name =Notional attribute name = currency type = ENUMERATION value = poundscharacters called:100endElement is called:NotionalstartElement called: element name =Fixed_Ratecharacters called:5endElement is called:Fixed_RatestartElement called: element name =NumYearscharacters called:3endElement is called:NumYearsstartElement called: element name =NumPaymentscharacters called:6endElement is called:NumPaymentsendElement is called:FixedFloatSwapendDocument called:

Output

Page 60: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Accessing the swap from Jigsaw

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap [<!ENTITY bankname "Pittsburgh National Corporation"> ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional currency = "pounds">100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

Saved under Www/fpml/ServerSwap.xml

Page 61: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

// This servlet file is stored in WWW/Jigsaw/servlet/GetXML.java// This servlet returns a user selected xml file from// the Www/fpml directory and returns it to the client.

import java.io.*;import java.util.*;import javax.servlet.*;import javax.servlet.http.*;

public class GetXML extends HttpServlet { public void doGet(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException {

String theData = ""; String extraPath = req.getPathInfo(); extraPath = extraPath.substring(1);

Servlet Code

Page 62: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

// read the file and write it to the client try { // open file and create a DataInputStream FileInputStream theFile = new FileInputStream("c:\\Jigsaw\\Jigsaw\\Jigsaw\\Www\\fpml\\“ +extraPath); //DataInputStream dis = new DataInputStream(theFile); InputStreamReader is = new InputStreamReader(theFile); BufferedReader br = new BufferedReader(is);

// read the file into the string theData String thisLine; while((thisLine = br.readLine()) != null) { theData += thisLine + "\n"; } } catch(Exception e) { System.err.println("Error " + e); }

Page 63: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

PrintWriter out = res.getWriter();

out.write(theData); System.out.println("Wrote document to client"); // write data to console System.out.println(theData); out.close(); }

}

Page 64: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

// Sax Clientimport java.io.*;import org.xml.sax.*;import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;

public class JigsawNotifyStr extends HandlerBase{ public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java NotifyStr filename.xml"); System.exit (1); }

String serverString = "http://localhost:8001/servlet/getXML/"; String fileName = argv[0];

Page 65: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

InputSource is = new InputSource(serverString + fileName);

System.out.println("Got the input source");

SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true);

JigsawNotifyStr myHandler = new JigsawNotifyStr();

try { SAXParser saxParser = factory.newSAXParser(); saxParser.parse( is, myHandler);

} catch (Throwable t) { System.out.println("Big error");

t.printStackTrace (); } System.exit (0); }

Page 66: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

public JigsawNotifyStr() {}

public void startDocument() throws SAXException {

System.out.println("startDocument called:"); }

public void endDocument() throws SAXException {

System.out.println("endDocument called:");

} // Same as before // public void error(SAXParseException e) throws SAXException {

// describe each arror and show each error method System.out.println("Parsing error"); System.out.println(e.toString()); }}

Page 67: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Being served by the servlet

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap [<!ENTITY bankname "Pittsburgh National Corporation"> ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional currency = "pounds">100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

Page 68: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Got the input sourcestartDocument called:Parsing errororg.xml.sax.SAXParseException: Element type "FixedFloatSwap" is not declared.startElement called: element name =FixedFloatSwapcharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "Bank" is not declared.startElement called: element name =Bankcharacters called:Pittsburgh National CorporationendElement is called:Bankcharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "Notional" is not declared.Parsing errororg.xml.sax.SAXParseException: Attribute "currency" is not declared for element "Notional".startElement called: element name =Notional attribute name = currency type = CDATA value = poundscharacters called:100endElement is called:Notionalcharacters called:

We have some parsing errors.

Do you see why?

Page 69: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Parsing errororg.xml.sax.SAXParseException: Element type "Fixed_Rate" is not declared.startElement called: element name =Fixed_Ratecharacters called:5endElement is called:Fixed_Ratecharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "NumYears" is not declared.startElement called: element name =NumYearscharacters called:3endElement is called:NumYearscharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "NumPayments" is not declared.startElement called: element name =NumPaymentscharacters called:6endElement is called:NumPaymentscharacters called: endElement is called:FixedFloatSwapendDocument called:

Page 70: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

The InputSource Class

The SAX and DOM parsers need XML input. The “output”produced by these parsers amounts to a series of method calls(SAX) or an application programmer interface to the tree (DOM).

An InputSource object can be used to provided input to theparser.

InputSurce SAX or DOM

Tree

Eventsapplication

So, how do we build an InputSource object?

Page 71: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

Some InputSource constructors:

InputSource(String pathToFile); InputSource(InputStream byteStream); InputStream(Reader characterStream);

For example: String text = “<a>some xml</a>”; StringReader sr = new StringReader(text); InputSource is = new InputSource(sr); : myParser.parse(is);

Page 72: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

But what about the DTD?

public interface EntityResolver

Basic interface for resolving entities.

If a SAX application needs to implement customized handling for external entities, it must implement this interface and registeran instance with the SAX parser using the parser'ssetEntityResolver method.

The parser will then allow the application to intercept any externalentities (including the external DTD subset and external parameterentities, if any) before including them.

Page 73: Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples Processing XML with SAX (locally and on the internet)

EntityResolver

public InputSource resolveEntity(String publicId, String systemId) {

// Add this method to the client above. The systemId String // holds the path to the dtd as specified in the xml document. // We may now access the dtd from a servlet and return an // InputStream or return null and let the parser resolve the // external entity. System.out.println("Attempting to resolve" + "Public id :" + publicId + "System id :" + systemId); return null;

}