25
SDPL 2007 3.2: (XML APIs) JAXP 1 3.3 JAXP: Java API for XML 3.3 JAXP: Java API for XML Processing Processing How can applications use XML How can applications use XML processors? processors? A Java-based answer: through A Java-based answer: through JAXP JAXP An overview of the JAXP interface An overview of the JAXP interface » What does it specify? What does it specify? » What can be done with it? What can be done with it? » How do the JAXP components fit together? How do the JAXP components fit together? [Partly based on tutorial “An Overview of the APIs” at [Partly based on tutorial “An Overview of the APIs” at http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/ http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/ overview/3_apis.html, from which also some graphics overview/3_apis.html, from which also some graphics are borrowed] are borrowed]

3.3 JAXP: Java API for XML Processing

  • Upload
    manning

  • View
    57

  • Download
    1

Embed Size (px)

DESCRIPTION

3.3 JAXP: Java API for XML Processing. How can applications use XML processors? A Java-based answer: through JAXP An overview of the JAXP interface What does it specify? What can be done with it? How do the JAXP components fit together? - PowerPoint PPT Presentation

Citation preview

Page 1: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 1

3.3 JAXP: Java API for XML 3.3 JAXP: Java API for XML ProcessingProcessing

How can applications use XML processors?How can applications use XML processors?– A Java-based answer: through A Java-based answer: through JAXPJAXP– An overview of the JAXP interfaceAn overview of the JAXP interface

» What does it specify?What does it specify?» What can be done with it?What can be done with it?» How do the JAXP components fit together?How do the JAXP components fit together?

[Partly based on tutorial “An Overview of the APIs” at [Partly based on tutorial “An Overview of the APIs” at http://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overvhttp://java.sun.com/xml/jaxp/dist/1.1/docs/tutorial/overview/3_apis.html, from which also some graphics are iew/3_apis.html, from which also some graphics are borrowed]borrowed]

Page 2: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 2

JAXP Versions: 1.1JAXP Versions: 1.1

JAXP 1.1 included in Java JDK 1.4JAXP 1.1 included in Java JDK 1.4 An interface for “plugging-in” and using An interface for “plugging-in” and using

XML processors in Java applicationsXML processors in Java applications– includes packagesincludes packages

» org.xml.saxorg.xml.sax:: SAX 2.0 SAX 2.0» org.w3c.domorg.w3c.dom:: DOM Level 2 DOM Level 2» javax.xml.parsersjavax.xml.parsers::

initialization and use of parsersinitialization and use of parsers» javax.xml.transformjavax.xml.transform::

initialization and use of initialization and use of Transformers Transformers

(XSLT processors)(XSLT processors)

Page 3: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 3

Later Versions: 1.2Later Versions: 1.2

JAXP 1.2 (2002) added property-strings for JAXP 1.2 (2002) added property-strings for setting the language and source of a schema setting the language and source of a schema used for (non-DTD-based) validationused for (non-DTD-based) validation– http://java.sun.com/xml/jaxp/http://java.sun.com/xml/jaxp/properties/schemaLanguageproperties/schemaLanguage

– http://java.sun.com/xml/jaxp/http://java.sun.com/xml/jaxp/properties/schemaSourceproperties/schemaSource

– either on aeither on a SAXParser SAXParser or or (DOM)(DOM) DocumentBuilderFactory DocumentBuilderFactory (to be (to be discussed)discussed)» in JAXP 1.3 with in JAXP 1.3 with setSchema(Schema) setSchema(Schema) method of method of

the Factory classesthe Factory classes

Page 4: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 4

Later Versions: 1.3Later Versions: 1.3

JAXP 1.3 included in Java JDK 1.5 (2005)JAXP 1.3 included in Java JDK 1.5 (2005)– more flexible validation (decoupled from more flexible validation (decoupled from

parsing)parsing)– DOM Level 3 Core, and Load and SaveDOM Level 3 Core, and Load and Save– API for applying XPath to do documentsAPI for applying XPath to do documents– mapping btw XML Schema and Java data types mapping btw XML Schema and Java data types

We'll restrict to basic ideas introduced in JAXP 1.1We'll restrict to basic ideas introduced in JAXP 1.1

Page 5: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 5

JAXP: XML processor plugin JAXP: XML processor plugin (1)(1)

Vendor-independent method for selecting Vendor-independent method for selecting processor implementation at run timeprocessor implementation at run time– principally through system propertiesprincipally through system properties

javax.xml.parsers.SAXParserFactoryjavax.xml.parsers.SAXParserFactoryjavax.xml.parsers.DocumentBuilderFactoryjavax.xml.parsers.DocumentBuilderFactoryjavax.xml.transform.TransformerFactoryjavax.xml.transform.TransformerFactory

– Set on command line (for example, to use Set on command line (for example, to use Apache Xerces as the DOM implementation):Apache Xerces as the DOM implementation): java java -D-Djavax.xml.parsers.DocumentBuilderFactoryjavax.xml.parsers.DocumentBuilderFactory= = org.apache.xerces.jaxp.DocumentBuilderFactoryImplorg.apache.xerces.jaxp.DocumentBuilderFactoryImpl

Page 6: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 6

JAXP: XML processor plugin JAXP: XML processor plugin (2)(2)

– Set during execution (Set during execution ( Saxon as the XSLT impl): Saxon as the XSLT impl): System.setProperty(System.setProperty(""javax.xml.transform.TransformerFactoryjavax.xml.transform.TransformerFactory", ", "com.icl.saxon.TransformerFactoryImpl");"com.icl.saxon.TransformerFactoryImpl");

By default, reference implementations usedBy default, reference implementations used– Apache Crimson/Xerces as the XML parserApache Crimson/Xerces as the XML parser– Apache Xalan/XSLTC as the XSLT processorApache Xalan/XSLTC as the XSLT processor

Supported by a few compliant processors:Supported by a few compliant processors:– Parsers: Apache Crimson and Xerces, Aelfred, Parsers: Apache Crimson and Xerces, Aelfred,

Oracle XML Parser for Java, Oracle XML Parser for Java, libxml2 (via GNU JAXP libxmlj) libxml2 (via GNU JAXP libxmlj)

– Transformers: Apache Xalan, Saxon, GNU XSL Transformers: Apache Xalan, Saxon, GNU XSL transformertransformer

"highly "highly experimental"experimental"

Page 7: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 7

JAXP: FunctionalityJAXP: Functionality

Parsing using SAX 2.0 or DOM Level 2Parsing using SAX 2.0 or DOM Level 2 Transformation using XSLTTransformation using XSLT

– (more about XSLT later)(more about XSLT later) Adds functionality missing from SAX 2.0 Adds functionality missing from SAX 2.0

and DOM Level 2:and DOM Level 2:– controlling validation and handling of parse controlling validation and handling of parse

errorserrors» error handling error handling cancan be controlled in SAX, be controlled in SAX,

by implementing by implementing ErrorHandlerErrorHandler methods methods

– loading and saving of DOM Document objectsloading and saving of DOM Document objects

Page 8: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 8

JAXP Parsing APIJAXP Parsing API

Included in JAXP package Included in JAXP package javax.xml.parsersjavax.xml.parsers

Used for invoking and using SAX …Used for invoking and using SAX …

SAXParserFactorySAXParserFactory spf = spf =SAXParserFactorySAXParserFactory..newInstancenewInstance();();

and DOM parser implementations:and DOM parser implementations:

DocumentBuilderFactoryDocumentBuilderFactory dbf = dbf =DocumentBuilderFactoryDocumentBuilderFactory..newInstancenewInstance();();

Page 9: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 9

XMLXML

.getXMLReader().getXMLReader()

JAXP: Using a SAX parser JAXP: Using a SAX parser (1)(1)

f.xmlf.xml

.parse(.parse( ” ”f.xml”)f.xml”)

.newSAXParser().newSAXParser()

Page 10: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 10

JAXP: Using a SAX parser JAXP: Using a SAX parser (2)(2)

We have already seen this:We have already seen this:SAXParserFactorySAXParserFactory spf = spf =

SAXParserFactorySAXParserFactory..newInstancenewInstance();(); try { try { SAXParserSAXParser saxParser = spf. saxParser = spf.newSAXParsernewSAXParser();(); XMLReaderXMLReader xmlReader = xmlReader =

saxParser.saxParser.getXMLReadergetXMLReader();(); ContentHandler handler = new myHdler(); ContentHandler handler = new myHdler(); xmlReaderxmlReader..setContentHandlersetContentHandler(handler);(handler); xmlReaderxmlReader..parseparse(systemIdOrInputSrc); (systemIdOrInputSrc);

} catch (Exception e) {} catch (Exception e) {System.err.println(e.getMessage());System.err.println(e.getMessage());System.exit(1); };System.exit(1); };

Page 11: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 11

f.xmlf.xml

JAXP: Using a DOM parser JAXP: Using a DOM parser (1)(1)

.parse(”f.xml”).parse(”f.xml”)

.newDocument().newDocument()

.newDocumentBuilder().newDocumentBuilder()

Page 12: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 12

JAXP: Using a DOM parser JAXP: Using a DOM parser (2)(2)

Parsing a file into a DOM Parsing a file into a DOM Document:Document:DocumentBuilderFactoryDocumentBuilderFactory dbf = dbf =

DocumentBuilderFactoryDocumentBuilderFactory..newInstancenewInstance();(); try {try { // to get a new// to get a new DocumentBuilderDocumentBuilder::

DocumentBuilderDocumentBuilder builder = builder = dbf.dbf.newDocumentBuildernewDocumentBuilder(); ();

DocumentDocument domDoc = domDoc = builder.builder.parseparse(fileNameOrURIetc);(fileNameOrURIetc);

} catch (} catch (ParserConfigurationExceptionParserConfigurationException e) { e) {e.printStackTrace());e.printStackTrace());System.exit(1); System.exit(1);

}; };

Page 13: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 13

DOM building in JAXPDOM building in JAXP

XMLXMLReaderReader

(SAX(SAXParser)Parser)

XMLXML

ErrorErrorHandlerHandler

DTDDTDHandlerHandler

EntityEntityResolverResolver

DocumentDocumentBuilderBuilder

(Content(ContentHandler)Handler)

DOM DocumentDOM Document

DOM on top of SAX - So what?DOM on top of SAX - So what?

Page 14: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 14

JAXP: Controlling parsing JAXP: Controlling parsing (1)(1)

Errors of DOM parsing can be handled Errors of DOM parsing can be handled – by creating a SAXby creating a SAX ErrorHandler ErrorHandler

» to implement to implement errorerror, , fatalErrorfatalError and and warningwarning methods methods

and passing it to the and passing it to the DocumentBuilderDocumentBuilder::builder.builder.setErrorHandlersetErrorHandler(new (new

myErrHandler()); myErrHandler()); domDoc = builder.domDoc = builder.parseparse(fileName);(fileName);

Parser properties can be configured:Parser properties can be configured:– for both for both SAXParserFactoriesSAXParserFactories and and

DocumentBuilderFactories DocumentBuilderFactories (before parser/builder (before parser/builder

creation)creation)::factory.factory.setValidatingsetValidating(true/false)(true/false)factory.factory.setNamespaceAwaresetNamespaceAware(true/false)(true/false)

Page 15: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 15

JAXP: Controlling parsing JAXP: Controlling parsing (2)(2)

dbf.dbf.setIgnoringCommentssetIgnoringComments(true/false)(true/false)

dbf.dbf.setIgnoringElementContentWhitespacesetIgnoringElementContentWhitespace(true/false)(true/false)

dbf.dbf.setCoalescingsetCoalescing(true/false)(true/false)

• combine CDATA sections with surrounding text?combine CDATA sections with surrounding text?

dbf.dbf.setExpandEntityReferencessetExpandEntityReferences(true/false)(true/false)

Further Further DocumentBuilderFactoryDocumentBuilderFactory configuration methods to control the configuration methods to control the form of the resulting DOM Document:form of the resulting DOM Document:

Page 16: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 16

JAXP Transformation APIJAXP Transformation API

earlier known as TrAXearlier known as TrAX Allows application to apply a Allows application to apply a TransformerTransformer

to a to a SourceSource document to get a document to get a ResultResult documentdocument

TransformerTransformer can be created can be created – from XSLT transformation instructions (to be from XSLT transformation instructions (to be

discussed later)discussed later)– without instructionswithout instructions

» gives an identity transformation, which simply gives an identity transformation, which simply copies the copies the SourceSource to the to the ResultResult

Page 17: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 17

XSLTXSLT

JAXP: Using Transformers JAXP: Using Transformers (1)(1)

.newTransformer(…).newTransformer(…)

.transform(.,.).transform(.,.)

SourceSource

Page 18: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 18

JAXP Transformation APIsJAXP Transformation APIs

javax.xml.transformjavax.xml.transform:: – Classes Classes TransformerFactoryTransformerFactory and and TransformerTransformer; initialization similar to parser ; initialization similar to parser factories and parsersfactories and parsers

Transformation Transformation SourceSource object can be object can be– a DOM tree, a SAX XMLReader or an input a DOM tree, a SAX XMLReader or an input

streamstream Transformation Transformation ResultResult object can be object can be

– a DOM tree, a SAX ContentHandler a DOM tree, a SAX ContentHandler or an output streamor an output stream

Page 19: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 19

Source-Result Source-Result combinationscombinations

XMLXMLReaderReader

(SAX (SAX Parser)Parser)

TransformTransformerer

DOMDOM

ConteContentnt

HandlHandlerer

Input Input StreamStream

Output Output StreamStream

DOMDOM

SourceSource ResultResult

Page 20: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 20

JAXP Transformation Packages JAXP Transformation Packages (2)(2)

Classes to create Classes to create SourceSource and and ResultResult objects from DOM, SAX and I/O streams objects from DOM, SAX and I/O streams defined in packagesdefined in packages– javax.xml.transform.domjavax.xml.transform.dom,, javax.xml.transform.saxjavax.xml.transform.sax,, andand javax.xml.transform.streamjavax.xml.transform.stream

Identity transformation to an output stream is a Identity transformation to an output stream is a vendor-neutral way to serialize DOM documents vendor-neutral way to serialize DOM documents (and the only option in JAXP)(and the only option in JAXP)– ““I would recommend using the JAXP interfaces until the I would recommend using the JAXP interfaces until the

DOM’s own load/save module becomes available” DOM’s own load/save module becomes available” » Joe Kesselman, IBM & W3C DOM WGJoe Kesselman, IBM & W3C DOM WG

Page 21: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 21

Serializing a DOM Document as XML Serializing a DOM Document as XML texttext

By an identity transformation to an output stream:By an identity transformation to an output stream:

TransformerFactoryTransformerFactory tFactory = tFactory = TransformerFactoryTransformerFactory..newInstancenewInstance();();

// Create an identity transformer:// Create an identity transformer:TransformerTransformer transformer = transformer =

tFactory.tFactory.newTransformernewTransformer(); (); DOMSourceDOMSource source = new source = new DOMSourceDOMSource(myDOMdoc); (myDOMdoc); StreamResultStreamResult result = result =

new new StreamResultStreamResult(System.out); (System.out);

transformer.transformer.transformtransform(source, result);(source, result);

Page 22: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 22

Controlling the form of the result?Controlling the form of the result?

We could specify the requested form of the result by We could specify the requested form of the result by an XSLT script, say, in file an XSLT script, say, in file saveSpec.xsltsaveSpec.xslt::

<xsl:transform version="1.0" <xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output encoding="ISO-8859-1" indent="yes"<xsl:output encoding="ISO-8859-1" indent="yes" doctype-system="reglist.dtd" />doctype-system="reglist.dtd" />

<xsl:template match="/"> <xsl:template match="/"> <!-- copy whole document: --><!-- copy whole document: --> <xsl:copy-of select="." /><xsl:copy-of select="." /> </xsl:template> </xsl:template>

</xsl:transform></xsl:transform>

Page 23: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 23

Creating an XSLT TransformerCreating an XSLT Transformer

Then create a tailored transfomer:Then create a tailored transfomer:

StreamSourceStreamSource saveSpecSrc = saveSpecSrc = new new StreamSourceStreamSource((

new File(”saveSpec.xslt”) );new File(”saveSpec.xslt”) );TransformerTransformer transformer = transformer =

tFactory.tFactory.newTransformernewTransformer(saveSpecSrc); (saveSpecSrc);

// and use it to transform a Source to a Result, // and use it to transform a Source to a Result,

// as before// as before

The The Source Source of transformation instructions could be of transformation instructions could be given also as a given also as a DOMSourceDOMSource, URL, or character reader, URL, or character reader

Page 24: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 24

DOM vs. Other Java/XML DOM vs. Other Java/XML APIsAPIs

JDOM (JDOM (www.jdom.orgwww.jdom.org), ), DOM4J (DOM4J (www.dom4j.orgwww.dom4j.org)), , JAXB (JAXB (java.sun.com/xml/jaxbjava.sun.com/xml/jaxb))

The others may be more convenient to The others may be more convenient to use, but …use, but … “ “The The DOM offersDOM offers not only the not only the ability to ability to

move between languagesmove between languages with minimal with minimal relearning, but to relearning, but to move between multiple move between multiple implementationsimplementations in a single language – in a single language – which a specific set of classes such as JDOM which a specific set of classes such as JDOM can’t support”can’t support”» J. Kesselman, IBM & W3C DOM WG J. Kesselman, IBM & W3C DOM WG

Page 25: 3.3 JAXP: Java API for XML Processing

SDPL 2007 3.2: (XML APIs) JAXP 25

JAXP: SummaryJAXP: Summary

An interface for using XML ProcessorsAn interface for using XML Processors– SAX/DOM parsers, XSLT transformersSAX/DOM parsers, XSLT transformers– schema-based validators (in JAXP 1.3)schema-based validators (in JAXP 1.3)

Supports pluggability of XML processorsSupports pluggability of XML processors Defines means to control parsing, and Defines means to control parsing, and

handling of parse errors (through SAX handling of parse errors (through SAX ErrorHandlers)ErrorHandlers)

Defines means to create and save DOM Defines means to create and save DOM DocumentsDocuments