35
Create by ChungLD faculty XML by Example / Bachkhoa – Aptech Computer Education 1/31 Session 5 DOM and SAX, XML DOM and SAX Objects

Session 5

Embed Size (px)

DESCRIPTION

XML by Example Session 5

Citation preview

Page 1: Session 5

Create by ChungLD faculty XML by Example / Bachkhoa – Aptech Computer Education 1/31

Session 5DOM and SAX,

XML DOM and SAX Objects

Page 2: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 2/31

Objectives Representing data Parsers DOM Working with DOM SAX Microsoft XML DOM objects

Page 3: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 3/31

Representing Data Model: is a way of representing data or

information. The data within XML documents can be represented using various models.

The various documents models available

Linear model Tree model Object model

Page 4: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 4/31

Linear Model The linear model can be applied to a static

document object, such as book. To go to a particular topic in a book, the

book name, page number, and the line number on that page is the onlyinformation required.

Page 5: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 5/31

Tree model XML documents can have a hierarchical structure

which can be interpreted as a tree structure, known as XML tree. A tree consists of nodes and at every node in the tree there are character strings.

Inventory

SnacksDrink

Fitzy Tipsy Popcorn Wafers

QuantityPrice QuantityPrice QuantityPrice QuantityPrice

Page 6: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 6/31

Object Model The XML object model is a collection of objects

that is used to access and manipulate the data stored in an XML document. The XML document is modeled like a tree, in which each element in the tree is considered a node. Objects with various properties and methods are used to traverse the tree and its nodes. Each node contains the data in the document.

Page 7: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 7/31

Parsers An XML parser is a software package, a libraray or

a module that reads XML documents first. The information present in the XML file is then made available by the parser to applications and other programming languages

Page 8: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 8/31

Parsing Techniques It is important to parse XML data efficiently,

especially in applications that handle large volumes of data. Improper parsing will result in excessive memory usage and processing time that will hamper scalability.

Parsing

Event-drivenParsing

Object-basedParsing

Page 9: Session 5

XML by Example / Bachkhoa – Aptech Computer Education

Types of parsers The different types of parsers are:

Create by ChungLD faculty 9/31

Page 10: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 10/31

Simple API for XML (SAX):

SAX gives an event based approach to XML parsing. In an event based parsing when the parser encounters an element, the parser returns the elements, its attribute and content. Event based parsing provides a data – centric view of XML. Events include XML tag, detecting errors and so on.

Advantages: Low memory consumption as the entire XML document is

not loaded in the memory Disadvantages:

No built-in document navigation support No support for random access of XML document No namespace and modifying XML document support in

place

Page 11: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 11/31

Document Object Model(DOM)

DOM is a mature standard from W3C. DOM parser builds a hierarchical model of the parsed document. Each of the important locations in the document, various element and attribute containers and characteristics of the model is represented as nodes.

Advantages: Easy to use. Easy navigation by using the APIs. Random

access to XML document as the tree is loaded in the memory

Disadvantages: Parsing of the XML document is done once. High memory

consumption and expensive as the entire tree structure is loaded in the memory

Page 12: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 12/31

Streaming API for XML (StAX)

The StAX is a new parsing model introduced recently. Like SAX it is an event-driven model. StAX uses pull model for event processing. In other words, StAX parser returns events that are requested by an application and the events can also be provided as objects.

Advantages: Ease in performance as it supports two parsing models.

Parsing controlled by application Disadvantages:

No built-in document navigation support No support for random access XML document No support for modifying XML document

Page 13: Session 5

XML by Example / Bachkhoa – Aptech Computer Education

Introduction DOM A standard object model for XML A standard programming interface for XML Platform- and language-independent A W3C standard

Create by ChungLD faculty 13/31

Page 14: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 14/31

DOM objects All elements present in the XML document including

their contents is accessed using the DOM tree. In the DOM tree the contents can be added, modified and deleted

The characteristics of a node tree are: The top node represent the root. A node has one parent node except root. A node can have many children A node with no child node is known as the leaf node. Nodes having same parent node are known as siblings.

Click and see

Page 15: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 15/31

Creation of XML Document Object Create DOM

Read all xmll

Page 16: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 16/31

Traversing a DOM Tree using Element Object An element is a Node object and hence inherits

the properties and methods of the Node interface. The method getElementsByTagName() returns a nodeList of the elements and the length() method can be used to loop through the list.

Click and see

Page 17: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 17/31

Traversing a DOM Tree using Node Object The node object represents a single node in the

document tree. It is the basic data type of the DOM. There are children for all node types

There different node types are: NODE_ELEMENT NODE_ATTRIBUTE NODE_TEXT NODE_PROCESSING_INSTRUCTION NODE_COMMENT NODE_DOCUMENT

Click and see

Page 18: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 18/31

Traversing a DOM Tree using NodeList The NodeList object represents a collection of

Node objects. Any alterations to the properties of node are reflected in the list. Individual nodes can be accessed using indexes as well as one can iterate through the collection.

Click and see

Page 19: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 19/31

Traversing a DOM Tree using NameNodeMap Object NameNodeMap object represents a collection of

nodes that is accessed by name.

Click and see

Page 20: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 20/31

Traversing a DOM Tree using Attribute Object The Attribute of an element object is represented

by the Attr interface and is defined in DTD. The Attr object is a node and inherits the properties and methods of the Node object. Attribute is a property of a child node and not a child node. An attribute does not have a parent node.

Click and see

Page 21: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 21/31

XMLDOMParseError The XMLDOMPaserError object is an

extension to the W3C specification. It can be used to get detailed information on the last error that occurred while either loading or parsing a document. The XMLDOMParseError

Page 22: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 22/31

Properties of XMLDOMParseError

errorCode filepos line linepos reason srcText

Page 23: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 23/31

SAX Simple API for XML is an XML parser which parses

an XML document and extracts information from the document.

Page 24: Session 5

XML by Example / Bachkhoa – Aptech Computer Education

XML Parsing using SAX SAX is not a W3C standard and has well-deffined

APIs for parsing XML document. SAX generates events as it reads through the document and the event handlers handle these events and provide access to the content of the XML document

Create by ChungLD faculty 24/31

Page 25: Session 5

XML by Example / Bachkhoa – Aptech Computer Education

The some common classes ContentHandler calss is used for accessing the

contents of the XML document

Create by ChungLD faculty 25/31

Page 26: Session 5

XML by Example / Bachkhoa – Aptech Computer Education

The DefaultHandler class This is the default base class for SAX2 event

handlers, it provides defalt implementations for all of the callbacks in the four core SAX2 handler classes: Entityresolver, DTDHandler, ContentHandler, ErrorHandler.

Create by ChungLD faculty 26/31

Page 27: Session 5

XML by Example / Bachkhoa – Aptech Computer Education

The XML Reader Interface This is allows an application to set and

query features and properties in the parser, to register event handlers for document processing and to begin parsing of ta document.

Create by ChungLD faculty 27/31

Page 28: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 28/31

Microsoft XML DOM Objects The Microsoft implementation of the XML

Document Object Model(DOM) provides a set of classes and interface that map to the W3C DOM.

Page 29: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 29/31

Properties Properties

async:This property indicates whether asynchronous download is permitted or not. When set to true, the load() method returns control to the called document before the download is finished.

doctype: this property return an object representing the root element of the XML document.

implementation: this property return an implementation object that handles this document.

nodeName: this property returns a read-only property that returns a string containing the Node’s name depending on the type of Node

Page 30: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 30/31

Event ondataavailable:is raised when new data is

available ontransformnode: is raised before a

stylesheet is applied to a node.

Page 31: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 31/31

Method

Click and see

Page 32: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 32/31

Properties and Method of an XMLDOMNode object Properties

namespaceURI parsed xml nodeName nodeType

Method hasChildNodes insertBefore(child1,child2) replaceChild(child1,child2) removeChild(child) appendChild(newChild)

Click and see

Page 33: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 33/31

Properties and method of an XMLDOMNodeList Properties

lengthproperty Method

item nextNode reset

Click and see

Page 34: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 34/31

XMLDOMNamedNodeMap Properties

length returns the number of nodes in the map Methods

Click and see

Page 35: Session 5

XML by Example / Bachkhoa – Aptech Computer EducationCreate by ChungLD faculty 35/31

Summary and workshop