25
Programming Mobile Devices XML Parsing University of Innsbruck WS 2009/2010 [email protected]

Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Programming Mobile DevicesXML Parsing

University of InnsbruckWS 2009/2010

[email protected]

Page 2: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

2

You all have seen typical XML files, e.g.

<?xml version="1.0" encoding="ISO-8859-1" ?><bibliography><phdthesis author="Thomas Strang">

<title>Service-Interoperability in Ubiquitous Computing Environments

</title><isbn>3-8007-2823-0</isbn>

</phdthesis><!-- … several more … -->

</bibliography>

biblio.xml

Page 3: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

3

Well-Formed vs. Valid

An XML document is well-formed, if it fulfills the following criteria:

There is exactly one root element, containing any other element

For all elements exist a opening and a closing tag

Nesting is not broken anywhere

Any attribute has a value enclosed by „“ or ‚‘

A characterset has to be defined in the header if not default UTF

An XML document is valid, if it has an associated document type declaration and if the document complies with the constraints expressed in it. [ http://www.w3.org/TR/REC-xml/#dt-valid ]

Page 4: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

4

XML Schema

XML Schemas express shared vocabularies and allow machines to carry out rules made by people. They provide mechanisms for defining the structure, content and [to some extend] semantics of XML documents

Idea: Specify Syntax & Grammar of instance documents using XML itself, e.g. similar to

<schema><element name="name" type="string" /> <element name="qualification" type="string" /><element name="born" type="date" /> <element name="dead" type="date" /> <element name="isbn" type="string" /> <element name="id" type="ID" /> <element name="available" type="boolean" /> <element name="lang" type="language" />

</schema>

Page 5: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

5

Instance vs. Schema

<title lang="en">Being a dog is a full-time job

</title>

• Simple Content• Complex Type (Attribute!)

<xs:element name="title"><xs:complexType>

<xs:simpleContent><xs:extension base="xs:string"><xs:attribute ref="lang" />

</xs:extension></xs:simpleContent>

</xs:complexType></xs:element>

Instance

Schema

"the element named titlehas a complex type which is a simple content obtained by extending the predefined datatype xs:string by adding the attribute defined in this schema and having the name lang"

Page 6: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

6

Validation

Validation of

Schema itself against XML Schema Spec

Instance against Schema

Validators

http://www.w3.org/2001/03/webdata/xsv

http://www.stg.brown.edu/service/xmlvalid

Page 7: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

7

Validation with XSV

Page 8: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

8

Parser

Parsers are used to read XML documents into a programming language specific data structure

Parsers may be

validating or non-validating

processing model: event-based or document-based

Big differences between the processing models w.r.t.

memory requirements

access to the elements

Page 9: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

9

SAX Parser

Serial Access to XML (SAX) – event-based

<bibliography><phdthesis>

<title>Service…

</title></phdthesis>…

</bibliography>

startDocument()startElement("bibliography")startElement("phdthesis")startElement("title")characters("Service…")endElement("title")endElement("phdthesis")…

endElement("bibliography")endDocument()

sequence

Page 10: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

10

DOM Parser

Document Object Model (DOM) – document-based

defined by interfaces (no classes)

phdthesisauthor = "Thomas Strang"

title"Service-Interoperability in …"

isbn"3-8007-2823-0"

bibliography

API to

createinsert

updatedelete

navigate

Page 11: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

11

Using DOM (J2SE)

import org.apache.xerces.parsers.DOMParser;import org.w3c.dom.Document;

Document doc = null;try {DOMParser p = new DOMParser(); // instantiate Parser

// activate validationp.setFeature("http://xml.org/sax/features/validation", true);

p.parse("http://www.deri.at/teaching.xml"); // parse filedoc = p.getDocument(); // get document from parser

}catch (IOException io) { // z. B. file error...

}catch (SAXException s) { // z. B. invalid XML...

}

Page 12: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

12

J2SE Example: JDOM

open source Java API for DOM

"Natural" way for Java programmers to access XML

uses collection classes, overloading, reflection

defined as classes and interfaces

Cooperative, not competitive to SAX and DOM:

(image from http://jdom.net)

Page 13: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

13

Using JDOM (J2SE)

import org.jdom.input.SAXBuilder;import org.jdom.Document;

Document doc = null;try {SAXBuilder b = new SAXBuilder(true /*validating*/ );doc = b.build("http://www.deri.at/teaching.xml");// ...doc.addContent(new Element("pmd-lecture")

.addAttribute("MSc","true")

.addContent(new Element("unit").setText("XML parsing")))

}catch (JDOMException j) {}

Page 14: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

14

Memory Requirements

0

100

200

300

400

500

600

700

XML-file JDOM JDOM w/ Index DOM

KByte

[Bielert, 2001]

Page 15: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

15

Access Speed: Improvements with dynamic content

0

2000

4000

6000

8000

10000

12000

14000

529 Bytes 2111Bytes

2336Bytes

192 140Bytes

271 414Bytes

ms WWWlocal

(given a set of diverse XML files from different WWW servers)

[Bielert, 2001]

Page 16: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

16

J2ME: Smaller, smaller, smaller - kXML

kXML is a small XML event based parser, specially designed for constrained environments such as Applets, Personal Java or MIDP devices

kXML 2 implements pull based XML parsing (see http://xmlpull.org), which combines some of the advantages of SAX (push) and DOM

source is provided on sourceforge at http://kxml.sourceforge.net

Page 17: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

17

Pull Parsing

In contrast to push parsing, pull parsing lets the programmer "pull" the next event from the parser.

In push parsing you would have to maintain the state of the current part of the data you were parsing, and based on the events passed to the listener you would have to take care to restore any previous state variables and save new ones when you were changing to a different state.

Pull parsing makes it easier to deal with state changes because you can pass parser to different functions, which can maintain their own state variables.

Page 18: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

18

kXML 1: Push Parsing (like SAX)

private void kXML1traverse(XmlParser parser) { // avoid recursion!!boolean leave = false;do {

ParseEvent event = parser.read();switch (event.getType()){// reads out the start-tags and attributes of the xml-streamcase Xml.START_TAG:

//... prepare data structures dependent on type of elementbreak;

// reads out the text-datacase Xml.TEXT:

//... fill current data structurebreak;

// reads out the end-tagcase Xml.END_TAG:

//... evaluate current data structurebreak;

// reads out the end-document-tagcase Xml.END_DOCUMENT:

leave = true;break;

default: // do nothing}

} while (!leave);

Page 19: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

19

kXML 2: Pull Parsing

XmlParser parser =new XmlParser(

new InputStreamReader( this.getClass().getResourceAsStream("file.xml")));

//...

while ((event = parser.read()).getType() != Xml.END_DOCUMENT) {if (name != null && name.equals("address"))

parseAddressTag( parser ); //...

parseAddressTag(XmlParser parser) {while ((event = parser.peek()).getType() != Xml.END_DOCUMENT) {

if (type == Xml.END_TAG && name.equals("address")) return; ParseEvent next = parser.read(); if (next.getType() != Xml.TEXT) continue; System.err.println(name + ": " + text);

//...

Page 20: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

JSR-172

Alternative to kXML if supported by the Phone: JSR-172

XML parser following the SAX-Model

as part of Web Service API (WSA)

20

Page 21: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

21

Complex Data Example: CityWalk Dataflow

HITNucleusEngine

webI/f

lotusclient

FDILotusNotes

XMLGateway& content adaptorDB

HTTP-Post

UI and Control

Database of Entries / Media cache

Record Store

bootstrap.xml

IDP CC

MIDP-Java

Web-basedMIDP Suite Creation

http://demo.heywow.com

VPN/LSP/GPRSBackbone Network

ImagesMedia

HTTP-Get

MIDlet Suite

JAR JAD

OTA-Downloador

Local Install

[www.heywow.com]

Page 22: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

22

Content Adaptation Example: CityWalkAdaptation of XML data

Text and ReferencesCommands

Actions (insert, update, delete etc.)group management

Multimedia Resources (Images, Audio, …)Maps

User-Agent: SonyEricssonP800/R101 Profile/MIDP-1.0 Configuration/CLDC-1.0 DisplayCaps/208x203x12 HeywowClient/Highlander-0.9

Negotiation basedon User-AgentHTTP Header

Page 23: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

23

Parsing Complex Data

CityWalk Schema Example

…<xsd:complexType name="ElementBaseInfoType">

<xsd:sequence><xsd:element name="city" type="xsd:string" minOccurs="0" /> <xsd:element name="name" type="citywalk:LanguageStringType" minOccurs="0" /><xsd:element name="description" type="citywalk:LanguageStringType" minOccurs="0" /><xsd:element name="mapref" type="citywalk:MapRefType" maxOccurs="unbounded"/>

</xsd:sequence></xsd:complexType>

<xsd:complexType name="ElementInfoType"><xsd:complexContent>

<xsd:extension base="citywalk:ElementBaseInfoType"><xsd:sequence>

<xsd:element name="imageResource" type="citywalk:ImageResourceType" minOccurs="0" maxOccurs="unbounded"/>

<xsd:element name="note" type="citywalk:LanguageStringType" minOccurs="0" maxOccurs="unbounded"/>

</xsd:sequence></xsd:extension>

</xsd:complexContent></xsd:complexType>…

Page 24: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

24

Parsing Complex Data

CityWalk Instance Example

…<CTarget Id="LANDSBERG_20030717151816_47B13FF4157213BFC1256D660046FB0F"

packagePath="dlr.tourGuide.tourGuideContent."><elementInfo><city>Landsberg</city><name>Schmalzturm</name><description xml:lang="DE">Auch "Schöner Turm" genannt, gehört zum 1. Mauerring aus dem 13. Jahrhundert

</description><mapref mapId="LandsbergMap1-5" fromLeft="138" fromTop="183" /><mapref mapId="LandsbergMap1-0" fromLeft="338" fromTop="383" /><imageResource xml:lang="DE"ref="schmalzturm.jpg">Schmalzturm</imageResource>

</elementInfo></CTarget>…

Page 25: Programming Mobile Devices · Lecture Programming Mobile Devices, Thomas Strang, WS 2009/2010. 4. XML Schema XML Schemas express shared vocabularies and allow machines to carry out

Lect

ure

Prog

ram

min

gM

obile

Dev

ices

, Th

omas

Str

ang,

WS

2009

/201

0

25

XML as serialization format

Trick: consider XML instances as serialized objects

If schema is specified carefully (self-describing!), objects (in MIDP Java) can be de-serialized from XML

Recap: The idea of self-describing serialization is toencode information about class where the object is an instance of, versioning information, length in byte arrayrepresentation etc.