14
Extensible Markup Language (XML) CS422 Dick Steflik

Extensible Markup Language (XML)

Embed Size (px)

DESCRIPTION

Extensible Markup Language (XML). CS422 Dick Steflik. What is XML. A Markup Language for giving a text document contextual structure parentage is Standard Generalized Markup Language (SGML; ISO 8879) specify a documents structure and attributes, not processing should ne declarative - PowerPoint PPT Presentation

Citation preview

Page 1: Extensible Markup Language (XML)

Extensible MarkupLanguage (XML)

CS422Dick Steflik

Page 2: Extensible Markup Language (XML)

What is XML

• A Markup Language for giving a text document contextual structure– parentage is Standard Generalized Markup

Language (SGML; ISO 8879)• specify a documents structure and attributes, not

processing • should ne declarative

– a set of rules for encoding documents that is both human and machine readable

Page 3: Extensible Markup Language (XML)

Things to note in the example

• Every tag is paired with an ending tag– end tags have same name preceded with "/"– tag pairs constitute xml entities

• Tag are in lower case by convention (XML doesn't care about case)

• Documents must be "well formed"– tags may be nested one inside of another (never

cross matched)– every opening tag must have a closing tag

Page 4: Extensible Markup Language (XML)

Tag Attributes

• Every tag may have a set of attributes– specified as part of the tag as either• name/value pairs (ex. id="abc")

– Attributes must be quoted

• keywords ( ex. noform)

– attributes specify additional information about the tag

– attributes are separated by one or more spaces • commas will generate errors

Page 5: Extensible Markup Language (XML)

Attribute example

<message to=“[email protected]" from=“[email protected]”> <subject>Another XML Example</subject> <text> This is the message body. </text></message>

Page 6: Extensible Markup Language (XML)

XML Prolog

• XML files always start out with a prolog line– <?xml version="1.0">– other attributes: • encoding – identifies the character set used to encode

the data• standalone – identifies that the document stands alone

i.e doesn't require any external references.

Page 7: Extensible Markup Language (XML)

Example

<?xml version="1.0" encoding="ISO8859-1" standalone="yes"> <message <to>[email protected]</to> <from>[email protected]</from> <subject>Another XML Example</subject> <text> This is the message body. </text></message>

Page 8: Extensible Markup Language (XML)

Comments in XML files

• <!--- ->

Page 9: Extensible Markup Language (XML)

Processing Instructions

• Since XML is a portable document format, the same document may be processed by a number of applications, this processing can be specified in the file

• Each instruction should be of the form:– <?target instructions ?>• target – the name of the processing application• instruction – a string of characters that specify the

processing commands or parameters

Page 10: Extensible Markup Language (XML)

Why is XML Important?

• Not a binary format• can be transtorted accross a network easily• easy to create manually or programatically• makes debugging easier• can describe very complex objects• easy to store in a database• more scalable than binary

Page 11: Extensible Markup Language (XML)

Data Identification

• Since the tags describe the structure of the data, it makes the same data more usable by multiple applications– looking at the previous example:• it is easily searchable by a search program• easily displayable by a viewer• easy to store in a database

Page 12: Extensible Markup Language (XML)

Stylability

• For applications where rendering is important (word processors, browsers, publishing) use Extensible Stylesheet Language (XSL)– XSLT– XSL-FO– XPATH

Page 13: Extensible Markup Language (XML)

Inline reusability

• Unlike HTML, XML documents can include other inline documents– this allows the construction of very complex

objects from:• other simpler objects• other hosts

Page 14: Extensible Markup Language (XML)

XML Parsers

• To make the data from an XML document useful it must be parsed out of the document. This can be easily done two ways– SAX (Simple API for XML)

• java api that parses xml and retrieves the data as the tags are encountered

– DOM (Document Object Model)• as an xml or xhtml document is loaded into the browser it is

parsed into a document tree and then via javascript made available for processing

• More on DOM and SAX later in the course