25
XML – Extensible Markup Language Sivakumar Kuttuva & Janusz Zalewski

XML – Extensible Markup Language Sivakumar Kuttuva & Janusz Zalewski

Embed Size (px)

Citation preview

XML – Extensible Markup Language

Sivakumar Kuttuva

& Janusz Zalewski

What is XML?

Extensible Markup Language (XML) is a universal standard for electronic data exchange

Provides a method of creating and using tags to identify the structure and contents of a document ignoring the formatting

How XML look like<?xml version="1.0"?>

<Course> //Root Tag <Name>Java Programming</Name> //Element Course Name  <Department>EECS</Department> //Element Dept

  <Teacher>    <Name>Paul</Name>  </Teacher>  <Student>    <Name>Ron</Name>  </Student>  <Student>    <Name>Uma</Name>  </Student>  <Student>    <Name>Lindsay</Name>  </Student></Course>

Why XML came into existence?(1)• Make it easier to provide metadata -- data about information

  <Department>EECS</Department>  <Teacher>    <Name>Paul Thompson</Name>  </Teacher>Here Name, Department are Metadata

• Large-scale electronic publishing requires dynamic documents without changing document formats.

• Internationalized media-independent electronic publishing.

Why XML came into existence? (2)

• Allow industries to define platform-independent protocols for the exchange of data, especially the data of electronic commerce.

• Make it easy for people to process data using inexpensive software.

Two Types of Syntax Standards

• XML documents must meet one of two syntax standards:– Well-formed (the basic standard) Document must

meet minimum,

standard criteria.– Valid

Document must be well-formed and

adhere to a DTD (Document Type Definition).

Well-Formed XML– Well-formed criteria include:

• All elements have a start and end tag with matching capitalization.

– <B></B>

• Proper element nesting.– <B><I></I></B>

– not <B><I></B></I>

• Attribute values are in single or double quotes.– <book call_no=" 3456-34567890-3456 ">

• Empty elements need an end or closed start tag.– <IMG></IMG> or <IMG />

Why Well-Formed Matters

• Guarantees the document’s syntax before sending it to an application.

• A clean syntax guarantee which means less ambiguity which results in faster processing.

• A well-formed violation is a fatal error.

Valid XML• To be valid, a document must be well-

formed and adhere to a DTD.• A DTD Example is shown below

– <!ELEMENT BOOKCATALOG (BOOK)+>– <!ELEMENT BOOK (TITLE, AUTHOR+, PUBLISHER?,PRICE?>– <!ATTLIST BOOK ISBN CDATA #REQUIRED>– <!ATTLIST BOOK BOOKTYPE (Fiction|SciFi|Fantasy)

#IMPLIED> – <!ELEMENT TITLE (#PCDATA)>– <!ELEMENT AUTHOR (LASTNAME)>– <!ELEMENT LASTNAME (#PCDATA)>– <!ELEMENT PUBLISHER (#PCDATA)>– <!ELEMENT PRICE (#PCDATA)>

Valid XML

• DTD - Document Type Definition specifies:– Elements in the document.

• Author, Publisher

– Their attributes.• For Book Author, Publisher, Price are attributes

– Whether they are mandatory or optional

• A DTD effectively specifies the document’s grammatical rules.

A sample entry in the XML file adhering to the given DTD

• <BOOKCATALOG>

• <BOOK>

• <ISBN>3456-34567890-3456</ISBN>  • <TITLE>C++ Primer</TITLE>  • <AUTHOR_LASTNAME>Tendulkar</AUTHOR_LASTNAME>

• <PUBLISHER>McGraw Hill</PUBLISHER>  

• <PRICE>41.99</PRICE>   </BOOK>

• </BOOKCATALOG>

Why use DTD

• Well-formed means the document meets a minimum standard set of rules.

• A DTD helps to define user defined rules and languages provided the XML content adheres to the syntax standards like WML, MAML, etc.

The Components – Line 1

• <!ELEMENT Bookcatalog (Book+)>

• Bookcatalog is the root element.

• Bookcatalog can have one or more (indicated by the +) Book elements.

The Components – Line 2

• <!ELEMENT Book (Title, Author, Publisher, Price)>

• Each Book element can contain:

• A title, author, publisher, price

The Components – Line 4

• <!ATTLIST Book BookType (Fiction | SciFi | Nonfiction) Fiction.

• Each Book element has a attribute BookType Three options (indicated by |) Fiction, SciFi and Non-Fiction with Fiction as default.

The Components – Lines 5-9

• The Remaining Elements Title through Price are #PCDATA– Parseable character data that the processor will

check for entities and markup characters– Any <,>, or & in data specified as PCDATA

must be represented by &lt; or &gt; or &amp;.

Schemas

• The next step beyond DTDs• Come from the database world• More powerful and extensible than DTDs, which

come from the SGML world• Schemas are XML documents, so they:

– Are extensible– Use XML syntax unlike DTDs– Support data types like dates, times, currencies,

important in eCommerce

DTDs vs Schemas

• Why use schemas?– More powerful than DTDs

– Better suited for eCommerce.

• Why use DTDs?– Wider tool support.

– More examples available for use and reference.• HTML, XHTML, CALS, etc.

– Greater depth of experience in the industry

– Wider pool of developers

CSS and XML

• CSS was designed for HTML but works fine under XML as well.

• Rather than create an XSL style sheet, you can create a simpler CSS and attach it to a XML document via a command like:– {?xml-stylesheet href=“mycss.css”

type=“text/css”?}

CSS and XSL

• XML uses custom tags that a browser does not know how to display

• So XML documents may display like this– <BOOKCATALOG>– <BOOK>– <ISBN>3456-34567890-3456</ISBN>  – <TITLE>C++ Primer</TITLE>  – <AUTHOR_LASTNAME>Tendulkar</AUTHOR_LASTNAME> – <PUBLISHER>McGraw Hill</PUBLISHER>   – <PRICE>41.99</PRICE>   </BOOK>– </BOOKCATALOG>

• Legibility requires applying styles:• – CSS• – XSL

XSL (Extensible Style Language)

• XSL comes from DSSSL (Document Style Semantics and Specification Language), the SGML style language, derived from LISP.

Benefits of XSL

• An XSL style sheet is well-formed XML.• Supports a style sheet DTD for

validation.• Far greater processing ability than CSS.• XSL Transformations (XSLT) take part

of an XML document and transform it, such as XML to HTML.– This is why XML appears to be the route tosingle-sourcing.

Advanced Features of XML

• Xlink

• Xpointer

• Parsing XML with DOM

(Document Object Model)

• XPath

XML Applications

• Applications that require the Web client to mediate between two or more heterogeneous databases like information tracking system for a home health care agency.

• Applications that attempt to distribute a significant proportion of the processing load from the Web server to the Web client like technical data delivery system for a wide range of products.

• Applications that require the Web client to present different views of the same data to different users.

• Applications in which intelligent Web agents attempt to tailor information discovery to the needs of individual users.

Future Demands of XML

• Intelligent Web agents would have demand for structured data

• User preferences must be represented in a standard way to mass media providers.