155
Foundation Course on XML - Ravindra Godbole

Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Embed Size (px)

Citation preview

Page 1: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Foundation Course on XML

- Ravindra Godbole

Page 2: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML

Extensible Markup Language

Page 3: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language
Page 4: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Any comments !

Everyday Impact of XML ...

Page 5: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Jack and Jill

Jack and JillWent up the HillTo fetch a pell of water

Jack fell down and broke his crownJill came tumbling after

Anonymous

Page 6: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

What was that ? It’s a Poem It has title It has stanzas. It is written by somebody ( anonymous )

This interpretation though possible for humans is notpossible for computers.

Page 7: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Poem using Markup Language

<poem><title> Jack and Jill </title><stanza>

Jack and JillWent up the HillTo fetch a pell of water

</stanza>

<stanza>Jack fell down and broke his crownJill came tumbling after

</stanza><author>Anonymous</author>

</poem>

Page 8: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

After this presentation you will .. Know XML concepts be able to write XML using simple

editor be able to write DTD for an XML learn use of XSL know what is Document Object Model Enjoy learning more about XML

Technologies

Page 9: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Pre-requisites

OOPs HTML Java

Knowledge of following will be helpful, but it is not mandatory.

Page 10: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

We will cover ...

• Introduction to XML• Use of XML • XML Syntax• DTD• XSL• Document Object Model

Page 11: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

HTML and XMLHTML only addresses the presentation of data.

XML takes this one step further, by addressing the contextor meaning of the data.

Example

Using XML, the word “bill” can be tagged as a name, a charge, a paper currency,a proposed law, or the mouth of a bird.

Any other examples ?

Page 12: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML is a simple data format that balances the needs of

people to read/write data with the needs of machines

to read/write data.

Page 13: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Introduction to XML

What is XML ?

XML is a text based meta-language. It is usedto define other Markup languages.

Example of Markup Languages

• MathML• WML

Page 14: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML is not ..

A Programming Language

Page 15: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Introduction to XML….

XML is a method for putting structured data in a text file .

Here is a example of XML file - book.xml<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name>

<author>Fredrick Brooks Jr.</author></book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

Page 16: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Introduction to XML….

XML looks a bit like HTML but isn't HTML

XML uses tags ( words separated by <,>) and attributes just like HTML. But HTML uses these tags for displayinginformation in the browser.

XML uses these tags, only to delimit pieces of data andleaves the interpretation of data completely to the applicationthat reads it.

So XML is not HTML

Page 17: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Introduction to XML….

XML is text, but isn't meant to be read

• XML files are text files. But these are to be edited only during emergencies.

• A forgotten tag, or an attribute without quotes makes the XML file unusable.

Page 18: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Introduction to XML….

XML is verbose, but that is not a problem

XML is a text format, and it uses tags to delimit the data, XML files are nearly always larger than comparable binary formats.

This is not a issue now a days as disk space is not expensive. Alsolots of compression utilities are widely available and modern protocolslike HTTP/1.1 can compress data on the fly.

Page 19: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Introduction to XML….

XML is license-free, platform-independent and well-supported

Many new tools and technologies are available to support XML activity.

You can build your software around it without paying anybody anything.

You are not tied to single vendor.

Page 20: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML Users Microsoft IBM Netscape Sun Microsystems Adobe Corel Hewlett-Packard

Page 21: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Introduction to XML….

XML Resources on the web

• http://xml.com/• http://www.ibm.com/xml/• http://www.sun.com/xml/• http://www.w3.org/

Page 22: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Use of XML

Use of XML

Page 23: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Use of XML

XML-Enabled Technologies

• Internet Search Engines• Electronic Commerce• Electronic Data Interchange ( EDI )• Data Repurposing• Content Personalization

and many more ...

Page 24: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Use of XML

Internet Search Engines

Due to power of XML, data could be tagged properly.This will enable search engines to retrieve exact information.

Searching for information about the Java programming language would no longer yield links to coffee sites or the Island of Java. This is because searching for theterm “Java” is narrowed down to those fields tagged as a “programming language”.

When searching for information on a subject that is contained in a single chapter or even a single page within a book, XML enables you to retrieve only that chapteror page, while HTML currently gives you the entire book.

Page 25: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Use of XML

Electronic Data Interchange

By leveraging XML, the applications easily broker information between themselves. Mapping data from one company’s purchasing system to another company’s inventory is just a matter of understanding the XML tags on the data.

Page 26: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Existing

A

D

B

C

Using XML

A

C

B

D

XML

Data Exchange

Page 27: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Use of XML

Electronic Commerce

With XML repository technology, on-line stores can present product information in a standard, structured format, independent of page design.

By reducing the time needed to locate a product, a price, or any otherrelevant information on the Internet, XML repositories will play an important role in making on-line shopping more efficient and enjoyable.

Page 28: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Use of XML

Data Repurposing

By breaking documents into discrete elements, it becomes very easy for individuals to extract the truly relevant information from several sources.

They can reassemble it into any format (e.g. web page, document, presentation, whatever).

Page 29: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML Document- A XML Document - B

New Document

Page 30: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Use of XML

Content Personalization

Using XML, you could create a very sophisticated personal news filter that spans multiple sites or the entire Internet. The XML repository would provide the date stamp, enabling agents or search engines to filter the information to extract only the “new” information.

Page 31: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Multiple formatting of XML

XML Document Online Help

HTML

Braille

Plain Text

Formatter

Formatter

Formatter

Formatter

Page 32: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents

Building XML Documents

Page 33: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Logical and Physical Views of XML document

booklist

publishername

bookbook book book

author

location

Book.xml

Book2.xmlBook1.xml

Name.xmlAuthor.xml

Logical View Physical View

Page 34: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents

XML file always start with a prolog.

The minimum prolog contains a declaration that identifiesthe document as XML document like this.

<?xml version="1.0"?>

The declaration might contain additional information whichwe will study later.

Identifies version of XML markup language used. This attribute is mandatory

Declaration

Page 35: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML Keywords ELEMENT DOCTYPE ATTLIST NOTATION ID IDREF

Page 36: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

W ell F orm ed XML

Valid XML

Ch aracter Data an d Marku p

Com m en ts

Processin g In stru ction s

CDATA section s

W h ite sp ace Han d lin g

D ocu m en t

Start, En d , Em p ty Tag s

E lem en t Typ e Declaration s

Attrib u te List Declaration

Con d ition al Section s

L og ica l S tru c tu re

Ch aracter Referen ces

En tity Referen ces

En tity Declaration s

Parsed En tites

P rocessin g In stru ction s

Notation Declaration

P h ys ica l S tru c tu re

X M L

Page 37: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

1. Well-formed

2. Valid

Well-formed documents conform with XML syntax.They contain text and XML tags. Everything is enteredcorrectly. They do not, however, refer to a DTD.

Valid documents not only conform to XML syntax but they also are error checked against a Document TypeDeclaration.

Type of XML Document

Page 38: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Well Formed XML Document Has one or more elments Has exactly one element called the

root or document element. Meets all the requirement for

specification

Page 39: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0” encoding=“UTF-8” standalone=“yes”><booklist>

<book><name type=“spec”>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book>

<listbreak/>

<book><name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

Well formed XMLdocument

Page 40: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents

<?xml version=“1.0” encoding=“UTF-8” standalone=“yes”><booklist>

<book><name type=“spec”>XML Specification<author>

</name>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</Name><author>Fredrick Brooks Jr.</author>

</book><listbreak/><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

NOTWell formed XML

documentTAGs do notnest properly.

Also Case does notmatch.

Page 41: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Valid XML Document

• Is well formed• Meets all the requirements specified in the Document Type Declaration

Well Formed XML

Valid XML

Page 42: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Character data All text that is not markup constitutes the character data of the Document. In the content of elements, character data is any string of characters which does not contain

the start-delimiter of any markup. In a CDATA section, character data is any string of characters not including the CDATA-

section-close delimiter, "]]>".

Page 43: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Markup Markup takes the form of

start tags end tags empty element tags entity references character references comments CDATA section Processing Instructions

Page 44: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

Start tag denotesbeginning of

element.

Page 45: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

End tag denotesend of

element.

Page 46: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

Start and end tag must match,

nest and can notoverlap.

Page 47: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0” encoding=“UTF-8” standalone=“yes”><booklist>

<book><name type=“spec”>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book>

<listbreak/>

<book><name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

Empty element tagcan not contain anyother markup or text

Page 48: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Comments Comments may appear anywhere

in the document outside other markup

They are not part of the document’s character data

Page 49: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0” encoding=“UTF-8” standalone=“yes”><booklist>

<book><name type=“spec”>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><listbreak/><!-- This book is good for C++ programmers --><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

XML commentsare entered like this.

Start tag is <!--and end tag is

-->

Page 50: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Processing Instructions Processing Instructions allow

document to contain instructions for application.

The TARGET name XML and xml are reserved for future use.

<?target instructions?>

target is the name of the application that is expected to do the processing,

instructions is a string of characters that embodies the information or commands for the application to process

Page 51: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Writing XML - Processing Instructions

Processing Instructions Example

<?xml-stylesheet type="text/xsl" href="cd_catalog.xsl"?>

Page 52: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Writing XML - CDATA

CDATA sections are used to display markup without the XML processor trying to interpret that markup. They are particularly useful when you want to display sections of XML code.

CDATA sections.

<![CDATA[

<greeting>Hello, world!</greeting>]]>

Page 53: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

White Space Handling White spaces are sometime redundant May be needed in contents like poetry ,

source code Special attribute to indicate the

intention of data regarding white space - xml:space

<!ATTLIST poem xml:space (default|preserve) 'preserve'>

This applies to all elements of the content, unless overriddenagain with another instance of xml:space attribute.

Page 54: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents

DTD - Document Type Declaration

The DTD defines the elements, attributes, and relationships

between elements for an XML document.

A DTD is a way to check that the document is structured correctly, but presence of DTD in a document is optional.

Here, file address.dtd which contains all the rules and is included in address.xml as follows...

<!DOCTYPE addressbook SYSTEM ”address.dtd" >

Page 55: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML and DTD

XML

DTD

XML Parser

ValidXML

Invalid XMLXML Parsers

• MSXML• AlphaWorks• XP

yes

no

Page 56: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD

Document is compared against associated DTD tocheck for its correctness.

This process is called validation and is performed by a tool called Parser.

Page 57: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD

Need for DTD

• All documents in a group follow the same set of rules.• Ensure that all the data required is present.• Need to match to Industry-specific standards.• Error-check the document for accuracy of tag usage.

Page 58: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD

Deciding on DTDs

• Share a DTD• Create your own DTD• Make an Internal DTD

Page 59: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD

Each statement in a DTD uses the <!XML DTD> syntax. This syntax begins each instruction with a left angle bracket and an exclamation point, and ends it with a right angle bracket.

Our outermost tag is booklist.

<!ELEMENT BOOKLIST (BOOK)+>

Page 60: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

W ell F orm ed XML

Valid XML

Ch aracter Data an d Marku p

Com m en ts

Processin g In stru ction s

CDATA section s

W h ite sp ace Han d lin g

D ocu m en t

Start, En d , Em p ty Tag s

E lem en t Typ e Declaration s

Attrib u te List Declaration

Con d ition al Section s

L og ica l S tru c tu re

Ch aracter Referen ces

En tity Referen ces

En tity Declaration s

Parsed En tites

Processin g In stru ction s

Notation Declaration

P h ys ica l S tru c tu re

X M L

Page 61: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents

Tag

You use a tag to identify a piece of data by element name.

Tags usually appear in pairs, surrounding the data. The opening tag contains the element name. The closing tag contains a slash and the element's name, like this:

<name>Effective C++</name>

Tag

Page 62: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Writing XML - Element Type Declaration

Element Type Declaration

Element type declarations set the rules for the type and number of elements that may appear in an XML document, what elements may appear inside each other, and what order they must appear in.

Page 63: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Writing XML - Element Type Declaration

<!ELEMENT parent_name (child_name)>

<!ELEMENT child_name allowable content>

<?xml version="1.0"?><!DOCTYPE student [ <!--'student' must have one child element type 'id'--> <!ELEMENT student (id)> <!--'id' may only contain text that is not markup in its content--> <!ELEMENT id (#PCDATA)>]><student> <id>9216735</id></student>

Page 64: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Writing XML - Element Type Declaration

Mixed content

Mixed content is used to declare elements that contain a mixture of children elements and text

<?xml version="1.0"?><!DOCTYPE student [ <!ELEMENT student (#PCDATA|id)*> <!ELEMENT id (#PCDATA)>]><student> Here's a bit of text mixed up with the child element. <id>9216735</id> You can put text anywhere, before or after the child element. You don't even have to include the 'id' element.</student>

Page 65: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

W ell Form ed XML

Valid XML

Ch aracter Data an d Marku p

Com m en ts

Processin g In stru ction s

CDATA section s

W h ite sp ace Han d lin g

D ocu m en t

Start, En d , Em p ty Tag s

E lem en t Typ e Declaration s

Attribu te L ist Declaration

Con d ition al Section s

L og ica l S tru c tu re

Ch aracter Referen ces

En tity Referen ces

En tity Declaration s

Parsed En tites

Processin g In stru ction s

Notation Declaration

P h ys ica l S tru c tu re

X M L

Page 66: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents

Attribute

Attributes are like adjectives, in that they further describe elements. Each attribute has a name and a value.

Attributes are entered as part of the tag, like this:

<name number="1874">XML Specification</name>

number is attribute of name

Page 67: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents

Types of Attribute

• STRING• ENUMERATED• ID• IDREF/IDREFS• ENTITY/ENTITIES• NMTOKEN• NOTATION

Page 68: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Attribute types

String type

Indicated by the keyword CDATA

Any string of valid XML character is allowed withthese restriction

Can not contain quotation, < and &

<!ATTLIST menu date CDATA #REQUIRED time CDATA #IMPLIED >

Page 69: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Attribute types

Enumerated-List type

List of possible values which attribute can take.

<!ATTLIST item type ( appetizer | entrée | dessert ) “entrée”

These definitions can also use #FIXED #IMPLIED or#REQUIRED default declarations.

Page 70: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

NOTATION Notations identify by name the

format of unparsed entities, format of element which bear NOTATION attribute, or the application to which processing instruction is addressed.

Page 71: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Attribute types

Notation Attributes

These are enumerated type of attributes that can only take as allowed values, one of the defined list of Notations.

<!ATTLIST frogs imgtype NOTATION ( gif|jpeg|pict) “gif” encoding NOTATION ( uuencode|base64) “base64”>

<frogs imgtype=“gif” encoding=“base64”>

Page 72: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Attribute types

ID Attribute Type

Designed for labeling and referencing elements in XML.

<!ATTLIST item ref ID #REQUIRED>

<item ref=“newItem”>abcd….</item>

newItem value should be unique within the document.

Page 73: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Attribute types

IDREF and IDREFS Attribute type

Used to reference elements labeled by ID attributes.

<!ATTLIST itemsize ref-to IDREF #IMPLIED topref IDREF #REQUIRED ><itemsize ref-to=“xs-089” topref=“dds-ss”>this is a item</itemsize>

All referenced ID must be present within the document.

Page 74: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Attribute types

IDREFS are equivalent to IDREF in that it can take morethan one references.

<!ATTLIST itemizers topref IDREFS #FIXED “ref1 ref2 ref43”>

IDREF(S) can be used to link one section of document withother, such as footnote or glossary entry.

Page 75: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Conditional Sections

Conditional sections are portions of the the DocumentType Declaration external set, which are included in, or excluded from, the logical structure of the DTD based on the keyword which governs them.

Page 76: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Example of Conditional Sections

<!ENTITY % draft 'INCLUDE' ><!ENTITY % final 'IGNORE' >

<![%draft;[<!ELEMENT book (comments*, title, body, supplements?)>]]><![%final;[<!ELEMENT book (title, body, supplements?)>]]>

Page 77: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML Document - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

Page 78: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Can you write simple XML Document ?

1. Building2. Employee3. Computer4. Car

Page 79: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

W ell F orm ed XML

Valid XML

Ch aracter Data an d Marku p

Com m en ts

Processin g In stru ction s

CDATA section s

W h ite sp ace Han d lin g

D ocu m en t

Start, En d , Em p ty Tag s

E lem en t Typ e Declaration s

Attrib u te L ist Declaration

Con d ition al Section s

L og ica l S tru c tu re

Ch aracter Referen ces

En tity Referen ces

En tity Declaration s

Parsed En tites

P rocessin g In stru ction s

Notation Declaration

P h ys ica l S tru c tu re

X M L

Page 80: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Physical Structure XML document may contain one or

more storage units called entities. Entities may be either parsed or

unparsed.

Page 81: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD

Entity Declarations

Entities reference data that act as an abbreviation or can be foundat an external location.

• Entities reduce entry of repetitive information• Entities allow us easier editing.

Types of Entities

• General• Parameter

• Parsed• Unparsed

• Internal• External

Page 82: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD - Entity Declaration

These entities refer to data which XML processor has to parse.

<!ENTITY name "entity_value">

<?xml version="1.0" standalone="yes" ?><!DOCTYPE author [ <!ELEMENT author (#PCDATA)> <!ENTITY js "Jo Smith">]><author>&js;</author>

General Entities - Internal Parsed Entities

Page 83: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD - Entity Declaration

External Entities

External entities are useful for creating a common reference that can be shared between multiple documents.

<!ENTITY name SYSTEM "URI">

<?xml version="1.0" standalone="no" ?><!DOCTYPE copyright [ <!ELEMENT copyright (#PCDATA)> <!ENTITY c SYSTEM "http://www.xmlwriter.net/copyright.xml">]><copyright>&c;</copyright>

Page 84: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD - Entity Declaration

External UnParsed Entities

External unparsed entities generally reference non-XML data. The 100% correct definition is that they refer to data that an XML processor does not have to parse.

<!ENTITY name SYSTEM "URI" NDATA name>

<?xml version="1.0" standalone="no" ?><!DOCTYPE img [ <!ELEMENT img EMPTY> <!ATTLIST img src ENTITY #REQUIRED> <!ENTITY logo SYSTEM "http://www.xmlwriter.net/logo.gif" NDATA gif> <!NOTATION gif PUBLIC "gif viewer">]><img src="logo"/>

Page 85: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD - Entity Declaration

Using Entities within Entities

<?xml version="1.0"?><!DOCTYPE author [ <!ELEMENT author (#PCDATA)> <!ENTITY email "[email protected]">

<!--the following use of a general entity is legal if it is used in the XML document--> <!ENTITY js "Jo Smith &email;">]><author>&js;</author>

Page 86: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD - Entity Declaration

Predefined Entities

Predefined entities How to declare these entities in a DTD:

&lt; <!ENTITY lt "&#38;#60;">

&gt; <!ENTITY gt "&#62;">

&amp; <!ENTITY amp "&#38;#38;">

&apos; <!ENTITY apos "&#39;">

&quot; <!ENTITY quot "&#34;">

Page 87: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD - Entity Declaration

Parameter Entities• Internal ( Parsed ) • External ( Parsed )

Internal parameter entity references are used to declare entities existing only in the DTD.

<!ENTITY % name "entity_value">

<!--external DTD example--><!ENTITY % p "(#PCDATA)"><!ELEMENT student (id,surname,firstname,dob,(subject)*)><!ELEMENT id %p;><!ELEMENT surname %p;><!ELEMENT firstname %p;><!ELEMENT dob %p;><!ELEMENT subject %p;>

Page 88: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML - Writing DTD - Entity Declaration

External Parameter Entites

External parameter entity references are used to link external DTDs.

<?xml version="1.0" standalone="no"?><!DOCTYPE student [ <!ENTITY % student SYSTEM "http://www.university.com/student.dtd"> %student;]>

Page 89: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML revisited...

Page 90: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

Page 91: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

XML declaration saysthat data that follows is an XML document.

Page 92: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

Top level elementcalled root element.Contains all other

elements.Only one root element

per document

Page 93: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

These elements are inside the root

element.

Page 94: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

Document complieswith version 1.0 ofXML Specification

Page 95: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

booklist

book

authorname

book

authorname

Page 96: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

Start tag denotesbeginning of

element.

Page 97: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

End tag denotesend of

element.

Page 98: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

Start and end tag must match,

nest and can notoverlap.

Page 99: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0”><booklist>

<book><name>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book>

</booklist>

Data is enclosedby the start and end

tags.

Page 100: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0” encoding=“UTF-8” standalone=“yes”><booklist>

<book><name type=“spec”>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book>

<listbreak/>

<book><name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

Data in this fileis encoded

using UTF-8encoding.

Page 101: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0” encoding=“UTF-8” standalone=“yes”><booklist>

<book><name type=“spec”>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book>

<listbreak/>

<book><name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

This document canbe read and processed

independentof any external

entities.

Page 102: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0” encoding=“UTF-8” standalone=“yes”><booklist>

<book><name type=“spec”>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book>

<listbreak/>

<book><name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

Attributes are simplynamed quantities

that define propertiesabout a specificinstance of an

element.

Page 103: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0” encoding=“UTF-8” standalone=“yes”><booklist>

<book><name type=“spec”>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book><listbreak/><!-- This book is good for C++ programmers --><book>

<name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

XML commentsare entered like this.

Start tag is <!--and end tag is

-->

Page 104: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Building XML Documents - Example

<?xml version=“1.0” encoding=“UTF-8” standalone=“yes”><booklist>

<book><name type=“spec”>XML Specification</name><author>Ian Grahm</author>

</book><book>

<name>A Mythical Man Month</name><author>Fredrick Brooks Jr.</author>

</book>

<listbreak/>

<book><name>More Effective C++</name><author>Scott Mayers</author>

</book></booklist>

Empty element tagcan not contain anyother markup or text

Page 105: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Rules to remember

• Remember XML declaration• Do what the DTD instructs.• Watch your capitalization• Quote attribute values• Close all tags• Close empty tags too.

Page 106: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Case Study 1 Identify Element Identify Attribute Write DTD Write XML

Page 107: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Case Study 2 Identify Element Identify Attribute Write DTD Write XML

Page 108: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

So far we covered... XML Concepts Use of XML Writing XML document Writing DTD

Page 109: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Standards built on XML MathML WML

Page 110: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

MathML

MathML is about encoding the structure of mathematical expressions so that they can be displayed, manipulated and shared over the World Wide Web

A carefully encoded MathML expression can be evaluated in a computer algebra system, rendered in a Web browser, edited in your word processor, and printed on your laser printer.

There are about 100 markup elements for MathML

Page 111: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

MathML Example

<msup> <mfenced> <mrow> <mi>a</mi> <mo>+</mo> <mi>b</mi> </mrow> </mfenced> <mn>2</mn></msup>

( a + b ) 2

is represented in MathML as

Page 112: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

WML ( Wireless Markup Language )

• Markup Language mainly used by WAP aware browsers• Must conform to DTD http://www.wapforum.org/DTD/wml_1.1.xml

Page 113: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

WML Example

<?xml version="1.0"?> <!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN" "http://www.wapforum.org/DTD/wml_1.1.xml"> <wml> <card id="Card1" title="Wap-UK.com"> <p> <!-- Hello World example --> Hello World </p> </card> </wml>

Click here to learn more about WML

Page 114: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML Examples• xml 1• xml 2• xml 3• xml 4• xml 5• xml 6• xml 7

Page 115: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

DTD Examples

• Play• Song• Genbank

Page 116: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML Test Bed

Now its time to have some hands on XML.

Click here to invoke the XMLTestbed Applet

Page 117: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML data usage…. Using CSS ( Cascading Style

Sheet ) XSL

We need Internet Explorer 5 to view these catalog.xml files.

1. XML viewing using style sheet catalog.css2. XML viewing with XSL uses cd_catalog.xsl3. Ordered list of CDs using XSL 4. Filtering XML data using XSL

Page 118: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Web basedXML App.

Page 119: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language
Page 120: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Use of XML at DrKB, London

XML as an alternative to dynamically changing

specifications.

Page 121: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML Namespaces

An XML namespace is a collection of names that can be used as element or attribute names in an XML document.

Page 122: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XML Nampspace.. Identified by

URI ( Universal Resource Identifier ) URL ( Unique Resource Locator ) URN ( Unique Resource Number

Namespaces Declarations

• Explicit• Default

Page 123: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

<BOOKS> <bk:BOOK xmlns:bk=”http://booklovers.org" xmlns:money=”http://finance.com"> <bk:TITLE>A Suitable Boy</bk:TITLE> <bk:PRICE money:currency="US Dollar">22.95</bk:PRICE> </bk:BOOK></BOOKS>

Explicit Namespace declaration

Define a shorthand, or prefix, to substitute for the full name of the namespace.

Are useful when a node contains elements from different namespaces

Page 124: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Declaring default namespace

The following example declares the "BOOK" element and allelements and attributes within it ("TITLE", "PRICE", "currency") are from the namespace ”http://book.info.com"

<BOOK xmlns=”http://book.info.com"> <TITLE>A Suitable Boy</TITLE> <PRICE currency="US Dollar">22.95</PRICE></BOOK>

Page 125: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

<?xml version="1.0"?><!-- initially, the default namespace is "books" --><book xmlns='urn:loc.gov:books'xmlns:isbn='urn:ISBN:0-395-36341-6'><title>Cheaper by the Dozen</title><isbn:number>1568491379</isbn:number><notes><!-- make HTML the default namespace for some commentary --><p xmlns='urn:w3-org-ns:HTML'>This is a <i>funny</i> book!</p></notes></book>

Page 126: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Let’s see Document Object Model ...

Page 127: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

DOM

Document Object Model

Page 128: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Document Object Model

• What is DOM ?• DOM - XML Parser• DOM Usage

Page 129: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Document Object Model

DOM - Document Object Model is a set of interfaces defined by W3C.

Page 130: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Document Object Model

DOM interface is platform and language independent

These interfaces represent structure and content of XMLdocuments.

Page 131: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Document Object Model

Page 132: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

DOM Object Types XML Document is represented as

collection of Node Objects. Node types are as follows.

Document Node Element Node Attribute Node Text Node Comment Node Processing Instruction Node

Page 133: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language
Page 134: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Document Object Model

Most commonly used Nodes

Node Type Example

• Document <!DOCTYPE book SYSTEM “book.dtd”>• Processing Instruction <?xml version=“1.0”?>• Element <name type=“design”>A Mythical Man Month</name>• Attribute type=“design”• Text A Mythical Man Month

Page 135: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Using JAVA to access DOM IBM Alphaworks DOM implementation is used here. Let’s walk through the java code.

Page 136: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

public static void main (String args[]) {

// File reading syntax…..

Parser parser = new Parser(filename); //*** Note how we refer only to DOM Interface references.

Document doc = parser.readStream(is);

Element root = (Element)doc.getDocumentElement(); //*** Use DOM Interface references. traverse(root);

}

Page 137: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

public static void traverse(Node node){

// Check the current Node

System.out.println( node.getNodeName() + " - " + node.getNodeType() + " - " +

node.getNodeValue()) ;

// Print more information for child Nodes

if (node.hasChildNodes()) { NodeList nl = node.getChildNodes(); int size = nl.getLength(); for (int i = 0; i < size; i++) { traverse(nl.item(i)); } }

Page 138: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XSL

eXtensible Style Language

Page 139: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Purpose of XSL

To provide, powerful yet easy-to-use style sheet syntaxfor rendering XML Document.

XML XSLStylesheet

XSL Processor

XSL Processor

XSL Processor

RTF

Tex

HTML

Page 140: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XSL Architecture XSL style sheet has set of rules known

as construction rules. It converts data into set of objects

known as flow objects. Common flow objects are

Page,Paragraph,Table,etc. Flow objects have characteristics. XSL is itself an XML document and is

based on its own DTD. It has a handful set of element types.

Page 141: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XSL Transformations Uses style sheet language XSL to transform

XML Transformed XML can be another XML

document which may be used for viewing or for some other purpose

Transformations in XSLT describe rules for transforming source tree into result tree.

Transformation is achieved by associating patterns with templates.

The structure of the result tree can be completely different from the source tree.

Page 142: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XSL …..Stylesheet

• Contains set of template rules• Template tree contains pattern and template rule to implement when pattern is found.• While searching the elements, XSLT makes use of expression language defined by XPath.

Page 143: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XSL ….. StyleSheet has the following structure

Stylesheet Structure

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="..."> ...

</xsl:template>

<xsl:template name="..."> ...

</xsl:template>

</xsl:stylesheet>

Page 144: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XSL …..

<?xml version='1.0'?><xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"><xsl:template match="/"><html><body><xsl:value-of /></body></html></xsl:template></xsl:stylesheet>

Example

This will print the whole xml document.

Page 145: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XSL …..

Example

<xsl:template match="title"> <xsl:value-of /> <br/></xsl:template>

Computing Value of Node

Page 146: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XSL …..Processing Multiple Elements

Example<xsl:template match="bookstore"> <body> <xsl:for-each select="book"> Title:<xsl:value-of select="title"/> <br/> Price:<xsl:value-of select="price"/> <br/> </xsl:for-each> </body></xsl:template>

Page 147: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XSL …..XML Patterns

Selecting nodes starting from root through XML hierarchy

bookstore/book/titleWildcard used for unknown elements

/bookstore/*/titleHaving a specified element

/bookstore/book/author[first-name]Selecting attribute of an element

/bookstore/book/@genreSpecifying attribute with given value

/bookstore/book/@genre="autobiography"

Page 148: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

XSL …..

XSL Transformation Specification

Page 149: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Simple XSL Examples cd_catalog cd_catalog_filter cd_catalog_order

Page 150: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Case Study

Write XML for employee of company

and use the same file for various departments like

personnel,travel,finance,projects,etc.

Use existing XSL / CSS to display the data.

Page 151: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Quiz !

Some points to think about …..

Page 152: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Quiz Where did XML get its Name ? What does it do ? What is markup Language ? Why markup languages have to base on XML ? What is structured Document ? How do you check the structure of XML for

validity ? Where can XML be used ? Give example of Markup Languages based on

XML ?

Page 153: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

<? Quiz ?>

Is the following xml declaration correct ?<Elem type=“type1” Type=“input”>

<!ELEMENT myMoney ( #PCDATA | currancy | currancy | visa ) * >

<!ELEMENT name (first, last)><!ELEMENT name (first, middle,last)>

Can we make any content of element as integerby putting some constraint ?

Page 154: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

<? Quiz ?>

Can we use PCDATA as element name in XML?

Xml FAQs are useful for understanding some subtlepoints.

Page 155: Foundation Course on XML - Ravindra Godbole XML Extensible Markup Language

Thank You