26
Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies in the right prospective. Some XML and document object model (DOM) details.

Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Embed Size (px)

Citation preview

Page 1: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Introduction to XMLThis presentation covers introductory features of XML.

What XML is and what it is not?

What does it do?

Put different related technologies in the right prospective.

Some XML and document object model (DOM) details.

Page 2: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

What XML is and what it is not?XML syntax is very similar to that of HTML, however:

HTML deals with the format of a document XML deals with the content in the document.

XML highlights what is to be displayed whereas HTML highlights how to display it. XML complements HTML.

HTML document contains both the data as well as displaying instructions.XML contains only data.

Example:

Page 3: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

What XML is and what it is not?Both XML as well as HTML use tags however, the tags in HTML have fixed semantics and cannot mean anything different but the tags in XML can be assigned different meanings and additional tags can be created.

XML makes the contents of a document readable not only by humans but also by machines.

XML documents are text files and are therefore platform independent.

HTML normally displays the entire contents of a document. with XML it is possible to select the information that is required to be displayed.

Page 4: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

XML does not Do anything!It is just a way of structuring, storing and sending information.

Format that can be understood by other applications.

It facilitates exchange of data between incompatible systems (does not exchange data in itself).

Developed by the W3C between 1996 and 1998 to provide a universal format for describing structured documents and data.

XML describes a class of data objects called XML documents.

Page 5: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

XML does not Do anything!

It allows the creation of a markup language from scratch.

Different industries and professions can develop custom languages that accurately handle their industry-specific data.

Wireless Markup Language, Chemical Markup Language, Speech Synthesis Markup Language, Gene Expression Markup Language etc..

XML will provide greater flexibility in transferring data between different applications on different platforms and machines, and greatly increase the accuracy of web searches.

Its reliance on Unicode makes it international.

Page 6: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Related Technologies

XML Tags are created by programmers.

How to tell a browser to display information inside a set of tags created by us?

How do browsers display information contained in HTML tags?

Standard predefined tags implemented by browser sftwr.We must therefore have some standard way to describe the tags created by us. We can then refer to this description and write programs to interpret / display the contents of the tags.

Page 7: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Related Technologies

How to describe the tags and their properties?

One of the techniques is called Document Type Definition (DTD). Does not use XML. Details later.

Another more recent technique is Xmlschema. Uses XML

How to display / interpret XML document?

CSS (Cascaded Style Sheets)

XSL (Extensible styling language)

XSLT (Extensible styling language for transformations).

DOM (Document object model) Details later.

SAX (Simple API for XML).

Page 8: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Related Technologies

XML Doc.

DTD

Xmlschema

CSS, XSL

XSLT, DOM

TemplateDisplay / Interpret

Page 9: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Document Type Definition (DTD)

A DTD is a set of rules that will be used by a parser that parses an XML document.

It defines parts of a document and outlines how they can be used including their order and contents.It generally has:

Processing instructionsEntitiesElements, including their start and end tags.AttributesCommentsCharacter Data

Page 10: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Parts of DTD

Processing Instructions.

Most commonly used processing instructions are:<?xml version=“1.0” encoding=“UTF-8” standalone=“no”?>

Entities

Variables used to define common text.

Entity references are references to entities.

Entities are expanded when a document is parsed by an XML parser.

<!ENTITY COPYRIGHT “Copyrighted 2001”>

Reference to above : &COPYRIGHT

Predefined entities: lt, gt, amp, quot, apos

Referenced as: &lt (<), &gt (>), &amp (&), &quot (“), &apos (‘).

Page 11: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

ElementsThe main building blocks of an XML document are tagged elements like: <SUBJECT> ……….</SUBJECT>

Element surrounded by angle brackets is called a ‘Tag’. Contents of an element go between a start-tag and end-tag.

Syntax: <!ELEMENT name content>

Contents of an Element can be other Elements, (parent-child , sibling relationships) or PCDATA / CDATA / EMPTY /ANY.Example:

<!ELEMENT DOC (SUBJECT, DATE,ADDRESS, MEMO)>

<!ELEMENT SUBJECT (#PCDATA)>

<!ELEMENT DATE (#PCDATA)>

<!ELEMENT ADDRESS (#PCDATA)>

<!ELEMENT MEMO (#PCDATA)>

Page 12: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

ElementsUse of the elements declared in the previous slide:

<DOC>

<SUBJECT>Today’s Memo</SUBJECT>

<DATE>Nov. 6, 2001</DATE>

<ADDRESS>McMaster University</ADDRESS>

<MEMO>I hope you like XML</MEMO>

</DOC>

“DOC” is the parent of other elements that are siblings to each other.

Page 13: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

ElementsMore Examples:

<!ELEMENT BR EMPTY> Usage: <BR />

<!ELEMENT Note ANY> Usage:

<Note> any type of contents </Note>

<!ELEMENT Doc (Page+)> One or more elements

<!ELEMENT Doc (Page*)> Zero or more elements

<!ELEMENT Doc (Page?)> Zero or one elements

Mixed Contents:

<!ELEMENT Note (#PCDATA|To|From|Message)*>

Page 14: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Attributes & Comments

Provide extra information about elements, e.g.in HTML:

<img src=“mypicture.gif” />

XML attributes are declared as follows:

<!ATLIST elementname attributename type default_usage>

<!ELEMENT ARTICLE (HEADLINE, BYLINE, STORY>

<!ATLIST ARTICLE AUTHORS CDATA #REQUIRED

EDITORS CDATA #IMPLIED>

Comments: Same syntax as that in HTML <!– comments -->

Page 15: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Syntax of XML

XML syntax rules are simple and self-describing but strict.

All XML documents must have a root element, called ‘document element’ and all children properly nested.

<root><Child_Element>

<Sub_child> …. </Sub_child></Child_Element>

</root>All XML tags are case sensitive.All elements must be properly nested.All elements must have a closing tag.

Page 16: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Syntax of XML

Attributes must be quoted.

Element Names should follow the following rules:

They can have letters, numbers and other characters

They must not start with a number or punctuation characters

They must not start with letters like: XML / Xml /xml

They should not have spaces

Avoid using hyphen or period in a name

Page 17: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

“Well Formed” and “Valid” XML documents

A Well Formed document is one that conforms to XML syntax

A Valid document is a Well Formed document that also conforms to a DTD.

Is this a Valid document?

Is it Well Formed?

How to refer to a DTD?

Use <!DOCTYPE> processing instruction

Page 18: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

External DTD and Internal DTD

A DTD may be a part of an XML document – Internal

Normally it is stored separately and can be referred as:

<!DOCTYPE memo SYSTEM “memo.dtd”>

<!DOCTYPE memo SYSTEM “http://site/file path”>

<!DOCTYPE purchase PUBLIC “-//Companyxyz//DTD purchase//EN” “http://site/path”>

We use:

<!DOCTYPE COURSE SYSTEM “http://www.cas.mcmaster.ca/~asghar/k600/course.dtd”>

Page 19: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Viewing XML Documents

Internet Explorer 5+ can be used to view XML documents.

XML source document is shown; Why?

How can we view a formatted document?

Using Cascaded Style Sheets CSS is one way. They specify how each tag in XML document being viewed must be formatted.

Extensible Style Language (XSL). Not discussed here.

CSS show the entire document just like HTML. What if we want to display only parts of a document?

Page 20: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Document Object Model (XMLDOM)

XML document can be understood by other applications.

It must be described in a way that other programming languages can manipulate its contents (add, delete, change).

DOM is a programming interface for XML documents and exposes them as a tree structures in memory and provides an easy to use environment for the programmer.

Look at earlier example:

It can be shown as:

Page 21: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

XML Tree Structure

to from S u b jec t

a le rt p arag rap h c los in g

C O N TE N T

m em o

Page 22: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

XMLDOMDOM is a W3C recommendation. It specifies a language independent API that can be used with languages like Java, C++, Perl, Visual Basic or JavaScript and others.

There are different implementations of DOM.

We use Microsoft’s implementation. MS provides a parser in the form of a COM component in its IE5+.

We use JavaScript to access it and to make different API calls.

DOM uses three objects to access the XML file: Document, Node and Node List. Each has properties and methods.

Common Node types are: Document Type, Processing Instruction, Element, Attribute, Text etc.

Page 23: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Example #1<html> <head> <title> Example1</title> </head> <body> <ul> <li> Asghar Bokhari</li> <li> 9026568 </li> <li> A+ </li> </ul> </body></html>

Page 24: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Example #2<Student>

<FirstName>Asghar</FirstName>

<LastName>Bokhari</LastName>

<ID>9026568</ID>

<Assignment>28</Assignment>

<MidTerm>29</MidTerm>

<Final>38</Final>

<LetterGrade>A+</LetterGrade>

</Student>

Page 25: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Example #3<?xml version=“1.0” ?>

<memo>

<to>K 600 Class</to>

<from>Asghar Bokhari</from>

<Subject>XML Lecture</Subject>

<CONTENT>

<alert>Please listen carefully</alert>

<paragraph>Please read this memo</paragraph>

<closing>Thank you very much </closing>

</CONTENT>

</memo>

Page 26: Introduction to XML This presentation covers introductory features of XML. What XML is and what it is not? What does it do? Put different related technologies

Example #4

<?xml version=“1.0” encoding=“UTF-8” standalone=“no”?><!DOCTYPE memo [<!ELEMENT memo (to, from, Subject, CONTENT)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT Subject (#PCDATA)><!ELEMENT CONTENT (alert, paragraph, closing)><!ELEMENT alert (#PCDATA)><!ELEMENT paragraph (#PCDATA)><!ELEMENT closing (#PCDATA) >]><memo>

<to>K 600 Class</to><from>Asghar Bokhari</from><Subject>XML Lecture</Subject><CONTENT>

<alert>Please listen carefully</alert><paragraph>Please read this memo</paragraph><closing>Thank you very much </closing>

</CONTENT></memo>