27
1 XML XML eXtensible Markup eXtensible Markup Language Language

1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

Embed Size (px)

Citation preview

Page 1: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

11

XMLXML

eXtensible Markup LanguageeXtensible Markup Language

Page 2: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

22

XML vs. HTMLXML vs. HTML

HTML is a HyperText Markup languageHTML is a HyperText Markup languageDesigned for a specific application, namely, Designed for a specific application, namely,

presenting and linking hypertext documentspresenting and linking hypertext documentsXML describes XML describes structurestructure and and content content

(“semantics”)(“semantics”)The presentation is defined separately from The presentation is defined separately from

the structure and the contentthe structure and the content

Page 3: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

33

An Address Book asAn Address Book asan XML documentan XML document

<addresses><person>

<name> Donald Duck</name><tel> 414-222-1234 </tel><email> [email protected] </email>

</person><person>

<name> Miki Mouse</name><tel> 123-456-7890 </tel><email>[email protected]</email>

</person></addresses>

Page 4: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

44

Main Features of XMLMain Features of XML

No fixed set of tagsNo fixed set of tagsNew tags can be added for new applicationsNew tags can be added for new applications

An agreed upon set of tags can be used in An agreed upon set of tags can be used in many applicationsmany applicationsNamespaces Namespaces facilitate uniform and coherent facilitate uniform and coherent

descriptions of datadescriptions of dataFor example, a namespace for address books For example, a namespace for address books

determines whether to use determines whether to use <<teltel>> or or <<phonephone>>

Page 5: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

55

Main Features of XMLMain Features of XML (cont’d) (cont’d)

XML has the concept of a schemaXML has the concept of a schemaDTD DTD and the more expressive and the more expressive XML SchemaXML Schema

XML is a data model XML is a data model Similar to the Similar to the semistructured data modelsemistructured data model

XML supports internationalization (XML supports internationalization (UnicodeUnicode) and platform independence (an ) and platform independence (an XML file is just a character file)XML file is just a character file)

Page 6: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

66

XML is the Standard forXML is the Standard forData ExchangeData Exchange

Web services (e.g., ecommerce) require Web services (e.g., ecommerce) require exchanging data between various exchanging data between various applications that run on different platformsapplications that run on different platforms

XML (augmented with namespaces) is the XML (augmented with namespaces) is the preferred syntax for data exchange on the preferred syntax for data exchange on the WebWeb

Page 7: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

77

XML is not AloneXML is not Alone XML SchemasXML Schemas strengthen the data-modeling strengthen the data-modeling

capabilities of XML (in comparison to XML with capabilities of XML (in comparison to XML with only DTDs)only DTDs)

XPath XPath is a language for accessing parts of is a language for accessing parts of XML documentsXML documents

XLink XLink and and XPointer XPointer support cross-references support cross-references XSLT XSLT is a language for transforming XML is a language for transforming XML

documents into other XML documents documents into other XML documents (including XHTML, for displaying XML files)(including XHTML, for displaying XML files) Limited styling of XML can be done with Limited styling of XML can be done with CSS CSS alone alone

XQuery XQuery is a lanaguage for querying XML is a lanaguage for querying XML documentsdocuments

Page 8: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

88

The Two Facets of XMLThe Two Facets of XML

Some XML files are just text documents with Some XML files are just text documents with tags that denote their structure and include tags that denote their structure and include some metadata (e.g., an attribute that gives the some metadata (e.g., an attribute that gives the name of the person who did the proofreading)name of the person who did the proofreading) See an example on the next slideSee an example on the next slide XML is a subset of XML is a subset of SGML (Standard Generalized SGML (Standard Generalized

Markup Language)Markup Language) Other XML documents are similar to database Other XML documents are similar to database

files (e.g., an address book)files (e.g., an address book)

Page 9: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

99

XML can DescribeXML can Describethe Structure of a Documentthe Structure of a Document

<book year="1994"><book year="1994"><title>TCP/IP Illustrated</title><title>TCP/IP Illustrated</title><author><author>

<last>Stevens</last><last>Stevens</last><first>W.</first><first>W.</first>

</author></author><publisher>Addison-Wesley</publisher><publisher>Addison-Wesley</publisher><price>65.95</price><price>65.95</price>

</book></book>

Page 10: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

1010

XML SyntaxXML Syntax

W3Schools Resources on XML SyntaxW3Schools Resources on XML Syntax

Page 11: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

1111

The Structure of XMLThe Structure of XMLXML consists of XML consists of tagstags and and texttextTags come in pairsTags come in pairs <date> <date> ...... </date> </date>They must be They must be properly nestedproperly nested

goodgood

<date><date> ... ... <day><day> ... ... </day></day> ... ... </date></date>badbad

<date><date> ... ... <day><day> ... ... </date></date>... ... </day></day>

(You can’t do (You can’t do <i><i> ... ... <b><b> ... ... </i></i> ... ...</b></b> in HTML) in HTML)

Page 12: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

1212

A Useful AbbreviationA Useful Abbreviation

Abbreviating elements with empty contents:Abbreviating elements with empty contents: <<brbr/> /> forfor < <brbr></></brbr>> <<hrhr widthwidth=“=“1010”/> ”/> forfor < <hrhr widthwidth=“=“1010”></”></hrhr>>For example:For example:

<<familyfamily>> <<personperson idid = “ = “lisalisa”>”>

<<namename> > LisaLisa SimpsonSimpson </ </namename> > <<mothermother idrefidref = “ = “margemarge”/>”/>

<<fatherfather idrefidref = “ = “homerhomer”/>”/></</personperson>>......

</</familyfamily>>

Note that a tag may have a set of attributes, each consisting of a name and a value

Page 13: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

1313

XML TextXML Text

XML has only one “basic” type – XML has only one “basic” type – texttext

It is bounded by tags, e.g.,It is bounded by tags, e.g., <<titletitle>> TheThe Big SleepBig Sleep </</titletitle>> <<yearyear>> 19351935 </ </ yearyear>> – 1935 is still text – 1935 is still text

XML text is called PCDATA XML text is called PCDATA (for parsed character data)(for parsed character data)

It uses a 16-bit encoding, e.g., It uses a 16-bit encoding, e.g., \&\#x0152\&\#x0152 for for the Hebrew letter the Hebrew letter MemMem

Page 14: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

1414

XML StructureXML Structure

Nesting tags can be used to express Nesting tags can be used to express various structures, e.g., a tuple (record):various structures, e.g., a tuple (record):

<person><name> Lisa Simpson</name><tel> 02-828-1234 </tel><tel> 054-470-777 </tel><email> [email protected] </email>

</person>

Page 15: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

1515

XML Structure (cont’d)XML Structure (cont’d)

We can represent a list by using the We can represent a list by using the same same tag repeatedly:tag repeatedly:

<addresses><person> … </person><person> … </person><person> … </person><person> … </person>…

</addresses>

Page 16: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

1616

XML Structure (cont’d)XML Structure (cont’d)

<addresses><person>

<name> Donald Duck</name><tel> 04-828-1345 </tel><email> [email protected] </email>

</person><person>

<name> Miki Mouse</name><tel> 03-426-1142 </tel><email>[email protected]</email>

</person></addresses>

Page 17: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

1717

TerminologyTerminologyThe segment of an XML document The segment of an XML document between an opening and a corresponding between an opening and a corresponding closing tag is called an closing tag is called an element element

<person> <name> Bart Simpson </name>

<tel> 02 – 444 7777 </tel> <tel> 051 – 011 022 </tel>

<email> [email protected] </email> </person>

element

element, a sub-element of

not an element

Page 18: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

1818

An XML Document is a TreeAn XML Document is a Tree

person

name emailtel tel

Bart Simpson

02 – 444 7777

051 – 011 022

[email protected]

Leaves are either empty or contain PCDATA

Page 19: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

1919

Mixed ContentMixed Content

An element may contain a mixture of sub-An element may contain a mixture of sub-elements and PCDATAelements and PCDATA

<<airlineairline>> <<namename>> British AirwaysBritish Airways </</namename>> <<mottomotto> > World’sWorld’s <<dubiousdubious>> favoritefavorite</</dubiousdubious>>

airline airline </</mottomotto>></</airlineairline>>

Page 20: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

2020

The Header TagThe Header Tag

<?xml <?xml versionversion=="1.0""1.0" standalonestandalone=="yes/no""yes/no" encodingencoding=="UTF-8""UTF-8"?>?>StandaStandallone=“no” means tone=“no” means thhat there is an at there is an

external Dexternal DTTDDYou can leave out the encoding attribute and You can leave out the encoding attribute and

the processor will use the UTF-8 defaultthe processor will use the UTF-8 default

Page 21: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

2121

Processing InstructionsProcessing Instructions

<?xml version="1.0"?><?xml version="1.0"?><?xml-stylesheet  href="doc.xsl" type="text/xsl"?><?xml-stylesheet  href="doc.xsl" type="text/xsl"?>

<!DOCTYPE doc SYSTEM "doc.dtd"><!DOCTYPE doc SYSTEM "doc.dtd">

<doc>Hello, world!<!-- Comment 1 --></doc><doc>Hello, world!<!-- Comment 1 --></doc>

<?pi-without-data?><?pi-without-data?><!-- Comment 2 --><!-- Comment 2 --><!-- Comment 3 --><!-- Comment 3 -->

Page 22: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

2222

Using CDATAUsing CDATA<<HEAD1HEAD1> >

Entering a Kennel Club MemberEntering a Kennel Club Member

</</HEAD1HEAD1>>

<<DESCRIPTIONDESCRIPTION>>Enter the member by the name on his or her papers. Use the Enter the member by the name on his or her papers. Use the NAME tag. The NAME tag has two attributes. Common (all in NAME tag. The NAME tag has two attributes. Common (all in lowercase, please!) is the dog's call name. Breed (also in all lowercase, please!) is the dog's call name. Breed (also in all lowercase) is the dog's breed. Please see the breed reference lowercase) is the dog's breed. Please see the breed reference guide for acceptable breeds. Your entry should look something guide for acceptable breeds. Your entry should look something like this: like this:

</</DESCRIPTIONDESCRIPTION> >

<<EXAMPLEEXAMPLE>><![CDATA[<![CDATA[<NAME common="freddy" breed"=springer-<NAME common="freddy" breed"=springer-spaniel">Sirspaniel">Sir Fredrick of Ledyard's End</NAME>Fredrick of Ledyard's End</NAME>]]>]]>

</</EXAMPLEEXAMPLE> >

We want to seethe text as is,even though

it includes tags

Page 23: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

2323

A Complete XML DocumentA Complete XML Document

http://www.mscs.mu.edu/~praveen/Teachihttp://www.mscs.mu.edu/~praveen/Teaching/fa05/AdvDb/Lectures/bib.xmlng/fa05/AdvDb/Lectures/bib.xml

Page 24: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

2424

Well-Formed XML DocumentsWell-Formed XML Documents

An XML document (with or without a DTD) is An XML document (with or without a DTD) is

well-formedwell-formed if if Tags are syntactically correctTags are syntactically correct

Every tag has an end tagEvery tag has an end tag

Tags are properly nestedTags are properly nested

There is a root tagThere is a root tag

A start tag does not have two occurrences of the A start tag does not have two occurrences of the

same attributesame attribute

An XML document must be well formed

Page 25: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

2525

Representing relational databasesRepresenting relational databases

A relational database for school:A relational database for school:

student:student: course: course:

enroll:enroll:

cno title credit

331 DB 3.0350 Web 3.0… … …

id name gpa

001 J oe 3.0002 Mary 4.0… … …

id cno

001 331001 350002 331… …

Page 26: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

2626

XML representationXML representation<school> <school>

<student id=“001”><student id=“001”>

<name> Joe </name><name> Joe </name> <gpa> 3.0 </gpa><gpa> 3.0 </gpa>

</student></student>

<student id=“002”><student id=“002”>

<name> Mary </name> <gpa> 4.0 </gpa><name> Mary </name> <gpa> 4.0 </gpa>

</student></student>

<course cno=“331”><course cno=“331”>

<title> DB </title> <credit> 3.0 </credit><title> DB </title> <credit> 3.0 </credit>

</course></course>

<course cno=“350”><course cno=“350”>

<title> Web </title> <credit> 3.0 </credit><title> Web </title> <credit> 3.0 </credit>

</course></course>

Page 27: 1 XML eXtensible Markup Language. 2 XML vs. HTML HTML is a HyperText Markup language HTML is a HyperText Markup language Designed for a specific application,

2727

XML representationXML representation

<enroll><enroll>

<id> 001 </id><id> 001 </id> <cno> 331 </cno> <cno> 331 </cno>

</enroll></enroll>

<enroll><enroll>

<id> 001 </id><id> 001 </id> <cno> 350 </cno> <cno> 350 </cno>

</enroll></enroll>

<enroll><enroll>

<id> 002 </id><id> 002 </id> <cno> 331 </cno> <cno> 331 </cno>

</enroll></enroll>

</school></school>