99
XML Documents and XML Documents and Schema in greater Schema in greater depth depth

XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

  • View
    223

  • Download
    1

Embed Size (px)

Citation preview

Page 1: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

XML Documents and XML Documents and Schema in greater depthSchema in greater depth

Page 2: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

In one sense XML is …In one sense XML is …

A language neutral way of representing A language neutral way of representing structured datastructured data

Analogy to serialized object is easiest to Analogy to serialized object is easiest to understand in this contextunderstand in this context

Great intermediate data format for Great intermediate data format for applications to talk cross-platform, cross-applications to talk cross-platform, cross-language, etc.language, etc.

Page 3: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Equivalently, XML isEquivalently, XML is A flexible format for describing any kind of data A flexible format for describing any kind of data

(document).(document).

like HTMLlike HTML But you define can whatever tags you want for your But you define can whatever tags you want for your

application.application. Actually more like SGMLActually more like SGML HTML is really a HTML is really a Document TypeDocument Type in SGML ( in SGML (Standard Standard

Generalized Markup LanguageGeneralized Markup Language))

A A self-describingself-describing format: format: an XML document gives complete information about what an XML document gives complete information about what

fields values are associated withfields values are associated with an application doesn’t have to infer the field names from an application doesn’t have to infer the field names from

the order.the order.

It just describes a documentIt just describes a document Doesn't say what it means.Doesn't say what it means. Doesn't tell how to display it.Doesn't tell how to display it.

Page 4: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Some sample XML Some sample XML documentsdocuments

Page 5: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Article exampleArticle example<Article ><Headline>Direct Marketer Offended by Term 'Junk Mail' </Headline><authors><author> Joe Garden</author><author> Tim Harrod</author></authors><abstract>Dan Spengler, CEO of the direct-mail-marketing firm Mailbox of Savings, took umbrage Monday at the use of the term "junk mail."</abstract><body type="url" > http://www.theonion.com/archive/3-11-01.html </body>

</Article>

Page 6: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Order / WhitespaceOrder / Whitespace

Note that element order is important, but whitespace is not. This is the same as far as the xml parser is concerned:

<Article ><Headline>Direct Marketer Offended by Term 'Junk Mail' </Headline><authors>

<author> Joe Garden</author><author> Tim Harrod</author>

</authors><abstract>Dan Spengler, CEO of the direct-mail-marketing firm Mailbox of

Savings, took umbrage Monday at the use of the term "junk mail."</abstract><body type="url" > http://www.theonion.com/archive/3-11-01.html </body>

</Article>

Page 7: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Molecule ExampleMolecule Example

<?xml version "1.0" ?><?xml version "1.0" ?><CML><CML>

<MOL TITLE="Water" ><MOL TITLE="Water" ><ATOMS> <ATOMS>

<ARRAY BUILTIN="ELSYM" > H O H</ARRAY><ARRAY BUILTIN="ELSYM" > H O H</ARRAY>

</ATOMS></ATOMS><BONDS><BONDS>

<ARRAY BUILTIN="ATID1" >1 2</ARRAY><ARRAY BUILTIN="ATID1" >1 2</ARRAY><ARRAY BUILTIN="ATID2" >2 3</ARRAY><ARRAY BUILTIN="ATID2" >2 3</ARRAY><ARRAY BUILTIN="ORDER" >1 1</ARRAY><ARRAY BUILTIN="ORDER" >1 1</ARRAY>

</BONDS></BONDS></MOL></MOL>

</CML></CML>

Page 8: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Rooms exampleRooms example

<?xml version="1.0" ?><?xml version="1.0" ?> <rooms> <rooms>

<room name="<room name="RedRed">">  <capacity><capacity>1010</capacity> </capacity> <equipmentList><equipmentList>

<equipment><equipment>ProjectorProjector</equipment> </equipment>    </equipmentList></equipmentList>  

</room></room><room name="<room name="GreenGreen">">  

<capacity><capacity>55</capacity> </capacity>   <equipmentList /> <equipmentList /> <features><features>   <feature><feature>No RoofNo Roof</feature> </feature>    </features></features>  

</room></room>   </rooms></rooms>

Page 9: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

SuggestionSuggestion

Try building each of those documents Try building each of those documents in XMLSpy.in XMLSpy.

Note: it is not required to create a Note: it is not required to create a schema to do this. Just create new schema to do this. Just create new XML document and start building.XML document and start building.

Page 10: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Dissecting an XML Dissecting an XML DocumentDocument

Page 11: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Things that can appear in an XML Things that can appear in an XML documentdocument

ELEMENTSELEMENTS: : simplesimple, , complexcomplex, , emptyempty, or , or mixedmixed content; content; attributes. attributes.

The The XML declarationXML declaration

Processing instructions(PIsProcessing instructions(PIs) ) <? …?><? …?> Most common is Most common is <?xml-stylesheet …?><?xml-stylesheet …?> <?xml-stylesheet type=“text/css” <?xml-stylesheet type=“text/css” href=“mys.css”?>href=“mys.css”?>

CommentsComments <!-- <!-- comment textcomment text -- -->>

Page 12: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Begin TagsEnd Tags

Tags

Attributes

<?xml version "1.0"<?xml version "1.0" ?>?>

<<CMLCML><><MOL TITLE="Water" MOL TITLE="Water" > <> <ATOMSATOMS>> <<ARRAY BUILTIN="ELSYM" ARRAY BUILTIN="ELSYM" >> H O H H O H</</ARRAYARRAY>></</ATOMSATOMS>><<BONDSBONDS>><<ARRAY BUILTIN="ATID1" >1 2ARRAY BUILTIN="ATID1" >1 2</</ARRAYARRAY>><<ARRAY BUILTIN="ATID2" >2 3ARRAY BUILTIN="ATID2" >2 3</</ARRAYARRAY>><<ARRAY BUILTIN="ORDER" >1 1ARRAY BUILTIN="ORDER" >1 1</</ARRAYARRAY>></</BONDSBONDS>></</MOLMOL>></</CMLCML>>

Parts of an XML documentParts of an XML documentDeclaration

AttributeValues

An XML element is everything from (including) the element's start tag to (including) the element's end tag.

Page 13: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

XML and TreesXML and Trees Tags give the structure of a Tags give the structure of a

document. They divide the document. They divide the document up into document up into Elements, Elements, starting at the top most starting at the top most element, theelement, the root element. root element. The The stuff inside an element is its stuff inside an element is its content – content cancontent – content caninclude other elements along include other elements along with ‘character data’with ‘character data’

CML

MOL

ATOMS BONDS

ARRAY ARRAY ARRAY ARRAY

HOH 12 23 11

Root element

CDATA sections

Page 14: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

XML and XML and TreesTrees

<?xml version "1.0"<?xml version "1.0" ?>?><<CMLCML>>

<<MOL TITLE="Water" MOL TITLE="Water" >><<ATOMSATOMS>>

<<ARRAY BUILTIN="ELSYM" ARRAY BUILTIN="ELSYM" >> H O H O HH</</ARRAYARRAY>>

</</ATOMSATOMS>><<BONDSBONDS>>

<<ARRAY BUILTIN="ATID1" >1 2ARRAY BUILTIN="ATID1" >1 2</</ARRAYARRAY>><<ARRAY BUILTIN="ATID2" >2 3ARRAY BUILTIN="ATID2" >2 3</</ARRAYARRAY>><<ARRAY BUILTIN="ORDER" >1 1ARRAY BUILTIN="ORDER" >1 1</</ARRAYARRAY>>

</</BONDSBONDS>></</MOLMOL>>

</</CMLCML>>

CML

MOL

ATOMS BONDS

ARRAY ARRAY ARRAY ARRAY

HOH 12 23 11

Root element

Data sections

Page 15: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

XML and TreesXML and Trees

rooms

room

capacity equipmentlistequipmentlist

equipment

capacity

room

features

feature10

projector

5

No Roof

Page 16: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

More detail on elementsMore detail on elements

Page 17: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Element relationshipsElement relationships

<book> <title>My First XML</title> <prod id="33-657" media="paper"></prod> <chapter>Introduction to XML <para>What is HTML</para> <para>What is XML</para> </chapter> <chapter>XML Syntax <para>Elements must have a closing tag</para> <para>Elements must be properly nested</para> </chapter> </book>

Book is the root element. Title, prod, and chapter are child elements of book. Book is the parent element of title, prod, and chapter. Title, prod, and chapter are siblings (or sister elements) because they have the same parent.

Page 18: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Element contentElement content

Elements can have different content types.

An XML element is everything from (including) the element's start tag to (including) the element's end tag.

An element can have element content, mixed content,simple content, or empty content. An element can also have attributes.

In the previous example, book has element content, because it containsother elements. Chapter has mixed content because it contains both textand other elements. Para has simple content (or text content) because it contains only text. Prod has empty content, because it carries no information.

Page 19: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Element namingElement naming

XML elements must follow these naming rules:

•Names can contain letters, numbers, and other characters •Names must not start with a number or punctuation character •Names must not start with the letters xml (or XML or Xml ..) •Names cannot contain spaces

Take care when you "invent" element names and follow these simple rules:•Any name can be used, no words are reserved, but the idea is to make names descriptive. Names with an underscore separator are nice.

Examples: <first_name>, <last_name>.

Page 20: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Element naming, cont.Element naming, cont.

Avoid "-" and "." in names. For example, if you name something "first-name,“ it could be a mess if your software tries to subtract name from first. Or if you name something "first.name," your software may think that "name" is a property of the object "first."Element names can be as long as you like, but don't exaggerate. Names should be short and simple, like this: <book_title> not like this: <the_title_of_the_book>. 

XML documents often have a corresponding database, in which fields exist corresponding to elements in the XML document. A good practice is to use the naming rules of your database for the elements in the XML documents.Non-English letters like éòá are perfectly legal in XML element names, but watch out for problems if your software vendor doesn't support them.The ":" should not be used in element names because it is reserved to be used for something called namespaces (more later).

Page 21: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Well formed XMLWell formed XML

Page 22: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Well-formed vs ValidWell-formed vs Valid

Recall that an XML document is said Recall that an XML document is said to be to be well-formedwell-formed if it obeys basic if it obeys basic semantic and syntactic constraints.semantic and syntactic constraints.

This is different from a This is different from a validvalid XML XML document, which (as we will see in document, which (as we will see in more depth) properly matches a more depth) properly matches a schema.schema.

Page 23: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Rules for Well-Formed XMLRules for Well-Formed XML

An XML document is considered well-formed if it obeys the An XML document is considered well-formed if it obeys the following rules:following rules:

There must be one element that contains all others (root There must be one element that contains all others (root element)element)

All tags must be balanced All tags must be balanced <BOOK>...</BOOK><BOOK>...</BOOK> <BOOK /><BOOK />

Tags must be nested properly:Tags must be nested properly: <BOOK> <LINE> This is OK </LINE> </BOOK><BOOK> <LINE> This is OK </LINE> </BOOK> <LINE> <BOOK> This is </LINE> definitely NOT </BOOK> <LINE> <BOOK> This is </LINE> definitely NOT </BOOK>

OKOK

Text is case-sensitive soText is case-sensitive so <P>This is not ok, even though we do it all the time <P>This is not ok, even though we do it all the time

in HTML!</p>in HTML!</p>

Page 24: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

More Rules for Well-Formed XMLMore Rules for Well-Formed XML

The attributes in a tag must be in quotesThe attributes in a tag must be in quotes < ITEM CATEGORY=“Home and Garden” Name=“hoe-matic < ITEM CATEGORY=“Home and Garden” Name=“hoe-matic

t500”>t500”>

Comments are allowedComments are allowed <!–- They are done just as in HTML… --><!–- They are done just as in HTML… -->

Must begin withMust begin with <?xml version=‘1.0’ ?><?xml version=‘1.0’ ?>

Special characters must be escaped: the most common are Special characters must be escaped: the most common are < < " ' > &" ' > &

<formula> x &lt; y+2x </formula><formula> x &lt; y+2x </formula> <cd title="&quot; mmusic"><cd title="&quot; mmusic">

An XML document that obeys these rules isAn XML document that obeys these rules is Well-Formed Well-Formed

Page 25: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Creating XMLCreating XML

There are many XML editors.There are many XML editors. XeenaXeena XMLSpy XMLSpy Xeena on the CSPP machinesXeena on the CSPP machines

Like HTML, text editors are frequently Like HTML, text editors are frequently the only thing available or the only the only thing available or the only thing that produces what you wantthing that produces what you want Test in IE6 or NetScape 7.0Test in IE6 or NetScape 7.0

Page 26: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Next StepNext Step

XML SchemaXML Schema

Page 27: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

XML SchemaXML Schema XML allows any sort of tag you want.XML allows any sort of tag you want.

In a given application, you want to fix a In a given application, you want to fix a vocabulary -- what tags make sense.vocabulary -- what tags make sense.

Use a Use a SchemaSchema to define an XML dialect to define an XML dialect MusicXML, VoiceXML, ADXML, etc.MusicXML, VoiceXML, ADXML, etc.

Restrict documents to those tags.Restrict documents to those tags.

Anyone who has your Schema can Anyone who has your Schema can validate their document to see if it obeys validate their document to see if it obeys the rules of the dialect.the rules of the dialect.

Page 28: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Schema determine …Schema determine … What sort of elements can appear in the What sort of elements can appear in the

documentdocument. .

What elements MUST appearWhat elements MUST appear

Which elements can appear as part of Which elements can appear as part of another element another element

What attributes can appear or must What attributes can appear or must appearappear

What kind of values can/must be in an What kind of values can/must be in an attribute.attribute.

Page 29: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

RoomsRooms XML Schema XML Schema

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="rooms"><xs:complexType><xs:sequence> <xs:element name="room" minOccurs="0" maxOccurs="unbounded"> <xs:complexType>

<xs:sequence> <xs:element name="capacity" type="xs:decimal"/> <xs:element name="equiptmentList"/> <xs:element name="features" minOccurs="0"><xs:complexType> <xs:sequence> <xs:element name="feature" type="xs:string“ maxOccurs="unbounded"/> </xs:sequence> </xs:complexType></xs:element> </xs:sequence>

<xs:attribute name="name" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:sequence></xs:complexType></xs:element></xs:schema>

Page 30: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

BookingsBookings XML Schema XML Schema

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="bookings"> <xs:complexType> <xs:sequence>

<xs:element ref="lastUpdated" maxOccurs="1" minOccurs="0"/><xs:element ref="meetingDate" maxOccurs="unbounded"/>

</xs:sequence> </xs:complexType> </xs:element> <xs:element name="year" type="xs:integer"/> <xs:element name="month" type="xs:string"/> <xs:element name="day" type="xs:integer"/>

Note that there are four global types in this document!

Page 31: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

BookingsBookings, cont., cont.

<xs:element name="meetingDate"> <xs:complexType> <xs:sequence>

<xs:element ref="year"/><xs:element ref="month"/><xs:element ref="day"/><xs:element ref="meeting" maxOccurs="unbounded" minOccurs="0"/>

</xs:sequence> </xs:complexType> </xs:element>

<xs:element name="lastUpdated"> <xs:complexType> <xs:attribute name="date" type="xs:string"/> <xs:attribute name="time" type="xs:string"/> </xs:complexType></xs:element>

Page 32: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

BookingsBookings, cont., cont.

<xs:element name="meeting"> <xs:complexType> <xs:sequence> <xs:element name="meetingName" maxOccurs="1" minOccurs="1" type="xs:string"/>

<xs:element name="roomName" maxOccurs="1" minOccurs="1" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element></xs:schema>

Page 33: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

An Example Bookings DocumentAn Example Bookings Document

<?xml version="1.0" encoding="UTF-8"?><bookings xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="../schemas/Bookings.xsd">

<meetingDate><year>2003</year><month>April</month><day>1</day><meeting>

<meetingName>Democratic Party</meetingName><roomName>Green Room</roomName>

</meeting><meeting>

<meetingName>Republican Party</meetingName><roomName>Red Room</roomName>

</meeting></meetingDate>

</bookings>

Page 34: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

XML Schema (Document Type Definition)XML Schema (Document Type Definition)

A Schema (or the older DTD) is a specification: it A Schema (or the older DTD) is a specification: it specifies the language that you speak.specifies the language that you speak.

Check the DTDs for musicxml, adxml, etc. that Check the DTDs for musicxml, adxml, etc. that are available off the course webpage are available off the course webpage These give you the basic structure of each of these These give you the basic structure of each of these

applications.applications. Not many schemas available, but much betterNot many schemas available, but much better

As we said before, like a user-defined type in a As we said before, like a user-defined type in a programming language. Also somewhat programming language. Also somewhat analogous to a database schemaanalogous to a database schema says what are the components that can appearsays what are the components that can appear gives default values and restrictions.gives default values and restrictions.

Page 35: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Dissecting SchemaDissecting Schema

Page 36: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

What’s in a Schema?What’s in a Schema? A Schema is an XML document (a DTD is not)A Schema is an XML document (a DTD is not)

Because it is an XML document, it must have a root Because it is an XML document, it must have a root elementelement The root element is The root element is <schema><schema>

Within the root element, there can be Within the root element, there can be Any number and combination ofAny number and combination of

InclusionsInclusions ImportsImports Re-definitionsRe-definitions AnnotationsAnnotations

Followed by any number and combinations ofFollowed by any number and combinations of Simple and complex data type definitionsSimple and complex data type definitions Element and attribute definitionsElement and attribute definitions Model group definitionsModel group definitions AnnotationsAnnotations

Page 37: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Structure of a SchemaStructure of a Schema

<schema><schema>

<!– any number of the following --><!– any number of the following -->

<include .../><include .../>

<import> ... </import><import> ... </import>

<redefine> ... </redefine><redefine> ... </redefine>

<annotation> ... </annotation><annotation> ... </annotation>

<!– any number of following definitions --><!– any number of following definitions -->

<simpleType> ... </simpleType><simpleType> ... </simpleType>

<complexType> ... </complexType><complexType> ... </complexType>

<element> ... </element><element> ... </element>

<attribute/><attribute/>

<attributeGroup> ... </attributeGroup><attributeGroup> ... </attributeGroup>

<group> ... </group><group> ... </group>

<annotation> ... </annotation><annotation> ... </annotation>

</schema></schema>

Page 38: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Simple TypesSimple Types

Page 39: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

ElementsElements

What is a Simple Element?What is a Simple Element? A simple element is an XML element that can A simple element is an XML element that can

contain only text. It cannot contain any other contain only text. It cannot contain any other elements or attributes.elements or attributes.

Can also add restrictions (facets) to a data Can also add restrictions (facets) to a data type in order to limit its content, and you type in order to limit its content, and you can require the data to match a defined can require the data to match a defined pattern.pattern.

Page 40: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Example Simple ElementExample Simple Element

The syntax for defining a simple element is: <xs:element name="xxx" type="yyy"/>

where xxx is the name of the element and yyy is the data type of the element. Here are some XML elements:

<lastname>Refsnes</lastname> <age>34</age> <dateborn>1968-03-27</dateborn>

And here are the corresponding simple element definitions: <xs:element name="lastname" type="xs:string"/> <xs:element name="age" type="xs:integer"/> <xs:element name="dateborn" type="xs:date"/>

Page 41: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Common XML Schema Data Common XML Schema Data TypesTypes

XML Schema has a lot of built-in data types. XML Schema has a lot of built-in data types. Here is a list of the most common types:Here is a list of the most common types: xs:string xs:string xs:decimal xs:decimal xs:integer xs:integer xs:boolean xs:boolean xs:date xs:date xs:time xs:time

Page 42: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Declare Default and Fixed Declare Default and Fixed Values for Simple ElementsValues for Simple Elements

Simple elements can have a default value OR a fixed value Simple elements can have a default value OR a fixed value set.set.

A default value is automatically assigned to the element A default value is automatically assigned to the element when no other value is specified. In the following example when no other value is specified. In the following example the default value is "red":the default value is "red":

<xs:element name="color" type="xs:string" default="red"/><xs:element name="color" type="xs:string" default="red"/>

A fixed value is also automatically assigned to the element. A fixed value is also automatically assigned to the element. You cannot specify another value. In the following example You cannot specify another value. In the following example the fixed value is "red":the fixed value is "red":

<xs:element name="color" type="xs:string" fixed="red"/><xs:element name="color" type="xs:string" fixed="red"/>

Page 43: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

AttributesAttributes(Another simple type)(Another simple type)

All attributes are declared as All attributes are declared as simple types.simple types.

Only complex elements can have Only complex elements can have attributes!attributes!

Page 44: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

What is an Attribute?What is an Attribute?

Simple elements cannot have Simple elements cannot have attributes. attributes. If an element has attributes, it is If an element has attributes, it is

considered to be of complex type. considered to be of complex type. But the attribute itself is always declared But the attribute itself is always declared

as a simple typeas a simple type. . This means that an element with This means that an element with

attributes always has a complex type attributes always has a complex type definition.definition.

Page 45: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

How to Define an AttributeHow to Define an Attribute

The syntax for defining an attribute is: The syntax for defining an attribute is: <xs:attribute name="xxx" type="yyy"/><xs:attribute name="xxx" type="yyy"/>

where xxx is the name of the attribute and yyy where xxx is the name of the attribute and yyy is the data type of the attribute. Here are an is the data type of the attribute. Here are an XML element with an attribute:XML element with an attribute:

<lastname lang="EN">Smith</lastname><lastname lang="EN">Smith</lastname>

And here are a corresponding simple attribute And here are a corresponding simple attribute definition:definition:

<xs:attribute name="lang" type="xs:string"/><xs:attribute name="lang" type="xs:string"/>

Page 46: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Declare Default and Fixed Declare Default and Fixed Values for AttributesValues for Attributes

Attributes can have a default value OR a fixed value Attributes can have a default value OR a fixed value specified.specified.

A default value is automatically assigned to the attribute A default value is automatically assigned to the attribute when no other value is specified. In the following example when no other value is specified. In the following example the default value is "EN":the default value is "EN":

<xs:attribute name="lang" type="xs:string" default="EN"/><xs:attribute name="lang" type="xs:string" default="EN"/> A fixed value is also automatically assigned to the attribute. A fixed value is also automatically assigned to the attribute.

You cannot specify another value. In the following example You cannot specify another value. In the following example the fixed value is "EN":the fixed value is "EN":

<xs:attribute name="lang" type="xs:string" fixed="EN"/><xs:attribute name="lang" type="xs:string" fixed="EN"/>

Page 47: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Creating Optional and Creating Optional and Required AttributesRequired Attributes

All attributes are optional by default. To All attributes are optional by default. To explicitly specify that the attribute is explicitly specify that the attribute is optional, use the "use" attribute:optional, use the "use" attribute: <xs:attribute name="lang" type="xs:string" <xs:attribute name="lang" type="xs:string"

use="optional"/>use="optional"/>

To make an attribute required:To make an attribute required:

<xs:attribute name="lang" type="xs:string" <xs:attribute name="lang" type="xs:string" use="required"/>use="required"/>

Page 48: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

RestrictionsRestrictions

As we will see later, simple types can As we will see later, simple types can have ranges put on their valueshave ranges put on their values

These are known as These are known as restrictionsrestrictions

Page 49: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Complex TypesComplex Types

Page 50: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Complex ElementsComplex Elements

A complex element is an XML element that A complex element is an XML element that contains other elements and/or attributes.contains other elements and/or attributes.

There are four kinds of complex elements:There are four kinds of complex elements: empty elements empty elements elements that contain only other elements elements that contain only other elements elements that contain only text elements that contain only text elements that contain both other elements and text elements that contain both other elements and text

Note:Note: Each of these elements may contain Each of these elements may contain attributes as well!attributes as well!

Page 51: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Examples of Complex XML Examples of Complex XML ElementsElements

A complex XML element, "product", which is empty:A complex XML element, "product", which is empty: <product pid="1345"/><product pid="1345"/>

A complex XML element, "employee", which contains only A complex XML element, "employee", which contains only other elements:other elements: <employee> <employee> <firstname>John</firstname> <firstname>John</firstname> <lastname>Smith</lastname><lastname>Smith</lastname> </employee></employee>

A complex XML element, "food", which contains only text:A complex XML element, "food", which contains only text: <food type="dessert">Ice cream</food><food type="dessert">Ice cream</food>

Page 52: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Examples, cont.Examples, cont.

A complex XML element, A complex XML element, "description", which contains both "description", which contains both elements and text:elements and text: <description> It happened on <date <description> It happened on <date

lang="norwegian">03.03.99</date> .... lang="norwegian">03.03.99</date> .... </description></description>

Page 53: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

An Example XML SchemaAn Example XML Schema

<?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="rooms"><xs:complexType><xs:sequence> <xs:element name="room" minOccurs="0" maxOccurs="unbounded"> <xs:complexType>

<xs:sequence> <xs:element name="capacity" type="xs:decimal"/> <xs:element name="equiptmentList"/> <xs:element name="features" minOccurs="0"><xs:complexType> <xs:sequence> <xs:element name="feature" type="xs:string"

maxOccurs="unbounded"/> </xs:sequence> </xs:complexType></xs:element> </xs:sequence>

<xs:attribute name="name" type="xs:string" use="required"/> </xs:complexType> </xs:element> </xs:sequence></xs:complexType></xs:element></xs:schema>

Page 54: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Referencing XML Schema Referencing XML Schema in XML documentsin XML documents

Page 55: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Sample Schema headerSample Schema header

The <schema> element may contain The <schema> element may contain some attributes. A schema declaration some attributes. A schema declaration often looks something like this:often looks something like this: <?xml version="1.0"?><?xml version="1.0"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">elementFormDefault="qualified">

... ...... ...

</xs:schema></xs:schema>

Page 56: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Schema headers, cont.Schema headers, cont.

The following fragment:The following fragment:

xmlns:xs=xmlns:xs=http://www.w3.org/2001/XMLSchemahttp://www.w3.org/2001/XMLSchema

indicates that the elements and data types used in the indicates that the elements and data types used in the schema (schema, element, complexType, sequence, schema (schema, element, complexType, sequence, string, boolean, etc.) come from the string, boolean, etc.) come from the "http://www.w3.org/2001/XMLSchema" namespace. "http://www.w3.org/2001/XMLSchema" namespace.

It also specifies that the elements and data types that It also specifies that the elements and data types that come from the "http://www.w3.org/2001/XMLSchema" come from the "http://www.w3.org/2001/XMLSchema" namespace should be prefixed with xs: !!namespace should be prefixed with xs: !!

Page 57: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Schema header, cont.Schema header, cont.

This fragment:This fragment: targetNamespace=targetNamespace=http://www.w3schools.comhttp://www.w3schools.com indicates that the elements defined by this schema (note, to, indicates that the elements defined by this schema (note, to,

from, heading, body.) come from the from, heading, body.) come from the "http://www.w3schools.com" namespace."http://www.w3schools.com" namespace.

This fragment:This fragment: xmlns=xmlns=http://www.w3schools.comhttp://www.w3schools.com indicates that the default namespace is indicates that the default namespace is

"http://www.w3schools.com"."http://www.w3schools.com".

This fragment:This fragment: elementFormDefault="qualified“elementFormDefault="qualified“ indicates that any elements used by the XML instance document indicates that any elements used by the XML instance document

which were declared in this schema must be namespace which were declared in this schema must be namespace qualified.qualified.

Page 58: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Referencing schema in XMLReferencing schema in XML

This XML document has a reference to This XML document has a reference to an XML Schema:an XML Schema: <?xml version="1.0"?><?xml version="1.0"?> <note xmlns="http://www.w3schools.com" <note xmlns="http://www.w3schools.com"

xmlns:xsi="http://www.w3.org/2001/XMLSchema-xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com instance" xsi:schemaLocation="http://www.w3schools.com note.xsd">note.xsd">

<to>Tove</to><to>Tove</to> <from>Jani</from> <from>Jani</from> <heading>Reminder</heading> <heading>Reminder</heading> <body>Don't forget me this weekend!</body><body>Don't forget me this weekend!</body> </note></note>

Page 59: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Referencing schema in xml, Referencing schema in xml, cont.cont.

The following fragment:The following fragment: xmlns=xmlns=http://www.w3schools.comhttp://www.w3schools.com

specifies the default namespace specifies the default namespace declaration. declaration.

This declaration tells the schema-validator This declaration tells the schema-validator that all the elements used in this XML that all the elements used in this XML document are declared in the document are declared in the "http://www.w3schools.com" namespace."http://www.w3schools.com" namespace.

Page 60: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

……

Once you have the XML Schema Instance Once you have the XML Schema Instance namespace available:namespace available:

xmlns:xsi=xmlns:xsi=http://www.w3.org/2001/XMLSchema-instancehttp://www.w3.org/2001/XMLSchema-instance

you can use the schemaLocation attribute. This attribute you can use the schemaLocation attribute. This attribute has two values. The first value is the namespace to use. has two values. The first value is the namespace to use. The second value is the location of the XML schema to use The second value is the location of the XML schema to use for that namespace:for that namespace:

xsi:schemaLocation="http://www.w3schools.com note.xsd"xsi:schemaLocation="http://www.w3schools.com note.xsd"

Page 61: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing
Page 62: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Using ReferencesUsing References

Page 63: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Using ReferencesUsing References You don't have to have the content of an element You don't have to have the content of an element

defined in the nested fashion as just showndefined in the nested fashion as just shown <xs:element <xs:element

name="rooms"><xs:complexType><xs:sequence>name="rooms"><xs:complexType><xs:sequence> <xs:element name="room"><xs:element name="room"> <xs:complexType><xs:complexType>

<xs:sequence><xs:sequence> <xs:element name="capacity" type="xs:decimal"/><xs:element name="capacity" type="xs:decimal"/>

You can define the element elsewhere and use a You can define the element elsewhere and use a reference t o it insteadreference t o it instead<xs:element <xs:element

name="rooms"><xs:complexType><xs:sequence>name="rooms"><xs:complexType><xs:sequence><xs:element ref="room"/><xs:element ref="room"/>

</xs:sequence></xs:complexType></xs:element></xs:sequence></xs:complexType></xs:element>

<xs:element name="room"><xs:element name="room"> … …</xs:element></xs:element>

Page 64: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Rooms Schema using ReferencesRooms Schema using References

<?xml version="1.0" encoding="UTF-8"?><?xml version="1.0" encoding="UTF-8"?><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" attributeFormDefault="unqualified">elementFormDefault="qualified" attributeFormDefault="unqualified"> <xs:element name="rooms"><xs:complexType><xs:sequence><xs:element name="rooms"><xs:complexType><xs:sequence>

<xs:element ref="room" maxOccurs="unbounded"/><xs:element ref="room" maxOccurs="unbounded"/> </xs:sequence></xs:complexType></xs:sequence></xs:complexType> </xs:element></xs:element>

<xs:element name="room"><xs:element name="room"> <xs:complexType><xs:complexType>

<xs:sequence><xs:sequence><xs:element name="capacity" type="xs:decimal"/><xs:element name="capacity" type="xs:decimal"/><xs:element name="equiptmentList"/><xs:element name="equiptmentList"/><xs:element ref="features" minOccurs="0" maxOccurs="1"/><xs:element ref="features" minOccurs="0" maxOccurs="1"/>

</xs:sequence></xs:sequence> <xs:attribute name="name" type="xs:string" use="required"/><xs:attribute name="name" type="xs:string" use="required"/>

</xs:complexType></xs:complexType></xs:element></xs:element>

</xs:schema></xs:schema>

Page 65: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

A Rooms Schema using References (Cont.)A Rooms Schema using References (Cont.)

<xs:element name="features"><xs:element name="features"> <xs:complexType><xs:complexType>

<xs:sequence><xs:sequence><xs:element name="feature" <xs:element name="feature"

type="xs:string"type="xs:string" maxOccurs="unbounded"/>maxOccurs="unbounded"/>

</xs:sequence></xs:sequence> </xs:complexType></xs:complexType></xs:element></xs:element></xs:schema></xs:schema>

Page 66: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

A Rooms Schema using References A Rooms Schema using References (Graphical)(Graphical)

Page 67: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

A Rooms Schema using References A Rooms Schema using References (Graphical Cont.)(Graphical Cont.)

Page 68: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

A Rooms Schema using References A Rooms Schema using References (Graphical Cont.)(Graphical Cont.)

Page 69: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

TypesTypes

Both elements and attributes have types, which are defined in the Both elements and attributes have types, which are defined in the Schema.Schema.

One can reuse types by giving them One can reuse types by giving them namesnames..

<xsd:element name="Robot"><xsd:element name="Robot"><xsd:complexType><xsd:complexType>

<xsd:sequence><xsd:sequence><xsd:element ref="Sensor_List" minOccurs="0"/><xsd:element ref="Sensor_List" minOccurs="0"/><xsd:element ref="Specification_List" <xsd:element ref="Specification_List"

minOccurs="0"/>minOccurs="0"/><xsd:element ref="Note" minOccurs="0"/><xsd:element ref="Note" minOccurs="0"/>

</xsd:sequence></xsd:sequence></xsd:complexType></xsd:complexType>

</xsd:element></xsd:element>

<xsd:element name="Robot” type=“RoboType”><xsd:element name="Robot” type=“RoboType”><xsd:complexType name="RoboType” ><xsd:complexType name="RoboType” >

<xsd:sequence><xsd:sequence><xsd:element ref="Sensor_List" minOccurs="0"/><xsd:element ref="Sensor_List" minOccurs="0"/><xsd:element ref="Specification_List" <xsd:element ref="Specification_List"

minOccurs="0"/>minOccurs="0"/><xsd:element ref="Note" minOccurs="0"/><xsd:element ref="Note" minOccurs="0"/>

</xsd:sequence></xsd:sequence></xsd:complexType></xsd:complexType></xsd:element></xsd:element>

OR

Page 70: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Other XML Schema FeaturesOther XML Schema Features

Foreign key facility (uses Xpath)Foreign key facility (uses Xpath)

Rich datatype facilityRich datatype facility Build up datatypes by inheritanceBuild up datatypes by inheritance Don’t need to list all of the attributes Don’t need to list all of the attributes

(can say "these attributes plus others").(can say "these attributes plus others"). Restrict strings using regular Restrict strings using regular

expressionsexpressions

Namespace aware.Namespace aware. Can restrict location of an element Can restrict location of an element

based on a namespacesbased on a namespaces

Page 71: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

RestrictionsRestrictions

Page 72: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Datatype RestrictionsDatatype Restrictions A DTD can only say that price can be any non-A DTD can only say that price can be any non-

markup text. Like this translated to Schemasmarkup text. Like this translated to Schemas<xsd:element name="zip" type="xsd:string"/><xsd:element name="zip" type="xsd:string"/>

But in Schema you can do better:But in Schema you can do better:<xsd:element name="zip" type="xsd:decimal"/><xsd:element name="zip" type="xsd:decimal"/>

Or even, make your own restrictionsOr even, make your own restrictions<xsd:simpleType name="ZipPlus4"><xsd:simpleType name="ZipPlus4"> <xsd:restriction base="xsd:string"><xsd:restriction base="xsd:string"> <xsd:length value="10"/><xsd:length value="10"/> <xsd:pattern value="\d{5}-\d{4}"/><xsd:pattern value="\d{5}-\d{4}"/> </xsd:restriction></xsd:restriction></xsd:simpleType></xsd:simpleType><xsd:element name="zip" type="ZipPlus4"><xsd:element name="zip" type="ZipPlus4">

Page 73: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Restriction RangesRestriction Ranges The restrictions must be "derived" from a base type, so it's object The restrictions must be "derived" from a base type, so it's object

basedbased

<xs:element name="LifeUniverseAndEverything"><xs:element name="LifeUniverseAndEverything"><xs:simpleType><xs:simpleType>

<xs:restriction base="xs:integer"><xs:restriction base="xs:integer"><xs:minInclusive value="42"/><xs:minInclusive value="42"/><xs:maxInclusive value="42"/><xs:maxInclusive value="42"/>

</xs:restriction></xs:restriction></xs:simpleType></xs:simpleType>

</xs:element></xs:element> Preceding "derived" from "integer"Preceding "derived" from "integer" Has 2 restrictions (called "facets")Has 2 restrictions (called "facets")

The first says that it must be greater than 41The first says that it must be greater than 41 The second says that it must be less than 43The second says that it must be less than 43

XML file is "42"XML file is "42"<LifeUniverseAndEverything <LifeUniverseAndEverything

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="LifeUniverseEverything.xsd">42</Lifexsi:noNamespaceSchemaLocation="LifeUniverseEverything.xsd">42</LifeUniverseAndEverything>UniverseAndEverything>

Page 74: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

FacetFacet Description Description

enumeratioenumeration n

Defines a list of acceptable valuesDefines a list of acceptable values

fractionDigifractionDigitsts

The maximum number of decimal places allowed. The maximum number of decimal places allowed. >=0>=0

lengthlength The exact number of characters or list items allowed. The exact number of characters or list items allowed. >=0 >=0

maxExclusimaxExclusiveve

The upper bounds for numeric values (the value must The upper bounds for numeric values (the value must be less than the value specified)be less than the value specified)

maxInclusimaxInclusive ve

The upper bounds for numeric values (the value must The upper bounds for numeric values (the value must be less than or equal to the value specified) be less than or equal to the value specified)

maxLengthmaxLength The maximum number of characters or list items The maximum number of characters or list items allowed. >=0allowed. >=0

minExclusiminExclusive ve

The lower bounds for numeric values (the value must The lower bounds for numeric values (the value must be greater than the value specified) be greater than the value specified)

minInclusivminInclusive e

The lower bounds for numeric values (the value must The lower bounds for numeric values (the value must be greater than or equal to the value specified)be greater than or equal to the value specified)

minLength minLength The minimum number of characters or list items The minimum number of characters or list items allowed >=0allowed >=0

pattern pattern The sequence of acceptable characters based on a The sequence of acceptable characters based on a regular expressionregular expression

totalDigits totalDigits The exact number of digits allowed. >0The exact number of digits allowed. >0

whiteSpace whiteSpace Specifies how white space (line feeds, tabs, spaces, Specifies how white space (line feeds, tabs, spaces, and carriage returns) is handledand carriage returns) is handled

Page 75: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Enumeration FacetEnumeration Facet

<xs:element name="FavoriteColor"><xs:element name="FavoriteColor">

<xs:simpleType><xs:simpleType>

<xs:restriction base="xs:string"><xs:restriction base="xs:string">

<xs:enumeration value="red"/><xs:enumeration value="red"/>

<xs:enumeration value="no blue"/><xs:enumeration value="no blue"/>

<xs:enumeration <xs:enumeration value="aarrrrggghh!!"/>value="aarrrrggghh!!"/>

</xs:restriction></xs:restriction>

</xs:simpleType></xs:simpleType>

</xs:element></xs:element>

Page 76: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Patterns (Regular Expressions)Patterns (Regular Expressions) One interesting facet is the pattern, which One interesting facet is the pattern, which

allows restrictions based on a regular allows restrictions based on a regular expressionexpression

This regular expression specifies a normal This regular expression specifies a normal word of one or more characters:word of one or more characters:

<xs:element name="Word"><xs:element name="Word"><xs:simpleType name="WordType"><xs:simpleType name="WordType">

<xs:restriction base="xs:string"><xs:restriction base="xs:string"><xs:pattern value="[a-zA-Z]+"/><xs:pattern value="[a-zA-Z]+"/>

</xs:restriction></xs:restriction></xs:simpleType></xs:simpleType>

</xs:element></xs:element>

Page 77: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Patterns (Regular Expressions)Patterns (Regular Expressions) Individual characters may be repeated a Individual characters may be repeated a

specific number of times in the regular specific number of times in the regular expression.expression.

The following regular expression restricts the The following regular expression restricts the string to exactly 8 alpha-numeric characters:string to exactly 8 alpha-numeric characters:

<xs:element name="password"><xs:element name="password"><xs:simpleType><xs:simpleType>

<xs:restriction base="xs:string"><xs:restriction base="xs:string"><xs:pattern value="[a-zA-Z0-9]{8}"/><xs:pattern value="[a-zA-Z0-9]{8}"/>

</xs:restriction></xs:restriction></xs:simpleType></xs:simpleType>

</xs:element></xs:element>

Page 78: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Whitespace facetWhitespace facet

The "whitespace" facet controls how whitespace in the The "whitespace" facet controls how whitespace in the element will be processedelement will be processed

There are three possible values to the whitespace facetThere are three possible values to the whitespace facet "preserve" causes the processor to keep all whitespace as-is"preserve" causes the processor to keep all whitespace as-is "replace" causes the processor to replace all whitespace "replace" causes the processor to replace all whitespace

characters (tabs, carriage returns, line feeds, spaces) with characters (tabs, carriage returns, line feeds, spaces) with space charactersspace characters

"collapse" causes the processor to replace all strings of "collapse" causes the processor to replace all strings of whitespace characters (tabs, carriage returns, line feeds, whitespace characters (tabs, carriage returns, line feeds, spaces) with a single space characterspaces) with a single space character

<xs:simpleType><xs:simpleType><xs:restriction base="xs:string"><xs:restriction base="xs:string">

<xs:whitespace value="replace"/><xs:whitespace value="replace"/></xs:restriction></xs:restriction>

</xs:simpleType></xs:simpleType>

Page 79: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Both elements and attributes have types, which are defined in the Both elements and attributes have types, which are defined in the Schema.Schema.

One can reuse types by giving them One can reuse types by giving them namesnames..Addr.xsd:Addr.xsd:<xsd:element name="Address"><xsd:element name="Address">

<xsd:complexType><xsd:complexType><xsd:sequence><xsd:sequence>

<xsd:element name="Street" type="xsd:string"/><xsd:element name="Street" type="xsd:string"/><xsd:element name="Apartment" <xsd:element name="Apartment"

type="xsd:string"/>type="xsd:string"/><xsd:element name="Zip" type="xsd:string"/><xsd:element name="Zip" type="xsd:string"/>

</xsd:sequence></xsd:sequence></xsd:complexType></xsd:complexType>

</xsd:element></xsd:element>

<xsd:complexType name="AddrType"><xsd:complexType name="AddrType"><xsd:sequence><xsd:sequence>

<xsd:element name="Street" type="xsd:string"/><xsd:element name="Street" type="xsd:string"/><xsd:element name="Apartment" type="xsd:string"/><xsd:element name="Apartment" type="xsd:string"/><xsd:element name="Zip" type="xsd:string"/><xsd:element name="Zip" type="xsd:string"/>

</xsd:sequence></xsd:sequence></xsd:complexType></xsd:complexType><xsd:element name=“ShipAddress" type="AddrType"/><xsd:element name=“ShipAddress" type="AddrType"/><xsd:element name=“BillAddress" type="AddrType"/><xsd:element name=“BillAddress" type="AddrType"/>

TypesTypes

OR

Page 80: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

TypesTypes The usage in the XML file is identical:The usage in the XML file is identical:

<?xml version="1.0" encoding="UTF-8"?><?xml version="1.0" encoding="UTF-8"?><PurchaseOrder <PurchaseOrder xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="Address-WithTypeName.xsd"xsi:noNamespaceSchemaLocation="Address-WithTypeName.xsd">>

<BillAddress><BillAddress> <Street>1108 E. 58th St.</Street><Street>1108 E. 58th St.</Street>

<Apartment>Ryerson 155</Apartment><Apartment>Ryerson 155</Apartment> <Zip>60637</Zip><Zip>60637</Zip>

</BillAddress></BillAddress> <ShipAddress><ShipAddress>

<Street>1108 E. 58th St.</Street><Street>1108 E. 58th St.</Street> <Apartment>Ryerson 155</Apartment><Apartment>Ryerson 155</Apartment> <Zip>60637</Zip><Zip>60637</Zip>

</ShipAddress></ShipAddress></PurchaseOrder></PurchaseOrder>

Page 81: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Type ExtensionsType Extensions

A third way of creating a complex type is to extend A third way of creating a complex type is to extend another complex type (like OO inheritance)another complex type (like OO inheritance)

<xs:element name="Employee" type="PersonInfoType"/><xs:element name="Employee" type="PersonInfoType"/><xs:complexType name="PersonNameType"><xs:complexType name="PersonNameType">

<xs:sequence><xs:sequence><xs:element name="FirstName" type="xs:string"/><xs:element name="FirstName" type="xs:string"/><xs:element name="LastName" type="xs:string"/><xs:element name="LastName" type="xs:string"/>

</xs:sequence></xs:sequence></xs:complexType></xs:complexType><xs:complexType name="PersonInfoType"><xs:complexType name="PersonInfoType">

<xs:complexContent><xs:complexContent><xs:extension base="PersonNameType"><xs:extension base="PersonNameType">

<xs:sequence><xs:sequence><xs:element name="Address" type="xs:string"/><xs:element name="Address" type="xs:string"/><xs:element name="City" type="xs:string"/><xs:element name="City" type="xs:string"/><xs:element name="Country" type="xs:string"/><xs:element name="Country" type="xs:string"/>

</xs:sequence></xs:sequence></xs:extension></xs:extension>

</xs:complexContent></xs:complexContent></xs:complexType></xs:complexType>

Page 82: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Type Extensions (use)Type Extensions (use)

To use a type that is an extension of To use a type that is an extension of another, it is as though it were all another, it is as though it were all defined in a single typedefined in a single type

<Employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" <Employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="TypeExtension.xsd">xsi:noNamespaceSchemaLocation="TypeExtension.xsd">

<FirstName>King</FirstName><FirstName>King</FirstName>

<LastName>Arthur</LastName><LastName>Arthur</LastName>

<Address>Round Table</Address><Address>Round Table</Address>

<City>Camelot</City><City>Camelot</City>

<Country>England</Country><Country>England</Country>

</Employee></Employee>

Page 83: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Simple Content in Complex TypeSimple Content in Complex Type If a type contains only simple content (text and If a type contains only simple content (text and

attributes), a <simpleContent> element can be put attributes), a <simpleContent> element can be put inside the <complexType>inside the <complexType>

<simpleContent> must have either a <extension> or a <simpleContent> must have either a <extension> or a <restriction><restriction>

This example is from the (Bridge of Death) Episode This example is from the (Bridge of Death) Episode Dialog:Dialog:

<xs:element name="dialog"><xs:element name="dialog"><xs:complexType><xs:complexType> <xs:simpleContent><xs:simpleContent> <xs:extension base="xs:string"><xs:extension base="xs:string"> <xs:attribute name="speaker"<xs:attribute name="speaker"

type="xs:string" use="required"/>type="xs:string" use="required"/> </xs:extension></xs:extension>

</xs:simpleContent></xs:simpleContent></xs:complexType></xs:complexType>

</xs:element></xs:element>

Page 84: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Model GroupsModel Groups

Model Groups are used to define an element Model Groups are used to define an element that hasthat has mixed content (elements and text mixed)mixed content (elements and text mixed) element contentelement content

Model Groups can beModel Groups can be all all

the elements specified must all be there, but in any the elements specified must all be there, but in any orderorder

choice choice any of the elements specified may or may not be thereany of the elements specified may or may not be there

sequencesequence all of the elements specified must appear in the specified all of the elements specified must appear in the specified

order order

Page 85: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

"All" Model Group"All" Model Group

The following schema specifies 3 elements and mixed contentThe following schema specifies 3 elements and mixed content<xs:element name="BookCover"><xs:element name="BookCover">

<xs:complexType mixed="true"><xs:complexType mixed="true"><xs:all minOccurs="0" maxOccurs="1"><xs:all minOccurs="0" maxOccurs="1">

<xs:element name="BookTitle" type="xs:string"/><xs:element name="BookTitle" type="xs:string"/><xs:element name="Author" type="xs:string"/><xs:element name="Author" type="xs:string"/><xs:element name="Publisher" type="xs:string"/><xs:element name="Publisher" type="xs:string"/>

</xs:all></xs:all></xs:complexType></xs:complexType>

</xs:element></xs:element>

The following XML file is valid in the above schemaThe following XML file is valid in the above schema<BookCover xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" <BookCover xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="AllModelGroup.xsd">xsi:noNamespaceSchemaLocation="AllModelGroup.xsd">

Title: Title: <BookTitle>The Holy Grail</BookTitle><BookTitle>The Holy Grail</BookTitle>Published: Published: <Publisher>Moose</Publisher><Publisher>Moose</Publisher>Author: Author: <Author>Monty Python</Author><Author>Monty Python</Author>

</BookCover></BookCover>

Page 86: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

AttributesAttributes<<xs:elementxs:element name name="="dialogdialog">">

<<xs:complexTypexs:complexType>><<xs:simpleContentxs:simpleContent>> <<xs:extensionxs:extension base base="="xs:string"xs:string">> <<xs:attributexs:attribute name name="="speaker"speaker"

typetype="="xs:stringxs:string"" useuse="="requiredrequired"/>"/>

</</xs:extensionxs:extension>></</xs:simpleContentxs:simpleContent>>

</</xs:complexTypexs:complexType>></</xs:elementxs:element>>

……

The attribute declarationis part of the type ofthe element.

Page 87: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

AttributesAttributes<xsd:element name="cartoon"><xsd:element name="cartoon">

<xsd:complexType><xsd:complexType><xsd:sequence><xsd:sequence>

<xsd:element ref="character" minOccurs="0" <xsd:element ref="character" minOccurs="0" maxOccurs="unbounded"/>maxOccurs="unbounded"/>

</xsd:sequence></xsd:sequence><xsd:attribute name="name" type="xsd:string" <xsd:attribute name="name" type="xsd:string"

use="required"/>use="required"/><xsd:attribute name="genre" type="xsd:string" <xsd:attribute name="genre" type="xsd:string"

use="required"/>use="required"/><xsd:attribute name="syndicated" use="required"><xsd:attribute name="syndicated" use="required">

<xsd:simpleType><xsd:simpleType><xsd:restriction base="xsd:NMTOKEN"><xsd:restriction base="xsd:NMTOKEN">

<xsd:enumeration value="yes"/><xsd:enumeration value="yes"/><xsd:enumeration value="no"/><xsd:enumeration value="no"/>

</xsd:restriction></xsd:restriction></xsd:simpleType></xsd:simpleType>

</xsd:attribute></xsd:attribute></xsd:complexType></xsd:complexType>

</xsd:element></xsd:element>

If an attribute type is more complicated than a basic type, then we spell out the type in a type declaration.

Page 88: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Optional and Required AttributesOptional and Required Attributes

All attributes are optional by All attributes are optional by default. To explicitly specify that default. To explicitly specify that the attribute is optional, use the the attribute is optional, use the "use" attribute:"use" attribute:

<xs:attribute name="speaker" type="xs:string" <xs:attribute name="speaker" type="xs:string" use="optional"/>use="optional"/>

To make an attribute required:To make an attribute required:<xs:attribute name="speaker" type="xs:string" <xs:attribute name="speaker" type="xs:string"

use="required"/>use="required"/>

Page 89: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Other XML Schema FeaturesOther XML Schema Features

Foreign key facility (uses Xpath)Foreign key facility (uses Xpath)

Rich datatype facilityRich datatype facility Build up datatypes by inheritanceBuild up datatypes by inheritance Don’t need to list all of the attributes (can Don’t need to list all of the attributes (can

say “these attributes plus others).say “these attributes plus others). Restrict strings using regular expressionsRestrict strings using regular expressions

Namespace aware.Namespace aware. Can restrict location of an element based on Can restrict location of an element based on

a namespacesa namespaces

Page 90: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

XML Schema Status XML Schema Status

Became a W3C recommendation Spring Became a W3C recommendation Spring 20012001

World domination expected imminently.World domination expected imminently. Supported in Xalan.Supported in Xalan. Supported in XML spy and other Supported in XML spy and other

editor/validators.editor/validators.

On the other hand:On the other hand: More complex than DTDs.More complex than DTDs. Ultra verbose.Ultra verbose.

Page 91: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Validating a SchemaValidating a Schema

By using Xeena or XMLspy or XML By using Xeena or XMLspy or XML Notepad.Notepad. When publishing hand-written XML When publishing hand-written XML

docs, this is the way to go. docs, this is the way to go.

By using a Java program that By using a Java program that performs validation.performs validation. When validating on-the-fly, must do it When validating on-the-fly, must do it

this waythis way

Page 92: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Some guidelines for Some guidelines for Schema designSchema design

Page 93: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Designing a SchemaDesigning a Schema Analogous to database schema design --- look Analogous to database schema design --- look

for intuitive namesfor intuitive names

Can start with an E-R diagram, and then convertCan start with an E-R diagram, and then convert Attributes to AttributesAttributes to Attributes Subobjects to SubelementsSubobjects to Subelements Relationships to IDREFSRelationships to IDREFS

Normalization? Still makes sense to avoid Normalization? Still makes sense to avoid repetition whenever possible– repetition whenever possible– If you have an Enrolment document, only list Ids of If you have an Enrolment document, only list Ids of

students, not their names.students, not their names. Store names in a separate documentStore names in a separate document Leave it to tools to connect themLeave it to tools to connect them

Page 94: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Designing a Schema (cont.)Designing a Schema (cont.)

Difficulties:Difficulties: Many more degrees of freedom than with database schemas: Many more degrees of freedom than with database schemas: e.g. one can associate information with something by e.g. one can associate information with something by

including it as an attribute or a subelement.including it as an attribute or a subelement.

<ADDRESS NAME=“Martin Sheen”, Street=“1222 Alameda <ADDRESS NAME=“Martin Sheen”, Street=“1222 Alameda Drive” ,City=“Carmel”, State=“CA”, ZIP=“40145”>Drive” ,City=“Carmel”, State=“CA”, ZIP=“40145”>

<ADDRESS><ADDRESS><NAME> Martin Sheen </NAME><NAME> Martin Sheen </NAME>……<ZIP> 4145 </ZIP><ZIP> 4145 </ZIP>

</ADDRESS></ADDRESS>

ELEMENTS are more extensible – use when there is a ELEMENTS are more extensible – use when there is a possibility that more substructure will be added.possibility that more substructure will be added.

ATTRIBUTES are easier to search on.ATTRIBUTES are easier to search on.

Page 95: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

““Rules” for Designing a SchemaRules” for Designing a Schema

Never leave structure out. The following is definitely a Never leave structure out. The following is definitely a bad idea:bad idea: <ADDRESS> Martin Sheen 1222 Alameda Drive, <ADDRESS> Martin Sheen 1222 Alameda Drive, Carmel, CA 40145 </ADDRESS>Carmel, CA 40145 </ADDRESS>

Better would be:Better would be: <ADDRESS firstName=“Martin” lastname=“Sheen” <ADDRESS firstName=“Martin” lastname=“Sheen” streenNum=“1222” streenName=“Alameda Drive” streenNum=“1222” streenName=“Alameda Drive” city=“Carmel” state=“CA” zip=“40145” />city=“Carmel” state=“CA” zip=“40145” />

Or:Or:<ADDRESS><ADDRESS>

<name><name><first>Martin</first><last>Sheen</last><first>Martin</first><last>Sheen</last>

</name></name><street><street>

<num>1222</num><name>Alameda Drive</name><num>1222</num><name>Alameda Drive</name></street></street><city>Carmel</city><city>Carmel</city><state>CA</state><zip>40145</zip><state>CA</state><zip>40145</zip>

</ADDRESS></ADDRESS>

Page 96: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

More“Rules” for Designing a SchemaMore“Rules” for Designing a Schema When to use Elements (instead of attributes)When to use Elements (instead of attributes)

Do not put large text blocks inside an attributeDo not put large text blocks inside an attribute (Bad Idea)(Bad Idea) <book type=“memoir” content=“Bravely bold Sir <book type=“memoir” content=“Bravely bold Sir Robin rode forth from Camelot.Robin rode forth from Camelot.

He was not afraid to die, O brave Sir Robin.He was not afraid to die, O brave Sir Robin.He was not at all afraid to be killed in nasty ways,He was not at all afraid to be killed in nasty ways,Brave, brave, brave, brave Sir Robin!Brave, brave, brave, brave Sir Robin!

He was not in the least bit scared to be mashed into a He was not in the least bit scared to be mashed into a pulp,pulp,

Or to have his eyes gouged out and his elbows broken,Or to have his eyes gouged out and his elbows broken,To have his kneecaps split and his body burned awayTo have his kneecaps split and his body burned awayAnd his limbs all hacked and mangled, brave Sir Robin!And his limbs all hacked and mangled, brave Sir Robin!

His head smashed in and his heart cut outHis head smashed in and his heart cut out

And his liver removed and his bowels unplugged…”>And his liver removed and his bowels unplugged…”> Elements are more flexible, so use an Element if you Elements are more flexible, so use an Element if you

think you might have to add more substructure later think you might have to add more substructure later on. on.

Page 97: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

More “Rules” for Designing SchemasMore “Rules” for Designing Schemas

More on when to use Elements (instead of More on when to use Elements (instead of Attributes)Attributes) Use an embedded element when the information Use an embedded element when the information

you are recording is a constituent part of the you are recording is a constituent part of the parent elementparent element

one's head and one's height are both inherent to a one's head and one's height are both inherent to a human being,human being,

you can't be a conventionally structured human being you can't be a conventionally structured human being without having a head and having a heightwithout having a head and having a height

One's head is a constituent part and one's height isn't -- One's head is a constituent part and one's height isn't -- you can cut off my head, but not my heightyou can cut off my head, but not my height

use embedded elements for complex structure use embedded elements for complex structure validation (obvious)validation (obvious)

use embedded elements when you need to show use embedded elements when you need to show order (attributes are not ordered)order (attributes are not ordered)

Page 98: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

More “Rules” for Designing SchemasMore “Rules” for Designing Schemas

When to use When to use AttributesAttributes instead of Elements instead of Elements use an attribute when the information is inherent use an attribute when the information is inherent

to the parent but not a constituent part (height to the parent but not a constituent part (height instead of head)instead of head)

use attributes to stress the one-to-one relationship use attributes to stress the one-to-one relationship among pieces of informationamong pieces of information

to stress that the element represents a tuple of to stress that the element represents a tuple of informationinformation

dangerous rule, thoughdangerous rule, though Leads to the extreme formulation that a Leads to the extreme formulation that a <chapter><chapter> element element

can have a can have a TITLE=TITLE= attribute attribute And then to the conclusion that it really ought to have a And then to the conclusion that it really ought to have a CONTENT=CONTENT= attribute too attribute too

Then you find yourself writing the entire document as an Then you find yourself writing the entire document as an empty element with an attribute value as long as the Quest empty element with an attribute value as long as the Quest for the Holy Grailfor the Holy Grail

use attributes for simple datatype validation use attributes for simple datatype validation (obviously)(obviously)

Page 99: XML Documents and Schema in greater depth In one sense XML is … A language neutral way of representing structured data A language neutral way of representing

Schema NotesSchema Notes

Fully supported (now) in XML-SpyFully supported (now) in XML-Spy Unknown if supported in Xalan, but Unknown if supported in Xalan, but

probablyprobably Fully supported in Xerxes DOM & SAXFully supported in Xerxes DOM & SAX

validation worksvalidation works Unknown if supported in JAXM/JAXPUnknown if supported in JAXM/JAXP Java JDK 1.4 does not support Java JDK 1.4 does not support

schemasschemas Nice set of schemas atNice set of schemas at

http://www.griphyn.org/http://www.griphyn.org/working_groups/VDS/working_groups/VDS/