Upload
albert-thomas
View
221
Download
0
Embed Size (px)
Citation preview
1
Advanced Database Topics
Copyright © Ellis Cohen 2002-2005
Introduction to XML
These slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.
For more information on how you may use them, please see http://www.openlineconsult.com/db
© Ellis Cohen, 2002-2005 2
Overview of XML / Database Topics
Modeling XML Data (DTD's & XML Schema)
Querying XML Data (XPath & XQuery)XML Persistent StorageClient Access to Local & Persistent
XML Data Modifying Persistent XML DataUsing XML Models with RDBs and OODBsIntegrated Access to Relational, OO and
XML Data SourcesIntegrating XML into RDBs
© Ellis Cohen, 2002-2005 3
Topics for This Lecture
Introduction to XMLDTD's:
Document Type DefinitionsXML Schema
© Ellis Cohen, 2002-2005 4
Introductionto XML
© Ellis Cohen, 2002-2005 5
XML Data Representation
RDBs are about representing and storing data as relations
OODBs are about representing and storing data as networks of objects
XML is about hierarchical data representations – data represented as trees
Such data can not only be stored in an XML DB, but also has a standard textual format -- XML
© Ellis Cohen, 2002-2005 6
XML as a Textual Representation
eXtensible Markup LanguageMechanism for tagging text to describe the
meaning of the informationHuman Readable (not necessarily easily…)
Looks similar to HTML tagsThis is <b>very</b> important
But• Tags define the meaning of the information,
not its presentation• Arbitrary rather than fixed set of tags• Documents may optionally be typed which
defines the set of tags allowed in a document and how and where they can be used
• Everything in XML is case-sensitive (content, tags, and attributes)
tag
© Ellis Cohen, 2002-2005 7
HTML Example
<h1>Books for CS779</h1><p><b>Database Design, Implementation &
Management, 5th Edition</b><br>Rob & Coronel<br><i>Course Technology</i><p><b>Professional XML Databases</b><br>Williams<br><i>Wrox Press</i>
tag
© Ellis Cohen, 2002-2005 8
XML Example<?xml version="1.0"><!DOCTYPE CourseBooks SYSTEM "http://…/cbooks.dtd">
<CourseBooks><Course>CS779</Course><Book>
<Title>Database Design, Implementation & Management, 5th Edition</Title>
<Author>Rob & Coronel</Author><Publisher>Course
Technology</Publisher></Book><Book>
<Title>Professional XML Databases</Title><Author>Williams</Author><Publisher>Wrox Press</Publisher>
</Book></CourseBooks><!-- That's all folks -->
Prolog
Body
tag
© Ellis Cohen, 2002-2005 9
XML Example Tree
root
CourseBooks
Course Book Book
Title Author Publisher…"CS779"
"…" "…""Rob & Coronel"
ElementNode
TextNode
© Ellis Cohen, 2002-2005 10
Tags Provide Meaning
Tags– describe the meaning of the content– make the document self-defining
Compare to untagged, delimited text
Database Design, Implementation & Management, 5th Edition|Rob & Coronel|Course Technology
Professional XML Databases|Williams|Wrox Press
© Ellis Cohen, 2002-2005 11
Unusual Character Data
What to do if your character data has special characters in it -- e.g. angle brackets?
CDATA<Title> <![CDATA[Why X < Y]]></Title>
Entity References<Title> Why X < Y</Title>
© Ellis Cohen, 2002-2005 12
Attributes
<Book><Title>Professional XML
Databases</Title><Author>Williams</Author><Publisher>Wrox Press</Publisher>
</Book>
vs
<Book title="Professional XML Databases" author="Williams"publisher="Wrox Press"/>
© Ellis Cohen, 2002-2005 13
Maximal Use of Attributes<?xml version="1.0"><!DOCTYPE CourseBooks SYSTEM "http://…/cbooks.dtd">
<CourseBooks course="CS779"><Book title="Database Design, Implementation &
Management, 5th Edition" author="Rob & Coronel" publisher="Course Technology"/>
<Book title="Professional XML Databases" author="Williams" publisher="Wrox Press"/>
</CourseBooks>
An attribute can generally only be used to represent a single value.
If you want to represent structured data or of a list of values or data items, you should
use an element
Prolog
Body
© Ellis Cohen, 2002-2005 14
XML with Attributes
root
CourseBooks
course Book Book
title author publisher…
ElementNode
AttributeNode
© Ellis Cohen, 2002-2005 15
XML can be UnNormalized
root
CourseBooks
Book
title Author publisher
……
name address dob
Suppose someone authored multiple books
© Ellis Cohen, 2002-2005 16
XML can be Normalized
BookDB
Booklist
Book
title
Author
publisher
……
name address dob AuthorRef
Authlist
……
root
authid
Note this is not an attribute!
© Ellis Cohen, 2002-2005 17
Normalized XML<?xml version="1.0"><!DOCTYPE BookDB SYSTEM "http://…/bookdb.dtd">
<BookDB><Booklist>
<Book title="Furniture Design, Implementation & Management, 5th Edition" publisher="Course Technology">
<AuthorRef>Wil4421</AuthorRef></Book><Book title="Cool ideas" publisher="Mumbo Jumbo Books"> <AuthorRef>Wil4421</AuthorRef></Book><Book title="Professional Peach Databases" publisher="Wrox Press"> <AuthorRef>Wil4421</AuthorRef> <AuthorRef>Bor601</AuthorRef></Book>…
</Booklist><Authlist>
<Author authid="Wil4421" name="Juan Williams" address="…" dob="11-04-42">
<Author authid="Bor601" name="Gonzo Borscht" address="…" dob="3-17-88">
…</Authlist>
</BookDB>No redundancy, often similar to way
information would be stored in an RDB
A book may have multiple
authors
© Ellis Cohen, 2002-2005 18
Storing Information in an RDB
BookListtitle authrefs publisher
Furniture Design, Implementation & Management, 5th Edition
Wil4421Course
Technology
Cool Ideas Wil4421Mumbo Jumbo
Books
Professional Peach DatabasesWil4421 Bor601
Wrox Press
… … …
AuthListauthid name address dob
Wil4421 Juan Williams … 11-04-42
Bor601 Gonzo Borscht … 3-17-88
… …
Not 1NF
© Ellis Cohen, 2002-2005 19
Structure of XML
Tags: Book, Title, Author, Publisher
<Author>Rob & Coronel</Author>
<Book title="Let's have fun"/>
start tag content end tag
element
empty element(start & end tag are combined; no content, only attributes)
Empty element indicator
© Ellis Cohen, 2002-2005 20
XML for Structured Text
XML represents structured text if the content of an element is either– Character data (i.e. untagged text)
<Author>Williams</Author>
– A sequence of one or more elements<Book>
<Title>Database Design, Implementation & Management, 5th Edition</Title>
<Author>Rob & Coronel</Author>
<Publisher>Course Technology</Publisher>
</Book>
content
Content
© Ellis Cohen, 2002-2005 21
XML for Semi-Structured Text
In semi-structured text, the content of an element may contain a sequence of untagged text and elements, which are intermixed.
<Description><Author>Williams</Author> is the author of <Title>Professional XML Databases</Title>, which is published by <Publisher>Wrox Press</Publisher>
</Description>
More like HTML markup, but with tags used to identify useful information.
© Ellis Cohen, 2002-2005 22
Description Example Tree
Description
AuthorTitle Publisher
"Williams"
" is the author of "
"Wrox Press"
"Professional XML
Databases"TextNode
ElementNode
" which is published
by "
© Ellis Cohen, 2002-2005 23
VocabulariesXML Tags to use in specific domains
eg. Business, Legal, Music, Robotics, Math, Chemical, Genetic
Some Business Vocabularies:Commerce XML (cXML) developed by Ariba
and MicrosoftBizTalk RosettaNetOFE: Open Financial Exchange
Go to http://www.xml.org/xml/registry.jsp for a detailed list of XML specs
© Ellis Cohen, 2002-2005 24
Namespaces
Allows libraries of tags to be combined without collisions
<CourseBooks xmlns:isbn="http://www.isbn.org"course="CS779"><Book>
<Title>Professional XML Databases</Title><Author>Williams</Author><Publisher>Wrox Press</Publisher>
<isbn:Number>1861003587</isbn:Number></Book>
</CourseBooks>
Local namespace prefixNamespace
identification
Use of namespace prefix to (potentially)
disambiguate tag
© Ellis Cohen, 2002-2005 25
Uses of XML
• Document Storage and RetrievalXML databases, with mechanisms for storage,
retrieval (querying) and updates, both of structured & semi-structured text
• Communication (e.g. SOAP)MessagesRemote procedure calls
• Specifying Configurations & MetadataResource files, initialization files, description
files, etc. (e.g. WSDL)
• Procedural and Descriptive LanguagesJSP, XML Schema, XQueryX, XSL
When it is advantageous to persistently store information in an XML tree representation rather than an RDB?
© Ellis Cohen, 2002-2005 26
Advantages of XML over RDB
• When you want to store and query significant amounts of semi-structured text
• When there are many tags and attributes which are optional and are used relatively infrequently
© Ellis Cohen, 2002-2005 27
JSP
<c:forEach var="customer" items="${customers}"><c:if test="${book.price <= customer.limit}">
<c:out value="${book.title}"/>fits <c:out value="${customer.name}"/>'s budget!<br>
</c:if></c:forEach>
© Ellis Cohen, 2002-2005 28
Some XML-based Languages
XML SchemaA language for defining the type of other XML
documents (including specifying the allowable tags and how they can be used)
XQueryXAn XML representation of the XQuery language for
locating and retrieving parts of XML documents
XSLA language for specifying the style of another XML
document.Defines how the XML document should be converted
to HTML or some other presentation format.More generally used for transforming an XML
document to a different format (which may or may not be XML-based)
© Ellis Cohen, 2002-2005 29
XML Type Definitions
We want to define the content of elements and attributes in XML just as we define the types of fields in a relational DB
There are 2 ways to define the type of XML documents
• DTD (Document Type Definition)Not XML-BasedUsed originally for defining SGML
(which predates the web)(XML is a simplification of SGML)
• XML SchemaSchema definition is XML-BasedFocus on reusabilityFiner grain constraints
© Ellis Cohen, 2002-2005 30
XML and Database Evolution
Hierarchical DB
Network DB
XML DB
OO/OR DB
Relational DB
An XML DB is used for storage and retrieval of hierarchically organized information
© Ellis Cohen, 2002-2005 31
RDB, ORDB, OODB, XMLDB
SQL DDL
SQL DML
SQL DML
Define
Query
Modify
Fill-in the rest …
RDB ORDB OODB XMLDB
© Ellis Cohen, 2002-2005 32
RDB, ORDB, OODB, XMLDB
SQL DDL
SQLDDL
ODL DTDXSchema
SQL DML
SQLDML
OQL(or OPathor other variants)
XPathXQueryXSLT
SQL DML
SQLDML
OOPLsDOM API'sXUpdate
SQL (if XRDB)
Define
Query
Modify
RDB ORDB OODB XMLDB
© Ellis Cohen, 2002-2005 33
DTD'sDocument Type
Definitions
© Ellis Cohen, 2002-2005 34
Document Type Definition
Formal grammar to specify structure and permissible valuesValid XML is well-formed syntacticallyValid XML conforms to the rules of the
vocabulary
DTD provides means of validating XML Documents
DTD also provides documentation of the vocabulary
DOCTYPE declaration to specify a DTD<!DOCTYPE Books SYSTEM "http://…/books.dtd">
© Ellis Cohen, 2002-2005 35
Example DTD
<!ELEMENT Title (#PCDATA)><!ELEMENT Author (#PCDATA)><!ELEMENT Publisher (#PCDATA)><!ELEMENT Course (#PCDATA)>
– Title, Author, Publisher & Course elementshave strictly textual content
<!ELEMENT Book (Title, Author, Publisher?)>– The content of a Book consists of a Title element,
followed by an Author element, optionally (?) followed by a Publisher element. Order matters!
<!ELEMENT CourseBooks (Course, Book+)>– The content of a CourseBooks element consists of a
Course element, followed by one or more (+) Book elements
© Ellis Cohen, 2002-2005 36
Example CourseBooks XML<?xml version="1.0"><!DOCTYPE CourseBooks SYSTEM "http://…/cbooks.dtd">
<CourseBooks><Course>CS779</Course><Book>
<Title>Database Design, Implementation & Management, 5th Edition</Title>
<Author>Rob & Coronel</Author><Publisher>Course
Technology</Publisher></Book><Book>
<Title>Professional XML Databases</Title><Author>Williams</Author><!– No publisher, it’s optional! -->
</Book></CourseBooks><!-- That's all folks -->
Prolog
Body
© Ellis Cohen, 2002-2005 37
Content Model
<!ELEMENT name content>
Content can be
– ANY: Anything<!ELEMENT Description ANY>
– EMPTY: Must be an EMPTY element<!ELEMENT Details EMPTY>
– regexp: a parenthesized regular expression <!ELEMENT CourseBooks (Course, Book+)>regexp
© Ellis Cohen, 2002-2005 38
Regular Expressions• Sequence
( exp1, …, expn )where each exp can be an element name, #PCDATA, a sequence, an alternative, or a repetitionSequence of first regexp, then second, etc.e.g. (#PCDATA)e.g. (Course, Book+)e.g. (Title, Author, Publisher)
• Alternative( exp1 | … | expn )where each exp can be an element name, #PCDATA, a sequence, an alternative, or a repetitionAny one of the expressions listede.g. ( Publisher | isbn:Number )
• Repetition– exp+ -- 1 or more– exp* -- 0 or more– exp? -- optional
where each exp can be an element name, a sequence, or an alternative
Character Data
© Ellis Cohen, 2002-2005 39
Element-ary Problem
<!ELEMENT Note (#PCDATA)><!ELEMENT Relevance (#PCDATA)><!ELEMENT Info
(#PCDATA | (Note, Relevance?))*
What are some example of XML bodies (no prolog)whose root element is Info
that would be valid for this DTD
© Ellis Cohen, 2002-2005 40
Some Element-ary Solutions
<Info></Info>
<Info>Hello</Info>
<Info><Note>ok</Note></Info>
<Info> <Note>this is a note</Note> hello <Note>ok</Note><Relevance>13</Relevance> <Note>another note</Note></Info>
© Ellis Cohen, 2002-2005 41
Alternative Nesting Styles
There are three alternative DTDs for BillingData below. Give an example for each one that involves 3 payments & 3 charges.
Under which circumstances would you recommend each of the styles?
1) <!ELEMENT BillingData (Payment | Charge)*>
2) <!ELEMENT BillingData (Payment*, Charge*)>
3) <!ELEMENT BillingData (Payments, Charges)>
<!ELEMENT Payments (Payment*)>
<!ELEMENT Charges (Charge*)>
© Ellis Cohen, 2002-2005 42
Interleaved vs Grouped<BillingData> <Charge> … </Charge> <Payment> … </Payment> <Charge> … </Charge> <Charge> … </Charge> <Payment> … </Payment></BillingData>
<BillingData> <Payment> … </Payment> <Payment> … </Payment> <Charge> … </Charge> <Charge> … </Charge> <Charge> … </Charge></BillingData>
1) Intermixed charges & payments
2) All payments followed by all charges
© Ellis Cohen, 2002-2005 43
Separated vs SubGrouped<BillingData> <Payment> … </Payment> <Payment> … </Payment> <Charge> … </Charge> <Charge> … </Charge> <Charge> … </Charge></BillingData>
<BillingData> <Patments> <Payment> … </Payment> <Payment> … </Payment> <Payments> <Charges> <Charge> … </Charge> <Charge> … </Charge> <Charge> … </Charge> <Charges></BillingData>
3) All payments & all charges respectively placed within distinct parent elements
2) All payments followed by all charges
© Ellis Cohen, 2002-2005 44
Ordering Problem
Suppose a book consists of a title and an author, but they could appear in either order.What's the corresponding DTD definition for Book?
© Ellis Cohen, 2002-2005 45
Ordering Solution
<!ELEMENT Book ((Title, Author) | (Author, Title) )>
Suppose a book consists of a single title, a single author, and a single publisher, but they could appear in any order.What's the corresponding DTD definition for Book?
© Ellis Cohen, 2002-2005 46
Triple Ordering Solution
<!ELEMENT Book ( (Author, Title, Publisher) | (Author, Publisher, Title) | (Title, Author, Publisher) | (Title, Publisher, Author) | (Publisher, Author, Title) | (Publisher, Author, Title) )>
Suppose a book can contain a title and at least one author in any order. What's the DTD?
How about a book which contains at least one author and any number of co-authors in any order?
© Ellis Cohen, 2002-2005 47
Mixed Order with Repetition
<!ELEMENT Book ((Author+, Title, Author*) | (Author*, Title, Author+) )>
<!ELEMENT Book ((Coauthor*, Author, (Coauthor*, Author*)*>
© Ellis Cohen, 2002-2005 48
Limitations of Regular Expressions
Impose unwanted constraints on order<!ELEMENT Book (Title, Author, Publisher)>
Can be too vague<!ELEMENT Book (Title | Author | Publisher)*>
Note: could be helped by adding the interleaved combination operator &
(not supported in DTD standard)(Title & Author & Publisher)(Title & Author+)(Author+ & Coauthor*)
© Ellis Cohen, 2002-2005 49
Attributes
<!ATTLIST Booktitle CDATA #REQUIREDauthor CDATA #REQUIREDpublisher CDATA #IMPLIEDisbn:number CDATA #IMPLIED>
A Book is required to have a title and an author attribute, and optionally may have a publisher and isbn:number attribute.
The attributes may appear in any order!
© Ellis Cohen, 2002-2005 50
Elements & Attributes
ELEMENT descriptions:Define an element's sub-elements
(and the order in which they must appear!)
ATTLIST descriptions:Define an element's attributes
(which can appear in any order)
The designer determines which content should be represented by elements & which by attributes.
Changing this requires changing the DTD
© Ellis Cohen, 2002-2005 51
Revised CourseBooks DTD<!ELEMENT Book EMPTY><!ATTLIST Book
title CDATA #REQUIREDauthor CDATA #REQUIREDpublisher CDATA #IMPLIEDisbn:number CDATA #IMPLIED>
– A Book has no sub-elements (EMPTY), but must have a title and author attribute, and optionally has a publisher and isbn number. These can appear in any order.
<!ELEMENT CourseBooks (Book+)><!ATTLIST CourseBooks
course ID #REQUIRED>
– A CourseBooks element has one or more Books, and also has a course attribute, which must be unique
© Ellis Cohen, 2002-2005 52
XML Example for Revised DTD<?xml version="1.0"><!DOCTYPE CourseBooks SYSTEM "http://…/cbooks.dtd">
<CourseBooks xmlns:isbn="http://www.isbn.org"course="CS779"><Book title="Database Design, Implementation &
Management, 5th Edition" author="Rob & Coronel"/><Book author="Williams" title="Professional XML
Databases" publisher="Wrox Press"isbn:number="304-22-15678"/>
</CourseBooks>
An attribute can generally only be used to represent a single value.
If you want to represent structured data or of a list of values or data items, you should
use an element
Prolog
Body
© Ellis Cohen, 2002-2005 53
Attributes Kinds in DTDs
#REQUIRED#IMPLIED optionalvalue default valuevalue #FIXED the only allowed value
© Ellis Cohen, 2002-2005 54
Attributes Types in DTDs
CDATA character dataID key value definitionIDREF reference to an id'd entityIDREFS list of id references
(blank-separated)NMTOKEN must be a valid XML nameNMTOKENS list of valid XML namesENTITY non-text content (e.g. gif)Enumerations
e.g. (Monday | Wednesday | Friday)
Note: attributes could look like XML, but would just be strings with angle brackets, and no substructure
© Ellis Cohen, 2002-2005 55
ID and IDREF<!ATTLIST Book
title CDATA #REQUIREDid ID #REQUIREDpubref IDREF #IMPLIEDisbn:numberCDATA #IMPLIEDsources IDREFS #IMPLIED>
<CourseBooks course="CS779"><Book id="ddim5" title="Database …" …/><Book id="pxd" title="Professional XML
Databases" pubref="Wrox" /><Book id="gdb" title="Great Database Book"
sources="ddim5 pxd" … /></CourseBooks>
IDREFS are allowed, but not
widely used
IDREF & IDREFS support referential integrity.Each id reference in an IDREF or IDREFS attribute
must match the value of some ID attribute
© Ellis Cohen, 2002-2005 56
Stylistic Consistency
The designer of a DTD has many stylistic choices– The order of elements
– When to use elements and when to use attributes
– Whether lists of ids or names should be represented as a single whitespace-separated attribute or as repeated elements
– Whether repeated elements should be nested inside a collection element
These are aesthetic choices that should be made consistently.
© Ellis Cohen, 2002-2005 57
Alternative Representations for Repeated ID's
<Book title="Professional Peach Databases"> <AuthorRef>Wil4421</AuthorRef> <AuthorRef>Bor601</AuthorRef></Book>
<Book title="Professional Peach Databases"> <AuthorRef authref="Wil4421"/> <AuthorRef authref="Bor601"/></Book>
<Book title="Professional Peach Databases" authrefs="Wil4421 Bor601"/>
For each case, what are the corresponding DTD's?
© Ellis Cohen, 2002-2005 58
Alternative DTDs<!ELEMENT Book (AuthorRef*)><!ATTLIST Book
title CDATA #REQUIRED><!ELEMENT AuthorRef (#PCDATA)>
<!ELEMENT Book (AuthorRef*)><!ATTLIST Book
title CDATA #REQUIRED><!ELEMENT AuthorRef EMPTY><!ATTLIST AuthorRef
authref IDREF #REQUIRED>
<!ELEMENT Book EMPTY><!ATTLIST Book
title CDATA #REQUIRED authrefs IDREFS #REQUIRED>
© Ellis Cohen, 2002-2005 59
Limitations of DTDs
•Limited support for reusability and namespaces
•No interleaved combination operator for regular expressions
•Type specifications (e.g. ID, IDREF) only allowed for attributes, not text contents– Whitespace-separated lists only
supported for attributes, and only for NMTOKENS and IDREFS
•No way to constrain values•Very primitive referential integrity
© Ellis Cohen, 2002-2005 60
XML Schema
© Ellis Cohen, 2002-2005 61
XML Schema Example<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="Title" type="xs:string"/><xs:element name="Author" type="xs:string"/><xs:element name="Publisher" type="xs:string"/><xs:element name="Course" type="xs:NMTOKEN"/>
<xs:element name="Book"><xs:complexType>
<xs:sequence><xs:element ref="Title"/><xs:element ref="Author"/><xs:element ref="Publisher" minOccurs="0"/>
</xs:sequence></xs:complexType>
</xs:element>
<xs:element name="CourseBooks"><xs:complexType>
<xs:sequence><xs:element ref="Course"/><xs:element ref="Book" maxOccurs="unbounded"/>
</xs:sequence></xs:complexType>
</xs:element>
</xs:schema>
© Ellis Cohen, 2002-2005 62
Regular Expressions in XML Schema (with equivalent DTD)
<xs: …X… minOccurs="0" maxOccurs="unbounded"/>X*
<xs: …X… minOccurs="1" maxOccurs="unbounded"/>X+
<xs: …X… minOccurs="0" maxOccurs="1">X?
<xs:sequence> A B C </xs:sequence>(A, B, C)
<xs:choice> A B C </...>(A | B | C)
<xs:all> A B C </...>(A & B & C)
• can only appear as single child of a complexType• children can only be elements with maxOccurs=1
© Ellis Cohen, 2002-2005 63
Attributes in XML Schema
<xs:element name="CourseBooks"><xs:complexType>
<xs:element ref="Book" maxOccurs="unbounded"/><xs:attribute name="course" type="xs:NMTOKEN" use="required">
</xs:complexType></xs:element>
One or more attributes can be associated with any complexType
© Ellis Cohen, 2002-2005 64
Mixed Content & Any Type
Pro: Allows separation of type constraints from decision to allow mixed text
Con: Not possible to constrain more exactly where mixed text is allowed (though there rarely is a need to constrain it)
Means anything is permitted there
<xs:complexType mixed="true">
<xs:element name="anything" type="xs:anyType"/>
© Ellis Cohen, 2002-2005 65
Simple Typesstring, CDATA, token, NMTOKEN, NMTOKENS, ID,
IDREF, IDREFS, ENTITY, ENTITIES, …
booleandecimal, integer
long, int, short, byteunsignedLong, unsignedInt, unsignedShort,
unsignedBytenonPositiveInteger negativeInteger,
nonNegativeInteger, positiveInteger
floatduration, dateTime, time, date, …hexBinary, …anyUri…
© Ellis Cohen, 2002-2005 66
Facets of Simple Types
lengthminLength maxLength pattern enumeration whiteSpace
maxInclusivemaxExclusiveminInclusiveminExclusivetotalDigitsfractionDigits
•Facets additional properties restricting a simple type
•15 facets defined by XML Schema
© Ellis Cohen, 2002-2005 67
Simple Type Definitions
<xs:simpleType name="integer"> <xs:restriction base="xs:decimal"> <xs:fractionDigits value="0" fixed="true"/> </xs:restriction></xs:simpleType>
<xs:simpleType name="nonPositiveInteger"> <xs:restriction base="xs:integer"> <xs:maxInclusive value="0" /> </xs:restriction></xs:simpleType>
An integer is a decimal with no fractionDigits
A nonPositiveInteger is an integerwhose largest value is 0
© Ellis Cohen, 2002-2005 68
Primitive List Types
<xs:simpleType name="NMTOKENS"> <xs:restriction> <xs:simpleType> <xs:list itemType="xs:NMTOKEN"/> </xs:simpleType> <xs:minLength value="1"/> </xs:restriction></xs:simpleType>
NMTOKENS is a whitespace-separate list of NMTOKENs
© Ellis Cohen, 2002-2005 69
Patterns, Enumerations & Unions
<xs:simpleType name="isbnType"> <xs:union> <xs:simpleType> <xs:restriction base="xs:string"> <xs:pattern value="[0-9]{10}"/> </xs:restriction> </xs:simpleType> <xs:simpleType> <xs:restriction base="xs:NMTOKEN"> <xs:enumeration value="TBD"/> <xs:enumeration value="NA"/> </xs:restriction> </xs:simpleType> </xs:union></xs:simpleType>
An isbnType is either a string consisting of 10 digits or is one of the strings TBD or NA
© Ellis Cohen, 2002-2005 70
Extending ComplexTypes<xs:complexType name="BookType"> <xs:sequence> <xs:element name="title" type="xs:string"/> <xs:element name="author"
type="xs:string"/> <xs:element name="publisher"
type="xs:string" minOccurs="0"/> </xs:sequence></xs:complexType>
<xs:complexType name="IsbnBookType> <xs:complexContent> <xs:extension base="BookType"> <xs:sequence> <xs:element name="isbn:number"
type="xs:string" minOccurs="0"/> </xs:sequence> </xs:extension > </xs:complexContent></xs:complexType>
IsbnBookType extends BookType with isbn:number
© Ellis Cohen, 2002-2005 71
Groups<xs:group name="reviewElements"> <xs:sequence> <xs:element name="review" type="xs:string"/> <xs:element name="reviewDate" type="date"/> </xs:sequence></xs:group>
<xs:complexType name="ReviewedBookType> <xs:complexContent> <xs:extension base="BookType"> <xs:sequence> <xs:group ref="reviewElements"/> </xs:sequence> </xs:extension > </xs:complexContent></xs:complexType>
Groups of attributes can be defined as well
ReviewedBookType extends BookType with review and reviewDate
© Ellis Cohen, 2002-2005 72
Inclusion & Namespaces
Use xs:include to include definitions from another schema definition file
Use xs:redefine to include definitions and selectively redefine some of them
Use the targetNamespace attribute within xs:schema to define the namespace of the declarations
Use xs:import to import definitions from the schema definition file associated with another namespace
© Ellis Cohen, 2002-2005 73
Uniqueness, Keys & Referencesxs:unique can be used to require that
each specified combination of fields (i.e. named attributes and/or contents of named elements)
within a specified set of elementsmust be unique
xs:key is like xs:unique but also requires that the values are not nil
xs:keyref provides referential integrity by requiring thateach specified combination of fieldswithin a specified set of elements
within a specified set of elements must correspond to some existing combination associated with an xs:key or xs:unique definition
© Ellis Cohen, 2002-2005 74
XML Key Reference Example
BookDB
Booklist
Book
title
Author
publisher
……
name address dob Authref
Authlist
……
root
authid
© Ellis Cohen, 2002-2005 75
Key and Keyref Example
<xs:key name="authkeys"><xs:selector xpath="//Author"/><xs:field xpath="@authid"/>
</xs:key>
Every author's authid is unique and non-nil
Each book's Authref refers to a legal authid
<xs:keyref name="authrefs" refer="authkeys"><xs:selector xpath="//Book"/><xs:field xpath="Authref"/>
</xs:keyref>
The contents of a book's Authref element must correspond to some author's authid attribute