47
1 Web Data Management XML Schema

1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

Embed Size (px)

Citation preview

Page 1: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

1

Web Data Management

XML Schema

Page 2: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

2

In this lecture

• XML Schemas• Elements v. Types

• Regular expressions

• Expressive power

ResourcesW3C Draft: www.w3.org/TR/2001/REC-xmlschema-1-20010502

Page 3: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

3

XML Schemas

• http://www.w3.org/TR/xmlschema-1/10/2000• generalizes DTDs• uses XML syntax• two documents: structure and datatypes

– http://www.w3.org/TR/xmlschema-1– http://www.w3.org/TR/xmlschema-2

• XML-Schema is very complex– often criticized– some alternative proposals

Page 4: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

4

BookCatalogue.dtd

<!ELEMENT BookCatalogue (Book)*><!ELEMENT Book (Title, Author, Date, ISBN, Publisher)><!ELEMENT Title (#PCDATA)><!ELEMENT Author (#PCDATA)><!ELEMENT Date (#PCDATA)><!ELEMENT ISBN (#PCDATA)><!ELEMENT Publisher (#PCDATA)>

Page 5: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

5

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> <xsd:element name="BookCatalogue"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>

BookCatalogue.xsd xsd = Xml-Schema Definition

(explanations onsucceeding pages)

Page 6: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

6

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> <xsd:element name="BookCatalogue"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>

<!ELEMENT Title (#PCDATA)><!ELEMENT Author (#PCDATA)><!ELEMENT Date (#PCDATA)><!ELEMENT ISBN (#PCDATA)><!ELEMENT Publisher (#PCDATA)>

<!ELEMENT Book (Title, Author, Date, ISBN, Publisher)>

<!ELEMENT BookCatalogue (Book)*>

All XML Schemas have"schema" as the rootelement.

Page 7: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

7

The elements thatare used to createan XML Schemacome from the XMLSchemanamespace

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> <xsd:element name="BookCatalogue"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>

Page 8: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

8

elementcomplexType

schema

sequence

http://www.w3.org/2000/10/XMLSchema

XMLSchema Namespace

string

integer

boolean

Page 9: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

9

Says that theelements declaredin this schema(BookCatalogue,Book, Title, Author, Date, ISBN, Publisher)are to go in thisnamespace

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> <xsd:element name="BookCatalogue"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>

Page 10: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

10

BookCatalogue

BookTitle

Author

Date

ISBNPublisher

http://www.publishing.org (targetNamespace)

Publishing Namespace (targetNamespace)

Page 11: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

11

This is referencing a Book element declaration.The Book in whatnamespace? Since thereis no namespace qualifierit is referencing the Bookelement in the defaultnamespace, which is thetargetNamespace!

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> <xsd:element name="BookCatalogue"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>

The default namespace ishttp://www.publishing.orgwhich is the targetNamespace!

Page 12: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

12

This is a directive to instance documents whichuse this schema: Any elements used by the instance document whichwere declared by this schema must be namespacequalified by the namespacespecified by targetNamespace.

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> <xsd:element name="BookCatalogue"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>

Page 13: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

13

Referencing a schema in an XML instance document

<?xml version="1.0"?><BookCatalogue xmlns ="http://www.publishing.org" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:schemaLocation="http://www.publishing.org BookCatalogue.xsd"> <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>July, 1998</Date> <ISBN>94303-12021-43892</ISBN> <Publisher>McMillin Publishing</Publisher> </Book> ...</BookCatalogue>

1. First, using a default namespace declaration, tell the schema-validator that all of the elementsused in this instance document come from the publishing namespace.

2. Second, with schemaLocation tell the schema-validator that the http://www.publishing.org namespace is defined in BookCatalogue.xsd.

3. Third, tell the schema-validator that schemaLocation attribute we are using is the one in the schema instance namespace.

Page 14: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

14

Referencing a schema in an XML instance document

BookCatalogue.xml BookCatalogue.xsd

targetNamespace="A"schemaLocation="A BookCatalogue.xsd"

- defines elements in namespace A

- uses elements from namespace A

Page 15: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

15

Note multiple levels of checking

BookCatalogue.xml BookCatalogue.xsd XMLSchema.xsd(schema-for-schemas)

Validate that the xml documentconforms to the rules describedin BookCatalogue.xsd

Validate that BookCatalogue.xsd is a validschema document, i.e., it conformsto the rules described in theschema-for-schemas

Page 16: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

16

Default Value for minOccurs and maxOccurs

• The default value for minOccurs is "1"

• The default value for maxOccurs is "1"

<element ref="Title" minOccurs="1" maxOccurs="1"/>

<element ref="Title"/>

Equivalent!

Page 17: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

17

Qualify XMLSchema, Default targetNamespace

• In the last example, we explicitly qualified all elements from the XML Schema namespace. The targetNamespace was the default namespace.

BookCatalogue

BookTitle

Author

Date

ISBNPublisher

http://www.publishing.org

element

annotationdocumentation

complexType

schema

sequence

http://www.w3.org/2000/10/XMLSchema

Page 18: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

18

Default XMLSchema, Qualify targetNamespace

• Alternatively (equivalently), we can design our schema so that XMLSchema is the default namespace.

BookCatalogue

BookTitle

Author

Date

ISBNPublisher

http://www.publishing.org

element

annotationdocumentation

complexType

schema

sequence

http://www.w3.org/2000/10/XMLSchema

Page 19: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

19

<?xml version="1.0"?><schema xmlns="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.publishing.org" xmlns:pub="http://www.publishing.org" elementFormDefault="qualified"> <element name="BookCatalogue"> <complexType> <sequence> <element ref="pub:Book" minOccurs="0" maxOccurs="unbounded"/> </sequence> </complexType> </element> <element name="Book"> <complexType> <sequence> <element ref="pub:Title"/> <element ref="pub:Author"/> <element ref="pub:Date"/> <element ref="pub:ISBN"/> <element ref="pub:Publisher"/> </sequence> </complexType> </element> <element name="Title" type="string"/> <element name="Author" type="string"/> <element name="Date" type="string"/> <element name="ISBN" type="string"/> <element name="Publisher" type="string"/></schema>

Here we arereferencing aBook element.Where is thatBook elementdefined? In what namespace?The pub: prefixindicates whatnamespace thiselement is in. pub:has been set tobe the same as thetargetNamespace.

Page 20: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

20

"pub" References the targetNamespace

BookCatalogue

BookTitle

Author

Date

ISBNPublisher

http://www.publishing.org (targetNamespace)

element

annotationdocumentation

complexType

schema

sequence

http://www.w3.org/2000/10/XMLSchema

pub

Page 21: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

21

Alternate Schema

• In the previous examples we declared an element and then we ref’ed that element declaration. Instead, we can inline the element declarations.

• On the following slide is an alternate (equivalent) way of representing the schema shown previously, using inlined element declarations.

Page 22: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

22

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> <xsd:element name="BookCatalogue"> <xsd:complexType> <xsd:sequence> <xsd:element name="Book" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element></xsd:schema>

Elements Declared Inline

Page 23: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

23

Anonymous types (no name)

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> <xsd:element name="BookCatalogue"> <xsd:complexType> <xsd:sequence> <xsd:element name="Book" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> </xsd:element></xsd:schema>

Page 24: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

24

Named Types

• The following slide shows an alternate (equivalent) schema which uses a named type.

Page 25: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

25

Named type

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" targetNamespace="http://www.publishing.org" xmlns="http://www.publishing.org" elementFormDefault="qualified"> <xsd:element name="BookCatalogue"> <xsd:complexType> <xsd:sequence> <xsd:element name="Book" type="CardCatalogueEntry" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:complexType name="CardCatalogueEntry"> <xsd:sequence> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/> </xsd:sequence> </xsd:complexType></xsd:schema>

Named Types

Page 26: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

26

Please note that

<xsd:element name="A" type="foo"/><xsd:complexType name="foo"> <xsd:sequence> <xsd:element name="B" …/> <xsd:element name="C" …/> </xsd:sequence></xsd:complexType>

is equivalent to

<xsd:element name="A"> <xsd:complexType> <xsd:sequence> <xsd:element name="B" …/> <xsd:element name="C" …/> </xsd:sequence> </xsd:complexType></xsd:element>

Element A references thecomplexType foo.

Element A has the complexType definitioninlined in the elementdeclaration.

Page 27: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

27

type Attribute or complexType Child Element, but not Both!

• An element declaration can have a type attribute, or a complexType child element, but it cannot have both a type attribute and a complexType child element.

<xsd:element name="A" type="foo"> <xsd:complexType> … </xsd:complexType></xsd:element>

Page 28: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

28

Summary of Declaring Elements (two ways to do it)

<xsd:element name="name" type="type" minOccurs="int" maxOccurs="int"/>

A simple type(e.g., xsd:string)or the name ofa complexType

<xsd:element name="name" minOccurs="int" maxOccurs="int"> <xsd:complexType> … </xsd:complexType></xsd:element>

1

2

A nonnegativeinteger

A nonnegativeinteger or "unbounded"

Note: minOccurs and maxOccurs can only be usedin nested (local) element declarations.

Page 29: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

29

XML Schemas

<xsd:element name=“paper” type=“papertype”/>

<xsd:complexType name=“papertype”>

<xsd:sequence>

<xsd:element name=“title” type=“xsd:string”/>

<xsd:element name=“author” minOccurs=“0”/>

<xsd:element name=“year”/>

<xsd: choice> < xsd:element name=“journal”/>

<xsd:element name=“conference”/>

</xsd:choice>

</xsd:sequence>

</xsd:element>

<xsd:element name=“paper” type=“papertype”/>

<xsd:complexType name=“papertype”>

<xsd:sequence>

<xsd:element name=“title” type=“xsd:string”/>

<xsd:element name=“author” minOccurs=“0”/>

<xsd:element name=“year”/>

<xsd: choice> < xsd:element name=“journal”/>

<xsd:element name=“conference”/>

</xsd:choice>

</xsd:sequence>

</xsd:element>

DTD: <!ELEMENT paper (title,author*,year, (journal|conference))>

Page 30: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

30

Elements v.s. Types in XML Schema

<xsd:element name=“person”> <xsd:complexType> <xsd:sequence> <xsd:element name=“name” type=“xsd:string”/> <xsd:element name=“address” type=“xsd:string”/> </xsd:sequence> </xsd:complexType></xsd:element>

<xsd:element name=“person”> <xsd:complexType> <xsd:sequence> <xsd:element name=“name” type=“xsd:string”/> <xsd:element name=“address” type=“xsd:string”/> </xsd:sequence> </xsd:complexType></xsd:element>

<xsd:element name=“person” type=“ttt”/><xsd:complexType name=“ttt”> <xsd:sequence> <xsd:element name=“name” type=“xsd:string”/> <xsd:element name=“address” type=“xsd:string”/> </xsd:sequence></xsd:complexType>

<xsd:element name=“person” type=“ttt”/><xsd:complexType name=“ttt”> <xsd:sequence> <xsd:element name=“name” type=“xsd:string”/> <xsd:element name=“address” type=“xsd:string”/> </xsd:sequence></xsd:complexType>

DTD: <!ELEMENT person (name,address)>

Page 31: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

31

• Types:– Simple types (integers, strings, ...)

– Complex types (regular expressions, like in DTDs)

• Element-type-element alternation:– Root element has a complex type

– That type is a regular expression of elements

– Those elements have their complex types...

– ...

– On the leaves we have simple types

Elements v.s. Types in XML Schema

Page 32: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

32

Local and Global Types in XML Schema

• Local type: <xsd:element name=“person”>

[define locally the person’s type] </xsd:element>

• Global type: <xsd:element name=“person” type=“ttt”/>

<xsd:complexType name=“ttt”> [define here the type ttt] </xsd:complexType>

Global types: can be reused in other elements

Page 33: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

33

Local v.s. Global Elements inXML Schema

• Local element: <xsd:complexType name=“ttt”>

<xsd:sequence> <xsd:element name=“address” type=“...”/>... </xsd:sequence> </xsd:complexType>

• Global element: <xsd:element name=“address” type=“...”/>

<xsd:complexType name=“ttt”> <xsd:sequence> <xsd:element ref=“address”/> ... </xsd:sequence> </xsd:complexType>

Global elements: like in DTDs

Page 34: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

34

Regular Expressions in XML Schema

Recall the element-type-element alternation: <xsd:complexType name=“....”>

[regular expression on elements] </xsd:complexType>

Regular expressions:• <xsd:sequence> A B C </...> = A B C

• <xsd:choice> A B C </...> = A | B | C

• <xsd:group> A B C </...> = (A B C)

• <xsd:... minOccurs=“0” maxOccurs=“unbounded”> ..</...> = (...)*

• <xsd:... minOccurs=“0” maxOccurs=“1”> ..</...> = (...)?

Page 35: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

35

Local Names in XML-Schema<xsd:element name=“person”> <xsd:complexType> . . . . . <xsd:element name=“name”> <xsd:complexType> <xsd:sequence> <xsd:element name=“firstname” type=“xsd:string”/> <xsd:element name=“lastname” type=“xsd:string”/> </xsd:sequence> </xsd:element> . . . . </xsd:complexType></xsd:element>

<xsd:element name=“product”> <xsd:complexType> . . . . . <xsd:element name=“name” type=“xsd:string”/>

</xsd:complexType></xsd:element>

<xsd:element name=“person”> <xsd:complexType> . . . . . <xsd:element name=“name”> <xsd:complexType> <xsd:sequence> <xsd:element name=“firstname” type=“xsd:string”/> <xsd:element name=“lastname” type=“xsd:string”/> </xsd:sequence> </xsd:element> . . . . </xsd:complexType></xsd:element>

<xsd:element name=“product”> <xsd:complexType> . . . . . <xsd:element name=“name” type=“xsd:string”/>

</xsd:complexType></xsd:element>

name hasdifferent meaningsin person andin product

Page 36: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

36

Subtle Use of Local Names<xsd:element name=“A” type=“oneB”/>

<xsd:complexType name=“onlyAs”> <xsd:choice> <xsd:sequence> <xsd:element name=“A” type=“onlyAs”/> <xsd:element name=“A” type=“onlyAs”/> </xsd:sequence> <xsd:element name=“A” type=“xsd:string”/> </xsd:choice></xsd:complexType>

<xsd:element name=“A” type=“oneB”/>

<xsd:complexType name=“onlyAs”> <xsd:choice> <xsd:sequence> <xsd:element name=“A” type=“onlyAs”/> <xsd:element name=“A” type=“onlyAs”/> </xsd:sequence> <xsd:element name=“A” type=“xsd:string”/> </xsd:choice></xsd:complexType>

<xsd:complexType name=“oneB”> <xsd:choice> <xsd:element name=“B” type=“xsd:string”/> <xsd:sequence> <xsd:element name=“A” type=“onlyAs”/> <xsd:element name=“A” type=“oneB”/> </xsd:sequence> <xsd:sequence> <xsd:element name=“A” type=“oneB”/> <xsd:element name=“A” type=“onlyAs”/> </xsd:sequence> </xsd:choice></xsd:complexType>

<xsd:complexType name=“oneB”> <xsd:choice> <xsd:element name=“B” type=“xsd:string”/> <xsd:sequence> <xsd:element name=“A” type=“onlyAs”/> <xsd:element name=“A” type=“oneB”/> </xsd:sequence> <xsd:sequence> <xsd:element name=“A” type=“oneB”/> <xsd:element name=“A” type=“onlyAs”/> </xsd:sequence> </xsd:choice></xsd:complexType>

Arbitrary deep binary tree with A elements, and a single B element

Page 37: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

37

Attributes in XML Schema

<xsd:element name=“paper” type=“papertype”/>

<xsd:complexType name=“papertype”>

<xsd:sequence>

<xsd:element name=“title” type=“xsd:string”/>

. . . . . .

</xsd:sequence>

<xsd:attribute name=“language" type="xsd:NMTOKEN" fixed=“English"/>

</xsd:complexType>

<xsd:element name=“paper” type=“papertype”/>

<xsd:complexType name=“papertype”>

<xsd:sequence>

<xsd:element name=“title” type=“xsd:string”/>

. . . . . .

</xsd:sequence>

<xsd:attribute name=“language" type="xsd:NMTOKEN" fixed=“English"/>

</xsd:complexType>

• Attributes are associated to the type, not to the element• Only to complex types; more trouble if we want to add

attributes to simple types.

Page 38: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

38

“Mixed” Content, “Any” Type

• Better than in DTDs: can still enforce the type, but now may have text between any elements

• Means anything is permitted there

<xsd:complexType mixed="true"> . . . .

<xsd:complexType mixed="true"> . . . .

<xsd:element name="anything" type="xsd:anyType"/> . . . .

<xsd:element name="anything" type="xsd:anyType"/> . . . .

Page 39: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

39

“All” Group

• A restricted form of & in SGML• Restrictions:

– Only at top level– Has only elements– Each element occurs at most once

• E.g. “comment” occurs 0 or 1 times

<xsd:complexType name="PurchaseOrderType">

<xsd:all> <xsd:element name="shipTo" type="USAddress"/>

<xsd:element name="billTo" type="USAddress"/>

<xsd:element ref="comment" minOccurs="0"/>

<xsd:element name="items" type="Items"/>

</xsd:all>

<xsd:attribute name="orderDate" type="xsd:date"/>

</xsd:complexType>

<xsd:complexType name="PurchaseOrderType">

<xsd:all> <xsd:element name="shipTo" type="USAddress"/>

<xsd:element name="billTo" type="USAddress"/>

<xsd:element ref="comment" minOccurs="0"/>

<xsd:element name="items" type="Items"/>

</xsd:all>

<xsd:attribute name="orderDate" type="xsd:date"/>

</xsd:complexType>

Page 40: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

40

Derived Types by Extensions <complexType name="Address">

<sequence> <element name="street" type="string"/>

<element name="city" type="string"/>

</sequence>

</complexType>

<complexType name="USAddress">

<complexContent>

<extension base="ipo:Address">

<sequence>

<element name="state" type="ipo:USState"/>

<element name="zip" type="positiveInteger"/>

</sequence>

</extension>

</complexContent>

</complexType>

<complexType name="Address">

<sequence> <element name="street" type="string"/>

<element name="city" type="string"/>

</sequence>

</complexType>

<complexType name="USAddress">

<complexContent>

<extension base="ipo:Address">

<sequence>

<element name="state" type="ipo:USState"/>

<element name="zip" type="positiveInteger"/>

</sequence>

</extension>

</complexContent>

</complexType>

Corresponds to inheritance

Page 41: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

41

Derived Types by Restrictions

• (*): may restrict cardinalities, e.g. (0,infty) to (1,1); may restrict choices; other restrictions…

<complexContent> <restriction base="ipo:Items“> … [rewrite the entire content, with restrictions]... </restriction> </complexContent>

<complexContent> <restriction base="ipo:Items“> … [rewrite the entire content, with restrictions]... </restriction> </complexContent>

Corresponds to set inclusion

Page 42: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

42

Simple Types

• String

• Token

• Byte

• unsignedByte

• Integer

• positiveInteger

• Int (larger than integer)

• unsignedInt

• Long

• Short

• ...

• Time

• dateTime

• Duration

• Date

• ID

• IDREF

• IDREFS

Page 43: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

43

Facets of Simple Types

Examples

• length

• minLength

• maxLength

• pattern

• enumeration

• whiteSpace

• maxInclusive

• maxExclusive

• minInclusive

• minExclusive

• totalDigits

• fractionDigits

•Facets = additional properties restricting a simple type

•15 facets defined by XML Schema

Page 44: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

44

Facets of Simple Types

• Can further restrict a simple type by changing some facets

• Restriction = subset

Page 45: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

45

Not so Simple Types

• List types:

• Union types

• Restriction types

<xsd:simpleType name="listOfMyIntType">

<xsd:list itemType="myInteger"/>

</xsd:simpleType>

<xsd:simpleType name="listOfMyIntType">

<xsd:list itemType="myInteger"/>

</xsd:simpleType>

<listOfMyInt>20003 15037 95977 95945</listOfMyInt><listOfMyInt>20003 15037 95977 95945</listOfMyInt>

Page 46: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

46

Summary of XML Schema

• Formal Expressive Power:– Can express precisely the regular tree

languages (over unranked trees)

• Lots of other stuff– Some form of inheritance– A “null” value– Large collection of data types

Page 47: 1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:

47

Summary of Schemas

• in SS data: – graph theoretic– data and schema are decoupled– used in data processing

• in XML– from grammar to object-oriented– schema wired with the data– emphasis on semantics for exchange