Transcript
Page 1: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

1

eXtensible M

arkup Language

Reda Bendraou Reda.Bendraou@

lip6.fr

http://pagesperso-systeme.lip6.fr/R

eda.Bendraou

XML

2

Agenda

Part I : The X

ML S

tandard

Goals

W

hy XM

L ?

XM

L document S

tructure

Well-form

ed XM

L document

Part II : XM

L: DTD

and Schem

a

DTD

XML Schem

a

3

What is XM

L?

XM

L stands for EXtensible M

arkup Language

X

ML is a m

arkup language much like H

TML

XM

L tags are not predefined. You m

ust define/invent your ow

n tags

XM

L was designed to describe data

XM

L is a W3C

Recom

mendation

4

The Main Difference Betw

een XML and HTM

L

XML w

as designed to store, carry, and exchange data. XM

L was not designed to display data.

XM

L is not a replacement for HTM

L. X

ML and H

TML w

ere designed with different goals:

X

ML w

as designed to describe data and to focus on what

data is.

HTM

L was designed to display data and to focus on how

data looks.

HTML is about displaying inform

ation, while XM

L is about describing inform

ation.

Page 2: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

5

How can XM

L be Used?

To describe and to structure data

To Separate Data from

HTML

You need to parse your XM

L document to display data

To parse: CSS, XSLT, DO

M API, etc.

To Exchange Data

To Store Data

6

XM

L Goals

A

need for representing data:

E

asily readable :

That fits to the W

EB technology (to be easily integrated within W

EB servers)

W

ith a clear separation between :

In a standardized w

ay - By humains

- By machines

- presentation aspects (format,

colors, etc..) - Data (sem

antics)

7

XM

L: Origines

Som

e well know

n Formats :

HTM

L = Hyper Text Markup Language (only presentation)

SG

ML = Standard G

eneralized Markup Language (to structure the

document

Tag based Language O

ther notations :

ASN.1= Abstract S

yntax Notation (ITU

-T)

CD

R, X

DR

= Com

mon/eX

tenal Data R

epresentation etc…

..

8

HTM

L Draw

backs

Sim

ple, readable !

WEB Com

patibility! BUT

Not extensible ! (a fixed set of standardized tags and attributes)

M

ixing between the Form

and the Content ! (i.e. presentation tag with a data : <H

1> Intensive Care </H

1>)

Brow

sers / Versions Incompatibility

No way to check the docum

ent: - structure (Tags ordre ), - data (type, value), - sem

antic

Page 3: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

9

SG

ML D

rawbacks

pow

erful, extensible, standard (ISO

8879-1986)!

M

eta-language for documenting huge and com

plex

specifications (i.e.  automobile,  avionic,  etc…

)

…Nevertheless

Too com

plex ! -> Too hard to implem

ent, too hard to use !

Not necessarily W

EB com

patible!

10

A D

efinition:

XM

L :

A

n extensible and configurable language

A hierarchical representation of data

http://w

ww

.w3.org/X

ML/

XM

L

- An HTML variant!

(WEB com

patibility, readable, HTML-like syntax)

- A subset of SGM

L !

(flexibility, rigor)

11

XM

L document structure

header :

Equivalent to H

TML <H

EA

D>,

M

eta–data:

body :

Equivalent to H

TML <B

OD

Y>

Structured data :

- Processing instructions - com

ments

(no interpretable by the parser)

- Enclosing tags - Attributes (e.g. <gangster nam

e='Ocean')>

- Data within tags (e.g. <title> O

cean's 12 </title>) (a tree structure)

12

<letter> <location> S

omew

here in space</ location >

<sender> ObiW

an Kenobi </ sender >

<receiver> Luke Skyw

alker </ receiver > <introduction> D

ear padawan, </introduction>

<body_Letter>  …May  the  force  be  w

ith  you  </ body_Letter > <signature/>

</letter>

<?xml version = "1.0" standalone="yes" encoding="IS

O-8859-1"?>

XM

L Exam

ple : A letter

A stand alone document

Processing instruction

A single Tag (empty, no data)

Header

Body

Character set used (Latin)

document XM

L

Start tag

End tag

Data

Page 4: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

13

XML Prologue

(Example)

<?xml version="1.0" standalone = "no" encoding="IS

O-8859-1" ?>

<!DOCTYPE

liste_CD

SYSTEM "C

Ds.dtd">

A key word ! (Define the docum

ent Type)

This is a non-standalone XML docum

ent (uses an external docum

ent)

In conformity with an external definition

(specified in "CDs.dtd")

XML 1.0 docum

ent

14

XML docum

ent body (Example)

<CD_list> <CD

> <artist type="individual">Frank S

inatra</artist> <title no_pistes="4">In The W

ee Sm

all Hours</title>

<tracks> <track>In The W

ee Sm

all Hours</track>

<track>Mood Indigo</track>

</tracks> <price m

oney="euro" payment="C

B">12.99</price>

<to_buy/> </CD

> <CD

>  …..    </CD

> …

.. </ CD_list >

6

8

1

3 5

3 4

2

7 9

15

XM

L document body

(a tree view of the exam

ple)

liste_CD

piste

titre

pistes

artiste

CD

en_vente

piste

CD

prix

racine du document

Fra

nk S

ina

traIn T

he

We

e S

ma

ll Ho

urs

Mo

od

Ind

igo

In T

he

We

e S

ma

ll Ho

urs

12

,99

CD_list

/CD_list

Document Root

tracks

title

track

track

price

to_buy

16

XM

L document body

(som

e explanations)

A

tree-like structure (see slide 12)

The body's root is unique (1)(2).

Tags are either :

The content betw

een a pair of tags (3) is either :

S

ome tags m

ay have attributes (5)(8),

- by pairs : Start (1) ,and End (2), - unique (4).

- A simple value : a string (6), a real (7), etc.,

- A tree structure of other tags (9). - A m

ix of both (not shown in the example).

Page 5: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

17

Element Nam

ing N

ames should be short and sim

ple (e.g. <Cd_Title>)

XML elem

ents must follow

these naming rules:

Nam

es can contain letters, numbers, and other characters

N

ames m

ust not start with a num

ber or punctuation characters

Avoid (':', '-', '.', '!', etc)

Nam

es must not start w

ith the letters xml (or XM

L, or Xml, etc)

N

ames cannot contain spaces

18

XML Attributes

X

ML elem

ents can have attributes in the start tag, just like HTM

L

A

ttributes are used to provide additional information about

elements.

N

o order!

S

yntax : name=‘value’ or nam

e=“value”

Forbidden Characters : ^, %

et &

P

redefined attributes :

<book tongue=“FR”  date=“09/2000”  id="IS

BN

-123"/>

xml:lang=“fr”

xml:id="a_unique_tag_indentifier"

xml:idref ="a_reference_tow

ards_a_tag"

example :

19

X

ML w

ith correct syntax is Well Form

ed :

X

ML docum

ents must have a root elem

ent

XM

L elements m

ust have a closing tag

XM

L tags are case sensitive

XM

L attribute values must alw

ays be quoted

X

ML elem

ents must be properly nested

Well Form

ed XM

L Docum

ent

20

Well Form

ed XM

L Docum

ent

A W

ell Formed X

ML

A

link with a style sheet is rendered possible

C

an be reused by syntactic parser/analyzer (i.e. browse the XM

L tree and transform it)

C

andidate to be a valid XM

L document

<?xm

l version = "1.0" standalone="yes"?> <!– unw

ell formed X

ML D

ocument! -->

<artist> <nam

e> Picasso

<surname>

</name> P

ablo </surnam

e> </artist>

Improperly

Nested tags

counterexample :

Page 6: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

21

A V

alid XM

L Docum

ent

A "Valid" XML docum

ent is a "Well Form

ed" XML docum

ent, which also

conforms to the rules of a Docum

ent Type Definition (DTD) or an XML

Schema (.xsd)

D

efinition :

Conditions :

Docum

ent => Well Form

ed (syntactically correct),

Docum

ent structure conforms to the definition given in its D

TD (cf. D

TD),

R

eferences on document elem

ents can be resolved

Then

The XML docum

ent can be exchanged ! (standardized format)

-Internal: within the X

ML docum

ent not recom

mended

(using the DOCTYPE tag)

-External reuse, exchange

(a reference to a file containing the definition within the DOCTYPE tag)

22

Exercise

G

ive a Well Form

ed XM

L document w

hich represents a set of bibliographic references

If you w

ant to process these bibliographic references, w

hat would you need ?

Your first feedbacks upon X

ML ?

23

The document does not specify :

Tags : Attributes : Tag contents : - D

ata Types (i.e. String, enumeration, etc.)

First reflections on XM

L (1)

- Attribute nam

es (for each tag) - A

ttribute types (i.e. String, enumeration, etc.)

- Attribute values (i.e. Range, form

at etc.) - Tag nam

es

- constraints upon : - The order, - M

ultiplicity (no. of occurrences), - Com

position (the hierarchy). 24

First reflections on XM

L(2)

Questions : W

hen do we have to use Tags and w

hen do we have to use attributes?

How

do we indicate w

hat should be displayed/printed and how ?

D

oes Attributes order has any im

portance? N

o

Tags entities

Attributes properties

style C

SS

transformations

XSLT,  DOM,  XPath  …

..

Page 7: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

25

XM

L (Part II)

XM

L Docum

ent Definitions:

DTD, XM

L Schema

26

A w

ell formed and V

alid XM

L Docum

ent

W

ell Formed D

ocument

E

lements properly nested, Syntactically correct, etc.

N

ot necessarily conforms to a D

TD or X

ML S

chema

A V

alid Docum

ent

Well Form

ed + conforms to a D

TD (or a S

chema)

27

DTD

(Docum

ent Type Definition)

A DTD defines the legal elem

ents of an XM

L document

D

efines the «vocabulary» and structure of the docum

ent

A

Gram

mar w

hich phrases (instances) are X

ML docum

ents

M

ay be Internal to the document or E

xternal (referenced w

ithin an XM

L document)

28

Why use a DTD?

W

ith DTD

, each of your XM

L files can carry a description of its ow

n format w

ith it.

W

ith a DTD

, independent groups of people can agree to use a com

mon D

TD for interchanging data.

Your application can use a standard D

TD to verify

that the data you receive from the outside w

orld is valid.

Y

ou can also use a DTD

to verify your own data.

Page 8: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

29

DTD

Contents: E

lement and A

ttribute

Elem

ents are the main building blocks of both X

ML docum

ents.

Syntax:

<!ELE

ME

NT tag (content) > O

r

<!ELE

ME

NT elem

ent-name category> (i.e. E

MP

TY, A

NY

, #PC

DA

TA)

D

efines a Tag

E.g.. : <!E

LEM

EN

T book (author, editor)>

Attributes provide extra inform

ation about elements.

Syntax

<!ATTLIS

T element-nam

e [attribute-name, attribute type #m

ode [default value]]*>

Defines the list of attributes for a given tag

E

.g. : <!ATTLIS

T author gender C

DA

TA #R

EQ

UIR

ED

city C

DA

TA #IM

PLIE

D>

<!ATTLIS

T editor city C

DA

TA #FIXE

D "P

aris">

30

Elem

ents Ordering/ S

tructuring

Elem

ent contents structuring:

(a, b) sequence e.g. (name, surnam

e, street, city)

(a | b) either / or e.g. (yes | no)

a? optional elem

ent [0,1] e.g. (name, surnam

e?, street, city)

a* zero or m

ore occurrences [0,N] e.g. (product*, custom

er)

a+ one or m

ore occurrences [1,N] e.g. (product*, seller+)

31

Data Types

C

DA

TA (C

haracter Data)

O

nly text inside a CD

ATA

section will be ignored by the parser.

P

CD

ATA

(Parsed C

haracter Data)

Text found betw

een the start tag and the end tag of an XML elem

ent.

is text that will be parsed by a parser

A

n Elem

ent with no descendants, no attributes, only text

E

numeration

A

list of values separated by « | »

ID et ID

RE

F

Key and R

eference for attributes

AN

Y

A

ny combination of parsable data

E

MP

TY

To declare E

mpty elem

ents

32

Exam

ple of an External D

TD (m

essage.dtd) <?xm

l version="1.0"?> <!D

OC

TYP

E m

essage SY

STE

M "m

essage.dtd"> <m

essage> <to>D

ave</to> <from

>Susan</from

> <subject>R

eminder</subject>

<text>Don't forget to buy m

ilk on the way hom

e.</text> </m

essage> In a separate file, "m

essage.dtd" you define your DTD

as follows:

<!ELE

ME

NT m

essage (to, from, subject, text)>

<!ELEM

EN

T to (#PC

DA

TA)>

<!ELEM

EN

T from (#P

CD

ATA)>

<!ELEM

EN

T subject (#PC

DA

TA)>

<!ELEM

EN

T text (#PC

DA

TA)>

Page 9: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

33

Exam

ple of an Internal DTD

<?xm

l version="1.0"?> <!D

OC

TYPE

message [

<!ELE

ME

NT m

essage (to,from,subject,text)>

<!ELE

ME

NT to (#P

CD

ATA)>

<!ELEM

EN

T from (#P

CD

ATA)> <!E

LEM

EN

T subject (#PC

DATA)>

<!ELE

ME

NT text (#P

CD

ATA)> ]> <m

essage> <to>Dave</to> <from

>Susan</from>

<subject>Reminder</subject>

<text>Don't forget to buy milk on the w

ay home.</text>

</message>

34

Why encourage the use of an external D

TD?

To P

romote reusability

To share tags and structures

The D

efinition may be local or distant

<!D

OC

TYPE doc S

YSTEM

"doc.dtd">

<!DO

CTYP

E doc PU

BLIC "w

ww

.e-xmlm

edia.com/doc.dtd">

A

clear separation between a definition and its

instances

35

DTD

Entity D

eclaration

Definition: E

ntities are variables that represent other values. The value of the entity is substituted for the entity w

hen the XML docum

ent is parsed.

Entities can be defined internally or externally to your D

TD.

Syntax

Internal declaration: <!E

NTITY entity-nam

e entity-value>

E

xternal declaration: <!E

NTITY entity-nam

e SYS

TEM

"entity-UR

L">

To use it &

entity-name

Exam

ple (internal): <!E

NTITY w

ebsite "http://ww

w.TheS

carms.com

">

Exam

ple (external): <!E

NTITY w

ebsite SYS

TEM

"http://ww

w.TheS

carms.com

/entity.xml">

The above entity make this line of XM

L valid.

XML line:

<url>&w

ebsite</url> E

valuates to: <url>http://w

ww

.TheScarm

s.com</url>

36

DTD

: Sum

mary

D

efines the document structure w

ith a list of legal elements

D

TD Building blocks : E

LEME

NT, ATTLIST, E

NTITY, P

CD

ATA, C

DA

TA;

E

LEME

NT

Sim

ple :

•C

omposition :

•O

ccurrence Multiplicity :

- Sequence of elements ordered List

(a, b, c) - Alternatives choices

(a | b | c) - M

ix

(a, (b | c), d)

? (zero or one) ), * (zero or m

ore), + (one or m

ore)

- empty (EM

PTY) - N

o constraints on type (ANY), - text (#PCDATA

)

Page 10: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

37

Exercise

P

ropose a DTD

that allows the definition of XM

L docum

ents representing a Catalog of Products

(products, description, price, etc).

G

ive an example of an X

ML D

ocuments that

conforms to the D

TD you propose

C

ould you provide some advantages, draw

backs of using D

TD.

38

The Catalog D

TD

<!DO

CTYP

E C

ATA

LOG

[ <!E

LEM

EN

T CA

TALO

G (P

RO

DU

CT+)>

<!ELE

ME

NT P

RO

DU

CT (S

PE

CIFIC

ATIO

NS

+, OP

TION

S?, P

RIC

E+, N

OTE

S?)>

<!ELE

ME

NT S

PE

CIFIC

ATIO

NS

(#PC

DA

TA)>

<!ELEM

EN

T OPTIO

NS

(#PCD

ATA

)> <!E

LEM

EN

T PR

ICE

(#PC

DA

TA)>

<!ELE

ME

NT N

OTE

S (#P

CD

ATA

)> <!A

TTLIST P

RO

DU

CT N

AM

E C

DA

TA #IM

PLIE

D

C

ATE

GO

RY (H

andTool | Table | Shop-P

rofessional) "HandTool"

P

AR

TNU

M C

DA

TA #IM

PLIE

D

PLAN

T  (Pittsburgh  |  M

ilwaukee  |  C

hicago)  "Chicago“

IN

VE

NTO

RY (InS

tock | Backordered | D

iscontinued) "InStock">

<!ATTLIS

T SP

EC

IFICA

TION

S W

EIG

HT C

DA

TA #IM

PLIE

D

PO

WE

R C

DA

TA #IM

PLIE

D>

<!ATTLIS

T OP

TION

S FIN

ISH

(Metal | P

olished | Matte) "M

atte"

AD

AP

TER

(Included | Optional | N

otApplicable) "Included"

C

AS

E (HardS

hell | Soft | N

otApplicable) "H

ardShell">

<!ATTLIS

T PR

ICE

MS

RP

CD

ATA

#IMP

LIED

W

HO

LES

ALE

CD

ATA

#IMP

LIED

S

TRE

ET C

DA

TA #IM

PLIE

D

SH

IPP

ING

CD

ATA

#IMP

LIED

> <!E

NTITY A

UTH

OR

"John Doe">

<!EN

TITY CO

MP

AN

Y "JD P

ower Tools, Inc.">

<!EN

TITY EM

AIL "jd@

jd-tools.com"> ]>

39

DTD

: Draw

backs

W

eak Data Typing

B

asically one data type: Text (String)

A

re not Object O

riented (No Inheritance )

Specific language: N

o XM

L-Based

H

ard to parse/interpret

Proposition to overcom

e these limitations:

W

3C's XML-Schem

a

40

XM

L Schem

a

XM

L Schem

a is an XM

L-based alternative to DTD

.

The XM

L Schem

a language is also referred to as XM

L Schem

a D

efinition (XSD

).

A

n XM

L Schem

a describes the structure of an XM

L document :

D

ocument's possible elem

ents

Element attributes and their type

XM

L Schem

as use XM

L Syntax

N

o new language as for the D

TD

XM

L Schem

as support many D

ata Types

M

ore advantages

Data Structures and rich data typing

extensibility thanks to inheritance

analyzable by standard X

ML parsers

Page 11: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

41

The root element of an X

ML S

chema

The <schem

a> element is the root elem

ent of every XM

L S

chema:

<?xm

l version="1.0"?> <xs:schem

a> //schem

a body... //...

</xs:schema>

The <schem

a> element m

ay contain some attributes. A

schema declaration often looks som

ething like this: <?xm

l version="1.0"?> <xs:schem

a xmlns:xs="http://w

ww

.w3.org/2001/XM

LSchema">

//...

//...

</xs:schema>

42

XS

D, H

ow to?

R

eference a Schem

a in an XM

L Docum

ent: <?xm

l version="1.0"?> <note xm

lns:xsi="http://ww

w.w

3.org/2001/XM

LSchema-

instance" xsi:schemaLocation="path_to_your_file.xsd">

<to>Tove</to> <from

>Jani</from>

<heading>Rem

inder</heading> <body>D

on't forget me this w

eekend!</body> </note>

Your docum

ent Root tag

43

XS

D, H

ow to?

D

efine a Simple Elem

ent

A

n XML elem

ent that can contain only text (typed data).

It cannot contain any other elements or attributes

Syntax: <xs:elem

ent name="xxx" type="yyy"/>

E.g.

An X

ML D

ocument (a chunk):

<lastnam

e>Refsnes</lastnam

e> <age>36</age> <dateborn>1970-03-27</dateborn>

The C

orresponding simple elem

ent definition in the XM

L Schem

a <xs:elem

ent name="lastnam

e" type="xs:string"/> <xs:elem

ent name="age" type="xs:integer"/>

<xs:element nam

e="dateborn" type="xs:date"/>

Also:

<xs:elem

ent name="color" type="xs:string" default="red"/> (if no value)

Or

<xs:elem

ent name="color" type="xs:string" fixed="red"/> (can't be m

odified)

44

XS

D, H

ow to?

D

efine XSD

Attributes

All attributes are declared as sim

ple types

If an element has attributes, it is considered as a com

plex type

Syntax: <xs:attribute nam

e="xxx" type="yyy"/> E

.g. <xs:attribute nam

e="language" type="xs:string"/>

Also:

<xs:attribute name="lang" type="xs:string" default="E

N"/> (if no value,

use default)

<xs:attribute name="lang" type="xs:string" fixed="E

N"/> (can't be

modified)

A

ttributes are optional by default. To specify that the attribute is required, use the "use" property:

<xs:attribute nam

e="lang" type="xs:string" use="required"/>

Page 12: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

45

Basic types (1) Type

Description

string represents a String

boolean A Boolean value: true or false.

decimal

represents a decimal

float represents a float.

double represents a double.

duration represents a duration

dateTime

represents a value: date/time.

time

(format : hh:m

m:ss.sss ).

date represents a date (form

at : CCYY-M

M-D

D).

gYearMonth

represents Gregorian year and m

onth (format : CCYY-M

M)

46

Basic types (2)

Type Description

gYear represents a year (form

at : CCYY).

gMonthD

ay represents m

onth's day (format : M

M-D

D)

gDay

represents a day (format : D

D).

gMonth

represents a month (form

at : MM

).

hexBinary represents a binary hexadecim

al content.

base64Binary represents a 64 base binary content

anyUR

I represents U

RI address (ex.: http://www.site.com).

NO

TATION

represents a qualified nam

e.

47

Basic types (3)

Type Description

token represents a string w

ithout line feeds, carriage returns, tabs, leading and trailing spaces, and m

ultiple spaces language

represents a string that contains a valid language id N

MTO

KEN

a string that represents the NMTO

KEN attribute in XM

L (only used with schem

a attributes) id

a string that represents the ID attribute in XM

L (only used with schem

a attributes) ID

REF, ID

REFS

represents attributes IDREF, ID

REFS type

ENTITY, EN

TITIES represents the types: ENTITY, ENTITIES

integer represents an integer value (signed, arbitrary length)

nonPositiveInteger an integer containing only non-positive values (.., -2, -1, 0)

negativeInteger An integer containing only negative values ( .., -2, -1.)

48

Basic types (4)

Type Description

long Long integer between {-9223372036854775808 - 223372036854775807}

int A signed 32-bit integer between {-2147483648 - 2147483647}

short Short Integer {-32768 - 32767}

byte Between {-128 - 127}

nonNegativeInteger

An integer containing only non-negative values (0, 1, 2, ..) unsignedLong

An unsigned 64-bit integer long

A signed 64-bit integer {0 - 18446744073709551615} unsignedInt

An unsigned 32-bit integer {0 - 4294967295}

unsignedShort An unsigned 16-bit integer {0 - 65535}

unsignedByte An unsigned 8-bit integer {0 - 255}

positiveInteger An integer containing only positive values (1, 2, ..)

Page 13: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

49

XS

D Com

plex Elements

A

complex elem

ent contains other elements and/or

attributes

four kinds of com

plex elements:

em

pty elements

elem

ents that contain only other elements

elem

ents that contain attributes and only text

elements that contain both other elem

ents and text

Note: E

ach of these elements m

ay contain attributes as w

ell!

50

XS

D, H

ow to?

Define a Com

plex Element in XM

L Schema

tw

o different ways :

E

.g.: somew

here in a XM

L document you have for instance:

<em

ployee> // a com

plex Element that contains only other elem

ents

<firstname>John</firstnam

e>

<lastname>Sm

ith</lastname>

</em

ployee> First possible XM

L Schema:

<xs:elem

ent name="em

ployee">

<xs:com

plexType>

<xs:sequence>

<xs:element nam

e="firstname" type="xs:string"/>

<xs:element nam

e="lastname" type="xs:string"/>

</xs:sequence>

</xs:complexType>

</xs:element>

51

XS

D, H

ow to?

Define a Com

plex Element in XM

L Schema

E

.g.: somew

here in a XM

L document you have for instance:

<em

ployee>

<firstname>John</firstnam

e>

<lastname>S

mith</lastnam

e>

</employee>

Second possible XML Schem

a: <xs:elem

ent name="em

ployee" type="personinfo"/> //…

//……

<xs:com

plexType name="personinfo">

<xs:sequence>

<xs:element nam

e="firstname" type="xs:string"/>

<xs:element nam

e="lastname" type="xs:string"/>

</xs:sequence> </xs:com

plexType>

If you use the second w

ay (above), several elements can refer to the sam

e complex type as

being their typed

E.g.

<xs:element nam

e="employee" type="personinfo"/>

<xs:element nam

e="student" type="personinfo"/> <xs:elem

ent name="m

ember" type="personinfo"/>

52

XS

D, H

ow to?

Define Com

plex Text-Only Elem

ents

contains only simple content (text and attributes),

we add a sim

pleContent elem

ent around the content.

E.g.

<shoesize country="france">35</shoesize>

The corresponding XML Schem

a: <xs:elem

ent name="shoesize">

<xs:complexType>

<xs:simpleContent> // to indicate that it does not contain other elem

ents <xs:extension base="xs:integer"> // to indicate the text type <xs:attribute nam

e="country" type="xs:string" /> // the attribute type </xs:extension> </xs:sim

pleContent> </xs:com

plexType> </xs:elem

ent>

Page 14: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

53

XS

D, H

ow to?

Define Com

plex Types with M

ixed Content

A m

ixed complex type elem

ent can contain attributes, elements, and text.

E

.g. <letter> D

ear Mr.

<nam

e>John Sm

ith</name>. Your order <orderid>1032</orderid> w

ill be shipped on <shipdate>2001-07-13</shipdate>

</letter> The corresponding XM

L Schema:

<xs:element nam

e="letter"> <xs:com

plexType mixed="true">

<xs:sequence> <xs:elem

ent name="nam

e" type="xs:string"/> <xs:elem

ent name="orderid" type="xs:positiveInteger"/>

<xs:element nam

e="shipdate" type="xs:date"/> </xs:sequence> </xs:com

plexType> </xs:elem

ent>

54

XS

D Indicators

Indicators control HO

W elem

ents have to be used in docum

ents

Order indicators:

All

specifies that the child elements can appear in any order

each child elem

ent must occur only once

E

.g.: <xs:elem

ent name="person">

<xs:complexType>

<xs:all> <xs:elem

ent name="firstnam

e" type="xs:string"/> <xs:elem

ent name="lastnam

e" type="xs:string"/> </xs:all> </xs:com

plexType> </xs:elem

ent>

55

XS

D Indicators

O

rder indicators:

Sequence

specifies that the child elements m

ust appear in a specific order

E.g.:

<xs:element nam

e="person"> <xs:com

plexType> <xs:sequence> <xs:elem

ent name="firstnam

e" type="xs:string"/> <xs:elem

ent name="lastnam

e" type="xs:string"/> </xs:sequence> </xs:com

plexType> </xs:elem

ent>

56

XS

D Indicators

O

rder indicators:

Choice

specifies that either one child element or another can occur

E

.g.: <xs:elem

ent name="person">

<xs:complexType>

<xs:choice> <xs:elem

ent name="em

ployee" type="employee"/>

<xs:element nam

e="mem

ber" type="mem

ber"/> </xs:choice> </xs:com

plexType> </xs:elem

ent>

Page 15: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

57

XS

D Indicators

O

ccurrence indicators:

Occurrence indicators are used to define how

often an element can

occur

maxO

ccurs: the maxim

um num

ber of times an elem

ent can occur:

minO

ccurs : the minim

um num

ber of times an elem

ent can occur

E.g.:

<xs:element nam

e="person"> <xs:com

plexType> <xs:sequence>

<xs:element nam

e="full_name" type="xs:string"/>

<xs:elem

ent name="child_nam

e" type="xs:string"

m

axOccurs="10" m

inOccurs="0"/>

</xs:sequence>

</xs:complexType>

</xs:element>

58

XML Schem

a extension Element

The extension elem

ent extends an existing simpleType

or complexType elem

ent.

Parent enclosing elements: sim

pleContent O

r complexContent

<xs:complexType nam

e="fullpersoninfo"> <xs:com

plexContent>

<xs:extension base="personinfo"> <xs:sequence> <xs:elem

ent name="address" type="xs:string"/>

<xs:element nam

e="city" type="xs:string"/> <xs:elem

ent name="country" type="xs:string"/>

</xs:sequence> </xs:extension> </xs:com

plexContent>

</xs:complexType>

</xs:schema>

<xs:complexType nam

e="personinfo"> <xs:sequence> <xs:elem

ent name="firstnam

e" type="xs:string"/> <xs:elem

ent name="lastnam

e" type="xs:string"/> </xs:sequence> </xs:com

plexType>

59

XML Schem

a Restrictions R

estrictions are used to define acceptable values for X

ML elem

ents or attribute .

Parent enclosing elements: sim

pleType E

.g.: <xs:elem

ent name="car">

<xs:simpleType>

<xs:restriction base="xs:string">

<xs:enumeration value="A

udi"/>

<xs:enum

eration value="Golf"/>

<xs:enumeration value="B

MW

"/>

</xs:restriction> </xs:sim

pleType> </xs:elem

ent>

60

XML Schem

a Restrictions C

onstraint D

escription enum

eration D

efines a list of acceptable values fractionD

igits Specifies the m

aximum

number of decim

al places allowed. M

ust be equal to or greater than zero length

Specifies the exact number of characters or list item

s allowed. M

ust be equal to or greater than zero m

axExclusive Specifies the upper bounds for num

eric values (the value must be less than this value)

maxInclusive

Specifies the upper bounds for numeric values (the value m

ust be less than or equal to this value) m

axLength Specifies the m

aximum

number of characters or list item

s allowed. M

ust be equal to or greater than zero m

inExclusive Specifies the low

er bounds for numeric values (the value m

ust be greater than this value) m

inInclusive Specifies the low

er bounds for numeric values (the value m

ust be greater than or equal to this value) m

inLength Specifies the m

inimum

number of characters or list item

s allowed. M

ust be equal to or greater than zero

pattern D

efines the exact sequence of characters that are acceptable totalD

igits Specifies the exact num

ber of digits allowed. M

ust be greater than zero w

hiteSpace Specifies how

white space (line feeds, tabs, spaces, and carriage returns) is handled

Restrictions for D

ata types

Page 16: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

61

XM

L Schem

a Patterns

C

onstraints on values of basic types

With use of R

egular Expressions

Exam

ple <xs:elem

ent name="letter">

<xs:simpleType>

<xs:restriction base="xs:string"> <xs:pattern value="([a-z])*"/> </xs:restriction> </xs:sim

pleType> </xs:elem

ent>

/* The acceptable value is zero or m

ore occurrences of low

ercase letters from a to z */

62

XM

L Schem

a : example (1)

<xsd:schema xm

lns:xsd="http://ww

w.w

3.org/1999/XM

LSchema">

<xsd:element nam

e="purchaseOrder" type="PurchaseO

rderType"/> <xsd:elem

ent name="com

ment" type="xsd:string"/>

<xsd:complexType nam

e="PurchaseOrderType">

<xsd:sequence>

<xsd:element nam

e="shipTo" type="USA

ddress"/>

<xsd:elem

ent name="billTo" type="U

SAddress"/>

<xsd:element ref="com

ment" m

inOccurs="0"/>

<xsd:element nam

e="items" type="Item

s"/>

</xsd:sequence>

<xsd:attribute name="orderD

ate" type="xsd:date"/> </xsd:com

plexType>

63

XM

L Schem

a : example (2)

<xsd:complexType nam

e="Items">

<xsd:sequence>

<xsd:elem

ent name="item

" minO

ccurs="0" maxO

ccurs="unbounded">

<xsd:complexType>

<xsd:sequence>

<xsd:element nam

e="productNam

e" type="xsd:string"/>

<xsd:element nam

e="quantity">

<xsd:sim

pleType>

<xsd:restriction base="xsd:positiveInteger">

<xsd:maxExclusive value="100"/>

</xsd:restriction>

</xsd:sim

pleType>

</xsd:element>

<xsd:elem

ent name="U

SPrice" type="xsd:decimal"/>

<xsd:elem

ent ref="comm

ent" minO

ccurs="0"/>

<xsd:element nam

e="shipDate" type="xsd:date" m

inOccurs="0"/>

</xsd:sequence>

<xsd:attribute nam

e="partNum

" type="SKU

" use="required"/>

</xsd:complexType>

</xsd:elem

ent>

</xsd:sequence> </xsd:com

plexType> </xsd:schem

a>

64

Som

e XM

L Tools Editor

Tool Support

Tibco (Extensibility)

XM

L Authority 2.0

DTD

S

chema

Altova

XM

L Spy (2007

available:1 month trial)

DTD

S

chema

Dasan

Tagfree 2000 DTD

E

ditor D

TD

Data Junction

XM

L Junction S

chema

Insight Soft.

XM

LMate 2.0

DTD

S

chema

Microstar Soft.

Near &

Far Designer

DTD

Page 17: chema The Agenda umentlaser.cs.umass.edu/courses/cs520-620.Spring15/lectures/xml_schema_4pages.pdf1 e X M arkup L anguage aou r erso-u XML 2 Agenda : The rd Goals ? e Well-ument ema

65

Exercise

G

ive the XM

L schema of w

hat could an address book

G

ive an example of an X

ML docum

ent that contains a list of contacts and that conform

s to the schem

a you proposed

Your feedbacks ?

66

XM

L Schem

a, summ

ary

Expressive and extensible

R

ich data typing

More and m

ore used

Model exchange: X

MI 2.0

W

eb Services: S

OA

P, WS

DL

A

bit too complex!

67

XM

L : Wrap-up (P

art I, II)

M

eta-language ! Infinite number of :

A

tree-like structure !

S

eparation :

Form

alization :

A

standard and bunch of tools available for XM

L developers

- tags and, - Their associated attributes

- Data

(.xml)

- Logical syntax (.dtd or .xsd)

- Well-form

ed documents (correct XM

L syntax) - valid (well form

ed and conforms to a DTD or XSD)


Recommended