Upload
sybil-hampton
View
225
Download
0
Tags:
Embed Size (px)
Citation preview
ITR3 lecture 3: Namespaces, XML Schema &
XSLThomas Krichel
2002-09-10
Gee….
• Birdseye view only, have a look at what these things do.
• If there is interest, I can teach some more in a separate course.
• Structure– Some XML related standards– Namespaces– XML Schema– XSL
Literature
• Castro, Elizabeth (2001). XML for the World Wide Web: Visual QuickStart Guide. Peachpit Press.
• Duckett, Jon et al. (2001). Professional XML Schemas. Wrox Press (recommended)
• Kay, Michael (2001). XSLT (2nd ed.). Wrox Press.
XHTML
• This is HTML redefined so that it becomes well-formed XML
• Examples– Case-sensitive elements– <p> replaced by <p/>
• Verdict: pain without gain
Resource Description Framework (RDF)
• A standard issued by the W3C. A framework to encode meaning to make it computer processable.
• Uses the approach of a directed graph.• Generalizes an object / property / value approach
– Value may be another object. – Objects are URI identified by a URI.– Properties may be identified with a URI
• A paper on RDF available at http://openlib.org/home/krichel/papers/anhalter.letter.pdf
• RDF XML syntax is defined but currently being reworked.
• Verdict: very costly to implement.
Cascading style sheets (CSS)
• a non-XML way of writing stylesheets that can be applied to both XML and HTML. Widely supported by browsers.
• Written as a sequence of rules. Example
compositionyear, recordingyear {
color: red;
font-family: sans-serif }
• Verdict: not flexible
XPath and XPointer
• are non-XML syntaxes referring to parts of an XML document, specific – Ranges– points– sets of XML document.
• There are used in other XML related standards, in particular, in XSL will be covered as part of XSL.
• Verdict: useful
XLinks
• is an XML syntax to link XML documents.
• They go way beyond the conventional linking capabilities of HTML, but there is no obvious way for the browser to represent them.
• Verdict: nonsense
Document Object Model DOM
• “a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents. The Document Object Model provides a standard set of objects for representing HTML and XML documents, a standard model of how these objects can be combined, and a standard interface for accessing and manipulating them.”
• Now at ''Level 3''. • Works by building a tree out of a document.• Verdict: exxxtremly complicated
Simple API for XML (SAX)
• SAX is an event-based paring model. It reports parsing events (such as the start and end of elements) directly to the application through callbacks
• Does not usually build an internal tree.• A lot less resource-intensive,
– when the document is large– when the task is simple.
• Verdict: thumbs up!
XML Information Sets
• best understood through an example. Consider two XML snippets.
• Snippet 1 <person sex="female"> Margarete Krichel</person>
• Snippet 2 <person sex='female'>Margarete Krichel </person>
• Are they the same?
XML Namespaces
• Allow to make XML element names and attribute name globally unique by associating them with a particular URI, usually a URL.
• The globally unique name is called the qualified name or qname, for short.
• The name without the namespace URI called the local name.
• This is done through a namespaces declaration, and a prefix. The namespace declaration associates a short string, called a prefix with the namespace.
• The qualified name can then be written as prefix:localname
Namespace syntax
• <element xmlns[:prefix]=URI> … </element>• element is the element name • prefix is the prefix• URI is a URI, often a URL, actually.• [ ] indicate that it is optional. If the prefix is
missing it means that all elements that have no namespace prefix belong, by default to the declared namespace.
• Namespace declaration remains local to the children of element.
Avoiding cerebral indigestion related tonamespaces
• Expect nothing if you retrieve the namespace URI, when it is a URL.
• Prefixes can be any short string. Some prefixes are customary, like xsi for http://www.w3.org/2001/XMLSchema-instance
• Default attributes only apply to elements not attributes. Attributes belong to the namespace of their elements, unless it has an explicit prefix.
XML Schemas
http://www.w3.org/TR/xmlschema-0/ (Primer) http://www.w3.org/TR/xmlschema-1/ (Structures) http://www.w3.org/TR/xmlschema-2/ (Datatypes)
What is XML Schema?
• XML Schema is vocabulary for expressing constraints for the validity of an XML document.
• A piece of XML is valid if it satisfies the constraints expressed in another XML file, the schema file.
• The idea is to check if the XML file is fit for a certain purpose.
Example<location> <latitude>32.904237</latitude> <latitude>73.620290</longitude> <uncertainty units="meters">2</uncertainty></location>
To be valid, this XML snippet must meet all the following constraints: 1. The location must be comprised of a latitude, followed by a longitude, followed by an indication of the uncertainty of the lat/lon measurements. 2. The latitude must be a decimal with a value between -90 to +90 3. The longitude must be a decimal with a value between -180 to +180 4. For both latitude and longitude the number of digits to the right of the decimal point must be exactly six digits. 5. The value of uncertainty must be a non-negative integer 6. The uncertainty units must be either meters or feet.
Validating your data
<location> <latitude>32.904237</latitude> <longitude>73.620290</longitude> <uncertainty units="meters">2</uncertainty></location>
-check that the latitude is between -90 and +90-check that the longitude is between -180 and +180- check that the fraction digits is 6 …Etc..
XML instance
XML Schemavalidator
Data is ok!
XML Schema file
software
History of Schema• Once upon a time, there was SGML
• SGML has a “schema” language called a DTD.
• It is crap– Different syntax then SGML– Main focus on presence and absence of
elements– Very limited capabilties to check contents
of elements (datatypes)
XML Schemas can constrain
• the structure of instance documents– "this element contains these elements, which
contains these other elements“, etc
• the datatype of each element/attribute– "this element shall hold an integer with the
range 0 to 12,000"
Highlights of XML Schemas• 44 built-in datatypes• Can create your own datatypes by extending or restricting
existing datatypes• Written in the same syntax as instance documents• Can express sets, i.e., can define the child elements to occur in
any order• Can specify element content as being unique (keys on content)
and uniqueness within a region• Can define multiple elements with the same name but different
content• Can define elements with nil content• Can define substitutable elements
important schema concepts• simple types: types that can not have
child elements– elements that only have text contents and
no attributes– attributes
• complex type: type of anything that can have child attributes
important schema concepts
• global declarations are direct children of the root schema element. They are visible everywhere.
• all local declarations are local and are limited in scope to the element that they appear within
important schema concepts• Value space. The range of values that
the type can take• Lexical space. The range litterals that
represent the value• Set of facets. The defining properties of
a type. – Fundamental facets include equality, order,
bounds, cardinality, numeric/non-numeric– Constraining facets include ranges for
numbers, string lengths, or a regular expressions
Namespaces
• XML Schema file mixes vocabulary from the XML Schema language with own vocabulary to be created.
• Has to keep both separate using namespaces.
• Namespaces associate a URI with names.
elementcomplexType
schema
sequence
http://www.w3.org/2001/XMLSchema
string
integer
boolean
BookStore
BookTitle
Author
Date
ISBNPublisher
http://www.books.org (targetNamespace)
This is the vocabulary that XML Schemas provide to define yournew vocabulary
This is the vocabulary for our book store xml description.
<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>
BookStore.xsd (see example01)xsd = Xml-Schema Definition
(explanations onsucceeding pages)
<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>
<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>
All XML Schemas have"schema" as the rootelement.
<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>
The elements anddatatypes thatare used to constructschemas - schema - element - complexType - sequence - stringcome from the http://…/XMLSchemanamespace
elementcomplexType
schema
sequence
http://www.w3.org/2001/XMLSchema
XMLSchema Namespace
string
integer
boolean
<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>
Says that theelements definedby this schema - BookStore - Book - Title - Author - Date - ISBN - Publisherare to go in thisnamespace
BookStore
BookTitle
Author
Date
ISBNPublisher
http://www.books.org (targetNamespace)
Book Namespace (targetNamespace)
<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>
This is referencing a Book element declaration.The Book in whatnamespace?
The default namespace ishttp://www.books.orgwhich is the targetNamespace!
<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.books.org" xmlns="http://www.books.org" elementFormDefault="qualified"> <xsd:element name="BookStore"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Book" minOccurs="1" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Book"> <xsd:complexType> <xsd:sequence> <xsd:element ref="Title" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Author" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Date" minOccurs="1" maxOccurs="1"/> <xsd:element ref="ISBN" minOccurs="1" maxOccurs="1"/> <xsd:element ref="Publisher" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="Title" type="xsd:string"/> <xsd:element name="Author" type="xsd:string"/> <xsd:element name="Date" type="xsd:string"/> <xsd:element name="ISBN" type="xsd:string"/> <xsd:element name="Publisher" type="xsd:string"/></xsd:schema>
This is a directive to anyinstance documents whichconform to this schema: Any elements that are defined in this schemamust be namespace-qualifiedwhen used in instance documents.
Referencing a schema in an XML instance document
<?xml version="1.0"?><BookStore xmlns ="http://www.books.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.books.org BookStore.xsd"> <Book> <Title>My Life and Times</Title> <Author>Paul McCartney</Author> <Date>July, 1998</Date> <ISBN>94303-12021-43892</ISBN> <Publisher>McMillin Publishing</Publisher> </Book> ...</BookStore>
1. First, using a default namespace declaration, tell the schema-validator that all of the elementsused in this instance document come from the http://www.books.org namespace.
2. Second, with schemaLocation tell the schema-validator that the http://www.books.org namespace is defined by BookStore.xsd (i.e., schemaLocation contains a pair of values).
3. Third, tell the schema-validator that the schemaLocation attribute we are using is the one inthe XML Schema-instance namespace.
1
2
3
schemaLocationtype
noNamespaceSchemaLocation
http://www.w3.org/2001/XMLSchema-instance
XMLSchema-instance Namespace
nil
Referencing a schema in an XML instance document
BookStore.xml BookStore.xsd
targetNamespace="http://www.books.org"schemaLocation="http://www.books.org BookStore.xsd"
- defines elements in namespace http://www.books.org
- uses elements from namespace http://www.books.org
A schema defines a new vocabulary. Instance documents use that new vocabulary.
Note multiple levels of checking
BookStore.xml BookStore.xsd XMLSchema.xsd(schema-for-schemas)
Validate that the xml documentconforms to the rules describedin BookStore.xsd
Validate that BookStore.xsd is a validschema document, i.e., it conformsto the rules described in theschema-for-schemas
Using XSLT and XPath
XSL transforms XML
• XSL may be used to generate either HTML, XML, or text
XSL Processor
XSL
XML HTML (or XML or text)
Doing it using Internet Explorer
• First, download the latest version of Internet Explorer (at this time it is 6.0)
• Write an XSL stylesheet stylish.xsl• Write an XML file, and refer to the xsl
stylesheet with a processing instruction<?xml-stylesheet type="text/xsl“ href="stylish.xsl"?>
Note: this does not work with other browsers!
XML tree
• XSL has a model of XML as a tree.• XSL tree model is similar to the DOM model.• As the processor does its job it looks at
elements of the input tree and transforms them to the output tree.
• The processor only writes the file to the tree at the end.
• End points in the tree are called “nodes”.
in the general section
• we examine how XSL looks at an XML document. In fact it builds a tree.
• and then we look at a very simple way to look at what the stylesheet does. After that we have Roger showing us the details.
Seven types of nodes• root node: contains all the elements in the
document. Not to be confused with the document element of XML.
• element node: contains an element• text node: contain an as-large-as-possible area
of text.• attribute node: contains attribute name and value• comment node: contains a comment• processing instruction (p-i) node• namespace node: each element node has one
namespace node for every namespace declaration
properties of nodes: name
• This is empty for the root, text and comment nodes.
• for elments and attribute node, it is the name as it appears in the xml file, expanded by namespace declarations.
• for p-i nodes, it is the target
• for a namespace node, it is the prefix
properties of nodes: string value
• for text nodes: the text • for comment nodes: the text of the
comment• for p-i nodes: the data part of the p-i.• for an attribute node: the value of the
attribute• for a root node: the concatenation of all
the string values of all element and text children.
• for a namespace node: the URI of the namespace
properties of nodes: base URI
• for all nodes: the URI of the XML source document where the node has been found
• Only of interest for elements and p-i nodes
• for the root node: the URI of the document
• for attribute, text and comment nodes: the base URI of its parent node
properties of nodes: children
• for element nodes: all the element nodes, text nodes, p-i nodes and comment nodes between its start and end tags.
• for root nodes: all the element nodes, text nodes, p-i nodes and comment nodes that are not children of some other node.
parent node
• for all nodes except root nodes: the parent of the node.
• attribute nodes and namespace nodes have an element node as parent node, but are not considered to be its child.
property of nodes: attribute
• element: one to many attributes that the element has
• other nodes: empty
Now we look at what XSL does
Different formats…
• <xsl:output method="xml"> is the default
• <xsl:output method="html>
• <xsl:output method="text"> used for everything else. Final formatting may be up to formatting objects, anyway.
• Your stylesheet processor may have more formats, but they will be vendor-specific.
templates set rules
<xsl:template match="expression">
do some stuff
<xsl:template>
This is a rule that says, if you find a node that matches the expression expression, then go ahead and do some stuff. It is called a template. The fact that a rule is written down down does not imply that it is applied.
applying templates
• <xsl:apply-templates/>
says: apply all template rules on the current node and on all its child nodes.
Default, built-in rules for the nodes
• root: <xsl:apply-templates> on all children
• element: <xsl:apply-templates> to the current node and all its children
• attribute: copy the value as text to the output
• text: copy the text to the output
• comment, p-i, namespace: do nothing
HTML Generation
• We will first use XSL to generate HTML documents• When generating HTML, XSL should be viewed as
a tool to enhance HTML documents.– That is, the HTML documents may be enhanced
by extracting data out of XML documents– XSL provides elements (tags) for extracting the
XML data, thus allowing us to enhance HTML documents with data from an XML document
Enhancing HTML Documents with XML Data
XML Document
HTML Document(with embeddedXSL elements)
XSL element
XML data
XSLProcessor
XML data
Enhancing HTML Documents with the Following XML Data
<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="FitnessCenter.xsl"?>
<FitnessCenter> <Member level="platinum"> <Name>Jeff</Name> <Phone type="home">555-1234</Phone> <Phone type="work">555-4321</Phone> <FavoriteColor>lightgrey</FavoriteColor> </Member></FitnessCenter>
FitnessCenter.xml
Embed HTML Document in an XSL Template
<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0"> <xsl:output method="html"/> <xsl:template match="/"> <HTML> <HEAD> <TITLE>Welcome</TITLE> </HEAD> <BODY> Welcome! </BODY> </HTML> </xsl:template></xsl:stylesheet>
FitnessCenter.xsl (see html-example01)
Note
• The HTML is embedded within an XSL template, which is an XML document. The HTML must be well formed.
• We are able to add XSL elements to the HTML, allowing us to extract data out of XML documents.
• Let's customize the HTML welcome page by putting in the member's name. This is achieved by extracting the name from the XML document. We use an XSL element to do this.
Extracting the Member Name<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="html"/> <xsl:template match="/"> <HTML> <HEAD> <TITLE>Welcome</TITLE> </HEAD> <BODY> Welcome <xsl:value-of select="/FitnessCenter/Member/Name"/>! </BODY> </HTML> </xsl:template></xsl:stylesheet>
(see html-example02)
Extracting a Value from & Navigating the XML Document
• Extracting values:– use the <xsl:value-of select="…"/> XSL element
• Navigating:– The slash ("/") indicates parent/child relationship – A slash at the beginning of the path indicates that
it is an absolute path, starting from the top of the XML document
/FitnessCenter/Member/Name
"Start from the top of the XML document, go to the FitnessCenter element, from there go to the Member element, and from there go to the Name element."
Document/
PI<?xml version=“1.0”?>
ElementFitnessCenter
ElementMember
ElementName
ElementPhone
ElementPhone
ElementFavoriteColor
TextJeff
Text555-1234
Text555-4321
Textlightgrey
http://openlib.org/home/krichel
Thank you for your attention!