Upload
claribel-wells
View
244
Download
1
Embed Size (px)
Citation preview
XML
2Microsoft
• The Extensible Markup Language (XML) is a general-purpose markup language.
• It is classified as an extensible language because it allows its users to define their own tags.
• XML is a markup language for documents containing structured information.
• Its primary purpose is to facilitate the sharing of structured data across different information systems, particularly via the Internet.
• XML is recommended by the World Wide Web Consortium. It is a fee-free open standard.
What is XML?
3Microsoft
So XML is Just Like HTML?
• No. In HTML, both the tag semantics and the tag set are fixed.
• An <h1> is always a first level heading and the tag <mscit> is meaningless.
• XML specifies neither semantics nor a tag set.
4Microsoft
Why Is XML Important?
• Plain Text– Since XML is not a binary format, you can create and edit files
with anything from a standard text editor to a visual development environment.
– That makes it easy to debug your programs, and makes it useful for storing small amounts of data.
– XML provides scalability for anything from small configuration files to a company-wide data repository.
• Data Identification– XML tells you what kind of data you have, not how to display it.
5Microsoft
Why Is XML Important?
• XML can Separate Data from HTML– With XML, your data is stored outside your HTML.
• XML is Used to Exchange Data– With XML, data can be exchanged between incompatible
systems.
• XML and B2B– With XML, financial information can be exchanged over the
Internet.
6Microsoft
Why Is XML Important?
• XML Can be Used to Share Data– With XML, plain text files can be used to share data.– Since XML data is stored in plain text format, XML provides a
software- and hardware-independent way of sharing data.– In the real world, computer systems and databases contain
data in incompatible formats. One of the most time-consuming challenges for developers has been to exchange data between such systems over the Internet.
– Converting the data to XML can greatly reduce this complexity and create data that can be read by many different types of applications.
7Microsoft
Why Is XML Important?
• XML Can Be Used to Create Specialized Vocabularies– XML is an extensible standard. – By using XML as a base, you can create your own
vocabularies. Wireless Application Protocol (WAP), Wireless Markup Language (WML), and Simple Object Access Protocol (SOAP) are some examples of specialized XML vocabularies.
8Microsoft
Other Applications of XML
• Molecular sciences with CML
– Peter Murray-Rust’s Chemical Markup Language (CML)
• Science and math with MathML
– The Mathematical Markup Language (MathML) is an XML application for mathematical equations.
• Webcasting with CDF
– Microsoft’s Channel Definition Format (CDF) is an XML application for defining channels. Web sites use channels to upload information to readers who subscribe to the site rather than waiting for them to come and get it. This is alternately called Webcasting or push.
9Microsoft
Other Applications of XML
• Software updates through OSD– The Open Software Description (OSD) format is an XML
application co-developed by Marimba and Microsoft for updating software automatically. OSD defines XML tags that describe software components. The description of a component includes the version of the component, its underlying structure, and its relationships to and dependencies on other components
• Vector graphics with both PGML and VML– Vector graphics better than GIF, JPEG images– The Precision Graphics Markup Language (PGML) from
IBM, Adobe, Netscape, and Sun.– The Vector Markup Language (VML) from Microsoft,
Macromedia, Autodesk, Hewlett-Packard, and Visio
10Microsoft
Other Applications of XML
• Financial data with OFX– the Open Financial Exchange Format (OFX)
is an XML application used to describe financial data of the type you’re likely to store in a personal finance product like Money or Quicken. Any program that understands OFX can read OFX data. And since OFX is fully documented and non-roprietary (unlike the binary formats of Money, Quicken, and other programs) it’s easy for programmers to write the code to understand OFX.
11Microsoft
Other Applications of XML
• Financial data with OFX
12Microsoft
Other Applications of XML
• Financial data with OFX
13Microsoft
Other Applications of XML
• Automated voice responses with VoxML• Financial data with OFX• Legally binding forms with XFDL• Human resources job information with
HRML• Meta-data through RDF• Internal use of XML by various companies,
including Microsoft, Federal Express, and Netscape
14Microsoft
XML Syntax
• The syntax rules of XML are very simple and very strict.
An Example XML Document
XML documents use a self-describing and simple syntax.
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>Students</to>
<from>Faculty</from>
<heading>Reminder</heading>
<body>Learn XML</body>
</note>
15Microsoft
XML Syntax
<?xml version="1.0" encoding="ISO-8859-1"?> • defines the XML version and the character encoding used in
the document. • In this case the document conforms to the 1.0 specification of
XML and uses the ISO-8859-1 (Latin-1/West European) character set.
<note>
The next line describes the root element of the document
16Microsoft
XML Syntax
The next 4 lines describe 4 child elements of the root (to, from, heading, and body)
<to>Students</to>
<from>Faculty</from>
<heading>Reminder</heading>
<body>Learn XML</body>
</note>
last line defines the end of the root element
17Microsoft
XML Syntax
• All XML Elements Must Have a Closing Tag• With XML, it is illegal to omit the closing tag.• In HTML some elements do not have to have a closing tag.
The following code is legal in HTML:
<p>This is a paragraph
<p>This is another paragraph
XML all elements must have a closing tag, like this:
<p>This is a paragraph</p>
<p>This is another paragraph</p>
18Microsoft
XML Syntax
• XML Tags are Case Sensitive
<Message>This is incorrect</message>
<message>This is correct</message>
19Microsoft
XML Syntax
• XML Elements Must be Properly Nested• Improper nesting of tags makes no sense to XML.
<b><i>This text is bold and italic</b></i>
In XML all elements must be properly nested within each other like this:
<b><i>This text is bold and italic</i></b>
20Microsoft
XML Syntax
• XML Documents Must Have a Root Element• All XML documents must contain a single tag pair to define a
root element.• All other elements must be within this root element.• All elements can have sub elements (child elements). Sub
elements must be correctly nested within their parent element:
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
21Microsoft
XML Syntax
• XML Attribute Values Must be Quoted• With XML, it is illegal to omit quotation marks around attribute
values. • XML elements can have attributes in name/value pairs just like in
HTML. • In XML the attribute value must always be quoted. <?xml version="1.0" encoding="ISO-8859-1"?> <note date=12/11/2002> <to>Students</to> </note>
<?xml version="1.0" encoding="ISO-8859-1"?> <note date="12/11/2002"> <to>Students</to> </note>
22Microsoft
XML Syntax
• With XML, White Space is Preserved• With XML, the white space in your document is not truncated.
Comments in XML• The syntax for writing comments in XML is similar to that of
HTML.
<!-- This is a comment -->
23Microsoft
XML Sibling Elements
<book> <title>My First XML</title> <prod id="33-657" edia="paper"></prod><chapter>Introduction to XML
<para>What is HTML</para> <para>What is XML</para>
</chapter> <chapter>XML Syntax
<para>Elements must have a closing tag</para> <para>Elements must be properly nested</para>
</chapter> </book> Sibling Elements: Title, prod, and chapter are siblings (or sister elements) because they have the same parent.
24Microsoft
Use of Elements vs. Attributes
• Data can be stored in child elements or in attributes.
<person gender="female">
<firstname>ABC</firstname>
<lastname>XYZ</lastname>
</person>
<person>
<gender>female</gender>
<firstname>ABC</firstname>
<lastname>XYZ</lastname>
</person>
25Microsoft
Use of Elements vs. Attributes
Some of the problems with using attributes are: • attributes cannot contain multiple values (child elements can) • attributes are not easily expandable (for future changes) • attributes cannot describe structures (child elements can) • attributes are more difficult to manipulate by program code • attribute values are not easy to test against a Document Type
Definition (DTD) - which is used to define the legal elements of an XML document
26Microsoft
Well Formed XML Document
Well Formed XML Documents• A "Well Formed" XML document has correct XML syntax.• A "Well Formed" XML document is a document that conforms
to the following:– XML documents must have a root element – XML elements must have a closing tag – XML tags are case sensitive – XML elements must be properly nested – XML attribute values must always be quoted
27Microsoft
Why usa a DTD?
• With a DTD, each of your XML files can carry a description of its own format.
• With a DTD, independent groups of people can agree to use a standard DTD for interchanging data.
• Your application can use a standard DTD to verify that the data you receive from the outside world is valid.
• You can also use a DTD to verify your own data.
28Microsoft
Valid XML
• Valid XML Documents– A "Valid" XML document also conforms to a DTD.– A "Valid" XML document is a "Well Formed" XML document,
which also conforms to the rules of a Document Type Definition (DTD)
29Microsoft
Valid XML
• XML DTD – This is a statement of rules for an XML document– The purpose of a DTD is to define the legal building blocks of
an XML document. – It helps ensure the accuracy of the information you collect.– It helps ensure that the information gathered is in the most
usable format for your business needs.
30Microsoft
DTD
• A DTD can be declared inline inside an XML document, or as an external reference.
• Internal DTD Declaration• If the DTD is declared inside the XML file, it should be
wrapped in a DOCTYPE definition with the following syntax:
<!DOCTYPE root-element [element-declarations]>
31Microsoft
DTD Terms
32Microsoft
Inline to DTD
<?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note>
<to>Students</to> <from>Faculty</from> <heading>Reminder</heading> <body>Don't forget to learn XML !!!</body>
</note>
33Microsoft
External DTD Declaration
• If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the following syntax:– <!DOCTYPE root-element SYSTEM "filename">
<?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note>
File "note.dtd" which contains the DTD:<!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>
34Microsoft
DTD Explanation
• !DOCTYPE note defines that the root element of this document is note
• !ELEMENT note defines that the note element contains four elements: "to,from,heading,body“
• !ELEMENT to defines the to element to be of type "#PCDATA"
• !ELEMENT from defines the from element to be of type "#PCDATA"
• !ELEMENT heading defines the heading element to be of type "#PCDATA"
• !ELEMENT body defines the body element to be of type "#PCDATA"
35Microsoft
DTD Elements
• DTD elements can be defined as follows:• <!ELEMENT Name EMPTY>
– Sometimes, you may want an element type to remain empty with no content to call its own
• <!ELEMENT Name ANY>– If you want your element to serve as a catch-all box that you
can put anything in, you may want to use another type of content specification: ANY.
– If you declare an element to contain ANY content, you allow that element type to hold any element or character data.
36Microsoft
Element Attribute
• <!ATTLIST element-name attribute-name datatype defaultvalue>– The attribute-list declaration begins with the !ATTLIST string,
followed by white space.– Next is the element name, the associated attribute’s name, its
type, and a default value.– For Example
<!ATTLIST customer custType CDATA #REQUIRED>
<!ATTLIST price priceType (Retail | Wholesale) #REQUIRED>
37Microsoft
Element Attribute
• #REQUIRED means you must always include the attribute when the element is used.
• No specific default value for the attribute can be included in this case, so you must include a value for it in your– <!ATTLIST element-name attribute-name CDATA
#REQUIRED>
• #IMPLIED means the attribute is optional. The attribute may be used in an element, but no default value is provided if the attribute isn’t used. – <!ATTLIST element-name attribute-name CDATA #IMPLIED>
38Microsoft
Element Attribute
• #FIXED means the attribute is optional, but if used, the attribute must always take on the default value assigned in the DTD. – <!ATTLIST element-name attribute-name #FIXED “value”>
39Microsoft
Element and its attribute example
40Microsoft
DTD PCDATA
• PCDATA means parsed character data.• Think of character data as the text found between the start
tag and the end tag of an XML element.• PCDATA is text that WILL be parsed by a parser.
41Microsoft
DTD CDATA
• CDATA means character data.
• CDATA is text that will NOT be parsed by a parser.
• Tags inside the text will NOT be treated as markup and entities will not be expanded.
42Microsoft
XML Schema
• XML Schema is an XML-based alternative to DTDs.• An XML Schema describes the structure of an XML
document.• The XML Schema language is also referred to as XML Schema
Definition (XSD).
43Microsoft
An XML Schema
The purpose of an XML Schema is to define the legal building
blocks of an XML document, just like a DTD.• defines elements that can appear in a document • defines attributes that can appear in a document • defines which elements are child elements • defines the order of child elements • defines the number of child elements • defines whether an element is empty or can include text • defines data types for elements and attributes • defines default and fixed values for elements and attributes
44Microsoft
XML Schema
• XML Schemas use XML Syntax• You don't have to learn a new language • You can use your XML editor to edit your Schema files • You can use your XML parser to parse your Schema files • You can manipulate your Schema with the XML DOM
45Microsoft
XML Schema Example
• The following example is an XML Schema file called "note.xsd" that defines the elements of the XML document ("note.xml")
<xs:element name="note"> <xs:complexType>
<xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/> </xs:sequence>
</xs:complexType> </xs:element>
46Microsoft
XML Schema
• The note element is a complex type because it contains other elements.
• The other elements (to, from, heading, body) are simple types because they do not contain other elements.
47Microsoft
XML Schema
• Simple element is an XML element that contains only text. It cannot contain any other elements or attributes.
• The text can be of many different types. It can be one of the types included in the XML Schema definition (boolean, string, date, etc.), or it can be a custom type that you can define yourself.
• You can also add restrictions (facets) to a data type in order to limit its content, or you can require the data to match a specific pattern.
48Microsoft
XML Schema
• The syntax for defining a simple element is:
<xs:element name=“abc" type=“xyz"/>
• Where abc is the name of the element and xyz is the data type of the element.
• XML Schema has a lot of built-in data types. The most common types are:– xs:string – xs:decimal – xs:integer – xs:boolean – xs:date – xs:time
49Microsoft
XML Schema
• Here are some simple XML elements:
<lastname>Raj</lastname>
<age>36</age>
<dateborn>1987-03-27</dateborn>
• Here are the corresponding simple element definitions:
<xs:element name="lastname" type="xs:string"/>
<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>
50Microsoft
XML Schema
• Simple elements may have a default value OR a fixed value specified.
• Default value is automatically assigned to the element when no other value is specified. In the following example the default value is "red":
<xs:element name="color" type="xs:string“ default="red"/>
• Fixed value is also automatically assigned to the element, and you cannot specify another value. In the following example the fixed value is "red":
<xs:element name="color" type="xs:string" fixed="red"/>
51Microsoft
XML Schema
• The syntax for defining an attribute is:
<xs:attribute name=“abc" type=“xyz"/>
– Where abc is the name of the attribute and xyz specifies the data type of the attribute.
– Simple elements can’t have attributes!
52Microsoft
XML Schema
• When an XML element or attribute has a data type defined, it puts restrictions on the element's or attribute's content.
• If an XML element is of type "xs:date" and contains a string like "Hello World", the element will not validate.
• With XML Schemas, you can also add your own restrictions to your XML elements and attributes.
53Microsoft
XML Schema
• The following example defines an element called "age" with a restriction. The value of age cannot be lower than 0 or greater than 120:
<xs:element name="age"> <xs:simpleType> <xs:restriction base="xs:integer"> <xs:minInclusive value="0"/> <xs:maxInclusive value="120"/> </xs:restriction> </xs:simpleType> </xs:element>
54Microsoft
XML Schema
• The example below defines an element called "car" with a restriction. The only acceptable values are: Audi, Ford, BMW:
<xs:element name="car" type="carType"/> <xs:simpleType name="carType"> <xs:restriction base="xs:string"> <xs:enumeration value="Audi"/> <xs:enumeration value=“Ford"/> <xs:enumeration value="BMW"/> </xs:restriction> </xs:simpleType>
• Note: In this case the type "carType" can be used by other elements because it is not a part of the "car" element.
55Microsoft
56Microsoft
57Microsoft
An XML Document