24
XML

What is XML? XML stands for EXtensible Markup Language XML is a markup language much like HTML XML was designed to carry data, not to display data

Embed Size (px)

Citation preview

Page 1: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

XML

Page 2: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

What is XML?

XML stands for EXtensible Markup Language

XML is a markup language much like HTML XML was designed to carry data, not to

display data. Bridge for data exchange on the Web XML tags are not predefined. You must

define your own tags XML is designed to be self-descriptive XML is a W3C Recommendation

Page 3: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

What is XML? It is a protocol for containing and

managing data.– A family of technologies:

Formatting documents to filtering data– A philosophy for handling information.

SGML ( Standard Generalized Markup Language) as defined by ISO 8879.

Not well suited for serving documents over the WEB.

Page 4: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

What is XML? XML is not a language itself. Rather,

it’s a metalanguage used for writing other languages, called XML vocabularies.

XHTML is one of those vocabularies XML is the most common tool for

data transmissions between all sorts of applications.

Page 5: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

Understanding XML Syntax XML languages use tags to mark up text. <p>Here is an introduction to XML.</p> The above line is XHTML, but it’s also XML. XML allows you to construct your own tags, so you could

rewrite the previous markup as:<intro> Here is an introduction to XML. </intro>

In this example, the <intro> tag tells you the purpose of the text that it marks up.

One big advantage of XML is that tags can describe their content—that’s why XML languages are often called self-describing.

XML is flexible enough to allow for the creation of many different types of languages to describe data.

The only constraint on XML vocabularies is that they be well-formed.

Page 6: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

Well-Formed Documents XML documents are well-formed if they

meet the following criteria:• The document contains one or more elements.• The document contains a single document element, which may contain other elements.• Each element closes correctly.• Elements are case-sensitive.• Attribute values are enclosed in quotation marks and cannot be empty.

Page 7: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

XML Document:<?xml version="1.0“><!-- This XML document describes a DVD library --><library>

<DVD id="1"><title>Breakfast at Tiffany's</title><format>Movie</format><genre>Classic</genre></DVD><DVD id="2"><title>Contact</title><format>Movie</format><genre>Science fiction</genre></DVD><DVD id="3"><title>Little Britain</title><format>TV Series</format><genre>Comedy</genre></DVD>

</library>

Page 8: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

XML Document: The document starts with an XML declaration:

<?xml version="1.0“> This declaration is optional and can contain a

number of attributes This XML document also includes a comment

describing its purpose:<!-- This XML document describes a DVD library -->

The document or root element is called <library>.

The document element contains a number of <DVD> elements, and each <DVD> element contains <title>, <format>, and <genre> elements. The <DVD> element also contains an id attribute

Page 9: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

Encoding

Encoding is the process of converting unicode characters into their equivalent binary representation. When the XML processor reads an XML document, it encodes the document depending on the type of encoding. Hence, we need to specify the type of encoding in the XML declaration.

Encoding Types There are mainly two types of encoding: UTF-8 UTF-16 UTF stands for UCS Transformation Format, and UCS itself

means Universal Character Set. The number 8 or 16 refers to the number of bits used to represent a character. They are either 8(one byte) or 16(two bytes). For the documents without encoding information, UTF-8 is set by default.

Page 10: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

Naming Rules in XML:

XML names cannot start with a number or punctuation.

XML names cannot include spaces. Don’t include a colon in a name. XML names are case-sensitive

Page 11: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

Structure of an XML Document:

Each XML document is divided into two parts: Prolog:

Processing Comments DTD/Schema

Document: Elements Attribute Text CDATA Entity Comment

Page 12: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

Valid XML Documents:

A "Valid" XML document is a "Well Formed" XML document, which also conforms to the rules of a Document Type Definition (DTD)

<?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE note SYSTEM "Note.dtd"><note><to>Students</to><from>faculty</from><heading>Good Morning</heading><body>Hello Dear how are you</body></note>

Note: The DOCTYPE declaration in the example above, is a reference to an external DTD file.

Page 13: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

Document Type Definition(DTD) The Document Type Definition(DTD)

describes a model of the structure of the content of an XML document. This model says what elements must be present, which ones are optional,what their attributes are and how they can be structured in relation to each other.

The DOCTYPE statement is the document type declaration. The DOCTYPE statement and the statements between the opening bracket and the closing bracket comprise the document type definition and define the rules of the document.

Page 14: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

Internal DTD <?xml version=“1.0” standalone=“yes”?> <!DOCTYPE CATALOGS[ <!ELEMENT CATALOGS(book)*> <!ELEMENT

BOOK(BOOKNAME,AUTHORNAME,PUBLISHER,PAGES?,PRICE)> <!ELEMENT BOOKNAME(#PCDATA)> <!ELEMENT AUTHORNAME(#PCDATA)> <!ELEMENT PUBLISHER(#PCDATA)> <!ELEMENT PAGES(#PCDATA)> <!ELEMENT PRICE(#PCDATA)> <!ATTLIST PRICE CURRENCY CDATA #REQUIRED)> ]>

<CATALOGS> <BOOK> <BOOKNAME> WEB TECHOLOGY</BOOKNAME> ………………….

Page 15: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

External DTD <xml version=“1.0”> <!ELEMENT CATALOGS(book)*> <!ELEMENT

BOOK(BOOKNAME,AUTHORNAME,PUBLISHER,PAGES?,PRICE)>

<!ELEMENT BOOKNAME(#PCDATA)> <!ELEMENT AUTHORNAME(#PCDATA)> <!ELEMENT PUBLISHER(#PCDATA)> <!ELEMENT PAGES(#PCDATA)> <!ELEMENT PRICE(#PCDATA)> <!ATTLIST PRICE CURRENCY CDATA #REQUIRED)>

Save the above code as catdtd.dtd.

Page 16: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

<?xml version =“1.0”> <!DOCTYPE CATALOGS SYSTEM

“catdtd.dtd”> <CATALOGS> <BOOK> <BOOKNAME>…………. …. <BOOK> ………

Page 17: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

XML DTD Example:

<!DOCTYPE note[<!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>]>

The DTD above is interpreted like this: !DOCTYPE note defines that the root element of this document is

note !ELEMENT note defines that the note element contains four

elements: "to,from,heading,body" !ELEMENT to defines the to element  to be of type "#PCDATA" !ELEMENT from defines the from element to be of type "#PCDATA" !ELEMENT heading defines the heading element to be of type

"#PCDATA" !ELEMENT body defines the body element to be of type "#PCDATA“

Page 18: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

XML Parser

An XML parser (also called XML processor) determines the content and structure of an XML document by combining an XML document and its DTD(if one is present).

XML Document+ XML DTD->XML Parser->XML application

XML parser is a software library or a package that provides interface for client applications to work with XML documents. It checks for proper format of the XML document and may also validate the XML documents.

Page 19: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

XML Parser:

All modern browsers have a built-in XML parser An XML parser converts an XML document into an XML

DOM object A DOM (Document Object Model) defines a standard

way for accessing and manipulating documents.

The XML DOM The XML DOM defines a standard way for accessing and

manipulating XML documents. The XML DOM views an XML document as a tree-

structure. All elements can be accessed through the DOM tree.

Their content (text and attributes) can be modified or deleted, and new elements can be created. The elements, their text, and their attributes are all known as nodes.

Page 20: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

XML Namespaces XML Namespaces provide a method to

avoid element name conflicts. When using prefixes in XML, a so-

called namespace for the prefix must be defined.

The namespace is defined by the xmlns attribute in the start tag of an element.

The namespace declaration has the following syntax. xmlns:prefix="URL”.

The Namespace starts with the keyword xmlns.

The word name is the Namespace prefix. The URL is the Namespace identifier.

Page 21: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

Namespaces can be declared in the elements where they are used or in the XML root element:

<root xmlns:h="http://www......"><h:table>  <h:tr>    <h:td>Apples</h:td>    <h:td>Bananas</h:td>  </h:tr></h:table>

……..

Page 22: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

XML Schema XML Schema is commonly known as XML

Schema Definition (XSD). It is used to describe and validate the structure and the content of XML data.

XML schema defines the elements, attributes and data types. Schema element supports Namespaces. It is similar to a database schema that describes the data in a database.

You need to declare a schema in your XML document as follows:

<xs:schema xmlns:xs="http://www.xmllecture.com/XMLSchema">

Page 23: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

Attributes in XSD provide extra information within an element. Attributes have name andtype property as shown below:

<xs:attribute name="x" type="y"/>

You can define XML schema elements in following ways:

Simple Type - Simple type element is used only in the context of the text. Some of predefined simple types are: xs:integer, xs:boolean, xs:string, xs:date.

<xs:element name="phone_number" type="xs:int" />

Complex Type - A complex type is a container for other element definitions. This allows you to specify which child elements an element can contain and to provide some structure within your XML documents.

Page 24: What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data

<?xml version="1.0"?> <xs:schema

xmlns:xs="http://www.xmllecture.com/XMLSchema"> <xs:element name="contact">

<xs:complexType> <xs:sequence> <xs:element name="name"

type="xs:string" /> <xs:element name="company" type="xs:string" /> <xs:element name="phone" type="xs:int" /> </xs:sequence>

</xs:complexType> </xs:element> </xs:schema>