1 Dr Alexiei Dingli XML Technologies XML Advanced

1

Dr Alexiei Dingli

XML Technologies

XML Advanced

2

• A Uniform Resource Identifier (URI) is a string of characters which identifies an Internet Resource

• The most common URI is the Uniform Resource Locator (URL) which identifies an Internet domain address

• Another, not so common type of URI is the Universal Resource Name (URN)– urn:isbn:0451450523

URIs

3

• Since XML element names can be defined by anyone, this often results in a conflict

• Such as …

– How would you add XML related to furniture (tables and chairs) to an HTML document?

Conflicts !!!

4

More conflicts !!!

HTML

<table>

<tr>

<td>Apples</td> <td>Bananas</td>

</tr>

</table>

XML

<table>

<name>Coffee Table</name> <width>8</width> <length>10</length>

</table>

5

• If these XML fragments were added together, there would be a name conflict

• Both contain a <table> element, but the elements have different content and meaning

• An XML parser will not know how to handle these differences

Even more conflicts !!!

6

• Name conflicts in XML can easily be avoided using a name prefix

• This XML carries information about an HTML table, and a piece of furniture:

<h:table>

<h:tr>

<h:td>Apples</h:td>

<h:td>Bananas</h:td>

</h:tr>

</h:table>

<f:table>

<f:name>Coffee Table</f:name>

<f:width>8</f:width>

<f:length>10</f:length>

</f:table>

• In the example above, there will be no conflict because the two <table> elements have different names.

Solving conflicts ...

7

• The namespace is defined by the xmlns attribute in the start tag of an element

• The namespace declaration has the following syntax

xmlns:prefix="URI"

XML Namespaces

8

<root>

<h:table xmlns:h="http://www.w3.org/TR/html4/">

<h:tr>

<h:td>Apples</h:td>


</h:tr>

</h:table>

<f:table xmlns:f="http://www.w3schools.com/furniture"> <f:name>African Coffee Table</f:name>



</f:table>

</root>

Namespaces Ex 1

9

<root xmlns:h="http://www.w3.org/TR/html4/"

xmlns:f="http://www.w3schools.com/furniture">

<h:table>

<h:tr>

<h:td>Apples</h:td>


</h:tr>

</h:table>

<f:table>

<f:name>African Coffee Table</f:name>



</f:table>

</root>

Namespaces Ex 2

10

• The xmlns attribute in the <table> tag gave the h: and f: prefixes a qualified namespace

• When a namespace is defined for an element, all child elements with the same prefix are associated with the same namespace

• Namespaces can be declared in the elements where they are used or in the XML root element

• The namespace URI is not used by the parser to look up information

• The purpose is to give the namespace a unique name, however, often companies use the namespace as a pointer to a web page containing namespace information

More on Namespaces

11

• Defining a default namespace for an element saves us from using prefixes in all the child elements

• <table xmlns="http://www.w3schools.com/furniture"> – <name>African Coffee Table</name> – <width>80</width> – <length>120</length>

• </table>

Default Namespaces

12

Namespaces: XSLT Case Study

13

• XML parsers parse all text in a document• When an XML element is parsed, the text between the XML tags is

also parsed:– <message>This text is also parsed</message>

• The parser does this because XML elements can contain other elements:

– <name><first>Bill</first><last>Gates</last></name>

• The parser will break it up into sub-elements like this:– <name>

• <first>Bill</first> • <last>Gates</last>

– </name>

• Parsed Character Data (PCDATA) is a term used about text data that will be parsed by the XML parser.

Parsed Character Data

14

• The term CDATA is used about text data that should not be parsed by the XML parser

• Characters like "<" and "&" are illegal in XML elements

• "<" will generate an error because the parser interprets it as the start of a new element

• "&" will generate an error because the parser interprets it as the start of an character entity

• Some text, like JavaScript code, contains a lot of "<" or "&" characters. To avoid errors script code can be defined as CDATA

• Everything inside a CDATA section is ignored by the parser

• A CDATA section starts with "<![CDATA[" and ends with "]]>"

Un-Parsed Character Data (1)

15


16

• A CDATA section cannot contain the string "]]>"

• Nested CDATA sections are not allowed

• The "]]>" that marks the end of the CDATA section cannot contain spaces or line breaks


17

• XML documents can contain non ASCII characters, like French ê è é.

• To avoid errors, specify the XML encoding

• Two encoding errors:– An invalid character was found in text content.– Switch from current encoding to specified encoding not

supported.

XML Encoding

18

<?xml version="1.0" encoding="windows-1252"?>

<?xml version="1.0" encoding="ISO-8859-1"?>

<?xml version="1.0" encoding="UTF-8"?>

<?xml version="1.0" encoding="UTF-16"?>

Different encodings ...

19

• DOM (Document Object Model) defines a standard way for accessing and manipulating documents

• Consider XML documents as a tree-structure

• All elements can be accessed through the DOM tree

• Their content (text and attributes) can be modified or deleted, and new elements can be created

• The elements, their text, and their attributes are all known as nodes

DOM

20

document.getElementById("to").innerHTML=""

• document - the HTML document• getElementById("to") - the HTML element

where id="to"• innerHTML - the inner text of the HTML element

HTML DOM example

21

xmlDoc.getElementsByTagName("to")[0].childNodes[0].nodeValue

• xmlDoc - the XML document created by the parser.

• getElementsByTagName("to")[0] - the first <to> element

• childNodes[0] - the first child of the <to> element (the text node)

• nodeValue - the value of the node (the text itself)

XML DOM example

22

x=xmlDoc.getElementsByTagName("title")[0].childNodes[0];

Retrieves the text value of the first <title> element

txt=xmlDoc.getElementsByTagName("title")[0].getAttribute("lang");

retrieves the text value of the "lang" attribute of the first <title> element

Advanced DOM examples

23

x=xmlDoc.getElementsByTagName("book");

for(i=0;i<x.length;i++) {

x[i].setAttribute("edition","first");

}

Change value of attributes

24

newel=xmlDoc.createElement("edition");

newtext=xmlDoc.createTextNode("First");

newel.appendChild(newtext);

x=xmlDoc.getElementsByTagName("book");

x[0].appendChild(newel);

Show using trees ...

What is happening here ... ?

25

x=xmlDoc.getElementsByTagName("book")[0];

x.removeChild(x.childNodes[0]);

And what about here?

26

• The CD Catalogue (XML)

• The CD Catalogue (Code)

• The CD Catalogue (HTML)

XML to HTML

27

• We check the browser, and load the XML using the correct parser

• We create an HTML table with <table border="1">

• We use getElementsByTagName() to get all XML CD nodes

• For each CD node, we display data from ARTIST and TITLE as table data.

• We end the table with </table>

XML to HTML explanation

28

• Provides a way to communicate with a server after a web page has loaded

• Allows website to – Update a web page with new data without reloading the page– Request data from a server after the page has loaded – Receive data from a server after the page has loaded– Send data to a server in the background

XMLHttpRequest Object

29

function loadXMLDoc(url) {

xmlhttp=new XMLHttpRequest();

xmlhttp.onreadystatechange=state_Change;

xmlhttp.open("GET",url,true);

xmlhttp.send();

}

function state_Change() {

...

x=xmlhttp.responseXML.documentElement.getElementsByTagName("CD");

...

}

<body>

<button onclick="loadXMLDoc('cd_catalog.xml')">Get CD info</button>

</body>

Using XMLHttpRequest

30

xmlhttp.open("GET",url,true);

• Our examples use "true" in the third parameter of open()• This parameter specifies whether the request should be handled

asynchronously• True means that the script continues to run after the send() method,

without waiting for a response from the server• The onreadystatechange event complicates the code. But it is the

safest way if you want to prevent the code from stopping if you don't get a response from the server.

• By setting the parameter to "false", your can avoid the extra onreadystatechange code. Use this if it's not important to execute the rest of the code if the request fails.

XMLHttpRequest Async

31

Questions?

Documents

1 Dr Alexiei Dingli XML Technologies XML Advanced