17
1 Alphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible Markup Language (XML) technologies. The second tutorial described the construction of a well-formed XML document. The third tutorial discussed the role of the XML schema, the primary elements of a schema, and the relationship between an XML document and an XML schema. In this tutorial, we describe the roles of XSL stylesheets and introduce the mark-up and transformation languages used in stylesheets. Presentation and Transformation We started this series by discussing the primary role of XML. XML is a set of technologies that simplify and enhance the distribution of data and information (Figure 1). XML does this by structuring data hierarchically using tags to convey the meaning of each unit of data (called an element in XML terminology). Figure 1: Information Sharing Using XML The core repository for data in an XML-based system is the XML document. An XML document is a text document that contains elements delimited using tags (Figure 2). XML Schema XML Document

Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

  • Upload
    others

  • View
    36

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

1

Alphabet Soup: XSL Stylesheets

Overview

The first tutorial in this series introduced the core Extensible Markup Language (XML)

technologies. The second tutorial described the construction of a well- formed XML document.

The third tutorial discussed the role of the XML schema, the primary elements of a schema, and

the relationship between an XML document and an XML schema. In this tutorial, we describe

the roles of XSL stylesheets and introduce the mark-up and transformation languages used in

stylesheets.

Presentation and Transformation

We started this series by discussing the primary role of XML. XML is a set of technologies that

simplify and enhance the distribution of data and information (Figure 1). XML does this by

structuring data hierarchically using tags to convey the meaning of each unit of data (called an

element in XML terminology).

Figure 1: Information Sharing Using XML

The core repository for data in an XML-based system is the XML document. An XML

document is a text document that contains elements delimited using tags (Figure 2).

XML Schema

XML Document

Page 2: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

2

Figure 2: BankAcct3.xml

Presentation: Presentation of information on the web is one common application of XML. By

default, web browsers capable of processing XML display the XML document in a hierarchical,

text format (Figure 3).

Figure 3: BankAcct3.xml in Internet Explorer

Page 3: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

3

Although this format may be acceptable for simple XML documents, this format is not

appropriate for complex XML documents. XSL provides a means of formatting the XML data

for easier use. With XSL, the web browser displays a more meaningful view of the data (Figure

4).

Figure 4: BankAcct3.xml Rendered Using a XSL Stylesheet

Data Transformation: In addition to presenting information on the web, the XSL stylesheet

provides a means to transform data as needed so that different applications can examine the same

data in whatever way they need the data. One XML document may have many XSL stylesheets,

one stylesheet for each user (Figure 5). The user can be a person or an application.

Page 4: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

4

Figure 5: Multiple XSL Stylesheets for One XML Document

As a simple illustration, a person interested in information about his or her bank account may

prefer the view in Figure 4; however, a computer program would have a great deal of difficulty

understanding this format. Instead, the computer program might prefer the data in a comma-

separated format like Figure 6.

Figure 6: Comma-separated BankAcct3.xml Data

A Simple XSL Stylesheet

XSL uses HyperText Markup Language (HTML) to format data for display by web browsers and

other applications that understand HTML, XPath for navigating through the XML hierarchy, and

XML Document

Page 5: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

5

XSL Transformations (XSLT) for manipulating the data. Figure 7 is an example of a basic XSL

stylesheet.

Figure 7: A Simple XSL Stylesheet (BankAcct3.xsl)

To understand XSL, we will dissect this basic XSL stylesheet.

Line 1 declares the contents of the file as an XSL stylesheet based on the 1999 XSL standard.

Line 3 establishes the initial point in the XML document for processing nodes. The slash at the

start of match="/BankAccount" establishes the root node as the starting point for processing the

XML document. The remainder of the match property indicates this template manipulates the

top user-defined node in the XML document tree (bank account). Generally, the match property

for the initial template used to process the XML document starts with a reference to the root

node (/) followed by the name of the top user-defined node.

Lines 5 and 6 integrate HTML and XSLT to display the values of the account id element and the

balance element, respectively. Both lines start with standard HTML to display labels for the data

values. Each “ ” inserts a blank space in the stylesheet’s output; this is needed when

forcing space characters outside of HTML elements. The value-of transformation instruction

XPath

HTML

XSLT

Page 6: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

6

displays the value of the element identified by the select property. The value-of instruction

self-terminates to ensure a well- formed stylesheet. The <br/> at the end of each line is “XML”

notation for an HTML line break; in general, single tag HTML elements require a trailing slash

to self terminate.

Line 8 starts repetitively processing each account holder node. Line 9 specifies the sort order for

account holder nodes, in this case sorting by the account holder’s name. Lines 11 through 17

display the account holder information using HTML and XSLT. Line 19 terminates this

repetitive processing.

Line 21 closes the template XPath pointer established by line 3.

Line 23 terminates the XSL stylesheet.

Open MyAccount.html in Internet Explorer. MyAccount.html processes the BankAcct3.xml

document using the BankAcct3.xsl stylesheet to produce data formatted using HTML (Figure 8).

The resulting page looks similar to Figure 4.

Figure 8: XML Document/XSL Stylesheet Relationship

Modifications: Now we will modify the stylesheet to display the information in different ways.

Open BankAcct3.xsl in Crimson Editor. Line 6 currently displays the value of the balance

element without any formatting. Modify line 6 as follows:

XML Document

Data Data Formatted

Using HTML

XSL Stylesheet

BankAcct3.xml BankAcct3.xsl

Page 7: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

7

<b>Balance:</b>&#160;&#160;<xsl:value-of select="format-number(Balance, '###,###,##0.00; (###,###,##0.00)')"/><br/>

This change uses the format-number XSLT function to format the balance using commas,

enclosing negative numbers in parentheses, and displaying one digit to the left of the decimal and

two digits to the right of the decimal. Save the change.

Open BankAcct3.xml in Crimson Editor and change the balance from 232.34 to 1000232.34.

Save the change. Switch to Internet Explorer and refresh the page (MyAccount.html). Verify

the balance displays properly given the format specified (1,000,232.34).

Switch to BankAcct3.xml in Crimson Editor. Add a minus sign to the start of the balance. Save

the change. Switch to Internet Explorer and refresh the page. The balance should display inside

parentheses.

Switch to BankAcct3.xsl in Crimson Editor. Change the sort order of the account holders to use

the holder’s tax id instead of the name. Save the change. Switch to Internet Explorer and refresh

the page. The account holders should sort based on the tax id.

Default XSL Stylesheet: Currently, we are using a web page to transform the contents of the

XML document using the XSL stylesheet. You may want to specify a default XSL stylesheet for

the XML document. Switch to BankAcct3.xml in Crimson Editor. Add line 2 as shown in

Figure 9. This provides an automatic link to the BankAcct3.xsl stylesheet.

Figure 9: Default Stylesheet Specification

Close Internet Explorer. Open BankAcct3.xml in Internet Explorer (Figure 10).

Page 8: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

8

Figure 10: XML Document Displayed Using a Default XSL Stylesheet

Overriding the Default XSL Stylesheet: After setting a default XSL stylesheet, if you need to

associate other XSL stylesheets with the same XML document, use an HTML document (web

page) to process the desired XSL stylesheet just as MyAccount.html does.

Figure 11: Process BankAcct3.xml Using the BankAcct3.xsl Stylesheet (MyAccount.html)

Page 9: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

9

Figure 11 is a JavaScript function embedded in a web page that uses the BankAcct3.xsl

stylesheet to process BankAcct3.xml and display the results. Line 12 defines the XML

document. Line 16 defines the XSL stylesheet.

The sample files include a web page called Template.html. You can copy and modify

Template.html to create additional web pages to process XML documents using XSL stylesheets;

just change lines 12 and 16 as needed.

Make a copy of Template.html and rename the copy AcctInfo.html. Modify AcctInfo.html in

Crimson Editor to process the BankAcct3.xml document using the AcctInfo.xsl stylesheet. Open

AcctInfo.html in Internet Explorer. The web page should similar to Figure 12. The XSL

stylesheet designated in the web page overrides the default stylesheet specified in the XML

document.

Figure 12: AcctInfo.html Output

Close Internet Explorer. Open BankAcct3.xml in Internet Explorer. Notice how the page

displayed does not look exactly like Figure 12. Why not?

Close all files open in Crimson Editor and Internet Explorer.

Page 10: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

10

Extending XSL Stylesheets

Up to this point, we have used an XML document with data about a single bank account. To

demonstrate the full potential of XSL, we are going to extend this XML document to hold data

about several bank accounts.

Open AllAccts.xml in Internet Explorer. This XML document contains information about

multiple accounts. Figure 13 shows this XML document formatted using a default XSL

stylesheet.

Figure 13: AllAccts.xml

Page 11: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

11

Open AllAccts.xml in Crimson Editor. Scroll through the XML document paying attention to

the structure of the document and comparing it to the document as displayed in Figure 13.

Notice that this XML document contains multiple account nodes. This is the most significant

difference between this XML document and the one we have been using. The top node (bank

accounts) has multiple child nodes, one for each account. Any closed account has a close date

element; however, accounts currently open do not have the close date element.

Figure 14 is the default XSL stylesheet for the AllAccts.xml document.

Figure 14: Initial AllAccts.xsl Stylesheet

We will examine some key parts of this XSL stylesheet.

Templates: Since the bank accounts node contains a child node for each account, we have to

process each account node separately to produce account information. The XPath template

Page 12: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

12

element on line 3 defines the starting point for processing the XML document. In this case, the

XSL stylesheet processes each child node in the bank accounts node.

This is a good point to talk about how templates work in XSL stylesheets. The XML parser uses

the match property of the XPath template element to determine if a template applies to the

current node. If the currently selected node matches the node defined by the match property, the

XML parser uses the template. Initially the XML parser starts with the root node and tries to

find a template that matches this node. If so, the XML parser processes the node using this

template. If not, the XML parser moves to the next lower node in the hierarchy and searches for

a matching template. This continues until the XML parser finds a match or reaches the end of

the hierarchy. In general, if multiple templates match a node, the XML parser implemented by

Microsoft currently uses the last matching template in the XSL stylesheet. As a note, there is a

way to assign priorities to templates within an XSL stylesheet; however, this is beyond the scope

of this tutorial.

Since there are multiple account nodes, the XPath apply-templates element on line 5 causes

the XML parser to process each account node separately, in order based on the account id. The

XML parser looks for a template that manipulates account nodes. The template selected is the

one that starts on line 11. This template is almost identical to the BankAcct3.xsl stylesheet

discussed earlier.

Close Internet Explorer and Crimson Editor.

Make a copy of AllAccts.xsl and name the copy AllAccts2.xsl.

Open AllAccts.xml in Crimson Editor. Change the default XSL stylesheet to reference

AllAccts2.xsl (href="allaccts2.xsl"). Save the change and close Crimson Editor.

Open AllAccts2.xsl in Crimson Editor. Remove the <i> and </i> tags on lines 19 and 24. Save

the changes.

Page 13: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

13

Open AllAccts.xml in Internet Explorer. Notice that the AllAccts2.xsl stylesheet does not

italicize the Name and Tax ID labels.

We are going to modify the XSL stylesheet to make better use of templates. Switch to

AllAccts2.xsl in Crimson Editor. Cut lines 18 through 26 to the clipboard (select the lines and

then Edit à Cut). Move the cursor to line 23 (immediately before </xsl:stylesheet>). Paste

the lines cut earlier (Edit à Paste).

Modify the XSL stylesheet so the text looks like the code in Figure 15. We have marked

locations requiring changes. Save the changes.

Figure 15: AllAccts2.xsl Modifications

These modifications create a template to process account holder nodes and set up the template

that processes account nodes to use the new template. This accomplishes the same outcome as

the XPath for-next element previously used; however, the XML parser appears to process XSL

stylesheets faster using this method. This is a major issue when using large XML documents.

Page 14: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

14

This also makes reusing XSL code easier since the XSL stylesheet organizes code for processing

nodes in discrete blocks, which you can copy to new stylesheets more easily.

Switch to AllAccts.xml in Internet Explorer and refresh the page. The page should look the

same. If not, check for typographical errors in your code.

To verify that the new structure actually does something, switch back to AllAccts2.xsl in

Crimson Editor. Change the sort order for account holders (line 17 in Figure 15) to use the

account holder’s tax id. Save the changes. Switch to Internet Explorer and refresh the page.

Verify the sort order of account holders is the tax id.

Sorting Nodes: Switch to AllAccts2.xsl in Crimson Editor. Scroll to the top of the XSL

stylesheet. The XSL stylesheet currently sorts account nodes in ascending order by the account

id (Figure 16). We want to sort the account nodes in descending order by balance. Change the

sort element to use the balance. Change the sort order to descending. Save the changes.

Figure 16: Initial Account Node Sort Order

Switch to Internet Explorer and refresh the page. Notice that the XML parser sorts the accounts

by the first character of the balance, not the numeric value. This is because the XML parser

treats the data as text. We must tell the parser that the balance is a number.

Switch to AllAccts2.xsl in Crimson Editor. Modify the sort line to look like the following:

<xsl:sort select="Balance" order="descending" data-type="number"/>

Page 15: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

15

The data-type property tells the XML parser the balance is a number. Save the changes.

Switch to Internet Explorer and refresh the page. The sort should be correct now.

Filtering Nodes: Switch to AllAccts2.xsl in Crimson Editor. Assume we only want to see

accounts with a balance less than $100,000. XSL provides a filtering mechanism when applying

templates. Modify the XSL stylesheet as shown in Figure 17. Save the changes. Switch to

Internet Explorer and refresh the page. You should get an error similar to the one shown in

Figure 18.

Figure 17: Filtering - First Attempt

Figure 18: Filter - First Attempt Error

Page 16: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

16

The reason for this error is that the less than sign (<) is a special character in XML; it is the

starting character for any tag. Use &lt; instead of the less than sign to correct this problem.

Make this change to AllAccts2.xsl in Crimson Editor. Save the changes. Switch to Internet

Explorer and refresh the page. The page should display correctly. As a note, use &gt; instead of

the greater than sign (>); the greater than sign is the ending character for any tag.

Figure 19 shows the completed AllAccts2.xsl stylesheet.

Figure 19: Final AllAccts2.xsl Stylesheet

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/BankAccounts"> <xsl:apply-templates select="Account[Balance&lt;100000]"> <xsl:sort select="Balance" order="descending" data-type="number"/> </xsl:apply-templates> </xsl:template> <xsl:template match="Account"> <b>Account ID:</b>&#160;&#160;<xsl:value-of select="AccountID" /><br/> <b>Balance:</b>&#160;&#160;<xsl:value-of select="format-number(Balance,

'###,###,##0.00; (###,###,##0.00)')" /><br/> <xsl:apply-templates select="AccountHolders/AccountHolder"> <xsl:sort select="HolderTaxID"/> </xsl:apply-templates> <hr/> </xsl:template> <xsl:template match="AccountHolder"> <b>Name:</b>&#160;&#160; <xsl:value-of select="HolderName" /> &#160;&#160;&#160;&#160; <b>Tax ID:</b>&#160;&#160; <xsl:value-of select="HolderTaxID" /><br/> </xsl:template> </xsl:stylesheet>

Page 17: Alphabet Soup: XSL Stylesheetsocean.otr.usm.edu/~w300778/is-doctor/pubpdf/xmlxsl.pdfAlphabet Soup: XSL Stylesheets Overview The first tutorial in this series introduced the core Extensible

17

Summary

This tutorial described the roles and parts of an XSL stylesheet, as well as a web page to process

XML documents and XSL stylesheets. This tutorial does not provide a comprehensive coverage

of XSL stylesheets. For additional information about XSL stylesheets, consult the XSL

specification developed by the W3 and the Microsoft XML 4.0 Parser software development kit

(SDK) documentation available from Microsoft.

XML Resources

Crimson Editor, www.crimsoneditor.com.

Microsoft Internet Explorer, www.microsoft.com.

Microsoft XML 4.0 Parser Software Development Kit (SDK), www.microsoft.com.

Topologi P/L Schematron Validator, www.topologi.com.

World Wide Web Consortium, www.w3.org.