Upload
others
View
36
Download
0
Embed Size (px)
Citation preview
1
Alphabet Soup: XSL Stylesheets
Overview
The first tutorial in this series introduced the core Extensible Markup Language (XML)
technologies. The second tutorial described the construction of a well- formed XML document.
The third tutorial discussed the role of the XML schema, the primary elements of a schema, and
the relationship between an XML document and an XML schema. In this tutorial, we describe
the roles of XSL stylesheets and introduce the mark-up and transformation languages used in
stylesheets.
Presentation and Transformation
We started this series by discussing the primary role of XML. XML is a set of technologies that
simplify and enhance the distribution of data and information (Figure 1). XML does this by
structuring data hierarchically using tags to convey the meaning of each unit of data (called an
element in XML terminology).
Figure 1: Information Sharing Using XML
The core repository for data in an XML-based system is the XML document. An XML
document is a text document that contains elements delimited using tags (Figure 2).
XML Schema
XML Document
2
Figure 2: BankAcct3.xml
Presentation: Presentation of information on the web is one common application of XML. By
default, web browsers capable of processing XML display the XML document in a hierarchical,
text format (Figure 3).
Figure 3: BankAcct3.xml in Internet Explorer
3
Although this format may be acceptable for simple XML documents, this format is not
appropriate for complex XML documents. XSL provides a means of formatting the XML data
for easier use. With XSL, the web browser displays a more meaningful view of the data (Figure
4).
Figure 4: BankAcct3.xml Rendered Using a XSL Stylesheet
Data Transformation: In addition to presenting information on the web, the XSL stylesheet
provides a means to transform data as needed so that different applications can examine the same
data in whatever way they need the data. One XML document may have many XSL stylesheets,
one stylesheet for each user (Figure 5). The user can be a person or an application.
4
Figure 5: Multiple XSL Stylesheets for One XML Document
As a simple illustration, a person interested in information about his or her bank account may
prefer the view in Figure 4; however, a computer program would have a great deal of difficulty
understanding this format. Instead, the computer program might prefer the data in a comma-
separated format like Figure 6.
Figure 6: Comma-separated BankAcct3.xml Data
A Simple XSL Stylesheet
XSL uses HyperText Markup Language (HTML) to format data for display by web browsers and
other applications that understand HTML, XPath for navigating through the XML hierarchy, and
XML Document
5
XSL Transformations (XSLT) for manipulating the data. Figure 7 is an example of a basic XSL
stylesheet.
Figure 7: A Simple XSL Stylesheet (BankAcct3.xsl)
To understand XSL, we will dissect this basic XSL stylesheet.
Line 1 declares the contents of the file as an XSL stylesheet based on the 1999 XSL standard.
Line 3 establishes the initial point in the XML document for processing nodes. The slash at the
start of match="/BankAccount" establishes the root node as the starting point for processing the
XML document. The remainder of the match property indicates this template manipulates the
top user-defined node in the XML document tree (bank account). Generally, the match property
for the initial template used to process the XML document starts with a reference to the root
node (/) followed by the name of the top user-defined node.
Lines 5 and 6 integrate HTML and XSLT to display the values of the account id element and the
balance element, respectively. Both lines start with standard HTML to display labels for the data
values. Each “ ” inserts a blank space in the stylesheet’s output; this is needed when
forcing space characters outside of HTML elements. The value-of transformation instruction
XPath
HTML
XSLT
6
displays the value of the element identified by the select property. The value-of instruction
self-terminates to ensure a well- formed stylesheet. The <br/> at the end of each line is “XML”
notation for an HTML line break; in general, single tag HTML elements require a trailing slash
to self terminate.
Line 8 starts repetitively processing each account holder node. Line 9 specifies the sort order for
account holder nodes, in this case sorting by the account holder’s name. Lines 11 through 17
display the account holder information using HTML and XSLT. Line 19 terminates this
repetitive processing.
Line 21 closes the template XPath pointer established by line 3.
Line 23 terminates the XSL stylesheet.
Open MyAccount.html in Internet Explorer. MyAccount.html processes the BankAcct3.xml
document using the BankAcct3.xsl stylesheet to produce data formatted using HTML (Figure 8).
The resulting page looks similar to Figure 4.
Figure 8: XML Document/XSL Stylesheet Relationship
Modifications: Now we will modify the stylesheet to display the information in different ways.
Open BankAcct3.xsl in Crimson Editor. Line 6 currently displays the value of the balance
element without any formatting. Modify line 6 as follows:
XML Document
Data Data Formatted
Using HTML
XSL Stylesheet
BankAcct3.xml BankAcct3.xsl
7
<b>Balance:</b>  <xsl:value-of select="format-number(Balance, '###,###,##0.00; (###,###,##0.00)')"/><br/>
This change uses the format-number XSLT function to format the balance using commas,
enclosing negative numbers in parentheses, and displaying one digit to the left of the decimal and
two digits to the right of the decimal. Save the change.
Open BankAcct3.xml in Crimson Editor and change the balance from 232.34 to 1000232.34.
Save the change. Switch to Internet Explorer and refresh the page (MyAccount.html). Verify
the balance displays properly given the format specified (1,000,232.34).
Switch to BankAcct3.xml in Crimson Editor. Add a minus sign to the start of the balance. Save
the change. Switch to Internet Explorer and refresh the page. The balance should display inside
parentheses.
Switch to BankAcct3.xsl in Crimson Editor. Change the sort order of the account holders to use
the holder’s tax id instead of the name. Save the change. Switch to Internet Explorer and refresh
the page. The account holders should sort based on the tax id.
Default XSL Stylesheet: Currently, we are using a web page to transform the contents of the
XML document using the XSL stylesheet. You may want to specify a default XSL stylesheet for
the XML document. Switch to BankAcct3.xml in Crimson Editor. Add line 2 as shown in
Figure 9. This provides an automatic link to the BankAcct3.xsl stylesheet.
Figure 9: Default Stylesheet Specification
Close Internet Explorer. Open BankAcct3.xml in Internet Explorer (Figure 10).
8
Figure 10: XML Document Displayed Using a Default XSL Stylesheet
Overriding the Default XSL Stylesheet: After setting a default XSL stylesheet, if you need to
associate other XSL stylesheets with the same XML document, use an HTML document (web
page) to process the desired XSL stylesheet just as MyAccount.html does.
Figure 11: Process BankAcct3.xml Using the BankAcct3.xsl Stylesheet (MyAccount.html)
9
Figure 11 is a JavaScript function embedded in a web page that uses the BankAcct3.xsl
stylesheet to process BankAcct3.xml and display the results. Line 12 defines the XML
document. Line 16 defines the XSL stylesheet.
The sample files include a web page called Template.html. You can copy and modify
Template.html to create additional web pages to process XML documents using XSL stylesheets;
just change lines 12 and 16 as needed.
Make a copy of Template.html and rename the copy AcctInfo.html. Modify AcctInfo.html in
Crimson Editor to process the BankAcct3.xml document using the AcctInfo.xsl stylesheet. Open
AcctInfo.html in Internet Explorer. The web page should similar to Figure 12. The XSL
stylesheet designated in the web page overrides the default stylesheet specified in the XML
document.
Figure 12: AcctInfo.html Output
Close Internet Explorer. Open BankAcct3.xml in Internet Explorer. Notice how the page
displayed does not look exactly like Figure 12. Why not?
Close all files open in Crimson Editor and Internet Explorer.
10
Extending XSL Stylesheets
Up to this point, we have used an XML document with data about a single bank account. To
demonstrate the full potential of XSL, we are going to extend this XML document to hold data
about several bank accounts.
Open AllAccts.xml in Internet Explorer. This XML document contains information about
multiple accounts. Figure 13 shows this XML document formatted using a default XSL
stylesheet.
Figure 13: AllAccts.xml
11
Open AllAccts.xml in Crimson Editor. Scroll through the XML document paying attention to
the structure of the document and comparing it to the document as displayed in Figure 13.
Notice that this XML document contains multiple account nodes. This is the most significant
difference between this XML document and the one we have been using. The top node (bank
accounts) has multiple child nodes, one for each account. Any closed account has a close date
element; however, accounts currently open do not have the close date element.
Figure 14 is the default XSL stylesheet for the AllAccts.xml document.
Figure 14: Initial AllAccts.xsl Stylesheet
We will examine some key parts of this XSL stylesheet.
Templates: Since the bank accounts node contains a child node for each account, we have to
process each account node separately to produce account information. The XPath template
12
element on line 3 defines the starting point for processing the XML document. In this case, the
XSL stylesheet processes each child node in the bank accounts node.
This is a good point to talk about how templates work in XSL stylesheets. The XML parser uses
the match property of the XPath template element to determine if a template applies to the
current node. If the currently selected node matches the node defined by the match property, the
XML parser uses the template. Initially the XML parser starts with the root node and tries to
find a template that matches this node. If so, the XML parser processes the node using this
template. If not, the XML parser moves to the next lower node in the hierarchy and searches for
a matching template. This continues until the XML parser finds a match or reaches the end of
the hierarchy. In general, if multiple templates match a node, the XML parser implemented by
Microsoft currently uses the last matching template in the XSL stylesheet. As a note, there is a
way to assign priorities to templates within an XSL stylesheet; however, this is beyond the scope
of this tutorial.
Since there are multiple account nodes, the XPath apply-templates element on line 5 causes
the XML parser to process each account node separately, in order based on the account id. The
XML parser looks for a template that manipulates account nodes. The template selected is the
one that starts on line 11. This template is almost identical to the BankAcct3.xsl stylesheet
discussed earlier.
Close Internet Explorer and Crimson Editor.
Make a copy of AllAccts.xsl and name the copy AllAccts2.xsl.
Open AllAccts.xml in Crimson Editor. Change the default XSL stylesheet to reference
AllAccts2.xsl (href="allaccts2.xsl"). Save the change and close Crimson Editor.
Open AllAccts2.xsl in Crimson Editor. Remove the <i> and </i> tags on lines 19 and 24. Save
the changes.
13
Open AllAccts.xml in Internet Explorer. Notice that the AllAccts2.xsl stylesheet does not
italicize the Name and Tax ID labels.
We are going to modify the XSL stylesheet to make better use of templates. Switch to
AllAccts2.xsl in Crimson Editor. Cut lines 18 through 26 to the clipboard (select the lines and
then Edit à Cut). Move the cursor to line 23 (immediately before </xsl:stylesheet>). Paste
the lines cut earlier (Edit à Paste).
Modify the XSL stylesheet so the text looks like the code in Figure 15. We have marked
locations requiring changes. Save the changes.
Figure 15: AllAccts2.xsl Modifications
These modifications create a template to process account holder nodes and set up the template
that processes account nodes to use the new template. This accomplishes the same outcome as
the XPath for-next element previously used; however, the XML parser appears to process XSL
stylesheets faster using this method. This is a major issue when using large XML documents.
14
This also makes reusing XSL code easier since the XSL stylesheet organizes code for processing
nodes in discrete blocks, which you can copy to new stylesheets more easily.
Switch to AllAccts.xml in Internet Explorer and refresh the page. The page should look the
same. If not, check for typographical errors in your code.
To verify that the new structure actually does something, switch back to AllAccts2.xsl in
Crimson Editor. Change the sort order for account holders (line 17 in Figure 15) to use the
account holder’s tax id. Save the changes. Switch to Internet Explorer and refresh the page.
Verify the sort order of account holders is the tax id.
Sorting Nodes: Switch to AllAccts2.xsl in Crimson Editor. Scroll to the top of the XSL
stylesheet. The XSL stylesheet currently sorts account nodes in ascending order by the account
id (Figure 16). We want to sort the account nodes in descending order by balance. Change the
sort element to use the balance. Change the sort order to descending. Save the changes.
Figure 16: Initial Account Node Sort Order
Switch to Internet Explorer and refresh the page. Notice that the XML parser sorts the accounts
by the first character of the balance, not the numeric value. This is because the XML parser
treats the data as text. We must tell the parser that the balance is a number.
Switch to AllAccts2.xsl in Crimson Editor. Modify the sort line to look like the following:
<xsl:sort select="Balance" order="descending" data-type="number"/>
15
The data-type property tells the XML parser the balance is a number. Save the changes.
Switch to Internet Explorer and refresh the page. The sort should be correct now.
Filtering Nodes: Switch to AllAccts2.xsl in Crimson Editor. Assume we only want to see
accounts with a balance less than $100,000. XSL provides a filtering mechanism when applying
templates. Modify the XSL stylesheet as shown in Figure 17. Save the changes. Switch to
Internet Explorer and refresh the page. You should get an error similar to the one shown in
Figure 18.
Figure 17: Filtering - First Attempt
Figure 18: Filter - First Attempt Error
16
The reason for this error is that the less than sign (<) is a special character in XML; it is the
starting character for any tag. Use < instead of the less than sign to correct this problem.
Make this change to AllAccts2.xsl in Crimson Editor. Save the changes. Switch to Internet
Explorer and refresh the page. The page should display correctly. As a note, use > instead of
the greater than sign (>); the greater than sign is the ending character for any tag.
Figure 19 shows the completed AllAccts2.xsl stylesheet.
Figure 19: Final AllAccts2.xsl Stylesheet
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="/BankAccounts"> <xsl:apply-templates select="Account[Balance<100000]"> <xsl:sort select="Balance" order="descending" data-type="number"/> </xsl:apply-templates> </xsl:template> <xsl:template match="Account"> <b>Account ID:</b>  <xsl:value-of select="AccountID" /><br/> <b>Balance:</b>  <xsl:value-of select="format-number(Balance,
'###,###,##0.00; (###,###,##0.00)')" /><br/> <xsl:apply-templates select="AccountHolders/AccountHolder"> <xsl:sort select="HolderTaxID"/> </xsl:apply-templates> <hr/> </xsl:template> <xsl:template match="AccountHolder"> <b>Name:</b>   <xsl:value-of select="HolderName" />      <b>Tax ID:</b>   <xsl:value-of select="HolderTaxID" /><br/> </xsl:template> </xsl:stylesheet>
17
Summary
This tutorial described the roles and parts of an XSL stylesheet, as well as a web page to process
XML documents and XSL stylesheets. This tutorial does not provide a comprehensive coverage
of XSL stylesheets. For additional information about XSL stylesheets, consult the XSL
specification developed by the W3 and the Microsoft XML 4.0 Parser software development kit
(SDK) documentation available from Microsoft.
XML Resources
Crimson Editor, www.crimsoneditor.com.
Microsoft Internet Explorer, www.microsoft.com.
Microsoft XML 4.0 Parser Software Development Kit (SDK), www.microsoft.com.
Topologi P/L Schematron Validator, www.topologi.com.
World Wide Web Consortium, www.w3.org.