Upload
overdue-books-llc
View
3.983
Download
6
Tags:
Embed Size (px)
DESCRIPTION
This is a slide presentation I gave at XML 2004 in Washington, DC. It covers the basics of XSLT.
Citation preview
Learning XSLT
A Tutorial
Mike Fitzgerald
Wy’east Communications
15 November 2004 Slide 2
Quick Start: What Is XSLT?
Extensible Stylesheet Language or XSLT XML Path Language or XPath is a
companion specification to XSLT Both recommendations published by the
World Wide Web Consortium (W3C) inNovember 1999
15 November 2004 Slide 3
Quick Start: Templates
XSLT stylesheets are driven by templates Documents processed on a template model Templates can have: (1) innate priority or (2)
assigned assigned priority (rare) Templates can have modes Can name templates, pass them parameters Built-in templates Attribute value templates
15 November 2004 Slide 4
Quick Start: XPath
XPath is a non-XML syntax for addressing nodes with location paths
XPath data model describes XML documents with seven node types
Types are root (document in 2.0), element, attribute, text, namespace, comment, processing instruction
15 November 2004 Slide 5
Quick Start: More XPath
Expressions in attribute values Patterns, subset of expressions Predicates, method for filtering nodes Axes, forward and reverse Arithmetic in expressions Functions in expressions (XSLT adds
functions XPath set of functions)
15 November 2004 Slide 6
Quick Start: Output
Output methods: text, XML, or HTML (XHTML in 2.0)
Literal result elements Instruction elements Control over things like indentation, XML
declarations, document type declarations, media types, and more
15 November 2004 Slide 7
Quick Start: Processing
Several methods of copying nodes across Variables and parameters Can pass parameters in from outside
stylesheet, and outside a template Can process nodes conditionally Sorting and numbering of lists
15 November 2004 Slide 8
Quick Start: Keys
Keys are mainly a performance hike You can use more than one key You can use parameters with keys You can cross-reference with keys
15 November 2004 Slide 9
Quick Start: Multiple Documents
Multiple stylesheets through inclusion Multiple stylesheets through importation Multiple source documents Multiple output documents (through
extension in 1.0, standard in 2.0)
15 November 2004 Slide 10
Quick Start: Alternative Stylesheets
Normal stylesheets Literal result stylesheets Embedding stylesheets Namespace aliasing Namespace exclusion
15 November 2004 Slide 11
Quick Start: Extensions
XSLT designed for extension Can use processor-specific extensions, such
as those from Xalan, Saxon, EXSLT Can extend XSLT yourself XSLT 2.0 provides a native way to write
extension functions in XML syntax
15 November 2004 Slide 12
Main Presentation
A brief introduction to XSLT and XPath Focus on version 1.0, with asides to 2.0 Most features covered, though not all Basic understanding of XML assumed Example files available on CD (in DOS
format, so use conv –U for Unix line endings)
15 November 2004 Slide 13
W3C Specifications
XSLT 1.0 specification http://www.w3.org/TR/xslt
XPath 1.0 specification http://www.w3.org/TR/xpath
Working drafts of version 2.0 editions http://www.w3.org/TR/xslt20 http://www.w3.org/TR/xpath20
15 November 2004 Slide 14
What Is XSL-FO?
Originally, XSLT was included in Extensible Stylesheet Language or XSL
XSL deals with appearance, formatting XSL is commonly called XSL-FO Specs parted company in April 1999
15 November 2004 Slide 15
What Does XSLT Do?
A language in XML syntax that defines and describes transformations
Transforms XML to: XML HTML XHTML (2.0) plain text
15 November 2004 Slide 16
From Source to Result
Source XML document becomes source tree Source document could be a file or a stream Result document could be a file or stream A result tree can be serialized as an XML,
XHTML, HTML, or text document Text documents could be C++, Java, Python,
SQL, or what have you
15 November 2004 Slide 17
Process Flow
SourceStyle-sheet
Result
Processor Inputs
Processor Output
15 November 2004 Slide 18
Templates
The template is the heart of XSLT Templates match patterns in source trees A pattern is generally a location in the
structure of the source tree A template is invoked to search for a pattern A template that finds a pattern can be
instantiated and create a result tree for output
15 November 2004 Slide 19
Stylesheet Basics
A document element in XSLT can be stylesheet or transform
version is required attribute Namespace declaration is required
http://www.w3.org/1999/XSL/Transform
Conventional prefix is xsl
15 November 2004 Slide 20
ch01/msg.xml & ch01/msg.xsl
ch01/msg.xml just an empty document element
ch01/msg.xsl, document element is stylesheet, version attribute, namespace
Top-level element output, and method attribute with value of text
Top-level element template matches the pattern msg using the match attribute
Literal result text Found it!
15 November 2004 Slide 21
Some XSLT Processors
Instant Saxon (http://saxon.sourceforge.net)
saxon msg.xml msg.xsl
Saxon (http://saxon.sourceforge.net):java –jar saxon7.jar msg.xml msg.xsl
Xalan C++ (http://xml.apache.org):
xalan msg.xml msg.xsl
XRay (http://www.architag.com) xmlspy (http://www.xmlspy.com)
15 November 2004 Slide 22
A Simple Transformation
Transform the document with the stylesheet:
xalan msg.xml msg.xsl Yields this result:
Found it! How did that happen?
15 November 2004 Slide 23
ch01/msg-pi.xml
An XML stylesheet processing instruction is similar to the HTML element LINK
Points to a stylesheet:<?xml-stylesheet href="msg.xsl" type="text/xsl"?>
Some processors accept text/xslt A W3C specification:http://www.w3.org/TR/xml-stylesheet
15 November 2004 Slide 24
XSLT & Browsers
Several browsers support client-side XSLT transformations, such as IE 6.0, Netscape 7.1, Mozilla 1.7, Firebird 0.9
Load ch01/msg-pi.xml in a browser See the results in that browser
15 November 2004 Slide 25
ch01/message.xml
XML declaration with version information An XML declaration is not a processing
instruction Document element message Attribute priority with value low Text content (#PCDATA in DTD terms)
15 November 2004 Slide 26
value-of Instruction Element
Children of top-level elements are usually instruction elements
value-of’s required select attribute Returns a string value
15 November 2004 Slide 27
ch01/message.xsl
stylesheet element with version and namespace
output element, method is text template matches message value-of selects text content of message
text() is a node test for text nodes
15 November 2004 Slide 28
Another Transformation
Transform the document with the stylesheet:
xalan message.xml message.xsl Yields this result:
Hey, XSLT isn’t so hard after all!
What happened?
15 November 2004 Slide 29
Applying Templates
Templates processes children of the matched node
The apply-templates finds other templates that match those children
The select attribute of apply-templates finds a specific template that matches a child
15 November 2004 Slide 30
XPath 1.0 Data Model
XPath sees XML documents as a set of one or more nodes of seven types:
1. Root (or document in 2.0)
2. Element
3. Attribute
4. Text
5. Namespace
6. Comment
7. Processing instruction
15 November 2004 Slide 31
ch04/nodes.xml via ch04/tree-view.xsl
15 November 2004 Slide 32
Root Node
The root node is the root of the XML document
Called the document node in XPath 2.0 Has at most one element node child, that is,
the root or document element Location path: /
15 November 2004 Slide 33
Element Nodes
Elements are the building blocks of structured XML documents
A location path: message/body/line All elements: * All elements of a given namespace (name
test): rng:*
15 November 2004 Slide 34
Attribute Nodes
Attributes effectively modify elements Location path: message/@priority All attributes: @* xmlns, with or without a prefix (as in xmlns:rng), is not treated as an normal attribute
15 November 2004 Slide 35
Other Nodes
Match text nodes with text() node test Match namespace nodes with namespace::*
axis or namespace-uri() function Match comments with comment() node test Match processing instructions with processing-instruction() node test
15 November 2004 Slide 36
Built-in Template Rules
Hidden magic that is sometimes confusing Automatically kick in when an explicit
matching template is not found, but a node matching the built-in template is present
In absence of appropriate templates, built-in templates are used
15 November 2004 Slide 37
Built-in Templates...
Match patterns Match the root node and all element nodes,
including modes Match text and attributes No-ops on comments, processing
instructions, namespace nodes Try ch10/built-in.xsl
15 November 2004 Slide 38
Template Rules
Templates have priority when more than one template matches the same node
The template element has a priority attribute which takes a real number
Stylesheets can be included and imported Imported templates are governed by import
precedence rules
15 November 2004 Slide 39
Literal Result Elements
You can literally add text, elements, and attributes to a result tree
Can use attribute value templates ({}) in the attribute values of literal result elements
Namespaces apply to literal result elements Markup in stylesheet subject to well-
formedness constraints
15 November 2004 Slide 40
Literal Examples
Examples from ch02... Literal text:
xalan text.xml txt.xsl Literal result element:
xalan literal.xml literal.xsl Literal HTML markup:
xalan literal.xml html.xsl Literal XHTML markup (with 1.0):
xalan doc.xml doc.xsl
15 November 2004 Slide 41
Instruction Elements
element creates an element attribute creates an attribute attribute-set creates a reusable set of
attributes (top-level element) text creates text comment creates comments processing-instruction creates a
PI
15 November 2004 Slide 42
ch02/final.xsl
attribute-set is a top-level element that can contain zero or more attribute elements
processing-instruction creates an XML stylesheet PI for a CSS document
comment adds a comment to the document prolog before document element
15 November 2004 Slide 43
More on ch02/final.xsl
Literal result elements element creates elements attribute value is derived the from
source using an absolute location path text element adds some text to the result
without adding unintended whitespace
15 November 2004 Slide 44
output Element
Top-level element (zero or more) Controls output methods: xml, html, text or a QName (qualified name)
xhtml method available in XSLT 2.0 Structural indentation with the indent
attribute (yes or no) encoding attribute has values such asUS-ASCII, UTF-8, or ISO-8859-1
15 November 2004 Slide 45
XML Declaration & output
omit-xml-declaration attribute (yes or no)
version attribute sets version information standalone attribute sets standalone
declaration (yes or no)
15 November 2004 Slide 46
More on output
media-type attribute sets media type as in text/xml or application/xml
cdata-section-elements sets result tree elements that contain CDATA sections
doctype-system and doctype-public output document type declarations using SYSTEM or PUBLIC identifiers
See examples in ch03
15 November 2004 Slide 47
Expressions
Defined by XPath Expressions in select attributes Expressions can contain:
patterns (next slide) boolean and relational logic (and, or, !=, >) arithmetic (+, -, *, /, mod) functions (name(), last(), count()) variable references ($var)
15 November 2004 Slide 48
Patterns
Defined by XSLT as subset of an expression Patterns most often in match attributes Patterns often contain:
location steps (date/year) / (root) or // (descendant) or | (union) Node test (*, rng:*, comment(), text()) id() or key() functions child:: or attribute:: axes Predicate (state/name[.='Oregon'])
15 November 2004 Slide 49
Axes
Abbreviated and unabbreviated syntax Double colon separator (attribute::priority) Forward or reverse axes:
Forward: attribute, child, descendant, descendant-or-self, following, following-sibling, namespace, parent, self
Reverse: ancestor, ancestor-or-self, preceding, and preceding-sibling
See examples in ch04
15 November 2004 Slide 50
Functions
Part of expressions, so always called in attribute values that contain expressions (usually select)
Node-set, number, boolean, string functions XPath functions examples: position(), concat(), not(), sum()
XSLT function examples: document(), current(), key(), format-number()
See examples in ch05
15 November 2004 Slide 51
Copying Nodes
Shallow copy versus deep copy copy element performs a shallow copy:
element nodes and namespace nodes copy-of element performs deep copy:
element and child nodes, attribute nodes, namespace nodes
See examples in ch06
15 November 2004 Slide 52
Variables & Parameters
variable and param elements, both top-level or instruction elements
Referenceable value ($var) with scope Can’t change variable value Result tree fragment or temporary tree in
XSLT 2.0 (see with extensions) See examples in ch07
15 November 2004 Slide 53
More on Variables & Parameters
param value can change, have default Can pass parameters to a stylesheet Can pass parameters to a template using with-param
15 November 2004 Slide 54
Sorting Nodes
The sort element sorts nodes Child of apply-templates or for-each
Ascending or descending Alphabetical or numerical If alphabetical, by uppercase or lowercase See examples in ch08
15 November 2004 Slide 55
Numbering Nodes
The number element inserts formatted numbers into result tree as for a list
Numerical, alphabetical, Roman numerals Used with format-number() function Also used with decimal-format top-
level element See examples in ch09
15 November 2004 Slide 56
Named Templates
A template element can have a name attribute Can have match and name attributes together,
but if no match, must have name Can call templates by name with the call-template element, with parameters (with-param)
Displaced by XSLT 2.0 element function See examples in ch10
15 November 2004 Slide 57
Modes
Can’t apply template that matches a given node more than once
Overcome by modes Matching mode attribute on template
and apply-templates
15 November 2004 Slide 58
Keys
Can match nodes based on a key value Define keys at the top-level using the key
element Later called using the key() function in
patterns or expressions See examples in ch11
15 November 2004 Slide 59
Conditional Processing
Test a single condition with an expression using an if instruction element
Test multiple conditions with expressions using when instruction elements in choose (if-then statement)
choose can contain optional otherwise element (if-then-else statement)
See examples in ch12
15 November 2004 Slide 60
Including Stylesheets
Can include external stylesheet(s) with include element
Include effectively merges stylesheets See examples in ch13
15 November 2004 Slide 61
Importing Stylesheets
Can also import external stylesheet(s) with import element
Import builds import tree with precedence based on sequence of imports
apply-imports can trump normal precedence
See examples in ch13
15 November 2004 Slide 62
Alternative Stylesheets
Literal result element stylesheet uses a literal result element as the document element
Embedded stylesheets allow you to combine source tree and stylesheet into one document by using a fragment identifier with an id attribute
See examples in ch14
15 November 2004 Slide 63
Extensions
XSLT 1.0 can’t satisfy everyone, so extensions possible
Can write namespace-qualified extension elements, functions, and attributes
Can define your own extensions or use extensions already defined
Can define fallback behavior using fallback and message elements
See examples in ch15
15 November 2004 Slide 64
XSLT 2.0 Highlights
xhtml output method Multiple result trees (result-document) Regular expressions (analyze-string) Define stylesheet functions (function) Validation with XML Schema Format dates (date-format, format-date()) Grouping (for-each-group) Character maps (character-map, output-character)
15 November 2004 Slide 65
XPath 2.0 Highlights
Many new functions (177 versus 27) Strongly typed, relying on XML Schema
datatypes Kind tests (document-node(), element(), attribute())
Casting (cast, castable) for and if statements in expressions See examples in ch16
15 November 2004 Slide 66
Ox Documentation Tool
Syntax-related XSLT documentation on the command line
XML vocabulary items, too Example:
java –jar ox.jar xsl:text
See: http://www.wyeast.net/ox.html
15 November 2004 Slide 67
So Long and Thanks!
Mike Fitzgerald
Wy’east Communications