Upload
marianna-underwood
View
226
Download
0
Embed Size (px)
Citation preview
1
XPath and XSLT
CSE3201/CSE4500
Information Retrieval Systems
2
Manipulating XML Documents
parser
data
data
data
Applications
3
What is XSL
• Extensible Stylesheet Language
• Developed by W3C XSL Working Group
• Motivation: to handle the manipulation and presentation of XML documents
• Consists of: XSLT and XSL-FO
4
XSL
Stylesheet processor
XML document
XSL document
Presentation document
Transformation process
5
Transformation Tools
• XPath
• XSL(Extensible Stylesheet Languages)– XSLT(XSL Transformation)– XSL-FO(XSL Formatting Object)
6
Transformation Process
7
XSLT Processing
• Type of processings:– Change of vocabulary– Reorder data elements– Combine data elements– Filter and exclude data elements
• Output– Other XML vocabularies or fragments– Non-XML formats
• Uses– Display and printing– Transformation of data
8
XPath
• A locator for items in XML document.
• XPath expression gives direction of navigation in XML document.
• Assume an XML document as a “tree”
• Any part of a document, eg element, attribute, is considered as a “node”
• Current version XPATH 1.0
9
XPath
• Syntax (full form):axis :: node-test [predicate]
• Axis– describing the relationship between nodes, eg child,
parents, etc.
• Node test– condition for selecting nodes.
• Predicate: – further condition refinement of the set of nodes resulted
from the node test.
10
XPath AxesAncestor
Parent/ancestor
sibling
node
child/descendant
descendantattribute
sibling
context node
11
Node Test
• A node test identifies nodes in the document that meet the criteria of the test.
• The simplest type of test is nodes that match an element name.
• Example:
child::book => to find any child element with the name “book”.
child::author
12
Predicate
• Predicate further refine or filter the node-set produced by the node test.
• Example:– Find the third book in the list
• child::book[position( )=3]
– Find all the books that has <isbn> element• child::book[isbn]
13
AbbreviationsFormal Short Description
child::book book Select all children of the context node that has <book> element nodes.
child::* * Select all element nodes of the context node.
self::node() . Select the context node.
parent::node() .. Select the parent of the context node.
child::book[position()=1]
Book[1] Select the first child element that has <book> element.
attribute::* @* select all the attributes of the context node
attribute::number
@number Find the number of attributes in the context node.
14
Location PathDocument Root
<name>
<first>
<middle>
“John”
“Little”
<last>
“Howard”
/name/first
Uses “/” to build path, eg
15
Relative vs Absolute Path
• Absolute Path– full path needs to be included, starting from the
root node.• eg: /name/first
• Relative Path– path is declared starting from the current
context node.• eg: assume our current context is “name”, the XPath
expression for the node first => first
16
Recursive Decent Operator
• Locating nodes based on their names, regardless of where their positions in the document.
• Uses “//”• Example: //first
– Select any <first> element in the document (regardless how far down the tree).
• Decrease the performance of the stylesheet.– The entire document must be searched by the XSLT
parser.
17
Filtering Nodes
• It is done using XPath’s predicate.– the “[ ]” symbol.
• Using element as a filter: – book[price] matches any <book> element that
has a <price> child element.
• Using attribute as a filter:– book[@id] matches any <book> element that
has an id attribute.
18
XPath Expression• Some possible operators to build an XPath
Expression:and Logical AND
or Logical OR
not() logical negation
= Equal
!= Not equal
< Less than
<= Less than equal
> Greater than
>= Greater than equal
| Union
19
XPath Expression - Examples
• <xsl:template match="/">• <xsl:if test=“not(position()=last())”>
20
Usage of XPath in XSLT
• XSLT uses XPath expression to:– Match node sets in order to execute templates.– Evaluate node sets to control execution of
conditional XSLT elements.– Select node sets to change current context and
direct the flow of the execution through the source document.
– Select node sets to obtain an output value Professional XML, page 379.
21
XPath Function
• XPath functions can be used to:– manipulate node set
• eg: count, last, name, position
– manipulate string• eg: concat, substring, contains
– test boolean value• eg: language, false, true
– perform numeric operations• eg: ceiling, floor, number, round, sum
– XSLT specific manipulation• eg: current
22
XPath Function - Examples
• <xsl:if test=“not(position()=last())”>
• substring(‘abcde’,2,3) => returns ‘bcd’
23
Structure of Stylesheet
• An XSLT stylesheet is an XML document.• Root element is stylesheet element
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
…
</xsl:stylesheet>
• Consists of a set of rules.• Rules are made up of patterns and templates.
24
Attaching an XSL to an XML doc
<?xml-stylesheet type="text/xsl" href="books.xsl"?>
• href refers to the filename of the XSL document.
25
Example of a Stylesheet
<?xml version="1.0"?><xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html"/><xsl:template match="/"> <html> <body> <h1>Book</h1> <xsl:value-of
select="bookshop/book/title"/>
</body> </html></xsl:template></xsl:stylesheet>
<bookshop><book><title> Harry Potter and the Sorcerer stone </title><author> <initials>J.K</initials> <surname> Rowling</surname></author><price value=“$16.95”></price></book>…</bookshop>
26
Selecting Output Type
• Possible outputs:– XML, HTML, Text
• Syntax:<xsl-output method=“xml”/>
<xsl-output method=“text”/>
<xsl-output method=“html”/>
27
Templates
• To create a template, we need:– To declare the location in the source tree where
the template will be applied.– Rules of matching to be applied.
• can be another template
• The location is declared using the XPath expression.
28
Using Templates
• Templates are called using the <xsl:apply template>.
• <xsl:apply-templates select = node-set-expression> </xsl:apply-templates>
• The “select” attribute is optional.• Without the “select” attribute, the XSL
processor will apply the templates to all the child elements of the current context node.
29
Template Examples
<?xml version="1.0"?><xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html"/><xsl:template match="/bookshop/book"> <html> <body> <h1>Book</h1> <xsl:apply-templates/> </body> </html></xsl:template></xsl:stylesheet>
30
Selecting Templates
<?xml version="1.0"?><xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html"/>
<xsl:template match="/"> <html> <body> <h1>Monash Bookshop</h1> <xsl:apply-templates select="bookshop/book"/> </body> </html></xsl:template>
31
Selecting Templates- cont’d
<xsl:template match="book" >
<xsl:apply-templates select="author"/>
</xsl:template>
<xsl:template match="author">
<h2>Author</h2>
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
32
Getting the Value of a Node
xsl:value-of select=XPath expression
Example:
<xsl:template match="bookshop/book"><p><xsl:value-of select="title"/></p></xsl:template>
33
Conditional Test• xsl:if
– there is no “else” statement.– takes one attribute, test, which is an XPath expression. – if it evaluates true, the body of the element is executed
• Example:– <xsl:if test=“@id”> …</xsl:if>
34
Iteration
• <xsl:for-each><xsl:template match="/"><html><body><h1>Book</h1><xsl:for-each select="/bookshop/book"><p><xsl:value-of select="title"/></p></xsl:for-each></body></html></xsl:template>
<html><body><h1>Book</h1><p> Harry Potter and the Sorcerer Stone</p><p> Harry Potter</p></body></html>
35
Making Copies• xsl:copy
– It does not copy any child nodes that the context node may have.• xsl:copy-of
– copies all
<xsl:template match="/bookshop"> <html> <body> <h1>Book</h1> <xsl:copy/> </body> </html></xsl:template>
<html><body><h1>Book</h1><bookshop></bookshop> </body> </html>
36
Copy-of
<?xml version="1.0" encoding="utf-8"?><Author_List> <author>
<initials>JK</initials><surname> Rowling</surname>
</author> <author>
<initials>J</initials><surname> Rowling</surname>
</author></Author_List>
<?xml version="1.0"?><xsl:stylesheet version = '1.0' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'><xsl:output method="xml"/>
<xsl:template match="/"><xsl:element name="Author_List"><xsl:apply-templates/></xsl:element></xsl:template>
<xsl:template match="bookshop/book"><xsl:copy-of select="author"/></xsl:template>
</xsl:stylesheet>