51
XPath

XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Embed Size (px)

Citation preview

Page 1: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

XPath

Page 2: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Why XPath?• Common syntax, semantics for [XSLT] [XPointer]

• Used to address parts of an XML document

• Provides basic facilities for manipulation of strings, numbers and booleans

• Uses a compact, non-XML syntax

• Gets its name from use of path notation as in URLs for navigating through the hierarchical structure

Page 3: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

XPath Syntax• XPath has a natural subset used for matching

– Testing whether or not a node matches a pattern– This use of XPath is described in XSLT

• XPath models an XML document as a tree of nodes

• XPath defines a way to compute a string-value for each type of node

• Primary syntactic construct in XPath is the expression (See expr production rule)

Page 4: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

XPath Syntax• XPath has a natural subset used for matching

– Testing whether or not a node matches a pattern– This use of XPath is described in XSLT

• XPath models an XML document as a tree of nodes

• XPath defines a way to compute a string-value for each type of node

• Primary syntactic construct in XPath is the expression (See expr production rule)

Page 5: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

XPath Syntax• XPath expression is evaluated to yield an object

• The four basic types of objects:– node-set (an unordered collection of nodes without

duplicates) – boolean (true or false) – number (a floating-point number) – string (a sequence of characters)

• Expression evaluation occurs with respect to a context

Page 6: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Location Paths• One important kind of expression is a

location path

• Selects a set of nodes relative to the context node

• Location Path evaluation leads to node-set containing nodes selected by the location path

Page 7: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Location Paths• 2 kinds of location path:

– Relative location paths– Absolute location paths

• Relative location path consists of sequence of location steps separated by /

• Each step in turn selects a set of nodes relative to a context node

• An initial sequence of steps is composed together with a following step as follows

• The initial sequence of steps selects a set of nodes relative to a context node

• Each node in that set is used as a context node for the following step

Page 8: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Relative Location Paths• Relative location path consists of sequence of location

steps separated by /

• Each step in turn selects a set of nodes relative to a context node

• An initial sequence of steps is composed together with a following step as follows

• The initial sequence of steps selects a set of nodes relative to a context node

Page 9: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Relative Location Paths

• Each node in that set is used as a context

node for the following step

child::div/child::para selects the para element

children of the div element children of the

context node

Page 10: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Absolute Location Paths

• Consists of / optionally followed by a relative location path

• / by itself selects the root node of the document containing the context node

• If / followed by relative location path, the location path selects the set of nodes that would be selected by the relative location path relative to the root node

Page 11: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Location Step• A location step has three parts:

– An axis, specifying tree relationship between the nodes selected by the location step and the context node

– A node test, specifying node type, name of nodes selected by the location step

– Zero or more predicates, which use arbitrary expressions to further refine the set of nodes selected by the location step.

Page 12: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Location Step - Syntax• Syntax:

– Axis name, node test separated by a double colon, followed by zero or more expressions each in square brackets

child::para[position()=1]child = name of axis,para = node test,[position()=1] = predicate

Page 13: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Location Step - Syntax

Node-set selected by the location step is

the node-set that results from generating an

initial node-set from the axis and node-test,

and then filtering that node-set by each of

the predicates in turn

Page 14: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Axes• Specifies the tree relationship between the nodes

selected by the location step and the context node

Different types of Axes:• child axis contains the children of the context node

• descendant axis contains the descendants of the context node– descendant axis never contains attribute or namespace

nodes

Page 15: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

AxesDifferent types of Axes:• parent axis contains parent of context

node, if there is one

• ancestor axis contains the ancestors of the context node– Ancestor axis will always include the root

node, unless the context node is the root node

Page 16: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

AxesDifferent types of Axes:• following-sibling axis contains all the following

siblings of the context node– If the context node is attribute node or namespace

node, the following-sibling axis is empty

• preceding-sibling axis contains all the preceding siblings of the context node– If the context node is attribute node or namespace

node, the preceding-sibling axis is empty

Page 17: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

AxesDifferent types of Axes:• following axis contains all nodes in the same

document as the context node that are after the context node in document order– Excludes any descendants, attribute, namespace nodes

• preceding axis contains all nodes in the same document as the context node that are before the context node in document order– Excludes any descendants, attribute, namespace nodes

Page 18: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

AxesDifferent types of Axes:• attribute axis contains the attributes of the

context node– Axis will be empty unless the context node is an element

• namespace axis contains the namespace nodes of the context node– Axis will be empty unless the context node is an element

• self axis contains just the context node itself

Page 19: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

AxesDifferent types of Axes:• descendant-or-self axis contains the context

node and the descendants of the context node

• ancestor-or-self axis contains the context node and the ancestors of the context node– ancestor axis will always include the root node

Page 20: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Axes

The ancestor, descendant, following, preceding

and self axes partition a document (ignoring

attribute and namespace nodes): they do

not overlap and together they contain all the

nodes in the document

Page 21: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Node Tests• Every axis has a principal node type

• If an axis can contain elements, then the principal node type is element

• For the attribute axis, the principal node type is attribute

• For the namespace axis, the principal node type is namespace

• For other axes, the principal node type is element

Page 22: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Node Tests• A node test is true if and only if the type of the

node is the principal node type • node test * is true for any node of the principal

node type– For example, child::* will select all element children

of the context node

• node test text() is true for any text node– For example, child::text() will select the text node

children of the context node

Page 23: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Node Tests• node test node() is true for any node of

any type whatsoever

Page 24: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Predicates• Predicate filters a node-set with respect to an

axis to produce a new node-set

• For each node in node-set to be filtered, the predicate is evaluated with that node as the context node, with the number of nodes in the node-set as the context size

• Predicate is evaluated by evaluating the expression(s) and converting the result to a boolean

Page 25: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Abbreviated Syntax• child:: is the default axis. Can be omitted from a

location step

• attribute:: can be abbreviated to @

• // is short for /descendant-or-self::node()/

• . is short for self::node()

• .. is short for parent::node()•

Page 26: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Location Paths - ExercisesExplanations of each of these?• child::para, child::*, child::text(), child::node()• attribute::name, descendant::para, ancestor::div• ancestor-or-self::div• descendant-or-self::para• child::chapter/descendant::para• /descendant::olist/child::item • child::para[position()=last()] • following-sibling::chapter[position()=1]• /child::doc/child::chapter[position()=5]/

child::section[position()=2]

Page 27: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Locn Paths – Exercises (abbr.)Syntax of each of these?• Select the para element children of the context

node• Select all element children of the context node• Select all text node children of the context node• Select the name attribute of the context node• Select all the attributes of the context node• Select the last para child of the context node• Select all para grandchildren of the context

node

Page 28: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Locn Paths – Exer. Soln. (abbr.)• para• *

• text()• @name• @*• para[last()]• */para

Page 29: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Locn Paths – Exercises (abbr.)Syntax of each of the following?• Select the second section of the fifth chapter of the

doc

• Select the para element descendants of the chapter element children of the context node

• Select all the para descendants of the document root and thus selects all para elements in the same document as the context node

• Select the context node

Page 30: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Locn Paths – Exer. Soln. (abbr.)Explanations of each of these?

• /doc/chapter[5]/section[2]• chapter//para• //para• .

Page 31: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Locn Paths – Exercises (abbr.)Explanations of each of these?

• .//para• ..• //para• .

Page 32: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Locn Paths – Exer. Soln. (abbr.)• .//para selects the para element descendants of

the context node

• .. selects the parent of the context node

• //para selects all the para descendants of the document root and thus selects all para elements in the same document as the context node

• . selects the context node

Page 33: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Locn Paths – Exercises (abbr.)Syntax of the following:• Select the lang attribute of the parent of the

context node

• Select the fifth para child of the context node that has a type attribute with value warning

• Select all the employee children of the context node that have both a secretary attribute and an assistant attribute

Page 34: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Locn Paths – Exer. Soln. (abbr.)

• ../@lang• para[@type="warning"][5]• employee[@secretary and

@assistant]

Page 35: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Expressions

• Refer to expressions syntax tree

• Refer to XPath Expressions Handbook

Page 36: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

XPath Function Library

• Refer to XPath Function Library Handbook

Page 37: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model• XPath operates on an XML document as a tree

• The does not mandate any particular implementation

• 7 types of nodes:– Root Nodes– Element Nodes– Text Nodes– Attribute Nodes– Namespace Nodes– Processing Instruction Nodes– Comment Nodes

Page 38: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Document Order• The root node will be the first node

• Element nodes occur before their children

• Document order orders element nodes in order of the occurrence of their start-tag

• Attribute, Namespace nodes of an element occur before the children of the element

• Namespace nodes are defined to occur before the attribute nodes• Relative order of namespace, attribute nodes is implementation-

dependent

Page 39: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Document Order• Namespace nodes are defined to occur before the

attribute nodes

• Relative order of namespace, attribute nodes is implementation-dependent

• Root nodes and element nodes have an ordered list of child nodes

• Every node other than the root node has exactly one parent, which is either an element node or the root node

Page 40: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Root Node• Root node is the root of the tree

• Element node for the document element is a child of the root node

• Root node also has as children processing instruction and comment nodes

• String-value of the root node is the concatenation of

the string-values of all text node descendants of the root node in document order

Page 41: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Element Node• Is an element node for every element in the

document

• Children of an element node are:– Element nodes– Comment nodes– Processing Instruction nodes– Text nodes

• String-value of an element node is the concatenation of the string-values of all text node descendants of the element node in document order

Page 42: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Attribute• Each element node has an associated set of

attribute nodes

• Element is the parent of each of these attribute nodes

• An element has attribute nodes only for attributes that were explicitly specified in the start-tag or empty-element tag of that element or that were explicitly declared in the DTD with a default value

Page 43: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Attribute• Some attributes, such as xml:lang and

xml:space, apply to all elements that are descendants of the element bearing the attribute, unless overridden with an instance of the same attribute on another descendant element.

• An attribute node has a string-value

Page 44: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Namespace Nodes• Each element has an associated set of

namespace nodes, one for each distinct namespace prefix that is in scope for the element

• Element is the parent of each of these namespace nodes

Page 45: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Namespace NodesAn element will have a namespace node:

• For every attribute on the element whose name starts with xmlns:;

• For every attribute on an ancestor element whose name starts xmlns: unless the element itself or a nearer ancestor redeclares the prefix;

• For an xmlns attribute, if the element or some ancestor has an xmlns attribute, and the value of the xmlns attribute for the nearest such element is non-empty

Page 46: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Namespace Nodes

• The string-value of a namespace node is the namespace URI that is being bound to the namespace prefix

Page 47: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Processing Instruc.• Is a processing instruction node for every

processing instruction

• No such node for for any processing instruction that occurs within the document type declaration

Page 48: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Comment• Is a comment node for every comment

• No such node for any comment that occurs within the document type declaration

• String-value of comment is the content of the comment not including the opening <!-- or the closing -->

Page 49: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Text Node• Character data is grouped into text nodes

• As much character data as possible is grouped into each text node

• A text node never has an immediately following or preceding sibling that is a text node

• String-value of a text node is the character data

• A text node always has at least one character of data

Page 50: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for

Data Model: Text Node

• Characters inside comments, processing instructions and attribute values do not produce text nodes

Page 51: XPath. Why XPath? Common syntax, semantics for [XSLT] [XPointer][XSLT] [XPointer] Used to address parts of an XML document Provides basic facilities for