37
Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Embed Size (px)

Citation preview

Page 1: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Introduction to XPath

Bun YueProfessor, CS/CISUHCL

Page 2: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Resources XPath 1.0:

http://www.w3.org/TR/xpath XPath 2.0:

http://www.w3.org/TR/xpath20/ EditiX (free edition):

http://free.editix.com/ XPath 1.0 testbed by whitebeam:

http://www.whitebeam.org/library/guide/TechNotes/xpathtestbed.rhtm

Page 3: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Introduction to XPath 1.0 XPath is used to address parts of an XML

document. XPath is a W3C recommendation. The newest version is 2.0, which is largely

backward compatible. XPath is used by XPointer, XSLT and

XQuery. XPath is designed to access elements, but

not creating new elements. Designed to be embedded in a host

language, such as XSLT or XQuery.

Page 4: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Location Path

XPath uses path expressions to address parts of the documents, called location path.

A location path is composed of a sequence of location steps, separated by a '/'.

Page 5: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Location Path

A location path can be absolute or relative. an absolute location path starts with '/',

the document root. a relative location path does not start

with '/'. Its path is relative to a context node.

Page 6: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

XPath 1.0 Results

The result of an XPath 1.0 may be one of the following four types: Number String Boolean node-set: a set of node

As a set, there is no duplicate node. Not the same as a document fragment. To be replaced by sequence in XPath 2.0.

Page 7: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Example

/stocks/stock

matches all element nodes stock that are children of the root element stocks.

Page 8: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Editix

In Editix, use “>View > Windows > XPath View” to execute XPath expressions.

May select XPath 1.0 or 2.0.

Page 9: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Location Step

A location step is composed of three parts: a node axis (required): to describe

direction for navigation. a node test (required): to specify the

node type, and a set of node predicate (optional): to

specify additional inclusion test.

Page 10: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Example

//stocks/child::stock[@symbol=“IBM"]/lastprice

Consider the location step:child::stock[@symbol=“IBM"]

axis: childnode test: stockpredicate: [@symbol=“IBM"]

Page 11: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Axis

An axis is the first part of the location step and is followed by :: before the node test and predicates.

There are 13 axes in XPath 1.0. The default axis is the child axis. The symbol @ can be used for the

attribute axis.

Page 12: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Axes in XPath 1.0 child: the children of the context node. (not including

attribute nodes). descendant: contains the descendants of the context

node. parent: contains the parent of the context node, if

there is one. ancestor: the ancestors of the context node; including

the root node if the context node is not the root node. following-sibling: all the following siblings of the

context node. preceding-sibling: all the preceding siblings of the

context node.

Page 13: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Axes in Path 1.0 following: all nodes in the same document as the

context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes

preceding: all nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes

attribute: contains the attributes of the context node; the axis will be empty unless the context node is an element

Page 14: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Axes in XPath 1.0 namespace: the namespace nodes of the

context node; the axis will be empty unless the context node is an element

self: contains just the context node itself descendant-or-self: the context node and

the descendants of the context node ancestor-or-self: the context node and the

ancestors of the context node; thus, the ancestor axis will always include the root node.

Page 15: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Shorthand

. is the shorthand for self::node() .. is the shorthand for parent::node(). // is the shorthand for /descendant-or-

self::node()/

Page 16: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Node tests in XPath 1.0 The second part of a location step. It is

required. There are three kind of node tests:

NameTest: the name of the node. NodeType test:

node(): all nodes, including comments and PI, excluding attributes and the document root.

text() comment()

processing-instruction('pi-name') * is a wildcard character matching any

name. It is a name test.

Page 17: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Predicate tests Predicate tests are the last part of a

location steps. They are enclosed by [] and are optional. There may be more than one predicate

test. XPath built-in functions can be used to

construct predicate (boolean) expression as the added condition for inclusion.

Boolean operators: and, or.

Page 18: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Example

//text()matches all text nodes.

//@p[.='1']select all attributes with the name p

with value 1.//person[first][last]

Page 19: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

XPath Functions

There are many XPath 1.0 functions for testing and other purposes.

Many of them are obvious. The non-obvious ones are explained below.

Page 20: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

XPath 1.0 Functions boolean(): convert to boolean data type. false(): returns false always. lang(arg): returns True iff the xml:lang

attribute of the context node is the same as a sublanguage of the language specified by the argument string arg.

not(arg): negation of arg. true() count(arg): number of nodes in the nodeset

argument arg.

Page 21: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

XPath Functions id(arg): select elements with their id argument arg. last(): returns the context size of the expression

evaluation context local-name(arg): returns the local name of the first

node in the node-set argument arg; returns the local name of the context node if arg is missing.

name() namespace-uri() position(): returns the promixity position (starting

from one) of the context node within the axis.

Page 22: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

XPath 1.0 Functions ceiling(arg): ceiling of the number

argument arg. floor(arg) number(arg): convert arg to number. round(arg): sum(arg): sum of values of the node set

argument arg. concat(): string concatenation of

arguments. contains(arg1. arg2): true iff arg1 contains

arg2.

Page 23: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

XPath 1.0 Functions normalize-space(arg): returns the string

argument arg with white space stripped. starts-with(arg1, arg2): whether arg1 starts

with arg2. string(): convert to string. string-length(arg): the number of

characters of the string arg. substring(arg1, arg2, arg3): returns the

substring of arg1 that starts with the index arg2 for a length of arg3.

Page 24: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

XPath 1.0 Functions

substring-after(arg1, arg2): the substring of arg1 after arg2.

substring-before(): the substring of arg1 after arg2.

translate(arg1, arg2, arg3): returns arg1 with each character in arg2 translated to the corresponding characters in arg3.

Page 25: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

XPath 1.0 Classwork

To be handed in the class. Use Familytree.xml

Page 26: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

XPath 2.0 W3C related specifications:

XQuery 1.0 and XPath 2.0 Data Model XQuery 1.0 and XPath 2.0 Functions and

Operators XQuery 1.0 and XPath 2.0 Formal Semantics XML Path Language (XPath) 2.0 XSL Transformations (XSLT) Version 2.0 XSLT 2.0 and XQuery 1.0 Serialization XQuery 1.0: An XML Query Language

Page 27: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Major Changes in XPath 2.0 Sequences to replace node-sets as

the main data model. XML Schema data types Variable binding A rich set of functions Richer expressions New comment styles …

Page 28: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Sequences and items

A sequence is an ordered heterogeneous collection of items.

An item can be A node An atomic value

Page 29: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Sequences

Example:

(1, 5 to 8, "Bun Yue", 2.1)(1+2, 5)(1 to 50)[. mod 3 = 1]/* | //person(1, 2, (3, (4, 5))) is (1,2,3,4,5)

Page 30: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Sequences

Items within a sequence Can be in any arbitrary order. Can be heterogeneous. Can be repeating.

Sequences are not nested. XPath 2.0 results are sequences.

Atomic values are considered to be sequences with a single item.

Page 31: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

For expression & variable binding

for $varname in (expression) return (expression)

Example:

for $person in //person return count($person/email)for $person in //person return fn:count($person/email)

Page 32: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

If statement

Example:

if (//person[first/text()='Boris']) then 'found Boris' else 'no Boris'

Page 33: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

XPath 2.0 Functions Many new functions:

http://www.w3schools.com/XPath/xpath_functions.asp

Some categories: Sequences Aggregate functions Nodes Numeric String, with regular expressions

Page 34: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Quantified Expressions

Applied to a sequence: some every

Format: some $v in sequence satisfies condition every $v in sequence satisfies condition

Page 35: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Example

if (every $person in //person satisfies $person/email) then

"everyone has email address"else "oh oh"

Page 36: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Classwork

To be handed in the class. Use Familytree.xml

Page 37: Introduction to XPath Bun Yue Professor, CS/CIS UHCL

Questions