Download ppt - 5 Querying XML

Transcript
Page 1: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 1

5 Querying XML5 Querying XML

How to access various XML data sources?How to access various XML data sources? XQueryXQuery, XML Query Lang, W3C Rec, Jan '07, XML Query Lang, W3C Rec, Jan '07

– joint work by XML Query and XSL WGsjoint work by XML Query and XSL WGs» with XPath 2.0 and XSLT 2.0with XPath 2.0 and XSLT 2.0» Started ~1999; 2Started ~1999; 2ndnd Ed. in Dec 2010 Ed. in Dec 2010

– influenced by many research groups and query influenced by many research groups and query languageslanguages» QuiltQuilt, , XPathXPath, XQL, XML-QL, SQL, OQL, Lorel, ..., XQL, XML-QL, SQL, OQL, Lorel, ...

– A query language for any XML-represented data: A query language for any XML-represented data: both documents and databasesboth documents and databases

Page 2: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 2

Outline of this sectionOutline of this section

Quick overview of XQueryQuick overview of XQuery Review of XPath, emphasizing XPath 2.0 vs 1.0Review of XPath, emphasizing XPath 2.0 vs 1.0

– items, types, and sequences items, types, and sequences – tree model, location path expressions; comparison operatorstree model, location path expressions; comparison operators

Central features of XQuery (over those of XPath 2.0)Central features of XQuery (over those of XPath 2.0)– element constructors, FLWOR expressionselement constructors, FLWOR expressions– use cases, user-defined functions, querying relational datause cases, user-defined functions, querying relational data

Comparison of XQuery and XSLT 1.0Comparison of XQuery and XSLT 1.0 XQuery for problem solvingXQuery for problem solving

– examples of application to puzzlesexamples of application to puzzles SummarySummary

Page 3: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 3

Capabilities of XQuery Capabilities of XQuery (1)(1)

XQuery allows to select, reorganize and transform XML dataXQuery allows to select, reorganize and transform XML data– respecting document content, structure, hierarchy, and orderrespecting document content, structure, hierarchy, and order

Selection, filtering, and searchSelection, filtering, and search Combine and joinCombine and join

– data from different parts of a document, or from multiple documentsdata from different parts of a document, or from multiple documents Sort, group, and aggregateSort, group, and aggregate Transform, restructure and create XML dataTransform, restructure and create XML data Operate on numbers and datesOperate on numbers and dates Manipulate content stringsManipulate content strings

Page 4: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 4

Capabilities of XQuery Capabilities of XQuery (2)(2)

Closure property:Closure property: – Results of XML queries are also XML Results of XML queries are also XML

(well-formed document (well-formed document fragmentsfragments))– > queries can be combined, without limit> queries can be combined, without limit

Extensibility:Extensibility:– supports user-defined functions on all data types of the supports user-defined functions on all data types of the

data modeldata model In-place update of XML data In-place update of XML data notnot supported supported

– specified in “XQuery Update Facility 1.0”, W3C Rec. specified in “XQuery Update Facility 1.0”, W3C Rec. March 2011March 2011

Page 5: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 5

XQuery in a NutshellXQuery in a Nutshell

Functional expression language (Functional expression language (lausekekielilausekekieli)) Strongly-typedStrongly-typed: : (optional) (optional) type-checkingtype-checking of expressions, and of expressions, and

validationvalidation of results (We’ll concentrate to of results (We’ll concentrate to processingprocessing))– predeclared prefixpredeclared prefix for type names: for type names: xsxs="="http://www.w3.org/2001/http://www.w3.org/2001/XMLSchemaXMLSchema""

Extends XPath 2.0Extends XPath 2.0– XQuery 1.0 and XPath 2.0XQuery 1.0 and XPath 2.0 Functions and OperatorsFunctions and Operators, Rec. Jan. , Rec. Jan.

20072007» over 100; for numbers, strings, dates and times, Booleans, over 100; for numbers, strings, dates and times, Booleans,

documents & URIs, nodes, and sequencesdocuments & URIs, nodes, and sequences

XQuery XQuery XPath 2.0 + XSLT' + SQL' (roughly) XPath 2.0 + XSLT' + SQL' (roughly)

Page 6: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 6

Example QueryExample Query

xquery version "1.0"; xquery version "1.0"; (: optional declaration :)(: optional declaration :)

<cheapBooks><cheapBooks> <Title>Cheap Books</Title> <Title>Cheap Books</Title> { { forfor $b $b inin doc("bib.xml")//book[@price < 50] doc("bib.xml")//book[@price < 50] order byorder by $b/title $b/title returnreturn $b } $b } </cheapBooks> </cheapBooks>

XML-based syntax (XQueryX) has also been specifiedXML-based syntax (XQueryX) has also been specified easier for applications, harder for humanseasier for applications, harder for humans

Syntax "concise and easily understood"Syntax "concise and easily understood"

Page 7: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 7

A possible resultA possible result

<?xml version="1.0" encoding="UTF-8"?><?xml version="1.0" encoding="UTF-8"?><cheapBooks><cheapBooks> <Title>Cheap Books</Title> <Title>Cheap Books</Title> <book price="26.50"> <book price="26.50"> <title>Computing with Logic</title> <title>Computing with Logic</title> <author>David Maier</author> <author>David Maier</author> <publisher>Benjamin Cummings</publisher> <publisher>Benjamin Cummings</publisher> <year>1999</year> <year>1999</year> </book> </book> <book price="40.00"> <book price="40.00"> <title>Designing Internet applications</title> <title>Designing Internet applications</title> <author>Michael Leventhal</author> <author>Michael Leventhal</author> <publisher>Prentice Hall</publisher> <publisher>Prentice Hall</publisher> <year>1998</year> <year>1998</year> </book> </book></cheapBooks></cheapBooks>

Page 8: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 8

XQuery and XPathXQuery and XPath

XQuery is an extension of XPath (2.0)XQuery is an extension of XPath (2.0)– Common data model, 108 functions and 68 operatorsCommon data model, 108 functions and 68 operators– > review some XPath first> review some XPath first

XPath used in several other contexts, too:XPath used in several other contexts, too:– For pattern matching and selection in XSLTFor pattern matching and selection in XSLT– in validation rules of Schematronin validation rules of Schematron– For uniqueness constraints in XML SchemaFor uniqueness constraints in XML Schema– For addressing in XLink and XPointerFor addressing in XLink and XPointer

Page 9: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 9

XPath in a NutshellXPath in a Nutshell

XPath 1.0 (W3C Rec. 11/99)XPath 1.0 (W3C Rec. 11/99)– a compact non-XML syntax for a compact non-XML syntax for addressing parts of addressing parts of

XML documents XML documents (as (as node-setsnode-sets))– also operations on also operations on stringsstrings, , numbersnumbers and and truth valuestruth values

XPath 2.0 (W3C Rec. 1/07) extends and XPath 2.0 (W3C Rec. 1/07) extends and generalizes: generalizes: – data manipulated as data manipulated as sequences sequences ofof items items

» Item = a Item = a nodenode or an or an atomic valueatomic value of a simple XML of a simple XML Schema datatype Schema datatype

Page 10: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 10

Literal Atomic Values and Their TypesLiteral Atomic Values and Their Types

Examples:Examples:"-12" instance of xs:string"-12" instance of xs:string-12 instance of xs:integer-12 instance of xs:integer1.2 instance of xs:decimal1.2 instance of xs:decimal1.2E3 instance of xs:double1.2E3 instance of xs:doublestring(1.2E3) instance of xs:stringstring(1.2E3) instance of xs:stringnumber("+12") instance of xs:doublenumber("+12") instance of xs:doublexs:date("2009-05-11") instance of xs:datexs:date("2009-05-11") instance of xs:datetrue() instance of xs:booleantrue() instance of xs:boolean

Page 11: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 11

XPath 2.0/XQuery Type HierarchyXPath 2.0/XQuery Type Hierarchy

Page 12: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 12

XPath 2.0/XQuery Type Hierarchy (cont.)XPath 2.0/XQuery Type Hierarchy (cont.)

Page 13: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 13

XQuery/XPath 2.0 SequencesXQuery/XPath 2.0 Sequences

Expressions operate on, and return Expressions operate on, and return sequencessequences of of– atomic valuesatomic values (of simple XML Schema types) and (of simple XML Schema types) and– nodesnodes– an an itemitem a a singleton sequencesingleton sequence– sequences are sequences are flatflat: no sequences as items: no sequences as items

» (1(1,, (2 (2,, 3) 3),, () (),, 1) = (1 1) = (1,, 2 2,, 3 3,, 1) 1)– sequences are sequences are orderedordered, and can contain duplicates , and can contain duplicates

Unlimited combination of expressions, often with automatic Unlimited combination of expressions, often with automatic type conversions (e.g. for arithmetics)type conversions (e.g. for arithmetics)

Page 14: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 14

Sequence ExpressionsSequence Expressions

Constant sequences constructed by Constant sequences constructed by listing valueslisting values– comma (comma (,,) is a catenation operator) is a catenation operator

» (1(1,, (2 (2,, 3) 3),, () (),, 1) = (1 1) = (1,, 2 2,, 3 3,, 1) 1) Range expressions Range expressions for integer sequences:for integer sequences:

– 1 1 toto 4 4– 4 4 toto 1 1– reverse(1 reverse(1 toto 4) 4)

(1, 2, 3, 4)(1, 2, 3, 4) ()()

(4, 3, 2, 1)(4, 3, 2, 1)

Page 15: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 15

Accessing DocumentsAccessing Documents

XQuery operates on nodes accessible by input XQuery operates on nodes accessible by input functionsfunctions– fn:doc(fn:doc(""URIURI""))

» document-nodedocument-node of the XML document available at of the XML document available at URIURI» roughly same as roughly same as document(document(""URIURI"")) in XSLT 1.0 in XSLT 1.0

– fn:collection(fn:collection(""URIURI""))» sequence of nodes from sequence of nodes from URIURI» association defined by implementationassociation defined by implementation

– predeclared prefix for the default function namespace: predeclared prefix for the default function namespace: fnfn="http://www.w3.org/2005/04/xpath-functions"="http://www.w3.org/2005/04/xpath-functions"

Page 16: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 16

XQuery/XPath 2.0 Data ModelXQuery/XPath 2.0 Data Model

Documents are viewed as treesDocuments are viewed as treesmade of six types of nodes:made of six types of nodes:– documentdocument (additional root above document element) (additional root above document element)– elementelement nodes nodes– attributeattribute nodes nodes– texttext nodes nodes– CommentsComments and and processing instructionsprocessing instructions

Obs 1: No entity nodes, and no CDATA sectionsObs 1: No entity nodes, and no CDATA sections Obs 2: Namespace nodes have been deprecated Obs 2: Namespace nodes have been deprecated

Page 17: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 17

Document TreesDocument Trees

Defined in Sect. 5 of XPath 1.0 specDefined in Sect. 5 of XPath 1.0 spec– for XSLT/XPath 2.0 & XQuery in their joint Data Model for XSLT/XPath 2.0 & XQuery in their joint Data Model

Element nodes have elements, text nodes, Element nodes have elements, text nodes, comments and processing instructions of their comments and processing instructions of their (direct) (direct) contentcontent as their children as their children– NBNB: attribute nodes are : attribute nodes are notnot children (but have a parent) children (but have a parent)– > they have no siblings either> they have no siblings either– the the stringstring valuevalue of an document/element is the of an document/element is the

concatenation of its all text-node descendantsconcatenation of its all text-node descendants

Page 18: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 18

Document OrderDocument Order

Document orderDocument order of nodes: of nodes:– = their left-to-right pre-order= their left-to-right pre-order– Document root firstDocument root first– Other nodes in the order of the first character of their Other nodes in the order of the first character of their

XML markup in the document textXML markup in the document text– > an element precedes it's attribute nodes, which > an element precedes it's attribute nodes, which

precede any content nodes of the elementprecede any content nodes of the element– Order btw nodes belonging to different trees is Order btw nodes belonging to different trees is

implementation dependent, but consistent and stableimplementation dependent, but consistent and stable

Page 19: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 19

Location PathsLocation Paths

XPath can select any parts of a document XPath can select any parts of a document tree using … tree using …

Location pathsLocation paths– evaluated with respect to a evaluated with respect to a context itemcontext item ( (..))

» assigned on path steps, after the first oneassigned on path steps, after the first one» Path expression typically starts with Path expression typically starts with $x$x or or doc(…)doc(…)

– ResultResult: sequence of nodes, in document order, : sequence of nodes, in document order, without duplicateswithout duplicates

Page 20: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 20

Path ExpressionsPath Expressions

Similar to XPath 1.0: [/ [/]]Expr/… /ExprSimilar to XPath 1.0: [/ [/]]Expr/… /Expr– but steps more liberal:but steps more liberal:– arbitrary expressions OK, but steps before the last one arbitrary expressions OK, but steps before the last one

must produce must produce nodenode sequences sequences– 6 (of 13 XPath) axes required: 6 (of 13 XPath) axes required: childchild,, descendant descendant,,

attributeattribute,, self self,, descendant-or-self descendant-or-self,, parent parent» others (except others (except namespacenamespace) optional, ) optional,

available if the available if the Full Axis FeatureFull Axis Feature is supported is supported» with document-order operators (with document-order operators (<<<<, , >>>>) sufficient for ) sufficient for

expressing queries expressing queries ((→→ Exercises) Exercises)

Page 21: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 21

Location pathsLocation paths

Consist of Consist of location stepslocation steps separated by ' separated by '//''– each step produces a sequence of itemseach step produces a sequence of items– steps evaluated left-to-right, steps evaluated left-to-right,

each item in turn as the context itemeach item in turn as the context item Complete location step: Complete location step:

AxisNameAxisName:::: NodeTestNodeTest ( ([[PredicateExprPredicateExpr]])*)*– axisaxis specifies the tree relationship between the context specifies the tree relationship between the context

node and the selected nodes node and the selected nodes – node testnode test restricts the type and and name of nodes restricts the type and and name of nodes– filtered further by 0 or more filtered further by 0 or more predicatespredicates

Page 22: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 22

Location steps: AxesLocation steps: Axes

In total 12 axes (~ directions in tree)In total 12 axes (~ directions in tree)– for staying at the context node: for staying at the context node: selfself– for going downwards:for going downwards:

» childchild, , descendantdescendant, , descendant-or-selfdescendant-or-self– for going upwards:for going upwards:

» parentparent, , ancestorancestor, , ancestor-or-selfancestor-or-self– for moving towards start/end of the document:for moving towards start/end of the document:

» preceding-siblingpreceding-sibling, , following-siblingfollowing-sibling, , precedingpreceding, , followingfollowing

– ““Special” axesSpecial” axes» attributeattribute ((namespace namespace deprecated in XPath 2.0) deprecated in XPath 2.0)

– (Axes required in XQuery implementations (Axes required in XQuery implementations underlinedunderlined))

Page 23: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 23

Notes on Location Paths (1)Notes on Location Paths (1)

XPath 2.0 allows unrestricted expressions as stepsXPath 2.0 allows unrestricted expressions as steps– but intermediate steps must produce nodes onlybut intermediate steps must produce nodes only

Numeric predicates support array-style access: Numeric predicates support array-style access: $rows[$i]$rows[$i]

Predicates evaluated step at a time. This Predicates evaluated step at a time. This sometimes causes confusion with shorthand sometimes causes confusion with shorthand notations:notations:– doc("doc.xml")doc("doc.xml")////title[3] title[3]

third third titletitle child of each parent (likely none!). Why? child of each parent (likely none!). Why?– = = doc("doc.xml")doc("doc.xml")//

descendant-or-self::node()/descendant-or-self::node()/child::title[3]child::title[3]– To get the third title in the doc useTo get the third title in the doc use

((doc("doc.xml")//titledoc("doc.xml")//title))[3][3]

Page 24: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 24

Notes on Location Paths (2)Notes on Location Paths (2)

References to attributes and subelements References to attributes and subelements easy to use as predicateseasy to use as predicates– Get divisions that are of class Get divisions that are of class CC or have a or have a headhead::

doc("doc.xml")//div[@class="C" or head]doc("doc.xml")//div[@class="C" or head]

– Values are coerced to Booleans on Values are coerced to Booleans on demanddemand» string/sequence string/sequence true iff non-empty true iff non-empty» number number false if and only if zero or false if and only if zero or NaNNaN

(but a single number as a predicate tests for (but a single number as a predicate tests for equality with equality with position()position()))

Page 25: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 25

Filter ExpressionsFilter Expressions

Location steps can be filtered by predicates:Location steps can be filtered by predicates: doc("foo.xml")/body/(chap | app)[last()]/title doc("foo.xml")/body/(chap | app)[last()]/title

the title of the last chapter of appendix, whichever is lastthe title of the last chapter of appendix, whichever is last Other sequences, too:Other sequences, too:

(1 to 20)[. mod 5 eq 0](1 to 20)[. mod 5 eq 0] →→ (5, 10, 15, 20) (5, 10, 15, 20) – ('('..' generalized from XPath 1.0 shorthand for ' generalized from XPath 1.0 shorthand for self::node()self::node() into the into the context itemcontext item))

XPath 2.0 extended stepXPath 2.0 extended step

Page 26: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 26

Path Steps as a Map operatorPath Steps as a Map operator

XPath 2.0 path exprs provide a kind-of XPath 2.0 path exprs provide a kind-of mapmap facility, to compute a new sequence facility, to compute a new sequence by evaluating an expression for each item by evaluating an expression for each item of the input sequenceof the input sequence

Example: Get all salaries incremented by Example: Get all salaries incremented by 20%: 20%: doc("emps.xml")//emp/@salary/(. * 1.2)doc("emps.xml")//emp/@salary/(. * 1.2)

Useful tricks, like providing defaults for Useful tricks, like providing defaults for missing attributes: missing attributes: $emp/(@salary,text{0.0})[1]/(. * 1.2)$emp/(@salary,text{0.0})[1]/(. * 1.2)

Page 27: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 27

Path Steps as a Map (2)Path Steps as a Map (2)

Limitation: steps are applicable to Limitation: steps are applicable to nodenode sequences only. Example: an sequences only. Example: an invalidinvalid attemptattempt to square numbers 1, 2, ..., 10: to square numbers 1, 2, ..., 10:

(1 to 10)/(. * .)(1 to 10)/(. * .) Work-around: translate items first to text nodes:Work-around: translate items first to text nodes:

(for $i in 1 to 10 return text{ $i })(for $i in 1 to 10 return text{ $i })/(. * .)/(. * .) or simply: or simply: for $i in 1 to 10 return $i * $ifor $i in 1 to 10 return $i * $i Function calls can also be used as steps: Function calls can also be used as steps:

myFun:toTextNodes(1 to myFun:toTextNodes(1 to 10)/myFun:square(.)10)/myFun:square(.)

Page 28: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 28

Set Operations on Set Operations on NodeNode Sequences Sequences

Assume variable bindings:Assume variable bindings:

$s2$s2

Then:Then:$s1$s1

$s1 $s1 unionunion $s2 = $s2 =

$s1 $s1 intersectintersect $s2 = $s2 =

$s1 $s1 exceptexcept $s2 = $s2 =

based on based on node indentitynode indentity((nodenode11 is is nodenode22))

aa bb cc dd ee

aa bb cc dd ee

cc

aa bb

w.o. w.o. duplicates, duplicates, in doc. orderin doc. order

Page 29: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 29

Node ComparisonsNode Comparisons

To compare single nodes, To compare single nodes, – for identity: for identity: isis

$book//chap[@id="ch1"] $book//chap[@id="ch1"] isis ($book//chap)[1] ($book//chap)[1] true iff the chapter with true iff the chapter with id="ch1"id="ch1" is the first chap is the first chap

– for document order: for document order: <<<< andand >>>>$book//chap[@id="ch2"] $book//chap[@id="ch2"] >>>> $book//title[. eq "Intro"] $book//title[. eq "Intro"] true iff the chapter with true iff the chapter with id="ch2"id="ch2" appears appears afterafter <title>Intro</title><title>Intro</title>

– if either operand is empty, then result is empty (~ if either operand is empty, then result is empty (~ false)false)

Page 30: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 30

Comparing values of sequences and itemsComparing values of sequences and items

General comparisonsGeneral comparisons btw sequences: btw sequences: – ==, , !=!=, , <<, , <=<=, , >>, , >=>=– existential semantics: true iff existential semantics: true iff somesome pair of values pair of values

from operand sequences satisfy the conditionfrom operand sequences satisfy the condition » (1,2) = (2,3); (2,3) = (3,4); (1,2) != (3,4)(1,2) = (2,3); (2,3) = (3,4); (1,2) != (3,4)» Same as in XPath 1.0:Same as in XPath 1.0:

//book[author = "Aho"]//book[author = "Aho"]→→ books where books where somesome author is Aho author is Aho

– "Is (some) author of $book Ann or Bob?":"Is (some) author of $book Ann or Bob?":$book/author = ("Ann", "Bob")$book/author = ("Ann", "Bob")

– Slice of Slice of $seq $seq from pos from pos $s$s to to $e$e::

Page 31: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 31

Set operations for sequences of atomic itemsSet operations for sequences of atomic items

General comparison as a predicate yields set General comparison as a predicate yields set operations:operations:

UnionUnion of $A and $B: of $A and $B:

IntersectionIntersection of $A and $B: of $A and $B:

DifferenceDifference of $A and $B: of $A and $B:

Above comparisons require items of compatible typesAbove comparisons require items of compatible types

Page 32: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 32

Value ComparisonsValue Comparisons

For comparing single values:For comparing single values:– eqeq, , nene, , ltlt, , lele, , gtgt, , gege

» 1 eq 3 - 2; 10 lt 201 eq 3 - 2; 10 lt 20» $books[@price le 100]$books[@price le 100]

– the last assumes that a numeric type has been the last assumes that a numeric type has been assigned to assigned to @price@price by validation by validation» otherwise it has type otherwise it has type xs:untypedAtomicxs:untypedAtomic, ,

which is cast to which is cast to xs:string xs:string (( TYPE ERROR TYPE ERROR)) general comparisons more convenient general comparisons more convenient

with unvalidated elements & attributeswith unvalidated elements & attributes

Page 33: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 33

Working with Untyped ValuesWorking with Untyped Values

Text values may receive a specific type in Text values may receive a specific type in a a schema-validatedschema-validated element or attribute; element or attribute; Otherwise their type is Otherwise their type is xs:untypedAtomicxs:untypedAtomic

Automatic Automatic atomizationatomization and and casting casting make dealing make dealing with them easy. Example: with them easy. Example: l let $elem := <e>2.718281828</e>et $elem := <e>2.718281828</e> return ( "Value of", return ( "Value of",

concat(substring($elem, 1, 6), concat(substring($elem, 1, 6), ".. is about"), ".. is about"),

round-half-to-even($elem, 2) )round-half-to-even($elem, 2) )

-> Value of 2.71828.. is about 2.72-> Value of 2.71828.. is about 2.72

Page 34: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 34

General vs Value Comparisons wrt TypesGeneral vs Value Comparisons wrt Types

Comparisons atomize operands: nodes Comparisons atomize operands: nodes typed values typed values Assume that Assume that $E := <E>007</E>$E := <E>007</E> GeneralGeneral comparisonscomparisons

try to cast xs:untypedAtomic operands to compatible types:try to cast xs:untypedAtomic operands to compatible types: $E < 6 (: false: xs:untypedAtomic(007) -> $E < 6 (: false: xs:untypedAtomic(007) ->

xs:double(007) = 7 :), xs:double(007) = 7 :), $E < "6" (: true: xs:untypedAtomic(007) -> "007" :), $E < "6" (: true: xs:untypedAtomic(007) -> "007" :),

ValueValue comparisons comparisons cast xs:untypedAtomic values to strings: cast xs:untypedAtomic values to strings: $E lt "6" (: true: xs:string("007") lt "6" :), $E lt "6" (: true: xs:string("007") lt "6" :),

$E lt 6 $E lt 6 TYPE ERROR: cannot compare TYPE ERROR: cannot compare xs:untypedAtomic to xs:integer xs:untypedAtomic to xs:integer

Page 35: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 35

What does XQuery add to XPath 2.0?What does XQuery add to XPath 2.0?

A query is an expression (A query is an expression (lausekelauseke))– any XPath expression is a queryany XPath expression is a query

XQuery adds to XPath expressionsXQuery adds to XPath expressions– Element constructors (Element constructors ( XSLT templates) XSLT templates)– FLWOR expressions FLWOR expressions

(”flower”; (”flower”; fforor--lletet--wwherehere--oorder byrder by--rreturneturn))

Page 36: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 36

Central XQuery ExpressionsCentral XQuery Expressions

Path expressionsPath expressions Sequence expressions Sequence expressions Comparison operatorsComparison operators Conditionals: Conditionals: if (..) then .. else ..if (..) then .. else .. Quantified expressions Quantified expressions

((somesome//everyevery $var $var in … satisfiesin … satisfies …) …) Element constructors (Element constructors ( XSLT templates) XSLT templates) FLWOR expressions FLWOR expressions

(”flower”; (”flower”; fforor--lletet--wwherehere--oorder byrder by--rreturneturn))– XPath 2.0 has a simpler XPath 2.0 has a simpler fforor--rreturneturn expression expression

also in XPath 2.0also in XPath 2.0

Page 37: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 37

Example: Quantified ExpressionExample: Quantified Expression

Find Find bookbook elements which have at least 10 elements which have at least 10 sectionsections in each of their s in each of their chapterchapters s ::

doc(”Books.xml”)//book[doc(”Books.xml”)//book[everyevery $c $c inin .//chapter .//chapter

satisfiessatisfies count($c//section) ge 10 ] count($c//section) ge 10 ]

Page 38: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 38

Element ConstructorsElement Constructors

Direct element constructorsDirect element constructors ~ XSLT templates: ~ XSLT templates:– start and end tag enclosing the contentstart and end tag enclosing the content– literal fragments written directly, literal fragments written directly,

expressions enclosed in braces expressions enclosed in braces {{ and and }}≈≈ XSLT 1.0 attribute value templates XSLT 1.0 attribute value templates

often used inside another expression that binds often used inside another expression that binds variables used in the element constructorvariables used in the element constructor– (There is no 'current node' in XQuery)(There is no 'current node' in XQuery)– See nextSee next

Page 39: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 39

ExampleExample

An An empemp element with an element with an empidempid attribute and child attribute and child elements elements namename and and jobjob, from values in variables , from values in variables $id$id, , $n$n, and , and $j$j::

<emp empid="<emp empid="{$id}{$id}">"> <name><name>{$n}{$n}</name> </name> <job><job>{$j}{$j}</job></job></emp></emp>

Also Also computed constructorscomputed constructors::

element {"emp"} { element {"emp"} { attribute {"empid"} attribute {"empid"}{$id}{$id},, <name> <name> {$n}{$n} </name>, </name>, <job> <job> {$j}{$j} </job> } </job> }

Page 40: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 40

Identity of Component NodesIdentity of Component Nodes

Each node has node identity, and at most one Each node has node identity, and at most one parent. Existing nodes are copied before they get parent. Existing nodes are copied before they get a new parent.a new parent.

Example:Example:

let $x := <e>Hi</e>, let $x := <e>Hi</e>, $y := <p>{$x}</p>$y := <p>{$x}</p>return not($x is $y/e) and return not($x is $y/e) and

deep-equal($x, $y/e) deep-equal($x, $y/e)

-> true-> true

Page 41: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 41

FLWOR ("flower") ExpressionsFLWOR ("flower") Expressions

Constructed from Constructed from forfor, , letlet, , wherewhere, , order byorder by and and returnreturn clauses (~SQL clauses (~SQL selectselect--fromfrom--wherewhere))

Syntax: Syntax: (ForClause | LetClause)+ (ForClause | LetClause)+ WhereClause? WhereClause? OrderByClause?OrderByClause?""returnreturn" Expr" Expr

FLWOR binds variables to values, and uses FLWOR binds variables to values, and uses these bindings to construct a result these bindings to construct a result (an ordered sequence of items)(an ordered sequence of items)

Page 42: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 42

Flow of data in a FLWOR expressionFlow of data in a FLWOR expression

tuple = tuple = monikko/rivimonikko/rivi

sequnce ofsequnce of items items

Page 43: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 43

forfor clauses clauses

forfor $V $V11 inin ExpExp11 ( (, , $V$V22 inin ExpExp2,2, …) …)

– associates each variable associates each variable VVii with expression with expression ExpExpii

(e.g. a path expression) (e.g. a path expression) Result: list of tuples, each containing a binding for Result: list of tuples, each containing a binding for

each of the variableseach of the variables can be though of as loops iterating over the items can be though of as loops iterating over the items

returned by respective expressionsreturned by respective expressions

Page 44: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 44

Example:Example: for for clause clause

forfor $i $i inin (1,2), (1,2), $j $j inin (1 (1 toto $i) $i)returnreturn <tuple> <tuple>

<i>{<i>{$i$i}</i> <j>{}</i> <j>{$j$j}</j></tuple>}</j></tuple>

Result:Result:

<tuple><i>1</i><j>1</j></tuple><tuple><i>1</i><j>1</j></tuple>

<tuple><i>2</i><j>1</j></tuple><tuple><i>2</i><j>1</j></tuple>

<tuple><i>2</i><j>2</j></tuple><tuple><i>2</i><j>2</j></tuple>

Page 45: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 45

letlet clauses clauses

letlet also binds variables to expressions also binds variables to expressions– each variable gets the entire sequence as its value each variable gets the entire sequence as its value

(without iterating over the items of the sequence)(without iterating over the items of the sequence)– results in binding a single sequence for each variableresults in binding a single sequence for each variable

Compare:Compare:– forfor $b $b inin doc("bib.xml")//book doc("bib.xml")//book

many bindings (to single books) many bindings (to single books)– letlet $bl $bl :=:= doc("bib.xml")//book doc("bib.xml")//book

a single binding (to sequence of books) a single binding (to sequence of books)

Page 46: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 46

Example: let clausesExample: let clauses

letlet $s $s :=:= (<one/>, <two/>, <three/>) (<one/>, <two/>, <three/>)returnreturn <out> {$s} </out> <out> {$s} </out>

Result:Result:

<out><out> <one/><one/> <two/><two/> <three/><three/></out></out>

forfor $s $s inin (<one/>,<two/>,<three/>) (<one/>,<two/>,<three/>)returnreturn <out> {$s} </out> <out> {$s} </out>

-->--> <out><one/></out><out><one/></out> <out><two/></out><out><two/></out> <out><three/></out><out><three/></out>

Page 47: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 47

forfor//letlet clauses clauses

A FLWOR expr may contain several A FLWOR expr may contain several forfors and s and letletss– each may refer to variables bound in previous clauseseach may refer to variables bound in previous clauses

the result of the the result of the forfor//letlet sequence: sequence:– an ordered list of tuples (an ordered list of tuples (monikkomonikko) of bound variables) of bound variables– number of tuples = product of the cardinalities of the number of tuples = product of the cardinalities of the

sequences returned by the sequences returned by the forfor expressions expressions

Page 48: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 48

wherewhere clause clause

binding tuples generated by binding tuples generated by forfor and and letlet clauses clauses are filtered by an optional are filtered by an optional wherewhere clause clause– tuples with a tuples with a truetrue condition are used to instantiate the condition are used to instantiate the

returnreturn clause clause the the wherewhere clause may contain several predicates clause may contain several predicates

connected by connected by andand, , oror, and , and fn:notfn:not()()– usually refer to the bound variablesusually refer to the bound variables– sequences as Booleans (similarly to node-sets in sequences as Booleans (similarly to node-sets in

XPath 1.0): empty ~ XPath 1.0): empty ~ falsefalse; non-empty ~ ; non-empty ~ truetrue

Page 49: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 49

wherewhere clause clause

forfor binds variables to single items binds variables to single items value comparisonsvalue comparisons, e.g. , e.g. $color $color eqeq ""redred""

letlet to whole sequences to whole sequences general comparisonsgeneral comparisons, e.g. , e.g. $colors $colors == "red" "red"

((~ ~ somesome $c $c inin $colors $colors satisfiessatisfies $c $c eqeq "red" "red"))

– a number of aggregation functions available: a number of aggregation functions available: avgavg()(), , sumsum()(), , countcount()(), , maxmax()(), , minmin()()

(also in XPath 1.0)(also in XPath 1.0)

Page 50: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 50

returnreturn clause clause

The The returnreturn clause generates the output of the clause generates the output of the FLWOR expressionFLWOR expression

instantiated once for each binding tupleinstantiated once for each binding tuple often contains element constructors, references often contains element constructors, references

to bound variables, and nested sub-expressionsto bound variables, and nested sub-expressions

Page 51: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 51

Example:Example: for for + + returnreturn

forfor $i $i inin (1,2), (1,2), $j $j inin (1 (1 toto $i) $i)returnreturn <tuple> <tuple>

<i>{<i>{$i$i}</i> <j>{}</i> <j>{$j$j}</j></tuple>}</j></tuple>

Result:Result:

<tuple><i>1</i><j>1</j></tuple><tuple><i>1</i><j>1</j></tuple>

<tuple><i>2</i><j>1</j></tuple><tuple><i>2</i><j>1</j></tuple>

<tuple><i>2</i><j>2</j></tuple><tuple><i>2</i><j>2</j></tuple>

Page 52: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 52

Example: Prime numbersExample: Prime numbers Generate prime numbers up to Generate prime numbers up to $N$N, i.e., integers , i.e., integers

in {2, 3, … in {2, 3, … $N$N} which are not divisible by others:} which are not divisible by others:

declare variable $N external;declare variable $N external;

let $cands := 2 to $N let $cands := 2 to $N for $cand in $cands for $cand in $cands where count($cands[$cand mod .eq 0]) eq 1 where count($cands[$cand mod .eq 0]) eq 1 return $cand return $cand

Page 53: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 53

Positional variablesPositional variables For items, can also get their position in the For items, can also get their position in the

seq:seq:for $char for $char at $iat $i in ("a", "b", "c") in ("a", "b", "c")return concat($i, ".", $char, ";")return concat($i, ".", $char, ";") 1.a; 2.b; 3.c;1.a; 2.b; 3.c;

Could pair items by their position:Could pair items by their position:let $boys:= doc("kids.xml")//boy,let $boys:= doc("kids.xml")//boy, $girls:= doc("kids.xml")//girl $girls:= doc("kids.xml")//girlfor $b for $b atat $i$i in $boys in $boyswhere $i le count($girls)where $i le count($girls)return <pair>{ $b, $girls[$i] }</pair>return <pair>{ $b, $girls[$i] }</pair>

Page 54: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 54

Prime numbers more efficientlyPrime numbers more efficiently

Only smaller numbers can be divisors; Only smaller numbers can be divisors; useless to test against others:useless to test against others:

let $cands := 2 to $Nlet $cands := 2 to $Nfor $cand for $cand at $posat $pos in $cands in $candswhere empty( $cands[position() lt $pos]where empty( $cands[position() lt $pos]

[$cand mod .eq 0] )[$cand mod .eq 0] )return $candreturn $cand

Page 55: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 55

Effect of the Optimization (with Saxon-HE 9.3)Effect of the Optimization (with Saxon-HE 9.3) Quite positive on both time and space:Quite positive on both time and space:

40 s

35 s

30 s

25 s

20 s

15 s

10 s

5 s

10000 15000 20000 25000 30000

primes.xqprimes-opt.xq

350 MB

300 MB

250 MB

200 MB

150 MB

100 MB

50 MB 10000 15000 20000 25000 30000

primes.xqprimes-opt.xq

– can be optimized much more (see can be optimized much more (see later)later)

Page 56: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 56

Examples (adapted from "XML Query Use Cases")Examples (adapted from "XML Query Use Cases")

Assume: a document named ”Assume: a document named ”bib.xmlbib.xml” ” containing of a list of containing of a list of bookbooks:s:

<book><book>++<title><title><author><author>++<publisher><publisher><year><year><price><price>

Page 57: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 57

<recent-MK-books> {<recent-MK-books> {

} </recent-MK-books>} </recent-MK-books>

List Morgan Kaufmann book titles since 1998List Morgan Kaufmann book titles since 1998

forfor $b $b in docin doc("bib.xml")//book("bib.xml")//bookwherewhere $b/publisher = "Morgan Kaufmann" $b/publisher = "Morgan Kaufmann"

andand $b/year >= 1998 $b/year >= 1998returnreturn <book year="{$b/year}"> <book year="{$b/year}">

{$b/title}{$b/title} </book></book>

Page 58: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 58

Result could be...Result could be...

<recent-MK-books><recent-MK-books> <book year="1999"><book year="1999"> <title>TCP/IP Illustrated</title><title>TCP/IP Illustrated</title> </book></book> <book year="2000"><book year="2000"> <title>Advanced Programming in the Unix <title>Advanced Programming in the Unix environment</title>environment</title> </book></book></recent-MK-books></recent-MK-books>

Page 59: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 59

Publishers with avg price of their books:Publishers with avg price of their books:

forfor $p $p in fn:distinct-valuesin fn:distinct-values(( doc("bib.xml")//publisher ) doc("bib.xml")//publisher )

letlet $a := avg( doc("bib.xml")//book[ $a := avg( doc("bib.xml")//book[publisher = $p]/price )publisher = $p]/price )

returnreturn <publisher> <publisher><name> {$p} </name><name> {$p} </name><avgprice> {$a} </avgprice><avgprice> {$a} </avgprice>

</publisher> </publisher>

atomic values , without atomic values , without duplicatesduplicates

Page 60: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 60

Invert the book-list structureInvert the book-list structure

<author_list>{ (: group books by authors :)<author_list>{ (: group books by authors :) forfor $a $a in in distinct-values(distinct-values(

doc("bib.xml")//author )doc("bib.xml")//author ) returnreturn

<author> <author> {{ <name> <name> {{$a$a}} </name> </name>,,

forfor $b $b in in doc("bib.xml")//book[doc("bib.xml")//book[author = $a]author = $a]

returnreturn $b/title $b/title }} </author></author>} </author_list>} </author_list>

Page 61: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 61

List of publishers sorted alphabetically, List of publishers sorted alphabetically, and their books in descending order of priceand their books in descending order of price

forfor $p $p in in distinct-values(distinct-values(doc("bib.xml")//publisher )doc("bib.xml")//publisher )

order byorder by $p $preturnreturn <publisher><publisher> <name>{$p}</name><name>{$p}</name> { { forfor $b $b in in doc("bib.xml")//book[doc("bib.xml")//book[

publisher = $p]publisher = $p] order byorder by number($b/price) number($b/price) descendingdescending returnreturn <book> {$b/title, <book> {$b/title, $b/price} </book> }$b/price} </book> } </publisher></publisher>

treat untyped values as treat untyped values as numbers, instead of the numbers, instead of the xs:stringxs:string default default

Page 62: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 62

Queries on Document OrderQueries on Document Order

$x $x <<<< $y = true iff node $x precedes node $y = true iff node $x precedes node $y in document order; ($y $y in document order; ($y >>>> $x similarly) $x similarly)

$a$a

Page 63: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 63

Example Query on Document OrderExample Query on Document Order

Consider a surgical report with Consider a surgical report with procedureprocedure elements elements that contain that contain incisionincision sub- sub-elements elements

Return a "critical sequence" of contents btw the first and Return a "critical sequence" of contents btw the first and the second incisions of the first procedurethe second incisions of the first procedure

Page 64: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 64

Computing a "critical sequence"Computing a "critical sequence"

<critical_sequence> {<critical_sequence> { letlet $p := $p := (doc("report.xml")//procedure)[1](doc("report.xml")//procedure)[1] forfor $n $n inin $p/node() $p/node() wherewhere $n $n >>>> ($p//incision)[1] ($p//incision)[1] andand

$n $n <<<< ($p//incision)[2] ($p//incision)[2] returnreturn $n } $n } </critical_sequence></critical_sequence>

NB: if NB: if incisionincisions are not children of the s are not children of the procedureprocedure, , then an ancestor of the second then an ancestor of the second incisionincision gets to the gets to the result; How to avoid this?result; How to avoid this?

Page 65: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 65

User-defined functions: ExampleUser-defined functions: Example

declare functiondeclare function local:precedes-not-anc($a local:precedes-not-anc($a asas node()?, node()?,

$b $b asas node()?) node()?) asas xs:boolean xs:boolean{ $a { $a <<<< $b $b and and (: $a is no ancestor of $b: :) (: $a is no ancestor of $b: :) empty(empty($a/descendant::node() intersect $b$a/descendant::node() intersect $b)) }; };

locallocal:: is predeclared prefix for the namespace of local is predeclared prefix for the namespace of local function namesfunction names– Alternatively: Alternatively:

declare namespace declare namespace my = "http://my.namespace.org";my = "http://my.namespace.org"; declare functiondeclare function my:precedes( my:precedes(... (as above)... (as above)

Page 66: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 66

User-defined functions: ExampleUser-defined functions: Example

Now, ”critical sequence” without ancestors of Now, ”critical sequence” without ancestors of incisionincision: :

<critical_sequence> {<critical_sequence> { letlet $p := $p :=

(doc("report.xml")//procedure)[1](doc("report.xml")//procedure)[1] forfor $n $n inin $p/node() $p/node() wherewhere $n $n >>>> ($p//incision)[1] ($p//incision)[1] andand

local:precedes-not-anc(local:precedes-not-anc($n, $n, ($p//incision)[2]($p//incision)[2]))

returnreturn $n $n} </critical_sequence>} </critical_sequence>

Page 67: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 67

Prime numbers with a functionPrime numbers with a function

Method: (a variant of) the “Sieve of Eratosthenes”Method: (a variant of) the “Sieve of Eratosthenes”– Initialize a list of numbers 2, 3, 4, …, Initialize a list of numbers 2, 3, 4, …, nn

– Repeat: Repeat: (1) Move first of remaining numbers, (1) Move first of remaining numbers, pp, to Primes, to Primes(2) Cross out multiples of (2) Cross out multiples of pp (sufficient to start at (sufficient to start at p*pp*p))

Page 68: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 68

The The sieve()sieve() function function

Invocation: Invocation: pr:sieve(2 to $N)pr:sieve(2 to $N)

declare namespace declare namespace pr="http://www.uef.fi/cs/XQTesting/primes";pr="http://www.uef.fi/cs/XQTesting/primes";

declare function pr:sieve($cands as xs:integer*) declare function pr:sieve($cands as xs:integer*) as xs:integer* { as xs:integer* { (: Pre: $cands are ascending and contain no (: Pre: $cands are ascending and contain no multiples of primes < $cands[1] :) multiples of primes < $cands[1] :) ifif ($cands[1] * $cands[1] gt $cands[last()]) ($cands[1] * $cands[1] gt $cands[last()]) then then $cands (: all of $cands are primes :)$cands (: all of $cands are primes :) elseelse ( $cands[1], ( $cands[1],

pr:sieve($cands[. mod $cands[1] ne 0]) ) }; pr:sieve($cands[. mod $cands[1] ne 0]) ) };

NB NB if-then-elseif-then-else

Page 69: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 69

Efficiency of pr:sieve() Efficiency of pr:sieve() (with Saxon-HE 9.3)(with Saxon-HE 9.3)

– vs 23 s (!) for $N=100,000 with vs 23 s (!) for $N=100,000 with previous optimized FLWOR previous optimized FLWOR expressionexpression

14 s

12 s

10 s

8 s

6 s

4 s

2 s

0 500000 1e+06 1.5e+06 2e+06

pr:sieve() time

600 MB

500 MB

400 MB

300 MB

200 MB

100 MB

0 500000 1e+06 1.5e+06 2e+06

pr:sieve() RAM

Page 70: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 70

Recursive TransformationsRecursive Transformations

Example: “Table-of-contents” for nested sectionsExample: “Table-of-contents” for nested sections– Exclude anything but Exclude anything but titletitles, and tags of s, and tags of sectsect element element

Page 71: 5 Querying XML

declare namespace declare namespace my="http://my.own-ns.org";my="http://my.own-ns.org";declare functiondeclare function my:toc( $n my:toc( $n asas element() ) element() ) asas element()* { element()* { if if ((name(name($n$n))="sect")="sect") thenthen <sect> { <sect> {

forfor $c $c inin $n/* $n/* returnreturn my:toc($c) my:toc($c) } </sect> } </sect> else else

if if ((name(name($n$n))="title") ="title") thenthen $n $n else else (: do child elems, if any: :)(: do child elems, if any: :)

forfor $c $c inin $n/* $n/* returnreturn my:toc($c) }; my:toc($c) };

my:toc(doc("mydoc.xml")/*)my:toc(doc("mydoc.xml")/*)

SDPL 2011 5. Querying XML with XQuery 71

The TOC functionThe TOC function

Page 72: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 72

Querying relational dataQuerying relational data

Lots of data is stored in relational databasesLots of data is stored in relational databases Should be able to access also themShould be able to access also them Example: Tables for Parts and SuppliersExample: Tables for Parts and Suppliers

– P (P (pnopno, , descripdescrip) : part numbers and descriptions) : part numbers and descriptions– S (S (snosno, sname, sname) : supplier numbers and names) : supplier numbers and names– SP (SP (snosno, , pnopno, , priceprice): ):

who supplies which parts and for what price?who supplies which parts and for what price?

Page 73: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 73

Possible XML representation of relationsPossible XML representation of relations

**

**

**

Page 74: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 74

Selecting in SQL vs. XQuery Selecting in SQL vs. XQuery

SQL:SQL:

XQuery:XQuery:

SELECTSELECT pnopnoFROMFROM ppWHEREWHERE descripdescrip LIKELIKE ’Gear%’’Gear%’ORDER BYORDER BY pnopno;;

forfor $p $p in in docdoc("p.xml")//p_tuple("p.xml")//p_tuplewhere starts-withwhere starts-with($p/descrip, "Gear")($p/descrip, "Gear")order by order by $p/pno $p/pno returnreturn $p/pno $p/pno

Page 75: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 75

GroupingGrouping

Many queries involve grouping data and applying Many queries involve grouping data and applying aggregation function like aggregation function like countcount or or avgavg to each to each groupgroup

in SQL: in SQL: GROUP BYGROUP BY and and HAVINGHAVING clauses clauses Example: Find the part number and average price Example: Find the part number and average price

for parts with at least 3 suppliersfor parts with at least 3 suppliers

Page 76: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 76

Grouping: SQLGrouping: SQL

SELECTSELECT pno, pno, avgavg(price) (price) ASAS avgprice avgpriceFROMFROM sp spGROUP BYGROUP BY pno pnoHAVING countHAVING count(*) >= 3(*) >= 3ORDER BYORDER BY pno; pno;

Page 77: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 77

Grouping: XQueryGrouping: XQuery

forfor $pn $pn in distinct-values(in distinct-values(doc(doc("sp.xml")//pno)"sp.xml")//pno)

letlet $grp := $grp := docdoc("sp.xml")//sp_tuple[pno=$pn]("sp.xml")//sp_tuple[pno=$pn]where countwhere count($grp) >= 3($grp) >= 3order byorder by $pn $pnreturnreturn <well_supplied_item> <well_supplied_item> {{

<pno>{$pn}</pno><pno>{$pn}</pno>,, <avgprice> <avgprice> {avg{avg($grp/price)($grp/price)}} </avgprice> </avgprice>

}} </well_supplied_item> </well_supplied_item>

Page 78: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 78

JoinsJoins

Example: Return a ”flat” list of supplier names and their Example: Return a ”flat” list of supplier names and their part descriptions, in alphabetic orderpart descriptions, in alphabetic order

forfor $sp $sp in docin doc("sp.xml")//sp_tuple,("sp.xml")//sp_tuple, $p $p in docin doc("p.xml")//p_tuple[pno = $sp/pno],("p.xml")//p_tuple[pno = $sp/pno], $s $s in docin doc("s.xml")//s_tuple[sno = $sp/sno]("s.xml")//s_tuple[sno = $sp/sno]order byorder by $p/descrip, $s/sname $p/descrip, $s/snamereturnreturn <sp_pair>{ <sp_pair>{

$s/sname ,$s/sname , $p/descrip$p/descrip

}<sp_pair>}<sp_pair>

Page 79: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 79

XQuery vs. XSLT 1.0XQuery vs. XSLT 1.0

Could we express XQuery queries with XSLT?Could we express XQuery queries with XSLT?– In principle In principle yes,yes, always, (but could be tedious) always, (but could be tedious)

Partial XSLT simulation of FLWOR expressions:Partial XSLT simulation of FLWOR expressions:

–XQueryXQuery:: forfor $x$x inin Expr …rest of queryExpr …rest of query –can be expressed incan be expressed in XSLT XSLT asas::

<xsl:<xsl:for-each select="for-each select="ExprExpr">"><xsl:<xsl:variable name="variable name="xx" select="." select="."" /> />… … translation of the rest of the querytranslation of the rest of the query

</xsl:</xsl:for-each>for-each>

Page 80: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 80

XQuery vs. XSLT 1.0XQuery vs. XSLT 1.0

–XQueryXQuery:: letlet $y :=$y := Expr …Expr …corresponds directly tocorresponds directly to:: <xsl:<xsl:variable name="variable name="yy" select="" select="ExprExpr" />" />

andand wherewhere Condition … restCondition … rest <xsl:<xsl:if test="if test="ConditionCondition">"> …… translation of the resttranslation of the rest </xsl:</xsl:if>if>

Page 81: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 81

XQuery vs. XSLT 1.0XQuery vs. XSLT 1.0

–XQueryXQuery:: returnreturn ElemConstructor ElemConstructor can be simulated with a corresponding XSLT template:can be simulated with a corresponding XSLT template:

• static fragments as suchstatic fragments as such• enclosed expressions in element content, e.g. enclosed expressions in element content, e.g. {$s/sname}{$s/sname} becomebecome

<xsl:<xsl:copy-of select="copy-of select="$s/sname$s/sname" />" />

Page 82: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 82

XQuery vs. XSLT 1.0: ExampleXQuery vs. XSLT 1.0: Example

for for $b$b in in doc("bib.xml")//bookdoc("bib.xml")//book where where $b/publ = "MK"$b/publ = "MK" and and $b/year > 1998$b/year > 1998return return <book year="{$b/year}"><book year="{$b/year}">

{$b/title} </book>{$b/title} </book>

XQueryXQuery::

<xsl:template match="/"><xsl:template match="/"> <xsl:for-each select=" <xsl:for-each select="document('bib.xml')//bookdocument('bib.xml')//book">"> <xsl:variable name="b" select="." /> <xsl:variable name="b" select="." /> <xsl:if test="$b/publ='MK' <xsl:if test="$b/publ='MK' and and $b/year > 1998”>$b/year > 1998”> <book year="{$b/year}"><book year="{$b/year}"> <xsl:copy-of select="$b/title" /> </book><xsl:copy-of select="$b/title" /> </book> </xsl:if> </xsl:if> </xsl:for-each> </xsl:for-each> </xsl:template></xsl:template>

Page 83: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 83

XSLT for FLWOR ExpressionsXSLT for FLWOR Expressions

The sketched simulation is not complete:The sketched simulation is not complete:– Only two things, roughly, can be done with XSLT 1.0 Only two things, roughly, can be done with XSLT 1.0 result result

tree fragmentstree fragments produced by templates: produced by templates:» insertion in result treeinsertion in result tree

- with - with <xsl:copy-of select="$X" /><xsl:copy-of select="$X" /> and conversion to a string and conversion to a string

- with - with <xsl:value-of select="$X" /><xsl:value-of select="$X" />– Not possible to apply other operations to results (like, e.g., Not possible to apply other operations to results (like, e.g.,

sorting in XQuery):sorting in XQuery): forfor $y $y inin (<a><key>{$x/code}</key></a>, (<a><key>{$x/code}</key></a>, ......)) order byorder by $y/key $y/key

Page 84: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 84

Using XQuery for Problem SolvingUsing XQuery for Problem Solving

XQuery has features which make it a potential tool XQuery has features which make it a potential tool for experimental problem solvingfor experimental problem solving– Compositionality, flexible combining of expressionsCompositionality, flexible combining of expressions– Easy manipulation of sequencesEasy manipulation of sequences– FLWOR expressions for non-deterministic searchFLWOR expressions for non-deterministic search– General comparisonsGeneral comparisons– XML/XPath trees and recursion for generic XML/XPath trees and recursion for generic

data representation and repetitiondata representation and repetition

P. Kilpeläinen, Manuscript P. Kilpeläinen, Manuscript 20112011

Page 85: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 85

Filtering Integer SequencesFiltering Integer Sequences

XQuery yields simple solutions to some arithmetic problemsXQuery yields simple solutions to some arithmetic problems Example: ”Example: ”Add numbers below one thousand that are Add numbers below one thousand that are

multiples of 3 or 5” multiples of 3 or 5” [ from [ from http://projecteuler.net ] ]

sum( (1 to 999)[. mod 3 eq 0 or . mod 5 eq 0] )sum( (1 to 999)[. mod 3 eq 0 or . mod 5 eq 0] )

Generation of Prime Numbers considered earlierGeneration of Prime Numbers considered earlier

Page 86: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 86

Solving “Tricky Triangles”Solving “Tricky Triangles”

A puzzle of A puzzle of 9 cards with 9 cards with animal figures:animal figures:

™ ™ Dan Gilbert Art GroupDan Gilbert Art Group

Page 87: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 87

Modeling the PuzzleModeling the Puzzle

Matching parts represented as opposite numbers:Matching parts represented as opposite numbers:

Dolphin Head Front Rear Tail

Dark blue 1 -1 2 -2

Light blue 3 -3 4 -4

Big green NA 5 -5 NA

Small green 6 -6 NA NA

White 7 -7 8 -8

Cards represented by empty elements, and figures Cards represented by empty elements, and figures on their sides by attributes (in clock-wise order)on their sides by attributes (in clock-wise order)::

Page 88: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 88

Representing Puzzle CardsRepresenting Puzzle Cards

Introduce Introduce cardcard elements: elements:

declare variable $cards := ( declare variable $cards := ( <card side1="1" side2="7" side3="6" />, <card side1="1" side2="7" side3="6" />, <card side1="7" side2="-2" side3="6" />, <card side1="7" side2="-2" side3="6" />, <card side1="-4" side2="-6" side3="5" />, <card side1="-4" side2="-6" side3="5" />, <card side1="-1" side2="-5" side3="2" />, <card side1="-1" side2="-5" side3="2" />, <card side1="-2" side2="-6" side3="5" />, <card side1="-2" side2="-6" side3="5" />, <card side1="-1" side2="-8" side3="2" />, <card side1="-1" side2="-8" side3="2" />, <card side1="4" side2="-3" side3="6" />, <card side1="4" side2="-3" side3="6" />, <card side1="8" side2="3" side3="-7" />, <card side1="8" side2="3" side3="-7" />, <card side1="8" side2="-5" side3="-7" /> );<card side1="8" side2="-5" side3="-7" /> );

Page 89: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 89

Generating Rotations of CardsGenerating Rotations of Cards

Three Three turnturns of a s of a $card$card created by a function: created by a function:

declare function tr:rotations($card as element(card)) declare function tr:rotations($card as element(card)) as element(card) { as element(card) { <card><card> <turn side1="{$card/@side1}" side2="{$card/@side2}" <turn side1="{$card/@side1}" side2="{$card/@side2}"

side3="{$card/@side3}"/>side3="{$card/@side3}"/> <turn side1="{$card/@side3}" side2="{$card/@side1}" <turn side1="{$card/@side3}" side2="{$card/@side1}"

side3="{$card/@side2}"/>side3="{$card/@side2}"/> <turn side1="{$card/@side2}" side2="{$card/@side3}" <turn side1="{$card/@side2}" side2="{$card/@side3}"

side3="{$card/@side1}"/>side3="{$card/@side1}"/> </card> };</card> };

Page 90: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 90

Invoking the solution functionInvoking the solution function

Card rotations first generated, Card rotations first generated, then passed to a solution function :then passed to a solution function :

let $cardsRotated := $cards/tr:rotations(.)let $cardsRotated := $cards/tr:rotations(.)return tr:solutions($cardsRotated)return tr:solutions($cardsRotated)

The function uses 9 The function uses 9 variables for card slots:variables for card slots:

Page 91: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 91

Solution function for the puzzleSolution function for the puzzle

Possible Possible turnturns selected from remaining s selected from remaining cardscards, , in a careful order of slots:in a careful order of slots:declare function tr:solutions($cards as element(card)+) declare function tr:solutions($cards as element(card)+) as element(solution)* { as element(solution)* { (: Start from the top card, $c1: :)(: Start from the top card, $c1: :) for $c1 in $cards/turnfor $c1 in $cards/turn let $cards := $cards except $c1/parent::cardlet $cards := $cards except $c1/parent::card

(: Try matching turns for the mid-card of the 2nd row: :)(: Try matching turns for the mid-card of the 2nd row: :) for $c3 in $cards/turn[@side1 = -$c1/@side2]for $c3 in $cards/turn[@side1 = -$c1/@side2] let $cards := $cards except $c3/parent::card let $cards := $cards except $c3/parent::card

NB NB letlet clauses for simulating assignment statements clauses for simulating assignment statements

Page 92: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 92

Solution function (cont)Solution function (cont)

Cards selected in a similar manner, until …Cards selected in a similar manner, until …

(: … VARIABLES UP TO THE LAST TWO EXCLUDED … :) (: … VARIABLES UP TO THE LAST TWO EXCLUDED … :)

for $c8 in $cards/turn[@side1 = -$c4/@side2 andfor $c8 in $cards/turn[@side1 = -$c4/@side2 and @side3 = -$c7/@side1]@side3 = -$c7/@side1] let $cards := $cards except $c8/parent::cardlet $cards := $cards except $c8/parent::card

for $c9 in $cards/turn[@side3 = -$c8/@side2]for $c9 in $cards/turn[@side3 = -$c8/@side2] return <solution>{return <solution>{

$c1, $c2, $c3, $c4, $c5, $c6, $c7, $c8, $c9$c1, $c2, $c3, $c4, $c5, $c6, $c7, $c8, $c9 }</solution> }</solution>

}; };

Page 93: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 93

SolutionsSolutions

The program finds two unique solutions:The program finds two unique solutions:

<solution><solution> <turn side1="6" side2="1" side3="7" /><turn side1="6" side2="1" side3="7" /> <turn side1="-2" side2="-6" side3="5" /><turn side1="-2" side2="-6" side3="5" /> <turn side1="-1" side2="-8" side3="2" /><turn side1="-1" side2="-8" side3="2" /> <turn side1="-5" side2="-7" side3="8" /><turn side1="-5" side2="-7" side3="8" /> <turn side1="3" side2="-7" side3="8" /><turn side1="3" side2="-7" side3="8" /> <turn side1="6" side2="4" side3="-3" /><turn side1="6" side2="4" side3="-3" /> <turn side1="-6" side2="5" side3="-4" /><turn side1="-6" side2="5" side3="-4" /> <turn side1="7" side2="-2" side3="6" /><turn side1="7" side2="-2" side3="6" /> <turn side1="-1" side2="-5" side3="2" /><turn side1="-1" side2="-5" side3="2" /> </solution></solution>

<solution><solution> <turn side1="-1" side2="-8" side3="2" /><turn side1="-1" side2="-8" side3="2" /> <turn side1="7" side2="6" side3="1" /><turn side1="7" side2="6" side3="1" /> <turn side1="8" side2="-5" side3="-7" /><turn side1="8" side2="-5" side3="-7" /> <turn side1="-2" side2="-6" side3="5" /><turn side1="-2" side2="-6" side3="5" /> <turn side1="4" side2="-3" side3="6" /><turn side1="4" side2="-3" side3="6" /> <turn side1="-6" side2="5" side3="-4" /><turn side1="-6" side2="5" side3="-4" /> <turn side1="2" side2="-1" side3="-5" /><turn side1="2" side2="-1" side3="-5" /> <turn side1="6" side2="7" side3="-2" /><turn side1="6" side2="7" side3="-2" /> <turn side1="8" side2="3" side3="-7" /><turn side1="8" side2="3" side3="-7" /> </solution></solution>

Page 94: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 94

SolutionsSolutions

Corresponding arrangements:Corresponding arrangements:

Well, actually more … Well, actually more …

Page 95: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 95

Avoiding multiple solutionsAvoiding multiple solutions

Each solution is reported Each solution is reported three timesthree times, as a rotation of the , as a rotation of the entire puzzleentire puzzle

No obvious solution to avoid themNo obvious solution to avoid them

would be easy, had the puzzle a slot at the center of symmetrywould be easy, had the puzzle a slot at the center of symmetry

Page 96: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 96

EfficiencyEfficiency

Solutions computed in 0.8 sec and 30 MB (with Saxon-HE)Solutions computed in 0.8 sec and 30 MB (with Saxon-HE) Careful order of card selections relevant for restricting the Careful order of card selections relevant for restricting the

number of combinationsnumber of combinations The number of brute-force combinations is high:The number of brute-force combinations is high:

9 x 3 = 27 for the 1st card, 27 x (8 x 3) = 648 for the first two etc9 x 3 = 27 for the 1st card, 27 x (8 x 3) = 648 for the first two etc Number of ways of choosing the ith card:Number of ways of choosing the ith card:

Strategy 1st 2nd 3rd 4th 5th 6th 7th 8th 9thBrute-force 27 648 13,608 244,944 4x106 4x107 4x108 2x109 7x109

Careful 27 48 81 129 138 135 123 12 6

Page 97: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 97

Solving Sudoku PuzzlesSolving Sudoku Puzzles

Discussion of an XQuery Sudoku solverDiscussion of an XQuery Sudoku solver Example: Inkala’s ”AI Escargot” Sudoku and its input representation:Example: Inkala’s ”AI Escargot” Sudoku and its input representation:

Internal representation of cells:Internal representation of cells:

Start by loading the cells of a board:Start by loading the cells of a board:let $cells := sudo:preprocess( doc($SudoDoc)//row )let $cells := sudo:preprocess( doc($SudoDoc)//row )

Page 98: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 98

Loading Sudoku BoardsLoading Sudoku Boards

declare namespace declare namespace sudo="http://www.uef.fi/cs/XQueryTesting/sudoku";sudo="http://www.uef.fi/cs/XQueryTesting/sudoku";declare variable $SudoDoc external;declare variable $SudoDoc external;

declare function sudo:preprocess( $rows as element(row)+ )declare function sudo:preprocess( $rows as element(row)+ ) as element(cell)+ { as element(cell)+ {(: Return cells with numbers for row, col and box :)(: Return cells with numbers for row, col and box :) for $row at $rowNum in $rowsfor $row at $rowNum in $rows let $colContents := tokenize(string($row), ",\s*")let $colContents := tokenize(string($row), ",\s*") for $colCont at $colNum in $colContentsfor $colCont at $colNum in $colContents return <cell rowNum="{$rowNum}" colNum="{$colNum}" return <cell rowNum="{$rowNum}" colNum="{$colNum}" val="{xs:integer($colCont)}" val="{xs:integer($colCont)}" box="{sudo:boxNum($rowNum, $colNum)}" /> }; box="{sudo:boxNum($rowNum, $colNum)}" /> };

(Generation of box number s; see next )(Generation of box number s; see next )

Page 99: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 99

Generating box numbers for Sudoku cellsGenerating box numbers for Sudoku cells

declare function sudo:boxNum( $rowNum as xs:integer, declare function sudo:boxNum( $rowNum as xs:integer, $colNum as xs:integer ) $colNum as xs:integer ) as xs:integer { (: box number in 1,2,...,9 :)as xs:integer { (: box number in 1,2,...,9 :) (($rowNum - 1) idiv 3)*3 + ($colNum - 1) idiv 3 + 1(($rowNum - 1) idiv 3)*3 + ($colNum - 1) idiv 3 + 1 };};

Page 100: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 100

Invoking the solution functionInvoking the solution function

After preprocessing, After preprocessing, pass free cells and fixed cells to a solution function, pass free cells and fixed cells to a solution function, and display its solutions :and display its solutions :

let $freeCells := $cells[@val = 0],let $freeCells := $cells[@val = 0], $fixedCells := $cells except $freeCells$fixedCells := $cells except $freeCellsfor $solution in sudo:solution($freeCells, $fixedCells)for $solution in sudo:solution($freeCells, $fixedCells)return sudo:displayCells($solution/*)return sudo:displayCells($solution/*)

Function Function solution()solution() returns returns boardboard elements, elements, whose whose cellcells are displayed by s are displayed by displayCells()displayCells()::

Page 101: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 101

The Sudoku solution functionThe Sudoku solution function

declare function sudo:solution( $freeCells as element(cell)*,declare function sudo:solution( $freeCells as element(cell)*, $fixedCells as element(cell)+ )$fixedCells as element(cell)+ ) as element(board)* { as element(board)* { if ( empty($freeCells) ) then (: the board is complete :) if ( empty($freeCells) ) then (: the board is complete :) <board>{ $fixedCells }</board><board>{ $fixedCells }</board> elseelse let $cell := $freeCells[1] (: pick let $cell := $freeCells[1] (: pick anyany unfilled cell :) unfilled cell :) let $thisRow := $fixedCells[@rowNum eq $cell/@rowNum],let $thisRow := $fixedCells[@rowNum eq $cell/@rowNum], $thisCol := $fixedCells[@colNum eq $cell/@colNum],$thisCol := $fixedCells[@colNum eq $cell/@colNum], $thisBox := $fixedCells[@box eq $cell/@box]$thisBox := $fixedCells[@box eq $cell/@box] let $forbiddenVals := ($thisRow | $thisCol | $thisBox)/@vallet $forbiddenVals := ($thisRow | $thisCol | $thisBox)/@val for $val in (1 to 9)[not(. = $forbiddenVals)]for $val in (1 to 9)[not(. = $forbiddenVals)] let $fixedCells2 := ( $fixedCells,let $fixedCells2 := ( $fixedCells,

<cell val="{$val}">{ $cell/@*[name() ne "val"] }</cell> ) <cell val="{$val}">{ $cell/@*[name() ne "val"] }</cell> ) return sudo:solution($freeCells except $cell, $fixedCells2) };return sudo:solution($freeCells except $cell, $fixedCells2) };

Page 102: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 102

Displaying a solved boardDisplaying a solved board

declare variable $line-btw-boxes:="---------------------&#10;";declare variable $line-btw-boxes:="---------------------&#10;";

declare function sudo:displayCells(declare function sudo:displayCells($cells as element(cell)+ ) as xs:string+ {$cells as element(cell)+ ) as xs:string+ {

for $rowNum in (1 to 9) for $rowNum in (1 to 9) return ( sudo:displayRow($cells[@rowNum = $rowNum]),return ( sudo:displayRow($cells[@rowNum = $rowNum]), if ($rowNum eq 3 or $rowNum eq 6) then $line-btw-boxesif ($rowNum eq 3 or $rowNum eq 6) then $line-btw-boxes else () ) };else () ) };

declare function sudo:displayRow(declare function sudo:displayRow($cells as element(cell)+) as xs:string+ {$cells as element(cell)+) as xs:string+ {

for $colNum in (1 to 9) for $colNum in (1 to 9) return ($cells[@colNum = $colNum]/@val, (: + bar btw boxes: :)return ($cells[@colNum = $colNum]/@val, (: + bar btw boxes: :) if ($colNum eq 3 or $colNum eq 6) then "|" else () ), if ($colNum eq 3 or $colNum eq 6) then "|" else () ),

"&#10;" }; "&#10;" };

Page 103: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 103

Heuristic OptimizationHeuristic Optimization

Intuitively useful to fill the most constrained cells firstIntuitively useful to fill the most constrained cells first For this, compute the number of constraints for a For this, compute the number of constraints for a $cell$cell::

declare function sudo:numOfConstraints(declare function sudo:numOfConstraints( $cell as element(cell), $fixedCells as element(cell)+ ) $cell as element(cell), $fixedCells as element(cell)+ )

as xs:integer { as xs:integer { let $neighbors := $fixedCells[@rowNum eq $cell/@rowNum orlet $neighbors := $fixedCells[@rowNum eq $cell/@rowNum or @colNum eq $cell/@colNum or@colNum eq $cell/@colNum or @box eq $cell/@box]@box eq $cell/@box] return count( distinct-values($neighbors/@val) ) };return count( distinct-values($neighbors/@val) ) }; Use this function to choose maximally constrained free cells:Use this function to choose maximally constrained free cells:

Page 104: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 104

Heuristic Optimization (2)Heuristic Optimization (2)

declare function sudo:mostConstrainedFreeCells(declare function sudo:mostConstrainedFreeCells( $freeCells as element(cell)+, $fixedCells as element(cell)+ ) $freeCells as element(cell)+, $fixedCells as element(cell)+ ) as element(cell)* { (: Maximally constrained free cells: :) as element(cell)* { (: Maximally constrained free cells: :) let $maxNumOfConstraints := let $maxNumOfConstraints :=

max( for $cell in $freeCellsmax( for $cell in $freeCells return sudo:numOfConstraints($cell, $fixedCells) )return sudo:numOfConstraints($cell, $fixedCells) ) for $cell in $freeCellsfor $cell in $freeCells where sudo:numOfConstraints($cell, $fixedCells) eq where sudo:numOfConstraints($cell, $fixedCells) eq

$maxNumOfConstraints $maxNumOfConstraints return $cell };return $cell };

Use in function Use in function solve()solve()::let $cell := (: pick a maximally constrained cell :)let $cell := (: pick a maximally constrained cell :) sudo:mostConstrainedFreeCells($freeCells, $fixedCells)[1] sudo:mostConstrainedFreeCells($freeCells, $fixedCells)[1]

Page 105: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 105

XQuery Sudoku EfficiencyXQuery Sudoku Efficiency

Efficiency, and the effect of the optimization varies Efficiency, and the effect of the optimization varies by puzzle instances:by puzzle instances:

Puzzle Time Memory Time MemoryEasy004 1.1 s 36 MB +20% +33%

Hard004 2.9 s 146 MB -45% -44%

AI Escargot 4.0 s 270 MB +20% +25%

Fiendish 2 6.9 s 360 MB -40% -5%

Inkala ’10 7.1 s 360 MB -40% -3%

Minimal 17 145.3 s 400 MB -40% -5%

Double 16 158.5 s 410 MB -85% -15%

with with heuristicheuristic

unoptimizedunoptimized

Page 106: 5 Querying XML

SDPL 2011 5. Querying XML with XQuery 106

XQuery: SummaryXQuery: Summary

– A recent W3C XML query language, also capable of A recent W3C XML query language, also capable of general XML processinggeneral XML processing

– Vendor support??Vendor support??» http://www.w3.org/XML/Query

mentions 50+ prototypes or products (2004: ~ 30, 2005: ~ 40; mentions 50+ prototypes or products (2004: ~ 30, 2005: ~ 40; free, commercial, ... Oracle, IBM DB2, MS SQL Server;free, commercial, ... Oracle, IBM DB2, MS SQL Server;Native XML databases, ...)Native XML databases, ...)

– Future?? Future?? » Promising confluence of document and database researchPromising confluence of document and database research» highly potential for XML-based data integrationhighly potential for XML-based data integration


Recommended