48
1 Conditional XPath, the first order complete XPath dialect Maarten Marx Presented by: Einav Bar- Ner

Conditional XPath, the first order complete XPath dialect

  • Upload
    halen

  • View
    78

  • Download
    3

Embed Size (px)

DESCRIPTION

Conditional XPath, the first order complete XPath dialect. Maarten Marx Presented by: Einav Bar-Ner. Introduction. XPath 1.0 is a variable free language used for selecting nodes from XML documents. What is missing? - PowerPoint PPT Presentation

Citation preview

Page 1: Conditional XPath, the first order complete XPath dialect

1

Conditional XPath, the first order complete XPath dialect

Maarten MarxPresented by: Einav Bar-Ner

Page 2: Conditional XPath, the first order complete XPath dialect

2

Introduction

XPath 1.0 is a variable free language used for selecting nodes from XML documents.

What is missing?Core XPath (xpath query which interpreted

on XML document tree models) cannot express queries with conditional paths as exemplified by “do a child step, while test is true at the resulting node.”

Page 3: Conditional XPath, the first order complete XPath dialect

3

Introduction – Cont.

Solution:We add conditional axis relations to Core XPath

and show that the resulting language, called conditional XPath, is equally expressive as first-order logic when interpreted on ordered trees.

Specifically, conditional XPath extends Core XPath with the until operator.

This paper introduces and motivates this addition.

Page 4: Conditional XPath, the first order complete XPath dialect

4

XPath 1.0 VS. XPath 2.0

XPath 2.0 contains variables which are used in if–then–else, for, and quantified expressions. The available axis relations are the same in both versions.

Page 5: Conditional XPath, the first order complete XPath dialect

5

XPath 1.0 VS. XPath 2.0

As XPath 2.0 adds variables and

first order quantifiers it is natural to ask whether XPath 1.0 is already expressively complete with respect to first order logic.

Page 6: Conditional XPath, the first order complete XPath dialect

6

XPath VS. First Order Logic

More precisely, is every first order

query Φ(x) which selects a set of nodes from an XML document model equivalent to an XPath expression?

As we will see, it is not, but a simple and natural addition is sufficient.

Page 7: Conditional XPath, the first order complete XPath dialect

7

Lecture’s Topics

1. Introduction + motivation

2. A brief introduction to XPath, CXPath

3. Proving the expressively completeness of CXPath

Page 8: Conditional XPath, the first order complete XPath dialect

8

“Until–like” query

The answer set of “until-like” query consists of all nodes n’ satisfying the statement:

“n’ is a descendant of n, the label of n’ is A, and for all z, if z is a descendant of n and n’ is a descendant of z, then the label of z is A.”

This query can be expressed by (child :: A)+, which we call a conditional axis.

Page 9: Conditional XPath, the first order complete XPath dialect

9

Conditional XPath (CXPath) vs. Core XPath

While CXPath is more expressive than Core XPath, in all other aspects it is very conservative.

CXPath has the same syntax as XPath 1.0 and Core XPath and the same (standard W3C) semantics.

Page 10: Conditional XPath, the first order complete XPath dialect

10

A Brief Introduction to XPath – Cont.

The primitive axis of XPath are

and those of CXPath are

,,,

fexprfexprfexprfexpr ,,,,,,,

Page 11: Conditional XPath, the first order complete XPath dialect

11

A Brief Introduction to XPathDefinition 1. The syntax of the XPath languages XCore and

CXPath is defined by the grammarLocpath ::= axis ‘::’ntst | axis ‘::’ ntst ‘[’fexpr‘]’

| ‘/’locpath | locpath ‘/’ locpath |

locpath ‘ | ’ locpath fexpr ::= locpath | not fexpr | fexpr and

fexpr | fexpr or fexpraxis ::= self | primitive_axis |

primitive_axis+ | primitive_axis*.

Page 12: Conditional XPath, the first order complete XPath dialect

12

A Brief Introduction to XPath – Cont. “locpath” is the start production “axis” – denotes axis relations e.g.

descendant, ancestor. “ntst” denotes tags labeling document nodes

or the star ‘*’ that matches all tags (these are called node tests).

“fexpr” will be called filter expressions after their use as filters in location paths.

With an XPath expression we always mean a “locpath”.

Page 13: Conditional XPath, the first order complete XPath dialect

13

A Brief Introduction to XPath – Cont.

Interpretation of the axis relations:

- Assigns a boolean value: “true” if

the node n satisfies fexpr, else false.

Given a tree T and an expression A, the meaning of A in T is written as [[A]]T.

Page 14: Conditional XPath, the first order complete XPath dialect

14

The Semantics of XCore and CXPath

Page 15: Conditional XPath, the first order complete XPath dialect

15

Example

Consider the following information need: give elements whose next element in document order has tag A. This can be expressed in first order logic by

in which << abbreviates descendant or ancestor_or_self/following_sibling/desc-endant_or_self.

)()( yzxzyAyxy

Page 16: Conditional XPath, the first order complete XPath dialect

16

Example – Cont.

In the following case, answer ser will not contain x.

x

z

y A

Page 17: Conditional XPath, the first order complete XPath dialect

17

Example – Cont.To “program” this information in navigational

XPath we seem to need to express the “next in document order” relation. To do this we use a few macros: first, last and leaf abbreviate

, and We also use the converses of the conditional

axis, written as Its meaning is the transitive closure of the relation

Now we can write the “next in document order” relation in conditional XPath by the following case distinction:

last

]::[:: notself ]::[:: notself ]::[:: notself

::/][:: lastself

::/::|::/][::|][:: lastleafselffirst

Page 18: Conditional XPath, the first order complete XPath dialect

18

What kind of queries does XPath express?

The answer set of locpath evaluated on a tree R (notation: answerR(locpath)) is the set

{n Є R| there exists an m, (m,n) Є

[locpath]R}

Thus for each expression A, answerR(A) equals

./*::/ * AanswerR

Page 19: Conditional XPath, the first order complete XPath dialect

19

Motivation

We will show that for every first order query Φ(x) there exists an absolute CXPath expression A which is equivalent in the following strong sense: for each tree R, for each node n, R|=Φ(n)

if and only if n Є answerR(A).

Page 20: Conditional XPath, the first order complete XPath dialect

20

Conditional XPath and First order logic

Let be the first order language with two binary relation symbols < and and countably many unary predicates P, Q, . . . L

is interpreted on node labeled sibling ordered trees in the obvious manner: < is interpreted as the descendant relation , as the strict total order on the siblings, and the unary predicates P as the sets of nodes labeled with P.

R

R

Page 21: Conditional XPath, the first order complete XPath dialect

21

Conditional XPath and First order logic – Cont.

Theorem 4. Every formula in one free variable is, on ordered trees, equivalent to an CXPath filter expression.

Meaning: For each filter expression fexpr

εR(n, fexpr) is true if and only if n Є

Note that we do not make a restriction to finite trees. The result holds for the class of all trees.

])fexpr*[::(/ *Ranswer

Page 22: Conditional XPath, the first order complete XPath dialect

22

Defining Xuntil

Define Xuntil as the propositional modal language with four binary until–like modal operators

The syntax is given by the grammar

The pi are propositional variables and

,,,

Page 23: Conditional XPath, the first order complete XPath dialect

23

Defining Xuntil – Cont.

Formally, a model R is a structure (T, h), with T a tree and h an assignment function from the set of propositional variables to the powerset of the set of tree nodes.

Truth of a formula is defined relative to a model R and a node n in that model via the following recursive definition:

Page 24: Conditional XPath, the first order complete XPath dialect

24

Example

h(p) = {n1’’, n2’’, n3’’}, h(q) = {n’} Φ = q, ψ = p, n |= (q, p)

n n1’’ n2’’ n3’’ n’

p p p q

Page 25: Conditional XPath, the first order complete XPath dialect

25

Xuntil and CXPath

Proposition 6. Every Xuntil formula is, on ordered

trees,equivalent to an CXPath filter

expression.

Page 26: Conditional XPath, the first order complete XPath dialect

26

proof of Theorem 4

Instead of proving that Theorem directly we show expressive completeness of Xuntil , which is sufficient by Proposition 6.

We say that Xuntil is expressively completeif for every formula Φ(x) there exists an

Xunti formula σ such that for every tree model R, for all nodes n in R,

R|=Φ(n) if and only if R,n |= Φ.The key idea of the proof is the brilliant

notion of separation.

Page 27: Conditional XPath, the first order complete XPath dialect

27

SeparationLet T be a tree and a node t є T. Define the following

partition on T:

Now let h, h’ be two assignments. We say that h, h’ agree on the future of t iff for any atom q and any s є future(t), s є h(q) iff s є h’(q).

We similarly define this notion for the present, past, left and right.

Page 28: Conditional XPath, the first order complete XPath dialect

28

Example

h and h’ agree on the future of t:h: h’:

t

s1 s2

t

s1 s2

q1

q2 q3

q2

q2 q3

Page 29: Conditional XPath, the first order complete XPath dialect

29

Example – Cont.

h and h’ do not agree on the future of t:

h: h’:

t

s1 s2

t

s1 s2

q1

q2 q3

q2

q2 q3, q4

Page 30: Conditional XPath, the first order complete XPath dialect

30

Separation – Cont.

We say that a wff A is a pure future wff iff for each tree T, for all t є T, for all assignments h, h’, if h, h’ agree on the future of t, then t, h |= A iff t, h’ |= A.

Similarly, we define pure present, past, left and

right wffs.

We say that a wff A is separable iff there exists a wff which is a boolean combination of pure present, future, past, left and right wffs and is equivalent to A everywhere on any tree.

Page 31: Conditional XPath, the first order complete XPath dialect

31

Example – Not a Pure Future wff

h: h’:

s1q1

q2 q3s2

t

s4

s3 q4

q5

s2

t

s4

s3 q4

q6

q2q1

s1 s2s1 q3

s5 s5q4 q4

( q4, q5)

Page 32: Conditional XPath, the first order complete XPath dialect

32

Example –Pure Future wff

h: h’:

s1q1

q2 q3s2

t

s4

s3 q4

q5

s2

t

s4

s3 q4

q6

q2q1

s1 s2s1 q3

s5 s5q4 q4

( q2, q4)

Page 33: Conditional XPath, the first order complete XPath dialect

33

Separation – Cont.

Theorem 7. If every Xuntil wff is separable over trees,then Xuntil is expressively complete.

Theorem 8. Each Xuntil wff is, over trees, separable.

Corollary:Xuntil is expressively complete.

Page 34: Conditional XPath, the first order complete XPath dialect

34

Separation – Cont.

CXPath Xuntil Separable

Expressively Complete

Page 35: Conditional XPath, the first order complete XPath dialect

35

Separation – Cont.

The next lemma describes a syntactic criterion for pure wffs.

Theorem 8 is shown by rewriting each wff into a boolean combination of wffs satisfying these syntactic criteria.

Page 36: Conditional XPath, the first order complete XPath dialect

36

Separation – Cont.Lemma 9.

1. Each boolean combination of atoms is a pure present wff.

2. Each boolean combination of wffs in whose scope occur only atoms, , and wffs is a pure future wff.

3. Each boolean combination of wffs in whose scope occur only atoms, wffs and pure future wffs is a pure right wff.

4. Each boolean combination of wffs in whose scope occur only atoms, wffs and pure future wffs is a pure left wff.

5. Each boolean combination of wffs in whose scope occur only atoms, wffs and pure left and right wffs is a pure past wff.

Page 37: Conditional XPath, the first order complete XPath dialect

37

Separating Formulas

We shall describe a syntactic procedure for separating each wff into a boolean combination of wffs of the form described in Lemma 9. Then that Lemma yields the theorem.

Page 38: Conditional XPath, the first order complete XPath dialect

38

Separating Formulas – Cont.

what do we need to do?1. pull out wffs from under the

scope of wffs, and conversely;

2. pull out wffs from under the scope of , and wffs;

3. pull out wffs from under the scope of wffs.

Page 39: Conditional XPath, the first order complete XPath dialect

39

Separating Formulas – Cont.

Lemma 10. The following are valid on all structures, for any of the four orientations

Page 40: Conditional XPath, the first order complete XPath dialect

40

Creating Macros for the Unary Connectives.

For one of the four arrows: abbreviates abbreviates T) abbreviates corresponds to the “next” operator. corresponds to its transitive

closure. expresses that everywhere below

A holds.

),( A

,(A

A

A

A

A

A

Page 41: Conditional XPath, the first order complete XPath dialect

41

Separating Formulas – Cont.

Call a wff horizontal if it does not contain and wffs.

Theorem 12 (Gabbay). Each horizontal wff is equivalent to a wff in

which no wff occurs in the scope of a wff and conversely.

The next lemma allows us to bring wffs out of the scope of horizontal wffs.

Page 42: Conditional XPath, the first order complete XPath dialect

42

Separating Formulas – Cont.

Lemma 13. Let a, q, A, B be arbitrary wffs. The following are valid over trees. They are also valid if is replaced everywhere by

Page 43: Conditional XPath, the first order complete XPath dialect

43

Separating Formulas – Cont.

Proof of Lemma 13. The equivalences follow from theobservation that if then for any if and only if

),( BA,yxR

),(| BAx ),(| BAy

Page 44: Conditional XPath, the first order complete XPath dialect

44

Separating Formulas – Cont.

The only cases left are wffs in the scope of wffs and conversely.

Lemma 14. Let a, q, A and B be atoms. Consider the followings wffs:

Page 45: Conditional XPath, the first order complete XPath dialect

45

Separating Formulas – Cont.

Each of the above wffs is equivalent, over ordered trees, to another wff in which the only appearances of the connective are as + (A, B) and if an appearance of that wff is in the scope of , then (A, B) is in the scope of a or a wff (which itself is in the scope of the ).

Page 46: Conditional XPath, the first order complete XPath dialect

46

The final step.

We now know the basic steps of the separation proof. We simply keep pulling out ’s from under the scope of ’s etcetera until there are no more. Given a wff A, this process will eventually lead to a syntactically separated wff.

Page 47: Conditional XPath, the first order complete XPath dialect

47

COMPLEXITY

Query evaluation for Core XPath can be done in time O( |D| · |Q| ), with |D| the size of the data and |Q| the size of the query.

CXPath is more expressive, but the upper bound remains.

Page 48: Conditional XPath, the first order complete XPath dialect

48

CONCLUSION

We defined an easy to use, variable free XPath dialect which is expressively complete with respect to first order logic when interpreted on ordered trees.

According to Marx, the lack of variables is one of the reasons for the success of XPath, so it is nice to know that they are not needed for expressivity reasons.

Besides that, query evaluation for CXPath can still be done in linear time.