28
The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Page 1: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

The Complexity of XPath Evaluation

Paper By:Georg GottlobCristoph KochReinhard Pichler

Presented By:Royi Ronen

Page 2: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Introduction

• All major XPath evaluating algorithms run in exponential time.

• Paper’s main goals: – Prove that the “XPath problem” P-complete.– Prove that other related problems are

LOGCFL-complete.

Page 3: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

XPath – Quick Reminder

• XPath is a query language for XML documents.

• Navigating through a document:

/descendant::a/child::b selects nodes named “b” that have a father named “a”.

• Testing nodes:

/descendant::a/child::b[@c=3] requires that b’s attribute c equals 3.

Page 4: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Sketch: How P-Completeness is proven

• In order to prove P-Completeness of a problem, we have to prove:– Membership in P;– P-Hardness;

P

P-Complete

P-Hard

Page 5: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

XPath is P-Complete

• Sketch:1. Membership of XPath in P is already proven (By the same authors).2. P-Hardness of XPath will be proven by reduction from the monotone circuit problem (which is known to be P-Complete) to Core XPath (a subset of XPath with its main features). Why is it enough?

Page 6: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Monotone Boolean Circuit Problem

• A Monotone Boolean circuit is a circuit with many inputs and one output that uses the following Boolean gates only:– AND– OR– DUMMY

• Given a circuit and its inputs, solving the problem is stating the output.

• The problem is P-Complete.

Page 7: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

A Monotone Boolean Circuit

• Item 3 in the handout:

Page 8: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Core XPath - Definition

XPath is has many features, and is inconvenient for theoretical treatment. Therefore Core XPath, a subset of XPath with its main features is defined by the following grammar (Item 1 in the handout):

locpath ::= ‘/’ locpath | locpath ‘/’ locpath | locpath ‘|’ locpath | locstep.

locstep ::= axis ‘::’ ntst `[' bexpr `]' . . . ‘[‘ bexpr ‘]’.bexpr ::= bexpr ‘and’ bexpr | bexpr ‘or’ bexpr |

‘not(’ bexpr ‘)’ | locpath.axis ::= ‘self’ | ‘child’ | ‘parent’ |

‘descendant’ | ‘descendant-or-self’ | ‘ancestor’ | ‘ancestor-or-self’

‘following’ | ‘following-sibling’ ‘preceding’ | ‘preceding-sibling’.

Page 9: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

The Corresponding Languages

• The paper shows direct reductions between the problems.

• We will show the same reduction, but between the corresponding languages, since it is the methodology used in the Technion Computability course.

• The proofs are equivalent.

Page 10: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

The Corresponding Languages

• L-Core XPath:

{(Q,D) | Q is a Core XPath query, D is a valid document and Q yields a

non-empty result when run on D} • L-Monotone Circuit:

{(C,I) | C is a monotone circuit, I is a set of inputs to C and C evaluates 1 when run on I}

Page 11: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

The Reduction

• Reduction is our tool to prove that one language is at least as hard as another.

• Here we will show: L-Circuit is reducible to L-Core XPath. It proves that L-Core XPath is at least as hard as L-Circuit, therefore P-Hard.

• We have to build (Q,D) that yields a nonempty result iff (C,I) evaluates to 1.

Page 12: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

The circuit layered

• An equivalent monotone circuit, in which only one non-dummy gate exists in every layer (Item 4 in the handout).

• The gates are ordered, data can flow from lower to higher indexed gates only.

Page 13: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Q and D

• D is built as follows:

M inputs, Here M=4 N non-input gates, Here N=5

Total of 2(M+N)+1 nodes.

Nodes are tagged, from the alphabet: {0,1,Ii,Oi,G }Where i is from {1,2,…,N}

Page 14: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Tagging Rules

• V1-VM are tagged each with its input value, e.g. 0 or 1.

• VM+N Is tagged R, Vi is tagged G (inc. VM+N).

• If gate Gi is an input to gate GM+k (i<M+k), Ik is added to Vi and Ok – to VM+k.

• V’1..M are tagged Ii and Oi, where i is in {1,..,N}.• V’M+i are tagged Ik and Ok, where k is in {i,..,N}.

These tags will be used by the query.

Page 15: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

A Simple ExampleD

V0

V’1

V1

V’2

V2 V3

V’3

G G G

R

I1 I1O1 O1

I1O1

I1 I11 01 0

G1C

O1

Page 16: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

The Query

• The query in the output of the reduction is:

k

The reduction can be achieved in logarithmic space

N/descendant-or-self::[T(R) and ]:= descendant-or-self::[T(Ok) and parent::*[ ]]k

kk := not(child::*[T(Ik) and not( )]) If GM+k is an AND Gate

kk := child::*[T(Ik) and ( )]If GM+k is an OR/DUMMY Gate

k := ancestor-or-self::*[T(G) and ]1k0 := T(1) End of

recursion

Evaluation of Gk by: selecting V0 iff all (one of) Gk inputs are (is) 1 and the gate is “AND” (“OR”).

Pushing down results

Page 17: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Sub-queries Meaning1: ::*[ ( ) ]k kancestor or self T G

: ( ::*[ ( ) ( )])k k knot child T I not

: ::*[ ( ) ( )])k k kchild T I

: ::*[ ( ) ::[ ]]k k kdescendant or self T O parent

/ ::*[ ( ) ]kdescendant or self T R

Returns nodes in the previous iteration and their tagged children, e.g. pushes “down” results by including the children.

Returns the root iff all the inputs to gate k are true, in an AND gate.

Returns the root iff at least one of the inputs to gate k is true, in an OR gate. In both cases, returns the nodes that represent gates that were previously evaluated to true.

Includes Vk iff the root was returned by the previous sub-query.

Returns the rightmost node iff the output gate is evaluated to true. (No other gate is tagged R).

Page 18: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

The Query - ExampleV0

V’1

V1

V’2

V2 V3

V’3

G G G

R

I1 I1O1 O1

I1O1

I1 I11 0

1/ ::*[ ( ) ]descendant or self T R

1 1 1: ::*[ ( ) ::[ ]]descendant or self T O parent

1 1 1: ( ::*[ ( ) ( )])not child T I not

1 : [ ( ) (1)]ancestor or self T G T

0

O1

Page 19: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Discussion

It is enough to show that:

Reason: T(R) is true for the rightmost node only.

If the last gate evaluates to 1, then the result of the query consists of that

node, and (Q,D) is in Circuit.

Otherwise, the result is empty, and (Q,D) is not in Circuit.

kVi [ ] iff Gi evaluates to true

Page 20: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Tagged Tree Example

I23

G 1

I24 1 G

I1 0 G

O I1 G

O1 I34 G

I5 O2 G

O3 I5

GO4 I5 G

O5 R G

I1-I5O1-O5

I1-I5O1-O5

I1-I5O1-O5

I1-I5O1-O5

I1-I5O1-O5

I2-I5O2-O5

I3-I5O3-O5

I4-I5O4-O5

I5O5

and and andand or

For C in the handout

Page 21: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Discussion

• consists of the values of the k nodes in layer k of the circuit.

• It can also be viewed as the situation at the k-th tick of a clock in a synchronous system.

• Proof:

kVi [ ] iff Gi evaluates to true

k

Page 22: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Despite P-Completeness

• Problems that are P-Complete are considered inherently sequential, and thus cannot benefit from parallelization.

• However, for real-world use, it may be very useful to find subsets of the problem and classify them into lower complexity classes (easier problems).

• Does anyone recall a well known problem that can benefit from such manipulation?

• The paper continues by looking for how to degenerate the problem.

Page 23: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

First Modification Trial

• Only usage of the axes: child, parent and descendant-or-self is allowed.

• The modification doesn’t yield lower complexity. The same reduction will work after changing:

ancestor-or-self::*

to

descendant-or-self::*/parent::*

Page 24: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

Second Modification Trial

• Let Positive Core-XPath be: Core-XPath \ Queries that use negation.• This problem is a member of LOGCFL.• LOGCFL problems can be reduced in logarithmic

space to a context free language.• Being context free embodies the ability to be

parallelized. Segments do not dependant on each other.

• The reduction is very similar. It uses the problem of semi-bounded circuits for the reduction.

Page 25: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

WF and Positive WF

• WF is a subset of XPath that allows Core-XPath, arithmetic operations and conditions using position() last() and constants.

• Where is WF?

• Positive WF is LOGCFL-Complete. The proof of hardness resembles the proof we have just seen.

Page 26: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

The Global Picture

Page 27: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

BACKUP

• BACKUP

Page 28: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen

PF is NL-Complete

• PF is the problem of navigating through an XML document, with no conditions allowed.

• NL is the class of problems solved by a Turing Machine that uses, non-deterministically, logarithmic space.

• Proof: PF is NL-Complete.– Membership in NL (By random guessing)– NL-Hardness