Answering Queries and Hypertree Decompositions Conjunctive Queries The problem BCQ: Instance:...

Preview:

Citation preview

Answering Queriesand Hypertree Decompositions

Conjunctive Queries

The problem BCQ:

Instance: < DB, Q>

Question: Has Q a nonempty result over DB?

Combined Complexity(Vardi ’82)

Problems Equivalent to BCQ

Conjunctive Query Containment

Query of Tuple Problem

Constraint Satisfaction in AI

Clause Subsumption in Theorem Proving

)()( 2121

dbQdbQdbQQ

? )( dbQt

? s.t. DC

BCQ CSP

Example of CSP: Crossword Puzzle

Complexity of BCQ

NP-complete in the general case (Chandra and Merlin ’77)NP-hard even for fixed database

Polynomial if Q has an acyclic hypergraph(Yannakakis ’81)LOGCFL-complete (in NC2) (G.L.S. ’98)

Interest in larger tractable classes of CQS

Is this query hard?

),','(),',()',',,,(

)','()','()',',(),(

),(),',()',',',,(),,',,(

FXBqFXBpYXYXJj

ZYhZXgZFFfZYe

ZXdZCCcFCYYSbFCXXSaans

n size of the databasem number of atoms in the query

• Classical methods worst-case complexity:O(n

m)

m = 11 !

• Despite its apparence, this query is nearly acyclic

It can be evaluated in O(m·n 2· logn)

Nearly Acyclic Queries

Bounded Treewidth (tw) a measure of the cyclicity of graphs for queries: tw(Q) = tw(G(Q))

For fixed k: checking tw(Q) k Computing a tree decomposition

linear time(Bodlaender’96)

Answering BCQ of treewidth k:O(nk log n) (Chekuri & Rajaraman’97)LOGCFL-complete (G.L.S.’98)

Beyond treewidth

Bounded Degree of Cyclicity

Bounded Query width

(Gyssens & Paredaens ’84)

(Chekuri & Rajaraman ’97)

Group together query atoms (hyperedges) instead of variables

Hypertree Decomposition

We use p(X,Y,Z) partially p(X,Y,_), c(T,W)

d(X,T)

a(X,U,W), b(Y,V,W)

c(Y,T)

p(X,Y,Z), q(U,V,Z)

p(X,Y,_),p(X,Y,_), c(T,W)

We group atoms p(X,Y,Z), q(U,V,Z)

a(S,X,X’,C,F), b(S,Y,Y’,C’,F’)

j(J,X,Y,X’,Y’)

j(_,X,Y,_,_), c(C,C’,Z) j(_,_,_,X’,Y’), f(F,F’,Z’)

d(X,Z) e(Y,Z) h(Y’,Z’)g(X’,Z’), f(F,_,Z’)

p(B,X’,F) q(B’,X’,F)

),','(),',()',',,,(

)','()','()',',(),(

),(),',()',',',,(),,',,(

FXBqFXBpYXYXJj

ZYhZXgZFFfZYe

ZXdZCCcFCYYSbFCXXSaans

Connectedness Condition

a(S,X,X’,C,F), b(S,Y,Y’,C’,F’)

j(J,X,Y,X’,Y’)

j(_,X,Y,_,_), c(C,C’,Z) j(_,_,_,X’,Y’), f(F,F’,Z’)

d(X,Z) e(Y,Z) h(Y’,Z’)g(X’,Z’), f(F,_,Z’)

p(B,X’,F) q(B’,X’,F)

Evaluating queries having bounded hypertree widthk fixed

Given:a database db

a query Q over db such that hw(Q) ka width k hypertree decomposition of Q

Deciding whether Q(db) is not empty is in O(n k+1 log n) and complete for LOGCFL

Computing Q(db) is feasible in output-polynomial time

Comparison results

Hypertree Decomposition

Hinge Decomposition+

Tree ClusteringCycle Hypercutset

Tree Clusteringw* treewidth

Cycle Cutset

HingeDecomposition

Biconnected Components

Characterizations ofHypertree width

Logical characterization:Loosely guarded logic

Game characterization:The robber and marshals game

Work in progressAnswering queries and hypertree decompositions: A query-planner based on hypertree

decompositions Choosing the best query plan (i.e., the

best decomposition) exploiting data on tables, attibute selectivity, indices, etc.

Further possible applications: Answering queries using views

Recommended