Upload
collin-shields
View
217
Download
0
Embed Size (px)
Citation preview
CSE 636Data Integration
Answering Queries Using Views
Bucket Algorithm
Fall 2006
2
• Each subgoal g of Q must be “covered” by some view
• Make a list of candidates (buckets) per query subgoal
• Consider combinations of candidates from different buckets
• Not all combos are “compatible”• Keep the compatible ones and minimize them• Discard the ones contained in another• Take their union
The Bucket Algorithm
3
The Bucket Algorithm
q(X,Y,R) :- ForSale(X,Y,C,”auto”), Review(X,R,”auto”), Y > 1985
Step 1: For each subgoal, put the relevant sources into a bucket:
V1(name, year) :- ForSale(name, year, “France”, “auto”), year > 1990 would be relevantV3(name, year) :- ForSale(name, year, “France”, “cheese”) would be irrelevant
Step 2: Take the Cartesian product of the bucketsAlgorithm produces maximally contained rewritingIgnores interactions between subgoals in Step 1
4
The Bucket Algorithm: Example
V1(Std,Crs,Qtr,Title) :- reg(Std,Crs,Qtr), course(Crs,Title), Crs ≥ 500, Qtr ≥ Aut98V2(Std,Prof,Crs,Qtr) :- reg(Std,Crs,Qtr), teaches(Prof,Crs,Qtr)V3(Std,Crs) :- reg(Std,Crs,Qtr), Qtr ≤ Aut94V4(Prof,Crs,Title,Qtr) :- reg(Std,Crs,Qtr), course(Crs,Title), teaches(Prof,Crs,Qtr), Qtr ≤ Aut97
q(S,C,P) :- teaches(P,C,Q), reg(S,C,Q), course(C,T), C ≥ 300, Q ≥ Aut95
Step 1: For each query subgoal, put the relevant sources into a bucket
5
The Bucket Algorithm: Example
V1(Std,Crs,Qtr,Title) :- reg(Std,Crs,Qtr), course(Crs,Title), Crs ≥ 500, Qtr ≥ Aut98V2(Std,Prof,Crs,Qtr) :- reg(Std,Crs,Qtr), teaches(Prof,Crs,Qtr)V3(Std,Crs) :- reg(Std,Crs,Qtr), Qtr ≤ Aut94V4(Prof,Crs,Title,Qtr) :- reg(Std,Crs,Qtr), course(Crs,Title), teaches(Prof,Crs,Qtr), Qtr ≤ Aut97
q(S,C,P) :- teaches(P,C,Q), reg(S,C,Q), course(C,T), C ≥ 300, Q ≥ Aut95
PProf, CCrs, QQtr
Note: Arithmetic predicates don’t pose a problem
V2
Buckets
V4
teaches reg course
6
The Bucket Algorithm: Example
V1(Std,Crs,Qtr,Title) :- reg(Std,Crs,Qtr), course(Crs,Title), Crs ≥ 500, Qtr ≥ Aut98V2(Std,Prof,Crs,Qtr) :- reg(Std,Crs,Qtr), teaches(Prof,Crs,Qtr)V3(Std,Crs) :- reg(Std,Crs,Qtr), Qtr ≤ Aut94V4(Prof,Crs,Title,Qtr) :- reg(Std,Crs,Qtr), course(Crs,Title), teaches(Prof,Crs,Qtr), Qtr ≤ Aut97
q(S,C,P) :- teaches(P,C,Q), reg(S,C,Q), course(C,T), C ≥ 300, Q ≥ Aut95
SStd, CCrs, QQtr
Note: V3 doesn’t work: arithmetic predicates not consistentV4 doesn’t work: S not in the output of V4
V2
Buckets
V4
teaches reg course
V1V2
7
The Bucket Algorithm: Example
V1(Std,Crs,Qtr,Title) :- reg(Std,Crs,Qtr), course(Crs,Title), Crs ≥ 500, Qtr ≥ Aut98V2(Std,Prof,Crs,Qtr) :- reg(Std,Crs,Qtr), teaches(Prof,Crs,Qtr)V3(Std,Crs) :- reg(Std,Crs,Qtr), Qtr ≤ Aut94V4(Prof,Crs,Title,Qtr) :- reg(Std,Crs,Qtr), course(Crs,Title), teaches(Prof,Crs,Qtr), Qtr ≤ Aut97
q(S,C,P) :- teaches(P,C,Q), reg(S,C,Q), course(C,T), C ≥ 300, Q ≥ Aut95
CCrs, TTitle V2
Buckets
V4
teaches reg course
V1V2
V1V4
8
The Bucket Algorithm: Example
Step 2:• Try all combos of views, one each from a bucket• Test satisfaction of arithmetic predicates in each case
– e.g., two views may not overlap, i.e., they may be inconsistent
• Desired rewriting = union of surviving ones
Query rewriting 1:
q1(S,C,P) :- V2(S’,P,C,Q), V1(S,C,Q,T’), V1(S”,C,Q’,T)– no problem from arithmetic predicates (none in V2)– May or may not be minimal (why?)
V2V4
teaches reg course
V1V2
V1V4
9
The Bucket Algorithm: Example
Unfolding of rewriting 1:q1’(S,C,P) :- r(S’,C,Q), t(P,C,Q), r(S,C,Q), c(C,T’), r(S”,C,Q’), c(C,T), C ≥ 500, Q ≥ Aut98, C ≥ 500, Q’ ≥ Aut98
• Black r’s can be mapped to green r:S’S, S”S, Q’Q
• Black c can be mapped to green c:just extend above mapping to TT’
Minimized unfolding of rewriting 1:q1m’(S,C,P) :- t(P,C,Q), r(S,C,Q), c(C,T’), C ≥ 500, Q ≥ Aut98Minimized rewriting 1:q1m(S,C,P) :- V2(S’,P,C,Q), V1(S,C,Q,T’)
10
The Bucket Algorithm: Example
Query Rewriting 2:
q2(S,C,P) :- V2(S’,P,C,Q), V1(S,C,Q,T’), V4(P’,C,T,Q’)q2’(S,C,P) :- r(S’,C,Q), t(P,C,Q), r(S,C,Q), r(S,C,Q), c(C,T’), C ≥ 500, Q ≥ Aut98, r(S”,C,Q’), c(C,T), t(P’,C,Q’), Q’ ≤ Aut97• This combo is infeasible: consider the conjunction of
arithmetic predicates in V1 and V4
Query rewriting 3:
q3(S,C,P) :- V2(S’,P,C,Q), V2(S,P’,C,Q), V4(P”,C,T,Q’)
V2V4
teaches reg course
V1V2
V1V4
V2V4
teaches reg course
V1V2
V1V4
11
The Bucket Algorithm: Example
Unfolding of rewriting 3:q3’(S,C,P) :- r(S’,C,Q), t(P,C,Q), r(S,C,Q), t(P’,C,Q), r(S”,C,Q’), c(C,T), t(P”,C,Q’), Q’ ≤ Aut97• The green subgoals can cover the black ones under the
mapping: S’S, S”S, P’P, P”P, Q’Q
Minimized rewriting 3:q3m(S,C,P) :- V2(S,P,C,Q), V4(P,C,T,Q)
Verify that there are only two rewritings that are not covered by others
Maximally Contained Rewriting:q’ = q1m q3m
12
The Bucket Algorithm: Example 2
Query:q(X) :- cites(X,Y), cites(Y,X), sameTopic(X,Y)
Views:V4(A) :- cites(A,B), cites(B,A)V5(C,D) :- sameTopic(C,D)V6(F,H) :- cites(F,G), cites(G,H), sameTopic(F,G)
Note: Should we list V4(X) twice in the buckets?
V4
Buckets
V6
cites cites sameTopic
V4
V6
V5
V6
13
The Bucket Algorithm: Example 2
• Consider all combos & check for containment of the unfolded rewriting in Q
• V4(X) cannot be combined with anything (why?) Try q1(X) :- V4(X), V4(X), V5(X,Y)Try q2(X) :- V4(X), V6(X,Y), V5(X,Y)
• Does any of these work?• When can we discard a view from consideration?
14
The Bucket Algorithm: Example 2
Here is a successful rewriting:q3(X) :- V6(X,Y), V6(X,Y), V6(X,Y)
• By itself is not contained in Q• But, with subgoal X=Y added, it is!
By minimizing the rewriting, we get:q3m(X,Y) :- V6(X,X)
15
The Bucket Algorithm: Example 2
Remarks: • V4 didn’t contribute to any rewrite, but the
bucket algorithm doesn’t recognize it ahead• Consider:
q2(X,Y) :- cites(X,Y), cites(Y,X)• Then both cites predicates can be folded into V4
– Not recognized by the bucket algorithm
16
The State of Affairs
• Bucket algorithm:– deals well with predicates, Cartesian product can be
large (containment check required for every candidate rewriting)
• Inverse rules:– modular (extensible to binding patterns, FD’s)– no treatment of predicates– resulting rewritings need significant further
optimization
Neither scales up• The MINICON algorithm:
– change perspective: look at query variables
17
References
• Querying Heterogeneous Information Sources Using Source Descriptors– By Alon Y. Levy, Anand Rajaraman and Joann J. Ordille– VLDB, 1996
• Laks VS Lakshmanan– Lecture Slides
• Alon Halevy– Answering Queries Using Views: A Survey– VLDB Journal, 2000– http://citeseer.ist.psu.edu/halevy00answering.html