Upload
maili
View
37
Download
4
Embed Size (px)
DESCRIPTION
Αποτίμηση Ερωτημάτων σε Συλλογές Διαδρομών (Evaluating Queries on Route Collections). Παναγιώτης Μπούρος Ενδιάμεση Κρίση Επιβλέπων: Ι. Βασιλείου. Outline. Introduction Route collections, queries & frequent updates Existing work Proposed framework Experiments Conclusions. - PowerPoint PPT Presentation
Citation preview
Αποτίμηση Ερωτημάτων σε Συλλογές Διαδρομών
(Evaluating Queries on Route Collections)
Παναγιώτης ΜπούροςΕνδιάμεση Κρίση
Επιβλέπων: Ι. Βασιλείου
Outline
• Introduction– Route collections, queries & frequent updates
• Existing work• Proposed framework• Experiments• Conclusions
Examples of route collections
• People visiting Athens– Use e-devices to track their
sightseeing– Create routes through
interesting places– Upload/propose routes to Web
sites like ShareMyRoutes.com• Query such route collection:
– Find a sequence of interesting places from Omonia Square to Cathedral
– Find a sequence of interesting places from Academy to Museum of Acropolis that passes through Zappeion
• Answers may involve nodes from more than one route
Examples of route collections
• People visiting Athens– Use e-devices to track their
sightseeing– Create routes through
interesting places– Upload/propose routes to Web
sites like ShareMyRoutes.com• Querying such route collection:
– Find a sequence of interesting places from Omonia Square to Cathedral
– Find a sequence of interesting places from Academy to Museum of Acropolis that passes through Zappeion
• Answers may involve nodes from more than one route
Examples of route collections
• People visiting Athens– Use e-devices to track their
sightseeing– Create routes through
interesting places– Upload/propose routes to Web
sites like ShareMyRoutes.com• Querying such route collection:
– Find a sequence of interesting places from Omonia Square to Cathedral
– Find a sequence of interesting places from Academy to Museum of Acropolis that passes through Zappeion
• Answers may involve nodes from more than one routes
Examples of route collections
• Courier company offering same day pickup and delivery services
• Route collection created from previous day
– Each route performed by a vehicle– Each node in routes is a point for picking-
up, delivering or just waiting– Each node in a route has time interval I
• Querying such collection:– Ad-hoc customer requests
• Pick-up parcel from Academy within IS• Deliver parcel at Zappeion within IT
• Serve ad-hoc request– Not add new route– Exploit one or more existing routes, pass
parcel among vehicles
Examples of route collections
• Courier company offering same day pickup and delivery services
• Route collection created from previous day
– Each route performed by a vehicle– Each node in routes is a point for picking-
up, delivering or just waiting– Each node in a route has time interval I
• Querying such collection:– Ad-hoc customer requests
• Pick-up parcel from Academy within IS
• Deliver parcel at Zappeion within IT
• Serve ad-hoc request– Not add new route– Exploit one or more existing routes, pass
parcel among vehicles
Examples of route collections
• Courier company offering same day pickup and delivery services
• Route collection created from previous day
– Each route performed by a vehicle– Each node in routes is a point for picking-
up, delivering or just waiting– Each node in a route has time interval I
• Querying such collection:– Ad-hoc customer requests
• Pick-up parcel from Academy within IS
• Deliver parcel at Zappeion within IT
• Serve ad-hoc request– Not add new route– Exploit one or more existing routes, pass
parcel among vehicles
Examples of route collections
• Courier company offering same day pickup and delivery services
• Route collection created from previous day
– Each route performed by a vehicle– Each node in routes is a point for picking-
up, delivering or just waiting– Each node in a route has time interval I
• Querying such collection:– Ad-hoc customer requests
• Pick-up parcel from Academy within IS
• Deliver parcel at Zappeion within IT
• Serve ad-hoc request– Not add new route– Exploit one or more existing routes, pass
parcel among vehicles
Passing parcelpossible
Examples of route collections
• Courier company offering same day pickup and delivery services
• Route collection created from previous day
– Each route performed by a vehicle– Each node in routes is a point for picking-
up, delivering or just waiting– Each node in a route has time interval I
• Querying such collection:– Ad-hoc customer requests
• Pick-up parcel from Academy within IS
• Deliver parcel at Zappeion within IT
• Serve ad-hoc request– Not add new route– Exploit one or more existing routes, pass
parcel among vehicles• Related to dynamic pickup and delivery
problem
Passing parcelimpossible
What’s all about…
• Consider route collections– Very large, stored in secondary storage– Frequently updated with new routes
• E.g. new routes are proposed or new vehicle routes are included• Evaluate queries that identify paths
– Sequences of distinct nodes from one or more routes– In the latter we use links, i.e., shared nodes– PATH(S,T)
• Find a sequence of interesting places from Omonia Square to Cathedral– FLSP(S,IS,T,IT)
• Moving cost m inside routes• Changing cost c between routes via links• Find the sequence of points to pick-up parcel from Cathedral and deliver it to
Zappeion that:– Abides with the temporal constraints imposed by time intervals– Minimizes primarily the changing cost and secondarily the moving cost
What’s all about…
• Consider route collections– Very large, stored in secondary storage– Frequently updated with new routes
• E.g. new routes are proposed or new vehicle routes are included• Evaluate queries that identify paths
– Sequences of distinct nodes from one or more routes– In the latter we use links, i.e., shared nodes– PATH(S,T)
• Find a sequence of interesting places from Omonia Square to Cathedral– FLSP(S,IS,T,IT)
• Moving cost m inside routes• Changing cost c between routes via links• Find the sequence of points to pick-up parcel from Cathedral and deliver it to
Zappeion that:– Abides with the temporal constraints imposed by time intervals– Minimizes primarily the changing cost and secondarily the moving cost
What’s all about…
• Consider route collections– Very large, stored in secondary storage– Frequently updated with new routes
• E.g. new routes are proposed or new vehicle routes are included• Evaluate queries that identify paths
– Sequences of distinct nodes from one or more routes– In the latter we use links, i.e., shared nodes– PATH(S,T)
• Find a sequence of interesting places from Omonia Square to Cathedral– FLSP(S,IS,T,IT)
• Moving cost m inside routes• Changing cost c between routes via links• Find the sequence of points to pick-up parcel from Cathedral and deliver it to
Zappeion that:– Abides with the temporal constraints imposed by time intervals– Minimizes primarily the changing cost and secondarily the moving cost
What’s all about…
• Consider route collections– Very large, stored in secondary storage– Frequently updated with new routes
• E.g. new routes are proposed or new vehicle routes are included• Evaluate queries that identify paths
– Sequences of distinct nodes from one or more routes– In the latter we use links, i.e., shared nodes– PATH(S,T)
• Find a sequence of interesting places from Omonia Square to Cathedral– FJP(S,IS,T,IT)
• Moving cost m inside routes• Changing cost c between routes via links• Find the sequence of points to pick-up parcel from Cathedral and deliver it to
Zappeion that:– Abides with the temporal constraints imposed by time intervals– Minimizes primarily the changing cost and secondarily the moving cost
Outline
• Introduction• Existing work
– Graph-based solutions for• PATH queries• FJP queries
• Proposed framework• Experiments• Conclusions
Existing work
• Convert route collection to a graph– GPATH for PATH queries– GFJP for FJP queries
• Answer queries on graph following one of the following paradigms:– Direct evaluation
• Low storage and maintenance cost• Slow query evaluation
– Preprocessing• Fast query evaluation• High maintenance cost
Evaluating PATH queries
• Strongly related with REACH query
– Exists path between two nodes?• Direct evaluation
– Search algorithms• Depth-first or breadth-first
search• Preprocessing GPATH
– Directly exploiting transitive closure is not feasible
– Labeling and compression schemes
• Mainly focus on REACH• Some transform graph to DAG• 2-hop, HOPI, 3-hop, Dual
Labeling, Interval labeling, Path Cover, GRIPP etc
r1 (A,B,C,D,J)
r2 (A,F,D,Z,B,T)
r3 (Z,L,M)
r4 (D,Z,B)
r5 (A,F,K)
Evaluating FJP queries
• Shortest path on GFJP
• Direct evaluation– Dijkstra, A*
• Preprocessing– Compression scheme
• 2-hop, construction cost
– Graph embedding + A*• Assumes that graph
structure does not change
r1(A,B,C,D,J)
r2(A,F,D,Z,B,T)
r3(Z,L,M)
r4(D,Z,B)
r5(A,F,K)
Outline
• Introduction• Existing work• Proposed framework
– PATH & FJP queries– Indices, algorithms & methods for handling
updates
• Experiments• Conclusions
Proposed framework
• Evaluate PATH and FJP queries directly on route collections– Route are precomputed answer to queries
• Our framework involves the following:– A search algorithm
• Depth-first search for PATH queries• Uniform-Cost Search for FJP queries
– Indices on route collection for efficiently answering queries
• Reduce iterations of the search algorithm• P-Index, H-Index, L-Index
– Methods for handling frequent updates
Indexing route collections P-Index
• Associates each node of the collection with the routes containing it
• List routes[n]– <ri, oi>
– Sorted by route identifier
r1 (A,B,C)
r2 (F,N,B,L)
r3 (T,B,N,M)
node routes list
A <r1:1>
B <r1:2>, <r2:3>, <r3:2>
C <r1:3>
F <r2:1>
L <r2:4>
M <r3:4>
N <r2:2>, <r3:3>
T <r3:1>
Indexing route collections P-Index
• Associates each node of the collection with the routes containing it
• List routes[n]– <ri, oi>
– Sorted by route identifier
r1 (A,B,C)
r2 (F,N,B,L)
r3 (T,B,N,M)
node routes list
A <r1:1>
B <r1:2>, <r2:3>, <r3:2>
C <r1:3>
F <r2:1>
L <r2:4>
M <r3:4>
N <r2:2>, <r3:3>
T <r3:1>
Indexing route collections H-Index
• Captures all possible transitions between routes via links– Links are shared nodes
• List edges[r]– <ri, n:o:oi>– Sorted by primarily by
route identifier ri and then by position o
– Each node in r before n can reach every node after n in ri
r1 (A,B,C)
r2 (F,N,B,L)
r3 (T,B,N,M)
node edges list
r1 <r2,B:2:3>, <r3,B:2:2>
r2 <r1,B:3:2>, <r3,B:3:2>, <r3,N:3:2>
r3 <r1,B:2:2>, <r2,B:2:3>, <r2,N:3:2>
Indexing route collections H-Index
• Captures all possible transitions between routes via links– Links are shared nodes
• List edges[r]– <ri, n:o:oi>– Sorted by primarily by
route identifier ri and then by position o
– Each node in r before n can reach every node after n in ri
r1 (A,B,C)
r2 (F,N,B,L)
r3 (T,B,N,M)
node edges list
r1 <r2,B:2:3>, <r3,B:2:2>
r2 <r1,B:3:2>, <r3,B:3:2>, <r3,N:3:2>
r3 <r1,B:2:2>, <r2,B:2:3>, <r2,N:3:2>
Indexing route collections H-Index
• Captures all possible transitions between routes via links– Links are shared nodes
• List edges[r]– <ri, n:o:oi>– Sorted by primarily by
route identifier ri and then by position o
– Each node in r before n can reach every node after n in ri
r1 (A,B,C)
r2 (F,N,B,L)
r3 (T,B,N,M)
node edges list
r1 <r2,B:2:3>, <r3,B:2:2>
r2 <r1,B:3:2>, <r3,B:3:2>, <r3,N:3:2>
r3 <r1,B:2:2>, <r2,B:2:3>, <r2,N:3:2>
Indexing route collections H-Index
• Captures all possible transitions between routes via links– Links are shared nodes
• List edges[r]– <ri, n:o:oi>– Sorted by primarily by
route identifier ri and then by position o
– Each node in r before n can reach every node after n in ri
node edges list
r1 <r2,B:2:3>, <r3,B:2:2>
r2 <r1,B:3:2>, <r3,B:3:2>, <r3,N:3:2>
r3 <r1,B:2:2>, <r2,B:2:3>, <r2,N:3:2>
r1 (A,B,C)
r2 (F,N,B,L)
r3 (T,B,N,M)
Indexing route collections L-Index
• Disadvantage of H-Index– Large edges lists– High storage and
maintenance cost– Notice: each edges list
involves a few distinct links
• Store links contained in each route
• List links[r]– <n:o>– Sorted by link identifier n
r1 (A,B,C)
r2 (F,N,B,L)
r3 (T,B,N,M)
node links list
r1 <B:2>
r2 <B:3>, <N:2>
r3 <B:2>, <N:3>
Indexing route collections L-Index
• Disadvantage of H-Index– Large edges lists– High storage and
maintenance cost– Notice: each edges list
involves a few distinct links
• Store links contained in each route
• List links[r]– <n:o>– Sorted by link identifier n
r1 (A,B,C)
r2 (F,N,B,L)
r3 (T,B,N,M)
node links list
r1 <B:2>
r2 <B:3>, <N:2>
r3 <B:2>, <N:3>
Evaluating PATH queries
• Basic idea– Traverse nodes similar to depth-first search– Exploit indices on route collection to terminate search
• P-Index => method pfsP• H-Index => method pfsH• L-Index => method pfsL
– At each iteration:• Pop current node n• Check termination condition• Access routes containing n, routes[n] from P-Index
– For each route, push nodes after n in search stack
Evaluating PATH queries cont.
• pfsP terminates search– When: current node n is contained in a route before target
t– How: joins routes[n] with routes[t] & stops after finding a
common route entry
SS TT
Evaluating PATH queries cont.
• pfsP terminates search– When: current node n is contained in a route before target
t– How: joins routes[n] with routes[t] & stops after finding a
common route entry
SS TTnn…
Evaluating PATH queries cont.
• pfsP terminates search– When: current node n is contained in a route before target
t– How: joins routes[n] with routes[t] & stops after finding a
common route entry
SS TTnn
r1
r2
r3
routes[n] = {<r1:o1>,<r2:o2>,<r3:o3>}
…
Evaluating PATH queries cont.
• pfsP terminates search– When: current node n is contained in a route before target
t– How: joins routes[n] with routes[t] & stops after finding a
common route entry
SS TTnn
r1
r2
r3
routes[n] = {<r1:o1>,<r2:o2>,<r3:o3>} JOIN routes[T] = {…,<r2:oT>,…}
…
Evaluating PATH queries cont.
• pfsP terminates search– When: current node n is contained in a route before target
t– How: joins routes[n] with routes[t] & stops after finding a
common route entry
SS TTnn
r1
r2
r3
…
o2 < oT
routes[n] = {<r1:o1>,<r2:o2>,<r3:o3>} JOIN routes[T] = {…,<r2:oT>,…}
Evaluating PATH queries cont.
• pfsP terminates search– When: current node n is contained in a route before target
t– How: joins routes[n] with routes[t] & stops after finding a
common route entry
SS TTnn…
r1
r2
rk
…
Answer: path from S to n => part of r2 from n to T
Evaluating PATH queries cont.
• pfsH terminates search– When: current node n is contained in a route that is
connected with a route containing target t– How: for each route r containing n, joins edges[r] with
routes[t] & stops after finding a common route entry
SS TT
Evaluating PATH queries cont.
• pfsH terminates search– When: current node n is contained in a route that is
connected with a route containing target t– How: for each route r containing n, joins edges[r] with
routes[t] & stops after finding a common route entry
SS TTnn…
Evaluating PATH queries cont.
• pfsH terminates search– When: current node n is contained in a route that is
connected with a route containing target t– How: for each route r containing n, joins edges[r] with
routes[t] & stops after finding a common route entry
SS TTnn
r1
r2
r3
routes[n] = {<r1:o1>,<r2:o2>,<r3:o3>}
…
Evaluating PATH queries cont.
• pfsH terminates search– When: current node n is contained in a route that is
connected with a route containing target t– How: for each route r containing n, joins edges[r] with
routes[t] & stops after finding a common route entry
SS TTnn
r1
r2
r3
…
rX
rY
XX YY
routes[n] = {<r1:o1>,<r2:o2>,<r3:o3>}, edges[r2] = {<rX,X:oX2:oX>,<rY,Y:oY2:oY>}
Evaluating PATH queries cont.
• pfsH terminates search– When: current node n is contained in a route that is
connected with a route containing target t– How: for each route r containing n, joins edges[r] with
routes[t] & stops after finding a common route entry
SS nn
r1
r2
r3
…
rX
rY
XX YY
edges[r2] = {<rX,X:oX2:oX>,<rY,Y:oY2:oY>} JOIN routes[T] = {…,<rY:oT>,…}
TT
Evaluating PATH queries cont.
• pfsH terminates search– When: current node n is contained in a route that is
connected with a route containing target t– How: for each route r containing n, joins edges[r] with
routes[t] & stops after finding a common route entry
SS nn
r1
r2
r3
…
rX
rY
XX YY
TTo2 < oY2 AND oY < oT
edges[r2] = {<rX,X:oX2:oX>,<rY,Y:oY2:oY>} JOIN routes[T] = {…,<rY:oT>,…}
Evaluating PATH queries cont.
• pfsH terminates search– When: current node n is contained in a route that is
connected with a route containing target t– How: for each route r containing n, joins edges[r] with
routes[t] & stops after finding a common route entry
SS nn
r1
r2
r3
…
rX
rY
XX YY
TT
Answer: path from S to n => part of r2 from n to Y => part of rY from Y to T
Evaluating PATH queries cont.
• pfsL– Visits only the links in a route– Constructs list T of the links before target– Terminates search
• When: current node n is contained in a route before a link of T• How: for each route r containing n, joins links[r] with list T & stops after
finding a common link entry
SS
TT
Evaluating PATH queries cont.
SS
rY
YY
T = {<Y,(rY:oY:oT)>}
TT
• pfsL– Visits only the links in a route– Constructs list T of the links before target– Terminates search
• When: current node n is contained in a route before a link of T• How: for each route r containing n, joins links[r] with list T & stops after
finding a common link entry
Evaluating PATH queries cont.
SS nn…
rY
YY
TT
T = {<Y,(rY:oY:oT)>}
• pfsL– Visits only the links in a route– Constructs list T of the links before target– Terminates search
• When: current node n is contained in a route before a link of T• How: for each route r containing n, joins links[r] with list T & stops after
finding a common link entry
Evaluating PATH queries cont.
SS nn…
pY
YY
routes[n] = {<r1:o1>,<r2:o2>,<r3:o3>}, T = {<Y,(rY:oY:oT)>}
TT
r1
r2
rk
• pfsL– Visits only the links in a route– Constructs list T of the links before target– Terminates search
• When: current node n is contained in a route before a link of T• How: for each route r containing n, joins links[r] with list T & stops after
finding a common link entry
Evaluating PATH queries cont.
SS nn…
pY
YY
links[r2] = {…<X:oX>,<Y:oY2>} JOIN T = {<Y,(rY:oY:oT)>}
TT
r1
r2
r3
• pfsL– Visits only the links in a route– Constructs list T of the links before target– Terminates search
• When: current node n is contained in a route before a link of T• How: for each route r containing n, joins links[r] with list T & stops after
finding a common link entry
Evaluating PATH queries cont.
SS nn
r1
r2
r3
…
rY
XX YY
TTo2 < oY2
links[r2] = {…<X:oX>,<Y:oY2>} JOIN T = {<Y,(rY:oY:oT)>}
• pfsL– Visits only the links in a route– Constructs list T of the links before target– Terminates search
• When: current node n is contained in a route before a link of T• How: for each route r containing n, joins links[r] with list T & stops after
finding a common link entry
Evaluating PATH queries cont.
SS nn
r1
r2
rk
…
rY
XX YY…
TT
Answer: path from S to n => part of r2 from n to Y => part of rY from Y to T
• pfsL– Visits only the links in a route– Constructs list T of the links before target– Terminates search
• When: current node n is contained in a route before a link of T• How: for each route r containing n, joins links[r] with list T & stops after
finding a common link entry
Evaluating FJP queries
• Basic idea– Traverse nodes similar to uniform-cost search– Compute lower bound of the answer, a candidate answer
• Using P-Index => method MP, L-Index => method ML
• Triggers early termination condition• Prunes search space
– At each iteration:• Pop current node n• Check termination condition against candidate answer• Update candidate answer through n • Access routes containing n, routes[n] from P-Index
– For each route, push nodes after n in search queue iff expanding them would give answer better than candidate
Handling frequent updates
• P-Index, H-Index, L-Index as inverted files on disk– Updates -> adding new routes– Not consider each new route separately– Batch updates, consider set of new routes
• Basic idea:– Build memory resident P-Index, H-Index, L-Index for
new routes– Merge disk-based indices with memory resident ones
Outline
• Introduction• Existing work• Proposed framework• Experiments
– Index creation– Query evaluation– Updates
• Conclusions
Setup
• Synthetic route collections– Number of routes |R| = {50k, 100k, 500k}– Average route length lavg = {5, 10, 50}– Number of distinct nodes |N| = {50k, 100k, 500k}– Number of links |L| = {10k, 30k, 50k, 80k, 100k}– Update factor U (% |R|) = {1%, 5%, 10%}
• Comparison– PATH queries
• dfs @ GPATH Vs pfsP, pfsH, pfsL– FJP queries
• Dijkstra @ GFJP Vs MP, ML
Index construction
Vary |R| Vary |N|
Evaluating PATH queries
Vary |R|
Evaluating FJP queries
Vary |R|
Updates
Outline
• Introduction• Existing work• Proposed framework• Experiments• Conclusions
– Summary– Future work
To sum up…
• Focus of my thesis:– Evaluating queries that find paths on large route
collections– Route collections are stored on disk and they are
frequently updated• Current work:
– A framework for evaluating PATH / FJP queries involving:• Depth-first / uniform-cost search• P-Index, H-Index, L-Index• Methods for handling frequent updates
Further work
• Three directions:① Address other kind of updates
– Remove routes– Insert nodes in routes – as dynamic pickup and delivery problem is
solved– Update time intervals of nodes
② Evaluate other types of queries– Trip planning or optimal sequence like queries
• Each node is an instance of a class in set C, e.g. {Museum,Stadium,Restaurant}
• Find a path passing through a Museum, then a Stadium and finally a Restaurant
③ Combine query evaluation with keyword search– Starting and ending node given as set of keywords– Find a path passing through a Restaurant relevant to “sea food,
lobster”
Thank you
• Evaluating queries on route collections:– P. Bouros, Fewer-Jumps Path Queries under Temporal Constraints, TR– P. Bouros, Evaluting Path Queries over Frequently Updated Routes, TR– P. Bouros, Evaluating Path Queries over Route Collections, ICDE’10 PhD
Workshop.– P. Bouros, S. Skiadopoulos, T. Dalamagas, D. Sacharidis and T. Sellis,
Evaluating reachability queries over path collections, SSDBM’09.– P. Bouros, T. Dalamagas, S. Skiadopoulos and T. Sellis, Evaluating “Find a
Path” Reachability Queries, ECAI’08 Workshop.• Other work:
– D. Sacharidis, P. Bouros, and T. Sellis, Caching Dynamic Skyline Queries, SSDBM’08.
– T. Dalamagas, P. Bouros, T. Galanis, M. Eirinaki and T. Sellis, Mining User Navigation Patterns for Personalizing Topic Directories, CIKM’07 Workshop
Outline
• Introduction• Existing work• Proposed framework• Experiments• Conclusions
Thank you