Click here to load reader

2006-09-15 VLDB '2006 Haibo Hu (Hong Kong Baptist University, Hong Kong) Dik Lun Lee (Hong Kong University of Science and Technology, Hong Kong) Victor

  • View
    220

  • Download
    6

Embed Size (px)

Text of 2006-09-15 VLDB '2006 Haibo Hu (Hong Kong Baptist University, Hong Kong) Dik Lun Lee (Hong Kong...

Distance Indexing on Road NetworksVLDB '2006
Haibo Hu (Hong Kong Baptist University, Hong Kong) Dik Lun Lee (Hong Kong University of Science and Technology, Hong Kong) Victor Lee (City University of Hong Kong, Hong Kong)
Distance Indexing on Road Networks
*
objects
Queries:
large-degree nodes
By Dijkstra's single-source shortest path algorithm:
Maintain nodes whose distances are not finalized
Pick the node with the shortest distance and finalize it
Relax all not-yet-finalized distances
Limitations:
Running time O(NlgV)
Return k nearest neighbor
Precomputed indexes are costly to store and update
*
Distance signature --- the first general-purposed index on road networks that
Categorizes the distances of a node to all objects
Supports both rough and exact distance computation
Accelerates processing of common query types
Reduces the storage and maintenance cost
Is orthogonal to other query optimization techniques
*
Construction and Maintenance
Distance Signature
Basic Idea:
Precomputing distances is a good trade-off between having no indexing and solution space indexing
Maintain the approximate distance between objects and nodes
How rough is the approximation?
Apply rough approximation to faraway objects
Queries are always interested in local objects
Faraway objects are more than local objects
We use an exponential sequence of categories
In the form of [0, T), [T, cT), [cT, c2T), [c2T, c3T), ...
T and c are constant parameters
E.g., T = 3, c = 2, then [0, 3), [3,6), [6,12), [12,24), ...
3
6
24
12
Distance Signature (Cont'd)
For each node n, signature component S(n)[i] denotes the category of dist(n,i)
S(n)[i].link denotes the next node from n in the shortest path to i
*
Construction and Maintenance
Distance Operations on Signatures
Principle: trace back the link until the distance range is accurate enough
Exact
Approximate
Trace back through the link from node to object
Terminate once the distance range does not partially overlap with input
Comparison (distances from node n to objects a and b)
Trace back until the two distance ranges don’t overlap
Sorting
First apply approximate sorting, then apply bubble sort using exact comparison
Quick sort using approximate comparison
11
4
n2
n3
n6
11
p1
p2
*
Compare the distances of two objects based on one signature
Avoid accessing the signatures of other nodes
Used to get a rough result of distance sorting
How?
Select an observer n3
n3 tells if n2 or n6 is closer to n4
If n4 is on the perpendicular
bisector, is it possible for n3
to find n4 within distance range
s(n4)[n3]?
Categories tell the approximate distances between q and other objects
Get k closest objects according to their category values
If no need to know the distances or order, return objects based on category ranges
To find the ordering:
To find exact distances:
*
Construction and Maintenance
Exponential categories [0, T), [T, cT), [cT, c2T], ...
How to determine c and T?
Factors:
Storage availability
Spreading is uniformly distributed
4.unknown
5.unknown
Build shortest path spanning tree for each object (Dijkstra)
Fill in s(n)[i] when the tree of object i is spanned to node n
Variable length encoding
the number of objects in each category is not even
# of objects 1 unit, 2 units, 3 units, ... away: 4, 8, 12, ...
Use fewer bits for larger categories
*
Under assumptions "exponential partition", "grid topology", "uniform distance range of queries", and c>1.5, this coding scheme is optimal
[0, T) [T, cT) [cT, c2T) [c2T, c3T) [c3T, ∞)
Average code length is approximately :
1
01
001
0001
0000
Fixed coding
u
v
n
1-bit flag
not compressed
in memory
The shortest path spanning trees of all objects
A reverse index for each edge of trees that comprise this edge
limit the number of trees affected by the change of this edge
How (suppose edge (a,b) is updated) :
Find those affected spanning trees
For each affected tree of object c, check s(a)[c] or s(b)[c] (whichever is smaller)
Propagate to adjacent nodes until no more updates
*
Construction and Maintenance
Page size: 4K bytes
Network Voronoi Diagram (NVD) from VN3
Tuning parameters
*
*
*
*
Speed up general query processing
Optimal choice of distance categories and category encoding
Future work
The signatures of nearby nodes are similar
Derivation of optimal distance categories for a wider range of network topologies and object distributions
H
i
g
h
w
a
y
G
a
s
s
t
a
t
i
o
n
Q
u
e
r
y
p
o
i
n
t
4