Upload
barrie-rose
View
213
Download
0
Tags:
Embed Size (px)
Citation preview
1
Hypersphere Dominance: An Optimal Approach
Cheng Long, Raymond Chi-Wing Wong, Bin Zhang, Min XieThe Hong Kong University of Science and Technology
Prepared by Cheng LongPresented by Cheng Long
24 June, 2014
Hyperspheres
A hypersphere in a d-dimensional space (center, radius) the set of all points that have their distances from
the center bounded by the radius
2
𝑐𝑟 𝑟
𝑐
2D: a disk 3D: a ball
Hyperspheres are commonly used Uncertain databases
the location of an uncertain object Spatial databases
SS-tree, SS+-tree, M-tree, VP-tree and SR-tree
3
SS-tree: similar to R-tree with hyperrectangles replaced by hyperspheres
SS-tree based on A-Hlayout of 8 objects: A-H
Motivating example Scenario
Ada has her location uncertain, but constrained in a disk Sa. Bob has his location uncertain, but constrained in a disk Sb. Connie has her location uncertain, but constrained in a disk Sq.
Question Is Ada always closer to Connie than Bob?
4
(Ada)
Sb (Bob)
Sq (Connie) Sq
(Connie)
(Ada)
Sb (Bob)
No
For this specification of the locations, Ada is closer to Connie than Bob
In fact, for all specifications of the locations, Ada is closer to Connie than Bob
Yes
Hypersphere dominance: definition
5
Definition 1: Hypersphere dominanceGiven
, , and , it decides whether
Dominance condition
Yes: No:
Basic operator used in many queries Probabilistic RkNN query [Lian and Chen, VLDBJ’09] AkNN query [Emrich et al., SSDBM’10] kNN query [Long et al., SIGMOD’14]
Hypersphere dominance: existing solutions—overview
MinMax [Roussopoulos et al., SIGMOD Record’95; Hjaltason and Samet, TODS’99]
MBR [Emrich et al., SIGMOD’10]
GP [Lian and Chen, VLDBJ’09]
Trigonometric [Emrich et al., SSDBM’10]
6
Hypersphere dominance: existing solutions—MinMax (1)
7
𝑆𝑎 𝑆𝑏
𝑐 𝑎 𝑐 𝑏𝑟 𝑎 𝑟 𝑏
𝑀𝑎𝑥𝐷𝑖𝑠𝑡 (𝑆𝑎 ,𝑆𝑏)=𝐷𝑖𝑠𝑡 (𝑐 𝑎 ,𝑐𝑏 )+𝑟𝑎+𝑟 𝑏 =
( and Sb overlap), – – ( and Sb do not overlap)
𝑆𝑎𝑐 𝑎 𝑐 𝑏𝑟 𝑎 𝑟 𝑏
𝑆𝑏
Definition: the maximum distance between a point in and a point in Sb
Definition: the minimum distance between a point in Sa and a point in Sb
𝑀𝑎𝑥𝐷𝑖𝑠𝑡 (𝑆𝑎 ,𝑆𝑏) 𝑀𝑖𝑛𝐷𝑖𝑠𝑡 (𝑆𝑎 ,𝑆𝑏)𝑀𝑖𝑛𝐷𝑖𝑠𝑡 (𝑆𝑎 ,𝑆𝑏)=0
𝑆𝑏𝑆 𝑎
𝑐 𝑎 𝑐 𝑏𝑟 𝑎 𝑟 𝑏
Hypersphere dominance: existing solutions—MinMax (2)
8
MinMaxCompute Compute If
Return Else
Return
𝑆𝑎
SbSq 𝑆𝑎
Sb
Sq
𝑀𝑎𝑥𝐷𝑖𝑠𝑡 (𝑆𝑎 ,𝑆𝑞)𝑀𝑖𝑛𝐷𝑖𝑠𝑡 (𝑆𝑏 ,𝑆𝑞)
𝑀𝑎𝑥𝐷𝑖𝑠𝑡 (𝑆𝑎 ,𝑆𝑏)
𝑀𝑖𝑛𝐷𝑖𝑠𝑡 (𝑆𝑏 ,𝑆𝑞)
𝐷𝑜𝑚(𝑆𝑎 ,𝑆𝑏 ,𝑆𝑞)=𝑡𝑟𝑢𝑒MinMax returns
“false negative”
<
MinMax returns
>
correct 𝐷𝑜𝑚(𝑆𝑎 ,𝑆𝑏 ,𝑆𝑞)=𝑡𝑟𝑢𝑒
bisector and
Hypersphere dominance: existing solutions--Insufficiency
Methods Correct? Sound? Efficient?
MinMax Yes No Yes
MBR Yes No Yes
GP Yes No Yes
Trigonometric No Yes Yes
9
Criteria of a method:1. Correctness: No false positive2. Soundness: No false negative3. Efficiency: runs in O(d) where d is the number of dimensionality
Our approach is the only one which is correct, sound and efficient!
Our approach(Hyperbola)
Yes Yes Yes
Our approach: major idea Step 1: pre-checking
Do the decision directly Step 2: dominance checking
Drive an equivalent condition of which is easier to decide Do the decision
10
For cases where it is easy to decide whether the dominance condition is true For cases where it is difficult to decide whether the dominance condition is true directly
Our approach: pre-checking
11
𝑆𝑎
Sb
Sq 𝑆𝑎
Sb
Sq
Step 1: Pre-checking:If and Sb overlap
Return If Sb and Sq overlap
Return and Sb overlap𝐷𝑜𝑚(𝑆𝑎 ,𝑆𝑏 ,𝑆𝑞)= 𝑓𝑎𝑙𝑠𝑒
Sb and Sq overlap𝐷𝑜𝑚(𝑆𝑎 ,𝑆𝑏 ,𝑆𝑞)= 𝑓𝑎𝑙𝑠𝑒
Our approach: dominance checking (1)
12
Dominance condition:
Equivalent condition (1):
Proof of the equivalence between Condition (1) and Condition (2):“=>”: By contradiction “<=”:
Step 2: Dominance checking:Derive an equivalent condition of and check whether the derived condition is true
Our approach: dominance checking (5)
13
Equivalent condition (2):
Equivalent condition (3):
𝑀𝑎𝑥𝐷𝑖𝑠𝑡 (𝑞 ,𝑆𝑎)=𝐷𝑖𝑠𝑡 (𝑞 ,𝑐𝑎 )+𝑟 𝑎+0=𝐷𝑖𝑠𝑡 (𝑞 ,𝑐𝑎 )+𝑟 𝑎 𝑀∈𝐷𝑖𝑠𝑡(𝑞 ,𝑆𝑏)=𝐷𝑖𝑠𝑡 (𝑞 ,𝑐𝑏)−𝑟 𝑏−0=𝐷𝑖𝑠𝑡 (𝑞 ,𝑐𝑏)−𝑟 𝑏
Our approach: dominance checking (3)
14
Space partitioning: Boundary : Region : Region :
Boundary : Region Ra
Region Rb
Equivalent condition (4): is in Region ( is in Region )
SaSb
ca
cb
Sqcq
Equivalent condition (3):
Our approach: dominance checking (4)
15
Equivalent condition (5): is in Region and
Equivalent condition (4): is in Region
rq
𝑚𝑖𝑛𝑥∈𝑃𝐷𝑖𝑠𝑡 (𝑐𝑞 ,𝑥 )
SaSb
ca
cb
Region Ra
Region Rb
Sqcq
Boundary :
𝑚𝑖𝑛𝑥∈𝑃𝐷𝑖𝑠𝑡 (𝑐𝑞 ,𝑥 )>𝑟𝑞
is Region
is in Region
Our approach (2)
Compute constraint: objective: minimize
We use the Lagrange Multiplier (LM) method. Details could be found in the paper
16
correct sound efficientThe condition (3) is equivalent to the dominance conditionEach condition transformation takes O(d) time and the cost of LM is also O(d)
Equivalent condition (5): is in Region and
Space partitioning: Boundary : Region : Region :
Empirical study: set-up
Datasets: Real datasets: NBA, Color, Texture, and Forest Synthetic datasets
Algorithms: MinMax, MBR, GP, Trigonometric, Hyperbola (our
method) Measures:
precision = TP/(TP+FP) recall = TP/(TP+FN) running time
17
A correct method has the precision always equal to 1A sound method has the recall always equal to 1
Criteria of a method:1. Correctness: No false positive (FP)2. Soundness: No false negative (FN)3. Efficiency: runs in O(d) where d is the number of dimensionality
Empirical study: results (precision, NBA)
All algorithms except Trigonometric have precisions = 1.
18
Methods Correct? Sound? Efficient?
MinMax Yes No Yes
MBR Yes No Yes
GP Yes No Yes
Trigonometric No Yes Yes
Our approach Yes Yes Yes
Empirical study: results (recall, NBA)
Only our approach (Hyperbola) and Trigonometirc have recalls = 1.
19
Methods Correct? Sound? Efficient?
MinMax Yes No Yes
MBR Yes No Yes
GP Yes No Yes
Trigonometric No Yes Yes
Our approach Yes Yes Yes
Empirical study: results (running time, NBA)
MinMax < GP < Hyperbola (our method) < MBR < Trigonometric
20
Conclusion
First solution for the hypersphere dominance problem, which is correct, sound and efficient for any dimension
An application study: kNN Experiments
21
Q & A
22
The following slides are for backup use only
23
Hyperspheres in uncertain databases
Song and Roussopoulos [SSTD’01] Cheng et al. [TKDE’04] Chen and Cheng [ICDE’07] Beskales et al. [PVLDB’08]
24
Our approach (1)
25
Dominance condition:
Equivalent condition (1): :
Major idea:Derive an equivalent condition of and check whether the derived condition is true
Equivalent condition (2):
Equivalent condition (3): and :
Definition 1: Hypersphere dominanceGiven
, , and , it decides whether
Dominance condition
Yes: No:
An application study: kNN qeury
kNN query: Given a set D of hyperspheres, , , …, , a query
hypershere , and an integer , the query finds a set of hyperspheres in D each of
which is not dominated by wrt where is the hypersphere in D with the k-th smallest maximum distance from .
Solution: A best-first search algorithm based on SS-tree Some pruning strategies 26
27
Boundary : Region Ra
Region RbIllustration 1: 2D space, and are two points (i.e., = 0, = 0)Sb ()
SqSa () cq