33
1 The University of Hong Kong Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg University Kyriakos Mouratidis, Singapore Management University Nikos Mamoulis, University of Hong Kong

The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

Embed Size (px)

Citation preview

Page 1: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

11

The University of Hong Kong

Capacity Constrained Assignment in Spatial Databases

Authors: Leong Hou U, University of Hong KongMan Lung Yiu, Aalborg UniversityKyriakos Mouratidis, Singapore Management UniversityNikos Mamoulis, University of Hong Kong

Page 2: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 22

Outlines

MotivationRelated Work

Assignment Problems

SolutionsApproximate SolutionsConclusion

Page 3: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 33

Motivation

Assume that Our system has a set of service providers (Q)

which serve a set of customers Each service provider (q) can serve at most k

customers simultaneously For every provider-customer (q,p) pair, our

central server knows the cost to assign p to q

Our aim is to maximize our service utilization

1. Maximize the number of served customers2. Minimize the total sum of weights

Page 4: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 44

Case Study I

Concerning the case between wireless routers and laptops

each router can serve at most 3 users concurrently the signal strength is measured by the Euclidean

distance (longer distance means weaker signal)

Can it be solved by Nearest Neighbor Queries?

3-Nearest Neighbor Queries

Page 5: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 55

Case Study I

Can it be solved by Reverse Nearest Neighbor Queries?

Reverse Nearest Neighbor Queries

Page 6: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 66

Case Study I

Can it be solved by Closest Pair Queries?

6-Clostest Pairs(2 routers * 3 capacities)

Page 7: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 77

32

Case Study I

Can it be solved by Spatial Matching (Exclusive Closest Pair)?

ECP matchingRouter’s capacity is 3

Find ECP between set {A} and {B}

1. Find closest pair (a,b) from (A,B)

2. (a,b) is a pair of ECP, a.k=a.k-1, b.k=b.k-1 (* k is the capacity value)

3. {A}={A}-a if a.k=0, {B}={B}-b if b.k=0, go to step 1 until {A} or {B} is empty

10

3210

Page 8: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 88

Case Study I

Can it be solved by optimal assignment?

Optimal assignment

Optimal assignment tries to server as many as possible users and also tries to minimize the sum cost (distance)

Page 9: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 99

Related Work

Optimal assignment is to compute the maximum size matching with minimum assignment cost

Two popular algorithms Hungarian Algorithm Successive Shortest Path Algorithm (SSPA)

The time complexity of two algorithms is O(n3) in worst case where n is the number of service providers or

customers

Page 10: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 1010

Successive Shortest Path Algorithm (SSPA)

q1

q2

p1

p2

0

1 1

3

S D

q1

q2

p1

p2

0

1 1

3

S D

q1

q2

p1

p2

0

-1 -1

3

S D

1. Find shortest path (SP) from source to destination

2. Reverse the edge direction on SP3. Repeat steps 1~2, Until no more path can

be foundq1

q2

p1

p2

0

1 1

3

S D

-1 -1

Page 11: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 1111

Successive Shortest Path Algorithm (SSPA)

SSPA is easy to implement with capacity constraint

Assume that data set A is our routers with capacity 2, data set B is our users

q1

q2

p1

p2

0

1 1

3

S D

0q1

q2

p1

p2

-1 1

3

S D

q1

q2

p1

p2

0

1 1

3

S D

q1

q2

p1

p2

0

1 1

3

S D

Page 12: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 1212

Preliminary Solution

In our problem settings, we have a set of service providers (Q) with capacity value k and a set of customers (P) which are indexed by an R-tree

Let us analyze SSPA performance in detail Consider the case |Q|=|P| and k=1 For every q in Q, we need to find a SP (time=N, where N=|

Q|) Find a SP in the bipartite graph between Q and P (time=|

Eall|, where Eall is all the edges between Q and P)

So the time complexity is N*|Eall|

The algorithm should do better if the bipartite graph is smaller

N*|Esub| << N*|Eall|, if |Esub| << |Eall|

Page 13: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 1313

Preliminary Solution

A SP can be determined by a sub-graph, if the sub-graph is built in order

q1

q2

p1

p2S D

q3 p3

q1

q2

p1

p2S D

q3 p3

Only add edges with weight ≤ 1 into our graph

p1 p2 p3

q1 0 1 5

q2 1 12 14

q3 4 3 8

p1 p2 p3

q1 0 1 >1

q2 1 >1 >1

q3 >1 >1 >1

Page 14: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 1414

Solution - RIA

Range Incremental Algorithm (RIA) is based on the last observation to build the bipartite graph incrementally

Lemma 1 If all the edges with weight ≤ T are added into sub-graph

(Esub), then a SP from Esub with weight ≤ T must be a SP from EQxP

q1

q2

p1

p2S D

q3 p3

T=1, Only add those edges with weight ≤ T into our graph

Weight of SP is 2

Increase threshold T=T+1 => T=2, it does not add any edge into graph

PROBLEM

p1 p2 p3

q1 0 1 >1

q2 1 >1 >1

q3 >1 >1 >1

p1 p2 p3

q1 0 1 >2

q2 1 >2 >2

q3 >2 >2 >2

Page 15: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 1515

Solution - NIA

Nearest Neighbor Incremental Algorithm (NIA) increases Esub by nearest neighbor

q1

q2

p1

p2S D

q3 p3

Heap H={(q1,p1,0), (q2,p1,1), (q3,p2,3)}

Heap H={(q1,p2,1), (q2,p1,1), (q3,p2,3)}

Lemma 2

If the weight of SP ≤ H.top(), then it is also a SP in Eall

Otherwise, add a new edge from H to Esub

Heap H={(q2,p1,1), (q3,p2,3), (q1,p3,5)}

p1 p2 p3

q1 0 ≥0 ≥0

q2 ≥0 ≥0 ≥0

q3 ≥0 ≥0 ≥0

p1 p2 p3

q1 0 1 ≥1

q2 ≥1 ≥1 ≥1

q3 ≥1 ≥1 ≥1

Page 16: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 1616

Solution - IDA

Lemma 3 If any object in Q (which is our service providers) is not

accessed from source S, then it is not necessary to add its nearest neighbor into Esub

We develop a novel algorithm Incremental On-Demand Algorithm (IDA) which is based on this lemma

q1

q2

p1

p2S D

q3 p3

Heap H={(q1,p2,1), (q2,p1,1), (q3,p2,3)}

It is not necessary to add this edge in current state, since it cannot help us to find any new SP

Heap H={(q2,p1,1), (q3,p2,3), (q1,p3,5)}

p1 p2 p3

q1 0 1 ≥1

q2 ≥1 ≥1 ≥1

q3 ≥1 ≥1 ≥1

Page 17: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 1717

Solution - IDAq1

q2

p1

p2S D

q3 p3

Heap H={(q1,p2,1), (q2,p1,1), (q3,p2,3)}

Heap H={(q2,p1,1), (q3,p2,3)}

Heap H={(q3,p2,3), (q2,p2,12)}

Heap H={(q1,p2,1), (q3,p2,3), (q2,p2,12)}

Heap H={(q3,p2,3), (q1,p3,5), (q2,p2,12)}

IDA only expands the graph when it is necessary It is expected to have a smaller sub-graph

(smaller Esub) when executing SP searches

Weight of SP is 1-0+1=2p1 p2 p3

q1 0 ≥1 ≥1

q2 ≥1 ≥1 ≥1

q3 ≥1 ≥1 ≥1

p1 p2 p3

q1 0 ≥3 ≥3

q2 1 ≥3 ≥3

q3 ≥3 ≥3 ≥3

p1 p2 p3

q1 0 1 ≥3

q2 1 ≥3 ≥3

q3 ≥3 ≥3 ≥3

Page 18: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 1818

Experiments

Number of Service Providers |Q| (in thousands):0.25 0.5 1 2.5 5

Number of Customers |P| (in thousands):25 50 100 150 200

Capacity k:20 40 80 160 320

Both datasets were generated on the road map of San Fransisco

Language C++ Pentium D 3.0 GHz with running on Ubuntu 7.10

Page 19: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 1919

Experiments

First, we test the performance on small dataset over different capacity k (|Q|=0.25, |P|=25 [in thousand])

20 40 80 160 320

0.1

1

10

100

1000

10000C

PU

tim

e (

s)

k

SSPA RIANIA IDA

Page 20: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 2020

Experiments

Page 21: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 2121

Experiments

Page 22: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 2222

Approximate Solution

Time-critical applications could favor fast answers over exact matching

Our approximate solutions provide a tunable trade-off between result accuracy and response time

with theoretical guarantees for the assignment cost Three phases of our general method

Partitioning phase Concise matching phase Refinement phase

ab

centroid of group

Page 23: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 2323

Service provider Approximation (SA)

Service providers are sorted by Hilbert value and are grouped by this order

Each point q is inserted into an existing group G so that the diagonal of G’s MBR does not exceed δ

If no such group is found, then a new group is formed to contain q

The centroid of a group G is the geometric centroid. e.g., for x-coordinate,

sum( q.x*q.k ) / sum(q.k) where q in G

Theoretical error bound is 2 * num of assignment * δ

3

41

Page 24: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 2424

e1

Customer Approximation (CA)

Unlike SA, CA can do the grouping in R-tree Theoretical error bound is num of assignment * δ

e3

e2

e6

e7

e5

e4

e1 e2

e3 e4 e5 e6 e7

δ

Page 25: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 2525

Refinement Phase

In refinement phase, SA and CA only solve some smaller assignment problems

We could run the exact algorithm for each of these smaller problems. This, however, is expensive

Therefore, two heuristics methods are proposed NN-based refinement

Use round robin fashion to find NN customer for each service provider

Exclusive Closest Pair refinement Use ECP to make assignment

3

1

2 4

Page 26: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 2626

Experiments

Quality =sum of approximate cost

sum of optimal costSACA

Page 27: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 2727

Experiments

Page 28: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 2828

Conclusion

We proposed three algorithms which solve the CCA problem efficiently

All our methods try to Minimize I/O accesses Minimize CPU time

Also, we proposed two approximate solutions which achieve good tradeoff between execution time and matching quality

Our next step is to investigate Incremental updates to CCA solution Continuous monitoring of CCA Other types of matching (assignment) problems

Page 29: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

2929

The University of Hong Kong

Thank you!

Any question?

Page 30: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 3030

Hungarian Algorithm

1. Find the smallest value for each row, and reduce it to every elements in each row

2. Find the smallest value for each column, and reduce it to every elements in each column

3. Find minimum number of lines to cover all zero4. Find the smallest value for all uncovered elements, and reduce it to every

uncovered elements (also, add it to the cell which is the intersection of two covered lines)

5. Repeat steps 3~4, until the number of lines is equal to |A| or |B|

878a3

751a2

910a1

b3b2b1

878a3

751a2

910a1

b3b2b1

101a3

640a2

910a1

b3b2b1

101a3

640a2

910a1

b3b2b1

001a3

540a2

810a1

b3b2b1

001a3

540a2

810a1

b3b2b1

002a3

430a2

700a1

b3b2b1

002a3

430a2

700a1

b3b2b1

-0

-1

-7

-0 -0 -1

Page 31: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 3131

Hungarian Algorithm

Hungarian is not easy to work with capacity constraint efficiently duplicating the row/column is a not a good solution

The memory usage of Hungarian is very high Sum(a.k)xSum(b.k), where a in A, b in B

Step 3 of Hungarian is not easy to do further optimization Find minimum number of lines to cover all zero

b1 b2 b3

a1 0 1 9

a1 0 1 9

a2 1 5 7

a2 1 5 7

a3 8 7 8

a3 8 7 8

b1 b2 b3

a1 0 1 8

a2 0 4 5

a3 1 0 0

Page 32: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 3232

Optimization – Reducing Dijkstra Execution

Some optimizations to Dijkstra can be done Dijkstra stops search when the weight of a potential SP is

higher than the top value in heap H Once a new path adds into Esub, it only affects one vertex and

its sequential vertices Notice that Dijkstra cannot run with negative weight on the

edges, but potential value can be used to solve this problem Each node has a potential value, and it is changed when

updating the graph The potential weight of edges is calculated by the edge

weight+two vertices’ potential values which is always larger than zero

Potential vertices are affected by new added edge

Unaffected vertices

Page 33: The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg

The University of Hong Kong 3333

Optimization – Incremental All Nearest Neighbor

All three proposed algorithms invoke numerous range/NN search operations around the service providers to the R-tree that indexes the customers

To reduce the I/O cost, we employ an incremental all-nearest-neighbor technique

qp