28
SCUBA: Scalable Cluster-Based Al gorithm for Evaluating Spatio-Te mporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue University, W.Lafayette, IN 47906 USA [email protected] Elke A. Rundensteiner Department of Computer Sciences, Worcester Polytechnic Institute, Worcester, MA 01609 USA [email protected] SCUBA: Scalable Cluster-Based Algorithm for Evaluating Continuous Spatio-Temporal Queries on Moving Objects

SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

Embed Size (px)

Citation preview

Page 1: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects

1

Rimma V. Nehme

Department of Computer Sciences, Purdue University,

W.Lafayette, IN 47906 [email protected]

Elke A. Rundensteiner

Department of Computer Sciences, Worcester Polytechnic Institute,

Worcester, MA 01609 [email protected]

SCUBA: Scalable Cluster-Based Algorithm for Evaluating Continuous Spatio-Temporal Queries on Moving Objects

Page 2: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA2

Outline

Motivation Related Work Our Approach: SCUBA Experimental Study Conclusion

Page 3: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA3

Challenges for Continuous Query Processing on Spatio-Temporal Data Streams

Scalability Large number of objects Large number of queries

Limited Resources Memory CPU

Real-time Response Requirement

The challenge is to provide fast query response in update-intensive environments

- moving objects- dynamic range query

- dynamic kNN query

Novel Idea: Exploit thefact that objects naturally move

in groups (i.e., clusters) to optimize query evaluation

Page 4: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA4

Motivation

Monitor the traffic in the

red areas

Continuously return the

area covered by the heard during the migration

Page 5: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA5

Big Picture

SINA [MXA04] SEA-CNN [XMA05] Q-Index [PXK+02] PSoup [CF03] NiagaraCQ [CDT00]

SR [SR01] DQ [LPM02] CNN [TPS] TPR [SJL00]

Traditional Execution Shared Execution

Our work (SCUBA)

Shared Cluster-Based Execution

We use clustering as means to improve execution of spatio-temporal queries onmoving objects

Novel Idea

Moving Objects Spatio-temporalqueries

Scan Scan

Q1 Q2

Spatial Join

(B) Shared query plan for spatio-temporal queries

Scan Scan

Q1 Q2

Join Between

(C) Shared-cluster based query plan for spatio-temporal queries

Join Within

Moving ClustersMoving ClustersMoving Objects Moving Objects

Scan Scan

Return allcars in region R1

Return allpolice cars in

region R2

Q1 Q2

(A) Individual query plans for spatio-temporal queries

Page 6: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA6

Our Idea: Moving Clusters

Main Idea: Abstracting individual entities into a cluster based on common attributes

- Direction

- Speed

- Spatial Position

With cluster abstractions,

we want to minimize the number of unnecessary individual object/query joins, thus optimizing query evaluation

Continuously retrieve closest police car next

to me

Police Car

Scalable Cluster-Based Algorithm for Evaluating Continuous Spatio-Temporal Queries on Moving Objects (SCUBA)

Page 7: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA7

Advantage of Moving Clusters

When clusters don’t overlap, we avoid many joins of individual objects within those clusters

m1m2

No need to join objects/queries in m1 with queries/objects in m2

- Moving object - Spatio-temporal range query

We present SCUBA in the context of continuous spatio-temporal range queries

If two abstractions do not ‘overlap' then we can discard negative candidates

and avoid individual joins.

Page 8: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA8

Advantage of Moving Clusters Objects/Queries continuously move    Grid cells are static:

If put in grid, we have to continuously have to take them and put into and out of grid cells.

Instead  we want to make "flexible cells"  i.e., moving clusters

Page 9: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA9

Architecture Overview SCUBA-enabled motion operator execution Answers produced periodically (every )

SCUBA - Motion Operator

Moving ObjectsData Stream

Moving QueriesData Stream

Results DataStream

-range query

Time interval expires

Grid-based Join Between/Within

Clusters

StreamGenerator

Query PlanGenerator

StreamGenerator

Raindrop Workhorse

ExecutionEngine

ExecutionScheduler

StatisticsGatherer

StreamReceiver

CAPE Engine

User Query

Control Flow

Data Flow

Legend:

User Query

End User

Internet

CAPE

-moving object

Moving Clusters

Page 10: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA10

Network Constrained Movement

Movement is constrained withinroad network: Roads = edges Intersections = connection nodes

New York City

ConnectionNode (CNLoc)

SCUBA supports both constrained and unconstrained movement.

Page 11: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA11

Moving Cluster Representation in SCUBA

Centroid

Actual Cluster SizeΘD

Max Cluster Size

Direction Vector

Cluster members: moving objects

Cluster members: moving queries

Cluster Member Representation Inside Cluster:

Centroid

Cluster member:(moving object)

Moving clusters expire after some time

Speed

Page 12: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA12

SCUBA Execution SCUBA produces result periodically (every time units)

Phase I: Cluster Pre-Join Maintenance Formation of new clusters Dissolving “empty” clusters Expanding existing clusters

Cluster-Based Joining

Clusters Position Update

DONE

DONE

SCUBA has three phases

Phase II: Cluster-Based Joining Joining clusters Joining objects and queries inside clusters

Phase III: Cluster Post-Join Maintenance Dissolving “expiring” clusters Relocating “non-expiring” clusters based on velocity vector

TimeoutIn-memory

clustering

Ob

ject

&

Qu

erie

s

DONE

Cluster Pre-Join Maintenance

Cluster-Based Joining

Cluster Post-Join Maintenance

Send Results

Page 13: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA13

Phase I: Cluster Pre-Join Maintenance Clustering is done incrementally (upon the arrival of updates) Location update format

(ID, Loct, t, Speed, CNLoc, ...)

Use 2 thresholds + destination ΘD – distance threshold

ΘS – speed threshold Destination

ConnectionNode (CNLoc)

Clustering Algorithm is based on Leader-Follower Clustering Algorithm(J.A. Hartigan. Clustering Algorithms,John Wiley and Sons 1975)

(1) New moving object arrives

(2) Hash objectinto grid

(3) Add object tocluster and update cluster attributes

M1

M2

M3

M1

M2

M3

-centroid position-radius-average speed-member count

Parent Cluster

(4) If cluster has expanded check foroverlap with neighboringcells (make new entries if necessary)

Clustering New Object Example

(5) If object left existing cluster,for a new cluster and old cluster is “empty”, dissolve old cluster.

Page 14: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA14

Phase II: Cluster-Based JoiningLocation Updates

Arrival

Incremental Clustering Cluster-Based Join

∆ expires

1. Join-Between2. Join-Within

Phase I Phase II

Cluster-Based Join

Page 15: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA15

Phase II: Cluster-Based Joining Join-Between

Between two clusters

Join-Within For each cluster (joining objects and queries inside) For two overlapping clusters (cross-join between objects and queries

from the two clusters)

Join-Between

= overlap

Join-Within

ignored

= query results

Join-Within

Page 16: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA16

Phase III: Cluster Post-Join Maintenance

ConnectionNode

Dissolved

New ClusterPosition Updated

Insert into the grid

Clear the grid

Dissolve “expiring” clusters

Relocate “non-expiring” clusters based on velocity vector back into the grid

Page 17: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA17

Moving Cluster-Based Load Shedding

ΘD

Velocity Vector

O1(r1,1)

O2(r2,2)

O3(r3,3)

Q4(r4,4)

Q5(r5,5)

Load Shedding - process of dropping excess load from the system when the demand on resources is above the system capacity [TCZ03].

Load shedding reduces resource requirements by dropping data, thereby sacrificing the accuracy of the query answers.

The main goal is to minimize the degradation in accuracy.

Focus: Discarding data

inside moving clusters

Page 18: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA18

Experimental Settings Implemented inside Java-based CAPE streaming

system [RDZ05]

Used Network-based Generator of Moving Objects [BR02] to generate a set of moving objects and moving queries in Worcester County (Tiger Line files)

Unless mentioned otherwise, the following are the parameters used:

10,000 moving objects and 10,000 moving queries.

Clustering Thresholds:ΘD = 100 (spatial units), ΘS = 10 (spatial units/time units) ΘN = 0 (no load shedding)

Grid: 100x100

Page 19: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA19

Experimental Results From Dissimilar to Similar Motion

- Higher skew factor means more dense objects and queries (i.e., more clusterable)- Compare against regular grid-based execution (termed REGULAR)

Page 20: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA20

Experimental Results

0

5

10

15

20

25 Offline Clustering Time

Join Time

Tim

e (in

sec

s)

Increm. Non-Inc.iter = 1

Non-Inc.iter = 3

Non-Inc.iter = 5

Non-Inc.iter = 10

- Non-Increm. Clustering Time- Join Time

Incremental vs. Non-incremental:

- Join time slightly improves with non-incremental clustering- But clustering wait time outweighs advantage of faster join

Page 21: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA21

Experimental Results

- Performance of regular grid-based execution improves with finer granularity of grid cells (But memory requirements increase as well)

0

10

20

30

40

50

60

50x50 75x75 100x100 125x125 150x150

REGULAR SCUBA

0

500

1000

1500

2000

50x50 75x75 100x100 125x125 150x150

REGULAR SCUBA

Tim

e (in

sec

s)

(a) Join TimeGrid Cell Count Grid Cell Count

Mem

ory

(in M

B)

(b) Memory Consumption

Varying Grid Cell Size:

Page 22: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA22

Experimental Results Cluster Maintenance:

Cluster maintenance time is cheap relative to the join time

Page 23: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA23

Conclusions

1. Designed SCUBA is a novel cluster-based algorithm for continuously evaluating spatio-temporal queries.

2. Scalability in SCUBA is achieved through shared cluster-based execution.

3. Implemented SCUBA in CAPE streaming database

4. Experimental results show that SCUBA outperforms regular grid-based indexing scheme when executing on densely moving objects

5. Clustering significantly improves performance when processing densely moving objects

6. Maintaining clusters (overhead) is very small

Page 24: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA24

Future Work Non-circular clusters Extend to other types of spatio-temporal queries

CKNN Aggregate

Hierarchical clustering (merge and break-down clusters)

Page 25: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA25

Thank you.

Mass Pike in Boston Satellite Image, Google Maps 2006

Page 26: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA26

Additional Slides…

Page 27: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA27

References[BR02] Brinkhoff T.: 'A Framework for Generating Network-Based Moving Objects', GeoInformatica, Vol. 6, No. 2, Kluwer, 2002, 153-180[SDK02] D. Stojanovi´c and S. Djordjevi´c–Kajan: Location–based Web services for tracking and visual route analysis of mobile objects. In: Proceedings of Yu INFO Conference, Kopaonik, 2002, CD ROM (Serbian).[GL04] Gedik, B., Liu, L. MobiEyes: Distributed Processing of Continuously Moving Queries on Moving Objects in a Mobile System. EDBT, 2004.[MXA04] Mokbel, M., Xiong, X., Aref, W. SINA: Scalable Incremental Processing of Continuous Queries in Spatio-temporal Databases. SIGMOD, 2004.[PXK+02] Prabhakar, S., Xia, Y., Kalashnikov, D., Aref, W., Hambrusch, S. Query Indexing and Velocity Constrained Indexing: Scalable Techniques for Continuous Queries on Moving Objects. IEEE Transactions on Computers, 51(10): 1124-1140, 2002.[XMA05] Xiong, X., Mokbel, M., Aref, W. SEA-CNN: Scalable Processing of Continuous K-Nearest Neighbor Queries in Spatio-temporal Databases. ICDE, 2005.[WCL02] Ouri Wolfson, Hu Cao, Hai Lin, Goce Trajcevski, Fengli Zhang, Naphtali Rishe: Management of Dynamic Location Information in DOMINO. EDBT 2002: 769-771[BBH04] L. Becker, H. Blunck, K. Hinrichs, J. Vahrenhold: A Framework for Representing Moving Objects. Proceedings of the 14th International Conference on Database and Expert Systems Applications (DEXA 2004) Berlin, 2004, 854 - 863[AG04] V. T. Almeida and R. H. Guting. Indexing the trajectories of moving objects in networks. Technical Report 309, FernuniversitÄat Hagen, Fachbereich Informatik, 2004.[PJT00] D. Pfoser, C. S. Jensen, and Y. Theodoridis. Novel approaches to the indexing of moving object trajectories. In Proceedings of the 26th International Conference on Very Large Databases, pages 395–406, 2000.[TPS02] Yufei Tao, Dimitris Papadias, and Qiongmao Shen. Continuous Nearest Neighbor Search. In VLDB, 2002.[LPM02] Iosif Lazaridis, Kriengkrai Porkaew, and Sharad Mehrotra. Dynamic Queries over Mobile Objects. In EDBT, 2002[SR01] Zhexuan Song and Nick Roussopoulos. K-Nearest Neighbor Search for Moving Query Point. In SSTD, 2001.[LPM02] Iosif Lazaridis, Kriengkrai Porkaew, and Sharad Mehrotra. Dynamic Queries over Mobile Objects. In EDBT, 2002.[TPS] Yufei Tao, Dimitris Papadias, and Qiongmao Shen. Continuous Nearest Neighbor Search. In VLDB, 2002.[SJL00] Simonas Saltenis, Christian S. Jensen, Scott T. Leutenegger, and Mario A. Lopez. Indexing the Positions of Continuously Moving Objects. In SIGMOD, 2000.[RDZ05] Elke A. Rundensteiner, Luping Ding, Yali Zhu, Timothy Sutherland and Bradford Pielech, CAPE:A Constraint-Aware Adaptive Stream Processing Engine, Invited Book Chapter, in Stream Data Management (Advances in Database Systems Series), 2005, chapter 5, Springer Verlag, pp. 83-111.

Page 28: SCUBA: Scalable Cluster-Based Algorithm for Evaluating Spatio-Temporal Queries On Moving Objects 1 Rimma V. Nehme Department of Computer Sciences, Purdue

EDBT 2006 SCUBA28

Data Structures Objects Table Queries Table ClusterHome Table ClusterStorage Table ClusterGrid

1

246

5637

42