26
On the Effect of Trajectory Compression in Spatio- temporal Querying Elias Frentzos , and Yannis Theodoridis Data Management Group, University of Piraeus http:// isl.cs.unipi.gr /db ADBIS, October 2 2007

On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Embed Size (px)

Citation preview

Page 1: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

On the Effect of Trajectory Compression in Spatio-

temporal Querying

Elias Frentzos, and Yannis TheodoridisData Management Group, University of Piraeushttp://isl.cs.unipi.gr/db

ADBIS, October 2 2007

Page 2: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

2

Problem Statement Background

Compressing Trajectories Related work on Error Estimation

Estimating the Effect of Compression ST Querying

Evaluating the Effect of Compression ST Querying Experimental Results

On the performance On the quality

Conclusions and Future Work

Talk Outline

Page 3: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

3

Problem Statement Background

Compressing Trajectories Related work on Error Estimation

Estimating the Effect of Compression ST Querying

Evaluating the Effect of Compression ST Querying Experimental Results

On the performance On the quality

Conclusions and Future Work

Talk Outline

Page 4: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

4

Trajectory is the data obtained from moving point objects and can be seen as a string in the 3D space

Trajectory compression is a very promising field since moving objects recording their position in time produce large amounts of frequently redundant data

Existing work on trajectory compression is mainly driven by research advances in the fields of line generalization and time series compression.

Our interest is in lossy compression techniques which eliminate some repeated or unnecessary information under well-defined error bounds.

Problem Statement (1)

Page 5: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

5

The objectives for trajectory compression are: To obtain a lasting reduction in data size; To obtain a data series that still allows various

computations at acceptable (low) complexity; To obtain a data series with known, small margins of

error, which are preferably parametrically adjustable. Our goal is to calculate the mean error introduced

in query results over compressed trajectory data, which is by no means a trivial task We argue that this mean error can be used for

deciding whether the compressed data are suitable for the user needs

We restrict our discussion in a special type of spatiotemporal query, the timeslice queries

Problem Statement (2)

Page 6: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

6

Problem Statement Background

Compressing Trajectories Related work on Error Estimation

Estimating the Effect of Compression ST Querying

Evaluating the Effect of Compression ST Querying Experimental Results

On the performance On the quality

Conclusions and Future Work

Talk Outline

Page 7: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

7

Methods exploiting line simplification algorithms for compressing a trajectory are based on the so called Synchronous Euclidean Distance (SED)

SED is the distance between the sampled point Pi (xi , yi , ti ) being under examination, and the point of the line (Ps, Pe) where the moving object would lie, supposed it was moving on this line, at time instance ti determined by the point under examination

Compressing Trajectories: SED

Ps(xs,ys,ts)

Pe(xe,ye,te)

Pi(xi,yi,ti)

Pi’(xi’,yi’,ti)

SED(P,P’)

Page 8: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

8

The TD-TR algorithm (Meratnia and By, EDBT 2004) is a spatiotemporal extension of the quite famous Top – Down Douglas – Peucker algorithm which was originally used in cartography

The algorithm tries (and achieves) to preserve directional trends in the approximated line using a distance threshold

The TD-TR algorithm uses SED instead of the perpendicular distance It is a batch algorithm since it requires the full line at its start

Compressing Trajectories: TD-TR algorithm

A

B

Page 9: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

9

Opening window (OW) algorithms anchor the start point of a potential segment, and then attempt to approximate the subsequent data series with increasingly longer segments.

The algorithm also achieves to preserve directional trends in the approximated line using a distance threshold

The OPW-TR algorithm (Meratnia and By, EDBT 2004) also uses SED instead of the perpendicular distance

It can be used as an online algorithm

Compressing Trajectories: OPW-TR algorithm

A

B

C

Page 10: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

10

Problem Statement Background

Compressing Trajectories Related work on Error Estimation

Estimating the Effect of Compression ST Querying

Evaluating the Effect of Compression ST Querying Experimental Results

On the performance On the quality

Conclusions and Future Work

Talk Outline

Page 11: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

11

The only relative work estimates the average value of the Synchronous Euclidean Distance (SED), also termed as Synchronous Error, between an original trajectory and its approximation.

There is no obvious way on how to use it in order to determine the error introduced in query results

Related work on Error Estimation

11

,1

( , ) ( )k

k

tn

p qk t

AvgE p q E t dt

11 2

2, 2

2 4 2( ) arcsinh

4 8 4

kk

kk

tt

p q

tt

at b b ac at bE t dt at bt c

a a a ac b

t1

tn

t

q

p

x

Page 12: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

12

Problem Statement Background

Compressing Trajectories Related work on Error Estimation

Estimating the Effect of Compression in ST Querying

Evaluating the Effect of Compression in ST Querying

Experimental Results On the performance On the quality

Conclusions and Future Work

Talk Outline

Page 13: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

13

t y

Q1

Q2

x

1 2

Q3

3 4

Q5

Q4

t1

t4

t2

t6

t3

Estimating the Effect of Compression in ST Querying: Preliminaries

Our goal is to provide closed-form formulas that estimate the number of false hits introduced in query results over compressed trajectory datasets

Among the query types executed against trajectory datasets, we focus on a special type or range query, the so-called timeslice query

Two types of errors are introduced in query results when executing a timeslice query over a trajectory dataset

t

Q1

x

1 false negatives are the

trajectories which originally qualified the query but their compressed counterparts were not retrieved

false positives are the compressed trajectories retrieved by the query while their original counterparts are not qualifying it

Page 14: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

14

Query Window

Wj

a

b

†2, jp

p2,j

δy2(tj)

δx2(tj)

x

y t=tj

δy1(tj) δx1(tj)

Estimating the Effect of Compression in ST Querying: Analysis (1)

We first calculate AvgPi,P / AvgPi,N, which is the average probability of a single compressed trajectory to be retrieved as false positive / negative, regarding all possible timeslice query windows with sides a b

We then sum-up these average probabilities of all dataset trajectories in order to produce the global average probability

The error introduced in the position of a trajectory can be calculated as a function of time

,1

n

P a b i P a bi

E R AvgP R

,1

n

N a b i N a bi

E R AvgP R

, 1 ,, ,

, 1 ,

( ) i k i ki i k i k

i k i k

x xx t x t t

t t

, 1 ,, ,

, 1 ,

( ) i k i ki i k i k

i k i k

y yy t y t t

t t

Page 15: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

15

W

Estimating the Effect of Compression in ST Querying: Analysis (2)

We calculate the average probability of a compressed trajectory Ti to be retrieved as false positive / negative regarding a timeslice query window at timestamp tj

The quantity of timeslice query windows that may retrieve a compressed trajectory as false positive / negative at timestamp tj can be extracted geometrically

We distinguish among 4 cases, regarding the signs of δx and δy values

Finally by integrating the area Ai,j over all the timestamps inside the unit space we obtain AvgPi,P / AvgPi,N

δyi,j>0

δxi,j<0

[0,1][0,1], tj

Ai,j , , ,i j i j i jA a b a x b y W

Page 16: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

16

Estimating the Effect of Compression in ST Querying: Analysis (3)

Summing up the average probabilities of all trajectories and performing the necessary calculations, we obtain:

where

1, , 1 , , 1, 1 ,

1 1 (1 ) (1 ) 2 2 6

i

N a b P a b

mni k i k i k i ki k i k

i k

E R E R

b x x a y yt t e

a b

, , , 1 , 1 , , 1 , 1 ,2 2i k i k i k i k i k i k i k i ke x y x y x y x y

Page 17: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

17

Problem Statement Background

Compressing Trajectories Related work on Error Estimation

Estimating the Effect of Compression in ST Querying

Evaluating the Effect of Compression in ST Querying

Experimental Results On the performance On the quality

Conclusions and Future Work

Talk Outline

Page 18: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

18

Evaluating the Effect of Compression in ST Querying

The evaluation of this formula is a costly operation O(nm); its calculation requires to process the entire original dataset along with its compressed counterpart

However, any compression algorithm evaluating SED, need also to calculate δxi,k δyi,k in every timestamp

As a consequence, the evaluation of the average error in the query results, can be integrated in the compressions algorithm, introducing only a small overhead on its execution

1, , 1 , , 1, 1 ,

1 1 (1 ) (1 ) 2 2 6

i

N a b P a b

mni k i k i k i ki k i k

i k

E R E R

b x x a y yt t e

a b

2 2

i i iSED t x t y t

Page 19: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

19

Problem Statement Background

Compressing Trajectories Related work on Error Estimation

Estimating the Effect of Compression in ST Querying

Evaluating the Effect of Compression in ST Querying

Experimental Results On the performance On the quality

Conclusions and Future Work

Talk Outline

Page 20: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

20

Experimental Study: Settings

Datasets One real trajectory dataset of a fleet of trucks (273

trajectories, 112K entries) A synthetic dataset of 2000 trajectories generated using

network-based data generator and the San Joaquin road network

Implementation We implemented the TD-TR algorithm and compressed

the real and synthetic datasets varying its threshold

Experiments Average overhead introduced in the TD-TR algorithm Average number of false positives and false negatives in

10000 randomly distributed timeslice queries

Page 21: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

21

Experimental Study: On the performance Scaling the value of the

TD-TR threshold The algorithm’s execution

time reduces as the value of the TD-TR threshold increases

The overhead introduced in the algorithm’s execution, is typically small (bellow 7%)

In absolute times, the overhead introduced never exceeds 0.2 milliseconds per trajectory

Trucks dataset

Synthetic dataset

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0.001 0.005 0.01 0.015 0.02TD-TR threshold

Ex

ecu

tio

n t

ime

(mse

c)

Model calculations included

Model calculations excluded

0

0.2

0.4

0.6

0.8

1

1.2

0.001 0.005 0.01 0.015 0.02TD-TR threshold

Ex

ecu

tio

n t

ime

(mse

c) Model calculations included

Model calculations excluded

Page 22: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

22

Experimental Study: On the quality (1) Scaling the value of the TD-

TR threshold The average number of false hits

(negatives and positives) is linear with the value of the TD-TR compression threshold

The average error in the estimation for the synthetic dataset is around 6%, varying between 0.2% and 14%

In the trucks dataset the average error increases around 10.6%, mainly due to the error introduced in small values of TD-TR threshold

Trucks dataset

Synthetic dataset

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.001 0.005 0.01 0.015 0.02TD-TR threshold

Av

erag

e Fa

lse

Hit

s

False Negatives

False Positives

Estimation

0

0.2

0.4

0.6

0.8

1

1.2

1.4

0.001 0.005 0.01 0.015 0.02TD-TR threshold

Av

erag

e Fa

lse

Hit

s

False Negatives

False Positives

Estimation

Page 23: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

23

Experimental Study: On the quality (2) Scaling the query size

The average number of false hits (negatives and positives) is sub-linear with the size of the query

The average error in the estimation for the synthetic dataset is around 2.9%, varying between 0.2% and 8.7%

In the trucks dataset the average error increases around 7.5%

Trucks dataset

Synthetic dataset

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.05 0.1 0.15 0.2 0.25 0.3Query size (a = b )

Av

erag

e Fa

lse

Hit

s

False Negatives

False Positives

Estimation

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.05 0.1 0.15 0.2 0.25 0.3Query size (a = b )

Av

erag

e Fa

lse

Hit

s

False Negatives

False Positives

Estimation

Page 24: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

24

Summary and Future Work

We provided a closed formula of the average number of false negatives and false positives covering the case of uniformly distributed query windows and arbitrarily distributed trajectory data

Through an experimental study we demonstrated the efficiency of the proposed model We illustrated the applicability of our model under real-life

requirements – it turns out that the estimation of the model parameters introduce only a small overhead in the trajectory compression algorithm

We presented the accuracy of our estimations, with an average error being around 6%.

Future work: Extension of our model in nearest neighbor and general range

queries Applicability of our model in the case of spatiotemporal warehouses

Page 25: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

25

Acknowledgements

Research partially supported by: GEOPKDD (“Geographic Privacy-aware Knowledge

Discovery and Delivery”) project funded by the European Community under FP6-014915 contract

Page 26: On the Effect of Trajectory Compression in Spatio-temporal Querying Elias Frentzos, and Yannis Theodoridis Data Management Group, University of Piraeus

Frentzos and Theodoridis, ADBIS 2007On the Effect of Trajectory Compression in Spatiotemporal Querying

26

Thank you!

On the Effect of Trajectory Compression in Spatiotemporal Querying