30
Time-Decaying Sketches for Sensor Data Aggregation Graham Cormode AT&T Labs, Research Srikanta Tirthapura Dept. of Electrical and Computer Engineering Iowa State University Bojian Xu Dept. of Electrical and Computer Engineering Iowa State University

Time-Decaying Sketches for Sensor Data Aggregation

  • Upload
    bela

  • View
    48

  • Download
    4

Embed Size (px)

DESCRIPTION

Time-Decaying Sketches for Sensor Data Aggregation. Graham Cormode AT&T Labs, Research Srikanta Tirthapura Dept. of Electrical and Computer Engineering Iowa State University Bojian Xu Dept. of Electrical and Computer Engineering Iowa State University. 75F 11:39. 76F 11:34. 72F 11:29. - PowerPoint PPT Presentation

Citation preview

Page 1: Time-Decaying Sketches for Sensor Data Aggregation

Time-Decaying Sketches for Sensor Data Aggregation

Graham CormodeAT&T Labs, Research

Srikanta TirthapuraDept. of Electrical and Computer Engineering

Iowa State University

Bojian XuDept. of Electrical and Computer Engineering

Iowa State University

Page 2: Time-Decaying Sketches for Sensor Data Aggregation

2/30

Mean of the Temperatures in the Last 30 Minutes

76F11:45

73F11:40

79F11:30

70F11:22

76F11:1578F

11:4173F

11:3976F

11:3876F

11:26

75F11:39

76F11:34

72F11:29

73F11:19

80F11:38

79F11:30

76F11:25

76F11:45

78F11:41

73F11:39

76F11:38

76F11:26

Page 3: Time-Decaying Sketches for Sensor Data Aggregation

3/30

Sketch

76F11:45

73F11:40

79F11:30

70F11:22

76F11:1578F

11:4173F

11:3976F

11:3876F

11:26

75F11:39

76F11:34

72F11:29

73F11:19

80F11:38

79F11:30

76F11:25

76F11:45

Page 4: Time-Decaying Sketches for Sensor Data Aggregation

4/30

Sketch Merging

Answer

Page 5: Time-Decaying Sketches for Sensor Data Aggregation

5/30

General Time Decay• General Decay function:

• Time decayed value of element at time c is:

age0

Page 6: Time-Decaying Sketches for Sensor Data Aggregation

6/30

Formal Model of the Data(on One Sensor)

Data stream: e0=(v0,t0,id0), e1=(v1,t1,id1), …– v: value– t: timestamp of creation– id: a unique id of the observation

• User defined Time Decay:

• Asynchronous arrival: It is possible ti > tj, while i<j

• Duplicates: idi = idj is possible – Assume: if idi = idj , then vi = vj, ti=tj

Page 7: Time-Decaying Sketches for Sensor Data Aggregation

7/30

Contribution

First mergable sketch combines the following:

Logarithmic space of the universe size

Guaranteed accuracy

Any time decay model Sum

Asynchronous arrival Quantile

Duplicate insensitive Frequent elements

Data aggregation under any multi-path routing protocol

Page 8: Time-Decaying Sketches for Sensor Data Aggregation

8/30

Related WorkAny time decay model

Asynchronous arrival

Duplicate insensitive

Sum Quantile Frequent Elements

1 √ √2 √ √3 √ √4 √ √

Our work

√ √ √ √ √ √

1. S. Nath, P. B. Gibbons, S. Seshan and Z. R. Anderson, “Synopsis diffusion for robust aggregation in sensor networks”, SenSys 2004

2. J. Considine, F. Li, G. Kollios and J. Byers, “Approximate Aggregation Techniques for Sensor Databases”, ICDE 2004

3. E. Cohen and M. Strauss, “Maintaining time-decaying stream aggregates”, PODS 2003; Journal of Algorithm 2006

4. S. Tirthapura, B. Xu and C. Busch, “Sketching Asynchronous Streams Over Sliding Windows”, PODC 2006

Page 9: Time-Decaying Sketches for Sensor Data Aggregation

9/30

Outline

• Problem: Time decayed sum of distinct elements over an asynchronous stream.

• Focus on Integral decay model: is always an integer

Page 10: Time-Decaying Sketches for Sensor Data Aggregation

10/30

Estimate of the Sum (on One Sensor)

• Given:

– Stream: R = (v0,t0,id0),…, (vn,tn,idn), …

– User defined decay function: f()

• Maintain:

– c: current time– D: set of distinct elements in R

Page 11: Time-Decaying Sketches for Sensor Data Aggregation

11/30

Estimate of the Sum (cont’d)• Linear space lower bound on duplicate-insensitive

sum (Alon, Matias and Szegedy, STOC 1996)– Deterministic approximate algorithm– Randomized algorithm giving accurate result

• Goal: Continuously maintain an (, )-estimate of:

– User inputs:– D: set of distinct elements in R

An (, )- estimate for X is a random variable Y, such that Pr[|Y-X| > X] < .

Page 12: Time-Decaying Sketches for Sensor Data Aggregation

12/30

Algorithm for Sum (High Level Picture)

Sum v1=4 v2=8+

SampleRate = p

• Count the number of selected integers

• Multiply by 1/p

√ √ √√ +Count

Random Sampling

Page 13: Time-Decaying Sketches for Sensor Data Aggregation

13/30

Duplicate Detection

Copy 1

√ √

Copy 2

Hash Function Random Sampling

Select x

Page 14: Time-Decaying Sketches for Sensor Data Aggregation

14/30

Intuition - I

Sample

sample rate

By Chebyshev inequality, for an ε-approximation of the count with constant probability:

(v,t,id)

Page 15: Time-Decaying Sketches for Sensor Data Aggregation

15/30

Intuition - II

• t

• t+

• Sample rate ?

Page 16: Time-Decaying Sketches for Sensor Data Aggregation

16/30

SIZE ??

p1 = 1/2

p0 = 1

p2 = 1/4

SampleRate pj

Maintain Multiple Samples

Page 17: Time-Decaying Sketches for Sensor Data Aggregation

17/30

Faster Sampling• RangeSample (Pavan & Tirthapura, SICOMP 2007)

– Efficiently compute the number of selected integers

√ √ √

SIZE ??

p1 = 1/2

p0 = 1

p2 = 1/4

SampleRate pj

p1 = 1/2

p0 = 1

p2 = 1/4

Page 18: Time-Decaying Sketches for Sensor Data Aggregation

18/30

At time: t

At time: t +

e=(v, t, id)

= Expiry Time

Expiry Time

√ √ √ At time: t

At time: t +

expiry time

Binary search over [t, tmax] using RangeSample

√ √ √

Page 19: Time-Decaying Sketches for Sensor Data Aggregation

19/30

t0

t1

t2 1/4

1/8

p=1

1/2

Level 0

Level 1

Level 2

Largest expiry time of all the elements discarded from the sample Sample 0

Sketch

Sketch Structure

Page 20: Time-Decaying Sketches for Sensor Data Aggregation

20/30

(e1,22)

(e1,19)

1/4

p=1

1/2

Level 0

Level 1

Level 2

current time 17

data: (v, t, id) e1 (22, 16, 6)

Expiry0 22Expiry1 19Expiry2 17

Page 21: Time-Decaying Sketches for Sensor Data Aggregation

21/30

(e3,21)(e2,23)(e1,22)

(e2,21)(e1,19)

1/4

p=1

1/2

Level 0

Level 1

Level 2

current time 17 18 18

data: (v, t, id) e1 (22, 16, 6)

e2(32, 17, 9)

e3(7, 16, 11)

Expiry0 22 23 21Expiry1 19 21 16Expiry2 17 18 16

Page 22: Time-Decaying Sketches for Sensor Data Aggregation

22/30

(e4,23)(e2,23)(e1,22)

(e4,21)(e2,21)(e1,19)

1/4

p=1

1/2

Level 0

Level 1

Level 2

current time 17 18 18 20

data: (v, t, id) e1 (22, 16, 6)

e2(32, 17, 9)

e3(7, 16, 11)

e4(21, 18, 8)

Expiry0 22 23 21 23Expiry1 19 21 16 21Expiry2 17 18 16 20

(e3,21)

Discard the element with smallest expiry time

Page 23: Time-Decaying Sketches for Sensor Data Aggregation

23/30

(e4,23)(e2,23)(e1,22)t0= 21

(e4,21)(e2,21)(e1,19)

1/4

p=1

1/2

Level 0

Level 1

Level 2

current time 17 18 18 20

data: (v, t, id) e1 (22, 16, 6)

e2(32, 17, 9)

e3(7, 16, 11)

e4(21, 18, 8)

Expiry0 22 23 21 23

Expiry1 19 21 16 21

Expiry2 17 18 16 20

Page 24: Time-Decaying Sketches for Sensor Data Aggregation

24/30

(e4,23)(e2,23)(e1,22)t0= 21

(e4,21)(e2,21)(e1,19)

1/4

p=1

1/2

Level 0

Level 1

Level 2

current time 17 18 18 20 20

data: (v, t, id) e1 (22, 16, 6)

e2(32, 17, 9)

e3(7, 16, 11)

e4(21, 18, 8)

e5(32, 17, 9)

Expiry0 22 23 21 23 23

Expiry1 19 21 16 21 21

Expiry2 17 18 16 20 18

Duplicate

Page 25: Time-Decaying Sketches for Sensor Data Aggregation

25/30

Answer a Query for the Decayed Sum

Current time = 20

t0= 21Level 0

Level 1

Level 2

Level used to answer the

query

e2

e4

(e4,23)(e2,23)(e1,22)

(e4,21)(e2,21)(e1,19)

1/4

p=1

1/2

Page 26: Time-Decaying Sketches for Sensor Data Aggregation

26/30

Over the Whole Sensor N/W

(e3,13)(e2,9)(e1,6)

(e3,13)(e5,10)(e4,6)

Each sample keeps 3 distinct items with largest expiry time.

union

(e3,13)(e5,10)(e2,9)

union

union

Sketch 1

Sketch 2

Result of merging sketch 1&2

Page 27: Time-Decaying Sketches for Sensor Data Aggregation

27/30

Algorithm Complexity

• Space complexity:

• Time complexity– expected time for processing one item

– Time for answering a query

– Time for merging two sketches

Page 28: Time-Decaying Sketches for Sensor Data Aggregation

28/30

ConclusionFirst sketch combines the following

Logarithmic space of the universe size

Guaranteed accuracy

Any time decay model Sum

Asynchronous arrival Quantile

Duplicate insensitive Frequent elements

Data aggregation under any multi-path routing protocol

Page 29: Time-Decaying Sketches for Sensor Data Aggregation

29/30

Ongoing and Future Work

• Implementation– Observed results better than theoretical

predictions

• Better duplicate insensitive sketches for specific decay models?

• Other aggregates, such as Variance, clustering?

Page 30: Time-Decaying Sketches for Sensor Data Aggregation

30/30

THANKS