Probabilistic Similarity Search for Uncertain Time Series Presented by CAO Chen 21 st Feb, 2011

Preview:

Citation preview

Probabilistic Similarity Search for Uncertain Time SeriesPresented by CAO Chen21st Feb, 2011

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

2

Background – Time Series

3

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Background – Time Series (cont’d)• Source of Time Series Data• Traffic measurements

• Uncorrelated

• Location tracking of moving objects

• Measuring environmental parameter(temperature)• Correlated 4

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Background – Similarity Search• Similarity Search• Pattern Matching• Shape Matching

5

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Background – Similarity Search (cont’d)• Range Query• Return all tuples that fits between an upper and lower boundary. • We don’t know how many it will return• Slower than top-k because no upper bound to prune

• Sequence Matching• Whole matching: Sequences with same length• Subsequence Matching

6

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Motivation & Contribution• Uncertainty • Moving objects• Object identification• Sensor network monitoring

7

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Motivation & Contribution (cont’d)• Contribution• (Firstly) Formalize the notion of uncertain time series• Two novel types of probabilistic range queries over uncertain

time series• Pruning strategy based on approximating representation of

uncertainty• Explicitly evaluate the refinement(processing) time cost

8

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

9

Probabilistic Queries Over Uncertain TS• Definition of Uncertain Time Series

10

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Probabilistic Queries Over Uncertain TS (cont’d)• Definition of Uncertain Lp-Distance

11

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Probabilistic Queries Over Uncertain TS (cont’d)• Definition of Probabilistic Range Queries

12

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Challenge in Processing Range Queries with Uncertainty• Naïve Solution

• Computing all distance observations• CPU-bound vs. I/O bound• Long time series and high sample rates (large n), • Naïve Solution• Number of computing the distance

13

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

14

Approximate Representation

15

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Approximate Representation (cont’d)• Two Levels of Appr. Representation• Different in whether existing multiple(K) groups of sample

observation in one time slot

16

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

By K-means clusteringOnly one group at each time slot

Distance Approximations

17

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Distance Approximations (cont’d)

18

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Distance Approximations (cont’d)• Lemma 1

• Lemma 2

19

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Probabilistic Bounded Range Queries (PBRQ)

20

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

True Hit

True Drop

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

21

Step-Wise Refinement• When to refine?• Time series that could not be filtered or determined simply by

comparing the interval of lower and upper bound• Refinement Goal• To identify an uncertain time series as true hit or true drop

• Condition to increase the lower bound

• Increase of the number of qualified distance

22

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Step-Wise Refinement (cont’d)

• Refinement heuristics

23

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

24

Evaluation• Benchmark• UCI Time Series Data Mining Archive• CBF, GUN/POINT, CONTROL CHART, OSU LEAF

• Uncertainty• Generating samples uniformly distributed around the given exact

values

• Evaluation• Overall Speed-Up• Refinement Speed-Up

25

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Evaluation (cont’d)• Speed-up for Probabilistic Bounded Range Query (PBRQ)

26

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Evaluation (cont’d)• Speed-up for Probabilistic Rank Range Query (PRRQ)

27

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Evaluation (cont’d)• Speed-up w.r.t. scalability

28

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Evaluation (cont’d)• Refinement• S-S: using proposed strategy• R-R: randomly processing for both steps

• Logarithm value of required calculations

29

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

Outline• Introduction• Background• Time Series• Similarity Search

• Motivation & Contribution• Uncertain Time Series Query • Uncertainty Approximation• Step-wise Refinement

• Evaluation• Related Literature Review• Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

30

Q & A

CAO

Che

n, D

B G

roup

, CSE

, HKU

ST21

/2/2

011

31

• Thank You

Recommended