29
Incremental Frequent Route Based Trajectory Prediction Karlsruhe Institute of Technology European Centre for Soft Computing KTH – Royal Institute of Technology Anja Bachmann Christian Borgelt Gyözö Gidofalvi

Incremental Frequent Route Based Trajectory Prediction Karlsruhe Institute of Technology European Centre for Soft Computing KTH – Royal Institute of Technology

Embed Size (px)

Citation preview

Incremental Frequent Route Based Trajectory Prediction

Karlsruhe Institute of TechnologyEuropean Centre for Soft ComputingKTH – Royal Institute of Technology

Anja Bachmann

Christian Borgelt

Gyözö Gidofalvi

Outline

Introduction Related work IncCCFR

Trajectory representation Stream processing model Incremental mining of Closed Contiguous Frequent Routes (CCFR) CCFR-based trajectory prediction

Empirical evaluations

2013-11-05 IWCTS 2013, Orlando, FL 2

Introduction

Congestion is a serious problem Economic losses and quality of life

degradation that result from increased and unpredictable travel times

Increased level of carbon footprint that idling vehicles leave behind

Increased number of traffic accidents that are direct results of stress and fatigue of drivers that are stuck in congestion

2013-11-05 IWCTS 2013, Orlando, FL 3

Road network expansion is not a sustainable solution

Instead: monitor understand control movement and congestion

Modern Traffic Prediction and Managemnt System (TPMS)

Motivated by: Widespread adoption of online GPS-based on-board navigation systems and

location-aware mobile devices Movement of an individual contains a high degree of regularity

Use vehicle movement data as follows: Vehicles periodically send their location (and speed) to TPMS TPMS extracts traffic / mobility patterns from the submitted information TPMS uses traffic / mobility patterns + current / recent historical locations (and

speeds) of the vehicles for: Short-term traffic prediction and management:

Predict near-future locations of vehicles and near-future traffic conditions Inform the relevant vehicles in case of an (actual / predicted) event Suggest how and which vehicles to re-route in case of an event

Long-term traffic and transport planning

2013-11-05 IWCTS 2013, Orlando, FL 4

Remaining Challenges

Sequential pattern based trajectory prediction is difficult to adopt to capture the temporal and periodic variations

Trajectory prediction systems model and provide knowledge about the movement of the objects at a fixed level of detail, while different applications (real-time management vs. long-term planning) need different levels of detail.

Predictions tend to be based on either historical or current information while both types of information are relevant.

No end-to-end system for management, incremental mining and accurate prediction of continuously evolving trajectories of moving objects.

2013-11-05 IWCTS 2013, Orlando, FL 5

Outline

Introduction Related work IncCCFR

Trajectory representation Stream processing model Incremental mining of Closed Contiguous Frequent Routes (CCFR) CCFR-based trajectory prediction

Empirical evaluations

2013-11-05 IWCTS 2013, Orlando, FL 6

Related Work: Frequent Pattern Mining

20 years of research Frequent pattern types: itemsets sequences graphs

Exponential search space is pruned based on the anti-monotonicity of the pattern support measure given a minimum support threshold min_sup

Pattern constraints: Maximal (lossy): Pattern X is a maximal if X is frequent and there does not exist

another pattern Y that is a proper superset of X that is frequent. lossy Closed (lossless): Pattern X is closed if X is frequent and there does not exist

another pattern Y that is a proper superset of X that has the same support as X.

Processing models: batch online / stream incremental

2013-11-05 IWCTS 2013, Orlando, FL 7

Related Work: Trajectory Prediction

Prediction model Markov model Sequential rule / trajectory pattern

Model basis / generality General model for all objects Type-base model for similar (type of) objects Specific model for each individual object

Definition of Regions Of Interest (ROI) for prediction Application specific ROIs (road segments, network cells, sensors, etc.) Density-based ROIs Grid-based ROIs

Prediction provision Sequential spatial prediction (loc. of next ROI) Spatio-temporal prediction

Additional movement assumptions or models: YES / NO

2013-11-05 IWCTS 2013, Orlando, FL 8

Outline

Introduction Related work IncCCFR

Trajectory representation Stream processing model Incremental mining of Closed Contiguous Frequent Routes (CCFR) CCFR-based trajectory prediction

Empirical evaluations

2013-11-05 IWCTS 2013, Orlando, FL 9

Trajectory Representation

Grid G with side length glen uniformly partitions the 2D space Representation is without limitations, easily scalable to different level of details

Grid based trajectory: start time temporally annotated sequence: sequence of traversed grid cells and associated

traversal times

Modeling the stopping of objects: append a pseudo grid cell (‘stop’) after the last (real) grid cell of each completed trip trajectory

2013-11-05 IWCTS 2013, Orlando, FL 10

Stream Processing Model

Temporal sliding window model: window size and window stride

2013-11-05 IWCTS 2013, Orlando, FL 11

completed trips partial trips

stridesize

Mining of Closed Contiguous Frequent Routes

Grow CCFRs (or patterns) in a depth-first fashion Start with single grid cells Recursively extend by adding one grid cell in each recursion

Data structure: Simple flat array representation of the trajectories is used References are kept to the current ends of the pattern occurrences in order to be

able to quickly find and group possible extensions.

Simple and fast closedness checking of contiguous patterns: direct check of possible superpatterns and their support by generating and testing all possible extensions of a given pattern

Without limitations, annotate CCFRs with global traversal times of grid cells

2013-11-05 IWCTS 2013, Orlando, FL 12

Increamental CCFR Mining

General idea from Bifet et al. for incremental closed subgraph mining Weight closed patterns by their ”relative support” and mine the weighted patterns

to reproduce the original pattern set, i.e., the combined operation of weighting and mining is an idempotent operation: f(x)=f(f(x))

Idempotent pattern weight (ipw) of a pattern is its support minus the support of all of its super-patterns in the pattern set

Incremental mining: combine and mine patterns of patterns sets from non-overlapping windows to reproduce and approximation of results

2013-11-05 IWCTS 2013, Orlando, FL 13

stride

mine

wiwi-1wi-2

ipwi

CCFRi

ipwi-1

CCFRi-1

ipwi-2

CCFRi-2

min

e

+ + Approx. CCFR(i-2..i)

CCFR(i-2..i)

Capture Temporal and Periodic Variations

Use the same pattern weighting methodology to combine patterns from temporally relevant historical windows

Temporal domain projections to capture periodic variations at different levels

2013-11-05 IWCTS 2013, Orlando, FL 14

ipwMonday@9am

CCFRMonday@9am

min

e

Approx. CCFRweekdays@9am

+

CCFRTuesday@9am

CCFRFriday@9am

+

+…

ipwTuesday@9am

ipwFriday@9am

Faulty Support Definition and the Fix

Example database of two sequences: ABC and ABDBC min_sup = 2 Original support def: # of sequences that contain the pattern

Closed patterns and their support: AB:2 and BC:2 NOTE: A, B , or C alone are not closed!

ipw of patterns: ipw(AB)=2 and ipw(BC)=2 Mining after ipw-weigting yields patterns: AB:2, BC:2 and B:4 cannot be!

New support def: # of times the pattern occurs in the sequences Closed patterns and their support: B:3, AB:2 and BC:2 ipw of patterns: ipw(B)=3-2-2=-1, ipw(AB)=2 and ipw(BC)=2 Mining after ipw-weigting yields patterns: AB:2, BC:2 and B:3 (idempotency)

Fix only works for directed sequences and contiguous patterns!

2013-11-05 IWCTS 2013, Orlando, FL 15

CCFR Based Prediction

Given a set of CCFRs R, iteratively extend the query vector q (partial trajectory) that ends in an anchor a as follows:1. Find the set of best matching patterns R* that contain the longest contiguous

suffix s of q starting from a

2. Calculate the successor probability of the cell grid cells that occur in the patterns in R* directly after an occurrence of s

3. Retrieve the neighboring cell probability of every grid cell that occurs in the trips after the anchor a

4. Complete the successor probability distribution over the neighbors of a using the neighboring cell probabilities

5. Extend q with the most likely successor grid cell c* and reduce the prediction horizon by the gobal average of the traversal time of c*

6. Stop and return c* if the remaining prediction horizon<=0; otherwise go to step 1.

2013-11-05 IWCTS 2013, Orlando, FL 16

Illustrative Example: Trajectories and Mining

2013-11-05 IWCTS 2013, Orlando, FL 17

Illustrative Example: Prediction

2013-11-05 IWCTS 2013, Orlando, FL 18

When Patterns Make a Difference

Neighboring cell probabilities predict (4.1) with confidence 57%, but the patterns predict (5.2) with confidence 100%.

2013-11-05 IWCTS 2013, Orlando, FL 19

When Neighboring Probabilities Fail: Avoid cycles and u-turns!

Cases when predictions with patterns differ from predictions with neighboring cell probabilities

2013-11-05 IWCTS 2013, Orlando, FL 20

Explicitly rule out u-turns (as well as cycles) in the prediction

Outline

Introduction Related work IncCCFR

Trajectory representation Stream processing model Incremental mining of Closed Contiguous Frequent Routes (CCFR) CCFR-based trajectory prediction

Empirical evaluations

2013-11-05 IWCTS 2013, Orlando, FL 21

Empirical Evaluation

Hardware: 64bit Ubuntu 12.10 on Intel Core 2 Quad Q8400 2.66GHz processor and 4GB memory

Data set: 6 day sample of 11K taxis in Wuhan, China (85M records)

2013-11-05 IWCTS 2013, Orlando, FL 22

Outlier removal Sampling gaps of more the

120 seconds delimit trips Linear interpolation of trips

between samples using 100-meter grid cells

Eliminate short trips (less than 300 seconds or 10 grid cells)

2 million trips that have an average length of 1390 seconds and 94 grid cells and refer to 2 billion grid cells Raw sample vs. interpolated trips

Evaluation Measure

2013-11-05 IWCTS 2013, Orlando, FL 23

Prediction Tests

Sliding window model: t_wsize = 60 minutes, t_wstride = 5 minutes Prediction horizon: upto 5 minutes Methods:

global: neighboring probabilities only, based on all trips (even future ones!) g ¬o: global + cycle prevention g ¬ou: global + cycle and u-turn prevention g best: best prediction of global local: neighboring probabilities only, based on completed trips in the window l ¬o: local + cycle prevention l ¬ou: local + cycle and u-turn prevention l best: best prediction of local 60: patterns with min_sup=60 + neighboring probabilities, based on completed trips in

the window 60, 6d: same as 60 but with hour-of-day projection 60, 4d: same as 60 but with hour-of-day and weekday-weekend projections

2013-11-05 IWCTS 2013, Orlando, FL 24

Absolute Prediction Error

Absolute prediction error (i.e., average grid cell distance to the predicted and to ‘best’ grid cell) of different methods.

2013-11-05 IWCTS 2013, Orlando, FL 25

Relative Prediction Error

Relative prediction error (i.e., percentage improvement) of different methods w.r.t. the baseline predictor ‘global’.

2013-11-05 IWCTS 2013, Orlando, FL 26

Effects of Incremental Mining

Using 20 minute subwindows the average prediction errors virtually unchanged compared to method ’60’.

2013-11-05 IWCTS 2013, Orlando, FL 27

Trips during 1 hour Directly mined CCFRs Incrementally mined CCFRs

Conclusions and Future Work

IncCCFR: a novel, incremental approach for managing, mining, and predicting the incrementally evolving trajectories of moving object Essentially a varying order, deterministic Markov model that is based on closed

contiguous frequent routes and neighboring cell probabilities Advantages:

Reduced mining and storage costs Ability to combine multiple temporally relevant mining results from the past to capture

temporal and periodic regularities in movement

Future work: Use pattern combination approach to parallelize mining Use current speed + historical CCFRs to be able to react to rare, unpredictable,

sudden changes

2013-11-05 IWCTS 2013, Orlando, FL 28

Thank you for your attention!

Q/A?

2013-11-05 IWCTS 2013, Orlando, FL 29