Upload
ceya
View
1.070
Download
1
Tags:
Embed Size (px)
Citation preview
1
Learning Transportation Mode from Raw GPS Data for Geographic Applications on the Web
Yu Zheng, Like Liu, Xing Xie, WWW’08Microsoft Research Asia
Advisor: Chia-Hui ChangPresenter: Teng-Kai Fan
Date: 2010-03-19
2
Outline
• Introduction• Framework• Methodology• Experiment• Conclusion & future work
3
2004 2005 2006 2007 2008 2009 2010 2011 -
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
AfricaAsia/PacificEastern EuropeJapanLatin AmericaMiddle EastNorth AmericaWestern EuropeTotal
Years
Perc
enta
ge o
f Tot
al S
ales
• Background Percentage of GPS-enabled handset among mobile phone (Gartner Dataqueste: Forecast: GPS-enabled device 2004-2011)
4
Introduction• What we do: Infer transportation modes from users’ GPS logs
GPS log
Users
Infer model
5
Introduction– Motivation
• Differentiate GPS trajectory of different transportation modes• Learning knowledge from raw GPS data
– enable people to absorb more knowledge from others’ life experience– Trigger people’s memory about their past– Understand people’s life pattern
• Understanding user behavior– Context-aware computing– Modeling traffic condition– Discover social pattern– …
– Difficulty• A trajectory may contain more than two kinds of transportation modes• Pure velocity-based method may suffer from congestion
6
Introduction
0%
10%
20%
30%
40%
50%
60%
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
WalkCarBusBike
Distribution of mean velocity (m/s) of different transportation modes
0%
5%
10%
15%
20%
25%
30%
35%
40%
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Walk
Car
Bus
Bike
Distribution of maximum velocity (m/s) of different transportation modes
7
Introduction
• Contributions– We propose
• A change point-based segmentation method• An inference model based on supervised learning• A post-processing algorithm based on conditional probability
– Significance• A step toward mining knowledge from raw GPS data for geographic applications on
the Web• A step toward understanding user behavior based on GPS data
– Evaluation results• Large-scale data collected by 45 people over a period of 6 months• Almost 70 percent accuracy
8
Framework
Latitude, longitude, TimeP1: Lat1, long1, T1P2: Lat2, long2, T2 ………...Pn: Latn, longn, Tn
P1
Pn
Car
P2 P3 Pn-1
Change Point
WalkNon-Walk Segment
L2,T2
Walk Segment
• Preliminary
a place where people changetheir transportation modes
9
Framework
Inference ModelPost Process
GPS Log Segmentation
Feature Extraction
Features
Trans. Modes
Final Results
CRF
• Inference strategy
divide the GPS track into trips and then partition each trip into segments by change
points
10
Framework
Segment[i-1]: Car Segment[i]: Walk Segment[i+1]: Bike
P(Car): 75%P(Bus): 10%P(Bike): 8%P(Walk): 7%
P(Bike): 62%P(Walk): 24%P(Bus): 8%P(Car): 6%
P(Bike): 40%P(Walk): 30%P(Bus): 20%P(Car): 10%
Segment[i].P(Bike) = Segment[i].P(Bike) * P(Bike|Car)
Segment[i].P(Walk) = Segment[i].P(Walk) * P(Walk|Car)
• Post-Processing
11
Framework
Mi-1 Mi Mi+1
Xi-1 Xi Xi+1
Observations
States
WalkBus ForwardCar
Graphical Model
A Trip
• CRF-Based Inference
transportationsmode
feature from segment
12
Methodology• Commonsense knowledge from real world
– Typically, people need to walk before transferring transportation modes– Typically, people need to stop and then go when transferring modes– Walk should be a transition between different transportation modes
Transportation modes
Walk Car Bus Bike
Walk / 53.4% 32.8% 13.8%
Car 95.4% / 2.8% 1.8%
Bus 95.2% 3.2% / 1.6%
Bike 98.3% 1.7% 0% /
Transition matrix of transportation modes
13
Methodology• Change point-based Segmentation Algorithm
– Step 1: using a loose upper bound of velocity (Vt) and acceleration (at) to distinguish all possible Walk Points, non-Walk Points.
– Step 2: merge short segment (the length less than a thredshold) composed by consecutive Walk Points or non-Walk points
– Step 3: merge consecutive Uncertain Segment (less than 50 meters) to non-Walk Segment.
– Step 4: end point of each Walk Segment are potential change points
WalkBus
Certain Segment
Denotes a non-walk Point: P.V>Vt or P.a>at
Denotes a possible walk point: P.V<Vt and P.a<at
(b)
(c)
Backward Forward
Car
(a)
Certain Segment3 Uncertain Segments
Car
14
Experiments• Framework of experiment
• Feature Extraction– length– mean velocity– expectation of velocity– variance of velocity– top three velocities– top three accelerations
GPS log Data
Change Point Based
Uniform Duration Based
Uniform Length Based
Bayesian NetSVM Decision Tree CRF
Feature Extraction from Each Segment
Seg
men
tati
onIn
fere
nce
15
Experiment
• Devices
• Data
16
Experiment• Evaluation method
– Precision of inference a segment • Accuracy by Length
• Accuracy by Duration
– Change Point• Precision of change point• Recall of change point
𝐴𝐿 = σ 𝐶𝑜𝑟𝑟𝑒𝑐𝑡𝑆𝑒𝑔𝑚𝑒𝑛𝑡ሾ𝑗ሿ.𝐿𝑒𝑛𝑔𝑡ℎ𝑚𝑗=0σ 𝑆𝑒𝑔𝑚𝑒𝑛𝑡 ሾ𝑖ሿ.𝐿𝑒𝑛𝑔𝑡ℎ𝑁𝑖=0
𝐴𝐷 = σ 𝐶𝑜𝑟𝑟𝑒𝑐𝑡𝑆𝑒𝑔𝑚𝑒𝑛𝑡ሾ𝑗ሿ.𝐷𝑢𝑟𝑎𝑡𝑖𝑜𝑛𝑚𝑗=0σ 𝑆𝑒𝑔𝑚𝑒𝑛𝑡 ሾ𝑖ሿ.𝐷𝑢𝑟𝑎𝑡𝑖𝑜𝑛𝑁𝑖=0
N: the total number of the segments after beingpartitioned by a segmentation method.m: # of segments our approach correctly predicted
17
Experiment: Result
00.10.20.30.40.50.60.70.80.9
Decision Tree
SVM Bayes net CRF
Acc
urac
y
Inference Model
Accuracy by LengthAccuracy by Duration
Inferring accuracy of transportation mode over change point-based segmentation method
• Inference performance
18
Experiment
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
50 100 150 200 250 300
Reca
ll
Distance (m)
Decision Tree
SVM
Bayes net
CRF
Recall of change point using change point based
segmentation method
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0.55
50 100 150 200 250 300
Prec
ision
Distance (m)
Decision Tree
SVM
Bayes net
CRF
Precision of change point using change point based segmentation method
• Inference performance of change point
19
Experiment: Result
change point uniform duration
(120 s)uniform length
(100 m)
Accuracy by Length 0.685 0.647 0.399
Accuracy by Duration 0.753 0.701 0.674
Recall/change point 0.887 0.867 0.867
Precision/change point 0.406 0.197 0.148
Comparison of different segmentation methods using Decision Tree
20
Experiment: Result
Comparison of inference results of CRF over different segmentation methods
change point
uniform duration(90 s)
uniform length(150 m)
Accuracy by Length 0.528 0.524 0.617
Accuracy by Duration 0.358 0.413 0.525
Recall/ change point 0.281 0.121 0.656
Precision /change point 0.286 0.070 0.159
21
Conclusion
Change Point based
Uniform Duration
based
Uniform Length based
SVM
Bayesian Net
Decision Tree
CRF
Segmentation method
Inference method
22
Future work
• Identify more valuable features• Location-constraint conditional probability• Improving prediction performance of CRF-based approach