Upload
gwen
View
30
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Computational Transportation Science. Ouri Wolfson Computer Science. Vision. Take advantage of advances in Wireless communication (communicate) Mobile/static Sensor technologies (integrate) Geospatial-temporal information management (analyze) To address transportation problems Congestion - PowerPoint PPT Presentation
Citation preview
Computational Transportation Science
Ouri Wolfson
Computer Science
Vision
• Take advantage of advances in – Wireless communication (communicate)– Mobile/static Sensor technologies (integrate)– Geospatial-temporal information management (analyze)
• To address transportation problems– Congestion– Safety – Mobility– Energy – Environmental
• Funded by the National Science Foundation ($3M+)• Train about 20 Scientists
– Will develop novel classes of applications• Colleges: engineering, business, urban planning• $30K/year stipend, international internships
TransportationInformation Technology
IGERT Ph.D. program in Computational Transportation Science
Outline• Abstraction of concepts from sensor data:
extracting semantic locations from GPS traces. • Coping with imprecision and uncertainty: map matching. • Mixed environments: information in vehicular and other
peer-to-peer networks. • Managing spatial-temporal data: compression.• Software tools: Databases with
– spatial, – temporal, – uncertainty
capabilities for – Tracking,– analysis, – routing;
Introduction – location information
• Location information– Physical location
• Provided by positioning systems– GPS: (122.39, 239.11, 11:20am)
• Unreadable by users
– Semantic location• Not directly provided by positioning systems
– Dominick’s grocery store, 1340 S. Canal St.– Dermatologist’s office– Home
• Useful to users
Introduction – problem statement
• Physical location -> semantic location • Devices
– Outdoor positioning systems– Internet access
• Application examples: – context awareness of mobile devices
(autocomplete)– Reminder applications– “Total Recall” by Gordon Bell
Main Input and Output• Input: Trajectory: T =(x1, y1, t1), (x2, y2, t2), …,
(xn, yn, tn)• Output 1: Semantic location
– Location name (BestBuy)– Semantic category
• Business type (electronics store), • office • home
– Street address
• Output 2: Semantic location log file– (date, begin_time, end_time, semantic location)
Online and offline versions
• Online: determine the current location– On mobile device– Based on incomplete trip trajectory
• Offline: Determine multiple past locations – Based on complete trip trajectory
Auxiliary inputs
• Profile– Calendar – (event date, semantic location)– Address Book – (phone number, semantic location)– Phone Call List – (calling date, semantic location)– Web Page List - (visiting date, semantic location)– Destination List – (searching date, address)– User’s Feedback
• Confirmed list• Denied list
Algorithm
GPS dataStep1:Extract stays
Step5: Decide thesemantic location
Yellow pages
Step2: Get streetaddress candidates
Map
Step3: Get semanticlocation candidates
Profile
Step4.3Calculate
profile utility
Step4.2Calculate SA
utility
Step4.1CalculateSC utility
User confirmation
SemanticLog file
Juhong Liu, Ouri Wolfson, Huabei Yin, UIC
11 04/21/23
Step1 - Stay extraction • Stay
– Loss of GPS signal– To spend at least min_time in an area with the
diameter no larger than d.
• (stay_position, date, stay_start, stay_end)
Juhong Liu, Ouri Wolfson, Huabei Yin, UIC
12 04/21/23
Step2 – Street address candidates
• Reverse Geocoding– Physical location
(stay_position) -> street address
• Traditional geocoding method– Nearest street address– Incorrect result
Street address candidates: the street addresses within k meters (graph distance) from stay_position.
Bui l di ng B
850 S. Hal sted St
E
Step3-semantic location candidates
• Street address candidates ->
semantic location candidates– Yellow pages
• Such as switchboard.com
– Profile• Calendar, Address Book, Phone Call List, Web
Page List, Destination List, User's Feedback
At end of step 3: A set of Semantic Location candidates
• Semantic location– Location name (BestBuy)– Semantic category
• Business type (electronics store; theater), • office • home
– Street address
Step4- three utilities calculation
• For each semantic location SL in set of candidates compute:– Semantic category (SC) utility: likelihood of
semantic category, given semantic log (history)– Street address (SA) utility: likelihood the street
address, given the stay location– Profile (P) utility: Likelihood of SL, given profile P
Outline• Abstraction of concepts from sensor data:
extracting semantic locations from GPS traces. • Coping with imprecision and uncertainty: map matching. • Mixed environments: information in vehicular and other
peer-to-peer networks. • Spatial-temporal data: compression.• Software tools: Databases with
– spatial, – temporal, – uncertainty
capabilities for – Tracking,– analysis, – routing;
Problem
• Most information systems are client/server
• Nearby mobile devices are inaccessible– Parking slot info– Video of road construction– Malfunctioning brakelight– Taxi cab– Ride-share opportunity
Environment
A central server does not necessarily exist
Local database
Local query
“Floating database”Resources of interest in a limited geographic area possibly for short time durationApplications coexist
resource-query C
resource-query Bresource 4resource 5
resource 8
resource-query Aresource 1resource 2resource 3
Pda’s, cell-phones, sensors, hotspots, vehicles, with short-rangewireless
Short-range wireless networks wi-fi (100-200 meters) bluetooth (2-10, popular) zigbee
Unlicensed spectrum (free)
High bandwidth
Bandwidth-Power/search tradeoff
Mobile Local Search: applications
• social networking (wearable website) – Personal profile of interest at a convention– Singles matchmaking– Games – Reminder
• mobile advertising (coupons, rfid-tag info)– Sale on an item of interest at mall– Music-file exchange
• Transportation • emergency response
– Search for victims in a rubble • military
– Sighting of insurgent in downtown Mosul in last hour• asset management and tracking
– Sensors on containers exchange security information => remote checkpoints
• mobile collaborative work• tourist and location-based-services
– Closest ATM
How to enable Mobile P2P applications?
• Develop a platform for building them
Problems in data management
• Query processing
• Dissemination analysis
• Participation incentives
Floating (Probe) car data
A Segment of the road network
・・・
Periodically the ITA on a vehicle generates a velocity report:
Vehicle id IL391645 Average speed 45mph Time 3:49:45pm Location (12345.25, 4321.52) Travel direction east
P2P method
1
4
25
36
BA
C
1
4
1
2
B
A
C
(a) (b)
4
5
1
3
4
6
1
4
25
36
BA
C
1
4
1
2
B
A
C
(a) (b)
4
5
1
3
4
6
Each vehicle communicates reports to other vehiclesusing short-range (e.g. 300 meters), unlicensed, wireless spectrum, e.g. 802.11
Travel-time map
Multimedia info: view/hear traffic conditions 1 mile ahead by a click
on your smartphone.
Query Processing StrategiesWiMaC paradigm: WiFi-disseminate,
Match
Wifi/cellular-respond
media media Q
(b) Z sends Q to M-producer via cellular
Z
Z
(c) M-producer sends media to Q-producer via cellular
(a) media and Q are initially disseminated. They collocate at Z.
Z
Q
media
M-producer Q-producer
M-producer Q-producer
M-producer Q-producer
WiMaC Design Space
Evaluation criteria:• Throughput• Response time• Wi-Fi communication volume• Cellular communication volume
Comparison Results
3a (query)-WiFi
2a (meta)-WiFi
1 (media)
5b (media,query)-cell 3b (query)-cell
7b (media,meta,query)-cell
WiFi-onlystrategies
WiFi-cellularstrategies
6b (media,query)-cell
4b (media,meta)-cell 2b (meta)-cell
X Y: Strategy X dominates strategy Y
4a (media,meta)-WiFi
5a (media,query)-WiFi
7a (media,meta,query)-WiFi
6a (meta,query)-WiFi
X Y: Strategy X weakly dominates strategy Y
push-media
pull
hy-MuM-cell
hy-meta-cell
1 (media)
3a (query)-WiFi
6b (meta,query)-cell
7b (media,meta,query)-cell
0
2
4
6
8
10
1% 12.5% 25% 37.5% 50%
penetration ratio
an
sw
er
thro
ug
hp
ut
1 (query) 3a (query)-WiFI
7b (media,meta,query)-cell
dominance analysis
simulations
Outline• Abstraction of concepts from sensor data:
extracting semantic locations from GPS traces. • Coping with imprecision and uncertainty: map matching. • Mixed environments: information in vehicular and other
peer-to-peer networks. • Spatial-temporal data: compression• Software tools: Databases with
– spatial, – temporal, – uncertainty
capabilities for – Tracking,– analysis, – routing;
Data Compression -- Motivation
– Tracking the movements of all vehicles in the USA needs approximately 4TB/day (GPS receivers sample a point every two seconds).
Trajectory Lossy-Compression
• approximate a trajectory by another which is not farther than ε.
Desiderata for Trajectory Compression
• bounded error when answering queries on compressed trajectories.
Relational-Oriented Queries• Point queries:
– Where (T,t): where is the moving object with trajectory T at time t
– When (T,x,y): when is the moving object with trajectory T at location (x,y)
• Range queries (R,t1,t2,O): retrieve the moving objects (i.e.
trajectories) of O that are in region R between times t1 and t2.
• Nearest neighbor (t,T,O): retrieve the object of O that is closest
to trajectory T at time t
• Join queries (O,d): Retrieve the pairs of objects of O that are
within distance d.
Distance Functions• The distance functions considered
are:– E3: 3D Euclidean distance.
– E2: Euclidean distance on 2D projection of a trajectory
– Eu: the Euclidean distance of two trajectory points with same time.
– Et: It is the time distance of two trajectory points with same location or closest Euclidean distance.
• #(T'2) ≤ #(T'3) ≤ #(T'u), which is also verified by experimental saving comparison.
Soundness of Distance Functions • Soundness: bound on the error when answering spatio-temporal queries on compressed trajectories.
• The appropriate distance function depends on the type of queries expected on the database of compressed trajectories. – If all spatio-temporal queries are expected, then Eu and Et should be used.
– If only where_at, intersect, and nearest_neighbor queries are expected, then the Eu distance should be used.
Where_at
When_at
Intersect Nearest_Neighbor
Spatial Join
E2 No No No No Sound when (a) the distance
function D of join is metric
(b) E is weaker than D.
E3 No No No No
Eu Yes No Yes Yes
Et No Yes No No
Aging of Trajectories
• Increase the tolerance ε as time progresses
• Aging friendliness property: If ε1ε2 then
T’ =Comp(Comp(T, ε1 ), ε2) = Comp(T, ε2)
(associative)
Theorem: The DP algorithm is aging-friendly, whereas the optimal algorithm is not.
Outline• Abstraction of concepts from sensor data:
extracting semantic locations from GPS traces. • Coping with imprecision and uncertainty: map matching. • Mixed environments: information in vehicular and other
peer-to-peer networks. • Spatial-temporal data: compression.• Software tools: Databases with
– spatial, – temporal, – uncertainty
capabilities for – Tracking,– analysis, – routing;
Matching Methods Matching Methods ---- Straightforward ---- Straightforward
SnappingSnapping
a
A
B
a b
B
A
• A, B: road segments• a, b: GPS points
• A, B: road segments• a, b: GPS points
• Compute the weight of each
road segment (block)
• Compute the shortest weight path between the start and the end GPS points as the route of the moving object
Weight-based MatchingWeight-based Matching
x
y
t
t2
t6
trajectroy
arcpolyline
p8, t8
p6, t6
p7, t7
p5, t5p4, t4
p3, t3
p2, t2p1, t1
b1, t'1
b2, t'2
b3, t'3
b4, t'4
b5, t'5
||
))()((
ij
t
t arctraj
tt
dttgtgW
j
i
Matching VariantsMatching Variants
• Offline– Find the overall route of a vehicle after the
trip is over
• Online Snapping– Real time, i.e. every 2 minutes (online
frequency)– Determine the road segment on which the
vehicle is currently located
Experiments ---- OfflineExperiments ---- Offline
• Evaluation method– Edit Distance
The smallest number of insertions, deletions, and substitutions required to change the snapped route to the correct route
– Correct matching percentage (OFFcorrect)
OFFcorrect = 100(1 – ed/n)
ResultsResults– On average, weight-based alg. is correct
up to 94% of the time, depending on the GPS sampling interval.
– It is always superior to the straightforward closest-block snapping.
– Correct matching decreases significantly when GPS sampling intervals are larger than 120 seconds
Outline• Abstraction of concepts from sensor data:
extracting semantic locations from GPS traces. • Coping with imprecision and uncertainty: map matching. • Mixed environments: information in vehicular and other
peer-to-peer networks. • Spatial-temporal data: compression.• Software tools: Databases with
– spatial, – temporal, – uncertainty
capabilities for – Tracking,– analysis, – routing;
Basic element of a moving objects database: a trajectory
Y
X
Time
Present time
2d-ROUTE
3d-TRAJECTORY
Future Trajectory: Motion planPast trajectory: GPS trace
Why are traditional databases inappropriate to manage trajectories?
SELECT o
FROM MOVING-OBJECTS
WHERE Sometime/Always(10,11)
inside (o, R)
Retrieve the objects that are in R sometime/always between 10 and 11am
R
10 1110
11
sometime always
Why are traditional databases inappropriate to manage trajectories?
• Discrete vs. Continuous data
• Operators of the language that are natural in the domain
• Uncertainty
Uncertainty operators in spatial range queries
possibly and definitely semantics based onbranching time
SELECT oFROM MOVING-OBJECTS
WHERE Possibly/Definitely Inside (o, R)
Rdefinitely
possibly
uncertainty interval
Uncertain trajectory model
Possible Motion Curve (PMC) and Trajectory Volume (TV)
• PMC is a continuous function from Time to 2D
• TV is the boundary of the set of all the
PMCs (resembles a slanted cylinder)
Predicates in spatial range queries
Possibly – there exists a possible motion curve
Definitely -- for all possible motion curves
• possibly-sometime = sometime-possibly• possibly-always• always-possibly• definitely-always = always-definitely• definitely-sometime • sometime-definitely
Uncertainty in Language - Quantitative Approach
Uncertainty interval
database location
probability density function
Probabilistic Range Queries
SELECT o
FROM MOVING-OBJECTS
WHERE Inside(o, R)
R
Answer: (RWW850, 0.58)
(ACW930, 0.75)
Outline• Databases with
– spatial, – temporal, – uncertainty
capabilities for – Tracking,– analysis, – routing;
• compression of spatial-temporal data; • query and dissemination of (possibly multimedia)
information in vehicular and other peer-to-peer networks;
• extracting semantic locations and activity knowledge from GPS traces;
• map matching.
Adapt Uncertainty to Update frequency
• Tradeoff :
precision vs. resource-consumption• Cost based approach
(1 update = 2 units of imprecision)• Dynamic cost minimization
Information-Cost of a tripComponents:• Cost-of-location-update• Cost-of-imprecision
• Cost-of-deviation
• Cost-of-uncertainty
Current location = 15 + 5
proportional to length of period of time for which persist
14
actual location
database locationdeviation = 1
1510 20Uncertainty = 10
Outline• Databases with
– spatial, – temporal, – uncertainty
capabilities for – Tracking,– analysis, – routing;
• compression of spatial-temporal data; • Databases in vehicular and other peer-to-peer networks; • extracting semantic locations from GPS traces; • map matching.
Example queries
• Find a multimodal route that will get me home by 7pm with 90% certainty.
• Find a route that will get me home by 7pm with 90% certainty, and
lets me stop at a grocery store for 30 minutes
Example Graph
ALL_TRIPS
ALL_TRIPS( origin-vertex, destination-vertex)
Returns a non-materialized relation of all trips (sequences of vertices) between the origin and destination
General Query Structure
SELECT *FROM ALL_TRIPS(origin, destination)WHERE
<WITH STOP VERTICES> (florist, grocery)
<WITH MODES> (Bus, boat)
<WITH CERTAINTY> (0.8)
<OPTIMIZE>) (time, distance, cost, #transfers),…)
Example Query
SELECT * FROM ALL_TRIPS(work, home) AS t WITH STOP_VERTICES v1, v2 WITH CERTAINTY .75 WHERE "pharmacy" IN v1.facilities AND "florist" IN v2.facilities AND DURATION(v1) > 10min AND DURATION(v2) > 10minAND MODES(t)contained-in {pedestrian, rail, bus} MINIMIZE number-of-transfers
With a certainty greater than or equal to .75, find a trip home from work that uses public transportation and visits a pharmacy and then a florist (spending at least 10 minutes at each) and has minimum number of transfers
Query Semantics
From the set of trips that satisfy:
– the non-temporal constraints, and – the temporal constraints with the required
certainty (remember probabilistic travel times)
Select the optimal (according to single criteria)
Semantics
Select *From All_Trips (work, home) as tWITH STOP-VERTICES v1WHERE pharmacy in v1.facilities, and modes(t) contained-in {train, bus}, and begin(t) > 8pm, and arrive(t) <10pm, and duration(v1) > 10minsWITH CERTAINTY 0.9 MINIMIZE NUMBER-OF-TRANSFERS
For each trip from work to home create a mapping from v1 to vertices of t:t1…. (t1,map1) map1: v1 -> UnionStationt1…. (t1,map2) map2: v1 -> CentralStationt2…. (t2,map1) map1: …....
For each (ti, mapj) evaluate WHERE condition and if satisfied with CERTAINTY > 0.9 put pair in RESULT.
From RESULT return the pair that MINIMIZES the number of transfers.
Evaluation of WHERE condition W on (ti,mapj)
• Evaluate non-temporal conditions and if W = ‘true’ or ‘false’ , then done.
• Otherwise split trip into legs: L1, v1, L2• L1 has departure y1 and duration z1
• L2 has departure y2 and duration z2
• y1>8pm, y2+z2<10pm, y2-y1-z1>10mins defines a region S in R4.
• Assume that we know the joint density function f(y1,z1,y2,z2).
• Then we compute the probability of W as the integral ∫S f(y1,z1,y2,z2)dy1dz1dy2dz2
Plug-and-play Query Processing
• Based on a framework– Algorithms are chosen based on the structure of the
query
SELECT *
FROM ALL_TRIPS(source, dest) AS t
WITH STOP VERTICES is empty
WHERE number-of-transfers (t) < k
OPTIMIZE is the minimization of the sum of some numeric edge attribute (e.g., length, duration)
Can be solved with
A. Lozano and G. Storchi. Shortest viable path algorithm in multimodal networks. In Transportation Research Part A: Policy and Practice, volume 35, pages 225–241, March 2001.
Conclusion• Abstraction of concepts from sensor data:
extracting semantic locations from GPS traces. • Coping with imprecision and uncertainty: map matching. • Mixed environments: information in vehicular and other
peer-to-peer networks. • Managing spatial-temporal data: compression.• Software tools: Databases with
– spatial, – temporal, – uncertainty
capabilities for – Tracking,– analysis, – routing;
Ongoing work
• Autonomous driving – Grand Cooperative-Driving Challenge– high precision maps
• Database platform for intellidrive applications (nsf grant)
• Competitive routing