29
Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information Processing in Sensor Networks | April 24, 2008

Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

Embed Size (px)

Citation preview

Page 1: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

Toward Community Sensing

Andreas Krause Carnegie Mellon University

Joint work with Eric Horvitz, Aman Kansal, Feng ZhaoMicrosoft Research

Information Processing in Sensor Networks | April 24, 2008

Page 2: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

2

Motivation: Traffic monitoring

Deployedsensors,

high accuracyspeed data

What about148th Ave?

How can we get accurate road speed estimates everywhere?

Detector loops

Traffic cameras

Page 3: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

3

Cars as traffic sensorsMany cars have Personal Navigation Devices (PNDs)Know exact location and speed!

Fuse GPS, map information, engine speed, …

Modern PNDs have network connection Can use cars as speed sensors!

Example: Dash Express (GPS + GPRS/WiFi)

Page 4: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

4

Community Sensing Vision

Realize full potential of population owned sensorsMust respect privacy and preference about sharing!

Privately-heldsensors

Common goal

Estimate spatialphenomenon

(traffic, weather, …)

Construct 3D cities

News coverage

Contributesensor data

Request data

SenseWeb

Page 5: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

5

Privacy concern of GPS traces

Dense GPS traces allow to identify people’s locations, activities, intents, etc.Even anonymization or strong obfuscation doesn’t help.Key idea: Avoid dense sampling!

Need to predict from sparse samples

Images courtesy of John Krumm

Page 6: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

6

s1 s2 s3

s4

s5s7

s6

s11

s12

s9 s10

s8

Phenomenon modeling

(Normalized) speeds as random variablesJoint distribution allows modeling correlationsCan predict unmonitoredspeeds from monitored speeds using P(S5 | S1, S9)

s1 s3

s12

s9

Which segments should we monitor?

Page 7: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

7

Minimizing uncertainty

s1=.9 s2=1 S3=1

s5 s6

s4=1

s7 P(S 5|

s A)

0 1Var(S5|sA)=.01Var(S5|sA)=.1

Var(S5|SA)=

A={S1,S2,S3,S4}s1=.5 s2=.6 s3=.8

s4=.6.08

Var(S6|SA)=

.1

Var(S7|SA)=

.3

s1 s2 s3

s4

s1 s2 s3

s4

Can estimate prediction error at segment Si

Var(Si | SA = sA)

Expected error at segment Si

Expected mean squared errorEMSE(A) = i Var(Si | SA) = + +

A* = argmin|A|· k EMSE(A)

Does not take “importance” of Si into account

Frequentlytravelled

Lesstravelled

Page 8: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

8

Taking demand into accountModel demand Di as random variables (e.g., Poisson)E.g., Di = #cars on segment Si

Demand weighted MSEDMSE(A) = i E[Di] Var(Si | SA)

Error reduction: R(A) = DMSE(;)-DMSE(A)

Want: A* = argmax|A|· k R(A)

NP-hard optimization problem

s1 s3

s4

Var(S5|SA)=

.08Var(S6|SA)=

.1

Var(S7|SA)=

.3

50D5 =

s2

s510D6 =

200D7 =

= ¢ ¢ ¢+ + s6

s7

Page 9: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

9

Selecting informative locationsGreedy algorithm:

A ;For i = 1:k do

s*= argmaxs R(A [ {s})

A A [ {s*}

How well does this heuristic do?

s1 s2 s3

s4

s5

s7

s6

s11

s12

s9 s10

s8

s2

s11

s7

s10

Page 10: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

10

s1 s2 s3

s4

s5

s7

s6

s11

s9 s10

s8

Selection B

Diminishing returns

s1 s2 s3

s4

s5

s7

s6

s11

s9 s10

s8

s’Observe new locationS’B A

S’

+

+

Large improvement

Small improvement

Submodularity:

For A µ B, F(A [ {S’}) – F(A) ¸ F(B [ {S’}) – F(B)

Utility R(A) is submodular*!

*See store for details

Selection A

Adding s’ helps a lot! Adding s’ doesn’t help much

Page 11: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

11

Why is submodularity is useful?

Theorem [Nemhauser et al ‘78]Greedy algorithm gives constant factor approximation

F(Agreedy) ¸ (1-1/e) F(Aopt)

Greedy algorithm gives near-optimal set of locations to observe

Have no control over where the sensors (cars, cell phones) are going to be!

~63%

Page 12: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

12

Querying a roving sensor

How can we cope with uncertain sensor availability?

s1 s3

s6

s4

s7

s2

s5

Query!Response: “I’m at S2,

going 55 mph”

Query!No response

(no data)

s2=.9

Page 13: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

13

Road segmentsV = {S1,…,Sn}

Random A µ Vfrom P(A | B)

Modeling sensor availabilitySet W of observations (cars) we can select fromIf select car Cj, observe Si with probability P(i | Cj)

s1 s3

s6

s4

s7

s2

s5

C1

C2

C3

ObservationsW = {C1,…,Cm}

Pick B µ W

Utility R(A)

s1

s7

Goal: Maximize expected utility:

B* = argmax|B|· k A P(Aj B) R(A)

Page 14: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

14

Optimizing community sensingLemma: Whenever R(A) is submodular, the function

F(B) = |A|· k P(A j B) R(A) is submodular

Can use the greedy algorithm to optimize selection F(B) is sum over exponentially many terms

Theorem: For any , can find set B’ such that

F(B’) ¸ (1-1/e) max|B|· k F(B) -

with probability 1-, using independent samples of R(A)

Page 15: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

15

Handling user preferencesNeed to respect user preferences

“Sample my speed at most once per day”“Don’t measure my speed for the next hour”“Never sample close to my home”“Wait at least 10 minutes between samples”

Can accommodate preferences using constraint optimization:

B* = argmaxB F(B) subject to C(B) · L

Can still get near-optimal solutions (details in paper)

Complex cost function

SensingBudget

Page 16: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

16

Community Sensing SummaryOptimize value of probing roving sensors

Utility (expected error reduction)Demand (usage: “utilitarian” impact)

Sensor availabilityPredict location based on history

PreferencesAbide by preferencesE.g., frequency / number of probes, min. inter-probe intervalOther constraints: e.g., “Not near my home!”

Phenomenon

DemandAvailability

& Preferences

Page 17: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

17

Phenomenon modeling3 months of data from 534 segments across 7 highways and interstates near Seattle, WASamples at 15 minute intervalsUse Gaussian Process to model road speeds (covariance function based on road network topology)Can compute utility R(A) in closed form!

Page 18: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

18

Demand modelingDemand = #cars on road segmentEstimate demand based on 3166 ClearFlow route requests

Expected demand(rush hour)

Page 19: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

19

Evaluating model accuracy

Accurate estimation of prediction error!

0 10 20 30 40 500

0.1

0.2

0.3

0.4

predicted error

0 10 20 30 40 500

0.1

0.2

0.3

0.4

test set errorpredicted error

Number of locations

Dem

and-

wei

ghte

d RM

S

Low

er

is b

ett

er

Page 20: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

20

0 10 20 30 40 500

0.05

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e Random

Demand driven querying

65% error reduction using only 10 (of 534) observations!Optimized sensing requires 10x fewer samples!

0 10 20 30 40 500

0.05

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e

Unit-weights

Random

0 10 20 30 40 500

0.05

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e

Demand-weights

Unit-weights

Random

Low

er

is b

ett

er

Page 21: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

21

Availability modelingMicrosoft Multiperson Location Survey (MSMLS) [Krumm ‘06]GPS traces from 85 drivers, 6+ days eachAssociate GPS readings with road segments“Map matching”

Two models of sensor availabilitySpatial obfuscationSparse querying

GPS usedin MSMLS

Page 22: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

22

Spatial obfuscationMotivation: Privacy through enforcing uncertainty about sensor location

CommunitySensing Service

Populationof sensors

Request road speed at some location in area X

Anonymized response fromrandom car in cell X (if available)

X

Page 23: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

23

0 10 20 30 40 50

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e

Random road

Optimize road

0 10 20 30 40 50

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e

Random road

13 cells

Optimize road

0 10 20 30 40 50

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e

Random road

13 cells

53 cells

Optimize road

0 10 20 30 40 50

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e

Random road

13 cells

53 cells

146 cells

Optimize road

Spatial obfuscation

Discretization ≈ Utility / Privacy knobHigh accuracy even with coarse discretization

23

0 10 20 30 40 50

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e

Random road

13 cells

53 cells

146 cells

449 cells

Optimize road

Low

er

is b

ett

er

Page 24: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

24

Obfuscation by sparse queryingAssociate roving sensors with anonymous IDLearn availability model for each sensor from data

CommunitySensing Service

Populationof sensors

Request road speed and location from car Ci

Response from car Ci

(if connected to network available)

Page 25: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

25

0 10 20 30 40 50

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e

Random road

Optimized road

0 10 20 30 40 50

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e

Random road

Optimized road

Random user

Obfuscation by sparse monitoring

Biggest difference in “important” part of the curve50% error reduction over mean if querying 10 “cars”

25

0 10 20 30 40 50

0.1

0.15

0.2

0.25

Number of observations

Dem

and-

wei

ghte

d va

rianc

e

Random road

Optimized road

Optimized user

Random user

Low

er

is b

ett

er

Page 26: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

26

Mobile vs. fixed sensorsWhen does it “pay off” to use mobile vs. fixed sensors?Experiment: cost C(B) = #fixed(B) + #mobile(B)

Mobile sensors pay off if fixed sensors 4x as expensive

Fixedbudget

max F(B) s.t. C(B)· L

Page 27: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

27

Extensions / Future workSpatio-temporal models (see paper)How to quickly learn good models (see paper)

Other applications:Population fitness?News coverage?Reconstruction of 3D cities?

Formal privacy guarantees?

Page 28: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

28

Related workTravel time estimation using cell phones [Wunnava et al ’07]Privacy-aware querying of cars with GPS & cell phones [Bayen et al ’08, forthcoming]Spatial monitoring, experimental design etc. (see paper)

Page 29: Toward Community Sensing Andreas Krause Carnegie Mellon University Joint work with Eric Horvitz, Aman Kansal, Feng Zhao Microsoft Research Information

29

ConclusionsPresented integrated approach to community sensing

Theoretical analysis near-optimal sensing policiesExtensive empirical evaluation on traffic monitoring case study

Phenomenon

Demand Availability& Preferences