56
Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Modeling Count Data over Time Using Dynamic Bayesian Networks

Jonathan Hutchins

Advisors: Professor Ihler and Professor Smyth

Page 2: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Optical People Counter at a Building Entrance

Loop Sensors on Southern California Freeways

Sensor Measurements Reflect Dynamic Human Activity

Page 3: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Outline• Introduction, problem description

• Probabilistic model

• Single sensor results

• Multiple sensor modeling

• Future Work

Page 4: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Modeling Count Data

In a Poisson distribution:mean = variance = λ

p(c

ou

nt|

λ)

count

Page 5: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

10 20

10

20

30

40

mean

varia

nce

simulated data2*std dev linesmean=var line

mean people count

vari

ance

Simulated Data

15 weeks, 336 time slots

Page 6: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

10 20

100

200

mean

varia

nce

Poisson Test: All Building Data

building data2*std dev linesmean=var line

mean people count

vari

ance

Building Data

Page 7: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

10 20 30 40

100

200

300

mean

varia

nce

Poisson Test: All Freeway Data

freeway data2*std dev linesmean=var line

mean people count

vari

ance

Freeway Data

Page 8: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

One Week of Freeway Observations

Page 9: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

10 20 30 40

20

40

60

80

mean

vari

an

ce

Poisson Test: Freeway Data - Events Removed

events removed2*std dev linesmean=var line

0 5 10 15 20 25

20

40

60

mean

varia

nce

Poisson Test: Building Data - Events Removed

events removed2*std dev linesmean=var line

10 20 30 40

100

200

300

mean

vari

an

ce

Poisson Test: All Freeway Data

freeway data2*std dev linesmean=var line

10 20

100

200

mean

vari

an

cePoisson Test: All Building Data

building data2*std dev linesmean=var line

Page 10: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

One Week of Freeway Data

SUN MON TUE WED THU FRI SAT

10

20

30

40

50

CO

UN

TS

BASEBALL GAME EVENTS

Page 11: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Detecting Unusual Events: Baseline Method

6:00am 12:00pm 6:00pm

20

40

time

car

coun

t

Ideal model

car

cou

nt

6:00am 12:00pm 6:00pm

20

40

time

car

coun

ts

Baseline model

car

cou

nt

Unsupervised learning faces a “chicken and egg” dilemma

Page 12: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

May 17 May 18

20

40

car

coun

t

0

1

p(E)

events

time

Persistent Events

Page 13: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Quantifying Event Popularity

Ideal model

Baseline model

Page 14: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

My contributionAdaptive event detection with time-varying Poisson processes A. Ihler, J. Hutchins, and P. Smyth Proceedings of the 12th ACM SIGKDD Conference (KDD-06), August 2006.•Baseline method, Data sets, Ran experiments•Validation

Learning to detect events with Markov-modulated Poisson processes A. Ihler, J. Hutchins, and P. Smyth ACM Transactions on Knowledge Discovery from Data, Dec 2007•Extended the model to include a second event type (low activity)•Poisson Assumption Testing

Modeling Count Data From Multiple Sensors: A Building Occupancy Model J. Hutchins, A. Ihler, and P. Smyth

IEEE CAMSAP 2007,Computational Advances in Multi-Sensor Adaptive Processing, December 2007.

Page 15: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

"Graphical models are a marriage between probability theory and graph theory. They provide a natural tool for dealing with two problems that occur throughout applied mathematics and engineering -- uncertainty and complexity” Michael Jordan 1998

Graphical Models

Page 16: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

• Nodes variables

Directed Graphical Models

observedObserved

Count

hidden

Event Rate Parameter

Page 17: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Directed Graphical Models

• Nodes variables • Edges direct dependencies

A B

C

( , , ) ( | , ) ( ) ( )p A B C p C B A p A p B

Page 18: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Graphical Models: Modularity

ObservedCountt

ObservedCountt-2

ObservedCountt-1

ObservedCountt+2

ObservedCountt+1

Page 19: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Graphical Models: Modularity hidden

observedPoisson Rate λ(t)

NormalCountt-1

ObservedCountt

ObservedCountt-1

ObservedCountt+1

Day, Timet-1

NormalCountt-1

Day, Timet

NormalCountt-1

Day, Timet+1

( | , , )

( , ( , ))t t t

t t t

p NormalCount Day time

poisson NormalCount Day time

Page 20: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Graphical Models: Modularity hidden

observedPoisson Rate λ(t)

NormalCountt-1

ObservedCountt

ObservedCountt-1

ObservedCountt+1

Day, Timet-1

NormalCountt-1

Day, Timet

NormalCountt-1

Day, Timet+1

6:00am 12:00pm 6:00pm

20

40

time

car

coun

t

Page 21: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Graphical Models: Modularity

EventtEventt-1 Eventt+1

hidden

observedPoisson Rate λ(t)

NormalCountt-1

ObservedCountt

ObservedCountt-1

ObservedCountt+1

Day, Timet-1

NormalCountt-1

Day, Timet

NormalCountt-1

Day, Timet+1

1 2 1 1( | , ,..., ) ( | )

t t t tp Event Event Event Event p Event Event

Page 22: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Graphical Models: Modularity

EventtEventt-1 Eventt+1

hidden

observedPoisson Rate λ(t)

NormalCountt-1

ObservedCountt

ObservedCountt-1

ObservedCountt+1

Day, Timet-1

NormalCountt-1

Day, Timet

NormalCountt-1

Day, Timet+1

Event State Transition

Matrix

0 0 0 1 0 2

1 0 1 1 1 2

2 0 2 1 2 2

( ) ( ) ( )

( ) ( ) ( )

( ) ( ) ( )

p Tr p Tr p Tr

p Tr p Tr p Tr

p Tr p Tr p Tr

Page 23: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

EventtEventt-1 Eventt+1

Event State Transition

Matrix

ObservedCountt

ObservedCountt-1

ObservedCountt+1

EventCountt

EventCountt-1

EventCountt+1

hidden

observed

Poisson Rate λ(t)

NormalCountt-1

Day, Timet-1

NormalCountt-1

Day, Timet

NormalCountt-1

Day, Timet+1

0 for state = 0

( ; ) for states 1,2poisson N

Page 24: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

EventtEventt-1 Eventt+1

Event State Transition

Matrix

ObservedCountt

ObservedCountt-1

ObservedCountt+1

EventCountt

EventCountt-1

EventCountt+1

hidden

observed

Poisson Rate λ(t)

NormalCountt-1

Day, Timet-1

NormalCountt-1

Day, Timet

NormalCountt-1

Day, Timet+1

β

α

η ηη

Page 25: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

EventtEventt-1 Eventt+1

Event State Transition

Matrix

ObservedCountt

ObservedCountt-1

ObservedCountt+1

EventCountt

EventCountt-1

EventCountt+1

hidden

observedPoisson Rate λ(t)

NormalCountt-1

Day, Timet-1

NormalCountt-1

Day, Timet

NormalCountt-1

Day, Timet+1

Markov Modulated Poisson Process (MMPP) model e.g., see Heffes and Lucantoni (1994), Scott (1998)

Page 26: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Approximate Inference

( , , , , | )p a b c d e D

Page 27: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Gibbs Sampling

( , , , , | )

( | , , , , )

( | , , , , )

( | , , , , )

( | , , , , )

( | , , , , )

p a b c d e D

p a b c d e D

p b a c d e D

p c a b d e D

p d a b c e D

p e a b c d D

*

****

* **

**

*

** ***

**

Page 28: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Gibbs Sampling

*

x

y

****

* **

Page 29: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Block Sampling1

1

1 1

1 1

( | , ,....) no mixing

( | , , ....) slow mixing

( | , ,....) slow mixing

i i

t t t

i i i

t t t t

i i i

t t t

p No Ne O

p E No Ne O

p E E E

Normal count

Event count

Observed count

Event State

No

Ne

O

E

Page 30: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Gibbs Sampling

EventtEventt-1 Eventt+1

Event State Transition

Matrix

ObservedCountt

ObservedCountt-1

ObservedCountt+1

EventCountt

EventCountt-1

EventCountt+1

Poisson Rate λ(t)

NormalCountt-1

Day, Timet-1

NormalCountt-1

Day, Timet

NormalCountt-1

Day, Timet+1

Page 31: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Gibbs Sampling

EventtEventt-1 Eventt+1

Event State Transition

Matrix

ObservedCountt

ObservedCountt-1

ObservedCountt+1

EventCountt

EventCountt-1

EventCountt+1

Poisson Rate λ(t)

NormalCountt-1

Day, Timet-1

NormalCountt-1

Day, Timet

NormalCountt-1

Day, Timet+1

Poisson Rate λ(t)

Poisson Rate λ(t)

Event State Transition

Matrix

Event State Transition

Matrix

For the ternary valued event variable with chain length of 64,000

Brute force complexity ~

64 ,0003

Page 32: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Gibbs Sampling

EventtEventt-1 Eventt+1

AA A

Poisson Rate λ(t)

Day, Timet-1

ObservedCountt-1

NormalCountt-1

EventCountt-1

Poisson Rate λ(t)

Day, Timet-1

ObservedCountt-1

NormalCountt-1

EventCountt-1

Poisson Rate λ(t)

Day, Timet-1

ObservedCountt-1

NormalCountt-1

EventCountt-1

1: 1:

1Forward pass: ( | , )t t t

i ip E O

11

1 1: 1 1: 1 1

1 1 1 1( | , ) ( | , ) ( , , | , )

tt ti

t t t t t t t t t t t

i i i i i i iNo NeE

p E O p E E p O No Ne E

Normal count

Event count

Observed count

Event State

Model Parameters

No

Ne

O

E

Page 33: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth
Page 34: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Chicken/Egg Delima

6:00am 12:00pm 6:00pm

20

40

time

car

coun

tsca

r co

un

t

6:00am 12:00pm 6:00pm

20

40

peop

le c

ount

0

0.51p(E)

time

events

car

cou

nt

Page 35: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Event Popularity

6:00am 12:00pm 6:00pm

20

40

peop

le c

ount

0

0.51p(E)

time

events

car

cou

nt

car

cou

nt

Page 36: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

May 17 May 18

20

40

car

coun

t

0

1

p(E)

events

time

Persistent Event

Page 37: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Persistent Event

May 17 May 18

20

40

car

coun

t

00.5

1p(E)

events

time

Page 38: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Detecting Real Events: Baseball Games

Total Number

Of Predicted Events

Graphical

Model

Detection of the 76 known events

Baseline

Model

Detection of the 76 known

events

203 100.0% 86.8%

186 100.0% 81.6%

134 100.0% 72.4%

98 98.7% 60.5%

Remember: the model training is completely unsupervised,no ground truth is given to the model

Page 39: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Multi-sensor Occupancy Model

Modeling Count Data From Multiple Sensors: A Building Occupancy Model J. Hutchins, A. Ihler, and P. SmythIEEE CAMSAP 2007,Computational Advances in Multi-Sensor Adaptive Processing, December 2007

Page 40: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Where are the People?

Building Level City Level

Page 41: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Optical People Counter at a Building Entrance

Loop Sensors on Southern California Freeways

Sensor Measurements Reflect Dynamic Human Activity

Page 42: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Application: Multi-sensor Occupancy Model

CalIt2 Building, UC Irvine campus

Page 43: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Building Occupancy, Raw Measurements

Occt = Occt-1 + inCountst-1,t – outCountst-1,t

week1 week2 week3 week4 week5

-500

-400

-300

-200

-100

0

100

time

occ

upancy

Page 44: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Building Occupancy: Raw Measurements

Noisy sensors make raw measurements of little value

Over-counting

Under-counting

week1 week2 week3 week4 week5

-500

-400

-300

-200

-100

0

100

time

occu

panc

y

Page 45: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Adding Noise Model

EventtEventt-1

Event State Transition

Matrix

EventCountt

EventCountt-1

Poisson Rate λ(t)

NormalCountt-1

Day, Timet-1

NormalCountt-1

Day, Timet

ObservedCountt

ObservedCountt-1

TrueCountt-1

TrueCountt

Page 46: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Probabilistic Occupancy Model

In(Entrance) Sensors

Out(Exit) Sensors

Occupancy

In(Entrance) Sensors

Out(Exit) Sensors

ConstraintTime

Occupancy

Time t Time t+1

Page 47: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

24 hour constraint

47

Constraint

Occupancy

0 1 2 3 4 5 6 7 8 9 100

0.1

0.2

0.3

0.4

0.5

Building Occupancy

Geometric Distribution, p=0.5

Page 48: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Gibbs Sampling | Forward-Backward | Complexity

Learning and Inference

In(Entrance) Sensors

Out(Exit) Sensors

Occupancy

In(Entrance) Sensors

Out(Exit) Sensors

Occupancy

Page 49: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

6am 6pm 6am 6pm 6am 6pm-40

-20

0

20

40

60

80

100

120

140

Thursday Friday Saturdaytime

occu

panc

y

raw measurementbaseline methodoccupancy model

Typical Days

Thursday Friday Saturday

Bu

ild

ing

Occ

up

ancy

Page 50: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

6am 12pm 6pm-100

-50

0

50

100

time

occu

panc

y

actual raw measurementsraw with missingbaseline methodoccupancy model

Missing DataB

uil

din

g O

ccu

pan

cy

time

Page 51: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

6am 12pm 6pm 12am 6am 12pm 6pm-100

-50

0

50

100

150

Thursday Fridaytime

occu

panc

y

actual raw measurementscorrupted raw measurementsbaseline methodoccupancy model

Corrupted DataB

uil

din

g O

ccu

pan

cy

Thursday Friday

Page 52: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Future Work

• Freeway Traffic

• On and Off ramps

• 2300 sensors

• 6 months of measurements

Page 53: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Sensor Failure Extension

Page 54: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Spatial Correlation

-118.6 -118.5 -118.4 -118.3 -118.2 -118.1 -118 -117.9 -117.8 -117.733.7

33.8

33.9

34

34.1

34.2

34.3

34.4

34.5

LAX

DS

Loop Sensor Locations: Los Angeles County

Longitude

Latit

ude

Page 55: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

Four Off-Ramps

6:00am 12:00pm 6:00pm

20

40

60

80

Page 56: Modeling Count Data over Time Using Dynamic Bayesian Networks Jonathan Hutchins Advisors: Professor Ihler and Professor Smyth

PublicationsModeling Count Data From Multiple Sensors: A Building Occupancy Model J. Hutchins, A. Ihler, and P. Smyth

IEEE CAMSAP 2007,Computational Advances in Multi-Sensor Adaptive Processing, December 2007.

Learning to detect events with Markov-modulated Poisson processes A. Ihler, J. Hutchins, and P. Smyth ACM Transactions on Knowledge Discovery from Data, Dec 2007

Adaptive event detection with time-varying Poisson processes A. Ihler, J. Hutchins, and P. Smyth Proceedings of the 12th ACM SIGKDD Conference (KDD-06), August 2006.

Prediction and ranking algorithms for event-based network dataJ. O Madadhain, J. Hutchins, P. Smyth ACM SIGKDD Explorations: Special Issue on Link Mining, 7(2), 23-30, December 2005