Multi-Person Multi-Camera Tracking for EasyLiving

Preview:

DESCRIPTION

Multi-Person Multi-Camera Tracking for EasyLiving. Brian Meyers. Steve Harris. John Krumm. Vision Technology Research Group Microsoft Research Redmond, WA USA. Barry Brumitt. Steve Shafer. Michael Hale. What Is EasyLiving?. - PowerPoint PPT Presentation

Citation preview

Multi-Person Multi-Camera Tracking for EasyLiving

John Krumm Steve Harris Brian Meyers

Barry Brumitt Michael Hale Steve Shafer

Vision Technology Research GroupMicrosoft ResearchRedmond, WA USA

What Is EasyLiving?EasyLiving is a prototype architecture and technologies for building intelligent environments that facilitate the unencumbered interaction of people with other people, with computers, and with devices.

Example Behaviors

• Adjust lights as you move around a space

• Route video to best display

• Move your Windows session as you move

• Deliver e-messages to where you are

• Monitor a young child or old person

EasyLiving Demo (7 min.)

Self-Aware SpaceEasyLiving must know about people, computers, software, devices, and geometry to work right.

Who’s Where?

Person-TrackingSystem

5 Triclops stereo cameras

5 PCs running “Stereo Module”(and Microsoft Windows 2000)

1 PC running “Person Tracker”

(only U.S. $319)

(includes Internet Explorer)

(as part of the OS)

TriclopsColor Stereo

Cameras

Stereo Processingand Person Detection

PersonTracking

(for a limited time only?)

Triclops Cameras

Now superceded by “Digiclops” digital IEEE-1394 version

Typical Images

Color image from Triclops Disparity image from Triclops

RequirementsTo work in a real-life intelligent environment, our tracking system must …

1. Maintain location & identity of people

2. Run at reasonable speeds (we get 3.5 Hz)

3. Work with multiple people (we handle up to three)

4. Create and delete people instances

5. Work with multiple cameras (we’re up to five)

6. Use cameras in the room

7. Work for extended periods

8. Tolerate partial occlusions and variable postures

Other SystemsNon-Vision• Olivetti Research (’92) & Xerox PARC (’93) – IR badges• AT&T Laboratories (Cambridge) (’97) – Ultrasonic badges• PinPoint, Ascension, Polhemus – commercial RF badges

Vision (for multiple people)• Haritaoglu & Davis (’98-’99)• Darrell et al. (’98)• Orwell et al. (’99)• Collins et al. (’99)• Rosales & Sclaroff (’99)• Kettnaker & Zabih (’99)• Intille et al. (’95, ’97)• Rehg et al. (’97)• Boult et al. (‘99)• Stiefelhagen et al. (’99)• MacCormick & Blake (’99)• Cai & Aggarwal (’98)• Halevi & Weinshall (’97)• Gavrila & Davis (’96)

“I see by the current issue of ‘Lab News,’ Ridgeway, that you’ve been working for the last twenty years on the same problem I’ve been working on for the last twenty years.”

Why Use Vision?Alternative sensors:• Active badges• Pressure-sensitive floors• Motion sensors• Localized sensors, e.g. on door, chair

But …• Cameras are getting cheap• Cameras are easy to install• Cameras give location and identity• Cameras can find other objects, e.g. video screens• Cameras can be use to model room geometry

(active badge)

Person Detection Steps

1. Background subtraction2. Blob clustering3. Histogram identification

Camera calibrationBackground modeling

Camera Calibration

0

0.5

1

1.5

2

2.5

3

3.5

4

-2 -1 0 1 2

0

0.5

1

1.5

2

2.5

3

3.5

4

-2 -1 0 1 2

• All tracking done in ground plane• Record path of single person walking around room• Compute (x,y,) that best aligns paths• Requires robust alignment to deal with outliers

Paths before calibration Paths after calibration

Background Modeling

View of space Combined color & disparity background image

Background SubtractionForeground if:

• valid depth over invalid depth- OR -

• depth difference > Td

- OR -• any (R,G,B) difference > Tc

• Color takes over when person sinks into couch cushions• Potential problem when person walks in front of moving video

(thus turn on moving video when acquiring background)

Person Detection

Region-growing on foreground pixels gives fragmented blobs

Group blobs into people-shaped clusters

Blob Clustering• Minimum spanning tree• Break really long links• Find five remaining longest links• Break all combinations of these five:

1 2 3 4 5

1 0 0 0 0 0

2 0 0 0 0 1

3 0 0 0 1 0

30 1 1 1 0 1

31 1 1 1 1 0

32 1 1 1 1 1

• Covariance matrices of 3D coordinates of linked blobs• Eigenvalues of covariance matrices• Compare eigenvalues to person model

Color Histograms• Identify people with RGB color histograms, 16x16x16• Each camera PC maintains its own histograms• Space-variant histograms built as person moves around room• Person tracker uses histogram to resolve ambiguities

windowwindowBluish tint

Regular color

So FarTriclops

Color StereoCameras

Stereo Processingand Person Detection

PersonTracking

calibration, background

• background subtraction (color & depth)• blob clustering• histogram maintenance

Person Tracking• Takes reports from stereo modules• Transforms to common coordinate frame

(common coordinate frame)

Person Tracking – Steady State

One “track” for each person

Predicted location

Resolve with color histograms

Feed back results to stereo modules for histogram updating

Person Tracking – Bad Data

Measurement Noise:• Computed position based on predicted position from many reports

Occlusions:• Multiple cameras• Long timeout on unsupported tracks

Person Creation Zone

• Tracks begin and end here• Initial tracks are provisional• Makes remainder of room more robust

Summary• Live demos, 20 minutes long• Person tracker runs at 3.5 Hz• Up to three people in room• People can:

• enter• leave• walk around• stop moving• sit• collide

Recent Efforts• Stop breaking the vision system!

• Moved chairs & changing lights bad background model• Special behavior, e.g. slow through person creation zone• Lots of people, e.g. around conference table

• Find other objects to enable interesting behaviors, e.g. “Where’s that book?”

• Easier method to model room geometry

Workshop on Multi-Object Tracking

Recommended