Talk 2007-monash-seminar-behavior-recognition-framework

www.monash.edu.au

A real-time behavior recognition

framework for visual surveillance

Mahfuzul Haque

Manzur Murshed

www.monash.edu.au

2

Motivation

Are we really protected?

www.monash.edu.au

3

Motivation

Deployment of large number of surveillance cameras in recent years

London Heathrow airport has more than 5000 cameras!!

www.monash.edu.au

4

Motivation

Dependability on human monitors has increased.

Reliability on surveillance system has decreased.

www.monash.edu.au

5

Research Question

How to recognize unusual, unsafe and

abnormal human and group behaviors from a

surveillance video stream in real-time?

Automatic detection of abnormal

behaviors to aid the human

monitors

Reduce the dependability on

human monitors

Improve the reliability of

surveillance systems for ensuring

human security

www.monash.edu.au

6

Proposed Research Framework

A real-time behavior recognition framework for visual surveillance

Surveillance

video stream

Identified

active agents

Classified

active agents

Tracked

trajectories

Pattern

database High level

description of

unusual actions

and interactions

1.

Environment

Modeling

2.

Feature Extraction

and Agent

Classification

3.

Agent Tracking

with Occlusion

Handling

4.

Event/Behavior

Recognition Alarm!

www.monash.edu.au

7

Targeted Behaviors

Mob violence

Crowding

Sudden group

formation/deformation

Shooting

Public panic

www.monash.edu.au

8

Research Problems

www.monash.edu.au

9

1. Environment Modeling

How to extract the active regions from surveillance video stream?

Challenges!!

• Background initialization is not a practical approach in real-world

• Dynamic nature of background environment due to illumination variation, local motion, camera displacement and shadow

Background Subtraction

- =

Current frame Background Moving foreground

www.monash.edu.au

10

Environment Modeling in Literature (1 of 4)

Single Gaussian Model (Wren et al. PAMI’ 97)

Gaussian Mixture Model (Stauffer et al. CVPR’ 99, Lee PAMI’ 05)

Generalized Gaussian Mixture Model (Allili et al. CRV’ 07)

Gaussian Mixture Model with SVM (Zhang et al. THS’ 07)

Cascaded Classifiers (Chen et al. WMVS’ 07)

Environment modeling

Background subtraction

Background modeling

Background maintenance

Foreground detection

Moving foreground detection

Object detection

Moving object detection

Pixel-based approaches

www.monash.edu.au

11


Region and texture-based approaches Incorporates neighborhood information using block or texture measure. (Sheikh et al. PAMI’ 07, Heikkila et al. PAMI’ 06, Schindler et al. ACCV’ 06)

Shape-based approaches Use shape-based features instead of color features. (Noriega et al. BMVC’ 06, Jacobs et al. WMVC’ 07)



Background modeling




Object detection


www.monash.edu.au

12


Predictive modeling Uses probabilistic prediction of the expected background. (Toyama et al. ICCV’ 99, Monnet et al. ICCV’ 03)

Model initialization approaches Recovering clear background from a given sequence containing moving objects. (Gutchess et al. ICCV’ 01, Wang et al. ACCV’ 06, Figueroa et al. IVC’ 06)



Background modeling




Object detection


www.monash.edu.au

13


Nonparametric background modeling Density estimation based on a sample of intensity values. (Elgammal et al. ECCV’ 00)

Stationary foreground detection Uses multiple model operating on multiple time scale. (Cheng et al. WMVC’ 07)



Background modeling




Object detection


www.monash.edu.au

14

2. Agent Classification

How to classify the active regions in real-time?

Challenges!! • Identifying the appropriate features for the targeted behaviors

• Real-time classification using the those features

Vehicle People in Group Person carrying

object Single Person

Active Regions

Human Non-human

Single Person People in

Groups

Carrying

Object Not Carrying any

Object

Features

• Position

• Width/Height

• Centroid/Perimeter

• Aspect Ratio

• Compactness

• Others….

Which features to use?

B. Liu and H. Zhou (NNSP’ 03)

www.monash.edu.au

15

Agent Classification in Literature

Binary image classification techniques

Algorithms for calculating ellipticity, rectangularity, and triangularity

Feature evaluation techniques

Agent

Classification

Generic

Classification

Approaches

Domain

Specific

Classifiers

Classification

Using Tracked

Trajectories

Coastline

Surveillance

System

Traffic

Monitoring

System

Industrial

Robot

Manipulator

Residential

Security

System

For identifying humans, pets, and other objects.

For classifying objects on moving conveyor.

Vehicle (including motorcycle, car, bus and truck)

And human (including pedestrian and bicycler)

For classifying different kinds of ships.

www.monash.edu.au

16

3. Occlusion Handling during Tracking

Occlusion handling is a major problem in visual surveillance.

During occlusion only portions of each objects are visible and often at very low resolution.

Challenges!!

Better models need be developed to cope with the correspondence between features for eliminating errors during tracking multiple objects.

www.monash.edu.au

17

Occlusion Handling in Literature (1 of 3)

Most practical method for addressing occlusion is through

the use of multiple cameras.

Progress is being made using statistical methods to predict

object pose, position, and so on, from available image

information.

www.monash.edu.au

18


Region-based tracking works well in scenes containing

only a few objects (such as highways).

Active contour-based tracking reduces computational

complexity and track under partial occlusion but sensitive

to the initialization of tracking.

www.monash.edu.au

19


Model-based tracking – high computational cost,

unsuitable for real-time implementations.

Feature-based tracking can handle occlusion

between two objects as long as velocity of

centriods are distinguishable.

Centroid of

the bounding

box

width

height

(x,y)

www.monash.edu.au

20

4. Behavior Recognition

Challenges!!

• Identifying the time-varying features

for a particular behavior

• Automatic learning of behaviors

• Recognizing the learned behaviors

in different scenarios

How to learn and recognize

a particular behavior?

Movement pattern

Behavior

Recognition

Crowd

Violence

Sudden group

formation

Pattern

Database

www.monash.edu.au

21

Behavior Recognition in Literature (1 of 3)

Real-time system for recognizing human behaviors including following another person and altering one’s path to meet another. (Oliver et al. PAMI’ 00)

Real-time system to determine whether people are carrying objects, depositing an object, exchanging bags. (Haritaoglu et al. PAMI’ 00)

Following another person

Altering one’s path to

meet another

Carrying object

Depositing an object

Exchanging objects

Behavior

Recognition

www.monash.edu.au

22


Identifying abnormal movement patterns. (Grimson et al. CVPR’ 98)

Interaction patterns among a group of people based on simple statistics computed on tracked trajectories. Behaviors: loitering, stalking and following. (Wei et al. ICME’ 04)

Real-time behavior interpretation from traffic video for producing lexical output. (Kumar et al. ITS’ 05)

Abnormal movement pattern

Loitering

Stalking

Following

Target moving towards point

Target crossing a point

Target stopped at a point

Behavior

Recognition

www.monash.edu.au

23


Tracking groups of people in metro scene and recognizing abnormal behaviors. Appearance/disappearance of groups, dynamics (split and merge) and failure of motion detector. (Cupillard et al. WAVS’ 01)

Analyzing vehicular trajectories for recognizing driving patterns. (Niu et al. ICSP’ 03)

Surveillance event primitives: entry/exit, crowding, splitting and track loss. (Guha et al. VSPETS’ 05)

Appearance of groups

Disappearance of groups

Merging of groups

Splitting of groups

Turn/Stop

Entry/Exit

Crowding

Track loss

Behavior

Recognition

www.monash.edu.au

24

Addressed Research Problem

www.monash.edu.au

25

Surveillance

video stream

Identified

active agents

Classified

active agents

Tracked

trajectories

Pattern

database High level

description of

unusual actions

and interactions

1.

Environment

Modeling

2.

Feature Extraction

and Agent

Classification

3.

Agent Tracking

with Occlusion

Handling

4.

Event/Behavior

Recognition Alarm!

Environment Modeling in the Proposed Framework

www.monash.edu.au

26

Environment Modeling

Surveillance

video stream Identified

moving objects

Environment

Modeling

Baseline

Pixel-based approaches are more suitable for visual surveillance

Most popular and widely used pixel-based method was introduced at MIT by Stauffer and Grimson (CVPR’ 99)

Gaussian Mixture Model (GMM) was used for environment modelling

Improved adaptability proposed by Lee (PAMI’ 05)

www.monash.edu.au

27

Environment Modeling using Gaussian Mixtures

Sky

Cloud

Leaf

Moving Person

Road

Shadow

Moving Car

Floor

Shadow

Walking People

P(x)

x µ

σ2

P(x)

x µ

σ2

P(x)

x µ

σ2

P(x) Sky

Cloud

Person Leaf

x (Pixel intensity)

www.monash.edu.au

28

Moving Object Detection

ω1

σ12

µ1

road

ω2

σ22

µ2

shadow

ω3

σ32

µ3

car

road shadow car road shadow

Frame 1 Frame N

65% 20% 15%

b

1k

kb TωargminBBackground Models

T = 70%

T is minimum portion of data in the environment accounted for background.

Matched model for a new pixel value Xt, |Xt - µ | < Mth * σ

Models are ordered by ω/σ

K models

www.monash.edu.au

29

An Observation

Current frame Moving foreground

Background

Model

T = 70% T = 90%

This model is sensitive to

environment!!

Not an ideal approach for

the proposed framework!!

www.monash.edu.au

30

Background Representation

ω1

σ12

µ1

m1

road

ω2

σ22

µ2

m2

shadow

ω3

σ32

µ3

m3

car


i

i

Kiargmaxj

where jm

Background

Representation

How to obtain a visual representation of the background from the

environment model?


Frame 1 Frame N

Background

Model

Why?

Which value should be

used to represent the

background?

- =


www.monash.edu.au

31

Representation of the Computed Background

(a) Test Frame

(b) Lee’s Formulation

(c) Proposed Approach

(a) (b) (c)

Lee (PAMI' 05) gave an intuitive solution to

compute the expected value of the

observations believed to be background.

Kj jj

Kk kkk

K

kkk

GPGBP

GPGBPBGPGXEBXE

1

1

1 )()|(

)()|()|(]|[]|[

../../../../../../../dump/GT/GroundTruth/result.html

www.monash.edu.au

32

Another Observation

K = 3

K models

Selecting the least probable model for the new pixel value could

sacrifice the most appropriate model representing the background!


Frame 1 Frame N

road shadow car

65% 20% 15%


ω1

σ12

µ1

m1

ω2

σ22

µ2

m2

ω3

σ32

µ3

m3

Contradiction in model dropping strategy!!

Which model should be dropped?

ω

σ2

µ

m

www.monash.edu.au

33

Model Dropping Strategy

K = 3

K models


Frame 1 Frame N

road shadow car

65% 20% 15%


ω1

σ12

µ1

m1

ω2

σ22

µ2

m2

ω3

σ32

µ3

m3

ω

σ2

µ

m

To have a realistic background representation

To retain the most contributing background models as

long as possible

Objectives

The model having the least evidence for representing the background.

Which model should be dropped?

www.monash.edu.au

34

Representation of the Computed Background

(a) (b) (c) (d)

(a) Test Frame

(b) Lee’s Formulation

(c) Proposed (ODS)

(d) Proposed (MDS)

ODS - Original Dropping Strategy

MDS - Modified Dropping Strategy

And it works!


www.monash.edu.au

35

Background Response from Pixel Model - 1

www.monash.edu.au

36


www.monash.edu.au

37


www.monash.edu.au

38


www.monash.edu.au

39

Experiments

Total 14 test sequences

5 PETS sequences (Performance Evaluation for Tracking and Surveillance)

7 Wallflower sequences (Microsoft Research)

2 other sequences

Datasets

Moving Object Detection

- =


Evaluation

Compared with two most widely used GMM-based methods:

Stauffer and Grimson (CVPR’ 99) and Lee (PAMI’ 05)

Results are evaluated both visually and numerically

False Positive (FP)

False Negative (FN)

False Classification

www.monash.edu.au

40

Involved parameters, thresholds and constants

Learning Rate (α)

Maximum number of distribution per pixel model (K)

Matching threshold (Mth)

Subtraction threshold (Sth)

Initial high variance assigned to a new distribution (V0)

Initial low weight assigned to a new distribution (W0)

K = 3

www.monash.edu.au

41

First

Frame

Test

Frame

Ground

Truth

GMM

(Stauffer) GMM

(Lee)

Proposed

(ODS)

Proposed

(MDS)

(1)

(2)

(3)

(4)

(5)

(1) PETS2000; (2) PETS2006-S7-T6-B-1; (3) PETS2006-S7-T6-B-2; (4) PETS2006-S7-T6-B-3; and (5) PETS2006-S7-T6-B-4.

Experimental Results (PETS Dataset)


www.monash.edu.au

42

First

Frame

Test

Frame

Ground

Truth

GMM

(Stauffer) GMM

(Lee)

Proposed

(ODS)

Proposed

(MDS)

(6)

(7)

(8)

(9)

(10)

(11)

(12)

Experimental Results (Wallflower Sequences)

(6) Bootstrap; (7) Camouflage; (8) Foreground Aperture; (9) Light Switch; (10) Moved Object; (11) Time Of Day; and (12) Waving Tree


www.monash.edu.au

43

Experimental Results (Football and Walk)

First

Frame

Test

Frame Ground

Truth

GMM

(Stauffer) GMM

(Lee)

Proposed

(ODS)

Proposed

(MDS)

(13)

(14)

(13) Football; and (14) Walk


www.monash.edu.au

44

Experimental Results (Numeric Evaluation)

False Negative

www.monash.edu.au

45


False Positive

www.monash.edu.au

46


False Negative + False Positive

www.monash.edu.au

47


Contributions

• Independent of any environment sensitive parameter

• Improved detection quality than existing GMM-based methods

• No post-processing step required

• Operational with same parameter setting in different environments

• Fault tolerant with small camera displacement

Surveillance

video stream Identified

moving objects

Environment

Modeling

www.monash.edu.au

48

Timetable

Behavior Recognition

Thesis Writing

Object Classification

Literature Review

Tracking/Occlusion


First Year Second Year Third Year Task

Pattern

database

Environment

Modeling

Feature Extraction

and Agent

Classification

Agent Tracking

with Occlusion

Handling Behavior

Recognition Alarm!

www.monash.edu.au

49

Acknowledgments

• http://www.fotosearch.com/DGV464/766029/

• http://www.cyprus-trader.com/images/alert.gif

• http://security.polito.it/~lioy/img/einstein8ci.jpg

• http://www.dtsc.ca.gov/PollutionPrevention/images/question.jpg

• http://www.unmikonline.org/civpol/photos/thematic/violence/streetvio2.jpg

• http://www.airports-worldwide.com/img/uk/heathrow00.jpg

• http://www.highprogrammer.com/alan/gaming/cons/trips/genconindy2003/exhibit-hall-crowd-2.jpg

• http://www.bhopal.org/fcunited/archives/fcu-crowd.jpg

• http://img.dailymail.co.uk/i/pix/2006/08/passaPA_450x300.jpg

• http://www.defenestrator.org/drp/files/surveillance-cameras-400.jpg

• http://www.cityofsound.com/photos/centre_poin/crowd.jpg

• http://www.hindu.com/2007/08/31/images/2007083156401501.jpg

• http://paulaoffutt.com/pics/images/crowd-surfing.jpg

• http://msnbcmedia1.msn.com/j/msnbc/Components/Photos/070225/070225_surveillance_hmed.hmedium.jpg

URLs of the images used in this presentation

www.monash.edu.au

50

Thank you!

Q&A

Technology

Talk 2007-monash-seminar-behavior-recognition-framework