imagelab.ing.unimo.it Tutorial: multicamera and ...imagelab.ing.unimore.it/files2/icdsc09/cucchiara tutorial-icdsc09.pdf · recognition for multicamera and distributed video surveillance

http://imagelab.ing.unimo.it

Tutorial: multicamera and distributed video surveillance

Third ACM/IEEE Prof. Rita Cucchiara

International Conference on

Distributed Smart Cameras

Università di Modena e Reggio Emilia, Italy

ICDSC 2009

30/08/2009 C (It l )Como (Italy)

Agenda

This tutorial addresses algorithms and techniques of computer vision and pattern recognition for multicamera and distributed video surveillance.

When multiple (heterogeneous) cameras are connected in a forest of sensors, standard techniques used in single- fixed camera surveillance are not sufficient anymore.

Different approaches should be taken into account depending on the camera layout (e.g., with overlapped or not overlapped field of view), the camera motion (e.g., fixed or PTZ cameras), the network capability and the availability of computational resource in the smart camera for early processing.

The tutorial aims at presenting a short survey of the research activities in this area, mainly focusing on people surveillance; models and algorithms for object segmentation and tracking in multi-camera environments will be presented in details with several demosand tracking in multi camera environments will be presented in details with several demos from ImageLab of Modena.

Techniques for people detection in cluttered environment will be presented. Finally, recent advances in trajectory analysis for people behaviour classification in distributed cameras systems j y y p p ywill be discussed.

Benchmark videos with ground truth and tutorial material will be available for the tutorial attendees.

Presentation of ImageLab

Computer Vision

Digital Librarycontent-based

Visionfor roboticautomation

retrieval

Multimedia:MedicalImaging

Multimedia: video annotation

Off-line Video analysisfor telemetryand forensics

Video analysisfor indoor/outdoorsurveillance

IMAGELAB-SOFTECHLab of Computer Vision,Pattern Recognition and Multimediaand forensics

People and vehicle surveillance

surveillance g

Dipartimento di Ingegneria dell’InformazioneUniversità di Modena e Reggio Emilia ggItalyhttp://imagelab.ing.unimore.it

Imagelab: recent projects in surveillance

THIS Transport hubs intelligent surveillance EU JLS/CHIPS Project 2009-2010VIDI-Video: STREP VI FP EU (VISOR VideosSurveillance

Projects:

European (Online Repository) 2007-2009

BE SAFE NATO Science for Peace project 2007-2008D t ti f i filt t d bj t f it 2006 2008 A t li

European

International Detection of infiltrated objects for security 2006-2008 Australian Council

Bheave_Lib : Regione Emilia Romagna Tecnopolo Softech 2010-Italian & Regional _ g g p2013LAICA Regione Emilia Romagna; 2005-2007FREE_SURF MIUR PRIN Project 2006-2008

g

Stopped Vehicles with Digitek Srl 2007-2008SmokeWave: with Bridge-129 Italia 2007-2010Sakbot for Traffic Analysis with Traficon 2004-2006

With Companies

Sakbot for Traffic Analysis with Traficon 2004 2006Mobile surveillance with Sistemi Integrati 2007

Domotica per disabili: posture detection FCRM 2004-2005

Video surveillance

Video-surveillance concerns models, techniques and systems for • acquiring videos about the 3D external world, Multidisciplnary

C• detecting targets along the time and the space, • recognizing interesting or dangerous situations, • generating real-time alarms

Context

Imaging, g g

• recording meaningful data about the controlled scene. g g,

Image processing, Pattern recognition, Computer vision Machine learningPhysicsPhysicsComputer architectureDigital ElectronicsUbiquitous computing, networking, Wi d i lWired-wireless CommunicationMultimediaData managing, knowledge grepresentationAI…

Video surveillance

Commercial worldResearch world

‘60 -70 Analogue cameras‘60-70 Hardware

‘80 Digital CCTV systems‘80 Military research

’90 Digital surveillance systems‘90 Traffic monitoring

‘90 Network surveillance systems‘00 People surveillance

’10? Ubiquitous, WAN systems‘10 ? Life surveillance

Commercial System Generations: Hardware

Gs3

Commercial System Generations: Software

Gs3

Research in surveillance: a brief history

‘80-‘90 Research on Military EnvironmentFOCUS ON SENSORS AND IMAGE

J.Ratches, C. Walters, R.G.Buser, B.D. Guenther, Aided and AutomaticTarget Recognition Based upon SensoryInputs from Image Forming Systems, IEEE

PROCESSING FOR HUMAN-BASED ENHANCEMENT

FOCUS ON TARGET DETECTION

p g g y ,Trans. On PAMI No. 9, Sept.1997, pp. 1004-1019.

D Koller J Weber T Huang J Malik

‘90-’00 Research on Traffic SurveillanceMANUAL CALIBRATION OF THE ROAD

D. Koller, J. Weber, T. Huang, J. Malik, Towards Robust Automatic Traffic Scene Analysis in Real-Time,” in Proc of ICPR, 1994

Z. Zhu, G. Xu, B. Yang, D. Shi, X. Lin, VISATRAM: A Real-time vision system for

VEHICLE DETECTION WITH ADAPTIVE BACKGOUND bnew (x) = αbold (x) + β

yAutomatic Traffic Monitoring, Image and Vision Computing 18,2000

R. Cucchiara, M. Piccardi and P. Mello, “Imageanalysis and rule-based reasoning for a

DOUBLE DIFFERENCE WITH GRADIENT ANALYSIS

REASONING FOR CROSS MONITORING

traffic monitoring system”, IEEE Trans. on Intelligent Transportation Systems, v. 1, n. 2, Jun. 2000,

V. Kastrinaki , M. Zervakis , K. Kalaitzakis A f id i t h i f

SURVEYsurvey of video processing techniques fortraffic applications Image and Vision Computing 2003

Research in surveillance:

‘00 Research on People surveillance: SYSTEMS DEVELOPED IN UNIVERSITY CAMPUS (PFINDER w4 VSAM SAKBOT KNIGTH )

PFINDERC. Wren, A. Azarbayejani, T. Darrell, and A.P.

Pentland, “Pifinder: real-time tracking of the human body,” IEEE Trans. PAMI, ( 19) 7,(PFINDER, w4, VSAM, SAKBOT, KNIGTH………………)

PRECISE ALGORITHMS FOR DETECTION AND TRACKING

human body, IEEE Trans. PAMI, ( 19) 7, 1997.

VSAM PROJECTCollins, Lipton, Kanade, Fujiyoshi, Duggins, Tsin,

TRACKING FROM SINGLE FIXED CAMERA

MOTION DETECTION WITH BACKGROUND SUPPRESSION

, p , , j y , gg , ,Tolliver, Enomoto, and Hasegawa, "A System for Video Surveillance and Monitoring: VSAM Final Report," Technicalreport CMU-RI-TR-00-12, RoboticsInstitute, Carnegie Mellon University, May, 2000SUPPRESSION

USE OF COLOR FOR PRECISE DETECTION AND SHADOW SUPPRESSION

2000.… many others..

W4Haritaoglu, D. Harwood, and L.S. Davis, “W4:

SHADOW SUPPRESSION

SINGLE OBJECT TRACKING

real-time surveillance of people and their activities,” IEEE Trans. PAMI, ( 22) 8, 2000.

SAKBOTR. Cucchiara, C. Grana, M. Piccardi, A. Prati

ANALYSIS OF OCCLUSIONS FOR GROUP OFPEOPLE..

“Detecting Moving Objects, Ghosts and Shadows in Video Streams“, IEEE Trans on PAMI, 2003

KNiGHT

RESEARCH LABS IN INDUSTRIES (SIEMENS, HONEYWELL, IBM, SARNOFF, MERL..)

M. Shah, O. Javed, K. Shafigue Automated VisualSurveillance in Realistic Scenarios IEEE MULTIMEDIA 2007

Background suppression: milestones et al.

Color Analysis S.J. McKenna, S. Jabri, Z. Duric, A. Rosenfeld, and H.Wechsler, “Tracking groups of people,” Computer Vision and Image Underst.,( 80)1, 2000

Mixture of Gaussian

2000.

C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real-time tracking,” IEEE T PAMI ( 22) 8 2000

Background and Shadow detection

Trans. PAMI, ( 22) 8, 2000.

A SHORT SURVEYA Prati I Mikic M M Trivedi R Cucchiara "Detecting

Background and layered representation

A. Prati, I. Mikic, M.M. Trivedi, R. Cucchiara, Detecting Moving Shadows: Algorithms and Evaluation," IEEE Trans. on PAMI, July 2003

Tao Sawneir, Kumar. Object trackingith b i ti ti f

Background with KERNEL DENSITY

with bayesian estimation ofdynamic layer representation IEEE Trans on PAMI 24,1 2002

A.Elgammal, R.Duraiswami, D.Harwood, L.S. Davis

A d th h d d f

Background and Foreground Modeling UsingNonparametric Kernel Density for VisualSurveillance Proceedings of the IEEE 2003

…….. And other hundreds of papers…

Techniques for people tracking: milestones

Many SURVEYs Moeslund , Hilton, Kruger A survey of advanced in vision-based human motion capture and analysis Computer Vision and ImageUnderstanding vol 104 2006

Most used techniques:

Understanding vol 104 2006

Yilmaz Javed Shah Object tracking a survey ACM COMPUTING Survey vol 33 n 4 2006

MeanSHIFTDorin Comaniciu, Visvanathan Ramesh: Mean Shift

and Optimal Prediction for Efficient Object Tracking. ICIP 2000

PARTICLE FILTERINGM. Isard and A. Blake. A smoothing

filter for condensation. In Proc. ECCV, 1998.

.........

Appearance based tracking

Wu, H; Sankaranarayanan, A; Chellappa, R;Online Empirical Evaluation of Tracking AlgorithmsIEEE Trans. PAMI 2009 forthcoming

Tao Zhao Nevatia R Tracking multiple humans in Tao Zhao Nevatia, R. Tracking multiple humans in complex situations IEEE Trans. PAMI 2004

A.Senior Tracking people with appearance ModelsPETS 2002

R.Cucchiara, C.Grana, G.Tardini, R.VezzaniP b bili ti l t ki f l i Probabilistic people tracking for occlusion handling, ICPR04, 2004

..

To parallel and distributed surveillance

A review of commercial systems Valera, M. Velastin, S.A.Digital Imaging Res. Centre, Kingston Univ., UK; Intelligent distributed

A review od hardware and software

Univ., UK; Intelligent distributedsurveillance systems: a reviewVision, Image and SignalProcessing, IEE Proceedings -2005

requirements Hu, Tan, Wang, Maybank: A Survey on visual surveillance of objectmotion and bheaviors IEEE Trans. On system man and cybernetics

A survey of multicamera and distributedsurveillance

On system man and cyberneticsvol34 n 3 2004

A. C. Sankaranarayanan, surveillance A.Veeraraghavan, and R.Chellappa,Object Detection, Tracking and Recognition for Multiple Smart Camera Proceedings of the IEEE | Vol. 96, No. 10, October 2008

Enlarging the surveillance dimensions.. 3T space

The 3T space: T for Time analysis

PresentRecent PastHistoricalPast Future(t0+nDt)

Near Futuret0(t0-Dt)..(t0-nDt).. (t0+nDt)..(t0+Dt)..

Surveillance,

SurveillanceWhat is happening?Who is this one?

ForensicsWhat happened ?What is the result ofh i i i ?

ForecastingStatistical analysisWhat will happen ?

Who is this one?Which event can be detected now?

Human driven Surveillance/forensicsWhat is just happened?

the investigation?

What is just happened?Where he/she come from?Which is the cause of the event ?


The 3T space: T for Topology (in Space)

1Partitioned 1 Moving DistributedN overlapped1FOVFoV

gFOV.

DistributedFoV

N overlappedFOV

SurveillanceDear Single Static Fixed

Surveillancewith ZoomAnalysis

Wide Area Surveillancenetwork of camerasDistributed

SurveillanceMulticamera systemsStatic Fixed

Camera..Focus on faces

SurveillancePT moving camera

DistributedSurveillance

Focus on attentionFocus on event

Hand-driven moving camerasCameras installed in moving systems

Focus on target

gathering

g


The 3T space: T for Target

CrowdBody partsBody detailsSingle

individual Group of people100…

Surveillance,

y p0.5

y0.1

individual1 2,3 10

Crowd analysisPeople Flow modeling

Surveillance- BiometryFace detectionFingerprintE i l i

SurveillanceDetection,Identification

Surveillance/forensics

Expression analysis IdentificationAction and Interaction recognitionand Bheavioral analysis

People posture,Gait, activity detection

Putting all together

Distributed FoVs

Multiple Wide Area Network

Large Area (Network)Surveillance

pOverlapped

FoVsSurveillance

One-CameraPeople surveillance

Single individualBody partsBody details CrowdGroup of people

1FOV

People surveillance

1y p0.5

y0.1 100…

p p p2,3 10

Forenscis

PartitionedFoV

& Biometry

Focus on multicamera - distributed surveillance

1. Introduction

2. Multicamera surveillanceDetection, tracking, gTrajectory analysis

3. Not overlapped camerasSearching corrispondencesTracking

4 Moving cameras4. Moving camerasPTZ synchronizationCamras in motion

Architectural requirements

Multicamera (multiview) surveillance

Distributed ( network camera) surveillancesurveillance

fully synchronized acquisition ; 1 framegrabber with 1-20 fixed and PTZ

( l i )

surveillanceloosely coupled acqusition and processing; potentially thousands of

d i h dcameras; 1 (multiprocessor) computer for many cameras

nodes with smart cameras and traditional network cameras

shared memory architectureChallenges: More precision, 3D reconstruction, consistent labeling in

message passing architectureChallenges: Large coverage, communication, bandwidth, tradeoff-, g

multiview, occlusion handling, people identification, bheavioranalysis

, ,local computation and computationpower; less precision, multiple hypothesis generaion , search forsimilarity

+ in addition freely moving camerason vehicles, moving infrastructure, hand-cameras

Multi camera surveillance with overlapped FoVs

Using multiple sensors/cameras has manyadvantages:

id f hA. C. Sankaranarayanan,

A Veeraraghavan and R ChellappaWider coverage of the sceneMulti-modal sensoringRedundant data (improved accuracy)

A.Veeraraghavan, and R.Chellappa,Object Detection, Tracking and Recognition for Multiple Smart Camera Proceedings of the IEEE | Vol 96 No 10 October 2008( p y)

Fault tolerance

Multiple cameras require consistent

IEEE | Vol. 96, No. 10, October 2008

Saad M. Khan and Mubarak Shah; Multiple cameras require consistent labeling

Tracking Multiple Occluding People by Localizing on Multiple Scene Planes; IEEE TRANS. ON PAMI, VOL. 31, NO. 3, MARCH

Consistent labeling requires homographyand techniques for solving tracking in overlapped FoVs

2009

Consistent labeling: overview

Appearance-based approaches base the matching essentially on the color

J. Orwell, P. Remagnino, G. Jones, Multi-camera colour tracking, in: Proc. of Second IEEE Workshop on Visual Surveillance, (VS’99), 1999, pp. 14–21.g y

of the objects: color histograms, faces, …

K. Nummiaro, E. Koller-Meier, T. Svoboda, D. Roth, L. Van Gool, Color-based object tracking in multi-camera environments, in: DAGM03, 2003, pp. 591–599.

Geometry-based approaches exploit geometrical relations and constraints b t diff t i

Z. Yue, S. Zhou, R. Chellappa, Robust two-camera tracking using homography, in: Proc. of IEEE Intl Conf. on Acoustics, Speech, and Signal Processing, Vol. 3, 2004, pp. 1–4

between different views: homography-based, epipolargeometry calibration-less

A. Mittal, L. Davis, M2tracker: A multi-view approach to segmenting and tracking people in a cluttered scene, IJCV 51 (3) (2003) 189–203

S. Khan, M. Shah, Consistent labeling of tracked geometry, calibration less, …Mixed approaches combine information about the geometry with

objects in multiple cameras with overlapping fields of view, IEEE Trans. on PAMI 25 (10) (2003) 1355–1360

J. Kang, I. Cohen, G. Medioni, Continuous tracking ithi d t i P f IEEE information about the geometry with

information provided by the visual appearance: probabilistic

within and across camera streams, in: Proc. of IEEE Int’l Conference on Computer Vision and Pattern Recognition, Vol. 1, 2003, pp. I–267 – I–272.

S. Dockstader, A. Tekalp, Multiple camera tracking of interacting and occluded human motion Proc of the information fusion, BBNs, … interacting and occluded human motion, Proc. of the IEEE 89 (10) (2001) 1441–1455.

Consistent labeling: overview[11] J. Kang, I. Cohen, G. Medioni, Continuous tracking within and across

camera streams, in: Proc. of IEEE CVPR 2003[12] C. Stauffer, K. Tieu, Automated multi-camera planar tracking

correspondence modeling, in: Proc. of IEEE CVPR 2003[13] J Orwell P Remagnino G Jones Multi camera colour tracking in: [13] J. Orwell, P. Remagnino, G. Jones, Multi-camera colour tracking, in:

Proc. Of (VS’99), [14] S. Chang, T.-H. Gong, Tracking multiple people with a multi-camera

system, in: Proc. of IEEE Workshop on Multi-Object Tracking, 2001[15] Z. Yue, S. Zhou, R. Chellappa, Robust two-camera tracking using

homography in: Proc of IEEE ACSSP 2004 homography, in: Proc. of IEEE ACSSP 2004 [16] S. Khan, M. Shah, Consistent labeling of tracked objects in multiple

cameras with overlapping fields of view, IEEE Trans. on PAMI 25 (10) (2003)

[17] S. Dockstader, A. Tekalp, Multiple camera tracking of interacting and occluded human motion, Proc. of the IEEE 89 (10) (2001), ( ) ( )

[18] H. Tsutsui, J. Miura, Y. Shirai, Optical flow-based person tracking by multiple cameras, in: Proc. 2001 Int. Conf. on Multisensor Fusion

[19] J. Krumm, S. Harris, B. Meyers, B. Brumitt, M. Hale, S. Shafer, Multi-camera multiperson tracking for easyliving, in: Proc. of IEEE IntlWorkshop VS00

[20] Q. Zhou, J. Aggarwal, Object tracking in an outdoor environment using fusion of features and cameras, Image and Vision Computing 24 (11) (2006)

[21] J. Black, T. Ellis, Multi camera image tracking, Image and Vision Computing 24 (11) (2006)

22] K. Nummiaro, E. Koller-Meier, T. Svoboda, D. Roth, L. Van Gool, Color-based object tracking in multi-camera environments, in: DAGM03, 2003,

[23] M. H. Tan, S. Ranganath, Multi-camera people tracking using bayesiannetworks, FPCRM 2003

[24] A Mitt l L D i U ifi d lti d t ti d t ki i [24] A. Mittal, L. Davis, Unified multi-camera detection and tracking using region matching, in: Proc. of IEEE Workshop on Multi-Object Tracking, 2001,

[25] A. Mittal, L. Davis, M2tracker: A multi-view approach to segmenting and tracking people in a cluttered scene, IJCV 51 (3) (2003)

The solution at Imagelab

HECOL (Homography and Epipolar-based S. Calderara, A. Prati, R. Cucchiara,

“Bayesian-competitive Consistentp pCOnsistent Labeling)

Bayesian-competitive ConsistentLabeling for People Surveillance“ on IEEE Trans on PAMI, feb. 2008

S C CGround-plane homography and epipolar geometry automatically

S. Calderara, A. Prati, R. Cucchiara, “HECOL: Homography and Epipolar-based Consistent Labeling forOutdoor Park Surveillance" in press

C t Vi i d Icomputed from training videos on Computer Vision and ImageUnderstanding, 2008

Person’s main axis warped to the other view and Bayesian inference is

d f lid t h thused for validate hypotheses

Video surveillance at ImageLab

F

Fixed camera

F

Fixed camera

F

Fixed cameraSakbot

PM

Moving and mobile

S

Segmentation

Tracking

Segmentation

Tracking

Segmentation

SakbotPTZ camera

ROI andmodel-basedTracking

Moving and mobilecamera

MosaicingSegmentation& Tracking

Sensors

Sensor dataacquisitionTracking

Ad-hoc

PTZ Control

Tracking

Consistent Labeling and Multicamera trackingGeometryRecoveryHecol

HeadTrackingFace selection

High ResolutionDetection

TrajectoryAnalysis

Action analysis

Posture analysis

Recovery

Homograpy &Epipoles

Hecol

Face obscurationFace Recognition & People Identification

People Annotation

Bheavior recognition

Video surveillance

VISOR

Ontology

WEBMPEG StreamingAnnotated

video storage

Security controlcenters

Mobile surveillancel tf

Mosesplatforms

Hecol on-line off line processes

1) off-line process for HOMOGRAPHY AND

EPIPOLE DETECTION and construction of a Camera Transition Graph

PRMA, Barcelona 20062) detection and tracking in each single camera system and 3) consistent labeling at each new track detection

Homography

One to One

One point in the 3D space is projected A (X ,Y ,Z )A A A

IOne point in the 3D space is projectedin a single point in the image planeThus, known the position of the pointin the space and the camera parameters

O

( , , )A A A

xy C

in the space and the camera parametersI can know if the point can fall in the image plane

A’( )x ,yAA

X

Y

Z

I

One to many

Instead a point in the image plane can

OA (X ,Y ,Z )A A A

xy C

Instead, a point in the image plane can correspond to many points: to all the points of the line passing through the optical center and the source point

A’( )x ,yAA

X

Y

Z

optical center and the source point

Horizontal homography

But if I have at least oneconstraint (e.g. the point is I

at z=0) from a point P in the image plane i can know its position in the

O

xy C

real space

( Z )

P(X ,Y ,Z =c)P P P

I

xy C

(parete; Z=c)

Oy C

P(X Y Z )=0

(pavimento; Y=0)

P(X ,Y ,Z )P P P=0Horizontal homography

Rectification

Hypotesis to have all the points in the ground plane

Homographic transformation matrix

Homography is a linear non singular tranformation of the projective plane in itself

Given a plane (e.g. ground plane) and given two bidimensional projections ( two image planes) the homographic transformation links the coordinates of the points in the two reference systems

1 13 3

2 2

''

1 1

x

x xx H xλ⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥=⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦

1 2 33 3

4 5 6

1 1

x

h h hH h h h

⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦⎡ ⎤⎢ ⎥= ⎢ ⎥

H is defined at less of a scale factor and thus with 8 degrees of freedom

7 8 1h h⎢ ⎥⎢ ⎥⎣ ⎦

H is defined at less of a scale factor and thus with 8 degrees of freedom

It can be defined with 4 points of corrispondence ( 4 points 8 variables)

Homography computation

The homography is a planar projective transformation that relates pointcoordinates lying on the planes used to construct the warping matrix.y g p p g

In projective coordinates a planar homography can be described by a 3x3 matrix and 8 independent parametrs.

Selections of 4 far not aligned points:Selections of 4 far not aligned points:Manual selection in initial calibration stepAutomatic selection with human probe

Off-line stage

Off-line process automatically computes ground-plane homography

2

S. Khan and M. Shah. Consistent labeling of tracked objects in multiple cameras with overlapping fields of view. IEEE

with E2oFoV and epipolar constraintsDefine the Entry Edge Field of Views

with overlapping fields of view. IEEE Trans. on PAMI, 25(10):1355–1360, October 2003.

S Calderara R Vezzani A Prati R Cucchiara S.Calderara, R. Vezzani, A. Prati, R. Cucchiara, "Entry Edge of Field of View for multi-camera tracking in distributed video surveillance" in IEEE International Conference on AVSS 2005, Como, Italy, pp. 93-98, 2005

31

Automatic Homography computation 1/2

Automatic learning phase to computeoverlapping zones and ground-planehomography. Take many correspondences amongground plane support points ( with a tracking algorithm for a single

l )people)

Define the Entry Edge Field of ViewsE2oFoV using Least SquareE2oFoV using Least SquareOptimization

Define the overlapping zones and the

E2oFoV EoFoV

Define the overlapping zones and the extremes points

Compute the homography fromCompute the homography frompoints correspondences

32

Examples

Engineering Campus of University Of Modena

From (http://www-( psop.inria.fr/orion/ETISEO/)

G. Kayumbi and A. CavallaroMultiview TrajectoryMapping Using Homography withHomography withLens Distortion CorrectionEURASIP Journal on Image and Video Processing2008

Epipolar geometry recovery

Pure ground-plane homography-based matching is not reliable in case of segmentation errors and groups of people we need another 3D information. Using only homography we can detect only the presence in the planar world. For recovering the epipolar geometry we must obtain point correspondences not on the ground plane.g p

Thus we take the heads! ( upper points of the blob)

Epipole computation is performed with RANSAC to improve numerical stability of the solution.

34

EPIPLOAR GEOMETRY

Exploiting the parallax property of perspective planes given a planeto plane homography 2 points correspondences are sufficient to compute epipole location.Given a point up and its projections in the two cameras C1 and C2 th i l li b t d i th h h t i Hthe epipolar line can be computed using the homography matrix H and point projections.

Where l is the epipolar line in the image plane of C1 passing

)(, 2121,1 upHeupHupl up ⋅×=⋅=

p p g p p gthrough up projectionsH is the homography matrix from C2 to C1 ground planes

b Indicates the line passing through points a and bba,The intersections of two lines univocally identify the epipole

EPIPLOAR GEOMETRY ALGORITHM

Point correspondences can be affected by errors if extractedautomaticallyRANSAC is used to iteretively choose the two lines that give the best epipole locationsALGORITHM:1. during the previous E2oFoV training phase, correspondences

between object’s head projections up in both cameras are sampled between object s head projections up in both cameras are sampled frame by frame.

2. after choosing a camera, C , we randomly compute two epipolarli t k f th l t f N f i i l li lines taken from the sample set of N frames using epipolar line equation and detect the epipole location.

3. we evaluate the consensum of remaining samples counting how g p gmany samples are close enough to the epipolar lines computed using estimated epipole

4 iterate from 2 until the consensum is above a fixed threshold4. iterate from 2 until the consensum is above a fixed threshold.

On-line Stage: Consistency resolver

HECOL defines a Bayesian-competitive approach with warping of vertical axis and a two-contributions checkEach hypothesis in two overlapped cameras is accounted b h i l dboth as single and as group

Consistency resolverwhenever a new object is detectedwhenever a new object is detected( not only to the EoFoV)

an appearance basedan appearance-basedtracking is needed

37

CTG: modeling the camera overlaps

When a new track appears on one camera the search forpossible matching tracks could be time consuming

A Camera Transition Graph is built to speed up the serach process:

p g g

pThe Graph incorporates the cameras topologylearnt during off-line geometry learning stageEach node consists of one cameraEach node consists of one cameraArcs represent the overlapping constraint amongcameras FoVsV i bl (t k ) t h d t ti f thVariables (tracks) at each node must satisfy the unary constraint of having different labelsWhen a new object appears (variable added at a

d ) th bi t i t th t t i t fnode) the binary constraint that two istances ofthe same object on different nodes must have the same label is verified (Costraint SatisfactionProblem)Problem)

Example

Bayesian-competitive framework

Given a new track on camera l hypotheses’ space is build for each of the possible overlapped cameras in CTGEach hypothesis in two overlapped cameras is accounted both as single and as groupInter camera MAP:Inter-camera MAP:

prior

40

Prior computation

A hypothesis consisting of a single object will gain hi h i if h d l i f h f hhigher prior if the warped l.s.p. is far enough from the other objects’ support points.

A hypothesis consisting of two or more objects (i.e., a possible group) will gain higher prior if the objects composing it are close to each other after the warping, p g p gand, at the same time, the whole group is far from other objects.

41

Prior computation

Priors must account for different probability in p ycase of multiple hypotheses:

42

Likelihood

Likelihood is computed as the max of two t ib ti (f d b k d)contributions (forward - backward):

homography

epipolar geometry

a2j

g p y

<a1j ,a2

j>FG

a1jhomography

a1 ,a2

Fitness measure

MAP label assignment

43

Example

PRMA, Barcelona 2006

Experimental results (1)

The system has been tested in a setup at our campusy p p

45


46


Examples of logging appearanceExamples of logging appearance

47

Park experimentation

48

Distributed surveillance

Network of (smart) cameras; Not overlappedFoVs; loosely coupled.

bl f d i i

Zoltan Safar, John Aa. Sørensen, Jianjun Chen, and K°are J. Kristoffersen MULTIMODAL WIRELESS NETWORKS: DISTRIBUTED SURVEILLANCE WITH MULTIPLE NODES Problems of node communication

If moving cameras: problems of calibration and tracking. The simultaneous localization and

SURVEILLANCE WITH MULTIPLE NODES Proc of ICASSP 2005

Funiak, S.; Guestrin, C.; Paskin, M.; Sukthankar, R ; Distributed localization of networked tracking (SLAT) problem, to estimate both the

trajectory of the object and the poses of the cameras.

R.; Distributed localization of networked cameras Int conf on Information Processing in Sensor Networks, 2006.

R. Vezzani, D. Baltieri, R. Cucchiara, "Pathnodes

Problem of tracking and distributed consistentlabeling

, , ,integration of standalone Particle Filters forpeople tracking on distributed surveillancesystems" in Proceedings of 25° ICIAP2009, 2009

1) Tracking on single camera: multi hypotesisgeneration 2) Camera world calibration2) Camera world calibration

GeometryColor

3) Prediction into new cameras’ FoVs3) Prediction into new cameras FoVs4) matching

Color calibration

Methods:Linear transformation

Roullot, E., "A unifying framework for color image calibration," 15th International Conference on Systems, Signals and

Independent channelsConference on Systems, Signals and Image Processing, 2008. IWSSIP 2008, pp.97-100, 25-28 June 2008

K Yamamoto and J U “Color Calibration 1 1R RA Bα β⎡ ⎤ ⎡ ⎤+

⎢ ⎥ ⎢ ⎥

Full matrix ( M conmputed with LSQ)

K. Yamamoto and J. U Color Calibration for Multi-Camera System by using Color Pattern Board” Technical Report MECSE-3-2006

2 2

3 3

G G

B B

A BA B

α βα β

⎢ ⎥ ⎢ ⎥= +⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥+⎣ ⎦ ⎣ ⎦

u at ( co puted w t SQ)

R R

G G

A BA M B⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥=⎢ ⎥ ⎢ ⎥

Look up table

B BA B⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦

Look-up tablefor non lineartransformation

Example

original Independent channelsg p

Look-up table Full matrixp

Feature to match

Color (single / multiple) Shape (geometrical ratios / spline / elliptical models)M ti ( d di ti )Motion (speed, direction)Gait (Fourier transform)SIFT + , greylevel co-occurrence matrix, Zernike moments and some simple colour features

Nicholas J. Redding, Julius FabianOhmer1, Judd Kelly1 & Tristrom

moments and some simple colour featuresPolar color histogram + Shape

Cooke Cross-Matching via Feature Matching for Camera Handover with Non-Overlapping Fields of View Proc. Of DICTA2008

K Ji C h I M di i Kang, Jinman; Cohen, Isaac; Medioni, Gerard, "Persistent Objects TrackingAcross Multiple Non OverlappingCameras," IEEE Workshop on Motionand Video Comp ting 2005 and Video Computing, 2005. WACV/MOTIONS '05, vol.2, no., pp.112-119, Jan. 2005

Distributed Surveillance at ImageLab

The problem:a people disappeared in the scene exiting from a camera FoVa people disappeared in the scene exiting from a camera FoV, where can be detected in the future? 1) tracking with a camera FoV) g2) tracking in exit zones3) matching in the entering zones

Using Particle Filtering + PathnodesIn computer graphic all the possible avatar positions are represented by nodes and the connecting arcs refers to allowed paths. The sequence of visited nodes is called pathnodes. A weight can be associated to each arc in order to give some measures on it, such as the duration the likelihood to be chosen with respect to other paths and so onduration, the likelihood to be chosen with respect to other paths, and so on. Weights can be defined or learned in a testing phase

Exploit the knowledge about the scene

To avoid all-to-all matches, the tracking system can exploit the knowledge about the sceneknowledge about the scene

Preferential paths -> PathnodesBorder line / exit zonesPhysical constraints & Forbidden zones NVRTemporal constraints

Tracking with pathnode

A possible pathA possible pathbetweenCamera1 and Camera 4

Pathnodes lead particle diffusion

Results with PF and pathnodes

Single camera tracking: Multicamera trackingRecall=90.27% Recall=84.16%Recall 90.27% Recall 84.16%Precision=88.64% Precision=80.00%

Example

Frame 431. a man #21 exits and his particles are propagatedhis particles are propagated

Frame 452 a person # 22 exitstoo and also his particles aretoo and also his particles are propagated

Frame 471 a people is detected in p pCamera #2 and the particles ofboth # 21 and #22 are used butthe ones of #22 match and person à22 is recognized

And thus?

Thus, providing a both multicamera and distributed considtentlabeling we couldlabeling we couldidentify peopledetecting their trajectoriesg janalyzing interactiondefining abnormal bheaviorsetecting suspicius situations.…..Reacting and defining alarm in realtimeForecasting crowd situation..

An example: trajectory shape analysis

Trajectory shape analysis for “abnormal behavior” recognition in open spacep

Trajectory Shape similarity; invariant to space shiftsNot only space-based or time-based similarity

Trajectory similarity and clustering

Trajectories similarity measures:Exact pairwise comparison

E lid Di t [F l ICIP05]Euclidean Distance [Fu et al. ICIP05]Hausdorff distance [Lou et al. ICPR02, Junejo et al. ICPR04 ]

Inexact alignment based comparisonLCSS (Longest Common Subsequence) [Buzan et al ICPR04]LCSS (Longest Common Subsequence) [Buzan et al. ICPR04]DTW (Dynamic time warping) [Keogh et al. ACM KDD00]

Statistical matching•HMM [Porikli et al CVPRW04] [Bashir et al TIP 07]•HMM [Porikli et al. CVPRW04], [Bashir et al. TIP 07]

Clustering techniques:Hard-clustering:

Vector Quantization and Spatial Saussians [Mecocci et al ICIP 05 ]Graph Cuts [Junejo et al. ICPR04 ]Spectral clustering [ Hu et al TIP07, Poikli et al CVPRW04]p g [ , ]

Soft Clustering:EM [Bennewitz et al Intelligent Robot Systems 02]Fuzzy K means in spatial and temporal domains [H t l PAMI06]Fuzzy K-means in spatial and temporal domains [Hu et al. PAMI06]

Imagelab Proposal

1. Trajectory description with angle sequence

2 St ti ti l t ti ith Mi t f V

A. Prati, S. Calderara, R. Cucchiara, "UsingCircular Statistics for TrajectoryAnalysis" in Proceedings of{ }1, 2 , ,, , ,

jj j j n jT θ θ θ= K

2. Statistical representation with a Mixture of Von Mises Distributions (MovM)

3 Coding with a sequence of selected vM pdf

Analysis in Proceedings ofInternational Conference on CVPR 2008Definition of EM

algorithm for MovM3. Coding with a sequence of selected vM pdf

identifiers4. Code Alignment5 Clustering with k medoids

Using Dynamicprogramming

Definition ofBhattacharyya

5. Clustering with k-medoids distance fon vMand on-line EM

Traing set and on-line classification

Alignement

MovM(Tj)

EM for MoVM

{ }1, 2 , ,, , ,jj j j n jT θ θ θ= K

Coding with

<S={S1j..Snjj},MovM(Tj)>

Clustering

AlignementEM for MoVM Coding with MAP

with Br distance

Trajectory

Trajectoryrepository

clustersrepository

Normal/abnormal

Classificationwith Br

di t

Surveillance system

abnormal

On-line EM for MoVM

Coding with MAP

Alignement

distance{ }1, 2 , ,, , ,jj j j n jT θ θ θ= K

for MoVM MAP

Some tests

15 classes ofsyntheticytrajectories5 classes of realtrajectoriestrajectories

Experiments:PeriodicityMultimodalitySequence orderNoise..

Experimental results

• Results are evaluated in terms of :l ifi ti• classification accuracy

• normal/abnormal accuracy

• Compared with a HMM-based approach*

*.( part of) F. Porikli and T. Haga. Event detection by eigenvector decomposition using object and frame features In Proc of CVPR Workshop volume 7 pages 114 121 2004and frame features. In Proc. of CVPR Workshop, volume 7, pages 114–121, 2004.

Example

Example of clustering similar trajectories using Micxture of Von Mises and speedMises and speed

Another Application

Finding similar people at different time S. Calderara, R.Cucchiara, A. Prati Multimedia Surveillance: ContentbasedRetrieval withMulticamera People Tracking Procof VSSN 2006

Final example

Example of 2 peopleExample of 2 people detected in 3 different cameras

In addition…

Moving cameras:Problems of tracking with moving cameras

G. Gualdi, A. Prati, R. Cucchiara, "Video Streaming for Mobile Video Surveillance" in IEEE Transactions on Multimedia,

1142 1154 O t b 2008Problems of tracking with moving camerasProblems of fast and low-bandwidth communication

pp. 1142-1154, October, 2008

G. Gualdi, A. Prati, R. Cucchiara, Covariance Descriptors on Moving Regions for Human Detection in Very Complex Outdoor Scenes in Proc ofComplex Outdoor Scenes in Proc of ICSDC 2009

Large view cameras:problem of people detection in large areas

Available data set

VISOR : Video Surveillance Online repositoryhttp://Imagelab.ing.unimore.it/visorhttp://Imagelab.ing.unimore.it/visor

ViSORViSORViSORViSORVideo Surveillance Video Surveillance Online RepositoryOnline Repository

http://www.openvisor.org

Aims of ViSOR

Gather and make freely available a repository of surveillance videosof surveillance videos

Store metadata annotations, both manually provided as ground-truth and automatically generated by video surveillance tools and systemsy

Execute Online performance evaluation and icomparison

Create an open forum to exchange, compareCreate an open forum to exchange, compare and discuss problems and results on video surveillance

http://www.openvisor.org

Video Surveillance Ontology

People, fixed and mobile objects (e g

Physical Object

mobile objects (e.g., trees, vehicles, buildings, and so on…)

Content

Action/EventConcept Actions by or

Context

yinteraction between people; events

Metadata (e g : author of the video camera information calibrationMetadata (e.g.: author of the video, camera information, calibration data, location, date and time, and so on…)

Different types of annotation

St t l A t ti id i th k dStructural Annotation: video size, authors, keywords,…

Base Annotation: ground truth with concepts referred to theBase Annotation: ground-truth, with concepts referred to the whole video. Annotation tool: online!

GT Annotation: ground-truth, with a frame level annotation; concepts can be referred to the whole video, to a frame interval or p ,to a single frame. Annotation tool: Viper-GT (offline)

Automatic Annotation: output of automatic systems shared by ViSOR users.

Actual Video Corpus set

• At the moment (August 2009) ViSOR contains about 200At the moment (August 2009) ViSOR contains about 200 videos grouped into 14 categories

• Some examples:

• 5 videos for shadow detection

• 14 videos for smoke detection

• 40 Videos of different human actions

• 6 videos of surveillance of entrance doors6 videos of surveillance of entrance doors

• 33 videos for consistent labeling on a set of partially overlapped cameraspp

•1.160.698 frames – 13h

Video corpus set: the 14 categories

Outdoor multicamera

Synchronizedviews

Surveillance of entrance door of a building

About 10h!

Videos for smoke detection with GT

Videos for shadow detection

Already used from many researcherworking on shadow detection A. Prati, I. Mikic, M.M. Trivedi, R.

Cucchiara "Detecting Moving Shadows:working on shadow detectionSome videos with GT

Cucchiara, Detecting Moving Shadows: Algorithms and Evaluation" in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, n. 7, pp. 918-923, July, 2003

Videos for face detection with moving camera

Action recognition

40 videos10 different actions, such as “Taking off

S.. Calderara, R. Cucchiara, A. Prati, "Action Signature: a Novel HolisticRepresentation for Action10 different actions, such as Taking off

the jacket”, “walking”, “wearing glasses”, …

Representation for ActionRecognition" in 5th IEEE International Conference AVSS2008 , Santa Fe, New Mexico, 1-3 Sep, 2008

We invite you to

Contribute to the growing and population of both the video repository and the annotation setof both the video repository and the annotation setJoin to the community using the forum Sending us comments and suggestions…g gg

http://www.openvisor.orgIng. Roberto VezzaniDipartimento d’Ingegneria dell’informazione, Modena [email protected] +39-059-2056270

For any other information

http://Imagelab.ing.unimo.itRita CucchiaraDipartimento di Ingegneria

dell’[email protected]

Thanks to ImagelabThanks to ImagelabAndrea Prati, Roberto Vezzani, Costantino Grana, Simone Calderara, Giovanni Gualdi, Paolo Piccinini, Paolo Santinelli, Daniele Borghesani Davide Baltieri Sara Chiossi Borghesani, Davide Baltieri, Sara Chiossi, Rudy Melli, Emanuele Perini, Giuliano Pistoni

Documents

imagelab.ing.unimo.it Tutorial: multicamera and ...imagelab.ing.unimore.it/files2/icdsc09/cucchiara tutorial-icdsc09.pdf · recognition for multicamera and distributed video surveillance