Jungong Han, Eric J. Pauwels , Paul M. de Zeeuw , and Peter H.N. de With, Fellow, IEEE

Preview:

DESCRIPTION

Employing a RGB-D Sensor for Real-Time Tracking of Humans Across Multiple Re-Entries in a Smart Environment. Jungong Han, Eric J. Pauwels , Paul M. de Zeeuw , and Peter H.N. de With, Fellow, IEEE IEEE Transactions on Consumer Electronics , 2012. Outline. Introduction Related Works - PowerPoint PPT Presentation

Citation preview

1

Employing a RGB-D Sensor for Real-Time Tracking of Humans Across

Multiple Re-Entries in a Smart Environment

Jungong Han, Eric J. Pauwels, Paul M. de Zeeuw, and Peter H.N. de With, Fellow, IEEE

IEEE Transactions on Consumer Electronics, 2012

2

Outline

Introduction Related Works Proposed Method Experimental Results Conclusion

3

Introduction

4

Introduction Smart environment

Location Identity

especially for elderly or disabled people.

5

Introduction

Smart environment Location Identity

Goal: Detect and track humans in a home-used system Using a low-cost consumer-level RGB-D camera. Combine the advantages of color and depth characteristics .

especially for elderly or disabled people.

6

Introduction

Requirements of home-used human tracking system Track multiple persons Be robust against changes in the environment Could re-identifying persons Real-time performance Low-cost camera sensors

7

Related Works

8

Related Works

Human segmentation RGB camera: background modeling

 Median filter[3] and Gaussian Mixture Model[4]

Depth camera[10]

Motion and depth[11]

Graph cut[12]

9

Related Works

Human tracking RGB camera:  appearance modeling

Mean shift tracker[5]:  real-time non-parametric technique Particle-filter[6]: a random search

Depth camera Expectation Maximization algorithm[12]

10

Related Works

RGB camera Intuitive and easy Depend on color/intensity =>unreliable

Depth camera Illumination suitable Can’t handle occlusion and identification

Both camera [13,14,15]

Proposed method

11

Reference[3] R. Culter, and L. Davis, “View-based detection,” Proc. ICPR, 1998.

[4] Z. Zivkovic, “Improved adaptive Gaussian Mixture Model for background subtraction,” Proc. ICPR, pp. 28-31, Aug. 2004.

[5] D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 5, pp. 564-577, 2003.

[6] K. Nummiaro, E. Koller-Meier and L. Van Gool “An adaptive color-based particle filter” Image Vis. Comp, 2003

[10] A. Bevilacqua, L. Di and S. Azzari, “People tracking using a Time-of-Flight depth sensor”Proc. IEEE Int. Conf. Video and Signal based Surveillance, 2006

[11] D. Hansen, M. Hansen, M. Kirschmeyer, R. Larsen and D. Silvestre “Cluster tracking with Time-of-Flight cameras” Proc. CVPR workshop on TOF-CV, June, 2008

[12] O. Arif, W. Daley, P. Vela, J. Teizer and J. Stewart “Visual tracking and segmentation using Time-of-Flight sensor” Proc. ICIP, pp. 2241-2244, Sept. 2010

[13] R. Crabb, C. Tracey, A. Puranik and J. Davis “Real-time foreground segmentation via range and color imaging” Proc. CVPR Workshop on TOF-CV, June 2008

[14] S. Gould, P. Baumstarck, P. Quigley, M. Ng and A. Koller “Integrating visual and range data for robotic object detection” Proc. ECCV Workshop on Multi-camera and Multimodal Sensor Fusion, June 2008

[15] L. Sabeti, E. Parvizi and Q. Wu “Visual tracking using color cameras and Time-of-Flight range imaging sensors” Journal of multimedia, vol. 3, no. 2, pp. 28-36, June, 2008

12

Proposed Method

13

Proposed Method

1.

2.3.

4.

14

1. Object label

Motion detection using background subtraction B: depth of background

Depth clustering using depth information Seeds: Moving pixels detected in the previous step Check the depth continuity of neighboring pixels of the seeds Returns with several separated clusters as the objects

15

2. Detecting Human Problem: depend on the posture Solution: defer computation until the person is standing Characteristic: a moving object can be promoted to be a human

only when it is stable with sufficient height. Stable: size changes are less than 10% in 5 successive frames Height: related with depth [16]

d: distance between the camera and the object

a1 and a2 : parameters in off-line calibration

[16] P. Remagnino, A. Shihab, and G. Jones, “Distributed intelligence for multi-camera visual surveillance,” Pattern Recognition, vol. 37, no. 4, pp. 675-689, April 2004.

16

3. Human Re-Entry Identification

How? Track persons across successive appearances in the scene Tag persons with a persistent ID label

Technique: appearance-based matching Extend color histogram including color and texture Length ratio between different body parts is fixed Head / torso / leg

17

3. Human Re-Entry Identification

Head Top 1/3 part of the entire body Detect the length in horizontal direction The neck width is less than others.

(local minimum)

Torso and leg: by ratio

18

3. Human Re-Entry Identification Use color histogram and texture to describe the human appearance Probability of feature u:

: the pixel locations in the defined region : associates to the pixel at location Formula the index C: normalization constant the Kronecker delta function Texture intensity using canny detection

19

3. Human Re-Entry Identification Compare two histogram as similarity

20

4.Human ID Tracking

Depth continuity Gaussian distribution:

Appearance similarity Bhattacharyya distance:

The Probability Ti matching with Dj

21

Proposed Method

1.

2.3.

4.

22

Experimental Results

23

Experimental Results

Device: Dual core 2.53 GHz, 4 GB RAM with a 64-bits operation system

Implemented by C++ with OpenNI and OpenCV library Evaluation

A. Object labeling B. Human detection and ID tracking C. System efficiency

24

Experimental Results A. Object labeling in different situation

Stable lightDifferent color between foreground & background

Stable lightSimilar color between F & B

25

Using depth data 

Using RGB data 

26

Experimental Results B. Human region detection

96.1% accuracy in 2000 frames Only 78 frames failed

27

Experimental Results B.

Identification Set the RGB-D sensor in a living room for 30 minutes, and

asked persons to leave and come back for 35 times.  The persons used 5 different coats. 

Results: only 8 occasions fail Coats with similar colors Posture difference

28

Experimental Results B.

Accuracy in human tracking module(based on 5 videos ,in total 2600 frames) Proposed: 96.27% Particle filter[6]: 83.54% =>illumination Mean shift filter[5]: 71.23% =>occlusion

[5] D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object tracking,”IEEE Trans. Pattern Anal. Mach. Intell., 2003.[6] K. Nummiaro, E. Koller-Meier and L. Van Gool “An adaptive color-based particle filter” Image Vis. Comp, 2003

29

Experimental Results B.

0 01 1

0 1 1 0

Anonym

Anonym

30

Experimental Results C.

System efficiency in 100 frames 1-person: 41.3 ms/frame 2-person: 73.8 ms/frame 3-person: 97.1 ms/frame

Overall about more than10 fps

31

Conclusion

32

Conclusion

Proposed a two-camera system based on a RGB-D sensor, which enables person detection, tracking and re-entry identification.

Proposed system can achieve real-time performance with sufficient accuracy 95% detection; 80% re-identification; 96% occlusion and illumination

Future Work Improve human detector to execute with a more general descriptor 

Recommended