Presented by Rajesh Radhakrishnan Instructor: K.R Rao

Statistical analysis and evaluation of spatio-temporal and compressed domains moving

object detection

Presented by Rajesh RadhakrishnanInstructor: K.R Rao

Scope of the Project

• Three essential modules are required to obtain the comparative study of the moving object detection between two domains.

• First is the manual annotation of hand locations using a GUI to get the co-ordinate location of the object in every frame.

• Second is to obtain time series of hand locations based on spatio-temporal algorithm.

• Third is to obtain time series of hand locations based on compressed domain algorithm.

Spatio temporal moving object detection:

• Given a video test sequence, we need to perform background subtraction to separate the foreground moving objects from the background model.

Fig 1: Block diagram of steps involved in spatio temporal object detection

• Background modeling is a process of obtaining

static image regions from a sequence of video.• Frame differencing is one of the technique to

perform background modeling. • Parametric and non-parametric estimation

model is a way to improve the candidate foreground object detection.

Parametric model• Simple Gaussian model is an example of a parametric

model. Estimate parameters such as mean and standard deviation.

• Consider a block of ground truth image and estimate the mean and standard deviation of the block.

• Ground truth can be defined as the area under the actual moving object in each frame.

• P(color/RGB)=(P(RGB/color)*P(color))/P(RGB) by Bayes rule.• Where, P(color/RGB)= conditional probability and P(x)

=probability of x.

• P(RGB/color) is estimated from the training set, which is the Gaussian probability of RGB/color(mean ,std), where std = standard deviation.

• Assume P(R), P(G), P(B) are mutually independent, then Gaussian probability of P(RGB/color) is given by,

P(RGB/color)=P(R/color)*P(G/color)*P(B/color) where P(R/color(mean, STD))= Gaussian_probability(R(i,j),mean,std)

• where, 1≤i≤N, 1≤j≤M, of an image of size NxM

An example to find P(RGB/color)

• Consider the sub-block image shown in Fig 1 to estimate the mean and standard deviation of the green color object to be detected.

This is an sub-block image of size 80x60, find the mean and standard deviation of each color band separately.

Fig 2:Sub-block image to estimate mean and standard deviation.

• How to estimate P(color) and P(RGB) ??• P(color) can be of any value, this determines

the color adjustment factor.• Some of the prior probabilities are shown,

P(RGB) =1, as shown below,

P(RGB)= P(RGB/color)* P(color)+P(RGB/non_color)*P(non_color)

X X X

P(x) P(x) P(x)

Prior Probabilities

Fig. 2.1 Prior probability distribution [4]



Non-parametric Estimation

• Non- parametric model does not require any parameter estimates.

• Histogram based distribution is one of the non parametric model.

• For basic color object detection like red, green and blue colors, approximation based method can be done.

• Suppose a green object has to be detected, given a color image of dimension NxMx3,

• Subtract the green pixel region with the rest to get the probability of green distributions in the image

Green_dist(i,j)=2*Image(i,j,2)-Image(i,j,1)-Image(i,j,3)

Experimental Results- Spatiotemporal object detection

• The spatio-temporal moving object detection algorithm was tested on two video sequences one with a nearby object and another with a far end object.

• Four set of output were generated, with single and multiple detection boxes.

Fig 3. Close up video Frame # 17, 3 detection boxes.

Fig 4. Distant video Frame # 19, 3 detection boxes.

Object detection output of a frame with three detection boxes

Fig 5. Close up video Frame # 19, 1 detection box

Fig 6. Close up video Frame # 17, 1 detection box

Object detection output of a frame with single detection box

Compressed Domain Object detection

• Motion vector estimate is used to predict the moving object block.

Algorithm:• Rearrange the frames from bit stream order to display

order.• Consider three pairs of arrays present, past and future

for storing the motion vectors.• The process of inputting the motion vectors into

correct arrays and reordering frames were incorporated into the decoder.

• Each video sequence is divided into one or more group of pictures (GOPs), the display order of the GOPs will be of the form given in fig . 7,

• Here I , B and P are intra-coded, bidirectional prediction and predicted frames.

Fig 7: MPEG group of pictures – Display order [11].

But the encoder output in bit stream order will be of the form I P B B P B B I B B P B B. [11]

Bit stream to display order conversionConverting from bit stream order to Display order

Fig.8. Block diagram illustrating conversion from bit stream order to display order [11].

• If an P frame is encountered, place it in a temporary storage called future.

• P frame will be left in the future until another I or P frame comes in, on arrival of a new I or P frame, the already existing I or P frame is removed from the future and put in the display order.

• All B frames are immediately put in display order.• Next step is to obtain the motion vectors from

these frames.

Fig.9. Flow chart of the operational program. [9]

Frame handling- Program Operation

• Each incoming frames are placed in a past, present or future array locations based on their type (i.e) either a P, I or B frame.

• The size of the array will be equal to the frame size in macro blocks, (i.e) the frame size used in this project is 240x320, for a motion vector of block size 8x8, array size would be 30x40.

• Once the motion vectors are stored, the next step is to find the motion from frame to frame.

Finding motion from frame to frame

• The output of the present and past frame array motion vectors are used to find the motion from frame to frame.

Past Present Vector types that can be subtracted

I B or P Forward only

I I None

P B or P Forward only

P I None

B B or P Forward and backward

Table 1: Constraints to be taken into account.

• For example, consider a transition from B frame to a P or B frame, it has both the forward and backward vector to be considered.

• let a B frame macro block motion vector have values (4, -6) for forward prediction and (-6,1) for backward prediction.

• Let a P frame macro block motion vector have values (9,-7) for forward and (0,0) for backward, as P frame doesn’t have a backward prediction.

• Total motion will be average of forward and backward prediction.

• Forward = (9,-7) – (4,-6)=(5,-1) , backward = (0,0) – (-6,1) = (6,-1)

• The corresponding motion vector values are written into a file one for horizontal and another for vertical and its values were plotted using MATLAB.

• The motion vector which gave a maximum direction was spotted and its corresponding spatial domain coordinate location was noted.

• For example, suppose the array location (16,24) gave the maximum motion vector magnitude, then the corresponding spatial coordinates was marked as (128,192).

0 5 10 15 20 25 30 350

5

10

15

20

25

30

35

40

45

Vertical Motion

Horizontal Motion

Fig 10:Motion vector values from frame #15 of close detect1

Motion vector Plot

Fig 11:Corresponding spatial domain frame of the moving object

Close detect frame #15

In this particular example the object was correctly detected from the motion vector estimate.

Problems/ Constraints in object detection using motion vectors

• If the test video has moving background object other than object to be detected then the object detection accuracy will be very less.

• For a specific moving object detection, in our case a color object, even the user hand which moves the object that has to be detected can be falsely classified as correctly detected object.

• These constraints results in reduced accuracy of detection.

GUI for annotating hand locations:

Fig 12: GUI for annotating hand locations

Accuracy of Detection

1. Accuracy of detection= correctly classified object/(correct answers).2. The rectangular boxes namely correct answers and questions answered are the 40x40 detection box.3. Where, correct answers detection box are obtained manually from the GUI used earlier and questions answered are the results returned from the object detection algorithm.4. Correctly classified objects is the region of overlap between the two detection boxes.

Fig 13: Evaluation of accuracy

Results:

• The results were obtained by considering the object detection from two different test video sequences. The results are tabulated as follows.

Test video Sequence Accuracy of detection Spatiotemporal method

Accuracy of detection motion vector based object detection in compressed domain

Close detect 94% 79%

Distant detect 91% 68%

Table 2: Performance of moving object detection

Further Improvements

• The accuracy of moving object detection in compressed domain can be further improved by considering parameters like DCT coefficients associated with each macro block. [6].

• Including pre-processing and post-processing steps can enable object detection algorithm to adopt to non-stationary object movements such as waving trees, image changes due to camera motion and illumination changes.

References• [1] Z. Qiya and L. Zhicheng, “Moving object detection algorithm for H.264/AVC

compressed video stream”, ISECS International Colloquium on Computing, Communication, Control and Management, pp.186-189, Sep. 2009.

• [2] T. Yokoyama, T. Iwasaki, and T. Watanabe,” Motion vector based moving object detection and tracking in the MPEG compressed domain”, Seventh International Workshop on content based Multimedia Indexing, pp. 201-206, Aug. 2009.

• [3] Kapotas K and A. N. Skodras,” Moving object detection in the H.264 compressed domain”, International Conference on Imaging systems and techniques, pp.325-328, Aug. 2010.

• [4] S. C Sen-Ching and C. Kamath,” Robust techniques for background subtraction in urban traffic video” Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Jul. 2004.

• [5] S. Y. Elhabian and K. M. El-Sayed,” Moving object detection in spatial domain using background removal techniques- state of the art”, Recent patents on computer science, Vol 1, pp. 32-54, Apr. 2008.

• [6] O. Sukmarg and K.R Rao,” Fast object detection and segmentation in MPEG compressed domain”, IEEE TENCON 2000, proceedings, pp. 364-368, Mar. 2000.

• [7] W.B. Thompson and Ting-Chuen,” Detecting moving objects”, International journal of computer vision, pp. 39-57, Jun. 1990.

• [8] JM software - http://iphome.hhi.de/suehring/tml/ • [9] V. Y. Mariano, et al,”Performance evaluation of object detection

algorithms” International conference on pattern recognition, Vol.3, pp. 965 – 969, June 2002.

• [10] J. C Nascimento and J. S Marques,” Performance evalaution of object detection algorithms for video survillance”, IEEE Transactions on multimedia, Vol. 8, pp. 761-774, Dec. 2006.

• [11] J Gilvarry,”Calculation of motion using motion vectors extracted from an MPEG stream”, Proc. ACM Multimedia 99, Boston MA, pp 3-50, Sept 20, 1999.

• [12] S. Aramvith and M. T Sun, “MPEG-1 and MPEG-2 Video Standards”, image and video processing handbook, Vol-2, pp- 320-342, June 1999.

• [13] FFMPEG - http://www.ffmpeg.org/download.html

http://iphome.hhi.de/suehring/tml/

http://www.ffmpeg.org/download.html

QUESTIONS ???

Documents

Presented by Rajesh Radhakrishnan Instructor: K.R Rao