10
Research Article Using Traffic Light Signal to Enhance Intersection Foreground Detection Based on Video Sensor Networks Rong Ding, Shunli Wang, and Xu Liu State Key Laboratory of Soſtware Development Environment, School of Computer Science and Engineering, Beihang University, Beijing 100191, China Correspondence should be addressed to Rong Ding; [email protected] Received 4 November 2013; Revised 1 March 2014; Accepted 28 March 2014; Published 15 April 2014 Academic Editor: Zhiyong Wang Copyright © 2014 Rong Ding et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Foreground detection plays an important role in the traffic surveillance applications, especially in urban intersections. Background subtraction is an efficient approach to segment the background and foreground with static cameras from video sensor networks. But when modelling the background, most statistical techniques adjust the learning rate only based on the changes from video sequences, which is a crucial parameter controlling the updating speed. is causes a slow adaptation to sudden environmental changes. For example, a stopped car fuses into background before moving again, and it lowers the segmentation performance. is paper proposes an efficient way to address the problem by accounting for the physical world signal in traffic junctions. It assigns an adaptive learning rate to each pixel by integrating traffic light signal obtained from sensor networks.Combined with abundant physical world signals, background subtraction method is able to adapt itself to the outside world changes instantly. We test our approach in real urban traffic intersection; experimental results show that the new method increases the accuracy of detection and has a promising future. 1. Introduction Intelligent video surveillance, aiming at making traffic more intelligent and decreasing the amount of vehicle accidents, is a well-studied subject area with both existing application systems and new approaches still being developed. Among this area, detecting objects at the intersection is one of the most significant focuses in typical intelligent transportation systems (ITS) applications and the basis of high-level pro- cessing. In the most real intersections, a single camera is not enough to monitor the whole scenario. Video sensor network provides a large-scale, redundant, of video streams to observe the intersections [1, 2]. Because some of video-based traffic monitoring systems include high-level description of both cars and their behaviours, continuous tracking result is sig- nificant to the high-level processing. Background subtraction is a widely used technique for foreground detection which compares an observed image with an estimated background image that does not contain any objects of interest. But before using this method, several parameters have to be determined. Among these arguments, the learning rate is more critical to the performance. If the rate is large, the slow or stopped vehicles will fuse into background quickly just as Figure 1 describes. But if we set the rate at a small value, the background will not be updated in time. In particular, in a traffic junction scene, vehicles always encounter congestion and stop-and-go when there is a red light. At this time, a reasonable learning rate becomes more significant. Usually, the majorities of current methods adjust learning rate only relying on the changes from video sequences. is causes very oſten the method to be unable to adapt itself with the outside world changes instantly. And when the traffic light is red, it mainly leads the tracking of vehicles to be interrupted. Once the light turns green, the cars move again and new tracking of them is constructed. In summary, it has a bad effect on the continuous tracking of foreground objects and reduces the accuracy of some high-level understanding methods, such as [3, 4]. Unlike previous methods, this paper focuses on how to adjust learning rate according to the real-time and accurate physical world signals from other sensors. And this Hindawi Publishing Corporation International Journal of Distributed Sensor Networks Volume 2014, Article ID 576759, 9 pages http://dx.doi.org/10.1155/2014/576759

Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

Research ArticleUsing Traffic Light Signal to Enhance Intersection ForegroundDetection Based on Video Sensor Networks

Rong Ding, Shunli Wang, and Xu Liu

State Key Laboratory of Software Development Environment, School of Computer Science and Engineering, Beihang University,Beijing 100191, China

Correspondence should be addressed to Rong Ding; [email protected]

Received 4 November 2013; Revised 1 March 2014; Accepted 28 March 2014; Published 15 April 2014

Academic Editor: Zhiyong Wang

Copyright © 2014 Rong Ding et al. This is an open access article distributed under the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Foreground detection plays an important role in the traffic surveillance applications, especially in urban intersections. Backgroundsubtraction is an efficient approach to segment the background and foreground with static cameras from video sensor networks.But when modelling the background, most statistical techniques adjust the learning rate only based on the changes from videosequences, which is a crucial parameter controlling the updating speed. This causes a slow adaptation to sudden environmentalchanges. For example, a stopped car fuses into background before moving again, and it lowers the segmentation performance.Thispaper proposes an efficient way to address the problem by accounting for the physical world signal in traffic junctions. It assignsan adaptive learning rate to each pixel by integrating traffic light signal obtained from sensor networks.Combined with abundantphysical world signals, background subtraction method is able to adapt itself to the outside world changes instantly. We test ourapproach in real urban traffic intersection; experimental results show that the new method increases the accuracy of detection andhas a promising future.

1. Introduction

Intelligent video surveillance, aiming at making traffic moreintelligent and decreasing the amount of vehicle accidents,is a well-studied subject area with both existing applicationsystems and new approaches still being developed. Amongthis area, detecting objects at the intersection is one of themost significant focuses in typical intelligent transportationsystems (ITS) applications and the basis of high-level pro-cessing. In the most real intersections, a single camera is notenough tomonitor the whole scenario. Video sensor networkprovides a large-scale, redundant, of video streams to observethe intersections [1, 2]. Because some of video-based trafficmonitoring systems include high-level description of bothcars and their behaviours, continuous tracking result is sig-nificant to the high-level processing. Background subtractionis a widely used technique for foreground detection whichcompares an observed image with an estimated backgroundimage that does not contain any objects of interest. Butbefore using this method, several parameters have to be

determined. Among these arguments, the learning rate ismore critical to the performance. If the rate is large, the slowor stopped vehicles will fuse into background quickly just asFigure 1 describes. But if we set the rate at a small value, thebackground will not be updated in time. In particular, in atraffic junction scene, vehicles always encounter congestionand stop-and-go when there is a red light. At this time, areasonable learning rate becomes more significant. Usually,the majorities of current methods adjust learning rate onlyrelying on the changes fromvideo sequences.This causes veryoften the method to be unable to adapt itself with the outsideworld changes instantly. And when the traffic light is red, itmainly leads the tracking of vehicles to be interrupted. Oncethe light turns green, the cars move again and new trackingof them is constructed. In summary, it has a bad effect on thecontinuous tracking of foreground objects and reduces theaccuracy of some high-level understanding methods, suchas [3, 4]. Unlike previous methods, this paper focuses onhow to adjust learning rate according to the real-time andaccurate physical world signals from other sensors. And this

Hindawi Publishing CorporationInternational Journal of Distributed Sensor NetworksVolume 2014, Article ID 576759, 9 pageshttp://dx.doi.org/10.1155/2014/576759

Page 2: Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

2 International Journal of Distributed Sensor Networks

technique guarantees the continuous tracking of vehicles intraffic junction.

In this paper, we select the more common traffic lightas the external signal to improve the results of foregrounddetection. Meanwhile, to divide input images into reasonableregions, road line detection result is used. Because the camerais static, road lines are just detected only once in the wholemonitoring period. If the system receives a red light signal,it indicates that vehicles will slow down and stop aftersome frames. To avoid these interested vehicles blendingin background and losing existed tracking, we decrease thelearning rates of pixels in the red light region while the ratesof other pixels remain unchanged. When the system receivesa green light signal, it means that vehicles run throughthe intersection at normal speed and normal learning ratesare selected. Experimental results show that this kind ofenvironment information can greatly improve the results ofthe background subtraction and foreground detection.

We have done some work in improving the detectionresults by combining the video sequence and physical worldinformation before, which is published in paper [5]. Thispaper is an expanded version of [5], which analyzes themethod more in detail, sets more contrast experiments, andadds quantitative evaluation of experiments’ results. Theremaining of this paper is organised as follows. In Section 2,a compact review of important developments and exist-ing improvements about background subtraction and fore-ground detection are presented. We propose the proposedframework and outline three classical methods and illustratehow to use our approach to make these methods performbetter in Section 3. Further explanations of our method ina practical application scenario are supplied in Section 4.Meanwhile, the contrast experiment results between preciousmethods and our method are presented. The last section isconclusion and future work.

2. Related Research

There has been numerous works devoted to the developmentof background subtraction for real-time video processing.Several surveys devoted to this topic can be found in [6, 7].The statistical tools provide a good framework to model thebackground of a complex traffic scene and so many methodshave been developed.

During these methods, Gaussians mixture model(GMM), first presented in [8], models the distribution ofthe values observed at each pixel by a weighted mixtureof Gaussians. GMM is able to cope with the multimodalnature of many practical situations and leads to good resultswhen there are repetitive background motions, such asleaves shaking or water rippling. By far, GMM is the mostresearched and applied method. Many enhanced algorithms[9] have been proposed all along these years. Reviews ofthem can be seen in literature [7, 10]. The weakness of GMMlies in its strong assumption that the background is morefrequently visible than the foreground and that its variance issignificantly lower. Also, the initialization of the model andestimation of the parameters are problematic and uncertain

in different real-world environments. Therefore, traditionalGMM-based methods with empirical value usually arenot competent to good background subtraction results. Toavoid the difficulty of finding an appropriate shape for theprobability density function, nonparametric methods usingkernel density estimation to model background distributionshave been proposed. These methods build a histogramof background values for each pixel, by collecting valuessampled from the pixels recent time window [11]. In [12], aBayesian framework which incorporates multiple types offeatures for modelling complex backgrounds is proposed(we abbreviate the foreground detection method in [12] asFGD for short) and solves the sudden once-off backgroundchange effectively.

For all those background subtraction methods that havebeen mentioned above, they still have a common problem.That is how to select an adaptive learning rate. Explicitly,background modelling methods with a global empiricallearning rate are significantly penalized. Over these years,a lot of research papers discussed the adaptive learningrate and proposed various solutions for tuning the learningrate based on local intensity changes [13, 14], differentlevel feedbacks [15, 16], and so on. Based on GMM, [17]proposed a background subtraction method using a pixel-wise adaptive learning rate for object tracking. Unlike thetraditional methods that use the same experiential “learningrate,” it assigns a learning rate to each pixel relying ontwo parameters; one is depending on the difference of pixelintensities between the background model and the currentframe and the other is depending on the duration of the pixelbeing classified as a background pixel. In [18], the learningrates for the mean and the variance terms are decoupledand independent so as to avoid the saturation phenomenonand degeneracy problem. They use an adaptive learningrate to update the mean and a semiparametric model forthe variance. The authors of [19] use the time gap betweenmoving and stopped objects to train the background modeland get adaptive parameters for urban traffic video. Consid-ering the slow learning problem of GMM at the beginningphase, Kaewtrakulpong and Bowden improved the updatemechanism in learning step and proposed the fast-learningGaussianmixture model [20]. In [21], the enhanced Gaussianmixture model detects still objects from moving state andadjusts learning rate to improve the performance of detectingmoving object detection with intermittent stops. The authorsof [22] modulated the learning rate of background modelbased on scene activity. In [23], an updating method withadaptive learning rate (we abbreviate this method as GMMXfor short) is proposed to accurately segment the objects thatmove slow or stop for a while during moving.

Even though many background subtraction approacheswith adaptive learning rates were proposed and indeedimproved the naiveGMM, asmentioned above, they still havesome limits and are not proper for foreground detection atthe intersection. Especially, we find that many stopped carsgradually fuse into background and cannot be traced againwith the previous methods. Firstly, most of these methods[14, 15, 17, 18, 21–23] perceive sudden change only based onthe image information, such as illumination changes and

Page 3: Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

International Journal of Distributed Sensor Networks 3

(a)

(b)

Figure 1: (a) shows several stopped vehicles during red light. (b) is the corresponding detection results. As time passed, some of the foregroundobjects disappeared gradually.

background movements. Therefore, the accuracy of percep-tion is difficult to be guaranteed. Secondly, statisticalmethodsoften need a period of time to affirm and learn new changes.During this period, it generates a lot of detecting mistakes.Thus, some instant adjustment mechanisms are significant.Thirdly, image processing for perception needs additionalcomputing, which aggravates burden on the system real-timeperformance. The problems mentioned above motivate us topropose a new method to perceive environment changes andadjust the learning rate by integrating traffic light signal intovideo sensor networks.

3. Our Approach

To the best of our knowledge, there have not been anymethods that utilize the traffic light signal to enhance thevision-based background subtraction at the intersection.Thispaper proposes to regard these similar physical world signalsas the criteria to adjust background model parameters. Thereare several parameters in background modelling methods.Learning rate 𝛼 which controls the updating speed of mod-elling is amore important parameter.The stopped foregroundobjects will not fuse into background by adjusting 𝛼 accord-ing traffic light signal. Then, we use GMM [8], GMMX [23],and FGD [12] as experimental subjects and traffic light signalsas external information. How these methods are enhancedby our approach is described elaborately in the followingtext. Note that our method can work with other existing

background subtraction approaches applied in intersectionscene as well.

3.1.TheProposed Framework. In Figure 2, it depicts thewholeproposed framework of using traffic signal to enhance thevision-based background subtraction. Model initialization,the first step, assigns all parameters needed and initials thebackground model. Then, the system gets a new input imagefrom the video sensor networks and simultaneously receivesphysical world signals. Next module adjusts learning rateaccording to the traffic light signals. The following step isbackground subtraction, which is the same as the previousmethods. Then, it outputs the foreground detection results.Meanwhile, the system updates background model and waitsfor the next input image. The grey modules are newly addedin the framework, which distinguish from previous methods.

As wireless sensors become widely available and theircosts come down, traffic control systems integrate withwireless sensor networks (WSN) in ITS [24]. Our methodis designed to obtain traffic light signals from WSN. Forconvenience, we set the traffic light signal manually inexperimenting. The learning rate is adjusted according totraffic light signal. When it is red, a little constant is selected.And once the light turns green, a normal one is applied. Here,the two values are all determined empirically. The followingcontent specifically describes the improved methods.

3.2. Improved GMM Modelling. In the context of a trafficsurveillance system, Friedman and Russell [25] proposed

Page 4: Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

4 International Journal of Distributed Sensor Networks

Model initialization

Receive physical world signals

Adjust learning rate

Background subtraction

Output

Update background model

New input image

Figure 2: Framework of the proposed background subtraction.

to model each background pixel using a mixture of threeGaussians corresponding to road, vehicle, and shadows.The maintenance is made by using an incremental EMalgorithm for real-time consideration. Stauffer and Grimson[8] generalized this idea by modelling the recent history ofthe colour features of each pixel {𝑋

1, . . . , 𝑋

𝑡} by a mixture of

𝐾 Gaussians.The intensity in the RGB colour space of each pixel is

selected as the feature to classify.The probability of observingthe current pixel value is considered given by the followingformula in the multidimensional case:

𝑃 (𝑋𝑡) =

𝐾

𝑖=1

𝜔𝑖,𝑡𝜂 (𝑋𝑡, 𝜇𝑖,𝑡, Σ𝑖,𝑡) , (1)

where the parameters are the number of Gaussian𝐾, a weight𝜔𝑖,𝑡associated to the 𝑖th Gaussian at time 𝑡 with mean 𝜇

𝑖,𝑡,

and standard deviationΣ𝑖,𝑡. 𝜂 is aGaussian probability density

function

𝜂 (𝑋𝑡, 𝜇, Σ) =

1

(2𝜋)𝑛/2|Σ|1/2𝑒(−1/2)(𝑋

𝑡−𝜇)Σ−1(𝑋𝑡−𝜇). (2)

For computational reasons, Stauffer and Grimson [8]assumed that the RGB colour components are independentand have the same variances. So, the covariance matrix is ofthe form

Σ𝑖,𝑡= 𝜎2

𝑖,𝑡𝐼. (3)

The 𝐾 Gaussians are sorted in descending following theratio 𝜔

𝑗/𝜎𝑗. The first 𝐵 Gaussian distributions which exceed

certain threshold𝑇 are retained for a backgrounddistribution

𝐵 = argmin𝑏

(

𝑏

𝑖=1

𝜔𝑖,𝑡> 𝑇) . (4)

Theothers are regarded as foreground distribution.Whenthe new frame comes at time 𝑡 + 1, a match test is made foreach pixel. And a pixel matched a Gaussian distribution if

((𝑋𝑡+1

− 𝜇𝑖,𝑡)𝑇

−1

𝑖,𝑡

(𝑋𝑡+1

− 𝜇𝑖,𝑡))

1/2

< 𝑘𝜎𝑖,𝑡. (5)

When a match is found with one of the 𝐾 Gaussians, forthe matched component, the update is done as follows:

𝜔𝑖,𝑡+1

= (1 − 𝛼) 𝜔𝑖,𝑡+ 𝛼,

𝜇𝑖,𝑡+1

= (1 − 𝜌) 𝜇𝑖,𝑡+ 𝜌𝑋𝑡+1,

𝜎2

𝑖,𝑡+1= (1 − 𝜌) 𝜎

2

𝑖,𝑡+ 𝜌 (𝑋

𝑡+1− 𝜇𝑖,𝑡+1

) (𝑋𝑡+1

− 𝜇𝑖,𝑡+1

)𝑇

,

(6)

where 𝛼 and 𝜌 are two learning rates. For the unmatchedcomponent, only the weight is replaced by

𝜔𝑗,𝑡+1

= (1 − 𝛼) 𝜔𝑗,𝑡. (7)

When no match is found, the least probable distributionis replaced with a new one with initial parameters.

In our improved method, the learning rate is adaptivelytuned in accordance with external physical world events.Once a new input image arrives, the system enquires trafficlight to perceive environment changes and does some reason-able adjustments instantly to get the best effect. The learningrate is changed as follows:

𝛼 = {

𝛼red, when light is red𝛼normal, when light is green.

(8)

When light is green, 𝛼 is selected as 𝛼normal, a constantused by the original methods. Once the light turns red, 𝛼 iscorrespondingly adjusted to 𝛼red, a small constant dependingon the duration time of red light. From Figure 2, we cansee that only the steps of receiving signal and adjustinglearning rate are added. So this idea, improving models byintegrating traffic light signal, can be generalized on manyother methods.

3.3. Improved GMMX Modelling. The original GMM hasmany limitations, such as the number of Gaussians having tobe predetermined, the need for good initializations, and thedependence of the results on the true distribution law whichcan be non-Gaussian and slow recovery from failures. Toalleviate these disadvantages, numerous improvements havebeen proposed over the recent years. In this paper, we choosea method as the comparative experiment, which is abbrevi-ated as GMMX [23] and has an outstanding performance on

Page 5: Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

International Journal of Distributed Sensor Networks 5

detecting temporarily stopped objects with adaptive learningrate.

The main contribution of paper [23] is a model numberadaptive method to decrease the amount of computation andan updating method with adaptive learning rate to accuratelysegment the objects that move slow or stop for a whileduring moving (here we are only interested in the secondmethod).The authors think the fixed learning rate causes theproblem that moving objects stopping for a short time willrapidly be updated to the background model by the GMM.Thus, different learning rates should be assigned to differentdistributions. When a new match is found at time 𝑡, thelearning rate 𝛼

𝑡is changed as follows:

𝛼𝑡= max (𝛼

0, 𝜔𝑀⋅ 𝛼) , (9)

where𝜔𝑀is the weight of thematched distribution.The value

of 𝛼 should be higher than 𝛼𝑜, both of whom are constants.

The reason for evaluating𝛼𝑡with themaximumof𝛼

0and𝜔

𝑀⋅

𝛼, rather than𝜔𝑀⋅𝛼, is that, when𝜔

𝑀has a very low value, 𝛼

𝑡

will almost be zero if being evaluated with𝜔𝑀⋅𝛼. It will cause

an object staying for a long time to be difficult to be updatedinto the background model.

Traffic light signal is still able to be combined withGMMX by changing 𝛼

𝑡as follows:

𝛼𝑡= {

min (𝛼red,max (𝛼0, 𝜔𝑀⋅ 𝛼)) , when light is red

max (𝛼0, 𝜔𝑀⋅ 𝛼) , when light is green,

(10)

where 𝛼red is a little constant to prevent the objects from fus-ing into the background in accordance with external signals.Because the red lightmay last for dozens of seconds, the valueof 𝛼𝑡should not be beyond 𝛼red. Otherwise, the method of

GMMX also results in the disappearing of stopped objects,especially at a later stage when the matched component hasa large 𝜔

𝑀. Thus, external signals are also needed to let the

output be more accurate.

3.4. Improved FGD Modelling. Li et al. proposed to classifybackground and foreground pixels under the Bayes decisiontheory [12]. Let𝑉

𝑡be a discrete value feature vector extracted

from an image sequence at the pixel 𝑠 = (𝑥, 𝑦) and timeinstant 𝑡. According to the Bayes rule, a posterior probabilityof 𝑉𝑡from the background 𝑏 or foreground 𝑓 is

𝑃 (𝐶 | V𝑡, 𝑠) =

𝑃 (V𝑡| 𝐶, 𝑠) 𝑃 (𝐶 | 𝑠)

𝑃 (V𝑡| 𝑠)

, 𝐶 = 𝑏 or 𝑓. (11)

Using the Bayes decision rule, a pixel 𝑠 is classified asbackground according to its feature vector V

𝑡,𝑠observed at

time 𝑡 if

𝑃 (𝑏 | V𝑡,𝑠) > 𝑃 (𝑓 | V

𝑡,𝑠) . (12)

Note that the feature vectors associated with the pixel 𝑠are either from background or from foreground objects, andit follows that

𝑃 (V𝑡| 𝑠) = 𝑃 (V

𝑡| 𝑏, 𝑠) ⋅ 𝑃 (𝑏 | 𝑠) + 𝑃 (V

𝑡| 𝑓, 𝑠) ⋅ 𝑃 (𝑓 | 𝑠) .

(13)

Substituting (11) and (13) into (12), it becomes

2𝑃 (V𝑡| 𝑏, 𝑠) ⋅ 𝑃 (𝑏 | 𝑠) > 𝑃 (V

𝑡| 𝑠) . (14)

In this method, the colours of a pixel are chosen as thefeature for stationary background, while the colour cooccur-rences of interframe changes from the pixel are chosen as thefeature for moving background. And a table of statistics forthe possible principal features is established for each featuretype at 𝑠, which is denoted as

𝑆𝑠,𝑡,𝑖

V𝑡

=

{{{{

{{{{

{

𝑝𝑡,𝑖

V = 𝑃 (V𝑖𝑡| 𝑠)

𝑝𝑡,𝑖

V,𝑏 = 𝑃 (V𝑖

𝑡| 𝑏, 𝑠)

V𝑖𝑡= [𝑎𝑖

1, . . . , 𝑎

𝑖

𝑛]

𝑇

.

(15)

For each feature vector V𝑡that is used to classify a pixel as

foreground or background, the statistics of the correspondingfeatures (colour or colour cooccurrence) is updated by

𝑝𝑠,𝑡+1

𝑏= (1 − 𝛼

2) 𝑝𝑠,𝑡

𝑏+ 𝛼2𝑀𝑠,𝑡

𝑏,

𝑝𝑠,𝑡+1,𝑖

V = (1 − 𝛼2) 𝑝𝑠,𝑡,𝑖

V + 𝛼2𝑀𝑠,𝑡,𝑖

V ,

𝑝𝑠,𝑡+1,𝑖

V𝑏 = (1 − 𝛼2) 𝑝𝑠,𝑡,𝑖

V𝑏 + 𝛼2 (𝑀𝑠,𝑡

𝑏∧𝑀𝑠,𝑡,𝑖

V ) ,

(16)

where 𝛼2is the learning rate which controls the speed of

feature learning. 𝑀𝑠,𝑡𝑏

= 1 when 𝑠 is labelled as the back-ground at time 𝑡; otherwise, 𝑀𝑠,𝑡

𝑏= 0. 𝑀𝑠,𝑡,𝑖V = 1 when

V𝑖𝑡of 𝑆𝑠,𝑡,𝑖V

𝑡

in (15) matches V𝑡best and 𝑀

𝑠,𝑡,𝑖

V = 0 for theremainders. A reference background image that representsthe most recent appearance of the background is maintainedat each time to make the background difference accurate.If 𝑠 is detected as a point of insignificant change in changedetection, the reference background image is updated as

𝐵 (𝑠, 𝑡 + 1) = (1 − 𝛼1) 𝐵 (𝑠, 𝑡) + 𝛼

1𝐼 (𝑠, 𝑡) . (17)

Also, traffic light signal is added in FGD just as GMMin Figure 2. The external signals are mainly used to adjustfeature learning rate 𝛼

2

𝛼2= {

𝛼red, when light is red𝛼normal, when light is green.

(18)

When the light is red, a little constant is set to avoid thestopped cars vanishing quickly. Once the signal changes, 𝛼

2

is always set as a suitable value.This method not only ensuresthe robustness of foreground detection but also eliminates themissing of slow-moving or stopped vehicles in the results.

4. Experiment Results

In order to make the experiment more convincing, we recordthe video at a real traffic junction, where there is a trafficlight.Then,we run the programs ofGMM,GMMX, FGD, andtheir enhanced versions, collect, and analyse the executiveoutcomes.

Page 6: Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

6 International Journal of Distributed Sensor Networks

(1) Frame 2122 (2) Frame 2256 (3) Frame 2381 (4) Frame 2417

(a) Frame 2122 (b) Frame 2256 (c) Frame 2381 (d) Frame 2417

(e) Frame 2122 (f) Frame 2256 (g) Frame 2381 (h) Frame 2417

(i) Frame 2122 (j) Frame 2256 (k) Frame 2381 (l) Frame 2417

(m) Frame 2122 (n) Frame 2256 (o) Frame 2381 (p) Frame 2417

(q) Frame 2122 (r) Frame 2256 (s) Frame 2381 (t) Frame 2417

(u) Frame 2122 (v) Frame 2256 (w) Frame 2381 (x) Frame 2417

Figure 3: (1)–(4) are four original frames selected randomly from the test video when the light is red.These images show us five cars that slowdown and stop successively. The other images are the foreground detection results of three original methods and corresponding improvedmethods. (a)–(d) are the results of GMM, while (e)–(h) are the results of improved GMM. (i)–(l) are the results of GMMX, while (m)–(p)are the results of improved GMMX. (q)–(t) are the results of FGD, and the images of the last row are the results of improved FGD.

Page 7: Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

International Journal of Distributed Sensor Networks 7

0

50000

100000

150000

200000

250000

300000

350000

GMM ImprovedGMM

GMMX ImprovedGMMX

FGD ImprovedFGD

FPFN

Figure 4: Overall performance of the selected twelve frames for thethree previous methods and their improvements.

0

10000

20000

30000

40000

50000

2122

2172

2222

2256

2322

2381

2417

2467

2520

2570

2620

2672

GMMImproved GMMGMMX

Improved GMMXFGDImproved FGD

Figure 5: The horizontal axis is the frame number, while thevertical axis is the FN value (the number of foreground pixels thatare wrongly marked as background). Different colour representsdifferent method.

4.1. Data and Qualitative Results. The test video, whichconsists of 2752 frames of 640 ∗ 480 pixels and is acquiredat a frequency of 25 fps (frames per seconds), is taken from abusy intersection with traffic light. From the original images,we can see that strong shadows casted bymoving vehicles canbe observed in the entire sequence, but removing shadow isbeyond our paper. We do not perform removing shadow inall of the experiments.

The detailed selection of important parameters is asfollows. In GMM, we choose 𝐾 = 5, 𝑇 = 0.6, and 𝑘 = 2.5.The learning rate is set to 0.005 and 𝜎 is initialized with 30.Initial weight associated to eachGaussian is 0.05. Tomake theexperiment more comparable, GMMX has the same valuesof parameters with GMM. In their enhanced versions, whenlight is red, the learning rate is changed to 0.0001. In FGD,there are 64 and 32 bins in the joint histograms for colourand colour cooccurrence vectors, respectively. To make thecomputation and storage efficient, we set 𝑁

1= 15 and 𝑁

2=

30 for colour features and 𝑁1= 25 and 𝑁

2= 40 for colour

cooccurrence features.The background updating rate𝛼2is set

to 0.01. In accordance with the original work, we initialize theprior and conditional probabilities as 𝑝𝑠,0

𝑏= 0, 𝑝𝑠,0,𝑖V = 0, and

𝑝𝑠,0,𝑖

V,𝑏 = 0 for 𝑖 = 1, . . . , 𝑁2 and V𝑡= {𝑐𝑡, 𝑐𝑐𝑡}. In the same way,

improved FGDuses the same values of parameters with FGD.Only when light is red, 𝛼

2is set to 0.0001.

In Figure 3, the first row is four original frames selectedrandomly from the test video sequence, which are frames2122, 2256, 2381, and 2417, respectively. These pictures showus five cars that slow down and stop successively. Then,the images below are the detection results of three initialapproaches and their enhanced versions. We can see thatthe enhanced methods are successful to keep the stoppedvehicles in the detection results, while the outputs of previousmethods show that the cars in front of every image have fusedinto background. Our approach obviously improves the effectof detection.

4.2. Quantitative Evaluation. In the evaluation, we choose 12frames from the segment of the test video, during which thelight is red and lasts about 25 seconds. In other words, a frameis selected in every 2 seconds. We use these images to analysethe performance of our method.

Three terms are used in the quantitative evaluation: falsepositive (FP) is the number of background pixels that arewrongly marked as foreground; false negative (FN) is thenumber of foreground pixels that are wrongly marked asbackground; total error (TE) is the sum of FP and FN.We calculate these terms for each image according to thecorresponding hand-segmented ground truth. FN, FP, andTE of every approach are the sum of four frames of FN, FP,and TE.

Figure 4 illustrates overall performance on the selectedtwelve frames for the three previous methods and theirimprovements. The total error of the improved versions isless than the previous ones. In particular, the new approachreduces FN vastly, which means that the stopped cars wouldnot disappear before the light turns into green. So, thetracking of foreground objects will not be interrupted, whichsupplies a solid foundation for many high-level imagesprocessing. But it is a pity that FP increases a little and it isour next problem to be solved.

In Figure 5, FN of different methods is listed according tothe frame number. The horizontal axis is the frame number,while the vertical axis is the FN value (the number offoreground pixels that are wrongly marked as background).And different colours represent different methods. FromFigure 5, we can see that FN of previous methods increasesrapidly, which means that the foreground objects disappearfrom the detection results, while FN of the improved onesstays in a low quantity. This phenomenon is more obvious inlatter period of red light.

In summary, we conclude that our method can effectivelyimprove the accuracy and reliability of foreground segmenta-tion.Meanwhile, various backgroundmodellingmethods areable to benefit from physical world signals.

5. Conclusions

We present a novel method that utilises traffic light signal toenhance the performance of background subtraction, while

Page 8: Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

8 International Journal of Distributed Sensor Networks

existing methods use only image information to model andupdate the reference background. Then, this paper recordselaborately the experimental process and results. By contrastwith FN, FP, and TE of the previous methods, such asGMM, GMMX, and FGD, the responding enhanced versionsobviously have a better performance. It demonstrates thatbackground subtraction methods based on traffic light signalmay have a bright future. Considering that different pixelshave different characteristics of changes in colour, it is betterto set different and more reasonable learning rates based onthese signals for each pixel rather than some constants forall pixels, which is our future research focus. Meanwhile, inorder to combine the background modelling methods withphysical world signals more closely, we will think over morerelations between model parameters and these signals to geta better effect.

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper.

Acknowledgments

This work was supported by the fund of the State Key Lab-oratory of Software Development Environment (Grant no.SKLSDE-2012ZX-01) and the Fundamental Research Fundsfor the Central Universities.

References

[1] M. M. Trivedi, I. Mikic, and G. Kogut, “Distributed videonetworks for incident detection and management,” in Proceed-ings of the IEEE Intelligent Transportation Systems, pp. 155–160,October 2000.

[2] M.M.Trivedi, T. L.Gandhi, andK. S.Huang, “Distributed inter-active video arrays for event capture and enhanced situationalawareness,” IEEE Intelligent Systems, vol. 20, no. 5, pp. 58–65,2005.

[3] T. A. C. Bragatto, G. I. S. Ruas, V. A. P. Benso et al., “Anew approach to multiple vehicle tracking in intersectionsusing harris corners and adaptive background subtraction,” inProceedings of the IEEE Intelligent Vehicles Symposium (IV ’08),pp. 548–553, June 2008.

[4] C. Zhang, S. C. Chen, M. L. Shyu et al., “Adaptive backgroundlearning for vehicle detection and spatio-temporal tracking,” inProceedings of the 4th Pacific Rim Conference on MultimediaInformation, Communications and Signal Processing, vol. 2, pp.797–801, 2003.

[5] R. Ding, X. Liu, W. Cui, and Y. Wang, “Intersection foregrounddetection based on the cyber-physical systems,” in Proceedingsof the IET International Conference on Information Science andControl Engineering, pp. 1881–1886, Shenzhen, China, 2012.

[6] M. Hedayati, W. M. D. W. Zaki, and A. Hussain, “A qualitativeand quantitative comparison of real-time background subtrac-tion algorithms for video surveillance applications,” Journal ofComputational Information Systems, vol. 8, no. 2, pp. 493–505,2012.

[7] T. Bouwmans, F. El Baf, and B. Vachon, “Background modelingusing mixture of Gaussians for foreground detection-a survey,”

Recent Patents on Computer Science, vol. 1, no. 3, pp. 219–237,2008.

[8] C. Stauffer and W. E. L. Grimson, “Adaptive backgroundmixture models for real-time tracking,” in Proceedings of theIEEE Computer Society Conference on Computer Vision andPattern Recognition (CVPR ’99), pp. 246–252, June 1999.

[9] Y.-J. X. Hou and S.-R. Gong, “Adaptive shadows detectionalgorithm based on Gaussian Mixture Model,” in Proceedingsof the International Symposium on Information Science andEngineering (ISISE ’08), pp. 116–120, December 2008.

[10] T. Bouwmans and F. Baf, “Statistical background modeling forforeground detection: a survey,” Handbook of Pattern Recogni-tion and Computer, vol. 4, pp. 181–199, 2010.

[11] A. Tavakkoli, M. Nicolescu, G. Bebis, and M. Nicolescu,“Non-parametric statistical background modeling for efficientforeground region detection,”Machine Vision and Applications,vol. 20, no. 6, pp. 395–409, 2009.

[12] L. Li, W. Huang, I. Y.-H. Gu, and Q. Tian, “Statistical modelingof complex backgrounds for foreground object detection,” IEEETransactions on Image Processing, vol. 13, no. 11, pp. 1459–1472,2004.

[13] Y.-H. Li, H.-F. Tian, and Y. Zhang, “An improved Gaus-sian mixture background model with real-time adjustment oflearning rate,” in Proceedings of the International Conferenceon Information, Networking and Automation (ICINA ’10), pp.V1512–V1515, October 2010.

[14] M. Shah, J. Deng, andB. J.Woodford, “Localized adaptive learn-ing ofMixture of Gaussians models for background extraction,”in Proceedings of the 25th International Conference of Image andVision Computing New Zealand (IVCNZ ’10), November 2010.

[15] A. Pnevmatikakis and L. Polymenakos, “2D person trackingusing Kalman filtering and adaptive background learning in afeedback loop,” Lecture Notes in Computer Science (includingsubseries Lecture Notes in Artificial Intelligence and Lecture Notesin Bioinformatics), vol. 4122, pp. 151–160, 2007.

[16] H.-H. Lin, J.-H. Chuang, and T.-L. Liu, “Regularized back-ground adaptation: a novel learning rate control scheme forgaussian mixture modeling,” IEEE Transactions on Image Pro-cessing, vol. 20, no. 3, pp. 822–836, 2011.

[17] K. K. Ng and E. J. Delp, “Background subtraction using a pixel-wise adaptive learning rate for object tracking initialization,” inProceedings of the Visual Information Processing and Communi-cation II, January 2011.

[18] P. L. M. Bouttefroy, A. Bouzerdoum, S. L. Phung, and A.Beghdadi, “On the analysis of background subtraction tech-niques using Gaussian mixture models,” in Proceedings of theIEEE International Conference on Acoustics, Speech, and SignalProcessing (ICASSP ’10), pp. 4042–4045, March 2010.

[19] P.-S. Hwang, K.-Y. Eom, J.-Y. Jung, andM.-H. Kim, “A statisticalapproach to robust background subtraction for urban trafficvideo,” in Proceedings of the 2nd International Workshop onComputer Science and Engineering (WCSE ’09), pp. 177–181,October 2009.

[20] P. KaewTraKulPong and R. Bowden, “An improved adaptivebackground mixture model for real-time tracking with shadowdetection,”Video-Based Surveillance Systems, pp. 135–144, 2002.

[21] H. Fu, H. Ma, and A. Ming, “EGMM: an enhanced Gaussianmixture model for detecting moving objects with intermittentstops,” in Proceedings of the 12th IEEE International Conferenceon Multimedia and Expo (ICME ’11), July 2011.

[22] M. Harville, G. Gordon, and J. Woodfill, “Adaptive videobackground modeling using color and depth,” in Proceedings

Page 9: Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

International Journal of Distributed Sensor Networks 9

of the IEEE International Conference on Image Processing (ICIP’01), pp. 90–93, October 2001.

[23] P. Suo and Y. Wang, “An improved adaptive Backgroundmodeling algorithmBased onGaussianMixturemodel,” in Pro-ceedings of the 9th International Conference on Signal Processing(ICSP ’08), pp. 1436–1439, October 2008.

[24] M. Tubaishat, Z. Peng, Q. Qi, and S. Yi, “Wireless sensornetworks in intelligent transportation systems,” Wireless Com-munications and Mobile Computing, vol. 9, no. 3, pp. 287–302,2009.

[25] N. Friedman and S. Russell, “Image segmentation in videosequences: a probabilistic approach,” in Proceedings of the 13thconference on Uncertainty in artificial intelligence (UAI ’97), pp.175–181, San Francisco, Calif, USA, 1997.

Page 10: Research Article Using Traffic Light Signal to Enhance ...downloads.hindawi.com/journals/ijdsn/2014/576759.pdf · csurveillanceapplications,especiallyinurbanintersections.Background

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Journal ofEngineeringVolume 2014

Submit your manuscripts athttp://www.hindawi.com

VLSI Design

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Modelling & Simulation in EngineeringHindawi Publishing Corporation http://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

DistributedSensor Networks

International Journal of