Seminar Report

Virtual keyboard

Pedestrian Detection System

SEMINAR REPORTONPedstrian Detection System

Guided byProf.S.V.Jagtap

Submitted byShikha SharmaB.E -I.T Roll No-52

Bansilal Ramnath Agarwal Charitable TrustsVishwakarma Institute TechnologyPUNE 411 037

Bansilal Ramnath Agarwal Charitable TrustsVishwakarma Institute TechnologyPUNE 411 037

CERTIFICATE

This is to certify that the seminar titled Pedestrian Detection System is successfully presented and submitted byShikha SharmaJ-52

as the partial fulfillment for the Bachelor degree in Information Technology as prescribed by the University of Pune in the academic year 2011-12.

7-11-2011 Prof.S.V.JagtapDate:GuideHead of Department

Acknowledgement

On the submission of our Seminar report, I would like to extend my gratitude and sincere thanks to my seminar guide Prof.S.V.Jagtap for her constant motivation and support during the course of our work . I truly appreciate and value her esteemed guidance and encouragement from the beginning to the end of this seminar report. She has been the source of inspiration throughout the seminar project work and without her invaluable advice and assistance it would not have been possible for me to complete this seminar report.

Shikha Sharma

Contents 1. What and Why Pedestrian Detection System. ..07 1.1 What is Pedestrian Detection System?.................................................................07 1.2Architecture of Pedestrian Detection System ...........08 1.3 Why Pedestrian Detection System 092. INTRODUCTION......102.1 Context ......102.2 Overview Diagram.112.3 Benefits113. Functional Block Diagram..12 3.1Block Diagram ...12 3.2Explanation............13 3.3Working Mechanism........13 4 Description 14 4.1 Edge based target recognition...14 4.1.1Naive Approach Binary Matching..14 4.1.2Chamfer Distance........14 4.1.3 Hausdorff Measure 15 4.1.4 Partial Hausdorff Measure ...15 4.1.5 Distance transform.15 4.2 Matching Multiple Templates...164.3 AdaBoost Algorithm...17 5. Pedestrian Detection methods.......185.1 Range Sensor Based method..........185.2Vision Sensor Based Method...19 6. Implementation ..20 6.1 Monocular Camera........20 6.2 Stereo Camera..............................21 6.3 Infrared Camera.22 6.4 Image Processor.23 7. Observation.....................23 7.1Overall Scenario...23 7.2 Advantages24 7.3 Disadvantages25 8. Improvement.25 9. Conclusion and Future scope..25 10. References...26

List of figures 1. Architecture of Pedestrian Detection System...08

2. Pedestrian Detection....10

3. Overview Diagram...11

4. Block Diagram.12 5. Edge Template....14 6. Distance Transform.15 7. Lookup Table...16 8. Tree Structure..16 9. Laser Scanner Map..18 10. Pedestrian detection Using Vision Camera..19 11. Pedestrian detection Using Infrared Camera.19 12. Monocular Camera...............................................................................20 13. Stereo Camera21 14. Infrared Camera22 15. Image Processor23 16. Range of Pedestrian Detection System .24

1.What And Why Pedestrian Detection System 1.1 What is Pedestrian Detection Pedestrian detectionis an essential and significant task in any intelligentvideo surveillancesystem, as it provides the fundamental information forsemanticunderstanding of thevideo footages. This system is designed to save the pedestrians on urban streets in unfortunate unfavorable conditions The aim is to developactive (video-based) driver assistance systems whichdetect dangerous situationsinvolving pedestriansahead of time, allowing the possibility towarn the driveror toautomatically control the vehicle (e.g. braking). Such systems are particularly valuable when the driver is distracted or visibility is poor. The system uses a computer fed by information from a wide-angle radar system that detects objects and monitors their speed and distance from the car, and from a camera fitted near the rear view mirror. Using this information the computer identifies the objects and determines if they are on a collision path.If a collision is imminent the car gives the driver an audible and visual warning and brakes hard if the driver does not react quickly enough Detectors are trained to search for pedestrians in the video frame by scanning the whole frame. The detector would fire if the image features inside the local search window meet certain criteria. Some methods employ global features such as edge template.

1.2 Architecture of Pedestrian Detection System

Fig:1 Architecture of Pedestrian Detection System

1.3 Why Pedestrian Detection System Almost two-thirds of the 1.2 million people killed annually in road traffic crashes worldwide arepedestrians. Despite the magnitude of the problem, most attempts at reducing pedestrian deaths have focused solely on education and traffic regulation. Hence, there is a need of a system which would help reduce the probability of collision and the injury level, by assisting the driver and taking necessary action in case of no response from the driver.Pedestrians have the absolute right of way, absolutely, and except in the most extreme circumstancesa child dashing out from between parked carsthe law will find the driver culpable in an accident. The presumption is that pedestrians are defenseless against automobiles and must be protected. Even so, the feds' safety statistics suggest that pedestrians are more often to blame in these incidents. In 2009, nearly 40% of pedestrian fatalities were caused by pedestrians' improper crossing or walking/playing/working in the roadway, according to the National Highway Traffic Safety Administration. Recently, efforts have been made to develop a technology to ensure safety of a human in a vehicle accident. In particular, it is required to protect a pedestrian from a fatal damage when he/she collides with the vehicle, as well as to ensure safety of an occupant in the vehicle. It has been considered, as a method to protect the pedestrian colliding with the vehicle, to reduce an injury level (i.e. a strength of an impact caused by a collision) which is given to the pedestrian who collides with the vehicle and then falls on a hood of the vehicle. By reducing the injury level, the pedestrian is possibly protected from the fatal damage.A vehicle such as an automobile, adopted in the embodiment is equipped with a camera (visible camera, infrared camera, far infrared camera) serving as an image pickup device and a pedestrian detection processing unit for recognizing a pedestrian from the image of the area ahead of or around the vehicle, picked up by the camera. These are the main features of the system.

.

2.INTRODUCTION

2.1 ContextThe pedestrian detection system is a video-based driver assistance system for the detection of potentially dangerous situations with pedestrians, in order to either warn the driver, or, if no such time remains, initiate appropriate protective measures (e.g. automatic vehicle braking). The use of video sensors comes quite natural for this problem; they provide texture information at fine horizontal and vertical resolution, which in turn enables the use of discriminative pattern recognition techniques for distinguishing pedestrians from other static and dynamic objects in the traffic environment. The human visual perception system is perhaps the best example of what performance might be possible with such sensors, if only the appropriate processing were used. Yet the pedestrian application is very challenging from machine vision perspective. It combines the difficulties of a moving camera, a wide range of possible (deformable) object appearances, cluttered backgrounds, stringent performance criteria and hard real-time constraints.

Fig:2 Pedestrian Detection

A vehicle such as an automobile, adopted in the embodiment is equipped with a camera (visible camera, infrared camera, far infrared camera) serving as an image pickup device which picks up the image of an area ahead of or around the vehicle. A pedestrian detection processing unit for recognizing a pedestrian from the image of the area ahead of or around the vehicle, picked up by the camera. These are the main features of the system. The camera picks up the image if it is visible camera or, if it is an infrared camera it picks up the image by capturing a small amount of infrared rays, which are electromagnetic waves having longer wavelength than visible rays generated by an object even in dark.

The pedestrian recognition/processing unit includes a computer such as a micro computer. The computer includes a central processing unit (CPU) for controlling an entire system according to a control program stored in ROM (read only memory). The Rom is used for storing the control program and fixed data such as a whole body modeling and a head modeling of a pedestrian, and a RAM (random access memory) serving as a temporary storage unit in processing.The aim is to develop active driver assistance systems which detect dangerous situations involving pedestrians ahead of time, allowing the possibility to warn the driver or to automatically control the vehicle (e.g. braking). Such systems are particularly valuable when the driver is distracted or visibility is poor.

2.2 Overview diagram

Sensing

Regions Of Interest

Features Extraction

Classification

Non PedestrianPedestrian

Fig3.Overview Diagram

2.3 Benefits Reduces pedestrian accidents by focusing on areas that areprone to collisions. Implement multiple safety solutions with a single product. Interfaces with existing system components, requiring minimal additional equipment. Mounts on existing intersection poles. Agencies can customize the types of alert response (audible/visual alerts, speed governing, etc.). It may be set up to sound alerts by voice annunciation, beeps, flashing LEDs, or any combination. All alerts and warnings canalso be remotely monitored in real time, with a running log of alert activity. Capable of recognizing when turns are being initiated early enough to receive pedestrian alertsand early enough for the driver to safely react.

3.Functional Block Diagram 3.1 Block DiagramIn Figure 4 we briefly summarize a prototype implementation of a stationary-camera pedestrian detection system implemented using a combination of a CPU and an FPGA.

Figure 4: Block diagram of proof-of-concept pedestrian detection application using an FPGA and a CPU.

3.2 ExplanationIn the figure, the Pre-processing block comprises operations such as scaling and noise reduction, intended to improve the quality of the image. The Image Analysis block incorporates motion detection, pixel statistics such as averages, color information, edge information, etc. At this stage of processing, the image is divided into small blocks. The object segmentation step groups blocks having similar statistics and thus creates an object. The statistics used for this purpose are based on user defined features specified in the hardware configuration file.The Identification and Meta Data generation block generates analysis results from the identified objects such as location, size, color information, and statistical information. It puts the analysis results into a structured data format and transmits them to the CPU.Finally, the On-screen Display block receives command information from the host and superimposes graphics on the video image for display.Field programmable gate arrays (FPGAs) are flexible logic chips that can be reconfigured at the gate and block levels. This flexibility enables the user to craft computation structures that are tailored to the application at hand. It also allows selection of I/O interfaces and on-chip peripherals matched to the application requirements. The ability to customize compute structures, coupled with the massive amount of resources available in modern FPGAs, yields high performance coupled with good cost- and energy-efficiency. 3.3 Working Mechanism To detect pedestrians in a video sequence, we learn a foreground model of the motion and appearance of pedestrians from example video sequences. The pedestrian detector builds on earlier face detection work . The earlier work is extended by using motion information as well as appearance information. This is in contrast to most prior work which attempts to build a model of the background. The detector learned uses a set of simple motion and appearance features.The appearance features are simple rectangle features acting on a single frame of the video.The motion features are simple rectangle features acting on the difference image between successive frames of the video. The optimal set of motion and appearance features are learned from a large library of possible features using the AdaBoost learning algorithm.

4.Description4.1 Edge Based Target RecognitionInitially, we want to determine the presence and location of a template T in an image I.

Fig5 Template

Our template T is an edge-map, many such edge maps are stored in the ROM. The edge map of image which is created is called as the feature image I. For the initial step to complete T (edge-map) is slided over I, until it somehow delivers the best match. There is a need to perform a search for the closest image pixel of each template pixel (distance between template and image). There are a number of ways we can approach to accomplish this need.4.1.1Nave Approach: Binary MatchingA match is determined by counting the pixels that match between the template and the edge-image. If this count is high enough (if it is close to the count of pixels in the template) then we have a match. This approach only works well if the template really has the exact size, shape and orientation as the image. It however, does not give us any information about how far the non-matching pixels are off.4.1.2 Chamfer DistanceThis is a method where in we let T be our template and let I be the images edge-map.The Chamfer distance is the average distance to the nearest feature.

This method doesnt handle occlusion too well.

4.1.3 Hausdorff Measure

In this method we let M be the set of object model pixels and let I be the set of image edge pixels. Wherein h(M,I) is the distance of the worst matching object pixel to its closest image pixel. There is a problem with this approach, the Hausdorff measure makes the assumption that each object pixel occurs in the image. This is obviously not true when an object is occluded. This method provides maximum distance between template and image, also it does not handle occlusion at all.

4.1.4 Partial Hausdorff Measure

In Partial Hausdorff Measure K object pixels that are closest to the image. Wherein K can be tweaked to the minimum number of pixels that we expect to find in an image.K can also be set higher to reduce the rate of false positives, but we might miss some matches that way. Hence, we can get distance of Kth cosest match, also we can treat occlusion by tweaking K.4.1.5 Distance TransformFor each image we first compute the images edge map, we then compute the Distance Transform (DT) which is an intensity map that marks the distance to the closest pixel on the edge map.

Fig 6. Distance Transform

This method provides us with inherent distance information that can be used by our template matching algorithm. It acts as our lookup-table for finding the distance of the closest matching object pixel that we previously needed to search for manually.

Fig 7.Look up Table

4.2 Matching Multiple TemplatesIn the real world, objects tend to appear in many different shapes, this might cause our viewpoint to change. The object might actively change its shape (such as walking pedestrians).To create a template for each expected combination of viewpoint and shape becomes very tedious, especially for real-time purposes. A tree structure is therefore employed. Our tree is ordered by generality, the most general template is the root of our tree. The most general template is the one which has the lowest maximum distance measure to all other templates. The leafs of our tree are all possible templates.

Fig 8.Tree StructureOne can start at the root template and try to find a match in our image. One should then choose the distance threshold to be large enough so that our match could potentially contain any of the child-nodes. If a match is found, one can descend down the tree, and try to match the next level of templates (by focusing only on the area in the image that has been matched by our parent). A smaller distance threshold shoul now be used, that is still large enough to possibly contain each of our child-templates. This process is repeated (usually using depth-first search) until one of our leafs matches. Speed up to three orders of magnitude depending on various factorsis gained.

4.3 AdaBoost AlgorithmAdaBoost, short for AdaptiveBoosting, is amachine learningalgorithm.It is ameta-algorithm, and can be used in conjunction with many other learning algorithms to improve their performance. AdaBoost is adaptive in the sense that subsequent classifiers built are tweaked in favor of those instances misclassified by previous classifiers. AdaBoost is sensitive to noisy data andoutliers. In some problems, however, it can be less susceptible to theoverfittingproblem than most learning algorithms.AdaBoost calls aweak classifierrepeatedly in a series of roundsfrom a totalTclassifiers. For each call a distribution of weightsDtis updated that indicates the importance of examples in the data set for the classification. On each round, the weights of each incorrectly classified example are increased (or alternatively, the weights of each correctly classified example are decreased), so that the new classifier focuses more on those examples.Given: training set:where number of iterationsTInitializeFor: Find the classifierhtfrom the family of weak classifiers that minimizes the error with respect to the distributionDt:, whereI is the indicator function ifthen stop. Choose, typicallywhereis the weighted error rate of classifierht. Update:

whereZtis a normalization factor (chosen so thatDt+ 1will be aprobability distribution, i.e. sum one over all x).Output the final classifier:

The equation to update the distributionDtis constructed so that:

Thus, after selecting an optimal classifierfor the distribution, the examplesthat the classifieridentified correctly are weighted less and those that it identified incorrectly are weighted more. Therefore, when the algorithm is testing the classifiers on the distribution, it will select a classifier that better identifies those examples that the previous classifier missed.

5. Pedestrian Detection Methods5.1Range sensor based methodShort range radars are integrated in the front bumper of the test vehicle. They are able to observe and track multiple targets in the region of interest. However, one difficulty is to distinguish between pedestrians and other objects. Its detection feature is that raw data is clustered based on range discontinuities.

Fig 9. Laser Scanner Map5.2Vision sensor based methodA vision-based system can recognize pedestrians in front of the moving vehicle, then warns the driver of the dangerous situation loudly or slows the vehicle down automatically to protect both drivers and pedestrians. In general, the vision-based pedestrian detection process can be divided into three consecutive steps: pedestrian detection, pedestrian recognition, and pedestrian tracking. It detects the pedestrian ahead or around the vehicle using visible light camera. The detection features of this method are symmetry of legs, brightness, shape, difference of two sequence images. This method gives accurate results in daylight, it tends to become less reliable during the dark.

Fig 10.Pedesrtians detected using visible camera

Pedestrian detection can be done using the visible light camera, but it is a known fact that it tends to become increasingly less reliable in the dark (example, during the night). Therefore the use of non-visible light (infrared camera) proves to be beneficial. All objects emit a certain amount of blackbody radiation as a function of their temperatures. Generally speaking, the higher an object's temperature is, the more infrared radiation as black-body radiation it emits. A special camera can detect this radiation in a way similar to an ordinary camera does visible light. It works even in total darkness because ambient light level does not matter.

Fig 11.pedesrtians detected using infrared camera

6.Implimentation We can implement pedestrian detection system using following instruments: 6.1 Monocular Camera:Amonocularis a modifiedrefracting telescopeused tomagnifythe images of distant objects by passing light through a series oflensesand sometimesprisms; the use of prisms results in alightweighttelescope. Volume and weight are less than half those ofbinocularsof similar optical properties, making it easy to carry. Monoculars produce 2-dimensionalimages, while binoculars add perception of depth (3 dimensions). Fig:12 monocular cameraA monocular with a straightoptical pathis relatively long; prisms can be used to fold the optical path to make an instrument which is much shorter (see the article onbinocularsfor details). A monocular with a straightoptical pathis relatively long; prisms can be used to fold the optical path to make an instrument which is much shorter

6.2 Stereo camera

Fig:13 Stereo cameraAstereo camerais a type ofcamerawith two or more lenses with a separateimage sensoror film frame for each lens. This allows the camera to simulate humanbinocular vision, and therefore gives it the ability to capture three-dimensional images, a process known asstereo photography. Stereo cameras may be used for makingstereoviewsand 3D pictures for movies, or forrange imaging. The distance between the lenses in a typical stereo camera (the intra-axial distance) is about the distance between one's eyes (known as the intra-ocular distance) and is about 6.35cm, though a longer base line (greater inter-camera distance) produces more extreme 3-dimensionality.Stereo cameras are sometimes mounted in cars to detect the lane's width and the proximity of an object on the road. The stereo camera, which uses six-dimensional image analysis is able to assess the direction in which people are moving. It can also determine the range to an object with an accuracy of between 20 and 30 centimers at a distance of 20 to 30 meters.The stereo camera has a range of up to 60 meters, which, Continental said, provides huge potential for improved braking systems.Since the stereo camera also realizes the already familiar assistance systems, such as lane departure warning, traffic sign recognition, and intelligent headlamp control, it is said that it will set a new trend in the medium to long term.

6.3 Infrared camera

Fig:14 Infrared cameraA thermographic camera or infrared camera is a device that forms an image using infrared radiation, similar to a common camera that forms an image using visible light. Instead of the 450750 nanometer range of the visible light camera, infrared cameras operate in wavelengths as long as 14,000nm (14m). All objects emit a certain amount of black body radiation as a function of their temperatures. Generally speaking, the higher an object's temperature is, the more infrared radiation as black-body radiation it emits. A special camera can detect this radiation in a way similar to an ordinary camera does visible light. It works even in total darkness because ambient light level does not matter.Infrared cameras are being used for a wide variety of night vision applications. Infrared cameras produce a clear image in the darkest of nights. Infrared cameras do not need any light whatsoever to operate. An infrared camera can also see through light fog and smoke. Infrared cameras produce a crisp image in practically all weather conditions.An infrared camera can be incorporated in cars, busses, trucks, trains, for driver vision enhancement. An infrared camera sees up to 5 times further than headlights. Thanks to infrared cameras the driver can see pedestrians and obstacles on the road from a further distance. This way an infrared camera can help to avoid deadly accidents.

5.4 Image ProcessorImage processingis any form ofsignal processingfor which the input is an image, such as aphotographorvideo frame; theoutputof image processing may be either an image or, a set of characteristics orparametersrelated to the image. Most image-processing techniques involve treating the image as atwo-dimensionalsignal and applying standard signal-processing techniques to it.Image processing usually refers todigital image processing, butopticalandanalog image processingalso are possible.

Fig:15 Processor

Image processing focuses on 2D images, how to transform one image to another, e.g., by pixel-wise operations such as contrast enhancement, local operations such as edge extraction or noise removal, or geometrical transformations such as rotating the image. This characterization implies that image processing/analysis neither require assumptions nor produce interpretations about the image content.Computer vision includes 3D analysis from 2D images. This analyzes the 3D scene projected onto one or several images, e.g., how to reconstruct structure or other information about the 3D scene from one or several images. Computer vision often relies on more or less complex assumptions about the scene depicted in an image.

7. Observation 7.1 Overall ScenarioThe "support function", which works by using radar and camera technology to watch out for vehicles and pedestrians ahead of the car, is designed to save lives on urban streets. The detection component consists of a cascade of module, each utilizing different visual criteria to successively focus on relevant image regions, carefully balancing robustness and efficiency considerations. The tracking component aggregates per-frame detections to trajectories by a tracking module. Finally, the risk assessment and warning/control component evaluates the probability of collision; if the latter exceeds a threshold an acoustic driver warning is given or automatic vehicle braking is applied.

Fig:16 Range of pedestrian detection system

7.2 Advantages Increased Safety: Reduce pedestrian accidents by focusing on areas that areprone to collisions. Implement multiple safety solutions with a single product. Ease of Installation: Interfaces with existing system components, requiring minimal additional equipment. Mounts on existing intersection poles.

Flexibility: Agencies can customize the types of alert response (audible/visual alerts, speed governing, etc.). It may be set up to sound alerts by voice annunciation, beeps, flashing LEDs, or any combination. All alerts and warnings canalso be remotely monitored in real time, with a running log of alert activity. Timeliness: Capable of recognizing when turns are being initiated early enough to receive pedestrian alertsand early enough for the driver to safely react.

7.3 Disadvantages A major complication is that because of the moving vehicle, one does not have the luxury to use simple background subtraction methods (such as those used in surveillance applications) to obtain a foreground region containing the human. A Pedestrian is a non-rigid body. In other words, the shape and size of a pedestrianvaries greatly, and therefore the model of a pedestrian is much more complex than that of rigid objects. The clutter background. It does not matter if we are analyzing images from a typical city or from a country trafficc environment, the background formed by vehicles, trees, wire poles, and billboards is very cluttered. Most of these backgrounds can be taken for pedestrians, due to possible similar shapes.

8. ImprovementThe system described achieves a very good detection rate. However, the number of false positives is undesirable for a real application. A possible way to improvethis rate is to combine visual information with a laser scanner . If the data collected from the lasers canner is used to define a region of interest (ROI) in the image , the number of sub-windows classified per window will strongly decreased and, consequently, the false alarm rate will decrease as well. Furthermore, taking into account the estimation of the distance between the car and the object, it is possible to discard a great amount of false positives by ignoring inappropriate pedestrian sizes on the image. The laser scanner can be used to define the horizontal limit of the ROI on the image, the vertical limit can be set to the height of the image. Applying the above described classifier only to this ROI, maintaining the same parameters and the same image set used in this system, it is likely that the number of false positives decreased to atleast 35-39% of the ones detected by the initial system. The system might speed up more than 2 times.

9. Conclusion and Future scopeThe proposed system has the following properties: It is able to detect pedestrians in various poses, shapes,sizes and clothing. It runs in real-time with a good accuracy . It is robust to lighting variation, background changes and camera motion.The missing detection of pedestrians is due to lack of contrast between pedestrians and backgroundand to the enormous fexibility of the human body. In a future work, and in order to improve the system, implementation the data fusion can be done. Using the data fusion it ispossible to use a classifier based on the laser scanner data combined with the vision based classifier to give the output of the global system. Thus, it is expected to obtain more accurate results and to significantly decrease the false alarm rate. Moreover, a filter - Kalman Filter can be integrated for the tracking of the pedestrians. Since, most of the times, false positives only appear on isolated image frames, they will be ignored during tracking. An extension can be made so for the classifier to detect other objects, like cars, traffic signs, etc.

10. References http://en.wikipedia.org/wiki/Pedestrian_detection http://www.isr.uc.pt/~urbano/mtdts04/pdf/REF_3_Robotica.pdf file:///C:/Users/fruti/Desktop/BE/Seminar/New%20folder/pedestrain/Pedestrian%20detection%20system%20-1.htm#v=onepage&q&f=false S. Agarwal, A. Awan, and D. Roth, Learning to Detect Objects in Images via a Sparse, Part-Based Representation, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1475- 1490, Nov. 2004. Markus Enzweiler, Student Member, Monocular Pedestrian Detection:Survey and Experiments IEEE, and Dariu M. Gavrila

5

Documents

Seminar Report