Line Following and Obstacle Detection Through the Birds ...bschroed/RoboticsSpring2012/... · Line Following and Obstacle Detection Through the Birds-Eye View ... I developed a method

Line Following and Obstacle Detection

Through the Birds-Eye View

Brigit Schroeder University of Massachusetts, Lowell

Department of Computer Science [email protected]

ABSTRACT In this paper, a method for outdoor robot line following is described based upon a careful analysis of the HSV color space. This process is then extended to the detection of both lines and obstacles separately, such that targets can be marked in the original imagery. An inverse perspective mapping has been added to create bird’s eye view of the of original robot view to derive the physical distance of lines and obstacles from the robot.

Author Keywords Line following, obstacle detection, birds-eye view, HSV, IGVC, robotics.

INTRODUCTION I developed a method of robot line following and obstacle detection for my final project through the use of image understanding. Image understanding is the process of interpreting regions in an image after they have been segmented and detected by a sequence of image processing techniques. For line following, within the scope of the IGVC, identifying and understanding which regions are white painted lines, grassy areas and basic obstacles such as traffic barrels, is important. Once the lines other obstacles have been detected in imagery, the physical locations of the lines can be derived (or at best, approximated) by projecting the detected or segmented image into a calibrated “bird’s eye” (e.g. top-down) reference frame.

PROJECT DESCRIPTION The problem of detecting lines and obstacles in stochastic outdoor environment in order to avoid them defines the “line following” problem in the scope of the IGVC. The problem in its simplest form would be to detect white (or various shades of white) lines on a green surface. The lines are painted on grass, which is an uneven surface in terms of texture and intensity, and the lighting conditions are also fairly variable give the time of day. However, besides lines, there are multi-colored traffic/garbage (e.g. obstacles) barrels on the ground that the robot must traverse around.

Line and Obstacle Detection The problem can be broken down into a series of image

processing steps that segment out the desired targets (lines and barrels) and mark the pixels of the segmented targets as “line” or “barrel”. The end result is an image mask indicating which groups of pixels are targets by various values. An example of the segmentation can be seen in the image below, where barrels and lines have segmented out of the original image:

Figure 1. Series of image processing steps required for line/object detection and birds-eye projection.

Figure 2. Sample image collected from robot imaging sensor of IGVC robotics course with lines, barrels, and grass.

Figure 3. Segmentation of original robot image sensor image based upon hue thresholding, showing the lower 60% of the robot’s field of view.

Note that only the lower 60% of the robot’s view frame is visible; the reasoning behind 60% is two-fold, to reduce processing time and navigation is generally concerned only with the things that are closest to it (or at the bottom of the field-of-view). In previous IGVC competitions, a range of image processing techniques have been used to try to detect and segment the white lines in image, including using HSV color channel separation [1]. One common technique for segmenting targets is fusing multiple masks (e.g. binary maps), which indicate target pixels regions with high probability, to achieve a final target mask. In the approach outlined in this paper, similar image processing techniques to [1] were used, including histogramming, thresholding and connected components analysis, but differs from previous approaches in that both lines and color-specific targets, such as multi-colored barrels, are detected. From this, a series of fused masks indicating “all targets”, lines only and barrels only (as shown in Figure 1) were created. In this solution, binary masks were created based upon transforming RGB sensor images (see Figure 2) into individual HSV color channels and thresholding a single channel, such as “hue.” Hue (H) is a is value that represents a unique shade the “true” color spectrum, saturation (S) is a measure of how pure or saturated the color is, and value (V) is the “lightness” or how much white there is in a color. The image data from the H and S color channels are light invariant, which makes them more reliable to use, especially outdoors. On the hue scale, an orange barrel, for example, will always reliably have the same color value (or range of values), which can be used for target-specific thresholding and detection. Figure 4 is an example of a series of multimodal H, S and V histograms created from a single sensor image; note the

strongest peaks in each that correspond directly to the large amount of grass in the scene. In the hue (H) histogram, shades of grass (green primarily) fell in the largest peak in a range between 0.15 and 0.25, traffic barrel colors fell in the second largest peak in a range between 0 and 0.15, and white painted lines were observed to vary in range above 0.25 to 0.8. In the saturation (S) histogram, shades of white in both the painted lines and traffic barrels had very low saturation values (as might be expected since white is an amalgamation of all shades on the color spectrum) and could be isolated via thresholding saturation values below 0.25 (e.g. only lines and white regions were left visible). After thresholding, the final binary masks were created in series of stages, the stages and results of which can be seen in Figure 5. The thresholded hue and saturation masks (“color map” and “inverse S map”) seen in Figure 5 were combined with a logical “OR” to create an “all target” mask (“final mask”) of lines and barrels. The “line mask” in Figure 5 was created from the thresholded saturation mask (“inverse S map”) by removing the horizontal white regions from orange and white traffic barrels seen in Figure 2. Connected components analysis was used to label blob regions in then saturation mask. Image moments [2], used to characterize the shape-based properties of blobs (such as centroid and orientation), were used to filter out the horizontal white barrel regions. Since the orientation of the barrels and the robot are nearly always horizontal to the ground, filtering by an absolute orientation of 20 degrees or less to the image’s horizontal axis nearly removed all of the white barrel regions. The “barrel mask” was created last by subtracting the line mask from the “all target” mask. The subtraction sometimes had the effect of leaving noise (e.g. small speckled regions) as can be seen in Figure 5. These regions could be removed with an image erosion step.

Figure 4. Sample histograms of the H, S, and V color channels of an image from an IGVC course. Note the strong peaks in each, especially in the hue histogram, which correspond to the large amount of grass in the image.

Figure 5. Results from HSV color channel separation (1st row), hue and saturation thresholding (2nd row), and final binary mask line/object detection masks (3rd row).

Bird’s Eye Projection Once lines and obstacles are detected in the from the robot’s imaging sensor, their physical “real world” location (relative to the robot) needs to be derived for navigation, similar to the work in [3] for automobiles. This can be achieved by projecting the robot’s camera perspective into a bird’s eye or “top down” viewpoint. A “robot image to ground” transform needs to be calculated as a projective homography [4] between a large calibration target (placed on the ground in front of the robot) and an image of the target captured from the robot’s image sensor. The homography is calculated between corresponding points (such as the corners of the target) in the calibration target’s metric coordinate system (e.g. meters) and image pixel coordinates.

Figure 6. The robot’s view (perspective view) with a checkerboard target is projected into a into a top-down bird’s eye viewpoint done with a sample target.

The net effect of the bird’s eye projection, also called an “inverse perspective transform”, is that the perspective is removed from the original image (e.g. converging parallel lines at infinity). This can be seen Figure 6, where the robot’s view (perspective view) with a calibration target is projected into a into a top-down bird’s eye viewpoint. Note that the floorboards in the original image are parallel in the bird’s eye projection. For this project, the bird’s eye projection was applied to the final line and obstacle binary mask as seen in Figure 7. A “in front of the robot” viewpoint was also cropped out of the full bird’s eye projection (see Figure 6), which is most useful for navigation purposes. The effect of bird’s eye projection can be seen where the curved lane of the robotics course becomes straight

Figure 7. The bird’s eye projection was applied to sample IGVC course data, including the final line and obstacle binary mask and a “in front of the robot” viewpoint.

ANALYSIS OF RESULTS For this project, pictures speak louder than words; in other words, the metric of success proved to be qualitative rather than quantitative. The first objective was to produce a reasonable mask that could be integrated into the UML IGVC robotics platform for 1) line following and additionally/optionally 2) barrel avoidance. The “all target” mask discussed in the previous section (also called “final mask” in Figure 5) satisfies this objective, especially for use with the 2D ray-casting technique implemented for the 2011 IGVC.

Figure 8. The final “all targets” mask.

The second objective was to achieve line and obstacle (barrel) detection, which in the larger scope of the UML IGVC system, was not as important as the previous binary mask result. However, reasonable target detection was achieved by being able to create the individual line and barrel masks. Either of these on their own might prove to be useful for future integration/processing with the UML IGVC system. Below (in Figure 9) is an example of good obstacle detection; the original robot sensor image has been been overlaid with the line mask in blue and barrel mask in red:

Figure 9. An example of “good” line and obstacle detection with line pixels marked in blue and barrels marked in red.

The random noise in the detection image around the yellow ramps could be removed with image erosion in the original mask, as discussed in the Project Description section. Figure 10 below shows a case where sections of barrels occasionally get classified as lines due to the connected components analysis discussed in the Project Description section (the apparent orientation of white sections traffic barrels changes).

Figure 10. An example of line and obstacle detection with mis-marked targets; some barrels are marked as lines.

However, this result was still very promising and could be fixed potentially with more time invested.

Calculating the bird’s eye projection of the line and obstacle detection was not in the original description of my final project proposal. However, figuring out how to make it work was both intriguing and useful for this year’s IGVC platform. I used estimated ground measurements, based upon the known diameter of traffic barrels, etc, for the bird’s eye projection shown below in Figure 11:

Figure 11. A series of views of sample IGVC course data, including bird’s eye projection of original image and “all targets” mask.

The red trapezoid in the “Robot View” is an estimate of the ground plane with apparent perspective. The red rectangle in the “Birds-eye View” is an estimate of the unwarped (e.g. perspective removed) ground plane. The bird’s eye views are only approximate due to the ground plane estimation.

The amount of collected IGVC course video was limited, so I attempted to simulate a robotics course in my living room, as seen in Figure 12:

Figure 12. A “doll house” simulated robotics course I built.

Building the simulated robotics course was most useful for testing the bird’s eye projection, as I was able use the pictured checkerboard to estimate a robot image to ground homography. Also, the videos of the simulated course were closer to actual IGVC robot view than the IGVC video collected at the 2011 competition. Below are the identical results from Figure 11 for the simulated course:

Figure 13. A series of views of simulated simulated course data, including bird’s eye projection of original image and “all targets” mask.

Measuring the goodness of the bird’s eye projection depended upon on how well the checkerboard target and image of the target are calibrated. When accurate, as seen with the test target in Figure 6, straight lines will be straight, but when estimated as in Figure 11, there will be a lot of distortion (e.g. inaccurate projection) in the desired straight lines. The projection results for the simulated course were midway in between these two in terms of quality.

DISCUSSION In order to come up with a robust solution to line following problem, a variety of sample input data is a necessity. The analysis of a problem such as this can really suffer if the solution is only tuned to a limited set of data (in this case, one type of outdoors conditions at a specific time of day). What would have been ideal was to have sample IGVC course data from different days, at different times of day, and under different weather (e.g. lighting) conditions. Also, it is important to use “real” data versus simulated data as much as possible (e.g. data collected from the robot’s perspective rather than a human’s laptop). Even in the case where I simulated a robotics course, I found that it took several attempts to get the right camera perspective. Testing on different environment conditions (indoor vs. outdoor) and different types of surfaces/materials proved to be a

challenging with some of the image processing steps that depended upon color and saturation thresholding.

Another thing I learned while attempting to come up with both an approach to the line following problem was that complex solutions were not always the best answer. In some cases, such as this one, starting with a basic, even naïve, approach proves to give a good solid result. Initially I proposed using a machine learning based technique that would have taken much long to develop, especially for testing and training classifiers. I think the simpler, more straightforward solution could also be a good segue into a potentially more accurate complex approach.

CONCLUSION The work presented in the paper is a straightforward and effective solution to line and obstacle detection. The detection process can select for all targets or segmented out with different types of targets with reasonable accuracy (qualitative). The results are also projected in a bird’s eye perspective and “in front of robot” perspective which are very realistic (e.g. the perspective is removed properly) when the projection calibration is done with accurate measurements. More time, data and Brigits are needed to finess the solution further, mostly for integration and testing into the 2012 UML IGVC platform.

ACKNOWLEDGMENTS The work described in this paper was conducted as part of a Spring 2012 robotics course, taught in the Computer Science department of the University of Massachusetts Lowell by Prof. Fred Martin.

REFERENCES 1. INDRIK : Design Report 2011 Intelligent Ground

Vehicle Competition (University of Waterloo): http://uwrobotics.uwaterloo.ca/igvc/uwrt_igvc_2011.pdf

2. Image Moments: http://en.wikipedia.org/wiki/Image_moment.

3. S.Tuohy, D.O’Cualain, E. Jones, M.Glavin. Distance Determination for an Automobile Environment using Inverse Perspective Mapping in OpenCV. In IEEE Computer Engineering: March (2010).

4. Projective Homography: http://en.wikipedia.org/wiki/Homography

Documents

Line Following and Obstacle Detection Through the Birds ...bschroed/RoboticsSpring2012/... · Line Following and Obstacle Detection Through the Birds-Eye View ... I developed a method