6
Statistical Analysis of Image-Features Used as Inputs of an Road Identifier Based in Artificial Neural Networks Patrick Yuri Shinzato, Denis Fernando Wolf, Institute of Mathematics and Computer Science, University of Sao Paulo - ICMC-USP, Sao Carlos, SP Brazil [email protected], [email protected] Abstract— Navigation is a broad topic that has been receiving considerable attention from the mobile robotic community. In order to execute a autonomous driving on outdoors, like street and roads, it is necessary that the vehicle identify parts of the terrain that can be traversed and parts that should be avoided. This paper describes an analyses of many multi-layer perceptron neural networks(ANN) used for image-based terrain identification. The ANNs differ in image features used in input layer. Experimental tests using a car and a video camera have been conducted in real scenarios to evaluate the proposed approach. I. INTRODUCTION Autonomous navigation capability is a requirement for mobile robots. In order to deal with this issue, the robot must obtain information about the environment through sensors and thereby identify safe regions for navigation [1]. Outdoor navigation in real scenarios and unknown terrain in certainly a complex problem. Beyond the obstacles avoidance, it is necessary that the vehicle can identify the region where it can navigate safely. The irregularity of the terrain and dynamics of the environment are some of the factors that make difficult the outdoor robot navigation [2]. Several techniques for visual road following have been developed based on certain assumptions about road scenery. Detecting road boundaries through the use of gradient-based edge techniques is described in [3], [4], [5]. These algorithms assume that road edges are clear and fairly sharp. In [6], it has been developed an approach to extracts the texture in road images and use it as a feature for the segmentation of unmarked roads. The approach presented by [7] divides images in slices and tries to detect the path on each one. A work related with artificial neural network is the Autonomous Land Vehicle In a Neural Network (ALVINN) of [8], where a network is trained to classify the entire image and detect the road. Another work latest is presented by [9] that use ANNs and specific areas of image for a real-time road detection application. Usually it is desirable that the mobile robot (vehicle) can move along the road or street and avoid obstacles and non-navigable areas. Since these elements, usually, have differences in color and texture, cameras are a suitable option to identify navigable regions. In this work we present an anal- ysis of many multi-layer perceptron neural networks(ANN) - that use different features obtained from image as input - used for image-based terrain identification. Fig. 1. Image Division in Blocks. II. METHODOLOGY Navigation in outdoor spaces is considerable more complex than in structured indoor spaces. The terrain is composed by a variety of elements like grass, gardens, sidewalks, streets and gravels. These elements, usually, have different colors and textures making possible the use of cameras to diferentiate them. Our earlier work [10], [11] focused on determining which parts of sidewalk scene are linearly distinguishable with an acceptable error rate and the possibility of classification using neural networks. Some neural networks configurations obtained good results, indicating that neural networks can be used in road following algorithms as image classifiers. Because of this, we evaluated neural networks with different image features for a road scene, composed by paved street, walkways and vegetation. A. Block-based Classification Method A block-based classification method consists to treat and evaluating a collection of pixels directly connected, neigh- bors, as a group. A value is generated to represent this group, this value can be the average of the RGB, entropy and others features from collection of pixels represented. In the grouping step, a frame resolution (M x N ) pixels was sliced in groups with (K x K) pixels, as show Fig. 1. Suppose an image represented by a matrix I of size (M x N ). The element I (m, n) corresponds to the pixel in row m and collumn n of image, where (0 m<M ) and (0 n < N ). Therefore, group G(i, j ) contains all the 2010 Latin American Robotics Symposium and Intelligent Robotics Meeting 978-0-7695-4231-7/10 $26.00 © 2010 IEEE DOI 10.1109/LARS.2010.25 19

[IEEE 2010 Latin American Robotics Symposium and Intelligent Robotic Meeting (LARS) - Sao Bernardo do Campo, Brazil (2010.10.23-2010.10.28)] 2010 Latin American Robotics Symposium

Embed Size (px)

Citation preview

Page 1: [IEEE 2010 Latin American Robotics Symposium and Intelligent Robotic Meeting (LARS) - Sao Bernardo do Campo, Brazil (2010.10.23-2010.10.28)] 2010 Latin American Robotics Symposium

Statistical Analysis of Image-Features Used as Inputs of an RoadIdentifier Based in Artificial Neural Networks

Patrick Yuri Shinzato, Denis Fernando Wolf,Institute of Mathematics and Computer Science,

University of Sao Paulo - ICMC-USP,Sao Carlos, SP Brazil

[email protected], [email protected]

Abstract— Navigation is a broad topic that has been receivingconsiderable attention from the mobile robotic community. Inorder to execute a autonomous driving on outdoors, like streetand roads, it is necessary that the vehicle identify parts ofthe terrain that can be traversed and parts that should beavoided. This paper describes an analyses of many multi-layerperceptron neural networks(ANN) used for image-based terrainidentification. The ANNs differ in image features used in inputlayer. Experimental tests using a car and a video camera havebeen conducted in real scenarios to evaluate the proposedapproach.

I. INTRODUCTIONAutonomous navigation capability is a requirement for

mobile robots. In order to deal with this issue, the robot mustobtain information about the environment through sensorsand thereby identify safe regions for navigation [1]. Outdoornavigation in real scenarios and unknown terrain in certainlya complex problem. Beyond the obstacles avoidance, it isnecessary that the vehicle can identify the region where it cannavigate safely. The irregularity of the terrain and dynamicsof the environment are some of the factors that make difficultthe outdoor robot navigation [2].

Several techniques for visual road following have beendeveloped based on certain assumptions about road scenery.Detecting road boundaries through the use of gradient-basededge techniques is described in [3], [4], [5]. These algorithmsassume that road edges are clear and fairly sharp. In [6], ithas been developed an approach to extracts the texture inroad images and use it as a feature for the segmentationof unmarked roads. The approach presented by [7] dividesimages in slices and tries to detect the path on each one. Awork related with artificial neural network is the AutonomousLand Vehicle In a Neural Network (ALVINN) of [8], where anetwork is trained to classify the entire image and detect theroad. Another work latest is presented by [9] that use ANNsand specific areas of image for a real-time road detectionapplication.

Usually it is desirable that the mobile robot (vehicle)can move along the road or street and avoid obstacles andnon-navigable areas. Since these elements, usually, havedifferences in color and texture, cameras are a suitable optionto identify navigable regions. In this work we present an anal-ysis of many multi-layer perceptron neural networks(ANN)- that use different features obtained from image as input -used for image-based terrain identification.

Fig. 1. Image Division in Blocks.

II. METHODOLOGY

Navigation in outdoor spaces is considerable more complexthan in structured indoor spaces. The terrain is composedby a variety of elements like grass, gardens, sidewalks,streets and gravels. These elements, usually, have differentcolors and textures making possible the use of cameras todiferentiate them.

Our earlier work [10], [11] focused on determining whichparts of sidewalk scene are linearly distinguishable withan acceptable error rate and the possibility of classificationusing neural networks. Some neural networks configurationsobtained good results, indicating that neural networks canbe used in road following algorithms as image classifiers.Because of this, we evaluated neural networks with differentimage features for a road scene, composed by paved street,walkways and vegetation.

A. Block-based Classification Method

A block-based classification method consists to treat andevaluating a collection of pixels directly connected, neigh-bors, as a group. A value is generated to represent thisgroup, this value can be the average of the RGB, entropyand others features from collection of pixels represented. Inthe grouping step, a frame resolution (M x N ) pixels wassliced in groups with (K x K) pixels, as show Fig. 1.

Suppose an image represented by a matrix I of size(M x N ). The element I(m,n) corresponds to the pixel inrow m and collumn n of image, where (0 ≤ m < M) and(0 ≤ n < N). Therefore, group G(i, j) contains all the

2010 Latin American Robotics Symposium and Intelligent Robotics Meeting

978-0-7695-4231-7/10 $26.00 © 2010 IEEE

DOI 10.1109/LARS.2010.25

19

Page 2: [IEEE 2010 Latin American Robotics Symposium and Intelligent Robotic Meeting (LARS) - Sao Bernardo do Campo, Brazil (2010.10.23-2010.10.28)] 2010 Latin American Robotics Symposium

pixels I(m,n) such that ( (i ∗K) ≤ m < ((i ∗K) +K) )and ( (j ∗ K) ≤ n < ((j ∗ K) + K) ). For eachgroup, a feature value is calculated depending on the featurechosen. This strategy has been used to reduce the amount ofdata, allowing faster processing.

B. Statistical Measures

We use several statistical measures as image features.Simples measures, like mean and probability, and morecomplicated, like entropy and variance, were used. Theirdefinitions and equations are described below.

1) Shannon Entropy:

E(X) = −∑x∈X

p(x)logp(x) (1)

2) Energy:

ε =

C−1∑i=0

(p(x))2 (2)

3) Variance:

σ2 =

C−1∑i=0

(x− µ)2 ∗ p(x) (3)

where p(x) is the probability of pixel x being in thecollection, µ is mean of the collection and C is the numberof colors of space.

C. RGB Color Spaces

The RGB color is a space where each color can be definedby the quantities variation of R (red), G (green) and B (blue)components [12]. The classification based in the color spacegenerates a feature with RGB pixel format. This feature isthe weighted average of the pixel occurrence in pixel-block.

We also use the RGB entropy and energy in the set as afeature. In order to obtain the entropy value and energy, it iscalculated the frequency of each pixel into the pixel-block.For each pixel with value x, p(x) is calculated by dividingthe frequency of x by the total number of pixel into pixel-block. Note that x and y are a pixel in format RGB, x = yif and only if:• red of x equals red of y and,• green of x equals green of y and,• blue of x equals blue of y

D. HSV Color Spaces

The HSV color space is a space that contains hue (H),saturation (S) and value (V)(brightness) [13]. As with RGB,we generate average, entropy and energy of HSV. Where, xand y are a pixel in format HSV, x = y if only if:• hue of x equals hue of y and,• saturation of x equals saturation of y and,• value of x equals value of yHowever, for this space color, we generate entropy, energy

and variance from each channel independently. In otherwords, we generate also attributes such as hue entropy,saturation entropy, value entropy, addition to other mea-sures previously commented. Another attribute generated was

HAS, which is (hue+saturation)/2. This attribute has beengenerated in order to take advantage of the consistency ofthese two channels when they belong to a pixel of the street.The entropy value of HAS was also generated in this work.

III. ARTIFICIAL NEURAL NETWORKS

Artificial Neural Networks (ANN) are notorious for pre-senting very own properties such as: adaptability, ability tolearn by examples and ability of generalization. In this work,we have used a multilayer perceptron (MLP) [14], which is ofa feedforward neural network model that maps sets of inputdata onto specific outputs. We uses the back propagationtechnique [15], which estimates the weights based on theamount of error in the output compared to the expectedresults.

In this work, we evaluated only one configuration inhidden layer, this layer has five neurons.All neural networksconfigurations tested have only one neuron on output layer,enough to classify the pixel-block as navigable (returning 1)or non-navigable (returning 0). However, the neural networksprovided responses in decimal values between 0 and 1. Forthis reason we defined responses as follow:• if result ≤ 0.3 then the region is classified as non-

navigable;• if result ≥ 0.7 then the region is classified as navigable;• if result > 0.3 and result < 0.7 then is classified

as unknown; Notice that the unknown classification isactually an error value.

The input layer varies depending of combination of evalu-ated features. If RGB is the evaluated feature then the inputlayer has three neurons, one neuron for each channel. Ifthe combination of RGB and H entropy are evaluated thenthe input layer has four neurons. The neural networks wereevaluated at each 100 training cycles until reach 2,000 cycles.

IV. EXPERIMENTS AND RESULTS

In order to analyse the various attributes combinations,several experiments have been carried out at the universitycampus. We collected data in realistic environments underdifferent conditions. More specifically, we taped the pathof a car on campus, through streets flanked with sidewalks,parking or vegetation. In addition, portions of the street hadadverse conditions such as sand and dirt.

Our setup for experiments was a car equipped with anA610 Canon digital camera. The image resolution was(320 x 240) pixels with 30 FPS. The car and camerawere used only for data collection. In order to executethe experiments with ANNs, we used a Stuttgart NeuralNetwork Simulator (SNNS) which is a software simulatorfor neural networks developed in University of Stuttgart. TheopenCV library has been used in the image acquisition andto visualize the processed results from SNNS. The pixel-block size used was K = 10, making each frame has 768pixel-blocks.

We performed the experiments in two phases, in first phasewe trained the neural networks with one simple frame, wherethe road is flanked with grass and a parking each side. The

20

Page 3: [IEEE 2010 Latin American Robotics Symposium and Intelligent Robotic Meeting (LARS) - Sao Bernardo do Campo, Brazil (2010.10.23-2010.10.28)] 2010 Latin American Robotics Symposium

frame used for evaluation was similar to the one used in thetraining step. The second phase was more complex, we usedfive frames with different conditions for training step and weevaluated with fifteen frames (the five of training step + tenother frames). We divide the work into two phases becausethe number of classifiers was very large, almost 28.000.The first phase eliminated the combinations of attributes thatdid not obtain satisfatory results, reducing the number ofcandidates for the second phase.

A. Phase 1Phase 1, we tested combination of 21 features from the

pixel-blocks: average R (red), average G (green), average B(blue), RGB entropy, average H (hue), average S (saturation),average V (value), HSV entropy, H entropy, S entropy, V en-tropy, H variance, S variance, V variance, RGB energy, HSVenergy, H energy, S energy, V energy, HAS and HAS entropy.Each feature corresponds to a neuron in the input layer ofneural network, so we tested different combinations withone attribute, two atributes, three, four and five attributes,thus, neural networks configurations with one, two, up to fiveneurons in input layer. Totaling in 27,890 differents neuralnetworks and combinations of attributes evaluated.

(a) Frame used in training (b) Frame used in evaluating

Fig. 2. Images Used in Phase 1.

The frames used in this phase are showed in Fig. 2. Animportant detail about this stage is that only the blocks belowa certain line were used both to train and evaluate, thuseach frame generated only 480 pixel-blocks - can be seenin viewing of the network response (Fig. 3). This is due tothe fact that much of the top of the image represents sky,which can be eliminated with a pre-processing [16].

(a) Best Result (b) Evaluating of Best Result

Fig. 3. Classification Results: 3(a) blocks classified non-navigation - asmagenta - and navigation - as cyan - network responses. 3(b) correct, false-negative, false-positive and unknown classification in green, blue, red ansyellow, respectively.

Among the results from all neural networks configurations,tested 16,976 neural networks achieved hit rate between 90%

and 98%, where about one thousand networks configurationsreached the best result. It is important to note that the hit rateis a percentage of the 480 pixel-blocks from the frame ofassessment, which means that network with hit rate of 98%,wrong in the classification of only ten pixel-blocks. However,besides being very different combinations, this number is stillintractable. Because of this, we executed the second phaseevaluating only the neural networks configurations that gothit rate equal to 90% or more.

B. Phase 2

Based on the results obtained on Phase 1, we evaluatedthe 16,976 neural networks configurations with new patternsof different conditions of street. The evaluation method, theimage region, and other situations used were the same asPhase 1. This time, we used patterns generated from fiveframes to training step - the frames be seen in Fig. 4. Forthe evaluation stage, we used the patterns generated fromfifteen frames, including same used in the training stage. Theother frames used in evaluating step are descript in Fig. 5.Among this frames, can be seen scenes of curves - Fig. 4(c)-, dirty roads - Fig. 4(e) - and streets with no defined edges- Fig. 4(d).

Out of the results obtained from all neural networksconfigurations tested, 5967 achieved hit rate between 90%and 93%. One important detail is that in Phase 1 therewere neural networks that reached 98%, while in Phase 2the maximum reached was 93%. Among these 5967 neuralnetworks configurations, we analyzed the number of timesthat a subset of attributes appeared.

Is important to note that this analysis is more complexbecause the hit rate is a percentage of the (15 ∗ 480) pixel-blocks from frames of assessment - call this analysis of“general analysis”. That means that network with hit rateof 93%, wrong in the classification of 504 pixel-blocks, i.e.,every error can be concentrated in just one frame. Therefore,we performed a more careful analysis, taking into accountthe results for each frame - call this analysis of the “analysisper frame”. This analysis, we calculate the error rate perframe and then calculate the average of these values andstandard deviation. So we can see which network has a steadyperformance in all situations of the path.

C. Frequency Analysis

In this work we assume that all neural networks thatachieve a hit rate of 90% or more are acceptable, approx-imately five thousand neural networks configurations. But,instead of analyzing one by one, we decided to examinewhat each of these neural networks configurations have incommon. In other words, we analyzed the number of timesthat a subset of attributes appeared among the neural net-works configurations considered good. The neural networkswith these subsets were retrained and revaluated. Based onthese new results it was possible to see the contribution ofa combination of attributes in the network classifier.

Among all the 5967 neural networks configurations thathave been successful, there are neural networks that use five,

21

Page 4: [IEEE 2010 Latin American Robotics Symposium and Intelligent Robotic Meeting (LARS) - Sao Bernardo do Campo, Brazil (2010.10.23-2010.10.28)] 2010 Latin American Robotics Symposium

(a) Frame 1 (b) Frame 2 (c) Frame 3 (d) Frame 4 (e) Frame 5

Fig. 4. Images Used in Training of Phase 2.

(a) Frame 6 (b) Frame 7 (c) Frame 8 (d) Frame 9 (e) Frame 10

(f) Frame 11 (g) Frame 12 (h) Frame 13 (i) Frame 14 (j) Frame 15

Fig. 5. Frames Used in Evaluating of Phase 2.

four, three, two and even one attribute as input. So, it wascounted how many times the subset X is used as input forthese neural networks “good”. Where X is a subset of up tofour elements of all the 21 attributes evaluated.

TABLE IATTRIBUTES THAT APPEARED MORE FREQUENTLY.

B average Hue average Hue entropy Value entropyHue average Hue entropy Value entropy HSV energyHue average Hue entropy Value entropy HAS entropy

Subsets of one element, the attribute that showed up was“V entropy”, appearing 2740 times, all others appeared lessthan two thousand times. However, the results with only oneattribute was not satisfactory. Because of this, the analysiscame from subsets of four elements. Table I shows the threesubsets of four elements that were most used. These subsetsappeared 18 times - the best result possible, since there areonly 17 attributes to be the fifth element or not have it - whichmeans that this combination yielded good performance nomatter the fifth attribute used.

Note that these subsets have (Hue average, Hue entropy,Value entropy) in common, which is the subset of threeelements most used - appeared 146 times. Based on attributesfrom the Table I, we reviewed all the neural networksconfigurations that use these attributes. More specifically,we reviewed the neural networks configurations showed inTable II, where a columns AT are attributes, the column AVEis average hit rate from “general analysis” of ten runs of

(a) Answer of Network withBlue Average.

(b) Answer of Network withHAS Entropy.

Fig. 6. Classification results: blocks classified non-navigation - as magenta- and navigation - as cyan - network responses. The color yellow representsclassification unknown.

evaluating of network, the column SD is standard deviationfor AVE, APF is average error rate from “analysis per frame”and column SDF is standard deviation for APF. Table II issorted by AVE.

From the results shown in Table II, note that the top tenfrom “general analysis” are also in the top ten from “analysisper frame”, only that in a different order except that the bestis the best in the two analysis. Note that the best networkhas only 6.93% of error rate per frame with a low standarddeviation, so for all frames tested, this network missed lessthan 10%, which in numerical terms is a good performance.However, is necessary determine how errors are displayedin frame, because blocks grouped misclassified are moreharmful than the blocks scattered in the region of interest.

For example, the classification of Frame 1 was not so good

22

Page 5: [IEEE 2010 Latin American Robotics Symposium and Intelligent Robotic Meeting (LARS) - Sao Bernardo do Campo, Brazil (2010.10.23-2010.10.28)] 2010 Latin American Robotics Symposium

TABLE IIEVALUATION OF THE MOST CONTRIBUITING ATTRIBUTES. (AVE. = AVERAGE)

AND (ENT. = ENTROPY) AND (ENE. = ENERGY)

AT 1 AT 2 AT 3 AT 4 AVE SD APF SDF

Blue ave. H ave. H ent. V ent. 92 0.94 6.93 3.32HAS ent. H ave. H ent. V ent. 91.7 0.82 7.74 3.68Blue ave. H ent. V ent. 91.6 0.7 7.28 4.49Blue ave. V ent. 91.5 0.53 7.38 4.59HSV ene. H ave. H ent. V ent. 91.2 0.63 7.74 3.62

H ave. H ent. V ent. 91.0 0.47 8.18 3.21HAS ent. H ave. V ent. 91.0 0.0 8.54 3.59HSV ene. H ave. V ent. 90.5 0.53 8.72 3.59

H ave. V ent. 90.3 0.48 8.56 3.73Blue ave. H ave. V ent. 90.2 1.32 7.39 4.3Blue ave. H ave. H ent. 89.8 0.63 9.04 3.8

H ave. H ent. 89.0 0.67 10.0 3.11HAS ent. H ave. 89.0 0.0 11.93 4.08HSV ene. H ave. H ent. 89.0 0.0 11.22 2.98Blue ave. H ave. 88.9 0.99 10.17 3.74Blue ave. H ent. 88.8 0.79 10.97 4.75HAS ent. H ave. H ent. 87.7 7.27 9.29 4.25HSV ene. H ave. 87.5 0.85 13.31 4.41HAS ent. H ent. V ent. 78.3 2.71 19.86 5.95HAS ent. H ent. 74.5 0.85 23.82 6.94HAS ent. V ent. 70.6 0.7 28.15 5.45HSV ene. H ent. V ent. 67.5 0.53 31.47 8.59

H ent. V ent. 65.7 0.48 33.54 10.38HSV ene. V ent. 63.5 0.71 36.92 8.86HSV ene. H ent. 59.3 2.11 42.22 7.24

(a) Sidewalk as unknown. WithBlue average

(b) Sidewalk as navigable. With-out Blue average

(c) Errors in dirty road. WithBlue average not classify dirtyroad as navigable.

(d) Errors in dirty road. WithoutBlue average.

Fig. 7. Classification results: show blocks classified non-navigation -as magenta -, navigation - as cyan - network responses and unknownclassification as yellow.

for the neural networks with (Blue average, H average, Hentropy, V entropy) and (HAS entropy, H average, H entropy,V entropy). Their error rates for this frame are similar, but thenetwork with Blue is better than network with HAS entropybecause the network with Blue has classified the region ofthe park as unknown - show in Fig. 6(a) - while the networkwith HAS entropy classified as navigable - see Fig. 6(b). Thistype of assessment has been made between the best neuralnetworks configurations for all frames, but only discuss themost relevant cases.

Analyzing the results of top four neural networks, weconcluede that the networks with (blue average) tend toclassify the parking lot and the sidewalks as unknown - showin Fig. 7(a) - while the networks without (blue average)classified as navigable - show in Fig. 7(b) . However, the

neural networks err more on the dirt roads as seen in theFig. 7(c).

Another conclusion to be drawn is that the subset (hueaverage and value entropy) has a good performance, becausethis subset appears eight times in the top ten. You can alsosee that these two attributes when combined with another,slightly improve its performance. Due to this fact, reanalyzedall neural networks configurations of up to three elementsthat have as input at least these two attributes. The Table IIIhas the same columns of the Table II, but with other neuralnetworks for ten runs of experiments and is also ordered byAVE.

TABLE IIIEVALUATION OF HUE AVERAGE AND VALUE ENTROPY. (AVE. = AVERAGE) AND

(ENT. = ENTROPY)

AT 1 AT 2 AT 3 AVE SD AFP SDF

1 H ave. V ent. G ave. 91.3 1.1 7.54 4.02 H ave. V ent. B ave. 91.1 0.83 7.33 4.213 H ave. V ent. H ent. 91.1 0.3 7.54 3.374 H ave. V ent. V ave. 90.9 1.58 6.86 4.635 H ave. V ent. S ent. 90.8 0.4 7.63 3.876 H ave. V ent. S var. 90.8 0.6 7.82 4.437 H ave. V ent. HAS ent. 90.8 0.4 9.0 3.328 H ave. V ent. H var. 90.7 0.9 7.9 4.019 H ave. V ent. HSV ent. 90.7 0.46 8.14 3.9310 H ave. V ent. S ave. 90.6 1.02 8.24 2.6911 H ave. V ent. RGB ent. 90.6 0.49 8.38 3.9612 H ave. V ent. HSV ene. 90.6 0.49 8.58 3.8513 H ave. V ent. RGB ene. 90.4 0.49 8.51 3.814 H ave. V ent. H ene. 90.4 0.49 8.64 3.7415 H ave. V ent. R ave. 90.4 1.5 9.49 6.1616 H ave. V ent. 90.3 0.46 8.46 3.8917 H ave. V ent. V ene. 90.3 0.64 8.74 3.7618 H ave. V ent. S ene. 90.2 0.4 8.78 4.0119 H ave. V ent. HAS ave. 89.8 0.6 11.01 6.2920 H ave. V ent. V var. 88.0 7.01 7.86 4.38

From the results shown in Table III, note that the top sixfrom “general analysis” are also in the top six from “analysisper frame”. In addition, all neural networks have achievedgood results, except the network 20. This proves that (Hueaverage and Value entropy) are good attributes to be used inclassifying an image of a road scene.

In general, neural networks configurations with blue av-erage or hue entropy or saturation entropy, obtained betterresults in parking lots and sidewalks, however missed agreater proportion in the dirt roads. It can be concluded thatneural networks incorporating the dirt as navigable regionalso include the sidewalks and parking lots because of thecolor or texture similar. The attribute blue average, helpsreduce the similarity between dirty street and sidewalks butnot enough to classify them as non-navigable. If assume thatall block classified as unknown are non-navigable then thenetwork can be used in road following algorithm with goodresults.

Good overall performance can be seen in Fig. 8, with someerros in the traffic lanes and edges of sidewalks. The mostsignificant error occurred in the dirt road, where the regionin the middle of the road is classified as non-navigable whichcan be expected due to the similarity in color to the sidewalkand plats.

V. CONCLUSIONS AND FUTURE WORKS

Autonomous navigation is one of the main capabilitiesof autonomous robots. This paper addresses the problem

23

Page 6: [IEEE 2010 Latin American Robotics Symposium and Intelligent Robotic Meeting (LARS) - Sao Bernardo do Campo, Brazil (2010.10.23-2010.10.28)] 2010 Latin American Robotics Symposium

(a) Frame 1 (b) Frame 6 (c) Frame 2 (d) Frame 7 (e) Frame 3

(f) Frame 8 (g) Frame 9 (h) Frame 10 (i) Frame 4 (j) Frame 11

(k) Frame 12 (l) Frame 13 (m) Frame 5 (n) Frame 14 (o) Frame 15

Fig. 8. Classification results: show blocks classified non-navigation - as magenta -, navigation - as cyan - network responses and unknown classificationas yellow.

of identification navigable areas in the environment usingartificial neural networks and vision information. Differentcombinations of network topologies have been evaluated inrealistic environments.

In general the results were satisfactory, since many neuralnetworks have obtained a good rate of success. Furthermore,neural networks had good classification of the main portionof the street where the car can travel, the region of interest.All networks had errors at the edges and the traffic lanein different proportions because the block-based method,had erros at parking areas because the statistical featuresare very similar with road. Furthermore, when the neuralnetwork classify the dirty road as navigable, then the networkclassify the color of dirt as navigable increasing the numberof errors, since dirt is, usually, considered non-navigable. Itmay be possible to get a better classification by combiningthe responses of these networks. As future work we plan toevaluate others image features and more complex environ-ments in order to reduce the number of networks with goodresults. We also plan to integrate our approach with lasermapping, which provides depth information.

ACKNOWLEDGMENTS

The authors acknowledge the support granted by CNPqand FAPESP to the INCT-SEC (National Institute of Scienceand Technology - Critical Embedded Systems - Brazil),processes 573963/2008-9 and 08/57870-9.

REFERENCES

[1] R. C. Arkin, An Behavior-based Robotics. Cambridge, MA, USA:MIT Press, 1998.

[2] D. Wolf, G. Sukhatme, D. Fox, and W. Burgard, “Autonomous terrainmapping and classification using hidden markov models,” in IEEEICRA 2005, April 2005, pp. 2026–2031.

[3] Y. He, H. Wang, and B. Zhang, “Color-based road detection in urbantraffic scenes,” Intelligent Transportation Systems, IEEE Transactionson, vol. 5, no. 4, Dec. 2004.

[4] A. Broggi and S. Bert, “Vision-based road detection in automotive sys-tems: A real-time expectation-driven approach,” Journal of ArtificialIntelligence Research, vol. 3, pp. 325–348, 1995.

[5] C. Rotaru, T. Graf, and J. Zhang, “Extracting road features from colorimages using a cognitive approach,” in Intelligent Vehicles Symposium,2004 IEEE, June 2004, pp. 298–303.

[6] J. Zhang and H.-H. Nagel, “Texture-based segmentation of roadimages,” in Intelligent Vehicles ’94 Symposium, Proceedings of the,Oct. 1994, pp. 260–265.

[7] R. Ghurchian, T. Takahashi, Z. Wang, and E. Nakano, “On robot self-navigation in outdoor environments by color image processing,” inICARCV 2002. 7th International Conference on, vol. 2, Dec. 2002.

[8] D. Pomerleau, “Neural network vision for robot driving,” in TheHandbook of Brain Theory and Neural Networks, M. Arbib, Ed., 1995.

[9] M. Foedisch and A. Takeuchi, “Adaptive real-time road detectionusing neural networks,” in in Proc. 7th Int. Conf. on IntelligentTransportation Systems, Washington D.C, 2004.

[10] P. Y. Shinzato and D. F. Wolf, “Path recognition for outdoor naviga-tion,” in LARS, Valparaiso, Chile., 2009.

[11] P. Y. Shinzato, D. F. Wolf, L. C. Fernandes, and F. S. Osorio, “Pathrecognition for outdoor navigation using artificial neural networks:Case study,” in International Conference on Industrial Technology,Vina del Mar, Chile, 2010.

[12] G. H. Joblove and D. Greenberg, “Color spaces for computer graph-ics,” SIGGRAPH Comput. Graph., vol. 12, no. 3, pp. 20–25, 1978.

[13] C. Reiter, “With j: image processing 2: color spaces,” SIGAPL APLQuote Quad, vol. 34, no. 3, pp. 3–12, 2004.

[14] P. S. Churchland and T. J. Sejnowski, The Computational Brain.Cambridge, MA, USA: MIT Press, 1994.

[15] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internalrepresentations by error propagation,” pp. 673–695, 1988.

[16] J. Lee, C. D. C. III, S. Kim, and J. Kim, Eds., Road Following in anUnstructured Desert Environment using Monocular Color Vision asApplied to the DARPA Grand Challenge. International Conferenceon Control, Automation and Systems, 2005.

24