9
Genetic Model Optimization for Hausdorff Distance-Based Face Localization Klaus J. Kirchberg, Oliver Jesorsky, and Robert W. Frischholz BioID AG, Germany {k.kirchberg,o.jesorsky,r.frischholz}@bioid.com, WWW home page: http://www.bioid.com Abstract. In our previous work we presented a model-based approach to perform robust, high-speed face localization based on the Hausdorff distance. A crucial step during the design of the system is the choice of an appropriate edge model that fits for a wide range of different human faces. In this paper we present an optimization approach that creates and successively improves such a model by means of genetic algorithms. To speed up the process and to prevent early saturation we use a special bootstrapping method on the sample set. Several initialization functions are tested and compared. c In Proc. International ECCV 2002 Workshop on Biometric Authentication, Springer, Lecture Notes in Computer Science, LNCS-2359, pp. 103–111, Copenhagen, Denmark, June 2002. 1 Introduction Face localization is a fundamental step in the process of face recognition. Its aim is to decide whether there is a face in a given image and, in the positive case, to determine the coordinates of the face. The accuracy of the detected face coordinates has a heavy influence on the recognition performance. In [5] we presented a method for robust frontal face detection based on the Hausdorff distance. This algorithm uses a predefined edge model of the human face to find face candidates in the image. While it is possible to use a simple ellipse as a model, the detection performance can be improved by using a more detailed model. However, it must still represent a wide variety of faces. In this paper we follow a genetic algorithm approach to generate a face model from scratch and to optimize it based on a fairly large database of sample images. A coding scheme for binary edge models is presented along with the correspond- ing genetic operators. We describe a bootstrapping optimization framework that speeds up the process by successively adapting the subset of evaluation samples. Several experiments prove the performance of the system and compare dif- ferent initialization strategies. 2 Hausdorff Distance-Based Face Detection This section gives a brief overview of the underlying face detection algorithm. A more detailed description can be found in [5].

Genetic Model

  • Upload
    silvian

  • View
    216

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Genetic Model

Genetic Model Optimization

for Hausdorff Distance-Based Face Localization

Klaus J. Kirchberg, Oliver Jesorsky, and Robert W. Frischholz

BioID AG, Germany{k.kirchberg,o.jesorsky,r.frischholz}@bioid.com,

WWW home page: http://www.bioid.com

Abstract. In our previous work we presented a model-based approachto perform robust, high-speed face localization based on the Hausdorffdistance. A crucial step during the design of the system is the choice ofan appropriate edge model that fits for a wide range of different humanfaces. In this paper we present an optimization approach that createsand successively improves such a model by means of genetic algorithms.To speed up the process and to prevent early saturation we use a specialbootstrapping method on the sample set. Several initialization functionsare tested and compared.

c© In Proc. International ECCV 2002 Workshop on Biometric Authentication,Springer, Lecture Notes in Computer Science, LNCS-2359, pp. 103–111,

Copenhagen, Denmark, June 2002.

1 Introduction

Face localization is a fundamental step in the process of face recognition. Itsaim is to decide whether there is a face in a given image and, in the positivecase, to determine the coordinates of the face. The accuracy of the detected facecoordinates has a heavy influence on the recognition performance.In [5] we presented a method for robust frontal face detection based on the

Hausdorff distance. This algorithm uses a predefined edge model of the humanface to find face candidates in the image. While it is possible to use a simpleellipse as a model, the detection performance can be improved by using a moredetailed model. However, it must still represent a wide variety of faces.In this paper we follow a genetic algorithm approach to generate a face model

from scratch and to optimize it based on a fairly large database of sample images.A coding scheme for binary edge models is presented along with the correspond-ing genetic operators. We describe a bootstrapping optimization framework thatspeeds up the process by successively adapting the subset of evaluation samples.Several experiments prove the performance of the system and compare dif-

ferent initialization strategies.

2 Hausdorff Distance-Based Face Detection

This section gives a brief overview of the underlying face detection algorithm. Amore detailed description can be found in [5].

Page 2: Genetic Model

The face detection problem can be stated as follows: given an input image,decide whether there is a face in the image or not. If a face was found, returnthe face coordinates inside the image. In our case these are the coordinates ofthe left and right eye centers, which is sufficient if the problem is restricted tofrontal view faces in images of constant aspect ratio.

For simplicity, we concentrate on the task of finding a single face in an image.The extension to finding multiple faces is straightforward.

We use an edge-based method for finding faces. Therefore we first calculatean edge magnitude image with the Sobel operator. The relevant edge featurepoints are extracted by a locally adaptive threshold filter to compensate vari-able illumination. We assume that this procedure will produce a characteristicarrangement of segmentation points in the facial area.

Based on the typical layout that strongly depends on the segmentation steps,we use a face model which itself consists of a set of feature points and can berepresented as a binary image.

The feature points of the face model are chosen in a way that the patternstored in the model is somehow similar to the typically observed patterns in theimages’ face area.

To detect a face, the model is superimposed over the image at several discretepositions. At each position the similarity between the translated model and thecovered part of the image is calculated. A face is considered to be located at theposition yielding the highest similarity between model and covered image part.The procedure is illustrated in figure 1. Note that the model can be scaled toallow detecting faces of different sizes.

segmentation localization

face model

Fig. 1. Face finding procedure

An efficient yet powerful method to calculate the similarity of two binaryimages is the Hausdorff distance [7], a metric between two point sets. We usea slightly adapted measure, called the (directed) modified Hausdorff distance

(MHD) [2] to calculate the similarity between the image and the model. Giventhe two point sets A and B and some underlying norm || · || on the points, the

Page 3: Genetic Model

MHD is defined as

hmod(A,B) =1

|A|

a∈A

minb∈B‖a− b‖ . (1)

With the two-dimensional point set A representing the image and Tp(B)representing the translated and scaled model with transformation parameters p,the formula

dp̂ = minp∈P

hmod(A, Tp(B)) (2)

calculates the distance value of the best matching position and scale. The pa-rameters of the corresponding transformation are represented by the parameterset p̂.To make an efficient implementation possible, we use a discrete grid for the

model point positions. A model can then be represented by a binary image wherewhite pixels represent the model points. The resolution of this image has to behigh enough to be able to represent enough detail but has to be as low as possibleto minimize computation time for both the localization procedure and the modeloptimization process. We used a 45× 47 model grid which has turned out to bea good trade-off.One of the major problems of the Hausdorff distance method is the actual

creation of the face model. While a simple ”hand-drawn” model will be sufficientfor the detection of simple objects, a general face model must cover the broadvariety of different faces.

3 Genetic Model Optimization

The task of finding a well-suited model for Hausdorff distance based face localiza-tion can be formulated as a discrete global optimization problem. An exhaustivesearch would produce the optimal result (with respect to a given sample set),but due to the exponential complexity it is not computationally feasible. In thebroad area of global optimization methods, Genetic Algorithms (GA) form awidely accepted trade-off between global and local search strategy. They werechosen here for they are well-investigated and have proven their applicability inmany fields.Since their invention by Holland [4], Genetic Algorithms have become a stan-

dard solution approach for multi-dimensional global optimization problems. Weuse the algorithm and terminology of the Simple Genetic Algorithm (SGA) de-scribed by Goldberg [3].To formulate our face finding problem as a genetic algorithm, we have to do

the genotype coding of the face model, define a fitness function and have to setsome more parameters (population size, crossover method etc).

3.1 Genotype Coding

The genotype coding of the face model is done fairly straightforward by a two-dimensional binary genome. We presume that the average face model is sym-

Page 4: Genetic Model

metric along the vertical axis, which is not exactly true for a single face butsufficient for our purposes. Thus, only the left half of the model is coded in thegenome.

3.2 Fitness Function

The fitness function assigns a real-valued number to a given model. This valuemust reflect the performance of the face localization algorithm with a certainmodel. During reproduction phase of the GA this value determines an individ-ual’s probability to survive and produce offspring. To rate a specific model, itis tested on a set of sample face images. This sample set must be both largeenough to be representative and also small enough to allow fast evaluation ofthe fitness function.We define the fitness value of a model as the ratio of found faces to the

overall number of faces in the set. A face is said to be found if some distancemeasure between true position and found position is below a certain threshold.We use here the accuracy measure deye introduced in [5]. Let dl and dr denotethe distances between the true eye centers Cl, Cr ∈ R2 and the expected eyepositions, respectively. Then the rating distance is defined as

deye =max(dl, dr)

‖Cl − Cr‖(3)

with the euclidean norm || · ||.

The threshold that defines a face as found was set to d̂eye = 0.12 for theoptimization process, which means that we allow a shift of 12% with respect tothe distance between the true right and left eye. The true eye positions weremarked manually.We used a set consisting of 1362 face images for evaluation. To speed up

evolution, we choose a subset of 80 images. This subset selection is updatedevery five generations. The process is shown in figure 2.

initialize population

do while not converged

evaluate population on complete set

build new evaluation set with best model

run GA for five generations on evaluation set

end do

Fig. 2. Optimization process

When a new evaluation set of face images is built, the localization is per-formed on the whole set of images with the best model of the current population.

Page 5: Genetic Model

For each image we record deye and sort the images by this distance. The newevaluation set is then compiled from 40 out of the 200 top-ranking images and40 out of the 200 images with the lowest rating. This makes the GA learn tofind new faces while not ”forgetting” the others.

3.3 Selection, Crossover, Mutation, Population Size

In the choice of the other genetic operators we mostly follow the suggestions inGoldberg’s book [3].Selection is done by the Roulette wheel scheme, which means each individ-

ual’s selection probability is directly proportional to its fitness.As crossover operator we use the natural extension of the one-point crossover.

Its function in the two-dimensional case is depicted in figure 3.

Fig. 3. One-point crossover operator for 2d binary genomes

Mutation is represented by random bit flip with a probability of 0.00025 foreach bit.The population size is constant and is set to 50 individuals.

3.4 Initialization

Another important decision is how to initialize the population. In our experi-ments we used three different initializations:

– blank model– average edge model– hand-drawn model

These initialization options are described in more detail in the next chapter.

4 Experiments

The described approach was first tested on three different setups. They all dif-fered in the method used to initialize the first generation. The purpose of thistest was to check the influence of the initialization on the convergence behavior.

Page 6: Genetic Model

In the first method, the population was initialized with random points, eachone having a 5% probability of being set. Further genetic parameters were:

crossover rate 0.9mutation rate 0.00025population size 50

Some models generated by the GA in this run are shown in figure 4.

Fig. 4. Models from the randomly initialized GA run

For the second setup, an average edge map was generated from a set ofsample images. In the initialization step, the model points were randomly setwith a probability proportional to the value of the corresponding point in theaverage edge map. Some models from this run are shown in figure 5.

Fig. 5. Models from the average edge map-initialized GA run

The third run was initialized with a hand-drawn face model and 5% of thebits flipped. Figure 6 shows the hand-drawn model and some other models fromthis run.

Fig. 6. Models from the hand-drawn initialized GA run

Page 7: Genetic Model

The resulting models were tested on the XM2VTS [6] data set and the BIOIDface test set, which is publicly available at [1]. The first mentioned contains 2360,the second 1521 gray level images, each of them showing a single face.The results for the three models on both sets are summarized in Figure 7.

The figure shows the distribution function of the detection results rated usingthe same method as used by the fitness function described in the previous section(see eq. 3).

0

20

40

60

80

100

0 0.1 0.2 0.3 0.4 0.5

d

[%]

average edge map

blank

hand-drawn

eye

0

20

40

60

80

100

0 0.1 0.2 0.3 0.4 0.5

d

[%]

average edge map

blank

hand-drawn

eye

(a) (b)

Fig. 7. Distribution function of relative eye distances for the XM2VTS (a) and theBIOID face data set (b) for the best models of the three runs

Regarding a value of deye = 0.25 more than 80% of the faces are found in theXM2VTS test set. According to the definition of deye a value of 0.25 equals halfthe width of an eye. This has shown to be a reasonable threshold for robust facerecognition.The results on the BIOID test set are a little poorer because this set has

been recorded under a larger variety of illumination and face scale and thereforeimplies a harder problem for face detection systems.The model gained from blank initialization performs best on both data sets.

Therefore we used the blank initialization method to start a second optimiza-tion on a larger image database, also using more generations than in the firstevaluations.With the resulting model that is shown in figure 8, together with the belong-

ing distribution functions, the localization performance could be increased up to92.8% on the BIOID and 94.2% on the XM2VTS dataset (again considering amaximum allowed error of 0.25 relative eye distance).In comparison, the detection rate for the hand-drawn model itself is 62.3%

on the XM2VTS database.Due to the lack of a common performance measurement for face detection

algorithms it is hard to compare different approaches. For example, Smeraldi etal [8] reported a detection rate of 91% for a SVM approach on a subset of 349images from the M2VTS database. They allowed an absolute tolerance of 3 pixelfor eyes and mouth positions.

Page 8: Genetic Model

0

20

40

60

80

100

0 0.1 0.2 0.3 0.4 0.5

d

[%]

BioID

XM2VTS

eye

(a) (b)Fig. 8. Resulting model (a) and corresponding distance distribution functions for theXM2VTS and the BIOID face set (b).

5 Conclusions

One of the major problems in model-based face detection is the creation of aproper face model. We have presented a genetic algorithm approach for obtaininga binary edge model that allows localization of a wide variety of faces with theHausdorff search method.The experiments showed that the GA performs better when starting from

scratch than from a hand-drawn model. With this method, the localization per-formance could be improved to more than 90% compared to roughly 60% for thehand-drawn model.Genetic Algorithms are a powerful tool that can help in finding an appro-

priate model for face localization. The presented framework led to a model thatperformed considerably better than a simple hand-drawn model.Face localization can be improved by a multi-step detection approach that

uses more than one model in different grades of detail. Each of these modelscan then be optimized separately. This does not only speed up the localizationprocedure but also produces more exact face coordinates.

References

[1] BioID face database. http://www.bioid.com/research/index.html.[2] M.P. Dubuisson and A.K. Jain. A modified Hausdorff distance for object matching.

In ICPR94, pages A:566–568, Jerusalem, Israel, 1994.[3] David E. Goldberg. Genetic Algorithms in Search, Optimization and MachineLearning. Addison Wesley, 1989.

[4] John H. Holland. Adaption in Natural and Artificial Systems. The University ofMichigan Press, Ann Arbor, 1975.

[5] Oliver Jesorsky, Klaus J. Kirchberg, and Robert W. Frischholz. Robust Face Detec-tion Using the Hausdorff Distance. In Josef Bigun and Fabrizio Smeraldi, editors,Audio- and Video-Based Person Authentication - AVBPA 2001, volume 2091 of Lec-ture Notes in Computer Science, pages 90–95, Halmstad, Sweden, 2001. Springer.

Page 9: Genetic Model

[6] K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre. XM2VTSDB: Theextended M2VTS database. In Second International Conference on Audio andVideo-based Biometric Person Authentication, pages 72–77, March 1999.

[7] W. Rucklidge. Efficient Visual Recognition Using the Hausdorff Distance, volume1173 of Lecture notes in computer science. Springer, 1996.

[8] F. Smeraldi, N. Capdevielle, and J. Bigun. Facial features detection by saccadic ex-ploration of the Gabor decomposition and Support Vector Machines. In Proceedingsof the 11th Scandinavian Conference on Image Analysis - SCIA 99, Kangerlussuaq,

Greenland, volume I, pages 39–44, June 1999.