Corner Sharpening With Modified Harris Corner Detection to Localize Eyes in Facial Images

52nd International Symposium ELMAR-2010, 15-17 September 2010, Zadar, Croatia

Corner Sharpening With Modified Harris Corner Detection to Localize Eyes in Facial Images

Moein Lak1, AmirMohamad Soleimani Yazdi2

1, 2 Islamic Azad University, South Tehran Branch, Technical & Engineering Faculty, P.O. Box: 11365-4435, Tehran, Iran [email protected]

Abstract— Harris corner detection algorithm has been widely used to find corners in images; it has been used for eye localization in combination with filters. In this paper a new strategy based on filters combination approach is suggested for eyes localization in which filters are used to find and highlight corners of region with local maximum intensity referred here as sharp region. The filtering process proposed here utilizes Harris corner detector in combination with Homomorphic and Tophat-Bothat morphologic filtering. The use of Tophat-Bothat filter enhances image at the first stage resulting in a highlighted region with maximal differences in intensity level after which application of Homomorphic filter extracts the desired region and suppresses others. The proposed method was tested in different applications including biometrics and industrial (mechanical) image analysis in which we used sharp region approach for corner detection. Experimental results, including application on facial image analysis on both color and grayscale, all indicate high performance can be achieved using the proposed method.

Keywords-Harris Corner Detection; Image Analysis; Eye Localization.

I. INTRODUCTION

ocalization of eye position in facial images has always been considered to play an important role in many

systems such as: face recognition, gaze tracking, and iris recognition, etc. By locating the position of eyes, the gaze can be determined. Detecting eyes facilitates the task of locating other facial features such as eyelids, nose and mouth needed for recognition tasks. Eye detection is invaluable in determining the orientation of the face; model based video coding, face normalization for passport images, gaze detection and human computer interaction. Significance of locating eyes in facial features is due to their relatively constant intraocular distance.

Various algorithms for eye detection can be generally classified as follows. At the first stage, the rough estimate of eye region is obtained which includes also imposing a priori knowledge on the face image so that eye windows can be located. This is followed by fine tuning of the eye zone to localize eye position. In this paper, we describe an algorithm for eyes detection that is robust against numerous factors including changes in light conditions, visual angle, noise,contrast and head orientation. At this stage, background of images must have no texture but its color can be mixed. When the filters are applied to the image, only regions belonging to

the corners with maximum difference in intensity such as eyes, nostrils, furrows in hair, collar of shirts, pocket of shirt, will have an improved contrast considering density as well as accumulation of intensity, it would then be possible to select the region of pair of eyes by using a suitable threshold. Now projecting the intensity values across axis of the image, it would be possible to identify the exact bounding box of each eye. While other methods are also considered to find approximate region of eyes such as projections and finding peak amplitude, our algorithm finds the eye zone globally. In this algorithm, while other corners are also detected as seen in Fig.1 (f), but they are mostly of vertical shape; however the eye shape is horizontal. Using this simple remedy, the eyes region is obtained globally by binary enhancements via morphologic filters and labeling the components of image.

The Section II summarizes various techniques that have been utilized for eye detection. In section III and IV, details of proposed method are described. In Section V the experimental results are illustrated. Finally the paper is concluded in section VI.

II. BRIEF OVERVIEW OF RELATED WORKS

There has been several research works dealing with eye detection and eye region localization during the last decade and numerous approaches have been proposed.[1,2,3,4,5] These techniques are based on the use of texture, shape and color information or a combination of them for eye detection. Generally the approaches are categorized according to image segmentation, grayscale projection, edge detection, template matching and deformable template matching. Lam et al. [1] improves deformable templates matching method by eye corners. Vezhnevets et al. [2] estimates approximate eyelid contours based on facial features (eye corners, iris border points). Iris center, radius and upper eyelid are then detected where by filtering the resulted image, the outliers are removed. This is followed by polynomial curve fitting to boundary points. Projection functions [3] [4] [5] have also been shown in different publications. Reinders et al. [6] used eyes position template matching applied on a sequence of video images. Some other modified versions of projection method are introduced by Feng and Yeun [3], Zhou and Geng [4]. Huang et al. [7] applied optimal wavelet packets for eye representation and radial basis functions for subsequent classification of facial areas into eye and non-eye regions.

L

27


Yuille et al. [8] proposed deformable templates in locating human eye. Saber et al. [9] used geometrical structure of facial images to estimate the location of eyes. Real-time eye detection [10] [11] [12] [13] using infrared images are utilized to capture the physiological properties of eyes. Gabor Filters [14], principal component analysis [15] [16], Eigeneyes [17], SVM based architectures [18], and Neural Networks classifier [19] are also used in some literatures. Current eye detection methods can be divided into two categories: active and passive eye detection [23]. The active detection methods use special illumination and IR cameras to quickly locate pupil centers. The disadvantages are that they need special lighting sources and have more false detections within an outdoor environment. Passive methods directly detect eyes from images within visual spectrum and normal illumination condition. Niu et al [24] used 2D cascade adaboost for eye localization. The two cascade classifiers bootstrap positive and negative samples. A good survey on recent works is available in [25].

III. SYSTEM ARCHITECTURE AND METHODOLOGY

As previously indicate, proposed algorithm is composed of two major steps. At the first stage, coarse estimation of eye pair is obtained through the sequential filtering. For color images, by converting the image to Lab space and applying Homomorphic filter and Harris corner detector- combined with Tophat-Bothat filters, the main facial template image will be generated. By trimming the output image with morphologic filters and using a suitable enhancement, a texture-based thresholding is applied on the image to obtain corresponding binary image. At this stage, projection is done on the binary image to localize eye pair and remove the outliers. Finally by components labeling and removing horizontal components, the final binary image is produced. By Using morphologic operations one can easily find the bounding box and centroid of eyes. (Fig.1). Some of the related theoretical considerations of the main steps are described below.

IV. IMAGE PROCESSING AND ENHANCEMENT

A brief explanation of Homomorphic filters, morphologic filters and Harris corner detector is first given. Then in each step the output of algorithm is shown and the necessary details of each step are included. We used luminance part of image as input by simply converting the image into Lab space.

A. Homomorphic Filtering The illumination-reflectance model of image can be used as

the basis for a frequency domain procedure that is useful for improving the appearance of an image by simultaneous brightness range compression and contrast enhancement [20]. An image ),( yxf can be expressed in terms of its illumination and reflectance components as follows:

),(),(),( yxryxiyxf (1)

However, the product (separable) of the image representation in spatial domain as given above, does not yield separable from of representation in Fourier domain for illumination and reflectance because the Fourier transform of the product of two functions is not separable. The low spatial frequency illumination is separated from the high frequency reflectance by Fourier high-pass filtering. In general a high-pass filter is used to separate and suppress low frequency components while still passing the high frequency components in the signal, if the two types of signals are additive, i.e., the actual signal is the sum of the two types of signals. However, in this illumination/reflection problem low-frequency illumination is multiplied, instead of being added, to the high-frequency reflectance. In order to be able to use the usual high-pass filter, the logarithmic operation is needed to convert the multiplication to addition. Therefore we use the following formula for image representation on which Fourier transform is applied:

),(ln),(ln),(ln yxryxiyxf (2)

We can now apply filtering separately in the frequency domain as given below:

),(),(),( vuRvuIvuZ (3)

Where ),( vuI and ),( vuR are the Fourier transforms of equation (2). If we process ),( vuZ by means of a filter function ),( vuH and then apply inverse Fourier transform the equation converts to:

),(),(),( yxryxiyxs (4)

Where

)},(),({1),(

)},(),({1),(

vuRvuHyxr

vuIvuHyxi (5)

Finally, as ),( yxz was formed by taking the logarithm of the original image ),( yxf , the inverse operation yields the desired enhanced image. The illumination component of an image is generally characterized by slow spatial variations, while the reflectance component tends to vary abruptly, particularly at the junctions of dissimilar objects. These characteristics lead to associating the low frequencies of the Fourier transform of the logarithm of an image with illumination and the high frequencies with reflectance. Although these associations are rough approximations, they can be used to improve image enhancement results.

B. Harris Corner Detection The Harris corner detector is a popular point detector due to

28


its strong invariance to rotation, scale, illumination variation and image noise [2]. The Harris corner detector is based on the local auto-correlation function of a signal; where the local auto-correlation function measures the local changes of the signal with patches shifted by a small amount in different directions. Harris and Stephens improved upon Moravec's corner detector by considering the differential of the corner score with respect to direction directly, instead of using shifted patches. Let this image be shown by I. Consider taking an image patch over the area ),( vu and shifting it by ),( yx . The SSD (sum of squared differences) between these two patches, S is given by:

u vyvxuIvuIS 2)),(),(( (6)

The Harris matrix is found by taking the second derivative (the Hessian) of S around )0,0(),( yx . A is given by:

)(2

)(2

yIyIxI

yIxIxIA (7)

where angle brackets denote averaging (summation over ),( vu), and the typical notation for partial derivatives is used. If a circular window (or circularly weighted window, such as a Gaussian) is used, then the response will be isotropic. Although there are two versions of Harris corner detector descriptors, we used the following equations as corner detector:

)(2)(2

2)(2).(2

yIxI

xyIyIxICor (8)

C. Tophat-Bothat filter Morphology is a broad set of image processing operations

that process images based on shapes. Morphological operations apply a structuring element to an input image, creating an output image of the same size. Instead of constructing a dot product of a mask array and a pixel array, morphological processes use set theory operations such as intersection (AND), union (OR), and complement (NOT) to combine pixels logically into a resulting pixel value.

Morphological operators work much like spatial convolution based on values of the neighboring pixels. Bothat filtering is combination of closing and subtracting of original image but Tophat filtering is combination of opening and subtracting.

Bothat = Subtract the input image from its closing Tophat = Subtract the opening from the input image

We used union of two filters by subtracting the Tophat filtered image plus original image from Bothat output.

V. EXPERIMENTAL RESULT

In this section the results of each block in Fig.1 are explained. The results demonstrate the performance of the algorithm. Because of very low intensity of the output in some steps, for good visualization we show the complement of output. Fig.2 (a) shows the original image in RGB space. We perform conversion to Lab space and use luminance component of image which is shown in Fig.2 (b). In Fig.2 (c) the complement of output from Homomorphic and Harris filters is depicted. You can see that only the sharp region explained in section I is highlighted but the contrast of these regions is poor. By performing the combined Tophat-Bothat filters (Fig.2 (d)) and complement’s enhancement with Homomorphic filter which is shown in Fig.2 (e), the images is enhanced. By finding the maximum intensity of image and performing threshold (Fig.2 (f)), only the sharp regions are found. Because the layout of eye is in vertical shape (the length of eye is bigger than its height), by morphologic labeling and searching for those components that satisfies this constraint, many components of filtered image are removed. The output is shown in Fig.2 (g). The projection against two versus results precisely eye pair location because in every image the biggest projection is happened in eyes. Finally by applying morphologic labeling easily the bounding box and centroid of each eye are obtained. The reminder of images can be followed in Fig.2 (h-l).

Our method was performed on different databases as follows:

ORL face database [22] which contains 400 facial images from 40 distinct subjects. There are 10 different images of each subject ( 92112 pixels).

Our color facial database (passport and ID card images) that contains 220 images from 110 subjects (

240320 pixels) Additional images from internet. Animation images.

The eye pair candidates can be selected successfully in all cases whether face patterns are in different scale, expression, and illumination conditions. The system was implemented by MATLAB, 2.2GHz Intel processor, 256M RAM. Fig.3 shows some results of eyes location. A perfect success rate on all images combined with superior results on animation images indicated the strength of our new method. The average time for localization was only 0.47 seconds on our database and 0.17 seconds on ORL.

VI. CONCLUSION

In this paper a new method based on regional synthesis is proposed for eye localization. The algorithm generally highlighted sharp area with corners. Using combined Homomorphic, Harris and Tophat-Bothat filtering with

29


morpwithintelThendetereachon bpictuof th

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

phologic opeh some minolligent searchin easily the brmined with m

h labeled regioboth grayscaleures show thehe proposed m

K.M. Lam, H. YImages,” in Pat. V. Vezhnevets aExtraction,” in 2003. G.C. Feng and Application to ELet, vol. 19, pp. 8 Z.H. Zhou, and Pat. Rec., vol. 37 R. Thilak KumaUsing Color CueProcessing ICIPM.J.T. ReindersAutomatic CodebComputing and IJ. Huang and HPackets and Rad13, 1999. A.L. Yuille, P.WFaces Using Defpp. 99-111, 1992E. Saber and AFeature ExtractiFunctions,” in PaX. Liu, F. Xu anfor Driver ObsIntelligent VehicZ. Zhu, K. Fujimunder Various LEye Tracking ReA. Haro, M. FliUsing Their PhyProc. IEEE CVPC. Morimoto, Detection of EyInterfaces, San FS.A. Sirohey andLinear and NonliK. Talmi and J. Interactive SterCommunication,E. Hjelms and J.Proc. 11th ScandA. Pentland, B. MEigenspaces for Vision and PatteJ. Huang, D. Ii, XDetection UsingASI on Face RVerlag, pp. 528-5M.J.T. ReindersFeatures in ImagConf. Automatic R.C. Gonzalez Edition. Addison

erators emphaor outliers. Bing in binary ibounding boxmorphologic oon in the binae and color face high accuracmethod.

REFER

Yan, “Locating aRec. J., vol. 29, p

and A. DegtiarevProc. Graphicon

P.C. Yuen, “Vye Detection for 899-906, 1998. X. Geng, “Proje

7, pp. 1049-1056,ar, S. Kumar Rajes and Projection

P, vol. 3, Rochestes, “Eye Trackinbook Generation Imaging, Heijen NH. Wechsler, “Edial Basis Functio

W. Hallinan and formable Templa2.

A.M. Tekalp, “Frion Using Colorat. Rec. Let., vol.

nd K. Fujimura, “ervation under le Symposium, V

mura and Q. Ji, “RLight Conditions esearch &Applicaickner and I. Esysiological Prope

PR, Hilton Head ID. Koons, A. es and Faces,” in

Francisco CA, ppd A. Rosenfeld, inear Filters,” PaLiu, “Eye and G

reoscopic Displa vol. 14, Berlin G Wroldsen, “Recodinavian Conf. ImMoghaddam andFace Recognitio

ern Recognition, SX. Shao and H. W

g Support Vector Recognition: Fro536. s, R.W.C. Koch ge Sequences Usi

Face and Gesturand R.E. Wood

n-Wesley Longma

asize particulBy morpholoimage only eyx and centroioperators that ary image. Excial images ascy and compu

RENCES

and Extracting thpp. 771-779, 199va, “Robust and n-2003, Moscow

Variance ProjectHuman Face Rec

ection Functions , 2004. a, A.G. Ramakri

n Functions,” in Per New York USAng by Ttemplate

Scheme,” in 3rdNetherlands, pp.

Eye Detection Usons,” in Int. J. P

D.S. Cohen, “Feates,” in Int. J. Co

ontal-View Facer, Shape and S 19, (1998) 669-6

“Real-Time Eye DVarious Light C

Versailles France, Real-Time Eye Dand Face Orienta

ations, NewOrleanssa, “Detecting aerties, Dynamicssland, South CaroAmir and M. n Proc. Worksho. 117-120, 1998. “Eye Detection i

at. Rec. J., vol. 34Gaze Tracking foays,” in Signa

Germany, pp. 799ognizing Faces fr

mage Analysis, 19d T. Starner, “Vieon,” in Proc. IEESeattle WA, pp. 8

Wechsler, “Pose DMachines (SVM

om Theory to A

and J.J. Gerbraing Neural Networe Recognition, 1ds, Digital Imagan Publishing Co

larly eye locogic labeling ye pair is obtaid of each bofind propertie

xperimental res well as anim

utational effici

he Eye in Human6. Accurate Eye Co

w, Russia , pp. 8

tion Function ancognition,” in Pat

for Eye Detectio

ishnan, “Eye DetProc. Int. Conf. IA, pp. 337-340, 2e Matching Usind. Conf. Adv. Scho85-91, 1997. sing Optimal W

Pat. Rec. and A. I

eature Extractionomputer Vision, v

e Detection and Symmetry Based680. Detection and TraConditions,” in pp. 18-20, 2002.

Detection and Traations,” in ACM ns USA. 2002.

and Tracking Ey, and Appearancolina, 2000. Flickner, “Real

op on Perceptual

in a Face image 4, pp. 1367-1391, for Visually Contl Processing. I

9-810, 1999. rom the Eyes On999. ew-Based and Mo

EE Int. Conf. Com84-91, 1994. Discrimination anMs) ,” in Proc. NApplications, Sp

ands, “Tracking orks,” in Proc. 2n996.

ge Processing, So., Inc, 2001.

ation and

ained. ox is es for esults matediency

n Face

ontour 81-84,

nd its t. Rec.

on,” in

tectionImage

2002. ng an ool for

Wavelet I., vol.

n from vol. 8,

Facial d Cost

ackingIEEE

acking Symp.

yes By ce,” in

l-Time l User

Using2001.

trolledImage

ly,” in

odular mputer

nd Eye NATO-pringer

Facial nd Int.

Second

[21]

[22][23]

[24]

[25]

Fig. 2. The resuright, up to dowcomponent of Homomorphic fiHomomorphic fimage, thresholdoutput after labedetection, morph

C. Schmid, R. MDetectors,” Int. JThe ORL DatabJI Qiang, H. Weeye detection Understanding,N. Zhiheng, S.Cascaded AdaBConference on PP. Canpadelli, Rin the book CommunicationSeptember 2006

Input Image

TrimmiBinariz

Thresholding

Fig. 1. Flow

ults of algorithmwn): original imaimage, complem

ilters, Tophat-Botfiltered image, ded image with meling and removihologic labeling, e

Mohr and C. BaJ. Computer Visioase of Faces. httpechsler, A. Duch

and tracking,”pp. 1–3, 2005. Shiguang, Y.

Boost for Eye LoPattern RecognitiR. Lanzarotti and

of The Fundaand the Biom

6.

Homomorp

CHo

ing &zation

Project

chart of system fo

m. (Figures (a)-(lage in RGB formment of applyinthat filtered imagadjusted comple

maximum intensiing, projection preye detection.

auckhage, “Evaluon, vol. 37, pp. 1

p://www.uk.reseahowski and M. F” in Computer

Shengye, C. Xiocalization,” in Pion, 2006. G. Lipori, “Eye mentals of Veretrical Issue, N

rphic Har

Complement omomorphic

tion E

for eye localizatio

l) from left to mat, luminance ng Harris and ge, complement ement contrast ity of adjusted, rofile, eye pair

uation of Interest51-172, 2000.

arch.att.com:pub/dlickner, “Special

r Vision and

ilin and g. WenProc. 18th Intern

localization: a surbal and Non-

NATO Science

rris Corner detection

Tophat- Botha

Eye Localization

on.

t Point

data/l issue: Image

n, “2D nationl

urvey,” -verbal Series,

n

t

n

30


Fig. 3. Samples of eye localization. Color images are from our database, grayscale images or from ORL database and 3 animated images from internet. Eyes in all images are localized very well.

31

Documents

Corner Sharpening With Modified Harris Corner Detection to Localize Eyes in Facial Images