AdaBoost Face Detection - University of California, San Diegocseweb.ucsd.edu/classes/fa04/cse252c/projects/hamed_old.pdf · AdaBoost Face Detection Hamed Masnadi-Shirazi ... The first

AdaBoost Face Detection

Hamed Masnadi-Shirazi

Department of Electrical and Computer Engineering University of California, San Diego

La Jolla, California [email protected]

Abstract Viola and Jones [1] introduced a new and effective face detection algorithm based on simple features trained by the AdaBoost Algorithm, Integral Images and Cascaded Feature sets. This paper attempts to replicate their results. The Feret Face data set is used as the training set. The AdaBoost Algorithm, simple feature set and Integral Images are briefly explained and implemented in our Matlab based program. A series of ten best features were identified out of a set of close to fifty thousand. These best features were used to produce probability of error plots. Finally our face detection Algorithm is implemented on a series or random Images taken from the internet. More than just ten best features are needed to have a face detector comparable to the two hundred best features of Viola and Jones [1] but the face detector still performs well and anyone can use our program included in the Appendix to implement this effective face detection algorithm and train as many best features as suited for their application..

1. Introduction Face detection and recognition has become an increasingly researched area. The Viola and Jones [1] method for face detection is an especially successful method as it has a very low false positive rate, can detect faces in real time and yet is very flexible in the sense that it can be trained for different levels of computational complexity, speed and detection rate suitable for specific applications. What makes this algorithm even more attractive is the fact that it can be implemented with slight changes to detect many other objects as well. Using a set of two hundred best features, Viola and Jones [1] were able to produce a 95% diction rate and a 1 in 14084 false positive rate. They were also able to detect faces within a 380x280 image in less than 0.7 seconds. Such high performance makes this method one of the best face detection algorithms. As mentioned above the same algorithm can be applied to detecting other objects as well and Viola, Jones and Snow [4] have successfully used this method for detecting pedestrians. Considering the above, it is highly desirable to be able to implement this versatile method for anyone who might want to do research in this area.

1.1 Overview In This paper Section 2 will discuss the simple features and integral images. Section 3 will discuss the AdaBoost training. Section 4 will specifically discuss our program and the methods we used to implement this Algorithm. Section 5 will discuss our results. Appendix A will have a set of example runs of our program on images and Appendix B will have the complete listing of our Matlab program.

2. Features and Integral Images The first thing to keep in mind is that the Viola Jones [1] method is a feature based detection scheme. So a pool of features must be created and a scheme used to find the “good” features. “Good” features are features that best discriminate between faces and non-faces. Many different forms of feature pools can be created. A desirable feature pool would be one that is exhaustive, has feature forms that can describe the object and has features that can be applied to our images and computed efficiently. The rectangle features used have all the above conditions. An example of such features can be seen in Figure 1.

Figure 1: Example rectangular features used in our program. (From left to right) feature numbers 1, 100, 1000, 15000, 25000, 49000. It should be mentioned that these features are 24x24 pixels. The image pixels that fall within this space and are under the black pixels are subtracted from the image pixels that are under the white feature pixels (or vice versa). The resulting number is considered the output of this feature applied to our image.

2.1 Integral Images It might seem computationally expensive to find the output of each feature as described above, but using a clever method called the Integral Images; one can easily and effectively find the output. In this method a new image is created for each test image such that every new pixel is the sum of the pixels above and to the left of it.

ii(x’,y’)= ∑<< ''

),(yyxx

yxi

The same is done for each feature in our feature bank to the extent that each feature is represented only by a series of numbers that define how specific parts of the Integral Image should be added or subtracted to produce the feature output. This will be further explained in Section 4 when we introduce our specific program. Using this method it is no longer necessary to add or subtract individual pixels. The Image is scanned once and the Integral Image is found. After this stage to find the output of a feature applied to our image we just add and subtract a limited number of our Integral Image pixels.

3. Adaboost Training and Feature Selection Now that we have a set of features and an efficient way to compute the output of each feature, we describe how the best features are found. To realize the importance of this stage it should be mentioned that using 24x24 pixel features we produced an exhaustive set of 49,554 features. A method must be used that can reduce this feature set and give us the best features that discriminate the faces from non-faces and also complement each other.

3.1 Weak Classifiers Before we move on to explain the training algorithm the concept of a weak classifier should be explained. A training set of labeled faces and non-faces is prepared by scaling each image to a 24x24 image and normalizing it. For each feature a weak classifier is defined by applying the feature to the training set. A weighted histogram is produced and an optimal threshold that best separates the faces from non-faces is found. A parity is also found that describes weather the face outputs are above or below the threshold. So each weak classifier hj(x) has a feature fj, a threshold θj and a parity pj assigned to it. (Finding the threshold of each weak classifier is an optimization step and any optimization algorithm that minimizes the error of classification can be used.) In our implementation the output of our weak classifier is 1 if the image x is classified as a face and -1 if it is classified as a non-face. Mathematically this is equivalent to:

hj(x)= ⎩⎨⎧

←−

Θ<←

Otherwise

pxfp jjjj

1

)(1

Figure 2: First three features found using naïve approach. It is readily seen that they are slightly scaled versions of the same basic feature.

Care should be taken to realize that a weak classifier is not a true classifier in our final algorithm. It is nothing more than a weighted histogram of the outputs of one feature applied to all our training data set along with an optimum threshold that separates the face from non-faces outputs and hence provides an error of classification associated with that particular feature.

3.2 Training and AdaBoost Algorithm An intuitive yet naïve procedure for training our features and finding the set of best features would be to construct the weak classifier for each feature and rank the features in order of smallest error of classification. This simple method unfortunately does not work because doing so will produce the first best feature and a set of features that look very similar to the first but are slightly scale or shifted. These features are all basically the same and fail on the same test images. Figure 2 shows the first few best features found using this naïve approach. The AdaBoost algorithm fixes this problem by changing the weights used in computing the classification error of our weak classifier. A small error is now weighted more and this ensures that our first best feature and any other feature similar to it will not be chosen as our second best feature. This means that our second best feature is no longer similar to our first best feature and a whole different feature is selected as our next best feature. This second best feature ideally compliments our first best feature in the sense that it is successful at classifying faces that the first best feature failed on. This process is repeated to find as many best features as desired. This is generally a very time consuming processes. Once a number of best features have been found, they can be used to detect faces in a test image. Each feature votes on weather it thinks the test image is a face or not. Each features vote is weighted in log-inverse proportion to the error of that feature. So a feature with a smaller error gets a heavier weighted vote. See Table 1 for complete algorithm.

Table 1: The AdaBoost algorithm. Each for each (t) a new best feature is found.

4. Our Implemented Algorithm In this section we explain the methods and procedures used and give a precise description of our approach to implementing the different aspects of training our face detector.

4.1 The Features An exhaustive set of 49,554 features were produces using 24x24 pixel features with a minimum rectangle size of 8x8 pixels. Figure 1 shows a few of these features. But this is not how each feature is stored in our program. Storing 49,554 individual 24x24 pixels images would take up a significant amount of memory space. As previously explained it would also be computationally expensive and inefficient to use the raw features in our algorithm. In order to deal with these problems we have used Integral Images. The Integral Image of each feature is found and a simple representation for that feature is produced. This representation uses a few numbers to tell us how to apply

Figure 3: Each feature is reshaped as a sparse column vector with elements at the corners of the black and white rectangles. The location of the corners (left columns) and the number multiplied by the integral image at that location (right columns) is stored. The left image is for feature#1 and the right image is for feature#15000. each feature to a training image. Each number tells us how the integral image pixel at that location should be added or subtracted to produce the feature output. Figure 3 shows two features and their representation as stored in our Matlab program.

4.2 The Training Data Set The Feret face set was used as our face training set. 739 frontal faces were used. The faces were originally 256x384 pixels. The location of the eyes, nose and mouth for each face was known and used to cut out the portion of the image that only contained the face. Each face was then resized to 24x24 pixels and normalized to be used by our algorithm. Figure 4 shows a few of these training faces. 739 non-face training images were used as well. 400 of these non-face images were artificially randomly generated using a uniform distribution and 300 were produced from random images. Each image was resized and normalized. Figure 4 shows a few of the training non-face images. The integral image for each training image was also computed and stored.

4.3 The Training Algorithm The same general procedure for training using the AdaBoost algorithm was used as explained in Table 1. In order to make our algorithm more efficient we represented each integral image as a row vector. In this case the output of each feature applied to any image is simply the dot product of the feature vector and the integral image vector. The threshold for each weak classifier was found using the optimization function “fminsearch.m” in Matlab. We have tried to keep our program as clear as possible. Appendix B has a complete listing of our program. Finding each best feature takes about 5 hours of computing time using a Pentium 4 with 2.4Ghz processor. This time could be reduced by using other programming languages.

Figure 4: A few face and non-face images used in our training set.

5. Results We allowed our program to run for approximately 50 hours and find the fist 10 best features. These features can be seen in Figure 5. It can be readily seen that these 10 best features correspond to the best features found by Viola and Jones [1]. These features also make intuitive sense as they represent the lighter and darker areas of the eyes, nose, forehead and cheeks on a typical face. These 10 best features were then used to find probability of error plots. Figure 6 shows these plots. It is seen that the number of face images misclassified decreases as the number of features used increases. The same is true for non-face images. The Images used to produce these plots were the same training set images. This is why the probability of error is so low. Although using the same training set as a test set is not the best course of action, we are still able to verify the accuracy and correctness of our 10 best features by seeing that they are very successful at actually doing what they were trained to do and that is to classify faces over the training set. The fact that the probability of error is very low over the training set and that the probability of error decreases with increasing number of features used, shows the validity of our 10 best features. We also gathered a small set of random images off the internet to validate our features on non-training set images. In order to do this our program scans the image at

Figure 5: The 10 best features. multiple levels and crops out sections of the image, resizes the image to 24x24 pixels, normalizes and attempts to classify the image. If it is decided that the image is a face, then a rectangular box is drawn around the original cropped out section of the image. It should be mentioned that a single face within an image will usually have multiple rectangular boxes around it, since it is classified as a face at multiple levels and by different scans. This is a phenomenon mentioned in the Viola Jones [1] paper as well.

Figure 6: Shows the probability of error decreasing for increasing number of features used. Horizontal axis is the “Number of Features Used” the vertical axis is P(error) for Face and Non-Face Images. A few of these faces can be seen in Figure 7. Appendix A shows more of these results. In order to limit the number of false positives within these images a lower bound on the size of the cropped out images was imposed on our detection program. This was done to compensate for the small number of 10 features used compared to 200 features used by Viola and Jones [1]. If this had not been done then the program would draw multiple small rectangles around portions of the image that did not include faces. Assuming that the faces within the image are relatively larger than 24x24 pixels and that their size is approximately known, our 10 best features can be successfully used.

6. Conclusion We were able to successfully replicate the results found by Viola and Jones [1] and even arrive at the same feature set

Figure 7: successfully detected faces. even though our training set was different. This further shows the robustness of the algorithm. Although training takes much time, the detection algorithm is fast and can be used to scan large images quickly. Using 10 best features is not sufficient for detecting faces with low false positive rates in the general case, but on the other hand a small set of 10 features can be used successfully if the faces within the image are relatively large and an approximate bound can be placed on their size.

7. Acknowledgements We would like to thank Ian Fasel for providing the training data set and the rectangular feature bank function.

References [1] Viola and Jones,2001. Robust Real Time Object Detection. Proceedings, 2nd International Workshop on Statistical and Computational Theories of Vision. [2] Freund and Schapire,1999. A Short Introduction to Boosting. Journal of Japanese Society for Artificial Intelligence.

[3] Ian Fasel and Schapire and J.R Movellan. 2004. A Generative Framework for Boosting with Applications to Real Time Eye Coding. Computer Vision and Image understanding,2004. [4] Viola and Jones and Snow. 2001. Detecting Pedestrians Using a Boosted Cascade of Simple Features.In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition.

Appendices

A. Examples

B. The Program Listing

Getimagesfromfile.m %%%%%%%%%%%% MAKE FACE DATA %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% [names,a,b,c,d,e,f,g,h] = textread('NameOfImagesONLY.txt','%s %f %f %f %f %f %f %f %f',-1); % read all data from txt file [row col]=size(a); FaceData=[]; cropsize=24; % CHANGE SIZE HERE for i=1:row % CHANGE NUMBER OF TRAIN IMAGES HERE if exist( strcat(char(names(i)),'.tif') ) == 2 count=count+1; I=imread(char(names(i)),'tif'); % CONVERT THE IMAGES TO GRAY SCALE TO FIT THE PROGRAM dist=c(i)-a(i); xmin=a(i)-round(dist*3.5/5.5); ymin=b(i)-dist; width=12*round(dist/5.5); height=14*round(dist/5.5); Im=imcrop(I,[xmin ymin width height] ); Imr=imresize(Im,[cropsize,cropsize]); %figure,imshow(Imr) Imr=reshape(Imr,1,cropsize*cropsize); FaceData=[FaceData ; Imr]; %each row is a croped face end end FaceData=double(FaceData); %Max=255 Min=0 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%% Make NOISE DATA %%%%%%%%%%%%%%%% count=0; tcount=0; NoiseData=[]; % cropsize=24; % CHANGE SIZE HERE picnoiseNum=300; % change this to add more data from the pics for i=1:row % CHANGE NUMBER OF TRAIN IMAGES HERE if exist( strcat(char(names(i)),'.tif') ) == 2 & count<picnoiseNum count=count+1; I=imread(char(names(i)),'tif'); % CONVERT THE IMAGES TO GRAY SCALE TO FIT THE PROGRAM [rowN colN]=size(I); xmin=round(colN/2); ymin=rowN-cropsize-1; width=cropsize; height=cropsize; Im=imcrop(I,[xmin ymin width height] );

Imr=imresize(Im,[cropsize,cropsize]); % THis Is NOT needed anymore since the crop size is already 24*24 %figure,imshow(Imr) Imr=reshape(Imr,1,cropsize*cropsize); NoiseData=[NoiseData ; Imr]; %each row is a croped face else noise= randImageF; % MUST ADD THIS FUNCCTION NoiseData=[NoiseData ; noise]; end tcount=tcount+1; if tcount==size(FaceData,1) break; end end NoiseData=double(NoiseData); %Max=255 Min=0 %%%%%%%%%%%%%%%% NORMALIZE THE IMAGES %%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% for i=1:size(FaceData,1) faceraw=reshape(FaceData(i,:),cropsize,cropsize); noiseraw=reshape(NoiseData(i,:),cropsize,cropsize); faceraw=normImageF(faceraw); noiseraw=normImageF(noiseraw); faceraw=reshape(faceraw,1,cropsize*cropsize); noiseraw=reshape(noiseraw,1,cropsize*cropsize); FaceData(i,:)=faceraw; NoiseData(i,:)=noiseraw; end %%%%%%%%%%%%%%%%%%%%%% SAVE THE FACES AND NOISE PICS %%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%% loc= 'C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\PROJECT\' ; save([loc 'FaceNoiseData.mat'], 'FaceData','NoiseData'); % Normalized and ready %imshow(reshape(NoiseData(1,:),24,24))

normImageF.m function I= normImageF(I) ; [r c]= size(I) ; avgVal= sum(sum(I)) / (r*c) ; subtrMat= repmat(avgVal, r, c) ; I= I - subtrMat ;

sdI= std(reshape(I,r*c,1)) ; I= I ./ repmat(sdI, r, c) ; % if you want, use these to check mean==0 & stdev==1 %meanI= sum(sum(I)) / (r*c) %stNormI= std(reshape(I,r*c,1))

randImageF.m function I= randImageF picsize=24; % MUST CHANGE THIS FOR smaller sized face images im1= round(255*rand(picsize,picsize)); % THIS 255 WAS ADDED TO BE COMPATIBLE WITH THE Tiff FACE IMAGES % *** we are taking out the gradient for now: %grad= linspace(.3, -.3, 10 )' ; %grad= linspace(0, 0, 10 )' ; %im2= repmat(grad,1,10) ; preI= im1; %+ im2 ; %preI= normImageF(preI); % THIS LINE WAS TAKEN OUT TO NORMALIZE LATER %ANF AVOID NEGATIVE NUMBERS I= reshape(preI,1,picsize*picsize); %imagesc(preI); %colormap gray;

padJN.m function I= padJN(preI) % takes a face image in as a column vector % and returns it as a vector where it's been padded % with an extra left column and top row as zeros length= max(size(preI)); side= length.^.5 ; preI= reshape(preI,side,side) ; bigI= zeros(side+1, side+1) ; bigI(2:end, 2:end)= preI ; I= reshape(bigI, 1, (side+1)^2) ;

cumImageJN.m function I= cumImageJN(preI) % takes a 1x100 or 1x121 face % interprets it as a 10x10 or 11x11 face % gets a cumsum image % returns it with original 1x100 or 1x121 dimensions % numPix= max(size( preI )) ; side= numPix ^ .5 ;

preI= reshape(preI,side,side) ; preI= cumsum(cumsum(preI,1),2) ; I= reshape(preI,1,numPix);

findClassifier.m function [threshold, parity, weightedError]= findClassifier(posHist, negHist, posHistWeights, negHistWeights) % guess whether polarity is 1 (cutoff above faces) % or -1 (cutoff below faces) % probably these should be weighted appropriately weightedPosHist= posHist*posHistWeights' ; %############### weightedNegHist= negHist*negHistWeights' ; %############# guessCutoff= weightedPosHist + weightedNegHist ; % ########## weightedPosMean= weightedPosHist / sum(posHistWeights); weightedNegMean= weightedNegHist / sum(negHistWeights); if weightedPosMean <= weightedNegMean parity= 1; elseif weightedPosMean > weightedNegMean parity= -1; else disp('freak out problem in findClassifier.m'); end [threshold, weightedError]= fminsearch(@testClassifier, guessCutoff, [], parity, posHist, negHist, posHistWeights, negHistWeights) ;

makeImagesF.m function makeImagesF load('C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\PROJECT\FaceNoiseData.mat'); % this is how many faces you want. % there will be an equal number of nonfaces minfeaturesize=8; % CHANGE THIS FOR LESS FEATURES TO TRAIN ON patchi=24; numFaces= 739; patchsize=24*24; patchsizepad=25*25; %faces= zeros(numFaces,patchsize); facesPad= zeros(numFaces,patchsizepad); cumFaces= zeros(numFaces,patchsizepad); %noisePics= zeros(numFaces,patchsize); noisePicsPad= zeros(numFaces,patchsizepad); cumNoise= zeros(numFaces,patchsizepad);

for(ind= 1: numFaces) face= FaceData(ind,:); padFace= padJN(face); noise= NoiseData(ind,:); noisePad= padJN(noise); faces( ind,: )= face; facesPad(ind,:)= padFace; cumFaces(ind,:)= cumImageJN(padFace); noisePics(ind,:)= noise; noisePicsPad(ind,:)= noisePad; cumNoise(ind,:)= cumImageJN(noisePad); %viewFace(faceImage); end % xN is number of faces and of non-faces, total. % this is evenly split between faces and non-faces xN= numFaces*2 ; % get the features from Ian's code, for use with integral images [f, Vidx, F, Fidx, Pidx]= violaboxeszero(patchi,patchi,minfeaturesize); %first two numbers are feature size window the third number is the feature min size #MUST CHANGE FOR DIFFERENT SIZE IMAGES########################### loc= 'C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\PROJECT\' ; save([loc 'images.mat'], 'faces', 'cumFaces', 'noisePics', 'cumNoise', 'xN', 'f');

HamedProjectTrainRevisedNoNaN_FACES.m clear all tic % gives these variables: % 'faces', 'cumFaces', 'noisePics', 'cumNoise', 'xN' % for example, faces might be 1000 x 100, where each column is a face % xN is the number of images. % this is evenly split between faces and nonfaces. patchi=24; % CHANGE RTHIS FOR SMALLER IMAGE WINDOWS load('C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\PROJECT\images.mat'); x=[cumFaces ; cumNoise];%a matrix with each row being one of the test images fNbest=0; wface=1/xN; wNoface=1/xN; w=[wface*ones(1,0.5*xN) wNoface*ones(1,0.5*xN)]; %INITIALIZE%an array in order of x with wface as value if x is a face and wNoface as value if x is not a face %xN=%number of images faces=[]; nofaces=[];

thetaArray=[]; pArray=[]; %f=%a matrix with each column being a feature image %[f, Vidx, F, Fidx, Pidx]= violaboxeszero(10,10); % TOOK OUT THIS LINE BECAUSE f is ALREADY MADE IN IMAGES.DAT %f=[ f(:,4005) f(:,5250) ]; %############################################ %fN=%number of features fN=size(f,2); y=[ones(1,0.5*xN) -1*ones(1,0.5*xN)];% an array of length number of images with in order of x and with 1 or -1 elements fNbestArray=[]; alpha_t_Array=[]; minErrorArray=[]; thetaBestArray=[]; pBestArray=[]; T=2% number of itteration we want #################################### for t=1:T for fNcount=1:fN faces=[]; nofaces=[]; %if fNcount ~= fNbest for xNcount=1:xN fout=x(xNcount,:)*f(:,fNcount); %fout=IntegralImageFunction( x(:,xNcount),f(:,fNcount) ); %finds the output as a number fout of applyin the feature to image if y(xNcount)==1 faces=[faces fout]; else nofaces=[nofaces fout]; end end [theta , p, weightedError ]= findClassifier( faces , nofaces, w(1:0.5*xN), w(0.5*xN+1:xN) );%[theta , p ]= HistogramThreshholdFunction( faces , nofaces ); % a function to find the threshhold and parity that minimizes overlap of histogram thetaArray=[thetaArray theta]; pArray=[pArray p]; %else %thetaArray=[thetaArray NaN]; %pArray=[pArray NaN]; %end if mod(fNcount,100)==0 disp('1');disp(fNcount);disp(t) % DISP end end %Function to find if x is classified correctly for fNcount=1:fN Error=0;

%Error=(h==y)*w'; %if thetaArray(fNcount)~=NaN for xNcount=1:xN fout=x(xNcount,:)*f(:,fNcount);%fout=IntegralImageFunction( x(:,xNcount),f(:,fNcount) ); %finds the output as a number fout of applyin the feature to image if fout*pArray(fNcount) < thetaArray(fNcount)*pArray(fNcount) h(xNcount)=1; else h(xNcount)=-1; end Error=Error - ( -1 + 0.5*( abs( h(xNcount) + y(xNcount) ) ) )*w(xNcount); end %else %Error= NaN; %end ErrorArray(fNcount)=Error; if mod(fNcount,100)==0 disp('2');disp(fNcount); disp(t) % DISP end end [minError minErrorIndex]=min(ErrorArray) % if fNbest==minErrorIndex % ErrorArray(minErrorIndex)=inf; % [minError minErrorIndex]=min(ErrorArray); % end fNbest=minErrorIndex; fNbestArray=[fNbestArray fNbest]; minErrorArray=[minErrorArray minError]; thetaBestArray=[thetaBestArray thetaArray(fNbest) ]; pBestArray=[ pBestArray pArray(fNbest) ]; %update weights function for xNcount=1:xN alpha_t= (0.5)*log( (1-minError)/(minError+0.000001) ); alpha_t_Array=[alpha_t_Array alpha_t]; fout=x(xNcount,:)*f(:,fNbest);%fout=x(xNcount,:)*f(:,fNcount); %fout=IntegralImageFunction( x(:,xNcount),f(:,fNbest) ); %finds the output as a number fout of applyin the BEST feature to image if fout*pArray(fNbest) < thetaArray(fNbest)*pArray(fNbest) h(xNcount)=1; else h(xNcount)=-1; end w(xNcount)=w(xNcount)*exp( -1 * y(xNcount) * alpha_t * h(xNcount) ); % disp('3');disp(xNcount); % DISP end wsum=sum(w); w=w/wsum; % box=DisplayFeature(patchi,patchi,f(:,fNbest));

% imshow(box,[]) % title(['feature ' num2str(fNbest)]); % drawnow; end % END of first for loop ttoc=toc loc= 'C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\PROJECT\' ; save([loc 'TrainResults.mat'], 'fNbestArray','thetaBestArray','pBestArray','alpha_t_Array','T');

TestGetAccuracyPlots.m clear all PropOfNoFacesClassedCorrectly=[]; PropOfFacesClassedCorrectly=[]; load('C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\PROJECT\images.mat'); load('C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\PROJECT\TrainResults5.mat'); % hard coding T because it didn't like indices %T= 5; %[f, Vidx, F, Fidx, Pidx]= violaboxeszero(10,10); for i=1:T x=[cumFaces];%x=[cumFaces ; cumNoise];%a matrix with each column being one of the test images %f=%a matrix with each column being a feature image %x= % LOad in the image (MUST BE IN COLUMN ARRAY FORMAT!!!) we want to test to see if it is a face or not; %T=% number of best features found in our training program %fNbestArray=% Load in the array with the index of the best features found in training isfacenum=0; isnotfacenum=0; for xNcount=1:(xN/2) sum=0; for t=1:i %fout=IntegralImageFunction( x , f( : , fNbestArray(t) ) ); %finds the output as a number fout of applyin the feature to image fout=x(xNcount,:)*f( : , fNbestArray(t) ); if fout*pBestArray(t) < thetaBestArray(t)*pBestArray(t) h(t)=1;

else h(t)=-1; end sum=sum + alpha_t_Array(t) * h(t); end if sum>0 isfacenum=isfacenum+1;%fprintf('IS FACE!\n') else isnotfacenum=isnotfacenum+1;%fprintf('NO FACE!\n') end end PropOfFacesClassedCorrectly=[ PropOfFacesClassedCorrectly isfacenum/(xN/2)]; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% x=[cumNoise];%x=[cumFaces ; cumNoise];%a matrix with each column being one of the test images %f=%a matrix with each column being a feature image %x= % LOad in the image (MUST BE IN COLUMN ARRAY FORMAT!!!) we want to test to see if it is a face or not; %T=% number of best features found in our training program %fNbestArray=% Load in the array with the index of the best features found in training isfacenum=0; isnotfacenum=0; for xNcount=1:(xN/2) sum=0; for t=1:i %fout=IntegralImageFunction( x , f( : , fNbestArray(t) ) ); %finds the output as a number fout of applyin the feature to image fout=x(xNcount,:)*f( : , fNbestArray(t) ); if fout*pBestArray(t) < thetaBestArray(t)*pBestArray(t) h(t)=1; else h(t)=-1; end sum=sum + alpha_t_Array(t) * h(t); end if sum>0 isfacenum=isfacenum+1;%fprintf('IS FACE!\n') else isnotfacenum=isnotfacenum+1;%fprintf('NO FACE!\n') end

end PropOfNoFacesClassedCorrectly=[PropOfNoFacesClassedCorrectly 1-isfacenum/(xN/2)]; end truePos= PropOfFacesClassedCorrectly ; trueNeg= PropOfNoFacesClassedCorrectly ; falsePos= 1 - trueNeg ; PropOfImagesClassedCorrectly= (PropOfFacesClassedCorrectly+PropOfNoFacesClassedCorrectly)./2 figure,plot([1:T],1 + -PropOfImagesClassedCorrectly); title('all images'); xlabel('number of features used'); ylabel('P(error)'); figure,plot([1:T],1 + -PropOfFacesClassedCorrectly); title('faces'); xlabel('number of features used'); ylabel('P(error)'); figure,plot([1:T],1 + -PropOfNoFacesClassedCorrectly); title('non faces'); xlabel('number of features used'); ylabel('P(error)'); figure,plot(falsePos, truePos, '*-'); title('ROC curve: P(true positive) as f( P(false positive))');

FaceDetector.m clear all load('C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\PROJECT\images.mat'); load('C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\PROJECT\TrainResults5.mat'); Itest=imread('C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\test faces funny\faces6.jpg'); %Itest=imread('C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\test faces funny\SmilyFaces.jpg'); %Itest=imread('C:\Documents and Settings\BAGHIYE\Desktop\Special topics Vision FALL 04\test faces funny\twotestfaces.jpg'); Itest=imcrop(Itest); % ############TAKE OUT IN GENERAL CASE figure,imshow(Itest); Itest=rgb2gray(Itest); figure,imshow(Itest); Itest=double(Itest); Itest= normImageF(Itest); figure,imshow(Itest,[]) %%%%%%%%%%%%%%%%%%%%%%%% % TAKE repeated chunk out of image FUNCTION GOES HERE [row col]=size(Itest); mindim=min(row,col);

for size = 24:6:mindim %# PLAY WITH THIS ######################## for xplace=1:10:col-size for yplace=1:10:row-size %%%%%%%%%%%%%%%%%%%%%%% [Ichunk, rect]=imcrop(Itest,[xplace yplace size size]); %[xmin ymin width height] Ichunkoriginal=Ichunk; cropsize=24; patchi=24; patchsize=24*24; patchsizepad=25*25; Ichunk=imresize(Ichunk,[cropsize,cropsize]); Ichunk=reshape(Ichunk,1,cropsize*cropsize); Ichunk= padJN(Ichunk); Ichunk=cumImageJN(Ichunk); x=Ichunk; sum=0; i=T; for t=1:i %fout=IntegralImageFunction( x , f( : , fNbestArray(t) ) ); %finds the output as a number fout of applyin the feature to image fout=x*f( : , fNbestArray(t) ); if fout*pBestArray(t) < thetaBestArray(t)*pBestArray(t) h(t)=1; else h(t)=-1; end sum=sum + alpha_t_Array(t) * h(t); end if sum>0 %isfacenum=isfacenum+1; %fprintf('IS FACE!\n'); rectangle('Position',rect,'edgecolor','red') else %isnotfacenum=isnotfacenum+1; %fprintf('NO FACE!\n') end end end end % SIZE if

checkFeature.m function good = checkFeature(F,nI,nJ) % verify goodness of a feature

good = 1; nI = nI + 1; nJ = nJ + 1; [rows,cols] = size(F); if ~((rows == nI) & (cols == nJ)) if (cols == 1) F = reshape(F,[nI,nJ]); end end [i j c] = find(F); % If all zeros, thats bad if length(c) == 0; good = 0; return end img = ones(nI,nJ); % A patch sized image img(2:end,2:end) = 1; img = cumsum(cumsum(img,1),2); img = reshape(img,[1,nI*nJ]); big_img = zeros(nI*2,nJ*2); % A larger sized image big_img(2:end,2:end) = 1; big_img = cumsum(cumsum(big_img,1),2); % big_img = reshape(big_img,[1,(nI*2)*(nJ*2)]); % make sure feature does the same thing for images of different sizes. v1 = img * F(:); img = reshape(img,[nI,nJ]); v11 = 0; for corner = 1:length(c) v11 = v11 + c(corner) * img(i(corner), j(corner)); end v2 = 0; for corner = 1:length(c) v2 = v2 + c(corner) * big_img(i(corner) + floor(nI/2), j(corner) + floor(nJ/2)); end if v1 ~= v2 good = 0; end return for i = 1:ci.numClassifiers showWeights(ci,i); disp(sprintf('Feature %d goodness: %d',i,checkFeature(ci.F(:,i),24,24) )) pause(1) end

DisplayFeature.m function boxen = DisplayFeature(nI,nJ,B, varargin) % Display feature B within an image of size nIxnJ % %if (nI~=nJ) % warning('nI and nJ should be equal'); % return; %end nI = nI+1; nJ = nJ + 1; L = tril(ones(nI,nI)); A = zeros(nI*nJ); for iv = 1:nJ for jv = 1:iv; A((nI*(iv-1)+1):nI*iv,(nI*(jv-1)+1):nI*jv) = L; end; end; wts = B'*A; S = reshape(wts,[nI,nJ]); boxen = S(2:end,2:end); if nargin == 4 ind = find(boxen==0); boxen = boxen + varargin{1}; end %imshow(boxen,... % [min(min(avgf)), max(max(avgf))],... %'notruesize'); %drawnow;

ShiftFeature.m function sB= ShiftFeature(nI,nJ, nx,ny,B) % Shifts the feature B xs pixels horizontally and ys pixels % vertically. If it cant be done it returns a zero vector. % nI, nJ are the rows and cols of the original image patch % B is a (nI+1)*(nJ+1) dimensional column vector representing % a feature of the Viola/Jones style % nI = nI+1; % rows in zero padded image nJ = nJ+1; % cols in zero padded image % reshape is an almost free operation -- it doesn't actually change any % of the data, only the indexes in the sparse matrix data structure. % Note that data *is* changed if you actually try to write into B, so % we won't do that. B = reshape(B, [nI, nJ]); [i,j] = find(B); newi = i+ny; newj = j+nx; if sum(newi>nI) | sum(newi<1) | sum(newj>nJ) | sum(newj<1) sB = []; else sB = zeros(nI,nJ); sB(newi,newj) = B(i,j);

end % % copy B into a zero padded matrix, with enough room to % % be shifted in any direction. % % so if B = 0 0 then % % 0 1 % % sp = 0 0 0 0 0 0 % % 0 0 0 0 0 0 % % 0 0 0 0 0 0 % % 0 0 0 1 0 0 % % 0 0 0 0 0 0 % % 0 0 0 0 0 0 % % this is almost free, since sparse matrices take almost no space % sp = sparse(3*nI,3*nJ); % sp(nI+1:2*nI,nJ+1:2*nJ) = B; % % % now select the shifted part of sp % sB = sp(nI+1-ny:2*nI-ny,nJ+1-nx:2*nJ-nx); % % % We know if it went off the edge by counting the number of nonzero % % elements. If it did, blank it out; if not, make it a column vector % if length(find(sB)) < length(find(B)) % sB = []; % else % sB = reshape(sB,[nI*nJ,1]); % end % end return; nI = nI+1; % rows in zero patched image nJ = nJ+1; % cols in zero patched image ulc = min(B); % Find the index to upper left corner of the box. lrc = max(B); % index to lower right corner of the box col_ulc = ceil(ulc/nI); % Find the row and col of ulc in the padded image row_ulc = ulc - nJ*(col_ulc-1); col_lrc = ceil(lrc/nI); % Find the row and col of ulc in the padded image row_lrc = lrc - nJ*(col_lrc-1); row_ulc = row_ulc + ny; % shift col_ulc = col_ulc + nx; row_lrc = row_lrc + ny; % shift2 col_lrc = col_lrc + nx; sB = zeros(nI*nJ,1); %Check whether shift is possible if sum(row_ulc < 1) | sum(col_ulc < 1) | sum(row_lrc > nI) | sum(col_lrc > nJ) sB = []; return else % Find the index to the shifted ulc pixel ulcs = (col_ulc -1)*nI + row_ulc; shift = ulcs - ulc; for i=1: nI*nJ if B(i) ~= 0 sB(i+shift) = B(i);

end end end

violaboxcount.m function c = violaboxcount(nI,nJ,nMin,nMax) if nargin < 4 nMax = max(nI,nJ); end if nargin < 3 nMin = 1; end; nI=nI+1; nJ=nJ+1; nPix = nI*nJ; c=0; %count viola features for preallocation (for speedup) for t = 1:6 si=1; sj=1; switch t case 1, si=2; % split i 2 case 2, sj=2; % split j 2 case 3, si=3; % split i 3 case 4, sj=3; % split j 3 case 5, si=2; sj=2; % split ij 2 case 6, si=3; sj=3; % center end si0 = si*ceil(nMin/si); sj0 = sj*ceil(nMin/sj); for i=2:nI; for j=2:nJ; nIend= min(i+nMax,nI); nJend= min(j+nMax,nJ); for i2=i+si0-1:si:nIend; for j2=j+sj0-1:sj:nJend; c=c+1; end; end; end; end; end;

violaboxzero.m function [V, Vidx, F, Fidx, Pidx]= violaboxeszero(nI,nJ,nMin,nMax,nScale) % nI nJ are actual size of window -- without zero padding % nMin, nMax are the minimum and maximum size of the boxes % nLoc is the maximum number of features centered at each location if nargin < 5 nScale = nI*nJ; end if nargin < 4 nMax = max(nI,nJ); end if nargin < 3 nMin = 1; end; c = violaboxcount(nI,nJ,nMin,nMax);

[F Fidx Pidx] = simpleboxeszero(nI,nJ); disp('did violaboxes zero') nI=nI+1; nJ=nJ+1; nPix = nI*nJ; % allocate sparse matrix V = sparse([],[],[],nPix,c,c); nF=size(F,2); Vidx = sparse([],[],[],nF,5,c); c=0; %index viola features for t = 1:6 si=1; sj=1; switch t case 1, si=2; % split i 2 case 2, sj=2; % split j 2 case 3, si=3; % split i 3 case 4, sj=3; % split j 3 case 5, si=2; sj=2; % split ij 2 case 6, si=3; sj=3; % center end %nIs = (min(2+nMax,nI) - (2+si*ceil(nMin/si)-1))/si+1 %nJs = (min(2+nMax,nJ) - (2+sj*ceil(nMin/sj)-1))/sj+1 si0 = si*ceil(nMin/si); % 0th size in i for this feature sj0 = sj*ceil(nMin/sj); % 0th size in j for this feature for i=2:nI; for j=2:nJ; nIend= min(i+nMax,nI); nJend= min(j+nMax,nJ); for i2=i+si0-1:si:nIend; for j2=j+sj0-1:sj:nJend; iw = i2-i+1; jw = j2-j+1; %switch t % case {1,2,3,4,5} % sr = nScale/sqrt((nI - iw + 1)*(nJ - jw + 1)); % case 6 % sr = 1; %end % if rand <= sr c=c+1; Vidx(Fidx(Pidx(i,j),Pidx(i2,j2)),t)=c; switch t case 1 %split i 2 iw = iw/2; b1 = Fidx(Pidx(i, j),Pidx(i+iw-1,j2)); b2 = Fidx(Pidx(i+iw,j),Pidx(i2, j2)); v = F(:,b1) - F(:,b2); case 2 % split j 2 jw = jw/2; b1 = Fidx(Pidx(i, j),Pidx(i2,j+jw-1)); b2 = Fidx(Pidx(i,j+jw),Pidx(i2, j2)); v = F(:,b1) - F(:,b2); case 3 % split i 3 iw = iw/3; b1 = Fidx(Pidx(i, j),Pidx(i+iw-1, j2)); b2 = Fidx(Pidx(i+iw, j),Pidx(i+iw+iw-1,j2)); b3 = Fidx(Pidx(i+iw+iw,j),Pidx(i2, j2)); v = F(:,b1) - F(:,b2) + F(:,b3); case 4 %split j 3 jw = jw/3; b1 = Fidx(Pidx(i, j),Pidx(i2, j+jw-1));

b2 = Fidx(Pidx(i, j+jw),Pidx(i2,j+jw+jw-1)); b3 = Fidx(Pidx(i,j+jw+jw),Pidx(i2, j2)); v = F(:,b1) - F(:,b2) + F(:,b3); case 5 % split ij 2 iw = iw/2; jw = jw/2; b1 = Fidx(Pidx(i, j),Pidx(i+iw-1,j+jw-1)); b2 = Fidx(Pidx(i+iw, j),Pidx(i2, j+jw-1)); b3 = Fidx(Pidx(i, j+jw),Pidx(i+iw-1, j2)); b4 = Fidx(Pidx(i+iw,j+jw),Pidx(i2, j2)); v = F(:,b1) - F(:,b2) - F(:,b3) + F(:,b4); case 6 % center iw = iw/3; jw = jw/3; b1 = Fidx(Pidx(i, j ),Pidx(i2, j2)); b2 = Fidx(Pidx(i+iw, j+jw ),Pidx(i+iw+iw-1,j+jw+jw-1)); v = F(:,b1) - F(:,b2) * 2; %DisplayFeature(nI-1,nJ-1,v);pause end; nf = sum(v~=0); % if(((nf>6) & (t<3)) | ((nf > 8) & (t < 5)) |(nf > 9 )) % disp([c, i,j,i2,j2,si,sj,t]); % error('bad feature'); % end if(nf>0) V(:,c)=v; else error ('empty feature!'); end %DisplayFeature(nI-1,nJ-1,v);pause %end end; end; end; end; end;