6
An Original Face Anti-spoofing Approach using Partial Convolutional Neural Network Lei Li 1 , Xiaoyi Feng 1 , Zinelabidine Boulkenafet 2 , Zhaoqiang Xia 1 , Mingming Li 1 , and Abdenour Hadid 2 1 School of Electronics and Information Northwestern Polytechnical University Xi’an, Shaanxi, China 710129 e-mail: [email protected] 2 Center for Machine Vision Research (CMV) University of Oulu, Finland e-mail: [email protected].fi Abstract— Recently deep Convolutional Neural Networks have been successfully applied in many computer vision tasks and achieved promising results. So some works have introduced the deep learning into face anti-spoofing. However, most approaches just use the final fully-connected layer to distinguish the real and fake faces. Inspired by the idea of each convolutional kernel can be regarded as a part filter, we extract the deep partial features from the convolutional neural network (CNN) to distinguish the real and fake faces. In our prosed approach, the CNN is fine-tuned firstly on the face spoofing datasets. Then, the block principle component analysis (PCA) method is utilized to reduce the dimensionality of features that can avoid the over-fitting problem. Lastly, the support vector machine (SVM) is employed to distinguish the real the real and fake faces. The experiments evaluated on two public available databases, Replay-Attack and CASIA, show the proposed method can obtain satisfactory results compared to the state-of-the-art methods. Keywords— face anti-spoofing, deep part features, convolution- al neural network, block PCA. I. I NTRODUCTION In recent years, the face biometric technique has been successfully employed in many smart devices and authority systems, such as ATMS, mobile phones and entrance guard systems. With this trend, people even don’t need to remember the complex password of the credit card, and they can pay the goods just by scanning their face. Now, many Android smart phones, such as Huawei’s honor series, have supported the face locks. When the clients want unlock the phone, they just put their face in front of the embed camera. But do you have imaged that some illegal criminals can easily invade these biometric systems just by one your face picture or video and embezzle your money and personal privacy? Actually the existing face biometric systems are so vulnerable to various kinds of spoofing attack, demonstrated in some recent studies [1][2]. Face spoofing attacks occurs when someone try to bypass a face recognition system by presenting a fake face to the acquisition system (e.g., camera). Based on the material of fake face, the face spoofing attacks includes (i) print attacks, (ii) replay attacks, and (iii) 3D mask attacks. Print attacks means using the photo of the client spoof the systems, which are the most common and simple among the afore-mentioned face spoofing attacks. Replay attacks exploit the client’s face video to intrude the systems. These two kinds of attacks are 2D face attacks. Conversely, 3D mask attacks are sponsored by the complex production process and high costs, which need soft plastic looked like the human skin. According to the frequency of attacks and the situation of the practical application statistics, many researches, included this paper, focus on print attacks and replay attacks in the past decade. Besides with the development of Graphic Processing Unit (GPU) and the coming of big data, such as NVIDIA’s GeForce card and ImageNet Large Scale Visual Recognition Challenge database [3], the convolutional neural networks (CNN) have won a place in computer vision with a very high performance. In face recognition, the use of the CNN outperforms all the previous proposed methods and the obtained results surpass the human performances [4]. However, in the computer vision tasks, the overwhelming majority of CNNs just use the last fully connected layer’s responses to classify the targets [5][6]. Just few researchers explore the effectiveness of convolutional layer’s responses [7][8]. Moreover, inspired by the idea that deformable part models are convolutional neural networks [9], we want to extract the key parts of the face region, which can dig the useful information and eliminate redundant informa- tion. Each convolutional kernel can be regarded as a part filter, which means the different convolutional kernels will extract the component features of different locations. It indicates that different kernels will generate different responses. For face anti-spoofing, based on the region of strong responses, we extracted the deep part features from the convolutional neural network (DPCNN) as the local descriptor to distinguish the real and fake face. Compared with the first deep learning work on face anti-spoofing [6], in our knowledge, the proposed DPCNN has a more deeper architecture and utilize the deep part features to classify the real and fake face, instead of only the last fully connected layer. II. RELATED WORK Based on the different types of spoofing attack, the coun- termeasures proposed in the past decade can be divided into several groups: (i) texture based; (ii) motion based; (iii) image quality analysis based; (iiii) multi-cues fuse based. 978-1-4673-8910-5/16/$31.00 ©2016 IEEE

An original face anti-spoofing approach using partial ...static.tongtianta.site/paper_pdf/8cf104d2-3c10-11e... · deformable part models are convolutional neural networks [9], we

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An original face anti-spoofing approach using partial ...static.tongtianta.site/paper_pdf/8cf104d2-3c10-11e... · deformable part models are convolutional neural networks [9], we

An Original Face Anti-spoofing Approach using PartialConvolutional Neural Network

Lei Li1, Xiaoyi Feng1, Zinelabidine Boulkenafet2, Zhaoqiang Xia1, Mingming Li1, and Abdenour Hadid21 School of Electronics and InformationNorthwestern Polytechnical University

Xi’an, Shaanxi, China 710129e-mail: [email protected]

2 Center for Machine Vision Research (CMV)University of Oulu, Finland

e-mail: [email protected]

Abstract— Recently deep Convolutional Neural Networks havebeen successfully applied in many computer vision tasks andachieved promising results. So some works have introduced thedeep learning into face anti-spoofing. However, most approachesjust use the final fully-connected layer to distinguish the real andfake faces. Inspired by the idea of each convolutional kernel canbe regarded as a part filter, we extract the deep partial featuresfrom the convolutional neural network (CNN) to distinguishthe real and fake faces. In our prosed approach, the CNN isfine-tuned firstly on the face spoofing datasets. Then, the blockprinciple component analysis (PCA) method is utilized to reducethe dimensionality of features that can avoid the over-fittingproblem. Lastly, the support vector machine (SVM) is employedto distinguish the real the real and fake faces. The experimentsevaluated on two public available databases, Replay-Attack andCASIA, show the proposed method can obtain satisfactory resultscompared to the state-of-the-art methods.

Keywords— face anti-spoofing, deep part features, convolution-al neural network, block PCA.

I. INTRODUCTION

In recent years, the face biometric technique has beensuccessfully employed in many smart devices and authoritysystems, such as ATMS, mobile phones and entrance guardsystems. With this trend, people even don’t need to rememberthe complex password of the credit card, and they can paythe goods just by scanning their face. Now, many Androidsmart phones, such as Huawei’s honor series, have supportedthe face locks. When the clients want unlock the phone, theyjust put their face in front of the embed camera. But do youhave imaged that some illegal criminals can easily invadethese biometric systems just by one your face picture or videoand embezzle your money and personal privacy? Actually theexisting face biometric systems are so vulnerable to variouskinds of spoofing attack, demonstrated in some recent studies[1][2]. Face spoofing attacks occurs when someone try tobypass a face recognition system by presenting a fake face tothe acquisition system (e.g., camera). Based on the material offake face, the face spoofing attacks includes (i) print attacks,(ii) replay attacks, and (iii) 3D mask attacks. Print attacksmeans using the photo of the client spoof the systems, whichare the most common and simple among the afore-mentionedface spoofing attacks. Replay attacks exploit the client’s face

video to intrude the systems. These two kinds of attacks are2D face attacks. Conversely, 3D mask attacks are sponsoredby the complex production process and high costs, whichneed soft plastic looked like the human skin. According tothe frequency of attacks and the situation of the practicalapplication statistics, many researches, included this paper,focus on print attacks and replay attacks in the past decade.

Besides with the development of Graphic Processing Unit(GPU) and the coming of big data, such as NVIDIA’s GeForcecard and ImageNet Large Scale Visual Recognition Challengedatabase [3], the convolutional neural networks (CNN) havewon a place in computer vision with a very high performance.In face recognition, the use of the CNN outperforms all theprevious proposed methods and the obtained results surpassthe human performances [4]. However, in the computer visiontasks, the overwhelming majority of CNNs just use the lastfully connected layer’s responses to classify the targets [5][6].Just few researchers explore the effectiveness of convolutionallayer’s responses [7][8]. Moreover, inspired by the idea thatdeformable part models are convolutional neural networks [9],we want to extract the key parts of the face region, which candig the useful information and eliminate redundant informa-tion. Each convolutional kernel can be regarded as a part filter,which means the different convolutional kernels will extractthe component features of different locations. It indicates thatdifferent kernels will generate different responses. For faceanti-spoofing, based on the region of strong responses, weextracted the deep part features from the convolutional neuralnetwork (DPCNN) as the local descriptor to distinguish thereal and fake face. Compared with the first deep learningwork on face anti-spoofing [6], in our knowledge, the proposedDPCNN has a more deeper architecture and utilize the deeppart features to classify the real and fake face, instead of onlythe last fully connected layer.

II. RELATED WORK

Based on the different types of spoofing attack, the coun-termeasures proposed in the past decade can be divided intoseveral groups: (i) texture based; (ii) motion based; (iii) imagequality analysis based; (iiii) multi-cues fuse based.

978-1-4673-8910-5/16/$31.00 ©2016 IEEE

Page 2: An original face anti-spoofing approach using partial ...static.tongtianta.site/paper_pdf/8cf104d2-3c10-11e... · deformable part models are convolutional neural networks [9], we

A. Methods Based on Texture

Intuitively, the texture of real facial skin is different fromthe medium of spoofing attack, such as soft-plastic and LCD,which brings about the methods based on texture analysis aremainstreams of face anti-spoofing. Based on the different ofimage space, the texture analysis methods can be classifiedinto gray-scaled texture analysis and color texture analysis. Forinstance Maatta et al., in [10], proposed approach analyzes thetexture of the facial images using multi-scale local binary pat-terns (LBP). Then, they used a nonlinear SVM classifier withradial basis function kernel for determining whether the inputimage corresponds to a live face or not. The proposed approachbased on micro-Texture analysis is robust, computationallyfast and does not require user-cooperation. In another work[11], Chingovska et al. explored the performance of differentkinds of LBP descriptors extracted from the gray-scaled image.In this work, the LBP features were fed into three kindsof classifier to distinguish the real and fake facial images.Moreover, Yang et al., in [6], proposed a method based onthe convolutional neural network. They explored five differentspatial scales of face and tested on two spoofing database. Theresults shown that over 70% relative decrease of HTER havebeen achieved compared with the state-of-art, however the bestspatial scales of the two databases are different, which meanslittle practical value for variety spoofing attacks in reality.Apart from training a canonical CNN structure, they choosena SVM classifier to distinguish the real and fake face, whichwas used to handle the multi-frames.

B. Methods Based on Motion

Apart from texture analysis, motion information is also avital cue for face anti-spoofing, especially in print-attacks.The correlation of face region and background is discrepantbetween the real face and the spoofing attack, caused by themotion of the spoofing attack medium (e.g. printed paper,screen of mobile device). So in [12], Anjos et al. proposed amethod of anti-spoofing by analyzing the motion relationshipbetween the background and the face region, which was testedon Replay-Attack database. The results shown that it can geta good performance on the photo attack. To utilize the inter-frame information, in [13], Pereira et al. proposed a methodbased on the LBP-TOP operator combining both space andtime information into a single multiresolution texture descrip-tor. After the feature extraction step, the features were fedinto a binary classifier to discriminate spoofing attacks fromreal accesses. Experiments carried out with the Replay-Attackdatabase and shown a HTER improvement from 15.16% to7.60%.

As we know, the optical flow method has an importantposition in the field of motion target detection. So anotherwork, in [14], Kollreider et al. presented a technique todetect the face spoofing. First, the system selected the parts(e.g. eyes/nose, left and right ear) from the face region bya simplified optical flow analysis. Then evaluating livenessbased on a short sequence of images using a binary detector.

The same authors put forward, in [15], an anti-face spoofingmethod by fusing scores from several experts systems thatobserve, concurrently, the 3-D face motion scheme introducedon the previous work and liveness properties (e.g. facialmuscles twitch, eye-blinks and lip movements). Beyond that,in [16], Chakka et al. proposeed a method by detecting planarmedia (such as paper or screens) to unearth the attacks.

C. Methods Based on Image Analysis

As previously mentioned, real facial skin is different fromthe medium of spoofing attack. One main reason is the fakeface in replay attacks is captured twice by the cameras,which will generate more noise than the real face. In [17],Zhang et al. used four DoG filters to filter the face image,then concatenated the filtered images into one feature vector.After feature extraction step, a SVM classifier was trainedto distinguish the real and fake face. In [18], Wen et al.extracted four kinds of features, such as specular reflectionfeatures, blurriness features, chromatic moment features, andcolor diversity features, to achieve the purpose of detectingintrusion face. Cascading the four kinds of features into onefeature vector to train two different SVM classifiers to detectthe fake face. In another work [19], Peng et al. extractedthe High Frequency Descriptor (HFD) to combat the attack.Actually, the HFD indicates the amount of the face region,and the disparity between the real face and the screen will beenlarged when they are lighted by a photoflash. With this fact,setting a threshold to judge the real and fake face.

D. Methods Based on fusing other cues

Apart from face, some other biological information canbe used to represent the unique person, such as fingerprintand iris. So some researchers fuse several cues to detect theliveness in recent years, who want to exploit more robustnessand secure system. For instance, in [20], Akhtar et al. concate-nated three different biological features, such as face, iris andfingerprint. The three biological features were described byLocally Uniform Comparison Image Descriptor (LUCID) [21].LUCID is a novel feature descriptor that is calculated fromorder permutations. After extracting the features from face,iris and fingerprint, the authors cascaded the three featuresinto one vector, then fed feature vectors into a SVM classifierto detect spoofing attack. In the same year, the authors of [22]implemented a mobile biometric liveness detection system.Base on the different level of security, the system can selectdisparate anti-methods to meet the demand. Concatenatingdifferent biological features can get better detect accuracy andbecome more robust than single cue, but this also need moresensors.

Although there are already a lot of detection algorithm aboutface anti-spoofing, many of them are low robustness for realitysociety. It is this embarrassing situation need us find out newmethod for anti-spoofing, so we proposed an anti-spoofingmethod based on a very deep part convolutional neural net-

Page 3: An original face anti-spoofing approach using partial ...static.tongtianta.site/paper_pdf/8cf104d2-3c10-11e... · deformable part models are convolutional neural networks [9], we

..

Conv Layer FC Layer

Fake or

Real face

224 224

Dropout

.. ..

… …

Relu Layer M-pool

Step1: Fine-tuning VGG-face Model

Fine-tuned

VGG-face

Model

Input Train Samples

Mean

Step1: Fine tuning VGG face Model

Mean Responses Part Feature Position

10%

Step2: Part Features Extraction from Convolutional Responses

Forward Propagation

Deep part Features

Block

PCA

SVM

Fig. 1. Primary architecture of the proposed DPCNN.

work (DPCNN), which is almost triple depth compared with[6].

III. PROPOSED METHOD

Inspired by the idea that deformable part models are con-volutional neural networks [9], we proposed the DPCNN toextract the part features from the convolutional layers. Themain architecture of the DPCNN is illustrated in Fig.1. Asshown in the main architecture, the DPCNN includes twosteps: (i)Fine-tuning the pre-trained VGG-face, and (ii)Extractthe part features from the fine-tuned VGG-face.

A. Fine-tuning the VGG-face Model

The use of the Convolution Neural Network in face anti-spoofing has already proposed in [6] and [23]. In these previ-ous works, the authors train their CNN models from scratchusing the existing face anti-spoofing databases. However, theseanti-spoofing databases are quite small and they are collectedin controlled environments. Thus, it is quite hard to train adeep models which can achieve high performances using onlythe small existing samples. To overcome this problem, weuse a pre-trained VGG-face model for face anti-spoofing. The

VGG-face model was designed for face recognition in [24].It consists of 11 blocks, the first 8 blocks are a convolutionalblocks, where each block contains one or more convolutionlayers followed by one or more non-linear layers (Relu and/ormax pooling layers). The last three blocks are full connectedblocs, the convolution’s filters of this blocs have the same sizeas the input data.

To adapt the VGG-face model for our face anti-spoofingproblem, we fine-tune the model by a training sets of real andfake images. As in the face anti-spoofing problem, we havetwo classes (ie. real and fake), we change the output of thelast full connected layer from 2622 to 2. (2622 is the numberof the subjects used for face recognition). The softmaxlossfunction, as shown in Eq.1, is used as the cost function whenfine-tuning the VGG-face model.

Cost =

n∑i=1

{max(Yi)+

log(1∑

j=0

exp(yij −max(Yi)))− yir}(1)

i is the index of training sample, n is the number of training

Page 4: An original face anti-spoofing approach using partial ...static.tongtianta.site/paper_pdf/8cf104d2-3c10-11e... · deformable part models are convolutional neural networks [9], we

samples, Yi = [y0, y1] is the all predict values of the ithsample. It is noted that the yir is the predict value of the ithsample. max(·) is the operation of finding the maximum in thematrix. The overall architecture of fine-tuning the VGG-faceis shown in Figure 1.

B. Extract Deep Part Features

1) Find the parts in convolutional layers: To obtainthe average responses of all training samples, we put al-l training samples into the fine-tuned VGG-face. SettingX = [x1, x2, ..., xi, ..., xn] denotes the images of trainingsamples, n is the number of training samples. Cl(xi) =[C1

l (xi), C2l (xi), ..., C

zl (xi), ..., C

Zl (xi)] is the convolutional

responses of the lth layer, Czl (xi) is the zth convolution kernel

response, Z is the number of convolutional filters in lth layer.After that, the average convolutional responses of the lth layercan be illustrated in Eq.2.

Cl = [C1

l ,C2

l , ...,Cz

l , ...,CZ

l ]

Cz

l =1

n

∑i=1,2,...,n

Czl (xi)

(2)

In our work, we choose the 10% region of all the convolutionalresponse to represent the part features. So we set thresholdvalue τ = 0.9 ∗ max(Cz

l ), which max(•) is the operationof acquiring the maximum value in the matrix Cz

l . The theposition of the component can be obtained by the Eq.3.

P zl = find(Cz

l > τ) (3)

The operation find(•) means obtaining the position wherethe value is greater than the threshold value τ . Then the partfeatures of the input image xi counted as Eq.4.

F zl (xi) = rank(Cz

l (xi), Pzl ) (4)

rank(•) executes that gathering the element of the correspond-ing position, then cascading all the gathered elements into arow vector. The part features of all training samples are shownin Eq.5.

F (X) =

F 1l (x1), F

2l (x1) ... F

Zl (x1)

F 1l (x2), F

2l (x2) ... F

Zl (x2)

... ... ... ...F 1l (xn), F

2l (xn) ... F

Zl (xn)

(5)

2) Blocks PCA: Usually, the dimension of F (X) is veryhigh. As an example, if we extract the part features from thirdlayer, the dimension is almost up to three hundred thousand(0.1 ∗ 224 ∗ 224 ∗ 64 = 321126.4). Compared with the scaleof the training set, it is easy to over-fitting the binary-classclassifier. So there is necessary that reducing the dimension ofF (X). Principal component analysis (PCA) can be competentthe job of dimension reduction. However, in the process ofcomputing covariance matrix, it tends to out of memory. In thispaper, we adopt the solution that dividing the feature matrix

F (X) into several blocks, then using the PCA to reduce thedimension, as illustrated in Eq.6.

F1(X) =

F 1l (x1), ... F

il (x1)

F 1l (x2), ... F

il (x2)

... ... ...F 1l (xn), ... F

il (xn)

............

FS(X) =

F sl (x1), ... F

Zl (x1)

F sl (x2), ... F

Zl (x2)

... ... ...F sl (xn), ... F

Zl (xn)

(6)

S is the number of feature matrix blocks. After dimensionreduction, the deep part features of training set FPCA(X) =[pca(F1(X)), ...pca(Fs(X))..., pca(FS(X))], where pca(•) isthe operation of principal component analysis.

3) Classification: After extracting the deep part featuresfrom the convolutional layers, we use the support vectormachine (SVM) to learn the classifiers from the training setfor face anti-spoofing. In this paper, the LIBLINEAR toolkit[25] is used.

IV. EXPERIMENTS DATA AND SETUP

A. Experiment data

In this paper, we evaluate our proposed method on twopublic face anti-spoofing databases: Replay-Attack database[26] and CASIA Face Anti-spoofing database CASIA-FA [27].A description of these two databases is given in follows:

1) Replay-Attack: The Idiap Replay-Attack database 1 [26]consists of 1300 video clips of real and attack attempts to 50clients (Figure 2 show some example of real and fake images).The genuine videos are recorded under two different lightingconditions: controlled and adverse.

The real and the fake videos of the 50 subjects were dividedinto 3 subject-disjoint subsets for training, development andtesting (15, 15 and 20, respectively). the training subset isused to train the countermeasure model, the development setis used for to tune the model while the testing subset is usedfor the performance evaluation.

2) CASIA-FA: The CASIA Face Anti-Spoofing Database(CASIA-FA) 2 [27], consists of 600 video recordings ofreal and attack attempts to 50 clients(Figure 3 show someexample of real and fake images). Compared with the Replay-Attack database, CASIA-FA is more challenging in term ofthe attacks’ types and the camera devices used for the videorecording. Three type of attacks are created: video replayattacks, warped attacks and cut attacks. The real and the attackattempts were recorded using three camera devices with: low,normal and high resolution.

The 50 subject were divided into two subject-disjoint sub-sets for training and testing (20 and 30, respectively). For

1https://www.idiap.ch/dataset/replayattack/download-proc2http://www.cbsr.ia.ac.cn/english/FaceAntiSpoofDatabases.asp

Page 5: An original face anti-spoofing approach using partial ...static.tongtianta.site/paper_pdf/8cf104d2-3c10-11e... · deformable part models are convolutional neural networks [9], we

Fig. 2. Samples from the Replay-Attack database. The first row presentsimages taken from the controlled scenario, while the second row corresponds

to the images from the adverse scenario. From the left to the right: realfaces and the corresponding high definition, mobile and print attacks.

Fig. 3. Sample from the CASIA FASD. From top to bottom: low , normaland high quality images. From the left to the right: real faces and the

corresponding warped photo, cut photo and video replay attacks.

the performance evaluation 7 protocols were defined: Thefirst three scenarios were designed to study the effect ofthe imaging quality: low, normal and high. The next threescenarios study the types of the spoofing attacks: warped photoattacks, cut photo attacks and video attacks. Finally, the overallscenario evaluates the spoof detection performance in generalby combining all the aforementioned scenarios.

B. Experiment setups

For the performance evaluation, we followed the overallprotocol associated with the two databases. For each database,we used the training set to fine-tuning the VGG-face modeland the testing set to evaluate the performance. On CASIA-FAdatabase, the results are evaluated in term of Equal Error Rate.The Replay-Attack database provide also a development set totune the model parameters. Thus, the results are reported interm of EER on the development set and HTER on the testset.

V. RESULTS AND DISCUSSION

A. Single DPCNN

We extract the deep part features from the 11th, 13th, 15th,18th, 20th, 22th, 25th and 27th layer in the fine-tuned VGG-face model for that the front several convolutional layers areshort of the abstract information and the behind convolutionallayers lose the architecture features. For obtaining the effec-tiveness on inter-database scenario, we have illustrated theresults in 1. The table shows that the different layers almosthave no influence on Replay-Attack database, what is oppositeto CASIA-FA database. And for CASIA-FA database, we canconclude a roughly trend that the EER of CASIA-FA willbecome more better with the depth of the convolutional layer,except the 20th layer.

Table 1. The Intra-database results based on the single DPCNN.

Test on: Replay-Attack CASIAEER(%) HTER(%) EER(%)

Layer-11 5.1549 6.9046 11.9014Layer-13 3.9231 5.5445 12.1512Layer-15 4.4170 7.1702 10.7570Layer-18 5.0465 5.9078 10.3097Layer-20 3.9285 5.5658 7.7712Layer-22 4.6124 6.2154 8.1561Layer-25 4.3844 6.0740 5.1804Layer-27 4.8564 5.1696 4.9931

B. Combined DPCNN

Selecting the layers where the smallest of the three EERvalues in intra-test, as shown in Tab.1, we combine the 13th,20th, 25th layers for Replay-Attack database and the 20th,25th, 27th layers for CASIA database. The intra-test resultsare shown in Tab.2, which illustrates that the smallest EERof Replay-Attack and CASIA are 2.8703% and 4.5458%,whereas for single DPCNN is 3.9231% and 4.9931% shownin Tab.1.

Table 2. The Intra-database results based on the combined DPCNN.

Test on: Replay-Attack CASIAEER(%) HTER(%) EER(%)

Layer-13&20 2.8703 6.1130 8.0414Layer-13&25 5.1495 6.2601 8.0834Layer-20&25 4.6882 5.5426 7.3239Layer-20&27 4.1132 5.1616 7.2194Layer-25&27 5.8062 5.7753 4.5458

C. Combined with the state-of-the-art methods

Table 3 compares the performance of our proposed approachwith the state-of-the-art methods on both intra-database andcross-database scenarios. From Table 3, we see that the bestperformance on the Replay-Attack database is the Motion Magmethod [28], but their result on CASIA-FA database is uncom-petitive than others. So overall, our proposed approach givesthe state-of-the-art performance on the challenging CASIA-FAdatabase and competitive performance on the Replay-Attackdatabase.

Page 6: An original face anti-spoofing approach using partial ...static.tongtianta.site/paper_pdf/8cf104d2-3c10-11e... · deformable part models are convolutional neural networks [9], we

Table 3. Comparison between the intra-database performance of proposedcountermeasure and state-of-the-art methods

Replay-Attack CASIAMethod EER(%) HTER(%) EER(%)

IQA based [29] - - 32.4CDD [30] - - 11.8

DOG (baseline)[27] - - 17.0Motion+LBP [31] 4.5 5.1 -

Motion [32] 11.6 11.7 26.6LBP [26] 13.9 13.8 18.2

LBP-TOP [13] 7.8 7.6 10.6Motion Mag [28] 0.2 0.0 14.4Deep Learning [6] 6.1 2.1 7.3

Fine-tuned VGG-face 8.4 4.3 5.2Proposed single DPCNN 3.9 5.5 5.0

Proposed combined DPCNN 2.9 6.1 4.5

VI. CONCLUSION

In this paper, a novel face anti-spoofing method based ona deep part convolutional neural network (DPCNN) is intro-duced. Unlike the previous proposed methods, our approachis based on a pre-trained deep neural network and use theconvolutional layer’s responses to distinguish the real and fakefaces. The experiments show that the proposed DPCNN canachieved promising results which can be compared favorablyto the state-of-the-art methods. It is finally worth noting thatone can expect much better results with deep learning whena large number of training images are available for training.So, we will set up a more large scale database for face anti-spoofing.

REFERENCES

[1] Y. Li, K. Xu, Q. Yan, Y. Li, and R. H. Deng, “Understanding osn-basedfacial disclosure against face authentication systems,” Proceedings ofAcm Sigsac Symposium on Information Computer and CommunicationsSecurity, 2014. 1

[2] L. Omar and I. Ivrissimtzis, “Evaluating the resilience of face recogni-tion systems against malicious attacks,” in Proceedings of the 7th UKComputer Vision Student Workshop (BMVW), G. K. L. Tam, Ed. BMVAPress, September 2015, pp. 5.1–5.9. 1

[3] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma,Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., “Imagenet largescale visual recognition challenge,” International Journal of ComputerVision, vol. 115, no. 3, pp. 211–252, 2015. 1

[4] Y. Sun, X. Wang, and X. Tang, “Deep learning face representationby joint identification-verification,” Advances in Neural InformationProcessing Systems, vol. 27, pp. 1988–1996, 2014. 1

[5] J. Hosang, M. Omran, R. Benenson, and B. Schiele, “Taking a deeperlook at pedestrians,” in 2015 IEEE Conference on Computer Vision andPattern Recognition (CVPR), June 2015, pp. 4073–4082. 1

[6] J. Yang, Z. Lei, and S. Z. Li, “Learn convolutional neural network forface anti-spoofing,” Eprint Arxiv, vol. 9218, pp. 373–384, 2014. 1, 2,3, 6

[7] M. Paulin, M. Douze, Z. Harchaoui, and J. Mairal, “Local convolu-tional features with unsupervised training for image retrieval,” in IEEEInternational Conference on Computer Vision, 2015, pp. 91–99. 1

[8] B. Yang, J. Yan, Z. Lei, and S. Z. Li, “Convolutional channel features,”in 2015 IEEE International Conference on Computer Vision (ICCV),Dec 2015, pp. 82–90. 1

[9] R. Girshick, F. Iandola, T. Darrell, and J. Malik, “Deformable partmodels are convolutional neural networks,” in IEEE Conference onComputer Vision and Pattern Recognition, 2015, pp. 437–446. 1, 3

[10] J. Maatta, A. Hadid, and M. Pietikainen, “Face spoofing detectionfrom single images using micro-texture analysis,” in International JointConference on Biometrics, 2011, pp. 1–7. 2

[11] I. Chingovska, A. Anjos, and S. Marcel, “On the effectiveness of localbinary patterns in face anti-spoofing,” in IEEE International Conferenceof the Biometrics Special Interest Group, 2012, pp. 1–7. 2

[12] A. Anjos and S. Marcel, “Counter-measures to photo attacks in facerecognition: A public database and a baseline,” in Biometrics, Interna-tional Joint Conference on, 2011, pp. 1–7. 2

[13] T. D. F. Pereira, A. Anjos, J. M. D. Martino, and S. Marcel, “Lbp - topbased countermeasure against face spoofing attacks,” in Proceedings ofthe 11th international conference on Computer Vision - Volume Part I,2012, pp. 121–132. 2, 6

[14] K. Kollreider, H. Fronthaler, and J. Bigun, “Non-intrusive livenessdetection by face images,” Image and Vision Computing, vol. 27, no. 3,pp. 233–244, 2009. 2

[15] ——, “Verifying liveness by multiple experts in face biometrics,” in2012 IEEE Computer Society Conference on Computer Vision andPattern Recognition Workshops, 2008, pp. 1–6. 2

[16] M. M. Chakka, A. Anjos, S. Marcel, and R. Tronci, “Competition oncounter measures to 2-d facial spoofing attacks,” in Proceedings of the2011 International Joint Conference on Biometrics, 2011, pp. 1–6. 2

[17] Z. Zhang, J. Yan, S. Liu, and Z. Lei, “A face antispoofing databasewith diverse attacks,” in Biometrics (ICB), 2012 5th IAPR InternationalConference on, 2012, pp. 26–31. 2

[18] D. Wen, H. Han, and A. K. Jain, “Face spoof detection with imagedistortion analysis,” IEEE Transactions on Information Forensics andSecurity, vol. 10, no. 4, pp. 746–761, 2015. 2

[19] J. Peng and P. P. K. Chan, “Face liveness detection for combating thespoofing attack in face recognition,” in Wavelet Analysis and PatternRecognition (ICWAPR), 2014 International Conference on, July 2014,pp. 176–181. 2

[20] Z. Akhtar, C. Michelon, and G. L. Foresti, “Liveness detection forbiometric authentication in mobile applications,” in Security Technology(ICCST), 2014 International Carnahan Conference on, 2014, pp. 6695–6695. 2

[21] A. Ziegler, E. Christiansen, D. Kriegman, and S. Belongie, “Locallyuniform comparison image descriptor,” Advances in Neural InformationProcessing Systems, vol. 1, pp. 1–9, 2012. 2

[22] Z. Akhtar, C. Micheloni, C. Piciarelli, and G. L. Foresti, “Mobio-livdet:Mobile biometric liveness detection,” in Advanced Video and SignalBased Surveillance (AVSS), 2014 11th IEEE International Conferenceon, Aug 2014, pp. 187–192. 2

[23] D. Menotti, G. Chiachia, A. Pinto, W. Robson Schwartz, H. Pedrini,A. Xavier Falcao, and A. Rocha, “Deep representations for iris, face,and fingerprint spoofing detection,” Information Forensics and Security,IEEE Transactions on, vol. 10, no. 4, pp. 864–879, 2015. 3

[24] O. M. Parkhi, A. Vedaldi, and A. Zisserman, “Deep face recognition,”in British Machine Vision Conference, 2015. 3

[25] R. E. Fan, K. W. Chang, C. J. Hsieh, X. R. Wang, and C. J. Lin,“Liblinear: A library for large linear classification,” Journal of MachineLearning Research, vol. 9, no. 12, pp. 1871–1874, 2010. 4

[26] I. Chingovska, A. Anjos, and S. Marcel, “On the effectiveness of localbinary patterns in face anti-spoofing,” in International Conference of theBiometrics Special Interest Group (BIOSIG), Sept 2012, pp. 1–7. 4, 6

[27] Z. Zhang, J. Yan, S. Liu, Z. Lei, D. Yi, and S. Li, “A face antispoofingdatabase with diverse attacks,” in International Conference on Biomet-rics (ICB), March 2012, pp. 26–31. 4, 6

[28] B. Samarth, D. T. I, V. Mayank, and S. Richa, “Face anti-spoong viamotion magnication and multifeature videolet aggregation,” Universityof Delhi, Department of Computer Science and Engineering, Tech. Rep.,5 2014, https://repository.iiitd.edu.in/jspui/handle/123456789/138. 5, 6

[29] J. Galbally and S. Marcel, “Face anti-spoofing based on general imagequality assessment,” in International Conference on Pattern Recognition(ICPR), Aug 2014, pp. 1173–1178. 6

[30] J. Yang, Z. Lei, S. Liao, and S. Li, “Face liveness detection with com-ponent dependent descriptor,” in Biometrics (ICB), 2013 InternationalConference on, June 2013, pp. 1–6. 6

[31] J. Komulainen, A. Hadid, M. Pietikainen, A. Anjos, and S. Marcel,“Complementary countermeasures for detecting scenic face spoofingattacks,” in International Conference on Biometrics (ICB), June 2013,pp. 1–7. 6

[32] T. de Freitas Pereira, A. Anjos, J. De Martino, and S. Marcel, “Canface anti-spoofing countermeasures work in a real world scenario?” inInternational Conference on Biometrics (ICB), June 2013, pp. 1–8. 6