Extended Sparse Linear Model for Face Recognition Weihong Deng
( ) Beijing Univ. Post. & Telecom.( )
Slide 2
2 Characteristics of Face Pattern The facial shapes are too
similar, sometimes identical ! (~100% face detection rate, kinship
verification) A Special Object: Easy for both detection and
identification ! Within-class variation is larger than
between-class variation Even faces looks similar, ~100% accuracy
are achieved by linear classifiers on public databases (although
not for real-world applications)
Slide 3
3 Learn linear structures in high-dimensional image space
Challenges Geometry: Describe low-dimensional structures in
high-dimensional data? Statistics: Deal with real data that contain
large intra-class variations, such as pose, illumination, and
expression (PIE), or occlusion. Learning: Handle insufficient data,
i.e. small (single) sample size problem Computation: implement the
real-time recognition system !
Slide 4
4 Linear Model is universal in (Visual) Signals JPEG: Linear
representation with pre-defined basis images Sparse coding: Linear
representation with learned basis images
Slide 5
5 Outline: Two Historical Linear Model and their extensions 5
Sparse representation-based Classification (SRC) Principal
Component Analysis (Eigenfaces) y=Ax Basis imagesPrototype
images
Slide 6
6 Eigenfaces: A human psychophysics experiment Eigenfaces
reveals that the intrinsic dimension of the face space is ~100.
Intrinsic dimension: Human observer can recognize face with SNR~7-
8, which requires as low as 100-200 dimension eigenspace. Orthgonal
Basis images reconstructed images Marsha Meytlis and Lawrence
Sirovich, On the Dimensionality of Face Space, PAMI 2007
Slide 7
7 Face Recognition via Sparse Linear Representation subspace of
the same faceFace space Illumination Theory: Images of the same
face under varying illumination lie approximately on a low
(nine)-dimensional subspace, known as the harmonic plane [Basri
& Jacobs, PAMI, 2003]. Inspire a new linear model called sparse
representation based classification (SRC) [PAMI 2009]
Slide 8
8 Assumption: the test image,,, can be expressed as a linear
combination of k training images, say of the same subject: The
solution,,, should be a sparse vector of its entries should be
zero, except for the ones associated with the correct subject. Face
Recognition via Sparse Representation Reference: Wright et al.
Robust Face Recognition via Sparse Representation. PAMI,
31(2):210227, 2009.
Slide 9
9 1 2 3 N subject isubject 1subject n 1 2 3 N Classification
criterion: assign to the class with the smallest residual. subject
i Face Recognition via Sparse Representation Sparse representation
encodes membership through its nonzero coefficients!
Slide 10
10 Limitations of Sparse Representation based Classification
However, when the sample size is small, the sparsity would be break
down! SRC assumes that the training images have been carefully
controlled and that the number of samples per class is sufficiently
large. coefficients Training dictionary Test image Outside these
operating conditions, SRC should not be expected to perform well
(Wright et al.). Many real world applications are outside these
operating conditions. Reference: Wright et al.. Sparse
representation for computer vision and pattern recognition.
Proceedings of the IEEE, 98(6):1031 1044, 2010.
Slide 11
11 Previous works (How to solve SSS) [1] Fisherfaces (Linear
Discriminant Analysis) The feature covariance of all classes are
identical The within-class scatter matrix is shared by all classes
for discriminant analysis [2] Quotient image All faces share
identical 3D shape Render all lighting conditions of a novel face
from the training images of other subjects. [3] Adaptive
discriminant analysis The within-class variation of the gallery
subjects can be represented by out-of-gallery subjects. Adapt the
within-class scatter matrix for DA on gallery samples 1.Peter N.
Belhumeur, Joao P. Hespanha, and David J. Kriegman, Eigenfaces vs.
Fisherfaces: Recognition Using Class Specific Linear Projection,
PAMI1997 2.Amnon Shashua, and Tammy Riklin-Raviv, The Quotient
Image: Class-Based Re-Rendering and Recognition with Varying
Illuminations, PAMI2001 3.Meina Kan, Shiguang Shan, Yu Su, Xilin
Chen, Wen Gao: Adaptive discriminant analysis for face recognition
from single sample per person. FG 2011
Slide 12
12 Observation Human faces share similar shape. Key assumption:
the intra-class of any gallery face can be approximated by a linear
combination of the intra-class difference from sufficient number of
generic faces 1. Deng et al., Extended SRC: Undersampled Face
Recognition via Intra-Class Variant Dictionary, PAMI 2012 2. Deng
et al., In Defense of Sparsity Based Face Recognition, CVPR, 2013.
= ?
Slide 13
13 Extended Sparse Representation Two novel assumptions: 1.
Image can be superposed by prototype and variance dictionaries; 2.
Intra-class variant bases are shared across classes. Prototypes
Variance Reference: Deng et al., Extended SRC: Undersampled Face
Recognition via Intra-Class Variant Dictionary, PAMI 34(9):
1864-1870, 2012.
Slide 14
14 Recognition with single sample per class Dramatically reduce
the error rates Intra-class variance dictionaries of a single face
can reduce the error rate by nearly a half ! Intra-class variance
dictionary Test variability Reference: Deng et al., Extended SRC:
Undersampled Face Recognition via Intra-Class Variant Dictionary,
PAMI 34(9): 1864-1870, 2012.
Slide 15
15 Recognition with uncontrolled training samples Largely boost
the accuracy with uncontrolled training samples Outperform methods
with complicated dictionary learning ! Reference: Deng et al., In
Defense of Sparsity Based Face Recognition, CVPR, 2013.
Slide 16
16 Recognition with over-complete intra-class variance
dictionary Construct over-complete variance dictionary from the
image difference of the FRGC training set (12,766 images).
Comparing the effects of L1 (sparsity) and L2 (non-sparsity)
regularization for computing the linear combination. L1 (sparsity)
constraint leads to much better recognition accuracies. Reference:
Deng et al., In Defense of Sparsity Based Face Recognition, CVPR,
2013.
Slide 17
17 Extended works of ESRC Yi Ma Single-Sample Face Recognition
with Image Corruption and Misalignment via Sparse Illumination
Transfer (CVPR 2013, IJCV 2014) Neither Global Nor Local:
Regularized Patch-Based Representation for Single Sample Per Person
Face Recognition (IJCV 2014) Lei Zhang Sparse Variation Dictionary
Learning for Face Recognition with A Single Training Sample Per
Person (ICCV 2013) Xudong Jiang Sparse And Dense Hybrid
Representation via Dictionary Decomposition for Face Recognition,
PAMI 2015
Slide 18
18 From ESRC to Metric Learning Metric space is the identity
space with parameter-free learning Gallery images are mapped to [1
0 0], [0 1 0] and [0 0 1], respectively. Generic facial variations
are all mapped to [0 0 0] Deng et al., Equidistant Prototypes
Embedding for Single Sample Based Face Recognition with Generic
Learning and Incremental Learning, Pattern Recognition, 2014. 1.
Gallery images 2. Basis images of the metric space before transfer
learning 3. Basis images of the metric space after transfer
learning
Slide 19
19 Excellent Applicability and Surprisingly good results 1.
Parameter free 2. Super simple and fast, handle h-d feature in real
time 4. Higher flexibility: on-line update with identical
recognition accuracy 5. Consistently better performance than SRC
with any number of samples per class Comparative recognition
accuracy via Batch vs. Online learning Comparative time via Batch
vs. Online learning Reference: Deng et al., Equidistant Prototypes
Embedding for Single Sample Based Face Recognition with Generic
Learning and Incremental Learning, Pattern Recognition, 47(12):
37383749, 2014 LRA vs SRC 3. Generic learning to address small
sample size problem
Slide 20
Observations 1. Deng et al., Extended SRC: Undersampled Face
Recognition via Intra-Class Variant Dictionary, PAMI 2012 2. Deng
et al., Equidistant Prototypes Embedding for Single Sample Based
Face Recognition with Generic Learning and Incremental Learning,
Pattern Recognition, 2014. Cross-database Transferred Metric
Learning Accuracy improvement by transfer metric learning from
other databases is significant. Transfer metric learning from
images of 5 faces can reduce the recognition errors by a half. The
efficiency of Transfer is much higher than other tasks Error rate
as a function # class of face used for transferred metric
learning
Slide 21
21 Underestimated Core Problem of Face Recognition Curse of
Misalignment Underestimated issue: There is no criterion to define
a good aligned face, current research works manually align the face
in heuristic manner. Most frontal representation/recognition
research align face by feature points Curse of misalignment: Many
representation/recognition errors in practice is caused by
misalignment ! Our conjecture: Alignment by image plane is more
stable than feature points Shan et al., Curse of Misalignment in
face recognition: Problem and a novel misalignment learning
solution, FG 2004.
Slide 22
22 From Point to Plane: Alignment via eigenspace Our idea:
Transform every training image toward its projection on the
eigenspace
Slide 23
23 Transform-invariant PCA TIPCA: Joint Face Representation and
Registration Reference: Deng et al., Transform-Invariant PCA: A
Unified Approach to Fully Automatic Face Alignment, Representation,
and Recognition. PAMI 36(6): 12751284, 2014.
Slide 24
24 Fully Automatic Registration, Representation, and
Recognition Reference: Deng et al., Transform-Invariant PCA: A
Unified Approach to Fully Automatic Face Alignment, Representation,
and Recognition. PAMI 36(6): 12751284, 2014. Our belief: There is a
underlying relationship among image Registration, Representation,
and Recognition. We only do a very simple work (TIPCA) relying on
this profound relationship.
26 Better Representation Improves Recognition Recognition
results on FERET gallery (training): 1196 subject (one image per
subject) fb: expression variation; fc: lighting variation; dup1:
time interval < 18 months; dup2: time interval > 18 months.
Standard FERET Database TIPCA-aligned faces are more suitable for
the recognition purpose than the manually aligned faces Reference:
Deng et al., Transform-Invariant PCA: A Unified Approach to Fully
Automatic Face Alignment, Representation, and Recognition. PAMI
36(6): 12751284, 2014.
Slide 27
27 Aligning 1000 images of 100 subjects with real occlusion
& Lighting Reference: Deng et al., Transformed Principal
Gradient Orientation for Robust and Precise Batch Face Alignment.
ACCV, 2014.
Slide 28
28 Aligning 1000 images of 100 subjects with real occlusion
& Lighting 1. Deng et al., Transformed Principal Gradient
Orientation for Robust and Precise Batch Face Alignment. ACCV,
2014. 2. Deng et al., Transform-Invariant PCA: A Unified Approach
to Fully Automatic Face Alignment, Representation, and Recognition.
PAMI 36(6): 12751284, 2014. Manual aligned images of Standard AR
Database
Slide 29
29 Take-Home Messages Extended Linear Model (combination or
projection) for undersampled face recognition problem Human faces
share similar shape, which makes recognition difficult but makes
knowledge transfer easy. TIPCA: A Unified framework for Face
Registration, Representation, and Recognition. Automatic alignment
by image plane could be more precise than human label of landmarks.
Benchmark databases could be redefined to ensure the meaningfulness
to real-world application.