Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
1
Component Analysis for Signal Processing 1
Advanced Component Analysis
Methods for Signal Processing
Fernando De la TorreFernando De la Torre
[email protected]@cs.cmu.edu
Component Analysis for Signal Processing 2
Outline• Introduction
• Principal Component Analysis (PCA)– Robust Principal CA
– PCA missing data
– Parameterized Principal CA
• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis
• Canonical Correlation Analysis– Canonical Time Warping
• K-means– Aligned Cluster Analysis (ACA)
– Discriminative Cluster Analysis (DCA)
Component Analysis for Signal Processing 3
Robust PCA•Two types of outliers:
Sample outliers Intra-sample outliers(Xu & Yuille., 1995) (de la Torre & Black, 2001b; Skocaj & Leonardis, 2003)
•Standard PCA solution (noisy data):
Component Analysis for Signal Processing 4
Robust PCA• Using robust statistics:
∑∑ ∑= = =
−−=n
i
d
p
p
k
j
jipjppirpca cbdE1 1 1
),(),,( σµρµCB
Pixel residual
meanBasis (B) &
Coefficients(c)
robustrobust
quadratic
outlieroutlier
(de la Torre & Black, 2001b; de la Torre & Black, 2003a)
2
Component Analysis for Signal Processing 5
Numerical Problems• No closed form solution in terms of an eigen-equation.
• Deflation approaches do not hold.
L
T
T
11
uuAA
uuAA
222
1
'''
'
λ
λ
−=
−=First eigenvector with
highest eigenvalue.
Second eigenvector with
highest eigenvalue.
• In the robust case all the basis have to be computed
simultaneously (including the mean).
Component Analysis for Signal Processing 6
How to Optimize it?
∑∑ ∑= = =
−−=n
i
d
p
p
k
j
jipjppirpca cbdE1 1 1
),(),,( σµρµCB
[ ]
[ ]C
HCC
BHBB
∂
∂−=
∂
∂−=
−+
−+
rpca
c
nn
rpca
b
nn
E
E
o
o
11
11
)(max
)(max
2
2
T
ii
rpca
T
ii
rpca
Ediag
Ediag
ccH
bbH
c
b
∂∂
∂=
∂∂
∂=
(Blake & Zisserman, 1987)
• Normalized Gradient descent
• Deterministic annealing methods to avoid local minima.
Component Analysis for Signal Processing 7
Example
• Small region
• Short amount of time
Statistical
outlier
Component Analysis for Signal Processing 8
Robust PCA
Original PCA RPCA Outliers
3
Component Analysis for Signal Processing 9
Structure from Motion
• Robust estimation of coefficients (Black & Jepson, 1998; Leonardis & Bischof, 2000; Ke & Kanade, 2004)
• Robust estimation of basis and coefficients (Gabriel & Odoro, 1984; Croux & Filzmoser., 1981; Skocaj et al., 2002;Skocaj & Leonardis, 2003; de la Torre & Black, 2001b; de la Torre & Black, 2003a)
• Other Robust PCA techniques (sample outliers) (Campbell, 1980; Ruymagaart, 1981; Xu & Yuille., 1995)
More work on Robust PCA
Component Analysis for Signal Processing 10
PCA with Uncertainty and Missing Data
• If weights are separable closed-form solution.
• Generalized SVD
productHadamard
wij
nd
o
0≥
ℜ∈ ×W
D=
dnd
n
n
dd
dd
dd
LL
MOM
O
LL
1
221
111
=
r
d
r
r
r
w
w
w
...
2
1
w
( )cnccc www 21=w ……
T
r cwwW =
(Greenacre, 1984; Irani & Anandan, 2000;)
∑∑ ∑= = =
−=−=d
i
n
j
k
s
sjisijijFcbdwE
1 1
2
1
2 )()()( BCDWCB, o• Adding uncertainty
Component Analysis for Signal Processing 11
General Case
• For arbitrary weights no closed-form solution.
• Alternated least squares algorithms– Slow convergence, easy implementation.
• Damped Newton Algorithm– Fast convergence.
– H definite positive:
vvvv
C
Bv
CBBCDWCB,
∂∂
∂∂
−=
=
++−=−
+ 2
1
2
2
2)1(
212
][
)(
)(
||||||||)()(
EE
vec
vec
E
nn
FFFλλo
∑
∑
=
=
−−
=−−=−=
d
p
pTppTpTp
n
i
iii
T
iiF
diag
diagE
1
1
2
))(()(
))(()()()(
bCdwbCd
BcdwBcdBCDWCB, o
Iv
H λ+∂∂
=2
2
2Eeconvergencuntil
FFuntil
I
repeat
EE
repeat
10;
)()(
)(
10
1
2
2
2
2
λλ
λ
λλ
==
<
+−=
=
∂∂
=∂∂
=
−
yx
xy
gHxy
vg
vH
(Buchanan & Fitzgibbon., 2005)
(Torre & Black, 2003a)
Component Analysis for Signal Processing 12
Experiments
4
Component Analysis for Signal Processing 13
Outline• Introduction
• Principal Component Analysis (PCA)– Robust Principal CA
– PCA missing data
– Parameterized Principal CA
• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis
• Canonical Correlation Analysis– Canonical Time Warping
• K-means– Aligned Cluster Analysis (ACA)
– Discriminative Cluster Analysis (DCA)
Component Analysis for Signal Processing 14
• Learn a subspace invariant to geometric transformations?
. . .
• Data has to be geometrically normalized
– Tedious manual cropping.
– Inaccuracies due to matching ambiguities.
– Hard to achieve sub-pixel accuracy.
Parameterized Component Analysis (PaCA)(de la Torre & Black, 2003b)
Component Analysis for Signal Processing 15
Error function for PaCA
)()()(),,( 21
1
2
1
caBc)af(x,daCBW
t ppET
t
tt++−=∑
=
Basis ((BB) &) &
coefficients ((cc))
Motion (warping)
Regularization
2
312
1 1
2
211
WWaΓacΓc −
= =− −+−∑∑ tat
T
t
L
l
tct λλ
Component Analysis for Signal Processing 16
EigenEye Learning
5
Component Analysis for Signal Processing 17
Outline• Introduction
• Principal Component Analysis (PCA)– Robust Principal CA
– PCA missing data
– Parameterized Principal CA
• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis
• Canonical Correlation Analysis– Canonical Time Warping
• K-means– Aligned Cluster Analysis (ACA)
– Discriminative Cluster Analysis (DCA)
Component Analysis for Signal Processing 18
Multimodal Oriented Component
Analysis (MODA)
• How to extend LDA to deal with:
– Model class covariances.
– Multimodal classes.
– Deal efficiently with huge covariance matrices
(e.g. 100*100).
(de la Torre & Kanade, 2005a)
Component Analysis for Signal Processing 19
Multimodality
−200−100
0100
200
−200
0
200−10
−5
0
5
10
MODA
0 10 20 30 40 50 60−40
−30
−20
−10
0
10
20
30
40
0 10 20 30 40 50 60−10
−8
−6
−4
−2
0
2
4
6
8
10
LDA
Component Analysis for Signal Processing 20
∑ ∑ ∑ ∑= ≠ ∈ ∈
−−−−−+−++
classes
i
Tr
j
r
i
classes
ij Cr Cr
r
i
r
j
r
j
r
i
r
i
r
j
r
i
r
j
T
i j
tr1
1111
)))())(((( 21
1 2
21211212 BµµΣΣµµΣΣΣΣB
Multimodal Oriented Discriminant
Analysis (MODA)�B that MAXIMIZES the Kullback-Leibler divergence between clusters among classes.
• 1 mode per class and equal covariances equivalent to LDA.
6
Component Analysis for Signal Processing 21
Optimization
W1
∑=
−−=1
1 ))()(()(i
i
T
i
TtrJ BABBΣBB
• Hard optimization problem
T(B)
)()(
)()(
00 BB
BBB
JT
JT
=
∀≥
W0
J(B)
• Iterative Majorization (Kiers, 1995; Leeuw, 1994)
Component Analysis for Signal Processing 22
Majorization
• Slow convergence…
∑
∑
=
−
−−
=
−≥
−=
classes
i
i
T
i
T
T
i
T
i
TT
i
Tclasses
i
tr
Tii
1
1
2
1
002
1
2
1
2
1
1
))()((
||)()()(||)(
BABBΣB
ABBΣBBΣBABBΣBB
Component Analysis for Signal Processing 23
Face recognition from video
Photometric normalization
Geometric normalization
Clustering.
• Challenges
– Low quality small images (40-50 pixels).
– Changes in expression/pose/occlusion/illumination.
– Real time and scalable to several users.
FernandoCarlos
PaulNot in the databaseNot in the database
Prototypes.
(de la Torre et al., 2005b)
Component Analysis for Signal Processing 24
Pro
toty
pes
??40
40
B
40*40 16
Number of basis
Recognition
LDA
PCA
MODA
7
Component Analysis for Signal Processing 25
Outline• Introduction
• Principal Component Analysis (PCA)– Robust Principal CA
– PCA missing data
– Parameterized Principal CA
• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis
• Canonical Correlation Analysis– Canonical Time Warping
• K-means– Aligned Cluster Analysis (ACA)
– Discriminative Cluster Analysis (DCA)
Component Analysis for Signal Processing
Problem
Component Analysis for Signal Processing
Canonical Correlation Analysis (CCA)(Hotelling 1936)
• CCA minimizes:
CCA
different #rows, same #columns
Spatial transformation
2
),(F
T
y
T
xyxccaJ YVXVVV −= b
y
TT
y
x
TT
xts I
VYYV
VXXV=
.
ndnd yx×× ℜ∈ℜ∈ YX ,
Component Analysis for Signal Processing
A least-square formulation for DTW
same #rows, different #columnsyx ndnd ×× ℜ∈ℜ∈ YX ,
2
),(F
T
y
T
xyxdtwJ YWXWWW −=
8
Component Analysis for Signal Processing
Canonical Time Warping (CTW)
different #rows, different #columns
temporal alignment
spatial transformation
Reminder
b
y
T
y
T
y
T
y
x
T
x
T
x
T
x
F
T
y
T
y
T
x
T
xyxyxctw
ts
J
IVYWYWV
VXWXWV
YWVXWVVVWW
=
−=
..
),,,(2
yyxxndnd ×× ℜ∈ℜ∈ YX ,
2
2
),(
),(
F
T
y
T
xyxdtw
F
T
y
T
xyxcca
J
J
YWXWWW
YVXVVV
−=
−=
Component Analysis for Signal Processing
Facial expression alignment
Component Analysis for Signal Processing
Facial expression alignment
Component Analysis for Signal Processing
Aligning human motion
Boxing Opening a cabinet
9
Component Analysis for Signal Processing
Aligning motion capture and video
Component Analysis for Signal Processing 34
Outline• Introduction
• Principal Component Analysis (PCA)– Robust Principal CA
– PCA missing data
– Parameterized Principal CA
• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis
• Canonical Correlation Analysis– Canonical Time Warping
• K-means– Aligned Cluster Analysis (ACA)
– Discriminative Cluster Analysis (DCA)
Component Analysis for Signal Processing
Problem
• Mining facial expression
Component Analysis for Signal Processing
• Mining facial expression for one subject• Mining facial expression for one subject
• Summarization
• Visualization
• Indexing
Problem
10
Component Analysis for Signal Processing
• Mining facial expression for one subject
Looking up Sleeping SmilingLooking
forwardWaking up
• Summarization
• Visualization
• Indexing
Problem
Component Analysis for Signal Processing
• Mining facial expression of one subject
• Summarization
• Embedding
• Indexing
Problem
Component Analysis for Signal Processing
• Mining facial expression for one subject
• Summarization
• Embedding
• Indexing
Problem
Component Analysis for Signal Processing
=MGxy
k-means and kernel k-means
2||||),( FJ MGXGM −=
(MacQueen 67, Ding et al. 02, Dhillon et al. 04, Zass and Shashua 05, De la Torre 06)
1
2
3
4
5
6
7
8
9
10
x
y
=G
xy=X
)))((()( 1
n GGGGIKG−−= TTtrJ
)(φ
)()( XXK φφ T=
=Mxy
=G
11
Component Analysis for Signal Processing
1 2 3 1
Labels (G)
Problem formulation for ACA (I)
1s 2s 3s 4sStart and end of the segments (s)
s 2)(),,(
FacaJ MGXGM −= ϕ )..[)..[)..[ 13221,...,,
mm ssssss −XXX
)..[ 21 ssX)..[ 43 ssX
)..[ 1 mm ss −X
Component Analysis for Signal Processing
( )∑∑= =
−=+
k
c
cSS
m
i
ci mgii
1
2
2)..[
11
Xϕ
Dynamic Time Alignment Kernel (Shimodaira et al. 01)
X[Si , Si+1)mc
X [Si , Si+1)
mc
Problem formulation for ACA (II)
2
)..[)..[)..[ ),...,,(),,(13221 Fssssssaca mm
J MGXXXSGM −=−
ϕ
Component Analysis for Signal Processing
Matrix formulation for ACA
GGGGILKL1
n )(with)( −−== TT
kmk trJ
samples
segments
}{ 2371,0
×∈H
GHGGGHILWLK1
n )(with))o(( −−== TTT
aca trJ
2323×∈RW
clusters
segments
}{ 731,0
×∈G
)()( XXK φφ T=
Component Analysis for Signal Processing
Optimizing ACA (forward step)
• Efficient Dynamic Programming
i =23 i =25 i =29
2.11.81.7
2.41.21.8
2.41.91.5
maxw
12
Component Analysis for Signal Processing
Optimizing ACA (backward step)
)( max
2wnO
Component Analysis for Signal Processing
Three behaviors:
1-waggle, 2-left turn, 3-right turn
Seq 1 Seq 2 Seq 3 Seq 4 Seq 5 Seq 6
ACA 0.845 0.925 0.600 0.922 0.878 0.928
PS- SLDS (Oh et al 08) 0.759 0.924 0.831 0.934 0.904 0.910
HDP- VAR(1)-HMM
(Fox et al 08)
0.465 0.441 0.456 0.832 0.932 0.887
Spectral Clustering 0.698 0.631 0.509 0.671 0.577 0.649
Honey bee dance data (Oh et al. 08)
Component Analysis for Signal Processing
Facial image features
Appearance
• Active Appearance Models (Baker and Matthews ‘04)
Upper
face
Lower
face
Shape• Image features
Component Analysis for Signal Processing
• Cohn-Kanade: 30 people and five different
expressions (surprise, joy, sadness, fear, anger)
Facial event discovery across subjects
13
Component Analysis for Signal Processing
ACA Spectral
Clustering
(SC)
0.87(.05) 0.56(.04)
• Cohn-Kanade: 30 people and five different
expressions (surprise, joy, sadness, fear, anger)
Facial event discovery across subjects
• 10 sets of 30 people
Component Analysis for Signal Processing
Unsupervised facial event discovery
Component Analysis for Signal Processing
Clustering human motion
Component Analysis for Signal Processing
clustering of human motion II
14
Component Analysis for Signal Processing 53
Outline• Introduction
• Principal Component Analysis (PCA)– Robust Principal CA
– PCA missing data
– Parameterized Principal CA
• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis
• Canonical Correlation Analysis– Canonical Time Warping
• K-means– Aligned Cluster Analysis (ACA)
– Discriminative Cluster Analysis (DCA)
Component Analysis for Signal Processing 54
Problem
−10
−5
0
5
10 −10−5
05
10
−20
−15
−10
−5
0
5
10
15
20
YX
Z
−8 −6 −4 −2 0 2 4 6 8−10
−8
−6
−4
−2
0
2
4
6
8
X
Y
−10−5
05
10 −10−5
05
10−20
−15
−10
−5
0
5
10
15
20
Y
X
Z
−10
−5
0
5
10 −10−5
05
10
−20
−15
−10
−5
0
5
10
15
20
YX
Z
PCA+k-means DCA
Component Analysis for Signal Processing 55
Discriminative Cluster Analysis (DCA)
• Generative clustering (e.g. k-means):
• Not efficient for high dimensional data.
• Multiple local minima.
• Discriminative clustering (de la Torre & Kanade, 2006):
• Simultaneous dimensionality reduction and clustering.
nkij
c
i Cj
ijF
T
g
Ei
1G1
bdBGDBG
=∈
−=−= ∑∑= ∈
}1,0{
||||||||),(1
F
TTE ||)()(||),,( 2
1
DVBGGGGBV T−=−
(de la Torre & Kanade, 2006)
Component Analysis for Signal Processing 56
Optimization
• Eliminate V
• Optimize for B
• Optimize for G
)))(()((),( 11 BDGGGDGBBDDBGB TTTTTTtrE −−∝
Λ=−BDDBDGGGDG
TTTT 1)(
11
)()1(
)())(( −−
+
−=∂∂
∂∂
−==
GGAGGGGGIV
VVVVVG
TTT
C
nn
E
Eηo
SbSt
DBCCC)(CCATTT == −1
15
Component Analysis for Signal Processing 57
Experiments
20 40 60 80 100 120 140
20
40
60
80
100
120
140
−0.4−0.2
00.2
0.4
−0.2
0
0.2
0.4
0.6−0.2
−0.1
0
0.1
0.2
PCA
PCA DCA
≈− TT GGGG 1)(
Component Analysis for Signal Processing 58
DCA vs. PCA+k-means
5 10 15 20 25 30 35 400.65
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
Number of clusters (classes)
Acc
urac
y
DCA
PCA+k−means
Component Analysis for Signal Processing 59
What’s next?
CA
Component Analysis for Signal Processing 60
Bibliography
16
Component Analysis for Signal Processing 61 Component Analysis for Signal Processing 62
Component Analysis for Signal Processing 63 Component Analysis for Signal Processing 64
17
Component Analysis for Signal Processing 65 Component Analysis for Signal Processing 66
Bibliography
Zhou F., De la Torre F. and Hodgins J. (2008) "Aligned Cluster
Analysis for Temporal Segmentation of Human Motion“ IEEE
Conference on Automatic Face and Gestures Recognition, September,
2008.
De la Torre, F. and Nguyen, M. (2008) “Parameterized Kernel
Principal Component Analysis: Theory and Applications to
Supervised and Unsupervised Image Alignment“ IEEE Conference
on Computer Vision and Pattern Recognition, June, 2008.