17
1 Component Analysis for Signal Processing 1 Advanced Component Analysis Methods for Signal Processing Fernando De la Torre Fernando De la Torre [email protected] [email protected] Component Analysis for Signal Processing 2 Outline Introduction Principal Component Analysis (PCA) Robust Principal CA PCA missing data Parameterized Principal CA Linear Discriminant Analysis Multimodal Oriented Discriminant Analysis Canonical Correlation Analysis Canonical Time Warping K-means Aligned Cluster Analysis (ACA) Discriminative Cluster Analysis (DCA) Component Analysis for Signal Processing 3 Robust PCA Two types of outliers: Sample outliers Intra-sample outliers (Xu & Yuille., 1995) (de la Torre & Black, 2001b; Skocaj & Leonardis, 2003) Standard PCA solution (noisy data): Component Analysis for Signal Processing 4 Robust PCA Using robust statistics: ∑∑ == = = n i d p p k j ji pj p pi rpca c b d E 1 1 1 ) , ( ) , , ( σ μ ρ μ C B Pixel residual mean Basis (B) & Coefficients(c) robust robust quadratic outlier outlier (de la Torre & Black, 2001b; de la Torre & Black, 2003a)

RobustPCA Robust PCA

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: RobustPCA Robust PCA

1

Component Analysis for Signal Processing 1

Advanced Component Analysis

Methods for Signal Processing

Fernando De la TorreFernando De la Torre

[email protected]@cs.cmu.edu

Component Analysis for Signal Processing 2

Outline• Introduction

• Principal Component Analysis (PCA)– Robust Principal CA

– PCA missing data

– Parameterized Principal CA

• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis

• Canonical Correlation Analysis– Canonical Time Warping

• K-means– Aligned Cluster Analysis (ACA)

– Discriminative Cluster Analysis (DCA)

Component Analysis for Signal Processing 3

Robust PCA•Two types of outliers:

Sample outliers Intra-sample outliers(Xu & Yuille., 1995) (de la Torre & Black, 2001b; Skocaj & Leonardis, 2003)

•Standard PCA solution (noisy data):

Component Analysis for Signal Processing 4

Robust PCA• Using robust statistics:

∑∑ ∑= = =

−−=n

i

d

p

p

k

j

jipjppirpca cbdE1 1 1

),(),,( σµρµCB

Pixel residual

meanBasis (B) &

Coefficients(c)

robustrobust

quadratic

outlieroutlier

(de la Torre & Black, 2001b; de la Torre & Black, 2003a)

Page 2: RobustPCA Robust PCA

2

Component Analysis for Signal Processing 5

Numerical Problems• No closed form solution in terms of an eigen-equation.

• Deflation approaches do not hold.

L

T

T

11

uuAA

uuAA

222

1

'''

'

λ

λ

−=

−=First eigenvector with

highest eigenvalue.

Second eigenvector with

highest eigenvalue.

• In the robust case all the basis have to be computed

simultaneously (including the mean).

Component Analysis for Signal Processing 6

How to Optimize it?

∑∑ ∑= = =

−−=n

i

d

p

p

k

j

jipjppirpca cbdE1 1 1

),(),,( σµρµCB

[ ]

[ ]C

HCC

BHBB

∂−=

∂−=

−+

−+

rpca

c

nn

rpca

b

nn

E

E

o

o

11

11

)(max

)(max

2

2

T

ii

rpca

T

ii

rpca

Ediag

Ediag

ccH

bbH

c

b

∂∂

∂=

∂∂

∂=

(Blake & Zisserman, 1987)

• Normalized Gradient descent

• Deterministic annealing methods to avoid local minima.

Component Analysis for Signal Processing 7

Example

• Small region

• Short amount of time

Statistical

outlier

Component Analysis for Signal Processing 8

Robust PCA

Original PCA RPCA Outliers

Page 3: RobustPCA Robust PCA

3

Component Analysis for Signal Processing 9

Structure from Motion

• Robust estimation of coefficients (Black & Jepson, 1998; Leonardis & Bischof, 2000; Ke & Kanade, 2004)

• Robust estimation of basis and coefficients (Gabriel & Odoro, 1984; Croux & Filzmoser., 1981; Skocaj et al., 2002;Skocaj & Leonardis, 2003; de la Torre & Black, 2001b; de la Torre & Black, 2003a)

• Other Robust PCA techniques (sample outliers) (Campbell, 1980; Ruymagaart, 1981; Xu & Yuille., 1995)

More work on Robust PCA

Component Analysis for Signal Processing 10

PCA with Uncertainty and Missing Data

• If weights are separable closed-form solution.

• Generalized SVD

productHadamard

wij

nd

o

0≥

ℜ∈ ×W

D=

dnd

n

n

dd

dd

dd

LL

MOM

O

LL

1

221

111

=

r

d

r

r

r

w

w

w

...

2

1

w

( )cnccc www 21=w ……

T

r cwwW =

(Greenacre, 1984; Irani & Anandan, 2000;)

∑∑ ∑= = =

−=−=d

i

n

j

k

s

sjisijijFcbdwE

1 1

2

1

2 )()()( BCDWCB, o• Adding uncertainty

Component Analysis for Signal Processing 11

General Case

• For arbitrary weights no closed-form solution.

• Alternated least squares algorithms– Slow convergence, easy implementation.

• Damped Newton Algorithm– Fast convergence.

– H definite positive:

vvvv

C

Bv

CBBCDWCB,

∂∂

∂∂

−=

=

++−=−

+ 2

1

2

2

2)1(

212

][

)(

)(

||||||||)()(

EE

vec

vec

E

nn

FFFλλo

=

=

−−

=−−=−=

d

p

pTppTpTp

n

i

iii

T

iiF

diag

diagE

1

1

2

))(()(

))(()()()(

bCdwbCd

BcdwBcdBCDWCB, o

Iv

H λ+∂∂

=2

2

2Eeconvergencuntil

FFuntil

I

repeat

EE

repeat

10;

)()(

)(

10

1

2

2

2

2

λλ

λ

λλ

==

<

+−=

=

∂∂

=∂∂

=

yx

xy

gHxy

vg

vH

(Buchanan & Fitzgibbon., 2005)

(Torre & Black, 2003a)

Component Analysis for Signal Processing 12

Experiments

Page 4: RobustPCA Robust PCA

4

Component Analysis for Signal Processing 13

Outline• Introduction

• Principal Component Analysis (PCA)– Robust Principal CA

– PCA missing data

– Parameterized Principal CA

• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis

• Canonical Correlation Analysis– Canonical Time Warping

• K-means– Aligned Cluster Analysis (ACA)

– Discriminative Cluster Analysis (DCA)

Component Analysis for Signal Processing 14

• Learn a subspace invariant to geometric transformations?

. . .

• Data has to be geometrically normalized

– Tedious manual cropping.

– Inaccuracies due to matching ambiguities.

– Hard to achieve sub-pixel accuracy.

Parameterized Component Analysis (PaCA)(de la Torre & Black, 2003b)

Component Analysis for Signal Processing 15

Error function for PaCA

)()()(),,( 21

1

2

1

caBc)af(x,daCBW

t ppET

t

tt++−=∑

=

Basis ((BB) &) &

coefficients ((cc))

Motion (warping)

Regularization

2

312

1 1

2

211

WWaΓacΓc −

= =− −+−∑∑ tat

T

t

L

l

tct λλ

Component Analysis for Signal Processing 16

EigenEye Learning

Page 5: RobustPCA Robust PCA

5

Component Analysis for Signal Processing 17

Outline• Introduction

• Principal Component Analysis (PCA)– Robust Principal CA

– PCA missing data

– Parameterized Principal CA

• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis

• Canonical Correlation Analysis– Canonical Time Warping

• K-means– Aligned Cluster Analysis (ACA)

– Discriminative Cluster Analysis (DCA)

Component Analysis for Signal Processing 18

Multimodal Oriented Component

Analysis (MODA)

• How to extend LDA to deal with:

– Model class covariances.

– Multimodal classes.

– Deal efficiently with huge covariance matrices

(e.g. 100*100).

(de la Torre & Kanade, 2005a)

Component Analysis for Signal Processing 19

Multimodality

−200−100

0100

200

−200

0

200−10

−5

0

5

10

MODA

0 10 20 30 40 50 60−40

−30

−20

−10

0

10

20

30

40

0 10 20 30 40 50 60−10

−8

−6

−4

−2

0

2

4

6

8

10

LDA

Component Analysis for Signal Processing 20

∑ ∑ ∑ ∑= ≠ ∈ ∈

−−−−−+−++

classes

i

Tr

j

r

i

classes

ij Cr Cr

r

i

r

j

r

j

r

i

r

i

r

j

r

i

r

j

T

i j

tr1

1111

)))())(((( 21

1 2

21211212 BµµΣΣµµΣΣΣΣB

Multimodal Oriented Discriminant

Analysis (MODA)�B that MAXIMIZES the Kullback-Leibler divergence between clusters among classes.

• 1 mode per class and equal covariances equivalent to LDA.

Page 6: RobustPCA Robust PCA

6

Component Analysis for Signal Processing 21

Optimization

W1

∑=

−−=1

1 ))()(()(i

i

T

i

TtrJ BABBΣBB

• Hard optimization problem

T(B)

)()(

)()(

00 BB

BBB

JT

JT

=

∀≥

W0

J(B)

• Iterative Majorization (Kiers, 1995; Leeuw, 1994)

Component Analysis for Signal Processing 22

Majorization

• Slow convergence…

=

−−

=

−≥

−=

classes

i

i

T

i

T

T

i

T

i

TT

i

Tclasses

i

tr

Tii

1

1

2

1

002

1

2

1

2

1

1

))()((

||)()()(||)(

BABBΣB

ABBΣBBΣBABBΣBB

Component Analysis for Signal Processing 23

Face recognition from video

Photometric normalization

Geometric normalization

Clustering.

• Challenges

– Low quality small images (40-50 pixels).

– Changes in expression/pose/occlusion/illumination.

– Real time and scalable to several users.

FernandoCarlos

PaulNot in the databaseNot in the database

Prototypes.

(de la Torre et al., 2005b)

Component Analysis for Signal Processing 24

Pro

toty

pes

??40

40

B

40*40 16

Number of basis

Recognition

LDA

PCA

MODA

Page 7: RobustPCA Robust PCA

7

Component Analysis for Signal Processing 25

Outline• Introduction

• Principal Component Analysis (PCA)– Robust Principal CA

– PCA missing data

– Parameterized Principal CA

• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis

• Canonical Correlation Analysis– Canonical Time Warping

• K-means– Aligned Cluster Analysis (ACA)

– Discriminative Cluster Analysis (DCA)

Component Analysis for Signal Processing

Problem

Component Analysis for Signal Processing

Canonical Correlation Analysis (CCA)(Hotelling 1936)

• CCA minimizes:

CCA

different #rows, same #columns

Spatial transformation

2

),(F

T

y

T

xyxccaJ YVXVVV −= b

y

TT

y

x

TT

xts I

VYYV

VXXV=

.

ndnd yx×× ℜ∈ℜ∈ YX ,

Component Analysis for Signal Processing

A least-square formulation for DTW

same #rows, different #columnsyx ndnd ×× ℜ∈ℜ∈ YX ,

2

),(F

T

y

T

xyxdtwJ YWXWWW −=

Page 8: RobustPCA Robust PCA

8

Component Analysis for Signal Processing

Canonical Time Warping (CTW)

different #rows, different #columns

temporal alignment

spatial transformation

Reminder

b

y

T

y

T

y

T

y

x

T

x

T

x

T

x

F

T

y

T

y

T

x

T

xyxyxctw

ts

J

IVYWYWV

VXWXWV

YWVXWVVVWW

=

−=

..

),,,(2

yyxxndnd ×× ℜ∈ℜ∈ YX ,

2

2

),(

),(

F

T

y

T

xyxdtw

F

T

y

T

xyxcca

J

J

YWXWWW

YVXVVV

−=

−=

Component Analysis for Signal Processing

Facial expression alignment

Component Analysis for Signal Processing

Facial expression alignment

Component Analysis for Signal Processing

Aligning human motion

Boxing Opening a cabinet

Page 9: RobustPCA Robust PCA

9

Component Analysis for Signal Processing

Aligning motion capture and video

Component Analysis for Signal Processing 34

Outline• Introduction

• Principal Component Analysis (PCA)– Robust Principal CA

– PCA missing data

– Parameterized Principal CA

• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis

• Canonical Correlation Analysis– Canonical Time Warping

• K-means– Aligned Cluster Analysis (ACA)

– Discriminative Cluster Analysis (DCA)

Component Analysis for Signal Processing

Problem

• Mining facial expression

Component Analysis for Signal Processing

• Mining facial expression for one subject• Mining facial expression for one subject

• Summarization

• Visualization

• Indexing

Problem

Page 10: RobustPCA Robust PCA

10

Component Analysis for Signal Processing

• Mining facial expression for one subject

Looking up Sleeping SmilingLooking

forwardWaking up

• Summarization

• Visualization

• Indexing

Problem

Component Analysis for Signal Processing

• Mining facial expression of one subject

• Summarization

• Embedding

• Indexing

Problem

Component Analysis for Signal Processing

• Mining facial expression for one subject

• Summarization

• Embedding

• Indexing

Problem

Component Analysis for Signal Processing

=MGxy

k-means and kernel k-means

2||||),( FJ MGXGM −=

(MacQueen 67, Ding et al. 02, Dhillon et al. 04, Zass and Shashua 05, De la Torre 06)

1

2

3

4

5

6

7

8

9

10

x

y

=G

xy=X

)))((()( 1

n GGGGIKG−−= TTtrJ

)(φ

)()( XXK φφ T=

=Mxy

=G

Page 11: RobustPCA Robust PCA

11

Component Analysis for Signal Processing

1 2 3 1

Labels (G)

Problem formulation for ACA (I)

1s 2s 3s 4sStart and end of the segments (s)

s 2)(),,(

FacaJ MGXGM −= ϕ )..[)..[)..[ 13221,...,,

mm ssssss −XXX

)..[ 21 ssX)..[ 43 ssX

)..[ 1 mm ss −X

Component Analysis for Signal Processing

( )∑∑= =

−=+

k

c

cSS

m

i

ci mgii

1

2

2)..[

11

Dynamic Time Alignment Kernel (Shimodaira et al. 01)

X[Si , Si+1)mc

X [Si , Si+1)

mc

Problem formulation for ACA (II)

2

)..[)..[)..[ ),...,,(),,(13221 Fssssssaca mm

J MGXXXSGM −=−

ϕ

Component Analysis for Signal Processing

Matrix formulation for ACA

GGGGILKL1

n )(with)( −−== TT

kmk trJ

samples

segments

}{ 2371,0

×∈H

GHGGGHILWLK1

n )(with))o(( −−== TTT

aca trJ

2323×∈RW

clusters

segments

}{ 731,0

×∈G

)()( XXK φφ T=

Component Analysis for Signal Processing

Optimizing ACA (forward step)

• Efficient Dynamic Programming

i =23 i =25 i =29

2.11.81.7

2.41.21.8

2.41.91.5

maxw

Page 12: RobustPCA Robust PCA

12

Component Analysis for Signal Processing

Optimizing ACA (backward step)

)( max

2wnO

Component Analysis for Signal Processing

Three behaviors:

1-waggle, 2-left turn, 3-right turn

Seq 1 Seq 2 Seq 3 Seq 4 Seq 5 Seq 6

ACA 0.845 0.925 0.600 0.922 0.878 0.928

PS- SLDS (Oh et al 08) 0.759 0.924 0.831 0.934 0.904 0.910

HDP- VAR(1)-HMM

(Fox et al 08)

0.465 0.441 0.456 0.832 0.932 0.887

Spectral Clustering 0.698 0.631 0.509 0.671 0.577 0.649

Honey bee dance data (Oh et al. 08)

Component Analysis for Signal Processing

Facial image features

Appearance

• Active Appearance Models (Baker and Matthews ‘04)

Upper

face

Lower

face

Shape• Image features

Component Analysis for Signal Processing

• Cohn-Kanade: 30 people and five different

expressions (surprise, joy, sadness, fear, anger)

Facial event discovery across subjects

Page 13: RobustPCA Robust PCA

13

Component Analysis for Signal Processing

ACA Spectral

Clustering

(SC)

0.87(.05) 0.56(.04)

• Cohn-Kanade: 30 people and five different

expressions (surprise, joy, sadness, fear, anger)

Facial event discovery across subjects

• 10 sets of 30 people

Component Analysis for Signal Processing

Unsupervised facial event discovery

Component Analysis for Signal Processing

Clustering human motion

Component Analysis for Signal Processing

clustering of human motion II

Page 14: RobustPCA Robust PCA

14

Component Analysis for Signal Processing 53

Outline• Introduction

• Principal Component Analysis (PCA)– Robust Principal CA

– PCA missing data

– Parameterized Principal CA

• Linear Discriminant Analysis– Multimodal Oriented Discriminant Analysis

• Canonical Correlation Analysis– Canonical Time Warping

• K-means– Aligned Cluster Analysis (ACA)

– Discriminative Cluster Analysis (DCA)

Component Analysis for Signal Processing 54

Problem

−10

−5

0

5

10 −10−5

05

10

−20

−15

−10

−5

0

5

10

15

20

YX

Z

−8 −6 −4 −2 0 2 4 6 8−10

−8

−6

−4

−2

0

2

4

6

8

X

Y

−10−5

05

10 −10−5

05

10−20

−15

−10

−5

0

5

10

15

20

Y

X

Z

−10

−5

0

5

10 −10−5

05

10

−20

−15

−10

−5

0

5

10

15

20

YX

Z

PCA+k-means DCA

Component Analysis for Signal Processing 55

Discriminative Cluster Analysis (DCA)

• Generative clustering (e.g. k-means):

• Not efficient for high dimensional data.

• Multiple local minima.

• Discriminative clustering (de la Torre & Kanade, 2006):

• Simultaneous dimensionality reduction and clustering.

nkij

c

i Cj

ijF

T

g

Ei

1G1

bdBGDBG

=∈

−=−= ∑∑= ∈

}1,0{

||||||||),(1

F

TTE ||)()(||),,( 2

1

DVBGGGGBV T−=−

(de la Torre & Kanade, 2006)

Component Analysis for Signal Processing 56

Optimization

• Eliminate V

• Optimize for B

• Optimize for G

)))(()((),( 11 BDGGGDGBBDDBGB TTTTTTtrE −−∝

Λ=−BDDBDGGGDG

TTTT 1)(

11

)()1(

)())(( −−

+

−=∂∂

∂∂

−==

GGAGGGGGIV

VVVVVG

TTT

C

nn

E

Eηo

SbSt

DBCCC)(CCATTT == −1

Page 15: RobustPCA Robust PCA

15

Component Analysis for Signal Processing 57

Experiments

20 40 60 80 100 120 140

20

40

60

80

100

120

140

−0.4−0.2

00.2

0.4

−0.2

0

0.2

0.4

0.6−0.2

−0.1

0

0.1

0.2

PCA

PCA DCA

≈− TT GGGG 1)(

Component Analysis for Signal Processing 58

DCA vs. PCA+k-means

5 10 15 20 25 30 35 400.65

0.7

0.75

0.8

0.85

0.9

0.95

1

1.05

Number of clusters (classes)

Acc

urac

y

DCA

PCA+k−means

Component Analysis for Signal Processing 59

What’s next?

CA

Component Analysis for Signal Processing 60

Bibliography

Page 16: RobustPCA Robust PCA

16

Component Analysis for Signal Processing 61 Component Analysis for Signal Processing 62

Component Analysis for Signal Processing 63 Component Analysis for Signal Processing 64

Page 17: RobustPCA Robust PCA

17

Component Analysis for Signal Processing 65 Component Analysis for Signal Processing 66

Bibliography

Zhou F., De la Torre F. and Hodgins J. (2008) "Aligned Cluster

Analysis for Temporal Segmentation of Human Motion“ IEEE

Conference on Automatic Face and Gestures Recognition, September,

2008.

De la Torre, F. and Nguyen, M. (2008) “Parameterized Kernel

Principal Component Analysis: Theory and Applications to

Supervised and Unsupervised Image Alignment“ IEEE Conference

on Computer Vision and Pattern Recognition, June, 2008.