33
An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian University of Technology) Supervised by Bo Yu (DUT) Yin Zhang (Rice University) March 27, 2013

An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

An Alternating Direction Algorithm for Structure-enforced

Matrix Factorization

Lijun Xu (Dalian University of Technology)

Supervised by

Bo Yu (DUT) Yin Zhang (Rice University)

March 27, 2013

Page 2: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Outline

Introduction Alternating Direction Method (ADM) ADM Extension to SeMF Numerical experiments Conclusion

Page 3: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

• Matrix Factorization • Various factorizations requiring different

constraints on and , a) Exact factorizations: LU, QR, SVD and

eigendecomposition, etc b) Recent approximate factorizations : NMF, K-means,

sparse PCA, matrix completion, dictionary learning, etc.

Introduction

2

,

1min , , ,2

m n m k k nFX Y

M XY M X Y× × ×− ∈ ∈ ∈

X Y

Page 4: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

• In practice, many constraints on and impose structural properties like non-negativity, sparsity, orthogonality, normalization, etc., which allow easy ‘projections’.

• Structure-enforced Matrix Factorization (SeMF)

where and are easily projectable sets.

2

,

1min , s.t. , 2 FX Y

M XY X Y− ∈ ∈

X Y

Introduction

Page 5: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

• Some examples of easily projectable sets : Non-negativity :

Sparsity:

Orthogonality:

, 0( )

0 , 0ij ij

ij

X XX

X≥= <

{ : 0}ijX X= ≥

0{ : , 1, 2, }iX X k i= ≤ =

, | | is in the first -th largest absolute values of ( )

0 , otherwiseij ij iX X k X

X

=

{ : , }i JX X X i I= ⊥ ∈

( )1( ) , ( )

,

T TJ J J J i

j

X X X X X i IX

X j J

− Ι − ∈= ∈

Introduction

Page 6: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Normalization:

Combinatorial structure:

E.g. 3 groups, each group is sparse.

, 1( )

, 1i i i

i i

X X XX

X X

>= ≤

{ : 1, 1, 2, }iX X i= ≤ =

{ }1 2 : , 1, 2,

r iI I I I iX X X X X i r = = ∈ =

1 1 2 2( ) ( ) ( ) ( )

r rI I IX X X X =

1 zero 2 zeros 1 zero

Introduction

Page 7: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Introduction

• Problems with specific structural patterns

a) Sparse NMF : non-negative (+sparse) : non-negative (+ sparse) b) Sparse PCA : sparse : column normalized c) Dictionary Learning for sparse representation : column normalized : sparse etc.

2

,

1min , s.t. , 2 FX Y

M XY X Y− ∈ ∈

Page 8: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

• Classic ADM:

where are convex, are closed convex. • Augmented Lagrangian:

ADM:

Alternating Direction Method

Page 9: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

ADM Extension to SeMF • Original Model:

• Model with splitting variables:

Splitting variables separates from (similarly for ), Separations facilitate alternating direction methods

2

,

1min , s.t. , 2 FX Y

M XY X Y− ∈ ∈

2

, , ,

1min , s.t. 0, 0, ,2 FX Y U V

M XY X U Y V U V− − = − = ∈ ∈

U X Y

Page 10: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

ADM framework to SeMF

• Augmented Lagrangian:

where are lagrangian multipliers, are penalty parameters and product .

Minimizing with respect to one at a time while fixing others, and then updating after each sweep of such alternating minimization.

2 2 21( , , , , , )2 2 2

+ ( ) ( )

A F F FX Y U V M XY X U Y V

X U Y V

α βΛ Π = − + − + −

Λ• − +Π • −

, ij iji jA B a b• =∑

,Λ Π ( , ) 0α β >

A ( ), and , ,X Y U V,Λ Π

Page 11: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

ADM framework to SeMF

• Framework:

( )

1

1 1

1 1

1 1

1 1 1

1 1 1

argmin ( , , , , , ) ,

argmin ( , , , , , ) ,

( / ),

( / ),

( ),

,

k k k k k kA

Xk k k k k k

AY

k k k

k k k

k k k k

k k k k

X X Y U V

Y X Y U V

U XV Y

X U

Y V

α

β

γα

γβ

+

+ +

+ +

+ +

+ + +

+ + +

← Λ Π

← Λ Π

← +Λ

← +Π

Λ ← Λ + −

Π ← Π + −

Page 12: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Implementation • Choice of Step length we set Adaptive updating Motivation: fixed values often cause slow convergence and getting

trapped in local minima. Intuition : balance the changes of the 3 terms and .

• Stopping criterion: , where

M XY−

,X U Y V− −

1,γ =( )0,1.618 ,γ ∈, , α β γ

( , ) , α β

1 k k kf f f tol+− ≤ k kk F

f M X Y= −

Page 13: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Implementation • An updating strategy:

Page 14: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Implementation • An simple example:

Solve

using different initial :

2

,: random 40 60 matrix, || || =1: sparse 60 1500 matrix

each column has 3 zeros with random location and value,

i

A XYX xY

×,

2[1 0.1] 10 , 1, 5.kA k−× × =

2

2 0,

1min . . 1, 32 i iFX Y

A XY s t x y− = ≤

Page 15: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian
Page 16: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Numerical Experiments Dictionary Learning

Synthetic experiments: (compare with K-SVD) X*: random 20*50, columns normalized; Y*: 3 random non-zeros each column; M: X*Y*+ white Gaussian noise.

2

2 0,

1min , s.t. 1, ,2 i jFX Y

M XY x y k i j− ≤ ≤ ∀,

: samples of data, : overcomplete dictionary matrix,

: sparse representation of ,

MXY M

Denote X as learned dictionary. Measure distance: ( )( , ) min 1 ,T

j i jidist x X x x∗ ∗= −

Page 17: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

In this case (sparsity = 3), SeMF can recover better when number of samples is small (<500).

Test: a) Solve with different numbers of samples and figure out the percentage of recovery columns ,

Numerical Experiments if is recovered, and define

( , ) 0.01,jdist x X∗ ≤

( , ) ( ( , ))jdist X X mean dist x X∗ ∗=jx∗

Dictionary size : 20*50, Sparsity: 3 Noise: 20dB .

Page 18: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

b) The smallest number of samples to reach 95% recovery of dictionary respective to different sparsity ,

the number of samples : [200:50:2000] sparsity: [1 2 3 4 5 6] average results of 10 experiments:

Numerical Experiments

Dictionary size : 20*50, Noise: 20dB .

Page 19: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

c) Recovery respect to different noise level.

Numerical Experiments

For each SNR, compute the number of recovered atoms, repeat 100 tests, sort the results and average in groups of 20. SNR = [10 20 30 ]dB

Page 20: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Numerical Experiments Test on Swimmer Datasets

• Swimmer consists of 256 images of size 32*32. Each image is constituted by 5 parts from the 17 distinct non-overlapping basis images, i.e., a centered invariant part called torso and four limbs in one of the 4 positions.

• Goal: extracting non-negative basis images . 1024 256 1024 17 17 256, ,M X Y× × ×∈ ∈ ∈

1 17{ , , }X X

Page 21: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Different structure enforcing 1. Sparse NMF

2. Sparse NMF with equal non-zero coefficients

Latent property: 5 parts of swimmer image have the same

coefficient, which means there are 5 equal non-zeros in the sparse representation Y.

2

00, 0

1min , s.t. 5 1, 2562 jFX Y

M XY y j≥ ≥

− ≤ = ,

2

00 0 ,,

1min , s.t. ( , 5 2

) jFX j nnzY jy meM Y a jyX yn≥ ≥

− ≤= ∀,

Numerical Experiments

Page 22: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Results on different structure enforcing

Sparse NMF Sparse NMF with equal coefficients

Improved but no sequence

Numerical Experiments

Page 23: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

3. Sparse NMF with orthogonal property Since sparse NMF can not apparently extract the central

torso, but potential sparsity and orthogonality to 4 limbs. (Actually all 5 parts are independent and there are non-overlapping non-zero parts.)

1, ,16 12

00, 00 7 171min , s.t. , 52

7 , 1 jFX Yx x xM XY y

≥ ≥− ⊥ ≤ ≤

Different structure enforcing Numerical Experiments

Page 24: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Sparse NMF Sparse NMF with orthogonal structure

The torso is classified.

Results on different structure enforcing Numerical Experiments

Page 25: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

4. Sparse NMF with combinatorial patterns Divide rows of Y into 5 groups(4 limbs and 1 torso), each

group has only 1 non-zero and the 5 non-zeros are equal.

0

2,0, 0

1min , s.t. ( 1, 1,)2

,5, ij nnz jF GX Y

M XY y mean y y i≥ ≥

= =− =

G1 G2 G3 G4 G5

Different structure enforcing Numerical Experiments

Page 26: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Sparse NMF enforcing combinatorial patterns

Results on different structure enforcing Numerical Experiments

quite well classified parts

Page 27: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Numerical Experiments Test on Face Images

• Goal: return a part-based representation.

The basis elements extract facial features such as eyes, nose and lips.

Page 28: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

• Structure Property: Y is non-negative, X is sparse and non-negative,

Few works with L0 sparse NMF. Non-negative K-SVD (NNK-SVD,2005), Probabilistic sparse matrix factorization

(PSMF,2004), NMFL0 (2012)

a) L1 sparse NMF (relaxation of L0 sparse, convex) penalize or constrain the L1 norm of X or Y: b) L0 sparse NMF (more intuitive, non-convex) constrain the L0 norm of X or Y.

Numerical Experiments

(Hoyer 2004)

Page 29: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

• Model: sparsity enforced to matrix X

• Compare to Alg. (R.Peharz, F. Pernkopf, 2012) a) fixed Y, calculate X using non-negative least square

(NNLS), b) update Y maintaining sparse structure of X. (ANLS or Multiplicative Update) Difference in subproblems a) and b): SeMF : minimize augmented lagrangian function, : minimize original objective.

2

00, 0

1min , s.t. 2 iFX Y

M XY x K≥ ≥

− ≤

Numerical Experiments

0 -NMF X

0 -NMF X

Page 30: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

• Apply to ORL datasets(10304400, 25 basis parts)

Numerical Experiments

nnz: 33% nnz: 25% nnz: 10%

SeMF:

NMFL0:

Page 31: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

• Comparison of reconstruction quality and running time.

similar quality but more faster than in less

sparsity cases (more non-zeros).

Numerical Experiments

0 -NMF X

note: perform better than Hoyer’s method in both SNR and time in the paper “Sparse nonnegative matrix factorization with L0-constraints” by R. Peharz and F. Pernkopf.

0 -NMF X

Page 32: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

• SeMF can handle many different structures provided they have easy projections,

• ADM approach for augmented lagrangian of a split model, • Dynamically updating penalty parameters empirically

performs well. • Potential applications to many problems with latent

structure properties to improve solution quality, • Further work on experiments and comparisons, non-convex

complication, parameter choices, etc.

Conclusions

Page 33: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian

Thank you!