85
Accelerated Outlier Detection in Low-Rank and Structured Data: Robust PCA and Extensions HanQin Cai Department of Mathematics University of California at Los Angeles Seminar on Applied Math and Data Science March 25 st , 2020

Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Accelerated Outlier Detection in Low-Rank andStructured Data: Robust PCA and Extensions

HanQin Cai

Department of MathematicsUniversity of California at Los Angeles

Seminar on Applied Math and Data ScienceMarch 25st, 2020

Page 2: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical Guarantee

3 Proposed Approach

4 Numerical Experiments

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 3: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Principal Component Analysis (PCA)

Principal Component Analysis (PCA), a.k.a. the best low rankapproximation:

minimizeL′

‖D − L′‖F

subject to rank(L′) ≤ r

− Fundamental tool for dimension reduction

Can be efficiently solved via truncated singular-value decomposition(SVD) Hr :

Hr (D) = Hr (UΣV T ) = UΣrVT

where Σr keeps only the first r singular values.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 2 / 55

Page 4: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Principal Component Analysis (PCA)

PCA, a.k.a. the best low rank approximation.

Solved efficiently via truncated SVD Hr .

Tolerates small (zero-mean) noise, but over-sensitive to outliers.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 2 / 55

Page 5: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Robust PCA

Given D = L + S ∈ Rm×n, where L is low rank and S is sparse.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 3 / 55

Page 6: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Robust PCA

Given D = L + S ∈ Rm×n, where L is low rank and S is sparse.

Split/recover L, S from D:

minimizeL′,S ′

‖D − L′ − S ′‖F

subject to rank(L′) ≤ r , ‖S ′‖0 ≤ αmn(1)

where α is the sparsity level of S , compared to its size.

− Non-convex problem

Tolerates outliers better.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 3 / 55

Page 7: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Applications

Video background subtraction with static background

H.Q. Cai (UCLA Math) Robust PCA & Extensions 4 / 55

Page 8: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Applications

Video background subtraction with static background

Face recognition

H.Q. Cai (UCLA Math) Robust PCA & Extensions 4 / 55

Page 9: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Applications

Video background subtraction with static background

Face recognition

System identification

Fault isolation

Netflix problem

and more ...

H.Q. Cai (UCLA Math) Robust PCA & Extensions 4 / 55

Page 10: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

RPCA is Ill-Posed without Assumptions

Cannot have a unique decomposition if D is both low rank and sparse.

0 0 00 5 00 0 0

=

0 0 00 1 00 0 0

+

0 0 00 4 00 0 0

=

0 0 00 2 00 0 0

+

0 0 00 3 00 0 0

= · · ·

H.Q. Cai (UCLA Math) Robust PCA & Extensions 5 / 55

Page 11: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Assumption A1 for Uniqueness of Solution

L ∈ Rm×n is rank-r with µ-incoherence, i.e.,

maxi‖eTi U‖2 ≤

õr

m

maxj‖eTj V ‖2 ≤

õr

n

where L = UΣV T is the SVD of L.

µ ∈ [1,max{m, n}/r ]

Smaller µ is better, which implies the entries of L are not toospiky/sparse. - Energy is evenly distributed among the entries of L.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 6 / 55

Page 12: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Assumption A2 for Uniqueness of Solution

S ∈ Rm×n is α-sparse, i.e., S has at most αn non-zero entries in eachrow, and at most αm non-zero entries in each column.

Non-zero entries are not locally dense.

Satisfied with high probability if the support of S is drawn by somestochastic methods.

− Bernoulli process− Uniform sampling

H.Q. Cai (UCLA Math) Robust PCA & Extensions 7 / 55

Page 13: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical GuaranteeConvex MethodsNon-Convex Methods

3 Proposed Approach

4 Numerical Experiments

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 14: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical GuaranteeConvex MethodsNon-Convex Methods

3 Proposed Approach

4 Numerical Experiments

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 15: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Convex Relaxation

Consider convex Principal Component Pursuit (PCP):

minimizeL′,S ′

‖L′‖∗ + λ‖S ′‖1

subject to L′ + S ′ = D(2)

Nuclear norm term for low rank− ‖L‖∗ =

∑ri=1 σi (L)

Vector `1-norm term for sparsity, where S is viewed as a long vector− ‖S‖1 =

∑ij |Sij |

H.Q. Cai (UCLA Math) Robust PCA & Extensions 8 / 55

Page 16: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Recovery Guarantee of PCP

If λ =√

1/max{m, n} is chosen, then, with high probability, thesolution of convex PCP

minimizeL′,S ′

‖L′‖∗ + λ‖S ′‖1

subject to L′ + S ′ = D

is exact the solution of original non-convex Robust PCA problem

minimizeL′,S ′

‖D − L′ − S ′‖F

subject to rank(L′) ≤ r , ‖S ′‖0 ≤ αmn

under some natural conditions. [Candes, Li, Ma, Wright, 2009]

H.Q. Cai (UCLA Math) Robust PCA & Extensions 9 / 55

Page 17: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Algorithms for PCP

Off-the-shelf solver− CVX package for Matlab

� Based on semidefinite programming� O(n6) - expensive� Cannot handle problem larger than 200× 200 dimension

Alternating direction method of multipliers (ADMM)

− Consider the augmented Lagrangian

`(L,S ,Y ) = ‖L‖∗ + λ‖S‖1 + 〈Y ,D − L− S〉+η

2‖D − L− S‖2F

− Updating L, S and dual variable Y alternatively− Convergence of ADMM has been well studied− Good per-iteration complexity, but sublinear convergence rate

H.Q. Cai (UCLA Math) Robust PCA & Extensions 10 / 55

Page 18: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Algorithms for PCP

Off-the-shelf solver− CVX package for Matlab

� Based on semidefinite programming� O(n6) - expensive� Cannot handle problem larger than 200× 200 dimension

Alternating direction method of multipliers (ADMM)

− Consider the augmented Lagrangian

`(L,S ,Y ) = ‖L‖∗ + λ‖S‖1 + 〈Y ,D − L− S〉+η

2‖D − L− S‖2F

− Updating L, S and dual variable Y alternatively

− Convergence of ADMM has been well studied− Good per-iteration complexity, but sublinear convergence rate

H.Q. Cai (UCLA Math) Robust PCA & Extensions 10 / 55

Page 19: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Algorithms for PCP

Off-the-shelf solver− CVX package for Matlab

� Based on semidefinite programming� O(n6) - expensive� Cannot handle problem larger than 200× 200 dimension

Alternating direction method of multipliers (ADMM)

− Consider the augmented Lagrangian

`(L,S ,Y ) = ‖L‖∗ + λ‖S‖1 + 〈Y ,D − L− S〉+η

2‖D − L− S‖2F

− Updating L, S and dual variable Y alternatively− Convergence of ADMM has been well studied− Good per-iteration complexity, but sublinear convergence rate

H.Q. Cai (UCLA Math) Robust PCA & Extensions 10 / 55

Page 20: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical GuaranteeConvex MethodsNon-Convex Methods

Gradient DescentAlternating Projections

3 Proposed Approach

4 Numerical Experiments

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 21: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Gradient Descent (GD)

Updating L:

− Consider L = UΣV T = (UΣ1/2)(Σ1/2V T ) := PQT

− Gradient descent on P, Q separately, based on objective function:

1

2‖PQT + S −D‖2F︸ ︷︷ ︸

loss function

+1

8‖PTP −QTQ‖2F︸ ︷︷ ︸keep scale being close

(3)

− Enforce incoherence on P, Q after gradient descent

Updating S :

− Sparsification operator Fα� Keeps only the largest α-fraction elements per row and column� Partial sorting on each row and column - expensive with larger α

[Yi, Park, Chen, Caramanis, 2016]

H.Q. Cai (UCLA Math) Robust PCA & Extensions 11 / 55

Page 22: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Recovery Guarantee of GD

The output (Pk ,Qk) of GD satisfies

‖PkQTk − L‖F ≤ ε

in O(κ log(1/ε)) iterations if followings are satisfied:

− L is µ-incoherent

− S is α-sparse with α ≤ O(

min{

1µr1.5κ1.5 ,

1µrκ2

})− γ = 2: parameter for sparsification− η ≤ 1/36σL

1 : step size for gradient descent

Linear convergence with rate of 1− O(1/κ).

κ: condition number of L

H.Q. Cai (UCLA Math) Robust PCA & Extensions 12 / 55

Page 23: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Alternating Projections (AltProj)

Consider two sets:

Mr = {L ∈ Rm×n | rank(L) ≤ r}Sα = {S ∈ Rm×n | S is α-sparse}

Find {(L,S) | L ∈Mr ,S ∈ Sα and L + S = D}

Updating:

Lnew = Hr (D − Sold).

− Hr : truncated SVD - expensive

Snew = Tζ(D − Lnew ).

− Tζ(x) = x · 1{|x|>ζ}: hard-thresholding− Choose ζ such that supp(Snew ) ⊂ supp(S)

[Netrapalli, Niranjan, Sanghavi, Anandkumar, Jain, 2014]

H.Q. Cai (UCLA Math) Robust PCA & Extensions 13 / 55

Page 24: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Alternating Projections (AltProj)

Consider two sets:

Mr = {L ∈ Rm×n | rank(L) ≤ r}Sα = {S ∈ Rm×n | S is α-sparse}Find {(L,S) | L ∈Mr ,S ∈ Sα and L + S = D}

Updating:

Lnew = Hr (D − Sold).

− Hr : truncated SVD - expensive

Snew = Tζ(D − Lnew ).

− Tζ(x) = x · 1{|x|>ζ}: hard-thresholding− Choose ζ such that supp(Snew ) ⊂ supp(S)

[Netrapalli, Niranjan, Sanghavi, Anandkumar, Jain, 2014]

H.Q. Cai (UCLA Math) Robust PCA & Extensions 13 / 55

Page 25: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Alternating Projections (AltProj)

Consider two sets:

Mr = {L ∈ Rm×n | rank(L) ≤ r}Sα = {S ∈ Rm×n | S is α-sparse}Find {(L,S) | L ∈Mr ,S ∈ Sα and L + S = D}

Updating:

Lnew = Hr (D − Sold).

− Hr : truncated SVD - expensive

Snew = Tζ(D − Lnew ).

− Tζ(x) = x · 1{|x|>ζ}: hard-thresholding− Choose ζ such that supp(Snew ) ⊂ supp(S)

[Netrapalli, Niranjan, Sanghavi, Anandkumar, Jain, 2014]

H.Q. Cai (UCLA Math) Robust PCA & Extensions 13 / 55

Page 26: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Illustration of Alternating Projections

𝑆"

𝑆$

𝑆%

𝐿%

𝐿$

𝐿'ℳ) 𝒮

H)

H)

H)

T+,

T+-

H.Q. Cai (UCLA Math) Robust PCA & Extensions 14 / 55

Page 27: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Alternating Projections (AltProj)

Consider two sets:

Mr = {L ∈ Rm×n | rank(L) ≤ r}Sα = {S ∈ Rm×n | S is α-sparse}Find {(L,S) | L ∈Mr ,S ∈ Sα and L + S = D}

Updating:

Lnew = Hr (D − Sold); Snew = Tζ(D − Lnew ).

Break algorithm into r stages:

At the tth stage, use Ht instead of Hr

Overcome the case of bad condition number of L

[Netrapalli, Niranjan, Sanghavi, Anandkumar, Jain, 2014]

H.Q. Cai (UCLA Math) Robust PCA & Extensions 15 / 55

Page 28: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Recovery Guarantee of AltProj

The output (Lt,k ,St,k) of AltProj satisfies

‖Lt,k − L‖F ≤ ε, ‖St,k − S‖∞ ≤ ε/√mn,

and supp(St,k) ⊂ supp(S)

in O(t log 12ε) iterations if followings are satisfied:

− L is µ-incoherent− S is α-sparse with α ≤ 1

512µr

− β = 4µr/√mn: parameter for thresholding

Linear convergence with rate of 1/2, at tth stage.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 16 / 55

Page 29: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical Guarantee

3 Proposed ApproachAlgorithm DesignTheoretical Guarantee

4 Numerical Experiments

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 30: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical Guarantee

3 Proposed ApproachAlgorithm DesignTheoretical Guarantee

4 Numerical Experiments

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 31: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Key Idea

AltProj is fast enough when updating S with hard thresholding, butusing truncated SVD for L updating can still be very expensive whenthe problem size is larger.

Accelerating by first projecting D − Sold onto some local lowdimensional subspace before obtaining a new estimate of L viatruncated SVD.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 17 / 55

Page 32: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

The Low Dimensional Subspace

Mr = {L ∈ Rm×n | rank(L) ≤ r} is a Riemannian manifold.

Tangent space of Mr at L

T = {UAT + BV T | A ∈ Rn×r ,B ∈ Rm×r}

where L = UΣV T is the SVD of L.

− The low dimensional subspace we want, but don’t have in practice.− Instead, use tangent space of Mr at Lold .

Project an arbitrary matrix onto the tangent space T :

PTZ = UUTZ + ZVV T −UUTZVV T (4)

H.Q. Cai (UCLA Math) Robust PCA & Extensions 18 / 55

Page 33: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Illustration of Our Approach

𝑆"

𝑆$

𝑆%𝐿%

𝐿$

𝐿'ℳ) 𝒮

P+,

P+-

P+.

T/-

T/.

𝐿"

H)

H)

H)

𝑆0%H)

T/,

H.Q. Cai (UCLA Math) Robust PCA & Extensions 19 / 55

Page 34: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Algorithm: AccAltProj

RPCA by Accelerated Alternating Projections (AccAltProj)

Initializationk = 0while ‖D − Lk − Sk‖F/‖D‖F ≥ ε doLk = Trim (Lk , µ) // Enforce Incoherence of Tangent Space

Lk+1 = Hr

(PTk

(D − Sk))

ζk+1 = β(σr+1

(PTk

(D − Sk))

+ γk+1σ1

(PTk

(D − Sk)))

Sk+1 = Tζk+1(D − Lk+1)

k = k + 1end while

H.Q. Cai (UCLA Math) Robust PCA & Extensions 20 / 55

Page 35: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Accelerating Truncated SVD

Denote Z := D − Sk . Computing Hr (PTkZ ) efficiently:

PTkZ =

[Uk Q1

]︸ ︷︷ ︸orthogonal

[UT

k ZVk RT2

R1 0

]︸ ︷︷ ︸

2r×2r

[Vk Q2

]T︸ ︷︷ ︸

orthogonal

− Q1R1 = (I − UkUTk )ZVk ∈ Rm×r

− Q2R2 = (I − VkVTk )ZUk ∈ Rn×r

Hr (PTkZ ) =

[Uk Q1

]Hr

([UT

k ZVk RT2

R1 0

]) [Vk Q2

]TOnly need a SVD of 2r × 2r matrix + two QR-decompositions

Complexities: 4n2r + n2 + O(nr2 + r3) [HrPTk]

O(n2r) with large hidden constant [Hr ]

H.Q. Cai (UCLA Math) Robust PCA & Extensions 21 / 55

Page 36: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Accelerating Truncated SVD

Denote Z := D − Sk . Computing Hr (PTkZ ) efficiently:

PTkZ =

[Uk Q1

]︸ ︷︷ ︸orthogonal

[UT

k ZVk RT2

R1 0

]︸ ︷︷ ︸

2r×2r

[Vk Q2

]T︸ ︷︷ ︸

orthogonal

− Q1R1 = (I − UkUTk )ZVk ∈ Rm×r

− Q2R2 = (I − VkVTk )ZUk ∈ Rn×r

Hr (PTkZ ) =

[Uk Q1

]Hr

([UT

k ZVk RT2

R1 0

]) [Vk Q2

]TOnly need a SVD of 2r × 2r matrix + two QR-decompositions

Complexities: 4n2r + n2 + O(nr2 + r3) [HrPTk]

O(n2r) with large hidden constant [Hr ]

H.Q. Cai (UCLA Math) Robust PCA & Extensions 21 / 55

Page 37: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Accelerating Truncated SVD

Denote Z := D − Sk . Computing Hr (PTkZ ) efficiently:

PTkZ =

[Uk Q1

]︸ ︷︷ ︸orthogonal

[UT

k ZVk RT2

R1 0

]︸ ︷︷ ︸

2r×2r

[Vk Q2

]T︸ ︷︷ ︸

orthogonal

− Q1R1 = (I − UkUTk )ZVk ∈ Rm×r

− Q2R2 = (I − VkVTk )ZUk ∈ Rn×r

Hr (PTkZ ) =

[Uk Q1

]Hr

([UT

k ZVk RT2

R1 0

]) [Vk Q2

]TOnly need a SVD of 2r × 2r matrix + two QR-decompositions

Complexities: 4n2r + n2 + O(nr2 + r3) [HrPTk]

O(n2r) with large hidden constant [Hr ]

H.Q. Cai (UCLA Math) Robust PCA & Extensions 21 / 55

Page 38: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Efficient S Updating with Hard Thresholding

Updating Sk+1 with Tζk+1(D − Lk+1), where

ζk+1 = β(σr+1

(PTk

(D − Sk))

+ γk+1σ1

(PTk

(D − Sk)))

,

so that we have supp(Sk+1) ⊂ supp(S) in theory.

The singular values of PTk

(D − Sk) were already computed in Lupdating.

− Cost O(1) for computing ζk+1

Complexity: 2n2 + O(1)

− No expensive partial sorting needed.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 22 / 55

Page 39: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Efficient S Updating with Hard Thresholding

Updating Sk+1 with Tζk+1(D − Lk+1), where

ζk+1 = β(σr+1

(PTk

(D − Sk))

+ γk+1σ1

(PTk

(D − Sk)))

,

so that we have supp(Sk+1) ⊂ supp(S) in theory.

The singular values of PTk

(D − Sk) were already computed in Lupdating.

− Cost O(1) for computing ζk+1

Complexity: 2n2 + O(1)

− No expensive partial sorting needed.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 22 / 55

Page 40: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Efficient S Updating with Hard Thresholding

Updating Sk+1 with Tζk+1(D − Lk+1), where

ζk+1 = β(σr+1

(PTk

(D − Sk))

+ γk+1σ1

(PTk

(D − Sk)))

,

so that we have supp(Sk+1) ⊂ supp(S) in theory.

The singular values of PTk

(D − Sk) were already computed in Lupdating.

− Cost O(1) for computing ζk+1

Complexity: 2n2 + O(1)

− No expensive partial sorting needed.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 22 / 55

Page 41: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Initialization

If choose L0 = 0, then U0 = V0 = 0.

− We don’t want this happen.

Initialization by Two Steps of AltProj

L−1 = 0ζ−1 = βinit · σ1(D)S−1 = Tζ−1(D − L−1)L0 = Hr (D − S−1)ζ0 = β · σ1(D − S−1)S0 = Tζ0(D − L0)

Two steps of AltProj give us a close enough tangent space to startwith.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 23 / 55

Page 42: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Initialization

If choose L0 = 0, then U0 = V0 = 0.

− We don’t want this happen.

Initialization by Two Steps of AltProj

L−1 = 0ζ−1 = βinit · σ1(D)S−1 = Tζ−1(D − L−1)L0 = Hr (D − S−1)ζ0 = β · σ1(D − S−1)S0 = Tζ0(D − L0)

Two steps of AltProj give us a close enough tangent space to startwith.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 23 / 55

Page 43: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Illustration of Our Approach

𝑆"

𝑆$

𝑆%𝐿%

𝐿$

𝐿'ℳ) 𝒮

P+,

P+-

P+.

T/-

T/.

𝐿"

H)

H)

H)

𝑆0%H)

T/,

H.Q. Cai (UCLA Math) Robust PCA & Extensions 24 / 55

Page 44: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical Guarantee

3 Proposed ApproachAlgorithm DesignTheoretical Guarantee

4 Numerical Experiments

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 45: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Main Theoretical Result

Recovery Guarantee of AccAltProj

Let L and S be two matrices satisfying Assumptions A1 and A2 with

α ≤ O

(min

{1

µr2κ3,

1

µ1.5r2κ,

1

µ2r2

}).

If the thresholding parameters obeyµrσL

1√mnσD

1≤ βinit ≤

3µrσL1√

mnσD1

and

β = µr2√mn

, alone with the convergence rate parameter γ ∈ ( 1√12, 1),

then the outputs of AccAltProj satisfy

‖L− Lk‖F ≤ εσL1 , ‖S − Sk‖∞ ≤ε√mn

σL1 , and supp(Sk) ⊂ supp(S)

in O(logγ ε) iterations.

Linear convergence with rate of γ.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 25 / 55

Page 46: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Proof Sketch

A Show any m × n RPCA problem can be reduced to a symmetricRPCA problem.

B Prove the symmetric case by mathematical induction:

Base case: Prove initialization by 2-step AltPorj satisfies

‖L0 − L‖2 ≤ 8αµrσL1 , ‖S0 − S‖∞ ≤

µr

nσL1 , and supp(S0) ⊂ supp(S).

Induction Step: Prove following two lemmas

1 If ‖Lk − L‖2 ≤ 8αµrγkσL1 , ‖Sk − S‖∞ ≤ µr

n γkσL

1 , andsupp(Sk) ⊂ supp(S), then‖Lk+1 − L‖2 ≤ 8αµrγk+1σL

1 and

‖Lk+1 − L‖∞ ≤ (1−8αµrκ)µr2n γk+1σL

1 .

2 If ‖Lk+1 − L‖∞ ≤ (1−8αµrκ)µr2n γk+1σL

1 , then‖Sk+1 − S‖∞ ≤ µr

n γk+1σL

1 and supp(Sk+1) ⊂ supp(S).

H.Q. Cai (UCLA Math) Robust PCA & Extensions 26 / 55

Page 47: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Proof Sketch

A Show any m × n RPCA problem can be reduced to a symmetricRPCA problem.

B Prove the symmetric case by mathematical induction:

Base case: Prove initialization by 2-step AltPorj satisfies

‖L0 − L‖2 ≤ 8αµrσL1 , ‖S0 − S‖∞ ≤

µr

nσL1 , and supp(S0) ⊂ supp(S).

Induction Step: Prove following two lemmas

1 If ‖Lk − L‖2 ≤ 8αµrγkσL1 , ‖Sk − S‖∞ ≤ µr

n γkσL

1 , andsupp(Sk) ⊂ supp(S), then‖Lk+1 − L‖2 ≤ 8αµrγk+1σL

1 and

‖Lk+1 − L‖∞ ≤ (1−8αµrκ)µr2n γk+1σL

1 .

2 If ‖Lk+1 − L‖∞ ≤ (1−8αµrκ)µr2n γk+1σL

1 , then‖Sk+1 − S‖∞ ≤ µr

n γk+1σL

1 and supp(Sk+1) ⊂ supp(S).

H.Q. Cai (UCLA Math) Robust PCA & Extensions 26 / 55

Page 48: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Proof Sketch

A Show any m × n RPCA problem can be reduced to a symmetricRPCA problem.

B Prove the symmetric case by mathematical induction:

Base case: Prove initialization by 2-step AltPorj satisfies

‖L0 − L‖2 ≤ 8αµrσL1 , ‖S0 − S‖∞ ≤

µr

nσL1 , and supp(S0) ⊂ supp(S).

Induction Step: Prove following two lemmas

1 If ‖Lk − L‖2 ≤ 8αµrγkσL1 , ‖Sk − S‖∞ ≤ µr

n γkσL

1 , andsupp(Sk) ⊂ supp(S), then‖Lk+1 − L‖2 ≤ 8αµrγk+1σL

1 and

‖Lk+1 − L‖∞ ≤ (1−8αµrκ)µr2n γk+1σL

1 .

2 If ‖Lk+1 − L‖∞ ≤ (1−8αµrκ)µr2n γk+1σL

1 , then‖Sk+1 − S‖∞ ≤ µr

n γk+1σL

1 and supp(Sk+1) ⊂ supp(S).

H.Q. Cai (UCLA Math) Robust PCA & Extensions 26 / 55

Page 49: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Proof Sketch

A Show any m × n RPCA problem can be reduced to a symmetricRPCA problem.

B Prove the symmetric case by mathematical induction:

Base case: Prove initialization by 2-step AltPorj satisfies

‖L0 − L‖2 ≤ 8αµrσL1 , ‖S0 − S‖∞ ≤

µr

nσL1 , and supp(S0) ⊂ supp(S).

Induction Step: Prove following two lemmas

1 If ‖Lk − L‖2 ≤ 8αµrγkσL1 , ‖Sk − S‖∞ ≤ µr

n γkσL

1 , andsupp(Sk) ⊂ supp(S), then‖Lk+1 − L‖2 ≤ 8αµrγk+1σL

1 and

‖Lk+1 − L‖∞ ≤ (1−8αµrκ)µr2n γk+1σL

1 .

2 If ‖Lk+1 − L‖∞ ≤ (1−8αµrκ)µr2n γk+1σL

1 , then‖Sk+1 − S‖∞ ≤ µr

n γk+1σL

1 and supp(Sk+1) ⊂ supp(S).

H.Q. Cai (UCLA Math) Robust PCA & Extensions 26 / 55

Page 50: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Proof Sketch

A Show any m × n RPCA problem can be reduced to a symmetricRPCA problem.

B Prove the symmetric case by mathematical induction:

Base case: Prove initialization by 2-step AltPorj satisfies

‖L0 − L‖2 ≤ 8αµrσL1 , ‖S0 − S‖∞ ≤

µr

nσL1 , and supp(S0) ⊂ supp(S).

Induction Step: Prove following two lemmas

1 If ‖Lk − L‖2 ≤ 8αµrγkσL1 , ‖Sk − S‖∞ ≤ µr

n γkσL

1 , andsupp(Sk) ⊂ supp(S), then‖Lk+1 − L‖2 ≤ 8αµrγk+1σL

1 and

‖Lk+1 − L‖∞ ≤ (1−8αµrκ)µr2n γk+1σL

1 .

2 If ‖Lk+1 − L‖∞ ≤ (1−8αµrκ)µr2n γk+1σL

1 , then‖Sk+1 − S‖∞ ≤ µr

n γk+1σL

1 and supp(Sk+1) ⊂ supp(S).

H.Q. Cai (UCLA Math) Robust PCA & Extensions 26 / 55

Page 51: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Key Properties of Tangent Space Projection (1)

If L, Lk are two rank-r matrices, then

‖(I − PTk)(L− Lk)‖2 ≤

‖L− Lk‖22σLr

.

− Make the local convergence analysis possible.

ℳ"

𝐿$𝐿%

(I − P())(𝐿% − 𝐿)

(I− P(+)(𝐿$ − 𝐿)P(+

P()

𝐿

𝜽𝟏𝜽𝟎

H.Q. Cai (UCLA Math) Robust PCA & Extensions 27 / 55

Page 52: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Key Properties of Tangent Space Projection (2)

If Lk is µ-incoherent and S is α-sparse, then

‖PTkS‖∞ ≤ 3αµr‖S‖∞.

− PTkS is not sparse.

− Ensures sparsity can still be used for error bound after tangent spaceprojection.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 28 / 55

Page 53: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Overview

*complexities in table are based on D ∈ Rn×n

We have the best per-iteration complexity, with the best order ofconvergence, in the class of provable non-convex Robust PCAalgorithms.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 29 / 55

Page 54: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical Guarantee

3 Proposed Approach

4 Numerical ExperimentsSynthetic DatasetsVideo Background Subtraction

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 55: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical Guarantee

3 Proposed Approach

4 Numerical ExperimentsSynthetic DatasetsVideo Background Subtraction

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 56: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Setup

L = PQT , where P,Q ∈ Rn×r are random matrices where theirentries are drawn i.i.d. from standard normal distribution.

supp(S) is sampled uniformly without replacement.

The values of non-zero entries of S are drawn i.i.d from uniformdistribution over the domain [−c · E(|Lij |), c · E(|Lij |)].

− Smaller c gives a harder splitting task (unless it becomes too small tobe outliers, like 10−8)

H.Q. Cai (UCLA Math) Robust PCA & Extensions 30 / 55

Page 57: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Setup - Continue

Compare among three non-convex algorithms:− GD

� Use 1.1α for sparsification F

− AltProj

� Only time the r th stage

− AccAltProj with Trimming

� Use 1.1µ for trimming

− AccAltProj without Trimming

� Skip the step Lk = Trim(Lk , µ)

PROPACK is used for large size truncated SVD

− Good for AltProj which uses it every iteration

H.Q. Cai (UCLA Math) Robust PCA & Extensions 31 / 55

Page 58: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Speed Comparison: Dimension vs Runtime

0 5000 10000 15000

Dimension n

0

100

200

300

400

500

600

Tim

e (

se

cs)

AccAltProj w/ trimming

AccAltProj w/o trimming

AltProj

GD

r = 5, α = 0.1, c = 1, stop at errk < 10−4.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 32 / 55

Page 59: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Speed Comparison: Sparsity vs Runtime

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Sparsity Factor p

0

2

4

6

8

10

12

14

16

18

Tim

e (

se

cs)

AccAltProj w/ trimming

AccAltProj w/o trimming

AltProj

GD

r = 5, c = 1, n = 2500, stop at errk < 10−4.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 33 / 55

Page 60: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Speed Comparison: Relative Error vs Runtime

10-610-510-410-310-210-1100

Relative Error errk

0

2

4

6

8

10

12

Tim

e (

secs)

AccAltProj w/ trimming

AccAltProj w/o trimming

AltProj

GD

r = 5, α = 0.1, c = 1, n = 2500.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 34 / 55

Page 61: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Rate of Success for Varying c and α

c = 0.2 0.3 0.35 0.4 0.45 0.5 0.55 0.6

AccAltProj w/ trimming 10 10 10 10 10 10 10AccAltProj w/o trimming 10 10 10 10 10 10 10

AltProj 10 10 10 10 10 10 10GD 10 10 10 0 0 0 0

c = 1 0.3 0.35 0.4 0.45 0.5 0.55 0.6

AccAltProj w/ trimming 10 10 10 10 10 10 10AccAltProj w/o trimming 10 10 10 10 10 10 10

AltProj 10 10 10 10 10 10 10GD 10 10 10 10 9 0 0

c = 5 0.3 0.35 0.4 0.45 0.5 0.55 0.6

AccAltProj w/ trimming 10 10 10 10 10 10 10AccAltProj w/o trimming 10 10 10 10 10 10 10

AltProj 10 10 10 10 10 10 10GD 10 10 10 10 10 10 7

r = 5, n = 2500.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 35 / 55

Page 62: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical Guarantee

3 Proposed Approach

4 Numerical ExperimentsSynthetic DatasetsVideo Background Subtraction

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 63: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Setup

Each frame in the video is vectorized, and becomes a row (or acolumn) in a data matrix.

The static background is the low rank component of the matrix.

The moving objects, i.e., foreground, forms the sparse components ofthe matrix.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 36 / 55

Page 64: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Demo

https://youtu.be/k6uaeQky2sc

H.Q. Cai (UCLA Math) Robust PCA & Extensions 37 / 55

Page 65: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Comparison of Video Outputs

Shoppingmall(S): 256× 320 frame size; 1000 frames

Restaurant(R): 120× 160 frame size; 3055 frames

AccAltProj w/ trim AccAltProj w/o trim AltProj GDruntime µ runtime µ runtime µ runtime µ

S 38.98s 2.12 38.79s 2.26 82.97s 2.13 161.1s 2.85R 28.09s 5.16 27.94s 5.25 69.12s 5.28 107.3s 6.07

Trimming helps the consistency a litter bit among the frames of thebackground, while only uses slightly more time.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 38 / 55

Page 66: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

AccAltProj has been published as:

Cai, HanQin, Jian-Feng Cai, and Ke Wei. ”Accelerated alternating projections for robust

principal component analysis.” The Journal of Machine Learning Research 20.1 (2019):

685-717.

Code can be found at:

https://github.com/caesarcai/AccAltProj_for_RPCA

H.Q. Cai (UCLA Math) Robust PCA & Extensions 39 / 55

Page 67: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical Guarantee

3 Proposed Approach

4 Numerical Experiments

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 68: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Form a Hankel Matrix

Given a complex vector x ∈ Cn, we can form a corresponding Hankelmatrix H(x) ∈ Cn1×n2 by

H(x) =

x0 x1 x2 · · · xn2−1x1 x2 x3 · · · xn2x2 x3 x4 · · · xn2+1...

......

......

xn1−1 xn1 xn1+1 · · · xn−1

where n1 + n2 = n + 1.

We try to keep the matrix (almost) square: when n is odd, we choosen1 = n2 = n+1

2 ; when n is even, we choose n1 = n2 + 1 and n2 = n

2 .

*We try to using consistent notation with the paper, which is unfortunately inconsistent with

the notation in previous sections.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 40 / 55

Page 69: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Low-Rank Hankel Matrix

Not all Hankel matrices are low-rank, but low-rank Hankel matrix appearsin many applications:

Spectrally sparse signal

Magnetic resonance image (MRI)Nuclear magnetic resonance (NMR) spectroscopyFluorescence microscopyAnalog-to-digital conversionEarthquake-induced vibration data

Dynamic sampling

Auto-regression

H.Q. Cai (UCLA Math) Robust PCA & Extensions 41 / 55

Page 70: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Hankel Outlier Detection Problem

To recover the vector x from the corrupted observationy = x + s ∈ Cn (or equivalently H(y) = H(x) +H(s) ∈ Cn1×n2), wewant to solve the minimization problem:

minimizex ,s

‖y − x − s‖2

subject to rank(H(x)) ≤ r , ‖s‖0 ≤ αn(5)

The sparse vector s can be:

Impulse noise in MRI/NMR signals.Corrupted time snap in dynamic samplingBlack swan events on financial marketing

H.Q. Cai (UCLA Math) Robust PCA & Extensions 42 / 55

Page 71: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

AAP-Hankel

Accelerated Alternating Projections for Hankel Recovery

Initialization, set k = 0while ‖y − xk − sk‖/‖z‖ ≥ ε doLk = Trim(Lk , µ)Lk+1 = DrPTk

H(y − sk)

xk+1 = H†(Lk+1)

ζk+1 = β(σr+1

(PTkH(z − sk)

)+ γk+1σ1

(PTkH(z − sk)

))sk+1 = Tζk+1

(z − xk+1)k = k + 1

end while

Trim: Enforce incoherence on a low rank matrix

Dr : the best rank r approximation

PTk: Projection onto subspace Tk = {UkA

T + BV∗k | A ∈ Rn2×r ,B ∈ Rn1×r}

H†, normalized dual of H such that H†H = ITζ : Hard thresholding

H.Q. Cai (UCLA Math) Robust PCA & Extensions 43 / 55

Page 72: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Computational Complexity of Updating x

Due to the construction of Hankel matrix, a Hankel matrix and vectormultiplication can be re-formula to a vector-vector convolution, whichcan be computed via FFT. Hence, a Hankel matrix and matrixmultiplication only needs O(rn log(n)) flops.

With a SVD of 2r × 2r matrix + two QR-decompositions, we needO(nr2 + n log(n)r + r3) flops for updating L.

xk+1 = H†(Lk+1) =∑r

i=1Σ(i ,i)k+1ωj

∑a+b=j U

(a,i)k+1V

(b,i)k+1 , ∀j

− ωj is the weight for averaging j th anti-diagonal− Convolutions can be computed by FFT. r of the convolutions result

O(n log(n)r) complexity.

Overall, need O(nr2 + n log(n)r + r3) flops for updating x .

H.Q. Cai (UCLA Math) Robust PCA & Extensions 44 / 55

Page 73: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Computational Complexity of Updating x

Due to the construction of Hankel matrix, a Hankel matrix and vectormultiplication can be re-formula to a vector-vector convolution, whichcan be computed via FFT. Hence, a Hankel matrix and matrixmultiplication only needs O(rn log(n)) flops.

With a SVD of 2r × 2r matrix + two QR-decompositions, we needO(nr2 + n log(n)r + r3) flops for updating L.

xk+1 = H†(Lk+1) =∑r

i=1Σ(i ,i)k+1ωj

∑a+b=j U

(a,i)k+1V

(b,i)k+1 , ∀j

− ωj is the weight for averaging j th anti-diagonal− Convolutions can be computed by FFT. r of the convolutions result

O(n log(n)r) complexity.

Overall, need O(nr2 + n log(n)r + r3) flops for updating x .

H.Q. Cai (UCLA Math) Robust PCA & Extensions 44 / 55

Page 74: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Assumptions

A1 H(x) ∈ Cn1×n2 is µ-incoherence: there exists a numerical constantµ0 > 0 such that

σr (E ∗LEL) ≥ n1µ0, σr (E ∗RER) ≥ n2

µ0,

where H(x) = ELDETR is its Vandermonde decomposition.

In the undamped spectrally sparse signals, this assumption isguaranteed with at least 2

n separations between the non-zerofrequencies [LiaoFannjiang’2016].

A2 s ∈ Cn is α-sparse: no more than αn non-zero entries in the vector.

this implies that H(s) is no more than 2α-sparse matrix.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 45 / 55

Page 75: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Main Theoretical Result of AAP-Hankel

Recovery Guarantee of AAP-Hankel

Let x and s be two matrices satisfying Assumptions A1 and A2 with

α . O

(min

{1

µr 2κ3,

1

µ1.5r 2κ,

1

µ2r 2

}).

If the thresholding parameters obeyµrσx1√n1n2σz1

≤ βinit ≤3µrσx1√n1n2σz1

and

β = µr2√n1n2

, alone with the convergence rate parameter γ ∈ ( 1√12, 1),

then the outputs of Algorithm 1 satisfy

‖H(x − xk)‖F ≤ εσx1 , ‖s − sk‖∞ ≤ε√mn

σx1 , and supp(sk) ⊂ supp(s)

in O(logγ ε) iterations.

σx1 denotes the largest singular value of H(x).

Linear convergence with rate of γ.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 46 / 55

Page 76: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Speed Comparison: Dimension vs Runtime

r = 5, α = 0.1, c = 1, stop at errk < 10−4.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 47 / 55

Page 77: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Speed Comparison: Sparsity vs Runtime

n = 4002, α = 0.1, c = 1, stop at errk < 10−4.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 48 / 55

Page 78: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Speed Comparison: Relative Error vs Runtime

n = 4002, r = 5, α = 0.1, c = 1.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 49 / 55

Page 79: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Phase Transition

H.Q. Cai (UCLA Math) Robust PCA & Extensions 50 / 55

Page 80: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Impulse Corrupted NMR Singal

n = 32768, 50% corruption.

H.Q. Cai (UCLA Math) Robust PCA & Extensions 51 / 55

Page 81: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

AAP-Hankel is under review, you can found it on arXiv:

Cai, HanQin, Jian-Feng Cai, Tianming Wang, and Guojian Yin. ”Fast and Robust

Spectrally Sparse Signal Recovery: A Provable Non-Convex Approach via Robust

Low-Rank Hankel Matrix Reconstruction.” arXiv preprint arXiv :1910.05859 (2019).

Code can be found at:

https://github.com/caesarcai/AAP-Hankel

H.Q. Cai (UCLA Math) Robust PCA & Extensions 52 / 55

Page 82: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Outline

1 Introduction

2 Existing Approaches with Theoretical Guarantee

3 Proposed Approach

4 Numerical Experiments

5 Extension to Low-Rank Hankel Recovery

6 Conclusion and Future Work

Page 83: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Conclusion

We present a new algorithm for Robust PCA and extend it tolow-rank Hankel recovery problem.

Non-convex with theoretical guarantee

Best per-iteration complexity

Guaranteed linear convergence

Significant speed advantage over the other state-of-the-artalgorithms

High robustness in practice

H.Q. Cai (UCLA Math) Robust PCA & Extensions 53 / 55

Page 84: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Future Work

Recovery guarantee proof for “without Trimming”

Relax tolerance of α to match our experimental results

Study partial observed Robust PCA

− Acceleration with guaranteed recovery

Study recovery stability with additive noise

− Theoretically and practically

H.Q. Cai (UCLA Math) Robust PCA & Extensions 54 / 55

Page 85: Accelerated Outlier Detection in Low-Rank and Structured ... · Given D = L + S 2Rm n, where L is low rank and S is sparse. Split/recover L , S from D : minimize L 0;S 0 kD L 0 S

Thank you!

Any Question?

H.Q. Cai (UCLA Math) Robust PCA & Extensions 55 / 55