Upload
phamdiep
View
227
Download
0
Embed Size (px)
Citation preview
Iterative Shrinkage/Thresholding Algorithms: g g gSome History and Recent Development
Mário A. T. Figueiredog
Instituto de TelecomunicaçõesandInstituto Superior TécnicoInstituto Superior Técnico, Technical University of Lisbon
PORTUGAL
CS Workshop, Duke, [email protected]
Signal/Image Restoration/Representation/Reconstruction
M i l/i t ti / i ti it i h th fMany signal/image reconstruction/approximation criteria have the form
is smooth and convex (the data fidelity term); usually,
is a regularization/penalty function;
typically convex (sometimes not), often non-differentiable.
Examples: TV-based and wavelet-based restoration/reconstruction, sparse representations sparse (linear or logistic) regression
CS Workshop, Duke, 2009
sparse representations, sparse (linear or logistic) regression,compressive sensing (with )
Outline
1. The optimization problem (previous slide)
2 IST Al ith 4 d i ti2. IST Algorithms: 4 derivations
3. Convergence results
4. Enhanced (accelerated) versions: TwIST and SpaRSA
5 Warm starting and continuation5. Warm starting and continuation
6. Concluding remarks
CS Workshop, Duke, 2009
Denoising/shrinkage operators
If , we have a denoising problem.
If is proper and convex , is strictly convex, there is a unique minimizer.
If , we have a denoising problem.
Thus, the so-called shrinkage/thresholding/denoising function
is well defined (Moreau proximal mapping) [Moreau 1962], [Combettes 2001]
Examples:
CS Workshop, Duke, 2009
(not convex)
Iterative Shrinkage/Thresholding (IST)
Problem:
IST algorithm:
Adequate when products by and are efficiently computable (e g FFT)(e.g., FFT)
Since is the gradient of
if , IST is gradient descent with step length
CS Workshop, Duke, 2009
IST also applicale in Bregman iterations to solve constrained problems[Yin, Osher, Goldfarb, Darbon, 2008]
IST as Expectation-Maximization [F. and Nowak, 2001, 2003]
Underlying observation model:
Equivalent model:Equivalent model:
Hidden image:
E-step: (Wiener)
M-step:
CS Workshop, Duke, 2009
monotonicity
IST as Majorization-Minimization [Daubechies, Defrise, De Mol, 2004]
Majorization function:j
MM algorithm:
Monotonicity:
If , we can set
CS Workshop, Duke, 2009
Thus,
IST as Forward-Backward Splitting
(the minimizeris unique)
Back to the problem differentiableconvex
(fixed point equation)
CS Workshop, Duke, 2009
Fixed point scheme:
IST as Separable Approximation
Recall the problem:
Separable approximation to
Recall the problem:
Iteration:
If i
Can be re-written as
If is convex,
The objective function in each iteration can be seen as the Lagrangian for
CS Workshop, Duke, 2009
…a trust-region method.
Bibliographical Notes
IST as expectation-maximization: [F. and Nowak, 2001, 2003]
IST as majorization-minimization: [Daubechies, Defrise, De Mol, 2003, 2004][F No ak Bio cas Dias 2005 2007][F., Nowak, Bioucas-Dias, 2005, 2007]
Forward-backward schemes in math: [Bruck, 1977], [Passty, 1979], [Lions and Mercier, 1979]
Forward-backward schemes in signal reconstruction: [Combettes and Wajs, 2003, 2004]
Separable approximation: [Wright, Nowak, and F., 2008]
Other authors independently proposed IST schemes for signal/image recovery: [Bect Blanc Féraud Aubert and Chambolle 2004][Bect, Blanc-Féraud, Aubert, and Chambolle, 2004], [Elad, Matalon, and Zibulevsky, 2006],[Starck, Nguyen, Murtagh, 2003],[Starck, Candès, Donoho, 2003],[Hale Yin Zhang 2007]
CS Workshop, Duke, 2009
[Hale, Yin, Zhang, 2007]
Existence, Uniqueness
is non empty if is coercive ( )
has at most one element if is strictly convex or is invertible
[Combettes and Wajs 2004]
has exactly one element if is bounded bellow
[Combettes and Wajs, 2004]
CS Workshop, Duke, 2009
Convergence Results (I)
Problem:
IST algorithm:
[Daubechies, Defrise, De Mol, 2004]: (applies in a Hilbert space setting)
Let and ; thenLet , , and ; then,
IST converges to a minimizer of
[Combettes and Wajs, 2005]: (applies to a more general version of IST)
L t b d ( t h )Let be convex and proper (never , not everywhere)
; then, IST converges to a minimizer ofand
CS Workshop, Duke, 2009
Convergence Results (II)
Problem:
IST algorithm:
ob e
[Hale, Yin, Zhang, 2007]:
Let and
Then, IST converges to some and,
for all but a finite number of iterations:
CS Workshop, Duke, 2009
where
Accelerating IST: Two-Step IST (TwIST)
IST becomes slow when is very ill conditioned and is small
Inspired by two-step method for linear systems [Frankel, 1950], [Axelsson, 1996],
TwIST algorithm [Bioucas-Dias and F., 2007]
IST becomes slow when is very ill-conditioned and is small
g [ ]
Simplified analysis with
The minimizer is unique and TwIST converges to ,
There is an optimal choice for and for which
CS Workshop, Duke, 2009
Accelerating IST: TwIST (II)
A t th d i d fA one-step method is recovered for
which is an over-relaxed version of the original IST.
For the optimal choice of :p
~ number of iterations to decrease error by factor of 10.
Example:
CS Workshop, Duke, 2009
Another two-step method was recently proposed in [Beck and Teboulle, 2008]
original Blurred, 9x9, 40db noise restored
Accelerating IST: TwIST (III)
o g a Blurred, 9x9, 40db noise es o ed
182
α=β=1
12
14
16 α=β=1β=β0
Second order
NR
TwIST
1.8
ββ=β0Second order
unct
ion
8
10
12SN
IST
over-relaxed IST
1.4
1.6
TwIST
IST
over-relaxed IST
obje
ctiv
e fu
CS Workshop, Duke, 2009
0 200 400 600 800 10006
iterations
IST
0 200 400 600 800 1000iterations
TwIST
Initialization: choose and ; set
Accelerating IST: The SpaRSA Algorithmic Framework
Initialization: choose , , and ; set
repeat:chooserepeat:choose
until (* acceptance criterion *)
until stopping criterion is satisfied
until ( acceptance criterion )
[Wright, Nowak, F., 2008]until stopping criterion is satisfied.
Variants of SpaRSA are distinguished by the choice of , , and
CS Workshop, Duke, 2009
Examples: yields standard IST.
yields monotone SpaRSA
The Barzilai Borwein approach: seek to mimic a Newton step
Choosing for Speed
The Barzilai-Borwein approach: seek to mimic a Newton step,
a less conservative choice than in IST:
With a least-squares criterion over the last step,
If , then
Alternative rule (SpaRSA-monotone): , with
CS Workshop, Duke, 2009
Compressed Sensing Experiment
210 x 212 random (Gaussian), 160 randomly located non-zeros
where, where
[F Nowak Wright 2007]
[Kim Koh Lustig Boyd Gorinvesky 2007]
[F., Nowak, Wright, 2007]
[Hale, Yin, Zhang, 2007]
[Kim, Koh, Lustig, Boyd, Gorinvesky, 2007]
[Bioucas-Dias, F., 2007]
[Nesterov, 2007]
CS Workshop, Duke, 2009GPSR and l1_ls are “hardwired” for
Non-monotonicity
SpaRSA--monotoneSpaRSASpaRSAGPSR
CS Workshop, Duke, 2009
Convergence of SpaRSA
P blProblem:
Critical point ifCritical point if
Criticality is necessary for optimality. If b th d it i l ffi i tIf both and are convex, it is also sufficient.
Safeguarded SpaRSA (S-SParRSA) [Wright, Nowak, F., 2008]
where usually e g
Let be Lipschitz continuously differentiable, convex and finite-valued, and
where , usually , e.g.,
CS Workshop, Duke, 2009
Let be Lipschitz continuously differentiable, convex and finite valued, and bounded below. Then, all accumulation points of S-SpaRSA are critical points of
Warm Starting and Continuation
SpaRSA (as GPSR, IST, etc) is slow for small
SpaRSA (as GPSR and IST) is “warm-startable”,
i.e., it benefits (a lot) from a good initialization.
Continuation scheme: start with largeg
slowly decrease while tracking the solution.
IST + continuation = fixed point continuation (FPC) [Hale, Yin, Zhang, 2007]
CS Workshop, Duke, 2009
Continuation Experiment
CS Workshop, Duke, 2009
For , the solution is the zero vector
Conclusions
- Reviewed several ways to derive the IST algorithm
Reviewed several convergence results for IST
- Described recent accelerated versions: TwIST, SpaRSA
- Reviewed several convergence results for IST
p
- IST and SpaRSA benefits (a lot) from a continuation scheme.
-State-of-the-art performance for a variety of problems:MRI reconstruction (TV and wavelets), MEG imaging, deconvolution,( ), g g, ,compressed sensing, …
CS Workshop, Duke, 2009