Imaging Inverse Problems
OBJECTIVE: to estimate an unknown image ๐ฅ from an observation ๐ฆ.
A +
๐ง~๐(0, ๐2๐ผ)
๐ฅ โ โ๐
UNKNOWNIMAGE
๐ฆ โ โ๐
OBSERVATIONLINEAR
OPERATOR
NOISE
Some canonical examples:โข image deconvolution โข compressive sensing โข super-resolution โข tomographic reconstructionโข inpainting
CHALLENGE: not enough information in ๐ฆ to accurately estimate ๐ฅ.
For example, in many imaging problems ๐ฆ = ๐ด๐ฅ + ๐ง where the operator ๐ด is either:
High-dimensional problems โ ๐~106
RANK-DEFICIENT
SMALL EIGENVALUES๐ด
not enough equations to solve for ๐ฅ
NOISE AMPLIFICATION unreliable estimates of ๐ฅ
REGULARISATION: We can render the problem well-posed by using prior knowledge about theunknown signal ๐ฅ.
How do we choose the regularisation parameter ๐?The regularisation parameter controls how much importance we give to prior knowledge andto the observations, depending on how ill posed the problem is and on the intensity of the noise.
๐ฝ
PRIORKNOWLEDGE
OBSERVEDDATA
POSSIBLE APPROACHES FOR CHOOSING ๐:
NON BAYESIANโข Cross-validation โ Exhaustive search method
โข Discrepancy Principle
โข Stein-based methods โMinimise Steinโs Unbiased Risk Estimator (SURE), a surrogate of the MSE
โข SUGAR โ More efficient algorithms that uses gradient of SURE
Limitation: mostly designed for denoising problems. Difficult to implement
BAYESIANโข Hierarchical โ Propose prior for ๐ and work with hierarchical model
โข Marginalisation โ Remove ๐ from the model ๐(๐ฅ|๐ฆ) = +โ ๐ ๐ฅ, ๐ ๐ฆ ๐๐
Limitation: only for homogeneous ๐(๐ฅ) or cases with known ๐ถ(๐)
โข Empirical Bayes โ Choose ๐ by maximising marginal likelihood ๐(๐ฆ|๐)
Difficulty: ๐(๐ฆ|๐) becomes intractable in high-dimensional problems
Our strategy: Empirical Bayes
The challenge is that ๐(๐ฆ|๐) is intractable as it involves solving two integrals in โ๐ (to marginalise ๐ฅ and to compute
๐ถ ๐ ). This makes the computation of ๐ ๐๐ฟ๐ธ extremely difficult.
OUR CONTRIBUTIONWe propose a stochastic optimisation scheme to compute the maximum marginal likelihood estimator of theregularisation parameter. Novelty: the optimisation is driven by proximal Markov chain Monte Carlo (MCMC) samplers.
We want to find ๐ that maximises the marginal likelihood ๐(๐ฆ|๐):
เต
๐(๐ฆ|๐) = โ๐ ๐ ๐ฆ, ๐ฅ ๐ ๐๐ฅ
๐๐๐ฟ๐ธ = ๐๐๐๐๐๐ฅ๐ โ ฮ
๐(๐ฆ|๐)
We should be able to reliably estimate ๐ from ๐ฆ as ๐ฆ is very high-dimensional and ๐ is a
scalar parameter
Proposed stochastic optimisation algorithmIf ๐(๐ฆ|๐) was tractable, we could use a standard projected gradient algorithm:
STOCHASTIC APPROXIMATION PROXIMAL GRADIENT (SAPG) ALGORITHM
๐๐ก+1 = ๐๐๐๐ฆฮ
[๐๐ก + ๐ฟ๐ก๐
๐๐log ๐(๐ฆ|๐๐ก)] where ๐ฟ๐ก verifies
To tackle the intractability, we propose a stochastic variant of this algorithm based on a noisy estimate of ๐
๐๐๐๐๐(๐(๐ฆ|๐๐ก). Using Fisherโs identity we have:
๐
๐๐log ๐ ๐ ๐ = ๐ธ๐|๐,๐
๐
๐๐log ๐ ๐, ๐ ๐ = ๐ธ๐|๐,๐ โ๐ ๐ฅ โ
๐
๐๐๐ฟ๐๐ ๐ถ ๐
If ๐ถ ๐ is unknown we can use the identity โ ๐
๐๐๐ฟ๐๐ ๐ถ ๐ = ๐ธ๐|๐ ๐ ๐
The intractable gradient becomes ๐
๐๐log ๐ ๐ ๐ =๐ธ๐|๐,๐ โ๐ ๐ฅ + ๐ธ๐|๐ ๐ ๐
Now we can approximate ๐ธ๐|๐,๐ โ๐ ๐ฅ and ๐ธ๐|๐ ๐ ๐ with Monte Carlo estimates.
We construct a Stochastic Approx. Proximal Gradient (SAPG) algorithm driven by two Markov kernels M๐ and K๐targeting the posterior ๐ ๐, ๐ ๐ and the prior ๐ ๐ฅ ๐ respectively.
๐ฟ๐+๐~M๐๐ก X ๐ฆ, ๐๐ก , ๐๐ก
๐๐ก+1 = ๐๐๐๐ฮ
๐๐ก + ๐ฟ๐ก ๐ ๐ผ๐+๐ โ ๐ ๐ฟ๐+๐
๐ผ๐+๐~K๐๐ก U ๐๐ก , ๐๐ก
๐ ๐ฅ ๐ฆ, ๐๐๐ฟ๐ธ
๐ ๐ฅ ๐๐๐ฟ๐ธ
๐๐๐ฟ๐ธ
CONVERGEJOINTLY TO
EMPIRICAL BAYES POSTERIOR
STOCHASTIC GRADIENT
How do we generate the samples?
MOREAU-YOSIDA UNADJUSTED LANGEVIN ALGORITHM (MYULA)
๐ฅ๐ก+1 = ๐ฅ๐ก โ ๐พ (๐ป๐๐ฆ ๐ฅ๐ก +๐ฅ๐ก โ ๐๐๐๐ฅ๐
๐๐ ๐ฅ๐ก
๐) + 2๐พ ๐๐ก+1
๐ป ๐๐๐(๐ฅ)
We use the MYULA algorithm for the Markov kernels K๐ and M๐ because they can handle:โช High-dimensionality !โช Convex problems with a non-smooth ๐(๐)
โช The MYULA kernels do not target ๐ ๐ฅ ๐ฆ, ๐ and ๐ ๐ฅ ๐ exactly.โช Sources of asymptotic bias:
โช Discretisation of Langevin diffusion: controlled by ๐พ and ๐.โช Smoothing of non differentiable ๐(๐ฅ): controlled by ๐.
โช ๐พ,๐ must be < inverse of Lipchitz constant of the gradient driving each diffusion respectively.โช More information about how to select each parameter can be found in [2].
WHERE DOES MYULA COME FROM?
๐๐ ๐ก =1
2๐ป log ๐ ๐ ๐ก |๐ฆ ๐๐ก + ๐๐ ๐ก
LANGEVIN DIFFUSION
๐ฅ๐ก+1 = ๐ฅ๐ก + ๐พ ๐ป ๐ฟ๐๐ ๐ ๐ฅ๐ก ๐ฆ, ๐ + 2๐พ ๐๐ก+1
๐๐ก+1~๐ 0, ๐
UNADJUSTED LANGEVIN ALGORITHM (ULA)
If ๐(๐ฅ) is not Lipschitz differentiable ULA is unstable.
The MYULA algorithm uses Moreau-Yosida regularization to replace the non-smooth term ๐(๐ฅ) with its Moreau Envelope ๐๐(๐ฅ).
Moreau-Yosida regularisation controlled by ๐
๐ ๐ฅ โ ๐โ|๐ฅ|
๐๐ ๐ก
๐๐กโ
๐ฅ๐ก+1โ๐ฅ๐ก
๐พ
EULER MARUYAMA APPROXIMATION
โ exp{ โ ๐๐ฆ(๐ฅ) โ ๐ ๐(๐ฅ) } BROWNIAN MOTION
ResultsWe illustrate the proposed methodology with an image deblurring problem using a total-variation prior.
We compare our results with those obtained with:โช SUGARโช Marginal maximum-a-posteriori estimationโช Optimal or oracle value ๐โ
EVOLUTION OF ๐ THROUGHOUT ITERATIONS
BOAT IMAGE
QUANTITATIVE COMPARISON
โ Generally outperforms the other approaches
For each image, noise level, and method, we compute the MAP estimator and we display on this table the average results for the 6 test images.
EB performs remarkably well at low SNR values
Longer computing time only justified if marginalisation canโt be used
SUGAR performs poorly when the main problem is not noise
Close-up for SNR=20db
METHOD COMPARISON
Over-smoothing: too much regularisation
Ringing:
not enough regularisation
PRIOR
DATA
DEGRADED EMPIRICAL BAYES MAP ESTIMATOR
SNR=40db
Future workโช Detailed analysis of the convergence properties.
โช Extending the methodology to problems involvingmultiple unknown regularisation parameters.
โช Reducing computing times by accelerating the Markovkernels driving the stochastic approximation algorithm:
โช Via parallel computingโช By choosing a faster Markov kernel
โช Adapt algorithm to support some non-convex problems.
โช Use the samples from ๐ ๐ ๐, ๐ฝ ๐ด๐ณ๐ฌ to perform
uncertainty quantification.
References[1] Marcelo Pereyra, Josรฉ M Bioucas-Dias, and Mรกrio ATFigueiredo, โMaximum-a-posteriori estimation with unknownregularisation parameters,โ in 23rd European Signal ProcessingConference (EUSIPCO), 2015. IEEE, pp. 230โ234, 2015.
[2] Alain Durmus, Eric Moulines, and Marcelo Pereyra, โEfficientBayesian computation by proximal Markov Chain Monte Carlo:when Langevin meets Moreau,โ SIAM Journal on ImagingSciences, vol. 11, no. 1, pp. 473โ506, 2018.
[3] Ana Fernandez Vidal and Marcelo Pereyra. "MaximumLikelihood Estimation of Regularisation Parameters." In 2018 25thIEEE International Conference on Image Processing (ICIP), pp.1742-1746. IEEE, 2018.
MAXIMUM LIKELIHOOD ESTIMATION OF REGULARISATION PARAMETERSAna Fernandez Vidal, Dr. Marcelo Pereyra ([email protected], [email protected])
School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS United Kingdom
The observation ๐ is related to ๐ by a statistical model as a random variable.
We use a prior density to reduce uncertainty about and deliver accurate estimates.
The Bayesian Framework
๐(๐ฅ) penalises undesired properties and the regularisation parameter ๐ controls the intensity of the penalisation.
Example: random vector ๐representing astronomical images.
๐ ๐ฅ|๐ =๐โ๐ ๐ ๐ฅ
๐ถ(๐)
HYPERPARAMETER
We model the unknown image ๐ฅ as a random vector with
prior distribution ๐ ๐ฅ|๐ promoting desired properties about ๐ฅ.
PRIOR INFORMATION
๐ ๐ฆ ๐ฅ โ ๐โ๐๐ฆ(๐ฅ)
The observation ๐ฆ is related to ๐ฅ by a statistical model:
OBSERVED DATA
LIKELIHOOD
Observed and prior information are combined by using Bayes' theorem:
POSTERIOR DISTRIBUTION
๐ ๐ฅ ๐ฆ, ๐ โ ๐โ ๐๐ฆ ๐ฅ + ๐ ๐ ๐ฅ
๐ ๐ฅ ๐ฆ, ๐ = ๐ ๐ฆ ๐ฅ ๐ ๐ฅ|๐ /๐(๐ฆ|๐)
เท๐ฅ๐๐ด๐ = argmax๐ฅโโ๐
๐ ๐ฅ ๐ฆ, ๐ = argmin๐ฅโโ๐
๐๐ฆ ๐ฅ + ๐ ๐(๐ฅ)
The predominant Bayesian approach in imaging is MAP estimationwhich can be computed very efficiently by convex optimisation:
We will focus on convex problems where:- ๐(๐ฅ) is lower semicontinuous,
proper and possibly non-smooth- ๐๐ฆ(๐ฅ) is Lipschitz differentiable
with Lipschitz constant L
๐๐ฆ(๐ฅ) = ๐ฆโ๐ด๐ฅ 2
2
2๐2)๐ฆ~๐(๐ด๐ฅ, ๐2๐ผ
NORMALIZING CONSTANT
Conclusionsโช We presented an empirical Bayesian method to estimate regularisation parameters in convex inverse imaging problems.โช We approach an intractable maximum marginal likelihood estimation problem by proposing a stochastic.
optimisation algorithm.โช The stochastic approximation is driven by two proximal MCMC kernels which can handle non-smooth regularisers efficiently.โช Our algorithm was illustrated with non-blind image deconvolution with TV prior where it:
โ achieved close-to-optimal performanceโ outperformed other state of the art approaches in terms of MSEร had longer computing times.
โช More details can be found in [3].
EXPERIMENT DETAILSโขRecover ๐ from a blurred noisy observation ๐ , ๐ = ๐ด๐ + ๐ and ๐ )~N(0, ๐2๐ผโข๐ด is a circulant uniform blurring matrix of size 9x9 pixelsโข๐ ๐ = ๐๐ ๐ โ isotropic total-variation pseudo-normโขWe use 6 test images of size 512x512 pixelsโขWe compute the MAP estimator for each using different values of ๐ obtained with different
methodsโขWe compare methods by taking average value over 6 images of the mean squared error
(MSE) and the computing time (in minutes)
ORACLE ORIGINAL
in minutes
ร Increased computing timesโ Achieves close-to-optimal performance
SNR=40dB
Close-up for SNR=20dB
DEGRADEDORIGINAL EMP. BAYES MARGINAL MAP SUGAR