Solution of ill-posed inverse problems pertaining to signal...

Preview:

Citation preview

SOLUTION OF ILL-POSED INVERSE PROBLEMS

PERTAINING TO SIGNAL RESTORATION :UNDERSTANDING NOISE IN THE BASIS

Rosemary Renauthttp://math.asu.edu/˜rosie

EDINBURGH

APRIL 12, 2012

Outline

BackgroundExamples of Ill-posed problemsMotivation: Total Variation Solutions

The SVD SolutionImportance of the BasisPicard Condition for Ill-Posed Problems

Generalized regularizationGSVD for examining the solutionRevealing the Noise in the GSVD BasisStabilizing the GSVD Solution

Conclusions and Future

Regularization Parameter Estimation for TV

Illustration:Blurred Signal Restoration

b(t) =∫ π−π h(t, s)x(s)ds h(s) = 1

σ√

2πexp(−s

2

2σ2 ), σ > 0.

Forward Problem: Given x, A calculate b

I x the source is sampled at sj , b(t) is measured at ti.I The kernel h describes matrix A: integral is approximated

using numerical quadrature (weights wj) yielding a matrixA.

I Matrix equationb = Ax

describes the linear relationshipI Generally

∫ ba h(x)dx = 1 (for PSF no energy is introduced,

the signal is spread) leads to the normalization∑j wjh(xj) = 1.

I When h(s, t) = h(t− s): kernel is spatially invariant.I Typical choice of h(x) is the Gaussian blur - a low pass

filter. Also interested in the general integral equationproblem.

Forward Problem: Given x, A calculate b

I x the source is sampled at sj , b(t) is measured at ti.I The kernel h describes matrix A: integral is approximated

using numerical quadrature (weights wj) yielding a matrixA.

I Matrix equationb = Ax

describes the linear relationshipI Generally

∫ ba h(x)dx = 1 (for PSF no energy is introduced,

the signal is spread) leads to the normalization∑j wjh(xj) = 1.

I When h(s, t) = h(t− s): kernel is spatially invariant.I Typical choice of h(x) is the Gaussian blur - a low pass

filter. Also interested in the general integral equationproblem.

Forward Problem: Given x, A calculate b

I x the source is sampled at sj , b(t) is measured at ti.I The kernel h describes matrix A: integral is approximated

using numerical quadrature (weights wj) yielding a matrixA.

I Matrix equationb = Ax

describes the linear relationshipI Generally

∫ ba h(x)dx = 1 (for PSF no energy is introduced,

the signal is spread) leads to the normalization∑j wjh(xj) = 1.

I When h(s, t) = h(t− s): kernel is spatially invariant.I Typical choice of h(x) is the Gaussian blur - a low pass

filter. Also interested in the general integral equationproblem.

Forward Problem: Given x, A calculate b

I x the source is sampled at sj , b(t) is measured at ti.I The kernel h describes matrix A: integral is approximated

using numerical quadrature (weights wj) yielding a matrixA.

I Matrix equationb = Ax

describes the linear relationshipI Generally

∫ ba h(x)dx = 1 (for PSF no energy is introduced,

the signal is spread) leads to the normalization∑j wjh(xj) = 1.

I When h(s, t) = h(t− s): kernel is spatially invariant.I Typical choice of h(x) is the Gaussian blur - a low pass

filter. Also interested in the general integral equationproblem.

Forward Problem: Given x, A calculate b

I x the source is sampled at sj , b(t) is measured at ti.I The kernel h describes matrix A: integral is approximated

using numerical quadrature (weights wj) yielding a matrixA.

I Matrix equationb = Ax

describes the linear relationshipI Generally

∫ ba h(x)dx = 1 (for PSF no energy is introduced,

the signal is spread) leads to the normalization∑j wjh(xj) = 1.

I When h(s, t) = h(t− s): kernel is spatially invariant.I Typical choice of h(x) is the Gaussian blur - a low pass

filter. Also interested in the general integral equationproblem.

Forward Problem: Given x, A calculate b

I x the source is sampled at sj , b(t) is measured at ti.I The kernel h describes matrix A: integral is approximated

using numerical quadrature (weights wj) yielding a matrixA.

I Matrix equationb = Ax

describes the linear relationshipI Generally

∫ ba h(x)dx = 1 (for PSF no energy is introduced,

the signal is spread) leads to the normalization∑j wjh(xj) = 1.

I When h(s, t) = h(t− s): kernel is spatially invariant.I Typical choice of h(x) is the Gaussian blur - a low pass

filter. Also interested in the general integral equationproblem.

Inverse Problem given A, b find x

The solution depends on the conditioning of A and on the noisein measurements b. Condition of A is 1.8679e+ 05 Restorationwith noise .0001 : x = A−1b(Almost committing the inverse crime)

Example of restoration of 2D image with low noise

Figure: Original

Example of restoration of 2D image with low noise

Figure: Noisy and Blurred 10−5

Example of restoration of 2D image with low noise

Figure: Restored 10−5 inverse crime

Example of restoration of 2D image with low noise

Figure: Noisy and Blurred 10−3

Example of restoration of 2D image with low noise

Figure: Restored 10−3

Summary

Goals Given data b and kernel h find x

Features One may wish to find features from x, for examplethe edges in the signal

Difficulties The solution is very sensitive to the dataIll-Posed (according to Hadamard) A problem is ill-posed if it

does not satisfy conditions for well-posedness, OR1. b /∈ range(A)2. inverse is not unique because more than one

image is mapped to the same data, or3. an arbitrarily small change in b can cause an

arbitrarily large change in x.Required Analyse the formulation / adapt solution

techniquesTikhonov Regularization

xTik(λ) = arg minx{‖b−Ax‖2 + λ2‖Lx‖2}

Summary

Goals Given data b and kernel h find x

Features One may wish to find features from x, for examplethe edges in the signal

Difficulties The solution is very sensitive to the dataIll-Posed (according to Hadamard) A problem is ill-posed if it

does not satisfy conditions for well-posedness, OR1. b /∈ range(A)2. inverse is not unique because more than one

image is mapped to the same data, or3. an arbitrarily small change in b can cause an

arbitrarily large change in x.Required Analyse the formulation / adapt solution

techniquesTikhonov Regularization

xTik(λ) = arg minx{‖b−Ax‖2 + λ2‖Lx‖2}

Summary

Goals Given data b and kernel h find x

Features One may wish to find features from x, for examplethe edges in the signal

Difficulties The solution is very sensitive to the dataIll-Posed (according to Hadamard) A problem is ill-posed if it

does not satisfy conditions for well-posedness, OR1. b /∈ range(A)2. inverse is not unique because more than one

image is mapped to the same data, or3. an arbitrarily small change in b can cause an

arbitrarily large change in x.Required Analyse the formulation / adapt solution

techniquesTikhonov Regularization

xTik(λ) = arg minx{‖b−Ax‖2 + λ2‖Lx‖2}

Summary

Goals Given data b and kernel h find x

Features One may wish to find features from x, for examplethe edges in the signal

Difficulties The solution is very sensitive to the dataIll-Posed (according to Hadamard) A problem is ill-posed if it

does not satisfy conditions for well-posedness, OR1. b /∈ range(A)2. inverse is not unique because more than one

image is mapped to the same data, or3. an arbitrarily small change in b can cause an

arbitrarily large change in x.Required Analyse the formulation / adapt solution

techniquesTikhonov Regularization

xTik(λ) = arg minx{‖b−Ax‖2 + λ2‖Lx‖2}

Summary

Goals Given data b and kernel h find x

Features One may wish to find features from x, for examplethe edges in the signal

Difficulties The solution is very sensitive to the dataIll-Posed (according to Hadamard) A problem is ill-posed if it

does not satisfy conditions for well-posedness, OR1. b /∈ range(A)2. inverse is not unique because more than one

image is mapped to the same data, or3. an arbitrarily small change in b can cause an

arbitrarily large change in x.Required Analyse the formulation / adapt solution

techniquesTikhonov Regularization

xTik(λ) = arg minx{‖b−Ax‖2 + λ2‖Lx‖2}

Tikhonov Regularized Solutions x(λ) and derivative Lx for changing λ

Figure: We cannot capture Lx (red) from the solution (green): Noticethat ‖Lx‖ decreases as λ increases

Tikhonov Regularized Solutions x(λ) and derivative Lx for changing λ

Figure: We cannot capture Lx (red) from the solution (green): Noticethat ‖Lx‖ decreases as λ increases

Tikhonov Regularized Solutions x(λ) and derivative Lx for changing λ

Figure: We cannot capture Lx (red) from the solution (green): Noticethat ‖Lx‖ decreases as λ increases

Tikhonov Regularized Solutions x(λ) and derivative Lx for changing λ

Figure: We cannot capture Lx (red) from the solution (green): Noticethat ‖Lx‖ decreases as λ increases

Total Variation Regularization

A more general regularization term R(x) may be considered tobetter preserve properties of the solution x:

x(λ) = arg minx{‖Ax− b‖2W + λ2R(x)}

Suppose R is total variation of x (general options are possible)The total variation for a function x defined on a discrete grid is

TV(x(s)) =∑i

|x(si)− x(si−1)| ≈ ∆∑i

|dx(si)/ds|

TV approximates a scaled sum of the magnitude of jumps in x.∆ is a scale factor dependent on the grid size.Notice TV(x(s)) ≈ ∆‖Lx‖1 so we solve

x(λ) = arg minx{‖Ax− b‖22 + λ2‖Lx‖1}

Problematic for large scale problems.

Total Variation Regularization

A more general regularization term R(x) may be considered tobetter preserve properties of the solution x:

x(λ) = arg minx{‖Ax− b‖2W + λ2R(x)}

Suppose R is total variation of x (general options are possible)The total variation for a function x defined on a discrete grid is

TV(x(s)) =∑i

|x(si)− x(si−1)| ≈ ∆∑i

|dx(si)/ds|

TV approximates a scaled sum of the magnitude of jumps in x.∆ is a scale factor dependent on the grid size.Notice TV(x(s)) ≈ ∆‖Lx‖1 so we solve

x(λ) = arg minx{‖Ax− b‖22 + λ2‖Lx‖1}

Problematic for large scale problems.

Total Variation Regularization

A more general regularization term R(x) may be considered tobetter preserve properties of the solution x:

x(λ) = arg minx{‖Ax− b‖2W + λ2R(x)}

Suppose R is total variation of x (general options are possible)The total variation for a function x defined on a discrete grid is

TV(x(s)) =∑i

|x(si)− x(si−1)| ≈ ∆∑i

|dx(si)/ds|

TV approximates a scaled sum of the magnitude of jumps in x.∆ is a scale factor dependent on the grid size.Notice TV(x(s)) ≈ ∆‖Lx‖1 so we solve

x(λ) = arg minx{‖Ax− b‖22 + λ2‖Lx‖1}

Problematic for large scale problems.

Split Bregman for TV

The main reference is here with software:T. GOLDSTEIN, Split Bregman Software Page,http://www.stanford.edu/˜tagoldst/Tom_Goldstein/Split_Bregman.html.T. GOLDSTEIN AND S. OSHER, The Split Bregman Method forL1-Regularized Problems. SIAM J. Imaging Sci. 2, 2, (2009),323-343.Many developments have been made since that time:S. SETZER, Operator Splittings, Bregman Methods and FrameShrinkage in Image Processing, International Journal ofComputer Vision, in press, 2011.Above reference discusses relationship of SB to AugmentedLagrangian and Peaceman-Rachford alternating direction

SB The Main Idea of GO Paper: for Regularization R(x)

For R(x) = Lx for L ∈ Rq×n: Introduce d = Lx

Rewrite R(x) = λ2

2 ‖d− Lx‖22 + µ‖d‖1 and restate as

(x,d) = arg minx,d{12‖Ax− b‖22 + λ2

2 ‖d− Lx‖22 + µ‖d‖1}

Solve using an alternating minimization which separatesminimization for d from x

Various versions of the iteration can be defined.

S1 : x(k+1) = arg minx{1

2‖Ax− b‖22 +

λ2

2‖Lx− (d(k+1) − g(k))‖22} (1)

S2 : d(k+1) = arg mind{λ

2

2‖d− (Lx(k+1) + g(k))‖22 + µ‖d‖1} (2)

S3 : g(k+1) = g(k) + Lx(k+1) − d(k+1). (3)

SB The Main Idea of GO Paper: for Regularization R(x)

For R(x) = Lx for L ∈ Rq×n: Introduce d = Lx

Rewrite R(x) = λ2

2 ‖d− Lx‖22 + µ‖d‖1 and restate as

(x,d) = arg minx,d{12‖Ax− b‖22 + λ2

2 ‖d− Lx‖22 + µ‖d‖1}

Solve using an alternating minimization which separatesminimization for d from x

Various versions of the iteration can be defined.

S1 : x(k+1) = arg minx{1

2‖Ax− b‖22 +

λ2

2‖Lx− (d(k+1) − g(k))‖22} (1)

S2 : d(k+1) = arg mind{λ

2

2‖d− (Lx(k+1) + g(k))‖22 + µ‖d‖1} (2)

S3 : g(k+1) = g(k) + Lx(k+1) − d(k+1). (3)

SB The Main Idea of GO Paper: for Regularization R(x)

For R(x) = Lx for L ∈ Rq×n: Introduce d = Lx

Rewrite R(x) = λ2

2 ‖d− Lx‖22 + µ‖d‖1 and restate as

(x,d) = arg minx,d{12‖Ax− b‖22 + λ2

2 ‖d− Lx‖22 + µ‖d‖1}

Solve using an alternating minimization which separatesminimization for d from x

Various versions of the iteration can be defined.

S1 : x(k+1) = arg minx{1

2‖Ax− b‖22 +

λ2

2‖Lx− (d(k+1) − g(k))‖22} (1)

S2 : d(k+1) = arg mind{λ

2

2‖d− (Lx(k+1) + g(k))‖22 + µ‖d‖1} (2)

S3 : g(k+1) = g(k) + Lx(k+1) − d(k+1). (3)

SB The Main Idea of GO Paper: for Regularization R(x)

For R(x) = Lx for L ∈ Rq×n: Introduce d = Lx

Rewrite R(x) = λ2

2 ‖d− Lx‖22 + µ‖d‖1 and restate as

(x,d) = arg minx,d{12‖Ax− b‖22 + λ2

2 ‖d− Lx‖22 + µ‖d‖1}

Solve using an alternating minimization which separatesminimization for d from x

Various versions of the iteration can be defined.

S1 : x(k+1) = arg minx{1

2‖Ax− b‖22 +

λ2

2‖Lx− (d(k+1) − g(k))‖22} (1)

S2 : d(k+1) = arg mind{λ

2

2‖d− (Lx(k+1) + g(k))‖22 + µ‖d‖1} (2)

S3 : g(k+1) = g(k) + Lx(k+1) − d(k+1). (3)

SB Unconstrained algorithm:

Update for x:

x = arg minx{1

2‖Ax− b‖22 +

λ2

2‖Lx− h‖22}, h = d− g

Standard least squares update using a Tikhonov regularizer.Uses the standard basis but perturbed weights through h

x(k+1) =

p∑i=1

νiuTi b + λ2µiv

Ti h

(k)

ν2i + λ2µ2

i

z̃i +

n∑i=p+1

(uTi b)z̃i

Update for d:

d = arg mind{µ‖d‖1 +

λ2

2‖d− c‖22}, c = Lx + g

= arg mind{‖d‖1 +

γ

2‖d− c‖22}, γ =

λ2

µ.

To obtain an effective solution we need to find the parametersλ, µ.

SB Unconstrained algorithm:

Update for x:

x = arg minx{1

2‖Ax− b‖22 +

λ2

2‖Lx− h‖22}, h = d− g

Standard least squares update using a Tikhonov regularizer.Uses the standard basis but perturbed weights through h

x(k+1) =

p∑i=1

νiuTi b + λ2µiv

Ti h

(k)

ν2i + λ2µ2

i

z̃i +

n∑i=p+1

(uTi b)z̃i

Update for d:

d = arg mind{µ‖d‖1 +

λ2

2‖d− c‖22}, c = Lx + g

= arg mind{‖d‖1 +

γ

2‖d− c‖22}, γ =

λ2

µ.

To obtain an effective solution we need to find the parametersλ, µ.

The Situation

1. Inverse problem we need regularization2. For feature extraction we need more than Tikhonov

Regularization - e.g. TV3. Both techniques are parameter dependent4. Notice the dependence of the TV solution on Tikhonov

solutions5. The TV iterates over many Tikhonov solutions6. Moreover the parameters are needed7. We need to fully understand the Tikhonov and ill-posed

problems

Spectral Decomposition of the Solution: The SVD

Consider general overdetermined discrete problem

Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn, m ≥ n.

Singular value decomposition (SVD) of A (full column rank)

A = UΣV T =

n∑i=1

uiσivTi , Σ = diag(σ1, . . . , σn).

gives expansion for the solution

x =

n∑i=1

uTi b

σivi

ui, vi are left and right singular vectors for ASolution is a weighted linear combination of the basis vectors vi

Spectral Decomposition of the Solution: The SVD

Consider general overdetermined discrete problem

Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn, m ≥ n.

Singular value decomposition (SVD) of A (full column rank)

A = UΣV T =

n∑i=1

uiσivTi , Σ = diag(σ1, . . . , σn).

gives expansion for the solution

x =

n∑i=1

uTi b

σivi

ui, vi are left and right singular vectors for ASolution is a weighted linear combination of the basis vectors vi

Spectral Decomposition of the Solution: The SVD

Consider general overdetermined discrete problem

Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn, m ≥ n.

Singular value decomposition (SVD) of A (full column rank)

A = UΣV T =

n∑i=1

uiσivTi , Σ = diag(σ1, . . . , σn).

gives expansion for the solution

x =

n∑i=1

uTi b

σivi

ui, vi are left and right singular vectors for ASolution is a weighted linear combination of the basis vectors vi

Spectral Decomposition of the Solution: The SVD

Consider general overdetermined discrete problem

Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn, m ≥ n.

Singular value decomposition (SVD) of A (full column rank)

A = UΣV T =

n∑i=1

uiσivTi , Σ = diag(σ1, . . . , σn).

gives expansion for the solution

x =

n∑i=1

uTi b

σivi

ui, vi are left and right singular vectors for ASolution is a weighted linear combination of the basis vectors vi

Spectral Decomposition of the Solution: The SVD

Consider general overdetermined discrete problem

Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn, m ≥ n.

Singular value decomposition (SVD) of A (full column rank)

A = UΣV T =

n∑i=1

uiσivTi , Σ = diag(σ1, . . . , σn).

gives expansion for the solution

x =

n∑i=1

uTi b

σivi

ui, vi are left and right singular vectors for ASolution is a weighted linear combination of the basis vectors vi

Illustration: Example of the basis vectors

Problem wing n = 84: Evaluating impact of higher precision.Cond(A) = 2.2056e+ 19. Hansen’s Regularization Toolbox.

h(s, t) = se−ts2, x(s) = 1,

1

3< t <

2

3b(t) =

e−t/9 − e−4t/9

2t

Left Singular Vectors and Basis Depend on the kernel (matrix A)

Figure: The first few left singular vectors ui and basis vectors vi. Arethe results correct?

Use Matlab High Precision to examine the SVD

I Matlab digits allows high precision. Standard is 32.I Symbolic toolbox allows operations on high precision

variables with vpa.I SVD for vpa variables calculates the singular values

symbolically, but not the singular vectors.I Higher accuracy for the SVs generates higher accuracy

singular vectors.I Solutions with high precision can take advantage of

Matlab’s symbolic toolbox.

Left Singular Vectors and Basis Calculated in High Precision

Figure: The first few left singular vectors ui and basis vectors vi.Apparently with higher precision we preserve the frequency contentof the basis. How many can we use in the solution for x?

Power Spectrum for detecting white noise : a time series analysistechnique

Suppose for a given vector y that it is a time series indexed byposition, i.e. index i.Diagnostic 1 Does the histogram of entries of y generate

histogram consistent with y ∼ N(0, 1)? (i.e.independent normally distributed with mean 0 andvariance 1) Not practical to automatically look at ahistogram and make an assessment

Diagnostic 2 Test the expectation that yi are selected from awhite noise time series. Take the Fourier transformof y and form cumulative periodogram z frompower spectrum c

cj = |(dft(y)j |2, zj =

∑ji=1 cj∑qi=1 ci

, j = 1, . . . , q,

Automatic: Test is the line (zj , j/q) close to a straight line withslope 1 and length

√5/2?

Power Spectrum for detecting white noise : a time series analysistechnique

Suppose for a given vector y that it is a time series indexed byposition, i.e. index i.Diagnostic 1 Does the histogram of entries of y generate

histogram consistent with y ∼ N(0, 1)? (i.e.independent normally distributed with mean 0 andvariance 1) Not practical to automatically look at ahistogram and make an assessment

Diagnostic 2 Test the expectation that yi are selected from awhite noise time series. Take the Fourier transformof y and form cumulative periodogram z frompower spectrum c

cj = |(dft(y)j |2, zj =

∑ji=1 cj∑qi=1 ci

, j = 1, . . . , q,

Automatic: Test is the line (zj , j/q) close to a straight line withslope 1 and length

√5/2?

Power Spectrum for detecting white noise : a time series analysistechnique

Suppose for a given vector y that it is a time series indexed byposition, i.e. index i.Diagnostic 1 Does the histogram of entries of y generate

histogram consistent with y ∼ N(0, 1)? (i.e.independent normally distributed with mean 0 andvariance 1) Not practical to automatically look at ahistogram and make an assessment

Diagnostic 2 Test the expectation that yi are selected from awhite noise time series. Take the Fourier transformof y and form cumulative periodogram z frompower spectrum c

cj = |(dft(y)j |2, zj =

∑ji=1 cj∑qi=1 ci

, j = 1, . . . , q,

Automatic: Test is the line (zj , j/q) close to a straight line withslope 1 and length

√5/2?

Cumulative Periodogram for the left singular vectors

Figure: Standard precision on the left and high precision on the right:On left more vectors are close to white than on the right - vectors arewhite if the CP is close to the diagonal on the plot: Low frequencyvectors lie above the diagonal and high frequency below the diagonal.White noise follows the diagonal

Cumulative Periodogram for the basis vectors

Figure: Standard precision on the left and high precision on the right:On left more vectors are close to white than on the right - if you countthere are 9 vectors with true frequency content on the left

Measure Deviation from Straight Line: Basis Vectors

Figure: Testing for white noise for the standard precision vectors:Calculate the cumulative periodogram and measure the deviationfrom the “white noise” line or assess proportion of the vector outsidethe Kolmogorov Smirnov test at a 5% confidence level for white noiselines. In this case it suggests that about 9 vectors are noise free.

Cannot expect to use more than 9 vectors in the expansion forx. Additional terms are contaminated by noise - independent ofnoise in b

Measure Deviation from Straight Line: Basis Vectors

Figure: Testing for white noise for the standard precision vectors:Calculate the cumulative periodogram and measure the deviationfrom the “white noise” line or assess proportion of the vector outsidethe Kolmogorov Smirnov test at a 5% confidence level for white noiselines. In this case it suggests that about 9 vectors are noise free.

Cannot expect to use more than 9 vectors in the expansion forx. Additional terms are contaminated by noise - independent ofnoise in b

Discrete Picard condition: examine the weights of the expansion

Recall

x =

n∑i=1

uTi b

σivi

Here|uTi b|/σi = O(1)

Ratios are not large but are the values correct? Consideringonly the discrete Picard condition does not tell us whether theexpansion for the solution is correct.

Discrete Picard condition: examine the weights of the expansion

I From high precision calculation of σi shows they decayexponentially to zero (down to machine precision)

I The Picard condition considers ratios |uTi b|/σi for data b;they decay exponentially (down to the machine precision).

I Note ratios in this case are O(1) - hence noisecontaminated basis vectors are not ignored. i.e. toapproximate a solution with a discontinuity we need all thebasis vectors.

I We may obtain solutions by truncating the SVD

x =

k∑i=1

uTi b

σivi

I Now parameter k is a regularization parameterI For given example we know k < 10 independent of the

measured data b. We cannot see this from the Picardcondition.

The Truncated Solutions (Noise free data b)

Figure: Truncated SVD Solutions: Standard precision |uTi b|. Error in

the basis contaminates the solution

The Truncated Solutions (Noise free data b)

Figure: Truncated SVD Solutions: VPA calculation |uTi b|. Number of

terms not sufficient to represent the solution discontinuity

Observations

I Even when committing the inverse crime we will notachieve the solution if we cannot approximate the basiscorrectly.

I We need all basis vectors which contain the highfrequency terms in order to approximate a solution withhigh frequency components - e.g. edges.

I Reminder - this is independent of the data.I But is an indication of an ill-posed problem. In this case the

data that is modified exhibits in the matrix Adecomposition.

I We look at a problem with a smoother solution -what arethe issues?Problem shaw from the regularization toolbox defined oninterval [−π/2, π/2]

h(s, t) = (cos(s) + cos(t))(sin(u)/u)2, u = π(sin(s) + sin(t))

x(t) = 2 exp(−6(t− .8)2) + exp(−2(t+ .5)2)

Observations

I Even when committing the inverse crime we will notachieve the solution if we cannot approximate the basiscorrectly.

I We need all basis vectors which contain the highfrequency terms in order to approximate a solution withhigh frequency components - e.g. edges.

I Reminder - this is independent of the data.I But is an indication of an ill-posed problem. In this case the

data that is modified exhibits in the matrix Adecomposition.

I We look at a problem with a smoother solution -what arethe issues?Problem shaw from the regularization toolbox defined oninterval [−π/2, π/2]

h(s, t) = (cos(s) + cos(t))(sin(u)/u)2, u = π(sin(s) + sin(t))

x(t) = 2 exp(−6(t− .8)2) + exp(−2(t+ .5)2)

Observations

I Even when committing the inverse crime we will notachieve the solution if we cannot approximate the basiscorrectly.

I We need all basis vectors which contain the highfrequency terms in order to approximate a solution withhigh frequency components - e.g. edges.

I Reminder - this is independent of the data.I But is an indication of an ill-posed problem. In this case the

data that is modified exhibits in the matrix Adecomposition.

I We look at a problem with a smoother solution -what arethe issues?Problem shaw from the regularization toolbox defined oninterval [−π/2, π/2]

h(s, t) = (cos(s) + cos(t))(sin(u)/u)2, u = π(sin(s) + sin(t))

x(t) = 2 exp(−6(t− .8)2) + exp(−2(t+ .5)2)

Observations

I Even when committing the inverse crime we will notachieve the solution if we cannot approximate the basiscorrectly.

I We need all basis vectors which contain the highfrequency terms in order to approximate a solution withhigh frequency components - e.g. edges.

I Reminder - this is independent of the data.I But is an indication of an ill-posed problem. In this case the

data that is modified exhibits in the matrix Adecomposition.

I We look at a problem with a smoother solution -what arethe issues?Problem shaw from the regularization toolbox defined oninterval [−π/2, π/2]

h(s, t) = (cos(s) + cos(t))(sin(u)/u)2, u = π(sin(s) + sin(t))

x(t) = 2 exp(−6(t− .8)2) + exp(−2(t+ .5)2)

Observations

I Even when committing the inverse crime we will notachieve the solution if we cannot approximate the basiscorrectly.

I We need all basis vectors which contain the highfrequency terms in order to approximate a solution withhigh frequency components - e.g. edges.

I Reminder - this is independent of the data.I But is an indication of an ill-posed problem. In this case the

data that is modified exhibits in the matrix Adecomposition.

I We look at a problem with a smoother solution -what arethe issues?Problem shaw from the regularization toolbox defined oninterval [−π/2, π/2]

h(s, t) = (cos(s) + cos(t))(sin(u)/u)2, u = π(sin(s) + sin(t))

x(t) = 2 exp(−6(t− .8)2) + exp(−2(t+ .5)2)

Basis vectors for problem shaw

Figure: The first few left singular vectors vi (left) and their white noisecontent on the right

The Solutions with truncated SVD- problem shaw

Figure: Truncated SVD Solutions: data enters through coefficients|uT

i b|. On the left no noise in b and on the right with noise 10−4

In this case the low frequency vectors can represent thesolution but we need to know the regularization parameter k

Observations from the SVD analysis in presence of noise

I Number of terms k in TSVD depends on vi.I Practically measured data also contaminated by noise e.

x =

n∑i=1

(uTi (bexact + e)

σi)vi = xexact +

n∑i=1

(uTi e

σi) vi

I Note ‖UTe‖ = ‖e‖ and ‖UTb‖ = ‖b‖ is dominated by lowfrequency terms when A smooths x to give b.

I If e is uniform, we expect |uTi e| to be similar magnitude ∀i.I When σi << |uTi e| contribution of the high frequency error

is magnified and that of basis vector viI The truncated SVD is a special case of spectral filtering

xfilt =

n∑i=1

γi(uTi b

σi)vi

I Spectral filtering is used to filter the components in thespectral basis, such that noise in signal is damped.

Observations from the SVD analysis in presence of noise

I Number of terms k in TSVD depends on vi.I Practically measured data also contaminated by noise e.

x =

n∑i=1

(uTi (bexact + e)

σi)vi = xexact +

n∑i=1

(uTi e

σi) vi

I Note ‖UTe‖ = ‖e‖ and ‖UTb‖ = ‖b‖ is dominated by lowfrequency terms when A smooths x to give b.

I If e is uniform, we expect |uTi e| to be similar magnitude ∀i.I When σi << |uTi e| contribution of the high frequency error

is magnified and that of basis vector viI The truncated SVD is a special case of spectral filtering

xfilt =

n∑i=1

γi(uTi b

σi)vi

I Spectral filtering is used to filter the components in thespectral basis, such that noise in signal is damped.

Observations from the SVD analysis in presence of noise

I Number of terms k in TSVD depends on vi.I Practically measured data also contaminated by noise e.

x =

n∑i=1

(uTi (bexact + e)

σi)vi = xexact +

n∑i=1

(uTi e

σi) vi

I Note ‖UTe‖ = ‖e‖ and ‖UTb‖ = ‖b‖ is dominated by lowfrequency terms when A smooths x to give b.

I If e is uniform, we expect |uTi e| to be similar magnitude ∀i.I When σi << |uTi e| contribution of the high frequency error

is magnified and that of basis vector viI The truncated SVD is a special case of spectral filtering

xfilt =

n∑i=1

γi(uTi b

σi)vi

I Spectral filtering is used to filter the components in thespectral basis, such that noise in signal is damped.

Observations from the SVD analysis in presence of noise

I Number of terms k in TSVD depends on vi.I Practically measured data also contaminated by noise e.

x =

n∑i=1

(uTi (bexact + e)

σi)vi = xexact +

n∑i=1

(uTi e

σi) vi

I Note ‖UTe‖ = ‖e‖ and ‖UTb‖ = ‖b‖ is dominated by lowfrequency terms when A smooths x to give b.

I If e is uniform, we expect |uTi e| to be similar magnitude ∀i.I When σi << |uTi e| contribution of the high frequency error

is magnified and that of basis vector viI The truncated SVD is a special case of spectral filtering

xfilt =

n∑i=1

γi(uTi b

σi)vi

I Spectral filtering is used to filter the components in thespectral basis, such that noise in signal is damped.

Observations from the SVD analysis in presence of noise

I Number of terms k in TSVD depends on vi.I Practically measured data also contaminated by noise e.

x =

n∑i=1

(uTi (bexact + e)

σi)vi = xexact +

n∑i=1

(uTi e

σi) vi

I Note ‖UTe‖ = ‖e‖ and ‖UTb‖ = ‖b‖ is dominated by lowfrequency terms when A smooths x to give b.

I If e is uniform, we expect |uTi e| to be similar magnitude ∀i.I When σi << |uTi e| contribution of the high frequency error

is magnified and that of basis vector viI The truncated SVD is a special case of spectral filtering

xfilt =

n∑i=1

γi(uTi b

σi)vi

I Spectral filtering is used to filter the components in thespectral basis, such that noise in signal is damped.

Observations from the SVD analysis in presence of noise

I Number of terms k in TSVD depends on vi.I Practically measured data also contaminated by noise e.

x =

n∑i=1

(uTi (bexact + e)

σi)vi = xexact +

n∑i=1

(uTi e

σi) vi

I Note ‖UTe‖ = ‖e‖ and ‖UTb‖ = ‖b‖ is dominated by lowfrequency terms when A smooths x to give b.

I If e is uniform, we expect |uTi e| to be similar magnitude ∀i.I When σi << |uTi e| contribution of the high frequency error

is magnified and that of basis vector viI The truncated SVD is a special case of spectral filtering

xfilt =

n∑i=1

γi(uTi b

σi)vi

I Spectral filtering is used to filter the components in thespectral basis, such that noise in signal is damped.

Observations from the SVD analysis in presence of noise

I Number of terms k in TSVD depends on vi.I Practically measured data also contaminated by noise e.

x =

n∑i=1

(uTi (bexact + e)

σi)vi = xexact +

n∑i=1

(uTi e

σi) vi

I Note ‖UTe‖ = ‖e‖ and ‖UTb‖ = ‖b‖ is dominated by lowfrequency terms when A smooths x to give b.

I If e is uniform, we expect |uTi e| to be similar magnitude ∀i.I When σi << |uTi e| contribution of the high frequency error

is magnified and that of basis vector viI The truncated SVD is a special case of spectral filtering

xfilt =

n∑i=1

γi(uTi b

σi)vi

I Spectral filtering is used to filter the components in thespectral basis, such that noise in signal is damped.

Regularization by Spectral Filtering : This is Tikhonov regularization

xTik =

n∑i=1

γi(uTI b

σi)vi

I Tikhonov Regularization γi =σ2i

σ2i +λ2

, i = 1 . . . n, λ is theregularization parameter, and solution is

xTik(λ) = arg minx{‖b−Ax‖2 + λ2‖x‖2}

I Choice of λ2 impacts the solution.

1-D Interesting but Noisy Signal

Blur with Gaussian and add noise- can we find the solution?

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Solution for Increasing λ

Solutions x(λ)

Finding the Optimal Parameter

1. Many options exist2. Debatable as to best approach3. Should use information on noise if available4. In general somewhat costly5. Usually involves finding several solutions6. We consider Unbiased Predictive Risk and χ2 ( the latter is

very cheap)7. But it is important to reduce cost

Regularizing the TSVD: But we can just use the truncated expansion

Inaccuracy of the basis vi suggest that one should not justseek the filtered SVD solution but use the stabilized TSVD

Figure: Two Example Solutions : Calculating the solution is robust forfewer basis vectors 1% noise

Regularizing the TSVD: But we can just use the truncated expansion

Inaccuracy of the basis vi suggest that one should not justseek the filtered SVD solution but use the stabilized TSVD

Figure: Two Example Solutions : Calculating the solution is robust forfewer basis vectors 1% noise

Further Observations

1. We can use a limited basis to obtain the solution.2. We can stabilize the TSVD if needed and parameter

choice methods are cheaper.3. Calculating the basis to use is data independent : depends

only on the ill-posedness of the system4. Removing the noisy vectors still does not permit all good

solutions5. Clearly high frequency cannot be represented with low

frequency basis6. We need to alter the basis

Extending the Regularization - Change the basis

Notice gradients in the solution are smoothedConsider the more general weighting ‖Lx‖2

x(λ) = arg minx{‖Ax− b‖2 + λ2‖Lx‖2}

Typical L approximates the first or second order derivative

L1 =

−1 1. . . . . .

−1 1

L2 =

1 −2 1. . . . . . . . .

1 −2 1

L1 ∈ R(n−1)×n and L2 ∈ R(n−2)×n. Note that neither L1 nor L2

are invertible.

Extending the Regularization - Change the basis

Notice gradients in the solution are smoothedConsider the more general weighting ‖Lx‖2

x(λ) = arg minx{‖Ax− b‖2 + λ2‖Lx‖2}

Typical L approximates the first or second order derivative

L1 =

−1 1. . . . . .

−1 1

L2 =

1 −2 1. . . . . . . . .

1 −2 1

L1 ∈ R(n−1)×n and L2 ∈ R(n−2)×n. Note that neither L1 nor L2

are invertible.

Extending the Regularization - Change the basis

Notice gradients in the solution are smoothedConsider the more general weighting ‖Lx‖2

x(λ) = arg minx{‖Ax− b‖2 + λ2‖Lx‖2}

Typical L approximates the first or second order derivative

L1 =

−1 1. . . . . .

−1 1

L2 =

1 −2 1. . . . . . . . .

1 −2 1

L1 ∈ R(n−1)×n and L2 ∈ R(n−2)×n. Note that neither L1 nor L2

are invertible.

Extending the Regularization - Change the basis

Notice gradients in the solution are smoothedConsider the more general weighting ‖Lx‖2

x(λ) = arg minx{‖Ax− b‖2 + λ2‖Lx‖2}

Typical L approximates the first or second order derivative

L1 =

−1 1. . . . . .

−1 1

L2 =

1 −2 1. . . . . . . . .

1 −2 1

L1 ∈ R(n−1)×n and L2 ∈ R(n−2)×n. Note that neither L1 nor L2

are invertible.

The Generalized Singular Value Decomposition

Introduce generalization of the SVD to obtain a expansion forx(λ) = arg minx{‖Ax− b‖2 + λ2‖L(x− x0)‖2}

Lemma (GSVD)Assume invertibility and m ≥ n ≥ p. There exist unitary matrices U ∈ Rm×m,V ∈ Rp×p, and a nonsingular matrix Z ∈ Rn×n such that

A = U

0(m−n)×n

]ZT , L = V [M, 0p×(n−p)]Z

T ,

Υ = diag(υ1, . . . , υp, 1, . . . , 1) ∈ Rn×n, M = diag(µ1, . . . , µp) ∈ Rp×p,

with

0 ≤ υ1 ≤ · · · ≤ υp ≤ 1, 1 ≥ µ1 ≥ · · · ≥ µp > 0, υ2i + µ2

i = 1, i = 1, . . . p.

Use Υ̃ and M̃ to denote the rectangular matrices containing Υ and M .

Solution of the Generalized Problem using the GSVD

We can use the GSVD to write down the solution for thegeneralized problem:

x(λ) =

p∑i=1

νiν2i + λ2µ2

i

(uTi b)z̃i +

n∑i=p+1

(uTi b)z̃i

where z̃i is the ith column of (ZT )−1.With generalized singular value ρi = νi/µi, i = 1, . . . , p

x(λ) =

p∑i=1

γiuTi b

νiz̃i +

n∑i=p+1

(uTi b)z̃i,

Lx(λ) =

p∑i=1

γiuTi b

ρivi, γi =

ρ2i

ρ2i + λ2

,

Notice the similarity with the filtered SVD solution

x(λ) =

n∑i=1

γiuTi b

σivi, γi =

σ2i

σ2i + λ2

.

Solution of the Generalized Problem using the GSVD

We can use the GSVD to write down the solution for thegeneralized problem:

x(λ) =

p∑i=1

νiν2i + λ2µ2

i

(uTi b)z̃i +

n∑i=p+1

(uTi b)z̃i

where z̃i is the ith column of (ZT )−1.With generalized singular value ρi = νi/µi, i = 1, . . . , p

x(λ) =

p∑i=1

γiuTi b

νiz̃i +

n∑i=p+1

(uTi b)z̃i,

Lx(λ) =

p∑i=1

γiuTi b

ρivi, γi =

ρ2i

ρ2i + λ2

,

Notice the similarity with the filtered SVD solution

x(λ) =

n∑i=1

γiuTi b

σivi, γi =

σ2i

σ2i + λ2

.

Solution of the Generalized Problem using the GSVD

We can use the GSVD to write down the solution for thegeneralized problem:

x(λ) =

p∑i=1

νiν2i + λ2µ2

i

(uTi b)z̃i +

n∑i=p+1

(uTi b)z̃i

where z̃i is the ith column of (ZT )−1.With generalized singular value ρi = νi/µi, i = 1, . . . , p

x(λ) =

p∑i=1

γiuTi b

νiz̃i +

n∑i=p+1

(uTi b)z̃i,

Lx(λ) =

p∑i=1

γiuTi b

ρivi, γi =

ρ2i

ρ2i + λ2

,

Notice the similarity with the filtered SVD solution

x(λ) =

n∑i=1

γiuTi b

σivi, γi =

σ2i

σ2i + λ2

.

What matrix to use for the GSVD? Form the truncated matrix Ak usingonly k basis vectors vi: Above A and belowAk Problem ilaplace

Figure: Contrast GSVD basis u (left) z (right). Trade off U and Z

What matrix to use for the GSVD? Form the truncated matrix Ak usingonly k basis vectors vi

Figure: This is confirmed by examining the KS test for white noise

Example Solution: problem ilaplace A above and Ak below

Figure: Solution is robust to using the reduced matrix Ak

Example Solution: problem ilaplace A above and Ak below

Figure: Solution is robust to truncating the GSVD

Example Solution: Problem phillips

Figure: Solution is robust to using the reduced matrix Ak

Example Solution: Problem phillips

Figure: Solution is robust to truncating the GSVD

Summary/Conclustions

I Basis vectors are subject to noise and contaminate thesolution independent of the data.

I Solutions require finding k.I We can determine k by examining noise in the basisI Using the GSVD we see that Tikhonov regularization

yields a basis with smoothed basis vectorsI But we can apply the same techniquesI Stabilizing both TSVD and TGSVD is robust to parameter

estimationI Parameter estimation for truncated expansions is cheaperI Reminder truncation k is independent of the data b.

Summary/Conclustions

I Basis vectors are subject to noise and contaminate thesolution independent of the data.

I Solutions require finding k.I We can determine k by examining noise in the basisI Using the GSVD we see that Tikhonov regularization

yields a basis with smoothed basis vectorsI But we can apply the same techniquesI Stabilizing both TSVD and TGSVD is robust to parameter

estimationI Parameter estimation for truncated expansions is cheaperI Reminder truncation k is independent of the data b.

Summary/Conclustions

I Basis vectors are subject to noise and contaminate thesolution independent of the data.

I Solutions require finding k.I We can determine k by examining noise in the basisI Using the GSVD we see that Tikhonov regularization

yields a basis with smoothed basis vectorsI But we can apply the same techniquesI Stabilizing both TSVD and TGSVD is robust to parameter

estimationI Parameter estimation for truncated expansions is cheaperI Reminder truncation k is independent of the data b.

Summary/Conclustions

I Basis vectors are subject to noise and contaminate thesolution independent of the data.

I Solutions require finding k.I We can determine k by examining noise in the basisI Using the GSVD we see that Tikhonov regularization

yields a basis with smoothed basis vectorsI But we can apply the same techniquesI Stabilizing both TSVD and TGSVD is robust to parameter

estimationI Parameter estimation for truncated expansions is cheaperI Reminder truncation k is independent of the data b.

Summary/Conclustions

I Basis vectors are subject to noise and contaminate thesolution independent of the data.

I Solutions require finding k.I We can determine k by examining noise in the basisI Using the GSVD we see that Tikhonov regularization

yields a basis with smoothed basis vectorsI But we can apply the same techniquesI Stabilizing both TSVD and TGSVD is robust to parameter

estimationI Parameter estimation for truncated expansions is cheaperI Reminder truncation k is independent of the data b.

Summary/Conclustions

I Basis vectors are subject to noise and contaminate thesolution independent of the data.

I Solutions require finding k.I We can determine k by examining noise in the basisI Using the GSVD we see that Tikhonov regularization

yields a basis with smoothed basis vectorsI But we can apply the same techniquesI Stabilizing both TSVD and TGSVD is robust to parameter

estimationI Parameter estimation for truncated expansions is cheaperI Reminder truncation k is independent of the data b.

Summary/Conclustions

I Basis vectors are subject to noise and contaminate thesolution independent of the data.

I Solutions require finding k.I We can determine k by examining noise in the basisI Using the GSVD we see that Tikhonov regularization

yields a basis with smoothed basis vectorsI But we can apply the same techniquesI Stabilizing both TSVD and TGSVD is robust to parameter

estimationI Parameter estimation for truncated expansions is cheaperI Reminder truncation k is independent of the data b.

Summary/Conclustions

I Basis vectors are subject to noise and contaminate thesolution independent of the data.

I Solutions require finding k.I We can determine k by examining noise in the basisI Using the GSVD we see that Tikhonov regularization

yields a basis with smoothed basis vectorsI But we can apply the same techniquesI Stabilizing both TSVD and TGSVD is robust to parameter

estimationI Parameter estimation for truncated expansions is cheaperI Reminder truncation k is independent of the data b.

Extensions

I Truncating the basis means that we will not see highfrequency in the solutions

I We cannot represent edges or steep gradientsI Need to use Total Variation or alternate regularizers for

feature enhancementI Still will be important to analyze the noiseI Use the same ideas to further analyze TV solutions - work

in progress

Extensions

I Truncating the basis means that we will not see highfrequency in the solutions

I We cannot represent edges or steep gradientsI Need to use Total Variation or alternate regularizers for

feature enhancementI Still will be important to analyze the noiseI Use the same ideas to further analyze TV solutions - work

in progress

Extensions

I Truncating the basis means that we will not see highfrequency in the solutions

I We cannot represent edges or steep gradientsI Need to use Total Variation or alternate regularizers for

feature enhancementI Still will be important to analyze the noiseI Use the same ideas to further analyze TV solutions - work

in progress

Extensions

I Truncating the basis means that we will not see highfrequency in the solutions

I We cannot represent edges or steep gradientsI Need to use Total Variation or alternate regularizers for

feature enhancementI Still will be important to analyze the noiseI Use the same ideas to further analyze TV solutions - work

in progress

Extensions

I Truncating the basis means that we will not see highfrequency in the solutions

I We cannot represent edges or steep gradientsI Need to use Total Variation or alternate regularizers for

feature enhancementI Still will be important to analyze the noiseI Use the same ideas to further analyze TV solutions - work

in progress

Picard condition for the GSVD: for x and Lx examine the weights in theexpansion

Figure: Weights for the expansion - λ = .0001 - blow up together

Picard condition for the GSVD: for x and Lx examine the weights in theexpansion

Figure: Weights for the expansion - λ = .05 separate for lowfrequency

Picard condition for the GSVD: for x and Lx examine the weights in theexpansion

Figure: Weights for the expansion - λ = 5

Notice that for L1 νi are small except for large i, i.e. µi ≈ 1dominates for smalll i.

Illustrating Total Variation Solutions for noise level .1 in the data

Figure: λ = .1, γ = .5: We see that final h is too large relative to b

Illustrating Total Variation Solutions for noise level .1 in the data

Figure: λ = 1, γ = .5: Final h balances b

Illustrating Total Variation Solutions for noise level .1 in the data

Figure: λ = 10, γ = .5: b dominates and the solution is over smooth

Illustrating Total Variation Solutions for noise level .1 in the data: Theimpact of γ

Figure: λ = .1, On the left γ = .5 and on the right γ = 5

Observations

Advantage of TV is clear for constant components.But solutions still depend on parametersWe need to find both λ and µWe may use standard parameter estimation to find λ - forupdating x

We can use reweighting for µ - for updating d

Observations

Advantage of TV is clear for constant components.But solutions still depend on parametersWe need to find both λ and µWe may use standard parameter estimation to find λ - forupdating x

We can use reweighting for µ - for updating d

Observations

Advantage of TV is clear for constant components.But solutions still depend on parametersWe need to find both λ and µWe may use standard parameter estimation to find λ - forupdating x

We can use reweighting for µ - for updating d

Observations

Advantage of TV is clear for constant components.But solutions still depend on parametersWe need to find both λ and µWe may use standard parameter estimation to find λ - forupdating x

We can use reweighting for µ - for updating d

Observations

Advantage of TV is clear for constant components.But solutions still depend on parametersWe need to find both λ and µWe may use standard parameter estimation to find λ - forupdating x

We can use reweighting for µ - for updating d

Exploiting the GSVD for analysis

x(1) =

p∑i=1

φiνiuTi bzi +

n∑i=p+1

(uTi b)zi,

x(k+1) =

p∑i=1

(φiνiuTi b +

(1− φi)µi

vTi h(k))zi +

n∑i=p+1

(uTi b)zi (4)

Lx(k+1) =

p∑i=1

(φiµiνi

(uTi b) + (1− φi)vTi h(k))vi. (5)

Notice in this case that we must also control the coefficients forterms with h. Relevant coefficients

φiνi

φiµiνi

(1− φi)µi

(1− φi)

GSVD coefficients for the Problem

Figure: λ = .001

Of course the coefficients are independent of the data

GSVD coefficients for the Problem

Figure: λ = .01

Of course the coefficients are independent of the data

GSVD coefficients for the Problem

Figure: λ = 1

Of course the coefficients are independent of the data

GSVD coefficients for the Problem

Figure: λ = 1

Of course the coefficients are independent of the data

GSVD coefficients for the Problem

Figure: λ = 10

Of course the coefficients are independent of the data

Picard Condition using GSVD for noise level .1

But h changes with the iteration

Picard Condition using GSVD for noise level .1

But h changes with the iteration

Picard Condition using GSVD for noise level .1

But h changes with the iteration

Picard Condition using GSVD for noise level .1

But h changes with the iteration

Further Observation

I For regularization in general the choice of λ depends onthe right hand side vector

I In this case h changes each step.I It is clear that we should update λ each step.I We use standard approach - Unbiased Predictive Risk

Estimation and L-CurveI We also use iteratively reweighed norm approach for the d

update

Example Solution: 1D

Figure: γ = 200 Low noise. Without updating λ left and updated right.SB UPRE uses the estimated λ from UPRE for all SB steps. UpdateSB, updates λ each step. SB IRN updates and iteratively reweights‖d‖1.

Example Solution: 1D

Figure: γ = 200 High noise. Without updating λ left and updated right.SB UPRE uses the estimated λ from UPRE for all SB steps. UpdateSB, updates λ each step. SB IRN updates and iteratively reweights‖d‖1.

Example Solution: 2D - similar blurring operator

Figure: γ = 5

SB with updated λ is useful

Example Solution: 2D - similar blurring operator

Figure: γ = 5

SB with updated λ is useful

Further Observations and Future Work

Results demonstrate basic analysis of problem isworthwhile

Parameter estimation from basic LS can be used to findappropriate parameter

Questions that may be raised - cost of finding optimal λI Overhead of optimal λ for the first step -

reasonableI Overhead of subsequent steps - UPRE

requires matrix trace - but for deblurring wecan use results about spectrum of Toeplitzmatrices

Future Implement using the Toeplitz operatorsExtensions Implement using statistical estimation using χ2

approach. Takes account of covariance on h

Convergence testing is based on h.

See me for extensive references to literature

THANK YOU

Recommended