Linear and Nonlinear Optimization - MNRLab...Linear and Nonlinear Optimization Islam S. M. Khalil German University in Cairo October 10, 2016 Islam S. M. Khalil Gradient Descent and

Linear and Nonlinear Optimization

Islam S. M. Khalil

German University in Cairo

October 10, 2016

Islam S. M. Khalil Gradient Descent and Levenberg-Marquardt Methods

Outline

Introduction

Gradient descent method

Gauss-Newton method

Levenberg-Marquardt method

Case study: Straight lines have to be straight


Introduction

Optimization is used to compensate for this radially distorted image

Figure: Image subjected to a radial distortion.


Introduction

Figure: Image subjected to a radial distortion.


Introduction

In fitting a function f̂ (x) of an independent variables x to aset of data points f , it is convenient to minimize the sum ofthe weighted squares of the errors between the measured data(f ) and the curve-fit function (f̂ (x))

e2(x) =1

2

m∑i=1

(f − f̂ (x)

wi

)2(1)

=1

2(f − f̂(x))TW(f − f̂(x)) (2)

=1

2fTWf − fTWf̂(x) + 1

2f̂T(x)Wf̂(x), (3)

where W is a diagonal weighting matrix with Wii = 1/w2i .


Introduction

The function f̂ is nonlinear in the model parameters x.Therefore, the minimization of e2 with respect to theparameters (x) must be done iteratively.

The goal of each iteration os to find a perturbation h to theparameter x that reduces e2.

We can use three methods, i.e., the gradient descent Method,the Gauss-Newton method and the Levenberg-Marquardtmethod.


Gradient Descent Method

The steepest descent method is a general minimization methodwhich updates parameter values in the direction opposite to thegradient of the objective function. The gradient of e2 with respectto the parameters is

∂e2(x)

∂x= (f − f̂(x))TW ∂

∂x(f − f̂(x)) (4)

= −(f − f̂(x))TW[∂ f̂(x)∂x

](5)

= −(f − f̂(x))TWJ. (6)

The perturbation h that moves the parameters in the direction ofthe steepest descent is given by

hgd = αJTW(f − f̂(x)), (7)

where α is a positive scalar that determines the length of the stepin the steepest descent direction.


Gauss-Newton method

The Gauss-Newton method is a method of minimizing asum-of-squares objective function. It presumes that the objectivefunction is approximately quadratic in the parameters near theoptimal solution. The function evaluated with perturbed modelparameters may be locally approximated through a first-orderTaylor series expansion.

f̂(x + h) = f̂(x) +[∂ f̂(x)∂x

]h (8)

= f̂(x) + Jh. (9)

Substituting the approximation for the perturbed function in (1)yields

e2(x+h) =1

2fTWf+

1

2f̂TWf̂−1

2fTWf̂−(f−f̂)TWJh+1

2hTJTWJh,

(10)


Gauss-Newton method

The perturbation h that minimizes e2(x) is given from ∂e2(x)∂h = 0

∂

∂he2(x + h) = −(f − f̂)TWJ + 1

2hTJTWJ, (11)

and the resulting normal equations for the Gauss-Newtonperturbation are

[JTWJ]hgn = JTW(f − f̂(x)). (12)


Levenberg-Marquardt Method

The Levenberg-Marquardt algorithm adaptively varies theparameter updates between the gradient descent andGauss-Newton update,

[JTWJ + λI]hlm = JTW(f − f̂(x)), (13)

where small values of the parameter λ result in a Gauss-Newtonupdate and large values of λ result in a gradient descent update.

[JTWJ + λdiag(JTWJ)]hlm = JTW(f − f̂(x)). (14)



Calculate the Jacobian matrix J

If an iteration e2(x)− e2(x+h) > hT(λh+ JTW)(f − f̂), thenx + h is sufficiently better than x, reduce λ by a factor of ten.

If an iteration e2(x)− e2(x + h) < hT(λh + JTW)(f − f̂),then x + h is sufficiently better than x, increase λ by a factorof ten.

Convergence is achieved if max(| JTW)(f − f̂) |) < t. tdenotes a threshold.



The following functions can be used in fitting a set ofmeasured data and finding the minimum of the function e2(x):

f̂(x) = x1 exp(−t/x2) + x3 sin(t/x4) (15)

f̂(x) = (x1tx21 + (1− x1)t

x22 )

1/x2 (16)

f̂(x) = x1(t/max(t))+x2(t/max(t))2+x3(t/max(t))

3+x4(t/max(t))4

(17)



Figure: Data are collected from a radially distorted image.



The lens distortion model can be written asan infinite series:

xu = xd(1 + k1r2d + k2r

4d + . . .) (18)

yu = yd(1 + k1r2d + k2r

4d + . . .),(19)

where xu and yu are the undistortedcoordinates, whereas xd and yd are thedistorted coordinates. Further, k1 and k2are the radial distortion parameters. Thedistorted radius (rd) is given by

rd =√

x2d + y2d . (20)

Figure: Radial andtangential distortions.



The distortion error of each edge segmentis given by

e2 = a sin2 φ−2 | b || sinφ | cosφ+c cos2 φ(21)

where

a =n∑

j=1

x2j −1

n

n∑j=1

xj

2 (22)b =

n∑j=1

xjyj −1

n

n∑j=1

xj

n∑j=1

yj (23)

c =n∑

j=1

y2j −1

n

n∑j=1

yj

2 . (24)

Figure: The distortionerror is the sum ofsquares of the distancesfrom the degels to theleast square fit line.



Figure: Data are collected from a radially distorted image after a edgedetection.



Figure: Compensation of the radial distortion on the processed image.



Figure: Compensation of the radial distortion on the original image.


Thanks

Questions please


Documents

Linear and Nonlinear Optimization - MNRLab...Linear and Nonlinear Optimization Islam S. M. Khalil German University in Cairo October 10, 2016 Islam S. M. Khalil Gradient Descent and