Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Introduction to the theory, numerical methods and applications of ill-posed
problems
Compendium
Instructor: Anatoly Yagola
E-mail: [email protected] Tel. +7-9161211549
Lomonosov Moscow State University, Gothenburg University, Chalmers
University of Technology
2013
CONTENTS
1. Introduction
2. Well-posed and ill-posed problems
3. Elements of functional analysis
4. Examples of ill-posed problems
5. Definition of the regularizing algorithm
6. Ill-posed problems on compact sets
7. Extremal problems statement
8. Solvability of an extremal problem. Simplest necessary and sufficient
conditions of minimum
9. Convex functionals
10. Solvability of a convex programming problem
11. Criteria of convexity and strong convexity
12. Least square method. Pseudoinversion
13. Minimizing sequences
14. Some methods for solving one-dimensional extremal problems
15. Method of steepest descent
16. Conjugate gradient method
17. Newton’s method
18. Zero-order methods
19. Conditional gradient method
20. Projection of conjugate gradients method
21. Numerical methods for solving ill-posed problems on special compact
sets
22. Ill-posed problems with a source-wise represented solution
23. Tikhonov’s regularizing algorithm
24. Generalized discrepancy principle
25. Incompatible ill-posed problems
26. Numerical methods for the solution of Fredholm integral equations of
the first kind
27. Convolution type equations
A course plan
• Lecture 1: Sections 1–4.
• Lecture 2: Section 5-11.
• Lecture 3: Section 12-18.
• Lecture 4: Sections 19-21.
• Lecture 5: Sections 22-23.
• Lecture 6: Section 24-25.
• Lecture 7: Section 26-27.
Course projects
It is assumed that a student would solve one project. Computations
performed by standard accessible software (MATLAB) illustrating the solutions
are encouraged for the projects. The project includes solving an ill-posed problem
with different a priori information.
The students should present a written report on the solved problems.
1. Introduction
In the lecture course we will describe fundamentals of the theory of ill-posed
problems so as numerical methods for their solution if different a priori
information is available. For simplicity, only linear equations in normed spaces are
considered. Although, it is clear that all similar definitions can be introduced also
for nonlinear problems in more general metric (even also topological) spaces.
Numerous inverse problems can be found in different branches of physics.
E.g., an astrophysicist has no possibility to influent actively on processes in remote
stars and galaxies. He is induced to make conclusions about physical
characteristics of very remote objects using their indirect manifestations measured
on the Earth surface or near the Earth on space stations. Excellent examples of ill-
posed problems are in medicine. Firstly, let us point out computerized tomography.
A lot of applications of ill-posed problems are in geophysics. Indeed, it is easier
and cheaper to judge about what is going under the Earth surface solving inverse
problems than drilling deep boreholes. Other examples are in radio astronomy,
spectroscopy, nuclear physics, plasma diagnostics, etc., etc.
Mostly, these inverse problems are ill-posed. It means that small deviations
of input data (due to experimental errors) can produce large errors in an
approximate solution. The modern theory of ill-posed problems started seventy
years ago (1943) when famous Russian mathematician Andrey Tikhonov published
in Soviet Mathematics Doklady magazine a paper “On stability of inverse
problems”. At that time Tikhonov studied inverse problems in geophysics
participating personally in geophysical field investigations near Ural Mountains.
In 1963 Tikhonov formulated the definition of a regularizing algorithm. He
and two other Russian mathematicians Mikhail Lavrentiev and Valentin Ivanov
became founders of the modern theory of inverse and ill-posed problems. Now this
theory is greatly developed throughout the world. The most eminent specialists in
Sweden are Lars Elden from Linkoping and Larisa Beilina from Gothenburg.
Course literature
A.N.Tikhonov, A.V.Goncharsky, V.V.Stepanov, A.G.Yagola. Numerical methods for
the solution of ill-posed problems. - Kluwer Academic Publishers, 1995.
A.N.Tikhonov, A.S.Leonov, A.G.Yagola. Nonlinear ill-posed problems. V. 1, 2 -
Chapman and Hall, 1998.
L. Beilina, M.V. Klibanov. Approximate global convergence and adaptivity for
coefficient inverse problems. – Springer, 2012.
Recommended reference literature
O.M. Alifanov, E.A. Artioukhine, S.V. Rumyantsev. Extreme Methods for
Solving Ill-Posed Problems with Applications to Inverse Heat Transfer
Problems. – Begell House Inc., 1995.
A. Bakushinsky, A. Goncharsky. Ill-posed problems: Theory and applications. -
Kluwer Academic Publishers, 1994.
A.B. Bakushinsky, M.Yu. Kokurin. Iterative Methods for Approximate Solution of
Inverse Problems. – Springer, 2005.
H.W. Engl, M. Hanke, A. Neubauer. Regularization of Inverse Problems. – Kluwer
Academic Publishers, 1996.
V.K. Ivanov, V.V. Vasin, V.P. Tanana. Theory of Linear Ill-Posed Problems and its
Applications. – VSP, 2002.
M.M. Lavrentiev, L.Ya. Saveliev. Operator Theory and Ill-Posed Problems. – De
Gruyter, 2006.
V.V. Vasin, A.L. Ageev. Ill-Posed Problems with A Priori Information. – VSP,
1995.
2. Well-posed and ill-posed problems
Let us consider an operator equation:
uAz ,
where A is a linear operator acting from a Hilbert space Z into a Hilbert space U. It
is required to find a solution of the operator equation z corresponding to a given
inhomogeneity (or right-hand side) u.
This equation is a typical mathematical model for many physical so called
inverse problems if it is supposed that unknown physical characteristics z cannot
be measured directly. As results of experiments, it is possible to obtain only data u
connected with z with help of an operator A.
French mathematician J. Hadamard formulated the following conditions of
well-posedness of mathematical problems. Let us consider these conditions for the
operator equation above. The problem of solving the operator equation is called to
be well-posed (according to Hadamard) if the following three conditions are
fulfilled:
1) the solution exists Uu ;
2) the solution is unique;
3) if uun , nn uAz , uAz , then zzn .
The condition 2) can be realized then and only then the operator A is one-to-one
(injective). The conditions 1) and 2) imply that an inverse operator 1A exists, and
its domain D( 1A ) (or the range of the operator A R(A)) coincides with U. It is
equivalent to that the operator A is bijective. The condition 3) means that the
inverse operator 1A is continuous, i.e., to “small” perturbations of the right-hand
side u “small” changes of the solution z correspond. Moreover, J. Hadamard
believed that well-posed problems only can be considered while solving practical
problems. However, there are well known a lot of examples of ill-posed problems
that should be numerically solved when practical problems are investigated. It
should be noted that stability or instability of solutions depends on definition of the
space of solutions Z. Usually, a choice of the space of solutions (including a choice
of the norm) is determined by requirements of an applied problem. A mathematical
problem can be ill-posed or well-posed depending on a choice of a norm in a
functional space.
In order to understand what is normed spaces, convergence, etc., we need to
repeat definitions of functional analysis.
3. Elements of functional analysis
A set L is called a (real) linear space if for any two its elements x, y is
defined an element x+yL and for any element xL and any (real) number is
defined an element xL, so as the following conditions are fulfilled:
1) for any x, yL x+y=y+x;
2) for any x, y, zL (x+y)+z=x+(y+z);
3) the zero element 0L exists such that for any xL x+0=x;
4) for any xL the inverse element (-x)L exists such that x+(-x)=0;
5) for any x,yL and any (real) : (x+y)=x+y;
6) for any (real) and and any xL: (+)x=x+x;
7) for any (real) , and any xL: ()x=(x);
8) for any xL: 1x=x.
Elements of a linear space are called vectors.
Example. A finite-dimensional vector space Rn
that is very well known from
Linear algebra.
Example. A space of (real) functions defined on a segment [a,b].
Example. A space of (real) functions defined and continuous on a segment
[a,b].
Elements of a linear space L (vectors) y1, …, yn are called linear
independent if their linear combination 1 y1+…+n yn0 for any numbers 1,
…, n, except of 1=…=n=0. If maximal number of linear independent vectors of
a linear space L is finite then this number is called a dimension of a space L, and
the space is finite-dimensional. In opposite case a space L is infinite-
dimensional.
A set M is called a metric space if for any two its elements x, yM a real
number (x,y) (metric or distance) is defined such that:
1) for any x, yM (x,y)0, and (x,y)=0 if and only if x and y coincide
(x=y);
2) for any x, yM (x,y)=(y,x);
3) for any x, y, zM (x,y) (x,z)+ (y,z).
A sequence xnM, n=1, 2, ; converges to x0M (xn x0 as n) if (xn,
x0)0 as n.
A metric space can be not a linear space.
A linear space N is called normed space if for any xN is defined a real
number ||x|| (norm of x) such that:
1) for any xN ||x||0, and ||x||=0 if and only if x=0;
2) for any xN and any (real) ||x||=||||x||;
3) for any x, yN ||x+y||||x||+||y||.
A normed space can be considered as a metric space if define a metric:
(x,y)=||x-y||.
A sequence xnN, n=1, 2, …; converges (has a limit) to x0N (xn x0 as
n) if ||xn-x0||0 as n.
Lemma. If xn x0 as n, then || xn|| || x0||.
Examples of normed spaces:
Example. A finite-dimensional vector space Rn.
Example. A space C[a,b] of (real) functions defined and continuous on a
segment [a,b] with the norm: ||y||C[a,b]=max{|y(s)|, s[a,b]}. Convergence in this
space is called uniform convergence. Properties of uniform convergent sequences
(and series) of continuous on [a,b] functions were studied in Mathematical
analysis.
Definition. A sequence xnN, n=1, 2, …; is called fundamental sequence if
for any 0 there exists a natural number K such that for any nK and any natural
number p ||xn+p-xn||.
If a sequence converges then it is a fundamental one.
If any fundamental sequence converges then the normed space is called the
complete space.
Definition. A complete normed space is called a Banach space.
Example. A finite-dimensional vector space Rn
is a Banach space.
Example. A space C(p)
[a,b] (C(0)
[a,b]= C[a,b]) of (real) functions defined
and continuous with all derivatives up to p-th order (p0 – integer) on a segment
[a,b] with the norm:
]}.,[|,)(max{|||||0
)(
],[bassyy
p
k
k
bapC
is a Banach space. Convergence in this space is called uniform convergence with
all derivatives up to p-th order.
Definition. A linear space is called the Euclidean space if for any two
elements x, yE is defined a real number (x,y) (a scalar product) such that:
1) for any x, yE (x,y)=(y,x);
2) for any x1,x2, yE (x1+x2,y)=( x1,y)+(x2,y);
3) for any x, yE and any real (x,y)= (x,y);
4) for any xE (x,x)0, and (x,x)=0 if and only if x=0.
If a scalar product exists then a norm can be defined: ||x||E={(x,x)}1/2
.
The Cauchy inequality is true: |(x,y)|||x||||y||.
Example. A finite-dimensional vector space Rn
is an Euclidean space with a
scalar product (x,y)=x1y1+…+xnyn, where x1,…,xn; y1,…,yn; are components of
vectors x and y respectively.
Example. A space of (real) functions defined and continuous on a
segment [a,b] with a scalar product: for any continuous on [a,b] functions y1(x),
y2(x) define
b
a
dxxyxyyy .)()(),( 2121
This is the space ],[~
2 baL with the norm
b
a
dxxyy .}})({|||| 2/12
This Euclidean space is infinite-dimensional and is not complete.
Definition. An infinite-dimensional complete Euclidean space is called a
Hilbert space.
Applying a space completion procedure any infinite-dimensional Euclidean
space can be transformed in a Hilbert space. If apply this procedure to ],[~
2 baL we
can get the Hilbert space L2[a,b]. In order to describe elements of this space we
should know not only the Riemann integral (known from Mathematical analysis)
but also the Lebesgue integral.
We need also one of Sobolev’s spaces ],[1
2 baW (=H1[a,b]). The scalar
product in this space is defined as
b
a
b
a
dxxyxydxxyxyyy )()()()(),( 212121 , and the
norm is equal to 2/12'2 }))(()({||||
b
a
b
a
dxxydxxyy . If derivatives are classic and
integrals are Riemann ones then the space is an infinite-dimensional Euclidean
space. If derivatives are generalized and integrals are Lebesgue ones then the space
is a Hilbert space.
Definition. An operator А mapping from a linear space L1 into a linear space
L2 is called linear for any elements y1 and y2 from L1 and any real 1 and 2 :
A(1 y1+2 y2)= 1A y1+2A y2.
We note the domain of A as D(A). For simplicity, D(A)= L1. The range of A
let us note R(A). In this case R(A) L2 is a linear subspace in L2.
If an operator A is one-to-one (or injective) operator then an inverse operator
A-1
exists with its domain D(A-1
)=R(A) and range R(A-1
)=D(A)= L1. An inverse
operator exists if and only if the null space (or kernel) KerA={xL1: Ax=0} is
trivial. In this case, we say that the operator A is nondegenerate. Obviously, KerA is
a linear subspace of L1 and 0 KerA. Если KerA{0} then an inverse operator does
not exist and the operator A is called degenerate.
Example. The Fredholm integral operator
b
a
basxdssysxKAy ].,[,;)(),(
with a continuous kernel K(x,s) is a linear operator acting in the linear space of
continuous on a segment [a,b] continuous functions.
Let a linear operator А be a mapping from a normed space N1 into a normed
space N2 (for simplicity D(A)=N1).
Definition. An operator A is called continuous in y0D(A) if for any 0
there exists 0 such that for all yD(A) satisfying the inequality ||y-y0|| the
following inequality is true ||Ay-Ay0||.
An equivalent definition is following.
Definition. An operator A is called continuous in y0D(A) if for any
sequence yn, n=1, 2, …, ynD(A), yny0, the sequence Ayn converges to Ay0.
Definition. An operator A is called continuous on D(A) (on N1) if A is
continuous in all y0D(A) (N1).
A property of a linear operator: a linear operator A is continuous on D(A) if
A is continuous in any fixed y0D(A) (e.g., y0=0D(A).
Definition. The norm of a linear operator А is called
2
1
211
sup Ny
NN AyAN
For simplicity, we will write A21 NN = A .
Definition. If A then the operator А is called bounded.
In finite-dimensional spaces any linear operator is bounded. In infinite-
dimensional spaces it is not so.
Example. The space C 1,0 is an infinite-dimensional space. Let us consider
a differentiation operator A=sd
d defined on the linear subspace of continuously
differentiable functions in C 1,0 . The operator А is unbounded. For the sequence
y snn cos ;n=1, 2,…; we get:
1cosmax1,0
nsys
n , and nsnsAyn sin as
n .
Lemma. If А: N1 N 2 then for any 1Ny , : yAAy .
Theorem. A linear operator А: N1 N 2 is continuous if and only if A is
bounded.
Definition. A sequence yn,, n=1,2,…, of elements of a normed space N is
called bounded if there exists a constant C such that ||yn||C , n=1,2,…. .
Any convergent sequence is bounded. Any fundamental sequence is
bounded.
Definition. A sequence yn,, n=1,2,…, of elements of a normed space N is
called compact if any its subsequence has a convergent subsequence. If space N is
not Banach space then it has a fundamental subsequence.
A compact sequence is bounded. According to the Bolzano–Weierstrass
theorem, any bounded sequence in a finite-dimensional space is compact. For
infinite-dimensional spaces it is not right.
Example. Let us consider an infinite-dimensional space ],[~
2 baL . From
Mathematical analysis it is known that in ],[~
2 baL there exists an orthonormal
system of functions with an infinite number of elements (e.g., the trigonometric
system):
ji
jiee ijji
,1
,0),( ; i,j=1, 2,…
Let us consider a sequence ei, i=1, 2,…; it is bounded but not compact
because: 2),(2 jijiji eeeeee , ji 2 ji ee if ij. Any subsequence
of this sequence is not fundamental.
Definition. A linear operator А: N1 N 2 is called compact if for any
bounded sequence ny i=1, 2,…; elements of 1N the sequence zn=Ayn of elements in
2N is compact.
Definition. A linear compact operator is completely continuous.
Theorem. Any completely continuous operator is bounded (consequently,
continuous).
In finite-dimensional spaces any linear operator is completely continuous. It
is not true in infinite-dimensional spaces.
Example. Let us consider the identity operator I: ],[~
2 baL ],[~
2 baL , Iy=y for
any y. It is bounded but not compact.
Theorem. Let А be the Fredholm integral operator mapping from L2[a,b]
into L2[a,b]. Then А is completely continuous.
Let А: EE be a linear bounded operator and Е is an Euclidean space.
Definition. An operator A : EE is called adjoint of А if Eyy 21,
),(),( 2121 yAyyAy .
Definition. If AA then an operator А is called self adjoint.
4. Examples of ill-posed problems
The Fredholm integral equation of the 1st kind is a very well known sample
of an ill-posed problem. Let an operator A be of the form;
dcxxudsszsxKAzb
a
,,, .
Let the kernel of the integral operator sxK , be a function continuously
depending on a set of arguments dcx , , bas , , and the solution sz be a
continuous on the segment ba, function. Then let us consider the operator A as
acting between the following spaces: dcCbaCA ,,: . Let us show that in this
case the problem is ill-posed. It is necessary to check conditions of well-posedness:
1) Existence of a solution for any continuous on dc, function )(xu . In truth,
it is not so. There are exists an infinite number of continuous functions such that
the integral equation has no solutions.
2) Uniqueness of a solution. This condition is true if and only if the kernel of
the integral operator is closed.
The first two conditions are equivalent to existence of an inverse operator 1A
with the domain D( 1A )= dcC , . If the kernel of the integral operator is closed then
the inverse operator exists but its domain does not coincide with dcC , .
3) Stability of a solution. It means that for any sequence uun
( uzAuAz nn , ) the sequence zzn . Stability is equivalent to continuity of the
inverse operator 1A if it exists. In this case it is not true. Let us consider an
example. Let the sequence of continuous functions )(szn , n=1, 2, …, bas , , be
such that 0)( szn on the segment ]2
,2
[ nn dba
dba
and equal to zero out-of-
segment, max| )(szn |=1, s[a, b], and a numerical sequence 00nd . Such
functions could be chosen, e.g., as piecewise linear. Then for any dcx ,
021)(),()(),()( 0
2
2
n
dba
dba
n
b
ann dKdsszsxKdsszsxKxu
n
n
as n ,
where ),(max0 sxKK , ],,[ dcx ],[ bas .
The functional sequence )(xun uniformly (that is in norm of ],[ dcC )
converges to 0u . Though the solution of the equation uzA in this case is 0z ,
the sequence nz does not converges to z so far as 1],[
baCn zz .
The integral operator A is completely continuous (compact and continuous)
while acting from ],[2 baL into ],[2 dcL , from ],[ baC into ],[2 dcL and from ],[ baC
into ],[ dcC . It means that the operator transforms any bounded sequence to a
compact sequence. By definition, from any subsequence of a compact sequence it
is possible to select a converging subsequence. It is easy to indicate a sequence nz ,
1],[
2
baLnz , from which it is impossible to select a converging in ],[ baC
subsequence. For instance,
,...2,1;)(
sin)2
()( 2/1
n
ab
axn
abxzn
.
Norms of all terms of this sequence are equal to 1 in ],[2 baL , so the sequence
is bounded. But from any subsequence of the sequence it is impossible to select a
converging subsequence because jizz ji ,2|||| . Obviously, all functions )(xzn
are continuous on ],[ ba , and the sequence )(xzn , n=1, 2, …, is uniformly (in norm
of ],[ baC ) bounded. From this sequence it is impossible to select a converging in
],[ baC subsequence (then it converges also in ],[2 baL as far as convergence in
average follows from uniform convergence). Let us suppose that the operator 1A
is continuous. It is very easy to arrive to a contradiction. If A is an injective
operator then an inverse operator exists. Evidently, if an operator B :
],[ dcC ],[ baC , is continuous and an operator A is completely continuous then the
operator BA : ],[ baC ],[ baC , is completely continuous also. Since for any n
nn zAzA 1 ,
the sequence nz is compact, and that is not true. An operator that is inverse to a
completely continuous operator cannot be continuous. A similar proof can be
provided for any infinite dimensional Banach (i.e., complete normed) space.
Since the problem of solving the Fredholm integral equation of the 1st kind is
ill-posed in mentioned above spaces then even very small errors in the right-hand
side u(x) can produce a result that the equation is not solvable or the solution exists
but it differs ad lib strongly from the exact solution.
Thus, a completely continuous operator acting in infinite dimensional
Banach spaces has an inverse operator that is not continuous (not bounded).
Moreover, a range of a completely continuous operator acting between infinite
dimensional Banach spaces is not closed. Therefore, in any neighborhood of the
right-hand side )(xu such that the equation has a solution there exists infinite
number of right-hand sides such that the equation is not solvable.
A mathematical problem can be ill-posed in connection also with errors in an
operator. The simplest example gives the problem to find a normal pseudosolution
of a system of linear algebraic equations. Instability of this problem is determined
by errors in a matrix. We will discuss this problem a bit later.
Sometimes a problem of solving an operator equation it is convenient to
pose as problem of calculation of values of an unbounded and not everywhere
defined operator A-1
: z= A-1
u. The simplest example is calculation of the
differentiation operator:dx
duxz )( . We showed before that this operator is
unbounded and not everywhere defined if consider it is as a mapping from C[0, 1]
into C[0, 1]. The problem of calculation of values of this operator is not well-
posed because the first and third conditions of well-posedness are not fulfilled.
If consider the same operator is as a mapping from C(1)
[0, 1] into C[0, 1] the
problem is well posed.
5. Definition of the regularizing algorithm
Let us consider an operator equation:
uAz ,
where A is an operator acting between normed spaces Z and U. In 1963 A.N.
Tikhonov formulated a famous definition of the regularizing algorithm (RA) that is
a basic conception in the modern theory of ill-posed problems.
Definition. Regularizing algorithm (regularizing operator) )(),( uRuR is
called an operator possessing two properties:
1) )( uR is defined for any 0 , Uu , and is mapping U ),0( into Z;
2) For any Zz and for any Uu such that uAz , 0, uu ,
0)(
zuRz .
A problem of solving an operator equation is called to be regularizable if there
exists at least one regularizing algorithm. Directly from the definition it follows
that if there exists one regularizing algorithm then number of them is infinite.
At the present time, all mathematical problems can be divided into following
classes:
1) well-posed problems;
2) ill-posed regularizable problems;
3) ill-posed nonregularizable problems.
All well-posed problems are regularizable as it can be taken 1)( AuR . Let us
note that knowledge of 0 is not obligatory in this case.
Not all ill-posed problems are regularizable, and it depends on a choice of
spaces Z, U. Russian mathematician L.D. Menikhes constructed an example of an
integral operator with a continuous closed kernel acting from C[0,1] into L2[0,1]
such that an inverse problem (that is, solving a Fredholm integral equation of the
1st kind) is nonregularizable. It depends on properties of the space C[0,1]. Below it
would be shown that if Z is the Hilbert space, and an operator A is bounded and
injective, then the problem of solving of the operator equation is regularizable.
This result is valid for some Banach spaces, not for all (for reflexive Banach spaces
only). In particular, the space C[0,1] does not belong to such spaces.
An equivalent definition of the regularizing algorithm is following. Let be given
an operator (mapping) )( uR defined for any 0 , Uu , and reflecting
U ),0( into Z. An accuracy of solving an operator equation in a point Zz
using an operator )( uR under condition that the right-hand side defined with an
error 0 is defined as ||||sup),,(,||:||
zuRzRuAzuuUu
. An operator )( uR is
called a regularizing algorithm (operator) if for any Zz 00),,(
zR . This
definition is equivalent to the definition above.
Similarly, a definition of the regularizing algorithm can be formulated for a
problem of calculating values of an operator (see the end of the previous section),
that is for a problem of calculating values of mapping G :
XGDYGD )(,)( under condition that an argument of G is specified with an
error (X, Y are metric or normed spaces). Of course, if A is an injective operator
then a problem of solving an operator equation can be considered as a problem of
calculating values of A-1
.
It is very important to get an answer to the following question: is it possible
to solve an ill-posed problem (i.e., to construct a regularizing algorithm) without
knowledge of an error level ?
Evidently, if a problem is well-posed then a stable method of its solution can
be constructed without knowledge of an error . E.g., if an operator equation is
under consideration then it can be taken uAzuAz 11 as 0 . It is
impossible if a problem is ill-posed. A.B. Bakushinsky proved the following
theorem for a problem of calculating values of an operator. An analogous theorem
is valid for a problem of solving operator equations.
Theorem. If there exists a regularizing algorithm for calculating values of an
operator G on a set XGD )( , and the regularizing algorithm does not depend on
(explicitly), then an extension of G from XGD )( to X exists, and this
extension is continuous on XGD )( .
So, construction of regularizing algorithms not depending on errors
explicitly is feasible only for well-posed on its domains problems.
The next very important property of ill-posed problems is impossibility of
error estimation for a solution even if an error of a right-hand side of an operator
equation or an error of an argument in a problem of calculating values of an
operator is known. This basic result was also obtained by A.B. Bakushinsky for
solving operator equations.
Theorem. Let 0,||:||
0)(||||sup),,(
zuRzRuAzuuUu
for any
ZDz . Then a contraction of the inverse operator on the set AD : UAD
A
1 is
continuous on AD .
So, a uniform on z error estimation of an operator equation on a set ZD
exists then and only then if the inverse operator is continuous on AD . The theorem
is valid also for nonlinear operator equations, in metric spaces at that.
From the definition of the regularizing algorithm it follows immediately if
one exists then there is infinite number of them. While solving ill-posed problems,
it is impossible to choose a regularizing algorithm that finds an approximate
solution with the minimal error. It is impossible also to compare different
regularizing algorithms according to errors of approximate solutions. Only
including a priori information in a statement of the problem can give such a
possibility, but in this case a reformulated problem is well-posed in fact. We will
consider examples below.
Regularizing algorithms for operator equations in infinite dimensional
Banach spaces could not be compared also according to convergence rates of
approximate solutions to an exact solution as errors of input data tend to zero. The
author of this principal result is V.A. Vinokurov.
In conclusion of the section let formulate a definition of the regularizing
algorithm in the case when an operator can also contain an error, i.e., instead of an
operator A it is given a bounded linear operator hA : UZAh : , such that
0, hhAAh . Briefly, let us note a pair of errors , h as =(, h).
Definition. Regularizing algorithm (regularizing operator)
),(),,( hh AuRAuR is called an operator possessing two properties:
1) ),( hAuR is defined for any 0 , 0h , Uu , ),( UZLAh , and is
mapping ),(),0[),0( UZLU into Z;
2) for any Zz , for any Uu such that uAz , 0, uu and for
any ),( UZLAh such that 0, hhAAh , 0
),(
zAuRz h .
Here ),( UZL is a space of linear bounded operators acting from Z into U
with the usual operator norm.
Similarly, it possible to define what is it a regularizing algorithm if an
operator equation is considered on a set ZD , i.e., a priori information that an
exact solution ZDz is available.
For ill-posed SLAE A.N. Tikhonov was the first who proved impossibility to
construct a regularizing algorithm that does not depend explicitly on h.
6. Ill-posed problems on compact sets
Let us consider an operator equation:
uAz ,
A is a linear injective operator acting between normed spaces Z and U. Let z be an
exact solution of an operator equation, uzA , u is an exact right-hand side, and it
is given an approximate right-hand side such that 0, uu .
A set }:{ uAzzZ is a set of approximate solutions of the operator
equation. For linear ill-posed problems },||:sup{|| 2121 ZzzzzZdiam for
any 0 since the inverse operator 1A is not bounded.
The question is that: is it possible to use a priori information in order to
restrict a set of approximate solutions or (it is better) to reformulate a problem to
be well-posed. A.N. Tikhonov proposed a following idea: if it is known that the set
of solutions is a compact then a problem of solving an operator equation is well-
posed under condition that an approximate right-hand side belongs to the image of
the compact. A.N. Tikhonov proved this assertion using as basis the following
theorem.
Theorem. Let an injective continuous operator A be mapping:
UADZD , where UZ , are normed spaces, D is a compact. Then the inverse
operator 1A is continuous on AD .
The theorem is true for nonlinear operators also. So, a problem of solving an
operator equation is well-posed under condition that an approximate right-hand
side belongs to AD . This idea made possible to M.M. Lavrentiev to introduce a
conception of a well-posed according to A.N. Tikhonov mathematical problem (it
is supposed that a set of well-posedness exists), and to V.K. Ivanov to define a
quasisolution of an ill-posed problem.
The theorem above is not valid if )(ARu . So, it should be generalized.
Definition. An element Dz such that uAzzDz
minarg is called a
quasisolution of an operator equation on a compact D ( uAzzDz
minarg means
that }:min{ DzuzAuzA ).
A quasisolution exists but maybe is nonunique. Though, any quasisolution
tends to an exact solution: zz as 0 . In this case, knowledge of an error
is not obligatory. If δ is known then:
1) any element Dz satisfying an inequality : uAz , can be chosen
as an approximate solution with the same property of convergence to an exact
solution (δ-quasisolution);
2) it is possible to find an error of an approximate solution solving an
extreme problem:
find max||z-zδ|| maximizing on all Dz satisfying an inequality:
uAz (it is obviously that an exact solution satisfying the inequality).
Thus, the problem of quasisolving an operator equation does not differ
strongly from a well-posed problem. A condition of uniqueness only maybe does
not satisfy.
If an operator A is specified with an error then the definition of a
quasisolution can be modified changing an operator A to an operator Ah.
Definition. An element Dz such that uzAz hDz
minarg is called a
quasisolution of an operator equation on a compact D.
Any element Dz satisfying an inequality: |||| zhuAz can be
chosen as an approximate solution (η-quasisolution).
If Z and U are Hilbert spaces then many numerical methods of finding
quasisolutions of linear operator equations are based on convexity and
differentiability of the discrepancy functional 2
uAz . If D is a convex compact
then finding a quasisolution is a problem of convex programming. The inequalities
written above and defining approximate solutions can be used as stopping rules for
minimizing the discrepancy procedures. The problem of calculating errors of an
approximate solution is a nonstandard problem of convex programming because it
is necessary to maximize (not to minimize) a convex functional.
Some sets of correctness are very well known in applied sciences. First of
all, if an exact solution belongs to a family of functions depending on finite
number of bounded parameters then the problem of finding parameters can be
well-posed. The same problem without such a priori information can be ill- posed.
If an unknown function z(s), s[a, b], is monotonic and bounded then it is
sufficient to define a compact set in the space L2[a, b]. After finite-dimensional
approximation the problem of finding a quasisolution is a quadratic programming
problem. For numerical solving, known methods such a method of projections of
conjugate gradients or a method of conditional gradient can be applied. Similar
approach can be used also when the solution is monotonic and bounded, or
monotonic and convex, or has given number of maxima and minima. In these
cases, an error of an approximate solution can be calculated.
In order to understand how to minimize functionals we will study now
basics of convex programming.
7. Extremal problems statement
As the main problem below, we will consider the following problem
(Problem (*)):
To find min f(x)=f* if xXH, and an element x
*X such that f(x*)=f
*.
Here H is a Hilbert space (or finite-dimensional Euclidean space Rn), X is its
subset, and f(x) is defined on X a real functional. The problem (*) is called an an
extremal problem or an optimization problem. We will consider minimization
problems only because the maximization problem: to find max f(x) if xXH is
equivalent to the problem: to find min (-f(x)) on the same set..
The set X is called the admissible set or the set of constraints. Any element
xX is called an admissible point. If X=H then the problem (*) is called the
problem without constraints, in opposite case it is the problem with constraints.
We denote:
x*=argmin{f(x): xX}, f(x
*)=f
*= min{f(x): xX}.
If the point of minimum is nonunique:
X*=Argmin{f(x): xX}.
A point x*is called the point of a (global) minimum of a functional f(x) on a
set X if for any xX: f(x)f(x*); and it is called the point of a strict (global)
minimum of a functional f(x) on a set X if for any xX, xx*: f(x)>f(x
*).
We define a closed ball with its radius r>0 and its centre x*as a set
Sr(x*)={ xH: ||x-x
*||r}. A point x
* is called the point of a local minimum of a
functional f(x) on a set X if there exists r>0 such that for any xX Sr(x*):
f(x)f(x*), is called the point of a strict local minimum of a functional f(x) on a set
X if there exists r>0 such that for any x X Sr(x*), xx
*: f(x)>f(x
*).
8. Solvability of an extremal problem. Simplest necessary and sufficient
conditions of minimum
The following theorem is fundamental for investigation of extremal problems
solvability.
Theorem (Weierstrass). Let Х be a compact in H, and a functional f(x)
continuous on X. Then there exists a point of a global minimum of f(x) on X.
Corollary Let a functional f(x) be lower semicontinuous on X, i.e.
)()(lim:,...2,1,, )0()()0()()()0( xfxfxxnXxXx n
n
nn
. The Theorem is valid in
this case also.
Remark. In proof of the Theorem it never was used that H is a Hilbert
space. The proof is valid for any space in which a sequential convergence and a
sequential compactness could be defined.
Let us consider now an extremal problem without constraints: X=H.
Definition. A functional f(x) is called differentiable in a point x(0)
of a
Hilbert space H if for any Hx
f(x)= f(x0)+( f
’’(x
0), x-x
(0))+o(||x-x
(0)||).
Here X)(xf (0)' is called the Frechet derivative, or the gradient of a functional
f(x) in a point x(0)
.
In the finite-dimensional case (H=Rn) a gradient is a vector, and its
components are partial derivatives of a function of n variables f(x) in a point x(0)
.
Definition. A functional f(x) is called twice differentiable in a point x(0)
of a
Hilbert space H if for any Hx
f(x)= f(x0)+( f
’’(x
0), x-x
(0))+1/2( f
’’’(x
0)( x-x
(0)), x-x
(0)) +o(||x-x
(0)||
2).
Here a bounded linear operator HH:)(xf (0)'' called the second Frechet
derivative of a functional f(x) in a point x(0)
. In the finite-dimensional case
)( )0(xf is given by the Hesse matrix (or Hessian). This matrix consists of second
partial derivatives ji xx
f
2
in x(0)
.
Definition. A linear bounded operator HH:A is called nonnegatively
definite if for any Hh : (Ah, h)0, and positively definite if for any Hh , h0:
(Ah, h)>0.
Following trivial results are known from Mathematical analysis. Here H=
Rn, and a functional f(x) is a function of n variables.
Theorem (necessary condition of local extremum). Let a functional f(x) be
differentiable in Hx * . If *x is a point of a local minimum then 0)( * xf .
Theorem (necessary condition of local minimum). Let a functional f(x) be
twice differentiable in Hx * . If *x is a point of a local minimum, then 0)( * xf
and 0)( * xf (nonnegatively definite).
Theorem (sufficient condition of local minimum). Let a functional f(x) be
twice differentiable in Hx * . If the gradient ,0)( * xf the operator 0)( * xf
(positively definite), then *x is a point of a local minimum.
These results are easily generalized on the infinite-dimensional case. The
necessary conditions are the same, the sufficient condition should be strengthened.
The theorems are valid if the problem is with constraints, and *x is an
interior point of X.
9. Convex functionals
Definition. A set Х in a linear space is called convex if Xxx )2()1( , the
segment Xxxxxxxx ]}1,0[),(:{],[ )1()2()1()2()1( .
Definition. A functional f(x) defined on a convex set Х is called convex if
Xxx )2()1( , и ]1,0[ :
))()1()())1(( )2()1()2()1( xfxfxxf .
)2()1( )1( xx , ]1,0[ is called a convex combination of )1(x , )2(x .
Definition. A functional f(x) defined on a convex set Х is called strictly
convex if )1,0(,,, )2()1()2()1( xxXxx
))1(())1(( )2()1()2()1( xxfxxf .
Definition. A functional f(x) defined on a convex set Х of a Hilbert space is
called strongly convex if there exists a constant 0 (constant of strong
convexity) such that Xxx )2()1( , и :]1,0[
2)2()1()2()1()2()1( ||||)1()()1()())1(( xxxfxfxxf .
If a functional is strongly convex then it is strictly convex.
Examples. 1) The function of one variable 2)( xxf is strongly convex, the
function 4)( xxf is not strongly convex but it is strictly convex. 2) The functional
f(x)=||x|| defined on a Hilbert space H is convex but it is not strictly convex. The
functional f(x)=||x||2=(x,x) is strongly convex.
Theorem (Jensen’s inequality). Let a functional f(x) be convex on a
convex set Х. Then Хxx n )()1( ,..., and 10:,...,1 in ,
n
i
i
1
1 :
)(...)()...( )()1(
1
)()1(
1
n
n
n
n xfxfxxf .
A classification of extremal problems:
1) If Х is a convex set and f(x) is a convex functional then (*) is a problem of
convex programming (or convex optimization).
2) Often Х is defined by a system of equalities and inequalities:
)},..1,0)(,,..1,0)(:{ mkixgkixgPxX ii . Here Р is a set of direct
constraints. Then (*) is a problem of mathematical programming. It is a
problem of convex programming if Р is a convex set, functionals )(xg i in
constraints of inequalities type are convex, and in constraints of equalities
type they are affine. It follows from: 1) if )(xg is a convex functional then
the set })(:{ xgx ( is a constant) is convex (or empty); 2) intersection of
convex sets is convex or empty.
3) (*) is a problem of linear programming if H=Rn, ),()( xbaxf is a linear
functional (more exactly, affine functional, but we will call linear), and all
constraints produce a polyhedron (i.e. Р is a polyhedron, and all functionals
)(xg i are linear). A linear programming problem is a special case of a
convex programming problem. Using a necessary condition of minimum, it
is easy to prove that a point of minimum (if it exists) is a vertex of the
polyhedron.
4) If H=Rn, and ),(
2
1),()( xCxxbaxf (a quadratic functional), and
constraints are linear then (*) is a problem of quadratic programming.
Here С is a symmetric matrix. If С is nonnegatively definite the a problem
of quadratic programming is a special case of a problem of convex
programming.
Definition. A functional f(x) is called concave (strictly concave) (strongly
concave) if -f(x) is convex (strictly convex) (strongly convex).
Theorem. If (*) is a convex programming problem then any point of a local
minimum is a point of the global minimum.
Convex functionals has no local minima.
Theorem. Let functional f(x) be convex Н on a Hilbert space and
differentiable in Hx * . If 0)( * xf then *x is a point of global minimum f(x) on Н.
Theorem. Let (*) be a solvable convex programming problem. Then the set
of its solutions )(min* xfArgXXx
is convex. If f(x) is a strictly convex functional
then *X contains an unique *x .
10. Solvability of a convex programming problem
Theorem. A convex continuous (lower semicontinuous) functional f(x) on a
closed bounded convex set of a Hilbert space H has a minimum point (i. e. a
convex programming problem (*) is solvable).
Definition. A sequence )(nx , n=1,2,…, of elements of a normed space
weakly converges to an element x if for any linear continuous functional )(x :
)()( )( xx n . If H is a Hilbert space, then Hxxweakly
n )( if ).,(),(: )( xaxaHa n
Definition. A set X is called a weak (sequential) compact if from any
sequence of its elements it is possible to choose a subsequence weakly convergent
to an element of X.
Definition. A functional f(x) is called weakly lower semicontinuous on the
set X if )()(lim:,...2,1,, )0()()0()()()0( xfxfxxnXxXx n
n
weaklynn
.
Theorem (Weierstrass). A weakly lower semicontinuous functional on a
weak compact has a minimum point.
We need to prove two assertions only:
1. A convex continuous (lower semicontinuous) functional is weakly lower
semicontinuous.
2. A closed bounded convex set of a Hilbert space H is a weak compact.
Examples. 1) A closed ball }1:{)1(0 xxS in an infinite-dimensional
Hilbert space is not a compact but it is a weak compact. 2) A sphere {x: ||x||=1} in
an infinite-dimensional Hilbert space is closed and bounded but not weakly closed.
So it is not a weak compact.
Theorem. Let xxHxx nnn
n
)()(1
)( ,,}{ weakly and |||||||| )( xx n . Then
xx n )( .
I.e., from weak convergence and convergence of norms follows strong
convergence (H-property of Hilbert spaces).
A Hilbert space is also strictly normed: ||||||||||:|,,, yxyxxyyxyx .
The result concerning solvability of a convex programming problem is not
valid for all Banach spaces.
Example. Let us consider a Banach space ]1,1[C and a unit ball in this
space ]}1,1[,1)(1),({)0(1 sszszS . Obviously, that )0(1S is a closed bounded
convex set. A linear continuous functional
1
0
0
1
)()()( dsszdsszzf has no a
minimum point on this set.
The theorem formulated above is valid only for reflexive Banach spaces.
Theorem. Let )(xf be a continuous strongly convex functional defined on a
closed convex set X of a Hilbert space H . Then )(xf has a minimum point in X ,
and this point is unique.
Let Xx
xff
)(min* , and Xx
xfx
)(minarg* , i.e. )( ** xff .
Definition. A sequence )(nx , n=1,2,…, of elements of a set X is called a
minimizing sequence if **)( )()( fxfxfn
n
.
Definition. A sequence )(nx , n=1,2,…, of elements of a set X is called
convergent to a minimum point if *)( xxn
n
.
Theorem. If a functional f(x) is strongly convex, then a minimizing
sequence converges to a minimum point.
11. Criteria of convexity and strong convexity
In this and only this chapter for simplification of formulating theorems we
note a convex functional as a strongly convex functional with constant 0 .
Theorem. Let )(xf be differentiable on a convex set HX functional, H a
Hilbert space. Then )(xf is strongly convex with a constant 0 if and only if
Xxx
,2
)),('()()( xxxxxfxfxf
.
Theorem. Let )(xf be a continuously differentiable functional defined on a
convex set HX . Then )(xf is strongly convex with a constant 0 if and only
if Xxx
, 2
2)),(')('( xxxxxfxf .
Theorem. Let )(xf be a twice continuously differentiable functional,
defined on a convex set X of a Hilbert spaceH , and Xint {Ø}. Then )(xf is
strongly convex with a constant 0 if and only if HhXx ,
2
2),)(''( hhhxf .
In a finite-dimensional case (H=Rn), it means that a minimal eigenvalue of
Hessian is 2)(min x .
Let Abe a symmetric matrix defined a linear operator nn RRA : .
Let us consider a quadratic functional:
cxbxAxxf ),(),(2
1)( .
It easy to calculate that AxfbAxxf )('',)(' .
If A is strictly positively definite, then 0)(min A , and the quadratic
functional is strongly convex. If 0)(min A , functional is convex but not strongly
convex. If 0)(min A , then )(xf is not convex.
If A is a self-adjoint operator: HHA : , we can define the quadratic
functional in an infinite-dimensional Hilbert space. If A is strictly positively
definite (for any 0x : 0),( xAx ), it does not mean that there exists 0)(min A . In
this case, the quadratic functional is strictly convex but, generally speaking, is not
strongly convex.
12. Least square method. Pseudoinversion
Let us consider a system of linear algebraic equations (SLAE):
bAx , nRx , mRb , mn RRA : .
The system can be solvable or unsolvable. In the beginning of the XIX
century, Gauss and Legendre independently proposed the least squares method.
Instead of solving SLAE, they suggested to minimize a quadratic functional
(discrepancy):
),(),(2),()( **2
bbxbAxAxAbAxxФ ,
*A is a conjugate (transposed) matrix. The matrix AA* is nonnegatively definite, so
)(xФ is a convex functional. For a convex differentiable functional the problem to
find )(min xФnRx
is equivalent to find a stationary point, notably solving an equation
0)(' xФ . It easy to see that )(2)(' ** bAAxAxФ , 02)('' * AAxФ . Then an
equation 0)(' xФ (the gradient of the discrepancy is equal to zero) is turned into a
system of linear algebraic equations with a square nonnegatively definite matrix
(system of normal equations):
bAAxA **
In a finite dimensional case it easy to prove that the system of normal equations
has a solution for any vector b (it maybe does not exist for the original SLAE).
This solution is called a pseudosolution (or least squares solution) of the SLAE
bAx . The pseudosolution can be nonunique if the determinant 0)det( * AA . If
0)det( * AA then the pseudosolution is unique. The set of pseudosolutions is an
affine (or linear) subspace in Rn, it is convex and closed.
If the system bAx has a solution then it coincides with a solution of the
system bAAxA ** . In this case, )(min xФnRx
==0. But if )(min xФnRx
=>0, then the
system bAx has no solutions, though as it was shown above there exists a
pseudosoltion (maybe nonunique). The number is called a measure of
incompatibility of the system bAx .
Definition. A normal pseudosolution nx of the system bAx is a
pseudosolution with a minimal norm, i.e., it is a solution of extreme problem: find
minimum xbAAxAx **:
min
.
It easy to prove that a normal pseudosolution exists and is unique. We can
formulate a problem: it is given a vector b . Let us find a corresponding to it
normal pseudolution nx . An operator A that realizes this correspondence is linear.
It is called a pseudoinverse to A operator: bAxn
. If a solution of the original
system bAx exists and is unique for any vector b then 1 AA . If there exists a
unique solution of the system bAAxA ** for any vector b (it means that the
operator AA* is invertible) then *1* )( AAAA . In a general case, an expression
for A has the following form: *1* )lim( AEAAA as 00 .
If instead of a vector b a vector b~
is given such that: ||~
|| bb , 0 , and
bAxn , bAxn
~~ , then ||||||~|| Axx nn (it proves stability of a normal
pseudosolution for disturbances in a right-hand side). So, the problem of finding a
normal pseudosolution is well-posed if a pseudoinverse operator can be calculated
exactly. But the problem of a pseudoinverse operator calculation can be ill- posed.
It means that the problem to find bAxn may be unstable for errors in A .
Example. Let us consider a system:
1
1
yx
yx.
Obviously, the system has an infinite number of solutions, and 1 2 1 2( / , / ) is
its normal solution.
In this case,
11
11A ,
1
1b .
Let the matrix contain an error:
1
1)1(
yx
yx , 0 ,
Such approximate system has a unique “approximate” solution 1,0 yx , which
does not depend on . Moreover, as 0 it does not converge to the exact normal
pseudosolution
2/1
2/1. Assuming that all four entries of the matrix have errors it
easy to construct examples of convergence of “approximate” solutions to different
vectors as errors tend to zero.
Let us change a vector b and consider a system:
2
3
2
1
yx
yx
,
It has no solutions. Its normal pseudosolution is the same as in the previous case:
2/1
2/1. It is no problem to construct examples of its instability.
13. Minimizing sequences
Let us consider the problem: to find )(minarg* xfxXx
. The main approach to
solve this problem is to construct an iterative process:
,...2,1,0,)()()1( khxx k
k
kk , )0(x is an initial approximation, )(kh is a
direction of minimization, k is a step along the direction of minimization.
The basic algorithm for constructing a sequence )(kx is following:
1. Assign k=0, define )0(x .
2. Check stopping rules; if yes then go to 3, if no then go to 4.
3. Stop calculations.
4. Calculate )(kh .
5. Calculate k .
6. Calculate )1( kx .
7. Add 1 to k. Go to 2.
If )(kh and k are calculated without calculations of derivatives of )(xf , then
such methods are called zero order methods. If it is necessary to calculate first
derivatives, then it is the first order (or gradient) method, etc.
We call a sequence )(kx to be minimizing if )(min)( *)( xffxfXx
k
. We need
also )(minarg*)( xfxxXx
k
(convergence of arguments). In this case, it is possible
to estimate a rate of convergence. A sequence )(kx converges to *x with a linear
rate if |||||||| *)(*)1( xxqxx kk , 10 q . If |||||||| *)(*)1( xxqxx kk
k , 0kq ,
k , then convergence is superlinear. If 2*)(*)1( |||||||| xxCxx kk , C=const,
then convergence is quadratic.
Mostly used stopping rules:
1) 1)()1( |||| kk xx ;
2) 2)()1( |)()(| kk xfxf ;
3) 3)( ||)(|| kxf ,
where 321 ,, are given small positive numbers. Criteria 1) and 2) are usually
applied simultaneously. Criterion 3) is typical for gradient methods.
Many minimization methods are so called descent methods. h is a direction
of descent of )(xf if for sufficiently small positive >0. )()( xfhxf . We
denote the of directions of descent as ),( fxV . A minimization method is called a
descent method if ),( )()( fxVh kk
Lemma. Let f (x) be differentiable in a point Hx . If 0)),(( hxf
then ),( fxVh . If ),( fxVh then 0)),(( hxf .
In particular, if )( )()( kk xfh then it is a method of a gradient descent.
If )(kx and )(kh were calculated, then we need to calculate . In very many
methods it is necessary to solve a one-dimensional extremal problem:
)(min)( )()()()(
1
kk
R
k
k
k hxfhxf
. If )(kh is a direction of descent, then
)(min)( )()(
0
)()( kkk
k
k hxfhxf
or )(minarg )()(
0
kk
k hxf
. In some cases a
solution of this problem can be found analytically. E.g., for the quadratic functional
),(),(2
1)( xbxAxxf ( 0A is a symmetric matrix) this solution is easy to find.
A minimum point of the function of one variable
),(2
1),(),(
2
1)( )()()()(2)()()()( kkkkkkkk xbAxhbAxhAhhxf
is: ),(
)),((
),(
),(
)()(
)()(
)()(
)()(
kk
kk
kk
kk
khAh
hxf
hAh
hbAx
.
14. Some methods for solving one-dimensional extremal problems
Let a function of one variable )(xf be defined on ],[ ba and have a unique
minimum point. Two most popular methods to find it are described below.
Golden section. At first, we shall find numbers F1 и F2 such that:
Let us define points: 21 , FabFaa , here ab . After that, let
compare )(af and )(bf . If )()( bfaf , then we move b to b . If )()( bfaf we
move a to a . If we divide the new segment again, then one point is “old”.
Stopping rule is: 0, .
Quadratic approximation Let: 111 bca and )()();()( 1111 bfcfafcf .
Using these three points, we approximate f(x) by a quadratic function
(parabola) and can calculate its minimum point. After that we can stop or reiterate
the procedure.
1,1
212
2
1 FFF
F
F
38.02
531
F
62.02
152
F
15. Method of steepest descent
The method of steepest descent (or gradient descent) was proposed by
Cauchy in the beginning of the 19th
century. In this method:
0),(minarg),( )()()()( kk
k
kk hxfxfh ,
i.e. k is a solution of one-dimensional problem on a ray.
Theorem. Let a functional )(xf satisfy following conditions:
1) It is defined, bounded from below and differentiable on a Hilbert space H .
2) For its gradient the Lipschitz condition holds:
||ˆ||||)ˆ()(:||ˆ, xxMxfxfHxx , here 0M is a constant.
Then for any initial condition )0(x the sequence )(kx constructed, according
the method of steepest descent, has a property: .0||)(||lim )(
k
kxf
Remarks. 1) Any limit point of sequence )(kx is a stationary point of )(xf . If
the functional is strongly convex, then it is a minimum point.
2) k is possible to calculate also by different methods. E.g., M
k2
1 , or if
given 0<<1, to choose k using the condition:
2)()()()( ||)(||))()(( k
k
kkk xfxfxfxf .
3) The method of steepest descent for a strongly convex functional f
converges with a linear rate.
16. Conjugate gradient method
Let us consider a quadratic functional: ),(),(2
1)( xbxAxxf , A =A
*>0,
nRx .
A problem: to find directions (vectors) in Rn: nn Rhh )1()0( ,..., , such that:
)(min)( )( xfxfnRx
n
nRx )0( ;
),(minarg; )()()()()1( kk
k
k
k
kk hxfhxx .
If we can find such vectors, then we can find a point of minimum of a
quadratic functional after a finite number of iterations.
Definition. Two vectors 0, hh are conjugate (with respect to A ) if
0),( hhA .
Definition. Nonzero vectors nk Rhh )1()0( ,..., are conjugate (with respect to
A ) if 1,...,0,;,0),( )()( kjijihAh ji .
Remark. If vectors are conjugate with respect to A =A*>0, then they are
linear independent. So, if we find such a system nn Rhh )1()0( ,..., , then it is a basis
in nR .
Theorem. If vectors )(kh , )(1,...,0 nmmk , are conjugate with respect to
A =A*>0 and k are solutions one-dimensional minimization problem (see above),
then )(min)( )( xfxfmXx
m
, }:{
1
0
)()0(
m
j
j
jm hxxxX , j are arbitrary real numbers.
Remark. If nm , then )(min)( )( xfxfnRx
n
.
This is a conjugate directions method. It guarantees convergence of the
minimization procedure for a quadratic functional after n iterations.
Let us describe a conjugate gradient method. An initial approximation )0(x is
an arbitrary vector. )( )0()0( xfh , .)(.1,)( )1(
1
)()( bAxxfkhxfh k
k
kk
;
....,12,1,1,0
....12,1,1,
)('
)('
2)1(
2)(
1
nnk
nnk
xf
xf
k
k
k
;
....,12,1,1,0
....12,1,1,
)('
))(')('),('(2
)1(
)1()()(
1
nnk
nnk
xf
xfxfxf
k
kkk
k
1k are such that векторов )(kh and )1( kh are conjugate with respect to A. We
suppose that .0)( )( kxf (If 0)( )( kxf , the minimization procedure should be
stopped). So, ),()),((),(0 )1()1(
1
)1()()1()(
kk
k
kkkk AhhAhxfAhh , and
),(
)),(()1()1(
)1()(
1
kk
kk
kAhh
Ahxf .
Finally, ),(minarg; )()()()()1( kk
k
k
k
kk hxfhxx .
Lemma. Under conditions formulated above, vectors )1()0( ,..., nhh are
conjugate with respect to A , and a conjugate gradient method is a particular case
of a conjugate direction method.
Theorem. A conjugate gradient method can find a minimum point of a
quadratic functional ),(),(2
1)( xbxAxxf ( A =A
*>0) in a finite-dimensional space
nR after no more than n iterations.
A conjugate gradient method can be applied for minimizing a nonquadratic
differentiable functional if a gradient of a functional satisfies a Lipschitz condition.
nRx )0( , )()()1( k
k
kk hxx , )(minarg )()(
0
kkk hxf
)(' )0()0( xfh , )1(
1
)()( )('
k
k
kk hxfh
If
then it is the Fletcher-Reeves method.
If
then it is the Polak-Ribiere method.
For a quadratic functional both methods are equivalent. If )(xf is a strongly
convex functional, then the conjugate gradient method has a superlinear rate of
convergence.
17. Newton’s method
Let )(xf be a twice differentiable functional such that there exists a bounded
linear operator 1" )]([ xf for any Hx , H is a Hilbert space.
Newton’s method produces the following minimizing sequence: Hx )0( is
an initial approximation: )()]([ )('1)(")()1( nnnn xfxfxx , ,...1,0n .
Remark. The sequence }{ )(nx in Newton’s method can be constructed in such
way:
1) Hx )0( ;
2) )(minarg)1( xfx nHx
n
,
here )),)(((2
1)),(()()( )()()(")()(')( nnnnnn
n xxxxxfxxxfxfxf is a quadratic
approximation of f(x) in a point )(nx according to a Taylor series.
Theorem. Suppose that:
1) )(xf is a twice differentiable strongly convex functional in H ;
2) the second derivative )(" xf satisfies the Lipschitz condition, i.e. there exists a
constant L0 such that for any Hxx "' , ||||||)()(|| "'""'" xxLxfxf ;
3) the initial approximation is chosen such that 1||||2
)0()1( xxL
q
, here 0 is a
constant of strong convexity.
Then:
1) Hxx n *)( ;
2)
nk
n k
qL
xx 2*)( 2||||
, *x is a solution of (*).
An advantage of Newton’s method is a very high rate of convergence.
Disadvantages are: 1) the method converges from a specially chosen initial
approximation only; 2) it is necessary to invert a Hessian at each iteration. There
exist modifications of the method.
If the functional is quadratic: ),(),(2
1)( xbxAxxf , A =A
*>0, then Newton’s
method converges for one iteration starting from arbitrary initial approximation.
18. Zero-order methods
If calculation of derivatives of a functional in a finite-dimensional space is
difficult or impossible, sometimes zero-order methods can be applied. We will
enumerate some most known methods.
1. Coordinate descent.
2. Hooke-Jeeves method.
3. Nelder-Mead method.
4. Random search.
5. Genetic algorithms.
In any case, gradient methods give a possibility to study at least a local behavior
of a functional. From this point of view, they are more preferable.
19. Conditional gradient method
Let a differentiable functional )(xf be defined on a closed convex bounded
set X in a Hilbert space H . The conditional gradient method (or the Frank–Wolfe
algorithm) produces two sequences Xx n }{ )( and Xx n }{ )( . An initial
approximation Xx )0( is an arbitrary point in X. After that we choose )0(x solving
the problem
)),(()),(( )()()(min xxfxxf n
Xx
nn
(n=0), and construct
)( )()()()1( nn
n
nn xxxx
under condition that 10 n . We repeat the process.
All points Xx n )( exist and belong to the boundary of X, because the linear
bounded functional )),(( )( xxf n ha a point of minimum on the boundary of X.
“Parametrical steps” n can be chosen by different ways. We will consider
Demyanov’s method.
Theorem. Let a functional )(xf be defined on a closed convex bounded set
HX , and its gradient )(xf satisfies Lipschitz condition (i.e. there exists a
constant L>0 such that for any Xyx , ||||||)()(|| yxLyfxf ). Let also
parametrical steps ]1,0[ be solutions of one-dimensional extremal problem:
))((min))(()( )()()(
10
)()()()1( nnnnn
n
nn xxxfxxxfxf
. Then for sequences
Xx n }{ )( and Xx n }{ )( , defined above, following assertions are valid:
1) 0)),((lim)()()(
nnn
n
xxxf ;
2) if the functional )(xf is convex, then )(min)(lim*)( xffxf
Xx
n
n
;
3) if the functional )(xf is strongly convex, then the sequence Xx n }{ )(
converges to Xx * , the unique point of the global minimum of )(xf in X, and
n
Cxx n |||| *)( .
It is very easy to realize the method if nRH , the set X is a convex
bounded polyhedron, and the functional )(xf is quadratic. In this case, the problem
to find a minimum of )),(( )( xxf n and calculate )(nx is a linear programming
problem. If we know vertices of the polyhedron X, we can apply an enumeration
method. If we don’t know vertices, we can apply the simplex method. The problem
to find n is trivial.
20. Projection of conjugate gradients method
We will describe the method of projection of conjugate gradients for
minimization of a quadratic functional on the set of linear constraints in Rn.. Finite
number of iterations is needed.
Let J be a set of indexes, JC be a nm matrix. Vectors Jic i ,)( are its rows.
Lemma. If vectors Jic i ,)( are linear independent, then the matrix JJ CC is
nondegenerated.
Let us define JJJJ CCCCP 1)( . It is easy to see that
.0)()(,, PPIPIPPPPPP
The operator nn RRP : such that PPPP 2 is called the projection
operator. Obviously, PI is the projection operator also. From written above, it is
clear that nPR and nRPI )( are orthogonal linear subspaces. So, P is the
orthogonal projection operator on a subspace that is a linear shell of vectors
Jic i ,)( .
We minimize a quadratic functional
),(),(2
1)( xbxAxxf
under linear constraints
Jidxc ii ,0),( )( .
Here ,,, )( ni Rcbx id are numbers, nn matrix A is symmetric positively definite,
J is a finite set of indexes.
Let vectors Jic i ,)( be linear independent. )0(x satisfies linear constraints.
We define new variable y :
yPIxx )()0( .
Consider a function ))(()( )0( yPIxfy . Note that ).()()( xfPIy
Lemma. Let y be a point of global minimum of )(y . Then
yPIxx )()0( is a point of minimum )(xf under linear constraints.
Theorem. Minimization problem for a quadratic functional on the set of
linear constraints can be solved for finite number of iterations using the method of
projection of conjugate gradients. Starting from an initial approximation (satisfied
constraints) )0(x , we apply iterations:
,...1,0,,
),(
,
)()(
)()()()(
,
),()(
)(
)()(
)1(
2)1(
2)(
)()(
)()()1(
)0()0(
kAhp
hxf
h
xfPI
xfPIxfPIh
hxx
xfPIh
kk
kk
k
k
k
k
kk
kk
kk
If we have linear constraints of inequality type, the method can be applied
also. It is a bit more complicated.
21. Numerical methods for solving ill-posed problems on special compact sets
Let us consider again an operator equation:
uAz ,
here A is a linear bounded injective operator mapping a normed space Z into a
normed space U. Let z be an exact solution of the operator equation, uzA is an
exact right-hand side, u is an approximate right-hand side such that
0, uu , and hA : UZAh : is a linear bounded operator such that
0, hhAAh . We denote a pair of errors , h as =(, h).
It is given that Dz , ZD is a compact. Then we can consider as an
approximate solution any Dz that satisfies an inequality: |||| zhuzAh
(η-quasisolution), and zz as 0 at that.
Suppose that it is known from a priori considerations that an exact solution
of an operator equation is a monotonic on [a,b] function (for definiteness, the
function is nonincreasing) that is bounded from below and from above by
constants C1 and C2. Without loss of generality, we suppose that C1=C>0, C2=0.
We consider a set CZ of nonincreasing functions z(s) bounded by constants C and
0, i.e. ],[0)( basszC . This set CZ is a compact in L2[a,b], and thereby
zz as 0 in L2[a,b]. This result can be reinforced if an exact solution is
continuous on [a,b], i.e. ],[)( baCZsz C . In this case, regularized approximate
solutions converge to the exact solution uniformly on any segment ),(],[ ba . If
the exact solution is piece-wise continuous then the convergence is uniform on any
segment ),(],[ ba that doesn’t contain points of discontinuity of z . Obviously,
that the set CZ is convex.
Other sets under consideration are following: the set CZ
of concave on [a,b]
functions bounded from above and from below by constants C and 0; the set CZ
of concave on [a,b] monotonic (for definiteness, nonincreasing) functions
bounded from above and from below by constants C and 0. Both these sets are
compacts in L2[a,b], and the convergence of approximate solutions to the exact
solution is uniform on any segment ],(],[ ba .
Let Z= L2[a,b], U= L2[c,d]. Let us consider the quadratic functional
(the discrepancy)
2)( uzAzФ h ; )(2)(
**
uAzAAzФ hhh .
In order find a quasisolution Dz it is sufficient to minimize the
discrepancy on the set D using the stopping rule: 2||)||()( zhzФ . The
conditional gradient method can be applied.
After finite-dimensional approximation of sets CCC ZZZ
,, we can get sets
CCC MMM
,, respectively:
niCz
nizz
nizzz
RzzM
niCz
nizzzRzzM
niCz
nizzRzzM
i
ii
iii
n
C
i
iiinC
i
iin
C
...1,0
1...1,0
1...2,02
,:
...1,0
1...2,02,:
...1,0
1...1,0,:
1
11
11
1
where )( ii szz , niis 1}{ is an uniform grid on ],[ ba with a step sh .
These sets CCC MMM
,, are closed bounded and convex polyhedrons in
nR , vertices of them )( jT , ( nj 1 ) are following.
Vertices of CM :
....1,,0
,,
,0
)(
)0(
njji
jiCT
T
j
i
Vertices of CM
:
....1,,
,,
,0
)(
)0(
njjiCjn
in
jiC
T
T
j
i
Vertices of CM
:
.1,,
,,
,,1
12
,0
),(
)0,0(
njijkCjn
kn
jkiC
jkj
kC
T
T
ji
k
Let us consider the Fredholm integral equation of the 1st kind:
b
a
xuszsxKAz )()(),( ,
where the kernel ),( sxK is nondegenerated, and it is a real continuous on
},{ dscbxaП function. Instead of the exact right-hand side uzA it is
given ],[2
:dcL
uuu , and instead of ),( sxK it is given
hsxKsxKsxKПLhh
)(2
),(),(:),( . For simplicity, we choose uniform grids
with steps xh and sh respectively n
jjs 1}{ , on the segment ],[ ba and m
iix 1}{ on the
segment ],[ dc .
Then a finite-dimensional approximation of the discrepancy
dxxudsszsxKuzAzФ
d
c
b
a
hbaLh
2
],[2
)()(),()(2 is a quadratic functional
m
i
x
n
j
isiij huhzaz1
2
1
}ˆ(ˆ , where ),( jihij sxKa , n
jjzz 1}{ˆ .
For its minimization on described above polyhedrons the conditional
gradient method (or better the method of projection of conjugate gradients) can be
applied.
22. Ill-posed problems with a source-wise represented solution
Let us consider again an operator equation:
uAz ,
where A is a linear bounded injective operator mapping from a normed space Z
into a normed space U. Let z be an exact solution of the equation, uzA is an
exact right-hand side, and u is an approximate right-hand side such that
0, uu . Suppose that we are given the following a priori information: the
exact solution z is source-wise represented: zvB , Vv ; ZVB : ; B is an
injective completely continuous operator mapping from a Hilbert (or reflexive
Banach) space V into Z.
We introduce the method of extended compacts firstly proposed by V.
Ivanov and I. Dombrovskaya.
At first, we define the iteration number n=1 and the closed ball in the space
V: }:{)0( nvvSn . Its image )0(nn SBZ is a compact since B is a completely
continuous operator, and V is a Hilbert space. Therefore, there exists ))0((
min
nSBz
uAz
.
If ))0((
min
nSBz
uAz
, then we stop iterations and define nn )( , and as approximate
solution we can choose any element ))0((: )()()( nnn SBzz и |||| )( uAzn . If
))0((
min
nSBz
uAz , we need to extend the compact. For this purpose we add 1 to n and
repeat the process.
Theorem. The described above process converges: )(n . There exists
00 (generally speaking, depending on z ) such that )()( 0 nn ],0( 0 . The
approximate solution )(nz converges to the exact solution z as 0 .
For his method it is possible to construct so called a posteriori error estimate.
It means that there exists a function ),( u such that 0),( u as 0 , and
||||),( )( zzu n at least for sufficiently small 0 . As a posteriori error
estimate we can take }||||,||:max{||),( )()( uAzZzzzu nn .
This approach can be generalized on the case when operators A and B are
specified with errors, and also for the case of nonlinear ill-posed problems.
23. Tikhonov’s regularizing algorithm
A.N. Tikhonov proposed a variational approach to constructing regularizing
algorithms.
We have an operator equation:
uAz ,
where A is a linear bounded injective operator mapping a Hilbert space Z into a
Hilbert space U. We suppose that an exact solution Dz , D is a closed convex set
such that D0 . The set D is defined by a priori constraints; if no constraints then
D=Z.
Suppose that the exact right-hand side is not known zAu , and we are
given its approximation u such that uu . We know also the error 0 . Let
also the operator A be specified with an error. It means that it is known a linear
bounded operator UZAh : ; hAAh , and its error 0h . For conciseness:
),( h . The problem to construct a regularizing algorithm is following: we are
given },,{ hAu , and we have to find an approximate solution Dz such that
zz as 0 , or DAuRz h ),( , zz as 0 , here ),( hAuR is a
regularizing algorithm.
A.N. Tikhonov proposed the variational approach to constructing
regularizing algorithms. He introduced the smoothing functional (now it is called
Tikhonov’s functional): 22
][ zuzAzM h , 0 is a parameter of
regularization, and considered an extremal problem: to find Dz
zM
][min .
Lemma. Under conditions formulated above, there exists an unique element
Dz :
Dz
zMz
][minarg .
Numerical methods are based on minimization procedures or (if no
constraints: D=Z) on numerical solving Euler’s equation for Tikhonov’s functional.
The parameter of regularization should be connected with errors of the right-
hand side and the operator if the problem is ill posed (Bakushinsky’s theorem). So
such “methods” as “L-curve method” and “generalized cross-validation” (GCV)
cannot be applied for solving ill-posed problems. Different examples show that
these methods cannot solve also simplest well-posed problems.
Methods of a regularization parameter choice can be divided to a priori and a
posteriori methods. The first a priori method was proposed by A.N. Tikhonov:
а) 00)(
; б) 0
2
0)(
)(
h. In this case
0
)(
zz . If the operator A is not injective
then it is possible to prove a convergence to the normal solution.
A priori choice of the regularization parameter cannot be applied for solving
practical problems because error levels are fixed. Below we will describe a
posteriori method – the generalized discrepancy principle (GDP) proposed by A.V.
Goncharsky, A.S. Leonov, and A.G. Yagola. This method is a generalization of the
discrepancy principle proposed by V.A. Morozov for the case h=0.
24. Generalized discrepancy principle
Definition. The incompatibility measure of the operator equation with
approximate data on the set ZD is defined as uzAAu hDz
h
inf),( .
Lemma. If hAADzzAuuu h ,,, , then 0),(0
hAu .
We suppose then we can calculate the incompatibility measure with an error
0k . It means that we are given ),( h
k Au such that
kAuAuAu hh
k
h ),(),(),( , 0)(0
kk
Definition. The generalized discrepancy is defined as the function of the
regularization parameter 0 : 222
)),(()()( h
k
h
k AuzhuzA
.
Let us formulate the generalized discrepancy principle (GDP) (we
suppose that D0 ).
If the condition 222)),(( h
k Auu is not satisfied, then we take as
an approximate solution of the operator equation zη=0. If this condition is true,
then the generalized discrepancy has a positive root 0* , i.e.
222
)),(()(**
AuzhuzA k
h
. In this case, the approximate solution is
* zz , it is unique, and zz as 0 . If the operator is not injective, then this
convergence is to the normal exact solution.
The generalized discrepancy 222
)),(()()( h
k
h
k AuzhuzA
has the following properties:
1. k
is continuous and monotone nondecreasing as 0 .
2. 22 2lim ,
k
hu u A
.
3. 2
0 0lim
k
.
From these properties it follows immediately that there exists a positive root
of the generalized discrepancy if 222)),(( h
k Auu .
Let us consider the extremal problem with constraints:
to find 222)),(()(,:,inf h
k
h AuzhuzADzzzz .
It is so called the generalized discrepancy method (GDM).
Theorem. Let A, Ah be linear bounded operators mapping Z into U; D is a
closed convex set included 0, ZD ; .,,, DzzAuuuhAA h Then
GDP and GDM are equivalent.
25. Incompatible ill-posed problems
Let us consider an operator equation:
uAz
where ZDz is a convex closed set, D0 , Uu , Z and U are Hilbert spaces,
А is a linear bounded operator. If uu , a solution of the equation, maybe, does not
exist, but we suppose that there exists Dz such that minimizes the
discrepancy: 0inf
uAzDz
.
is called the incompatibility measure of the operator equation with exact
data u and A . If 0 , then uAzzDz
minarg (или uzA , Dz ) is called the
pseudosolution (or quasisolution). If uAzDzzZ ,: contains more than
one element, then Z is convex and closed. Consequently, there exists the unique
element zzZz
minarg that is called the normal pseudosolution. If 0 , then a
pseudosolution is a solution of the operator equation, and a normal pseudosolution
is a normal solution.
Let us try to construct an algorithm for stable approximation to the normal
pseudosolution of the incompatible operator equation if we are given approximate
data: ,,uAh , here h, , 0 , 0h , uu , hAAh ( 0 we will
consider as a particular case).
We will consider an upper estimate for the incompatibility measure:
uzAzhAu hDz
h inf,~ .
Lemma. Under formulated above conditions hAu ,~ , and
hAu ,~lim0
.
We denote the minimizer of Tikhonov’s functional zM as
z and
introduce the generalized discrepancy: 22
,~~hh AuzhuzA
.
If hAu ,~ is calculated with an error 0k , then the generalized
discrepancy is 22
,~~h
k
h
k AuzhuzA
,
where kAuAuAu hh
k
h ,~,~,~ , h
k Au ,~ , and h
k Au ,~ as
0 if 0 kk as 0 .
The generalized discrepancy has following properties:
1) k~ is continuous and nondecreasing for 0 ;
2) 22,~~lim h
kk Auu
;
3) 0,~,~lim22
____
00
h
k
h
k AuAu
.
4) If hk Auu ,~
, then 0 : 0~ k , that is equivalent to
hk
h AuzhuzA ,~
;
z is unique.
Modified GDP: we are given ,,uAh . We calculate
kAukuzAzhAu hhDz
h
k
,~inf,~ , 0k , 0 kk as 0 . If
hk Auu ,~
is not true, then 0z ; in the opposite case we shall find
and calculate
zz .
Theorem. zz
0lim , where z is the normal pseudosolution of the operator
equation.
Modified GDP is equivalent to Modified GDM:
to find zinf , h
k
h AuzhuzADzz ,~, .
26. Numerical methods for the solution of Fredholm integral equations of the
first kind
Suppose that we are given the Fredholm integral equation of the first kind:
b
a
xuszsxKAz )()(),( ,
where the kernel ),( sxK is not degenerated and is real and continuous on
},{ dscbxaП . We choose ],[2 dcLU and ],[1
2 baWZ . In this case, we can
guarantee convergence of regularized approximations in ],[1
2 baW , and uniform
convergence (in ],[ baC ). We do not know the exact right-had side uzA , and we
are given ],[2
:dcL
uuu , and instead of ),( sxK we know
hsxKsxKsxKПLhh
)(2
),(),(:),( . For simplicity, we introduce uniform
grids with steps xh и sh : on the segment ],[ ba the grid n
jjs 1}{ , and on the segment
],[ dc the grid m
iix 1}{ .
Then the discrepancy functional
dxxudsszsxKuzAzФ
d
c
b
a
hbaLh
2
],[2
)()(),()(2
can be approximated by the quadratic functional
m
i
x
n
j
isiij huhzaz1
2
1
}ˆ(ˆ ,
where ),( jihij sxKa , n
jjzz 1}{ˆ . Some methods for minimization of this functional
if a priori information is available were described earlier.
Finite-dimensional approximation of Tikhonov’s functional
b
a
b
a
d
c
b
a
hbaWbaLh
dsszdssz
dxxudsszsxKzuzAzM
]))(()([
)()(),(||||][
22
2
2
],[],[
2
12
2
can be written as following:
n
j
n
j
sjjsj
m
i
x
n
j
isiij hzzhzhuhzazM1 2
2
1
2
1
2
1
/)(]ˆ[ˆ .
It is a quadratic functional defined on nR . If linear constraints are available,
then for its minimization different methods can be used, e.g., the method of
projection of conjugate gradients.
If no constraints, then we can get a system of linear algebraic equations:
0)]ˆ[ˆ( zM . For its solution it can be used different numerical methods including
the Cholesky decomposition.
27. Convolution type equations
1D convolution type equation is following:
( ) ( ) ( ),Az K x s z s ds u x
x .
We suppose that the functions in the equation satisfy requirements:
1 1
1 2
1
2
1 1
2
( ) ( ) ( ),
( ) ( ),
( ) ( ),
K y L R L R
u x L R
z s W R
A is an injective operator (the kernel K is closed).
No constraints( D Z ).
For u there exists an unique solution Az u in 1 1
2 ( )W R .
We are given approximate data u and the integral operator of the convolution
type hA (with the kernel hK ), 0 , h o such that: 2L
uu ;
hAAAALLhLWh
22212
.
Let us consider Tikhonov’s functional:
22
122
][WLh zuzAzM
.
If we choose the regularization parameter in appropriate way (e.g., using
GDP), we can guarantee convergence regularized approximations to the exact
solution in 1 1
2 ( )W R , and, consequently, for any segment [a,b]:
],[0],[)()(
baCbaCszsz
.
Let us find ( )z s
. We define direct and inverse Fourier transforms for
)()( 1
2 RLxf as:
dxexff xi )()(~
;
defxf xi)(2
1)(
.
Using the convolution theorem, the Plancherel equality and varying M over
1 1
2 ( )W R , we obtain:
dKK
euKsz
hh
si
h
)1()()(
)()(
2
1)(
2*
*
,
where ( ),hK u are the Fourier transforms of ( ),hK s u , and *( ) ( )h hK K .
If we substitute the Fourier transform of ( )u x in the expression for )(sz , we
obtain:
1( ) ( ) ( ) ,
2z s K x s u x dx
where:
*
* 2
( )1( )
2 ( ) ( ) ( 1)
i t
h
h h
K eK t d
K K
.
For solving 2D convolution type equations we need to apply 2D Fourier
transform.
Consider the case when the solution and the kernel have local supports. We
remind that the support of a function f(x) is the set }0)(:{ 1 xfRxSuppf .
In this case the equation has a form:
a
axxudsszsxKAz
2
0
20),()()( .
We assume the following conditions:
1
2 2: [0,2 ] [0,2 ]A W a L a , ( ) [ , ],2 2
l lSuppK y ( ) [ , ]
2 2
z zl lSuppz x a a ,
, 0, 0,2 2 .z za l l l l a Under these conditions we define to be equal zero K(y) on [-
a,a] and z(s) on [0,2a], and after that define their periodical continuation on 1R
with the period 2T a . In this case, the right-hand side u(x) has the support in
[0,2a], and has also a periodical continuation on 1R with the period 2T a .
We now introduce uniform grids: 1,0,2
, nkn
xxksx kk
, where n
is even. For simplicity we approximate the equation by the rectangle formula:
)()()(1
0
kj
n
j
jk xuxszsxK
.
Put ( )k j k jK x s K , ( )j jz s z , ( )k ku x u . We define the discrete Fourier
direct and inverse transforms for functions (vectors) of a discrete variable k that are
periodic with the period ( k n kf f ):
1
0
m k
ni x
m k
k
f f e
,
1
0
~1 n
m
xi
mkkmef
nf
, где 2
m mT
, , 0, 1k m n .
It is easy to check Plancherel’s equality and the convolution theorem for the
discrete Fourier transform:
1 12 2
0 0
1| |
n n
k m
k k
f fn
;
xzKxezK mm
x
j
n
k
n
j
jkkm
~~1
0
1
0
We approximate Tikhonov’s functional:
xxzxzxuxzKzMn
k
k
n
k
k
n
k
n
j
kjjk
1
0
21
0
2
21
0
1
0
)(')(][ .
Using Plancherel’s equality, the convolution theorem and the equality
( ) ( )k k kz x i z x , we obtain:
)*~)1(~~*~*~
2*~~(][ 222
21
1
mmmmmmmm
n
m
m zzuuzKxxzzKn
xzM
.
The minimum of this functional is attained on the vector with the discrete
Fourier transform coefficients
1*~
~*~
~2
2
a
mxK
xuKz
m
mmm
, 1,0 nm .
Applying the inverse discrete Fourier transform, we find the regularized
solution. If kn 2 , FFT (fast Fourier transform) can be used.
In 2D case we should use 2D discrete Fourier transform.