Click here to load reader
Upload
xiaodong
View
220
Download
4
Embed Size (px)
Citation preview
Systems Engineering Procedia 4 (2012) 219 – 225
2211-3819 © 2011 Published by Elsevier Ltd. Selection and peer-review under responsibility of Desheng Dash Wu.doi:10.1016/j.sepro.2011.11.069
Available online at www.sciencedirect.com Systems Engineering
ProcediaSystems Engineering Procedia 00 (2011) 000–000
www.elsevier.com/locate/procedia
The 2nd
An Improved Mixed Conjugate Gradient Method
International Conference on Complexity Science & Information Engineering
Wen Jiaa ,Jizhou Zonga,Xiaodong Wangb, a*aNorth China Electric Power University,Beijing,102206,china
b
Abstract
Air Defense Forces Command College of People's Liberation Army, Zhengzhou, 450052,china
Conjugate gradient method is an important and efficient method to solve the unconstrained optimization problems, especially for large scale problems. Based on the mixed conjugate gradient method proposed by Jing and Deng [1], we improve the mixed conjugate gradient method and prove the global convergence under a sufficient condition. At last, some numerical experiments from engineering are shown.
© 2011 Published by Elsevier Ltd. Selection and peer-review under responsibility of Desheng Dash Wu
Keywords: unconstrained optimization, conjugate gradient, Wolfe line search, global convergence, engineering;
1. Introduction
Conjugate gradient (CG) method is widely used to solve the unconstrained optimization problem that can be widely recognized in engineering. After continuous development, the conjugate gradient algorithm are developed into many kinds of deformation, becoming a more practical method in model optimization algorithm. With the development and requirement of science and technology, new deformation of the conjugate methods is still a research focus [1]. Consider the unconstrained optimization problem,
)(min xfnRx∈
(1)
where 1:)( RRxf n → is a continuously differentiable function. There are many methods to solve the problem. The conjugate gradient method has the following advantages: It only refers to the first order derivative, not only overcoming the slow convergence shortage in the steepest descent method, but also avoiding the second order derivative which demanded in storage and computing when using the Newton method. It needs small internal memory and the program is relatively simple, so that it is an important method in solving the unconstrained optimization problems, especially in large scale optimization problems [2]. It is widely used in the optimal solution of national defense, economic, financial, engineering design, radio communication, management, and many other areas. The detailed conjugate gradient method is as follows:
...2,1,1 =+=+ kdxx kkkk α (2)
* Corresponding author. Tel.: 86-15811460674;E-mail address: [email protected] .
© 2011 Published by Elsevier Ltd. Selection and peer-review under responsibility of Desheng Dash Wu.
220 Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225
2 Author name / Systems Engineering Procedia 00 (2011) 000–000
where 1x is a given initial point, kα is the step length along kd , kd is the search direction.
≥+=
=2,dg-
1,g-d
1-kkk
kk k
kβ
(3)
where )x(fg kk ∇= is the gradient of f(x)at kx , and the selection of 1−kβ depends on 1-kx and kx .Different kβ derives different conjugate gradient method. There are many famous formulas of kβ , such as Fletcher-Reeves (FR), Polak-Ribiere-Polyak (PRP), Hestenes-Stiefel (HS) [3] . Based on the inspiration of the HS, PRP and FR, Jing and Zheng [1] proposed a new conjugate gradient method .The new new
kβ is
≥+
<
=1-k
Tk
2k2
1-k1-kk
21-k
2k
1-kTk
2k
newk
ggggdg
g-g
ggg0
Tµ
β (4)
where 10 ≤< µ .Then put the new newkβ and PRP
kβ together to get a new mixed kβ .
==
≥=
−
−
1,01PRPk
12new
kmixk
µβ
ββ
kTk
kTkk
dg
ggg . (5)
In section 2, we modified the mixkβ and give a new algorithm with the new method Mmix
kβ . In section 3,
we discuss the sufficient descent property of algorithm. In section 4, we discuss the global convergence of algorithm by a new proof method proposed in reference [4]. In section 5, we have some numerical experiments. In section 6, we give the conclusion .
2 Algorithm
In this section, we improve the (4) and get a pithy kβ ,
)gdg
g-gmax(0, 2
1-k1-kk
21-k
2kMnew
k+
=T
µβ . (6)
Considering the new mixed HS-DY conjugate gradient method [6], we modified the second part of (5) .
We get a new 1-kk
21-k
2k
dgg
gT
MPRPk
+= λβ . Then we put the new Mnew
kβ and MPRPkβ together to get
a new mixed Mmixkβ ,
221Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225
Author name / Systems Engineering Procedia 00 (2011) 000–000 3
+
≥
=others
dgg
g
ggg
1-kTk
21-k
2k
1-kTk
2k
Mnewk
Mmixk
λ
β
β , (7)
whereλ is parameter and 10 ≤<< µλ . It is easy to find that 0k ≥β . Our new method have the global convergent and less iterative times.
Our modified mixed conjugate gradient method algorithm is given below.
Algorithm(a) (Modified mix method: Mmix, see [7] )Step 0: Choose nR∈0x and 0>ε ;Step 1: Set 00 gd −= , 0:=k . If ε≤0g then stop else go to Step 2.Step 2: (General Wolfe Line Search ) Compute step size kα , such that,
kkkkkkk dgxfdxf T)()( δαα ≤−+ (7)
kTkk
Tkkkk
Tk dgddxgdg 21 )( σασ ≤+≤ (8)
Where )1,0(,, 21 ∈σσδ .Let kkkk dxx α+=+1 , 1: += kk .Step 3: Compute kg , if ε≤0g ,stop. Otherwise, go to Step 4.Step4: Compute 1−+−= k
Mmixkkk dgd β , go to Step 2.
3 Sufficient Descent Property
In this section, we show our modified method have the sufficient descent property with any line searchby the following theorem.Theorem 1. We assume Mmix
kk ββ = in (2) and (3) . When 1k ≥ ,2
kkTk g-1-dg )( µ≤ (9)
Proof Without any line search,
when Mnewk
mixk ββ =M , 1≥k ,
2kk
Tk g-1-dg )( µ≤ .
when1-k
Tk
21-k
2kmix
kdgg
g
+= λβ M ,
2k
2k
2k1-k
Tk2
1-k1-kTk
2k2
kkTk g-1-gg-dg
gdg
gg-dg )( µλλ ≤+≤
++= . (10)
Hence, the theorem is proved.
4 Global convergence
222 Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225
4 Author name / Systems Engineering Procedia 00 (2011) 000–000
In this section we discuss convergence properties method in conjunction with the Wolfe line search. To ensure the convergence of Algorithm(a), we need the following standard assumptions.Assumption (a)(i) For any nRx∈ , the level set
{ })()(: 1xfxfRx n ≤∈=Γ (11) is bounded.(ii) f is continuously differentiable and there exists a constant 0>L such that for any nRx∈ , the gradient of f satisfies
yxLygxg −≤− )()( , Nyx ∈∀ , . (12)For proving the convergence of Algorithm(a) , we need the following theorem which is present in
reference [4].Theorem 2. (Convergent theorem, see [4]). We determine the conjugate gradient method by (2) and (3), and use the Wolfe line search. The objective function f(x) in (1) satisfies the above Assumption (a).If
21-k
2kMmix
kg
g0 ≤≤ β ,
and exists 0b,a > , i.e.
)( …=≤≤ 2,1kbgdga 2k
1-kTk . (14)
Then the global convergence is got, or
0ginflim kk=
∞→. (15)
In next part, we prove our modified algorithm meeting the conditions in the Theorem 2 by the following Lemma.Lemma 1. The objective function f(x) in (1) satisfies above Assumption (a), 0gk ≠ , kα meet the Wolfe conditions, and the kd is determined by (2), (3) and (7).
If 0dg 1-kTk ≥ ,then )( σ-1a0,1
gdg-a 2
k
kTk ≤<≤≤ (16)
If 0dg 1-kTk < ,then )( σ+≥≤< 1b,b
gdg-1 2
k
kTk (17)
Proof (i) When 1k ≥ and 1-kTk
2k ggg ≥ , Mnew
kMmix
k ββ = ,
we have
21-k
2k
21-k
2k
21-kk
Tk
21-k
2kMnew
kMmix
kg
g
g
g
gdg
g-g0 ≤≤
+==≤ µµββ (18)
223Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225
Author name / Systems Engineering Procedia 00 (2011) 000–000 5
If 0dg 1-kTk ≥ ,
we have
1dggg
gdg
dggdg-1
gdg-1
gdg-
dggdg-1-1-1
1-kTk
21-k
2k
2k1-k
Tk
1-kTk
21-k
1-kTk
2k
1-kTkmix
k2k
kTk
1-kTk
21-k
1-kTk
≤+
++
=
=≤+
≤≤
)(µµ
βµµσσ M
(19)
Hence, the (16) is got. If 0dg 1-k
Tk ≤ ,
we have
σµσµ
µµβ
+≤+≤+
≤
++
+==≤
11dgg
dg-1
dggg
gdg
dggdg-1
gdg-1
gdg-1
1-kTk
21-k
1-kTk
1-kTk
21-k
2k
2k1-k
Tk
1-kTk
21-k
1-kTk
2k
1-kTkmix
k2k
kTk
)(
M
. (20)
Then, we get the (17).
(ii) When 1k ≥ and 1-kTk
2k ggg < ,
1-kTk
21-k
2kMmix
kdgg
g
+= λβ .
We have
21-k
2k
21-k
2k
21-kk
Tk
2kMmix
kg
g
g
g
gdg
g0 ≤≤
+=≤ λλβ . (21)
If 0dg 1-kTk ≥ ,
we have
1dgg
dg-1gdg-1
gdg--1-1
1-kTk
21-k
1-kTk
2k
1-kTkmix
k2k
kTk ≤
+==≤≤ λβλσσ M . (22)
Hence, the (17) is got.
If 0dg 1-kTk ≤ ,
we have
σµσλβ +≤+≤+
==≤ 11dgg
dg-1gdg-1
gdg-1
1-kTk
21-k
1-kTk
2k
1-kTkMmix
k2k
kTk . (23)
224 Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225
6 Author name / Systems Engineering Procedia 00 (2011) 000–000
Hence, we get the (17).According to the Lemma 1 and the proving procedures of Lemma 1, we have the following theorem.
Theorem 3. If we have 21-k
2kMmix
kg
g0 ≤≤ β ,
i.e.
)( …=≤≤ 2,1kbgdga 2k
1-kTk , (24)
where σ-1a0 ≤< , σ+≥1b .From the above Lemma 1 and Theorem 3, finally, we get the Theorem 2 (Convergent theorem). We
also get the conclusion. Assume the function f(x) in (1) satisfies the above Assumption (a), the CG method (2) and (3) will have the global convergence under the Wolfe line search.
4 Numerical experiments
In this section we do some numerical experiments, and we compare the iterations of our new algorithm with the FR method. We do each test several times with the same function, and get the following table. From such tests, we can find that the new algorithm is efficient to the general functions. In order to show the efficiency of our new algorithm, the following functions are considered:
(i) 22
211 )2()2()( xxxf ++−= ,
(ii) 434
4323
22
212 721--5-2)( xxxxxxxxf +++= )( ,
(iii) 441
432
443
2213 )(10)2-510)( xxxxxxxxxf −+−+++= ()()( .
Numerical results is shown in Table 1. In table 1, x0 is the original point and xmin is the last point of the new algorithm.
Table 1. Iterative times with the three functions
functions
Iterative times
0x minxFR Mix
)(1 xf 25 11 (1.2,-1.2) (2,-2)
)(2 xf 38 25 (1,1,1,1) (2.5,2.5,5.25,-3.5)
)(2 xf 43 23 (1,1,1,1) (0,0,0,0)
5 Conclusion
Conjugate gradient method is an important method to solve large scale unconstrained nonlinear optimization problems. Firstly, a new mixed conjugate gradient method is proposed. Then the new mixed CG method is shown to have the fully descending property and the global convergence. Finally, some numerical experiments with several functions show the efficiency of the new mixed CG method.
225Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225
Author name / Systems Engineering Procedia 00 (2011) 000–000 7
Acknowledgments
The authors wish to thank Dr. Y.Y. Shi for various assistance. The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers.
References
[1] Jing SJ, Deng T, et al. A mixed conjugate met with sufficient descent property for the exact line search. J HNPU2010;29:266-270.
[2] Zhang ZH, Shi ZJ, et al. A new descent conjugate gradient method for unconstrained optimization. J AM 2009;38:340-344.[3] Chen Y, Mi HL, et al. A new descent conjugate gradient method for unconstrained optimization. J SYU 2010;7:9-11.[4] Wang XY, Chen GJ, et al. A sufficient condition for global convergence of conjugate gradient methods. J NUC 2010;31:5-8.[5] Zheng XF, Tian ZY, Song LW, et al. The global convergence of a mixed conjugate gradient method with the Wolfe line search.
J OR transactions 2009;13:18-24. [6] Dai ZF, Chen LP, et al. A mixed HS-DY conjugate gradient methods. J MNS 2005;27:429- 436.[7] Luo ZJ, Xie J, Xiong R. Convergence of Dai-Yuan conjugate method with general Wolfe line search. J IJPAM 2011;71:343-
349.