7

Click here to load reader

An Improved Mixed Conjugate Gradient Method

Embed Size (px)

Citation preview

Page 1: An Improved Mixed Conjugate Gradient Method

Systems Engineering Procedia 4 (2012) 219 – 225

2211-3819 © 2011 Published by Elsevier Ltd. Selection and peer-review under responsibility of Desheng Dash Wu.doi:10.1016/j.sepro.2011.11.069

Available online at www.sciencedirect.com Systems Engineering

ProcediaSystems Engineering Procedia 00 (2011) 000–000

www.elsevier.com/locate/procedia

The 2nd

An Improved Mixed Conjugate Gradient Method

International Conference on Complexity Science & Information Engineering

Wen Jiaa ,Jizhou Zonga,Xiaodong Wangb, a*aNorth China Electric Power University,Beijing,102206,china

b

Abstract

Air Defense Forces Command College of People's Liberation Army, Zhengzhou, 450052,china

Conjugate gradient method is an important and efficient method to solve the unconstrained optimization problems, especially for large scale problems. Based on the mixed conjugate gradient method proposed by Jing and Deng [1], we improve the mixed conjugate gradient method and prove the global convergence under a sufficient condition. At last, some numerical experiments from engineering are shown.

© 2011 Published by Elsevier Ltd. Selection and peer-review under responsibility of Desheng Dash Wu

Keywords: unconstrained optimization, conjugate gradient, Wolfe line search, global convergence, engineering;

1. Introduction

Conjugate gradient (CG) method is widely used to solve the unconstrained optimization problem that can be widely recognized in engineering. After continuous development, the conjugate gradient algorithm are developed into many kinds of deformation, becoming a more practical method in model optimization algorithm. With the development and requirement of science and technology, new deformation of the conjugate methods is still a research focus [1]. Consider the unconstrained optimization problem,

)(min xfnRx∈

(1)

where 1:)( RRxf n → is a continuously differentiable function. There are many methods to solve the problem. The conjugate gradient method has the following advantages: It only refers to the first order derivative, not only overcoming the slow convergence shortage in the steepest descent method, but also avoiding the second order derivative which demanded in storage and computing when using the Newton method. It needs small internal memory and the program is relatively simple, so that it is an important method in solving the unconstrained optimization problems, especially in large scale optimization problems [2]. It is widely used in the optimal solution of national defense, economic, financial, engineering design, radio communication, management, and many other areas. The detailed conjugate gradient method is as follows:

...2,1,1 =+=+ kdxx kkkk α (2)

* Corresponding author. Tel.: 86-15811460674;E-mail address: [email protected] .

© 2011 Published by Elsevier Ltd. Selection and peer-review under responsibility of Desheng Dash Wu.

Page 2: An Improved Mixed Conjugate Gradient Method

220 Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225

2 Author name / Systems Engineering Procedia 00 (2011) 000–000

where 1x is a given initial point, kα is the step length along kd , kd is the search direction.

≥+=

=2,dg-

1,g-d

1-kkk

kk k

(3)

where )x(fg kk ∇= is the gradient of f(x)at kx , and the selection of 1−kβ depends on 1-kx and kx .Different kβ derives different conjugate gradient method. There are many famous formulas of kβ , such as Fletcher-Reeves (FR), Polak-Ribiere-Polyak (PRP), Hestenes-Stiefel (HS) [3] . Based on the inspiration of the HS, PRP and FR, Jing and Zheng [1] proposed a new conjugate gradient method .The new new

kβ is

≥+

<

=1-k

Tk

2k2

1-k1-kk

21-k

2k

1-kTk

2k

newk

ggggdg

g-g

ggg0

β (4)

where 10 ≤< µ .Then put the new newkβ and PRP

kβ together to get a new mixed kβ .

==

≥=

1,01PRPk

12new

kmixk

µβ

ββ

kTk

kTkk

dg

ggg . (5)

In section 2, we modified the mixkβ and give a new algorithm with the new method Mmix

kβ . In section 3,

we discuss the sufficient descent property of algorithm. In section 4, we discuss the global convergence of algorithm by a new proof method proposed in reference [4]. In section 5, we have some numerical experiments. In section 6, we give the conclusion .

2 Algorithm

In this section, we improve the (4) and get a pithy kβ ,

)gdg

g-gmax(0, 2

1-k1-kk

21-k

2kMnew

k+

=T

µβ . (6)

Considering the new mixed HS-DY conjugate gradient method [6], we modified the second part of (5) .

We get a new 1-kk

21-k

2k

dgg

gT

MPRPk

+= λβ . Then we put the new Mnew

kβ and MPRPkβ together to get

a new mixed Mmixkβ ,

Page 3: An Improved Mixed Conjugate Gradient Method

221Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225

Author name / Systems Engineering Procedia 00 (2011) 000–000 3

+

=others

dgg

g

ggg

1-kTk

21-k

2k

1-kTk

2k

Mnewk

Mmixk

λ

β

β , (7)

whereλ is parameter and 10 ≤<< µλ . It is easy to find that 0k ≥β . Our new method have the global convergent and less iterative times.

Our modified mixed conjugate gradient method algorithm is given below.

Algorithm(a) (Modified mix method: Mmix, see [7] )Step 0: Choose nR∈0x and 0>ε ;Step 1: Set 00 gd −= , 0:=k . If ε≤0g then stop else go to Step 2.Step 2: (General Wolfe Line Search ) Compute step size kα , such that,

kkkkkkk dgxfdxf T)()( δαα ≤−+ (7)

kTkk

Tkkkk

Tk dgddxgdg 21 )( σασ ≤+≤ (8)

Where )1,0(,, 21 ∈σσδ .Let kkkk dxx α+=+1 , 1: += kk .Step 3: Compute kg , if ε≤0g ,stop. Otherwise, go to Step 4.Step4: Compute 1−+−= k

Mmixkkk dgd β , go to Step 2.

3 Sufficient Descent Property

In this section, we show our modified method have the sufficient descent property with any line searchby the following theorem.Theorem 1. We assume Mmix

kk ββ = in (2) and (3) . When 1k ≥ ,2

kkTk g-1-dg )( µ≤ (9)

Proof Without any line search,

when Mnewk

mixk ββ =M , 1≥k ,

2kk

Tk g-1-dg )( µ≤ .

when1-k

Tk

21-k

2kmix

kdgg

g

+= λβ M ,

2k

2k

2k1-k

Tk2

1-k1-kTk

2k2

kkTk g-1-gg-dg

gdg

gg-dg )( µλλ ≤+≤

++= . (10)

Hence, the theorem is proved.

4 Global convergence

Page 4: An Improved Mixed Conjugate Gradient Method

222 Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225

4 Author name / Systems Engineering Procedia 00 (2011) 000–000

In this section we discuss convergence properties method in conjunction with the Wolfe line search. To ensure the convergence of Algorithm(a), we need the following standard assumptions.Assumption (a)(i) For any nRx∈ , the level set

{ })()(: 1xfxfRx n ≤∈=Γ (11) is bounded.(ii) f is continuously differentiable and there exists a constant 0>L such that for any nRx∈ , the gradient of f satisfies

yxLygxg −≤− )()( , Nyx ∈∀ , . (12)For proving the convergence of Algorithm(a) , we need the following theorem which is present in

reference [4].Theorem 2. (Convergent theorem, see [4]). We determine the conjugate gradient method by (2) and (3), and use the Wolfe line search. The objective function f(x) in (1) satisfies the above Assumption (a).If

21-k

2kMmix

kg

g0 ≤≤ β ,

and exists 0b,a > , i.e.

)( …=≤≤ 2,1kbgdga 2k

1-kTk . (14)

Then the global convergence is got, or

0ginflim kk=

∞→. (15)

In next part, we prove our modified algorithm meeting the conditions in the Theorem 2 by the following Lemma.Lemma 1. The objective function f(x) in (1) satisfies above Assumption (a), 0gk ≠ , kα meet the Wolfe conditions, and the kd is determined by (2), (3) and (7).

If 0dg 1-kTk ≥ ,then )( σ-1a0,1

gdg-a 2

k

kTk ≤<≤≤ (16)

If 0dg 1-kTk < ,then )( σ+≥≤< 1b,b

gdg-1 2

k

kTk (17)

Proof (i) When 1k ≥ and 1-kTk

2k ggg ≥ , Mnew

kMmix

k ββ = ,

we have

21-k

2k

21-k

2k

21-kk

Tk

21-k

2kMnew

kMmix

kg

g

g

g

gdg

g-g0 ≤≤

+==≤ µµββ (18)

Page 5: An Improved Mixed Conjugate Gradient Method

223Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225

Author name / Systems Engineering Procedia 00 (2011) 000–000 5

If 0dg 1-kTk ≥ ,

we have

1dggg

gdg

dggdg-1

gdg-1

gdg-

dggdg-1-1-1

1-kTk

21-k

2k

2k1-k

Tk

1-kTk

21-k

1-kTk

2k

1-kTkmix

k2k

kTk

1-kTk

21-k

1-kTk

≤+

++

=

=≤+

≤≤

)(µµ

βµµσσ M

(19)

Hence, the (16) is got. If 0dg 1-k

Tk ≤ ,

we have

σµσµ

µµβ

+≤+≤+

++

+==≤

11dgg

dg-1

dggg

gdg

dggdg-1

gdg-1

gdg-1

1-kTk

21-k

1-kTk

1-kTk

21-k

2k

2k1-k

Tk

1-kTk

21-k

1-kTk

2k

1-kTkmix

k2k

kTk

)(

M

. (20)

Then, we get the (17).

(ii) When 1k ≥ and 1-kTk

2k ggg < ,

1-kTk

21-k

2kMmix

kdgg

g

+= λβ .

We have

21-k

2k

21-k

2k

21-kk

Tk

2kMmix

kg

g

g

g

gdg

g0 ≤≤

+=≤ λλβ . (21)

If 0dg 1-kTk ≥ ,

we have

1dgg

dg-1gdg-1

gdg--1-1

1-kTk

21-k

1-kTk

2k

1-kTkmix

k2k

kTk ≤

+==≤≤ λβλσσ M . (22)

Hence, the (17) is got.

If 0dg 1-kTk ≤ ,

we have

σµσλβ +≤+≤+

==≤ 11dgg

dg-1gdg-1

gdg-1

1-kTk

21-k

1-kTk

2k

1-kTkMmix

k2k

kTk . (23)

Page 6: An Improved Mixed Conjugate Gradient Method

224 Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225

6 Author name / Systems Engineering Procedia 00 (2011) 000–000

Hence, we get the (17).According to the Lemma 1 and the proving procedures of Lemma 1, we have the following theorem.

Theorem 3. If we have 21-k

2kMmix

kg

g0 ≤≤ β ,

i.e.

)( …=≤≤ 2,1kbgdga 2k

1-kTk , (24)

where σ-1a0 ≤< , σ+≥1b .From the above Lemma 1 and Theorem 3, finally, we get the Theorem 2 (Convergent theorem). We

also get the conclusion. Assume the function f(x) in (1) satisfies the above Assumption (a), the CG method (2) and (3) will have the global convergence under the Wolfe line search.

4 Numerical experiments

In this section we do some numerical experiments, and we compare the iterations of our new algorithm with the FR method. We do each test several times with the same function, and get the following table. From such tests, we can find that the new algorithm is efficient to the general functions. In order to show the efficiency of our new algorithm, the following functions are considered:

(i) 22

211 )2()2()( xxxf ++−= ,

(ii) 434

4323

22

212 721--5-2)( xxxxxxxxf +++= )( ,

(iii) 441

432

443

2213 )(10)2-510)( xxxxxxxxxf −+−+++= ()()( .

Numerical results is shown in Table 1. In table 1, x0 is the original point and xmin is the last point of the new algorithm.

Table 1. Iterative times with the three functions

functions

Iterative times

0x minxFR Mix

)(1 xf 25 11 (1.2,-1.2) (2,-2)

)(2 xf 38 25 (1,1,1,1) (2.5,2.5,5.25,-3.5)

)(2 xf 43 23 (1,1,1,1) (0,0,0,0)

5 Conclusion

Conjugate gradient method is an important method to solve large scale unconstrained nonlinear optimization problems. Firstly, a new mixed conjugate gradient method is proposed. Then the new mixed CG method is shown to have the fully descending property and the global convergence. Finally, some numerical experiments with several functions show the efficiency of the new mixed CG method.

Page 7: An Improved Mixed Conjugate Gradient Method

225Wen Jia et al. / Systems Engineering Procedia 4 (2012) 219 – 225

Author name / Systems Engineering Procedia 00 (2011) 000–000 7

Acknowledgments

The authors wish to thank Dr. Y.Y. Shi for various assistance. The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers.

References

[1] Jing SJ, Deng T, et al. A mixed conjugate met with sufficient descent property for the exact line search. J HNPU2010;29:266-270.

[2] Zhang ZH, Shi ZJ, et al. A new descent conjugate gradient method for unconstrained optimization. J AM 2009;38:340-344.[3] Chen Y, Mi HL, et al. A new descent conjugate gradient method for unconstrained optimization. J SYU 2010;7:9-11.[4] Wang XY, Chen GJ, et al. A sufficient condition for global convergence of conjugate gradient methods. J NUC 2010;31:5-8.[5] Zheng XF, Tian ZY, Song LW, et al. The global convergence of a mixed conjugate gradient method with the Wolfe line search.

J OR transactions 2009;13:18-24. [6] Dai ZF, Chen LP, et al. A mixed HS-DY conjugate gradient methods. J MNS 2005;27:429- 436.[7] Luo ZJ, Xie J, Xiong R. Convergence of Dai-Yuan conjugate method with general Wolfe line search. J IJPAM 2011;71:343-

349.