View
2.958
Download
0
Category
Tags:
Preview:
Citation preview
SelfSelf--Adaptive Simulated Binary CrossoverAdaptive Simulated Binary Crossoverfor Realfor Real--Parameter OptimizationParameter Optimization
Reviewed by
Paskorn Champrasertpaskorn@cs.umb.eduhttp://dssg.cs.umb.edu
Kalyanmoy Deb*, S Karthik*, Tatsuya Okabe*GECCO 2007
*Indian Inst. India**Honda Research Inst., Japan
November 7, 08 DSSG Group Meeting 2/31
Outline• Simulated Binary Crossover (SBX Crossover )
(revisit)• Problem in SBX• Self-Adaptive SBX• Simulation Results
– Self-Adaptive SBX for Single-OOP– Self-Adaptive SBX for Multi-OOP
• Conclusion
November 7, 08 DSSG Group Meeting 3/31
SBXSimulated Binary Crossover
• Simulated Binary Crossover (SBX) was proposed in 1995 by Deb and Agrawal.
• SBX was designed with respect to the one-point crossover properties in binary-coded GA.– Average Property:
The average of the decoded parameter values is the same before and after the crossover operation.
– Spread Factor Property:The probability of occurrence of spread factor β ≈ 1 is more likely than any other β value.
November 7, 08 DSSG Group Meeting 4/31
Important Property of One-Point CrossoverSpread factor:
Spread factor is defined as the ratio of the spread of offspring points to that of the parent points:
β
- Expanding Crossover β > 1
- Contracting Crossover β < 1
- Stationary Crossover β = 1
The offspring points are enclosed by the parent points.
The offspring points enclose by the parent points.
The offspring points are the same as parent points
p1 p2
c1 c2
p1 p2
c1 c2
p1 p2
c1 c2
β = | c1−c2p1−p2 |
November 7, 08 DSSG Group Meeting 5/31
Spread Factor (β) in SBX• The probability distribution of β in SBX should be similar (e.g.
same shape) to the probability distribution of β in Binary-coded crossover.
0 1 2 3 4 5 6 7
Prob
abili
ty D
istri
butio
n
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
β
2=n
• The probability distribution function is defined as:
c(β) = 0.5(nc + 1)βnc ,β ≤ 1
c(β) = 0.5(nc + 1)1
βnc+2 ,β > 1
November 7, 08 DSSG Group Meeting 6/31
c(β) = 0.5(n+ 1) 1βn+2 ,β > 1
c(β) = 0.5(n+ 1)βn,β ≤ 1contracting
expanding
Mostly, n=2 to 5
A large value of n gives a higher probability for creating ‘near-parent’solutions.
November 7, 08 DSSG Group Meeting 7/31
Spread Factor (β) as a random number
• β is a random number that follows the proposed probability distribution function:
0 1 2 3 4 5 6 7
Prob
abili
ty D
istri
butio
n
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
β
2=n
To get a β value,1) A unified random number u
is generated [0,1]2) Get β value that makes the
area under the curve = uu
β Offspring:c1 = x− 1
2β(p2 − p1)c2 = x+
12β(p2 − p1)
November 7, 08 DSSG Group Meeting 8/31
SBX Summary• Simulated Binary Crossover (SBX) is a real-parameter
combination operator which is commonly used in evolutionary algorithms (EA) to optimization problems.
– SBX operator uses two parents and creates two offspring.
– SBX operator uses a parameter called the distribution index (nc) which is kept fixed (i.e., 2 or 5) throughout a simulation run.
• If nc is large the offspring solutions are close to the parent solutions• If nc is small the offspring solutions are away from the parent solutions
– The distribution index (nc) has a direct effect in controlling the spread of offspring solutions.
November 7, 08 DSSG Group Meeting 9/31
Research Issues• Finding the distribution index (nc) to an appropriate value
is an important task.– The distribution index (nc) has effect in
1) Convergence speed if nc is very high the offspring solutions are very close to the parent solutions the
convergence speed is very low.2) Local/Global optimum solutions.
Past studies of SBX crossover GA found that it cannot work well in complicated problems (e.g., Restrigin’s function)
November 7, 08 DSSG Group Meeting 10/31
Self-Adaptive SBX• This paper proposed a procedure for updating the
distribution index (nc) adaptively to solve optimization problems.
– The distribution index (nc) is updated every generation.
– How the distribution index of the next generation (nc) is updated (increased or decreased) depends on how the offspring outperform (has better fitness value) than its two parents.
• The next slides will show the process to make an equation to update the distribution index.
November 7, 08 DSSG Group Meeting 11/31
Probability density per offspring
0 1 2 3 4 5 6 7
Prob
abili
ty D
istri
butio
n
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
β
2=cn
u
β
Spread factor distribution
An example, when p1=2, p2=5
Offspring:
c2 = x+12 β(p2 − p1)
c1 = x− 12 β(p2 − p1)
β>1
c(β) = 0.5(nc + 1)βnc ,β ≤ 1(1)
(2)
(1)
(1)
(2)
(2)
(1) (2)c(β) = 0.5(nc + 1)
1βnc+2 ,β > 1
November 7, 08 DSSG Group Meeting 12/31
Self-Adaptive SBX (1)Consider the case that β > 1:
p1 p2
c1 c2β > 1
β = | c2−c1p2−p1 |
--- (1)β = 1 + 2(c2−p2)p2−p1
β = (c2−p2)+(p2−p1)+(p1−c1)p2−p1
(c2 − p2) = (p1 − c1)
November 7, 08 DSSG Group Meeting 13/31
Self-Adaptive SBX (2)
0 1 2 3 4 5 6 7
Prob
abili
ty D
istri
butio
n
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
β
u
β
(1) (2)
c(β) = 0.5(nc + 1)βnc ,β ≤ 1(1)
(2) c(β) = 0.5(nc + 1)1
βnc+2 ,β > 1
--- (2)
(2)R β10.5(nc+1)βnc+2 dβ = (u− 0.5)
November 7, 08 DSSG Group Meeting 14/31
--- (2)
(2)R β10.5(nc+1)βnc+2 dβ = (u− 0.5)
Self-Adaptive SBX (3)
R β10.5(nc + 1)β
−(nc+2)dβ = (u− 0.5)
−0.5β−nc−1 − (−0.5) = (u− 0.5)−0.5β−nc−1 = (u− 1)log(β−nc−1) = log− (u−1)0.5
--- (3)nc = −(1 + log 2(1−u)log β )
(−nc − 1) logβ = log(−2(u− 1))(−nc − 1) = log 2(1−u)
log β
November 7, 08 DSSG Group Meeting 15/31
--- (4)
• If c2 is better than both parent solutions,– the authors assume that the region in which the child
solutions is created is better than the region in which the parents solutions are.
– the authors intend to extend the child solution further away from the closet parent solution (p2) by a factor α(α >1)
Self-Adaptive SBX (4)
β > 1p1 p2
c1 c2 c2
(c2 − p2) = α(c2 − p2)
November 7, 08 DSSG Group Meeting 16/31
β − 1
β Update
from (1) β = 1+2(c2−p2)p2−p1
β > 1c2
p1 p2
c1 c2
--- (1)β = 1 + 2(c2−p2)p2−p1 --- (4)(c2−p2) =α(c2−p2)
= 1 + 2((α(c2−p2)+p2)−p2)p2−p1
--- (5)β = 1 + α(β − 1)
= 1 + α2(c2−p2)p2−p1
November 7, 08 DSSG Group Meeting 17/31
Distribution Index (nc) Update--- (3)nc = −(1 + log 2(1−u)
log β ) --- (5)β=1+α(β−1)
nc = −(1 + log 2(1−u)log β
)
nc = −(1 + log 2(1−u)log(1+α(β−1)) )
log 2(1− u) = −(nc + 1) log β
nc = −1 + (nc+1) log βlog(1+α(β−1)) --- (6)
November 7, 08 DSSG Group Meeting 18/31
Distribution Index (nc) Update
nc = −1 + (nc+1) log βlog(1+α(β−1)) --- (6)
When c is better than p1 and p2 then the distribution index for the next generation is calculated as (6). The distribution index will be decreased in order to create an offspring (c’) further away from its parent(p2) than the offspring (c) in the previous generation.
[0,50]
α is a constant (α >1 )
November 7, 08 DSSG Group Meeting 19/31
Distribution Index (nc) Update
nc = −1 + (nc+1) log βlog(1+α(β−1)) --- (6)
When c is worse than p1 and p2 then the distribution index for the next generation is calculated as (7). The distribution index will be increased in order to create an offspring (c’) closer to its parent(p2) than the offspring (c) in the previous generation.
The 1/α is used instead of α
α is a constant (α >1 )
--- (7)nc = −1 + (nc+1) log βlog(1+(β−1)/α)
November 7, 08 DSSG Group Meeting 20/31
Distribution Index (nc) UpdateSo, we are done with the expanding case
0 1 2 3 4 5 6 7
Prob
abili
ty D
istri
butio
n
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
β
u
β
(1) (2)
c(β) = 0.5(nc + 1)βnc ,β ≤ 1(1)
(2) c(β) = 0.5(nc + 1)1
βnc+2 ,β > 1
(2)
How about contracting case ?(1)
The same process is applied for (1)
--- (x)
(1)R β00.5(nc + 1)β
ncdβ = u
November 7, 08 DSSG Group Meeting 21/31
Distribution Index (nc) Updatefor contracting case
When c is better than p1 and p2 then the distribution index for the next generation is calculated as (8). The distribution index will be decreased in order to create an offspring (c’) father away from its parent(p2) than the offspring (c) in the previous generation.
α is a constant (α >1 )
--- (8)nc =1+ncα − 1
When c is worse than p1 and p2, 1/α is used instead of α
--- (9)nc = α(1 + nc)− 1
November 7, 08 DSSG Group Meeting 22/31
Distribution Index Update Summary
nc = −1 + (nc+1) log βlog(1+α(β−1))
nc = −1 + (nc+1) log βlog(1+(β−1)/α)
nc =1+ncα − 1
nc=α(1+nc)−1
Expanding Case(u > 0.5)
Contracting Case(u<0.5)
c is better than parents
c is worse than parents
(u = 0.5), nc = nc
November 7, 08 DSSG Group Meeting 23/31
N = 150nc = 2α = 1.5
If rand < pc
Ptinitial
population
Parent 1
Parent 2
Update nc
Offspring C1Selection Crossover
Offspring C2
If rand > pc
Mutation
Pt+1Population
at t+1
N = 150α = 1.5
Self-SBX Main Loop
November 7, 08 DSSG Group Meeting 24/31
Simulation Results
SBX• Initial nc is 2.• Population size =150• Pc = 0.9• The number of variables (n) =30• Pm =1/30• A run is terminated when a solution having a function value = 0.001• 11 runs are applied.
Self-SBX• Pc = 0.7• α = 1.5
Sphere Problem:
November 7, 08 DSSG Group Meeting 25/31
Fitness Value
• Self-SBX can find better solutions
November 7, 08 DSSG Group Meeting 26/31
α Study
• α is varied in [1.05,2]• The effect of α is not
significant since the average number of function evaluations is similar.
November 7, 08 DSSG Group Meeting 27/31
Simulation Results
Many local optima, one global minimum (0)The number of variables = 20SBX• Initial nc is 2.• Population size =100• Pc = 0.9• The number of variables (n) =30• Pm =1/30• A run is terminated when a solution having a function value = 0.001 or the maximum
of 40,000 generations is reached.• 11 runs are applied.
Self-SBX• Pc = 0.7• α = 1.5
Rastrigin’s Function:
November 7, 08 DSSG Group Meeting 28/31
Fitness Value
• Self-SBX can find a very close global optimal
Best Median Worst
November 7, 08 DSSG Group Meeting 29/31
α Study
• α is varied in [1.05,10]
• The effect of α is not significant since the average number of function evaluations is similar.
• α = 3 is the best
November 7, 08 DSSG Group Meeting 30/31
Self-SBX in MOOP
• Self-SBX is applied in NSGA2-II.
• Larger hyper-volume is better
November 7, 08 DSSG Group Meeting 31/31
Conclusion
• Self-SBX is proposed in this paper.– The distribution index is adjust adaptively on the run
time.• The simulation results show that Self-SBX can
find the better solutions in single objective optimization problems and multi-objective optimization problems.
• However, α is used to tune up the distribution index.
Recommended