Swarm Intelligence – W6: An Introduction to Control ......– Genetic Algorithms (GA) – Genetic Programming (GP) – Evolutionary Strategies (ES) – Particle Swarm Optimization

Swarm Intelligence – W6:An Introduction to Control Architectures for Mobile

Robots and two Additional Metaheuristics

Outline• General Concepts

– Autonomy– Perception-to-Action loop– Terminology

• Main example of reactive control architectures– Proximal architectures– Distal architectures

• Introduction to two new key metaheuristics– Genetic Algorithms– Particle Swarm Optimization– Comparison between GA and PSO

General Concepts and Principles for Mobile

Robotics

Autonomy

• Different levels/degrees of autonomy– Energetic level– Sensory, motor, and computational level– Decisional level

• Needed degree of autonomy depends on task/environment in which the unit has to operate

• Environmental unpredictability is crucial: robot manipulator vs. mobile robot vs. sensor node

Autonomy – Mobile Robotics

Task Complexity

Autonomy

State of the Art in Mobile Robotics

Industry

Research

Human-GuidedRobotics

AutonomousRobotics

Collective AutonomousRobotics

?

Perception-to-Action Loop

Computation

Perc

eptio

n

Act

ion

Environment

• Reactive (e.g., nonlinear transform, single loop)

• Reactive + memory (e.g. filter, state variable, multi-loops)

• Deliberative (e.g. planning, multi-loops)

• sensors • actuators

Sensors• Propioceptive (“body”) vs. exteroceptive

(“environment”)– Ex. proprioceptive: motor speed/robot arm joint angle,

battery voltage– Ex. exteroceptive: distance measurement, light

intensity, sound amplitude

• Passive (“measure ambient energy”) vs. active(“emit energy in the environment and measure the environmental reaction”)– Ex. passive: temperature probes, microphones, cameras– Ex. active: laser rangfinder, IR proximity sensors,

ultrasound sonars

Actuators

• For different purposes: locomotion, control a part of the body (e.g. arm), heating, sound producing, etc.

• Example of electrical-to-mechanical actuators: DC motors, step motors, servos, loudspeakers, etc.

Computation/Control• Usually microcontroller based• “Discretization” (analog-to-digital for

values, continuous-to-discrete for time) and “continuization” (digital-to-analog for values, discrete-to-continuous for time)

• Different types of control architectures:– Reactive (‘reflex-based”) vs. deliberative

(“planning”)– Proximal vs. distal

Example of Reactive Control Architectures

Reactive Architectures: Proximal vs. Distal in Theory

• Proximal: – close to sensor and actuators– very simple linear/nonlinear operators on crude

data– high flexibility in shaping the behavior– Difficult for to engineer in a “human-guided”

way; machine-learning usually perform better

Reactive Architectures: Proximal vs. Distal in Theory

• Distal architectures – Farer from sensor and actuators– More elaborated data processing (e.g., filtering) – Less flexibility in shaping the behavior– Easier to engineer in a “human-guided” way the

basic block (handcoding); more difficult to compose the blocks in the right way (e.g., sequence, parallel, …)

Reactive Architectures: Proximal vs. Distal in Practice

• A whole blend!• Four “classical” examples of reactive

control architecture for solving the same problem: obstacle avoidance.

• Two proximal: Braitenberg and Artificial Neural Network

• Two distal: Subsumption and Motor Schema, both behavior-based

Ex. 1: Braitenberg’s Vehicles

- - --

2a 2b 3a 3b

lightsensors

motors

++++

• Work on the difference (gradient) between sensors• Originally omni-directional sensors but work even better with directional sensors• + excitation, - inibition; linear controller (out = signed coefficient * in)• Symmetry axis along main axis of the vehicle (----)• Originally: light sensors; works perfectly also with proximity sensors (3c?)

symmetry axis

Ex. 2: Artificial Neural Network

f(xi)

Ij

Ni

wij

1e12)( −+

= −xxf

output

synaptic weight input

neuron N with sigmoidtransfer function f(x)

S1

S2

S3 S4S5

S6

S7S8

M1M2

∑=

+=m

jjiji IIwx

10

Oi

)( ii xfO =

inhibitory conn.excitatory conn.

Rule 1:if (proximity sensors on the left active) then

turn right

Rule 2:if (proximity sensors on the right active) thenturn left

Rule 3:if (no proximity sensors active) ) thenmove forwards

Ex. 3: Rule-Based

Subsumption Architecture

• Rodney Brooks 1986, MIT• Precursors: Braitenberg (1984), Walter (1953)• Behavioral modules (basic behaviors) represented by

Augmented Finite State machines (AFSM)• Response encoding: predominantly discrete (rule

based)• Behavioral coordination method: competitive

(priority-based arbitration via inhibition and suppression)

Subsumption ArchitectureSense

Model

Plan

Act

Classical paradigm (serial);emphasis on deliberativecontrol

Modify the WorldCreate Maps

DiscoverAvoid Collisions

Move Around

Subsumption (parallel); emphasis on reactive control

Subsumption Architecture: AFSM

Behavioral ModuleI

Inputlines

Inhibitor

R

Reset

SOutputlines

Suppressor

Inhibitor: block the transmissionSuppressor: block the transmission and replacethe signal with the suppressing message

Ex. 4: Behavior-Based with Subsumption

Obstacle avoidance

WanderS

(1 suppresses and replaces 2)

actuatorssensors

1

2

Evaluation of Subsumption

+ Support for parallelism: each behavioral layer can run independently and asynchronously (including different loop time)

+ HW retargetability: can compile down directly to programmable-array logic circuitry

- Hardwiring mean less run time flexibility- Coordination mechanisms restrictive (“black or white”)- Limited support for modularity (upper layers design cannot

be independent from lower layers).

Motor Schemas

• Ronald Arkin 1987, Georgia Tech• Precursors: Arbib (1981), Khatib (1985)• Parametrized behavioral libraries

(schemas)• Response encoding: continuous using

potential field analog• Behavioral coordination method:

cooperative via vector summation and normalization

Motor Schemas

…

Σvector motors

MS1

MS2

PS1

PS2

PS3PSS1

PSS2

S1

S2

S3

…

sensors

PS: Perceptual SchemaPSS: Perceptual SubschemaMS: Motor SchemaS: sensor

Ex. 5: Behavior-Based with Motor Schemas

Avoid-obstacle

ΣDetect-obstacles

Move-to-GoalDetect-Goal actuatorssensors

NoiseGenerate-direction

For avoiding to get stuck in local minima (typical problem of vector field approaches)

Visualization of Vector field for Ex. 5Avoid-static-obstacle

Obstacle

Obstacle

Vmagnitude =

Vdirection = radially along a line between robot and obst. center, directed away from the obstacle

Rdfor

SdRforGRSdS

Sdfor

≤∞

≤<−−

>0

S = obstacle’s sphere of influenceR = radius of the obstacleG = gainD = distance robot to obstacle’s center

Goal

Move-to-goal (ballistic)

Output = vector = (r,φ)(magnitude, direction)

Vmagnitude = fixed gain value

Vdirection = towards perceived goal

Visualization of Vector field for Ex. 5

Move-to-goal + avoid obstacle

G

O

OLinear combination(weigthed sum)

Visualization of Vector field for Ex. 5

Evaluation of Motor Schemas

+ Support for parallelism: motor schemas are naturally parallelizable

+ Run time flexibility: schemas = software agents -> reconfigurable on the flight

- Robustness -> well-known problems of potential field approach -> extra introduction of noise (not clear method for exploiting that generated by sensors, …)

- Slow and computationally expensive sometimes- No HW retargetability: do not provide HW compilers; do

not take into account the system as a whole

Evaluation of both Architectures in Practice

• In pratice (my expertise) you tend to mix both and even more …

• The way to combine basic behavior (collaborative and/or competitive) depends from how you developed the basic behaviors (or motor schemas), reaction time required, on-board computational capabilities, …

• Pierre Arnaud’s work (thesis EPFL, 2000, see references at the end) went in this direction

A well known Example of Hybrid, Practical Control Solution: Boids

• Time-constant linear weighted for the three behavioral rules sum did not work in front of obstacles → Motor schema did not work!

• Time-varying, nonlinear weighted sum worked much better: allocate the maximal acceleration available to the highest behavioral priority, the remaining to the other behaviors → Mixture of motor schema & subsumption ok!

Very interesting: Brooks (1986), Arkin (1987), Reynolds (1987)!

Rationale behind Automatic Design and

Optimization and Terminology

Why Machine-Learning?

• Complementarity to a model-based/engineering approaches: when low-level details matter (optimization) and/or good solution do not exist (design)!

• When the design/optimization space is too big (infinite)/too computationally expensive to be systematically searched

• Automatic design and optimization techniques• Role of engineer reduced to specifying

performance requirements and problem encoding

Why Machine-Learning?

• There are design and optimization techniques robust to noise, nonlinearities, discontinuities

• Individual real-time, adaptation to new environmental conditions; i.e. increased individual flexibility when environmental conditions are not known/cannot predicted a priori

• Search space: parameters and/or rules

ML Techniques: Classification– Supervised techniques: “a trainer/teacher” is available.

• Ex: a set of in-out examples is provided to the system and the error at the output of the of the system vs. perfect answer of the teacher for the same input is calculated and fed to the system using a machine-learning algorithm so that error is reduced over trials

• The system generalization capabilities after training are tested on examples not presented to the system before (i.e. not in the training set)

– Unsupervised techniques: “trial-and-error”, “evaluative”techniques; no teacher available.

• The system judges its performance according to a given metric to be optimized

• The metrics does not refer to any specific input-to-output mapping• The system tries out possible design solutions, does mistakes, and

try to learn from its mistakes• Number of possible examples is very large, possibly infinite, and

not known a priori

ML Techniques: Classification

– Off-line: in simulation, download of the learned/evolved solution onto real hardware when certain criteria are met

– Hybrid: most of the time in simulation (e.g. 90%), last period ( e.g. 10%) of the process on real hardware

– On-line: from the beginning on real hardware (no simulation). Depending on the algorithm more or less rapid

ML Techniques: Classification

– On-board: machine-learning algorithm run on the system to be learned or evolved (no external unit)

– Off-board: the machine-learning algorithm runs off-board and the system to be learned or evolved just serve as phenotypical, embodied implementation of a candidate solution

ML algorithms require sometimes fairly important computational resources (in particular for multi-agent search algorithms), therefore a further classification is:

Selected Unsupervised ML Techniques Robust to Noisy

Fitness/Reinforcement Functions• Evolutionary computation

– Genetic Algorithms (GA) – Genetic Programming (GP)– Evolutionary Strategies (ES)– Particle Swarm Optimization (PSO)

• Learning – In-Line Adaptive Learning– Reinforcement Learning

Today & next week

Next week

Next week

Maybe last 2 weeks of the course

Genetic Algorithms

GA: Terminology• Population: set of m candidate solutions (e.g. m = 100); each candidate

solution can also be considered as a genetic individual endowed with a single chromosome which in turn consists of multiple genes.

• Generation: new population after genetic operators have been applied (n = # generations e.g. 50, 100, 1000).

• Fitness function: measurement of the efficacy of each candidate solution

• Evaluation span: evaluation period of each candidate solution during a given generation. The time cost of the evaluation span differ greatly from scenario to scenario: it can be extremely cheap (e.g., simply computing the fitness function in a benchmark function) or involve an experimental period (e.g., evaluating the performance of a given control parameter set on a robot)

• Life span: number of generations a candidate solution survives

• Population manager: applies genetic operators to generate the candidate solutions of the new generation from the current one

• Principles: select, recombine, and mutate

Evolutionary Loop: Several Generations

Initialize Population

Generation loop

End criterion met?

End

Start

NY

Ex. of end criteria:

• # of generations

• best solution performance

•…

Generation Loop

Evaluation of Individual Candidate

Solutions

Population Replenishing

Selection

Crossover and Mutation

Decoding (genotype-> phenotype)

System

Fitness Measurement

Population Manager

GA: Coding & Decodingphenotype genotype phenotype

coding decoding(chromosome)

• genotype: chromosome = string of genotypical segments, i.e. genes, or mathematically speaking, again a vector of real or binary numbers; vector dimension varies according to coding schema (≥ D)

G1 G2 GnG4G3 … Gi = gene = binary or real number

Coding: real-to-real or real-to-binary via Gray code (minimization of nonlinear jumping between phenotype and genotype)

Decoding: inverted operation

• phenotype: usually represented by a vector of dimension D, D being the dimension of the hyperspace to search; vector components are usually real numbers in a bounded range

Rem:

• Artificial evolution: usually one-to-one mapping between phenotypic and genotypic space

• Natural evolution: 1 gene codes for several functions, 1 function coded by several genes.

GA: Basic Operators• Selection: roulette wheel, ranked selection, elitist selection• Crossover: 1 point, 2 points (e.g. pcrossover = 0.2)

• Mutation (e.g. pmutation = 0.05)

Gk Gk’

Note: examples for fixed-length chromosomes!

Particle Swarm Optimization

PSO: Terminology• Population: set of candidate solutions tested in one time step, consists of m

particles (e.g., m = 20)

• Particle: represent a candidate solution; it is characterized by a velocity vector v and a position vector x in the hyperspace of dimension D

• Evaluation span: evaluation period of each candidate solution during one a time step; as in GA the evaluation span might take more or less time depending on the experimental scenario.

• Fitness function: measurement of efficacy of a given candidate solution during the evaluation span

• Population manager: update velocities and position for each particle according to the main PSO loop

• Principles: imitate, evaluate, compare

Evolutionary Loop: Several Generations

Initialize particles

Perform main PSO loop

End criterion met?

End

Start

NY

Ex. of end criteria:

• # of time steps

• best solution performance

•…

Initialization: Positions and Velocities

The Main PSO Loop – Parameters and Variables

• Functions– rand ()= uniform random number [0,1]

• Parameters– w: velocity inertia (positive scalar)– cp: personal coefficient/weight (positive scalar)– cn: neighborhood coefficient/weight (positive scalar)

• Variables– xij(t): position of particle i in the j-th dimension at time step t (j = [1,D])– vij(t): velocity particle i in the j-th dimension at time step t– : position of particle i in the j-th dimension with maximal fitness up to

iteration t– : position of particle i’ in the j-th dimension having achieved the

maximal fitness up to iteration t in the neighborhood of particle i (i’≠ i)

)(* txij

)(* tx ji′

The Main PSO Loop (Eberhart, Kennedy, and Shi, 1995, 1998)

for each particle i

update the

velocity

( ) ( )1)1( ++=+ tvtxtx ijijijthen move

for each component j

At each time step t

)()()()(

)()1(**

ijjinijijp

ijij

xxrandcxxrandc

twvtv

−+−

+=+

′

The main PSO Loop- Vector Visualization

*ix

Here I am!

The position with optimal fitness of my neighbors up to date

My position for optimal fitness up to date

xi

vi

p-proximity

n-proximity

*ix ′

Neighborhoods Types

• Size: – Neighborhood index consider also the particle itself in

the counting – Local: only k neighbors considered over m particles in

the population (1 < k < m); k=1 means no neighbor considered in the velocity update

– Global: m neighbors• Topology:

– Geographical– Social– Indexed– Random– …

Neighborhood Examples: Geographical vs. Social

geographical social

Neighborhood Example: Indexed and Circular

Virtual circle

1

5

7

6 4

3

8 2Particle 1’s 3-

neighbourhood

PSO Animated IllustrationGlobal optimum

GA vs. PSO - Qualitative

Multi-agent, probabilistic search

Multi-agent, probabilistic search

General features

Particle’s variables (tracked by the population manager)

Social operators

Individual operators

Individual memory

Parameter/function

position and velocityposition

neighborhood best position history

selection, crossover

personal best position history, velocity inertia

mutation

yes (randomized hill climbing)

no

PSOGA

GA vs. PSO - Qualitative

Particle’s variables (tracked by the population manager)

Global/local search balance

Population diversity

# of algorithmic parameters (basic)

Parameter/function

position and velocityposition

Tunable with w (w↑→ global search; w↓→ local search)

somehow tunable via pc/pm ratio and selection schema

Mainly via local neighborhood

somehow tunable via pc/pm ratio and selection schema

w, cn, cp, k, m, position and velocity range (2) = 7

pm, pc, selection par. (1), m, position range (1) = 5

PSOGA

GA vs. PSO - Quantitative• Goal: minimization of a given f(x)• Standard benchmark functions with thirty terms (n = 30) and a

fixed number of iterations• All xi constrained to [-5.12, 5.12]

• GA: Roulette Wheel for propagation, mutation applies numerical adjustment to gene

• PSO: lbest ring topology with neighborhood of size 2• Algorithm parameters used (but not optimized!):

GA vs. PSO - Quantitative

GA vs. PSO - Quantitative

0.01 ± 0.030.01 ± 0.01Griewank

48.3 ± 14.4157 ±21.8Rastrigin

7.38 ± 3.2734.6 ± 18.9Generalized Rosenbrock

0.00 ± 0.000.02 ± 0.01Sphere

PSO(mean ± std dev)

GA (mean ± std dev)

Function (no noise)

Bold: best results; 30 runs; no noise

Additional Literature – Week 6Books• Braitenberg V., “Vehicles: Experiments in Synthetic Psychology”,

MIT Press, 1986.• Siegwart R. and Nourbakhsh I. R., “Introduction to Autonomous

Mobile Robots”, MIT Press, 2004. • Arkin R. C., “Behavior-Based Robotics”. MIT Press, 1998.• Everett, H. R., “Sensors for Mobile Robots, Theory and Application”,

A. K. Peters, Ltd., 1995• Nolfi S. and Floreano D., “Evolutionary Robotics: The Biology,

Intelligence, and Technology of Self-Organizing Machines”. MIT Press, 2004

• Kennedy J. and Eberhart R. C. with Y. Shi, Swarm Intelligence, Morgan Kaufmann Publisher, 2001.

• Mitchell M., “An Introduction to Genetic Algorithms”, MIT Press, 1996.

• Goldberg D. E., “Genetic Algorithms in Search: Optimization and Machine Learning”. Addison-Wesley, Reading, MA, 1989.

Documents

Swarm Intelligence – W6: An Introduction to Control ......– Genetic Algorithms (GA) – Genetic Programming (GP) – Evolutionary Strategies (ES) – Particle Swarm Optimization