The Neuronal Replicator Hypothesis

The Neuronal Replicator Hypothesis

Chrisantha Fernando & Eors Szathmary

CUNY, December 2009

1Collegium Budapest (Institute for Advanced Study), Budapest, Hungary2Centre for Computational Neuroscience and Robotics, Sussex University, UK

3MRC National Institute for Medical Research, Mill Hill, London, UK4Parmenides Foundation, Kardinal-Faulhaber-Strase 14a, D-80333 Munich, Germany

5Institute of Biology, Eötvös University, Pázmány Péter sétány 1/c, H-1117 Budapest, Hungary

Visiting Fellow MRC National Institute for

Medical ResearchLondon

Post-DocCenter for Computational

Neuroscience and RoboticsSussex University

Marie Curie FellowCollegium Budapest

(Institute for Advanced Study)Hungary

The Hypothesis•Evolution by natural selection takes

place in the brain at rapid timescales and contributes to solving cognitive/behavioural search problems.

•Our background is in evolutionary biology/the origin of non-enzymatic template replication/evolutionary robotics/computational neuroscience.

Outline

•Limitations of some proposed search algorithms, e.g.

•Reward biased stochastic search

•Reinforcement Learning

•How copying/replication of neuronal data structures can alleviate these limitations.

•Mechanisms of neuronal replication

•Applications and future work

Simple Search Tasks

•Behavioural and neuropsychological learning tasks can be solved by stochastic-hill climbing

•Stroop Task

•Wisconsin Card Sorting Task (WCST)

•Instrumental Conditioning in Spiking Neural Networks

•Simple inverse kinematics problem

Stochastic Hill-Climbing

• Initially P(xi = 1) = 0.5, Initial reward = 0

• Make random change to P

• Generate M examples of binary strings

• Calculate reward

• If r(t) > r(t-1), keep changes of P, else revert to previous P values.

• One solution, change solution, keep good changes, loose bad changes.

0.5 0.5 0.5 0.5 0.5

0.5 0.5 0.5 0.5 0.5

0.8 0.5 0.5 0.4 0.5

Can get stuck on local optima

Stroop TaskGreen Red Blue Purple Blue Purple

Blue Purple Red Green Purple Green

Name the colour of the words.

Dehaene et al, 1998

dW = Reward x pre x postDecreased reward -> Instability in workspace

WCST•Each card has several “features”.

Subjects must sort cards according to a feature (color, number, shape, size).

•Rougier et al 2005. PFC weights stabilised if expected reward obtained, destabilised if expected reward not obtained, i.e. TD learning

Instrumental Conditioning

In a spiking neural net

Izhikevich 2007

• Simple spiking model • Random connections • STDP • Delayed reward • Eligibility traces• Synapse selected

• Simple spiking model

STDP

Time tpre

Time tpost

Interval = Tpost - Tpre

Time tpost

Time tpre

Interval = Tpost - Tpre

A simple 2D inverse kinematics

problem

Reinforcement Learning

• For large problems a tabular representation of state-action pairs is not possible.

• How does compression of state representation occur? Function approximation

• Domain-specific knowledge provided by the designer, e.g. TD-Gammon was dependent on Tesauro’s skillful design of a non-linear multilayered neural network, used for value function approximation in the Backgammon domain consisting of approximately 1020 states” p20 [51].

So far…•SHC works on simple problems

•RL is a sophisticated kind of SHC

•In order for RL/SHC to work, action/value representations must fit the problem domain.

•RL doesn’t explain how appropriate data-structures/representations arise.

Large search space sorandom search or

exhaustive search not possible.

Representation criticallocal optima.

Requires internal sub-goals, no explicit

reward.

What neural mechanisms underlie complex search?

What is natural selection?

Some hereditary traits affect survival and/or fertility

1. multiplication

2. heredity

3. variability

Natural selection reinvented itself

Evolutionary Computation•Solving problems by EC also

requires decisions about genetic representations

•And about fitness functions

•For example, we use EC to solve the 10 coins problem

Fitness function•Convolution of desired inverted

triangle over grid

•Instant fitness = number of coins occupying he inverted triangle template

•An important question is how such fitness functions (subgoals/goals) could themselves be bootstrapped in cognition.

Michael Ollinger, Parmenides Foundation, Munich

Structuring Phenotypic Variation

•Natural Selection can act on

•genetic representations

•variability properties (genetic operators, e.g mutation rates)

A

Variation in Variability

Improvement of representations for free…

B

Non-trivial Neutrality

g1

g2p

ed 1

ed 2

Adapted from Toussaint 2003

Population Search•Natural selection allows

redistribution of search resources between multiple solutions.

•We propose that multiple (possibly interacting) solutions to a search problem exist at the same time in the neuronal substrate.

AAB

C

D

AAB

C

D

A B C D

A B C D

A B C D

AD’D’’

D’’’D

C

DA

B

A B C D

AAB

C

D

A B C D

D’ D’’ D’’’ D

Waste

Can units of selection exist in the

brain?•We propose 3 possible mechanisms

•Copying of connectivity patterns

•Copying of bistable activity patterns

•Copying of spatio-temporal spike patterns & explicit rules

Copying of connectivity

patterns

How to copy small neuronal circuits

DNA neuronal network

STDP and causal inference

With error correction and sparse activation

1 + 1 Evolution Stratergy

Copying of bistable activity patterns

1 bit copy

Hebbian Learning can Structure Exploration Distributions

- Search in biased towards previous local optima

The Origin of Heredity in Neuronal Networks.

Phenotype 2

Phenotype 1

M2

M1

C

Genotype 1

Genotype 2

CM2= M1

C = M2-1M1

Non-local, e.g. requires ATA

Stochastic hill climbing can select for neuronal template replication

M2

M1

C

Genotype 1

Genotype 2

EEErrorError

Copying of Spatiotemporal

Spike Patterns & Explicit Rules

Spatiotemporal spike patterns

ABA vs ABB

DD vs DS

Visual shift-invariancemechanisms applied

to linguistics.

APPLICATIONS

•Evolution of Predictors (Feed-forward Models/Emulators/Bayesian Causal Networks).

•First derivative of predictability

•Evolution of Linguistic Construction

•Evolution of controllers for robot hand-manipulation

•Evolution of Productions in ACT-R/Copycat

•Evolution of representations and search for insight problem solving.

Operations to construct a BN

Larranaga et al, 1996. Structure Learning of Bayesian Networks by Genetic Algorithms.Kemp & Tenenbaum, 2008. The discovery of structural form.

Luc Steels et al, Sony Labs

Istvan Zacher Collegium Budapest (Institute for Advanced Study)

K(v)

S(p) C(p)0 1

K(v)

S(p) C(p)0 1

Rules

K(v)

S(p) C(p)0 1

Rules

K(v)

S(p) C(p)0 1

Rules

K(v)

S(p) C(p)0 1

Rules

KC

K(v)

S(p) C(p)0 1

Rules

KC S

Rules

KC S

Rules

KC S

K(v)

S(p) C(p)0 1

Helge Ritter, Bielefeld, Germany

Thanks toRichard GoldsteinRichard Watson

Dan BushEugine Izhikevich

Phil HusbandsLuc Steels

K.K. KarishmaAnna Fedor, Zoltan Szatmary, Szabolcs Szamado, Istvan Zachar

Anil Seth

Documents

The Neuronal Replicator Hypothesis