How the Brain Learns: Rules and Outcomes

How the Brain Learns:Rules and Outcomes

Psychology 209January 28, 2013

The brain is highly plastic and changes in response to experience

• Alteration of experience leads to alterations of neural representations in the brain.

• What neurons represent, and how precisely they represent it, are strongly affected by experience.

• We allocate more of our brain to things we have the most experience with.

The Sensory Homunculus

Monkey Somatosensory Cortex

Merzenich’s Joined Finger ExperimentReceptivefields afterfingers weresown together

Controlreceptivefields

Merzenich’s Rotating Disk Experiment

Merzenich’s Rotating Disk Experiment: Redistribution and Shrinkage of Fields

Merzenich’s Rotating Disk Experiment: Expansion of Sensory Representation

Temporal Sharpening

Synaptic Transmission and Learning• Learning may occur by changing

the strengths of connections.• Addition and deletion of

synapses, as well as larger changes in dendritic and axonal arbors, also occur in response to experience.

• Recent evidence suggests that neurons may be added or deleted in some cases as well. (This occurs in a specialized sub region of the hippocampus, perhaps elsewhere as well neocortex.)

Pre Post

Hebb’s Postulate“When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.”

D. O. Hebb, Organization of Behavior, 1949

In other words:“Cells that fire together wire together.”

Unknown

Mathematically, this is often taken as:Dwba = eabaa

(Generally you have to subtract something to achieve stability)

a1

a2

b

The Molecular Basis of Hebbian Learning (Short Course!)

Glutamate ejected fromthe pre-synaptic terminalactivates AMPA receptors,exciting the post-synapticneuron.

Glutamate also binds to theNMDA receptor, but it onlyopens when the level of depolarization on the post-synaptic side exceeds a threshold.

When the NMDA receptor opens,Ca++ flows in, triggering abiochemical cascade that resultsin an increase in AMPA receptors.

The increase in AMPA receptorsmeans that an the same amount of transmitter release at a later time will cause a stronger post-synaptic effect (LTP).

Linsker’s papers(PNAS, 1986a,b,c)

• Showed how a simple Hebbian learning rule can give rise to:– Center-surround receptive fields at initial stages

of visual processing– Oriented edge detectors at later stages

• Extended the idea to consider why topographic maps may form.

Emergence of Center-Surround Organization

• Assume a first-layer (A) of independent randomly spiking neurons project to a second layer of neurons that receive inputs according to a Gaussian function of distance from the corresponding point in the layer below.

• Then neighboring neurons in layer B will have a tendency to have slightly correlated activations, due to the fact that they tend to receive input from some of the same neurons.

• Now feed activation forward to layer C from layer B. Activation follows a linear activation rule, and learning occurs according to the Hebbian rule at right.

• Additional assumptions: Weights (wrs) are bounded in [-1,+1]. Alternatively there could be two types of weights, some in [0,1] and others in [-1,0]. So, when a weight reaches its min or max value, it cannot change further in that direction.

• d represents a decay term tending to cause an overall negative drift.

• The result of running this for a long time with small e is center surround organization as shown (small dots = excitatory connections, large dots = inhibitory connections).

• This comes from the fact that inputs near the center of the receptive field tend to be more correlated with the other inputs to the cell than the ones near the edges.

Dwrs = e(ar-Qr)(as-Qs) - d

ar = Ss wrsas + br

1 2 3input units

Activation rule: ar = Ssaswrs

Initial weights all = .1Learning Rule: Dwrs = earas – de = 1.0d = .075

How Hebbian Learning Plus Weight Decay Strengthens Correlated Inputs and Weakens Isolated

Inputs to a Receiving Neuron

Final weight .15 .15 .05

Units 1 & 2 active

Unit 3 active alone

unit r

This works because inputs correlated with otherinputs are associated with stronger activation ofthe receiving unit that inputs that occur on their own.

(2x)

ara3

Linsker’s rule maximizes a ‘correlatedness of inputs’ measure

sji aa Q

What Linsker calls‘Hebb Optimal’ weights are those maximizingthe sum of the pair-wise correlations among a unit’s inputs subjectto constraints on the sum of the weights:

SiSj Qijwriwrj

t jjt ii

t jjiiij

ataata

ataataQ

22 )()(

)()(Where

When , and , the learning rule shown previously tends to maximize the sum of the Qij’s subject to a constraint against large weights (enforced by having a non-zero value for d).

rra Q

When applied to inputs with ‘mexican hat’ receptive fields, Linsker’s rule tends to produce edge detectors!• Two representative simulated

receptive fields for units at a layer receiving inputs from a layer below consisting of center-surround units are shown.

• The Mexican Hat receptive field shape implies a Mexican Hat Q function, and this favors each point being like its near-neighbors and un-like its midrange neighbors.

• “Each point cannot be the center of a ‘like island’ in an ‘unlike sea’”.

• Whether you get blobs or edge detectors depends on many factors…

Miller, Keller, and Stryker (Science, 1989) model ocular dominance column development

using Hebbian learningArchitecture:• L and R LGN layers and a cortical

layer containing 25x25 simple neuron-like units.

• Each neuron in each LGN has an initial weak projection that forms a Gaussian hump (illustrated with disc) at the corresponding location in the Cortex, but with some noise around it.

• In the cortex, there are short-range excitatory connections and longer-range inhibitory connections, with the net effect as shown in B (scaled version shown next to cortex to indicate approximate scale).

Simulation of Ocular DominanceColumn Development based on Hebbian Learning

Experience and Training:• Before ‘birth’, random activity occurs in

each retina. Due to overlapping projections to LGN, neighboring LGN neurons in the same eye tend to be correlated. No (or less) between-eye correlation is assumed.

• Learning of weights to cortex from LGN occurs through a Hebbian learning rule: Dwcl = eacal – decay

(Note that w’s are not allowed to go below 0).

• Results indicate that ocular dominance columns slowly develop over time. Each panel shows the cortical sheet at a different point in time during development. Strength of R vs L eye input to a given cortical neuron is indicated with light to dark gray as shown).

• Some studies indicate that ocular dominance columns are present before there is neural activity, but even in that case mechanisms like those considered here may play a role in maintaining and refining the columnar organization.

Competitive Learning and the Self-Organizing Map (Hbk, Ch. 6)• Competitive learning:

– Units organized into mutually inhibitory clusters

– Unit with largest net input is chosen as ‘winner’

– Winner’s weights are updated to align with the input.

– Sum or sum of squares of weights coming to the unit are held constant.

• Self-organizing map:

– Very similar but units are arranged in a line, sheet or higher-d space, and:

– Weights are updated for the winner and for other units near the winner

– Extent of update falls off with distance.– Units then spread out over the input data, and

are assigned in proportion to density of data (as in Merzenich experiments).

– In simulation result below, units are arranged along a one-d line. Inputs are points in the 2-d space as shown.

Plusses and Minuses of Hebbian Learning• Hebbian learning tends to reinforce whatever

response occurs to a particular input.– If a neuron becomes activated, the inputs that

activate it are strengthened, so the neuron will be even more likely to be activated next time.

• This may contribute to failures of learning and even to stamping in of bad habits when the response we make to an input is not the best one to make.

• Possible examples (see articles by me in today’s readings):

- Failure of adults to learn new speech sounds- Phobias, racism - Entrenchment of “Habits of Mind”

Learning Depends on More than Mere Stimulation

• Exposure to task-irrelevant stimuli does not lead to change in studies from the Merzenich lab and other labs.

• Outcome feedback can lead to enhanced learning. E.g., feedback helps Japanese adults improve their discrimination of /r/ vs /l/ sounds.

How might outcomes shape learning?• Learning can be influenced by neuro-modulators such as ACH or dopamine.

– One Merzenich expt. showed that an entire cortical area can come to respond to a particular tone played repeatedly, if a sub-cortical nucleus that releases the modulator ACH is continually stimulated while the tone is being presented. ACH may enhance connection weight changes, and level of ACH release may depend on the animal’s attentional state.

• Reinforcement learning:– Release of the modulator dopamine may be triggered by occurrence of rewards, and dopamine too

may modulate the size of connection weight changes.

Dwrs = eRaras – d R = reinforcement

– This provides a powerful way of shaping Hebb-like learning to strengthen activations that lead to rewards, and not other activations.

– Maybe R signal corresponds to level of dopamine (above or below a baseline level).– R may correspond to the extent to which the reward is greater or less than expected (we will return

to this idea later).

• Error-correcting learning:– Learning rules driven by the difference between what a network produces and a ‘teaching signal’ are

the main engines of learning in neural networks – we will turn to these next time.

Documents

How the Brain Learns: Rules and Outcomes