65
Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin Rajendran (NJIT), Brian Gardner and Andr´ e Gr¨ uning (Univ. of Surrey) King’s College London December 11, 2018 Osvaldo Simeone Neuromorphic Computing 1 / 53

Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Neuromorphic Computing and Learning:A Stochastic Signal Processing Perspective

Osvaldo Simeonejoint work with Hyeryung Jang (KCL), Bipin Rajendran (NJIT), Brian

Gardner and Andre Gruning (Univ. of Surrey)

King’s College London

December 11, 2018

Osvaldo Simeone Neuromorphic Computing 1 / 53

Page 2: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Overview

Motivation

Models

Algorithms

Examples

Osvaldo Simeone Neuromorphic Computing 2 / 53

Page 3: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Overview

Motivation

Models

Algorithms

Examples

Osvaldo Simeone Neuromorphic Computing 3 / 53

Page 4: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Machine Learning Today

[Rajendran ’18]

Breakthroughs in ML have come at the expense of massive memory,energy, and time requirements...

Osvaldo Simeone Neuromorphic Computing 4 / 53

Page 5: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Machine Learning Today

... making many state-of-the-art solutions not suitable for mobile orembedded devices.

[Rajendran ’18]

Osvaldo Simeone Neuromorphic Computing 5 / 53

Page 6: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Machine Learning at the EdgeA solution is mobile edge or cloud computing: offload computationsto an edge or cloud server.

Possible privacy and latency issues

Osvaldo Simeone Neuromorphic Computing 6 / 53

Page 7: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Machine Learning at the EdgeA solution is mobile edge or cloud computing: offload computationsto an edge or cloud server.

Possible privacy and latency issues

Osvaldo Simeone Neuromorphic Computing 6 / 53

Page 8: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Machine Learning on Mobile DevicesScaling down energy and memory requirements for implementation onmobile devices requires exploring trade-offs between accuracy andcomplexity.Active field: many new chips released by established players andstart-ups to implement artificial neural networks...

Osvaldo Simeone Neuromorphic Computing 7 / 53

Page 9: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Human vs MachineBeyond artificial neural networks...

13 Million Watts5600 sq. ft. & 340 tons

∼ 1010 ops/J

∼ 20 Watts2 sq. ft. & 1.4 Kg∼ 1015 ops/J

Source: https://www.olcf.ornl.gov, Google Images

Osvaldo Simeone Neuromorphic Computing 8 / 53

Page 10: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Neuromorphic ComputingNeurons in the brain process and communicate over time using sparsebinary signals (spikes or action potentials).This results in a dynamic, sparse, and event-driven operation.

[Gerstner]

Osvaldo Simeone Neuromorphic Computing 9 / 53

Page 11: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Spiking Neural Networks

Spiking Neural Networks (SNNs) are networks of spiking neurons.

Topic at the intersection of computational neuroscience (focused onbiological plausibility) and machine learning (focused on accuracy andefficiency).

x2

x1

xn

y

Artificial Neural Network (ANN)

Spiking Neural Network (SNN)

Time

Time

w1

wn

w2

...

w1

wn

w2

...

x2(t)

x1(t)

xn(t)

y(t)

Osvaldo Simeone Neuromorphic Computing 10 / 53

Page 12: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Spiking Neural Networks

Spiking Neural Networks (SNNs) are networks of spiking neurons.

Topic at the intersection of computational neuroscience (focused onbiological plausibility) and machine learning (focused on accuracy andefficiency).

x2

x1

xn

y

Artificial Neural Network (ANN)

Spiking Neural Network (SNN)

Time

Time

w1

wn

w2

...

w1

wn

w2

...

x2(t)

x1(t)

xn(t)

y(t)

Osvaldo Simeone Neuromorphic Computing 10 / 53

Page 13: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Spiking Neural Networks

Proof-of-concept hardware implementations of SNNs havedemonstrated significant energy savings as compared to ANNs...

Osvaldo Simeone Neuromorphic Computing 11 / 53

Page 14: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Spiking Neural Networks

... generating significant (and perhaps premature) press coverage andpositive market predictions.

Osvaldo Simeone Neuromorphic Computing 12 / 53

Page 15: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Overview

Motivation

Models

Algorithms

Examples

Osvaldo Simeone Neuromorphic Computing 13 / 53

Page 16: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

I/O Interfaces

SNNNeuromorphic

sensor

Ex.: silicon cochlea, retina

Neuromorphic

actuator

Osvaldo Simeone Neuromorphic Computing 14 / 53

Page 17: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

I/O Interfaces

SNNencoder decoder

source actuator

5

Osvaldo Simeone Neuromorphic Computing 15 / 53

Page 18: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

I/O Interfaces

Rate encoding, time encoding, population encoding,...

Rate decoding, first-to-spike decoding,...

SNNencoder decoder

source actuator

5

Osvaldo Simeone Neuromorphic Computing 16 / 53

Page 19: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Internal Operation

An SNN is a network of spiking neurons.

Unlike ANNs, in SNNs neurons operate sequentially over time byprocessing and communicating via spikes.

Discrete-time vs continuous-time: most hardware implementationsfollow former (e.g., Intel’s Loihi).

x2

x1

xn

y

Artificial Neural Network (ANN)

Spiking Neural Network (SNN)

Time

Time

w1

wn

w2

...

w1

wn

w2

...

x2(t)

x1(t)

xn(t)

y(t)

Osvaldo Simeone Neuromorphic Computing 17 / 53

Page 20: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Internal Operation

An SNN is a network of spiking neurons.

Unlike ANNs, in SNNs neurons operate sequentially over time byprocessing and communicating via spikes.

Discrete-time vs continuous-time: most hardware implementationsfollow former (e.g., Intel’s Loihi).

x2

x1

xn

y

Artificial Neural Network (ANN)

Spiking Neural Network (SNN)

Time

Time

w1

wn

w2

...

w1

wn

w2

...

x2(t)

x1(t)

xn(t)

y(t)

Osvaldo Simeone Neuromorphic Computing 17 / 53

Page 21: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Internal Operation

Internal operation defined by:

I topology (connectivity)I spiking mechanism

x2

x1

xn

y

Artificial Neural Network (ANN)

Spiking Neural Network (SNN)

Time

Time

w1

wn

w2

...

w1

wn

w2

...

x2(t)

x1(t)

xn(t)

y(t)

Osvaldo Simeone Neuromorphic Computing 18 / 53

Page 22: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Topology

Two types of connections between neurons:I Synaptic links

F Causal dependency of a post-synaptic neuron on pre-synaptic neuronF Possibly recurrent: long-term memory

Osvaldo Simeone Neuromorphic Computing 19 / 53

Page 23: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Topology

Two types of connections between neurons:I Synaptic linksI Lateral dependencies:

F Instantaneous correlation between spiking of two neuronsF Excitatory or inhibitory

Osvaldo Simeone Neuromorphic Computing 20 / 53

Page 24: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Topology

An important example is a multi-layer SNN with lateral connectionswithin each layer.

Focus on this topology in the following, although many considerationsgeneralize.

Osvaldo Simeone Neuromorphic Computing 21 / 53

Page 25: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Spiking MechanismEach neuron is characterized by an internal state known as membrane

potential u(l)i ,t [Gerstner and Kistler ’02].

Generally, a higher membrane potentially entails a larger probability ofspiking.It evolves over time as a function of the past behavior of pre-synapticneurons and of the neuron itself.

Osvaldo Simeone Neuromorphic Computing 22 / 53

Page 26: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Spiking MechanismEach neuron is characterized by an internal state known as membrane

potential u(l)i ,t [Gerstner and Kistler ’02].

Generally, a higher membrane potentially entails a larger probability ofspiking.It evolves over time as a function of the past behavior of pre-synapticneurons and of the neuron itself.

Osvaldo Simeone Neuromorphic Computing 22 / 53

Page 27: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Membrane Potential

u(l)i ,t =

∑j∈V(l−1)

w(l)j ,i

(at ∗ s(l−1)j ,t

)+ w

(l)i

(bt ∗ s(l)i ,t

)+ γ

(l)i

Osvaldo Simeone Neuromorphic Computing 23 / 53

Page 28: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Membrane Potential

u(l)i ,t =

∑j∈V(l−1)

w(l)j ,i

(at ∗ s(l−1)j ,t

)+ w

(l)i

(bt ∗ s(l)i ,t

)+ γ

(l)i

Feedforward filter (kernel) at with learnable synaptic weight w(l)j ,i

Osvaldo Simeone Neuromorphic Computing 24 / 53

Page 29: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Membrane Potential

u(l)i ,t =

∑j∈V(l−1)

w(l)j ,i

(at ∗ s(l−1)j ,t

)+ w

(l)i

(bt ∗ s(l)i ,t

)+ γ

(l)i

Feedback filter (kernel) bt with learnable w(l)i (e.g., refractory period)

Bias (threshold) γ(l)i

Osvaldo Simeone Neuromorphic Computing 25 / 53

Page 30: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Membrane Potential

Kernels can more generally be parameterized via multiple basisfunctions and learnable weights [Pillow et al ’08].

This allows learning of temporal processing, e.g., by adapting synapticdelays.

Osvaldo Simeone Neuromorphic Computing 26 / 53

Page 31: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Deterministic Models

The most common model is leaky integrate-and-fire [Gerstner andKistler ’02]:

I Spike when membrane potential is positiveI Non-differentiable with respect to model parametersI Heuristic training algorithms based on ideas such as surrogate gradient

[Neftci ’18] [Anwani and Rajendran ’18].

Probabilistic models are more flexible and yield principleddifferentiable learning rules [Koller and Friedman ’09].

Osvaldo Simeone Neuromorphic Computing 27 / 53

Page 32: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Deterministic Models

The most common model is leaky integrate-and-fire [Gerstner andKistler ’02]:

I Spike when membrane potential is positiveI Non-differentiable with respect to model parametersI Heuristic training algorithms based on ideas such as surrogate gradient

[Neftci ’18] [Anwani and Rajendran ’18].

Probabilistic models are more flexible and yield principleddifferentiable learning rules [Koller and Friedman ’09].

Osvaldo Simeone Neuromorphic Computing 27 / 53

Page 33: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Deterministic Models

The most common model is leaky integrate-and-fire [Gerstner andKistler ’02]:

I Spike when membrane potential is positiveI Non-differentiable with respect to model parametersI Heuristic training algorithms based on ideas such as surrogate gradient

[Neftci ’18] [Anwani and Rajendran ’18].

Probabilistic models are more flexible and yield principleddifferentiable learning rules [Koller and Friedman ’09].

Osvaldo Simeone Neuromorphic Computing 27 / 53

Page 34: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Probabilistic Models

Basic probabilistic model: Generalized Linear Model (GLM)I There are no lateral connections, and the conditional spiking

probability is [Pillow et al ’08]

p(s(l)i,t = 1|s(l−1)≤t−1, s

(l)≤t−1) = σ(u

(l)i,t )

Osvaldo Simeone Neuromorphic Computing 28 / 53

Page 35: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Probabilistic Models

More general energy-based model (e.g., Boltzmann machines for timeseries [Osogami ’17]):

I With lateral correlations defined by parameters r(l)i,j , the joint

probability of spiking for layer l is

pθ(l)(s(l)t |s

(l−1)≤t−1, s

(l)≤t−1) ∝ exp

{ ∑i∈V(l)

u(l)i,t s

(l)i,t +

∑i,j∈V(l)

r(l)i,j s

(l)i,t s

(l)j,t

}

Osvaldo Simeone Neuromorphic Computing 29 / 53

Page 36: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Overview

Motivation

Models

Algorithms

Examples

Osvaldo Simeone Neuromorphic Computing 30 / 53

Page 37: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Training SNNs

Supervised learning:

training set = {(input, output)} → generalization

Unsupervised learning:

training set = {(input or output)} → compression, sample generation,clustering, ...

Reinforcement learning:

active (iterative) training = state (input) 7→ action (output) 7→ rewardand new input

Osvaldo Simeone Neuromorphic Computing 31 / 53

Page 38: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Training SNNs

Supervised learning:

training set = {(input, output)} → generalization

Unsupervised learning:

training set = {(input or output)} → compression, sample generation,clustering, ...

Reinforcement learning:

active (iterative) training = state (input) 7→ action (output) 7→ rewardand new input

Osvaldo Simeone Neuromorphic Computing 31 / 53

Page 39: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Training SNNs

Supervised learning:

training set = {(input, output)} → generalization

Unsupervised learning:

training set = {(input or output)} → compression, sample generation,clustering, ...

Reinforcement learning:

active (iterative) training = state (input) 7→ action (output) 7→ rewardand new input

Osvaldo Simeone Neuromorphic Computing 31 / 53

Page 40: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Training SNNs

Training is carried out by following a learning rule.

A learning rule describes how the model parameters are updated onthe basis of data in order to carry out a given task.

Online or batch

Local vs global information

Osvaldo Simeone Neuromorphic Computing 32 / 53

Page 41: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Training SNNs

Training is carried out by following a learning rule.

A learning rule describes how the model parameters are updated onthe basis of data in order to carry out a given task.

Online or batch

Local vs global information

Osvaldo Simeone Neuromorphic Computing 32 / 53

Page 42: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Training SNNs

Training is carried out by following a learning rule.

A learning rule describes how the model parameters are updated onthe basis of data in order to carry out a given task.

Online or batch

Local vs global information

Osvaldo Simeone Neuromorphic Computing 32 / 53

Page 43: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Learning Rules

The general form of many learning rules for synaptic weights followsthe three-factor format [Fremaux and Gerstner ’17]:

θ ← θ + η × learning signal × pre-syn × post-syn

Pre-synaptic and post-synaptic terms are local to each neuron.

Learning signal, aka neuromodulator, is global.

The product pre-syn × post-syn tends to be large when the twoneurons spike at nearly the same time.

“Neurons that fire together, wire together” (Hebbian theory, STDP,BCM theory) [Hebb ’49] [Bienenstock et al ’82].

Osvaldo Simeone Neuromorphic Computing 33 / 53

Page 44: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Learning Rules

The general form of many learning rules for synaptic weights followsthe three-factor format [Fremaux and Gerstner ’17]:

θ ← θ + η × learning signal × pre-syn × post-syn

Pre-synaptic and post-synaptic terms are local to each neuron.

Learning signal, aka neuromodulator, is global.

The product pre-syn × post-syn tends to be large when the twoneurons spike at nearly the same time.

“Neurons that fire together, wire together” (Hebbian theory, STDP,BCM theory) [Hebb ’49] [Bienenstock et al ’82].

Osvaldo Simeone Neuromorphic Computing 33 / 53

Page 45: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Learning Rules

The general form of many learning rules for synaptic weights followsthe three-factor format [Fremaux and Gerstner ’17]:

θ ← θ + η × learning signal × pre-syn × post-syn

Pre-synaptic and post-synaptic terms are local to each neuron.

Learning signal, aka neuromodulator, is global.

The product pre-syn × post-syn tends to be large when the twoneurons spike at nearly the same time.

“Neurons that fire together, wire together” (Hebbian theory, STDP,BCM theory) [Hebb ’49] [Bienenstock et al ’82].

Osvaldo Simeone Neuromorphic Computing 33 / 53

Page 46: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Deriving the Learning Rules

Three-factor rules can be derived as a form of stochastic gradientdescent for the probabilistic model:

θ ← θ + η × M × ∇ ln p(output | input)

learningrate

learningsignal

Osvaldo Simeone Neuromorphic Computing 34 / 53

Page 47: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Deriving the Learning Rules

Gradient for synaptic weights under GLM (no lateral connections)obtained by summing over time t

∇w

(l)j,i

log pθ(s(l)i ,t |s

(l)≤t−1, s

(l)≤t−1) =

(at ∗ s(l−1)j ,t

)︸ ︷︷ ︸pre-synaptic trace

(s(l)i ,t − σ

(u(l)i ,t

))︸ ︷︷ ︸post-synaptic error

Post-synaptic error = desired/ observed behavior - model averagedbehavior [Bienenstock et al ’82]

Osvaldo Simeone Neuromorphic Computing 35 / 53

Page 48: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Deriving the Learning Rules: Supervised Learning

θ ← θ + η × M × ∇ ln p(output | input)

= 1 or from VI output ← traininginput ← training

Variational Inference (VI) needed if there are intermediate layers[Rezende et al ’11] [Osogami ’17] [Jang et al ’18].

Osvaldo Simeone Neuromorphic Computing 36 / 53

Page 49: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Deriving the Learning Rules: Unsupervised Learning

For generative (unsupervised) models:

θ ← θ + η × M × ∇ ln p(output | input)

= from VI output ← traininginput ← ∅

Unsupervised learning models always have hidden layers.

Osvaldo Simeone Neuromorphic Computing 37 / 53

Page 50: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Deriving the Learning Rules: Reinforcement Learning

Using policy gradient to learn an SNN policy:

θ ← θ + η × M × ∇ ln p(output | input)

reward/ return& VI

output ← actioninput ← state

Variational Inference (VI) needed if there are intermediate layers[Rezende et al ’11] [Osogami ’17] [Jang et al ’18].

Osvaldo Simeone Neuromorphic Computing 38 / 53

Page 51: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Overview

Motivation

Models

Algorithms

Examples

Osvaldo Simeone Neuromorphic Computing 39 / 53

Page 52: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Supervised Learning

encoder

source

decoder

5

decoder

rotated

Osvaldo Simeone Neuromorphic Computing 40 / 53

Page 53: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Supervised Learning

From [Jang et al ’18-2]

Osvaldo Simeone Neuromorphic Computing 41 / 53

Page 54: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Unsupervised Learning

decoder

Osvaldo Simeone Neuromorphic Computing 42 / 53

Page 55: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Unsupervised Learning

decoder

variational

SNNencoder

Osvaldo Simeone Neuromorphic Computing 43 / 53

Page 56: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Unsupervised Learning

Osvaldo Simeone Neuromorphic Computing 44 / 53

Page 57: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Reinforcement Learning

encoder decoder

actuator

up

Osvaldo Simeone Neuromorphic Computing 45 / 53

Page 58: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Reinforcement Learning

From [Rosenfeld et al ’18]

Osvaldo Simeone Neuromorphic Computing 46 / 53

Page 59: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Concluding Remarks

Statistical signal processing review of neuromorphic computing viaSpiking Neural Networks.

Additional topics:I recurrent SNNs for long-term memory [Maas ’11]I neural sampling: information encoded in steady-state behavior [Buesing

et al ’11]I Bayesian learning via Langevin dynamics [Pecevski et al ’11] [Kappel et

al ’15]

Some open problems:I meta-learning, life-long learning, transfer learning [Bellec et al ’18]I learning I/O interfaces [Lazar and Toth ’03]

Osvaldo Simeone Neuromorphic Computing 47 / 53

Page 60: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

Acknowledgements

This work has received funding from the European Research Council(ERC) under the European Union’s Horizon 2020 research and innovation

programme (grant agreement No. 725731) and from the US NationalScience Foundation (NSF) under grant ECCS 1710009.

Osvaldo Simeone Neuromorphic Computing 48 / 53

Page 61: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

References

[Gerstner and Kistler ’02] W. Gerstner and W. M. Kistler, Spiking neuronmodels: Single neurons, populations, plasticity. Cambridge UniversityPress, 2002.[Pillow et al ’08] J.W. Pillow, J. Shlens, L. Paninski, A. Sher, A. M. Litke,E. Chichilnisky, and E. P. Simoncelli, “Spatio-temporal correlations andvisual signalling in a complete neuronal population,” Nature, vol. 454, no.7207, p. 995, 2008.[Osogami ’17] T. Osogami, “Boltzmann machines for time-series,” arXivpreprint arXiv:1708.06004, 2017.[Ibnkahla ’00] Ibnkahla M. Applications of neural networks to digitalcommunications–a survey. Signal processing. 2000 Jul 1;80(7):1185-215.[Koller and Friedman ’09] Koller D, Friedman N, Bach F. Probabilisticgraphical models: principles and techniques. MIT press; 2009.

Osvaldo Simeone Neuromorphic Computing 49 / 53

Page 62: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

References

[Fremaux et al ‘08] N. Fremaux and W. Gerstner, “Neuromodulatedspike-timing-dependent plasticity, and theory of three-factor learningrules,” Frontiers in neural circuits, vol. 9, p. 85, 2016.[Jang et al ’18] H. Jang, O. Simeone, B. Gardnerm and A. Gruning,“Spiking neural networks: A stochastic signal processing perspective,” ...[Rezende et al ’11] Rezende DJ, Wierstra D, Gerstner W. “Variationallearning for recurrent spiking networks”, In Advances in Neural InformationProcessing Systems, 2011 (pp. 136-144).[Brea et al ’13] J. Brea, W. Senn, and J.-P. Pfister, “Matching recall andstorage in sequence learning with spiking neural networks,” Journal ofNeuroscience, vol. 33, no. 23, pp. 9565–9575, 2013.[Hebb ’49] D. Hebb, The Organization of Behavior. New York: Wiley andSons, Nov. 1949.

Osvaldo Simeone Neuromorphic Computing 50 / 53

Page 63: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

References

[Bienenstock et al ’82] E. L. Bienenstock, L. N. Cooper, and P. W. Munro,“Theory for the development of neuron selectivity: orientation specificityand binocular interaction in visual cortex,” Journal of Neuroscience, vol. 2,no. 1, pp. 32–48, 1982.[Pecevski et al ’11] D. Pecevski, L. Buesing, and W. Maass, “Probabilisticinference in general graphical models through sampling in stochasticnetworks of spiking neurons,” PLOS Computational Biology, vol. 7, no.12, pp. 1–25, 12 2011.[Rosenfeld et al ’18] Rosenfeld B, Simeone O, Rajendran B. LearningFirst-to-Spike Policies for Neuromorphic Control Using Policy Gradients.arXiv preprint arXiv:1810.09977. 2018 Oct 23.[Jang et al ’18-2] Jang H, Simeone O. Training Dynamic ExponentialFamily Models with Causal and Lateral Dependencies for GeneralizedNeuromorphic Computing. arXiv preprint arXiv:1810.08940. 2018 Oct 21.

Osvaldo Simeone Neuromorphic Computing 51 / 53

Page 64: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

References

[Bellec et al ’18] G. Bellec, D. Salaj, A. Subramoney, R. Legenstein, andW. Maass, “Long short-term memory and learning-to-learn in networks ofspiking neurons,” arXiv preprint arXiv:1803.09574, 2018.[Kappel et al ’15] Kappel D, Habenschuss S, Legenstein R, Maass W.Synaptic sampling: a Bayesian approach to neural network plasticity andrewiring. InAdvances in Neural Information Processing Systems 2015 (pp.370-378).[Buesing et al ’11] Buesing L, Bill J, Nessler B, Maass W. Neural dynamicsas sampling: a model for stochastic computation in recurrent networks ofspiking neurons. PLoS computational biology. 2011 Nov3;7(11):e1002211.[Lazar and Toth ’03] Lazar, Aurel A., and Laszlo T. Toth. "Time encodingand perfect recovery of bandlimited signals." ICASSP (6). 2003.[Maas ’11] Maass W. Liquid state machines: motivation, theory, andapplications. InComputability in context: computation and logic in thereal world 2011 (pp. 275-296).

Osvaldo Simeone Neuromorphic Computing 52 / 53

Page 65: Neuromorphic Computing and Learning: A …Neuromorphic Computing and Learning: A Stochastic Signal Processing Perspective Osvaldo Simeone joint work with Hyeryung Jang (KCL), Bipin

References

[Neftci ’18] Neftci EO. Data and power efficient intelligence withneuromorphic learning machines. iScience. 2018 Jul 27;5:52.[Anwani and Rajendran ’18] N. Anwani and B. Rajendran, “TrainingMultilayer Spiking Neural Networks using NormAD based Spatio-TemporalError Backpropagation,” arXiv:1811.10678.

Osvaldo Simeone Neuromorphic Computing 53 / 53