Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Overview
• Sensory systems from a mile (kilometer) up
• Different approaches to modeling neural encoding
• Lab: nuts & bolts of basic encoding models
A specialized sensory periphery …
EM
mechanical
chemical
chemical
mechanical
electrical (spikes)
… feeds (more or less) standard central circuits
Sensory input guides behavior
Sensory processing is lossy information flow
Sensory inputs (many* bits)
Motor outputs (a few bits)
Brain
* In theory, the auditory nerve can encode 10360 different 1-second sounds. The universe is about 1018 seconds old.
All sensory processing is “active”
Sensory inputs Motor outputs Perception / decision-making
Reward assessment / learning
”context”
Brain
Context spans a large spatio-temporal continuum
Brain
Evoked activity
Neuromodulators
Synaptic weights
Genes
Network state
slower
faster Te
mp
ora
l sca
le } Context
smaller
larger
Spat
ial s
cale
Sensory inputs Motor outputs
108 s (1013?)
10-2 s
10-7 m
10-1 m
Context spans a large spatio-temporal continuum
Sensory inputs Motor outputs Brain
Evoked activity
Neuromodulators
Synaptic weights
Genes
Network state
slower
faster ti
mes
clae
1. Context, predictive coding
3. Synaptic plasticity
4. Development
2. Attention
5. Selection / evolution
Today: focus on encoding at the perceptual scale
Sensory inputs Motor outputs Brain
Evoked activity
Neuromodulators
Synaptic weights
Genes
Network state
slower
faster ti
mes
clae
Ken Harris
Christian Machens
Srdjan Ostodic
Today: focus on the auditory system
Retina
LGN
V1
Cochlear nucleus
MGB
A1
Superior olive
IC
Visual system Auditory system
(Kandel, Schwartz & Jessell)
Encoding models applied to other systems
• Visual system – Jones & Palmer 1987 – Ringach, Hawken, Shapley 1997 – David, Vinje, Gallant 2004 – Pillow et al. 2008 – Yamins, DiCarlo 2016 – … and many, many more.
• Somatosensory system – DiCarlo, Johnson, Hsiao 1998
• Olfactory system – Nagel, Hong, Wilson 2015
Tools for characterizing selectivity
h(s) s(x,t) r(t)
Stimulus (s) Neural response (r)
• Tuning curves • best frequency, modulation rate, etc.
• Filter-based encoding models (e.g., the STRF) • Selectivity indices
• no tuning space required • e.g., information theory
• Optimal stimuli
How does neural activity encode stimuli?
• The fact that information exists in a neural response does not mean that it is used by the brain.
• Two perspectives: – Classic: What information about a stimulus can be recovered from the
neural response? – More nuanced: How is a stimulus represented across the entire neural
population? What impact does a neuron’s activity have on downstream neurons and behavior?
• Implicit vs. explicit coding
– Implicit: information exists in the periphery but may not be readily accessible.
– Explicit: information is available to decoding scheme “X”, typically a rate code. “Untangle” the representation (DiCarlo & Cox 2007)
Neural encoding models
• Tuning curve – Spike counting
• Spectro-temporal receptive field (STRF) – Spike-triggered averaging (linear regression)
• Arbitrary (nonlinear) stimulus-response mapping – Gradient descent (machine learning)
• Two themes:
– How is each method implemented? – How is each method useful?
Ge
ne
raliz
abili
ty
Co
mp
lexi
ty
Inte
rpre
tab
ility
Dat
a re
qu
irem
en
ts
Neural encoding models
• Tuning curve – Spike counting
• Spectro-temporal receptive field (STRF) – Spike-triggered averaging (linear regression)
• Arbitrary (nonlinear) stimulus-response mapping – Gradient descent (machine learning)
• Two themes:
– How is each method implemented? – How is each method useful?
Ge
ne
raliz
abili
ty
Co
mp
lexi
ty
Inte
rpre
tab
ility
Dat
a re
qu
irem
en
ts
Mammalian auditory system
Cochlea (frequency
decomposition):
(http://jan.ucc.nau.edu; Kandel, Schwartz & Jessell)
Hair cells (transduction
to spikes):
Ear (collector):
Auditory inputs to brain
Spectrogram: (phase-locking largely absent)
(Yang, Wang & Shamma 1992)
Pla
ce c
od
e
Frequency tuning curve
Tuning curve: mean spike rate for each parametric manipulation of a stimulus. Spikes recorded from A1 of an awake ferret during presentation of bandpass noise centered at 20 logarithmically-spaced frequencies.
Emergent tuning properties
• Example 1: Space
– Phase-locking encodes spatial information implicitly. Spatially tuned cells encode space explicitly:
(Laback & Majdak 2008; Bala et al. 2003)
Owl midbrain: Interaural time difference (ITD):
Envelope amplitude
Emergent tuning properties
• Example 2: Amplitude modulation (AM)
Amplitude of natural vocalizations is modulated in time:
Sinusoidal amplitude modulated (SAM) noise
Time
3 Hz 10 Hz
Stimulus-locked response
Non-stimulus-locked
Temporal code: reliability of response at fixed time in AM cycle Rate code: average firing rate for a given AM rate
Emergent tuning properties
• Example 2: Amplitude modulation (AM) – Temporal code: precision of spike times relative to stimulus envelope varies with AM rate
– Rate code: total spike count varies with AM rate, but timing can be imprecise
– (NB: sometimes, confusingly, responses that follow the stimulus envelope are also referred to as “phase-locked”)
(Liang & Wang 2002)
Temporal code (vector strength)
Rate code (rate tuning)
What does the central auditory network do?
Cortex vs. periphery • Broader tuning
• Sloppier timing
• Less reliability
Possible explanations Invariance? Feature integration? Transformation to rate code? State dependence? Plasticity?
IC A1 (core) dPEG (belt)
Neural encoding models
• Tuning curve – Spike counting
• Spectro-temporal receptive field (STRF) – Spike-triggered averaging (linear regression)
• Arbitrary (nonlinear) stimulus-response mapping – Gradient descent (machine learning)
• Two themes:
– How is each method implemented? – How is each method useful?
Ge
ne
raliz
abili
ty
Co
mp
lexi
ty
Inte
rpre
tab
ility
Dat
a re
qu
irem
en
ts
Receptive field models
• Neural selectivity is often described as a sensory filter, describing the relationship between neural activity and the preceding stimulus at each point in time .
• Common neural filter models in the auditory system are the spike-triggered average (STA) and spectro-temporal receptive field (STRF)
• Filter models map conceptually onto the idea of lossy information flow (many input channels map to one spike rate).
Spike-triggered average
Time lag Sp
ikes
/dB
(de Boer 1968)
Sou
nd
p
ress
ure
, s(t
)
Spike times
Cross- correlation
Resonant frequency
• Derived from methods for systems identification in engineering (a.k.a. white noise analysis). • Basic idea: Present a complex sound and derive statistically what features of the sound evoke
neural activity. • Can be applied to evoked potentials, spikes, LFPs, BOLD signals.
Spectro-temporal receptive fields (STRFs)
(Aertsen et al. 1981)
Stimulus spectrogram Neural response
Linearization. Compute the spike-triggered average after transforming the stimulus into a representation that accounts for early processing. In this case, frequency tuning of the cochlea.
STRFs: How?
+ + + + +
Spike-triggered average
… …
Temporally-orthogonal ripple combinations (TORCs) are broadband noise stimuli that drive activity in AC and sample the space of possible stimuli efficiently (Klein et al 2001).
Why STRFs?
• STRFs are a general model. They can predict the neural response to any arbitrary natural stimulus.
• Perfect prediction = perfect model
• Unbiased characterization of tuning properties. Tuning curves report tuning through a pre-selected slice of stimulus space.
• Infer the biophysical mechanisms and networks that make the brain work
A snapshot of auditory cortex
STRFs for 24 channels recorded simultaneously in A1/AAF using a
fixed array of platinum-iridium electrodes.
Rate/scale space
(Singh and Theunissen, 2003)
According to Fourier theory, spectrograms can be decomposed into the sum of 2-dimensional sine wave gratings, analogous to spatial gratings in the visual system.
Rate/scale space
Modulation tuning functions (MTFs)
Time lag
Freq
ue
ncy
Rate
|FFT|
Scale
Time lag
Freq
ue
ncy
Rate
|FFT|
Time lag
Freq
ue
ncy
Rate
|FFT|
Spectro-temporal receptive fields
(Miller & Schreiner 2002)
STRFs permit the simultaneous measurement of multiple tuning properties
Matched tuning to natural stimuli?
(Miller & Schreiner 2002)
(Singh & Theunissen, 2003)
Modulation tuning
Stimulus modulation spectra STRF distribution matches modulation spectrum of natural sounds?
STRFs for single units in
awake ferret auditory
cortex from primary (core)
and secondary (belt)
auditory cortex.
A1
(primary)
dPEG
(secondary)
STRFs describe the processing hierarchy
A1 vs. dPEG:
Increasing integration time
Increasing complexity
But some dPEG neurons
look like A1. Longer tails in
distributions
STRFs describe the processing hierarchy
(Miller et al 2001)
STRFs recorded from connected pairs of MGB and A1 neurons reveal convergent inputs.
Neural activity and sensory representation
Sensory stimulus
Neural response
?
Encoding approach How does a neural signal respond to (encode) a given stimulus?
Sensory stimulus
Neural response
?
Decoding approach What information about a stimulus can be inferred (decoded) from a neural signal?
Stimulus reconstruction
Linear decoder (Bialek 1991; Mesgarani 2009)
Linear encoder (STRF; Theunissen 2001; David 2009)
What information is encoded in the neural population?
Stay tuned for Nima Mesgarani’s lecture in a few weeks!
Spike-triggered average as linear regression
−1 0 1
0
10
20
Sp
ike
s/s
Stimulus
amplitude
STRF = cross-correlation between stimulus spectrogram and time-varying spike rate.
Fre
qu
ency (
kH
z)
TORC #1
0.5
1
2
4
8
16
2468
10
Re
pe
titio
n
0 50 100 150 200 2500
40
80
120
Spik
es/s
Time (ms)
TORC #2
0 50 100 150 200 250
Time (ms)
TORC #30
0 50 100 150 200 250
Time (ms)
...
Actual PSTHPredicted PSTH
0
1
Norm
. a
mp
litu
de
Correlate stimulus and response at preferred frequency and time-lag:
Spike-triggered average
STRF = cross-correlation between stimulus spectrogram and time-varying spike rate.
6.3
8.0
16.0
Fre
qu
en
cy
(k
Hz)
−1 0 1
010
20
Sp
ike
s/s
Stimulus
amplitude
15 35Time lag (ms)
25 45
12.7
10.1
Time lag (ms)20 60 100
.5
1
2
4
8
16
Fre
qu
en
cy (
kH
z)
STRFs can predict neural responses
Fre
qu
ency (
kH
z)
TORC #1
0.5
1
2
4
8
16
2468
10
Re
pe
titio
n
0 50 100 150 200 2500
40
80
120
Spik
es/s
Time (ms)
TORC #2
0 50 100 150 200 250
Time (ms)
TORC #30
0 50 100 150 200 250
Time (ms)
...
Actual PSTHPredicted PSTH
0
1
Norm
. a
mp
litu
de
Time lag (ms)20 60 100
.5
1
2
4
8
16
Fre
qu
en
cy (
kH
z)
STRF estimation:
STRF prediction:
Reverse correlation as linear algebra
Linear algebraic form (using a delay-line, so that all stimulus samples influencing response at time t occur in row t of stimulus matrix S):
STRF estimation:
STRF prediction:
Prediction: Estimation:
(white noise)
What about natural stimuli?
Speech (natural)
TORC (parametric)
(David et al. 2009)
Natural stimuli and STRFs
(Woolley & Theunissen 2004)
STRFs in the birdsong system depend on estimation stimulus
Natural stimuli and STRFs
STRFs are piecewise-linear estimates of a nonlinear function
What about natural stimuli?
Unlike rippled noise, speech and other natural sounds are correlated in frequency and time:
(Theunissen et al 2001)
500
1000
2000
4000
Fre
qu
ency
(H
z)
8000
10 30 50 70 90 110
Time lag (ms)
500
1000
2000
4000
Fre
qu
ency
(H
z)
8000
10 30 50 70 90 110
Time lag (ms)
STA from speech STA from TORCs
Simply computing the spike-triggered average produces an artefactually smoothed STRF:
Normalized reverse correlation
0.9
0.99
1 - 10-3
1 - 10-4
1 - 10-5
1 - 10-6
Tolerance
500
1000
2000
4000
Fre
qu
ency
(H
z)
8000
10 30 50 70 90 110
Time lag (ms)
-1
0
1
No
rmal
ized
gai
n
Optimal-tolerance estimate (based on cross-validation)
Normalized reverse correlation. Natural stimuli contain spectro-temporal correlations, which must be accounted for to obtain the correct regression solution.
(Theunissen et al 2001)
Speech vs. noise STRFs
500
1000
2000
4000
Fre
qu
ency
(H
z)
8000
10 30 50 70 90 110
Time lag (ms)
500
1000
2000
4000
Fre
qu
ency
(H
z)8000
10 30 50 70 90 110
Time lag (ms)
500
1000
2000
4000
Fre
qu
ency
(H
z)
8000
10 30 50 70 90 110
Time lag (ms)
STA from speech STA from TORCs
STRF from speech by NRC
(Careful!) application of normalized reverse correlation reveals that TORC and speech STRFs have similar spectral tuning and differ primarily in their temporal dynamics.
(David et al. 2009)
Alternative: STRF estimation by boosting
Iteration
(David & Shamma 2007)
Boosting (a.k.a. coordinate descent) is a specific implementation of a gradient descent algorithm that works by iteratively updating the single STRF coefficient that best improves prediction accuracy.
500
1000
2000
4000
Fre
qu
ency
(H
z)
8000
10 30 50 70 90 110
Time lag (ms)
-1
0
1
No
rmal
ized
gai
n
Reverse correlation vs. boosting?
500
1000
2000
4000
Fre
qu
ency
(H
z)
8000
10 30 50 70 90 110
Time lag (ms)
-1
0
1
No
rmal
ized
gai
n
500
1000
2000
4000
Fre
qu
ency
(H
z)
8000
10 30 50 70 90 110
Time lag (ms)
-1
0
1
No
rmal
ized
gai
n
STRF from boosting STRF from reverse correlation
0 0.2 0.4 0.6 0.80
0.2
0.4
0.6
0.8
NRC STRF
prediction correlation
Boost
ed S
TR
F
pre
dic
tion c
orr
elat
ion
n=164
Prediction accuracy (for single neurons in A1 played speech) is slightly higher for boosted STRFs.
?
≠
Neural encoding models
• Tuning curve – Spike counting
• Spectro-temporal receptive field (STRF) – Spike-triggered averaging (linear regression)
• Arbitrary (nonlinear) stimulus-response mapping – Gradient descent (machine learning)
• Two themes:
– How is each method implemented? – How is each method useful?
Ge
ne
raliz
abili
ty
Co
mp
lexi
ty
Inte
rpre
tab
ility
Dat
a re
qu
irem
en
ts
A1 neurons are not linear
Time lag (ms)20 60 100
.25
.5
1
2
4
8
Fre
qu
en
cy (
kH
z)
Fre
qu
en
cy
(k
Hz)
20
40
−1 0 1
Sp
ikes/s
Stimulus
amplitude
15 35Time lag (ms)
25 45
1.6
1.3
1
0.8
0.6
0 0.2 0.4 0.6 0.80
0.2
0.4
0.6
0.8
NRC STRF
prediction correlation
Boost
ed S
TR
F
pre
dic
tion c
orr
elat
ion
n=164
Moving beyond the linear STRF
Problems with existing models • Classical STRF (LN model) has limited accuracy. • Several alternatives have been proposed but no single model has been
established as a replacement. • Behavior makes things more complicated, because data from behaving
animals are often limited and larger parameter count makes fitting harder.
Strategy (David & Thorson 2015): • Compare a large variety of model architectures with a standard data set • Reduce dimensionality while maximizing prediction accuracy • Starting point: LN STRF
Stimulus
cochleogram
Linear
filter
Static
nonlinearity
Response (all states)
Maximum a posteriori (MAP) estimation
David & Gallant 2004; Wu, David & Gallant 2007
*
a rg m in ,E r t h s t R
Loss function: smaller prediction error = more probable (e.g., mean squared error)
Model class: function that predicts response to any stimulus (e.g., the STRF)
Prior: penalize unlikely fits (e.g., flat, smooth, sparse prior).
What are the most probable fit parameters, *, given available data and existing knowledge of the system?
Estimation stimulus: sample of stimuli used for fitting
Model class
Basic model:
“Engine”:
Stimulus
cochleogram
Linear
filter
Static
nonlinearity
Response (all states)
The obvious first nonlinearity to try: apply static nonlinearity to the output of the STRF (a.k.a. Generalized Linear Model or GLM. Paninski et al. 2004):
This model can be thought of as an instance of a cascade of transformations from stimulus to response:
Fitter
Basic model:
Engine:
Fitter: Choose values of θ for which the engine produces the minimum prediction error.
Gradient :
Reduced rank spectro-temporal filters
Full rank STRF
Rank 1
Rank 2
Rank 3
Rank 4
(Simon et al. 2007; Park & Pillow 2012)
Reduced rank STRFs
Model prediction accuracy compared for standard single-unit data set recorded in ferret A1 during presentation of natural vocalizations.
(Thorson & David 2015)
Reduced rank STRFs
Prediction correlation,
FIR model
Pre
dic
tio
n c
orr
ela
tion
,
D=
2 f
acto
rize
d m
ode
l
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
N=176
0 100 200 300Parameter count
Factorized
***
FIR
D: 12
3
4
Me
an
pre
dic
tion
co
rrela
tion
0.42
0.44
0.46
0.48
0.50
0.40
Reduced-rank model performs better* and requires fewer parameters
* “better” only because the full-rank model suffers from over-fitting.
(Thorson & David 2015)
Parametric STRFs
Assume frequency tuning is Gaussian (2 free parameters):
Fre
qu
en
cy
Gain
σµ
Time lag
Gain
Assume temporal tuning is pole-zero filter (3-5 parameters):
(Thorson & David 2015)
Parametric STRFs
Factorized Ws & Wt
Gaussian Ws, Pole-zero Wt
FIR
0 100 277Parameter count
50 150
0.4
0.5
Mea
n p
redic
tio
n c
orr
ela
tion
0.42
0.44
0.46
0.48
FIR Filter Factorized Parameterized
R=0.58 R=0.61 R=0.64
R=0.37 R=0.42 R=0.55cell
po
r053
a-0
6-0
1ce
ll on
i013
b-b
1
0 50 100 150
Response latency (ms)
0.2
0.6
2.0
6.3
20
Stim
ulu
s f
requ
en
cy (
kH
z)
0.2
0.6
2.0
6.3
20
(Thorson & David 2015)
How many degrees of freedom are required?
Performance of n=1061 LN STRF architectures, compared for A1 neurons during presentation of vocalizations. A 29-dimensional model had best prediction accuracy.
(Thorson & David 2015)
(278 free parameters) (29 free parameters)
“Standard” STRF Parameterized STRF
Focus on temporal dynamics
Speech (many spectral channels)
Speech-modulated noise (one spectral channel, still naturalistic)
Encoding of vocalization-modulated noise
(David & Shamma 2013)
Encoding of vocalization-modulated noise
Linear (“LN”) receptive field model
A role for short-term plasticity?
• Evidence for strong influence of nonlinear synaptic depression in A1 – Fine timing of responses to amplitude-
modulated stimuli (Elhilali et al. 2004) – Dynamics of forward suppression (Wehr & Zador
2005) – Changes in STRFs estimated from speech and
rippled noise (David et al. 2009)
• Synaptic depression is a well-modeled process
at the single synapse level – Vesicle depletion rate, u, proportional to
presynaptic input – Recovery time constant, t , independent of input
Pre Post
(from Tsodyks & Markram 1997)
Nonlinear STP receptive field model
STP STRF has improved predictive power
(David & Shamma 2013)
STP STRF expanded for fully natural sounds
Spectral
weightingSTP
Temporal
filter
Static
nonlinearity
Frequency
Gain
Frequency
Gain
Stimulus
spectrogram
Spike
response
Nonlinear encoding models
0 100 200 300
n=117 A1 neurons
Parameter count
0
0.2
0.4
0.6M
edia
n p
red
ictio
n c
orr
ela
tion
perf
orm
an
ce
complexity
nonlinear filterslinear filters
Standard
LN STRF
Best-performing
nonlinear STRF
Pareto
frontier
>1000 encoding model architectures compared using 20 minutes of data recorded from n=117 A1 single units in awake ferrets during presentation of natural vocalizations Long-term plan: put model fitting system online to allow other labs to fit & compare models on their data.
Other variants of the STRF
• Linearize the input space – Gill, Woolley, Fremouw, Theunissen 2006 (cochlear model) – David, Shamma 2013 (STP example above) – Willmore, Schoppe, King, Schnupp, Harper 2016 (midbrain model)
• Subspace models
– Atencio, Sharpee, Schreiner 2008 (information theory-based) – Kozlov, Gentner 2016 – Atencio, Sharpee 201
• Gain control – Rabbinowitz, Willmore, Schnupp 2012 – Williamson, Ahrens, Linden, Sahani 2016
• Neural networks, deep networks – Harper, Willmore, Cui, Schnupp 2016 (similar to subspace models) – Yamins, DiCarlo 2016 (visual cortex) – Kell, Yamins, Norman-Haignere, McDermott (in prep!)
Take-home messages
• Neural encoding models span a wide range of complexity and generalizability
• Important factors: model architecture, fit stimulus, fit algorithm, cost function, priors
Prep for lab
• Download and install Anaconda: – https://www.anaconda.com/download/ (Python v3.6) – Follow instructions for installation – Required packages (if using a different Python distribution):
• numpy, scipy, matplotlib, ipython
• Download STRF demo code & data:
– https://bitbucket.org/lbhb/strf_demo/downloads/ – Click “Download repository” and unzip to desired directory
• Run ipython in terminal window from directory where STRF demo files are installed. Then test:
In [1]: %pylab
In [2]: run cartoon_rc.py