34
Classification: Biological Sciences, Neuroscience Title: A triplet model of spike-timing dependent plasticity generalizes the BCM rule to spatio- temporal inputs Authors: Julijana Gjorgjieva 1 , Claudia Clopath 2 , Juliette Audet 3 , Jean-Pascal Pfister 4,5 Address: (1) Department of Applied Mathematics and Theoretical Physics, University of Cam- bridge, Cambridge, UK (2) Laboratory of Neurophysics and Physiology, Universit´ e Paris Descartes, Paris, France (3) Laboratory of Computational Neuroscience, Ecole Polytechnique F´ ed´ erale de Lausanne, Lausanne, Switzerland (4) Computational Neuroscience Lab, Physiology Department, University of Bern, Bern, CH (5) Department of Engineering, University of Cambridge, Cambridge, UK Corresponding author: Julijana Gjorgjieva Corresponding author address: Department of Applied Mathematics and Theoretical Physics, Wilberforce Road, Cambridge, CB3 0WA, UK Corresponding author email: [email protected] Keywords: STDP, BCM, triplet model, higher-order correlations, input selectivity Acknowledgements: We would like to thank Stephen Eglen and Adrienne Fairhall for helpful discussions and reading drafts of this manuscript. This work was funded by a Cambridge Overseas Research Studentship and Trinity College Internal Graduate Studentship (JG), the Agence Nationale de la Recherche grant ANR-08-SYSC-005 (CC), the Wellcome Trust and the Swiss National Science Foundation (JPP).

Classi cation: Biological Sciences, Neuroscience Julijana …spiral.imperial.ac.uk/bitstream/10044/1/21442/2/PNAS_108... · 2015. 12. 15. · Classi cation: Biological Sciences, Neuroscience

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

  • Classification: Biological Sciences, Neuroscience

    Title: A triplet model of spike-timing dependent plasticity generalizes the BCM rule to spatio-temporal inputs

    Authors: Julijana Gjorgjieva1, Claudia Clopath2, Juliette Audet3, Jean-Pascal Pfister4,5

    Address: (1) Department of Applied Mathematics and Theoretical Physics, University of Cam-bridge, Cambridge, UK(2) Laboratory of Neurophysics and Physiology, Université Paris Descartes, Paris, France(3) Laboratory of Computational Neuroscience, Ecole Polytechnique Fédérale de Lausanne,Lausanne, Switzerland(4) Computational Neuroscience Lab, Physiology Department, University of Bern, Bern, CH(5) Department of Engineering, University of Cambridge, Cambridge, UK

    Corresponding author: Julijana Gjorgjieva

    Corresponding author address: Department of Applied Mathematics and Theoretical Physics,Wilberforce Road, Cambridge, CB3 0WA, UK

    Corresponding author email: [email protected]

    Keywords: STDP, BCM, triplet model, higher-order correlations, input selectivity

    Acknowledgements: We would like to thank Stephen Eglen and Adrienne Fairhall for helpfuldiscussions and reading drafts of this manuscript. This work was funded by a CambridgeOverseas Research Studentship and Trinity College Internal Graduate Studentship (JG), theAgence Nationale de la Recherche grant ANR-08-SYSC-005 (CC), the Wellcome Trust andthe Swiss National Science Foundation (JPP).

  • Abstract

    Synaptic strength depresses for low, and potentiates for high activation of the postsynaptic neuron.This feature is a key property of the well-known Bienenstock-Cooper-Munro (BCM) synaptic rule,which has been shown to maximize the selectivity of the postynaptic neuron, and thereby offersa possible explanation for experience-dependent cortical plasticity such as orientation selectivity.However, this BCM framework is rate-based and a significant amount of recent work has shown thatsynaptic plasticity depends on the precise timing of presynaptic and postsynaptic spikes. In thispaper we consider a triplet model of Spike-Timing Dependent Plasticity (STDP) which dependson the interactions of three precisely-timed spikes. Triplet STDP has been shown to describeexperimentally-observed plasticity which the classical STDP rule, based on pairs of spikes, has failedto capture. In the case of rate-based patterns, we show a tight correspondence between the tripletSTDP rule and the BCM rule. We demonstrate the selectivity property of the triplet STDP ruleanalytically for orthogonal inputs and numerically for non-orthogonal inputs. Moreover, in contrastto BCM, we show that triplet STDP can also elicit selectivity to input patterns consisting of higher-order spatio-temporal correlations. This property is crucial given the large amount of higher-ordercorrelations measured in the brain. We studied the triplet STDP model in a biologically-inspiredsetting from receptive field theory, where orientation and directions selectivity arise due to thesensitivity of triplet STDP to spatio-temporal correlations.

    Introduction

    Synaptic plasticity depends on the activity of presynaptic and postsynaptic neurons and is believedto provide the basis for learning and memory (1, 2). It has been shown that low frequency stim-ulation (1–3 Hz) (3) or stimulation paired with low postsynaptic membrane potential (4) inducessynaptic long-term depression (LTD), whereas synapses undergo long-term potentiation (LTP) af-ter high frequency stimulation (100 Hz) (5). Such findings are consistent with the well-knownBienenstock-Cooper-Munro (BCM) learning rule (6). This BCM model has been shown to elicitorientation selectivity and other aspects of experience-dependent cortical plasticity (6, 7). Further-more, in this model the modification of the threshold between LTP and LTD varies as a functionof the history of postsynaptic activity, a prediction which has been confirmed experimentally (8).

    Despite its consistency with experimental data and its functional relevance, the BCM frameworkis limited in several aspects. Experimentally, since the learning rule is expressed in terms of firingrates, it cannot predict synaptic modification based on the timing of pre- and postsynaptic spikeswhich has attracted particular interest more recently (9, 10). This form of plasticity, called Spike-Timing Dependent Plasticity (STDP) uses the timing of spike pairs to induce synaptic modification(11, 12). Synapses undergo LTP if a presynaptic spike precedes a postsynaptic spike, whereas thereverse timing of pre- and postsynaptic spikes leads to LTD (9, 10, 13, 14). Functionally, the BCMmodel cannot segregate input patterns that are characterized by their spiking temporal structure.

    Here, we consider a spike-based learning rule, “the triplet STDP model” (15, 16), and show that

    1

  • it overcomes those two important limitations of the BCM rule, and as a result generalizes the BCMframework. This triplet model uses sets of three spikes (triplets) – instead of pairs of spikes as inthe case of classical STDP – to induce potentiation. More precisely, LTP depends on the intervalbetween the pre- and postsynaptic spikes, and on the timing of the previous postsynaptic spike(Fig. 1A). Furthermore, this triplet learning rule has been shown to explain a variety of synapticplasticity data (17, 18) significantly better than pair-based STDP (16) (see also Fig. 1B).

    Computationally, it has been shown that under some rather crude assumptions – when theinput and output neurons have independent Poisson statistics – this triplet model can be mappedto the BCM learning rule (16). In this paper, we take a more biologically-plausible approach byincorporating contributions from input-output spiking correlations in inducing synaptic plasticity.Consistent with results from BCM theory, we demonstrate that in the presence of N orthogonalrate-based patterns, the maximally selective fixed points of the weight dynamics induced by thetriplet rule are stable. Furthermore, we show that the triplet rule acts as a generalized BCM rule inthe sense that postsynaptic neurons become not only selective to rate-based patterns of the inputs,but also to patterns that can be differentiated only based on their spiking correlation structure.

    In contrast to existing biophysical learning rules (19) that are consistent with the BCM model,the mathematical simplicity of the triplet model allowed us to characterize the explicit dependenceof the synaptic dynamics on higher-order input correlations. This is of critical importance giventhe ubiquity of those higher-order correlations (20, 21) and their relevance for neural coding (22).

    Materials and Methods

    Neuronal dynamics. We consider a feedforward network withN input neurons x(t) = [x1(t), . . . , xN (t)]T

    where xj(t) =∑

    tprejδ(t− tprej ) denotes the Dirac delta spike train at time t of the jth neuron and

    tprej its spike times. Input neurons are connected to a single output neuron (Fig. 2A) through aweight vector w(t) = [w1(t), . . . , wN (t)]T giving rise to the postsnynaptic spike train y(t). Theinput spike trains x(t) have instantaneous firing rates ρinst(t) = [ρinst1 (t), . . . , ρ

    instN (t)]

    T = 〈x(t)〉.(The expectation 〈·〉 is assumed to be taken over the input statistics.) In addition to this firstmoment, the input spike trains have higher-order moments, of which we only considered second(pairwise correlations) and third moments (triplet correlations). The input correlation matrix1 isdefined as C(s) = 1T

    ∫ T0

    〈x(t)xT(t− s)

    〉dt where ρ = 1T

    ∫ T0 ρ

    inst(t) dt is the mean firing rate aver-aged over a duration T . Note that this input correlation matrix has an atomic discontinuity ats = 0: C(0) = diag(ρ) δ(0).

    The third-order correlation is described by a third-order tensor U whose (k, j, n) componentsare defined as Ukjn(s1, s2) = 1T

    ∫ T0 〈xk(t)xj(t − s1)xn(t − s2)〉 dt. Like the pairwise correlation

    matrix, this third-order tensor also has atomic discontinuities when s1 = 0, s2 = 0 and s1 = s2(see Supplementary Information). For further notational convenience, we let Uj denote a matrixwhose (k, n) element is Ukjn. We assumed that the membrane potential of the postsynaptic neuron

    1Formally, the correlation matrix should be written as C(t; s). For notational convenience, we omit here thedependence on t.

    2

  • u(t) increases with the spike times of each input by the excitatory postsynaptic potential (EPSP)scaled by the corresponding weight

    u(t) = wT(t)∫ ∞

    0�(r)x(t− r) dr [1]

    The function �(r) denotes the EPSP kernel with∫∞

    0 �(r) dr = 1, taken to be a decaying exponential,1/τm e−r/τm , r ≥ 0, with τm being a membrane time constant of 11 ms (the exact choice of the timeconstant was not important). Postsynaptic spikes were generated stochastically from the membranepotential (23, 24), with a probability density of firing a spike at time t given by the transferfunction 〈y(t)〉 = g(u(t)). For simplicity, we used linear neurons where the transfer function wasapproximated by

    g(u(t)) ≈ g(u0) + g′(u0)(u(t)− u0) [2]

    where the expected membrane potential averaged over a period T is u0 = 1T∫ T

    0 〈u(t)〉 dt = wTρ.

    We also used ν = g(u0) to denote the mean postsynaptic firing rate. (The symbol 〈·〉 denotes theexpectation over both the pre- and postsynaptic stochastic spike trains.)

    Having described the dynamics of the postsynaptic neuron, we expressed the correlation functionK(s) = 1T

    ∫ T0 〈y(t)x(t− s)〉 dt between pre- and postsynaptic spike trains solely as a function of the

    (convolved) input correlations

    K(s) = g′(u0)C�T(s)w + α(u0) ρ [3]

    where α(u) = g(u)− g′(u)u, and the superscript � denotes a convolution with respect to the EPSPkernel �(s), i.e. C�(s) =

    ∫∞0 �(s

    ′)C(s−s′) ds′ (see Supplementary Information). Note that when thetransfer function is linear g(u) = u, then α(u0) = 0 and K(s) = C�T(s)w. Similarly, by using theapproximation in Eq. 2, the pre-post-post correlation vector Q(s1, s2) = 1T

    ∫ T0 〈y(t)x(t − s1) y(t −

    s2)〉 dt was expressed as

    Q(s1, s2) =(g′(u0)

    )2∑j

    wTU �j (s1, s2)w ej

    + α(u0) g′(u0)(C�T(s1) + C�T(s1 − s2)− 2 ρ ρT

    )w + β(u0) ρ [4]

    where ej is a vector of zeros with a 1 at the j-th component, and β(u) = g2(u) − g′2(u)u2

    (see Supplementary Information). The � superscript in U �j denotes a double convolution, i.e.U �j (s1, s2) =

    ∫∞0

    ∫∞0 �(r) �(q)Uj(s1 − r, s2 − r + q) dr dq. When the transfer function is linear

    g(u) = u, then α(u0) = β(u0) = 0. In this case, we expressed the pre-post-post correlation vectorQ simply as a function of the (convolved) third-order input correlation tensor U �: Q(s1, s2) =∑

    j wTU �j (s1, s2)w ej .

    Synaptic dynamics. In the previous section we assumed that the weights w were fixed. Here,we allowed the weights to be dynamic, but with a very small learning rate such that the results inthe previous section were still valid. Consistent with the approach of (23), we expressed the weight

    3

  • change as a Volterra expansion of both the pre- and postsynaptic spike trains. Setting the firstorder terms to zero, we assumed that synaptic plasticity depends on second- and third-order termsonly, i.e. pairs of spikes (1 pre and 1 post) and triplets of spikes (1 pre and 2 post)

    ẇ = y(t)∫ ∞

    0W2(s)x(t− s) ds+ x(t)

    ∫ ∞0

    W2(−s) y(t− s) ds

    + y(t)∫ ∞

    0

    ∫ ∞0

    W3(s1, s2)x(t− s1) y(t− s2) ds1 ds2. [5]

    where W2 is the pair-based STDP learning window (11, 23, 25) given by

    W2(∆t) =

    A+2 e−∆t/τ+ , ∆t ≥ 0−A−2 e∆t/τ− , ∆t < 0 [6]where ∆t = tpost − tpre denotes the timing between a post- and a presynaptic spike, τ+ is thepotentiation time constant and τ− is the depression time constant (Fig. 2B, blue curve). In addi-tion to the spike pair effect described in Eq. 6, spike triplets (tpre, tpost, t′post) also affect synapticpotentiation depending on their timing difference ∆t1 = tpost − tpre and ∆t2 = tpost − t′post

    W3(∆t1,∆t2) =

    A+3 e−∆t1/τ+e−∆t2/τy ,∆t1 ≥ 0,∆t2 ≥ 00, otherwise. [7]Thus, the level of potentiation induced by pairs of presynaptic and postsynaptic spikes in pair-basedSTDP is modulated by the timing difference between the two postsynaptic spikes, given by the timeconstant τy (Fig. 1A and 2C, left). The parameters used throughout this paper were those of theminimal triplet rule by (26)2, i.e. A+2 = 0, A

    −2 = −6.5 × 10−3, A

    +3 = 7.1 × 10−3, τ+ = 16.8 ms,

    τ− = 33.7 ms, τy = 114 ms. Assuming slow learning dynamics (implying small W2 and W3), wereplaced the weights w by their expectation averaged over a period of T : w(t)← T−1

    ∫ t+Tt 〈w(s)〉ds

    ẇ =∫ ∞−∞

    W2(s)K(s) ds+∫ ∞−∞

    ∫ ∞−∞

    W3(s1, s2)Q(s1, s2) ds1 ds2. [8]

    The terms arising from the contributions of pairwise and third-order correlations in this equationare illustrated in Fig. 2. By using the expression of the pre-post correlation vector K derived inEq. 3 and the pre-post-post correlation vector Q from Eq. 4 in the case of linear transfer function,we derived the following equation for the weight dynamics

    ẇ = Aw +N∑j=1

    (wTBjw

    )ej [9]

    2Note that (26) assumed the wrong number of pairs in the STDP experiments (60 instead of 75) and thereforethe fitted amplitude parameters were overestimated by approximately 10%. For the sake of consistency with (26) wekept the same parameters.

    4

  • where A =∫∞−∞W2(s)C

    �T(s) ds is the (convolved) input correlation matrix integrated with thepairwise learning rule W2. Similarly, we have Bj =

    ∫∞−∞

    ∫∞−∞W3(s1, s2)U

    �j (s1, s2) ds1 ds2 which

    corresponds to the integral of the triplet rule W3 with the third-order input statistics. Equivalently,Eq. 9 can be written as

    ẇ = φ(ν) ρ+ ∆Aw +N∑j=1

    (wT∆Bj w

    )ej [10]

    where φ(ν) = W 2ν + W 3ν2 corresponds to the BCM term, and W 2 =∫∞−∞W2(s) ds and W 3 =∫∞

    −∞∫∞−∞W3(s1, s2) ds1 ds2 are the integrals of the pair-based and triplet STDP rules, respec-

    tively. The new covariance terms ∆A and ∆B are defined analogously to A and B: ∆A =∫∞−∞W2(s) ∆C

    �T(s) ds and ∆Bj =∫∞

    0

    ∫∞0 W3(s1, s2) ∆U

    �j (s1, s2) ds1 ds2. Here, instead of the pair-

    wise and triplet input correlations C and U respectively, we subtracted the means to obtain thepairwise covariance matrix ∆C(s) = C(s) − ρ ρT and triplet covariance tensor ∆Ukjn(s1, s2) =Ukjn(s1, s2)− ρk ρj ρn.

    Weight dynamics for input selectivity. In the general case, we considered P input patterns,where pattern i has mean firing rate ρ(i), and pairwise and triplet correlation terms A(i) and B(i),respectively. Each input pattern i was associated with a probability pi of occurrence, and gave riseto an average postsynaptic firing rate ν(i) = wTρ(i). The level of selectivity of the postsynapticneuron defined by (6) is given by Sel(w) = 1 −

    (∑i piw

    T ρ(i))/(maxiwT ρ(i)

    ). The selectivity is

    zero if the output firing rate is identical for each input pattern. The selectivity is maximal is theoutput firing rate is zero in response to all input patterns except to one, when the selectivity isgiven by (N − 1)/N if pi = 1/N .

    To match the triplet rule to the BCM model of rate-based selectivity, we set A−2 → A−2 ν̄/ρ

    p0,

    where the expectation of the pth power of the postsynaptic firing rate can be expressed as ν̄ =∑Pi=1 pi

    (ν(i))p

    . This quantity was approximated by low-pass filtering the pth power of the instan-taneous postsynaptic firing rate ν(t) = g(u(t)) with a time constant which has to be larger than Ptimes the frequency of pattern presentation, i.e. r ' ν̄ where τrṙ = −r + νp with a time constantof τr = 5 s (unless otherwise noted). For all the calculations in this paper we took p = 2. Usingthe minimal triplet model of (26) where A+2 = 0, A

    (i) contains only depression effects from the pairSTDP rule (A−2 6= 0) such that Eq. 10 becomes

    ẇ =P∑i=1

    pi

    φ(ν(i), ν̄) ρ(i) + ∆A(i)(ν̄)w + N∑j=1

    (wT∆B(i)j w

    )ej

    . [11]To find the nontrivial fixed points of this system, we solved ẇ = 0.

    Selectivity with N rate-based patterns. Here, we show that the maximally selective fixedpoints are stable points of the triplet STDP rule in the case of independent Poisson inputs. Thefull proof is given in the Supplementary Information. By assuming the minimal triplet model(A+2 = 0) we have ∆A = 0 and ∆B is given by (see Supplementary Information) (∆B

    (i)j )kn =(

    b1 δkj ρ(i)n + b2 δkn ρ

    (i)k + b3 δkj δkn

    (i)j with b1 =

    ∫∞0

    ∫∞0 W3(s1, s2) (�(s1) + �(s2 − s1)) ds1 ds2, b2 =

    5

  • ∫∞0

    ∫∞0 W3(s1, s2)

    ∫∞0 �(r) �(r− s2) dr ds1 ds2 and b3 =

    ∫∞0

    ∫∞0 W3(s1, s2) �(s1) �(s2− s1) ds1 ds2. In

    the presence of P Poisson patterns, we can write the expected weight dynamics as

    ẇ =P∑i=1

    pi

    (φ(ν(i), ν̄)1 + Λ(i)

    )ρ(i) [12]

    where 1 is the identity matrix and Λ(i) is a diagonal matrix where the jth diagonal element isΛ(i)jj =

    (b1wj ν

    (i) + b2 ν(i)2 + b3w

    2j

    (i)j with ν

    (i)2 =

    ∑k w

    2k ρ

    (i)k . Solving for the fixed points we find

    a total of 2N fixed points of which N are maximally-selective, in agreement with BCM theory (seeSupplementary Information). Moreover, these maximally-selective fixed points are stable, becausethe Jacobian evaluated at each point is a diagonal matrix with negative entries (see SupplementaryInformation). We also calculate the sliding threshold around the maximally-selective fixed points(see the main text).

    Correlation-based patterns in a two-dimensional network. Here, we presented patternswith the same rates ρ(1) = ρ(2), but different pairwise and third-order correlations in each pattern.We further imposed a lower bound on the system, w ≥ wmin = 0, which in the case of orthogonalrate-based pattern was automatically satisfied. Then the fixed points of maximum selectivity (w∗1, 0)and (0, w∗2) are given by

    w∗1 =

    ∑i pi

    (B

    (i)1

    )11(∑

    i piA(i)11

    ) Pi pi

    “ρ(i)1

    ”2ρ20

    and w∗2 =

    ∑i pi

    (B

    (i)2

    )22(∑

    i piA(i)22

    ) Pi pi

    “ρ(i)2

    ”2ρ20

    [13]

    To find the stability of these fixed points, evaluating the Jacobian is not appropriate because ofthe lower bound on the weights. Denoting the system in Eq. 11 by ẇ = F (w) for stability of eachfixed point w∗, two conditions must be satisfied: (1) ∂Fm(w)∂wm

    ∣∣∣w=w∗

    < 0, and (2) Fj 6=m(w∗) < 0.Numerical simulations show that these conditions are always satisfied for the minimal triplet model(see Supplementary Information). This system can be also generalized to two pools of inputsassuming the weights in each pool evolve together, however, extensions to N correlated patternsare currently only possible with numerical simulations.

    Simulations of correlated spike trains. Higher-order correlations in the spike trains in Fig. 5were simulated by taking the union of an independent spike train and a common spike train (27–29).This approach was used for the generation of spatial correlations, however, for the spatio-temporalcorrelations we implemented the mixture method described by (29). In short, instantaneous corre-lations were generated by taking the union of an independent spike train and a common spike train(as in the case of spatial correlations), and then the generated spikes were shifted by independentand identically distributed random numbers from an appropriate distribution function. We usedan exponential distribution with a time constant τc (see Supplementary Information). Thus, forsimplicity we studied symmetric correlation functions, and assumed uniform correlations for allinput pairs and triplets.

    6

  • Numerical simulations with multiple patterns. For the simulations with rate-based pat-terns in Fig. 3, the inputs within each pattern were given independent Poisson spike trains lackingcorrelations. In each pattern, 90 inputs had rates of 1 Hz and 10 inputs had rates of 50 Hz. The10 inputs with the high rates were always assigned to a set of neighboring neurons (1–10, 11–20,. . ., 91–100), determining the pattern.

    For the simulations with correlation-based patterns in Fig. 5, each of the 100 inputs had thesame firing rate of 10 Hz. In each pattern, 90 inputs were given independent Poisson spikes,and 10 inputs had uniform correlations between any pair and triplet of inputs. For the spatialcorrelations in Fig. 5B, each pair and triplet of inputs shared 90% identical spikes. For the spatio-temporal correlations in Fig. 5C, one half of the 90% shared spikes for each pair and triplet of inputswere shifted by an exponential random distribution with a mean of 5 ms resulting in symmetric,exponentially-decaying correlations with a timescale of 5 ms.

    In Fig. 3 and 5, a new randomly-chosen pattern was presented to the network every 200 ms.Pre- and postsynaptic spikes were simulated stochastically given the respective firing rates. Theinitial weights were set to 1 and hard bounds were set between 0 and 3. The target firing rate wasset to ρ0 = 12 Hz. Postsynaptic activity was low-pass filtered with a time constant of 5 seconds. A+2and A−3 were reduced by a factor 10 compared to the parameters in (26) to give smooth evolutionof the weights.

    Orientation selectivity and receptive field development. In Fig. 4, every 200 ms (cor-responding to a fixation time between saccades), a small patch of 16 × 16 pixels was randomlychosen from ten natural images. The images were taken from the standard benchmark (30) alreadyprewhitened. In order to remove orientation bias, half of the time the patch was transposed, re-flected about the vertical or the horizontal axis. The inputs of the feedforward network consistedof 16 × 16 independent Poisson processes firing at a rate corresponding to the positive gray levelof the pixel (ON inputs) and another 16× 16 OFF inputs firing proportional to the negative graylevel of the pixel. The initial weights were set to 1 and hard bounds were set between 0 and 2. Thetarget firing rate was set to ρ0 = 5 Hz. A+2 and A

    −3 were reduced by a factor 100 compared to the

    parameters in (26). Every 50 s, the norm of the ON weights to the one of the OFF weights wasequalized.

    For Fig. 7, bars of different orientations: vertical, horizontal, two diagonal directions at ±45◦,which could move in one of the two directions perpendicular to the bar’s orientation were presentedto a feedforward network, giving a total of 8 patterns. In the network, an image of 9× 9 pixels cor-responded to the probability of firing of 81 independent Poisson inputs. When a bar was presented,all the 9 inputs corresponding to the bar fired at 50 Hz. The bar stayed at the given position for10 ms and then shifted to the next pixels in one of the two directions (positive velocity for onedirection and negative for the other). A new randomly-chosen patten was presented to the networkevery 200 ms.

    The model for the postsynaptic neuron was the same as described previously, but in additionto an EPSP kernel of 10 ms, an IPSP kernel of 20 ms was also used. This additional IPSP kernel

    7

  • allowed the neuron to be more sensitive to temporal structure of its input (31). The weightsevolved under the minimal triplet rule augmented with a sliding threshold as described above. Alower bound of 0 and an upper bound of 3 was imposed on the weights. Postsynaptic activity waslow-pass filtered with a time constant of 20 seconds.

    Results

    Triplet STDP implies selectivity with rate-based patterns

    Orientation-selective neurons in the primary visual cortex respond strongly when a bar is presentedin a certain orientation and respond much less if the bar is presented in a different orientation (32).This orientation selectivity is learned during receptive field development, and normal patterns ofsensory experience are important for receptive field maturation (33, 34). How is this orientationselectivity, or more generally pattern selectivity learned by a neuronal network? In order to solvethis task, (6) proposed a simple model of synaptic plasticity which is known as the BCM learningrule. In the BCM framework, a randomly chosen input pattern pattern i (of P possible patterns)with rates ρ(i) is presented with probability pi to a feedforward network with N inputs (Fig. 3A).This causes a postsynaptic firing rate ν(i) = wTρ(i), where w is the weight vector. The weightchange induced by the BCM rule is proportional to the input firing rate ρ

    ẇ = φ(ν, ν̄) ρ [14]

    and scales with a non-linear function φ, which depends not only on the postsynaptic firing rateν, but also on the average (over all patterns) of a non-linear function of the postsynaptic rateν̄ =

    ∑i pi(ν(i))2

    . The non-linear function φ must be negative when the postsynaptic firing rate isbelow a given threshold θ – which itself depends on ν̄ – and positive when it is above (Fig. 3B). Inthe case of orthogonal input patterns, (6) showed that the maximally-selective fixed points of thedynamics (for which the postsynaptic neuron shows the largest response when the selected patternis presented) are stable.

    Interestingly, the average weight change under the triplet STDP rule can be written preciselyas the BCM term plus some perturbation terms due to the input correlations

    ẇ = φ(ν) ρ+ ∆Aw +N∑j=1

    (wT∆Bj w

    )ej [15]

    where φ(ν) = W 2 ν + W 3 ν2 is the BCM term (Fig. 3B) with W 2 and W 3 being the overall areaunder the pair-based and triplet STDP rules, respectively. The additional terms, ∆A and ∆B,depend on the pairwise and third-order input statistics integrated over the pair-based and tripletSTDP learning rules, respectively (see Materials and Methods). In order to get depression at lowpostsynaptic firing rate and potentiation for higher firing rate, pairs of spikes must have an overalldepressive effect (W 2 < 0) and triplets of spikes must induce potentiation (W 3 > 0). This is indeed

    8

  • the case for the minimal triplet STDP model that has been proposed by (26) and that we considerhere.

    There are two differences between the original BCM term in Eq. 14 and the term in the tripletmodel in Eq. 15. Firstly, in the triplet model, the function φ depends only on the temporally-averaged postsynaptic activity ν, whereas in the BCM model, φ also depends on the postsynapticactivity averaged over all patterns, ν̄. However, if we redefine the amplitude parameter for pair-based depression A−2 as A

    −2 ν̄/ρ

    p0, where ρ0 is a constant denoting the target rate of the postsynaptic

    neuron (26), then both φ and ∆A in Eq. 15 will depend on ν̄.The second difference is the presence of the two additional terms (∆A and ∆B) in Eq. 15. If

    the inputs are Poisson neurons, we can rewrite Eq. 15 as

    ẇ = (φ(ν, ν̄)1 + Λ) ρ [16]

    where 1 denotes the identity matrix and Λ is a diagonal matrix (see Materials and Methods). Ifwe now assume that the patterns are orthogonal, we can show that the condition ẇ = 0 in Eq. 16gives rise to 2N fixed points, which is consistent with the results of BCM theory. Moreover, the Nmaximally selective fixed points, w∗(n) = (0, . . . , 0, w∗n, 0, . . . , 0), are stable fixed points (Fig. 3D,Eshows an illustration in two dimensions). The sliding threshold of the system around the fixedpoint (Fig. 3B) is given by

    θ(ν̄) = θ0(ν̄)(

    1 +(τ1ρ

    (n)n

    )−1+(τ2ρ

    (n)n

    )−2)−1[17]

    where θ0(ν̄) = A−2 τ−ν̄/(A+3 τ+τyρ

    p0) is the sliding threshold of the BCM term only when the correla-

    tion terms can be neglected (i.e. ∆A = 0 and ∆B = 0) and τ1 and τ2 are time constants dependingon the parameters of the system (see Supplementary Information).

    For non-orthogonal inputs, deriving selectivity analytically is too difficult, so we performednumerical simulations. Ten rate-based non-orthogonal patterns (Fig. 3A) were presented to 100inputs neurons in the feedforward network (Fig. 2A). At the end of the simulation the postsynapticneuron became selective to only one pattern, i.e. only the weights from one set of ten input neurons(61–70) potentiated, while the other inputs depressed (Fig. 3C). The choice of the selected patternis random. Note that standard pair-based STDP also exhibits selectivity of rate-based patterns(12), but the type of competition is of a different nature. As pointed out by (6), the competition inthe BCM rule (and equivalently with triplet STDP) is temporal via the sliding threshold whereaswith a pure Hebbian rule (and equivalently with pair-based STDP), the competition is spatial –between synapses.

    Receptive fields development with natural stimuli

    We have shown that the triplet STDP rule can be mapped to the BCM model in the case of rate-based patterns presented to a feedforward network. Therefore, we expected that the triplet STDP

    9

  • rule would be able to explain the development of localized receptive fields. In Fig. 4 we providedinput consisting of small patches of natural images chosen from a standard benchmark (30) andwhitened with standard preprocessing (35). The weights in the feedforward network evolved underthe triplet STDP rule (Fig. 4A). After learning, the weights were rearranged on the image, showing astable spatial structure resembling a receptive field (Fig. 4B). Different runs of the same simulationprotocol resulted in receptive fields with different locations and orientations (Fig. 4C).

    While classical pair-based STDP has been shown to support selectivity in the case of rate-based patterns (25), the results here suggest that triplet STDP performs a fundamentally differentcomputation than pair-based STDP. The receptive fields shown in Fig. 4 were spatially-localized,which has been previously related to independent component analysis (ICA) (35) and sparse coding(7, 30, 36). This stands in stark contrast to principal component analysis (PCA) of image patches,for example, as implemented with linear neurons and a linear Hebbian rate-based learning rule(37) or with pair-based STDP (data not shown). Unlike most other ICA algorithms (35), however,triplet STDP is supported by experimental plasticity data.

    Triplet STDP elicits selectivity with correlation-based patterns

    So far we have shown that the triplet STDP model can be mapped to the BCM model of rate-based selectivity for independent patterns based on different firing rates, and can perform ICA-likecomputations. While this is useful because it extends the rate-based framework of the BCM modelto a spike-based rule, the triplet learning rule can be seen as a further generalization of the BCMmodel in the following sense: in Eq. 15, ∆A and ∆B depend on the second and third-order inputcorrelations, respectively, therefore, triplet STDP must be sensitive to spatio-temporal correlationsin the inputs.

    In order to examine our hypothesis, we presented 10 ‘correlation-based’ patterns to the feed-forward network in Fig. 2A with 100 inputs. Unlike the rate-based patterns examined so far,correlation-based patterns consisted of the same rates, but had different pairwise and third-ordercorrelations in the inputs (Fig. 5A). Since the firing rates of the inputs in each pattern were thesame, the postsynaptic neuron elicited the same response to each pattern. Therefore, in the case ofcorrelation-based patterns selectivity was defined in terms of the selective potentiation of a group ofinputs which shared nonzero spatio-temporal correlations. Fig. 5B shows a simulation with purelyspatial correlations – the correlation structure between any pair or triplet of inputs was instanta-neous and had no temporal structure. The weights from one pattern potentiated (from a set of teninput neurons with indices 41–50), while the other weights depressed. Increasing the complexityof the correlation structure, in Fig. 5C spatio-temporally correlated inputs were presented to thenetwork. Despite the temporal structure diluting the strength of the input correlations, a set often weights (81–90) characteristic of one pattern potentiated, while the other weights depressed.

    We also found that the map between the BCM rule and the triplet STDP model can be es-tablished even in the case of correlation-based patterns. To see this, in Fig. 5D we illustrate theexpected weight change as a function of the postsynaptic firing rate in the feedforward network

    10

  • receiving spatio-temporal correlation-based patterns. While the additional terms ∆A and ∆Bin Eq. 15 prevent us from deriving an explicit expression for the modification threshold, such athreshold still exists and depends on the correlation strength of the inputs. Example curves fortwo different input correlation strengths are presented in Fig. 5D, demonstrating the dependenceof the effective threshold for potentiation versus depression on correlation strength.

    Due to the increased complexity of the system when correlations exist in the inputs, we derivedthe fixed points of maximal selectivity (w∗1, 0) and (0, w

    ∗2) in a small network of two inputs neurons

    (or two input pools), and analyzed their stability (see Materials and Methods). The correlationterms ∆A and ∆B introduced additional nonlinearities in the weight dynamics in Eq. 15, such thata lower bound had to be introduced to prevent the weights from becoming negative (in agreementwith Dale’s law). For the two-dimensional network, we found the maximally-selective fixed pointsto always be stable. Fig. 5E shows the two-dimensional phase plane, where the two unstable fixedpoints (red symbols) drive the weight trajectories toward the axes where the stable maximally-selective fixed points are located (black symbols). Example weight trajectories are shown in Fig. 5Ffor one choice of initial condition.

    However, for networks with more than two input pools, the correlation structure of the inputscan have a significant effect on the weight dynamics. We extended the simulation in Fig. 5C toexamine how the temporal correlation structure of the inputs influences the selective potentiationof synaptic weights corresponding to different patterns. We studied a particular example of asymmetric spatio-temporal correlation: exponentially-decaying in time, which was the same forall pairs and the same for all triplets of inputs (see Materials and Methods). Therefore, whilepreserving the peak of correlation strength, we examined the role of the correlation structure on theselective potentiation of synaptic weights by studying a single parameter: the correlation timescale(Fig. 5G). Increasing the correlation timescale resulted in a ‘dilution’ of the correlation strength.The triplet STDP rule failed to consistently potentiate the weights of one input pattern and oftentwo or three patterns were simultaneously selected (Fig. 5G). This result suggests that correlationsover broad timescales fail to evoke selective potentiation of correlation-based input patterns, andcould be used to understand the implications of different correlation structure occurring in differentbrain regions.

    Triplet STDP, but not pair-based STDP, can elicit selectivity driven by third-

    order correlations

    We have shown that triplet STDP (because of its link to the BCM model) can do ICA-like com-putations, resulting in the development of spatially-localized receptive fields (Fig. 4). While thiscontrasts classical Hebbian learning rules, including pair-based STDP, we asked whether tripletSTDP can do other computations which pair-based STDP cannot. We have also shown that tripletSTDP can induce selectivity driven by rate-based (Fig. 3) and correlation-based patterns (Fig. 5).Previous work has shown that pair-based STDP can also elicit rate-based input selectivity (25, 38),but it has not been studied how pair-based STDP performs in the case of correlation-based patterns.

    11

  • To answer this question, we designed a simpler selectivity task consisting of two spatial correlation-based patterns presented to a feedforward network of 6 input neurons. The two patterns consistedof the same firing rates, the same pairwise correlations, but differed in the presence or absence ofthird-order correlations (Fig. 6A). The probabilities of presentation of the two patterns were varied,as the weights evolved under triplet STDP (Fig. 6B) and pair-based STDP (Fig. 6C). If p1 denotesthe probability with which the first pattern (with third-order input correlations) was presented tothe neurons in the first group, then the probability that the weights corresponding to this patternwin is shown in Fig. 6B,C. For the static pattern case when p1 = 1 (corresponding to the twopatterns always being presented to the same group of neurons in the network), the triplet STDPrule selectively potentiated the weights of the pattern with third-order correlations. However, thepair-based STDP rule continued selectively potentiating the weights from either inputs group withequal probability.

    In summary, when weight dynamics evolve under pair-based STDP, the rule is not sensitive tothe third-order correlations which distinguish one pattern from another, thus, regardless of p1 therule behaves as if two identical patterns were presented to the network. In contrast, when thenweights evolve under triplet STDP, then the rule almost always selects the pattern with the third-order correlation for p1 = 1. This demonstrates that the triplet STDP rule can distinguish betweeninputs solely based on the higher-order correlation structure, which pair-based STDP ignores. Asa result, triplet STDP will be computationally more powerful in systems where such higher-ordercorrelations have been characterized (20–22).

    Spatio-temporal receptive field development

    Finally, we illustrate the selectivity property of the triplet rule shown analytically, in a biologically-inspired setting from visual cortex receptive field theory. We presented eight different patternsconsisting of four bars of different orientations (horizontal, vertical, and two diagonals tilted at±45◦) each moving in one of the two directions perpendicular to the bar’s orientation, as in Fig. 7A,to a feedforward network. The input neurons in the network were organized in a 9×9 grid mappingto a single postsynaptic neuron as in Fig. 7A. Each input spike produced both an excitatorypostsynaptic potential (EPSP) and an inhibitory postsynaptic potential (IPSP) with a longer timeconstant (see Materials and Methods). This additional IPSP makes the postsynaptic neuron moresensitive to the temporal derivative of the input currents (31). Subjected to learning governed bytriplet STDP, the synapses in the network selectively refined, some depressing to the lower bound ofzero and others potentiating maximally to the upper bound (Fig. 7B). At the end of the simulationthe postsynaptic neuron became selective to one of the bars, as shown in Fig. 7C by the strongweights corresponding to inputs placed in the bar with a vertical orientation.

    In addition to having a spatial structure as given by the orientation selectivity, the receptive fieldin this scenario also developed a temporal structure as a result of the spatio-temporal correlationsimposed by the moving bars. In order to visualize this effect, the weights were frozen after learningand the averaged firing rate of the postsynaptic neuron was recorded as all four bars were re-

    12

  • presented to the network, but at different velocities. The postsynaptic neuron fired the most whenthe bar with vertical orientation was presented at the velocity used during the learning (Fig. 7D).Therefore, the postsynaptic neuron became selective to both orientation and velocity.

    Recently, after training with moving stimuli, (39) observed the emergence of direction selectivityin cortical neurons of visually naive ferrets for the trained directions of motion. Motivated by thisexperimental result, we plotted the time evolution of the averaged firing rate for bars of all fourorientations moving in each direction in the scenario described above. We found that in addition toorientation selectivity, the triplet STDP rule also induced direction selectivity in the postsynapticneuron. Thus, while the postsynaptic neuron had the highest firing rate for the bar in the verticalorientation relative to all other orientations, it had a higher firing rate for the bar moving to theleft (∼ 40 Hz), compared to the firing rate for the bar moving to the right (∼ 30 Hz) at the end ofthe simulation.

    In this scenario the development of direction selectivity relied on two elements. First, tripletSTDP was sensitive to the spatio-temporal input correlations, and second, we used a modifiedpostsynaptic potential kernel which made the postsynaptic neuron sensitive to temporal derivativesof the input current. This last requirement was essential for obtaining direction selectivity becauseeach pattern (bar with a given orientation and direction) activated every input the same numberof times during its presentation. If the postsynaptic potential was only excitatory, then the totalinput to the postsynaptic neuron would be the same regardless of the direction of the stimulation.More robust direction selectivity as observed biologically (32, 40) can most likely be obtained byusing a more elaborate neuron that is strongly dependent on the temporal order, e.g. a neuron withfiring rate adapting properties, or a plastic recurrent network with unidirectional connections alongthe selected direction.

    Discussion

    The biologically-plausible BCM theory is attractive because it exhibits selectivity (6) and performsICA computation (41). Synaptic plasticity, however, is shown to depend on the precise spike timing(9, 10) classically modeled by pair-based STDP (11, 12). In this paper, we claim that a differentspike-based rule, triplet STDP, which is known to describe plasticity experiments accurately (26),exhibits the computational properties of the BCM rule, and also acts as a generalized BCM rulebecause of its sensitivity to spatio-temporal correlations.

    Rate-based patterns. First, we showed that triplet STDP supports pattern selectivity whenthe patterns are determined by firing rates. Consistent with the BCM theory, the maximally-selective fixed points of the weight dynamics under triplet STDP are always stable. Triplet STDPachieved rate-based selectivity even though the equations for the weight dynamics under tripletSTDP contained contributions from the input auto-correlations. In addition, triplet STDP performsICA computation allowing the development of localized receptive fields. This would not be possiblewith pair-based STDP which only performs PCA.

    13

  • Correlation-based patterns. Second, the spiking-based implementation of triplet STDP allowedus to extend the analysis beyond BCM to correlation-based patterns, determined purely by thespatio-temporal input correlations. In this case, we showed that triplet STDP can select for patternswhich consist of higher-order correlations, when classical pair-based STDP fails. The contributionof spike timing correlations has recently been extensively studied in different circuits (20–22, 42).However, even though higher-order statistical dependencies have been observed in many brain areas,their role in the processing of sensory information has been debated (43, 44). Due to its sensitivityto higher-order correlations widely observed in the nervous system, in contrast to earliest modelsof synaptic plasticity (rate-based (6, 45, 46) and pair-based STDP (11, 12)), triplet STDP offers anadditional computational advantage over other learning rules. Although when studying selectivitydriven by input patterns determined purely by the input correlations we could not use the definitionof selectivity as in the BCM theory, we still observed the selective potentiation of synaptic weightscorresponding to input patterns with nonzero correlations. We observed that increasing the inputcorrelation timescale dilutes the correlation strength which prevents the cooperation of inputsnecessary for the emergence of selectivity. Therefore, our results make experimental predictionsabout the type of correlation structure which leads to selectivity.

    Other Models. We showed that triplet STDP is a biologically-plausible learning rule which canperform ICA-like computations (35, 41). Recently, (47) showed that a voltage-based learning ruleconsistent with triplet STDP also supports the development of localized receptive fields. However,the rule was too complex to analyze mathematically. Several other models have addressed the issueof spiking-based implementations of the BCM learning framework. Ref. (48) and (49) established acorrespondence between classical pair-based STDP and BCM under certain assumptions, however,they failed to demonstrate an exact mapping and to derive a modification threshold. Those twomodels, as well as the classical pair-based STDP, are inconsistent with a variety of experimentaldata such as frequency dependence (17, 50), in contrast to triplet STDP. Ref. (51) derived analternate spike-based learning rule designed to maximize the information transmission between anensemble of inputs and the output of a postsynaptic neuron. Though such plasticity rules derivedfrom the infomax principle can generalize the BCM theory to spiking neurons (52, 53) and canbe reduced under some assmptions to the triplet STDP rule (54), the dynamics of these rules arerather complicated to be studied analytically in contrast to the triplet STDP model. The sameproblem arises with a different class of biophysical models (19, 55, 56), which have primarily beenstudied numerically. Therefore, the triplet STDP model is a good trade off: it can reproduce alarge set of electrophysiological data, and yet has a relatively simple formulation allowing analyticaltreatment.

    Extensions. The analytical study presented in this paper uses a simple neuron model. In particu-lar, there is no threshold membrane potential for the occurrence of spikes, nor a spike afterpotential.Several extensions can be performed which include a more realistic non-linear transfer function forthe neuron, inhibitory inputs, the presence of a refractory period or firing rate adaptation. Fur-thermore, the current results were derived for a feedforward network with a single postsynaptic

    14

  • neuron. Analytical calculations of recurrently-connected networks with plastic weights are alreadysophisticated for pair-based STDP (57–61), but with a theory for the simple feedforward networkin place, those calculations could be addressed in the future for the triplet model as well.

    15

  • References

    [1] Bliss, TV, Collingridge, GL (1993) A synaptic model of memory: long-term potentiation inthe hippocampus. Nature 361:31—39.

    [2] Malenka, RC, Nicoll, RA (2009) Long-Term Potentiation–a decade of progress? Science285:1870–1874.

    [3] Dudek, SM, Bear, MF (1992) Homosynaptic long-term depression in area CA1 of hippocampusand effects of N-methyl-D-aspartate receptor blockade. Proc Natl Acad Sci USA 89:4363–4367.

    [4] Artola, A, Bröcher, S, Singer, W (1990) Different voltage-dependent thresholds for inducinglong-term depression and long-term potentiation in slices of rat visual cortex. Nature 347:69–72.

    [5] Bliss, TV, Lomø, T (1973) Long-lasting potentiation of synaptic transmission in the dentatearea of the anaesthetized rabbit following stimulation of the perforant path. J Physiol (Lond)232:331–356.

    [6] Bienenstock, EL, Cooper, LN, Munro, PW (1982) Theory of the development of neuronselectivity: orientation specificity and binocular interaction in visual cortex. J Neurosci 2:32–48.

    [7] Cooper, LN, Intrator, N, Blais, BS, Shouval, HZ (2004) Theory of cortical plasticity (WorldScientific, Singapore).

    [8] Kirkwood, A, Rioult, MC, Bear, MF (1996) Experience-dependent modification of synapticplasticity in visual cortex. Nature 381:526–528.

    [9] Markram, H, Lübke, J, Frotscher, M, Sakmann, B (1997) Regulation of synaptic efficacy bycoincidence of postysnaptic AP and EPSP. Science 275:213–215.

    [10] Bi, GQ, Poo, MM (1998) Synaptic modifications in cultured hippocampal neurons: dependenceon spike timing, synaptic strength, and postsynaptic cell type. J Neurosci 18:10464–10472.

    [11] Gerstner, W, Kempter, R, van Hemmen, JL, Wagner, H (1996) A neuronal learning rule forsub-millisecond temporal coding. Nature 383:76–78.

    [12] Abbott, LF, Nelson, SB (2000) Synaptic plastictiy - taming the beast. Nat Neurosci 3:1178–1183.

    [13] Zhang, LI, Tao, HW, Holt, CE, Harris, WA, Poo, MM (1998) A critical window for cooperationand competition among developing retinotectal synapses. Nature 395:37–44.

    [14] Feldman, DE (2000) Timing-based LTP and LTD at vertical inputs to layer II/III pyramidalcells in rat barrel cortex. Neuron 27:45–56.

    16

  • [15] Pfister, JP, Toyoizumi, T, Barber, D, Gerstner, W (2006) Optimal spike-timing-dependentplasticity for precise action potential firing in supervised learning. Neural Comp 18:1318–1348.

    [16] Pfister, JP, Gerstner, W (2006) in Adv Neur Inf Proc Sys 18, eds Weiss, Y, Schölkopf, B,Platt, J (MIT Press, Cambridge, MA), pp 1083–1090.

    [17] Sjöström, PJ, Turrigiano, GG, Nelson, SB (2001) Rate, timing, and cooperativity jointlydetermine cortical synaptic plasticity. Neuron 32:1149–1164.

    [18] Wang, HX, Gerkin, RC, Nauen, DW, Wang, GQ (2005) Coactivation and timing-dependentintegration of synaptic potentiation and depression. Nat Neurosci 8:187–193.

    [19] Shouval, HZ, Bear, MF, Cooper, LN (2002) A unified model of NMDA receptor-dependentbidirectional synaptic plasticity. Prov Natl Acad Sci USA 99:10831–10836.

    [20] Montani, F et al. (2009) The impact of high-order interactions on the rate of synchronousdischarge and information transmission in somatosensory cortex. Phil Trans R Soc A 367:3297–3310.

    [21] Ohiorhenuan, IE et al. (2010) Sparse coding and high-order correlations in fine-scale corticalnetworks. Nature 466:617–621.

    [22] Pillow, JW et al. (2008) Spatio-temporal correlations and visual signalling in a completeneuronal population. Nature 454:995–999.

    [23] Kempter, R, Gerstner, W, van Hemmen, JL (1999) Hebbian learning and spiking neurons.Phys Rev E 59:4498–4514.

    [24] Gerstner, W, Kistler, WK (2002) Spiking Neuron Models (Cambridge University Press, Cam-bridge UK).

    [25] Song, S, Miller, KD, Abbott, LF (2000) Competitive Hebbian learning through spike-timing-dependent synaptic plasticity. Nat Neurosci 3:919–926.

    [26] Pfister, JP, Gerstner, W (2006) Triplets of spikes in a model of spike timing-dependent plas-ticity. J Neurosci 26:9673–9682.

    [27] Gütig, R, Aharonov, R, Rotter, S, Sompolinsky, H (2003) Learning input correlations throughnon-linear temporally asymmetric Hebbian plasticity. J Neurosci 23:3697–3714.

    [28] Kuhn, A, Aertsen, A, Rotter, S (2003) Higher-order statistics of input ensembles and theresponse of simple model neurons. Neural Comp 15:67–101.

    [29] Brette, R (2009) Generation of correlated spike trains. Neural Comp 21:188–215.

    [30] Olshausen, BA, Field, DJ (1996) Emergence of simple-cell receptive field properties by learninga sparse code for natural images. Nature 381:607–609.

    17

  • [31] Rao, RP, Sejnowski, TJ (2003) Self-organizing neural systems based on predictive learning.Philos Transact A Math Phys Eng Sci 361:1149–75.

    [32] Hubel, DH, Wiesel, TN (1962) Receptive fields, binocular interaction and functional architec-ture in the cat’s visual cortex. J Physiol 160:106–154.

    [33] White, LE, Coppola, DM, Fitzpatrick, D (2001) The contribution of sensory experience to thematuration of orientation selectivity in ferret visual cortex. Nature 411:1049–1052.

    [34] White, LE, Fitzpatrick, D (2007) Vision and cortical map development. Neuron 56:327–338.

    [35] Hyvärinen, A, Karhunen, J, Oja, E (2001) Independent Component Analysis (Wiley, NewYork).

    [36] Blais, BS, Intrator, N, Shouval, H, Cooper, L (1998) Receptive field formation in natural sceneenvironments. comparison of single-cell learning rules. Neural Comp 10:1797–1813.

    [37] Oja, E (1982) A simplified neuron as a principal component analyzer. J Math Biol 15:267–273.

    [38] Song, S, Abbott, LF (2001) Cortical development and remapping through spike timing-dependent plasticity. Neuron 32:339–350.

    [39] Li, Y, Hooser, SDV, Mazurek, M, White, LE, Fitzpatrick, D (2008) Experience with movingvisual stimuli drives the early development of cortical direction selectivity. Nature 456:952–956.

    [40] De Valois, RL, Yund, EW, Hepler, N (1982) The orientation and direction selectivity of cellsin macaque visual cortex. Vision Research 22:531–544.

    [41] Intrator, N, Cooper, LN (1992) Objective function formulation of the BCM theory of visualcortical plasticity: Statistical connections, stability conditions. Neural Networks 5:3–17.

    [42] Luczak, A, P. Barthó, SLM, Buzsáki, G, Harris, KD (2007) Sequential structure of neocorticalspontaneous activity in vivo. Proc Natl Acad Sci USA 104:347–352.

    [43] Shadlen, M, Newsome, W (1998) The variable discharge of cortical neurons: implications forconnectivity, computation and information coding. J Neurosci 18:3870–3896.

    [44] Latham, PE, Nirenberg, S (2005) Synergy, redundancy and independence in population codes,revisited. J Neurosci 25:5195–5206.

    [45] Sejnowski, TJ (1977) Storing covariance with nonlinearly interacting neurons. J Math Biol4:303–321.

    [46] Miller, KD (1994) A model for the development of simple cell receptive fields and the orderedarrangement of orientation columns through activity dependent competition between ON- andOFF-center inputs. J Neurosci 14:409–441.

    18

  • [47] Clopath, C, Büsing, L, Vasilaki, E, Gerstner, W (2010) Connectivity reflects coding: a modelof voltage-based STDP with homeostasis. Nat Neurosci 13:344–352.

    [48] Izhikevich, EM, Desai, NS (2003) Relating STDP to BCM. Neural Comp 15:1511–1523.

    [49] Senn, W, Tsodyks, M, Markram, H (2001) An algorithm for modifying neurotransmitterrelease probability based on pre- and postsynaptic spike timing. Neural Comp 13:35–67.

    [50] Clopath, C, Gerstner, W (2010) Voltage and Spike Timing Interact in STDP - A UnifiedModel. Front Synaptic Neurosci 2:25.

    [51] Toyoizumi, T, Pfister, JP, Aihara, K, Gerstner, W (2005) Generalized Bienenstock-Cooper-Munro rule for spiking neurons that maximizes information transmission. Proc Natl Acad SciUSA 102:5239–5244.

    [52] Toyoizumi, T, Pfister, JP, Aihara, K, Gerstner, W (2005) in Adv Neur Inf Proc Sys 17, edsSaul, L, Weiss, Y, Bottou, L (MIT Press, Cambridge, MA), pp 1409–1416.

    [53] Toyoizumi, T, Pfister, JP, Aihara, K, Gerstner, W (2007) Optimality model of unsupervisedspike-timing dependent plasticity: Synaptic memory and weight distribution. Neural Comp19:639–671.

    [54] Hennequin, G, Gerstner, W, Pfister, JP (2010) STDP in adaptive neurons gives close-to-optimal information transmission. Front Comput Neurosci 4:1–16.

    [55] Castellani, GC, Quinlan, EM, Cooper, LN, Shouval, HZ (2001) A biophysical model of bidi-rectional synaptic plasticity: dependence on AMPA and NMDA receptors. Proc Natl Acad SciUSA 98:12772–12777.

    [56] Shouval, HZ, Castellani, GC, Blais, BS, Yeung, LC, Cooper, LN (2002) Converging evidencefor a simplified biophysical model of synaptic plasticity. Biol Cybern 87:383–391.

    [57] Burkitt, AN, Gilson, M, van Hemmen, JL (2007) Spike-timing-dependent plasticity for neuronswith recurrent connections. Biol Cybern 96:533–546.

    [58] Gilson, M, Burkitt, AN, Grayden, DB, Thomas, DA, van Hemmen, JL (2009) Emergence ofnetwork structure due to spike-timing-dependent plasticity in recurrent neuronal networks. I.Input selectivity–strengthening correlated input pathways. Biol Cybern 101:81–102.

    [59] Gilson, M, Burkitt, AN, Grayden, DB, Thomas, DA, van Hemmen, JL (2009) Emergence ofnetwork structure due to spike-timing-dependent plasticity in recurrent neuronal networks. II.Input selectivity–symmetry breaking. Biol Cybern 101:103–114.

    [60] Gilson, M, Burkitt, AN, Grayden, DB, Thomas, DA, van Hemmen, JL (2009) Emergence ofnetwork structure due to spike-timing-dependent plasticity in recurrent neuronal networks III:Partially connected neurons driven by spontaneous activity. Biol Cybern 101:411–426.

    19

  • [61] Pfister, JP, Tass, PA (2010) STDP in oscillatory recurrent networks: Theoretical conditionsfor desynchronization and applications to deep-brain stimulation. Frontiers Comp Neurosci4:1–10.

    20

  • B

    −20 −10 0 10 20−50

    0

    50

    100

    150

    200

    [ms]

    [%]

    0.1Hz

    20Hz

    50Hz

    A pre

    post

    −50 0 50

    -100

    -50

    0

    50

    100

    [ms]

    =10 ms

    =50 ms

    =100 ms

    [%]

    Figure 1: The triplet STDP rule. A. Synaptic depression is induced as in classical pair-basedSTDP using spike pairs separated by ∆t1 = tpost− tpre < 0. Synaptic potentiation is induced usingtriplets of spikes consisting of two postsynaptic spikes and one presynaptic spike based on timinginterval between them ∆t1 = tpost − tpre > 0 and ∆t2 = tpost − t′post > 0. (See also Fig. 2B for athree-dimenstional representation of the rule.) B. Synaptic change as a function of the time betweenpre- and postsynaptic spikes in a protocol where 60 pairs were presented at different frequencies0.1, 20, and 50 Hz. Depression predominated at low frequency, whereas potentiation was moreprevalent at high frequencies. The data points are experiments performed by (17) and the lines aregenerated with the triplet STDP rule with the parameters taken from (26).

  • B

    -50 0 50

    0

    50

    0 500

    1

    2

    x 105

    0

    3

    6

    x 10-3

    -50 0 50

    0

    50

    0 50

    0

    50

    0 50

    C

    −6

    −3

    0

    3

    6x 10

    −3

    −6

    −3

    0

    3

    6x 10

    −3

    0

    1000

    2000

    0

    3

    6

    x 10-3

    0

    50

    0 50

    E

    0

    1000

    2000D

    . . .

    A

    0

    1

    2

    x 105

    Figure 2: Weight dynamics depend on pairwise and triplet input correlations. A. Themodelling framework consists of a feedforward network of N input spiking neurons connectedthrough the weight vector w = [w1, . . . , wN ]T to a single postsynaptic neuron. B and C. Theweight dynamics in the case of independent Poisson inputs. B. The pairwise contribution consistsof the integral of the pair-based learning window W2 (blue line) and the pre-post correlation vectorKj (red analytics, black numerics). C. The triplet contribution is obtained by multiplying thetriplet learning window W3 (left) with the pre-post-post correlation vector Qj (right). D and E .Same as in B and C, but with correlated inputs.

  • 0 2 4 6

    0

    2

    4

    6

    0 5 100

    2

    4

    6

    C

    time [a.u.]

    E

    time [hr]

    inp

    ut n

    um

    be

    r

    100

    80

    60

    40

    20

    5 10 15 0

    1

    2

    3

    D

    0 50 100

    0

    1

    [Hz]

    θ = 90 Hz

    θ = 65 Hz

    θ = 28 Hz

    B

    ...

    pattern 10pattern 1 pattern 2 pattern 3

    input number

    input

    firing rate

    A

    Figure 3: Triplet STDP implies selectivity with rate-based patterns. A. Ten rate-basedpatterns which differ by their firing rates were presented to a feedforward network. B. ∆w as afunction of postsynaptic activity for three different values of the threshold θ between depressionand potentiation. The inputs were independent Poisson spike trains. Symbols denote numerics andlines analytics. C. Evolution of the weights illustrates selectivity in the case of rate-based patterns.Ten rate-based patterns (no correlations) were presented to a feedforward network as in Fig. 1Awith 100 inputs: 90 inputs with rates 1 Hz and 10 inputs with rates 50 Hz. The 10 inputs withhigh rate in each pattern were neighboring (1–10, 11–20, . . . 91–100). A new randomly-chosenpattern was presented to the network every 200 ms. D. Two-dimensional phase plane analysis forrate-based patterns. Nullclines in green and purple intersect at the equilibria shown in red. E. Anexample trajectory for the two weights attracted to one of the stable nodes in D. (a.u. arbitraryunits)

  • filterOFF weightsON weights

    - =

    C

    A

    time [hr]in

    pu

    t n

    um

    be

    r

    0 2 4

    100

    200

    300

    400

    500

    B

    Figure 4: Receptive fields development with natural images. A small patch was chosenrandomly from the whitened natural images benchmark (30) every 200 ms. The positive level ofgray of the 16× 16 patch pixels corresponded to the firing rate of the 16× 16 Poisson ON inputs,the negative level to the 16×16 OFF inputs. A. The weights of the ON inputs and the OFF inputsrearranged on a 16× 16 grid after learning. The filter, calculated by subtracting the OFF weightsfrom the ON weights, represents an oriented localized receptive field. B. Temporal evolution of theweights. C. Examples for ten different neurons.

  • 0 1 20

    0.5

    1

    1.5

    2

    0 5 100

    0.5

    1

    1.5

    2

    B

    FEtime [hr]

    5 10 15 0

    1

    2

    3

    time [hr]

    input

    num

    ber

    10080604020

    5 10 15

    time [a.u.]

    C

    ...

    pattern 10pattern 1 pattern 2 pattern 3

    input number

    input firing rate

    correlationstrength

    A

    0 50 100

    0

    2

    4D θ = 45 Hzθ = 70 Hz

    [Hz]G

    % ca

    ses o

    ne p

    atte

    rn

    is se

    lecte

    d

    correlation timescale [ms]2 4 6 8 10 150

    20

    40

    60

    80

    100

    Figure 5: Triplet STDP implies selectivity with correlation-based patterns. A. Tencorrelation-based patterns which have the same firing rates, but different correlation strength. B.Evolution of the weights illustrates selectivity in the case of correlation-based patterns. Simulationscenario as in Figure 3B, except that the rate of each of the 100 inputs was set to 10 Hz: 90inputs had no correlations (independent Poisson spike trains) and 10 inputs had strong purelyspatial correlations (90% identical spikes). C. Same as B except for the 10 correlated inputs wherespatio-temporal input correlations with a time constant of 5 ms were used. D. ∆w as a function ofpostsynaptic activity for two different input correlation strengths, changing the effective thresholdfor potentiation. Symbols denote numerics and lines analytics. E. Two-dimensional phase planeanalysis for the spatial correlation-based patterns. Nullclines in green and purple intersect at theunstable fixed points shown in red. Imposing a lower bound at 0 resulted in stable maximally-selective fixed points on the axes shown in black. F. An example trajectory for the two weightsattracted to one of the black equilibria in D. G. Percent cases (out of 100 trials) where all inputsfrom a single pattern selectively potentiated. Correlation-based patterns were used with symmetric-exponentially decaying correlations (inset) whose correlation timescale progressively increased.

  • 1.0

    0.8

    0.6

    0.4

    0.2

    00.6 0.80.70.5 0.9 1.0 0.6 0.80.70.5 0.9 1.0

    B

    pro

    ba

    bili

    ty o

    f g

    rou

    p 1

    win

    nin

    g

    Ctriplet STDP pair STDP

    pattern 1 pattern 2

    A

    input

    firing rate

    third-order

    correlation

    strength

    pairwise

    Figure 6: Triplet STDP, and not pair-based STDP, can distinguish between patternsdetermined by third-order correlations. A. Two groups of inputs each consisting of threeneurons were presented, which had the same rates, and the same pairwise spatial correlations. Theneurons in group 1 had spatial third-order correlations as strong the pairwise correlations, whilethe neurons in group 2 lacked any third-order correlations. The probability of presenting pattern 1to group 1 was denoted by p1. B. Input selectivity under triplet STDP. C. Input selectivity underpair-based STDP. Mean ± s.e.m. were computed from 200 simulations in each case.

  • time [min]

    inp

    ut n

    um

    be

    r

    0 10 20 30

    20

    40

    60

    80

    0

    1

    2

    3

    0

    1

    2

    3

    time [min]

    0 10 20 3010

    20

    30

    [Hz]

    A B C

    D

    x

    y

    x

    y

    E

    −20 0 2010

    20

    30

    40

    pres time [ms]

    firin

    g r

    ate

    [H

    z]

    Figure 7: Triplet STDP leads to spatio-temporal receptive field development. A. Fourdifferent bars (horizontal, vertical, and the two diagonals on a 9 × 9 pixels image) were presentedas inputs to a feedforward network with a single postsynaptic neuron, each bar can moving in oneof two directions, giving a total of eight patterns. B. Time evolution of the 81 synaptic weights.C. Final weights re-ordered in a grid corresponding to the input location. D. After learning, theweights were frozen during a testing phase. The four moving bars were presented again to thenetwork with different velocities (positive and negative giving the two directions of motion) whilethe firing rate was measured (averaged over 500 seconds). The training velocity corresponded to apresentation time of −10 ms. E. Time evolution of the firing rate of the postsynaptic neuron forthe eight different patterns. Time axes was cut in bins of 200 seconds to compute a robust averageof the firing rate.

  • Supplementary Text 1

    Supplementary text for the manuscript “ A triplet model of spike-timing dependent plas-ticity generalizes the BCM rule to spatio-temporal inputs” by J. Gjorgjieva, C. Clopath,J. Audet and J.-P. Pfister

    Derivation of the correlation functions

    We derive the pre-post correlation vector K from Eqn. 5 using pair-based STDP. Given the meanpostsynaptic firing rate ν = g(u0) and using the superscript � to denote a convolution with theEPSP kernel �(s) (see the main text), the pre-post correlation vector K in index notation is

    Kj(s) =1T

    ∫ T0〈y(t)xj(t− s)〉 dt (S1)

    =1T

    ∫ T0〈g(u(t))xj(t− s)〉dt (S2)

    =1T

    ∫ T0

    (g(u0) ρj(t− s) + g′(u0) 〈u(t)xj(t− s)〉 − g′(u0)u0 ρj(t− s)

    )dt (S3)

    = g(u0) ρj + g′(u0)(

    1T

    ∫ T0〈u(t)xj(t− s)〉 dt− u0 ρj

    )(S4)

    =(g(u0)− g′(u0)u0

    )ρj + g′(u0)

    ∑k

    wkC�kj(s) (S5)

    = g′(u0)∑k

    wk C�kj(s) + α(u0) ρj (S6)

    where α(u) = g(u)− g′(u)u. This is Eqn. 3 in index notation.For the post-pre-post correlation tensor Q, we ignore the case when the two postynaptic

    spikes overlap (s2 = 0) because it is accounted in the contribution from the pair rule. Then Qof Eqn. 4 in index notation becomes

    Qj(s1, s2) =1T

    ∫ T0〈y(t− s2)xj(t− s1) y(t)〉 dt (S7)

    =1T

    ∫ T0〈g(u(t− s2))xj(t− s1) g(u(t))〉 dt (S8)

    = g2(u0) ρj − (g′(u0))2 u20 ρj − 2g′(u0)(g(u0)− g′(u0)u0

    )u0 ρj

    + g′(u0)(g(u0)− g′(u0)u0

    ) 1T

    ∫ T0

    (〈xj(t− s1)u(t)〉+ 〈u(t− s2)xj(t− s1)〉) dt

    + (g′(u0))21T

    ∫ T0〈u(t)u(t− s2)xj(t− s1)〉 dt (S9)

  • Supplementary Text 2

    Using the expression for α above and letting β(u) = g′(u) (g(u)− g′(u)u) we get

    Qj(s1, s2) = β(u0) ρj − α(u0) g′(u0) (2u0 ρj)

    + α(u0) g′(u0)(

    1T

    ∫ T0〈xj(t− s1)u(t)〉 dt+

    1T

    ∫ T0〈u(t− s2)xj(t− s1)〉 dt

    )+ (g′(u0))2

    (1T

    ∫ T0〈u(t)u(t− s2)xj(t− s1)〉 dt

    ). (S10)

    Now using the expression for u0 =∑

    j wj ρj , we get

    Qj(s1, s2) = β(u0) ρj + α(u0) g′(u0)∑k

    wk(C�kj(s1) + C

    �kj(s1 − s2)− 2u0 ρj

    )+ (g′(u0))2

    ∑k,n

    wk wn U�kjn(s1, s2). (S11)

    where U �j (s1, s2) =∫∞0

    ∫∞0 �(r) �(q)Uj(s1 − r, s2 − r + q) dr dq. Eqn. S11 corresponds to Eqn. 4

    in the main text but in index notation. Note that this third-order correlation function U hasatomic discontinuities, and we can write

    Ukjn(s1, s2) = U◦kjn(s1, s2) + δkj δjn δ(s2 − s1) δ(s1) ρk(t)+ δkn δ(s2)Ckj(s1) + δjn δ(s2 − s1)Ckj(s1) + δkj δ(s1)Ckn(s2) (S12)

    where U◦kjn(s1, s2) is the third-order correlation without atomic discontinuities and is equal toρkρjρn if the inputs are Poisson.

    Selectivity with N orthogonal rate-based input patterns

    We show that the maximally selective fixed points are stable points of the triplet STDP rule inthe case of independent Poisson inputs. We use the minimal triplet model (A+2 = 0) such that∆A = 0. By using Eqn. S11 and S12, (∆B(i)j )kn can be written as(

    ∆B(i)j)kn

    =(b1 δkj ρ

    (i)n + b2 δkn ρ

    (i)k + b3 δkj δkn

    )ρ(i)j (S13)

    whereb1 =

    ∫ ∞0

    ∫ ∞0

    W3(s1, s2) (�(s1) + �(s2 − s1)) ds1 ds2, (S14)

    b2 =∫ ∞

    0

    ∫ ∞0

    W3(s1, s2)∫ ∞

    0�(r) �(r − s2) dr ds1 ds2, (S15)

    andb3 =

    ∫ ∞0

    ∫ ∞0

    W3(s1, s2) �(s1) �(s2 − s1) ds1 ds2. (S16)

    In the presence of P Poisson patterns, we can write the expected weight dynamics as

    ẇ =P∑i=1

    pi

    (φ(ν(i), ν̄)1 + Λ(i)

    )ρ(i), (S17)

  • Supplementary Text 3

    whereφ(ν(i), ν̄) = W 2

    ν

    ρ20ν(i) +W 3

    (ν(i))2, (S18)

    1 is the identity matrix, and Λ(i) is a diagonal matrix where the jth diagonal element is

    Λ(i)jj =(b1wj ν

    (i) + b2 ν(i)2 + b3w

    2j

    )ρ(i)j (S19)

    with ν(i)2 =∑

    k w2k ρ

    (i)k . Recall from the main text that W 2 =

    ∫ 0−∞ W2(s) ds and W 3 =∫∞

    0

    ∫∞0 W3(s) ds1 ds2; also ν̄ =

    ∑m pm

    (ν(m)

    )2with ν(m) = wTρ(m).

    If we further assume that the input patterns are orthogonal (and N = P ), then the conditionẇ = 0 implies that φ(ν(i), ν̄(i)) + Λ(i)ii = 0, for all i = 1, . . . , N and therefore

    w2i F(ρ(i)i

    )−Gρ(i)i wi ν̄ = 0 ∀i = 1, . . . , N (S20)

    where

    G = −W 2ρ20

    > 0 and F(ρ(i)i

    )= W 3

    (ρ(i)i

    )2+ (b1 + b2) ρ

    (i)i + b3. (S21)

    Each fixed point of Eqn. S17 must satisfy the N conditions from Eqn. S20. Each condition hastwo solutions: either w∗i = 0 or w

    ∗i = Gρ

    (i)i /F

    (ρ(i)i

    ). As a consequence, there are 2N fixed

    points, which is consistent with the BCM theory.It remains to be shown that the maximally selective fixed points are stable. The nth fixed

    point is given byw∗(n) = (0, . . . , 0, w∗n, 0, . . . , 0)

    T

    where

    w∗n =F(ρ(n)n

    )Gpn

    (ρ(n)n

    )3 takes the nth position.To demonstrate that this fixed point is stable, we have to show that the eigenvalues of theJacobian of Eqn. S17 are negative when evaluated at w∗(n). We find that this Jacobian matrixis given by

    Jij

    ((w∗(n)

    )= −δij piG

    (ρ(i)i

    )2ν̄(n) (S22)

    where ν̄(n) = pnw2n(ρ(n)n

    )2. Since this matrix is diagonal with all diagonal elements being

    negative, we know that all the maximally selective fixed points are stable.Note that around this fixed point, the sliding threshold takes the value

    θ = w∗(n)ρ(n)n = θ0(ν̄)(

    1 +(τ1ρ

    (n)n

    )−1+(τ2ρ

    (n)n

    )−2)−1(S23)

    where we recall form the main text that θ0(ν̄) = A−2 τ−ν̄/(A+3 τ+τyρ

    p0) is the sliding threshold of

    the BCM term only when the correlation terms can be neglected (i.e. ∆A = 0 and ∆B = 0) andthe new timecales are

    τ1 = W 3/(b1 + b2) and τ2 =√W 3/b3. (S24)

  • Supplementary Text 4

    One can calculate these values and obtain

    W 3 = A+3 τ+τy, (S25)

    b1 + b2 =1

    τm + τ++

    τ+(τm + τ+)(τ+ + τy)

    +1

    2(τm + τy), (S26)

    andb3 =

    τ+(τm + 2τ+)(τ+τy + τmτ+ + τmτy)

    . (S27)

    Selectivity with two correlation-based input patterns

    We derive the fixed points of the two-dimensional system of Eqn. 9 for P = 2 patterns withrates ρ(i), pairwise correlations A(i) and third-order correlations B(i) presented to the networkwith probabilities p1 and p2, respectively

    ẇ =∑i=1,2

    pi

    A(i)w∑i=1,2 pi (wTρ(i))2ρ20

    +∑k

    wTB(i)k w ek

    . (S28)To obtain the general fixed points of this equation, we have to simultaneosly solve ẇ1 = 0 andẇ2 = 0. This amounts to solving two cubic equations for which we does not give nice analyticalexpressions. Thus, we looked for fixed points of maximum selectivity (w∗1, 0) and (0, w

    ∗2) with

    w∗1, w∗2 > 0 (ignoring the origin as the trivial fixed point).

    To find a fixed point on the w1 axis, we solve ẇ1 = 0 at w1 = w∗1 and w2 = 0

    0 = p1

    [A

    (1)11 w1

    p1(wTρ(1)

    )2+ p2

    (wTρ(2)

    )2ρ20

    +(B

    (1)1

    )11w21

    ]

    + p2

    [A

    (2)11 w1

    p1(wTρ(1)

    )2+ p2

    (wTρ(2)

    )2ρ20

    +(B

    (2)1

    )11w21

    ](S29)

    which gives the following linear equation

    (p1A

    (1)11 + p2A

    (2)11

    ) p1 (ρ(1)1 )2 + p2 (ρ(2)1 )2ρ20

    w∗1 +(p1

    (B

    (1)1

    )11

    + p2(B

    (2)1

    )11

    )= 0. (S30)

    This equation has the solution

    w∗1 = −p1

    (B

    (1)1

    )11

    + p2(B

    (2)1

    )11

    ρ−20

    (p1A

    (1)11 + p2A

    (2)11

    )p1

    (ρ(1)1

    )2+ p2

    (ρ(2)1

    )2 (S31)which is given in Eqn. 13.

  • Supplementary Text 5

    Instead of examining the Jacobian of the system at the fixed points to obtain their stability,because of the nonlinearity due to the lower bound on the weights, two conditions must besatisfied for stability:

    ∂F1(w)∂w1

    ∣∣∣w=w∗

    < 0, (S32)

    F2(w∗) < 0. (S33)

    The first condition becomes

    3(p1A

    (1)11 + p2A

    (2)11

    ) p1 (ρ(1)1 )2 + p2 (ρ(2)1 )2ρ20

    w∗1 + 2(p1

    (B

    (1)1

    )11

    + p2(B

    (2)1

    )11

    )< 0. (S34)

    If we use the expression for w∗1 from Eqn. S31, then the condition reduces to

    p1

    (B

    (1)1

    )11

    + p2(B

    (2)1

    )11> 0 (S35)

    which is always true since the correlation terms convolved with the triplet rule in B(i)k are alwayspositive.

    The second condition becomes

    (p1A

    (1)21 + p2A

    (2)21

    ) p1 (ρ(1)1 )2 + p2 (ρ(2)1 )2ρ20

    w∗1 +(p1

    (B

    (1)2

    )11

    + p2(B

    (2)2

    )11

    )< 0. (S36)

    If we use the expression for w∗1 from Eqn. S31, the condition reduces to

    p1A(1)21 + p2A

    (2)21

    p1A(1)11 + p2A

    (2)11

    >p1

    (B

    (1)2

    )11

    + p2(B

    (2)2

    )11

    p1

    (B

    (1)1

    )11

    + p2(B

    (2)1

    )11

    . (S37)

    Similarly, the fixed point on the w2 axis is stable if the following condition holds

    p1A(1)12 + p2A

    (2)12

    p1A(1)22 + p2A

    (2)22

    >p1

    (B

    (1)1

    )22

    + p2(B

    (2)1

    )22

    p1

    (B

    (1)2

    )11

    + p2(B

    (2)2

    )22

    . (S38)

    Numerical evaluations of these two conditions for a variety of firing rates, pairwise and third-order correlations demonstrated that there conditions always hold. Therefore, in the case of atwo-dimenstional network, the system always results in selectivity. As we show in the main text,this is not always the case for a general N -dimensional system, where selectivity depends on theinput correlation structure.

  • Supplementary Text 6

    Simulations of correlated spike trains.

    Higher-order correlations in the spike trains in Figure 5 were simulated by taking the union of anindependent spike train and a common spike train (Gütig et al., 2003; Kuhn et al., 2003; Brette,2009). Applying this method resulted in spike trains with the same rates and same higher-ordercorrelations, with the cross-correlations being delta functions.

    This approach was used for the generation of spatial correlations, however, for the spatio-temporal correlations the mixture method described by Brette (2009) was used. In this case,instantaneous correlations were generated by taking the union of an independent spike trainand a common spike train (as in the case of spatial correlations), and then the generated spikeswere shifted by independent and identically distributed random numbers from an appropriatedistribution function. We used an exponential distribution with a time constant τc, resulting inpairwise correlations equal to

    Cij(s) = γij/(2τc) e−|s|/τc + ρi ρj (S39)

    and a triplet correlations

    Ukjn(s1, s2) = Vkjn(s1, s2) + ρk Cjn(s2 − s1) + ρj Ckn(s2) + ρnCkj(s1)− ρk ρj ρn (S40)

    where Vkjn(s1, s2) is given by

    Vkjn(s1, s2) =βkjn3τ2c

    e−(s1+s2)/τc , s1 ≥ 0, s2 ≥ 0e(2s1−s2)/τc , s1 < 0, s2 > s1e(−s1+2s2)/τc , s1 > s2, s2 < 0.

    (S41)

    Thus, for simplicity we studied symmetric correlation functions, and assumed uniform correla-tions for all inputs pairs and triplets, i.e. the correlation strength was regulated by the coefficientsγij = γ and βkjn = β.

    References

    Brette, R. (2009). Generation of correlated spike trains. Neural Comp, 21:188–215.

    Gütig, R., Aharonov, R., Rotter, S., and Sompolinsky, H. (2003). Learning input correlationsthrough non-linear temporally asymmetric Hebbian plasticity. J Neurosci, 23:3697–3714.

    Kuhn, A., Aertsen, A., and Rotter, S. (2003). Higher-order statistics of input ensembles andthe response of simple model neurons. Neural Comp, 15:67–101.