2
POSTER PRESENTATION Open Access Stochastic gradient ascent learning with spike timing dependent plasticity Joana Vieira * , Orlando Arévalo, Klaus Pawelzik From Twentieth Annual Computational Neuroscience Meeting: CNS*2011 Stockholm, Sweden. 23-28 July 2011 Stochastic gradient ascent learning exploits correlations of parameter variations with overall success of a system. This algorithmic idea has been related to neuronal net- work learning by postulating eligibility traces at synapses, which make them selectable for synaptic changes depending on later reward signals ([1] and [2]). Formalizations of the synaptic and neuronal dynamics supporting gradient ascent learning in terms of differen- tial equations exhibit strong similarities with a recent formulation of spike timing dependent plasticity (STDP) [3] when it is combined with a reward signal. Here we present conditions under which reward modulated STDP is in fact guaranteed to maximize expected reward. We present numerical simulations underlining the relevance of realistic STDP models for reward dependent learning. In particular, we find that the non- linear adaptation to pre- and post-synaptic activities of STDP [3] contributes to stable learning. * Correspondence: [email protected] Institute for Theoretical Physics, University of Bremen, Bremen, D-28359, Germany Figure 1 Learning the XOR function with a reward modulated STDP rule. Left: Output activity versus training episode in a feed forward network with Poisson-like neurons (2 input nodes, 10 hidden nodes and 1 output node). The output activity for the [true, false] and [false, true] inputs becomes stronger, while the output for the [true, true] and [false, false] inputs becomes weak after training. Right: Accumulated administered reward for the four input patterns versus training episode. Vieira et al. BMC Neuroscience 2011, 12(Suppl 1):P250 http://www.biomedcentral.com/1471-2202/12/S1/P250 © 2011 Vieira et al; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Stochastic gradient ascent learning with spike timing dependent plasticity

Embed Size (px)

Citation preview

POSTER PRESENTATION Open Access

Stochastic gradient ascent learning with spiketiming dependent plasticityJoana Vieira*, Orlando Arévalo, Klaus Pawelzik

From Twentieth Annual Computational Neuroscience Meeting: CNS*2011Stockholm, Sweden. 23-28 July 2011

Stochastic gradient ascent learning exploits correlationsof parameter variations with overall success of a system.This algorithmic idea has been related to neuronal net-work learning by postulating eligibility traces atsynapses, which make them selectable for synapticchanges depending on later reward signals ([1] and [2]).Formalizations of the synaptic and neuronal dynamicssupporting gradient ascent learning in terms of differen-tial equations exhibit strong similarities with a recent

formulation of spike timing dependent plasticity (STDP)[3] when it is combined with a reward signal. Here wepresent conditions under which reward modulatedSTDP is in fact guaranteed to maximize expectedreward. We present numerical simulations underliningthe relevance of realistic STDP models for rewarddependent learning. In particular, we find that the non-linear adaptation to pre- and post-synaptic activities ofSTDP [3] contributes to stable learning.

* Correspondence: [email protected] for Theoretical Physics, University of Bremen, Bremen, D-28359,Germany

Figure 1 Learning the XOR function with a reward modulated STDP rule. Left: Output activity versus training episode in a feed forward networkwith Poisson-like neurons (2 input nodes, 10 hidden nodes and 1 output node). The output activity for the [true, false] and [false, true] inputsbecomes stronger, while the output for the [true, true] and [false, false] inputs becomes weak after training. Right: Accumulated administeredreward for the four input patterns versus training episode.

Vieira et al. BMC Neuroscience 2011, 12(Suppl 1):P250http://www.biomedcentral.com/1471-2202/12/S1/P250

© 2011 Vieira et al; licensee BioMed Central Ltd. This is an open access article distributed under the terms of the Creative CommonsAttribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction inany medium, provided the original work is properly cited.

Published: 18 July 2011

References1. Sebastian Seung H: Learning in Spiking Neural Networks by

Reinforcement of Stochastic Synaptic Transmission. Neuron 2003,40(6):1063-1073.

2. Xiaohui Xie, Sebastian Seung H: Learning in neural networks byreinforcement of irregular spiking. Phys Rev E 2004, 69(4):041909-1-041909-10.

3. Schmiedt Joscha T, Christian Albers, Klaus Pawelzik: Spike timing-dependent plasticity as dynamic filter. In Advances in Neural InformationProcessing Systems 23 J. Lafferty and C. K. I. Williams and J. Shawe-Taylorand R.S. Zemel and A. Culotta 2010, 2110-2118.

doi:10.1186/1471-2202-12-S1-P250Cite this article as: Vieira et al.: Stochastic gradient ascent learning withspike timing dependent plasticity. BMC Neuroscience 2011 12(Suppl 1):P250.

Submit your next manuscript to BioMed Centraland take full advantage of:

• Convenient online submission

• Thorough peer review

• No space constraints or color figure charges

• Immediate publication on acceptance

• Inclusion in PubMed, CAS, Scopus and Google Scholar

• Research which is freely available for redistribution

Submit your manuscript at www.biomedcentral.com/submit

Vieira et al. BMC Neuroscience 2011, 12(Suppl 1):P250http://www.biomedcentral.com/1471-2202/12/S1/P250

Page 2 of 2