A spin-glass-like Lyapunov function for a neurotrophic model of neuronal development

Abstract. We derive a spin-glass-like energy or Lyapunovfunction for our previously studied neurotrophic modelof anatomical synaptic plasticity and neuronal develop-ment. This function is then used in Monte-Carlo simu-lations of the model applied to the development of oculardominance columns. We discuss the relationship betweenour model and other models, and speculate on theimplications of underlying spin glass structures in manymodels of neuronal development, learning and plasticity.

1 Introduction

Understanding the mechanisms underlying activity-dependent, competitive interactions in the developingvertebrate nervous system is a major goal of developmen-tal neuroscience. These interactions are widely believed,for example, to lead to the development of oculardominance (OD) columns (ODCs) in the primary visualcortex of many mammals, including cats and Old Worldmonkeys (Hubel and Wiesel 1962; LeVay et al. 1978,1980). Recent experimental data implicate neurotrophicfactors (NTFs), particularly the neurotrophin gene familyof NTFs, in activity-dependent competition in the devel-oping visual cortex (reviewed in McAllister et al. 1999).

In previous work, we have built, partially analysedand simulated a model of anatomical synaptic plasticitybased on activity-dependent competition for retrogradeneurotrophic support, and later extended it to includesimultaneous anatomical and physiological plasticity(Elliott and Shadbolt 1998a,b, 1999; Elliott et al. 2001).Here we extend our earlier analysis by deriving anenergy or Lyapunov function for our neurotrophicmodel and use this function in Monte-Carlo simulationsof our model. This energy function is exact and globallyvalid for two afferents. For more than two afferents, it is

approximate and Lyapunov, and valid only around thefixed points of the model. The form of the energy orLyapunov function is that of a spin glass.

The plan for this paper is as follows. In Sect. 2, wefirst state without derivation our neurotrophic model ofanatomical synaptic plasticity. After transforming vari-ables, we derive the energy function for two afferentsand then, for more than two afferents, go on to gener-alise this function and show that it is Lyapunov in mostregions of parameter space. In Sect. 3, we present theresults of Monte-Carlo simulations of our neurotrophicmodel of synaptic plasticity formulated in terms ofenergy minimisation. In Sect. 4, we discuss the rela-tionship between the energy function formulation of ourneurotrophic model and other models of ODC forma-tion. Finally, in Sect. 5, we speculate on the implicationsof the uncovering of spin glass-like structures underlyingmodels of neuronal development.

2 Construction of an energy function

Let letters such as i and j label presynaptic or afferentcells and letters such as x and y label postsynaptic ortarget cells. The number of synapses projected byafferent cell i to target cell x is denoted by sxi, and theactivity of afferent cell i is ai 2 ½0; 1�. Then the funda-mental equation governing the evolution of sxi in ourneurotrophic model is

_ssxi ¼ �sxiðaþ aiÞPj sxjðaþ ajÞ

Xy

Dxy

"

� T0 þ T1

Pj syjajPj syj

! 1

#; ð1Þ

where a dot over a quantity denotes the (total) timederivative. A derivation, justification and discussion ofthe assumptions underlying this equation can be foundelsewhere (Elliott and Shadbolt 1998a). The two param-eters T0 and T1 denote an activity-independent andactivity-dependent, respectively, release of NTFs by

Correspondence to: e-mail: [email protected],Tel.: +44-23-80596000Fax: +44-23-80593313

Biol. Cybern. 86, 473–481 (2002)DOI 10.1007/s00422-002-0313-6� Springer-Verlag 2002

A spin-glass-like Lyapunov function for a neurotrophic modelof neuronal development

Terry Elliott

Department of Electronics and Computer Science, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom

Received: 1 October 2001 / Accepted in revised form: 15 January 2002

target cells, while the parameter a defines a resting afferentNTF uptake capacity; � is an overall ‘‘learning rate’’inversely proportional to a time scale for averaging NTFavailability (Elliott and Shadbolt, 1998a). In previouswork, we have regarded Dxy as a function characterisingthe diffusion of NTFs between target cells x and y. Here,for increased generality, we shall regard Dxy as arisingfrom a plexus of lateral connections between target cells,with excitatory lateral synapses (Dxy > 0) elevating NTFrelease and inhibitory lateral synapses (Dxy < 0) decreas-ing NTF release. We will continue to assume, withoutmuch loss of generality, that

Py Dxy ¼ 1 8 x for conve-

nience. We assume, furthermore, that the matrix D issymmetric, so that Dxy ¼ Dyx 8x; y.

We introduce a more useful set of variables via thedefinitions

sþx ¼Xj

sxj ; ð2Þ

vxi ¼ sxi=sþx ; ð3Þ

so that the vxi are independent of the overall synapticscale sþx , with vxi 2 ½0; 1� and, by definition,

Pj vxj � 1

8x. Equation (1) then becomes the two equations

sþx _vvxi ¼ �T1vxi

Pjðai ajÞvxjPjðaþ ajÞvxj

" #Xy;j

Dxyðacþ ajÞvyj ; ð4Þ

_ssþx þ �sþx ¼ �T1

Xy;j

Dxyðacþ ajÞvyj ; ð5Þ

where c ¼ T0=ðaT1Þ. To facilitate further analysis, weaverage (4) and (5) over the ensemble of afferent activitypatterns. Because of the non-linearities in the ai in (4),we assume, for tractability, that for n distinct afferents,i ¼ 1; . . . ; n, there are just n distinct activity patterns.Activity pattern number i is defined by ai ¼ 1, andaj ¼ p 8j 6¼ i, with p 2 ½0; 1�; for two afferents, thesepatterns are reasonably general. If l is the averageactivity of an afferent, so that nl ¼ 1 þ ðn 1Þp, anddropping for notational convenience the hi bracketsaround variables that conventionally indicate ensembleaveraging, after some algebra we obtain

nsþx _vvxi ¼ �T1vxi aðc 1ÞXj

1 þ rdij1 þ rvxj

n

!"

þð1 pÞXj

1 þ rdij1 þ rvxj

Xy

Dxyðvyj vxjÞ#

; ð6Þ

and

_ssþx þ �sþx ¼ �T1ðacþ lÞ ; ð7Þ

where dij is the Kronecker delta (dij ¼ 1 if, and only if,i ¼ j; and zero otherwise), and r ¼ ð1 pÞ=ðaþ pÞ.

We now seek a Lyapunov or energy function E suchthat _EE � 0 always, or at least _EE � 0 in the neighbourhoodof the fixed points of (6) and (7). We can find an exactform for E for two afferents, and can prove that this formgeneralises for more than two afferents in most regions ofparameter space. We consider the two cases separately.

2.1 Two afferents

For two afferents we write vx ¼ 2vxi 1 2 ½1;þ1� forany one of the two afferents, i. The variable vx indicateswhether afferent i’s control of target x dominates(vx > 0), whether the other afferent’s control dominates(vx < 0), or whether both have equal control (vx ¼ 0).Equations (6) and (7) then become

sþx _vvx ¼ �T1r2 1 v2x

ð2 þ rÞ2 r2v2x

Xy

D̂Dxyvy ; ð8Þ

_ssþx ¼ � T1ðacþ lÞ sþx� �

; ð9Þ

where

D̂Dxy ¼ að1 cÞ ð1 pÞ 1

n 1

r

� �dxy

þ ð1 pÞ 1

n 1

r

Dxy ð10Þ

with n ¼ 2, which can be rewritten in the moretransparent form

D̂Dxy ¼ ðaþ lÞDxy ðacþ lÞdxy : ð11Þ

Equation (8) manifestly possesses fixed points at vx ¼ �1and, for Dxy ¼ dxy , at vx ¼ 0, with the stability of allthree fixed points reversing as the sign of c 1 reverses.All relevant quantities in (8) are positive semi-definite,except for

Py D̂Dxyvy , so the sign of _vvx depends only on the

sign of this convolution. Thus, to construct an energyfunction E, it suffices to require that

oEovx

¼ Xy

D̂Dxyvy ; ð12Þ

oEosþx

¼ T1ðacþ lÞ sþx� �

; ð13Þ

which integrate to give

E ¼ 1

2

Xx;y

vxD̂Dxyvy þ1

2

Xx

T1ðacþ lÞ sþx� �2

: ð14Þ

This energy satisfies _EE � 0 everywhere and hence isglobally valid rather than locally valid only around thefixed points.

The solution of (9) – and therefore the minimisationof the sþx -dependent part of E – is trivial, withsþx ¼ T1ðacþ lÞ 8x being stable always. The sþx dynamicsare therefore uninteresting, and although the value of sþxaffects the evolution of vx in (8), this does not affect thesolutions of _vvx ¼ 0. We may therefore safely ignore thesþx dynamics, with the model’s competitive dynamicsresiding entirely in the vx dynamics. Hence, we may takeE just to be

E ¼ 1

2

Xx;y

vxD̂Dxyvy : ð15Þ

The minima of E on the space ½1;þ1�s, where s is thenumber of target cells, correspond to the fixed points of

474

(8). If the matrix D̂D is negative definite, then the onlyminimum of E corresponds to vx ¼ 0 8x, an unsegregat-ed state of equal control of each target cell by bothafferent cells. If D̂D is positive definite, then some state inwhich vx ¼ �1 8x will be a minimum; in general, therewill be many such states.

2.2 More than two afferents

We now extend our results for two afferents to morethan two afferents. We construct an approximate formfor E by generalising the two-afferent result, which weshow is Lyapunov around the unsegregated fixed pointand Lyapunov for most choices of parameters aroundthe segregated fixed points.

For two afferents, we have that E ¼ 12

Px;y vxD̂Dxyvy .

Since vx ¼ 2vxi 1 for any one of the two afferents i, bydiscarding constants and irrelevant overall multipliers,we can without loss of generality replace E by

E ¼ 1

2

Xi;x;y

vxiD̂Dxyvyi ; ð16Þ

where the sum in i extends over i ¼ 1 and i ¼ 2 for twoafferents. We conjecture, justify and then prove that thegeneralisation of (16) to more than two afferents consistsmerely of extending the sum in i over all the afferents.

Our first justification for this generalisation followsfrom taking a ‘‘natural’’ approximation of (6). Weassume that, on any given target cell x and for eachafferent i, the evolution of that afferent’s synapses, sxi,does not depend on the precise dynamics of all the otherafferents’ synapses, sxj 8j 6¼ i, but rather on a ‘‘meansynaptic field’’ generated by all the other afferents. Thus,in (6), we replace each occurrence of sxj, j 6¼ i, by �vvx,where �vvx represents a mean synaptic number for all then 1 afferents j 6¼ i, so that vxi þ ðn 1Þ�vvx ¼ 1. Theresult of this approximation is effectively to reduce thesystem, for each afferent, to a two-afferent system. Withthis approximation, (6) becomes

sþx _vvxi ¼ �T1r2 vxi�vvxð1 þ rvxiÞð1 þ r�vvxÞ

� aðc 1Þ 1

nþXy

D̂Dxyvyi

" #ð17Þ

where D̂Dxy is as in (10) without the restriction that n ¼ 2.For each afferent i, we then require that

oEovxi

¼ að1 cÞ 1

nXy

D̂Dxyvyi 8x ; ð18Þ

giving us the required generalisation of the two-afferentenergy function, after discarding the constant termað1 cÞ 1

n

Px;i vxi ¼ að1 cÞs. If we expand about this

approximation by writing vxj ¼ �vvx þ dvxj 8j 6¼ i andrequiring that

Pj 6¼i dvxj ¼ 0 in order that

Pj vxj ¼ 1,

then we find that the linear terms in dvxj 8j 6¼ i vanish inthe equation for _vvxi, so that E is, in fact, exact to OðdvÞ inthis approximation.

A second justification follows from two observations.Considering only one target cell x (so that, in effect,Dxy ¼ dxy), when all but two of the vxi are zero, (16) isexact. Similarly, the first justification above shows thatwhen vxj ¼ �xxx 8j 6¼ i, (16) is also exact. Thus, (16) ispairwise exact, meaning that on any line connecting anypair of fixed points in the space

Pj vxj ¼ 1 8x, (16) is

exact.We now prove that (16) is Lyapunov for most choices

of parameters around the segregated and unsegregatedfixed points. We consider the case Dxy ¼ dxy separatelyfrom general Dxy .

2.2.1 The Dxy ¼ dxy case. In this case, target cellsdecouple, so for notational convenience we consideronly one target cell and drop the x subscript.

First consider the fixed point vi ¼ 1 for some i, andvj ¼ 0 8j 6¼ i, corresponding to a segregated state. Weexpand about this point by writing vj ¼ 0 þ dvj 8j 6¼ i,and necessarily dvj � 0 8j 6¼ i. Then, since

Pj vj ¼ 1, we

must have vi ¼ 1 P

j 6¼i dvj. Then 8j (6) becomes

nsþd _vvj ¼ �T1aðc 1Þ r2

1 þ rdvj þ Oðdv2Þ : ð19Þ

We have that _EE ¼ aðc 1ÞP

j _vvjvj, from which weobtain

_EE ¼ �T11

nsþa2ðc 1Þ2 r2

1 þ r

Xj 6¼i

dvj þ Oðdv2Þ : ð20Þ

Thus, to OðdvÞ, as dvj � 0 8j 6¼ i, we have that_EE � 0 and hence E is Lyapunov at the segregated fixed

points.We now consider the unsegregated fixed point,

vj ¼ 1n þ dvj 8j, with

Pj dvj ¼ 0, but with no restriction

on the sign of the dvj’s. We then obtain

sþd _vvj ¼ �T1aðc 1Þ r2

ðnþ rÞ2dvj þ Oðdv2Þ ð21Þ

and hence

_EE ¼ �T11

sþa2ðc 1Þ2 r2

ðnþ rÞ2

Xj

dv2j þ Oðdv3Þ : ð22Þ

To Oðdv2Þ, we have that _EE � 0 and so E is Lyapunovaround the unsegregated fixed point.

In this limit of no lateral interactions between targetcells, the form for E in (16) for any n is Lyapunovaround all n segregated fixed points and around theunsegregated fixed point. Because E is also pairwiseexact, as defined above, we are justified in analyticallycontinuing E over the entire space in which

Pj vj ¼ 1.

We can, in fact, obtain an exact energy function whenDxy ¼ dxy for general n, by introducing the variablesXj ¼ 1=ð1 þ rvjÞ 8j in (6), so that E satisfies _EE � 0everywhere without the need for analytic continuation.However, the form of this exact energy function doesnot lend itself to ready generalisation to arbitrary Dxyand hence possesses little utility.

475

2.2.2 The general Dxy case. We consider first theunsegregated fixed point vxi ¼ 1

n 8x; i, and expand aboutit as usual, so that vxi ¼ 1

n þ dvxi, whereP

j dvxj ¼ 0 8x,with no restriction on the sign of the dvxi’s. After somealgebra, we find that

sþx d _vvxi ¼ �T1r2

ðnþ rÞ2

Xy

D̂Dxydvyi þ Oðdv2Þ ; ð23Þ

from which we obtain

_EE ¼ �T1r2

ðnþ rÞ2

Xx;i

1

sþx

Xy

D̂Dxydvyi

!2

þOðdv3Þ : ð24Þ

Thus, to Oðdv2Þ, _EE � 0 and hence E is Lyapunov aroundthe unsegregated fixed point.

We now turn to the segregated fixed points. Becauseof the large number of segregated fixed points and thecoupling between target cells induced by a general Dxy ,a general characterisation and consideration of theproperties of all such fixed points is intractable. Wetherefore restrict our attention to a specific class ofsegregated fixed points. First we suppose that thenumber of afferent cells, n, equals the number of targetcells, s. We then consider the fixed point correspondingto a state of perfect topographic mapping between aline of afferent cells and a line of target cells, defined byvxi ¼ dxi; i.e. afferent cell i projects to target cell xif, and only if, i ¼ x, and expand about this point bywriting vxi ¼ dxi þ dvxi, where

Pj dvxj ¼ 0 8x. Nota-

tionally, it is convenient for the purposes of this anal-ysis to label afferent cells by letters such as x and yinstead of i and j. Thus, we actually writevxy ¼ dxy þ dvxy , with

Py dvxy ¼ 0, where, as usual, the

first index refers to a target cell and the second to anafferent cell. This notation has the advantage that dxx ismanifestly unity without our having parenthetically tostate that i ¼ x. The expansion vxy ¼ dxy þ dvxy subjecttoP

y dvxy ¼ 0 is achieved by writing vxy ¼ 0 þ dvxywith dvxy > 0 for x 6¼ y, and vxx ¼ 1

Py 6¼x dvxy ; i.e.

dvxx ¼ P

y 6¼x dvxy . After some algebra, for y 6¼ x weobtain

nsþx d _vvxy ¼ �T1r

1 þ rdvxy aðc 1Þr þ ð1 pÞf

� 1 Dxx þ ð1 þ rÞDxy� �

þ Oðdv2Þ ð25Þ

and d _vvxx is obtained using d _vvxx ¼ P

y 6¼x d _vvxy . We thenhave that

_EE ¼ �T1r2

1 þ r1

n

Xx

1

sþx

Xy 6¼x

dvxyAxyBxy þ Oðdv2Þ ; ð26Þ

where

Axy ¼ acþ p ðaþ pÞDxx þ ðaþ 1ÞDxy ; ð27ÞBxy ¼ acþ l ðaþ lÞDxx þ ðaþ lÞDxy : ð28Þ

Because the dvxy , 8x; y 6¼ x, are independent variables,for _EE to be guaranteed to be negative semi-definite, wemust have than AxyBxy � 0 8x; y 6¼ x.

Consider first a specific choice for the lateral inter-action matrix. Suppose that all non-diagonal entries areidentical, so that Dxy ¼ d 8x; y 6¼ x for some value of d.Then the requirement that

Py Dxy ¼ 1 8x translates into

the requirement that Dxx ¼ 1 ðs 1Þd 8x. For s ¼ 2,this form of the interaction matrix is completely general.For this form of matrix, we find that Axy � Bxy 8x; y 6¼ x,and hence _EE � 0 and so is Lyapunov. In some sense, thischoice of interaction matrix is analogous to our choiceof the afferent activity patterns over which we averagedthe basic equation of our model: for each pattern (eachmatrix row or column) all afferents (targets) are treatedidentically except for one cell.

In general, however, AxyBxy – as a function of Dxx andDxy (which can be independent for n > 2) – can be neg-ative, and hence E is not guaranteed to be Lyapunov. Inthe Dxx Dxy plane, the function AxyBxy , for p 6¼ 1, has asaddle at the point

Dxx ¼ 1 þ að1 cÞðl 1Þð1 pÞðaþ lÞ ; ð29Þ

Dxy ¼að1 cÞðl pÞð1 pÞðaþ lÞ ; ð30Þ

and has value zero along the lines Axy ¼ 0 and Bxy ¼ 0.The normals to these two lines are in the directions ofthe vectors ð1; 1 þ rÞT and ð1; 1ÞT, and the angle, h,between them, defining the region in which AxyBxy < 0,is given by

cos h ¼ 1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1 þ 1p

2aþ1þp

� �2r : ð31Þ

For a ¼ 1, a typical value used in most of our earlierwork (Elliott and Shadbolt 1998a,b, 1999), the largestvalue of h is given by cos h ¼ 3=

ffiffiffiffiffi10

p, so that the possibly

negative, non-Lyapunov region occupies slightly over10% of the Dxx–Dxy plane. Later work has requiredtaking a to be larger, a � 10 (Elliott and Shadbolt 2002).In this case we can approximate (31), to obtainh � ð1 pÞ=ð2aþ 1 þ pÞ. For a ¼ 10, the possibly neg-ative, non-Lyapunov region amounts to only 1.5% ofthe Dxx Dxy plane. Hence, for reasonable choices of theparameters a and p, it is easy to select the elements of theinteraction matrix, Dxx and Dxy , such that AxyBxy > 0. Inmost regions of parameter space, we therefore have_EE � 0, and hence E is Lyapunov here.

2.3 Further considerations

In Sect. 2.1 we have shown that for two afferents, we canderive an energy function E of the form given in (15) or(16) that satisfies _EE � 0 everywhere. For n > 2 we haveshown that the form in (16) remains valid, in as much asit can be proved to be Lyapunov in the neighbourhoodof all segregated and unsegregated fixed points forDxy ¼ dxy . Furthermore, for general Dxy , the form in (16)is Lyapunov around the unsegregated fixed point,and Lyapunov for most choices of parameters

476

(‘‘pseudo-Lyapunov’’) in the neighbourhood of a classof segregated fixed points whose characterisation andanalysis is tractable. When the target-cell interactionmatrix is assumed to be of restricted form, with all non-diagonal entries equal, E becomes fully Lyapunov atthese segregated fixed points. The form of E for morethan two afferents is further justified by its beingderivable from a controlled approximation to (6), and,moreover, being pairwise exact, as defined above, alongany line connecting any pair of fixed points.

For n ¼ 2, and restricting vx to lie in the setvx 2 f1;þ1g, (15) is, of course, just the Hamiltonian ofa spin-glass system, where the ‘‘spin’’ of a target cellindicates whether one afferent (vx ¼ þ1) or the other(vx ¼ 1) controls the cell. The modified lateral inter-action matrix, D̂D, determines whether it is energeticallyfavourable to align (Dxy > 0, ‘‘cooperative’’ interactionsbetween target cells x and y) or anti-align (Dxy < 0,‘‘competitive’’ interactions) the ‘‘spins’’ of target cells xand y. If the unmodified lateral interaction matrix Dtakes the standard Mexican hat form of local excitationwith an inhibitory surround, then it is energeticallyfavourable to form patterns of control reminiscent ofODCs. Provided that D is symmetric, we can assumethat the distribution of states follows the Boltzmanndistribution and use standard Monte-Carlo techniquesto minimise E. To allow the possibility of unsegregatedstates, such as those that arise when c > 1, we can allowthe variables vx to take values from the set f1; 0;þ1g,permitting transitions only between neighbouring states(note that finer-grained levels of control can be admittedby allowing additional states). For n > 2 and with therestriction that vxi 2 f0; 1g 8x; i, (16) is the Hamiltonianof a generalised spin glass, what we might term a ‘‘Pottsglass’’. In this case, the variable vxi (or, equivalently,2vxi 1) indicates whether afferent i controls target cell x(vxi ¼ 1) or whether some other afferent, j 6¼ i, controls it(vxi ¼ 0). Again, we may assume that the distribution ofstates follows the Boltzmann distribution and useMonte-Carlo techniques to minimise E.

3 Monte-Carlo simulations

We now turn to Monte-Carlo simulations of ourneurotrophic model formulated as an energy minimisa-tion problem, and consider the development of ODCs.For simplicity, we restrict to two afferents: one afferentrepresenting the left eye and the other the right eye.(Strictly, of course, these cells represent cells in thelateral geniculate nucleus receiving input from the tworetinae.) We admit, for computational convenience, onlythe states vx ¼ þ1 and vx ¼ 1. We consider a 40 � 40square array of primary visual cortical cells. Periodicboundary conditions are imposed on this array to avoidedge effects. Because we assume that the target-cellinteraction matrix is symmetric, we can assume that thedistribution of states of connectivity follows the Boltz-mann distribution, and hence the probability of accept-ing a ‘‘spin flip’’ (change in OD) that induces a change inenergy of the system, DE, is given by

1

1 þ expðDE=T Þ ; ð32Þ

where T is the (computational) temperature. Onepossible interpretation of T is that it is related to thereliability of synaptic transmission, and hence to thematurity of synapses. Less-mature synapses tend to beless reliable and more noisy in transmission, while more-mature synapses tend to be more reliable and less noisy.Larger T gives more noise in the decision process, and socan be thought of correlating with less-mature synapses,while smaller T gives less noise in the decision process,thus correlating with more-mature synapses.

In Fig. 1 we present three OD maps for three differentchoices of target-cell interaction functions. These func-tions have a uniform excitatory core of radius r sur-rounded by a region of uniform inhibition of width r.The parameter r gives an indication of the extent of therange of lateral interactions between target cells. InFig. 1A, B and C we set, respectively, r ¼ 2, r ¼ 3and r ¼ 4. These maps are generated by quenchedsimulations in which T is set from the start of the

Fig. 1A–C. Ocular dominance (OD) maps generated for threedifferent choices of the target-cell interaction function in quenchedsimulations: A r ¼ 2; B r ¼ 3; C r ¼ 4. Each square represents acortical cell, white cells controlled by one eye and black cells controlledby the other

477

simulations to be very close to zero. We see, as expected,that ODC width increases as r increases.

In Fig. 2 we anneal the maps by initially setting T ¼ 25and reducing T to very close to zero in 101 steps ofDT ¼ 0:25. At each temperature step, 200 000 ‘‘spin flips’’are considered. The fact that the simulations appear toreach states that likely correspond to the global minimumof E in (15) indicates that this annealing schedule isacceptably slow. The three maps in this figure are distin-guished from the maps in Fig. 1 by exhibiting a globalordering, with the ODCs being straight, parallel lines.

Figure 3 shows a graph of oE=oT , which correspondsto the ‘‘heat capacity’’ of the network, against T , aver-aged over 2500 separate annealing processes. The spikein the heat capacity indicates the presence of a phasetransition, during which ODCs ‘‘freeze’’ into the net-work. Above this phase transition temperature, OD‘‘boils’’, changing value essentially randomly. A phasetransition in the development of ODCs has also beenobserved in other models (Tanaka 1991a; Elliott et al.1996). As we show in Sect. 4, these other models areclosely related to our present model, so the presence of aphase transition in all of them is not unexpected.

The energy E in (15) possesses a Z2 symmetry undervx ! vx 8x. The states exhibited in Figs. 1 and 2,

however, do not respect this symmetry. Hence, belowthe phase transition, the Z2 symmetry is spontaneouslybroken. Above this temperature, on average the Z2

symmetry is respected. Corresponding to a brokensymmetry, our system should exhibit domain walls.These walls, of course, correspond exactly to theboundaries between ODCs controlled by different eyes.

We now consider the simulation of monoculardeprivation by modifying the target-cell interactionmatrix (cf. Swindale 1980). The simplest way to alter thetarget-cell interaction matrix in order to model monoc-ular deprivation is to allow the excitatory core of thelateral connections surrounding target cells controlledby the undeprived eye to expand while shrinking theexcitatory core surrounding target cells controlled by thedeprived eye. This is done in Fig. 4. Greater expansionand shrinkage produce greater responses to monoculardeprivation. The blob-like pattern exhibited in Fig. 4 isreminiscent of those found by Tanaka (1991b) duringthe simulation of imbalanced retinal activity.

4 Relationship to other models

We now turn to a discussion of the relationship betweenour model and other models of ODC formation.

4.1 Swindale’s model

Swindale (1980) introduced a model that made explicitthe role of lateral interactions between target cells in thedevelopment of ODCs. In our notation, assumingsymmetric interactions between the two eyes, his basicequation is given by

_vvx ¼ ð1 v2xÞXy

D̂Dxyvy ; ð33Þ

where D̂D is some matrix denoting coupling betweentarget cells and encoding co-operative or competitiveinteractions between the two eyes. (Note that the symbolD̂D is used for notational consistency. Of course, inSwindale’s model, it is not necessarily decomposableaccording to our (10).). Swindale introduced the convo-lution term

Py D̂Dxyvy in order to account for the basic

structure of the OD map, while the multiplying term

Fig. 2A–C. Same as Fig. 1, for annealed simulations

Fig. 3. A graph of the heat capacity, oE=oT , against temperature, T ,averaged over 2500 annealings. The spike in the heat capacityindicates a phase transition in the development of OD columns

478

1 v2x was introduced in order to prevent the uncon-

strained growth of the vx variables. Equation (33) is verysimilar to our own, two-afferent activity-averaged equa-tion in (8). In particular, the fixed point structure isidentical for identical D̂D, as the denominator in ourequation is essentially irrelevant. Although not fullyequivalent, the dynamics of our two models willtherefore be very similar. Swindale did not introducean energy function, but of course his equations can berecast in an energy minimisation formalism, with hisglobal energy function given by our (15). Viewed as spin-glass dynamics with vx 2 f1;þ1g, the minimisation ofE by alignment or anti-alignment of ‘‘spins’’ accordingto the sign of the coupling between target cells gives aneven clearer and more intuitive understanding of the roleof D̂D. To the extent that Swindale’s model is aphenomenological model – written down to reproducethe phenomenology of ODC development withoutseeking to explicate the underlying mechanisms – webelieve that our approach is more informative, althoughthe final equations are very similar. For example,Swindale (1980) introduces the term 1 v2

x in a ratherarbitrary fashion, but in our approach it emergesautomatically.

4.2 Tanaka’s model

Tanaka (1989, 1990, 1991b) derives an elegant and verygeneral ‘‘thermodynamic model’’ inspired by theChangeux and Danchin (1976) model of selective

synaptic stabilisation, based on the dynamics of theNMDA receptor. As we have done, Tanaka starts froma set of differential equations describing the detaileddynamics that he wishes to model, and from theseequations he extracts a thermodynamic limit with aHamiltonian and a transition probability based onstandard Monte-Carlo techniques. Ignoring complicat-ing factors such a noise, in our notation his generalenergy function is essentially

E ¼ 1

2

Xi;j;x;y

sxiCijD̂Dxysyj ; ð34Þ

where Cij is an afferent-activity correlation matrix. Indealing with the development of ODCs, Tanaka restrictsthe sxi variables to represent OD, so that his energyfunction, in our notation, becomes

E ¼ 1

2

Xx;y

sxD̂Dxysy ; ð35Þ

which, again, is just our two-afferent derived energyfunction in (15). Clearly, Tanaka’s thermodynamicmodel is what we have termed a ‘‘Potts glass’’, althoughhe calls it, equivalently, a ‘‘Potts spin system’’. Tanakatakes this thermodynamic limit as fundamental, so thathis simulations involve the minimisation of E via Monte-Carlo techniques and not the direct (numerical) solutionof his basic differential equations. At the level of thethermodynamic formulation, Tanaka’s model and oursare formally equivalent. This equivalence accounts forthe similarity of many of our results to his, especially inrelation to the presence of a phase transition duringODC formation and the blob-like patterns found duringmonocular deprivation. Indeed, Tanaka’s approach ismore general, embracing our model as a special case. Atthe level of the basic differential equations, however, themodels are different and the detailed dynamics differ.

4.3 Elliott et al.’s earlier model

In our earlier work on the development of ODCs, wesimply wrote down an Ising model-like Hamiltonian,inspired by the similarity between patterns of ODCs andmagnetic domains in ferromagnets (Elliott et al. 1996).The model, however, was essentially unjustified, in asmuch as it was not derived from an underlying,biologically plausible model of synaptic plasticity, unlikeour current model (Elliott and Shadbolt 1998a,b) orTanaka’s model (Tanaka 1989, 1991b). To the extent thatan Ising model is a specific type of spin or Potts glass,this model is subsumed by the thermodynamic formula-tion of our current model, which is itself subsumed byTanaka’s thermodynamic formulation of his model.

4.4 Miller et al.’s model

Miller et al. (1989) write down a linear model of ODCformation and subject it to some degree of analysis. In

Fig. 4A, B. The simulation of monocular deprivation, achieved byextending the excitatory core of lateral connections surrounding targetcells controlled by the undeprived eye and shrinking the excitatorycore of lateral connections surrounding target cells controlled by thedeprived eye. A The OD state immediately before the onset ofdeprivation; B The final state after deprivation

479

order to account for the development of ODCs in thepresence of positively correlated images between the twoeyes, however, they introduce the device of subtractivesynaptic normalisation, which we have argued at lengthelsewhere is biologically implausible (Elliott and Shad-bolt 1999). They show that their basic differentialequations can be recast in terms of the steepest-descentminimisation of an energy function, and, ignoring theasymmetries introduced by their arbor function, thisenergy function can be written – in our notation – as

E ¼ 1

2

Xi;j;x;y

sxiCijD̂Dxysyj : ð36Þ

This, of course, is identical to Tanaka’s Potts-glassHamiltonian. Miller et al. (1989), do not, however,employ this energy in a thermodynamic formulation,preferring instead to regard their differential equationsas primary.

Miller (1998) later argued that his model (Miller et al.1989) is mathematically equivalent to our earlier, Ising-like model (Elliott et al. 1996). The basis of this claimrests on an identity between the energy functions un-derlying both models. We do not contest that the energyfunctions are identical, but we have argued elsewherethat identity of energy functions does not entail dy-namical identity of the full models (Elliott et al. 1998).Furthermore, Tanaka (1990) proved rather earlier thatthe model of Miller et al. (1989) is derivable as a specialcase from Tanaka’s Potts-glass model, essentially bydiscarding non-linearities and noise terms. Since an Isingmodel is a special case of a Potts glass, Tanaka hadalready established that the energy functions used in ourearlier model, Miller et al.’s model and Tanaka’s modelare identical.

4.5 Other models

Many models exist of the development of ODCs otherthan those discussed above (see, e.g., von der Malsburgand Willshaw 1976; Bienenstock et al. 1982; Montagueet al. 1991; Goodhill 1993; Harris et al. 1997). Althoughmany of these models employ some of the same featuresas the models discussed earlier, such as a plexus oflateral connections between target cells, we do notdiscuss these models in any detail because they eitherhave not been or cannot easily be recast in a thermo-dynamic formulation with an underlying spin or Potts-glass Hamiltonian.

Other models, although not of the development ofODCs specifically, do possess energy functions that arespin or Potts glasses, or simple variants thereof. Exam-ples include Linsker’s model of the development of ori-entation selectivity (Linsker 1986a,b,c) and Hopfield’sclassic model of associative memory (Hopfield 1982).

5 Discussion

In this paper, we have shown that our previously studiedneurotrophic model of neuronal development and

synaptic plasticity can be reformulated in terms of theminimisation of an exact energy function for twoafferents, and in terms of the minimisation of aLyapunov function, having the same form as the two-afferent energy function, for more than two afferents.The energy or Lyapunov functions are identical to theHamiltonians of spin or Potts-glass systems. Undercertain assumptions about the symmetry of the target-cell interaction matrix, we may then use statisticalmechanics and therefore standard Monte-Carlo tech-niques to minimise these functions, rather than solvingthe model’s differential equations directly. We haveshown the results of Monte-Carlo simulations of ourtwo-afferent system, revealing phase transitions, sym-metry breaking and domain-wall formation, and apossible role for the target-cell interaction matrix inthe phenomenology of monocular deprivation.

We have also demonstrated or discussed deep formalsimilarities between five models (Swindale 1980; Milleret al. 1989; Tanaka 1989; Elliott et al. 1996; Elliott andShadbolt 1998b) of ODC formation, indicating that allof them possess an underlying spin or Potts-glassstructure. Other models, although not of the formationof ODCs, are also based on spin glasses (e.g. Hopfield1982; Linsker 1986a,b,c). The existence of this spin orPotts-glass structure is intriguing, especially as four(Swindale 1980; Tanaka 1989; Miller et al. 1989; Elliott& Shadbolt 1998b) of the five models discussed in detailearlier were originally formulated in terms of differentialequations with no appeal or reference to the machineryof statistical mechanics or spin glasses.

Is there any significance to the presence of anunderlying spin or Potts-glass-like structure in a widerange of models of neuronal development, learning andplasticity? There are several possible answers to thisquestion, but perhaps one of the most intriguing is thefollowing. During the evolution of simple sensory sys-tems, evolution has of necessity had to conform to theprinciples of, for example, optics and acoustics in theconstruction of eyes and ears. If evolution can discoverand exploit these physical principles, then it is entirelypossible that it could discover other, more generalprinciples applying to the collective computationalcapacities of large networks of simple devices. Evolutionthus may have discovered that by exploiting the com-putational capacities of spin glasses or spin-glass-likesystems, it can create neuronal structures that conferadaptive advantage on those organisms possessing them.If this is correct, then the appropriate level of descriptionfor understanding cortical structures is not the neuronallevel or lower, but the collective, systems level. Fur-thermore, this account provides a simple answer to thequestion: what function does the plexus of lateralinhibitory and excitatory cortical connections serve? Theanswer, from this view point, is that it provides thecoupling between cortical cells that imparts on the col-lective system the computational capacities of spin glassor spin-glass-like systems. The cortex should then beseen not as a biological system, but as a physical device.Of course, these claims are very strong and reduction-istic, and certainly go much further than the original

480

motivations for the introduction of spin-glass theoryinto neural network theory, which included an appeal toand analogy between the physics of frustrated systemsand constrained optimisation problems (see, e.g., Hertzet al. 1991; Amit 1992; and references therein). Theseclaims also ignore the fact that the plexus of lateralconnections also develops and is highly plastic even inadulthood, and ignores the asymmetries in lateralinteractions that prevent the distribution of states ofafferent connectivity from following the Boltzmanndistribution. To this extent, the claims should be seennot as exact, but rather as giving a first-order, approx-imate understanding.

Acknowledgement. The author thanks the Royal Society for thesupport of a University Research Fellowship.

References

Amit DJ (1992) Modeling brain function. Cambridge UniversityPress, Cambridge

Bienenstock EL, Cooper LN, Munro PW (1982) Theory for thedevelopment of neuron selectivity: orientation specificity andbinocular interaction in visual cortex. J Neurosci 2: 32–48

Changeux JP, Danchin A (1976) Selective stabilization of devel-oping synapses as a mechanism for the specification of neuralnetworks. Nature 264: 705–712

Elliott T, Shadbolt NR (1998a) Competition for neuro-trophic factors: mathematical analysis. Neural Comput 10:1939–1981

Elliott T, Shadbolt NR (1998b) Competition for neurotrophicfactors: ocular dominance columns. J Neurosci 18: 5850–5858

Elliott T, Shadbolt NR (1999) A neurotrophic model of thedevelopment of the retinogeniculocortical pathway induced byspontaneous retinal waves. J Neurosci 19: 7951–7970

Elliott T, Shadbolt NR (2002) Dissociating ocular dominancecolumn development and ocular dominance plasticity: a neu-rotrophic model. Biol Cybern 86: 281–292

Elliott T, Howarth CI, Shadbolt NR (1996) Axonal processes andneural plasticity. I: ocular dominance columns. Cereb Cortex 6:781–788

Elliott T, Howarth CI, Shadbolt NR (1998) Axonal processes andneural plasticity: a reply. Neural Comput 10: 549–554

Elliott T, Maddison AC, Shadbolt NR (2001) Competitive ana-tomical and physiological plasticity: a neurotrophic bridge.Biol Cybern 84: 13–22

Goodhill GJ (1993) Topography and ocular dominance: a modelexploring positive correlations. Biol Cybern 69: 109–118

Harris AE, Ermentrout GB, Small SL (1997) A model of oculardominance column development by competition for trophicsupport. Proc Natl Acad Sci USA 94: 9944–9949

Hertz J, Krogh A, Palmer RG (1991) Introduction to the theory ofneural computation. Addison-Wesley, Redwood City, Calif.

Hopfield JJ (1982) Neural networks and physical systems withemergent collective computational abilities. Proc Natl Acad SciUSA 79: 2554–2558

Hubel DH, Wiesel TN (1962) Receptive fields, binocular interac-tion and functional architecture in the cat’s visual cortex.J Physiol (Lond) 160: 106–154

LeVay S, Stryker MP, Shatz CJ (1978) Ocular dominance columnsand their development in layer IV of the cat’s visual cortex: aquantitative study. J Comp Neurol 179: 223–244

LeVay S, Wiesel TN, Hubel DH (1980) The development of oculardominance columns in normal and visually deprived monkeys.J Comp Neurol 191: 1–51

Linsker R (1986a) From basic network principles to neural archi-tecture: emergence of spatial-opponent cells. Proc Natl AcadSci USA 83: 7508–7512

Linsker R (1986b) From basic network principles to neural archi-tecture: emergence of orientation-selective cells. Proc NatlAcad Sci USA 83: 8390–8394

Linsker R (1986c) From basic network principles to neural archi-tecture: emergence of orientation columns. Proc Natl Acad SciUSA 83: 8779–8783

Malsburg von der C, Willshaw DJ (1976) A mechanism for pro-ducing continuous neural mappings: ocularity dominancestripes and ordered retino-tectal projections. Exp Brain ResSuppl 1: 463–469

McAllister AK, Katz LC, Lo DC (1999) Neurotrophins and syn-aptic plasticity. Annu Rev Neurosci 22: 295–318

Miller KD (1998) Equivalence of a sprouting-and-retraction modeland correlation-based plasticity models of neural development.Neural Comput 10: 529–547

Miller KD, Keller JB, Stryker MP (1989) Ocular dominance col-umn development: analysis and simulation. Science 245: 605–615

Montague PR, Gally JA, Edelman GM (1991) Spatial signaling inthe development and function of neural connections. CerebCortex 1: 199–220

Swindale NV (1980) A model for the formation of ocular domi-nance stripes. Proc Roy Soc Lond B 208: 243–264

Tanaka S (1989) Theory of self-organization of cortical maps. In:Touretzky DS (ed) Advances in neural information processingsystems 1. Kaufmann, San Mateo, Calif., pp 451–548

Tanaka S (1990) Theory of self-organization of cortical maps:mathematical framework. Neural Netw 3: 625–640

Tanaka S (1991a) Phase transition theory for abnormal oculardominance column formation. Biol Cybern 65: 91–98

Tanaka S (1991b) Theory of ocular dominance column formation.Mathematical basis and computer simulation. Biol Cybern 64:263–272

481

Documents

A spin-glass-like Lyapunov function for a neurotrophic model of neuronal development