15
Revista Mexicana de Física 38, Suplemento 1 (1992) 127-1.0 Modelling genetic evolution with coupled map lattices G. MARTíNEZ-MEKLER, G. COCHO, A. GELOVER Instituto de Físca Universidad Nacional Autónoma de México Apartado postal 20-364, 01000 México, D.F., México ANO R. B ULAJICH Departamento de Matemáticas, Facultad de Ciencias Universidad Nacional Autónoma de México 04510 México, D.F., México Recibido el 20 de diciembre de 1991; aceplado el 9 de abril de 1992 ASSTHACT. The evollltion of gcnctic sequences, in particular the case of retro-virus as encountered in AJOS, is analyzcd by means of a coupled map lattice model. Mutations are introdllced as coupling terms and several ecological constraints are considered. The resulting cquations are oC the reaction-diffusion type. The model cstablishes an evolution fitness criteria in terms oC dynamical and structural propcrties at the molecular levcl. Under a global ecological constraint t the model dynamics generates quasi-species and presents an error threshold. Qualitative predictions regarding the AJDS virus RNA stability under mutations are corroborated experimcntally. RESUMEN. Se analiza la evolución de secuencia.... genéticas, en particular las del virus del SIDA, por. medio de un modelo de red de mapeos acoplados. Mutaciones corresponden a términos de acoplamiento y se consideran los ca.<;()s de varias restricciones ecológicas. Las ecuaciones que resultan son de reacción-difusión. El modelo establece un criterio de adecuación evolutiva en términos de propiedades estructurales y dinámicas a nivel molecular. Bajo una restricción ecológica global, la dinámica genera cuasiespecies y presenta un umbral de error. Predicciones cualitativas sobre la estabilidad frente a mutaciones del RNA viral del SIDA se corroboran experimentalmente. rAes: 87.1O.+e; 03.20.+i; 05.45.+b l. INTROOUCTION During lhe lasl few years lhe challenge of understanding lhe evoiulion of genel;c se- quences has received considerable altenlion. The availability of an impressive amonnt of data emerging from lhe Human Genome Projecl, lhe remarkable advances in lhe field of molecular biology, lhe advenl of fasl computers and lhe conceplual developmenls associ- ated wilh cornplex syslems research, have given rise lo a diversily of new olltlooks on lhis subject. Thc dynamir.s of genetic evolution is a complex problem, and 1ikc aH complex systcms, il can be sludied al differenl levels of complexily [1]' ranging from a highly slructured nelwork lo an inlricale physicochernical process. S. Kallffrnan, for example looks inlo lhe behavior of NK boolcan nelworks, where N is lhe nllmber of objecls and K lhe nelwork

Modelling genetic evolution with coupled map lattices - SMF · 2008. 7. 7. · Revista Mexicana de Física 38, Suplemento 1 (1992) 127-1.0 Modelling genetic evolution with coupled

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

  • Revista Mexicana de Física 38, Suplemento 1 (1992) 127-1.0

    Modelling genetic evolution with coupled map lattices

    G. MARTíNEZ-MEKLER, G. COCHO, A. GELOVERInstituto de Físca

    Universidad Nacional Autónoma de MéxicoApartado postal 20-364, 01000 México, D.F., México

    ANO

    R. BULAJICHDepartamento de Matemáticas, Facultad de Ciencias

    Universidad Nacional Autónoma de México04510 México, D.F., México

    Recibido el 20 de diciembre de 1991; aceplado el 9 de abril de 1992

    ASSTHACT. The evollltion of gcnctic sequences, in particular the case of retro-virus as encounteredin AJOS, is analyzcd by means of a coupled map lattice model. Mutations are introdllced ascoupling terms and several ecological constraints are considered. The resulting cquations are oC thereaction-diffusion type. The model cstablishes an evolution fitness criteria in terms oC dynamicaland structural propcrties at the molecular levcl. Under a global ecological constraintt the modeldynamics generates quasi-species and presents an error threshold. Qualitative predictions regardingthe AJDS virus RNA stability under mutations are corroborated experimcntally.

    RESUMEN. Se analiza la evolución de secuencia. ... genéticas, en particular las del virus del SIDA,por. medio de un modelo de red de mapeos acoplados. Mutaciones corresponden a términos deacoplamiento y se consideran los ca.

  • -'-

    128 G. MARTÍNEZ-MEKLER ET AL.

    eoordination number [2,3]. Coneepts sueh as adaptation, antiehaos and self-organizederitieality emerge from his treatment. Other authors sueh as M. Eigen and P. Sehusterwrite kinetie evolution equations for which the genetie material seeondary strueture aetsas the phenotype regulator [4,5]. In this paper we are primarily interested in relating theeoneept of evolutionary fitness with physieoehemieal eonstraints of the ribonucleie aeidsand the molecular maehinery at the ribosome level. Our mathematieal modelling is dOllein terms of eoupled map lattices and is based on phenomenologieal studies earried out byCocho and eo-workers [6,7,8,9].

    In a previous publieation sorne of us presented a detailed aeeount of the eonstruetiollof a eoupled map !attiee model for genetic evolution [91. In this paper we shall gi\'e a bricfexposition of that derivation and extend it to a wider family of models, this is the eontentof Seet. 2. \Ve then proeeed in Seet. 3 to analyze sorne aspeets of the evolution of theseequations. Our modelling is particularly suitable for the study of retro-virus, speeifieallythe AIDS virus, and allows for predictions to be made, whieh are eorroborated at a 'luali-tative level. In order to be in a position for a 'luantitative study a more refined modellingis neeessary. The purpose of this work is mainly to establish a working framework amI topoint out general trends. Seetion 4 is a brief diseussion.

    2. MODEL BUILDING

    Let us begin by introducing sorne basie definitions and termino!ogy.! Genetie material iseonstituted by ONA (deoxyribonucleie acid) and RNA (ribonucleic aeid). ONA moleeulesare linear polymers built from four basie units ealled nucleotides. Eaeh unit or monomerhas a eonstant eomponent (phosphate and deoxyribose groups) and a variable part, one ofthe four bases: adenine, guanine, thymine and eytosine. Adenine and guanine are purines(R) and are larger than the two pyrimidie bases (Y): thymine and eytosine. ONA is adouble helix (duplex) built from two antiparallel linear polymers with eomplementarybases. The guanine(G)-eytosine(C) base pair is Iinked by 3 hydrogen bonds, while theadenine(A)-thymine(T) base pair shares two hydrogen bonds. In this respeet we shalleonsider C and G to be strong bases (S), and A and T to be weak ones (\V). Analysis ofONA oligonucleotides have shown that the local values of the struetura! angles depend onthe local eomposition [121 and in particular they are mainly assoeiated with the R (Iarge)or Y (small) eharaeter of the bases.

    RNA moleeules are also linear polymers, similar to ONA, with ribose instead of de-oxyribose and with the pyrimidine uraeyl (U) instead of thymine (T). In sorne regionsthey show a double helix strueture (as in sorne of the struetural parts of ribosome RNA);they may also exist as a single polymer (as in the messenger RNA).

    Initially after the work of \Vatson and Criek, ONA was eonsidered as a statie, rigidstrueture altered oeeasionally by stoehastie mutations. Over the years this ¡mage hasevolved to that of a moleeule eontinuously subjeet to f!uetuations wit ha non-homogeneousstrueture along a baek-bone. Experimental studies of thermodynamie properties, in eon-junetion with a statistiea! analysis of the Gene-I3ank, as well as a deeper understanding

    lFor a detailed introduction to the basic biological elerncnls discussed in this paper refer lo [10,11).

  • MODELLlNG GENETIC EVOLUTlON WITH COUPLED MAP LATTICES 129

    of the molecular mechanisms involved in protein synthesis are indicative of a link betweenphysicochemical constraints and DNA evolution. \Ve shall refer in the text to these local(along the chain) restrictions as internal selection constr~ints. Addi.tionall~, there arerestrictions associated with the structural features of functlOnal protelOs, whlch we shallcall external selection constraints. Namely, the three dimensional protein structure relates

  • 130 G. MARTíNEZ-MEKLER ET AL.

    Since the total number of codon classes is eight: WWW, WWS, WSW, WSS, SWW,SWS, SSW, SSS, a point in the space of possible configurations is determined by thenumber i"lh (with ", {J, 'Y = W, S) of codons of each of the above types. Ir we take intoaccount that the sum iwww + ... + isss = L is conserved, the configuration space isa seven dimensional discrete hypertetrahedron with coordinates [iwww,iwws, ... ,isswj,where the i"lh only take integer values. As a system variable we consider the numberN¡(t) of codon sequences with length L that at a discrete time t have a composition i.In the absence of mutations, we assume that the time evolution of N¡(t) depends linearly

    on a growth coefficient A(i), which is configllration dependent (further on we shall givesorne criteria for the functional choice of A), and that it is subjected to an ecologicalconstraint. In the simplest case, as with population dynamics problems, we may expressthis constraint with a negative qlladratic termo We thus have for this uncoupled problema set of Q logistic equations [14) (Q is the number of sites in the lattice) with spacedependent parameters of the form:

    N¡(t + 1) = A(i)N¡(t) - e(i)[N¡(t))2, (2.1 )

    where the e(i) are positive.The mappings will be coupled if we consider mutations. Let us first define as a one-

    mutant neighboring sequence the one obtained by a ehange in one codon. A one-mutantneighborhood is then the set of all one-mutant sequenees of a given sequence. Ir we assumethat in a codon at most one point mutation (substitution of a base by another) takesplace, the coupling terms will be loca\. Since the mutation may present itse!f in anyof the codon bases xyz, we must consider the effective point mutation probabilities, asaccepted by selcction, PI' Py and p,. For the amino acid codification the y position is themost relevant, while the z position is basically negligible. Furthermore, on the averagep, > PI > Py. A Y point mutation usually produces important changes in the ami noacid chemical properties, decreasing in most cases the protein functional efficiency. In thiscase, external selection tends to eliminate the mutation and Py is smal\. A similar type ofargumentation leads to higher values for p,.Point mutations couple codons of differeut type, e.g. SWS and WWS are coupled by

    PI' WWS and WSS are couplcd by Py, and WWW and WWS by p,. If all three P, aredifferent from zero, the eight codon classes will be coupled by "difrusion terms" associatedto these point mutations. In this case, the diffusion will take place in the 7-dimensionalhypertetrahedron space mentioned above. Ir Py is taken to he zero and the other twop, different from zero, the diffusion will take place independently in t\\'o 3-dimensionaltetrahedra connected by A(i). !f oniy p, is dirrerent from zero, the 8 codons separate in 4disjoint groups: WWW and WWS, WSW and WSS, SWW ami SWS, SSW and SSS. Thenumber of codons in the 4 disjoint groups is constant. Takillg tlH's~ fOUT constraints intoaccount, the "state space" is rcduced lo a 4-dirnensional hypcrparallelepipcd, with aIledimensional diffusion taking place on each axis. Ir we have initially codons of only one ofthe 4 disjoint groups (e.g. WWW and WWS), we will have in the dynamical evoiution,only codons of that group, and the composition will be defined hy only one parameter.(e.g. i = iwws == i, as in this case iwww + iwws = L).

  • MODELLING GENETIC EVOLUTION WITH COUPLED MAP LATTICES 131

    In order to be more precise we need in[ormation on the multiplication and mutationrate, and on th~ alternation or simultaneity of multiplication and point mutations. Inbacteria and eukaryotes (cells with nucleus) most of the point mutations take place as-sociated to DNA duplication and therefore, multiplication and mutation are more or ¡esssimultaneous. Moreover, due to the action of the repairing mechanisms, the mutation rateis very low (:= 10-9 by base and DNA duplication cycle). For a type of virus, such asAIDS, known as retrovirus, we have a rather different scenario. These RN A virus, afterentering in the cell, get transcribed to DNA by means of the retrotranscriptase enzymeand are then integrated into the cell genoma. At this stage the virus is called provirus.The cell (including the provirus) can divide many times before the provirus is Iiberatedand transcribed into RNA, building viruses that can kili the cell and beginning a newcycle. Retrotranscriptase makes many errors [15,16], and the effective mutation rate is ofthe order of 10-3 by base and cycle; consequently, the diffusive terms are more relevantthan in the previous case. \Ve therefore have a situation where the alternation oJ mutationand multiplication allow.' Jor a natural implementation oJ a CML dynamics in two steps.Taking all this into account, we can write for the particular case when only p" iwwwand iwws are different [rom zero, the evolution equation

    Ni(t + 1) = J{Ni(t)} + p[(i + I)J{Ni+l(t)}+ (1, + 1 - i)J{Ni-I(t)} - LJ(Ni(t)}J

    with the function J defined by

    (2.2)

    (2.3)

    (2.4 )

    and where p is the mutation rate per base and cycle, iwww + iwws = L and iwws == i.2lf instead of the alternation of multiplication and mutation (as in retrovirus), they occurmore or less simultaneously (as in other RNA viruses with a high mutation rate, e.g.,foot-and-mouth disease virus), J{N{(t)} should be replaced by N{(t) in Eq. (2.3).lf we rearrange the terms in equation (2.2) we have

    Ni(t+l)= [1+ L~2]J{Ni(t)}+A+Bwith

    A = HJ{Ni+¡(t)} - 2J(Ni(t)} + J{Ni_¡(t))]

    r(2i-L)[ ]B = 2(1, + 2) J{Ni+I(t)} - J{Ni-¡(t)}

    and where r = 1'(L + 2) and J is given by Eq. (2.3).

    (2.5)

    (2.6)

    :lNotice that in this particular case thc numbcr of configuration space elemcnts i.e. the number of"space" lattice gites Q, is equal to the numbcr of codons in the gcnctic s~que~ce. in general theywill not coincide.

  • 132 G. MARTíNEZ-MEKLER ET AL.

    Equation (2.4) shows that our problem Calls within the c1ass oC reaction-diCfusion pro-cesses. The tirst term in (2.4) is a nonlinear source term, A is a discrete diffusion operatorand B corresponds to a discrete gradient contribution. In this perspective the equationhas the peculiarity that the vaIue of the "diffusion constant", £/2, is also related to theso urce and gradient terms.If we reIax some oC the assumptions behind Eq. (2.4) we end with a hierarchy oC

    equations. For example, iC we allow for two simultaneous point mutations we loose thelocal nature of (2.4); alternatively, if we consider that more than one p, is different fromzero then both, the codon composition space, as well as the discrete coupling operatorsacquire a more complex structure. On the other hand if short range eorrelations areconsidered a self-consistent mean tield treatment may be called COLIn this work we shalllook into the dynamics generated by Eq. (2.4) with a particular

    choice COI' f.

    3. SPACE-TIME EVOLUTION

    In the section aboye we ehose the logistic equation COI' the ecologieal restriction Cunctionf, based on the population dynamics eharacter oC our problem and taking into aecountthat considerable experience has been accumulated in the Ia.,t Cew years COI' this equation.!Iowever, the Cael that we are dealing with a distribution oC competing sequences withdifferent codon composition suggests that a global eeological constraint should be morerealistic. Sequences with a given chemieal composition are expected to compete for theenvironment not only with themselves but in general with all the other elements of thedistribution. In this case, in Eq. (2.3), instead oC the term c(OIN,(tJF we may considerthe sum over all possible "f composilions: N,(t) ¿e(C "f)N7(t).In the Collowing, we shall look into the behavior of (2.4) with the eeological constraint

    d Lf{N,(t)} = A(ON,(t) - C[N,(IW - L + 1N,(t) ¿IN7(1)],

    1"=0(3.1)

    which we shall refer to as generic ecologieal constraint if e i' O and d i' O. \Vhenever e i' O,d = O we shall say we are under a local ecological constminl, and the c = O and di'O casewe shall call global ecological constraint. The most general version 01' (3.1) should take intoaccount the dependence of parameters e and d on ~. As a tirst approximation, and in theabsence of specitic information on 1his issue, we shall consider that d anrl c are constanl;rnoreover, the case which tucns out to he more realistic corrcsponds to e = O.Notice that in (3.1), even in the absence of the mulalion coupling parameler (, the

    spccics are no longcr indcpendcnl. Equations of this typc (with ( = O) are rcferrcd lo inthe current literature as globally coupled maps [17].In order to completely determine our model we must specify a fUllctional form for A(O.

    In the phenomenologieal study of Cocho and Martínez-Mekler [9] they propose, that COI'retrovirus sueh as the AIDS type, A(O should be a non-deerea.,illg monotonous fUlletionof~. The choice of A(O is determined by the processes that take place at the ribosome.

  • MODELLlNG GENETIC EVOLUTION WITI! COUPLED MAP LATTICES 133

    The experimental work of [18,191 and the genetic sequence analysis of [9,201 suggest:that codons of the \VWS type are more efficient (rate of protein synthesis) and accurate(degree of absence of mistakes) than the W\V\V type, and that the S\VS are more efficientand accurate than the sww. This behavior can be related both to the tRNA-mRNA andrRNA-mRNA interactions 19,18,19,20,211. \Ve should remark that the above behaviorcorresponds to a regime for which mutations are negligible, ¡.e. for p = O. Under ourrestriction of considering only the WWW ami \V\VS group, sequences with a highercontent of \VWS should have a bigger growth rateo Since i is a measure of the W\VSconcentration, and A(i) is the i1h composition growth rate, an increasing monotonousbehavior is a reasonable choice. Notice that the functional form of A(i) reflects biologicaifeatures and is not introduced as an ad hoc assumption. For our calculations we shall use

    . 2A(i)=a+b(-iJ (3.2)[n (:l.2), as b inereases the efficiency and accuraey mentioned above grows. The quadratic

    dependence on i is an arbitrary choice consbtent with the requirements on A(i). \Ve havecheeked that a linear dependence does not alter significantly our results.

    Let us now consider the case for which mutations are present, ¡.e., the behavior ofEqs. (2.4), (2.5) ami (2.6) with f given by (:l.I) and A(i) by (3.2). Figure 1 shows thevariation with ( of the graphs of N,(t) with the global ecological constraint parametervalues (e = O, dI' O) as a funelion of i for :lOOconsecutive values of t after a transient of2000 iterations. The L = 500 initial values are taken at random and ( is chosen to be ofarder 1. The parameter values are related lo the gp120 external protein of the HrV1 AIDSvirus which has an average length of 1500 bases (500 codons) and presents a mutationrate p per base and cycle of the order of 10-3. This regio n is kllown as the env proteinof the virus and is the most active region in the interaetion with the host eel!. Since wehave assumcd only one point mutation per codon, the RNA sequence mutation rate percycle ( = p(L + 2) is of arder 1. Under these eonditions we are at what is referred to inthe literature as the threshold of hyper-mutability. S. Kauffman describes this situationas a regime at the edge of chaos [2). The parameters a and b comply with the fixed pointdynamies neeessary condition a + b < 3. This condition was determined from a continuumlimit aualysis of the time evolution of the sum of N,(t) over all i.Our numerical simulations show the following general features: i) For ( = O, the i = L

    strain survives. ii) As ( increases a distribution of configurations is conformed. For ( in therange O < ( < I this distribution has a de!ocalization, measured either by the standarddeviation 01' by its support, that grows with increasing f. The distribution maximumis shifted toward smaller values as ( grows (see Figs. La to I.d). iii) Above ( = 1 thefixed point dynamics is lost and an alternating behavior emerges. Notice the presenceof two branches in the graph of Fig. Le for values of i within a regio n contained in theinterval [300,400]. In this regio n the distribution Ni alternates in one time step between:having a nOll-zero valuc fOf i cven and a zcro valuc fol' i o

  • -------- ----

    134 G. MARTíNEZ-MEKLER ET AL.

    e=O.lOd=lb=1.500T =2000

    a=O.500

    Q=500

    N.(t) Nl(t)200,,----------------~ 50,----------------~

    a=O.SOO b=I.SOO c=D dd e=O.OlQ=500 T =2000

    100 25

    o OO 100 200 300 400 500 O 100 200 300 400 500¡

    :"II,{t) N,t(l20 10

    a=O.500 b=l.SOO c=D d=l e=O.sO a=O.500 b=I.SOO c=D d=l e=l.OOQ=500 T =2000 Q=500 T =2000

    10 5

    0t=========~~==JO 100 200 300 400 500

    ¡

    OO 100 200

    \]00 400 500

    e:l.lOd=1C=Ob=1.500T =2000

    a=O.500Q=500

    N,

  • MOOELLlNG GENETIC EVOLUTION WITH COUPLED MAP LATTICES 135

    Feature (i) indieates that in the absenee of mutations, the fastest growing speeiespredominates. This result can be obtained analytieally by looking at the evolution of thesum of all the strains.

    The distribution deloealization observed in (ii) is produeed by the diffusive term A, andthe shift in the maximum is governed by the gradient term which makes the system evolvetowards L/2. Sinee both the aboye terms are proportional to €, their effeet is amplified as( grows.

    We may re-examine the aboye results in terms of the Darwinian seenario of evolutionguided by the survival of the fittest species. It has been argued [5,13) that in an environ-mentally constrained population growth problem, the eontinuous generation of mutantsproduces a pool of variants of the fittest species whieh is the souree of an optimization andadaptation proeess. Instead of a single fittest type a clan of closely related types ealleda quasi-speeies is formed. In our modelling, the strain (sequenee eonfiguration) resemblesa quasi-species. Furthermore, as pointed out by Sehuster [151, the suggestion that highererror rates produce more mutations, whieh are available for seleetion and subjeet to amore elficient optimization proeess, is valid only up to a eertain ¡imil. At high errorrates, mutations take over, inheritanee breaks down, and seleetion finds nothing to aetullon. The mutation rate at whieh this sharp transition between elfieient optimizationand almost random seleelion oeeurs is known as the error threshold. It has been pointedout previously [5.e] that RNA viruses evolve under eonditions close to this threshold.Our model supports this view. The error threshold in our formalism is €e(a, b, d, L) andis dependent on parameters related to speeifie reproduetion meehanisms at a molecularlevel.

    In Figs. 2.a to 2.d we show graphs of N,(t), again under a global eeologieal eonstraintand a fixed point restrietion (a + b = 2), for several b values with d = 1 and € = 0.5. Asb decreases, the population distribution widens, its dispersion a inereases and its meanvalue z deereases.So far, we have looked only al lhe behavior of sequenees eoding for only one gene, if

    we allow for lwo genes, one with lenglh LI and the other with Ienglh L2, equation (2.2)reads

    whcrc

    with

    N,.j(1 + 1) = /;,j + p[(i + 1)/;+I,j + (/'1 + 1 - ;)/;-I.j+ (j + 1)/;,j+1 + (L2 + 1 - j)/;,j-l - L/;,j) ,

    !,.ó = A,.óN,.ó(l) - dN,.ó(I)Ñ,

    (3.3)

    (3.4 )

    (3.5)

  • 136 G. MARTíNEZ-MEKLER ET AL.

    N,h) N,(ll20 20

    a=0.005 b=1.99S C=O d=l e=O.50 3=1.000 b=I.OOO C=O d=l e::;Q.SOQ=500 T =2000 Q=500 T =2000

    10

    U10 I

    II

    O O)

    O lOO 200 300 400 500 O 100 200 100 4CO 500¡ ,

    N,(¡) N,(tl20

    d5~

    20a=1.500 b=O.500 C=O d:l

    3::1D95 b=O.OO5 C=O d:l e=O.50Q=500 T =3000

    Q=SQO r::1000I

    \ I10 ;0\IL )O dO

    O lOO 200 300 400 500 O lOO 200 JOO 400 500, ,

    FIGURE 2. Effect of the variation oC the parameter b 00 the sequcnce distribution. Thc numberNi(t) oC L ('odoo scquences with codon composition i is plotted as a f\lnclion oC i, [oc 300 consec-utive iterations after a transicnl r spedfied in the figure, slarting from random initial conditions(consecutive points are joined by a straight ¡¡oc segment)o As b changes fTom Fig. 2.a lo 2.d, theparametcr valucs (e = O,d), number oC composition spacc e1crncJlls Q (1.-= Q) and lhe coupling £are kcpt constant . The valuc of a satisfics a + b = 2, which cnsurcs a fixcd point dynamics (look atFig. Le for the ca.'ie with b = 1.5). The parameter choice is indicated in the figure ami correspondsto a global ccologica.1 constraint.

  • MOllELLI:-iG GENETIC EVOLUTION WITlI COUPLED MAP LATTICES 137

    £ = [>1 + £2 amI A,.• is the growth rate for N,.•.Whenever there is a double subscriptit is understood that the first symbol takes integer values in the interval [O,L¡I and thesecond one in the interval [O,£21.A factorizcd fixed point solution (N, .• (t + 1) = N, .• (t) = N,(t)N.(t)) of (3.3) can

    (1) (2) h h .. thbe obtained if we a.';sume that A, .• = A, + A. ' w ere t e superscnpt In e paren-lhesis distinguishes belween the first and the second genes. The last expression for A,.•states lhat the efficiency of both genes equals the sum of the efficiency of each one takenseparately. The solution is obtained from

    _ [A(l) + A(2) _ dÑ] }N(l) N(2)p1 J 1 J 1 (3.6)

    where A~k) = a(k) + b(k)(,/ Lk)2 , with I\ = 1,2, if the coefficients A;2) and AP) are sub-stituted by effective constants, determined through pertubative or variational methods.Under these conditions, the two gene problem corresponds to two one gene problems andthe qualitative features described so far would hold for each gene.In principie, the aboye result extends to the case of sequences coding for K genes.

    A similar situation is encountered in a gene where domains with different evolutionarycharacteristics can be identified, each domain fulfills the role of one of the I\ genes men-tioned aboye. In terms of this analogy for each domain I\ we associate a set of parametervalues {a(K), b(K), DK} where DK is the codon length of the I\'h domain. In this casethe dependence of lhe quasi-species distribution on the parameter b described for the fullsequence will now hold semi-Iocally in the domain DK for lhe parameter bK. In particular,we predict that in viral RNA, domains with low 'l{ (average composition value of the DKquasi-species) present a distribution with a high dispersion al{.The upper graph of Fig. 3 shows the semi-local period three amplitude PK of the power

    spectrum of the AIIJS virus env-HNA coding for the protein gpI20. The power spectrumis calculated from a Fourier transformation taken at each codo n position I\ for a 40codo n window centered at that positíon using the W /S degenerate representation withthe technique developed in references [7,8]. If only WWS and WWW codons are present,PK is the square of the average composition value IK [8]' evaluated along the K'h window(domain) centcrcd al the codan position 1\.. Taking ¡lIlo consideration the experimentalevidence [18,191 mentioned al lhe beginning of this seclion, il is rea.

  • 138 G. MARTÍr

  • MODELLlNG GENETIC EVOLUTION WITH COUPLED MAP LATTICES 139

    N,(t)5

    3=1.500 b=2.S00 c;:l d=l e=1.l04 q=500 T =2000

    N,«)5

    a:1.500 b=2.500 c=1 d=l 0=1.104 Q=500 T =300

    3

    2

    -'"o

    3

    2

    o

    o 100 200 300 400 500 o 100 200 300 400 500i

    FIGURE 4. a) Number N,(t) of codon sequences of length L, wilh codon composilion i, plolled asa fundion oC i, foc 300 consecutive iterations, after a transient oC 2000 steps, starting rroro random¡n¡tia1 conditions. The pararncter values (a, b, d), as well as the coupling constant f, and number oCcodon composilions Q (L = Q) are indicaled in lhe figure. The choice of paramelers correspondslo a generic ecological conslrainl. b) Same plol as above wilh a lransient T of only 300 iterations.

    dynamics and T is referred to as the transient time. Ir the asymptotic orbit is an invariantset of the time evolution, and there is a neighborhood of initial conditions that tend to it,then the graphs are "spacen-time attractors of the CML. In Fig. 4.b we decrease T to 300iterations. The purpose of this comparison is to exhibit the re¡evance of transients for thistype of dynamics. It is often the case for extended systems that transients are unusuallylong (e.g. see reference [23,24,25]). Notice that in order to attain the behavior shown inFig. 4.a we need of the order of 103 "IDS virus mutations (recall that we are assumingonc mutation per cyclc, i.e., £::! 1).

    4. DISCUSSION

    In the introduction we mentioned that the study of the evolution of genetic sequencesis a complex problem. Two of the characteristics that typify complex systems researchis lhe need for an interdisciplinary approach and the variety of alternative and comple-mentary treatments. In this work we have chosen one of the many possible formulalionsfor lhis problem, we have adopted the formalism of coupled map lattices based on anextensive phenomenological study [6,7,8,91. This previous information was crucial in order

  • -----------,- - , -- -----------------------

    140 G. MARTíNEZ-MEKLER ET AL.

    to aequire the neeessary insight for the introduetion oí approximations, with biologiealrelevanee, that allow for sorne numerical and analytical resultb. Under these conditions wehave been able to formulate qualitative predictions regardiog stability under mutations ofRNA sequenees. The correspondence with experiment gives support to our approximatiouband suggests the likelihood of a formal justification. \Vork is currently undcrway iu thisdirection. Based again on the previous phenomenology we have been able to re-expressexperimental infarmation at the ribosome machinery level and Ircnds from the statisti-cal analysis of eodon compositions as a fitness criteria for evolution. Concepts s'Ieh asquasi-species and error threshold emerge in a natural way. Considerable investigationis required, however, in arder to bridge the gap between our

  • MODELUt'G GENETIC EVOLUTION WITII COUPLED MAP LATTICES 141

    l.'). B.D. Preston et al., Seienee 242 (1988) 1168.16. J.D. Hoberts el al., Seienee 242 (1988) 1171.17. K. Kaneko, Physiea D4l (1990) 137.18. L.K. Thomas, D.B. Dix and R.C. Thompson, Proe. Nat!. Aead. Sei. USA 85 (1988) 4242.19. D.B. Dix and R.C. Thompson, Proe. Nat!. Aead. Sei. USA 86 (1988) 688.20. C.H. \Voose, R.G. Gulell, R. Gupta and H. Noller, Mierobio/. Rev. 47 (1983) 621.21. M .A. Firpo and A.E. Dahlberg, Posl- Transcriptiona! Control of Gene Expression, Eds. J .E.G.

    McCarlhy and M.F. Tuile, NATO ASl Series H49 (Springer-Verlag, Berlin, 1990) p. 185.22. B.H. Slarch el al., Cell45 (1986) 637.2:\. R. Livi, G. Martínez-Mekler and S. Hulfo, Physica D45, (1990) 452.24. F. llagnoli, S. Isola, G. MartÍnez-Mekler and S. Hulfo, Periodic Orbils in a COllpled Map

    Lattice Model, in Cellular Aulomata and Modelling of Complex Physical Syslems, eds. P.Manneville, N. lloccara, G.Y. Vichniac and R. Bideaux, Springer Proceedings in Physics 46(Springcr-Verlag, Bcrlin, 1990) page 282.

    25. J.P. Crutchfield and K. Kaneko, Phys. Rev. Lelt. 60 (1988) 2715.