19
Natural selection. IV. The Price equation Steven A. Frank * Department of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697–2525 USA The Price equation partitions total evolutionary change into two components. The first compo- nent provides an abstract expression of natural selection. The second component subsumes all other evolutionary processes, including changes during transmission. The natural selection component is often used in applications. Those applications attract widespread interest for their simplicity of expression and ease of interpretation. Those same applications attract widespread criticism by dropping the second component of evolutionary change and by leaving unspecified the detailed as- sumptions needed for a complete study of dynamics. Controversies over approximation and dynamics have nothing to do with the Price equation itself, which is simply a mathematical equivalence rela- tion for total evolutionary change expressed in an alternative form. Disagreements about approach have to do with the tension between the relative valuation of abstract versus concrete analyses. The Price equation’s greatest value has been on the abstract side, particularly the invariance relations that illuminate the understanding of natural selection. Those abstract insights lay the foundation for applications in terms of kin selection, information theory interpretations of natural selection, and partitions of causes by path analysis. I discuss recent critiques of the Price equation by Nowak and van Veelen ab . The heart and soul of much mathematics consists of the fact that the “same” object can be presented to us in different ways. Even if we are faced with the simple-seeming task of “giving” a large number, there is no way of doing this without also, at the same time, “giving” a hefty amount of extra struc- ture that comes as a result of the way we pin down—or the way we present—our large number. If we write our number as 1729 we are, sotto voce, ordering a preferred way of “computing it” (add one thousand to seven hundreds to two tens to nine). If we present it as 1+12 3 we are recommending another mode of computation, and if we pin it down—as Ra- manujuan did—as the first number express- ible as a sum of two cubes in two different ways, we are being less specific about how to compute our number, but have underscored a characterizing property of it within a subtle diophantine arena. ... This issue has been with us, of course, for- ever: the general question of abstraction, as separating what we want from what we are presented with. It is neatly packaged in the Greek verb aphairein, as interpreted by Aris- totle in the later books of the Metaphysics to mean simply separation : if it is whiteness we want to think about, we must somehow sep- arate it from white horse, white house, white hose, and all the other white things that it * email: [email protected]; homepage: http://stevefrank.org a doi: 10.1111/j.1420-9101.2012.02498.x in J. Evol. Biol. b Part of the Topics in Natural Selection series. See Box 1. invariably must come along with, in order for us to experience it at all [1, pp. 222–223]. Somewhere ... between the specific that has no meaning and the general that has no con- tent there must be, for each purpose and at each level of abstraction, an optimum degree of generality [2, pp. 197–198]. INTRODUCTION Evolutionary theory analyzes the change in pheno- type over time. We may interpret phenotype broadly to include organismal characters, variances of characters, correlations between characters, gene frequency, DNA sequence—essentially anything we can measure. How does a phenotype influence its own change in fre- quency or the change in the frequencies of correlated phe- notypes? Can we separate that phenotypic influence from other evolutionary forces that also cause change? The as- sociation of a phenotype with change in frequency, sep- arated from other forces that change phenotype, is one abstract way to describe natural selection. The Price equation is that kind of abstract separation. Do we really need such abstraction, which may seem rather distant and vague? Instead of wasting time on such things as the abstract essence of natural selection, why not get down to business and analyze real problems? For example, we may wish to know how the evolution- ary forces of mutation and selection interact to determine biological pattern. We could make a model with genes that have phenotypic effects, selection that acts on those phenotypes to change gene frequency, and mutation that changes one gene into another. We could do some calcu- lations, make some predictions about, for example, the frequency of deleterious mutations that cause disease, arXiv:1204.1515v1 [q-bio.PE] 6 Apr 2012

Natural selection. IV. The Price equation · 2018. 4. 16. · Natural selection. IV. The Price equation Steven A. Frank Department of Ecology and Evolutionary Biology, University

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Natural selection. IV. The Price equation

    Steven A. Frank∗

    Department of Ecology and Evolutionary Biology,University of California, Irvine, CA 92697–2525 USA

    The Price equation partitions total evolutionary change into two components. The first compo-nent provides an abstract expression of natural selection. The second component subsumes all otherevolutionary processes, including changes during transmission. The natural selection componentis often used in applications. Those applications attract widespread interest for their simplicityof expression and ease of interpretation. Those same applications attract widespread criticism bydropping the second component of evolutionary change and by leaving unspecified the detailed as-sumptions needed for a complete study of dynamics. Controversies over approximation and dynamicshave nothing to do with the Price equation itself, which is simply a mathematical equivalence rela-tion for total evolutionary change expressed in an alternative form. Disagreements about approachhave to do with the tension between the relative valuation of abstract versus concrete analyses. ThePrice equation’s greatest value has been on the abstract side, particularly the invariance relationsthat illuminate the understanding of natural selection. Those abstract insights lay the foundationfor applications in terms of kin selection, information theory interpretations of natural selection,and partitions of causes by path analysis. I discuss recent critiques of the Price equation by Nowakand van Veelenab.

    The heart and soul of much mathematicsconsists of the fact that the “same” objectcan be presented to us in different ways.Even if we are faced with the simple-seemingtask of “giving” a large number, there is noway of doing this without also, at the sametime, “giving” a hefty amount of extra struc-ture that comes as a result of the way wepin down—or the way we present—our largenumber. If we write our number as 1729 weare, sotto voce, ordering a preferred way of“computing it” (add one thousand to sevenhundreds to two tens to nine). If we present itas 1+123 we are recommending another modeof computation, and if we pin it down—as Ra-manujuan did—as the first number express-ible as a sum of two cubes in two differentways, we are being less specific about how tocompute our number, but have underscored acharacterizing property of it within a subtlediophantine arena. . . .

    This issue has been with us, of course, for-ever: the general question of abstraction, asseparating what we want from what we arepresented with. It is neatly packaged in theGreek verb aphairein, as interpreted by Aris-totle in the later books of the Metaphysics tomean simply separation: if it is whiteness wewant to think about, we must somehow sep-arate it from white horse, white house, whitehose, and all the other white things that it

    ∗ email: [email protected]; homepage: http://stevefrank.orga doi: 10.1111/j.1420-9101.2012.02498.x in J. Evol. Biol.b Part of the Topics in Natural Selection series. See Box 1.

    invariably must come along with, in order forus to experience it at all [1, pp. 222–223].

    Somewhere . . . between the specific that hasno meaning and the general that has no con-tent there must be, for each purpose and ateach level of abstraction, an optimum degreeof generality [2, pp. 197–198].

    INTRODUCTION

    Evolutionary theory analyzes the change in pheno-type over time. We may interpret phenotype broadlyto include organismal characters, variances of characters,correlations between characters, gene frequency, DNAsequence—essentially anything we can measure.

    How does a phenotype influence its own change in fre-quency or the change in the frequencies of correlated phe-notypes? Can we separate that phenotypic influence fromother evolutionary forces that also cause change? The as-sociation of a phenotype with change in frequency, sep-arated from other forces that change phenotype, is oneabstract way to describe natural selection. The Priceequation is that kind of abstract separation.

    Do we really need such abstraction, which may seemrather distant and vague? Instead of wasting time onsuch things as the abstract essence of natural selection,why not get down to business and analyze real problems?For example, we may wish to know how the evolution-ary forces of mutation and selection interact to determinebiological pattern. We could make a model with genesthat have phenotypic effects, selection that acts on thosephenotypes to change gene frequency, and mutation thatchanges one gene into another. We could do some calcu-lations, make some predictions about, for example, thefrequency of deleterious mutations that cause disease,

    arX

    iv:1

    204.

    1515

    v1 [

    q-bi

    o.PE

    ] 6

    Apr

    201

    2

    mailto:[email protected]://stevefrank.orghttp://dx.doi.org/10.1111/j.1420-9101.2012.02498.x

  • 2

    Box 1. Topics in the theory of natural selection

    This article is part of a series on natural selection. Althoughthe theory of natural selection is simple, it remains endlesslycontentious and difficult to apply. My goal is to make more ac-cessible the concepts that are so important, yet either mostlyunknown or widely misunderstood. I write in a nontechni-cal style, showing the key equations and results rather thanproviding full derivations or discussions of mathematical prob-lems. Boxes list technical issues and brief summaries of theliterature.

    and compare those predictions to observations. All clearand concrete, without need of any discussion of theessence of things.

    However, we may ask the following. Is there some re-orientation for the expression of natural selection thatmay provide subtle perspective, from which we can un-derstand our subject more deeply and analyze our prob-lems with greater ease and greater insight? My answeris, as I have mentioned, that the Price equation providesthat sort of reorientation. To argue the point, I will haveto keep at the distinction between the concrete and theabstract, and the relative roles of those two endpoints inmature theoretical understanding.

    Several decades have passed since Price’s [3, 4] originalarticles. During that span, published claims, counter-claims and misunderstandings have accumulated to thepoint that it seems worthwhile to revisit the subject. Onthe one hand, the Price equation has been applied to nu-merous practical problems, and has also been elevated bysome to almost mythical status, as if it were the ultimatepath to enlightenment for those devoted to evolutionarystudy (Box 2).

    On the other hand, the opposition has been gainingadherents who boast the sort of disparaging anecdotesand slogans that accompany battle. In a recent book,Nowak and Highfield [5] counter

    The Price equation did not, however, proveas useful as [Price and Hamilton] had hoped.It turned out to be the mathematical equiv-alent of a tautology. . . . If the Price equationis used instead of an actual model, then thearguments hang in the air like a tantalizingmirage. The meaning will always lie just outof the reach of the inquisitive biologist. Thismirage can be seductive and misleading. ThePrice equation can fool people into believingthat they have built a mathematical modelof whatever system they are studying. Butthis is often not the case. Although answersdo indeed seem to pop out of the equation,like rabbits from a magician’s hat, nothing isachieved in reality.

    Nowak and Highfield [5] approvingly quote van Vee-len et al. [6] with regard to calling the Price equation

    a mathematical tautology. van Veelen et al. [6] empha-size the point by saying that the Price equation is likesoccer/football star Johan Cruyff’s quip about the se-cret of success: “You always have to make sure that youscore one goal more than your opponent.” The state-ment is always true, but provides no insight. Nowak andHighfield [5] and van Veelen et al. [6] believe their argu-ments demonstrate that the Price equation is true in thesame trivial sense, and they call that trivial type of trutha mathematical tautology. Interestingly, magazines, on-line articles, and the scientific literature have for severalyears been using the phrase mathematical tautology forthe Price equation, although Nowak and Highfield [5] andvan Veelen et al. [6] do not provide citations to previousliterature.

    As far as I know, the first description of the Price equa-tion as a mathematical tautology was in Frank [7]. I usedthe phrase in the sense of the epigraph from Mazur, a for-mal equivalence between different expressions of the sameobject. Mathematics and much of statistics are aboutformal equivalences between different expressions of thesame object. For example, the Laplace transform changesa mathematical expression into an alternative form withthe same information, and analysis of variance decom-poses the total variance into a sum of component vari-ances. For any mathematical or statistical equivalence,value depends on enhanced analytical power that easesfurther derivations and calculations, and on the ways inwhich previously hidden relations are revealed.

    In light of the contradictory points of view, the maingoal of this article is to sort out exactly what the Priceequation is, how we should think about it, and its valueand limitations in reasoning about evolution. Subsequentarticles will show the Price equation in action, appliedto kin selection, causal analysis in evolutionary models,and an information perspective of natural selection andFisher’s fundamental theorem.

    OVERVIEW

    The first section derives the Price equation in its fulland most abstract form. That derivation allows us toevaluate the logical status of the equation in relation tovarious claims of fundamental flaw. The equation sur-vives scrutiny. It is a mathematical relation that ex-presses the total amount of evolutionary change in analternative and mathematically equivalent way. Thatequivalence provides insight into aspects of natural se-lection and also provides a guide that, in particular ap-plications, often leads to good approaches for analysis.

    The second section contrasts two perspectives of evo-lutionary analysis. In standard models of evolutionarychange, one begins with the initial population state andthe rules of change. The rules of change include the fit-ness of each phenotype and the change in phenotype be-tween ancestor and descendant. Given the initial stateand rules of change, one deduces the state of the changed

  • 3

    population. Alternatively, one may have data on theinitial population state, the changed population state,and the ancestor-descendant relations that map entitiesfrom one population to the other. Those data may bereduced to the evolutionary distance between two pop-ulations, providing inductive information about the un-derlying rules of change. Natural populations have no in-trinsic notion of fitness or rules of change. Instead, theyinductively accumulate information. The Price equationincludes both the standard deductive model of evolution-ary change and the inductive model by which informa-tion accumulates in relation to the evolutionary distancebetween populations.

    The third and fourth sections discuss the Price equa-tion’s abstract properties of invariance and recursion.The invariance properties include the information theoryinterpretation of natural selection. Recursion providesthe basis for analyzing group selection and other modelsof multilevel selection.

    The fifth section relates the Price equation to variousexpressions that have been used throughout the historyof evolutionary theory to analyze natural selection. Themost common form describes natural selection by the co-variance between phenotype and fitness or by the covari-ance between genetic breeding value and fitness. Thecovariance expression is one part of the Price equationthat, when used alone, describes the natural selectioncomponent of total evolutionary change. The essenceof those covariance forms arose in the early studies ofpopulation and quantitative genetics, have been used ex-tensively during much of the modern history of animalbreeding, and began to receive more mathematical de-velopment in the 1960s and 1970s. Recent critiques ofthe Price equation focus on the same covariance expres-sion that has been widely used throughout the history ofpopulation and quantitative genetics to analyze naturalselection and to approximate total evolutionary change.

    The sixth section returns to the full abstract form ofthe equation. I compare a few variant expressions thathave been promoted as improvements on the originalPrice equation. Variant forms are indeed helpful withregard to particular abstract problems or particular ap-plications. However, most variants are simply minor re-arrangements of the mathematical equivalence for totalevolutionary change given by the original Price equation.The recent extension by Kerr and Godfrey-Smith [8] doesprovide a slightly more general formulation by expand-ing the fundamental set mapping that defines Price’s ap-proach. The set mapping basis for the Price equationdeserves more careful study and further mathematicalwork.

    The seventh section analyzes various flaws that havebeen ascribed to the Price equation. For example, thePrice equation in its most abstract form does not con-tain enough information to follow evolutionary dynamicsthrough multiple rounds of natural selection. By con-trast, classical dynamic models of population genetics aresufficient to follow change through time. Much has been

    Box 2. Price equation literature

    A large literature introduces and reviews the Price equation.I list some key references that can be used to get started[7, 9–16].

    Diverse applications have been developed with the Priceequation. I list a few examples [17–28].

    Quantitative genetics theory often derives from the covari-ance expression given by Robertson [29], which is a form ofthe covariance term of the Price equation. The basic theorycan be found in textbooks [30, 31]. Much of the modern workcan be traced through the widely cited article by Lande andArnold [32].

    Harman [33] provides an interesting overview of Price’slife and evokes an Olympian sense of the power and magicof the Price equation. See Schwartz [34] for an alternativebiographical sketch.

    made of this distinction with regard to dynamic suffi-ciency. The distinction arises from the fact that classicaldynamics in population genetics makes more initial as-sumptions than the abstract Price equation. It must betrue that all mathematical equivalences for total evolu-tionary change have the same dynamic status given thesame initial assumptions. Each additional well-chosen as-sumption typically enhances the specificity and reducesthe scope and generality of the analysis. The epigraphfrom Boulding emphasizes that the degree of specificityversus generality is an explicit choice of the analyst withrespect to initial assumptions.

    The Discussion considers the value and limitations ofthe Price equation in relation to recent criticisms byNowak and van Veelen. The critics confuse the distinctroles of general abstract theory and concrete dynamicalmodels for particular cases. The enduring power of thePrice equation arises from the discovery of essential in-variances in natural selection. For example, kin selectiontheory expresses biological problems in terms of related-ness coefficients. Relatedness measures the associationbetween social partners. The proper measure of related-ness identifies distinct biological scenarios with the same(invariant) evolutionary outcome. Invariance relationsprovide the deepest insights of scientific thought.

    THE PRICE EQUATION

    The mathematics given here applies not onlyto genetical selection but to selection in gen-eral. It is intended mainly for use in deriv-ing general relations and constructing theo-ries, and to clarify understanding of selectionphenomena, rather than for numerical calcu-lation [4, p. 485].

    I have emphasized that the Price equation is a mathe-matical equivalence. The equation focuses on separation

  • 4

    of total evolutionary change into a part attributed to se-lection and a remainder term. That separation providesan abstraction of the nature of selection. As Price wrotesometime around 1970 but published posthumously inPrice [35]: “Despite the pervading importance of selec-tion in science and life, there has been no abstraction andgeneralization from genetical selection to obtain a generalselection theory and general selection mathematics.”

    It is useful first to consider the Price equation in thismost abstract form. I follow my earlier derivations [7,10, 24, 36], which differ little from the derivation givenby Price [4] when interpreted in light of Price [35].

    The abstract expression can best be thought of interms of mapping items between two sets [7, 35]. In biol-ogy, we usually think of an ancestral population at sometime and a descendant population at a later time. Al-though there is no need to have an ancestor-descendantrelation, I will for convenience refer to the two sets as an-cestor and descendant. What does matter is the relationsbetween the two sets, as follows.

    Definitions

    The full abstract power of the Price equation requiresadhering strictly to particular definitions. The definitionsarise from the general expression of the relations betweentwo sets.

    Let qi be the frequency of the ith type in the ancestralpopulation. The index i may be used as a label for anysort of property of things in the set, such as allele, geno-type, phenotype, group of individuals, and so on. Letq′i be the frequencies in the descendant population, de-fined as the fraction of the descendant population that isderived from members of the ancestral population thathave the label i. Thus, if i = 2 specifies a particularphenotype, then q′2 is not the frequency of the phenotypei = 2 among the descendants. Rather, it is the frac-tion of the descendants derived from entities with thephenotype i = 2 in the ancestors. One can have par-tial assignments, such that a descendant entity derivesfrom more than one ancestor, in which case each ances-tor gets a fractional assignment of the descendant. Thekey is that the i indexing is always with respect to theproperties of the ancestors, and descendant frequencieshave to do with the fraction of descendants derived fromparticular ancestors.

    Given this particular mapping between sets, we canspecify a particular definition for fitness. Let q′i =qi(wi/w̄), where wi is the fitness of the ith type andw̄ =

    ∑qiwi is average fitness. Here, wi/w̄ is propor-

    tional to the fraction of the descendant population thatderives from type i entities in the ancestors.

    Usually, we are interested in how some measurementchanges or evolves between sets or over time. Let themeasurement for each i be zi. The value z may be thefrequency of a gene, the squared deviation of some phe-notypic value in relation to the mean, the value obtained

    by multiplying measurements of two different phenotypesof the same entity, and so on. In other words, zi can bea measurement of any property of an entity with label,i. The average property value is z̄ =

    ∑qizi, where this

    is a population average.The value z′i has a peculiar definition that parallels

    the definition for q′i. In particular, z′i is the average mea-

    surement of the property associated with z among thedescendants derived from ancestors with index i. Thepopulation average among descendants is z̄′ =

    ∑q′iz′i.

    The Price equation expresses the total change in theaverage property value, ∆z̄ = z̄′ − z̄, in terms of thesespecial definitions of set relations. This way of expressingtotal evolutionary change and the part of total changethat can be separated out as selection is very differentfrom the usual ways of thinking about populations andevolutionary change. The derivation itself is very easy,but grasping the meaning and becoming adept at usingthe equation is not so easy.

    I will present the derivation in two stages. The firststage makes the separation into a part ascribed to selec-tion and a part ascribed to property change that coverseverything beyond selection. The second stage retainsthis separation, changing the notation into standard sta-tistical expressions that provide the form of the Priceequation commonly found in the literature. I follow withsome examples to illustrate how particular set relationsare separated into selection and property change compo-nents. The next section considers two distinct interpre-tations of the Price equation in relation to dynamics.

    Derivation: separation into selection and propertyvalue change

    We use ∆qi = q′i − qi for frequency change associ-

    ated with selection, and ∆zi = z′i − zi for property value

    change. Both expressions for change depend on the spe-cial set relation definitions given above.

    We are after an alternative expression for total change,∆z̄. Thus,

    ∆z̄ = z̄′ − z̄

    =∑

    q′iz′i −∑

    qizi

    =∑

    q′i(z′i − zi) +

    ∑q′izi −

    ∑qizi

    =∑

    q′i(∆zi) +∑

    (∆qi)zi.

    Switching the order of the terms on the right side of thelast line yields

    ∆z̄ =∑

    (∆qi)zi +∑

    q′i(∆zi), (1)

    a form emphasized by Frank [10, eqn 1]. The first termseparates the part of total change caused by changes infrequency. We call this the part caused by selection, be-cause this is the part that arises directly from differential

  • 5

    contribution by ancestors to the descendant population[35]. Because the set mappings define all of the directattributions of success for each i with respect to the as-sociated properties zi, it is reasonable to separate outthis direct component as the abstraction of selection. Itis of course possible to define other separations. I discussone particular alternative later. However, it is hard tothink of other separations that would describe selectionin a better way at the most abstract and general levelof the mappings between two sets. This first term hasalso been called the partial evolutionary change causedby natural selection (Eq. (7)).

    The second term describes the part of total changecaused by changes in property values. Recall that ∆zi =z′i − zi, and that z′i is the property value among enti-ties that descend from i. Many different processes maycause descendant property values to differ from ances-tral values. In fact, the assignment of a descendant to anancestor can be entirely arbitrary, so that there is no rea-son to assume that descendants should be like ancestors.Usually, we will work with systems in which descendantsdo resemble ancestors, but the degree of such associationscan be arranged arbitrarily. This term for change in prop-erty value encompasses everything beyond selection. Theidea is that selection affects the relative contribution ofancestors and thus the changes in frequencies of repre-sentation, but what actually gets represented among thedescendants will be subject to a variety of processes thatmay alter the value expressed by descendants.

    The equation is exact and must apply to every evolu-tionary system that can be expressed as two sets withcertain ancestor-descendant or mapping relations. It isin that sense that I first used the phrase mathematicaltautology [7]. The nature of separation and abstractionis well described by the epigraph from Mazur at the startof this article.

    Derivation: statistical notation

    Price [4] used statistical notation to write Eq. (1). Forthe first term, by following prior definitions we have

    ∆qi = q′i − qi

    = qiwiw̄− qi

    = qi

    (wiw̄− 1),

    so that∑(∆qi)zi =

    ∑qi

    (wiw̄− 1)zi = Cov(w, z)/w̄,

    using the standard definition for population covariance.For the second term, we have∑

    q′i(∆zi) =∑

    qiwiw̄

    (∆zi) = E(w∆z)/w̄,

    where E means expectation, or average over the full pop-ulation. Putting these statistical forms into Eq. (1) and

    moving w̄ to the left side for notational convenience yieldsa commonly published form of the Price equation

    w̄∆z̄ = Cov(w, z) + E(w∆z). (2)

    Price [35] and Frank [7] present examples of set mappingsexpressed in relation to the Price equation.

    DYNAMICS: INDUCTIVE AND DEDUCTIVEPERSPECTIVES

    The Price equation describes evolutionary change be-tween two populations. Three factors express one itera-tion of dynamical change: initial state, rules of change,and next state. In the Price equation, the phenotypes,zi, and their frequencies, qi, describe the initial popula-tion state. Fitnesses, wi, and property changes, ∆zi, setthe rules of change. Derived phenotypes, z′i, and theirfrequencies, q′i, express the next population state.

    Models of evolutionary change essentially always an-alyze forward or deductive dynamics. In that case, onestarts with initial conditions and rules of change and cal-culates the next state. Most applications of the Priceequation use this traditional deductive analysis. Suchapplications lead to predictions of evolutionary outcomegiven assumptions about evolutionary process, expressedby the fitness parameters and property changes.

    Alternatively, one can take the state of the initial pop-ulation and the state of the changed population as given.If one also has the mappings between initial and changedpopulations that connect each entity, i, in the initial pop-ulation to entities in the changed population, then onecan calculate (induce) the underlying rules of change. Atfirst glance, this inductive view of dynamics may seemrather odd and not particularly useful. Why start withknowledge of the evolutionary sequence of populationstates and ancestor-descendant relations as given, and in-ductively calculate fitnesses and property changes? Theinductive view takes the fitnesses, wi, to be derived fromthe data rather than an intrinsic property of each type.

    The Price equation itself does not distinguish betweenthe deductive and inductive interpretations. One canspecify initial state and rules of change and then deduceoutcome. Or one can specify initial state and outcomealong with ancestor-descendant mappings, and then in-duce the underlying rules of change. It is useful to under-stand the Price equation in its full mathematical gener-ality, and to understand that any specific interpretationarises from additional assumptions that one brings to aparticular problem. Much of the abstract power of thePrice equation comes from understanding that, by itself,the equation is a minimal description of change betweenpopulations.

    The deductive interpretation of the Price equation isclear. What value derives from the inductive perspec-tive? In observational studies of evolutionary change, weonly have data on population states. From those data,we use the inductive perspective to make inferences about

  • 6

    the underlying rules of change. Note that inductive es-timates for evolutionary process derive from the amountof change, or distance, between ancestor and descendantpopulations. The Price equation includes that inductive,or retrospective view, by expressing the distance betweenpopulations in terms of ∆z̄. I develop that distance in-terpretation in the following sections.

    Perhaps more importantly, natural selection itself isinherently an inductive process by which information ac-cumulates in populations. Nature does not intrinsically“know” of fitness parameters. Instead, frequency changesand the mappings between ancestor and descendant areinherent in a population’s response to the environment,leading to a sequence of population states, each separatedby an evolutionary distance. That evolutionary distanceprovides information that populations accumulate induc-tively about the fitnesses of each phenotype [36]. ThePrice equation includes both the deductive and inductiveperspectives. We may choose to interpret the equationin either way depending on our goals of analysis.

    ABSTRACT PROPERTIES: INVARIANCE

    The Price equation describes selection by the term∑(∆qi)zi = Cov(w, z)/w̄. Any instance of evolutionary

    change that has the same value for this sum has the sameamount of total selection. Put another way, for any par-ticular value for total selection, there is an infinite num-ber of different combinations of frequency changes andcharacter measurements that will add up to the sametotal value for selection. All of those different combina-tions lead to the same value with respect to the amountof selection. We may say that all of those different com-binations are invariant with respect to the total quantityof selection. The deepest insights of science come fromunderstanding what does not matter, so that one can alsosay exactly what does matter—what is invariant [37, 38].

    The invariance of selection with respect to transforma-tions of the fitnesses, w, and the phenotypes, z, that havethe same Cov(w, z) means that, to evaluate selection, itis sufficient to analyze this covariance. At first glance, itmay seem contradictory that the covariance, commonlythought of as a linear measure of association, can bea complete description for selection, including nonlinearprocesses. Let us step through this issue, first looking atwhy the covariance is a sufficient expression of selection,and then at the limitations of this covariance expressionin evolutionary analysis.

    Covariance as a measure of distance: definitions

    Much of the confusion with respect to covariance andvariance terms in selection equations arises from think-ing only of the traditional statistical usage. In statistics,covariance typically measures the linear association be-tween pairs of observations, and variance is a measure of

    the squared spread of observations. Alternatively, covari-ances and variances provide measures of distance, whichultimately can be understood as measures of information[36]. This section introduces the notation for the geomet-ric interpretation of distance. The next section gives themain geometric result, and the following section presentssome examples.

    The identity∑

    (∆qi)zi = Cov(w, z)/w̄ provides thekey insight. It helps to write this identity in an alter-native form. Note from the prior definition q′i = qiwi/w̄that

    ∆qi = q′i − qi = qi(wi/w̄ − 1) = qiai, (3)

    where ai = wi/w̄− 1 is Fisher’s average excess in fitness,a commonly used expression in population and quantita-tive genetics [39–41]. A value of zero means that an entityhas average fitness, and therefore fitness effects and se-lection do not change the frequency of that entity. Usingthe average excess in fitness, we can write the invariantexpression for selection as∑

    (∆qi)zi =∑

    qiaizi = Cov(w, z)/w̄. (4)

    We can think of the state of the population as thelisting of character states, zi. Thus we write the pop-ulation state as z = (z1, z2, . . .). The subscripts runover every different entity in the population, so the vec-tor z is a complete description of the entire population.Similarly, for the frequency fluctuations, ∆qi = qiai,we can write the listing of all fluctuations as a vector,∆q = (∆q1,∆q2, . . .).

    It is often convenient to use the dot product notation

    ∆q · z =∑

    (∆qi)zi = Cov(w, z)/w̄

    in which the dot specifies the sum obtained by multiply-ing each pair of items from two vectors. Before turningto some geometric examples in the following section, weneed a definition for the length of a vector. Traditionally,one uses the definition

    ‖z‖ =√∑

    z2i ,

    in which the length is the square root of the sum ofsquares, which is the standard measure of length in Eu-clidean geometry.

    Covariance as a measure of distance: examples

    A simple identity relates a dot product to a measureof distance and to covariance selection

    ∆q · z = ‖∆q‖‖z‖ cosφ = Cov(w, z)/w̄, (5)

    where φ is the angle between the vectors ∆q and z(Fig. 1). If we standardize the character vector −z =z/‖z‖, then the standardized vector has a length of one,

  • 7

    ‖−z‖ = 1, which simplifies the dot product expression ofselection to

    ∆q · −z = ‖∆q‖ cosφ,

    providing the geometric representation illustrated inFig. 1.

    The covariance can be expressed as the product of aregression coefficient and a variance term

    Cov(w, z)/w̄ = βzwVar(w)/w̄ = βwzVar(z)/w̄, (6)

    where the notation βxy describes the regression coeffi-cient of x on y [3]. This identity shows that the expres-sion of selection in terms of a regression coefficient anda variance term is equivalent to the geometric expressionof selection in terms of distance.

    I emphasize these identities for two reasons. First, asMazur stated in the epigraph: “The heart and soul ofmuch mathematics consists of the fact that the ‘same’object can be presented to us in different ways.” If anobject is important, such as natural selection surely is,then it pays to study that object from different perspec-tives to gain deeper insight.

    Second, the appearance of statistical functions, suchas the covariance and variance, in selection equationssometimes leads to mistaken conclusions. In the selec-tion equations, it is better to think of the covariance andvariance terms arising because they are identities with ge-ometric or other interpretations of selection, rather thanthinking of those terms as summary statistics of proba-bility distributions. The problem with thinking of thoseterms as statistics of probability distributions is that thevariance and covariance are not in general sufficient de-scriptions for probability distributions. That lack of suffi-ciency for probability may lead one to conclude that thoseterms are not sufficient for a general expression of selec-tion. However, those covariance and variance terms aresufficient. That sufficiency can be understood by think-ing of those terms as identities for distance or measuresof information [36].

    It is true that in certain particular applications ofquantitative genetics or stochastic sampling processes,one does interpret the variances and covariances as sum-mary statistics of probability distributions, usually thenormal or Gaussian distribution. However, it is impor-tant to distinguish those special applications from thegeneral selection equations.

    Invariance and information

    For the general selection expression in Eq. (5), anytransformations that do not affect the net values are in-variant with respect to selection. For example, trans-formations of the fitnesses and associated frequencychanges, ∆q, are invariant if they leave unchanged thedistance expressed by ∆q · z = Cov(w, z)/w̄. Similarly,changes in the pattern of phenotypes are invariant to the

    z

    z

    q

    q

    qco

    s

    qco

    s

    (a)

    (b)

    FIG. 1. Geometric expression of selection. The plots show theequivalence of the dot product, the geometric expression andthe covariance, as given in Eq. (5). For both plots, z = (1, 4)and −z = z/‖z‖ = (0.24, 0.97). The dashed line shows the per-pendicular between the pattern of frequency changes derivedfrom fitnesses, ∆q, and the phenotypic pattern, −z. The ver-tex of the two vectors is at the origin (0, 0). The distance fromthe origin to the intersection of the perpendicular along −z isthe total amount of selection, ‖∆q‖ cosφ. (a) The vector offrequency changes that summarize fitness is ∆q = (−0.4, 0.4).The angle between the vector of frequency changes and thephenotypes is φ = arccos [(∆q · −z)/‖∆q‖] which, in this ex-ample, is 1.03 radians or 59◦. In this case, the total selectionis ‖∆q‖ cosφ = 0.29. (b) In this plot, ∆q = (0.4,−0.4),yielding an angle φ of 121◦. The perpendicular intersectsthe negative projection of the phenotype vector, shown as adashed line, associated with the negative change by selectionof ‖∆q‖ cosφ = −0.29.

  • 8

    extent that they leave ∆q·z unchanged. These invarianceproperties of selection, measured as distance, may notappear very interesting at first glance. They seem to besaying that the outcome is the outcome. However, thehistory of science suggests that studying the invariantproperties of key expressions can lead to insight.

    Few authors have developed an interest in the invariantqualities of selection. Fisher [39] initiated discussion withhis fundamental theorem of natural selection, a specialcase of Eq. (5) [10]. Although many authors commentedon the fundamental theorem, most articles did not ana-lyze the theorem with respect to its essential mathemat-ical insights about selection. Ewens [42] reviewed thefew attempts to understand the mathematical basis ofthe theorem and its invariant quantities. Frank [36] tiedthe theorem to Fisher information [43, 44], hinting atan information theory interpretation that arises from thefundamental selection equation of Eq. (5).

    In spite of the importance of selection in many fieldsof science, the potential interpretation of Eq. (5) withrespect to invariants of information theory has hardlybeen developed. I briefly outline the potential connec-tions here [36]. I develop this information perspective ofselection in a later article, along with Fisher’s fundamen-tal theorem.

    To start, define the partial change in phenotype causedby natural selection as

    ∆S z̄ = ∆q · z = Cov(w, z)/w̄. (7)

    The concept of a partial change caused by natural selec-tion arises from Fisher’s fundamental theorem [39, 45–47]. With this definition, we can use eqns 5 and 6 towrite

    ∆S z̄ = βzwVar(w)/w̄ = w̄βzwVar(w/w̄). (8)

    From Eq. (3), we have the definition for the average ex-cess in fitness ai = wi/w̄ − 1. Thus, we can expand theexpression for the variance in fitness as

    Var(w/w̄) =∑

    qi

    (wiw̄− 1)2

    =∑

    qia2i .

    From Eq. (3), we also have the change in frequency interms of the average excess, ∆qi = qiai, and equivalently,ai = ∆qi/qi, thus

    Var(w/w̄) =∑

    qi

    (∆qiqi

    )2=∑(∆qi√

    qi

    )2= ∆q̂ ·∆q̂,

    where ∆q̂i = ∆qi/√qi is a standardized fluctuation in

    frequency, and ∆q̂ is the vector of standardized fluctu-ations. These alternative forms simply express the vari-ance in fitness in different ways. The interesting resultfollows from the fact that

    Var(w/w̄) = ∆q̂ ·∆q̂ = F (∆q̂)

    is the Fisher information, F , in the frequency fluctua-tions, ∆q̂. Fisher information is a fundamental quantityin information theory, Bayesian analysis, likelihood the-ory and the informational foundations of statistical in-ference. Fisher information is a variant form of the morefamiliar Shannon and Kullback-Leibler information mea-sures, in which the Fisherian form expresses changes ininformation.

    Once again, we have a simple identity. Although itis true that Fisher information is just an algebraic rear-rangement of the variance in fitness, some insight may begained by relating selection to information. The varianceform calls to mind a statistical description of selection ora partial description of a probability distribution. TheFisher information form suggests a relation between nat-ural selection and the way in which populations accumu-late information [36].

    We may now write our fundamental expression for se-lection as

    ∆S z̄ = w̄βzw F (∆q̂).

    We may read this expression for selection as: the changein mean character value caused by natural selection, ∆S z̄,is equal to the total Fisher information in the frequencyfluctuations, F , multiplied the scaling β that describesthe amount of the potential information that the pop-ulation captures when expressed in units of phenotypicchange. In other words, the distance ∆S z̄ measures theinformational gain by the population caused by naturalselection.

    The invariances set by this expression may be viewedin different ways. For example, the distance of evolution-ary change by selection, ∆S z̄, is invariant with respect tomany different combinations of frequency fluctuations,∆q̂, and scalings between phenotype and fitness. Simi-larly, any transformations of frequency fluctuations thatleave the measure of information, F (∆q̂), invariant donot alter the scaled change in phenotype caused by natu-ral selection. The full implications remain to be explored.

    NOTE ADDED AFTER PUBLICATION OFTHE JOURNAL ARTICLE. I have oversimplified abit in this section, with the aim to keep the presenta-tion brief. The proper expression of Fisher information isF (∆q̂)(∆θ)2, where ∆θ is the scale of change over whichpopulation differences are measured, typically taken as∆θ → 0 [36].

  • 9

    Summary of selection identities

    The various identities for the part of total evolutionarychange caused by selection include

    ∆S z̄ = Cov(w, z)/w̄

    = w̄βzwVar(w/w̄)

    = ∆q · z= ‖∆q‖‖z‖ cosφ= w̄βzw(∆q̂ ·∆q̂)= w̄βzw F (∆q̂). (9)

    These forms show the equivalence of the statistical, ge-ometrical and informational expressions for natural se-lection. These general abstract forms make no assump-tions about the nature of phenotypes and the patternsof frequency fluctuations caused by differential fitness.The phenotypes may be squared deviations so that theaverage is actually a variance, or the product of mea-surements on different characters leading to measures ofassociation, or any other nonlinear combination of mea-surements. Thus, there is nothing inherently linear orrestrictive about these expressions.

    Selection versus evolution

    The previous sections discussed the part of evolution-ary change caused by selection. The full Price equation(Eq. (2)) gives a complete and exact expression of totalchange, repeated here as

    ∆z̄ = Cov(w, z)/w̄ + E(w∆z)/w̄ (10)

    or in terms of the dot product notation as

    ∆z̄ = ∆q · z + q′ ·∆z. (11)

    The full change in the phenotype is the sum of the twoterms, which we may express in symbols as

    ∆z̄ = ∆S z̄ + ∆E z̄.

    Fisher [39] called the term ∆E z̄ the change caused by theenvironment [47]. However, the word environment oftenleads to confusion. The proper interpretation is that ∆E z̄encompasses everything not included in the expression forselection. The term is environmental only in the sensethat it includes all those forces external to the particulardefinition of the selective forces for a particular problem.

    The ∆E term is sometimes associated with changes intransmission [7, 10, 15, 48]. This interpretation arisesbecause E(w∆z) is the fitness weighted changes in char-acter value between ancestor and descendant. One maythink of changes in character values as changes duringtransmission.

    It is important to realize that everything truly meansevery possible force that might arise and that is not ac-counted for by the particular expression for selection.

    Lightning may strike. New food sources may appear.The Price equation in its general and abstract form is amathematical identity—what I previously called a math-ematical tautology [7].

    In applications, one considers how to express ∆E z̄,or one searches for ways to formulate the problem sothat ∆E z̄ is zero or approximately zero. This article isnot about particular applications. Here, I simply notethat when one works with Fisher’s breeding value as z,then near equilibria (fixed points), one typically obtains∆z → 0 and thus E(w∆z) → 0. In other cases, thesearch for a good way to express a problem means find-ing a form of character measurement that defines z suchthat characters tend to remain stable over time, so that∆z → 0 and thus E(w∆z) → 0. For applications thatemphasize calculation of complex dynamics rather than amore abstract conceptual analysis of a problem, methodsother than the Price equation often work better.

    ABSTRACT PROPERTIES: RECURSION ANDGROUP SELECTION

    To iterate is human, to recurse, divine [49].

    Essentially all modern discussions of multilevel selectionand group selection derive from Price [4], as developedby Hamilton [9]. Price and Hamilton noted that thePrice equation can be expanded recursively to representnested levels of analysis, for example, individuals livingin groups.

    Start with the basic Price equation as given in Eq. (10).The left side is the total change in average phenotype, z̄.The second term on the right side includes the terms ∆ziin E(w∆z) =

    ∑qiwi∆zi.

    Recall that in defining zi, we specified the meaning ofthe index i to be any sort of labeling of set members, sub-ject to minimal consistency requirements. We may, forexample, label all members of a group by i, and measurezi as some property of the group. If the index i itself rep-resents a set, then we may consider the members of thatset. For example, zij may be the jth member of the ithset, or we may say, the ith group. In the abstract math-ematical expression, there is no need to think of the ithgroup as having any spatial or biological meaning. How-ever, we may consider i as a label for spatially definedgroups if we wish to do so.

    With i defining a group, we may analyze the selectionand evolution of that ith group. The term ∆zi becomesthe average change in the z measure for the ith group,composed of members with values zij . The terms z

    ′ij are

    the average property values of the descendants of the jthentity in the ith group. The descendant entities that de-rive from the ith group do not have to form any sort ofgroup or other meaningful structuring, just as the origi-nal i labeling does not have to refer to group structuringin the ancestors. However, we may if we wish considerdescendants of i as retaining some sense of the ancestralgrouping.

  • 10

    Because zi represents an averaging over the entities j inthe ith group, we are assuming the notational equivalence∆zi = ∆z̄i. From that point of view, for each group i wemay from Eq. (10) express the change in the group meanby thinking of each group as a separate set or population,yielding for each i the expression

    ∆zi = ∆z̄i = Cov(wi, zi)/w̄i + E(wi∆zi)/w̄i.

    We may substitute this expression for each i into theE(w∆z) =

    ∑qiwi∆zi term on the right side of Eq. (10).

    That substitution recursively expands each change inproperty value, ∆zi, to itself be composed of a selectionterm and property value change term. For each group, i,we now have expressions for selection within the group,Cov(wi, zi)/w̄i, and average property value change withinthe group, E(wi∆zi)/w̄i. If we write out the full expres-sion for this last term, we obtain

    E(wi∆zi)/w̄i =∑j

    wij∆zij/w̄i.

    In the term ∆zij , each labeling, j, may itself be a sub-group within the larger grouping represented by i. Therecursive nature of the Price equation allows another ex-pansion to the characters zijk for the kth entity in thejth grouping that is nested in the ith group, and so on.Once again, the indexing for levels i, j, and k do not haveto correspond to any particular structuring, but we maychoose to use a structuring if we wish.

    One could analyze biological problems of group selec-tion without using the Price equation. Because the Priceequation is a mathematical identity, there are alwaysother ways of expressing the same thing. However, inthe 1970s, when group selection was a very confused sub-ject, the Price equation’s recursive nature and Hamilton’sdevelopment provided the foundation for subsequent un-derstanding of the topic. All modern conceptual insightsabout group selection derive from Price’s recursive ex-pansion of his abstract expression of selection.

    HISTORY AND ALTERNATIVE EXPRESSIONSOF SELECTION

    I have emphasized the general and abstract form ofthe Price equation. That abstract form was first pre-sented rather cryptically by Price [4]. In that article,Price described the recursive expansion to analyze groupselection. Apart from the recursive aspect, the more gen-eral abstract properties were hardly mentioned in Price[4] and not developed by others until 1995.

    While I was writing my history of Price’s contributionsto evolutionary genetics [7], I found Price’s unpublishedmanuscript The nature of selection among W. D. Hamil-ton’s papers. Price’s unpublished manuscript gave a verygeneral and abstract scheme for analyzing selection interms of set relations. However, Price did not explic-itly connect the abstract set relation scheme to the Priceequation or to his earlier publications [3, 4].

    I had The nature of selection published posthumouslyas Price [35]. In my own article, I explicitly developedthe general interpretation of the Price equation as theformal abstract expression of the relation between twosets [7].

    Price [3] wrote an earlier article in which he presenteda covariance selection equation that emphasized the con-nection to classical models of population genetics andgene frequency change. That earlier covariance formlacks the abstract set interpretation and generally hasnarrower scope. Preceding Price, Robertson [29] and Li[50] also presented selection equations that are similarto Price’s [3] covariance expression. Robertson’s covari-ance form itself arises from classical quantitative geneticsand the breeder’s equation, ultimately deriving from thefoundations of quantitative genetics established by Fisher[51]. Li’s form presents a covariance type of expressionfor classical population genetic models of gene frequencychange.

    One cannot understand the current literature withouta clear sense of this history. Almost all applications ofthe Price equation to kin and group selection, and toother problems of evolutionary analysis, derive from ei-ther the classical expressions of quantitative genetics [29]or classical expressions of population genetics [50].

    In light of this history, criticisms can be confusing withregard to the ways in which the Price equation is com-monly used. For example, in applications to kin or groupselection, the Price equation mainly serves to package thenotation for the Robertson form of quantitative geneticanalysis or the Li form of population genetic analysis.The Price equation packaging brings no extra assump-tions. In some applications, critics may believe that theparticular analysis lacks enough assumptions to attain adesired level of specificity. One can, of course, easily addmore assumptions, at the expense of reduced generality.

    The following sections briefly describe some alternativeforms of the Price equation and the associated history.That history helps to place criticisms of the Price equa-tion and its applications into clearer light.

    Quantitative genetics and the breeder’s equation

    Fisher [51] established the modern theory of quantita-tive genetics, following the early work of Galton, Pear-son, Weldon, Yule and others. The equations of selectionin quantitative genetics and animal breeding arose fromthat foundation. Many modern applications of the Priceequation to particular problems follow this tradition ofquantitative genetics. A criticism of these Price equationapplications is a criticism of the central approach of evo-lutionary quantitative genetics. Such criticisms may bevalid for certain applications, but they must be evaluatedin the broader context of quantitative genetics theory.This section shows the relation between quantitative ge-netics and a commonly applied form of the Price equation[14].

  • 11

    Evolutionary aspects of quantitative genetics devel-oped from the breeder’s equation

    R = Sh2,

    in which the response to selection, R, equals the selectiondifferential, S, multiplied by the heritability, h2. Theseparation of selection and transmission is the key to thebreeder’s equation and to quantitative genetics theory.

    The covariance term of the Price equation is equivalentto the selection differential, S, when one interprets themeaning of fitness and descendants in a particular way.Suppose that we label each potential parent in the ances-tral population of size N with the index, i. The initialweighting of each parent in the ancestral population isqi = 1/N . Assign to each potential parent a weightingwith respect to breeding contribution, q′i = qiwi, with fit-nesses standardized so that w̄ = 1 and the wi are relativefitnesses.

    With this setup, ancestors are the initial populationof potential parents, each weighted equally, and descen-dants are the same population of parents, weighted bytheir breeding contribution. The character value foreach individual remains unchanged between the ances-tor and descendant labelings. These assumptions leadto ∆z̄∗ = Cov(w, z), the change in the average charac-ter value between the breeding population and the initialpopulation. That difference is defined as S, the selectiondifferential.

    To analyze the fraction of the selection differentialtransmitted to offspring, classical quantitative geneticsfollows Fisher [51] to separate the character value asz = g+�, with a transmissible genetic component, g, anda component that is not transmitted, which we may callthe environmental or unexplained component, �. Follow-ing standard regression theory for this sort of expression,�̄ = 0.

    For a parent with z = g + �, the average charactervalue contribution ascribed to the parent among its de-scendants is z′ = g, following the idea that g representsthe component of the parental character that is trans-mitted to offspring. If we assume that the only fluctu-ations of average character value in offspring are causedby the transmissible component that comes from parents,then the genetic component measured by g is sufficientto explain expected offspring character values. Thus,∆z = z′ − z = −�, and E(w∆z) = −Cov(w, �).

    Substituting into the full Price equation from Eq. (2)and assuming w̄ = 1 so that all fitnesses are normalized

    ∆z̄ = Cov(w, z) + E(w∆z)

    = Cov(w, g) + Cov(w, �)− Cov(w, �)= Cov(w, g). (12)

    The expression ∆z̄ = Cov(w, g) was first emphasized byRobertson [29], and is sometimes called Robertson’s sec-ondary theorem of natural selection. Robertson’s expres-sion summarizes the foundational principles of quantita-tive genetics, as conceived by Fisher [51] and developedover the past century [30, 52, 53].

    It is commonly noted that Robertson’s theorem is re-lated to the classic breeder’s equation. In particular,

    R = ∆z̄ = Cov(w, g) = Cov(w, z)h2 = Sh2,

    where R is the response to selection, S = Cov(w, z) is theselection differential, and h2 = Var(g)/Var(z) is a form ofheritability, a measure of the transmissible genetic com-ponent. Additional details and assumptions can be foundin several articles and texts [10, 14, 54].

    Population genetics and the covariance expression

    Price [3] expressed his original formulation in terms ofgene frequency change and classical population genetics,rather than the abstract set relations that I have empha-sized. At that time, it seems likely that Price alreadyhad the broader, more abstract theory in hand, and waspresenting the population genetics form because of itspotential applications. The article begins

    This is a preliminary communication describ-ing applications to genetical selection of a newmathematical treatment of selection in gen-eral.

    Gene frequency change is the basic eventin biological evolution. The followingequation. . .which gives frequency change un-der selection from one generation to the nextfor a single gene or for any linear function ofany number of genes at any number of loci,holds for any sort of dominance or epistasis,for sexual or asexual reproduction, for ran-dom or nonrandom mating, for diploid, hap-loid or polyploid species, and even for imagi-nary species with more than two sexes. . .

    Using my notation, Price writes the basic covariance form

    ∆P = Cov(w, p)/w̄ = βwpVar(p)/w̄. (13)

    In a simple application, p could be interpreted as genefrequency at a single diploid locus with two alleles. ThenP = p̄ is the gene frequency in the population, and βwp isthe regression of individual fitness on individual gene fre-quency, in which the individual gene frequency is either0, 1/2 or 1 for an individual with 0, 1 or 2 copies of theallele of interest. Li [50] gave an identical gene frequencyexpression in his eqn 4.

    In more general applications, one can study a p-scorethat summarizes the number of copies of various allelespresent in an individual, or in whatever entities are be-ing tracked. In classical population genetics, the p-scorewould be, in Price’s words above, “any linear function ofany number of genes at any number of loci.” Here, lin-earity means that p is essentially a counting of presenceversus absence of various things within the ith entity.Such counting does not preclude nonlinear interactions

  • 12

    between alleles or those things being counted with re-spect to phenotype, which is why Price said that theexpression holds for any form of dominance or epistasis.

    Hamilton [17] used Price’s gene frequency form in hisfirst clear derivations of the direct and the inclusive fit-ness models of kin selection theory. Most early appli-cations of the Price equation used this gene frequencyinterpretation.

    Price [3] emphasized that the value of Eq. (13) arisesfrom its benefits for qualitative reasoning rather than cal-culation. The necessary assumptions can be seen fromthe form given by Price, which is always exact, here writ-ten in my notation

    ∆P = Cov(w, p)/w̄ + E(w∆p)/w̄,

    where ∆p is interpreted as the change in state betweenparental gene frequency for the ith entity and the averagegene frequency for the part of descendants derived fromthe ith entity.

    In practice, ∆p = 0 usually means Mendelian segre-gation, no biased mutation, and no sampling biases as-sociated with drift. Most population genetics theory oftraits such as social behavior typically make those as-sumptions, so that Eq. (13) is sufficient with respect toanalyzing change in gene frequency or in p-scores [55].However, the direction of change in gene frequency or p-score is not sufficient to predict the direction of changein phenotype. To associate the direction of change in p-score to the direction of change in phenotype, one mustmake the assumption that phenotype changes monotoni-cally with p-score. Such monotonicity is a strong assump-tion, which is not always met. For that reason, p-scoremodels sometimes buy simplicity at a rather high cost.In other applications, monotonicity is a reasonable as-sumption, and the p-score models provide a very simpleand powerful approach to understanding the direction ofevolutionary change.

    The costs and benefits of the p-score model are not par-ticular to the Price equation. Any analysis based on thesame assumptions has the same limitations. The Priceequation provides a concise and elegant way to explorethe consequences when certain simplifying assumptionscan reasonably be applied to a particular problem.

    ALTERNATIVE FORMS ORINTERPRETATIONS OF THE FULL EQUATION

    The full Price equation partitions total evolutionarychange into components. Many alternative partitions ex-ist. A partition provides value if it improves conceptualclarity or eases calculation.

    Which partitions are better than others? Better is al-ways partly subjective. What may seem hard for me mayappear easy to you. Nonetheless, it would be a mistake tosuggest that all differences are purely subjective. Someforms are surely better than others for particular prob-lems, even if better remains hard to quantify. As Russell

    [56, p. 14] said in another context, “All such conventionsare equally legitimate, though not all are equally conve-nient.”

    Many partitions of evolutionary change include someaspect of selection and some aspect of property or trans-mission change. Most of those variants arise by minorrearrangements or extensions of the basic Price expres-sion. A few examples follow.

    Contextual analysis

    Heisler and Damuth [57] introduced the phrase con-textual analysis to the evolutionary literature. Contex-tual analysis is a form of path analysis, which partitionscauses by statistical regression models. Path analysis hasbeen used throughout the history of genetics [58]. It is auseful approach whenever one wishes to partition varia-tion with respect to candidate causes. The widely usedmethod of Lande and Arnold [32] to analyze selection isa particular form of path analysis.

    Okasha [15] argued that contextual analysis is an alter-native to the Price equation. To develop a simple exam-ple, let us work with just the selection part of the Priceequation

    w̄∆z̄ = Cov(w, z).

    A path (contextual) analysis refines this expression bypartitioning the causes of fitness with a regression equa-tion. Suppose we express fitness as depending on twopredictors: the focal character that we are studying, z,and another character, y. Then we can write fitness as

    w = βwzz + βwyy + �

    in which the β terms are partial regressions of fitnesson each character, and � is the unexplained residual offitness. Substituting into the Price equation, we get thesort of expression made popular by Lande and Arnold[32]

    w̄∆z̄ = βwzVar(z) + βwyCov(y, z).

    If the partitioning of fitness into causes is done in auseful way, this type of path analysis can provide signifi-cant insight. I based my own studies of natural selectionand social evolution on this approach [10, 24].

    Authors such as Okasha [15] consider the partition-ing of fitness into distinct causes as an alternative tothe Price equation. If one thinks of the character z inCov(w, z) as a complete causal explanation for fitness,then a partition into separate causes y and z does in-deed lead to a different causal understanding of fitness.In that regard, the Price equation and path analysis leadto different causal perspectives.

    One can find articles that use the Price equa-tion and interpret z as a lone cause of fitness [?]see¿okasha06evolution. Thus, if one equates those spe-cific applications with the general notion of the Price

  • 13

    equation, then one can say that path or contextual analy-sis provides a significantly different perspective from thePrice equation. To me, that seems like a socially con-structed notion of logic and mathematics. If someonehas applied an abstract truth in a specific way, and onecan find an alternative method for the same specific ap-plication that seems more appealing, then one can saythat the alternative method is superior to the generalabstract truth.

    The abstract Price equation does not compel one tointerpret z strictly as a single cause explanation. Rather,in the general expression, z should always be interpretedas an abstract placeholder. Path (contextual) analysisfollows as a natural extension of the Price equation, inwhich one makes specific models of fitness expressed byregression. It does not make sense to discuss the Priceequation and path analysis as alternatives.

    Alternative partitions of selection and transmission

    In the standard form of the Price equation, the fitnessterm, w, appears in both components

    w̄∆z̄ = Cov(w, z) + E(w∆z).

    Frank [10, 24] derived an alternative expression

    ∆z̄ =∑

    q′iz′i −∑

    qizi

    =∑

    qi(wi/w̄)z′i −∑

    qizi

    =∑

    qi(wi/w̄)z′i −∑

    qiz′i +∑

    qiz′i −∑

    qizi

    =∑

    qi (wi/w̄ − 1) z′i +∑

    qi(z′i − zi)

    = Cov(w, z′)/w̄ + E(∆z). (14)

    This form sometimes provides an easier method to cal-culate effects. For example, the second term now ex-presses the average change in phenotype between parentand offspring without weighting by fitness effects. A bi-ased mutational process would be easy to calculate withthis expression—one only needs to know about the mu-tation process to calculate the outcome. The new covari-ance term can be partitioned into meaningful componentswith minor assumptions [10, p. 1721], yielding

    Cov(w, z′) = Cov(w, z)βz′z,

    where βz′z is usually interpreted as the offspring-parentregression, which is a type of heritability. Thus, wemay combine selection with the heritability componentof transmission into the covariance term, with the sec-ond term containing only a fitness-independent measureof change during transmission.

    Okasha [15] strongly favored the alternative partitionfor the Price equation in Eq. (14), because it separates allfitness effects in the first term from a pure transmissioninterpretation of the second term. In my view, there

    are costs and benefits for the standard Price equationexpression compared with Eq. (14). One gains by havingboth, and using the particular form that fits a particularproblem.

    For example, the term E(∆z) is useful when one has tocalculate the effects of a biased mutational process thatoperates independently of fitness. Alternatively, supposemost individuals have unbiased transmission, such that∆z = 0, whereas very sick individuals do not reproducebut, if they were to reproduce, would have a very biasedtransmission process. Then E(∆z) differs significantlyfrom zero, because the sick, nonreproducing individualsappear in this term equally with the reproducing popula-tion. However, the actual transmission bias that occursin the population would be zero, E(w∆z) = 0, becauseall reproducing individuals have nonbiased transmission.

    Both the standard Price form and the alternative inEq. (14) can be useful. Different scenarios favor differentways of expressing problems. I cannot understand whyone would adopt an a priori position that unduly limitsone’s perspective.

    Extended set mapping expression

    The Price equation’s power arises from its abstractionof selection in terms of mapping relations between sets[7, 35]. Although the Price equation is widely cited in theliterature, almost no work has developed the set mappingformalism beyond the description given in the initial pub-lications. I know of only one article.

    Kerr and Godfrey-Smith [8] noted that, in the originalPrice formulation, every descendant must derive from oneor more ancestors. There is no natural way for novelentities to appear. In applications, new entities couldarise by immigration from outside the system or, in acultural interpretation, by de novo generation of an ideaor behavior.

    Kerr and Godfrey-Smith [8] present an extended ex-pression to handle unconnected descendants. Their for-mulation depends on making explicit the connectionnumber between each individual ancestor and each in-dividual descendant, rather than using the fitnesses oftypes. Some descendants may have zero connections.

    With an explicit description of connections, an ex-tended Price equation follows. The two core componentsof covariance for selection and expected change for trans-mission occur, plus a new factor to account for novel de-scendants unconnected to ancestors.

    The notation in Kerr and Godfrey-Smith [8] is com-plex, so I do not repeat it here. Instead, I show a simpli-fied version. Suppose that a fraction p of the descendantsare unconnected to ancestors. Then we can write the av-erage trait value among descendants as

    z̄′ = p∑

    αjz∗j + (1− p)

    ∑q′iz′i,

    where z∗j is the phenotype for the jth member of thedescendant population that is unconnected to ancestors,

  • 14

    and αj is the frequency of each unconnected type, with∑αj = 1. Given those definitions, we can proceed with

    the usual Price equation expression

    ∆z̄ = z̄′ − z̄

    = p∑

    αjz∗j + (1− p)

    ∑q′iz′i − (p+ 1− p)

    ∑qizi

    = (1− p)(∑

    q′iz′i −∑

    qizi

    )+ p

    (∑αjz∗j −

    ∑qizi

    ).

    Note that the term weighted by 1−p leads to the standardform of the Price equation, so we can write

    ∆z̄ = (1− p) (Cov(w, z) + E(w∆z)) /w̄ + p(∑

    αjz∗j −

    ∑qizi

    )= (1− p) (Cov(w, z) + E(w∆z)) /w̄ + p (z̄∗ − z̄) .

    In the component weighted by p, no connections existbetween the descendant z∗j and a member of the ances-tral population. Thus, we have no basis to relate thoseterms to fitness, transmission, or property change. Kerrand Godfrey-Smith [8] use an alternative notation thatassociates all entities with their number of connections,including those with zero. The outcome is an extendedset mapping theory for evolutionary change. The mainconcepts and the value of the approach are best explainedby the application presented in the next section.

    Gains and losses in descendants and ancestors

    Fox and Kerr [59] analyze changes in ecosystem func-tion by modifying the method of Kerr and Godfrey-Smith [8]. They measure ecosystem function by sum-ming the functional contribution of each species presentin an ecosystem. To compare ecosystems, they consideran initial site and a second site. When comparing ecosys-tems, the notion of ancestors and descendants may notmake sense. Instead, one appeals to the more general setmapping relations of the Price equation.

    Assume that there is an initial site with total functionT =

    ∑zi, where zi is the function of the ith species. At

    the initial site, there are s different species, thus we mayalso express the total as T = sz̄, where z̄ is the averagefunction per species. At a second site, total function isT ′ =

    ∑z′j , with s

    ′ different species in the summation,and T ′ = s′z̄′. Let the number of species in commonbetween the sites be sc. Thus, the initial site has S =s−sc unique species, and the second site has S′ = s′−scunique species.

    Fox and Kerr [59] write the change in total ecosystemfunction as

    ∆T = T ′ − T = s′z̄′ − sz̄= (s′ − sc)z̄′ − (s− sc)z̄ + sc (z̄′ − z̄)= S′z̄′ − Sz̄ + sc(∆z̄).

    The term S′z̄′ represents the change in function causedthe gain of an average species, in which S′ is the num-ber of newly added species, and z̄′ is the average func-tion per species. Fox & Kerr suggest that a randomly

    added species would be expected to function as an aver-age species, and so interpret this term as the contributionof random species gain. The term Sz̄ is interpreted simi-larly as random species loss with respect to the S uniquespecies in the first ecosystem not present in the secondecosystem.

    Fox & Kerr partition the term sc(∆z̄) into three com-ponents of species function: deviation from the averagefor species gained at the second site, deviation from theaverage for species lost from the first site, and the changesin function for those species in common between sites.

    The point here concerns the approach rather than thetheory of ecosystem function. To analyze changes be-tween two sets, one often benefits by an explicit decom-position of the relations between the two sets. The orig-inal Price equation is one sort of decomposition, basedon tracing the ways in which descendants derive fromand change with respect to ancestors. Fox and Kerr [59]extend the decomposition of change by set mapping toinclude specific components that make sense in the con-text of changes in ecosystem function.

    More work on the mathematics of set mapping anddecomposition would be very valuable. The Price equa-tion and the extensions by Kerr, Godfrey-Smith, and Foxshow the potential for thinking carefully about the ab-stract components of change between sets, and how to ap-ply that abstract understanding to particular problems.

    Other examples

    No clear guidelines determine what constitutes an ex-tension to the Price equation. From a broad perspective,many different partitions of total change have similari-ties, because they separate something like selection fromother forces that alter the similarity between populations.

    For example, the stochastic effects of sampling anddrift create a distribution of descendant phenotypesaround the ancestral mean. In the classical Price formu-lation, there is only the single realization of the actualdescendants. A stochastic version analyzes a collectionof possible descendant sets over some probability distri-bution, and a mapping from the ancestor set to eachpossible realization of the descendant set.

    In other cases, partitions will split components morefinely or add new components not in Price’s formulation.I do not have space to review every partition of totalchange and consider how each may be related to Price’sformulation. I list a few examples here.

    Grafen [60] and Rice [61] developed stochastic ap-proaches. Grafen [27] based a long-term project on in-terpretations and extensions of the Price equation. Pageand Nowak [12] related the Price equation to variousother evolutionary analyses, providing some minor exten-sions. Wolf et al. [62], Bijma and Wade [63], and manyothers developed extended partitions by splitting causeswith regression or similar methods such as path analysis.Various forms of the Price equation have been applied in

  • 15

    economic theory [13].

    DIFFICULTIES WITH VARIOUS CRITIQUES OFTHE PRICE EQUATION

    A reliable way to make people believe in false-hoods is frequent repetition, because familiar-ity is not easily distinguished from truth [64,p. 62].

    One must distinguish the full, exact Price equation fromvarious derived forms used in applications. The derivedforms always make additional assumptions or expressapproximate relations [10]. Each assumption increasesspecificity and reduces generality in relation to particu-lar goals.

    Critiques of the Price equation rarely distinguish thecosts and benefits of particular assumptions in relationto particular goals. I use van Veelen’s recent series ofpapers as a proxy for those critiques. That series repeatssome of the common misunderstandings and adds somenew ones. Nowak recently repeated van Veelen’s critiqueas the basis for his commentary on the Price equation[5, 6, 65–68].

    Dynamic sufficiency

    The Price equation describes the change in some mea-surement, expressed as ∆z̄. Change is calculated withrespect to particular mapping relations between ances-tor and descendant populations. We can think of themappings and the beginning value of z̄ as the initial con-ditions or inputs, and ∆z̄ as the output.

    The output, z̄′ = z̄ + ∆z̄, does not provide enough in-formation to iterate the calculation of change in order toget another value of ∆z̄ starting with z̄′. We would alsoneed the mapping relations between the new descendantpopulation and its subsequent descendants. That infor-mation is not part of the initial input. Thus, we cannotstudy the dynamics of change over time without addi-tional information.

    This limitation with regard to repeated iteration iscalled a lack of dynamic sufficiency [69]. Confusion aboutthe nature of dynamic sufficiency in relation to the Priceequation has been common in the literature. In Frank [7,pp. 378–379], I wrote

    It is not true, however, that dynamic suffi-ciency is a property that can be ascribed tothe Price Equation—this equation is simplya mathematical tautology for the relationshipamong certain quantities of populations. In-stead, dynamic sufficiency is a property ofthe assumptions and information providedin a particular problem, or added by addi-tional assumptions contained within numer-ical techniques such as diffusion analysis or

    applied quantitative genetics. . . . What prob-lems can the Price equation solve that cannotbe solved by other methods? The answer is,of course, none, because the Price Equationis derived from, and is no more than, a set ofnotational conventions. It is a mathematicaltautology.

    I showed how the Price equation helps to define the neces-sary conditions for dynamic sufficiency. Once again, thePrice equation proves valuable for clarifying the abstractstructure of evolutionary analysis.

    Compare my statement to van Veelen et al. [6]

    Dynamic insufficiency is regularly mentionedas a drawback of the Price equation (see forexample Frank, 1995; Rice, 2004). We thinkthat this is not an entirely accurate descrip-tion of the problem. We would like to ar-gue that the perception of dynamic insuffi-ciency is a symptom of the fundamental prob-lem with the Price equation, and not just adrawback of an otherwise fine way to describeevolution. To begin with, it is important torealize that the Price equation itself, by itsvery nature, cannot be dynamically sufficientor insufficient. The Price equation is just anidentity. If we are given a list of numbers thatrepresent a transition from one generation tothe next, then we can fill in those numbers inboth the right and the left hand side of thePrice equation. The fact that it is an identityguarantees that the numbers that appear onboth sides of the equality sign are the same.There is nothing dynamically sufficient or in-sufficient about that (this point is also madeby Gardner et al., 2007, p. 209). A model, onthe other hand, can be dynamically sufficientor insufficient.

    This quote from van Veelen et al. [6] demonstrates aninteresting approach to scholarship. They first cite Frankas stating that dynamic insufficiency is a drawback of thePrice equation. They then disagree with that point ofview, and present as their own interpretation an argu-ment that is nearly identical in concept and phrasing tomy own statement in the very paper that they cited asthe foundation for their disagreement.

    In this case, I think it is important to clarify the con-cepts and history, because influential and widely citedauthors, such as Nowak, are using van Veelen’s articles asthe basis for their own critiques of the Price equation andapproaches to fundamental issues of evolutionary analy-sis.

    With regard to dynamics, any analysis achieves thesame dynamic status given the same underlying assump-tions. The Price equation, when used with the sameunderlying assumptions as population genetics, has thesame attributes of dynamic sufficiency as population ge-netics.

  • 16

    Interpretation of covariance

    van Veelen et al. [6] claim that

    Maybe the most unfortunate thing about thePrice equation is that the term on the righthand side is denoted as a covariance, eventhough it is not. The equation thereby turnsinto something that can easily set us off inthe wrong direction, because it now resemblesequations as they feature in other sciences,where probabilistic models are used that douse actual covariances.

    One can see the covariance expression in the standardform of the Price equation given in Eq. (2). In the Priceequation, the covariance is measured with respect to thetotal population, in other words, it expresses the associa-tion over all members of the population. In many statis-tical applications, one only has data on a subset of thefull population, that subset forming a sample. It is im-portant to distinguish between population measures andsample measures, because they refer to different things.

    Price [4, p. 485] made clear that his equation is abouttotal change in entire populations, so the covariance isinterpreted as a population measure

    [W]e will be concerned with population func-tions and make no use of sample functions,hence we will not observe notational conven-tions for distinguishing population and sam-ple variables and functions.

    In additional to population and sample measures, co-variance also arises in mathematical models of process.Suppose, for example, that I develop a model in whichrandom processes influence fitness and random processesinfluence phenotype. If the random fluctuations in fitnessand the random fluctuations in phenotype are associated,the random variables of fitness and phenotype would co-vary. All of these different interpretations of covarianceare legitimate, they simply reflect different situations.

    DISCUSSION

    In Frank [7], I wrote: “What problems can the Priceequation solve that cannot be solved by other methods?The answer is, of course, none, because the Price Equa-tion is derived from, and is no more than, a set of nota-tional conventions. It is a mathematical tautology.”

    Nowak and Highfield [5] and van Veelen et al. [6] em-phasize the same point in their critique of the Price equa-tion, although they present the argument as a novel in-sight without attribution. Given that the Price equa-tion is a set of notational conventions, it cannot uniquelyspecify any predictions or insights. A particular set of as-sumptions leads to the same predictions, no matter whatnotational conventions one uses. The Price equation is

    a tool that sometimes helps in analysis or in seeing gen-eral connections between apparently disparate ideas. Formany problems, the Price equation provides no value, be-cause it is the wrong tool for the job.

    If the Price equation is just an equivalence, or tautol-ogy, then why am I enthusiastic about it? Mathematicsis, in its essence, about equivalences, as expressed beau-tifully in the epigraph from Mazur. Not all equivalencesare interesting or useful, but some are, just as not allmathematical expressions are interesting or useful, butsome are.

    That leads us to the question of how we might knowwhether the Price equation is truly useful or a mere iden-tity? It is not always easy to say exactly what makes anabstract mathematical equivalence interesting or useful.However, given the controversy over the Price equation,we should try. Because there is no single answer, or evena truly unique and unambiguous question, the problemremains open. I list a few potential factors.

    “[A] good notation has a subtlety and suggestivenesswhich at times make it seem almost like a live teacher”[70, pp. 17–18]. Much of creativity and understandingcomes from seeing previously hidden associations. Thetools and forms of expression that we use play a strongrole in suggesting connections and are inseparable fromcognition [64]. Equivalences and alternative notationsare important.

    The various forms of the covariance component fromthe Price equation given in Eq. (9) show the equivalenceof the statistical, geometrical and informational expres-sions for natural selection. The recursive form of the fullPrice equation provides the foundation for all modernstudies of group selection and multilevel analysis. ThePrice equation helped in discovering those various con-nections, although there are many other ways in whichto derive the same relations.

    Hardy [71] also emphasized the importance of seeingnew connections between apparently disparate ideas:

    We may say, roughly, that a mathematicalidea is ‘significant’ if it can be connected, ina natural and illuminating way, with a largecomplex of other mathematical ideas. Thusa serious mathematical theorem, a theoremwhich connects significant ideas, is likely tolead to important advances in mathematicsitself and even in other sciences.

    What sort of connections? One type concerns the in-variances discovered or illuminated by the Price equation.I discussed some of those invariances in an earlier section,particularly the information theory interpretation of nat-ural selection through the measure of Fisher information[36]. Fisher’s fundamental theorem of natural selectionis a similar sort of invariance [72]. Kin selection theoryderives much of its power by identifying an invariant in-formational quantity sufficient to unify a wide variety ofseemingly disparate processes [24, Chapter 6]. The inter-pretation of kin selection as an informational invariance

  • 17

    has not been fully developed and remains an open prob-lem.

    Invariances provide the foundation of scientific under-standing: “It is only slightly overstating the case to saythat physics is the study of symmetry” [73]. Invarianceand symmetry mean the same thing [38]. Feynman [37]emphasized that invariance is The Character of Physi-cal Law. The commonly observed patterns of probabilitycan be unified by the study of invariance and its associa-tion to measurement [74, 75]. There has been little effortin biology to pursue similar understanding of invarianceand measurement [76, 77].

    Price argued for the great value of abstraction, in thesense of the epigraph from Mazur. In Price [35]

    [D]espite the pervading importance of selec-tion in science and life, there has been noabstraction and generalization from geneticalselection to obtain a general selection theoryand general selection mathematics. Instead,particular selection problems are treated inways appropriate to particular fields of sci-ence. Thus one might say that ‘selection the-ory’ is a theory waiting to be born—muchas communication theory was 50 years ago.Probably the main lack that has been holdingback any development of a general selectiontheory is lack of a clear concept of the generalnature or meaning of ‘selection’.

    This article has been about the Price equation in re-lation to its abstract properties and its connections tovarious topics, such as information or fundamental in-

    variances. Some readers may feel that those aspects ofabstraction and invariance are nice, but far from dailywork in biology. What of the many applications of thePrice equation to kin or group selection? Do those ap-plications hold up? How much value has been added?

    Because the Price equation is a tool, one can alwaysarrive at the same result by other methods. How wellthe Price equation works depends partly on the goal andpartly on the fit of the tool to the problem. There isinevitably a strongly subjective aspect to deciding abouthow well a tool works. Nonetheless, hammers truly aregood for nails and bad for screws. For valuing tools, thereis a certain component that should be open to agreement.For example, the Robertson [29] form of the Price equa-tion is widely regarded as the foundational method foranalyzing models of evolutionary quantitative genetics.However, not all problems in quantitative genetics arebest studied with the Robertson-Price equation. Andnot all problems in social evolution benefit from a Priceequation approach.

    The Price equation or descendant methods have ledto many useful models for kin selection [24]. The mostpowerful follow a path analysis decomposition of causesor use a simple maximization method to analyze easilywhat would otherwise have been difficult. I will returnto those applications in subsequent articles.

    ACKNOWLEDGMENTS

    I thank R. M. Bush and W. J. Ewens for helpful com-ments. My research is supported by National ScienceFoundation grant EF-0822399, National Institute of Gen-eral Medical Sciences MIDAS Program grant U01-GM-76499, and a grant from the James S. McDonnell Foun-dation.

    [1] B. Mazur, “When is one thing equal to some otherthing?” in Proof and Other Dilemmas: Mathematics andPhilosophy, edited by B. Gold and R. A. Simons (Mathe-matical Association of America, Washington, D.C., 2008)pp. 221–241.

    [2] K. E. Boulding, “General systems theory: the skeletonof science,” Management Science 2, 197–208 (1956).

    [3] G. R. Price, “Selection and covariance,” Nature 227,520–521 (1970).

    [4] G. R. Price, “Extension of covariance selection mathe-matics,” Annals of Human Genetics 35, 485–490 (1972).

    [5] M. A. Nowak and R. Highfield, SuperCooperators: Altru-ism, Evolution, and Why We Need Each Other to Succeed(Free Press, New York, 2011).

    [6] M. van Veelen, J. Garćıa, M. Sabelis, and M. Egas,“Group selection and inclusive fitness are not equivalent;the Price equation vs. models and statistics,” Journal ofTheoretical Biology 299, 64–80 (2012).

    [7] S. A. Frank, “George Price’s contributions to evolution-ary genetics,” Journal of Theoretical Biology 175, 373–388 (1995).

    [8] B. Kerr and P. Godfrey-Smith, “Generalization of thePrice equation for evolutionary change,” Evolution 63,531–536 (2009).

    [9] W. D. Hamilton, “Innate social aptitudes of man: anapproach from evolutionary genetics,” in Biosocial An-thropology, edited by R. Fox (Wiley, New York, 1975)pp. 133–155.

    [10] S. A. Frank, “The Price equation, Fisher’s fundamentaltheorem, kin selection, and causal analysis,” Evolution51, 1712–