14
PHYSICAL REVIEW D VOLUME 41, NUMBER 12 15 JUNE 1990 Higher-derivative Lagrangians, nonlocality, problems, and solutions Jonathan Z. Simon' Department of Physics, Uni uersi ty of California, Santa Barbara, California 93106 (Received 15 November 1989} Higher-derivative theories are frequently avoided because of undesirable properties, yet they occur naturally as corrections to general relativity and cosmic strings. We discuss some of their more interesting and disturbing problems, with examples. A natural method of removing all the problems of higher derivatives is reviewed. This method of "perturbative constraints" is required for at least one class of higher-derivative theories those which are associated with nonlocality. Nonlocality often appears in low-energy theories described by effective actions. The method may also be applied to a wide class of other higher-derivative theories. An example system is solved, ex- actly and perturbatively, for which the perturbative solutions approximate the exact solutions only when the method of "perturbative constraints" is employed. Ramifications for corrections to gen- eral relativity, cosmic strings with rigidity terms, and other higher-derivative theories are explored. I. INTRODUCTION Theories with higher derivatives (third derivative or higher in time in the equations of motion, second deriva- tive or higher in the Lagrangian) occur naturally for vari- ous reasons in different areas of physics. Quite often the higher-derivative terms are added to a more standard (lower-derivative) theory as a correction. This occurs in general relativity, for instance, where quantum correc- tions naturally contain higher derivatives of the metric (see, e.g. , Birrell and Davies'), or where nonlinear o mod- els of string theory predict terms of order R and higher (see, e. g. , de Alwis ). It occurs in the case of cosmic strings where higher-order corrections, dependent on the "rigidity" of the string, contain higher derivatives, ' and in Dirac s relativistic model of the classical radiating elec- tron. Unlike lower-derivative corrections, however, it is false to assume that adding a higher-derivative correction term with a small coefficient will only perturb the original theory. The presence of an unconstrained higher- derivative term, no matter how small it may naively ap- pear, makes the new theory dramatically different from the original. Unconstrained higher-derivative theories have very distinctive features. As will be shown below, they have more degrees of freedom than lower-derivative theories, and they lack a lower-energy bound. There is nothing mathematically inconsistent with these features, but they make two almost identical-looking theories, one a lower- derivative theory and the other the same theory with a higher-derivative correction added, very different. The lack of a lowest-energy state for the higher-derivative theory is probably the most dramatic change. This al- ways occurs when higher-derivative terms are present (as- suming no degeneracy or constraints), independently of how small their coefficients are. The addition of more de- grees of freedom might be physically more accurate, but then it means that the original lower-derivative theory was incomplete and missing (the most interesting) new families of solutions. It is particularly disturbing if there is a progression of higher-order, higher-derivative correc- tions, each system of which has more and more degrees of freedom, Classically, more degrees of freedom means that more initial data are required to specify motion. Quantum mechanically this means that, for a particle, x and x now commute since they are freely speci6able, and it becomes possible to measure the position and velocity at the same time. The momentum conjugate to x, m„, still does not commute with x; [x, ~„] =i A, but m. „&mx. From the path-integral point of view, the paths which dominate the functional integral are of a different class: where once they were nowhere differentiable, now they are everywhere once differentiable. Examples of all these types of behavior are presented below. No familiarity with any of the properties of higher-derivative theories is assumed. There is a large class of theories naturally containing higher derivatives that do not suffer the above problems. Nonlocal theories, where the nonlocality is regulated by a naturally small parameter, have perturbation expansions with higher derivatives. They avoid the above problems because they are constrained systems. They contain im- plicit constraints which keep the number of degrees of freedom constant and maintain a lower-energy bound. Higher-derivative theories that are truncated expansions of a nonloca1 theory also avoid these problems, once the proper constraints are imposed. Any theory for which the higher-derivative terms have been added as small corrections can be treated in the same manner, also avoiding the above problems. Nonlocality naturally appears in effective theories, val- id only in a low-energy limit and derived from a larger theory with some degrees of freedom frozen out. A good example is Wheeler-Feynman electrodynamics, in which the degrees of freedom of the electromagnetic field are frozen out. For two particles of mass m, 41 3720 1990 The American Physical Society

Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

  • Upload
    others

  • View
    18

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

PHYSICAL REVIEW D VOLUME 41, NUMBER 12 15 JUNE 1990

Higher-derivative Lagrangians, nonlocality, problems, and solutions

Jonathan Z. Simon'Department ofPhysics, Uni uersi ty of California, Santa Barbara, California 93106

(Received 15 November 1989}

Higher-derivative theories are frequently avoided because of undesirable properties, yet theyoccur naturally as corrections to general relativity and cosmic strings. We discuss some of theirmore interesting and disturbing problems, with examples. A natural method of removing all theproblems of higher derivatives is reviewed. This method of "perturbative constraints" is requiredfor at least one class of higher-derivative theories —those which are associated with nonlocality.Nonlocality often appears in low-energy theories described by effective actions. The method mayalso be applied to a wide class of other higher-derivative theories. An example system is solved, ex-actly and perturbatively, for which the perturbative solutions approximate the exact solutions onlywhen the method of "perturbative constraints" is employed. Ramifications for corrections to gen-eral relativity, cosmic strings with rigidity terms, and other higher-derivative theories are explored.

I. INTRODUCTION

Theories with higher derivatives (third derivative orhigher in time in the equations of motion, second deriva-tive or higher in the Lagrangian) occur naturally for vari-ous reasons in different areas of physics. Quite often thehigher-derivative terms are added to a more standard(lower-derivative) theory as a correction. This occurs in

general relativity, for instance, where quantum correc-tions naturally contain higher derivatives of the metric(see, e.g., Birrell and Davies'), or where nonlinear o mod-els of string theory predict terms of order R and higher(see, e.g., de Alwis ). It occurs in the case of cosmicstrings where higher-order corrections, dependent on the"rigidity" of the string, contain higher derivatives, ' andin Dirac s relativistic model of the classical radiating elec-tron. Unlike lower-derivative corrections, however, it isfalse to assume that adding a higher-derivative correctionterm with a small coefficient will only perturb the originaltheory. The presence of an unconstrained higher-derivative term, no matter how small it may naively ap-pear, makes the new theory dramatically different fromthe original.

Unconstrained higher-derivative theories have verydistinctive features. As will be shown below, they havemore degrees of freedom than lower-derivative theories,and they lack a lower-energy bound. There is nothingmathematically inconsistent with these features, but theymake two almost identical-looking theories, one a lower-derivative theory and the other the same theory with ahigher-derivative correction added, very different. Thelack of a lowest-energy state for the higher-derivativetheory is probably the most dramatic change. This al-ways occurs when higher-derivative terms are present (as-suming no degeneracy or constraints), independently ofhow small their coefficients are. The addition of more de-grees of freedom might be physically more accurate, butthen it means that the original lower-derivative theory

was incomplete and missing (the most interesting) newfamilies of solutions. It is particularly disturbing if thereis a progression of higher-order, higher-derivative correc-tions, each system of which has more and more degreesof freedom, Classically, more degrees of freedom meansthat more initial data are required to specify motion.Quantum mechanically this means that, for a particle, xand x now commute since they are freely speci6able, andit becomes possible to measure the position and velocityat the same time. The momentum conjugate to x, m„,still does not commute with x; [x,~„]=iA, but m.„&mx.From the path-integral point of view, the paths whichdominate the functional integral are of a different class:where once they were nowhere differentiable, now theyare everywhere once differentiable. Examples of all thesetypes of behavior are presented below. No familiaritywith any of the properties of higher-derivative theories isassumed.

There is a large class of theories naturally containinghigher derivatives that do not suffer the above problems.Nonlocal theories, where the nonlocality is regulated by anaturally small parameter, have perturbation expansionswith higher derivatives. They avoid the above problemsbecause they are constrained systems. They contain im-plicit constraints which keep the number of degrees offreedom constant and maintain a lower-energy bound.Higher-derivative theories that are truncated expansionsof a nonloca1 theory also avoid these problems, once theproper constraints are imposed. Any theory for whichthe higher-derivative terms have been added as smallcorrections can be treated in the same manner, alsoavoiding the above problems.

Nonlocality naturally appears in effective theories, val-id only in a low-energy limit and derived from a largertheory with some degrees of freedom frozen out. A goodexample is Wheeler-Feynman electrodynamics, in whichthe degrees of freedom of the electromagnetic field arefrozen out. For two particles of mass m,

41 3720 1990 The American Physical Society

Page 2: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

HIGHER-DERIVATIVE LAGRANGIANS, NONLOCALITY, . . . 3721

dxt' dx;„S=—ymc J dsds ds

1/2

—e&e&

( D, D2)—1' v, v2r'&

p =O 2P.C C2p 2

where D; signifies differentiation with respect to t of x;only, and r= ~x, —x2~. To achieve the same solutions asthe original Wheeler-Feynman theory, however, particu-lar constraints must be imposed. Without the con-straints, the expansion would have all the problems asso-ciated with higher-derivative theories, which are notpresent in the Wheeler-Feynrnan theory. The effect ofthe constraints is to throw away "runaway" solutions.This is accomplished by only allowing solutions that canbe Taylor expanded in powers of c ' about c ' =0 (cor-responding to infinite propagation speed).

These constraints allow the series expansion to be con-sidered as a legitimate perturbative expansion. Withoutthe constraints, higher order does not correspond tohigher powers of v/c, but instead all terins contributeequally. With the constraints imposed, each term in theseries contributes commensurately less as its order in-creases. For this reason the constraints will be referredto as "perturbative constraints. "

The need for perturbative constraints was first pointedout by Bhabha in the context of Dirac's classical theoryof the radiating electron (and its higher-order generaliza-tions), although Dirac realized that runaway solutionsshould be excluded. The use of perturbative constraintsas a method to remove the problems of higher-derivativetheories in general was discovered independently by Jaen,Llosa, and Molina (JLM) and Eliezer and Woodard'(EW). An explicit method for finding the perturbativeconstraints for any system expanded in a higher-derivative series about some small expansion parameterwas found also by JLM (Ref. 9). Given the perturbativeconstraints, a method of implementing them in a canoni-ca1 fashion which greatly simplifies the calculation wasfound by EW (Ref. 10). Perturbative constraints can beimplemented for either infinite or finite series expansions,though for the infinite case the perturbative constraints

eie2 dx~& dx2„+ I ds ds', " 5(hx "bx, ),c dS ds

where hx "=x&

—x 2. The only degrees of freedom

remaining are of the charged particles. This is nonlocalbecause the particle-particle interaction is not instantane-ous and pointlike, but occurs in retarded time (action at adistance with finite propagation speed). The nonlocalWheeler-Feynman theory is not valid for large v/c (e.g.,particle creation and annihilation is not allowed for}, sothere is a natural perturbative expansion in powers ofv/c. Higher derivatives occur directly as a result of thenonlocality. The action can be naturally expanded as

2'/v;S= t — mc 1—c2

I

are already implicitly present if it is demanded that theequations of motion converge. For finite series expan-sions, where convergence of the series is not an issue, theperturbative constraints play an extremely importantrole. The finite series expansion, with perturbative con-straints imposed, describes a system with the same solu-tions as those of the full nonlocal series (up to the ap-propriate order). The finite series expansion without theperturbative constraints describes a system with solutionsmost of which are nothing like the solutions to the origi-nal nonlocal system.

Finding the perturbative constraints does not dependon knowledge of the full nonlocal theory. It can be donejust as easily if only a finite number of perturbative termsare known. For this reason it can equally be applied toany higher-order system (with a small expansioncoefficient) without knowing whether or not the theory ispart of an infinite expansion. This is where the applica-tion of perturbative constraints is most powerful andmost underutilized.

In general relativity typical corrections take the formof curvature-squared terms in the Lagrangian. " ' Evenfor small coefficients these terms can easily dominate theevolution of the system (as in Starobinsky inflation' ).Applying the appropriate perturbative constraints de-scribes a (different) system, in which the number of de-grees of freedom is the same as in Einstein gravity, andwhich has no runaway solutions or ghostlike particles. Itis a perturbative correction to Einstein gravity, which weknow to be a very good approximation of nature.

Applying the perturbative constraints is not just an adhoc procedure. It is completely natural and necessary incases where the higher-derivative theory is a truncatedperturbative expansion of some larger, nonlocal (but oth-erwise well-behaved) theory. The nonlocal theory itself

may be the low-energy effective limit of some even largertheory for which fields have been integrated out. It willbe shown that the case of cosmic strings with higher-derivative "rigidity" terms falls in this category.

There is sometimes a small cost to the use of perturba-tive constraints with higher-derivative theories. Even forfinite series expansions, locality can be lost under theinQuence of explicitly time-dependent sources. This iswell known in the case of the self-interacting electron,where the nonlocal phenomenon of preacceleration (ac-celeration in response to a force that has yet to be ap-plied) occurs. An example is deinonstrated below. In thecase of the electron, this is considered unimportant, sinceit takes place only on the scale of the time light travelsacross the classical electron radius. At any rate, acausali-ty only arises at scales for which the approximation ofthe electron as a classical particle breaks down. If a simi-lar effect were to occur in a theory of gravity the nonlo-cality would be at the Planek scale. Most physicists,though, would agree that at the Planck scale the usualnotions of geometry probably break down (e.g., the ap-pearance of spacetime foam), and so the possible presenceof nonlocality (and the accompanying loss of causality)should not be worrisome.

The structure of the paper is as follows. First is a re-view of the behavior of unconstrained higher-derivative

Page 3: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

3722 JONATHAN Z. SIMON 41

theories in general, both classical and quantum, with sim-

ple examples of all interesting properties. Next is a dis-cussion of various higher-derivative theories that havebeen studied in the literature, including Dirac s classicalelectrodynamics, corrections to general relativity, andcorrections to cosmic strings. This is followed by a dis-cussion of the higher-derivative theories that do notsuffer from the above problems, and how the problemsare avoided by the use of perturbative constraints.

Nonlocal systems, when cast into their higher-derivative expansion, demand the use of perturbativeconstraints to reproduce the results of the original equa-tions of motion. For any finite expansion approximationto the nonlocal theory, perturbative constraints are stillrequired, and they must be imposed explicitly. The samefinite expansion without the perturbative constraintswould be a very different theory, completely unrelated tothe original nonlocal theory in terms of its available solu-tions. The constraints must also be applied to systemswhere the purpose of the higher-derivative terms is toprovide small corrections to the original theory. Anytheory which is intended (by construction or by physicalmotivation) to provide perturbative corrections to knownsolutions, but does not do so, is either incorrect or is be-ing applied beyond its domain of applicability. Themethod of perturbative constraints is the only means bywhich a theory with higher-derivative corrections canself-consistently avoid these problems.

Finally, the specific effects of applying the perturbativeconstraints are calculated for the cases of higher-derivative extensions to general relativity and cosmicstrings.

BH ~ (n) (n+1)7

()p (n)q

BH(„) p ( )

Oqn =0, 1, . . . , N —1,

which reproduce the Euler-Lagrange equation. Note thatit is entirely self-consistent from within this formalism toconsider q and all its derivatives up to N completely in-dependent. The dependence is regained from the equa-tions of motion by the first relation of (8), which states

(n) (n+1)—q'n'=q'n+" for n =0, 1, . . . , N —2 .

d0=+~

p

The canonical formalism for higher-derivative theorieswas developed by Ostrogradski. ' The canonical momen-ta are defined by (5), which shows the generality of theHamiltonian-Jacobi formalism. The Hamiltonian, as ex-pected, is given by

N —1 N —1

q(n) L y p q(n+i)

n=0 n=0

It is conserved and generates evolution in time, and so isequal to the energy of the system. Note thatq '=q' '(q, q, . . . , q' ",p ())( ))) (assuming no degen-

eracy), but all the remaining q'"' are independent general-ized coordinates and so are not inverted. For this reason,L =L(q, q, . . . , q ",p ())) ))) as well. The first-order

equations of motion are

II. A REVIEW OF HIGHER-DERIVATIVETHEORIES

L =L(q, q, . . . , q' '),

applying the variational principle

N5S=I dt 5q

t dt

N —1 f+ g p (,)5q"

i =0

gives

i3L

gq(i)

(4)

where the p („are given by

k —i —1

p (i)—k=i+1

BL(k)

Assuming that the 5q" are all held fixed at the boundary,the Euler-Lagrange equation is

(Many of the ideas in this review section are alsocovered by EW in a particularly lucid presentation. ' Allthe equations presented here apply to one-particle, one-dimensional systems, but the generalization is trivial. )

The Lagrangian formalism is straightforwardly applied tohigher-derivative theories. For a Lagrangian

To demonstrate how higher-derivative theories differfrom their lower-derivative counterparts, I will use thesimple example

(1+e2~2)x 2 ) t02x2 (62g 22 2 2

(10)

which is a simple harmonic oscillator with the mass termslightly modified, and an acceleration-squared piece. Itmay be helpful to think of ceo «1, but this is never as-sumed in our calculations. The kinetic term has beenmodified only to make the calculations easier; it has noqualitative effect whatsoever, and all quantitative effectsare small: 0(e2co2). This contrasts strongly with theeffects of the last term. It is tempting at first to view thelast term as a small correction, but we shall see that thisis false, independently of how small e is. (The analogousexample in scalar field theory has been examined byHawking. '

)

That the number of degrees of freedom of a higher-derivative theory is more than the lower-derivativetheory can be seen by examining Eqs. (4) and (8). For theunconstrained system, there are 2N constants that deter-mine the motion, corresponding to the 2N initial and finalq'"'s, or to the X initial (or final) q'"'s and the N initial(or final) p, „,s. This is a major qualitative difference

from the lower-derivative theory, which needs only twoconstants to specify the motion. This is also reflected inthe quantum theory. The wave function has N arguments,

Page 4: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

HIGHER-DERIVATIVE LACiRANQIANS, NONLOCALITY, . . . 3723

and the commutation relations reflect the Ostrogradskicanonical structure:

H =—,) [2m x —e n„—. (1+@co )x +co x ] . (14)

[q'"),p ( )]=ifi5„Note the impossibility of taking the e~O limit in thiscase. The general solution is

x = A+cos(cot+())+)+ A cos(e 't+P ) . (15)

The second of these equations looks especially odd: theposition and velocity of a particle commute. The wavefunction of the system will typically be functions of allthe q'"', although one may, of course, Fourier transformany of the generalized coordinates and obtain it in termsof any of the conjugate momenta in their place.

Next we examine how these properties are exhibited inthe example. The equation of motion is

For e~ &&1, the second mode oscillates extremely rapid-ly. The modes separate exactly because the Lagrangian isquadratic in all terms; nonquadratic terms would couplethe modes. The oscillatory nature of the second term isnot related to the fact that the e=O case is a simple har-monic oscillator: in the case co=0, the solution isx =xo+Uot+A cos(e 'r+(()). In the case e &0, thesolution is

e D x+(1+@co )D x+co x =0, where D =—.dt

(12)

x = A+cos(cot+())+)+ A, cosh(ie~ '&)

+ A, sinh( ~e) 't ) . (15a)

D—=(1+e co )x+e 'x',dL (3L

Bx Bx

X2"

(13)

The Hamiltonian is

Being fourth order in time, it requires twice as many ini-tial conditions as the e=O case, independent of the size ofe. This is also reflected in the Ostrogradski canonical for-malism, where the independent generalized coordinatesare x and x, and their respective generalized momentaare

Quantum mechanically, since x and x are independentcoordinates, the wave function will be a function of both:f=P(x,x) [though we could also use g=f(x, n ),

g=P(m„, x), or g=P(m„, n„)].. No. .te that [x,x]=0, al-

lowing the position and velocity to be measured in thesame experiment to arbitrary accuracy. This is also in-

dependent of the size of e, so long as it is nonvanishing.The quantum-mechanical system is solved exactly in

Appendix A. The energy eigenstates are labeled by twonon-negative integers:

E=(n+ —,) )co —(m+ —,

) )e ' for n, m =0, 1,2, . . . . (16)

The simplest wave function to calculate is

' 1/4co (1 ceo )x —4iau xx——(1 ere )x-

exp2e '(1+@cu )

As expected, the limit e~O does not approach the purelysimple harmonic-oscillator ground-state wave function.

Strongly related to the fact that [x,x]=0 is that theclass of paths that dominate the Feynman path-integralchanges. The path-integral sums over all possible paths,but a particular class of paths dominates the sum, whichcan be seen by examining the expectation values in transi-tion amplitudes. ' First we examine the properties ofthese paths for a lower-derivative theory. For a path-integral skeletonized into time slices of duration 5, theexpectation value of the distance crossed in that time isapproximately

So, as 6~0, the typical paths (averaged with a complexweighting) are continuous. But the expectation value ofthe particle's velocity diverges:

For a higher-derivative theory, this is not true. Thetypical paths for acceleration-dependent Lagrangianshave finite velocities, but their acceleration diverges; i.e.,the paths are continuous in (x,x) space. The higher thederivatives in the Lagrangian, the smoother the paths be-come. An infinite number of higher derivatives wouldhave, in some sense, only perfectly smooth paths contrib-uting. (In fact, because the path-integral formulation canbe used to derive Schrodinger s equation, one can read offexpectation values from the Hamiltonian, as done byFeynman, ' and as shown in Appendix B.)

To illustrate the path-integral properties of higher-derivative Lagrangian, we will use the simpler case co=0:a free particle with a (seemingly small) quadratic ac-celeration term (the co&0 case is conceptually no moredifficult but requires enormously more calculation):

L =—,'(x ' —e'2 '), H =

—,'( e'vr2+2n-„x —x '—) .

(20)

%e calculate the transition expectations

Page 5: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

3724 JONATHAN Z. SIMON

(Sx ) -xS 0, (ax ) -~-'S'" 0,&x„)-x &x„)-x+0(&hx)/5)-x,

&X„)——+0((ax )/S)+0(&ax )/S')

(21)

grees of freedom and new behavior at every step. In ourexample above, the problems become manifest when thesystem is compared to a simple harmonic oscillator(E=O) .These problems cannot be avoided in uncon-strained higher-derivative theories, whether oscillatingparticles, flexing cosmic strings, for E. gravity.

The dominant paths are now once dim'erentiable. The ex-act propagator has also been calculated for this systemusing modes, as shown in Appendix C.

Another extremely important property of higher-derivative theories, both classical and quantum, is thelack of any lower-energy bound. This can be seen mosteasily through (7). The only dependence on the p I„& forn

n &X—1 is linear, permitting the Hamiltonian to takeon arbitrarily negative values. This carries over into thequantized system as well. ' This property is easilydemonstrated by our example system: the Hamiltonian ismanifestly indefinite in Eq. (A2). The energy for the gen-eral solution given in Eq. (15) is

E = 1(1 ~2to2)(tv2g 2 e—2g 2

) (22)

which is also manifestly indefinite. The effect of even asmall amplitude for the negative mode leads to enor-mously negative energies (for eto «1}.Even though ex-citing the negative-energy modes leads only to oscillatorybehavior (for the e )0 case), it is nevertheless unstablesince even small excitations of those modes lowers the en-ergy dramatically. Any coupling present in a not purelyquadratic Lagrangian system would make the problemworse. The quantized system has the same negative-energy problems, as seen in (16) and Appendix A. At-ternpts have been made within quantum mechanics tochange the minus sign in (16) into a plus by giving half ofthe quantum states negative norm. ' ' This merely shiftsthe problem from the lack of a ground state to the lack ofunitarity (arising from the now possible zero-normmodes), but it is really the same problem transformed.

Higher-derivative field theories have the related prob-lem of ghosts: excitations (particles) of negative energy(mass} (see, e.g, Hawking' ). They behave analogously tothe oscillatory excitations of negative-energy states in ourexample. Creation of ghost particles not only costs noenergy, it produces excess energy, causing them to bespontaneously produced in infinite numbers.

In short, the distinct features of higher-derivativetheories fall into two major categories, either derivingfrom the more numerous degrees of freedom than thelower-derivative case, or from the loss of a lowest-energystate. It should be noted that there is nothing fundamen-tally contradictory or mathematically inconsistent withhigher-derivative systems. A good example of this kindof theory is the pure R theory of Horowitz (althoughthere are still problems with the negative-energy modes,as pointed out by Eliezer and Woodard' ).

These features do become serious problems in mostcases, however. Except in a purely cosmological context,the lack of a ground state is very unphysical. It is alsounphysical when there is a sequence of higher-ordertheories for which the higher-order terms are supposed toprovide small corrections, but instead introduce new de-

III. NATURALLY OCCURRINGHIGHER-DERIVATIVE THEORIES

As stated in the Introduction, higher-derivativetheories appear naturally in at least two contexts. Thefirst is as corrections to a lower-derivative theory. Theoldest example of this is the Abraham-Lorentz model of anonrelativistic, classical, radiating, charged particle (see,e.g., Jackson ') and the relativistic generalization due toDirac. In attempting to take into account the loss of en-

ergy due to radiation, a third-derivative term is intro-duced into the equation of motion. The higher-derivativeterm has a small coefficient, ~=—', (e /mc )-10 s, yetthere are now twice as many solutions as for the nonradi-ating electron, and half the solutions are runaways: solu-tions qualitatively difFerent from solutions of the nonradi-ating electron. (This is a dissipative system due to the ra-diation, so the lack of a lower-energy bound is not mani-fest. ) As an example, Dirac's equation of motion in theabsence of external forces

(23)v„=~(v'„—v„v "v„) where rt„,=( —,+, +, +},has the general solution (for motion in the x direction)

v„=sinh(e' '+b), v, =cosh(e'~'+b), (24)

where s is proper time and b is an integration constant, or

U =const .P (25)

For the first solution, the free electron accelerates to nearthe speed of light in a time comparable to ~. For thesecond, the electron remains unaccelerated, which is theexpected answer for zero external force.

Another example of a naturally occurring higher-derivative theory is the case of cosmic strings. If treatedas an unconstrained higher-derivative theory, as is alwaysdone in the literature, it sufFers from all the above prob-lems. The number of degrees of freedom is dependent onwhich order the higher-order expansion is stopped. Theexcitations of the newly available modes contain negativeenergy, just as in the case of the oscillator above. Notethat this is independent of the sign of the coefficient of thehigher-derivative term (corresponding to rigidity). It iscompletely analogous to the acceleration-dependentharmonic-oscillator example above. For e & 0 thenegative-energy modes are exponential in time and obvi-ously unstable, but even for e & 0, exciting the negative-energy modes allows arbitrary amounts of energy to beextracted. Even a small kick (A « A+ in the example)can extract large amounts of energy since the negative en-ergies are inversely proportional to the small parameter,in this case the width of the string. We shall see belowthat the perturbative constraints must be applied for con-

Page 6: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

41 HIGHER-DERIVATIVE LAGRANGIANS, NONLOCALITY, . . . 3725

sistency.Higher-order corrections to general relativity itself can

arise either quantum mechanically or classically. Quan-turn mechanically, conformal anomalies give rise to aneffective action with higher-derivative terms (see, e.g. ,Birrell and Davies'), which can be local ( ~ R ) or nonlo-cal (not expressible in local quantities such as the metricand Riemann tensor). Renormalizability argumentsdemand the presence of terms ~ R and ~ R,~R ' (Ref.13). The effect of these terms is to give additional fami-lies of solutions, some of which are "runaways. " Thenegative-energy problem manifests itself when coupled tomatter; there is in general no positive-energy theorem'(with the exception of special initial conditions for certainhigher-derivative terms ). There are also, in general,problems with ghost fields and local instabilities due tothe presence of negative-energy modes. " The concept ofa runaway solution on a cosmological scale is somewhatunclear in the case of gravity where, with an ordinarycosmological constant, exponential inflation is a physicalsolution. Nevertheless, the smaller the coefficient of thehigher-derivative terms, the faster the rate of inflation itcan induce. These extra solutions are non-Taylor ex-pandable in powers of the small coefFicient. Lovelockgravities ' have similar problems, despite the fact thatthey are, strictly speaking, not higher-derivative theories.Lovelock theories contain higher-order terms in the La-grangian which are dirnensionally extended Euler densi-ties. They allow extra solutions to the field equations,though the number of new solutions is finite, not a con-tinuous family. Nevertheless, some of the new solutionsare dramatically difFerent from the original and can beconsidered runaways.

Classically, string theory gives higher-order (local)corrections to the action in higher power of curvatureand its derivatives, as shown below in (50). If left uncon-strained, in addition to all the problems of the precedingparagraph, each theory obtained by truncating the expan-sion at a given order has a different number of degrees offreedom than the previous one.

IV. HIGHER-DERIVATIVE THEORIES WITHOUTTHE PROBLEMS

When higher-derivative theories occur as a result oftruncating a perturbative expansion of a nonlocal theory,the usual problems of higher-derivative theories do notoccur because the perturbative constraints must be ap-plied. They guarantee that of all possible solutions to theunconstrained higher-derivative equation of motion, onlysolutions that are Taylor expandable in the small expan-sion parameter are permitted. All other solutions areconsidered runaways, solutions that do not exist in thelimit of zero-expansion parameter. This corresponds tothe limit of infinite propagation speed of instantaneousinteractions in the Wheeler-Feynman model. The extrasolutions have extremely rapid behavior for small expan-sion coefFicients and are always associated with negative-energy behavior. The remaining solutions form a two-parameter family. The first use of the exclusion of runa-way solutions was in the context of removing obviously

x" f;(x,x)=0, i =—2, . . . , E . (26)

Add the constraints in this form to the Lagrangian withLagrange multipliers, and proceed either to the Euler-Lagrange equations or the Hamilton-Dirac equations(whose equivalence for constrained higher-derivative sys-tems has been shown by Pons ').

There is no issue of whether or not to quantize on alarge phase space and apply weak constraints afterwards,because there is no larger phase space. The perturbative

unphysical solutions such as (24) from the Dirac model(and its nonrelativistic variations). It was suggested thatrunaway solutions be isolated and defined by their latetime behavior, that the acceleration should be finite in thefar future for finite forces. Imposing future boundaryconditions at large scales is undesirable, since, if there isacausality present in the Universe, it is likely to be onlyat the smallest (i.e., Planck) scales. Furthermore, thefinite acceleration criterion does not generalize well toother classical theories, such as general relativity, whichhas extremely varied cosmological solutions. Bhabhapointed out that for Dirac's theory (and higher-order ex-tensions) all nonrunaway solutions are Taylor expandablein the natural small expansion parameter of the theory ~.Imposing the perturbative constraints is equivalent tothrowing away runaway solutions, but relieves us of theobligation to specify future conditions.

Nonlocal systems, such as the Wheeler-Feynmantheory, when cast into their higher-derivative expansion(first done for the Wheeler-Feynman theory by Kerner ),demand the use of perturbative constraints to reproducethe results of the original equations of motion. The per-turbative constraints are implicit in the Lagrangian bydemanding convergence. This will be demonstratedbelow with a simple model. For any finite expansion ap-proximation to the nonlocal theory, perturbative con-straints are still required, but must be imposed explicitly,since they are no longer demanded by convergence of theseries. The same finite expansion without the perturba-tive constraints is a very different theory, completely un-related to the original nonlocal theory in terms of itsavailable solutions.

It is also perfectly self-consistent and valid to imposeperturbative constraints on a higher-derivative systemeven without the sure knowledge that it is a truncated ex-pansion of some nonlocal theory. This was done byDirac when he threw away the undesirable runaway solu-tions. It is not unreasonable to apply it to any higher-derivative theory for which the method is applicable andexamine the consequences. It is absolutely necessary ifthe higher-derivative theory is to resemble at all the origi-nal lower-order theory in its behavior.

The JLM procedure for finding the perturbative con-straints strictly imposes the condition that all solutionsmust be Taylor expandable in the perturbative expansionparameter. It is not possible to invert all of the canoni-cal momenta within the limits of Taylor expandability,which signals the presence of a primary constraint.Secondary constraints are obtained by taking time deriva-tives of the primary constraints. Linear combinations ofthe constraints can always be put in the form

Page 7: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

3726 JONATHAN Z. SIMON 41

constraints are second-class constraints, and hold strong-ly. One must use the Dirac procedure, with Dirac brack-ets instead of Poisson brackets, calculate the Hamiltoni-an, and use the Hamilton-Dirac equations to quantize thesystem.

There is an easier method to put the system in canoni-cal form, however, based on the fact that we know whatform the fina answer must take, due to EW (Ref. 10) andquickly reviewed here. There exists a local, lower-ordertheory equivalent to the higher-derivative theory plus theconstraints. By knowing the energy, reduced to a func-tion of x and x only, E,(x,x ) (once the constraints havebeen applied), and knowing that the Hamiltonian is equalin value to the energy, the value of p, the canonicalmomentum to x, for a reduced version of the same systemcan be inferred. In order that

x = {x,H„(x,p)I

hold true, we must have

„dv BE,(x,v)p(x, x)=f +p(x, 0) .

o v Bx

(27)

(28)

x'=x, p'=p+t}F/dx . ] (29)

We can then invert (28) to get x(x,p) and arrive atH„(x,p) =E„(x,x (x,p ) ) and L„(x,x ) =p (x,x )x

E„(x,x). U—sing the new x and p, where {x,pJ =1, isequivalent in every way to using all the x'"',p („&, and

[There may be some uncertainty in the choice of p (x,O),but this corresponds to the uncertainty in the initial La-grangian of whether to add total derivatives of the formbL =dF(x (t))Idt, which one always has the freedom todo. This addition corresponds to the making the canoni-cal transformation

0=X+co s e I'I —' x t+&s +x t —es

S —IICo

(30)

When ceo &&1 we might expect this system's behavior tobe very similar to a simple harmonic oscillator, andindeed this is the case. Using the ansatz x =He'r' wefind two roots:

constraints, with Dirac brackets.In quantum cosmology there is not, in general, a spe-

cial physical quantity that takes the role of time which isso essential to Hamiltonian-based quantum mechanics.In these cases the action is taken to be the fundamentalbasis and quantization can proceed using the Feynmansum-over-histories approach. The technique is to Eu-clideanize the action and sum over all paths with aspecified boundary condition, which produces a specificquantum state. This does not require any canonical for-malism. If there is a special time parameter or symmetrythat allows a useful canonical formalism, the EW methodcan be used for ease of calculation of the action, but isnot required.

Let us observe all the above properties and characteris-tics of a nonlocal theory with a simple model. The modelis solved exactly, both classically and quantum mechani-cally, and can be expanded in an infinite or arbitrarilytruncated series in higher derivatives. The system is aone-dimensional harmonic oscillator but for which thepotential depends not only on the position at a giventime, but at all times past and future. Distant times,however, contribute exponentially weakly. We beginwith the equation of motion

CO g2

2+ 2 where y=( —'+ —'+1+4e t0 )' =1——'e +2 2 2 (31)

The second root, when reinserted into the equation of motion, fails to converge, and so is not a solution. The retnainingroot corresponds to the solution we expect: harmonic motion with frequency close to the original, y =to[1+0 (e to }].The general real solution is

x = A cos(yt+P} .

To put the system into a Lagrangian form, we expand out the equation of motion into an infinite series

(32)

0=X+co g (eD) "xn=0

and we can construct a Lagrangian that will give us this:

0= —g ( D)"—aL,

a(D"x) '

(33)

1Qo QO

[(eD}~"+ ~x (eD}~~+ x +(eD)~" +~x (eD)~~ +~x ]n =0 m =0

x 6) x +co x~ 2 2 2 2 dS —se ~'x (t +es ) + e ~'~ex(t +es)+Qo ds —s

2 QO QoI J L

2

(34)

Page 8: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

HIGHER-DERIVATIVE LAGRANGIANS, NONLOCALITY, . . . 3727

D2n+2X ( 1)n+1 2n+2X

D2n+3X ( 1)n+1 2n+2X n Os 1s ~ ~ ~ (35)

or some linear combination. The JLM procedure is un-necessary here because we have the general solution (32),and finding constraints to enforce it can be done by in-spection. These constraints may be put explicitly into theLagrangian with Lagrange multipliers. It is not neces-sary, since the convergence of the series enforces the con-straints implicitly, but it is helpful to acknowledge themexplicitly as well.

There are, of course, other Lagrangians that give us thesame equations of motion. For instance, we can alwaysadd the total derivative (d /dt) f (x) without changing theclassical equation of motion. When adding total deriva-tives with higher derivatives, however, one must exercisecaution, or the variational principle necessary for quanti-zation will be lost. Details of the need for a valid varia-tional formulation are discussed in Appendix D.

Since there is only one true degree of freedom (i.e., theevolution is specified completely by x; and x, ), notinfinitely many as implied by the infinite expansion, theremust be an infinite number of constraints implicit in theexpansion Lagrangian. They must express all higherderivatives in terms of x and x, and can be put in theform of (26). By inspection of the known solution (32)they must be

The energy is calculated using the Ostrogradski Hamil-tonian:

n —1E= g g (D" x) ( D—)a(D "x)

+~ ds —s=2x +co x e ~'~x ( t +es )oo

—L

ds e 'x(t+es) J ds'e 'x(t —es') .0 0

(36)

Note that lim, pE= —,'(x +co x ), as expected.The next step is to quantize the system. As usual,

finding the ground state of the system (which would notexist if it really were an unconstrained higher-derivativetheory) can be done without reference to the canonicalformalism or Poisson and/or Dirac brackets. Theground state can be calculated using the Euclidean sumover paths:

g(x )= 2)xe0 f (37)

where the Euclidean action I=~S, t = —~~, the sum isover all paths of finite Euclidean action ending at xf, andX(~)=x(t). We may take the final Euclidean time to be7 f 0 without loss of generality:

I=—' d~ x +cox2

+ ds —s +oo $X e "X(t+~es) — e-"e~(t+ies)

oo 2 oc 2

2

(38)

The action is still nonlocal in real time, even though thepaths are in Euclidean time. The path integral can bedone exactly since the action is quadratic in x (Ref. 18).There is only one classical solution with finite Euclideanaction: x,&

=x0e ~'. Let X =x,~+q. Then

E„=—,'rt(x +y x ), p =rex,

H„=—,'(g 'p +gy x ),L„=—,'g(x —

y x ) .

(40)

Qp(xp)= J 2)qe '[')e

const—yyixo /2=const X e (39)

The function p (x,O) is determined in this case bydemanding L (x,~

) =L„(x,~ ), which gives p (x,0)=0.

Now we know the whole system is canonically equivalentto a simple harmonic oscillator, so, in particular,

where

g=2 —g =1+0(e co ), (39a)

H g„=—,' y(n + 1)g„where n =0, 1, . . . ,

—gyx l22(41)

so

.I, (X) ~ e—vox [1+0(/co )]/2

qt0 x ~e (39a)

as expected.This system can be put into canonical form in two

ways: expanding the nonlocal integrals into infinite sums,defining the momenta by Eq. (5), applying the (second-class) constraints of Eq. (35) with Lagrange multipliers,and using Dirac brackets, or by the calculationally muchsimpler EW method discussed above. The reducedrelevant quantities given by this method are

which agrees with the Euclideanized sum-over-historiescalculation of the ground-state wave function.

Now suppose we are not given the fu11 theory, but onlythe first X terms. We may not even know where theycame from. But we do know that the zeroth-order termis a good approximation when @co((1. Or perhaps we donot have the tools to solve the full theory, but only forthe first X terms:

x 2 —co2x g (eD)2 x +total derivatives . (42)

Page 9: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

3728 JONATHAN Z. SIMON 41

The cases N =0, 1 are trivial, and the solutions are thesame as the solution to the full nonlocal theory, to order(ego) and (ego), respectively. If left unconstrained, how-ever, the case N =2 has all the quirks and problems ofthe acceleration-dependent oscillator above: twice asmany solutions as the zeroth-order case, negatively un-bounded energy, etc. For N=3, 4, . .., the number ofsolutions continues to increase and all associated prob-lems get worse.

The perturbative constraints are needed explicitly here(they are implicit in the full theory). The new "solutions"to the unconstrained finite series approximations do notconverge when put in the equation of inotion for the fulltheory. The appropriate constraints can be obtained bythe JLM procedure, which is necessary when the fulltheory is not known, and gives

(43)

where yz depends on the expansion order N, and

yz y=—0((et0) +) Hig.her-derivative constraints

are obtained by differentiating and substituting as neces-sary. Solving this system, for finite N, gives the correctsolutions of the full infinite system (to the appropriate or-der), whether classical or quantum, whether via the La-grangian or Hamiltonian. Solving the system without theperturbative constraints, while describing a well-definedsystem, does not approximate the full nonlocal system inany sense.

The above nonlocal oscillator is an example of a per-fectly well-behaved system that appears sick when ex-panded naively in a perturbation series. But when ex-panded properly, with the knowledge that the only con-tributing solutions are those close to the zeroth-ordersolutions, the expansion is useful, perturbative, physical,and agrees with the full theory to the appropriate order.

The one important aspect of higher-derivative andnonlocal systems that does not appear in the above exam-ple is the appearance of acausal solutions, i.e., preac-celeration types of effects. These appear when the systemis coupled to explicitly time-dependent terms. The bestknown example is Dirac's classical electron. In the caseof a nonzero force, the electron experiences a preac-celeration on times of the order of r (Refs. 21 and 5}. Fora one-dimensional delta-function impulse the equation ofmotion (23) in the nonrelativistic limit becomes

1X X K5(t)

which has the general solution x =c,e' '+cz for tPOwith an instantaneous change in x of ~ across t =0. Fort &0, c& =0 by requiring finite acceleration in the infinitefuture. For t &0, c2=0 if we desire zero velocity in thefar past. The solution is

(45)

which has nonzero acceleration before the force is ap-plied, but the time scale it occurs on is ~. In general,noncausality only appears at the scale of the small expan-sion parameter, and as shown by Wheeler and Feynman

for similar theories, the noncausality decreases as thenumber of particles rises. At any rate, for theories inwhich the nonlocality appears only as a low-energyeffective theory, the theory itself is only an approxima-tion at that scale, and its results at that scale will reactthis. If the theory is truly nonlocal, as the true theory ofeverything might be (we have no experimental evidencewhether or not nature is local near the Planck scale), thenthe nonlocality will be manifest in the solutions.

The JLM procedure fails in the presence of externalsources, though a related procedure has been proposed.Note that (45) is not analytic in r, and so whatevermethod is needed to remove ill-behaved solutions willmost likely not allow a solution of this form. The non-causality may still manifest itself, though probably in adifferent way. The form of the source term might be re-stricted due to (nonlocal} back reaction, on the order ofthe expansion parameter.

It is important to mention and put to rest a commonfallacy that the extra degrees of freedom of higher-derivative theories are somehow related to the degrees offreedom frozen out in creating the effective nonlocaltheory; for instance, that the higher-derivative degrees offreedom in a cosinic string arise from the lost degrees offreedom of the scalar and gauge fields, or that thehigher-derivative degrees of freedom from curvaturesquared corrections to general relativity arise from thefrozen-out massive string modes. The field degrees offreedom frozen out to create the nonlocal effective theoryare gone and cannot be regained. The apparent higher-derivative degrees of freedom, as in the model above, areartifacts from trying to perturbatively expand a nonlocaltheory without the necessary perturbative constraints.This is also particularly relevant in the case of cosmicstrings, where negative-energy modes are available in theunconstrained higher-derivative case, yet exact travelingwave solutions along a string in the full field theory haveonly positive energy.

V. EFFECTS ON GENERAL RELATIVITY, COSMICSTRINGS, AND OTHER THEORIES

Under what circumstances should perturbative con-straints be applied? They must always be applied in thecase where the theory in question is known to be a trun-cated expansion (with a small coupling constant) of anonlocal theory. The nonlocal theory may itself be alow-energy limit of some larger local theory. Specificcases of this are Wheeler-Feynman electrodynamics andcosmic strings (as will be shown below). If the expansionis to be perturbative, then the perturbative constraintsare the only means of enforcing it. To verify whether theperturbative expansion itself is appropriate, check the be-havior of the zeroth-order approximation (e.g., a slowlymoving electron for which radiation effects are ignored,or a very straight slow cosmic string). If it is appropriateto approximate the system with only the zeroth-orderterm, then it is appropriate to use higher-order perturba-tive corrections, and hence the perturbative constraints.This is certainly the case for %heeler-Feynman electro-dynamics and cosmic strings without too much curvature

Page 10: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

41 HIGHER-DERIVATIVE LAGRANGIANS, NONLOCALITY, . . . 3729

(i.e., no kinks or cusps). Where the zeroth-order approxi-mate theory is inappropriate (e.g. , electrons with speedsnear c and intersecting cosmic strings), the perturbativeexpansion is inappropriate as well. The expansionwithout perturbative constraints is never appropriate.

Consider the case of cosmic strings. The usual deriva-tion of the string action ' begins with the full gaugetheory

S = I d x& gX—(P,B„P,A„,F„„). (46)

Let /=$0(x~), A„=A~(xI') be a field configuration thatdescribes a string and let x"=X" be the location inspacetime of the string. We want to write down aneffective action based only on the movement of the string,i.e., formed only from functions and operators acting onX". First pick a coordinate system such that two coordi-nates P are in the world sheet of the string and two coor-dinates p" are Gaussian normal coordinates perpendicu-lar to the world sheet

x"=X"( (')+p "n "„, (47)

X &—y l+p "K„+—,p "p K K +

(4g)

where y,~(P) is the metric on the world sheet and

K„(P)are the traces of the two extrinsic curvatures. Wecan make an effective, nonlocal theory by integrating outall degrees of freedom off the string world sheet. It isnonlocal because the string has finite thickness, so the en-ergy of a piece of the string propagates not only along thestring but also over and around it, but those extra degreesof freedom are now frozen out of the picture. Once wehave done this, the lost degrees of freedom cannot berecovered; i.e., we cannot reconstruct the original systemfrom the effective theory. This effective, nonlocal theoryis not usually examined per se, but is itself perturbativelyexpanded, effectively in powers of the string width multi-plied by the extrinsic curvatures

S„,= —pJ d P —y l+ —K"K„+.. .p

(49)

where p is the string tension, sa is the "rigidity, " etc.The zeroth-order term is just the Nambu-Goto action.Higher-order expansions contain higher derivatives viathe extrinsic curvatures and their derivatives. If left un-constrained, these higher-derivative terms would havethe usual disastrous effect, making the so-called perturba-tive theory totally different from the full nonlocal theory.Instead, enforcing the perturbative constraints producessolutions consistent with the full gauge field theory, al-lows the expansion to be truly perturbative, and removesall the problems of extra degrees of freedom and negative

where n„"rae two (arbitrary} unit vectors normal to theworld sheet, a =1,2, and A =1,2. The action, still exact,now reads

S=I d gd p[X($0, Aq0)+ ]

energy.For cosmic strings the perturbative constraints ensure

that all higher-order solutions remain close to the solu-tions of the old, zeroth-order, Nambu-Goto action. Sinceit has been shown that solutions of the Nambu-Goto ac-tion are also solutions when the first-order "rigidity'*corrections are present, the appropriate constraint, for anoninteracting string, is that the original equations ofmotion remain unchanged. The result is that, for an iso-lated string, the rigidity term has no effect at all on themotion, independently of the sign of its coefficient T.hefirst contributing corrections to the Nambu-Goto actionmust come from higher-order terms, e.g., torsion orderivatives of extrinsic curvature. As in the case ofDirac s theory, however, the rigidity term will certainlyplay a role once external forces are considered.

Similar to the cases of nonradiating electrons andcosmic strings, Newtonian gravity and its post-Newtonian corrections can be derived from a perturba-tion expansion of an effective non-local theory ultimatelyderived from Einstein gravity. One might expect thesame phenomenon to occur, since the effects of gravitonshave been integrated out. Acceleration-dependent termsdo occur in the post-Newtonian and post-post-Newtonianapproximations, but only linearly, which is a degeneratecase (though perfectly suited to the Ostrogradski canoni-calization procedure with second-class constraints ').The higher-derivative terms at these low orders arise onlyfrom coordinate and gauge choices, but there is littlereason to doubt that nondegenerate higher-derivativeterms will appear at higher order.

Another important case where perturbative constraintshave not been considered, but should be, is the case ofgravity as a low-energy limit of string theory. Einstein sequations and the corrections to arbitrarily high orders(in the slope parameter) can be obtained from nonlinear omodels (e.g. , de Alwis }. The first-order corrected gravi-tational action in D dimensions is

I

S~ J dDx& gR — —(R R""~ V'R}—4 pvpa

+(matter terms)+0(a' )

where a' is the slope parameter. As in the precedingcase, to the extent that the zeroth-order theory (Einsteingravity) is a good approximation of nature, higher-orderterms produced by this method should only be perturba-tive corrections; they should not completely alter the dy-namics of the system. The perturbative constraints mustbe applied for consistency.

It is important to note in this case that the large nonlo-cal theory which Einstein gravity and the constrainedhigher-order terms approximates well is not string theoryitself. It is a nonlocal, low-energy effective theory that isderivable from string theory and is appropriate in caseswhere Einstein gravity is also a good approximation(though not as good as the nonlocal low-energy effectivetheory). Neither is appropriate in regions of very highcurvature or near singularities. The analog of the inter-mediate nonlocal theory for the case of electrodynamics

Page 11: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

3730 JONATHAN Z. SIMON

would be the Wheeler-Feynman theory, which falls be-tween full field-theoretic electrodynamics and slowly

moving, nonradiating, Lorentz-force-law motion. Just asthe Wheeler-Feynman theory is not accurate at high en-

ergies (comparable to the electron mass), neither generalrelativity nor general relativity plus string correctionswill be accurate at high curvatures.

When the higher-derivative corrections arise fromquantum effects, then the above argument does not hold.One may or may not choose to apply the perturbativeconstraints. But one should be aware that the higher-derivative theory without perturbative constraints isdramatically different from Einstein gravity, while thesame higher-derivative theory with perturbative con-straints is a true perturbative correction. If there is anyreason to believe that the quantum corrections will notradically alter the behavior of the system, then the per-turbative constraints must be applied. (The same holdsfor Lovelock gravities, which, while strictly speaking nothigher-derivative theories, have some solutions that areclose to Einstein gravity and others that are far from Ein-stein gravity. One must throw away the dramaticallydifferent solutions, i.e., impose the perturbative con-straints, if the corrections to Einstein gravity are intend-ed to be small. )

The perturbatively constrained system has qualitativelydifferent properties than the unconstrained higher-derivative theory. The renormalizability gained from thehigher-derivatives is lost once the constraints are applied.The extra particles (degrees of freedom) present in the un-

constrained theory do not exist in the constrained theory,since any solution containing them is nonanalytic in theexpansion parameters.

For higher-order terms ~R, all vacuum solutions tothe Einstein action are still vacuum solutions, and so justas for the case of cosmic strings and the unforced Diracelectron, the perturbative constraint is just the old equa-tion of motion. This has the effect that the new equationsof motion ignore the R piece entirely (though again,when coupled to matter, this could easily change).Dramatically different solutions, such as those offered inStarobinsky-type inflation, ' are excluded from the realmof acceptable solutions. For the additional nonlocal,higher-derivative terms arising from quantum correc-tions, the first-order terms contribute nontrivially even invacuum, but do not dominate the evolution. In the caseof Lovelock gravities, the de Sitter-like and anti-deSitter-like solutions are disallowed as acceptable spheri-cally symmetric solutions.

VI. SUMMARY

Higher-derivative theories occur in various placesthroughout theoretical physics, usually as a correctionterm to a standard, lower derivative theory. In particu-lar, they arise in the context of theories of gravity a.nd ofcosmic strings. Though the higher-derivative theories aremathematically self-consistent, there are distinctivefeatures of unconstrained higher-derivative theories thatset them apart from similar lower-derivative theories.There are more degrees of freedom, associated with new

solutions called "runaways, " qualitatively different fromthose of a related lower-derivative theory. There is nolower-energy bound. In the case of field theories, thesefeatures can cause the problems of "ghost*' fields and lossof unitarity.

There is a natural way to constrain many higher-derivative theories and save them from the above prob-lems. It applies in cases where the higher-derivativeterms are associated with a small, perturbative, expansionparameter. The method, called the method of perturba-tive constraints, is to exclude solutions that have no Tay-lor expansion in that small expansion parameter. Theeff'ect is to throw away all the runaway, negative-energysolutions. Without the perturbative constraints, higher-order terms in the expansion contribute as much as thelower-order terms, not commensurately less.

Higher-derivative theories that are expansions of anonlocal theory require these perturbative constraints togive the same results as the full nonlocal theory. The per-turbative constraints are actually present implicitly in thefull theory, but they must be included explicitly for anyfinite expansion. Important examples of this case, wherethe perturbative constraints should be (but have not been)applied include higher-order corrections to general rela-tivity from string theory, and to cosmic strings from theoriginal gauge theory from which they arise. Nonlocalityis a common feature in low-energy effective theories, andis not at all necessarily present in the full theory fromwhich they are derived.

Higher-derivative theories which are not necessarily atruncated version of an infinite series, but can still beviewed as corrections to a valid lower-derivative theory,can also reap the benefits of the method of perturbativeconstraints. The constrained theory will resemble theoriginal, lower-order theory in its solutions and numberof degrees of freedom, and will have a lower-energybound, all of which one would hope for in a perturbativeexpansion. The constraints are necessary if the perturba-tive higher-derivative corrections are to produce pertur-bative solutions. The unconstrained version would haveall of the associated problems of higher-derivativetheories, and the higher-derivative "correction" wouldcompletely dominate the behavior of the solutions, com-plete with negative-energy modes.

ACKNOWLEDGMENTS

The author would like to especially thank the Depart-ment of Applied Mathematics and Theoretical Physics atthe University of Cambridge, where most of this workwas done, for its warm hospitality and for the many im-portant conversations held with its members. Specialthanks go to Tanmay Vachaspati and Nico Giulini formany helpful conversations, and to Jim Hartle for hishelp, guidance, and support. This work was supported inpart by NSF Grant No. PHY85-06686.

APPENDIX A

The quantized version of the system presented in Eq.(10) can be solved exactly by algebraic methods. We

Page 12: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

41 HIGHER-DERIVATIVE LAGRANGIANS, NONLOCALITY, . . . 3731

define the new canonical variables

1 1q+ =— (e co x —n„.),

1 e—co

CO

p+ = (x n.„—},.

H=p++co q+ —(p +e q ) .

The energy spectrum is then given by

(A2)

E =(n+ —,')cu —(m+ ,')e —' for n, m =0, 1,2, . . . .

(A3)

The wave function can then be put in the form

(n„,x ) =g„(q+ )g ~(q (A4)

1 (6' c0 x 7t )~V 1 Eco

such that the Hamiltonian is in the form of the differenceof two harmonic oscillators:

where y„.„ is the standard simple harmonic-oscillatorwave function with energy level n' and frequency co', and

q+ are defined above. P„(x,x } is given by the Fouriertransform: e.g.,

1 lX 1T

fcN(x, x }= dm„e 'goo(m„, x )2m

' 1/4cu (1 eco )x— 4iecu x—x —(1 ec—u )x

exp2e '(1+@co )

(A5)

APPENDIX B

The rules for reading off the expectation values fromthe Hamiltonian are simple:

(q'"') =tiri5Xcoefficient of (p, „,),((q'"') ) =iirtSXcoefftcient of ((p ~„I) ), etc. ,

(Bl)

but they only apply when the Hamiltonian and/orSchrodinger formulation is equivalent to the Feynmanpath-integral formulation of quantum mechanics (formore details on when this is true, see, e.g, Popov }.

To calculate meaningful quantities, take the expecta-

tion value of classical expressions (since path integrals aresemiclassical approximations at the smallest scale). Forexample, when calculating the expectation value of veloc-ity, take the transition expectation value of

u„-x+O(bx/5) . (B2)

APPENDIX C

The general solution for the system described by (20) is

For the example, in Eq. (20), u„ is given exactly by thederivative of (Cl).

x (t)= [(x;—xf )sin[@ (T —t)]+(e tx; e txf —eTx, —x;+—xf )sin(e T)

+ (x; —xf )sin(e 't ) —e '( Tx, +x; —xf )cos[e '( T —t) ]

+e '(Txf+x; —xf)cos(e 't)+e '( tx; txf+ Tx; ——x, ——xf )cos(e 'T)

+6 (tx; +txf + Txf +x;+xf )I @[—2 cos(e 't }—e 'T sin(e 't}+2] (Cl)

S = E'[A (x;+x I )—2Bx;xf—

where

—2e 'C(x;+xf )(x; —xf )

+e D(x, —xf) ], (C2)

for a particle beginning its motion at x; at time t =0 andending at xf at time t =T. From this we can computethe classical action

3 = —4e 'T cose 'T+3e 'T cos2e 'T+4 sine 'T+(e T —2)sin2e 'T+e 'T,

8 =4e 'T cose 'T+e 'T cos2e 'T+2(e T +2}sine 'T 2sin2e 'T —5e '—T,

C =8cose 'T —2cos2e 'T+2e 'T sine 'T—e 'T sin2e 'T —6, (C3)

D =e 'T cos2e 'T+4 sine 'T —2 sin2e 'T —e 'T,E =16cose 'T+(e T 4)cos2e 'T-

+8e 'T sine 'T —4sin2e 'T —e T —12 .

Page 13: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

3732 JONATHAN Z. SIMON

Because the Lagrangian is quadratic, the quantum transi-tion amplitude given by the path-integral formulation isexactly

K (xI,xI, t;+ T;x, ,x, , t, ) =F( T)e" (C4)

where F ( T) can be derived from the classical action alone(see, e.g., Marinov for details):

F(T)= 1

27Tl

21/2—BD —C

g2 (C5)

Schrodinger's equation for this system can be obtaineddirectly from (C4) without the use of canonical formalismor the Hamiltonian. '

APPENDIX D

One may always add any total derivative to a Lagrang-ian without a8'ecting the equations of motion, but notevery Lagrangian had a valid variational formulation as-sociated with it, and without one, the associatedquantum-mechanical wave function will not fold. A sim-ple example of a Lagrangian without a valid variationalformulation is

L =—'(x —co x )+a—(xx) for aAO, —1 (Dl)

This tells us to fix four boundary conditions for a

for which the variational principle is

2 ..5S = —f (x+ro x)5x dt +(1 +a} x5x~t+ax5x~& .

(D2}

second-order equation, which is an overdetermined sys-tem. Any Lagrangian lacking a valid variational formu-lation can regain it by adding a total derivative. In thiscase, the most obvious total derivative to add is

a—(dldt)(xx). Note that for a= —1 there is a validvariational formulation (even though there are also con-straints); this is the theory obtained by choosing thecanonical momentum as the generalized coordinate. (Avalid variational formulation is not needed to put the sys-tem into canonical form, it is only necessary for quantummechanics. )

From Eq. (4) and from calculating the p I„~ for our

nonlocal oscillator [expressed as the infinite sum in thesecond line of (34)], we can see that the model Lagrangianpresented above does have a valid variational formula-tion, once the implied constraints of (35) have been takeninto account. The implied constraints tell us that holdingall even derivatives fixed on the boundaries holds x fixed,and holding all odd derivatives fixed holds x fixed. Be-cause the p („}vanish for odd n,

25S= (equations of motion)+(some function}5x~ f

1

(D3)

once the constraints are used. It is correct and, in fact,necessary to use the constraints to determine whether ornot the system is over determined, as they are just part ofthe equations of motion (cf. the case of l. }. In general,adding an arbitrary total derivative to the Lagrangian, ifit contains higher derivatives, corresponds to a canonicalchange in variable, which would destroy the variationalformulation.

'Electronic address: JSimon@Voodoo.'N. D. Birrell and P. C. W. Davies, Quantum Fields in Curved

Space (Cambridge University Press, Cambridge, England,1982).

2S. P. de Alwis, Phys. Rev. D 34, 3760 (1986).T. L. Curtright, G. I. Ghandour, and C. K. Zachos, Phys. Rev.

D 34, 3811 (1986).4K. Maeda and N. Turok, Phys. Lett. B 202, 376 (1988);R. Gre-

gory, ibid. 199, 206 (1988).5P. A. M. Dirac, Proc. R. Soc. London A167, 148 (1938).J. A. Wheeler and R. P. Feynman, Rev. Mod. Phys. 21, 425

(1949).7E. J. Kerner, J. Math. Phys. 3, 35 (1962).8H. J. Bhabha, Phys. Rev. 70, 759 (1946).X. Jaen, J. Llosa, and A. Molina, Phys. Rev. D 34, 2302 (1986).

' D. A. Eliezer and R. P. Woodard, Nucl. Phys. B325, 389(1989).

'K. S. Stelle, Gen. Relativ. Gravit. 9, 353 (1978).~2David G. Boulware, S. Deser, and K. S. Stelle, in Quantum

Field Theory and Quantum Statistics: Essays in Honor of theSixtieth Birthday of E S. Fradkin, edited by .I. A. Batalin, C.J. Isham, and C. A. Vilkovisky (Hilger, Bristol, England,1987).

' K. S. Stelle, Phys. Rev. D 16, 953 (1977).'4David G. Boulware, in Quantum Field Theory and Quantum

Statistics: Essays in Honor of The Sixtieth Birthday of E. S.Fradkin (Ref. 12).

' A. A. Starobinsky, Phys. Lett, 91B,99 (1980).' M. Ostrogradski, Mem. Act. St. Petersbourg UI 4, 385 (1850).'7S. W. Hawking, in Quantum Field Theory and Quantum

Statistics: Essays in Honor of the Sixtieth Birthday of E. S.Fradkin (Ref. 12).R. P. Feynman, Rev. Mod. Phys. 20, 267 (1948); R. P. Feyn-man and A. R. Hibbs, Quantum Mechanics and Path Integrals (McGraw-Hill, New York, 1965).T. D. Lee and G. C. Wick, Nucl. Phys. B9, 209 (1969).

OG. T. Horowitz, Phys. Rev. D 31, 1169 (1985).J. D. Jackson, Classical Electrodynamics (Wiley, New York,1975).

A. Strominger, Phys. Rev. D 30, 2257 (1984).D. Lovelock, J. Math. Phys. 12, 498 (1971);13, 874 (1972).

~D. G. Boulware and S. Deser, Phys. Rev. Lett. 55, 2656(1985);J. T. Wheeler, Nucl. Phys. B268, 737 (1986);B273, 732(1986); R. C. Myers and J. Z. Simon, Phys. Rev. D 38, 2434(1988).J. M. Pons, Lett. Math. Phys. 17, 181 (1989).

26P. A. M. Dirac, Lectures on Quantum Mechanics {YeshivaUniversity, New York, 1964).A. Hanson, T. Regge and C. Teitelboim, Constrained Hami/-tonian Systems (Acad. Naz. dei Lincei, Rome, 1976).

Page 14: Higher-derivative Lagrangians, nonlocality, problems, and ... · Higher-derivative Lagrangians, nonlocality, problems, and solutions ... problems of higher derivatives is reviewed

HIGHER-DERIVATIVE LAGRANGIANS, NONLOCALITY, . . . 3733

~~J. A. %heeler and R. P. Feynman, Rev. Mod. Phys. 17, 157(1945).

L. Bel and H. Sirousse-Zia, Phys. Rev. D 32, 3128 (1985).DVachaspati and T. Vachaspati, Phys. Lett. 8 238, 41 (1990).Y. Saito, R. Sugano, T. Ohta, and T. Kimura, J. Math. Phys.30, 1122 (1989).

G. Schafer, Phys. Lett. 100A, 128 (1984).J. Z. Simon (in preparation).

~4V. N. Popov, Functional 1ntegra!s in Quantum Field Theoryand Statistical Physics (Reidel, Boston, 1983).M. S. Marinov, Phys. Rep. 60, 1 (1980).