Transcript
Page 1: Classical and Quantum Mechanics

Notes onClassical and Quantum Mechanics

Jos Thijssen

February 10, 2005

(560 pages)

Available beginning of 1999

Page 2: Classical and Quantum Mechanics
Page 3: Classical and Quantum Mechanics

Preface

These notes have been developed over several years for use with the courses Classical and QuantumMechanics A and B, which are part of the third year applied physics degree program at Delft Uni-versity of Technology. Part of these notes stem from courses which I taught at Cardiff University ofWales, UK.

These notes are intended to be used alongside standard textbooks. For the classical part, severaltexts can be used, such as the books by Hand and Finch (Analytical Mechanics, Cambridge Uni-versity Press, 1999) and Goldstein (Classical Mechanics, third edition, Addison Wesley, 2004), theolder book by Corben and Stehle (Classical Mechanics, second edition, Dover, 1994, reprint of 1960edition), and the textbook by Kibble and Berkshire, (Classical Mechanics, 5th edition, World Scien-tific, 2004). The part on classical mechanics is more self-contained than the quantum part, althoughconsultation of one or more of the texts mentioned is essential for a thorough understanding of thisfield.

For the quantum mechanics part, we use the book by D. J. Griffiths (Introduction to QuantumMechanics, Second Edition, Pearson Education International/Prentice Hall, 2005). This is a verynice, student-friendly text which, however, has two drawbacks. Firstly, the informal way in whichthe material is covered, has led to a non-consistent use of Dirac notation; very often, the wavefunc-tion formalism is used instead of the linear algebra notation. Secondly, the book does not go intomodern applications of quantum mechanics, such as quantum cryptography and quantum computing.Hopefully these notes remedy that situation. Other books which are useful for learning this mate-rial from areIntroductory Quantum Mechanicsby Liboff (fourth edition, Addison Wesley, 2004) andQuantum Mechanicsby Bransden and Joachain (second edition, Prentice Hall, 2000). Many morestandard texts are availbale – we finally mention hereQuantum Mechanicsby Basdevant and Dal-ibard (Springer, 2002) and, by the same authors,The Quantum Mechanics Solver(Springer, 2000).Finally, the older text by Messiah (North Holland, 1961) the books by Cohen-Tannoudji, Diu andLaloe (2 vols., John Wiley, 1996), by Gasiorowicz (John Wiley, 3rd edition, 2003) and by Merzbacher(John Wiley, 1997) can all be recommended.

Not all the material in these notes can be found in undergraduate standard texts. In particular, thechapter on the relation between classical and quantum mechanics, and those on quantum cryptographyand on quantum information theory are not found in all books listed here, although Liboff’s bookcontains a chapter on the last two subjects. If you want to know more about these new developments,consultQuantum Computing and Quantum Informationby Nielsen and Chuang (Cambridge, 2000).

Along with these notes, there is a large problem set, which is more essential than the notes them-selves. There are many things in life which you can only learn by doing it yourself. Nobody wouldseriously believe you can master any sport or playing a musical instrument by reading books. Forphysics, the situation is exactly the same. You have to learn the subject by doing it yourself – even byfailing to solve a difficult problem you learn a lot, since in that situation you start thinking about thestructure of the subject.

In writing these notes I had numerous discussions with and advice from Herre van der Zant andMiriam Blaauboer. I hope the resulting set of notes and problems will help students learn and appre-ciate the beautiful theory of classical and quantum mechanics.

i

Page 4: Classical and Quantum Mechanics

Contents

Preface i

1 Introduction: Newtonian mechanics and conservation laws 11.1 Newton’s laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Systems of point particles – symmetries and conservation laws . . . . . . . . . . . . 3

2 Lagrange and Hamilton formulations of classical mechanics 82.1 Generalised coordinates and virtual displacements . . . . . . . . . . . . . . . . . . . 82.2 d’Alembert’s principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3.1 The pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.2 The block on the inclined plane . . . . . . . . . . . . . . . . . . . . . . . . 122.3.3 Heavy bead on a rotating wire . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.4 d’Alembert’s principle in generalised coordinates . . . . . . . . . . . . . . . . . . . 152.5 Conservative systems – the mechanical path . . . . . . . . . . . . . . . . . . . . . . 162.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.6.1 A system of pulleys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.6.2 Example: the spinning top . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.7 Non-conservative forces – charged particle in an electromagnetic field . . . . . . . . 232.7.1 Charged particle in an electromagnetic field . . . . . . . . . . . . . . . . . . 23

2.8 Hamilton mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.9 Applications of the Hamiltonian formalism . . . . . . . . . . . . . . . . . . . . . . 27

2.9.1 The three-pulley system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.9.2 The spinning top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282.9.3 Charged particle in an electromagnetic field . . . . . . . . . . . . . . . . . . 29

3 The two-body problem 303.1 Formulation and analysis of the two-body problem . . . . . . . . . . . . . . . . . . 303.2 Solution of the Kepler problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4 Examples of variational calculus, constraints 354.1 Variational problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.2 The brachistochrone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.3 Fermat’s principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.4 The minimal area problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.5 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

4.5.1 Constraint forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.5.2 Global constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

ii

Page 5: Classical and Quantum Mechanics

Contents iii

5 From classical to quantum mechanics 455.1 The postulates of quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 455.2 Relation with classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . 475.3 The path integral: from classical to quantum mechanics . . . . . . . . . . . . . . . . 505.4 The path integral: from quantum mechanics to classical mechanics . . . . . . . . . . 53

6 Operator methods for the harmonic oscillator 556.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556.2 The harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

7 Angular momentum 607.1 Spectrum of the angular momentum operators . . . . . . . . . . . . . . . . . . . . . 607.2 Orbital angular momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627.3 Spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637.4 Addition of angular momenta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647.5 Angular momentum and rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

8 Introduction to Quantum Cryptography 698.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 698.2 The idea of classical encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . 698.3 Quantum Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

9 Scattering in classical and in quantum mechanics 759.1 Classical analysis of scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 759.2 Quantum scattering with a spherical potential . . . . . . . . . . . . . . . . . . . . . 78

9.2.1 Calculation of scattering cross sections . . . . . . . . . . . . . . . . . . . . 829.2.2 The Born approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

10 Symmetry and conservation laws 8710.1 Noether’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8710.2 Liouville’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

11 Systems close to equilibrium 9211.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9211.2 Analysis of a system close to equilibrium . . . . . . . . . . . . . . . . . . . . . . . 93

11.2.1 Example: Double pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . 9511.3 Normal modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9611.4 Vibrational analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9711.5 The chain of particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

12 Density operators — Quantum information theory 1031 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1032 The density operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1033 Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1104 The EPR paradox and Bell’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 1125 No cloning theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146 Dense coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1157 Quantum computing and Shor’s factorisation algorithm . . . . . . . . . . . . . . . . 116

Page 6: Classical and Quantum Mechanics

iv Contents

Appendix A Review of Linear Algebra 1191 Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1192 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Appendix B The time-dependent Schrodinger equation 123

Appendix C Review of the Schrodinger equation in one dimension 125

Page 7: Classical and Quantum Mechanics

1

Introduction: Newtonian mechanics andconservation laws

In this lecture course, we shall introduce some mathematical techniques for studying problems inclassical mechanics and apply them to several systems. In a previous course, you have already metNewton’s laws and some of its applications. In this chapter, we briefly review the basic theory, andconsider the interpretation of Newton’s laws in some detail. Furthermore, we consider conservationlaws of classical mechanics which are connected to symmetries of the forces, and derive these conser-vation laws starting from Newton’s laws.

1.1 Newton’s laws

The aim of a mechanical theory is to predict the motion of objects. It is convenient to start with pointparticles which have no dimensions. The trajectory of such a point particle is described by its positionat each time. Denoting the spatial position vector byr , the trajectory of the particle is given asr(t),a three-dimensional function depending on a one-dimensional coordinate: the time. Thevelocity isdefined as the time-derivative of the vectorr(t), and by convention it is denoted asr(t):

r(t) =ddt

r(t), (1.1)

and theaccelerationa is defined as the second derivative of the position vector with respect to time:

a(t) = r(t). (1.2)

The last concept we must introduce is that ofmomentump: it is defined as

p = mr(t), (1.3)

wherem is the mass. Although we have an intuitive idea about the meaning of mass, this is also arather subtle physical concept, as is clear from the frequent confusion of mass with the concept ofweight (see below).

Now let us state Newton’s laws:

1. A body not influenced by any other matter will move at constant velocity

2. The rate of change of momentum of a body is equal to theforce, F:

dpdt

= F(r , t). (1.4)

1

Page 8: Classical and Quantum Mechanics

2 Introduction: Newtonian mechanics and conservation laws

Table 1.1: Forces for various systems. The symbolmi stand for the mass of point particlei, qi stands for electriccharge of particlei, B is a magnetic, andE an electric field.G, ε andg are known constants. The gravitationaland the electrostatic forces are directed along the line connecting the two particlesi = 1,2.

Forces in natureSystem ForceGravity FG = GmM1

r2

Gravity near earth’s surface Fg =−mgzElectrostatics FC = 1

4πε0q1q2

1r2

Particle in an electromagnetic fieldFEM = q(E+ r ×B)Air friction Ffr =−γ r

3. When a particle exerts a forceF on another particle, then the other particle exerts a force on thefirst particle which is equal in magnitude but opposite in direction to the forceF – these forces aredirected along the line connecting the two particles. Denoting the particle by indices 1 and 2, andthe force exerted on 1 by 2 byF1,2 and the force exerted on 2 by 1 byF2,1, we have:

F1,2 =−F2,1 =±F1,2r1,2. (1.5)

wherer1,2 is a unit vector pointing fromr1 to r2. The± denotes whether the force is repulsive(−) or attractive (+).

Some remarks about these laws are in place. It is questionable whether the second law is reallya statement, as a new vector quantity, called ‘force’, is introduced, which is not yet defined. Only ifwe know the force, we can predict how a particle will move. In that sense, a real ‘law’ is only formedby combining Newton’s second law together with an explicit expression for the force. In table 1.1,known forces are given for several systems. Note that the force generally depends on the positionr ,on the velocityr , and also explicitly on time (e.g. when an external, time-varying field is present). Animplicit dependence on time is further provided by the time dependence of the position vectorr(t).

In most cases, the mass is taken to be constant, although this is not always true: you may think ofa rocket burning its fuel, or disposing of its launching system, or bodies moving at a speed of the orderof the speed of light, where the mass deviates from the rest mass. With constant mass, the second lawreads:

mr(t) = F(r , t). (1.6)

In fact, the second law disentangles two ingredients of the motion. One is themass m, which isa property of the moving particle which is acted upon by the force, and the other the force, whichitself arises from some external origin. In the case of gravitational interaction, the force dependson the mass, which drops out of the equation of motion. Generally, mass can be described as theresistance to velocity change, as the second law states that the larger the mass, the smaller the changein velocity (for the same force). It is an experimental fact that the mass which enters the expressionfor the gravitational force is the same as this universal quantity mass, which occurs for any force inthe equation of motion. Theweightis the gravity force acting on a body.

Usually, the first law is phrased as follows: ‘when there is no force acting on a point particle, theparticle moves at constant velocity’. This statement obviously follows from the second law by takingF = 0. The formulation adopted above emphasises that force has a material origin. It is impossible

Page 9: Classical and Quantum Mechanics

1.2. Systems of point particles – symmetries and conservation laws 3

fulfill the requirements of this law, as everywhere in the universe gravitational forces are present: thefirst law is anidealisation. The first law is not obvious from everyday life, where it is never possibleto switch friction off completely: in everyday life, motion requires a force in order to be maintained.

The third law is a statement about forces. It turns out that this statement does not hold exactly,as the forces of this statement should actsimultaneously. In quantum field theory, particles travellingbetween the interacting particles are held responsible for the interactions, and these particles cannottravel at a speed faster than that of light in vacuum (about 3·108m/s). However, for everyday lifemechanics, the third law holds to sufficient precision, unless the moving particles carry a charge andinteract through electromagnetic interactions. In that case, the force acts no longer along the lineconnecting the two particles.

1.2 Systems of point particles – symmetries and conservation laws

Real objects which we describe in mechanics are not point particles, but to very good agreementthey can be considered as large collections of interacting point particles – in this section we considersystems consisting ofN point particles. It is possible to disentangle the mutual forces acting betweenthese particles from the external ones. The mutual forces satisfy Newton’s third law: for every forceFi, j , which is the force exerted by particlej on particlei, the forceF j,i is equal in magnitude butopposite in direction toFi, j . For a particlei, we consider all the mutual forcesFi, j for j 6= i – theremaining forces oni must then be due to external sources (i.e., not depending on the other particlesin our system), and we lump these forces together in one external force,FExt

i :

Fi =N

∑j=1; j 6=i

Fi, j +FExti , i = 1, . . . ,N. (1.7)

The equations of motion read:

mi r i =N

∑j=1; j 6=i

Fi, j +FExti . (1.8)

The total momentum of the system is the sum of the momenta of all the particles:

p =N

∑i=1

pi =N

∑i=1

mi r i . (1.9)

We can view the total momentum of the system as the momentum of a single particle with a massequal to the total massM of the system, and position vectorrC. This position vector is then definedthrough:

p = MrC =N

∑i=1

mi r i ; M =N

∑i=1

mi . (1.10)

This is equivalent to

rC =1M

N

∑i=1

mir i (1.11)

up to an integration constant which is always taken to be zero. The vectorrC is calledcentre of massof the system. A particle of massM at the centre of mass (which obviously changes in time) representsthe same momentum as the total momentum of the system.

Page 10: Classical and Quantum Mechanics

4 Introduction: Newtonian mechanics and conservation laws

Let us find an equation of motion for the centre of mass. We do this by summing Eq. (1.8) overi:

N

∑i=1

mi r i =N

∑i, j=1,i 6= j

Fi, j +N

∑i=1

FExti . (1.12)

In the first term on the right hand side, for every termFi, j , there will also be a termF j,i , but this isequal in magnitude and opposite toFi, j ! So the first term vanishes, and we are left with

N

∑i=1

mi r i = p =N

∑i=1

FExti ≡ FExt. (1.13)

We see that the centre of mass behaves as a point particle with massM subject to the total externalforce acting on the system.

Conservation of physical quantities, such as energy, momentum etcetera, is always the result ofsome symmetry. This deep relation is borne out in a beautiful theorem, formulated by E. Noether,which we shall consider in the next semester. In this section we shall derive three conservation prop-erties from Newton’s laws and the appropriate symmetries.

The first symmetry we consider is that of a system of particles experiencing only mutual forces,and no external ones. We then see immediately from Eq. (1.13) withFExt = 0 that p = 0, in otherwords, the total momentum is conserved.

Conservation of momentumIn a system consisting of interacting particles, not subject to an external force, the totalmomentum is always conserved.

Next, let us consider the angular momentumL . This is a vector quantity, which for particlei isdefined asL i = r i ×pi . The total angular momentumL is the sum of the vectorsL i :

L =N

∑i=1

L i . (1.14)

To see howL varies in time, we calculate the time derivative ofL i :

L i = r i ×pi + r i × pi . (1.15)

The first term of the right hand side vanishes becausepi is parallel tor i , so we are left with

L i = r i × pi = Ni ; (1.16)

Ni is thetorqueacting on particlei.Now we calculate the torque on the total system by summing overi and replacingpi by the force

(according to the second law):

L =N

∑i=1

r i ×Fi =N

∑i=1

r i ×

(N

∑j=1, j 6=i

Fi, j +FExti

). (1.17)

The first term in right hand side vanishes, again as a result of the third law:

r i ×Fi, j + r j ×F j,i = Fi, j × (r i − r j) = 0, (1.18)

Page 11: Classical and Quantum Mechanics

1.2. Systems of point particles – symmetries and conservation laws 5

r

r Frd

2

1

Figure 1.1: Path fromr1 to r2. The force at some point along the path is shown, together with the contributionto the work of a small segmentdr along the path.

where the last equality is a result of the direction ofFi, j coinciding with that of the line connectingr i

andr j (which excludes electromagnetic interactions between moving particles from the discussion).We therefore have

L =N

∑i=1

r i ×FExti . (1.19)

We see that if the external forces vanish, the angular momentum does not change:

Conservation of angular momentumIn a system consisting of interacting particles (not electromagnetic), not subject to an exter-nal force, the angular momentum is always conserved.

Finally, we consider the energy. Let us evaluate the workW done by moving a single particle fromr1 to r2 along some pathΓ (see figure 1.1). This is by definition the inner product of the force and theinfinitesimal displacements, summed over the path:

W =∫

ΓF ·dr(t) =

∫ t2

t1F · r dt. (1.20)

Using Newton’s second law, we can write:

W =∫ t2

t1mr rdt =

∫ t2

t1

m2

ddt

(r2)dt =

m2

(r2

2− r21

), (1.21)

wherer1 is the velocity at timet1 and similar forr2. We see that from Newton’s second law it followsthat the work done along the pathΓ is equal to the change in the kinetic energyT = mr2/2.

A conservation law can be derived for the case whereF is aconservativeand time-independentforce. This means thatF can be written as the negative gradient of some scalar function, called thepotential:1

F(r) =−∇V(r). (1.22)

In that case we can write the work in a different way:

W =−∫

Γ∇V(r)dr(t) =−

∫ t2

t1

dV(r)dt

dt = V(r1)−V(r2). (1.23)

1From vector calculus it is known that a necessary and sufficient condition for this to be possible is that the force iscurl-free, i.e.∇×F = 0.

Page 12: Classical and Quantum Mechanics

6 Introduction: Newtonian mechanics and conservation laws

From this and from Eq. (1.21) it follows that

T1 +V1 = T2 +V2, (1.24)

whereT1 is the kinetic energy in the pointr1 (or at the timet1) etcetera. ThusT +V is a conservedquantity, which we call theenergy E.

Of course, now that we know the expression for the energy, we can verify that it is a conservedquantity by calculating its time derivative, using Newton’s second law:

E = mr · r +∇V(r) · r = F · r −F · r = 0. (1.25)

For a many-particle system, the derivation is similar – the condition on the force is then that thereexists a potential functionV(r1, r2, . . . , rN), such that the forceFi on particlei is given by

Fi =−∇iV(r1, r2, . . . , rN). (1.26)

Note thatV depends on 3N coordinates – the gradient∇i acting onV gives a 3-dimensional vector

∇iV =(

∂V∂xi

,∂V∂yi

,∂V∂zi

). (1.27)

The kinetic energy is the sum of the one-particle kinetic energies. Now the energy conservation isderived as follows:

E = ∑i

mi r i · r i +∑i

∇iV · r i = ∑i

(Fi · r i −Fi · r i) = 0. (1.28)

The functionV above depends on time only through the time-dependence of the argumentsr i .If we consider a charged particle in a time-dependent electric field, this is no longer the case: thent occurs as an additional, explicit argument inV. If V would depend explicitly on time, the energywould change at a rate

E =∂

∂ tV(r1, r2, . . . , rN, t), (1.29)

where the argumentsr i also depend on time (but do not take part in the differentiation with respect tot). If V does not depend explicitly on time, we can define the zero of time (i.e. the time when we setour clock to zero) arbitrarily. Thistime translation invarianceis essential for having conservation ofenergy.

Similarly, the conservation of momentum is related to space translation invariance of the potential,i.e. this potential should not change when we translate all particles all over the same vector. Finally,angular momentum is related to rotational symmetry of the potential. In quantum mechanics, all thesesymmetries lead to the same conserved quantities (or rather their quantum mechanical analogues).

A final remark concerns the evaluation of the kinetic energy of a many-particle system. As wehave seen above, the motion of the centre of mass can be split off from the analysis in a suitable way.This procedure also works for the kinetic energy. Let us decompose the position vectorr i of particlei into two parts: the centre of mass position vectorrC and the position relative to the centre of mass,which we callr ′i :

r i = rC + r ′i . (1.30)

As, by definition,rC = ∑i mir i/M, we have

∑i

mir ′i = ∑i

mir i −MrC = 0. (1.31)

Page 13: Classical and Quantum Mechanics

1.2. Systems of point particles – symmetries and conservation laws 7

We can use this decomposition to rewrite the kinetic energy:

T = ∑i

mi

2

(rC + r ′i

)2 =M2

r2C + rC ·∑

i

mi r ′i +∑i

mi

2r ′2i . (1.32)

The second term vanishes as a result of (1.31) and therefore we have succeeded in writing the kineticenergy of the many-particle system as the kinetic energy of the centre of mass plus the kinetic energyof the relative coordinates:

T = TCM +∑i

mi

2r ′2i . (1.33)

This formula is a convenient device for calculating the kinetic energy in many applications.

Page 14: Classical and Quantum Mechanics

2

Lagrange and Hamilton formulations of classicalmechanics

The laws of classical mechanics, formulated by Newton, and the various laws for the forces (seetable 1.1) supply sufficient ingredients for predicting the motion of mechanical systems in the classicallimit. Working out the solution for particular cases is not always easy, however. In this chapterwe shall develop an alternative formulation of the laws of classical mechanics, which renders theanalysis of many systems easier than the traditional Newtonian formulation, in particular when themoving particles are subject to constraints. The new formulation will not only enable us to analysenew applications more easily than using Newton’s laws, but it also leads to an important exampleof a variational formulation of a physical theory. Broadly speaking, in a variational formulation, aphysical solution is found by minimising a mathematical expression involving a function by varyingthat function. Many physical theories can be formulated in a variational way, in particular quantummechanics and electrodynamics.

2.1 Generalised coordinates and virtual displacements

When observing motion in everyday life, we often encounter systems in which the moving particlesare subject toconstraints. For example, when a car moves on the road, the road surface withholdsthe car from moving downward, as is the case with the balls on a billiard table. Another example is aparticle suspended on a rigid rod (i.e. the pendulum), which can only move on the the sphere aroundthe suspension point with radius equal to the rod length. The constraints are realised by forces, whichwe call theforces of constraint. The forces of constraint guarantee that the constraints are met – theyoften do not influence the motion within the subspace.1 The main object of the next few sections isto show that it is possible to eliminate these constraint forces from the description of the mechanicalproblem.

As the presence of constraints reduces the actual degrees of freedom of the system, it is usefulto use a smaller set of degrees of freedom to describe the system. As an example, consider a ballon a billiard table. In that case, thez-coordinate drops out of the description, and we are left withthex andy coordinates only. This is obviously a very simple example, in which one of the Cartesiancoordinates is simply left out of the description of the system. More interesting is a ball suspendedon a rod. In that case we can use the angular coordinatesϑ andϕ to describe the system – that is, wereplacethe coordinatesx, y andz by the anglesϕ andϑ – see figure 2.1. In this case, we see that thecoordinates no longer represent distances, that is, they do not have the dimension of length, but ratherthey are values of angles, and therefore dimensionless. This is the reason why we speak ofgeneralised

1The subspace on which the particle is allowed to move is not necessarily a linear subspace, e.g. the spherical subspacein the case of a pendulum. Mathematicians would use the term ‘submanifold’ rather than subspace.

8

Page 15: Classical and Quantum Mechanics

2.1. Generalised coordinates and virtual displacements 9

ϕ

θ

x

y

z

Figure 2.1: The pendulum in three dimensions. The position of the mass is described by the two anglesϕ andϑ .

coordinates. These coordinates form a reduced representation of a system subject to constraints. Inchapter 2 of the Schaum book you find many examples of constraints and generalised coordinates.Generalised constraints are denoted byq j , wherej is an index which runs over the degrees of freedomof the constrained system.

We now shall look at constraints and generalised coordinates from a more formal viewpoint. Letus consider a system consisting ofN particles in 3 dimensions, so that the total number of coordinatesis 3N. The system is subject to a number of constraints, which are of the form

g(k)(r1, . . . , rN, t) = 0, k = 1, . . . ,K. (2.1)

Constraints of this form (i.e. independent of the velocities) are calledholonomic. Usually, it is thenpossible to transform the 3N degrees of freedom to a reduced set of 3N−K generalised coordinatesq = q j , j = 1, . . . ,3N−K. It is now possible to express the position vectors in terms of these newcoordinates:

r i = r i(q, t). (2.2)

As an example, consider the particle suspended on a rod; see figure 2.2. The Cartesian coordinatesarex, y andz and they can be written in terms of the generalised coordinatesϑ andϕ as:

x = l sinϑ cosϕ; (2.3)

y = l sinϑ sinϕ; (2.4)

z=−l cosθ , (2.5)

wherel is the length of the rod (and therefore fixed). These equations are a particular example ofEqs. (2.2).

The velocity can be expressed in terms of the ˙q j :

r i =3N−K

∑j=1

∂ r i

∂q jq j +

∂ r i

∂ t. (2.6)

From this equation we also find directly

∂ r i

∂ q j=

∂ r i

∂q j(2.7)

Page 16: Classical and Quantum Mechanics

10 Lagrange and Hamilton formulations of classical mechanics

a result which will be very useful further on.Newton’s laws predict the evolution of a mechanical system without ambiguity from a given initial

state (if that state is not on an unstable point, such as zero velocity at the top of a hill). However, we aresometimes interested in a variation of the path of a system, i.e. a displacement of one or more particlesin some direction. Such displacements are calledvirtual displacementsin order to distinguish themfrom the actual displacement, which is always governed by the Newton equations of motion. If wenow generalise the definition of work, Eq. (1.20) to include virtual displacementsδ r i rather than themechanical displacements which actually take place, then the work done due to this displacement isdefined as

δW =N

∑i=1

F ·δ r i . (2.8)

The notion of virtual work is very important in the following section.

2.2 d’Alembert’s principle

We start from Newton’s law of motion for anN-particle system.

pi = mr i = Fi , i = 1, . . . ,N. (2.9)

It is always possible to decompose the total force on a particle into a force of constraintFC and theremaining force, which we call theapplied forceFA :

F = FC +FA . (2.10)

If you consideranysystem consisting of a single particle (or nonrotating rigid body), subject to con-straints, you will find that the work forces of constraint are always perpendicular to the space in whichthe particle is allowed to move. For example, if a particle is attached to a rigid rod which is suspendedsuch that it can rotate freely, the particle can only rotate on a spherical surface. The force of con-straint, which is the tension in the rod, is always normal to that surface. Similarly, the force of thebilliard table on the balls is always vertical, i.e. perpendicular to the plane of motion. This notionprovides a way to eliminate these forces from the description. Consider an arbitrary but small virtualdisplacementδ r within the subspace allowed by the constraint.Because the force of constraint isperpendicular to this subspace, we have:

p ·δ r =(FC +FA) ·δ r = FA ·δ r . (2.11)

We see that the force of constraint drops out of the system, and we are left with a motion determinedby the applied force only. Because (2.11) holds for every smallδ r , we have

p = FA (2.12)

if we restrict all vectors to be tangential to the constraint subspace. The principle we have formu-lated in Eq. (2.11) is calledd’Alembert’s principle. For systems consisting of a single rigid body, itexpresses the fact that the forces of constraint are perpendicular to the subspace of the constraint. TheexpressionF ·δ r is the virtual work done as a result of the virtual displacement.

It is important to note that the virtual displacements arealwaysconsidered to be spatial – the timeis not changed. This is particularly important in cases where the constraints are time-dependent. Inthe next section we shall consider an example of this.

Page 17: Classical and Quantum Mechanics

2.3. Examples 11

Fg

l

FTϕ

ϕ

ϕ

m

Figure 2.2: The pendulum moving in a plane. The rod of length is rigid, massless, and is suspended frictionless.

For more than one object, the contributions to the virtual work must be added, so that we obtain:

N

∑i

pi ·δ r i = ∑i

FAi ·δ r i . (2.13)

In this form, the contributions of the constraint forces to the virtual work do not all vanish for eachindividual object, but thetotal virtual work due to the constraint forces vanishes:

∑i

FCi ·δ r i = 0. (2.14)

In summary, we can formulate d’Alembert’s principle in the following, concise form:

The virtual work due to the forces of constraint is always zero for virtual displacementswhich do not violate the constraint.

The use of d’Alembert’s principle can simplify the analysis of systems subject to constraints,although we often use this principle tacitly in tackling problems in the ‘Newtonian’ approach. Inthat approach we usually demand that the forces of constraint balance the components of the appliedforce perpendicular to the constraint subspace. Nevertheless, it is convenient to skip this step, usingd’Alembert’s principle, especially in complicated problems (many applied forces and constraints).

2.3 Examples

2.3.1 The pendulum

As a simple example, let us consider a pendulum moving in a plane. This system is shown in figure 2.2.Using Newton’s mechanics, we say that the ball of massm is kept on the circle by the tension in thesuspension rod. This tension is directed along the rod, and it precisely compensates the componentof the gravitational force along the same line. The component of the gravitational force tangential to

Page 18: Classical and Quantum Mechanics

12 Lagrange and Hamilton formulations of classical mechanics

FF

F

FF

1

z

Z

1-

2

IP

SB

d^

x

y

n

α

Figure 2.3: Small block on an inclined plane.

the circle of motion determines the motion. The motion is given byr(t) = lϕ(t), whereϕ is the angleshown in the figure. So ¨r(t) = l ϕ(t), and the equation of motion is

l ϕ(t) =−gsinϕ(t). (2.15)

Using d’Alembert’s principle simplifies the first part of this analysis. We can simply say that themotion is determined by the component of the applied force (i.e. gravity) lying in the subspace of themotion (i.e. the circle) and this leads to the same equation of motion. Although in this simple casethe difference between the approaches with and without d’Alembert’s principle is minute, in morecomplicated systems, the possibility to avoid analysing the forces of constraint is a real gain.

2.3.2 The block on the inclined plane

Now we consider a more complicated example: that of a block sliding on a wedge . We shall denotethe block by SB (small block) and the wedge by IP (Inclined plane). The setup is shown in figure 2.3.It consists of the wedge (inclined plane) of massM which can move freely (i.e. without friction)over a horizontal table, and the small block off massm, which can slide over the inclined plane (alsofrictionless). The aim is to find expressions for the accelerations of IP and SB. The cartesian unitvectors arex and y, and the unit vector along the inclined plane, pointing to the right, isd, and theupward normal unit vector to the plane is calledn. Let us solve this problem using the standardapproach. The acceleration of IP is calledA, and that of the small block isA +a, i.e., a is theacceleration of the small block with respect to the inclined plane.

Newton’s second law for the two bodies reads:

MA =−Mgy+F2y−F1n, (2.16a)

m(A +a) =−mgy+F1n. (2.16b)

As we know that the motion of IP is horizontal, we know that ally components of the forces actingon it will cancel, andA is directed alongx. Similarly, we know thata is zero alongn. This allows us

Page 19: Classical and Quantum Mechanics

2.3. Examples 13

to simplify the equations:

MA =−F1sinα; (2.17a)

m(Ax+a‖d) =−mgy+F1n (2.17b)

wherea‖ is the component ofa directed alongd. The first of these equations is a scalar equation. Thesecond equation represents in fact two equations, one for thex and one for they component. We havethree unknowns:A, a‖ andF1. Translatingd andn in thex- andy- components is straightforward, and(2.17b) becomes:

m(A+a‖ cosα) = F1sinα, (2.18a)

−ma‖ sinα = F1cosα−mg. (2.18b)

Now we can solve for the accelerations by eliminatingF1 from our equations, and we find:

a‖ = g(M +m)sinα

M +msin2α

; (2.19a)

A =−gmsinα cosα

M +msin2α

. (2.19b)

The solution of this problem contains one nontrivial step: the fact that we have split the accelerationof SB into the acceleration of IP plus the acceleration of the SB with respect to the IP has enabled usto remove the latter’s component alongn. This is not so easy a step when a different representation isused (e.g. when the acceleration is not split into these parts).

Now we turn to the solution using d’Alembert’s principle:

pSB ·δ rSB+ pIP ·δ r IP = FASB ·δ rSB+FA

IP ·δ r IP. (2.20)

We identify two natural coordinates: the coordinateX of the IP along the horizontal direction, and thedistanced from the top of the IP to the SB. The total virtual work done as a result of displacementsδX andδd is the sum of the work done by both bodies:

δ rSB = δd d+δX x and δ r IP = δX x. (2.21)

The applied forces are the gravity forces – we do not care about constraint forces any longer – and wefind

FAIP ·δ r IP = 0, (2.22)

as the displacement is perpendicular to the applied (gravity) force. Furthermore

FASB ·δ rSB = mgsinα δd. (2.23)

On the other hand:pSB = m(Xx+ dd) (2.24)

andpIP = MXx, (2.25)

so thatpIP = MAx and pSB = m(Ax + a‖d). Taking time derivatives of (2.24) and (2.25) and usingd’Alembert’s equations (2.20) for this problem, together with (2.22) and (2.23), we obtain

mAδX +ma‖ δd+mAcosα δd+ma‖ cosα δX +MAδX = mgsinα δd. (2.26)

Page 20: Classical and Quantum Mechanics

14 Lagrange and Hamilton formulations of classical mechanics

z

x

y

ω

α

tq

^

Figure 2.4: Bead on a rotating wire.

As this equation should hold for any pair of virtual displacementsδX andδd, the coefficients of bothδX andδd should vanish simultaneously, giving the equations:

(m+M)A+ma‖ cosα = 0. (2.27a)

m(a‖+Acosα) = mgsinα. (2.27b)

Not surprisingly, these equations lead to the same result (2.19) as obtained before. Although thesecond approach does not seem simpler, it is safer since the constraint forces do not have to be takeninto account explicitly. This manifests itself explicitly in the fact that we do not have to eliminate theconstraint forceF1 as in the direct approach.

2.3.3 Heavy bead on a rotating wire

In this section, we consider a system with a time-dependent constraint. A bead slides without frictionalong a straight wire which rotates along a vertical axis, under an angleα (see figure 2.4). Theposition of the bead along the wire is denoted byq, which is the distance of the bead from the origin.The momentum of the bead is given by

p = mqω sinα t +mqq (2.28)

It should however be noted that the unit vectorst and q rotate themselves, and hence their timederivatives occur inp. The latter occurs in d’Alembert’s equation, in which gravity enters as theapplied forceFA . Instead of working outp explicitly, we can use the following trick:

p ·δ r =ddt

(p ·δ r)−p ·δ r . (2.29)

At first sight, you might think that the second term on the right hand side is zero asδ r = δq q andδqdoes not involve any time dependence: virtual displacements are always assumed to be instantaneousand do not involve any time dependence. However, even with a time-independentδq, the displacementδ r is time-dependent as the displacement is carried out in a rotating frame. This can also be seen from

Page 21: Classical and Quantum Mechanics

2.4. d’Alembert’s principle in generalised coordinates 15

the fact thatq is time-dependent. In fact, in our system the displacement along the wire will cause achange in therotational velocity, and it is this velocity change which givesδ r . If the bead is movedupward, for example, the bead will move along a circle which has a larger radius, but still at the sameangular velocity, so that the orbital speed increases. The orbital speed is given asqω sinα, so that wehave:

δ r = ω sinα δq t. (2.30)

As δ r is given byδq q, we find

p ·δ r = mqδq−mω2sin2

α q δq = Fa ·δ r =−mgcosα δq (2.31)

and we find the equation of motion:

q−ω2sin2

α q =−gcosα. (2.32)

The solution to this equation can be found straightforwardly:

q(t) = q0 +AeΩt +Be−Ωt (2.33)

with q0 = gcotα/(ω2sinα), A andB arbitrary constants andΩ = ω sinα. Later we shall encountermore powerful techniques which enable us to solve such a problem more easily.

2.4 d’Alembert’s principle in generalised coordinates

In the previous section we have encountered a few examples of systems subject to constraints, andanalysed them using d’Alembert’s principle. In this section we shall do the same for an unspeci-fied system and derive the equations of motion for a general constrained system using d’Alembert’sprinciple.

We start from d’Alembert’s equation forN objects:

N

∑i=1

pi ·δ r i =N

∑i=1

FAi ·δ r i . (2.34)

If we write

δ r i =3N−K

∑j=1

∂ r i

∂q jδq j , (2.35)

and realise that theq j can be varied independently, we see that we must have

N

∑i=1

pi ·∂ r i

∂q j=

N

∑i=1

FAi ·

∂ r i

∂q j. (2.36)

In order to reformulate this equation we use a trick similar to the one we applied already to the beadsliding along the wire:

N

∑i=1

pi ·∂ r i

∂q j=

ddt

(N

∑i=1

pi ·∂ r i

∂q j

)−

N

∑i=1

pi ·ddt

∂ r i

∂q j. (2.37)

We note furthermore that in the second term, the time derivative can be written as

ddt

(∂ r i

∂q j

)=

∂ r i

∂q j. (2.38)

Page 22: Classical and Quantum Mechanics

16 Lagrange and Hamilton formulations of classical mechanics

In section 1.2 we have seen that the work done equals the change in kinetic energy. This suggeststhat the kinetic energy might be a convenient device for expressing d’Alembert’s equation in gener-alised coordinates. To see that this is indeed the case, we first calculate its derivative with respect toq j and multiply withδq j and sum overj:

∂T∂q j

=N

∑i=1

mi r i ·∂ r i

∂q j. (2.39)

Similarly:∂T∂ q j

=N

∑i=1

mr i ·3N−K

∑j=1

∂ r i

∂ q j=

N

∑i=1

pi ·∂ r i

∂q j, (2.40)

where we have used (2.7). We see that the left hand side of d’Alembert’s equation leads to

ddt

(∂T∂ q j

)− ∂T

∂q j. (2.41)

DefiningN

∑i=1

FAi

∂ r i

∂q j= F j , (2.42)

whereF j is thegeneralised force, we have the following

Formulation for d’Alembert’s principle in generalised coordinates:

ddt

(∂T∂ q j

)− ∂T

∂q j= F j . (2.43)

There is no sum overj in this equation because the variationsδq j are arbitrary and independent. It isthen possible to obtain the form (2.43) from d’Alembert’s principle by taking only one particularδq j

to be nonzero.

2.5 Conservative systems – the mechanical path

Consider now a particle which moves in a constrained subspace under the influence of a potential. Asan example you can imagine a non-flat surface on which a ball is moving fromr1 to r2. If the ball isnot forced to obey the laws of mechanics, it can move fromr1 at timet1 to r2 at timet2 along manydifferent paths. Instead of approaching the problem of finding the motion of the ball from a differentialpoint of view, where we update the position and the velocity of a particle at each infinitesimal timestep, we consider the path allowed for by the laws of mechanics1 as a special one among all theavailable paths fromr1 at t1 to r2 at t2.

We thus try to find a condition on the path as a whole rather than for each of its infinitesimalsegments. To this end, we start from d’Alembert’s principle, and apply it to two paths,ra(t) andrb(t),which are close together for all times. The difference between the two paths at some timet betweent1 andt2 is δ r(t) = rb(t)− ra(t), and we write down d’Alembert’s principle at timet using thisδ r(t):

mr(t) ·δ r(t) = F ·δ r(t), (2.44)

1This path not always, but nearly always, unique.

Page 23: Classical and Quantum Mechanics

2.5. Conservative systems – the mechanical path 17

where it is understood thatF is the applied force only, asδ r lies in the constrained subspace.2 Thisequation holds forevery tbetweent1 andt2, and we can formulate a global condition on the path byintegrating over time fromt1 to t2:∫ t2

t1mr(t) ·δ r(t)dt =

∫ t2

t1F ·δ r(t)dt. (2.45)

The analysis which follows resembles that of the previous chapter when we derived the conservationproperty of the energy. Indeed, the right hand side looks like an expression for the work, but it shouldbe kept in mind thatδ r is not a real displacement of the particle, but a difference between two possiblepaths.

Via partial integration, and using the fact that the begin and end point of the path are fixed, we cantransform the left hand side of (2.45):∫ t2

t1mr(t) ·δ r(t)dt =−

∫ t2

t1

m2

∂ r2

∂ rδ rdt =−

∫ t2

t1

∂T∂ r

δ rdt ≈−∫ t2

t1[T(rb)−T(ra)]dt, (2.46)

where the approximation holds to first order inδ r . The resulting expression is the difference in kineticenergy between the two paths, integrated over time.

If we are dealing with a conservative force field, the right hand side of (2.45) can also be trans-formed to a difference between two global quantities:∫ t2

t1F ·δ r(t)dt =−

∫ t2

t1∇V ·δ r(t)dt ≈−

∫ t2

t1[V(rb)−V(ra)]dt. (2.47)

Combining (2.46) and (2.47) we obtain:

δ

∫ t2

t1(T−V)dt = 0, (2.48)

in other words, d’Alembert’s principle for a conservative force can be transformed to the conditionthat the linear variation (2.48) vanishes. This global condition distinguishes the mechanical path fromall other ones.

The quantityT−V is called theLagrangian, L. The integral over time of this quantity∫ t2t1 L dt is

called theaction, denoted byS:

S=∫ t2

t1dt (T−V) =

∫ t2

t1dt L. (2.49)

We have derived a new principle:

The mechanical path of a particle moving in a conservative potential field from a positionr1

at timet1 to a positionr2 at t2 is astationary solutionof the action, i.e. the linear variationof the action with respect to an allowed variation of the path around the mechanical path,vanishes.

This principle is calledHamilton’s principle. Note that the variations of the path are restrictedto lie within the constrained subspace. The advantage of this new formulation of mechanics withconservative force fields over the Newtonian formulation is that it holds for any system subject to

2We suppose that the constrained subspace is smooth and thatra(t) is close torb(t) for all t.

Page 24: Classical and Quantum Mechanics

18 Lagrange and Hamilton formulations of classical mechanics

constraints, and that it holds independently of the coordinates which are chosen to represent the mo-tion. This is clear from the fact that we search for the minimum of the action within the subspaceallowed for by the constraint, and this subspace is properly described by the generalised coordinatesq j . When solving the motion of some particular mechanical system our task is therefore to properlyexpressT andV in terms of these generalised coordinates, plug the LagrangianL = T−V into the ac-tion, and minimise the latter with respect to the generalised coordinates (which are functions of time).Although this might seem a complicated way of solving a simple problem, it should be realised thatthe transformation of forces and accelerations to generalised coordinates is usually more complicatedthan writing the kinetic energy and the potential in terms of these new coordinates. Furthermore weshall see below that the problem of finding the stationary solution for a given action leads straightfor-wardly to a second-order differential equation, which is the correct form of the Newtonian equationof motion in terms of the chosen generalised coordinates.

As an example, consider the pendulum. The position of the massm is given by the 2 coordinatesx andy (we neglect the third coordinatez). The constraint obeyed by these coordinates isx2+y2 = l2.This constraint allows us to use only a single generalised coordinateϕ: x = l sinϕ andy = −l cosϕ.The velocity is given byvϕ = l ϕ. This example shows that the generalised coordinateq = ϕ does notnecessarily have to have the dimension of length, and likewise ˙q = ϕ does not necessarily have thedimension of velocity. The kinetic energy is now given asT = ml2ϕ2/2, and the potential energy byV =−mglcosϕ. The Lagrangian of the pendulum is therefore

L = T−V = m

(l2ϕ2

2+gl cosϕ

). (2.50)

We now turn to the problem of determining the stationary solution for an action with such a La-grangian.

The Lagrangian can have many different forms, depending on the particular set of generalisedcoordinates chosen; therefore we shall now work out a general prescription for determining the sta-tionary solution of the action without making any assumptions concerning the form of the Lagrangian,except that it may depend on theq j and on their time derivatives ˙q j :

S[q] =∫ t2

t1L(q, q, t)dt. (2.51)

Hereq(t) is any vector-valued function,q(t) = (q1(t), . . . ,qN(t)). We now consider an arbitrary, butsmall variationδq(t) of the pathq(t), and calculate the change inSas a result of this variation:

δS[q] = S[q+δq]−S[q] =∫ t2

t1L(q+δq, q+δ q, t)dt−

∫ t2

t1L(q, q, t)dt ≈∫ t2

t1

[∂L(q, q, t)

∂qδq+

∂L(q, q, t)∂ q

δ q]

dt. (2.52)

Note that bothq and q depend on time. Note further that∂/∂q is a vector – the derivative mustbe interpreted as a gradient with respect to all the components ofq. The use of∂ and notd in thederivatives indicates that when calculating the gradient with respect toq, q is considered as a constant,and vice-versa.

Of course,δq andδ q are not independent: if we knowq(t) for all t in the interval under consid-eration, we also know the time derivativeq. We can removeδ q by partial integration:∫ t2

t1

[∂L(q, q, t)

∂qδq+

∂L(q, q, t)∂ q

δ q]

dt =∫ t2

t1

(∂L∂q

− ddt

∂L∂ q

)δqdt. (2.53)

Page 25: Classical and Quantum Mechanics

2.5. Conservative systems – the mechanical path 19

Becauseδq is small but arbitrary, this variation can only vanish when the term in brackets on the righthand side vanishes. Consider for example aδq which is zero except for a very small range oft-valuesaround somet0 in the interval betweent1 and t2. Then the term between the square brackets mustvanish in that small range. We can do this for any small interval on the time axis, and we concludethat the term in brackets vanishes for allt in the integration interval. So our conclusion reads

The actionS[q] is stationary, that is, its variation with respect toq vanishes to first order, ifthe following equations are satisfied:

∂L∂q j

=ddt

∂L∂ q j

, for j = 1, . . . ,N. (2.54)

The equations (2.54) are calledEuler equations. In the case whereL is the Lagrangian of classicalmechanics,L = T −V, the equations are calledEuler–Lagrange equations(note that in the abovederivation, no assumption has been made with respect to the form ofL nor what it means – the onlyassumption is thatL depends at most onq, q and t). The Euler equations have many applicationsoutside mechanics.

Often the following notation is used:

δL =N

∑j=1

(∂L∂q j

− ddt

∂L∂ q j

)δq j (2.55)

andδLδq

=(

∂L∂q

− ddt

∂L∂ q

), (2.56)

or, written in another way:δLδq j

=(

∂L∂q j

− ddt

∂L∂ q j

). (2.57)

Note that (2.56) is an equality between (N-dimensional) vector quantities.The analysis given here can be summarised by a procedure for solving a mechanical problem in

classical mechanics with conservative forces:

• Find a suitable set of coordinates which parametrises the subspace of the motion allowed for bythe constraints.

• Express the kinetic energyT and the potentialV in those coordinates.

• Write down the Lagrange equations (2.54) for the LagrangianL = T−V and solve them.

Turning again to our simple example of a pendulum, we use the Lagrangian found in (2.50) andwrite down the Euler–Lagrange equation for this:

∂L∂ϕ

=−mglsinϕ =ddt

∂L∂ ϕ

= ml2ϕ. (2.58)

The solution to this equation can be found through numerical integration. In the next section we shallencounter some more complicated examples which show the advantages of the new approach moreclearly.

Page 26: Classical and Quantum Mechanics

20 Lagrange and Hamilton formulations of classical mechanics

2.6 Examples

2.6.1 A system of pulleys

We consider a system of massless pulleys as in the figure below.

mb

ma

mc

ll l

l1

2 34

The string is also massless and furthermore inextensible. It is quite complicated to find out whatthe forces on the system are when taking all the forces on the pulleys and on the wire into account.However, it turns out that using Hamilton’s principle makes it an easy problem. The total string lengthis l = l1 + l2 + l3 + l4 and is fixed. Of coursel2 = l3. Therefore, we can takel1 andl4 as generalisedcoordinates, and we have:

l2 = l3 =12(l − l1− l4). (2.59)

The height of the central pulley is given byl2 (or l3), and the total potential energy is therefore givenas:

V =−g[mal1 +

mb

2(l − l1− l4)+mcl4

]. (2.60)

The speed of the left massma is given byl1, and that of the right one,mc, by l4. Using (2.59) we findthat the speed of the central pulley is given by1

2(−l1− l4). The Lagrangian is therefore given as

L =12

mal21 +

12

mcl24 +

18

mb(l21 + l2

4 +2l1l4)+g[mal1 +

mb

2(l − l1− l4)+mcl4

]. (2.61)

The Euler-Lagrange equations can be derived straightforwardly:

(ma +14

mb)l1 +14

mbl4 =(

ma−12

mb

)g; (2.62a)

(mc +14

mb)l4 +14

mbl1 =(

mc−12

mb

)g. (2.62b)

Page 27: Classical and Quantum Mechanics

2.6. Examples 21

The two equations can be solved forl1 andl4 and the result is

l1 =4mamc +mamb−3mcmb

mcmb +4mamc +mambg; (2.63a)

l4 =4mamc +mcmb−3mamb

mamb +4mamc +mbmcg. (2.63b)

To check whether the answer is reasonable we verify that a stationary motion (i.e. a motion withconstant velocity) is possible ifmb = 2ma = 2mc. The solution is now trivial, since the right handsides of (2.63) vanish as should indeed be the case. We see that the Lagrange equations provide aframework which enables us to find the equations of motion quite easily.

2.6.2 Example: the spinning top

Consider a top with cylindrical symmetry. The position of the top is defined by its two polar anglesϑ andϕ and a third angle,ψ, defines the rotation of the top around its symmetry axis. The angularvelocity is given in terms of these three polar angles as:

ωωω = ϕ z+ ϑ e+ ψd (2.64)

wherez is a unit vector along thez-axis; e is a unit vector in thexy plane which is perpendicular tothe axis of the top, andd is a unit vector along the axis of the top. The axis of the top is shown in thefigure:

ϑ

ϕ

z

x

y

e

f

ϕ

d

ψ

ϑ

From this figure, it is clear that

e= (−sinϕ,cosϕ,0) and (2.65a)

d = (cosϕ sinϑ ,sinϕ sinϑ ,cosϑ). (2.65b)

And it follows thatf = e× d = (cosϕ cosϑ ,sinϕ cosϑ ,−sinϑ). (2.66)

The rotational kinetic energy of the top is given by

T =12

ωωωT Iωωω (2.67)

Page 28: Classical and Quantum Mechanics

22 Lagrange and Hamilton formulations of classical mechanics

(the superscriptT turns the column vectorωωω into a row vector). It is always possible to find some axeswith respect to which the moment of inertia tensor is diagonal, and as a result of the axial symmetryof the top one diagonal element, which we shall denote byI3, corresponds to the symmetry axisd,and two other diagonal elements correspond to axes in the plane perpendicular to the body axis, suchaseandf – we call these elementsI1.

The kinetic energy is then given by

T =12

I1(ωωω · e)2 +12

I1(ωωω · f)2 +12

I3(ωωω · d)2 =12

I1ϕ sin2ϑ +

I12

ϑ2 +

12

I3(ψ + ϕ cosϑ)2. (2.68)

The gravitational force results in a potentialV = MgRcosϑ , whereM is the top’s mass andR thedistance from the point where it rests on the ground to the centre of mass. The Lagrangian thereforereads:

L =12

I1ϕ2sin2

ϑ +I12

ϑ2 +

12

I3(ψ + ϕ cosϑ)2−MgRcosϑ . (2.69)

The Lagrange equations forϑ , ϕ andψ are then given by:

I1ϑ = I1ϕ2sinϑ cosϑ − I3(ψ + ϕ cosϑ)ϕ sinϑ +MgRsinϑ ; (2.70a)

ddt

[I1ϕ sin2

ϑ + I3(ψ + ϕ cosϑ)cosϑ]= 0; (2.70b)

I3ddt

(ψ + ϕ cosϑ) = 0. (2.70c)

We immediately see thatψ + ϕ cosϑ is a constant of the motion – we shall call thisω3:

ω3 = ψ + ϕ cosϑ = Constant. (2.71)

ω3 denotes the component of angular velocity along the spining axis.Let us search for solutions of constant precession:ϑ = constant, orϑ = 0. We furthermore set

ϕ = Ω. The first Hamilton equation then gives:

I1Ω2cosϑ − I3ω3Ω+MgR= 0. (2.72)

If ω3 is large, we find the two solutions

Ω =MgRI3ω3

(2.73)

for which Ω is inversely proportional toω3 and

Ω =I3ω3

I2cosϑ(2.74)

i.e.Ω is proportional toω3. The first solution corresponds to slow precession and fast spinning aroundthe spinning axis; the second solution corresponds to rapid precession in which the gravitational forceis negligible.

For generalω3, the quadratic equation (2.72) withϑ = 0 has two real solutions forΩ if

I23ω

23 > 4I1cosϑMgR. (2.75)

For smaller values ofω3, a wobbling motion sets in (“nutation”).

Page 29: Classical and Quantum Mechanics

2.7. Non-conservative forces – charged particle in an electromagnetic field 23

2.7 Non-conservative forces – charged particle in an electromagnetic field

In this section we consider one particular type of force which is not conservative, but which can stillbe analysed fully within the Lagrangian approach. This is the very important example of a chargedparticle in an electromagnetic field.

Suppose we have a collection ofN particles which experience a non-conservative force which canbe derived from ageneralised potential W(r i , r i) in the following way:

F =−∂W∂ r i

+ddt

∂W∂ r i

. (2.76)

Analogous to the previous section we can derive a variational condition, starting from d’Alembert’sprinciple: ∫ t2

t1mr iδ r i dt =−

∫ t2

t1δT dt =

∫ t2

t1

[−∂W

∂ r i+

ddt

∂W∂ r i

]δ r i dt. (2.77)

The left hand side has been transformed as in (2.46), and the procedure for the right hand side issimilar with the extension that the second term of the integrand is subject to a partial integration,leading to

−∫ t2

t1δT dt =

∫ t2

t1

[−∂W

∂ r iδ r i −

∂W∂ r i

δ r i

]dt =−

∫ t2

t1δW dt. (2.78)

So we see that the variation of the action

S[q] =∫ t2

t1[T−W]dt (2.79)

vanishes. It can also be checked by working out the Euler-Lagrange equations, which for this actiondirectly leads to the classical equation of motionmr i = Fi .

2.7.1 Charged particle in an electromagnetic field

A point particle with chargeq moving in an electromagnetic field experiences a force

F = q(E+v×B) . (2.80)

The chargeq of the particle should not be confused with the generalised coordinatesqi introducedbefore.E is the electric field,B is the magnetic field. These fields are not independent, but they arerelated through the Maxwell equations. We use the following two Maxwell equations

∇ ·B = 0 and (2.81a)

∇×E+∂B∂ t

= 0. (2.81b)

We know from vector calculus that a vector field whose divergence is zero, can always be writtenas the curl of a vector function depending on space (and, in our case, time); applying this to (2.81a)we see that we can writeB in the formB = ∇×A, whereA is a vector function, called thevectorpotential, depending on space and time. Substituting this expression forB in Eq. (2.81b) leads to

∇×(

E+∂A∂ t

)= 0. (2.82)

Page 30: Classical and Quantum Mechanics

24 Lagrange and Hamilton formulations of classical mechanics

Now we use another result from vector calculus, which says that any function whose curl is zero canbe written as the gradient of a scalar function, which in this case we call thepotential, φ(r , t). Thisresults in the following representations of the electromagnetic field:

E(r , t) =−∇φ(r , t)− ∂A∂ t

(r , t); (2.83a)

B(r , t) = ∇×A(r , t). (2.83b)

In fact, by using two Maxwell equations, we have reduced the set of 6 field values (3 forE and 3 forB) to 4 (3 forA and 1 forφ ).

As the force is velocity-dependent, it is not conservative. We are after a functionW(r , r) which,when used in an action of the usual form, yields the correct equation of motion with the force (2.80).The potential which does the job is

W(r , r) = qφ(r , t)−qr ·A(r , t) = qφ −q(xAx + yAy + zAz). (2.84)

Note thatAx denotes thex-componentand not the partial derivative with respect tox. The Lagrangianoccurring in the action is therefore:

L =12

mr2 +qr ·A(r , t)−qφ(r , t). (2.85)

To see that this Lagrangian is indeed correct we work out the force component in thex-direction.First we calculate the derivative of the potentialW with respect tox:

−∂W∂x

=−q∂φ

∂x+q

(x

∂Ax

∂x+ y

∂Ay

∂x+ z

∂Az

∂x

). (2.86)

Furthermoreddt

(∂W∂ x

)=−q

dAx

dt=−q

(∂Ax

∂ t+

∂Ax

∂xx+

∂Ax

∂yy+

∂Ax

∂zz

). (2.87)

The Euler-Lagrange equations for the action contain the two contributions resulting from the potential.We have

mx =−∂W∂x

+ddt

(∂W∂ x

)=−q

(∂φ

∂x+

∂Ax

∂ t

)+

q

[y

(∂Ay

∂x− ∂Ax

∂y

)+ z

(∂Az

∂x− ∂Ax

∂z

)]= qEx +q(yBz− zBy) (2.88)

i.e. precisely the equation of motion with the force given in (2.80)!

2.8 Hamilton mechanics

It is possible to formulate Lagrangian mechanics in a different way. At first sight this does not addanything new to the formalism which was constructed in the previous sections, but we shall see thatthis new formalism provides us with a conserved quantity which is the energy or some analogousobject. More importantly, this formalism is essential for setting up quantum mechanics in a structuredway, as will be shown in a later course.

Page 31: Classical and Quantum Mechanics

2.8. Hamilton mechanics 25

Let us again consider a system described by a Lagrangian formulated in terms of generalisedcoordinates, with the equations of motion given by:

ddt

∂L∂ q j

=∂L∂q j

. (2.89)

This is a second order differential equation, which we shall transform into two first order ones.We define thecanonical momenta pj as

p j =∂L∂ q j

. (2.90)

The canonical momentum should not be confused with themechanical momentum, which is simply∑i mr i , although the two coincide when the generalised coordinates are simply ther i . Using thecanonical momenta, the equations of motion can be formulated as:

p j =∂L∂q j

. (2.91)

In the particular example of a conservative system formulated in terms of the position coordinatesr i :

L =N

∑i=1

mi

2r2

i −V(r1, . . . , rN), (2.92)

the momenta are given aspi = mr i (2.93)

and the equations of motion are

pi =−∂V∂ r i

. (2.94)

We see that in the case of a particle moving in a conservative force field, the generalised momentumcorresponds to the usual definition of momentum.

We have reformulated the Euler-Lagrange equation as two first-order differential equations. TheEuler-Lagrange equations were derived from a variational principle, theHamilton principle, whichrequires the action to be stationary for the mechanical path. We may ask ourselves if it is possible todefine our two new equations in terms of the same variational principle. This turns out to be the caseindeed. If a variational principle should lead totwo equations for each generalised coordinate, thecorresponding functional to be minimized should have twoindependentparameters per generalisedcoordinateq j which should be varied. Of course, in addition to the generalised coordinateq j weusep j for the second coordinate. We know the form of the Lagrangian in terms ofq j and p j (theparameter ˙q j obviously disappears from the description as argued above).

The problem is that straightforward application of variational calculus with respect toq j and p j

is quite intricate. In fact, in order to simplify the derivation of the variation principle, it is useful tointroduce a new functional, called theHamiltonian H, depending on the generalised coordinates andmomenta, and the timet as follows:

H(p j ,q j , t) =N

∑j=1

p j q j −L [q j , q j(qk, pk), t] . (2.95)

Page 32: Classical and Quantum Mechanics

26 Lagrange and Hamilton formulations of classical mechanics

Note that we can indeed express ˙q j in terms of thepk andqk as indicated in the second argument ofLby inversion of Eq. (2.90).1

Let us calculate the derivatives ofH with respect toq j andp j :

∂H∂ p j

= q j +∑k

pk∂ qk

∂ p j−∑

k

∂L∂ qk

∂ qk

∂ p j. (2.96)

Note that it follows from (2.90) that the second and third terms on the right hand side cancel, so thatwe are left with

∂H∂ p j

= q j . (2.97)

Now let us calculate the derivative with respect toq j :

∂H∂q j

=− ∂L∂ q j

+∑k

pk∂ qk

∂q j−∑

k

∂L∂ qk

∂ qk

∂q j. (2.98)

Again using (2.90) we see that the second and third term on the right hand side cancel – furthermorethe first term on the right hand side is equal to−pi and we are left with:

∂H∂q j

=−p j . (2.99)

Eqs. (2.97) and (2.99), together with the definition of the Hamiltonian (2.95) and of the momen-tum (2.90) are equivalent to the equations of motion. Eqs (2.97) and (2.99) are calledHamilton’sequations. Note that we must consider the generalised coordinates and the canonical momenta asindependentcoordinates, in contrast to the Lagrange picture, in whichq j andq j are related by

q j =∂q∂ t

.

This independence of coordinates and momenta is needed in order to arrive at the correct equations ofmotion. When these equations are solved, we obtain relations between them. It is very important torealise the difference between theformal independence of the coordinates at the level of formulatingthe Hamiltonian and deriving the equations of motion and the dependence which is a consequence ofthe solution of these equations.

If the system does not depend explicitly on time, the Hamiltonian is the analogue of the energy.The simplest case is a conservative system with the positionsr i as coordinates. In that case it is easyto see that

H =N

∑i=1

p2i

2m+V(r1, . . . , rN). (2.100)

More generally, let us consider a conservative system formulated in terms of generalised coordinatesq1, . . . ,qs. Note the difference with Eq. (2.2), wherer i may contain an explicit time-dependence – inthe present case we assume that the constraints have no explicit time-dependence. In that case it ispossible to express the position coordinatesr i in terms of thesgeneralised coordinatesq j , j = 1, . . . ,s:

r i = r i(q1, . . . ,qs) (2.101)

1For this inversion to be possible, the Lagrangian should beconvex, but we shall not go into details concerning this point.

Page 33: Classical and Quantum Mechanics

2.9. Applications of the Hamiltonian formalism 27

and therefore the velocities can be calculated as

r i =s

∑j=1

∂ r i

∂q jq j . (2.102)

Therefore, if we formulate the kinetic energy∑i12mr2

i in terms of the generalised coordinates, weobtain an expression which is quadratic in the ˙q j :

T =s

∑k, j=1

Mk j(q1, . . . ,qs)q j qk (2.103)

where

M jk = Mk j =N

∑i=1

mi

2∂ r i

∂q j

∂ r i

∂qk. (2.104)

If we calculate the contribution to the momenta arising from the kinetic energy, we find that theydepend linearly on the ˙q j :

p j =∂T∂ q j

=s

∑k=1

(M jk +Mk j

)qk = 2

s

∑k=1

Mk jqk. (2.105)

Hences

∑j=1

q j p j = 2T (2.106)

andH = 2T− (T−V) = T +V = Energy. (2.107)

For a general system Hamilton’s equations of motion can be used to derive the time derivative ofthe Hamiltonian:

dHdt

=s

∑j=1

∂H∂q j

q j +s

∑j=1

∂H∂ p j

p j +∂H∂ t

. (2.108)

Using Hamilton’s equation of motion (2.97) and (2.99) we see that the first two terms on the righthand side cancel and we are left with:

dHdt

=∂H∂ t

. (2.109)

We see therefore that ifH (or L) does not depend explicitly on time, thenH is a conserved quantity. Ifthe potential does not contain a ˙q j dependence, this implies conservation of energy. If the potential onthe other handdoescontain such a dependence, then (2.109) implies conservation of some quantitywhich plays a role more or less equivalent to energy.

2.9 Applications of the Hamiltonian formalism

In this section we shall reconsider the systems studied before in the Lagrange framework and point outwhich features are different when these systems are considered within the Hamiltonian framework.From the derivation of the Hamiltonian and Hamilton’s equations, it is seen that the latter can beviewed as a different way of writing Lagrange’s equations. The reason for introducing the Hamiltonianand Hamilton’s equations is that they are often used in quantum mechanics and because the Hamiltonformalism is more convenient for discovering some conserved quantities.

Page 34: Classical and Quantum Mechanics

28 Lagrange and Hamilton formulations of classical mechanics

2.9.1 The three-pulley system

From the Lagrangian (2.61), the momentap1 andp4 associated with the degrees of freedoml1 andl4are found as:

p1 =(

ma +mb

4

)l1 +

mb

4l4; (2.110a)

p4 =(

mc +mb

4

)l4 +

mb

4l1; (2.110b)

After some calculation, we therefore find for the Hamiltonian:

H =1

2∆

[mcp2

1 +map24 +

mb

4(p1− p4)2

]−g[mal1 +

mb

2(l − l1− l4)+mcl4

]. (2.111)

with∆ = (ma +mc)mb/4+mamc. (2.112)

The Hamilton equations read:

p1 = (ma−mb

2)g; (2.113a)

p4 = (mc−mb

2)g. (2.113b)

The solution is simple since the right hand sides of these equations are constants:

p1 = (ma−mb

2)gt; (2.114a)

p4 = (mc−mb

2)gt, (2.114b)

where the initial conditions are that the system is standing still att = 0. Together with Eqs. (2.110),we obtain the same solution as in the Lagrangian case. We see that the difference between the twoapproaches are not very dramatic in this case. Note that it is now easy to see that forma = mc = 2mb

the system is in equilibrium.

2.9.2 The spinning top

From the Lagrangian, we can derive the momenta associated with the three degrees of freedomϕ, ϑ

andψ:

pϕ = I1ϕ sin2ϑ + I1(ψ + ϕ cosϑ)cosϑ ; (2.115a)

pϑ = I1ϑ ; (2.115b)

pψ = I3(ψ + ϕ cosϑ). (2.115c)

If we want to express the kinetic energy in terms of these momenta, we need to solve for the timederivatives of the angular coordinatesϑ , ϕ andψ in terms of these momenta:

ϕ =pϕ − pψ cosϑ

I1sin2ϑ

; (2.116a)

ϑ =pϑ

I1; (2.116b)

ψ =pψ

I3−

pϕ − pψ cosϑ

I1sin2ϑ

cosϑ . (2.116c)

Page 35: Classical and Quantum Mechanics

2.9. Applications of the Hamiltonian formalism 29

After some calculation, the Hamiltonian is then found to be

H =(pϕ − pψ cosϑ)2

2I1sin2ϑ

+p2

ϑ

2I1+

p2ψ

2I3+Mgrcosϑ . (2.117)

As the Hamiltonian does not depend onψ and ϕ, we see immediately thatpψ and pϕ mustbe constant. Coordinates of which only the momentum does appear in the Hamiltonian are calledignorable: these momenta are constant in time – they representconstants of the motion. We have seenthat bothpψ andpϕ are constants of motion.

The Hamiltonian now reduces to a simple form:

H =p2

ϑ

2I1+U(ϑ), (2.118)

where

U(ϑ) =(pϕ − pψ cosϑ)2

2I1sin2ϑ

+p2

ψ

2I3+Mgrcosϑ . (2.119)

The Hamilton equations yield

−I1ϑ =−pϑ =dUdϑ

. (2.120)

This equation is difficult to solve analytically. Note that apart from the ignorable coordinates, we havean additional constant of the motion, the energy:

p2ϑ

I1+U(ϑ) = E = constant. (2.121)

The motion and its analysis will be considered in a worksheet.

2.9.3 Charged particle in an electromagnetic field

Finally we consider again the charged particle in an electromagnetic field. The momentum can befound as usual from the Lagrangian – we obtain

p = mr +qA. (2.122)

The Hamiltonian is

H = pr − m2

r2−qr ·A +qφ =m2

r2 +qφ(r) =(p−qA)2

2m+qφ(r). (2.123)

You might already know that this Hamiltonian is used in quantum mechanics for a particle in anelectromagnetic field.

Page 36: Classical and Quantum Mechanics

3

The two-body problem

In this chapter we consider the two-body problem within the framework of Lagrangian mechanics.One of the most impressive results of classical mechanics is the correct description of the planetarymotion around the sun, which is equivalent to electric charges moving in each other’s field. Withthe analytic solution of this problem, we shall recover the famous Kepler laws. The problem is alsoimportant in quantum mechanics: the hydrogen atom is a quantum version of the Kepler problem.

3.1 Formulation and analysis of the two-body problem

The two-body problem describes two point particles with massesm1 andm2. We denote their positionsby r1 andr2 respectively, and their relative position,r2− r1, by r . Finding the Lagrangian is quitesimple. The kinetic energy is the sum of the kinetic energies of the two particles, and the potentialenergy is the interaction, which depends only on the separationr = |r | of the two particles, and isdirected along the line connecting them (note that this last restriction excludes magnetic interactions).We therefore have:

L =m1

2r21 +

m2

2r22−V(r). (3.1)

Before deriving the equations of motion, we note that instead of writing the kinetic energy as thesum of the kinetic energies of the two particles, it can also be separated into the kinetic energy of thecentre of mass and that of the relative motion, as in Eq. (1.33):

T = TCM +2

∑i=1

mi

2r ′2i , (3.2)

where

r ′i = r i − rCM; (3.3)

rCM =m1r1 +m2r2

M, (3.4)

and

TCM =M2

r2CM, M = m1 +m2. (3.5)

As there are only two particles, we can work out the coordinatesr ′i relative to the centre of massrC, and we find, using Eq. (3.4):

r ′1 = r1− rCM =m2

M(r1− r2), (3.6)

andr ′2 = r2− rCM =

m1

M(r2− r1). (3.7)

30

Page 37: Classical and Quantum Mechanics

3.1. Formulation and analysis of the two-body problem 31

We can take time derivatives by simply putting a dot on eachr in these equations and then, after somecalculation, we find for the kinetic energy:

T =M2

r2CM +

m1m2

2Mr2. (3.8)

The Lagrangian is therefore

L = T−V =M2

r2CM +

m1m2

2Mr2−V(r). (3.9)

We see that the kinetic energy of therelative motion has the form of the kinetic energy of a singleparticle of massm1m2/(m1 + m2) and position vectorr(t). The mass termµ = m1m2/(m1 + m2) iscalledreduced mass.

Of course we could write down the Euler–Lagrange equations for this Lagrangian as before, but itis convenient to perform a further separation: that of the kinetic energy of the relative coordinate intoa radial and a tangential part. First we must realise that the plane through the origin and the initialvelocity vectorr of the relative position will always remain the plane of the motion, as the force actsonly within that plane. In this plane, we choose anx and ay axis. Then we can conveniently introducepolar coordinatesr andϕ, in which thex andy coordinate can be expressed as follows:

x = r cosϕ; (3.10)

y = r sinϕ. (3.11)

It then immediately follows that the kinetic energy of the relative motion can be rewritten as

µ

2

(x2 + y2)=

µ

2

(r2 + r2

ϕ2) . (3.12)

The Lagrange equations given in Eq. (2.54) then take the form:

MrCM = 0; (3.13a)

ddt

(µr2

ϕ)

= 0; (3.13b)

µ r−µrϕ2 =−dV(r)

dr. (3.13c)

The first equation tells us that the centre of mass moves at constant speed: it does not feel a netforce. This follows from the fact that it does not appear in the potential, and is in accordance withthe conservation of total momentum in the absence ofexternalforces. Coordinates such asrCM withconstant canonical momentum, are calledignorable– see section 2.9.2. The second and third equationdo not depend onrCM – therefore we see that the relative motion can entirely be understood in termsof a single particle with massµ and moving in a plane under the influence of a potentialV(r).

We now use the second equation to eliminateϕ. First note that the term in brackets occurringin this equation must be a constant, – we call this constant` (` = µr2ϕ is precisely the angularmomentum, and we see that it is conserved); the third equation then transforms into

µ r− `2

µr3 =−dV(r)dr

, (3.14)

Note that this equation can be viewed as that of a one-dimensional particle subject to a forceF =`2

µr3 − dV(r)dr . Such a force can in turn be derived from a conservative potential:

F(r) =ddr

VEff(r) =ddr

[`2

2µr2 +V(r)]. (3.15)

Page 38: Classical and Quantum Mechanics

32 The two-body problem

VE

ff(r

)

-1

0

1

2

3

0 1 2 3 4

rminrmax

rmin

E<

E>

0

0

r

Figure 3.1: Effective potential for a two-particle system.

The subscript Eff is used to distinguish between this ‘effective’ potential and the original, bare attrac-tion potentialV(r). The potentialVEff is represented in figure 3.1 for the caseV(r) = −1/r. FromEqs. (3.13b) and (3.13c) and from figure 3.1, we can infer the qualitative behaviour of the motion.

We have seen [Eq. (3.13b)] that the angular momentum is constant. This implies that the motionwill always keep the same orientation (i.e. clockwise or anti-clockwise). If the particles move apart,the speed at which they orbit around each other will be slower (since increasingr implies decreasingrϕ).

The motion in the radial direction can be understood qualitatively as follows. Note that we caninterpret Eq. (3.13c) as the motion of a particle in one dimension. This particle has a mechanicalenergy which is the sum of its kinetic energy and the effective potential, and this energy shouldremain constant. Furthermore, the energy cannot be lower than the lowest value of the effectivepotential shown in figure 3.1. If it lies between this value and 0, then thenr will vary between someminimum and maximum value as shown in this figure. IfE is on the other hand positive,r will varybetween some minimum value and infinity.

We have seen that ther-component of the two-body motion can be described in terms of a singleparticle in one dimension. The energy of this particle is the sum of its kinetic and potential energy– the latter is the effective potential [see Eq. (3.15)]. It turns out that this energy is equal to the totalenergy of the two-particle system (neglecting the contribution of the centre of mass motion to thelatter). As we have already worked out the kinetic and potential energy of the two-body problemabove, we immediately see that

E = T +V =µ

2

(r2 + r2

ϕ2)+V(r), (3.16)

which can easily be identified as the kinetic energyµ r2/2 of the one-dimensional particle plus theeffective potential.

Page 39: Classical and Quantum Mechanics

3.2. Solution of the Kepler problem 33

3.2 Solution of the Kepler problem

The special caseV(r) = −A/r is very important as it describes the gravitational and the Coulombattraction. Also, in this special case, the motion can be studied further by analytical means. Findingthe solution in the formr(t),ϕ(t) is not convenient – rather, we search forr(ϕ), which containsexplicit information about the shape of the orbit.

We use the fact that the angular momentum` = µr2ϕ is constant and combine this with the factthat the energy is constant and given by (3.16):

ϕ =`

µr2 ; (3.17)

r2 =2µ

(E−V)− `2

µ2r2 . (3.18)

Eliminating thedt of the time derivatives by dividing (3.17) by the square root of (3.18) leads to

dr=

±`

r2 [2µ(E−V(r))− `2/r2]1/2. (3.19)

With V(r) =−A/r this can directly be integrated to give

ϕ−C = arcsin

(µAr− `2

εµAr

). (3.20)

In addition to the integration constantC on the left hand side, we see a constantε, called theeccen-tricity, which is given in terms of the problem parameters as

ε =

√1+

2E`2

µA2 . (3.21)

Inverting Eq. (3.20) to findr as a function of the polar angleϕ gives:

r =`2

µA[1− ε sin(ϕ−C)]. (3.22)

We have some freedom in choosingC – it changes the definition of the angleϕ. If we takeϕ = 0 asthe angle for which the two particles are closest (perihelion), we see thatC = π/2.

The motion can now be classified according to the value ofε. We takeε positive – changing thesign ofε does not change the shape of the orbit (puttingϕ → ϕ + π compensates this sign change).For ε = 0, r does not depend onφ . This corresponds to a circle. If 0< ε < 1, we have an ellipse(r varies between some maximum and minimum value). Forε = 1, we have a parabola (r → ∞ forϕ = π), and forε > 1 we have an hyperbola (r → ∞ for cosϕ = 1/ε).

Usually, the notation

λ =`2

µA1

1+ ε(3.23)

is used, so that the equation relating the two polar coordinates on the curve of the motion reads:

r =λ (1+ ε)

1+ ε cosϕ. (3.24)

Page 40: Classical and Quantum Mechanics

34 The two-body problem

a εFF

1 2

a

b

Figure 3.2: Ellipse with various parameters indicated.

In figure 3.2, we indicate the semi-major and semi-minor axisa andb respectively and the focal points.The semi-major axis can be related to the parameters we use to represent the motion:

a =λ

1− ε. (3.25)

The area of an ellipse in terms of its semi-major axis isπa2√

1− ε2. This can be related to the angularmomentum by realising that the infinitesimal area swept by a line from the origin to the point of themotion is given byr2dϕ/2. This tells us that the rate at which this area changes is given as`2/(2µ),so that the total area, which is swept in one revolution of periodT is equal toT`/(2µ), so that wehave:

T`

2µ= πa2

√1− ε2. (3.26)

The quantitiesa andε are not independent – remembera = λ/(1−ε); furthermore related toλ andε [see Eq. (3.23)]. Using this to eliminateε finally leads to

T2 =4π2µ

Aa3. (3.27)

We have now recovered all three laws of Kepler:

• All planets move around the sun in elliptical paths.In fact, most planets have eccen-tricities very close to zero.

• A line drawn from the sun to a planet sweeps out equal areas in equal times.Therate at which this area increases is given by`2/(2µ) as we have seen above.

• The squares of the periods of revolution of the planets about the sun are propor-tional to the cubes of the semimajor axes of the ellipses.See Eq. (3.27).

Page 41: Classical and Quantum Mechanics

4

Examples of variational calculus, constraints

4.1 Variational problems

In the previous chapters, we have considered a reformulation of classical mechanics in terms of avariational principle. This will lead the way to formulating quantum mechanics – this is the subject ofthe next chapter. In this chapter we make an excursion which is still in the field of classical problems,though not classical dynamics as in the previous chapters. In fact, variational calculus is not onlyuseful for mechanics. Many physical problems which occur in every day life can be formulated asvariational problems. In the next sections we shall consider a few examples.

We shall first introduce some further analysis concerning the problems we are about to treat in thischapter. Consider an expression of the form

J =∫

dx F(y,y′,x). (4.1)

We have used a notation which differs from that used in previous chapter in order to emphasise thatJ isnot always the action andF not always the Lagrangian.J assigns a real value to every functiony(x) –it is called afunctional. There is a whole branch of mathematics, called functional analysis, dedicatedto such objects. Here we shall only consider finding the stationary solutions (minima, maxima orsaddle points) ofJ; they are given as the solutions to the Euler equations

∂F∂y

− ddx

∂F∂y′

= 0 (4.2)

In the case whereF does not depend explicitly onx, i.e.

F = F(y,y′), (4.3)

we can directly integrate the Euler equation(s) once: by multiplying the Euler equation byy′ we find

ddx

[F(y,y′)−y′

∂F(y,y′)∂y′

]= 0. (4.4)

From this it follows that the solution must obey

F(y,y′)−y′∂F(y,y′)

∂y′= Constant. (4.5)

This is a first order differential equation: we have integrated the second order Euler equations once.

35

Page 42: Classical and Quantum Mechanics

36 Examples of variational calculus, constraints

4.2 The brachistochrone

Near the end of the 18-th century, Jean Bernouilli was studying a problem, which we formulate asfollows. Suppose you are to design a monorail in an amusement park. There is a track in your monorailwhere the trolleys, which arrive at some high pointA with low (approximately zero) speed shouldmove to another placeB under the influence of gravity (no motor is used and friction is neglected) inthe shortest possible time. The problem is to design the shape of the track in order to achieve this goal.it will be clear that the track lies in a plane. Let us first consider the possible solutions heuristically.One could argue that a straight line would be the best solution because it is the shortest path betweenA andB. On the other hand, it would seem favourable to increase the particle’s velocity as much aspossible in the beginning of the motion. This would call for a steep slope near the starting pointA,followed by a more or less horizontal path toB, but the resulting curve is considerably longer thanthe straight line, which is the shortest path betweenA andB. We must therefore find the optimumbetween the shortness of the path and the earliest increase of the velocity by a steeper slope.

We can solve this problem using the techniques of the previous section. We must minimisethe time for a curve which can be parametrised asx(s),y(s). Obviously, there are many ways toparametrise a curve – we shall use fors the distance along the curve, measured from the pointA. Theinfinitesimal segmentds is given by

ds=√

dx2 +dy2 = dx

√1+(

dydx

)2

= dx√

1+y′2 (4.6)

s can be expressed as a function oft – the relation between the two is given by

ds= vdt (4.7)

wherev =√

v2x +v2

y is the particle speed. The time needed to go fromA to B is given by

t =∫ B

A

dsv

. (4.8)

We need an equation forv in terms of the path length. As the gravitational force is responsible for thechange in velocity it is useful to consider thex- andy-components of the path. In fact, we have thefollowing relation betweenv andy as a result of conservation of energy:

12

v2 = gy, (4.9)

where the heighty is measured from the pointA. This means that we have put in the boundarycondition that wheny = 0, thenv = 0, which is correct since the particle is released fromA with zerovelocity. Therefore, using (4.9), we arrive at

t[y] =∫ x0

0dx

√1+y′2√2gy

(4.10)

wherex0 is the horizontal distance betweenA andB. We have to find the stationary functiony(x) forthe functionalt[y]. The Euler-Lagrange equations have the solution [see Eq. (4.5)]:√

1+y′2

2gy−y′2

√1(

1+y′2)

2gy= Constant. (4.11)

Page 43: Classical and Quantum Mechanics

4.3. Fermat’s principle 37

This can be simplified toy(1+y′2) = C = Constant. (4.12)

In order to solve this equation we substitute

y′ = tanφ (4.13)

so that we have:

y = Ccos2 φ = C

[12

+12

cos(2φ)]. (4.14)

Anddxdφ

=1y′

dydφ

=Csin(2φ)

tanφ= 2Ccos2 φ . (4.15)

The solution is therefore

x = C

[φ +

12

sin(2φ)]+D (4.16a)

y = C

[12

+12

cos(2φ)]. (4.16b)

D andC are integration constants – if we identify the pointA with (0,0), the curve starts atφ = π/2,andD/C =−π/2. The two coordinates ofB are used to fix the value ofφ at pointB and the constantsD andC. Note that the boundary conditiony′ = 0 at pointA was already realised in Eq. 4.9). Theresulting curve is called thecycloid – it is the curve described by a point on a horizontally movingwheel.

4.3 Fermat’s principle

The path traversed by a ray of light in a medium with a varying refractive index is not a straight line.According to Fermat’s principle this path is determined by the requirement that the light ray followsthe path which allows it to go from one point to another in the shortest possible time. The time neededto traverse a path is determined by the speed of light along that path, and this quantity is given as

c(n) =cn

(4.17)

wherec is the speed of light in vacuum andn is the refractive index. The latter might vary withposition.

If the path lies in thexy plane, the path lengthdl of a segment corresponding to a distancedxalong thex-axis is given by

dl = dx

√1+(

dydx

)2

. (4.18)

The timedt needed to traverse the pathdl is given as:

dt =dl

c/n, (4.19)

so that the total time can now be given as an integral overdx:

t =∫ L

0dx

n(y)c

√1+(

dydx

)2 . (4.20)

Page 44: Classical and Quantum Mechanics

38 Examples of variational calculus, constraints

x

y(x)

Here we have assumed thatn depends on the coordinatey only. Now taken(y) =√

1+y2, then wemust minimise

ct =∫ L

0dx

√1+y2

√1+(

dydx

)2 . (4.21)

For this case, the Euler-Lagrange equations reduce to the equation [see Eq. (4.5)]

dydx

=1A

√(1−A2)+y2. (4.22)

The solution is given as

y(x) =±√

1−A2sinh( x

A+B)

. (4.23)

The possible range ofA-values is|A| ≤ 1.

4.4 The minimal area problem

Consider a soap film which is suspended between two parallel hoops (see figure). The soap film hasa finite surface tension, which means that its energy scales linearly with its area. As the film tends tominimise its energy, it minimises its area. The minimal area for a surface of revolution described by afunctiony(x) is given by:

∫ L

0dx y

√1+y′2. (4.24)

Minimising this functional ofy leads to the standard Euler-Lagrange solution Eq. (4.5) for functionalswith no explicit time dependence:

y√1+y′2

= C (4.25)

The solution to this equation is given by

y(x) = Ccosh

(x+A

C

)(4.26)

Page 45: Classical and Quantum Mechanics

4.5. Constraints 39

We now assume that the hoops have the same diameter. Let us furthermore choose thex-axis suchthat the origin is in the middle between the two hoops. Using the fact that cosh is an even function,we have:

R= Ccosh

(L

2C

). (4.27)

whereR is the radius of the hoops. Consider now the graph ofCcosh[L/(2C)] as a function ofC forfixed L: It is clear that forR lying in the “gap” of the graph, no solution can be found. What happens

-4

-2

0

2

4

-4 -2 0 2 4

x*cosh(0.5/x)

is that if the hoops are not too far apart, the soap film will form a nice cosh-formed shape. However,when we pull the hoops apart, there will be moment at which the film can no longer be sustained andit collapses.

It can be seen from the graph that usually there are two different solutions. The one with thesmallest surface is to be selected. The surface area is found as

A(y) = π

[LC+2R

√C2 +R2

]. (4.28)

4.5 Constraints

4.5.1 Constraint forces

In d’Alembert’s approach, the forces of constraint are neglected as they are usually of limited physicalimportance. In some cases, however, it might be useful to know what these forces are. For example, adesigner of a monorail would like to know the force which is exerted on that rail by the train in orderto certify that the monorail is robust enough. In fact, it is possible to work out within a Lagrangiananalysis what the forces of constraint are.

Let us first recall the solution to the following problem

Find the minimum of the functionf (x), wherex = (x1,x2, . . . ,xN), under the conditiong(k)(x) =Ck, whereCk are constants;k = 1, . . . ,K.

Consider a small variationδx such thatg(k)(x+δx) = Ck still holds for allk. Then it holds that

g(k)(x)+δx ·∇g(k)(x) = g(k)(x) = Ck (4.29)

Page 46: Classical and Quantum Mechanics

40 Examples of variational calculus, constraints

henceδx ·∇g(k)(x) = 0 (4.30)

for all k. On the other hand, for variationsδx satisfying (4.30),f should not change to first orderalongδx, so we have

δx ·∇ f (x) = 0. (4.31)

Now we can show that∇ f (x) must lie in the span of the set∇g(k)(x). If it would lie outside the span,we can write it as the sum of a vector lying in the span of∇g(k)(x) plus a vector perpendicular to thisspace. If takeδx to be proportional to the latter, then (4.30) is satisfied, but (4.31) is not. Thereforewe conclude that∇ f can be written as a linear combination of the gradients∇g(k):

∇ f (x) =K

∑k=1

λk∇g(k)(x). (4.32)

This is the well-known Lagrange multiplier theorem.Let us consider a simple example: finding the minimum or maximum of the functionf (x,y) = xy

on the unit circle:g(x,y) = x2 +y2−1 = 0. There is only one Lagrange parameterλ , and Eq. (4.32)for this case reads

(y,x) = λ (2x,2y), (4.33)

whose solution isx =±y, λ =±1/2. The constraintx2 +y2 = 1 then fixes the solution tox =±y =1/√

2 andx =±y =−1/√

2. Indeed, the symmetry of the problem allows only the axesx =±y andthex = 0 or y = 0 as the possible solutions, and it is easy to identify the stationary points (minima,maxima, saddle points) among these.

Now suppose we have a mechanicalN-particle system without constraints. For such a systemwe know the LagrangianL. The combined coordinates of the system are represented as a vectorR = (r1, . . . , rN). Then we have for any displacementδR that the corresponding change in Lagrangianvanishes:

M

∑i=1

δ r i ·δL(R, R, t)

δ r i≡ δL(R, R, t) = 0 (4.34)

for all t, where we have used the notation of (2.55). Now suppose that there are constraints present ofthe form

g(k)(R) = 0. (4.35)

The argument used above for ordinary functions of a single variable can be generalised to show thatwe should have

δL(R, R, t)δR

=K

∑k=1

λk∇Rg(k)(R); (4.36)

the reader is invited to verify this. AsL is the Lagrangian of a mechanical system without constraints,we know that the left hand side of this equation can be written asp−FA. The right hand side has thedimension of a force and must therefore coincide with the constraint force.

Let us analyse the simple example of the pendulum once again. Without constraints we have

L =m2

(x2 + y2)−mgy. (4.37)

The constraint is given byl2 = x2 +y2. (4.38)

Page 47: Classical and Quantum Mechanics

4.5. Constraints 41

So the pendulum equations of motion become

mx = 2λx (4.39a)

my =−mg+2λy. (4.39b)

These equations cannot be solved analytically, as they describe the full pendulum, and not the smallangle limit. In the small angle limit the force in thex-direction dominates, and thereforemg/(2λ )should be approximately equal tol . We then see thatλ is negative, so that the solution is oscillatory

and the frequency is given byω =∣∣∣√2λ/m

∣∣∣ =√g/l . The λ -dependent terms in the equation of

motion represent indeed a force in the+y direction of magnitudemg: this is the tension in the stringor rod on which the weight is suspended.

When using polar coordinates, we have

L = m

[r2

2+

(rϕ)2

2

]+mgcosϕ, (4.40)

with the constraintr = l . which leads to the Lagrange equations:

mr = mgcosϕ +mrϕ2−λ ; (4.41)

m(r2

ϕ + r rϕ)

=−mgrsinϕ. (4.42)

Filling the constraint is particularly easy. The constraint force is given by

λ = mgcosϕ +mrϕ2. (4.43)

The constraint force consists of a term which compensates for the gravity force (first term) and anextra term which is necessary for keeping the circular motion going (a centripetal force, the secondterm). The equation forϕ reduces to the usual pendulum equation when the constraint is used:

ϕ =−g/l sinϕ. (4.44)

In practice, constraints are seldom used explicitly in the solution of mechanical problems.

4.5.2 Global constraints

In (4.5.1) we have analysed constraints of the form:

g(k)(r1, r2, . . . , rN; t) = 0. (4.45)

This type of constraint is called holonomic, and it frequently allows us to represent the system usinggeneralised coordinates. This type of constraint imposes conditions on the system which should holdat any moment in time. We may therefore consider this type of constraints as an infinite set (oneconstraint for each time). Such constraints are calledlocal, where this term refers to the fact that theconstraint is local in time.

Sometimes however, we must deal with constraints of a different form. Consider for example theproblem of finding the shape of a chain of homogeneous densityρ (= mass/unit length) suspendedat its two end points. We represent this shape by a functiony(x) wherex is the coordinate along theline connecting the two end points andy(x) is the height of the chain for coordinatex. The shape

Page 48: Classical and Quantum Mechanics

42 Examples of variational calculus, constraints

x x1 2

Figure 4.1: Example of a functionδy(x) which is nonzero only nearx1 andx2.

is determined by the condition that it minimises the (gravitational) potential energy, and it is readilyseen that this energy is given by the functional

V = gρ

∫ X

0dx y

√1+(

dydx

)2

. (4.46)

We leave out the constantsg andρ in the following as they do not affect the shape.If we would minimise the potential energy (4.46), we would get divergences, as we have not yet

restricted the total lengthL of the wire to have a fixed length. This requirement can be formulated as

L =∫ X

0dx

√1+(

dydx

)2

. (4.47)

This is a constraint which is not holonomic, and there is no way to reduce the number of degrees offreedom. This type of constraint is calledglobalas it is formulated as a condition on an integral of thesame type as the functional to be minimised, and is not to be satisfied for all values of the integrationvariable. Therefore, we must generalise the derivation of the Euler equations to cases in which afunctional constraint is present.

Let us consider two functionals,J andK:

J =∫ b

adx F(y,y′,x) (4.48a)

K =∫ b

adx G(y,y′,x), (4.48b)

and suppose we want to minimiseJ under the condition thatK has a given value, i.e., for each variationδy which satisfies: ∫

δG(y,y′,x) dx= 0; (4.49)

we require that ∫δF(y,y′,x) dx= 0. (4.50)

Consider now a particular variation which is nonzero only in a small neighbourhood of two valuesx1

andx2 (see figure 4.1). If the areas under these two humps areA1 andA2 respectively, we have∫ b

aδG(y,y′,x) dx= A1

δG[y(x1),y′(x1),x1]δy

+A2δG[y(x2),y′(x2),x2]

δy, (4.51)

Page 49: Classical and Quantum Mechanics

4.5. Constraints 43

y

(0,0) (X,0)

x

Figure 4.2: The cosh solution to the suspended chain problem.

and thereforeδG1/δyδG2/δy

=−A2

A1(4.52)

with an obvious shorthand notation.Applying this argument once again we see that for functionsy(x) satisfying requirement (4.52),

we should haveδF1/δyδF2/δy

=−A2

A1(4.53)

But this can only be true for arbitraryx1 andx2 whenδF/δy andδG/δy are proportional:

δFδy

= λδGδy

. (4.54)

Therefore, we must solve the Euler equations for the combined functional

J(y)−λK(y) (4.55)

whereλ is fixed by putting the solution of this minimisation back into the constraint. This is theLagrange multiplier theorem for functionals.

We shall now apply this to the suspended chain problem. We haveF = y√

1+y′2 and G =√1+y′2. Therefore, the Euler equations read:

(y+λ )

(√1+y′2− y′2√

1+y′2

)= Constant, (4.56)

which leads to

y+λ = C√

1+y′2. (4.57)

The solution is given byy(x) = Acosh[α(x−x0)]+B (4.58)

Page 50: Classical and Quantum Mechanics

44 Examples of variational calculus, constraints

with

A = C = 1/α (4.59a)

B =−λ . (4.59b)

Boundary conditions arey(0) = y(X) = 0 and the length of the wire must be equal toL. Theseconditions fixx0 andλ : x0 = X/2, λ = −cosh[X/(2C)] andC = L/sinh[X/(2C)]. In figure 4.2the solution is shown.

Page 51: Classical and Quantum Mechanics

5

From classical to quantum mechanics

In the first few chapters we have considered classical problems, in particular the variational formula-tion of classical mechanics, in the formulations of Hamilton and Lagrange. In this chapter, we look atquantum mechanics. In the first section, we introduce quantum mechanics by formulating the postu-lates on which the quantum theory is based. Later on, we shall then try to establish the link betweenthe classical mechanics and quantum mechanics, via Poisson brackets and via the path integral.

5.1 The postulates of quantum mechanics

When we consider classical mechanics, we start from Newton’s laws and derive the behaviour ofmoving bodies subjects to forces form these laws. This is a nice approach as we always like to seea structured presentation of the world surrounding us. However, in reality, for thousands of yearspeople have thought about motion and forces before Newton’s compact formulation of the underlyingprinciples was found. It is not justified to forget this and to pretend that physics only consists onunderstanding and predicting phenomena from a limited set of laws. The ‘dirty’ process of walkingin the dark and trying to find a comprehensive formulation on the phenomena under consideration isan essential part of physics.

This also holds for quantum mechanics, although it was developed in a substantially shorteramount of time than classical mechanics. In fact, quantum mechanics started at the beginning ofthe twentieth century, and its formulation was more or less complete around 1930. This formulationconsisted of a set ofpostulateswhich however do not have a canonized form similar to Newton’s laws:most books have their own version of these postulates and even their number varies.

We now present a particular formulation if these postulates.

1. The state of a physical system at any timet is given by thewavefunctionof the system at that time.This wavefunction is an element of the Hilbert space of the system. The evolution of the systemin time is determined by the Schrodinger equation:

i∂

∂ t|ψ(t)〉= H |ψ(t)〉 .

HereH is an Hermitian operator: theHamiltonian

2. Any physical quantityQ is being represented by an Hermitian operatorQ.When we perform a measurement of the quantityQ, we will always find one of the eigenvalues ofthe operatorQ. For a system in the state|ψ(t)〉, the probability of finding a particular eigenvalueλi , with an associated eigenvector|φi〉 of Q is given by

Pi =|〈φi |ψ(t)〉|2

〈ψ(t)|ψ(t)〉〈φi |φi〉.

45

Page 52: Classical and Quantum Mechanics

46 From classical to quantum mechanics

Immediately after the measurement, the system will find itself in the state|φi〉 corresponding tothe valueλi which was found in the measurement ofλi .

Several remarks can be made.

1. The wavefunction contains themaximum amount of informationwe can have about the system. Inpractice, we often do not know the wavefunction of the system.

2. Note that the eigenvectors|φi〉 always form abasisof the Hilbert space of the system underconsideration. This implies that the state|ψ(t)〉 of the system before the measurement can alwaysbe written in the form

|ψ(t)〉= ∑i

ci |φi〉 .

The probability to find in a measurement the valuesλi is therefore given by

Pi =|ci |2

∑ j

∣∣c j∣∣2 .

For a normalised state|ψ(t)〉 it holds that, if the eigenvectors|φi〉 are normalised too:

∑i

|ci |2 = 1.

In that casePi = |ci |2 .

3. So far we have suggested in our notation that the eigenvalues and eigenvectors form adiscreteset.In reality, not only discrete, but also continuous spectra are possible. In those cases, the sums arereplaced by integrals.

4. In understanding quantum mechanics, it helps to make a clear distinction between the formalismwhich described the evolution of the wavefunction (the Schrodinger equation, postulate 1) versusthe interpretation scheme. We see that the wavefunction contains the information we need topredict the outcome of measurements, using the measurement postulate (number 2).

It now seems that we have arrived at a formulation of quantum mechanics which is similar tothat of classical mechanics: a limited set of laws (prescriptions) from which everything can be de-rived, provided we know the form of the Hamiltonian (this is analogous to the situation in classicalmechanics, where Newton’s laws do not tell us what the form of the forces is).

However there is an important difference: the classical laws of motion can be understood byusing our everyday life experience so that we have some intuition for their meaning and content. Inquantum mechanics, however, our laws are formulated as mathematical statements concerning objects(vectors and operators) for which we do not have a natural intuition. This is the reason why quantummechanics is so difficult in the beginning (although its mathematical structure as such is rather simple).You should not despair when quantum mechanics seems difficult: many people find it difficult, andthe role of the measurement is still the object of intensive debate. Sometimes you must switch yourintuitition off and use the rules of linear algebra to solve problems.

Above, we have mentioned that quantum mechanics does not prescribe the form of the Hamil-tonian. In fact, although the Schrodinger equation, quite unlike the classical equation of motion, isa linear equation, which allows us to make ample use of linear algebra knowledge, the structure of

Page 53: Classical and Quantum Mechanics

5.2. Relation with classical mechanics 47

quantum mechanics is richer than that of classical mechanics because in principleanytype of Hilbertspace cold occur in Nature. In classical mechanics, the space containing all possible states of a systemis essentially a 6N dimensional space (for aN-body system we have 3N space- and 3N momentumcoordinates). In quantum mechanics, wavefunctions can be part of infinite-dimensional spaces (likethe wave functions of a particle moving along a one-dimensional axis) but they can also lie in a finite-dimensional space (for examplespinwhich has no classical analogue).

5.2 Relation with classical mechanics

In order to see whether we can guess the structure of the Hamiltonian for systems which have aclassical analogue, we consider the time evolution of a physical quantityQ. We assume thatQ doesnot depend on timeexplicitly. However, the expectation value ofQ may vary in time due to the changeof the wavefunction in the course of time. For normalised wavefunctions:

ddt

(〈ψ(t)|Q|ψ(t)〉) =(

∂ t〈ψ(t)|

)Q|ψ(t)〉+ 〈ψ(t)|Q

(∂

∂ t|ψ(t)〉

).

Using the Schrodinger equation and its Hermitian conjugate:

−i∂

∂ t〈ψ(t)|= 〈ψ(t)| H,

(note the minus-sign on the left hand side which results from the Hermitian conjugate) we obtain

iddt

(〈ψ(t)|Q|ψ(t)〉) =⟨ψ(t)|QH− HQ|ψ(t)

⟩=⟨ψ(t)|

[Q, H

]|ψ(t)

⟩,

where[Q, H

]is the commutator. We see that the time derivative ofQ is related to the commutator

betweenQ andH. This should wake you up or ring a bell. In the exercises, we have seen that for anyfunction f (q j , p j) of the coordinatesq j and momentap j , the time derivative is given by

d fdt

= ∑j

(∂ f∂q j

∂H∂ p j

− ∂ f∂q j

∂H∂ p j

)≡ f ,H .

We see that this equation is very similar to that obtained above for the time derivative of the expectationvalue of the operatorQ! The differences consist of replacing the Poisson bracket by the commutatorand adding a factori. It seems that classical and quantum mechanics are not that different after all.Could this perhaps be a guide to formulate quantum mechanics for systems for which have already aclassical version? This turns out to be the case.

As an example, we start by considering a one-dimensional system for which the relevant classicalobservables are the positionx and the momentump. Classically, we have

x, p=∂x∂x

∂ p∂ p

− ∂x∂ p

∂ p∂x

= 1.

The second term in the expression vanished becausex and p are to be considered asindependentcoordinates. From this, we may guess the quantum version of this relation:

[x, p] = i

which should sound familiar (if it does not, return to the second year quantum mechanics course). Itseems that our recipe of making quantum mechanics out of classical mechanics makes sense! There-fore we can now state the following rule:

Page 54: Classical and Quantum Mechanics

48 From classical to quantum mechanics

If the Hamiltonian of some classical system is known, we can use the same form in quantummechanics, taking into account the fact that the coordinatesq j andp j become Hermitianoperatorsand that their commutator relations are:

[q j ,qk] = 0; [p j , pk] = 0; [q j , pk] = δ jk.

You can verify these extended commutation relations easily by working out the corresponding classi-cal Poisson brackets.

In the second year, you have learned that

p =i

ddx

.

What about this relation? It was not mentioned here so far. The striking message here is that thisrelationcan be derived from the commutation relation. In order to show this, we must discuss anotherobject you might have missed too: the wavefunction written in the formψ(r) (for a particle in 3D). It isimportant to study the relation between this and the state|ψ〉. Consider a vectora in two dimensions.This vector can be represented by two numbersa1 anda2, which are the components of the vectora. However, the actual values of the components depend on how we have chosen our basis vectors.The vectora is an arrow in a two dimensional space. In that space,a has a particular length and aparticular orientation. By changing the basis vectors, we do not change theobjecta, but wedochangethe numbersa1 anda2.

In the case of the Hilbert space of a one-dimensional particle, we can use as basis vectors thestates in which the particle is localised at a particular positionx. We call these states|x〉. They areeigenvectors of the position operator ˆx with eigenvaluex:

x|x〉= x|x〉 .

The states|x〉 are properly normalised:⟨x|x′⟩

= δ (x−x′),

whereδ (x−x′) is the Dirac delta-function. We now can defineψ(x):

ψ(x) = 〈x|ψ〉 ,

that is,ψ(x) are the ‘components’ of the ‘vector’|ψ〉 with respect to the basis|x〉. For three dimen-sions, we have a wavefunction which is expressed with respect to the basis|r〉.

In order to derive the representation of the momentum operator, ˆp = i

ddx, we first calculate the

matrix element of the commutator:⟨x|[x, p]|x′

⟩=⟨x|xp− px|x′

⟩= (x−x′)

⟨x|p|x′

⟩.

The last expression is obtained by having ˆx in the first term act on the bra-vector〈x| on its left, and onthe ket|x′〉 on the right in the second term.

On the other hand, using the commutation relation, we know that⟨x|[x, p]|x′

⟩= i

⟨x|x′⟩.

This is an even function ofx−x′, as interchangingx andx′ does not change the matrix element on theright hand side. Since this function is equal to(x−x′)〈x|p|x′〉, we know that〈x|p|x′〉 must be an oddfunction ofx−x′.

Page 55: Classical and Quantum Mechanics

5.2. Relation with classical mechanics 49

Now we evaluate the matrix element〈x|p|ψ〉. We recall from linear algebra that, since|x〉 are theeigenstates of an Hermitian operator, they form a complete set, that is:

I =∫|x〉〈x| dx,

whereI is the unit operator. Then we can write

〈x|p|ψ〉=∫ ⟨

x|p|x′⟩⟨

x′|ψ⟩

dx′.

Now we perform a Taylor expansion aroundx in order to rewrite〈x′|ψ〉:

⟨x′|ψ

⟩= 〈x|ψ〉+(x′−x)

ddx〈x|ψ〉+ (x′−x)2

2!d2

dx2 〈x|ψ〉+ · · ·

Then we obtain

〈x|p|ψ〉=∫ ⟨

x|p|x′⟩(〈x|ψ〉+(x′−x)

ddx〈x|ψ〉+ (x′−x)2

2!d2

dx2 〈x|ψ〉+ · · ·)

dx′.

The first term in brackets gives a zero after integration, as it is multiplied by〈x|p|x′〉, which was anodd function ofx−x′. The second term gives∫ ⟨

x|p|x′⟩(x′−x)

ddx〈x|ψ〉=−i

ddx〈x|ψ〉 ,

where we have used the relation⟨x|p|x′

⟩(x′−x) =−iδ (x′−x).

We use the same relation for the second term. But then we obtain a term of the form

(x′−x)δ (x′−x)

in the integral overdx′. This obviously yields a zero. The same holds for all higher order terms, sowe are left with

〈x|p|ψ〉=i

ddx〈x|ψ〉 ,

which is the required result.Having obtained this we can analyse the form of the eigenstates of the momentum operator:

p|p〉= p|p〉 .

The states|p〉 can be represented in the basis〈x|; the components then are〈x|p〉. We can find the formof these functions by using the eigenvalue equation and the representation of the momentum operatoras a derivative:

〈x|p|p〉= p〈x|p〉 and

〈x|p|p〉=i

ddx〈x|p〉 .

Page 56: Classical and Quantum Mechanics

50 From classical to quantum mechanics

The first of these equation expresses the fact that|p〉 is an eigenstate of the operator ˆp, and thesecond one follows directly from the fact that the momentum operator acts as a derivative in thex-representation. Combining these two we obtain a simple differential equation

i

ddx〈x|p〉= p〈x|p〉 ,

with a normalised solution:

〈x|p〉=1√2π

eipx/.

This allows us to find any stateψ in the momentum representation, that is, the representation inwhich we use the states|p〉 as basis states:

〈p|ψ〉=∫〈p|x〉〈x|ψ〉 dx=

12π

∫eipx/

ψ(x) dx.

The analysis presented here for a one-dimensional particle can be generalised to three or more dimen-sions in a natural way.

5.3 The path integral: from classical to quantum mechanics

The path integral is a very powerful concept for connecting classical and quantum mechanics. More-over, this formulation renders the connection between quantum mechanics and statistical mechanicsvery explicit. We shall restrict ourselves here to a discussion of the path integral in quantum mechan-ics. The reader is advised to consult the excellent book of Feynman and Hibbs (Quantum Mechanicsand Path Integrals, McGraw-Hill, 1965) for more details.

The path integral formulation can be derived from the following heuristics:

• A point particle which moves with momentump at energyE can also be viewed as a wave with aphaseϕ given by

ϕ = k · r −ωt

wherep = k andE = ω.

• For a single path, these phases are additive, i.e. the phases for different segments of the path shouldbe added.

• The probablity to find a particle which att = t0 was atr0, at positionr1 at timet = t1, is givenby the absolute square of the sum of the phase factors exp(iϕ) of all possible paths leading from(r0, t0) to (r1, t1):

P(r0, t0; r1, t1) =

∣∣∣∣∣ ∑all paths

eiϕpath

∣∣∣∣∣2

.

This probability is defined up to a constant which can be fixed by normalization (i.e. the termwithin the absolute bars must reduce to a delta-function inr1− r0).

These heuristics are the analog of the Huygens principle in wave optics.To analyse the consequences of these heuristics, we chop the time interval betweent0 andt1 into

many identical time slices (see Fig. 5.1) and consider one such slice. Within this slice we take thepath to be linear. To simplify the analysis we consider one-dimensional motion. We first consider the

Page 57: Classical and Quantum Mechanics

5.3. The path integral: from classical to quantum mechanics 51

x

xi

f

t ti f

Figure 5.1: A possible path running from an initial positionxi at timeti to a final positionxf at timetf . The timeis divided up into many identical slices.

contribution ofk ·x to the phase difference. If the particle moves in a time∆t over a distance∆x, weknow that itsk-vector is given by

k =mv

=m∆x∆t

.

The phase change resulting from the displacement of the particle can therefore be given as

∆ϕ = k∆x =m∆x2

∆t.

We still must add the contribution ofω∆t to the phase. Neglecting the potential energy we obtain

∆ϕ =m∆x2

∆t− 2k2

2m∆t =

m∆x2

2∆t.

The potential also enters through theω∆t term, to give the result:

∆ϕ =m∆x2

2∆t−V(x)

∆t.

Forx occurring in the potential we may choose any value betweenx0 andx1 – the most accurate resultis obtained by substituting the mean value.

If we now use the fact that phases are additive, we see that for the entire path the phases are givenby

ϕ =1 ∑

j

m2

[x(t j+1)−x(t j)

∆t

]2

−V[x(t j)]+V[x(t j+1)]

2

∆t.

This is nothing but the discrete form of theclassical actionof the path! Taking the limit∆t → 0 weobtain

ϕ =1

∫ t1

t0

[mx2

2−V(x)

]dt =

1

∫ t1

t0L(x, x) dt.

Page 58: Classical and Quantum Mechanics

52 From classical to quantum mechanics

We therefore conclude that the probability to go fromr0 at timet0 is to r1 at timet1 is given by

P(r0, t0; r1, t1) = N

∣∣∣∣∣ ∑all paths

exp

[i

∫ t1

t0L(x, x) dt

]∣∣∣∣∣2

whereN is the normalization factor

N =√

m2π i∆t

.

This now is the path integral formulation of quantum mechanics.Let us spend a moment to study this formulation. First note the large prefactor 1/ in front of

the exponent. If the phase factor varies when varying the path, this large prefactor will cause theexponential to vary wildly over the unit circle in the complex plane. The joint contribution to theprobability will therefore become very small. If on the other hand there is a region in phase space(or ‘path space’) where the variation of the phase factor with the path is zero or very small, the phasefactors will add up to a significant amount. Such regions are those where the action is stationary, thatis, we recover theclassicalpaths as those giving the major contribution to the phase factor. For→ 0(the classical case), only the stationary paths remain, whereas for small, small fluctuations aroundthese paths are allowed: these are the quantum fluctuations.

You may not yet recognise how this formulation is related to the Schrodinger equation. On theother hand, we may identify the expression within the absolute signs in the last equation with a matrixelement of the time evolution operator since both have the same meaning:

⟨x1|U(t1− t0)|x0

⟩= ∑

all paths

N exp

[i

∫ t1

t0L(x, x) dt

].

This particular form of the time evolution operator is sometimes called thepropagator. Let us nowevaluate this form of the time evolution operator acting for a small time interval∆t on the wavefunctionψ(x, t):

ψ(x1, t1) = N∫

D[x(t)]∫ ∞

−∞exp

i

∫ t1

t0

[m

x2(t)2

−V[x(t)]]

dt

ψ(x0, t0) dx0.

The notation∫

D[x(t)] indicates an integral over all possible paths from(x0, t0) to (x1, t1). We first ap-proximate the integral over time in the same fashion as above, takingt1 very close tot0, and assuminga linear variation ofx(t) from x0 to x1:

ψ(x1, t1) = N∫ ∞

−∞exp

i

[m

(x1−x0)2

2∆t2 −V(x0)+V(x1)2

]∆t

ψ(x0, t0) dx0.

A similar argument as used above to single out paths close to stationary ones can be used here to arguethat the (imaginary) Gaussian factor will forcex0 to be very close tox1. The allowed range forx0 is

(x1−x0)2 ∆tm

.

As ∆t is taken very small, we may expand the exponent with respect to theV∆t term:

ψ(x1, t1) = N∫ ∞

−∞exp

[i

m(x1−x0)2

2∆t

][1− i[V(x0)+V(x1)]

2∆t

]ψ(x0, t0) dx0.

Page 59: Classical and Quantum Mechanics

5.4. The path integral: from quantum mechanics to classical mechanics 53

As x0 is close tox1 we may approximate[V(x0)+V(x1)]2 by V(x1). We now change the integration

variable fromx0 to u = x0−x1:

ψ(x1, t1) = N∫ ∞

−∞exp

(i

mu2

2∆t

)[1− i/V(x1)∆t]ψ(x1 +u, t0) du.

As u must be small, we can expandψ(x) aboutx1 and obtain

ψ(x1, t1) = N∫ ∞

−∞exp

(im

u2

2∆t

)[1− i

V(x1)∆t

][ψ(x1, t0)+u

∂xψ(x1, t0)+

u2

2∂ 2

∂x2 ψ(x1, t0)]

du.

Note that the second term in the Taylor expansion ofψ leads to a vanishing integral as the integrandis an antisymmetric function ofu. All in all, after evaluating the Gaussian integrals, we are left with

ψ(x1, t1) = ψ(x1, t0)−i∆t

V(x1)ψ(x1, t0)+i∆t2m

∂ 2

∂x2 ψ(x1, t0).

Usingψ(x1, t1)−ψ(x1, t0)

∆t≈ ∂

∂ tψ(x1, t1),

we obtain the time dependent Schrodinger equation for a particle moving in one dimension:

i∂

∂ tψ(x, t) =

[− 2

2m∂ 2

∂x2 +V(x)]

ψ(x, t).

You may have found this derivation a bit involved. It certainly is not the easiest way to arrive atthe Schrodinger equation, but it has two attractive features;

• Everything was derived from simple heuristics which were based on viewing a particle as a waveand allow for interference of the waves;

• The formulation shows that the classical path is obtained from quantum mechanics when we let→ 0.

5.4 The path integral: from quantum mechanics to classical mechanics

In the previous section we have considered how we can arrive from classical mechanics at the Schrodingerequation. This formalism can be generalised in the sense that for each system for which we canwrite down a Lagrangian, we have a way to find a quantum formulation in terms of the path integral.Whether a Schrodinger-like equation can be found is not sure: sometimes we run into problems whichare beyond the scope of these notes. In this section we assume that we have a system described bysome Hamiltonian and show that the time evolution operator has the form of a path integral as foundin the previous section.

The starting point is the time evolution operator, or propagator, which, for a time-independentHamiltonian, takes the form

U(r f , tf ; r i , ti) =⟨

r f

∣∣∣e− i (tf−ti)H

∣∣∣ r i

⟩.

The matrix element is difficult to evaluate – the reason is that the Hamiltonian which, for a particle inone dimension, takes the form

H =− 2

2md2

dx2 +V(x)

Page 60: Classical and Quantum Mechanics

54 From classical to quantum mechanics

is the sum of twononcommutingoperators. Although it is possible to evaluate the exponents of theseparate terms occurring in the Hamiltonian, the exponent of the sum involves an infinite series ofincreasingly complicated commutators. For any two noncommuting operatorsA andB we have

eA+B = eAeBe−1/2[A,B]−1/12([A,[A,B]]+[B,[B,A]])+1/24[A,[B,[A,B]]]+...

This is the so-called Campbell–Baker–Hausdorff (CBH) formula. The cumbersome commutatorsoccurring on the right can only be neglected if the operatorsA andB are small in some sense. We cantry to arrive at an expression involving small commutators by applying the time slicing procedure ofthe previous section:

e−i (tf−ti)H = e−

i ∆tHe−

i ∆tHe−

i ∆tH . . .

Note that no CBH commutators occur because∆tH commutes with itself.Having this, we can rewrite the propagator as (we omit the hat for operators)

U(xf , tf ;xi , ti) =∫

dx1 . . .dxN−1

⟨xf |e−i∆tH/|xN−1

⟩⟨xN−1|e−i∆tH/|xN−2

⟩· · ·⟨

x1|e−i∆tH/|xi

⟩.

Now that the operators occurring in the exponents can be made arbitrarily small by taking∆t verysmall, we can evaluate the matrix elements explicitly:⟨

x j |e−i∆tH |x j+1⟩

=⟨

x j |e−i∆t [p2/(2m)+V(x)]|x j+1

⟩= e−i∆tV(x j )/

⟨x j |e−

i∆t p2/(2m)|x j+1

⟩.

The last matrix element can be evaluated by inserting two unit operators formulated in terms of inte-grals over the complete sets|p〉:⟨

x|e−i∆t p2/(2m)|x′

⟩=∫ ∫

〈x|p〉〈p|e−i∆t p2/(2m) ∣∣p′⟩⟨p′|x′

⟩.

We have seen that〈x|p〉 = exp(ipx/)/√

2π. Realising that the exponential operator isdiagonalinp space, we find, after integrating overp:⟨

x|e−i∆t p2/(2m)|x′

⟩=

1√2π∆t

exp[im(x−x′)2/(2∆t)

].

All in all we have⟨x j |e−i∆tH/|x j+1

⟩=

1√2π∆t

e−i∆tV(x j )/ exp[mi(x−x′)2/(2∆t)

].

Note that we have evaluated matrix elements of operators. The result is expressed completely interms of numbers, and we no longer have to bother about commutation relations. Collecting all termstogether we obtain

U(xf , tf ;xi , ti) =∫

dx1 . . .dxN−1exp

i

N

∑j=0

[m(x j+1−x j)2

2−V(x j)

]∆t

.

The expression in the exponent is the discrete form of the Lagrangian; the integral over all intermediatevaluesx j is the sum over all paths. We therefore have shown that the time evolution operator fromxi

to xf is equivalent to the sum of the phase factors of all possible paths fromxi to xf .

Page 61: Classical and Quantum Mechanics

6

Operator methods for the harmonic oscillator

6.1 Introduction

Now that we know the basic formulation in terms of postulates of quantum mechanics, we are readyto treat standard quantum problems. You have already met some wavefunction problems in the secondyear – they are briefly mentioned in the appendix. In this chapter we consider a completely differentapproach for finding the energy spectrum and eigenfunctions – this is theoperator method. In thewavefunction, ordirect methodone tries to find an explicit form of a wave function satisfying theSchrodinger equation in, usually, the spatial representation. Operator methods however aim at solvingthe problem by finding particular operators satisfying particular commutation relations and in whichthe Hamiltonian can easily be expressed. By applying the commutation relations and a few generalphysical criteria, the solution is obtained without using tedious mathematics but at the expense of asomewhat higher level of abstraction.

We shall consider an application to the harmonic oscillator and use operator methods to findspectra of angular momentum operators in the next chapter. The harmonic oscillator is of considerableinterest in numerous problems. The reason is that often systems in nature are close to the classicalground state, and the potential can often be treated well in a harmonic approximation. Consider forexample the hydrogen molecule, which consists of two atoms linked together by a chemical bondwith an equilibrium distancer0. We can stretch or contract the bond and it will then act as a spring,which for small deviations from the equilibrium distance, is approximately harmonic as we shal seein chapter 11. The harmonic oscillator also forms the basis of many advanced quantum mechanicalfield theories, where we shall not go into.

6.2 The harmonic oscillator

Consider the one-dimensional harmonic oscillator. The Schrodinger equation reads

− 2

2md2

dx2 ψ(x)+12

mω2x2

ψ(x) = Eψ(x). (6.1)

ω is the frequency of the classical harmonic oscillator. This equation can also be written as:

p2

2mψ(x)+

12

mω2x2

ψ(x) = Eψ(x) (6.2)

where we have used the momentum operatorp≡ i

ddx. The momentum operator does not commute

with the positionx. We have:

[p,x] =i. (6.3)

55

Page 62: Classical and Quantum Mechanics

56 Operator methods for the harmonic oscillator

In order to simplify the notation, we scale the momentum and the distance according to:

p =p√

mω(6.4a)

x =√

x (6.4b)

so that ˜p = d/dx. The Schrodinger equation now assumes the form:

ω

2[p2 + x2]ψ(x) = Eψ(x) (6.5)

or,ω

2

[− d2

dx2 + x2]

ψ(x) = Eψ(x). (6.6)

The commutation relation for ˜p andx can be found using (6.3) and we have:

[p, x] =−i. (6.7)

We shall first consider the solution of this problem following the direct method. In order to solvethe Schrodinger equation it turns out convenient to writeψ(x) in the form:

ψ(x) = e−x2/2u(x) (6.8)

where a new functionu(x) has been introduced. Denoting derivatives with respect to ˜x by a prime′,we have:

ψ′(x) =

[−xu(x)+u′(x)

]e−x2/2 (6.9)

ψ′′(x) =

[x2u(x)−u(x)−2xu′(x)+u′′(x)

]e−x2/2 (6.10)

and substituting these expressions in (6.6) we obtain:

ω

2

[2xu′(x)+u(x)−u′′(x)

]= Eu(x) (6.11)

or

−u′′(x)+2xu′(x)+u(x)− 2Eω

u(x) = 0. (6.12)

The resulting equation can be analysed by writingu as power series expansion in ˜x:

u(x) =∞

∑n=0

cnxn. (6.13)

Substituting this series into (6.12) leads to

∑n=2

−n(n−1)cnxn−2 +2ncnxn +cnxn− 2Eω

cnxn = 0. (6.14)

Collecting equal powers in this expression and demanding that the resulting coefficients for eachpower should vanish, we obtain a recursive equation for thecn:

cn+2 =−2Eω−1−2n

(n+2)(n+1)cn. (6.15)

Page 63: Classical and Quantum Mechanics

6.2. The harmonic oscillator 57

This power series expansion diverges so strongly for large values of ˜x that it is impossible to normalisethe corresponding wave function, unless it truncates for a particular value ofn. Therefore we mustrequire thatcn vanishes for somen. This leads to the equation

En = ω(n+1/2) (6.16)

This is the spectrum of the one-dimensional harmonic oscillator: it is equidistant and bounded frombelow.

The solutionsψ can be written in terms of the solutionsu which, for the condition (6.16) are theso-calledHermite polynomials Hn:

ψ(x) =(√

π2nn!)−1/2

e−x2/2Hn(x). (6.17)

We now show that the harmonic oscillator problem can also be solved by a different method, inwhich merely commutation relations between operators are used to arrive at the energy spectrum. Wedefine two operators,a anda† which are each other’s Hermitian conjugates:

a =1√2(x+ i p), (6.18a)

a† =1√2(x− i p). (6.18b)

The fact that these operators are each other’s Hermitian conjugates can easily be checked using thefact bothx and p are Hermitian.

Using (6.7), it can be verified that

[a,a†] =12

([x+ i p, x− i p]) =i2[p, x]− i

2[x, p] = 1. (6.19)

Furthermore, using Eqs. (6.19) and (6.18), we obtain immediately:

H =ω

2

(a†a+aa†)= ω

(a†a+1/2

). (6.20)

From this, it is easy to calculate the following commutation relations:

[H,a] = ω[a†a,a] = ω[a†,a]a =−ωa (6.21)

and similarly[H,a†] = ωa†. (6.22)

After these preparations, we now consider the eigenvalue problem. SupposeψE is an eigenstatewith energyE.

HψE = EψE. (6.23)

We now consider the action of the Hamiltonian on the stateaψE. Using the commutation relation(6.21):

HaψE = aHψE−ωaψE = aEψE−ωaψE (6.24)

or:H(aψE) = (E−ω)(aψE) (6.25)

and we see thataψE is an eigenstate ofH with energyE−ω!

Page 64: Classical and Quantum Mechanics

58 Operator methods for the harmonic oscillator

Similarly we have fora†ψE:

Ha†ψE = a†HψE +ωa†

ψE = (E +ω)(a†ψE) (6.26)

that is,a†ψE is an eigenstate with energyE+ω. We say thata is a “lowering” operator, as it lowersthe energy eigenvalue byω and accordinglya† is called raising operator.

Note that ifψE is normalised,aψE anda†ψE need not have this property as an eigenvector isdefined only up to a normalisation constant. We will return to this below.

In order to find the spectrum, we use a physical argument. The spectrum must be bounded frombelow as the potential does not assume infinitely negative values. Therefore, if we start with someψE

and act successively on it with the lowering operatora, we must have at some point:

anψE = 0 (6.27)

because otherwise the spectrum would not be bounded from below. Let us callan−1ψE = ψ0. Thenaψ0 = 0. Therefore,

Hψ0 = ω

(a†a+

12

)ψ0 =

12

ωψ0, (6.28)

that is,ψ0 is an eigenstate ofH with eigenvalueω/2. Acting witha† on ψ0 we obtain an eigenstateψ1 (up to a constant) with eigenvalue 3ω/2 etc. Actingn times witha† onψ0, we obtain an eigenstateψn (up to a constant) with energyω(n+1/2), in accordance with the result derived above using thedirect method.

Often the operatora†a is callednumber operator, denoted byN, andH can now be written asω(N+1/2). ψn is an eigenstate ofN with eigenvaluen. The norm ofa†ψn can be expressed in thatof ψn: ⟨

a†ψn | a†

ψn⟩

=⟨ψn | aa†

ψn⟩

=⟨ψn | (a†a+1)ψn

⟩= (n+1)〈ψn | ψn〉 . (6.29)

Therefore, ifψn is normalised,a†ψn/√

n+1 is normalised too, and normalised statesψn can beconstructed from a normalised stateψ0 according to:

ψn =1√n!

(a†)n

ψ0. (6.30)

Using the commutation relations fora,a†, it is also possible to show that states belonging todifferent energy levels are mutually orthogonal:

〈ψn|ψm〉 ∝⟨

ψ0|an(a†)m|ψ0

⟩. (6.31)

Moving thea’s to the right by application of the commutation relations leads to a form involving thelowering operatora acting onψ0 which vanishes.Exercise: show that〈ψ2|ψ3〉 vanishes indeed.

We have succeeded in finding the energy spectrum but it might seem that we have not made anyprogress in finding the form of the eigenfunctionsψn. However, we have a simple differential equationdefining the ground stateψ0:

aψ0(x) =√

22

(x+ i p)ψ0(x) = 0 (6.32)

or:

(x+ddx

)ψ0(x) = 0 (6.33)

Page 65: Classical and Quantum Mechanics

6.2. The harmonic oscillator 59

The solution can immediately be found as:

ψ(x) = Const.e−x2/2 (6.34)

in accordance with the result obtained in the direct method. The normalisation constant is found as

Const. = (mω/π)1/4 (6.35)

(check!).Using (6.30), we can write the solution for generaln as:

ψn(x) =(mω

π

)1/4 1√n!2n

(x+ i p)ne−x2/2. (6.36)

which indeed turns out to be in accordance with the solution found in the direct method, but we shallnot go into this any further.

Page 66: Classical and Quantum Mechanics

7

Angular momentum

7.1 Spectrum of the angular momentum operators

We have seen that the energy spectrum of the harmonic oscillator is easy to find using creation andannihilation operators. Similar methods can be used to find the eigenvalues of angular momentumoperators. We know two such types of operators: the analogue of the classical angular momentum:

L = r ×p (7.1)

and the spinS. These operators satisfy the commutation relations:

[Ji ,Jj ] = iεi jkJk. (7.2)

Here, i, j andk are indices denoting the Cartesian components,x,y,z. The operatorJ is an angularmomentum operator likeL or S. εi jk is the Levy-Civita tensor – it is 1 ifi jk is an even permutation of1231 and−1 for an odd permutation. In fact, we will call every operator satisfying (7.2) an angularmomentum operator.

From the commutation relations (7.2) it can be derived that the components ofJ commute withJ2

– we can write this symbolically as:[J,J2] = 000. (7.3)

Exercise: prove this relation.The operatorJ2 is positive. This means that for any state|u〉,

⟨u|J2|u

⟩≥ 0.

Exercise: prove this.If the Hamiltonian of a physical system commutes with every component of an angular momentumoperatorJ, the eigenstates can be rearranged to be simultaneous eigenstates of the Hamiltonian andJ2 andJz (it is impossible to includeJx or Jy because they do not commute withJz).

In analogy to the raising and lowering operators for the harmonic oscillator we define the operatorsJ+ andJ− as follows:

J+ = Jx + iJy (7.4a)

J− = Jx− iJy (7.4b)

These operators are not Hermitian – they are each other’s Hermitian conjugates. They satisfy thefollowing commutation relations:

[Jz,J±] =±J±; (7.5a)

[J+,J−] = 2Jz. (7.5b)

1The even permutations of 123 are 123, 231 and 312. The remaining three are the odd permutations.

60

Page 67: Classical and Quantum Mechanics

7.1. Spectrum of the angular momentum operators 61

By definition, we call the eigenvalues ofJ2 2 j( j +1) and those ofJz m. Here j andm arereal (i.e.not necessarily integer) numbers which we will have to find. The eigenstates can now be written as| jm〉 where we have omitted quantum labels associated with other operators, such as the Hamiltonian.Note that we can always takej ≥ 0 because of the fact thatJ2 is a positive operator.

We now show that for an angular momentum eigenstate| jm〉, J±| jm〉 is an angular momentumeigenstate too:

Jz[J+| jm 〉] = (J+Jz+J+)| jm 〉= (m+1) [J+| jm 〉] (7.6)

and becauseJ+ commutes withJ2, we see thatJ+| jm〉 is proportional to an angular momentumeigenstate| j,m+1〉. Similarly, J−| jm〉 is proportional to| j,m−1〉. Therefore,J± are called raisingand lowering operators for the quantum numberm. This means that, given an eigenstate| jm〉, we canin principle construct an infinite sequence of eigenstates by acting an arbitrary number of times on itwith J±. The sequence is finite only if after acting a finite number of times with eitherJ+ or J−, thenew state is zero. The first result we have obtained is that the eigenstates| jm〉 occur in sequences ofstates with the samej butm stepping up and down by 1.

Suppose| jm〉 is normalised, then we can calculate the norm ofJ+| jm〉. UsingJ−J+ = J2−J2z−Jz

(check!) we have:

〈J+ jm|J+ jm〉= 〈 jm|J−J+| jm〉=⟨

jm|J2−J2z −Jz| jm

⟩= 2( j−m)( j +m+1). (7.7)

Similarly:

〈J− jm|J− jm〉= 〈 jm|J+J−| jm〉=⟨

jm|J2−J2z +Jz| jm

⟩= 2( j +m)( j−m+1). (7.8)

Both expressions must be positive and this restrictsm to the values

− j ≤m≤ j. (7.9)

The only way to restrictm to |m| ≤ j is whenJ+ acting a certain amount of times on| jm〉 yields zero:

Jp+| jm 〉= 0. (7.10)

SimilarlyJq−| jm 〉= 0. (7.11)

Now consider the state| j,m+ p−1 〉= Jp−1

+ | jm 〉 (7.12)

where the equality holds up to a normalisation constant. This is an angular momentum eigenstatesince it is obtained by actingp−1 times withJ+ on an eigenstate| jm〉. We must have

J+| j,m+ p−1〉= 0 (7.13)

which implies that the norm of the resulting state vanishes. By (7.7) it follows that

j = m+ p−1 (7.14)

(note that the other solutionj =−m− p is impossible because|m| ≤ j andp> 0). In a similar fashionwe find

− j = m−q+1 (7.15)

Page 68: Classical and Quantum Mechanics

62 Angular momentum

and combining the last two equations yields:

2 j = p+q−2 = integer. (7.16)

Therefore,j is either integer or half integer andm assumes the values

m=− j,− j +1, . . . , j−1, j. (7.17)

In conclusion we have

The angular momentum states can be labelled| j,m〉. The numbers j are either integer or half-integer. For a given j, the numbers m run through the values

− j,− j +1, . . . , j−1, j. (7.18)

From (7.7) and (7.8) we see that from a properly normalised state| jm〉 we can obtain properlynormalised states as follows:

| j,m−1〉=1

( j +m)( j−m+1)J−| jm〉 (7.19a)

| j,m+1〉=1

( j−m)( j +m+1)J+| jm〉. (7.19b)

These states are defined up to a phaseeiα .

7.2 Orbital angular momentum

As we have seen, the quantum analogue of the classical angular momentumL = r ×p is an angularmomentum operator because is satisfies the commutation relations (7.2). This can be shown using thecommutation relation

[p,x] =i. (7.20)

Exercise: Show that (7.2) holds, using the following formulation of the cross-product:

(a×b)k = εi jkaib j . (7.21)

This type of angular momentum will be calledorbital angular momentumsince it is expressed in theorbital coordinates of the particle. Another type of angular momentum operator is that representingthe spin. We will now find the spectrum of the orbital angular momentum operators for a singleparticle in three dimensions.

It turns out convenient to expressLz in polar coordinates. We will not derive this expression butsimply give:

Lz =−i∂

∂ϕ. (7.22)

The fact that it depends only onϕ is obvious sinceLz is associated with a rotation around thez-axis andsuch a motion is expressed as a variation of the angleϕ. The eigenfunctions ofL2,Lz can be writtenas functions of the anglesϑ andϕ. We know that these are eigenfunctions ofLz with eigenvaluem.Denoting the eigenfunctionsF(ϑ ,ϕ) we have:

−i∂F(ϑ ,ϕ)

∂ϕ= mF(ϑ ,ϕ). (7.23)

Page 69: Classical and Quantum Mechanics

7.3. Spin 63

This differential equation has a solution

F(ϑ ,ϕ) = G(ϑ)eimϕ . (7.24)

The wavefunction of the particle should be single-valued – hence it should be equal forϕ andϕ +2π

and this restrictsm to integer values. Hence we have:

The orbital angular momentum of a single particle has only integer eigenvalues j,m.

This result can be generalised to the orbital momentum of a system consisting of an arbitrary numberof particles. Half-integer values ofj can only come about by having particles with half-integer spin.

7.3 Spin

Classically, a charged particle having a nonzero angular momentum has a nonzero magnetic moment.The magnetic moment for a particle of chargeq is given by:

m =q2(r ×v) =

q2m

L . (7.25)

The energy of a magnetic moment in an external magnetic fieldB is given by

−m ·B. (7.26)

According to the correspondence principle we add this energy as an extra term to the Hamiltonian ofa (spinless) electron (q =−e):

H = H0 +H1 (7.27a)

H0 =p2

2m+V(r); (7.27b)

H1 =e

2mL ·B. (7.27c)

We takeB in the z-direction. If the potential is spherically symmetric,V(r) = V(r), the eigenstates ofH0 can be taken as simultaneous eigenstates ofL2 andLz. But in that case, they are also eigenstates ofH1 and therefore ofH. For an eigenstate ofH0 with energyE0, H1 shifts the energy by an amount

∆E1 =e2m

MB = µBMB. (7.28)

HereM is the quantum number associated withLz (capital letterM is used in order to avoid confusionwith the massm). We see that a magnetic field lifts theLz-degeneracy, yielding a splitting of al -levelinto 2l +1 sublevels.

Zeeman observed indeed a splitting of the levels of atoms in a magnetic field, but these splittingswere not in accordance with (7.28). Later, Uhlenbeck and Goudsmit (1925) explained the observedanomaly by the assumption of the existence of anintrinsic angular momentum variable, i.e. not asso-ciated with the orbital coordinates. This angular momentum was calledspin, S. With the spin there isassociated a magnetic moment and this is given by

m =− eg2m

S. (7.29)

Page 70: Classical and Quantum Mechanics

64 Angular momentum

The factorg is very close to 2 and its value can be derived only by using relativistic quantum mechan-ics. The eigenvalues of the spin operatorsS2, Sz are by convention2s(s+1) andms. s is always 1/2for an electron and therefore,ms can only assume the values 1/2 and−1/2. Therefore, the eigenvalueof S2 is always23/4 for an electron. Other particles have been found with spin 0, 1, 3/2 etc.

Writing down the Hamiltonian for an electron with spin, we have:

H = H0 +H1 (7.30)

H0 =p2

2m+V(r); (7.31)

H1 =e

2mc(L +2S) ·B. (7.32)

We have however forgotten something. Associating a magnetic moment with the spin and one with theangular momentum, we must also take into account the interaction between these two! For a propercalculation of this interaction we would need relativistic electrodynamics and therefore we simplyquote the result:

HSO =ge

4πε02m2c2S·L 1r

dV(r)dr

. (7.33)

For the hydrogen atom, withV(r) = 1/r (choosing suitable units), we have:

HSO =ge2

8πε0m2c2S·L 1r3 . (7.34)

This spin-orbit splitting is observed experimentally.

7.4 Addition of angular momenta

Consider an electron in a hydrogen atom. The electron has orbital angular momentum, characterisedby the quantum numbersl ,ml and spin quantum numbers,s,ms. The total angular momentumJ is thesum of the vector operatorsL andS:

J = L +S. (7.35)

What are the possible eigenvalues ofJ2 andJz? Heuristically, we can approach this problem by addingL andS as vectors. However, the relative orientation of the two is not arbitrary as we know that theeigenvaluesj,m of the resultant operator are quantised. IfL andS are “aligned”, we havej = l + sand if they are opposite we havej = l −s. This means thatj is half-integer and does not differ morethan 1/2 from l . We want to analyse the combination of angular momenta now in a more formal way,starting with the problem of adding two spins,S1 andS2.

Let us first ask ourselves why we would like to know the relation between the two angular mo-menta to be added and the result. To answer this question, consider a system consisting of two particleswith orbital angular momentum zero (l = 0,ml = 0) and each spin 1/2, described by the interaction

V(r) = V1(r)+V2(r)S1S2

2 . (7.36)

The second term contains the magnetic interaction between the spins. BothS1z andS2z do not com-mute with the second term and therefore the eigenstates of the Hamiltonian are not simultaneouseigenstates ofS1z andS2z. To find observables which do commute with the second term, we note that

S1S2 =12

(S2−S2

1−S22

)(7.37)

Page 71: Classical and Quantum Mechanics

7.4. Addition of angular momenta 65

and this commutes withS2, S21, S2

2 andSz = S1z+S2z.Exercise: Prove these commutation relations.Therefore, the eigenstates can be labeled bys1, s2 (both 1/2), stot (to be evaluated) andms (the eigen-value forSz; to be evaluated).

So let us consider the possible eigenvalues ofS2 andSz. We start from the states|s1,m1;s2m2〉where the labels belonging to the two particles are still separated. Asm1 andm2 can assume the values1/2 and−1/2 (denoted by+ and− respectively), we have four such states. As the values ofs1 ands2

are fixed, we can denote the four states simply by|m1,m2〉:

χ1 = |++〉 (7.38a)

χ2 = |+−〉 (7.38b)

χ3 = |−+〉 (7.38c)

χ4 = |−−〉. (7.38d)

We must find linear combinations of these states which are eigenstates ofS2 andSz. It turns out thatall four states are indeed eigenstates ofSz:

Szχ1 = (S1z+S2z)χ1 = (1/2+1/2)χ1 = χ1 (7.39)

and furthermore:

Szχ2 = 0 (7.40a)

Szχ3 = 0 (7.40b)

Szχ4 =−χ4. (7.40c)

Now consider the stateχ2 + χ3. This is certainly not an eigenstate ofS1z and neither ofS2z. But it isan eigenstate ofSz with eigenvalue 0. We see therefore that eigenstates ofSz need not necessarily beeigenstates ofS1z or S2z.

Now we try to find eigenstates ofS2. It is convenient two write this operator in the following form:

S2 = S21 +S2

2 +2S1zS2z+S1+S2−+S1−S2+ (7.41)

where we have used the raising and lowering operators:

S1± = S1x± iS1y (7.42)

etc. These operators have the usual effect when acting on our states:

S1+|++〉= 0 (7.43a)

S1+|−+〉= |++〉 (7.43b)

etcetera (check this). From this, it can be verified that the required eigenstates are:

Ψ1 = χ1 = |++〉; (7.44a)

Ψ2 = χ4 = |−−〉 (7.44b)

Ψ3 =χ2 + χ3√

2=

1√2

(|+−〉+ |−+〉) (7.44c)

Ψ4 =χ2−χ3√

2=

1√2

(|+−〉−|−+〉) (7.44d)

Page 72: Classical and Quantum Mechanics

66 Angular momentum

Using Eq. (7.41), it follows that

S2Ψ1 = 22Ψ1, hences= 1; (7.45a)

S2Ψ2 = 22Ψ2, hences= 1; (7.45b)

S2Ψ3 = 22Ψ3, hences= 1; (7.45c)

S2Ψ4 = 0 hences= 0. (7.45d)

Exercise: Check these results.The states can now be labeled|s,ms〉 (boths1 ands2 are equal to 1/2), and eithers= 1 with ms either−1, 0 or+1, or s= 0 andms = 0. Thes= 1 state is calledtriplet stateand thes= 0 statesinglet–the names refer to the degeneracy.

Now we consider the addition of an orbital momentumL to a single spinS which has quantumnumbers= 1/2:

J = L +S. (7.46)

The eigenstates we start from are|lm;sms〉 wheres= 1/2 and we will omit the quantum numbersin the remainder. Furthermore, we again denote the two possible values forms, 1/2 and−1/2 by= and− respectively. The fact that we should end up with linear combinations being eigenstates ofJz = Lz+Sz restricts combinations to the pairs

α|l ,m;+ 〉+β |l ,m+1;− 〉 (7.47)

which has eigenvaluemj = m+1/2 of Sz. α andβ will be fixed by the requirement that the resultingcombination is an eigenstate ofJ2, which we write in the form:

J2 = L2 +S2 +2L1zS1z+L+S−+L−S+. (7.48)

Consider the action of this operator on the state (7.47):

J2

2 [α|l ,m;+〉+β |l ,m+1;−〉] = α

[l(l +1)+

34

+m

]|l ,m;+〉+α

√(l −m)(l +m+1)|l ,m+1;−〉+

β

[l(l +1)+

34−m−1

]|l ,m+1;−〉+β

√(l −m)(l +m+1)β |l ,m;+〉. (7.49)

We require that this is equal to2 j( j +1)[α|l ,m;+〉+β |l ,m+1;−〉]. This leads to a linear homoge-neous set of equations forα,β :

α

[l(l +1)+

34

+m− j( j +1)]+β

√(l −m)(l +m+1) = 0 (7.50)

α√

(l −m)(l +m+1)+β

[l(l +1)+

34−m−1− j( j +1)

]= 0 (7.51)

This can only hold if the determinant of the system of linear equations vanishes and this leads to:

[l(l +1)+3/4+m− j( j +1)] [l(l +1)+3/4−m−1− j( j +1)] = (l −m)(l +m+1) (7.52)

and for givenl andm this equation has two solutions forj, given by the conditions :

j( j +1)− l(l +1)− 34

=−l −1 or (7.53a)

j( j +1)− l(l +1)− 34

= l (7.53b)

Page 73: Classical and Quantum Mechanics

7.5. Angular momentum and rotations 67

which leads toj = l +1/2 or j = l −1/2. (7.54)

The ratioα/β follows from the above equations and these coefficients are fixed by the requirementthat they are normalised. Forj = l +1/2:

α =

√l +m+1

2l +1; β =

√l −m2l +1

(7.55)

and for j = l −1/2:

β =

√l −m2l +1

; α =

√l +m+1

2l +1. (7.56)

The analysis presented here can be generalised to arbitrary angular momentum operators. Thisbecomes a tedious job, which leads to the identification of the linear expansion coefficients, which arecalledClebsch-Gordan coefficients. For details, see Messiah.

7.5 Angular momentum and rotations

In this section we consider rotations of the physical system at hand. Such a rotation can for thethree-dimensional space be expressed as a rotation matrix. This is the class of matrices which areorthogonal: the columns when considered as vectors form an orthonormal set and the same can besaid of the rows. Furthermore the determinant of the matrix is+1 (if the determinant is−1, there isan additional reflection). For simplicity, we will confine ourselves in the analysis which follows torotations around thez-axis. The matrix of such a rotation over an angleα reads:

R(α) =

cosα −sinα 0sinα cosα 0

0 0 1

(7.57)

Of course, if we rotate a physical system, its state, which we will denote|ψ〉, will change and werepresent this change by an operatorR:

|ψ〉 Rotation−−−−→R|ψ〉. (7.58)

Now consider ther -representationψ(r) = 〈r |ψ〉 . (7.59)

The new state of the system is the same as the old one up to a rotation, so if we evaluate the old state ata positionr rotatedbackover an angleα, we should get exactly the same result as when we evaluatethe new state inr (see figure 1): ⟨

R−1r |ψ⟩

= 〈r |Rψ〉 . (7.60)

Using this relation we can find an expression for the operatorR.Consider an infinitesimal rotation of a single particle around thez-axis with rotation angleδ ,

evaluated atr = x,y,z:

R(δ )ψ(r) = ψ(R−1r) = ψ(x+yδ ,y−xδ ,z) =

ψ(x,y,z)+δ

[y

∂ψ(r)∂x

−x∂ψ(r)

∂y

]= (1− iδLz/)ψ(x,y,z). (7.61)

Page 74: Classical and Quantum Mechanics

68 Angular momentum

This relation is valid for small angles. The expression for larger angles can be found by applyingmany rotations over small angles in succession. Chopping the angleα into N pieces (N large), wehave

R(α) =(

1− iα

NLz

)N

= exp(−iLzα/). (7.62)

This result can be generalised for rotations around an arbitrary axis characterised by a unit vectoru:

R(α) = exp(−iαu ·L/˝). (7.63)

This equation has been derived for a single particle, but it can be generalised for systems consist-ing of more particles. The angular momentum operator is then the sum of the angular momentumoperators of the individual particles. If the particles have spin, this is to be included in the total an-gular momentum. Equation (7.63) is in fact often used as the definition of total angular momentum.The commutation relations (7.2) can be derived from it, using the commutation relations for rotationmatrices (exercise!).

Suppose we have a HamiltonianH which is spherically symmetric. This implies that a rotationhas no effect on the matrix elements ofH:

〈ψ|H|φ〉=⟨ψ′|H|φ ′

⟩(7.64)

where the primed states are related to the unprimed ones through a rotation. Therefore we have:

〈ψ|H|φ〉=⟨ψ|R†HR|φ

⟩=⟨

ψ|eiαuJ/He−iαuJ/|φ⟩

. (7.65)

This relation should hold in particular for infinitesimal rotations and expanding the exponentials tofirst order inα we obtain:

〈ψ|H|φ〉= 〈ψ|1− iα/ [HJ ·u−J ·uH] |φ〉=⟨

ψ

∣∣∣∣1− i

αu · [H,J]∣∣∣∣φ⟩ . (7.66)

As this should hold for arbitrary directionsu and arbitrary statesψ,φ , we have

[H,J] = 0 (7.67)

and thereforeJ is a conserved quantity, as

d〈J〉dt

= i〈[H,J]〉= 0. (7.68)

Note that it is essential here to consider thetotal angular momentum, that is, including the spin degreesof freedom.

Page 75: Classical and Quantum Mechanics

8

Introduction to Quantum Cryptography

8.1 Introduction

Some of the most important technical developments in the next few years will be based on quantummechanics. In particular, spectacular developments and applications are expected in the areas ofquantum cryptography, quantum teleportation and quantum computing. In this note, I shall brieflyexplain some issues involved in quantum cryptography.

The idea of quantum cryptography hinges upon themeasurement postulateof quantum mechanics.This postulate deals with measurements of physical observables. In quantum mechanics, such anobservable is represented by an Hermitian operator, sayQ. The eigenvectors of this Hermitian operatorare denoted|φn〉 with corresponding eigenvaluesλn. The measurement postulate says that, for a state|ψ〉 in Hilbert space, which can be expanded as

|ψ〉= ∑n

cn |φn〉 , (8.1)

the result of a measurement ofQ yields one of its eigenvaluesλn. The probability of finding a par-ticular valueλn is given by|cn|2, and after the measurement the state of the system is reduced to thecorresponding eigenvector|φn〉. This last aspect, the fact that the state of a system is influenced byany observation, enables us to detect whether someone, an eavesdropper, has tried to read informationas we shall see below.

8.2 The idea of classical encryption

Encryption of messages can be useful for many different applications. In all these applications, some-one, denoted as A, sends a message to B, in such a way that an eavesdropper (E) cannot detect theinformation sent. To make the example more lively, A is usually given the name Alice, B is Bob, andthe eavesdropper E is called Eve.

A schematic drawing of the procedure is depicted here:

Alice Bob

Eve

A message which Alice sends to Bob is a series of bits:

69

Page 76: Classical and Quantum Mechanics

70 Introduction to Quantum Cryptography

0110011101011101....

In order to prevent Eve from eavesdropping the message, Alice and Bob decide to encrypt their mes-sages. For this purpose, several schemes exist, and we shall present the simplest one here.

Alice and Bob have met once, and on that occasion they have agreed on akeywhich they will useto encrypt messages. A key is some sequence of bits, e.g.:

1111010001010100....

The key does not have any particular structure. Before Alice sends over her message, she encrypts itby performing anexclusive orwith her message and the key. An exclusive or performed on messageand key performs a bitwise comparison: if two corrsponding bits (at the same position) of the messageand the key are equal, the result has a bit value 0. In the other cases, i.e. when the bits are unequal, theresult has a bit value of 1:

0110011101011101.... Message1111010001010100.... Key1001001100001001.... Message XOR Key=Encrypted message.

Bob receives the encrypted message and performs again an exclusive or with the key, which unveilsthe original contents of the message:

1001001100001001.... Encrypted message1111010001010100.... Key0110011101011101.... Message.

Eve can only intercept the encrypted message and it is difficult (usually impossible) for her to makesense of it.

Suppose however that Alice and Bob communicate very frequently, using the same key for eachmessage. In that case, Eve might guess what the key is: if she would let her computer generate manydifferent keys and use them to decrypt the messages exchanged between Alice and Bob. She thenmight quickly guess parts of the key, and gradually smaller and smaller parts of the key must bediscovered, which takes less and less effort. Therefore, it would be wise for Alice and Bob to use akey which is at least as long as their messages. Here we have a problem. In order to safely exchangethe keys, Bob and Alice have to meet before each message, or they must use a (hopefully) reliablecourier. The dependence on couriers makes this encryption method cumbersome and vulnerable.

Another way of encrypting messages is to use much shorter keys and to encrypt the message usingsome elaborate mathematical transformation depending on this key. This is done in the Rivest, Shamirand Adelman (RSA) encryption. The idea is based on factorisation of numbers into prime numbers.Consider the product of two large prime numbers. If you know that product, it is difficult to find out itstwo prime factors. On the other hand, if you know one of these factors, it is easy to compute the other.So, if Bob and Alice have an encryption and decryption algorithm based on the two prime factors,they can encrypt and decrypt their messages if they know these factors. The product is public in thiscase, that is, it is available to everyone, and to Eve in particular.

Now suppose that Eve finds out the factorisation of the product, then she can eavesdrop all themessages. The point is now that the factorisation requires an amount of cpu time which grows expo-nentially with the number of bits of the product. So, if that number is large enough, Eve will never beable to crack the code. So this method seems quite safe.

In 1994, it was shown that a new type computers, which is based on quantum mechanical be-haviour of matter, should be able to do the factorisation in a number of steps which grows as a power

Page 77: Classical and Quantum Mechanics

8.3. Quantum Encryption 71

of the number of bits used by the product number. Only a very primitive quantum computer has beendeveloped to date, but people believe that in the future, RSA will not be safe anymore.

8.3 Quantum Encryption

In quantum encryption, two channels are used: one is a public channel, such as the internet, and theother is a private one. The channels are shown in the figure.

Bob

Eve

QM channel

Alice

Public channel

(private)

This private channel cannot always be guaranteed to be safe for Eve (otherwise, encryption wouldnot be necessary any longer), but Bob and Alice can detect whether their communication has beeneavesdropped by Eve, as shall explain below.

The private channel is used to communicate the key only, and this key can be used for the standardexclusive-or encryption described in the previous section. The communication through the privatechannel is based on quantum mechanics. The information carriers of this channel are photons in somepolarisation state. Unfortunately, the details of the quantum states of photons cannot be given here,as they involve quantum field theory. Therefore you must accept some of the facts which are given inthe following.

Recall that light is an electromagnetic wave phenomenon and is therefore a wave with a certainpolarisation. A polaroid filter will be transparent for photons with a certain direction of polarisationonly, and opaque for photons with the perpendicular polarisation. A photon polarisation state can berepresented as a unit vector in the two-dimensional plane perpendicular to the direction of propagationof the photon. The states|1〉 and|0〉 shown in the figure below form a basis in the Hilbert space of allpossible polarisation states (the wave propagates along a direction perpendicular to the paper).

|0>

|1>

A state with angleϑ with respect to thex-axis would then have the form

|ϑ〉= cosϑ |1〉+sinϑ |0〉 .

Page 78: Classical and Quantum Mechanics

72 Introduction to Quantum Cryptography

If a detector is put behind a polarisation filter aligned along thex-direction, and a photon is sent tothat filter, the detector will register the arrival of the photon if it is polarised in thex direction, andnot when its polarisation is along they-axis. A photon in the state|ϑ〉 would thus be detected with aprobability cos2 ϑ

Now we consider the transmission of data through the quantum channel. This channel is a glassfiber through which Alice sends photons which she selects by a polariser which has one of the fourpossible orientations depicted in the figure below:

1 α

βα

10

β0

Note that

|1β 〉=1√2

(|1α〉+ |0α〉)

etcetera.

Now Bob will receive these photons at his end of the fiber. He first lets them pass through apolariser before they can arrive at the detector. For each photon, he aligns his polariser along either1α in the figure, or along 1β . When he detects a photon, he records a ‘1’, otherwise a ‘0’. WhetherBob detects a photon depends on his and on Alice’s polariser. Suppose Bob has his polariser along1α. Then, if Alice has sent a|1α〉 photon, Bob will detect it. If she has sent a 1β photon, Bob will orwill not detect it with equal probabilities. The same holds for a 0β photon. If a 0α photon was sent byEve, Bob will not detect it. The figure below gives the probabilities with which Bob detects a photonfor each of the four possible polarisations which Alice can send over.

Page 79: Classical and Quantum Mechanics

8.3. Quantum Encryption 73

Alice Bobα β

0.5

0.5 1

0.5

0.5

0

1

0

1 1

After a number of photons has been sent over, Bob sends to Alice the settings of the polariser hehas used, 1α or 1β , using thepublicchannel for this information. Alice responds and tells Bob whichof his settings corresponded to hers (i.e. whether she used anα or a β polariser, and not whethershe used the 1 or the 0 setting). For compatible settings, i.e. when Bob and Allice both usedα orboth usedβ , they both know the result of Bob’s detections. They both keep these results as bits of asequence. For all other photons, Bob has detected at random 0 or 1 photon, so these events are deleted.The sequence of retained bits is now taken as the key for encrypting a message using an exclusive-orencryption described in the previous section.

Let us consider what would happen during a sample session:

Alice: 1α 0α 0β 1β 1α 1β 1α

Bob’s settings: 1α 1β 1β 1α 1α 1β 1β

Bob’s detections: 1 1 0 0 1 1 1Now Bob sends over his settings (see second line)Alice tells Bob which of these were compatible with hersRetained bits: 1 x 0 x 1 1 x

An ‘x’ in the last line denotes a discarded bit. The sequence 1011 is now the key.Now consider the possibility of eavesdropping. If Eve intercepts the channel, the photons she

measures are lost, so she has to send new photons to Bob. Suppose however that Alice used polari-sation|1α〉 and Bob has used the 1α polariser. Then he would receive a correct bit of the key, whichin this case is a 1. But suppose Eve used polariserβ . If Eve detects|0β 〉, she will send a similarphoton to Bob. But Bob used polariserα so he will find a 0 (i.e. no detection) with 50 % probability.If Bob and Alice exchange messages they immediately discover a mismatch in the keys, so they stopcommunicating.

Page 80: Classical and Quantum Mechanics

74 Introduction to Quantum Cryptography

It is thus necessary to send over only one photon at a time, otherwise Eve could insert a beamsplit-ter in the quantum channel and detect half or more of the key without being noticed. Therefore, onlylow intensities must be used, which limits the distance over which communication is possible. Withpresent-day technology, a few tens of kilometers can be reliably bridged with low intensity opticalfibers.

Page 81: Classical and Quantum Mechanics

9

Scattering in classical and in quantum mechanics

Scattering experiments are perhaps the most important tool for obtaining detailed information on thestructure of matter, in particular the interaction between particles. Examples of scattering techniquesinclude neutron and X-ray scattering for liquids, atoms scattering from crystal surfaces, elementaryparticle collisions in accelerators. In most of these scattering experiments, a beam of incident particleshits a target which also consists of many particles. The distribution of scattering particles over thedifferent directions is then measured, for different energies of the incident particles. This distributionis the result of many individual scattering events. Quantum mechanics enables us, in principle, toevaluate for an individual event the probabilities for the incident particles to be scattered off in differentdirections; and this probability is identified with the measured distribution.

Suppose we have an idea of what the potential between the particles involved in the scatteringprocess might look like, for example from quantum mechanical energy calculations (programs for thispurpose will be discussed in the next few chapters). We can thenparametrisethe interaction potential,i.e. we write it as an analytic expression involving a set of constants: the parameters. If we evaluate thescattering probability as a function of the scattering angles for different values of these parameters, andcompare the results with experimental scattering data, we can find those parameter values for whichthe agreement between theory and experiment is optimal. Of course, it would be nice if we couldevaluate the scattering potential directly from the scattering data (this is called theinverse problem),but this is unfortunately very difficult (if not impossible) as many different interaction potentials canhave similar scattering properties as we shall see below.

Many different motivations for obtaining accurate interaction potentials can be given. One is thatwe might use the interaction potential to make predictions about the behaviour of a system consistingof many interacting particles, such as a dense gas or a liquid.

Scattering might beelasticor inelastic. In the former case the energy is conserved, in the latterenergy disappears. This means that energy transfer takes place from the scattered particles to degreesof freedom which are not included explicitly in the system (inclusion of these degrees of freedomwould cause the energy to be conserved). In this chapter we shall consider elastic scattering.

9.1 Classical analysis of scattering

In chapter 3, we have analysed the motion of two bodies attracting each other by a gravitational forcewhose value decays with increasing separationr as 1/r2. This analysis is also correct for oppositecharges which feel an attractive force of the same form (Coulomb’s law). When the force is repulsive,the solution remains the same – we only have to change the sign of the parameterA which definesthe interaction potential according toV(r) = A/r. One of the key experiments in physics which ledto the notion that atoms consist of small but heavy kernels, surrounded by a cloud of light electrons,is Rutherford scattering. In this experiment, a thin gold sheet was bombarded withα-particles (i.e.

75

Page 82: Classical and Quantum Mechanics

76 Scattering in classical and in quantum mechanics

helium-4 nuclei) and the scattering of the latter was analysed using detectors behind the gold film. Inthis section, we shall first formulate some new quantities for describing scattering processes and thencalculate those quantities for the case of Rutherford scattering.

Rutherford scattering is chosen as an example here – scattering problems can be studied moregenerally; see Griffiths, chapter 11, section 11.1.1 for a nice description of classical scattering.

We consider scattering of particles incident on a so-called ‘scattering centre’, which may be an-other particle. The scattering centre is supposed to be at rest. This might not always justified in areal experiment, but the analysis in chapter 3, in which the full two-body problem was reduced toa one-body problem with with a reduced mass, pertains to the present case. The incident particlesinteract with the scattering centre located atr = 000 by the usual scalar two-point potentialV(r) whichsatisfies the requirements of Newton’s third law. Suppose we have a beam of incident particles par-allel to thez-axis. The beam has a homogeneous density close to that axis, and we can define aflux,which is the number of particles passing a unit area perpendicular to the beam, per unit time. Usually,particles close to thez-axis will be scattered more strongly than particles far from thez-axis, as theinteraction potential between the incident particles and scattering centre falls off with their separationr. An experimentalist cannot analyse the detailed orbits of the individual particles – instead a detectoris placed at a large distance from the scattering centre and this detector counts the number of parti-cles arriving at each position. You may think of this detector as a photographic plate which changescolour to an extent related to the number of particles hitting it. The theorist wants to predict what theexperimentalist measures, starting from the interaction potentialV(r) which governs the interactionprocess.

In figure 9.1, the geometry of the process is shown. In addition a small cone, spanned by thespherical polar anglesdϑ anddϕ, is displayed. It is assumed here that the scattering takes place in asmall neighbourhood of the scattering centre, and for the detector the orbits of the scattered particlesall seem to be directed radially outward from the scattering centre. The surfacedAof the intersectionof the cone with a sphere of radiusRaround the scattering centre is given bydA= R2sinϑdϑdϕ. Thequantity sinϑdϑdϕ is calledspatial angleand is usually denoted bydΩ. ThisdΩ defines a cone likethe one shown in figure 9.1. Now consider the number of particles which will hit the detector withinthis small area per unit time. This number, divided by the total incident flux (see above) is called thedifferential scattering cross section, dσ/dΩ:

dσ(Ω)dΩ

=Number of particles leaving the scattering centre through the conedΩ per unit time

Flux of incident beam.

(9.1)The differential cross section has the dimension of area (length2).

First we realise ourselves that the problem is symmetric with respect to rotations around thez-axis,so the differential scattering cross section only depends onϑ . The only two relevant parameters ofthe incoming particle then are its velocity and its distanceb from thez-axis. This distance is calledthe impact parameter– it is also shown in figure 9.1.

We first calculate the scattering angleϑ as a function of the impact parameterb. We use thesolution found in chapter 3 [Eq. (3.24)] which is now a hyperbola. We write this solution in the form

r = λ1+ ε

ε cos(ϑ −C)−1. (9.2)

The integration constantC reappears in the cosine because we have not chosenϑ = 0 at the perihelion– the closest approach occurs when the particle crosses the dashed line in figure 9.1 which bisects thein- and outgoing particle direction.

Page 83: Classical and Quantum Mechanics

9.1. Classical analysis of scattering 77

cosϑ ϕdd

ϕ

b

O

ϑ8

Figure 9.1: Geometry of the scattering process.b is the impact parameter andϕ andϑ are the angles of theorbit of the outcoming particle.

We know that forϑ = π, r → ∞, from which we have

cos(π−C) = 1/ε. (9.3)

Because of the fact that cosine is even [cosx = cos(−x)] we can infer that the other value ofϑ forwhich r goes to infinity, and which corresponds to the outgoing direction occurs when the argumentof the cosine isC−π, so that we find

ϑ∞−C = C−π, (9.4)

or ϑ∞ = 2C−π. The subscript∞ indicates that this value corresponds tot → ∞. From the last twoequations we find the following relation between the scattering angleϑ∞ andε:

sin(ϑ∞/2) = 1/ε. (9.5)

We want to knowϑ∞ as a function ofb rather thanε however. To this end we note that the angularmomentum is given as

` = µvincb, (9.6)

where ‘inc’ stands for ‘incident’, and the total energy as

E =µ

2v2

inc, (9.7)

so that the impact parameter can be found as

b =`√

2µE. (9.8)

Page 84: Classical and Quantum Mechanics

78 Scattering in classical and in quantum mechanics

Using Eq. (3.21), we can finally write (9.5) in the form:

cot(ϑ∞/2) =√

ε2−1 =2Eb|A|

. (9.9)

From the relation betweenb and ϑ∞ we can find the differential scattering cross section. Theparticles scattered with angle betweenϑ andϑ + dϑ , must have approached the scattering centrewith impact parameters between particular boundariesb andb+db. The number of particles flowingper unit area through the ring segment with radiusb and widthdb is given asj2πbdb, where j is theincident flux. We consider a segmentdϕ of this ring. Hence:

dσ(Ω) = b(ϑ)dbdϕ. (9.10)

Relation (9.9) can be used to express the right hand side in terms ofϑ∞:

dσ(Ω) =(

A2E

)2

cot(ϑ/2) dcot(ϑ/2) dϕ =(

A2E

)2

cot(ϑ/2)dcot(ϑ/2)

dcosϑdcosϑdϕ.

(9.11)This can be worked out straightforwardly to yield:

dσ(Ω)dΩ

=(

A4E

)2 1

sin4ϑ/2

. (9.12)

This is known as the famousRutherford formula.

9.2 Quantum scattering with a spherical potential

We now consider the scattering problem within quantum mechanics, by looking at a particle incidenton a scattering centre which is usually another particle.1 We assume that we know the scatteringpotential which is spherically symmetric so that it depends on the distance between the particle andthe scattering centre only.

We shall again calculate thedifferential cross section, dσ

dΩ(Ω), which describes how these inten-sities are distributed over the various spatial anglesΩ. This quantity, integrated over the sphericalanglesϑ andϕ, is thetotal cross section, σtot.

The scattering process is described by the solutions of the single-particle Schrodinger equationinvolving the (reduced) massm, the relative coordinater and the interaction potentialV between theparticle and the interaction centre:[

− 2

2m∇2 +V(r)

]ψ(r) = Eψ(r). (9.13)

This is a partial differential equation in three dimensions, which could be solved using the ‘bruteforce’ discretisation methods presented in appendix A, but exploiting the spherical symmetry of thepotential, we can solve the problem in another, more elegant, way which, moreover, works muchfaster on a computer. More specifically, in section 9.2.1 we shall establish a relation between thephase shiftand the scattering cross sections. In this section, we shall restrict ourselves to a description

1Every two-particle collision can be transformed into a single scattering problem involving the relative position; in thetransformed problem the incoming particle has the reduced massm= m1m2/(m1 +m2).

Page 85: Classical and Quantum Mechanics

9.2. Quantum scattering with a spherical potential 79

-Vr

V= 0

V= 10

V= 205

Figure 9.2: The radial wave functions forl = 0 for various square well potential depths.

of the concept of phase shift and describe how it can be obtained from the solutions of the radialSchrodinger equation.

For the potentialV(r) we make the assumption that it vanishes forr larger than a certain valuermax. In case we are dealing with an asymptotically decaying potential, we neglect contributions fromthe potential beyond the rangermax, which must be chosen suitably, or treat the tail in a perturbativemanner.

For a spherically symmetric potential, the solution of the Schrodinger equation can always bewritten as

ψ(r) =∞

∑l=0

l

∑m=−l

Almul (r)

rYm

l (ϑ ,ϕ) (9.14)

whereul satisfies the radial Schrodinger equation:2

2md2

dr2 +[E−V(r)− 2l(l +1)

2mr2

]ul (r) = 0. (9.15)

Figure 9.2 shows the solution of the radial Schrodinger equation withl = 0 for a square well potentialfor various well depths – our discussion applies also to nonzero values ofl . Outside the well, thesolution ul can be written as a linear combination of the two independent solutionsj l and nl , theregular and irregular spherical Bessel functions. We write this linear combination in the particularform

ul (r > rmax) ∝ kr [cosδl j l (kr)+sinδl nl (kr)] . (9.16)

δl is determined via a matching procedure at the well boundary. The motivation for writingul in thisform follows from the asymptotic expansion for the spherical Bessel functions:

Page 86: Classical and Quantum Mechanics

80 Scattering in classical and in quantum mechanics

kr j l (kr)≈ sin(kr− lπ/2) (9.17a)

krnl (kr)≈ cos(kr− lπ/2) (9.17b)

k =√

2mE/

which can be used to rewrite (9.16) as

ul (r) ∝ sin(kr− lπ/2+δl ), large r. (9.18)

We see thatul approaches a sine-wave form for larger and the phase of this wave is determined byδl ,hence the name ‘phase shift’ forδl (for l = 0 ul is a sine wave for allr > rmax).

The phase shift as a function of energy andl contains all the information about the scatteringproperties of the potential. In particular, the phase shift enables us to calculate the scattering crosssections and this will be done in section 9.2.1; here we simply quote the results. The differential crosssection is given in terms of the phase shift by

dΩ=

1k2

∣∣∣∣∣ ∞

∑l=0

(2l +1)eiδl sin(δl )Pl (cosϑ)

∣∣∣∣∣2

(9.19)

and for the total cross section we find

σtot = 2π

∫dϑ sinϑ

dΩ(ϑ) =

k2

∑l=0

(2l +1)sin2δl . (9.20)

Summarising the analysis up to this point, we see that the potential determines the phase shiftthrough the solution of the Schrodinger equation forr < rmax. The phase shift acts as an intermediateobject between the interaction potential and the experimental scattering cross sections, as the lattercan be determined from it.

Unfortunately, the expressions (9.19) and (9.20) contain sums over an infinite number of terms– hence they cannot be evaluated on the computer exactly. However, cutting off these sums can bemotivated by a physical argument. Classically, only the waves with an angular momentum smallerthanlmax = krmax will ‘feel’ the potential – particles with higherl -values will pass by unaffected.Therefore we can safely cut off the sums at a somewhat higher value ofl – we can always checkwhether the results obtained change significantly when taking more terms into account.

How is the phase shift determined in practice? First, the Schrodinger equation must be integratedfrom r = 0 outwards with boundary conditionul (r = 0) = 0. At rmax, the numerical solution must bematched onto the form (9.16) to fixδl . This can be done straightforwardly in the few cases where ananalytical solution is known. For example, if the potential is a hard core with

V(r) =

∞ for r < a

0 for r ≥ a,(9.21)

we know that the solution is given as

u(r)∼ (r−a) j l (k(r−a)) (9.22)

which vanishes forr = 0. We therefore immediately see thatδ = ka, which can be substituted directlyin the expressions for the cross sections.

Page 87: Classical and Quantum Mechanics

9.2. Quantum scattering with a spherical potential 81

-6

-4

-2

0

2

4

6

0 0.5 1 1.5 2 2.5

V

(r)

[m

V]

eff

e

r [ ]σ

Figure 9.3: The effective potential for the Lennard-Jones interaction for variousl -values.

In a computational approach, we use the value of the numerical solution at two different pointsr1

andr2 beyondrmax and we will use the latter method in order to avoid calculating derivatives. From(9.16) it follows directly that the phase shift is given by

tanδl =K j(1)

l − j(2)l

Kn(1)l −n(2)

l

with (9.23a)

K =r1u(2)

l

r2u(1)l

. (9.23b)

In this equation,j(1)l stands forj l (kr1) etc.

A computational example is based on the work by Toennieset al., (J. Chem. Phys.,71, p. 614,1979) on the scattering of hydrogen off noble gas atoms. Figure 9.3 shows the Lennard-Jones inter-action potential plus the centrifugal barrierl(l +1)/r2 of the radial Schrodinger equation. For higherl -values, the potential consists essentially of a hard core, a well and a barrier which is caused by the1/r2 centrifugal term in the Schrodinger equation. In such a potential, quasi-bound states are possible.These are states which would be genuine bound states for a potential for which the barrier does notdrop to zero for larger values ofr, but remains at its maximum height. You can imagine the followingto happen when a particle is injected into the potential at precisely this energy: it tunnels through thebarrier, remains in the well for a relatively long time, and then tunnels outward through the barrier inan arbitrary direction because it has ‘forgotten’ its original direction. In wave-like terms, the particleresonates in the well, and this state decays after a relatively long time. This phenomenon is called‘scattering resonance’. This means that particles injected at this energy are strongly scattered and thisshows up as a peak in the total cross section.

Page 88: Classical and Quantum Mechanics

82 Scattering in classical and in quantum mechanics

5

10

15

20

25

30

35

40

45

0 0.5 1 1.5 2 2.5 3 3.5

Tot

al c

ross

sec

tion

Energy [m V]e

Figure 9.4: The total cross section shown as function of the energy for a Lennard-Jones potential modeling theH–Kr system. Peaks correspond to the resonant scattering states.

Such peaks can be seen figure 9.4, which shows the total cross section as a function of the energycalculated with a program as described above. The peaks are due tol = 4, l = 5 andl = 6 scattering,with energies increasing withl . Figure 9.5 finally shows the experimental results for the total crosssection for H–Kr. We see that the agreement is excellent.

9.2.1 Calculation of scattering cross sections

In this section we derive Eqs. (9.19) and (9.20). At a large distance from the scattering centre we canmake anAnsatzfor the wave function. This consists of the incoming beam and a scattered wave:

ψ(r)∼ eik·r + f (ϑ)eikr

r. (9.24)

ϑ is the angle between the incoming beam and the line passing throughr and the scattering centre.fdoes not depend on the azimuthal angleϕ because the incoming wave has azimuthal symmetry, andthe spherically symmetric potential will not generatem 6= 0 contributions to the scattered wave.f (ϑ)is called the scattering amplitude. From theAnsatzit follows that the differential cross section is givendirectly by the square of this amplitude:

dΩ= | f (ϑ)|2. (9.25)

Beyondrmax, the solution can also be written in the form (9.14) leaving out allm 6= 0 contributions

Page 89: Classical and Quantum Mechanics

9.2. Quantum scattering with a spherical potential 83

Figure 9.5: Experimental results as obtained by Toennieset al. for the total cross section (arbitrary units) of thescattering of hydrogen atoms by noble gas atoms as function of centre of mass energy.

because of the azimuthal symmetry:

ψ(r) =∞

∑l=0

Alul (r)

rPl (cosϑ) (9.26)

where we have used the fact thatYl0(ϑ ,φ) is proportional toPl [cos(ϑ)]. Because the potential vanishes

in the regionr > rmax, the solutionul (r)/r is given by the linear combination of the regular andirregular spherical Bessel functions, and as we have seen this reduces for larger to

ul (r)≈ sin(kr− lπ2

+δl ). (9.27)

We want to derive the scattering amplitudef (ϑ) by equating the expressions (9.24) and (9.26) for thewave function. For larger we obtain, using (9.27):

∑l=0

Al

[sin(kr− lπ/2+δl )

kr

]Pl (cosϑ) = eik·r + f (ϑ)

eikr

r. (9.28)

We write the right hand side of this equation as an expansion similar to that in the left hand side, usingthe following expression for a plane wave (see e.g. Abramovitz and Stegun,Handbook of Mathemat-ical functions, 1965, Dover)

eik·r =∞

∑l=0

(2l +1)i l j l (kr)Pl (cosϑ). (9.29)

Page 90: Classical and Quantum Mechanics

84 Scattering in classical and in quantum mechanics

f (ϑ) can also be written as an expansion in Legendre polynomials:

f (ϑ) =∞

∑l=0

fl Pl (cosϑ), (9.30)

so that we obtain:

∑l=0

Al

[sin(kr− lπ/2+δl )

kr

]Pl (cosϑ) =

∑l=0

[(2l +1)i l j l (kr)+ fl

eikr

r

]Pl (cosϑ). (9.31)

If we substitute the asymptotic form (9.17) ofj l in the right hand side, we find:

∑l=0

Al

[sin(kr− lπ/2+δl )

kr

]Pl (cosϑ) =

1r

∑l=0

[2l +1

2ik(−)l+1e−ikr +

(fl +

2l +12ik

)eikr]

Pl (cosϑ). (9.32)

Both the left and the right hand side of (9.32) contain in- and outgoing spherical waves (the occurrenceof incoming spherical waves does not violate causality: they arise from the incoming plane wave). Foreachl , the prefactors of the incoming and outgoing waves should both be equal on both sides in (9.32).This condition leads to

Al = (2l +1)eiδl i l (9.33)

and

fl =2l +1

keiδl sin(δl ). (9.34)

Using (9.25), (9.30), and (9.34), we can write down an expression for the differential cross sectionin terms of the phase shiftsδl :

dΩ=

1k2

∣∣∣∣∣ ∞

∑l=0

(2l +1)eiδl sin(δl )Pl (cosϑ)

∣∣∣∣∣2

. (9.35)

For the total cross section we find, using the orthonormality relations of the Legendre polynomials:

σtot = 2π

∫dϑ sinϑ

dΩ(ϑ) =

k2

∑l=0

(2l +1)sin2δl . (9.36)

9.2.2 The Born approximation

Consider again the solution of a particle which is being scattered by a potential. We shall now relax thecondition that the potential be spherically symmetric. Let us write down the stationary Schrodingerequation for the wavefunction: [

− 2

2m∇2 +V(r)

]ψ(r) = Eψ(r).

ForV(r) ≡ 0, an incoming plane wave would be a solution to this equation. It turns out possible towrite the solution to the Schrodinger equation with potential formally as an integral expression. This

Page 91: Classical and Quantum Mechanics

9.2. Quantum scattering with a spherical potential 85

is done using the Green’s function formalism. The Green function depends on two positionsr andr ′

– it is defined by [E +

2

2m∇2−V(r)

]G(r , r ′) = δ (r − r ′).

To understand the Green function (and easily recall its definition) you may view the delta functionon the right hand side as a unit operator, so thatG may be called theinverseof the operatorEI − H,whereI is the unit operator. ForV(r)≡ 0 we call the Green’s functionG0:[

E +2

2m∇2]

G0(r , r ′) = δ (r − r ′).

Before calculatingG0 let us assume we have it at our disposal. We then may write the solution tothe full Schrodinger equation, i.e. including the potentialV, in terms of a solutionφ(r) to the ‘bare’Schrodinger equation, that is, the Schrodinger equation with potentialV ≡ 0:

ψ(r) = φ(r)+∫

G0(r , r ′)V(r ′)ψ(r ′) d3r ′. (9.37)

This can easily be checked by substituting the solution into the full Schrodinger equation and usingthe fact thatEI − H, acting on the Green’s function, gives a delta-function.

Now we consider the scattering problem with an incoming beam of the formφ(r) = exp(ik i · r)(the subscript ‘i’ denotes the incoming wave vector). We see from Eq. (9.37) that this wave persistsbut that it is accompanied by a scattering term which is the integral on the right hand side. Now thewavefunctionψ(r) is still very difficult to find, as it occurs in Eq. (9.37) in an implicit form. We canmake the equation explicit if we assume that the potentialV(r) is small, so that the scattered part ofthe wave is much smaller than the wavefunction of the incoming beam. In a first approximation wemight then replaceψ(r ′) on the right hand side of Eq. (9.37) byφ(r) which is a plane wave:

ψ(r) = φ(r)+∫

G0(r , r ′)V(r ′)φ(r ′) d3r ′ = eik i ·r +∫

G0(r , r ′)V(r ′)eik i ·r ′ d3r ′.

The key to the scattering amplitude is given by the notion that it must always be possible to write thesolution (9.37) in the form:

ψ(r) = eik i ·r + f (ϑ ,ϕ)eikr

r.

At this moment we hardly recognise this form in the expression obtained for the wavefunction. We firstmust find the explicit expression for the Green’s functionG0. Without going through the derivation(see for example Griffiths, pp. 364–366) we give it here:

G0(r , r ′) =2m2

eik|r−r ′|

4πr

with k =√

2mE/2.Now we taker far from the origin. As the range of the potential is finite, we know that only

contributions withr ′ r have to be taken into account. Taylor expanding the exponent occuring inthe Green’s function: ∣∣r − r ′

∣∣=√r2−2r · r ′+ r ′2 ≈ r− r · r ′

rleads to

G(r , r ′) =2m2

eikr

4πre−ikr ·r ′/r .

Page 92: Classical and Quantum Mechanics

86 Scattering in classical and in quantum mechanics

The denominator does not have to be taken into account as it gives a much smaller contribution to theresult forr 1/k. Now we definekf = kr/r, i.e. kf is a wave vector corresponding to an outgoingwave from the scattering centre to the pointr . We have

ψ(r) = φ(r)+2m2

eikr

4πr

∫V(r ′)e−ikf ·r i eik i ·r d3r ′.

This is precisely of the required form provided we set

f (ϑ ,ϕ) =m

2π2

∫V(r ′)ei(k i−kf)·r .

This is the so-calledfirst Born approximation. It is valid for weak scattering – higher order approxi-mations can be made by iterative substitution forψ(r ′) in the integral occurring in Eq. (9.37). In thefirst order Born approximation, the scattering amplitudef (ϑ ,ϕ) is in fact a Fourier transform of thescattering potential.

As an example, we consider a potential which is not weak but which is easily tractable within theBorn scheme: the Coulomb potential

V(r) =q1q2

4πε0

1r.

The Fourier transform of this potential reads

V(k) =q1q2

4πε

1k2 .

Therefore, we immediately find forf (ϑ):

f (ϑ) =mq1q2

4πε2(k i −kf)2 .

The angleϑ is hidden ink i − kf , the norm of which is equal to 2sin(ϑ/2). The result therefore is,usingE = 2k2/(2m):

dΩ=[

q1q2

16πε0Esin2(ϑ/2)

]2

.

This is precisely the classical Rutherford formula, which also turns out to be the correct classicalresult. This could not possibly be anticipated beforehand, but it is a happy coincidence.

Page 93: Classical and Quantum Mechanics

10

Symmetry and conservation laws

In this chapter, we return to classical mechanics and shall explore the relation between the symmetry ofa physical system and the conservation of physical quantities. In the first chapter, we have already seenthat translational symmetry implies momentum conservation, that time translation symmetry impliesenergy conservation and that rotational symmetry implies conservation of angular momentum. Thereexists a fundamental theorem, calledNoether’s theorem, which shows that, indeed, for every spatialcontinuous symmetry of a system which can be described by a Lagrangian, some physical quantity isconserved, and the theorem also allows us to find that quantity.

The special form of the equations of motion for a system described by a Lagrangian (or Hamil-tonian) leads already to a large number of conserved quantities, calledPoincare invariants. We shallconsider only one Poincare invariant here: phase space volume. The associated conservation law iscalledLiouville’s theorem.

10.1 Noether’s theorem

Suppose a mechanical system is invariant under symmetry transformations which can be parametrisedusing some real, continuous parameter. Examples include those mentioned already above: rotations(parametrised by the rotation angles) or translations in space or time. The fact that the system isinvariant under these transformations is reflected by the Lagrangian being invariant under these sym-metries. For simplicity we shall restrict ourselves to a single continuous parameter,s. In the case ofrotations one could imagines to be the rotation angle about an axis fixed in space, such as thez-axis.The mechanical path for some system, i.e. the solution of the Euler-Lagrange equations of motion, iscalledq(t). Now we perform a symmetry transformation. This gives rise to a different path, whichwe callQ(s, t), with Q(0, t) = q(t). The pathQ(s, t) should have the same value of the LagrangianLas the pathq(t), in other words,L should not depend ons:

dds

L(Q(s, t),Q(s, t)) = 0. (10.1)

This leads toN

∑j=1

[∂L

∂Q j

∂Q j

∂s+

∂L

∂ Q j

∂ Q j

∂s

]= 0. (10.2)

Now we use the Euler-Lagrange equations:

∂L∂Q j

=ddt

∂L

∂ Q j(10.3)

87

Page 94: Classical and Quantum Mechanics

88 Symmetry and conservation laws

in order to write

dLds

=N

∑j=1

[∂L

∂Q j

∂Q∂s

+∂L

∂ Q j

∂ Q j

∂s

]=

N

∑j=1

[ddt

(∂L

∂ Q j

)∂Q j

∂s+

∂L

∂ Q j

∂ Q j

∂s

]=

ddt

[N

∑j=1

∂L

∂ Q j

dQj

ds

]= 0,

(10.4)and we see that the term within brackets in the last expression must be a constant of the motion:

N

∑j=1

∂L

∂ Q j

dQj

ds=

N

∑j=1

p jdQj

ds= Constant in time. (10.5)

We see that any continuous symmetry of the Lagrangian leads to a constant of the motion, given by(10.5). This analysis is obviously rather abstract, so let us now consider an example.

Suppose a one-particle system in three dimensional space is invariant under rotations around thez-axis. The rotation angle is calledα. In order to be able to evaluate the derivatives of the coordinateswith respect toα, we use cylindrical coordinates(r,ϕ,z) with x = r cosϕ andy = r sinϕ. A rotationabout thez axis over an angleα then corresponds to

ϕ → ϕ +α. (10.6)

so that we have

pxdxdα

=−pxr sin(ϕ +α) =−pxy; (10.7)

pydydα

= pyr cos(ϕ +α) = pyx; (10.8)

pzdzdα

= 0 (10.9)

so that the conserved quantity, from (10.5) is

xpy−ypx = Lz, (10.10)

the z-component of the angular momentum. Similarly, we would findLx andLy for the conservedquantities associated with rotations around thex- andy- axes respectively. Also, it is easy to verifythat for more than one particle, the total angular momentum is conserved.

The reader is invited to check that space translation symmetry results in momentum conservation.

10.2 Liouville’s theorem

A special conservation law is due to the fact that the equations of motion can be derived from aHamiltonian (or from a Lagrangian). Such equations of motion are calledcanonical. The fact that theequations of motion are canonical reflects a symmetry which is calledsymplecticity(or symplectic-ness), a discussion of which is outside the scope of these notes. The important notion is that this typeof symmetry leads to a number of conserved quantities, calledPoincare invariants, of which we shallconsider only one, the volume of phase space.

The proof of Liouville’s theorem hinges upon the fact that whenever in a volume integral like

V =∫

Ωdnx (10.11)

Page 95: Classical and Quantum Mechanics

10.2. Liouville’s theorem 89

we perform a variable transformationx → y, we must put a correction factor det(J) in the integral,whereJ is theJacobian matrix, given by

Ji j =∂yi

∂x j. (10.12)

We thus have

V =∫

Ωdnx det(J =

∫Ω′

dny. (10.13)

whereΩ′ is the volumeΩ transformed toy-space.The state of a mechanical system consisting ofN degrees of freedom is represented by a point in

2N-dimensional phase space(qi , p j). In the course of time, this point moves in phase space, and formsa trajectory. We now consider not a single mechanical system in phase space, but a set of systemswhich are initially homogeneously distributed over some regionΩ0, with volumeV0. In the course oftime, every point inΩ0 will move in phase space,Ω0 will therefore transform into some new regionΩ(t). The volume of this new space is given as

V(t) =∫

Ω(t)dnq dnp (10.14)

We want to show thatV(t) = V0, hence the volume does not change in time. To this end, we considera transformation from timet = 0 to ∆t:

q′i ≡ qi(∆t) = qi(0)+∂H(p,q)

∂ pi∆t +O(∆t2); (10.15)

p′i ≡ pi(∆t) = pi(0)− ∂H(p,q)∂qi

∆t +O(∆t2), (10.16)

where we have used a first order Taylor expansion and replaced time derivatives ofqi and pi usingHamilton’s equations. Now we can evaluate the original volumeV0 as follows:

V0 =∫

Ωdnq dnp =

∫Ω(∆t)

dnq dnp det[J(∆t)] . (10.17)

The Jacobi determinant can be written in block-form as follows:

det[J(∆t)] =

∣∣∣∣∣ 1+∆t ∂ 2H∂qi∂ p j

−∆t ∂ 2H∂qi∂qi

∆t ∂ 2H∂ pi∂ p j

1−∆t ∂ 2H∂q j ∂ pi

∣∣∣∣∣ (10.18)

Careful consideration of this expression should convince you that det[J(∆t)] = 1+O(∆t2). We seetherefore thatV(∆t) = V0 +O(∆t2), from which it follows that

dV(0)dt

= 0. (10.19)

This argument can be extended for arbitrary times, so that we have proven thatV is a constant of themotion. We have found Liouville’s theorem in the form:

The volume of a region phase occupied by a set of systems does not change in time.

Page 96: Classical and Quantum Mechanics

90 Symmetry and conservation laws

Figure 10.1: A box divided into two halves by a wall with a hole. Initially, particles will be in the right handvolume, and will move to the left. After large times, they will all come back to the right hand volume.

Of course, the region can change in shape, but its total volume will remain constant in time. We couldhave put any density distribution of points in phase space in the integrals, which does not change thederivation.

Liouville’s theorem is important in equilibrium statistical mechanics. So-calledergodicsystemsare assumed to move to a time-independent distribution in phase space, that is, any large set of systemssetting off at timet = 0 from different points in phase space and moving according to the Hamiltonianequations of motion will assume the same, invariant distribution after long times. Liouville’s theorem,moreover tells us that the systems will not all end up in the same point in phase space, but spread overa region with a volume equal to the initial volume.

There exists a more specific theorem concerning the behaviour of systems in time. This isPoincare’s theorem, which says that a system which is to evolve under the mechanical equationsof motion will always return arbitrarily close to its starting point within a finite time. Consider forexample a box partitioned into two sub-volumes (figure 10.1). There is a small hole in the middle, andthere areN particles in the right hand volume. Obviously, a fraction of these particles will move tothe left hand volume, but the Poincare recurrence theorem tells us that after afinite time, all particleswill reassemble in the right hand volume! This seems to be in contradiction with the second law of

Page 97: Classical and Quantum Mechanics

10.2. Liouville’s theorem 91

thermodynamics. This law states that the entropy will not decrease in the course of time. Here we seean increase of entropy when the particles distribute themselves over the two volume halves rather thana single one, but come back in a more ordered (less entropic) state after some time. This is only an ap-parent contradiction, as the Poincare theorem holds for afinitenumber of particles (finite-dimensionalphase space). What we see here is an example of the inequivalence of interchanging the order in whichlimits are taken: if wefirst take the system size to infinity (the approach of statistical mechanics andthermodynamics), the recurrence time will become infinite. If, on the other hand, we consider a finitesystem over infinitely large times (the mechanics approach), we see that it returns arbitrarily close toits initial state infinitely often. Taking then the system size to infinity does not alter this conclusion.

Page 98: Classical and Quantum Mechanics

11

Systems close to equilibrium

11.1 Introduction

When we prepare a conservative system in a state with a certain energy, it will conserve this energyad infinitum. In practice, such is never the case, as it is impossible to decouple a system from itsenvironment or from itsinternal degrees of freedom. This requires some explanation. We usuallydescribe macroscopic objects in terms of the coordinates of their centre of mass and their Euler an-gles. These are themacroscopicdegrees of freedom. As these objects consist of particles (atoms,molecules), they have very many additional,internal, or microsopicdegrees of freedom. In fact theheat which is generated during friction is nothing but a transfer of mechanical energy associated withthe macroscopic degrees of freedom to a mechanical energy of the internal (microscopic) degrees offreedom. So heat in the end is a form of mechanical energy. As a result of friction, macroscopicobjects will, when subject to a conservative, time-independent force (apart from friction), always endup at rest in a point where there potential energy is minimal. Any system at a point where its potentialenergy is minimal is said to be in astablestate. A system which looses its kinetic energy via frictionis called adissipativesystem. All the macroscopic systems we know are dissipative, although somecan approach conservative systems very well.

If the interactions are not all harmonic (‘harmonic’ means that the potential energy is a quadraticfunctions of the coordinates) then there may be more than one minimum. Local minima correspondto metastablestates. A system in a metastable state will return to that state under a small perturbation,but, when it isstronglyperturbed, it might move to another metastable with lower potential energy, orto the stable state. An example is shown in figure 11.1, for a particle in a one-dimensional potential.

Molecular systems, in which we take all degrees of freedom explicitly into account, are believed

S

M

V

Figure 11.1: System with a metastable (M) and a stable (S) state. A strong perturbation may kick the ball outof its metastable state, and under the influence of dissipation it will then move to the stable state.

92

Page 99: Classical and Quantum Mechanics

11.2. Analysis of a system close to equilibrium 93

to be non-dissipative. We know from statistical physics that every degree of freedom in a system inthermal equilibrium carries a kinetic energy equal (on average) tokBT, wherekB is Boltzmann’s con-stant. At low temperatures, the energies of the particles are small as can be seen from the Boltzmanndistribution, which gives the probability of finding a system with energyE as exp[−E/(kBT)]. There-fore, for low temperatures, the kinetic and the potential energy of a system are small. It can thereforebe inferred that at low temperatures, a system is close to a (meta-)stable state.

In this section, we analyse systems close to mechanical equilibrium. The beautiful result of thisanalysis is a description in terms of a set ofuncoupledharmonic oscillators, which are themselvestrivial to analyse. Moreover we obtain a straightforward recipe for finding the resonance frequencies(related to the coupling strengths) of those oscillators.

11.2 Analysis of a system close to equilibrium

Consider a conservative system characterised by generalised coordinatesq j , j = 1, . . . ,N. The systemis in equilibrium, defined by ˜q1, . . . , qN, if its potential energy is minimal. In that case we have

∂q jV(q1, . . . , qN) = 0; j = 1, . . . ,N. (11.1)

Now suppose that we perturb the system slightly, i.e. we change the values of theq j slightly withrespect to their equilibrium values. As the first derivative of the potential with respect to each of theq j vanishes, a Taylor expansion of the potential only contains second and higher order terms:

V(q1, . . . ,qN) = V(q1, . . . , qN)+12

N

∑j,k=1

(q j − q j)(qk− qk)∂ 2

∂q j∂qkV(q1, . . . , qN)+ . . . . (11.2)

The terms of order higher than two will be neglected, as we are interested in systems close to equilib-rium (i.e.q j − q j small).

We can represent the resulting expansion using matrix notation. Introduce the matrix

K =

∂ 2V

∂q1∂q1

∂ 2V∂q1∂q2

· · · ∂ 2V∂q1∂qN

∂ 2V∂q2∂q1

∂ 2V∂q2∂q2

· · · ∂ 2V∂q2∂qN

......

......

∂ 2V∂qN∂q1

∂ 2V∂qN∂q2

· · · ∂ 2V∂qN∂qN

. (11.3)

The matrixK is obviously symmetric. We can write

V(q1, . . . ,qN) = V(q1, . . . , qN)+12

δqTKδq (11.4)

whereδq is a column vector with componentsq j − q j , j = 1, . . . ,N; the superscript T denotes thetranspose of the vector.

Now we write down the kinetic energy in terms of the generalised coordinates. We assume thatthe constraints only depend on the generalised coordinatesq j and not on their derivatives or on time.In that case, the kinetic energy can be written in the form (see page 26):

T =12

N

∑j,k=1

M jkq j qk, (11.5)

Page 100: Classical and Quantum Mechanics

94 Systems close to equilibrium

where the matrixM is symmetric:M jk = M jk. Note thatM jk may depend on theq j . In terms ofq j − q j , and using vector notation, we can rewrite the kinetic energy as

T =12

δ qTMδ q. (11.6)

The equations of motion now read:

N

∑k=1

(M jkδ qk +δ qkMk j) =− ∂V∂q j

+∂T∂q j

=−N

∑k=1

(Kk jδqk +δqkKk j). (11.7)

We have omitted the dependence ofMi j onq j – this dependence generates terms on the left- and righthand side, which are both of orderδ q2 and can therefore be neglected. Using the symmetry of thematricesM andK, (11.7) reduces to

N

∑k=1

M jkδ qk =−N

∑k=1

K jkδqk. (11.8)

Let us consider the two-dimensional case to clarify the procedure. We consider two generalisedcoordinatesq1 andq2 with the matrixM jk equal to the identity. Then, the kinetic energy has the form:

T =12

q21 +

12

q22. (11.9)

The potential energy depends on the two coordinatesq1 andq2:

V = V(q1,q2). (11.10)

The equations of motion read:

q1 = δ q1 =− ∂V∂q1

(11.11)

q2 = δ q2 =− ∂V∂q2

(11.12)

Expanding about the point ˜q1, q2, whereV is supposed to be minimal, we have

V(q1,q2) = V(q1, q2)+12(q1− q1)2 ∂ 2

∂q21

V(q1, q2)+

(q1− q1)(q2− q2)∂ 2

∂q1∂q2V(q1, q2)+

12(q2− q2)2 ∂ 2

∂q22

V(q1, q2). (11.13)

This can be written in the form:

V(q1,q2) = V(q1, q2)+12(q1− q1,q2− q2)

∂ 2V∂q2

1

∂ 2V∂q1∂q2

∂ 2V∂q1∂q2

∂ 2V∂q2

1

( q1− q1

q2− q2

). (11.14)

Definingδq1 = q1− q1 and similarly forδq2, this equation reads:

V(q1,q2) = V(q1, q2)+12(δq1,δq2)

∂ 2V∂q2

1

∂ 2V∂q1∂q2

∂ 2V∂q1∂q2

∂ 2V∂q2

1

( δq1

δq2

). (11.15)

The 2×2 matrix occurring in this expression is our matrixK.

Page 101: Classical and Quantum Mechanics

11.2. Analysis of a system close to equilibrium 95

11.2.1 Example: Double pendulum

Consider as an example the double pendulum, consisting of two rigid massless rods of lengthl andL,with massesM andm:

l

L

M

m

θ

ϕ

The velocity of the upper mass isLϑ , that of the lower one is a vector sum of the velocity of theupper one and that of the lower one with respect to the upper one. For very small anglesϑ andϕ

both velocities will be approximately in the horizontal direction so that they can simply be added:vm = Lϑ + l ϕ. The kinetic energy therefore reads:

T =M2

(Lϑ)2 +

m2

(Lϑ + l ϕ

)2. (11.16)

Let us perform a transformation to more convenient variables

x = Lϑ (11.17a)

y = Lϑ + lϕ. (11.17b)

Note thatx andy do not denote cartesian coordinates. In that case the kinetic energy can simply bewritten as

T =M2

x2 +m2

y2. (11.18)

The potential energy of the upper mass isMgL(1−cosϑ)≈MgLϑ 2/2, and that of the lower massis given asmg[L(1−cosϑ)+ l(1−cosϕ)], which, in the small angle approximation becomes

VLower(ϑ ,ϕ) =12

mg(Lϑ

2 + lϕ2) . (11.19)

The total potential energy, written in terms ofx andy, therefore reads:

V =(M +m)g

2Lx2 +

mg2l

(y−x)2. (11.20)

Page 102: Classical and Quantum Mechanics

96 Systems close to equilibrium

The equations of motion can therefore be written as(M 00 m

)(xy

)=(−M+m

L g− mgl

mgl

mgl −mg

l

)(xy

). (11.21)

This is the form given in (11.8). We shall return to this example in the next section.

11.3 Normal modes

Let us try to find solutions to Eq. (11.15) of the form

δq j = A jeiωt (11.22)

whereω does not depend onj – all the degrees of freedom oscillate at thesamefrequency. Such amotion is called anormal mode. In the following we shall useq j instead ofδq j : q j is the generalisedcoordinate measured with respect to its equilibrium value.

We have ¨q j =−ω2q j , so (11.8) reduces to

N

∑k=1

M jkω2Ak =

N

∑k=1

K jkAk. (11.23)

If the mass tensorM jk would be the identity, Eq. (11.23) would be aneigenvalue equation. For generalmass tensors, the equation is ageneralised eigenvalue equation. We can reduce this equation to anordinary eigenvalue equation by multiplying the left and right and side by the inverseM−1 of the massmatrix. We then have:

ω2A j =

N

∑k,l=1

M−1jk KklAl . (11.24)

In algebraic terms, the solutions to these equations are the eigenvectorsA (with componentsA j ) andthe corresponding eigenvaluesω2. In physical terms, the componentsA j are the amplitudes of theoscillatory motions of the generalised coordinates, andω is the frequency of the oscillation. In orderfor the normal modes to exist, the eigenvalues should be real. That the eigenvalues are indeed realfollows from the fact that bothM andK are real, symmetric matrices. This implies thatM−1K is a real,symmetric matrix, and it is a well-known result of linear algebra that the eigenvalues of a Hermitianmatrix are real (real and symmetric implies Hermitian).

Another question is whether the eigenvalues are positive or negative. Assuming that we are ex-panding the potential around aminimum, the matrixK can be shown to be positive definite. A positivedefinite matrix has only positive eigenvalues1. Moreover, the mass matrix can be shown to be posi-tive. Then its inverseM−1 is also positive. Multiplying two positive matrices yields a product matrixwhich is positive. ThereforeM−1K is positive, and theω2 are positive. Hence the frequencies of theoscillations are always real – we do not find expontial growth or decay. In physical terms one couldsay that perturbing the system from equilibrium always pushes it back to this equilibrium – thereforethe ‘spring force’ experienced by the coordinates is always opposite to the perturbation, and thereforean oscillation arises, and not a drift away from equilibrium, or some exponential decay. Such decaymay however be found near a local maximum or near a saddle point of the potential.

Let us find the normal modes for the coupled pendulums. Note that this problem is relativelysimple as a result of the fact that the mass matrix is diagonal and therefore trivial to invert. After

1In fact, we shall occasionally allow for zero eigenvalues; in that case, the matrix is called positivesemidefinite.

Page 103: Classical and Quantum Mechanics

11.4. Vibrational analysis 97

multiplying both sides of Eq. (11.21) byM−1, we have a standard diagonalisation problem for thematrix: ( M+m

ML g+ mgMl −mg

Ml−g

lgl

). (11.25)

The eigenvalues are the solutions of the so-calledsecular equationwhich has the from∣∣∣∣ M+mML g+ mg

Ml −ω2 −mgMl

−gl

gl −ω2

∣∣∣∣= 0. (11.26)

This reduces to the following quadratic equation inω2:

ω4− M +m

M

(gL

+gl

2 +M +m

Mg2

Ll= 0. (11.27)

This equation has two solutions forω2.We will examine some special cases. IfM m, then, provided thatl is not too close toL, the two

roots with corresponding eigenvectors(Ax,Ay) are given by

ω ≈√

gl;

Ax

Ay≈ m

ML

l −L(11.28)

and

ω ≈√

gL

;Ax

Ay≈ L− l

L. (11.29)

The first solution describes an almost stationary motion of the upper pendulum with the lower oneoscillating at its natural frequency. In the second case, the motion of the upper and lower are of thesame order of magnitude with the natural frequency of the upper pendulum.

If M m, the solutions are

ω2 ≈ g

L+ lAx

Ay=

LL+ l

(11.30)

and

ω2 ≈ m

M

(gL

+gl

),

Ax

Ay≈−m

ML+ l

L. (11.31)

The first case describes a motion in which the two rods are aligned so that we have essentially a singlependulum of lengthl +L and massm. The second case corresponds to a very high frequency of theupper mass with an almost stationary lower mass.

11.4 Vibrational analysis

The way in which atoms are bound together in molecules is described by quantum mechanics. Thereis a long standing tradition in the quantum mechanical calculation of stationary states of molecules.In the last fifteen years or so it has become possible to performdynamicalcomputations of moleculesto very good accuracy using fully quantum mechanical calculations. These calculations are quitedemanding on computer resources and they do not always give a very good insight into the dynamicsof interest. Therefore, a semi-classical approach is often adopted in order to calculate vibration spectrafor example.

First, the total energy of the molecule is calculated as a function of the nuclear positionsRi ,i = 1, . . . ,N for anN-atomic molecule. There is however a problem in doing this. Suppose we want

Page 104: Classical and Quantum Mechanics

98 Systems close to equilibrium

stretch

torsion

bend

Figure 11.2: Interactions in a molecule.

to calculate this energy for 10 values of all the coordinates of a 10-atom molecule. As there are 30coordinates, we need to perform 1030 stationary quantum calculations, which would require the ageof the universe. Therefore the potential is parametrised in a sensible way, which we now describe.All the chemically bonded atoms are described by harmonic or goniometric interactions. The degreesof freedom chosen for this parametrisation are thebond length, bond angleanddihedral angle. Theforces associated with these degrees of freedom are calledstretch, bend, and torsion respectively.These degrees of freedom are shown in figure 11.2 The form of the potentials associated with bondstretching is given as

VStretch=κ

2(l − l0)2 (11.32)

wherel is the bond length andl0 is the equilibrium bond length. The spring constantκ determineshow difficult it is to stretch the bond. The bending potential is given in terms of the bond angleϕ:

VBend=α

2(ϕ−ϕ0)2 (11.33)

A similar expression exists for the torsional energy.The constantsκ andα can be determined from stationary quantum mechanical calculations. As-

suming that these parameters are known, we shall now use the given form of the potential to calculatethe vibration spectrum of a triatomic, linear molecule, such as CS2 or CO2 (see figure 11.3). Weneglect bending here, so only bond stretching is taken into account.

If the initial configuration is linear, the motion takes place along a straight line, which we take asour X-axis. The coordinates of the three atoms arex1, x2 andx3. The kinetic energy can therefore bewritten down immediately:

T =µ

2

(x2

1 + x23

)+

m2

x22. (11.34)

The potential energy is given by

V =κ

2(x2−x1− l)2 +

κ

2(x3−x2− l)2 . (11.35)

Here,l is the equilibrium bond length. The centre of mass of the system will move uniformly as thereare no external forces acting, and we take this centre as the origin. The equilibrium coordinates are

Page 105: Classical and Quantum Mechanics

11.4. Vibrational analysis 99

1

2

3

m

µ µ

Figure 11.3: Triatomic molecule.

.

thenx1 =−l , x2 = 0 andx3 = l . The deviations from these values are

δx1 = x1 + l ; δx2 = x2 and δx3 = x3− l . (11.36)

In this representation, we have

T =µ

2

(δ x2

1 +δ x23

)+

m2

δ x22. (11.37)

and

V =κ

2(δx2−δx1)

2 +κ

2(δx3−δx2)

2 . (11.38)

We can find the matricesK andM directly from these expressions:

M =

µ 0 00 m 00 µ

, (11.39)

and

K = κ

1 −1 0−1 2 −1

0 −1 1

. (11.40)

The normal modes can now be found by solving (11.23) with these matrices. The eigenvectors can befound by solving the secular equation:∣∣∣∣∣∣

κ−µω2 −κ 0−κ 2κ−mω2 −κ

0 −κ κ−µω2

∣∣∣∣∣∣= 0. (11.41)

This leads to:

(κ−µω2)ω2(µmω

2−κm−2κµ) = 0, (11.42)

from which we find:

ω1 = 0; ω2 =√

κ

µ; ω3 =

√κ

(1µ

+2m

). (11.43)

Page 106: Classical and Quantum Mechanics

100 Systems close to equilibrium

1

2

3

m

µ µ

Mode 1

Mode 2

Mode 3

Figure 11.4: The three modes of the triatomic molecule.

The corresponding eigenvectors can be found after some algebra:

A1 =

111

; A2 =

10

−1

; A3 =

1−2µ/m

1

. (11.44)

The first of these, corresponding toω1 = 0, is a mode in which the atoms all slide in the same directionwith the same speed. This is a manifestation of the translational symmetry of the problem, which hasbeen recovered by our procedure. The second one represents a mode in which the middle atom standsstill and the two outer atoms vibrate oppositely. Obviously, the frequency of this mode isω2 =

√κ/m,

corresponding to the two springs. Finally, the last mode is one in which the two outer atoms movein one direction, and the central atom in the opposite direction. The motion can be understood byreplacing the two outer masses by a single one with mass 2µ at their midpoint, coupled by a springwith spring constant 2κ to the central mass. The reduced mass of this system(1/(2µ)+1/m)−1 thenoccurs in the expression for the resonance frequency. The three modes are depicted in figure 11.4.

11.5 The chain of particles

In the previous section we have analysed a triatomic molecule. Now we shall analyse a larger system:a chain ofN particles. We assume that all particles have the same mass, and that they are connectedby a string with tensionτ. The particles are assumed to move only in the vertical (y) direction, andthex-components of adjacent particles differ by a separationd. The first and last spring are connectedto points aty = 0. The chain is depicted in figure 11.5. The chain is a model for a continuous string,which is obtained by lettingN→ ∞ andd→ 0 while keeping the string lengthNd fixed.

Let us consider particle numberk. The springs connecting this particle to its neighbours arestretched, and this may result in a net force acting on particlek. The spring between particlek andk+1 has a length

l =√

d2 +(yk+1−yk)2 ≈ d+(yk+1−yk)2

2d(11.45)

Page 107: Classical and Quantum Mechanics

11.5. The chain of particles 101

k

k+

k−1

1yk

d

Figure 11.5: The harmonic chain of particles.

where a first order Taylor expansion is used to obtain the second expression. The potential energy forthis link is equal to the tensionτ times the extension of the string, and therefore we find for the totalpotential energy:

V =τ

2d

N

∑k=0

(yk+1−yk)2 (11.46)

withy0 = yN+1 = 0. (11.47)

The kinetic energy is given by

T =N

∑k=1

m2

y2k. (11.48)

We now find the matricesMkl andKkl as:

Mkl = mδkl (11.49)

whereδkl is the Dirac delta function, in other words,Mkl is m/2 times the unit matrix. ForKkl wefind:

K =

d − τ

d 0 0 0 · · · 0− τ

d 2τ

d − τ

d 0 0 · · · 00 − τ

d 2τ

d − τ

d 0 · · · 0...

......

......

......

. (11.50)

The normal mode equation (11.23) can be solved analytically for arbitraryN by substituting for theeigenvectorAk = γ exp(iαk), whereγ is some constant. This trial solution does not satisfy the bound-ary equations (11.47), but we do not bother about this for the moment. Then for 2≤ k≤ N−1 wefind

mω2eikα =

τ

d

(−eiα(k−1) +2eikα −eiα(k+1)

)(11.51)

Dividing left and right hand side by exp(ikα), we find

mω2 =

d(1−cosα). (11.52)

For eachα, there is also a solution for−α for the sameω. This can be used to construct a solution

Ak = γ(eikα −e−ikα) = 2iγ sinα. (11.53)

Page 108: Classical and Quantum Mechanics

102 Systems close to equilibrium

This solution always vanishes atk = 0 and it vanishes also atk = N whenNα = nπ, for integern.So the conclusion is that for eachn = 0, . . . ,N, we have a solution which vanishes at the two endsof the string. For values ofn higher thanN, or lower than 0, the solutions obtained are identicalto the solutions with 0≤ n≤ N. For each solution, all particles move up and down with the samefrequency, given by (11.52). The wavelength is given bykd such thatkα = 2π, soλ = 2πd/α, andthe wavevectorq = α/d.

It is possible to formulate the Lagrangian directly in a continuum form, and derive the waveequation from this. Note that in the continuum limit,α andd small, we obtain from (11.52) for thefrequency:

ω2 =

τα2

md= τ

dm

q2. (11.54)

Comparison with the well known dispersion equationω = cq, we learn that the sound speedc is givenas√

τd/m. Defining the densityρ = m/d, we have

c =√

τ

ρ. (11.55)

Page 109: Classical and Quantum Mechanics

12

Density operators — Quantum information theory

1 Introduction

In this section, we extend the formalism of quantum physics to include statistical uncertainty whichcan be traced back to our lack of knowledge of what the wavefunction actually is. States whose formwe are not certain about, are described by an object calleddensity matrix. Density matrices can beused to detect coupling between a particle and an outside world, which in the simplest case is anotherparticle. We shall see that this coupling may lead to quantum states which do not have a classicalanalogue – these states are calledentangled. Entanglement is used in novel technological applicationswhich are based on the quantum nature of matter. The most spectular realisation of this trend whichmay be achieved in the next few decades is the quantum computer, which will be briefly discussedtowards the end of this chapter.

2 The density operator

Up to this point, we have always assumed that a quantum system can be described by a wavefunctionwhich contains all the information which can in principle be obtained about the system. In particular,knowledge of the wavefunction enables us to predict the possible outcomes of physical measurementsand their probabilities. For example, if we know that the electron in a hydrogen atom finds itself in astate

1√3

(|2,1,1,+〉+ i |2,1,0,+〉− |2,1,−1,−〉) , (1)

where the ket-vectors are of the form|n, l ,mz,sz〉, we can calculate the possible outcome of any mea-surement and its respective probability. In this example, a measurement ofLz would yield the value, or 0 or−, all with probability 1/3. Although knowledge of the quantum mechanical wavefunc-tion does not predict outcomes of measurements unambiguously, the wavefunction is the most preciseknowledge we can have about a system. In that sense, the wave function is for quantum mechanicswhat the positions and velocities are for a classical many-particle system.

In some cases we might indeed know the state of a quantum system, for example at very low tem-peratures where particles are almost certainly in the ground state, or when they are in some collectivequantum state, as is the case in superfluidity or superconductivity. Another example is a quantumsystem of which we just have measured all quantities corresponding to the observation maximum. Ifwe would measure for exampleLz = −, L2 = 22, E = E2 andSz = −/2, we can be sure that justafter that measurement the system is in the state

|2,1,−1,−〉 . (2)

Note that we donot know the overall phase factor, but this factor drops out when calculating physicalquantities.

103

Page 110: Classical and Quantum Mechanics

104 Density operators — Quantum information theory

In most practical situations, however, we do not know the wavefunction at all! Moreover, if wedo not know the state, we cannot infer its shape from whatever sequence of whatever measurements,as the first measurement reduces the state so that it changes considerably (as mentioned above, wedoknow its state immediatelyafter the measurement). Moreover, suppose we have carefully prepareda system in a well-defined state, the interaction with its surroundings will alter that state. Does thismean that quantum mechanics might be a nice theory, but that the state of affairs is that we cannotdo anything with it in practice? The answer is no: even if we do not know the state of a system pre-cisely, we usually have somestatisticalknowledge about it. This means that we know the probabilityPr that a system is in state|r〉. Now this might become very confusing: quantum mechanics allowsus to make statistical predictions, and now I say that the state of a system is specified in a statisticalmanner. It might be helpful to keep in mind that a wavefunction by itself is not a statistical object atall: it is a well-defined object whose time-evolution can be calculated with — in principle — arbitraryprecision. However, measurements performed on a system described by a known wavefunction aresubject to quantum mechanical uncertainty. This is calledintrinsic uncertainty. If we have — forwhatever reason — incomplete knowledge of the state of the system, we speak ofextrinsic uncer-tainty. One of the conceptual difficulties students and scientists have with quantum mechanics is thedifference between the wavefunction which evolves smoothly and deterministically according to thetime-dependent Schrodinger equation on the one hand, and the abrupt change taking place at a mea-surement where the state is instantaneously reduced to the eigenfunction of the operator correspondingto the physical quantity we measure, on the other hand.

As an example of a state which is not known explicitly, consider again the hydrogen atom. Sup-pose we have thousands of hydrogen atoms which we can measure. All these atoms have undergonethe same preparation. We measure energy,L2 andLz of the atom. In all cases we find that energy isthat of the first excited stateE = E2, andl = 1, but in 25% of the cases, we findm= 1, in 50 % of thecasesm= 0 and in the remaining 25 % we findm= −1. It now is tempting to say that the state ofevery hydrogen atom (neglecting the electron spin) can be written as

|ψ〉=12|2,1,1〉+ 1√

2|2,1,0〉+ 1

2|2,1,−1〉 (3)

But can we really infer this information from our measurements? We could flip the sign of the secondterm on the right hand sign, and we would find the same probabilities as with the state given above.Can you now still tell me which state the atoms are in?

To emphasise the point more strongly, we look at the simplest possible nontrivial system, de-scribed by a two-dimensional Hilbert state, e.g. a spin-1/2 particle. Suppose someone, Charlie, givesus an electron but he does not know its spin state. He does however know that there is no reason forthe spin to be preferably up or down, so the probability to measure spin ‘up’ or ‘down’ is 1/2 for both.Does that give us enough information to specify the state? Well, you might guess that the state is

|ψ〉=1√2

(|1/2〉+ |−1/2〉) , (4)

but why couldn’t it be

|ψ〉=1√2

(|1/2〉− |−1/2〉)? (5)

In fact the state of the system could be anything of the form

|ψ〉=1√2

(|1/2〉+eiϕ |−1/2〉

), (6)

Page 111: Classical and Quantum Mechanics

2. The density operator 105

for any realϕ.Although we do not know the wavefunction exactly, we can evaluate the expectation value of the

z-component of the spin: as we find/2 and−/2 with equal probabilities, the expectation value is0. More generally, if we have a spin which is in the spin-up state with probablityp and in the downstate with probablity 1− p, the expectation value of thez-component of the spin is(p−1/2). Soexpectation values can still be found, although we do not have complete information about the stateof the system. This fact might raise the question whether there is any difference in measured physicalquantities between one of the candidate wavefunctions suggested above, and the information that theparticle is in the ‘up’ state with probability 1/2 and in the ‘down’ state with the same probability. Afterall, they both give the same value for expectation value of thez-component of the spin.

We now introduce the following states:

|ψ1〉= |1/2〉 ; (7a)

|ψ2〉= |−1/2〉 ; (7b)

|ψ3〉=1√2

(|1/2〉+ |−1/2〉) ; (7c)

|ψ4〉=1√2

(|1/2〉− |−1/2〉) ; (7d)

|ψ5〉=1√2

(|1/2〉+ i |−1/2〉) ; (7e)

|ψ6〉=1√2

(|1/2〉− i |−1/2〉) . (7f)

These states are recognised as the spin-up and -down states for thez, x and y directions. Let usconsider a particle in the state

(|1/2〉+eiϕ |−1/2〉

)/√

2, and calculate the probability of finding at ameasurement this particle in the state|ψ3〉:∣∣∣∣12 (〈1/2|+ 〈−1/2|)

(|1/2〉+eiϕ |−1/2〉

)∣∣∣∣2 =∣∣∣∣1+exp(iϕ)

2

∣∣∣∣2 =12

(1+cosϕ) . (8)

If we evalute the probability to find the particle in the state|ψ3〉 in the case it was, before the mea-surement, in a so-calledmixed statewhich is given with equal probabilities to be|1/2〉 and|−1/2〉,we find 1/2, as can easily be verified. Calculating the probabilities for a particle to be found in thestates|ψ1〉 to |ψ6〉 we find the following results.

State(|1/2〉+eiϕ |−1/2〉

)/√

2 Equal mixture of|1/2〉 and|−1/2〉|ψ1〉 1/2 1/2|ψ2〉 1/2 1/2|ψ3〉 1/2(1+cosϕ) 1/2|ψ4〉 1/2(1−cosϕ) 1/2|ψ5〉 1/2(1−sinϕ) 1/2|ψ6〉 1/2(1+sinϕ) 1/2

We see that there is noϕ, i.e. no pure state, which leads to the same probabilities for all measurementresults. It is important to make the distinction between the two cases very clearly: if Charley gives us

Page 112: Classical and Quantum Mechanics

106 Density operators — Quantum information theory

millions of times a particle which is in a state with anarbitrary ϕ, we will find probabilities 1/2 tofind the particle in either(|1/2〉+ |−1/2〉)/

√2 or (|1/2〉− |−1/2〉)/

√2. If the phase would always

be the same, sayϕ = 0 then we would find probabilities 1 and 0 respectively. Therefore, a mixed stateindicates uncertainty of the relative phase of the components of the wave function.

Let us summarize what we have learned:

A system can be either in apureor amixedstate. In the first case, we know precisely thewavefunction of the system. In the second case, we are not sure about the state, but wecanascribe a probability for the system to be in any of the states accessible to it.

Note that the uncertainty about the state the particle is in, is aclassicaluncertainty. We can forexample flip a coin and, depending on whether the result is head or tails, send a spin-up or -down toa friend. Our friend then only knows that the probability for the particle he receives to be ‘up’ is 1/2,and similar for ‘down’.

We now turn to the general case of a system which can be in either one of a set of normalised, butnot necessarily orthogonal, states|ψi〉. The probability for the system to be in the state|ψi〉 is pi , withobviously∑i pi = 1. Suppose the expectation value of some operatorA in state|ψi〉 is given byAi .Then the expectation value ofA for the system at hand is given by

〈A〉= ∑i

piAi = ∑i

pi 〈ψi | A|ψi〉 . (9)

We now introduce thedensity operator, which is in some sense the ‘optimal’ specification of thesystem. The density operator is defined as

ρ = ∑i

pi |ψi〉〈ψi | . (10)

Suppose the set|φn〉 forms a basis of the Hilbert space of the system under consideration. Then theexpectation value of the operatorA can be rewritten after inserting the unit operator 1= ∑n |φn〉〈φn|as

〈A〉= ∑i

pi 〈ψi | A|ψi〉= ∑i

pi 〈ψi |∑n|φn〉〈φn| A|ψi〉=

∑n〈φn|

[∑

i

pi |ψi〉〈ψi |]

A|φn〉= ∑n〈φn| ρ |φn〉= Tr

(ρA). (11)

Here we have used thetrace operator, Tr which adds all diagonal terms of an operator. For a generaloperatorQ:

Tr Q = ∑n〈φn|Q|φn〉 . (12)

The trace is independent of the basis used — it is invariant under a basis transformation. We omit thehat from operators unless confusion may arise. Another property of the trace is

Tr (|ψ〉〈χ|) = 〈χ|ψ〉 , (13)

which is easily verified by writing out the trace with respect to a basisφn.If a system is in a well-defined quantum state|ψ〉, we say that the system is in apure state. In that

case the density operator isρ = |ψ〉〈ψ| . (14)

Page 113: Classical and Quantum Mechanics

2. The density operator 107

If the system is not in a pure state, but if only the statistical weightspi of the states|ψi〉 are known,we say that the system is in amixed state. When someone gives you a density operator, how can youassess whether it corresponds to a pure or a mixed state? Well, it is clear that for a pure state we haveρ2 = ρ, which means thatρ is a projection operator1:

ρ2 = |ψ〉〈ψ|ψ〉〈ψ|= |ψ〉〈ψ|= ρ, (15)

where we have used the fact thatψ is normalised.For a mixed state, such as

ρ = α |ψ〉〈ψ|+β |φ〉〈φ | (16)

where〈ψ|φ〉= 0, we haveρ

2 = α2 |ψ〉〈ψ|+β

2 |φ〉〈φ | 6= ρ. (17)

Although we have considered a particular example here, it holds in general for a mixed state thatρ isnot a projection operator.

Another way to see this is to look at the eigenvalues ofρ. For a pure state, for whichρ = |ψ〉〈ψ|,clearly |ψ〉 is an eigenstate ofρ with eigenvalue 1, and all other eigenvalues are 0 (their eigenstatesare all states which are perpendicular to|ψ〉. These values for the eigenvalues are the only ones whichare allowed by a projection operator. As

Trρ = ∑i

pi = 1, (18)

we have

∑i

λi = 1. (19)

Now let us evaluate〈φ |ρ |φ〉= ∑

i

pi |〈ψi |φ〉|2 ≤ 1, (20)

where the fact that|〈ψi |φ〉| ≤ 1, combined with∑i pi = 1 leads to the inequality. The condition∑i λi = 1, means that either one of the eigenvalues is 1 and the rest is 0, or they are all strictly smallerthan 1. Thus, for an eigenstateφ of the density operator, we have

〈φ |ρ |φ〉= 〈φ |λ |φ〉= λ < 1. (21)

We see that a density operator has eigenvalues between 0 and 1. In summary

The sum of the eigenvalues of the density operator is 1. The situation where only one ofthese eigenvalues is 1 and the rest is 0, corresponds to apure state.If there are eigenvalues 0< λ < 1, then we are dealing with a mixed state.

To summarize, we can say that if a system is in a mixed state, it can be characterized by a set ofpossible wavefunctions|ψi〉 and probabilitiespi for the system to be in each of those wave functions.But a more compact way of representing our knowledge of the system is by using thedensity operator,which can be constructed when we know the possible statesψi and their probabilitiespi [see Eq. (10)].The density operator can be used to calculate expectation values using the trace, see Eq. (11).

1Recall that a projection operatorP is an Hermitian operator satisfyingP2 = P.

Page 114: Classical and Quantum Mechanics

108 Density operators — Quantum information theory

Let us consider an example. Take again the case where Charley sends us a spin-up or -downparticle with equal probabilities. For convenience, we denote these two states as|0〉 (spin up) and|1〉(spin-down). Then the density operator can be evaluated as

ρ =12|0〉〈0|+ 1

2|1〉〈1| . (22)

This operator works in a two-dimensional Hilbert space – therefore it can be represented as a 2×2matrix:

ρ =(

1/2 00 1/2

). (23)

The matrix elements are evaluated as follows. The upper-left element is

〈0|ρ |0〉=12〈0|0〉〈0|0〉+ 1

2〈0|1〉〈1|0〉= 1/2 (24)

as follows from (22) and from the orthogonality of the two basis states. The upper-right element isgiven by

〈0|ρ |1〉=12〈0|0〉〈0|1〉+ 1

2〈0|1〉〈1|1〉= 0 (25)

as a result of orthogonality. The lower left element〈1|ρ|0〉 and the lower right〈1|ρ|1〉 are foundsimilarly. Another interesting way to find the density matrix (i.e. the matrix representation of thedensity operator) is by directly using the vector representation of the states|0〉 and|1〉:

ρ =12

(10

)(1,0)+

12

(01

)(0,1) =

(1/2 00 1/2

). (26)

Note the somewhat unusual order in which we encounter column and row vectors: the result is not anumber, but an operator.

Another day, Charley decides to send us particles which are either ”up” or ”down” along thex-axis. As you might remember, the eigenstates are

1√2

(|0〉+ |1〉) (27)

for spin-up (alongx) and1√2

(|0〉− |1〉) (28)

for spin-down. You recognize these states as the states|φ3〉 and|φ4〉 given above. Now let us workout the density operator:

ρ =14

(11

)(1,1)+

14

(1−1

)(1,−1) =

(1/2 00 1/2

). (29)

We see that we obtain thesamedensity matrix! Apparently, the particular axis used by Charley doesnot affect what we measure at our end.

Another question we frequently ask ourselves when dealing with quantum systems is:What is the probability to find the system in a state|φ〉 in a measurement?The answer for a system which is in a pure state|ψ〉 is:

Pφ = |〈φ |ψ〉|2 . (30)

Page 115: Classical and Quantum Mechanics

2. The density operator 109

If the system can be in either one of a set of states|ψi〉 with respective probabilitiespi , the answer istherefore

Pφ = ∑i

pi |〈φ |ψi〉|2 . (31)

Another way to obtain the expression on the right hand side is by using the density operator:

〈φ |ρ|φ〉= ∑i

pi |〈φ |ψi〉|2 = Pφ . (32)

This equation follows directly from the definition of the density operator.Important examples of systems in a mixed state are statistical systems connected to a heat bath.

Loosely speaking, the actual state of the system without the bath varies with time, and we do not knowthat state when we perform a measurement. We know however from statistical physics that the prob-ability for the system to be in a state with energyE is given by the Boltzmann factor exp[−E/(kBT)],so the density operator can be written as

ρ = N ∑i

|ψi〉e−Ei/(kBT) 〈ψi | (33)

where theψi are eigenstates of the Hamiltonian. The prefactorN is adjusted such thatN ∑e−Ei/(kBT) =1 in order to guarantee that Trρ = 1. The density operator can also be written as

ρ = N e−H/(kBT), (34)

as can be verified as follows:

e−H/(kBT) = ∑i

|ψi〉〈ψi |e−H/(kBT) ∑j

|ψ j〉〈ψ j |= ∑i

|ψi〉e−Ei/(kBT) 〈ψi | . (35)

Any expectation value can now in principle be evaluated. For example, consider a spin-1/2 particleconnected to a heat bath of temperatureT in a magnetic fieldB pointing in thez-direction. TheHamiltonian is given by

H =−γBSz. (36)

Then the expectation value of thez-component of the spin can be calculated as

〈Sz〉= Tr (ρSz). (37)

We can evaluateρ. Using the notationβ = 1/(kBT) it reads:

ρ =1

eβγB/2 +e−βγB/2

(eβγB/2 0

0 e−βγB/2

). (38)

Now the expectation value〈Sz〉 can immediately be found, usingSz = σz/2, whereσz is the Paulimatrix:

〈Sz〉= Tr (ρSz) = /2tanh(βγB/2). (39)

Considering systems ofnoninteractingparticles, the density operator can be used to derive theaverage occupation of energy levels, leading to the well-known Fermi-Dirac distribution for fermions,and the Bose-Einstein distribution for bosons. This derivation is however beyond the scope of thislecture course — it is treated in your statistical mechanics course.

Page 116: Classical and Quantum Mechanics

110 Density operators — Quantum information theory

3 Entanglement

Entanglement is a phenomenon which can occur when two or more quantum systems are coupled. Weshall focus on the simplest nontrivial system exhibiting entanglement: two particles,A andB, whosedegrees of freedom span a two-dimensional Hilbert space (as usual, you may think of two spin-1/2particles). The states of the particles are denoted|0〉 and |1〉. Therefore, the possible states of thetwo-particle system are linear combinations of the states

|00〉 |01〉 |10〉 and |11〉 (40)

(the first number denotes the state of particleA and the second one that of particleB). We use thesestates (in this order) as a basis of the four-dimensional Hilbert space, that is, we may identify

|00〉 ⇔

1000

(41)

and so on.Suppose the system is in the state

|ψ〉=12

(|00〉+ |01〉+ |10〉+ |11〉) (42)

or, in vector notation:

ψ =12

1111

. (43)

Note that this state is normalised.We perform measurements of thefirst spin only. More specifically, we measure the probabilities

for a system to be in the states

|ψ1〉= |0〉 , |ψ2〉= |1〉 , |ψ3〉=1√2

(|0〉+ |1〉) or |ψ4〉=1√2

(|0〉− |1〉) . (44)

The resulting probabilities are (check this!):

P1 = P2 = 1/2; (45a)

P3 = 1; P4 = 0. (45b)

These are precisely the same results as those found above for asingleparticle in the state|ψ3〉 =(|0〉+ |1〉)/

√2, that is, if we want to predict measurements on the first particle, we can forget about

the second particle. The reason for this is that we can write the state (42) as

12

(|0〉A + |1〉A)⊗ (|0〉B + |1〉B) (46)

where⊗ is the so-calledtensor product. The fact that (42) can be written as a (tensor) product of purestates of the two subsystemsA and systemB is responsible for the fact that the second particle doesnot ‘interfere’ with the first one.

Page 117: Classical and Quantum Mechanics

3. Entanglement 111

Now consider the state

|ψE〉=1√2

(|00〉+ |11〉) . (47)

If we evaluate again the probabilities of finding thefirst spin in the statesψ1 or ψ2, we find:

P1 = P2 =12, (48)

as can readily be seen from (47). In order to evaluate the probability to find the first spin in the stateψ3, we writeψ in the form

|ψ〉=12

[(|ψ3〉+ |ψ4〉) |0〉+(|ψ3〉− |ψ4〉) |1〉] . (49)

Now, the probability for the first spin to be in the stateψ3 or ψ4, while not caring about whether thesecond spin is 0 or 1, is seen to be

P3 = P4 = 1/2 (50)

These results are the same as for a single particle with density operator

ρ =12

(|0〉〈0|+ |1〉〈1|) . (51)

The conclusion is that for the state (47), the first particle is well described by a mixed state. Thereis no way to assign a pure state to particleA: we say that particleA is entangledwith particleB andψE is called anentangled state. Thus, we see that the density operator is very useful for describingparticles coupled to an ‘outside world’. A state is entangled when it does not allow us to assign a purestate to a part of the quantum system under consideration.

Let us now consider entanglement from another point of view. We perform measurements onparticleA and onB, checking whether these particles are found in state 1 or 0. For our entangled state(47) we find

P00 = P11 = 1/2 (52a)

P10 = P01 = 0, (52b)

whereP01 is the probability to find particleA in state 0 and particleB in state 1 etcetera. We see thatin terms of classical probabilities, the systemsA is strongly correlated with systemB. It turns out thatthis correlation remains complete even when the measurement is performed with respect to anotherbasis (see exercises):

Entanglement gives rise to correlation of probabilities, and this correlation cannot be lifted by abasis transformation.

Now let’s start with a system which is not entangled — it might for example be in the state (42).We assume that the system evolves according to a HamiltonianH which, in the basis|00〉, |01〉, |10〉,|11〉, has the following form:

1 0 0 00 1 0 00 0 1 00 0 0 −1

(53)

Page 118: Classical and Quantum Mechanics

112 Density operators — Quantum information theory

The time evolution operator is given byT = exp(−it H/) — at t = π/2 it has the form−i 0 0 00 −i 0 00 0 −i 00 0 0 i

(54)

so that we find

|ψ(t = π/2)〉=− i2

(|00〉+ |01〉+ |10〉− |11〉) , (55)

which is an entangled state (you will find no way to write it as a tensor product of two pure states ofA andB). Thus we see that when a system starts off in a non-entangled state, it might evolve into anentangled state in the course of time.

4 The EPR paradox and Bell’s theorem

In 1935, Einstein, Podolsky and Rosen (EPR) published a thought experiment, which demonstratedthat quantum mechanics is not compatible with some obvious ideas which we tacitly apply whendescribing phenomena. In particular the notions of an existing reality existing independently of ex-perimental measurements and of locality cannot both be reconciled with quantum mechanics. Localityis used here to denote the idea that events cannot have an effect at a distance before information hastravelled from that event to another place where its effect is noticed. Together, the notions of realityand locality are commonly denoted as ‘local realism’. From the failure of quantum mechanics tocomply with local realism, EPR concluded that quantum mechanics is not a complete theory.

The EPR paradox is quite simple to explain. At some point in space, a stationary particle withspin 0 decays into two spin-1/2 particles which fly off in opposite directions (momentum conservationdoes not allow the directions not to be opposite). During the decay process, angular momentum isconserved which implies that the two particles must have opposite spin: when one particle is found tohave spin ‘up’ along some measuring axis, the other particle must have spin ‘down’ along the sameaxis. Obviously, we are dealing with an entangled state.

Suppose Alice and Bob both receive an outcoming particle from the same decay event. Alicemeasures the spin of the particle along thez direction, and Bob does the same with his particle.Superficially, we can say that they would both have the same probability to find either/2 or−/2.However, if quantum mechanics is correct, these measurements should be strongly correlated: if Alicehas measured spin up, then Bob’s particle must have spin down along thez-axis, so the measurementresults are fully correlated. According to the ‘orthodox’, or ‘Copenhagen’ interpretation of quantummechanics, if Alice is the first one to measure the spin, the particular value measured by her is decidedat the very moment of that measurement. But this means that at theat the same momentthe spin stateof Bob’s particle is determined. But Bob could be lightyears away from Alice, and perform his mea-surement immediately after her. According to the orthodox interpretation, his measurement would beinfluenced by Alice’s. But this was inconceivable to Einstein, who maintained that the informationabout Alice’s measurement could not reach the Bob’s particle instantaneously, as the speed of light isa limiting factor for communication. In Einstein’s view, the outcome of the measurements of the par-ticles is determined at the moment when they leave the source, and he believed that a more completetheory could be found which would unveil the ‘hidden variables’ which determine the outcomes ofAlice and Bob’s measurements when the particles left the source. These hidden variables would thenrepresent some “reality” which exists irrespectively of the measurement.

Page 119: Classical and Quantum Mechanics

4. The EPR paradox and Bell’s theorem 113

a c

b

Figure 12.1: The measuring axis for a spin.

The EPR puzzle remained unsettled for a long time, until, in 1965, John Bell formulated a theoremwhich would allow to distinguish between Einstein’s scenario and the orthodox quantum mechanicalinterpretation. We shall now derive Bell’s theorem. Suppose we count in an audience the numbers ofpeople having certain properties, such as ‘red hair’ or ‘wearing yellow socks’, ‘taller than 1.70 m’.We take three such properties, calledA, B andC. If we select one person from the audience, he or shewill either comply to each of these properties or not. We denote this by a person being ‘in the state’A+,B−,C+ for example. The number of people in the stateA+,B−,C+ is denotedN(A+,B−,C+). Wenow write

N(A+,B−) = N(A+,B−,C+)+N(A+,B−,C−) (56)

which is a rather obvious relation.We use similar relations in order to rewrite this as

N(A+,B−) = N(A+,C−)−N(A+,B+,C−)+N(B−,C+)−N(A−,B−,C+)≤N(A+,C−)+N(B−,C+).(57)

This is Bell’s inequality, which can also be formulated in terms of probabilities [P(A+,B−) instead ofN(A+,B−) etcetera]. We have used everyday-life examples in order to emphasise that there is nothingmysterious, let alone quantum mechanical, about Bell’s inequality. But let us now turn to quantummechanics, and spin determination in particular.

Consider the three axesa, b andc shown in the figure.A+ is now identified with a spin-up mea-surement alonga etcetera. We can now evaluateP(A+,C−). MeasuringA+ happens with probability1/2, but after this measurement, the particle is in the spin-up state along thea-axis. If the spin isthen measured along thec direction, we have a probability sin2

π/8 to findC− (see problem 16 of theexercises). The combined probability isP(A+,C−) is therefore1

2 sin2(π/8) Similarly, P(B−,C+) isalso equal to1

2 sin2(π/8), andP(A+,B−) is 1/4. Inserting these numbers into Bell’s inequality gives:

14≤ sin2(π/8) =

12

(1− 1

2

√2

), (58)

which is obviously wrong. Therefore, we see that quantum mechanics does not obey Bell’s inequality.Now what does this have to do with the EPR paradox? Well, first of all, the EPR paradox allows us

to measure the spin in two different directions at virtually the same moment. But, more importantly, if

Page 120: Classical and Quantum Mechanics

114 Density operators — Quantum information theory

the particles would leave the origin with predefined probabilities, Bell’s inequality would unambigu-ously hold. The only way to violate Bell’s inequality is by accepting that Alice’s measurement reducesthe entangled wavefunction of the two-particle system, which is also noticed by Bobinstantaneously.So, there is some ‘action at a distance’, in contrast to what we usually have in physics, where everyaction is mediated by particles such as photons, mesons, . . . .

In 1982, Aspect, Dalibard and Roger performed experiments with photons emerging from decay-ing atoms in order to check whether Bell’s theorem holds or not. Since then, several other groupshave redone this experiment, sometimes with different setups. The conclusion is now generally ac-cepted that Bell’s theorem does not hold for quantum mechanical probabilities. The implications ofthis conclusion for our view of Nature is enormous: somehow actions can be performed without in-termediary particles, so that the speed of light is not a limiting factor for this kind of communication.‘Communication’ is however a dangerous term to use in this context, as it suggests that informationcan be transmitted instantaneously. However, the ‘information’ which is transmitted from Alice toBob or vice versa is purely probabilistic, since Bob nor Alice can predict the outcome of their mea-surements. So far, no schemes have been invented or realised which would allow us to send over aMozart symphony at speeds faster than the speed light.

5 No cloning theorem

In recent years, much interest has arisen in quantum information processing. In this field, people tryto exploit quantum mechanics in order to process information in a way completely different from clas-sical methods. We have already encountered one example of these attempts: quantum cryptography,where a random encryption key can be shared between Bob and Alice without Eve being capable ofeavesdropping. Another very important application, which unfortunately is still far from a realisation,is the quantum computer. When I speak of a quantum computer, you should not forget that I meana machine which exists only on paper, not in reality. A quantum computer is a quantum machine inwhich qubitsevolve in time. A qubit is a quantum system with a 2-dimensional Hilbert space. It canalways be denoted

|ϕ〉= a|0〉+b|1〉 , (59)

wherea andb are complex constants satisfyinga2 + b2 = 1. The states|0〉 and |1〉 form a basis inthe Hilbert space. A quantum computer manipulates several qubits in parallel. A system consistingof n qubits has a 2n-dimensional Hilbert space. A quantum computation consists of a preparation ofthe qubits in some well-defined state, followed by an autonomous evolution of the qubit system, andconcluded by reading out the state of the qubits. As the system is autonomous, it is described by a(Hermitian) Hamiltonian. The time-evolution operatorU = exp(−itH/) is then a unitary operator,so the the quantum computation between initialisation and reading out the results can be described interms of a sequence of unitary transformations applied to the system. In this section we shall derive ageneral theorem for such an evolution, the no-cloning theorem:

An unknown quantum state cannot be cloned.

By cloning we mean that we can copy the state of some quantum system into some other systemwithout losing the state of our original system.

Before proceeding with the proof of this theorem, let us assume that cloning would be possible.In that case, communication at speeds faster than light would in principle be possible. To see this,imagine Alice has a qubit of which Bob has many clones, which are entangled with Alice’s qubit. IfAlice performs a measurement on her qubit along the axis|0〉 or |0〉+ |1〉, Bob’s clones will become

Page 121: Classical and Quantum Mechanics

6. Dense coding 115

aligned along the same axis. As Bob has many clones, he can find out which measurement Alice per-formed without ambiguity (how?). So the no-cloning theorem is essential in making communicationat speeds faster than the speed of light impossible.

The proof of the no-cloning theorem for qubit systems proceeds as follows. Cloning for a qubitpair means that we have a unitary evolutionU with the following effect on a qubit pair:

U |α0〉= |αα〉 . (60)

The evolutionU should work for any stateα, therefore it cannot depend onα. Therefore, for someother state|β 〉 we must have

U |β0〉= |ββ 〉 . (61)

Now let us operate withU on the state|γ0〉 with |γ〉= (|α〉+ |β 〉)/√

2:

U |γ0〉= (|αα〉+ |ββ 〉)/√

2 6= |γγ〉 , (62)

which completes the proof.

6 Dense coding

In this section, I describe a way of sending over more information than bits. This sounds completelyimpossible, but, again, quantum mechanics isin principleable to realise the impossible. It is howeverdifficult to implement, as it is based on Bob and Alice having an entangled pair of qubits, in the state

|00〉+ |11〉 . (63)

From now on, we shall adopt the convention in this field to omit normalisation factors in front of thewavefunctions. We can imagine this state to be realised by having an entangled pair generator midwaybetween Alice and Bob, sending entangled particles in opposite directions as in the EPR setup.

Note that the following qubit operations are all unitary:

I |φ〉= |φ〉 (64a)

X |0〉= |1〉 , (64b)

X |1〉= |0〉 (64c)

Z |0〉= |0〉 , (64d)

Z |1〉=−|1〉 . (64e)

Y |0〉= |1〉 (64f)

Y |1〉=−|0〉 Y = XZ . (64g)

The operatorI is the identity;X is called the NOT operator, We assume that Alice has a device withwhich she can perform any of the four transformations(I ,X,Y,Z) on her member (i.e. the first) of theentangled qubit pair. The resulting perpendicular states for these four transformations are:

I (|00〉+ |11〉) = (|00〉+ |11〉) (65a)

X (|00〉+ |11〉) = (|10〉+ |01〉) (65b)

Y (|00〉+ |11〉) = (|10〉− |01〉) (65c)

Z(|00〉+ |11〉) = (|00〉− |11〉) (65d)

Alice does not perform any measurement — she performs one of these four transformations and thenshe sends her bit to Bob. Bob then measures in which of the four possible states the entangled pair is,in other words, he now knows which transformation Alice applied. This information is ‘worth’ twobits, but Alice had to send only one bit to Bob!

Page 122: Classical and Quantum Mechanics

116 Density operators — Quantum information theory

7 Quantum computing and Shor’s factorisation algorithm

A quantum computer is a device containing one or more sets of qubits (calledregisters), which can beinitialised without ambiguity, and which can evolve in a controlled way under the influence of unitarytransformations and which can be measured after completion of this evolution.

The most general single-qubit transformation is a four-parameter family. For more than one qubit,it can be shown that every nontrivial unitary transformation can be generated by a single-qubit trans-formation of the form

U(θ ,φ) =(

cos(θ/2) −ie−iφ sin(θ/2)−ieiφ sin(θ/2) cos(θ/2)

). (66)

and another unitary transformations involving more than a single qubit, the so-called 2-qubit XOR.This transformation acts on a qubit pair and has the following effect:

XOR (|00〉) = |00〉 (67a)

XOR (|01〉) = |01〉 (67b)

XOR (|10〉) = |11〉 (67c)

XOR (|11〉) = |10〉 (67d)

We see that the first qubit is left unchanged and the second one is the eXclusive OR of the two inputbits. Unitary transformations are realised by hardware elements calledgates.

Several proposals for building quantum computers exist. In the ion trap, an array of ions whichcan be in either the ground state (|0〉) or the excited state (|1〉), controlled by laser pulses. Coupling ofneighbouring ions in order to realise an XOR-gate is realised through a controlled momentum transferto displacement excitations (phonons) of the chain.

Here in Delft, activities focus on arrays of Josephson junctions. Josephson junctions are verythin layers of ordinary conductors separating two superconductors. Current can flow through thesejunctions in either the clockwise or anti-clockwise direction (interpreted as 0 and 1 respectively).Other initiatives include NMR devices and optical cavities. With this technique it has become possiblerecently to factorise the number 15. Realisation of a working quantum computer will take at least afew decades — if it will come at all.

A major problem in realising a working quantum computer is to ensure a unitary evolution. Inpractice, the system will always be coupled to the outside world. Quantum computing hinges upon thepossiblity to have controlled,coherentsuperpositions. Coherent superpositions are linear combina-tions of quantum states into another, pure state. As we have seen in the previous section, coupling tothe environment may lead to entanglement which would cause the quantum computer to be describedby a density operator rather than by a pure state. In particular, any phase relation between constitutiveparts of a phase-coherent superposition is destroyed by coupling to the environment. We shall nowtreat this phenomenon in more detail.

Consider a qubit which interacts with its environment. We denote the state of the environment bythe ket|m〉. The interaction is described by the following prescription:

|0〉 |m〉 → |0〉 |m0〉 ; (68a)

|1〉 |m〉 → |1〉 |m1〉 . (68b)

In this interaction, the qubit itself does not change — if this would be the case, our computer wouldbe useless to start with.

Page 123: Classical and Quantum Mechanics

7. Quantum computing and Shor’s factorisation algorithm 117

Suppose we start with a state|0〉+eiφ |1〉 (69)

which is coupled to the environment. This coupling will induce the transition(|0〉+eiφ |1〉

)|m〉 → |0〉 |m0〉+eiφ |1〉 |m1〉 . (70)

Suppose this qubit is then fed into a so-called Hademard gate, which has the effect

H |0〉=1√2

(|0〉+ |1〉) ; (71a)

H |1〉=1√2

(|0〉− |1〉) . (71b)

Then the outcome is

eiφ/2[|0〉(

e−iφ/2 |m0〉+eiφ/2 |m1〉)

+ |1〉(

e−iφ/2 |m0〉−eiφ/2 |m1〉)]

. (72)

If we suppose that〈m0|m1〉 is real, we find for the probabilities to measure the qubit in the state|0〉 or|1〉 (after normalisation):

P0 =12

(1−〈m0|m1〉cosφ) (73a)

P1 =12

(1+ 〈m0|m1〉cosφ) (73b)

If there is no coupling,m0 = m1 = m, and we recognise the phase relation between the two states inthe probabilities. On the other hand, if〈m0|m1〉 = 0, then we find for both probabilities 1/2, and thephase relation has disappeared completely.

It is interesting to construct a density operator for the qubit in the final state (72). Consider a qubit

α |0〉 |+β |1〉 (74)

which has interacted with its environment, so that we have the combined state

α |0〉 |m0〉+β |1〉 |m1〉 . (75)

We can arrive at a density operator for the qubit only by performing the trace over them-system only.Using (13) we find

ρqubit =(

|α|2 αβ ∗ 〈m1|m0〉α∗β 〈m0|m1〉 |β |2

). (76)

The eigenvalues of this matrix are

λ =12± 1

2

√(|α|2−|β |2)2 +4|α|2|β |2| 〈m0|m1〉 |2 (77)

and these lie between 0 and 1, where the value 1 is reached only for〈m0|m1〉 = 1. The terms co-herence/decoherence derive from the name coherence which is often used for the matrix element〈m0|m1〉.

Now let us return to the very process of quantum computing itself. The most impressive algorithm,which was developed in 1994 by Peter Shor, is that of factorising large integers, an important problem

Page 124: Classical and Quantum Mechanics

118 Density operators — Quantum information theory

in the field of encryption and code-breaking. We shall not describe this algorithm in detail, but presenta brief sketch of an important sub-step, finding the period of an integer functionf . It is assumed herethat all unitary transformations used can be realised with a limited number of gates.

The algorithm works with two registers, both containingn qubits. These registers are describedby a 2n-dimensional Hilbert space. As basis states we use the bit-sequences of the integers between0 and 2n−1. The basis state corresponding to such an integerx is denoted|x〉n. Now we perform theHademard gate (71) to all bits of the state|0〉n. This yields

H |0〉n ≡ |w〉n = 2−n2n−1

∑x=0

|x〉n . (78)

It is possible (but we shall not describe the method here) to construct, for any functionf which mapsthe set of numbers 0 to 2n−1 onto itself, a unitary transformationU f which has the effect

U f |x〉n |0〉n = |x〉n | f (x)〉n (79)

using a limited number of gates.Now we are ready for the big trick in quantum computing. If we letU f act on the state|w〉n then

we obtain

U f |w〉n |0〉n = 2−n2n−1

∑x=0

|x〉n | f (x)〉n . (80)

We see that the new state containsf (x) for all possible values of x. In other words, applying the gatesU f to our state|w〉n |0〉n, we have evaluated the functionf for 2n different arguments. This feature iscalledquantum parallelismand it is this feature which is responsible for the (theoretical) performanceof quantum computing.

Of course, if we were to read out the results of the computation for eachx-value, we would havenot gained much, as this would take 2n operations. In general, however, the final result that we areafter consists of only few data, so a useful problem does not consist of simply calculatingf for allof its possible arguments. As an example we consider the problem of finding the periodicity of thefunction f , which is an important step in Shor’s algorithm. This is done by reading out only oneparticular value of the result in the second register,f (x) = u, say. The first register is then the sumof all x-states for which it holds thatf (x) = u. If f has a periodr, we will find that thesex-valueslie a distancer apart from each other. Now we act with a (unitary) Fourier transform operator on thisregister, and the result will be a linear combination of the registers corresponding to the period(s) ofthe functionf . If there is only one period, we can read this out straightforwardly.

It has been said already that finding the period of some function is an important step in the fac-torising algorithm. Shor’s algorithm is able to factorise ann-bit integer in about 300n3 steps. A veryrough estimate of size for the number to be factorized where a quantum computer starts outperforminga classical machine, is about 10130.

Page 125: Classical and Quantum Mechanics

Appendix A

Review of Linear Algebra

1 Hilbert spaces

A Hilbert spaceis defined as alinear, closed inner product space. The notions of linearity, innerproduct and closure may need some explanation.

• A linear vector spaceis a vector space in which any linear combination of vectors is an elementof that space. In other words, ifu andv are elements of the spaceH , then

αu+βv lies inH . (1)

• An inner productis a scalar expression depending on two vectorsu andv. It is denoted by〈u|v〉and it satisfies the following requirements:

1.〈u|v〉= 〈v|u〉∗ , (2a)

where the asterisk denotes complex conjugation.

2. Linearity:〈w|αu+βv〉= α 〈w|u〉+β 〈w|v〉 . (2b)

3. Positive-definiteness:〈u|u〉 ≥ 0, (2c)

and the equals-sign only holds whenu = 0.

An inner productspace is a linear vector space in which an inner product is defined.

• Closuremeans that if we take a converging sequence of vectors in the Hilbert space then the limitof the sequence also lies inside the space.

We shall now discuss two examples of Hilbert spaces.

1. Linear vector space in finite dimensionN. The elements are represented as column vectors:

u = |u〉=

u1

u2...

uN

. (3)

The elementsui are complex. The vector〈u| is conveniently denoted as

〈u|= (u∗1,u∗2, . . .u

∗N), (4)

119

Page 126: Classical and Quantum Mechanics

120 Review of Linear Algebra

It is called theHermitian conjugateof the column vector|u〉; 〈u| is often denoted as|u〉†. Theinner product〈u|v〉 is the product between the row vector〈u| and the column vector|v〉 – hence itcan be written as

〈u|v〉=N

∑i=1

u∗i vi . (5)

This definition satisfies all the requirements of the inner product, mentioned above.

2. A second example is the space ofsquare integrable functions, i.e. complex-valued functionsfdepending onn real variablesx1, . . . ,xn ≡ x satisfying∫

dnx | f (x)|2 < ∞. (6)

Note that thex may be restricted to some domain.

The inner product for complex-valued functions is defined as

〈 f |g〉∫

dnx f∗(x)g(x) (7)

2 Operators

An operator transforms a vector into some other vector. We shall be mainly concerned withlinearoperatorsT, which, for any two complex numbersα andβ , satisfy

T (α |u〉+β |v〉) = αT |u〉+β T |v〉 . (8)

Examples are operators represented by matrices in a finite-dimensional Hilbert space: 1 2 3−1 −2 11 −1 0

121

=

8−4−1

. (9)

An example of a linear operator in function space is the derivative operatorD = d/dx:

D f (x) =ddx

f (x). (10)

TheHermitian conjugateT† of an operatorT is defined as(T |u〉

)† = 〈u| T†. (11)

As an example, consider a two-dimensional Hilbert space:

T |u〉=(

T11 T12

T21 T22

)(u1

u2

)=(

T11u1 +T12u2

T21u1 +T22u2

). (12)

Taking the Hermitian conjugate of this we have, using (4):(T |u〉

)† = (T∗11u

∗1 +T∗

12u∗2,T

∗21u

∗1 +T∗

22u∗2) . (13)

Page 127: Classical and Quantum Mechanics

2. Operators 121

According to (11), this must be equal to

(u∗1,u∗2)(

T†11 T†

12T†

21 T†22

)(14)

and we immediately see that

T† =(

T∗11 T∗

21T∗

12 T∗22

). (15)

We conclude that the Hermitian conjugate of a matrix is the transpose and complex conjugate of theoriginal. This result holds for matrices of arbitrary size.

Now let us find the Hermitian conjugate of the operatorD = d/dx:

〈 f | D |g〉=(〈g| D† | f 〉

)∗. (16)

Writing out the integral expressions for the inner product we have:

〈 f | D |g〉=∫

dx f∗(x)ddx

g(x) =−∫

dx

(ddx

f ∗(x))

g(x) =−∫

dx g(x)D f ∗(x) =(〈g|− D | f 〉

)∗(17)

where we have used the partial integration to arrive at the first equality and we have assumed that theintegrated terms vanish. This condition holds for virtually all sensible quantum systems. Comparing(16) with (17), we see that

D† =−D. (18)

A Hermitian operatorH is an operator satisfying

H† = H. (19)

We have seen that the differentiation operatorD is not Hermitian – however,D2 is.A unitaryoperatorU is an operator which satisfies

UU† = U†U = I , (20)

whereI is the unit operator which leaves any vector unchanged,I |u〉= |u〉.An eigenvectorof a linear operatorT is a vector which satisfies

T |u〉= λ |u〉 , (21)

whereλ is a complex number, which is called theeigenvalue. In geometrical terms, this means thata vector which is operated on byT will change its length, but not its direction. Eigenvectors areextremely important in quantum mechanics, as we shall see in this course. Eigenvalues are said to bedegenerateif they are shared by at least two linearly independent eigenvectors.

For an Hermitian operator we have the following:

• The eigenvectors span the whole Hilbert space, which means thatanyvector of the space can bewritten as a linear combination of the eigenvectors. This property of the eigenvectors is calledcompleteness.

• All eigenvalues are real.

• Any two eigenvectors belonging to distinct eigenvalues are mutually orthogonal.

Page 128: Classical and Quantum Mechanics

122 Review of Linear Algebra

In the special case of a finite dimensional Hilbert space, the matrix representation of an HermitianoperatorH satisfies

ˆDiag = SHS† (22)

where the matrix ˆDiag is diagonal, i.e. only its diagonal elements are nonzero, and the columns matrixSare the eigenvectors ofH.

Two operatorsA andB are said tocommuteif their product does not depend on the order in whichit is evaluated:

A andB commute ifAB = BA. (23)

For two commuting operatorsA andB it holds that any nondegenerate eigenvector ofB is also aneigenvector ofA. If howeverA has a degenerate eigenvalue, then there can always be found a specialorthogonal basis in the degenerate eigenspace of that eigenvalue such that all basis vectors are alsoeigenvectors ofB, with eigenvalues which may or may not be degenerate.

Page 129: Classical and Quantum Mechanics

Appendix B

The time-dependent Schrodinger equation

The time-dependent Schrodinger equation reads:

i∂

∂ tψ(R, t) = Hψ(R, t). (1)

The coordinateR denotes any dependence other than time. UsuallyR contains space coordinates ofthe particle(s) in the system and their spin.

In Dirac vector notation, The Schrodinger equation can be written as

i∂

∂ t|ψ(t)〉= H |ψ(t)〉 . (2)

This equation has a formal solution for the case where the HamiltonianH does not depend on thetime:

|ψ(t)〉= e−it H/ |ψ(t = 0)〉 . (3)

This expression is difficult to evaluate as it involves the exponent of an operator.In case we know the eigenvectors|ϕn〉 and eigenvaluesEn of the Hamiltonian:

H |ϕn〉= En |ϕn〉 , (4)

the solution is not difficult to find. We have for any eigenvector|ϕn〉:

|ϕn(t)〉= e−itEn/ |ϕn〉 . (5)

Because of completeness, we can write|ψ(t = 0)〉 as

|ψ(t = 0)〉= ∑n

cn |ϕn〉 , (6)

and we see that the following solution

∑n

cne−itEn/ |ϕn〉 (7)

satisfies the time-dependent Schrodinger equation with starting value|ψ(t = 0)〉, as can easily beverified by substitution.

The stationary Schrodinger equation can be derived from the time-dependent Schrodinger equa-tion by a separation of variables. Let us try to write the solution to the time-dependent Schrodingerequation in the form

ψ(R, t) = Φ(R)Q(t). (8)

123

Page 130: Classical and Quantum Mechanics

124 The time-dependent Schrodinger equation

Substitution into the time-dependent Schrodinger equation and division on the left- and right handside byψ(R, t) leads to

i ∂Q(t)∂ t

Q(t)=

HΦ(R)Φ(R)

. (9)

On the left hand side, we have an expression depending ont, whereas on the right hand side we havean expression depending onR. These two expressions can therefore be equal only when they areconstant. We call this constant theenergy, En. This leads to the two equations

i∂Q∂ t

= EnQ(t); (10a)

HΦ(R) = EnΦ(R). (10b)

The second equation is the stationary Schrodinger equation which is essentially an eigenvalue equationfor the operatorH. The first equation has as its solution a time-dependent phase factor exp(−iEnt/)which must be multiplied by the eigenfunction ofH at energyE in order to obtain a solution to thetime-dependent Schrodinger equation. From (7) we see that thefull solution to the time-dependentSchrodinger equation can always be written as linear combination of the solutions found via the sta-tionary approach.

Page 131: Classical and Quantum Mechanics

Appendix C

Review of the Schrodinger equation in onedimension

In your first quantum mechanics course, you have encountered the stationary Schrodinger equation inone dimension. In this appendix we review briefly some aspects of this equation and its solutions.

The stationary Schrodinger equation in one dimension reads[−2

2md2

dx2 +V(x)]

ψ(x) = Eψ(x). (1)

This is aneigenvalueequation: on the left hand side, we have an operator acting on the wave functionψ(x), and the result must be proportional toψ, with proportionality constantE, the energy. A re-striction on the possible solutions is that they must be square integrable, that is, they must have finitenorm.

The solution of this equation is known in a few cases only: the constant potential, the harmonicoscillator and the Morse potential, which is related to the hydrogen atom. Here we shall restrictourselves to the constant potential,V(x) = V.

For E > V, the solutions can be written as

ψ(x) = e±ikx, (2)

withk2 = 2m(E−V)/2. (3)

In Eq. (2), the+ sign in the exponent corresponds to a wave running to the right, and the− signto a left-running wave. This can be seen when the solutions are multiplied by the appropriate time-dependent phase factor exp(−iEt/). The solution exp(±ikx) is not normalisable. Nevertheless, itis accepted as a solution, because it is the limit of a sequence of normalisable solutionsψn whichare of the form exp(±ikx) for −n < x < n, and which are smoothly cut off to zero beyond these twoboundaries.

For E < V, the solution isψ(x) = e±qx, (4)

withq2 = 2m(V−E)/2. (5)

When the solution extends to∞, only the− sign is allowed as a result of the normalisability of thewave function. Forx to−∞, only the+ sign is admissable for the same reason. The difference withexp(±ikx) (which is also not normalisable but nevertheless accepted) is that it is not possible to find aseries of normalisable solutions whose limit behaves as a diverging exp(±qx).

125

Page 132: Classical and Quantum Mechanics

126 Review of the Schrodinger equation in one dimension

We often deal with a potential which is piecewise constant. At the boundary between two regionswith constant potential, the boundary condition must be met that the value and the derivative of thewave function are equal on both sides. Often we do not care about the normalisation of the wavefunction at first (wedo however care about the normalisability!). In that case, the two matching con-ditions for value and derivative can be replaced by a continuity condition for the so-calledlogarithmicderivativeψ ′(x)/ψ(x) (the prime′ stands for the derivative).

We now consider a general (i.e. a non-constant) potentialV(x). WhenE > V for x→ ∞ or −∞,then a solution can be found for all values ofE larger thanV. In other words, the energy spectrum isthen continuous. WhenE < V for x→ ∞ and x→−∞, then a normalisable solution is found only fora discrete set ofE-values. The spectrum is then discrete. Note that it is the normalisablity conditionwhich restricts in this case the energy to a discrete set.


Recommended