Approximation Methods - UC San Diego | Department …physics.ucsd.edu/.../fall2009/physics130b/Approximations.pdf2 Time-Independent Perturbation Theory 21 2.1 Perturbation Theory for

Approximation Methods

Physics 130B, UCSD Fall 2009

Joel Broida

November 15, 2009

Contents

1 The Variation Method 1

1.1 The Variation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Excited States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.3 Linear Variation Functions . . . . . . . . . . . . . . . . . . . . . . . 8

1.3.1 Proof that the Roots of the Secular Equation are Real . . . . 17

2 Time-Independent Perturbation Theory 21

2.1 Perturbation Theory for a Nondegenerate Energy Level . . . . . . . 212.2 Perturbation Theory for a Degenerate Energy Level . . . . . . . . . 262.3 Perturbation Treatment of the First Excited States

of Helium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372.4 Spin–Orbit Coupling and the Hydrogen Atom Fine Structure . . . . 46

2.4.1 Supplement: Miscellaneous Proofs . . . . . . . . . . . . . . . 532.5 The Zeeman Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

2.5.1 Strong External Field . . . . . . . . . . . . . . . . . . . . . . 592.5.2 Weak External Field . . . . . . . . . . . . . . . . . . . . . . . 612.5.3 Intermediate-Field Case . . . . . . . . . . . . . . . . . . . . . 622.5.4 Supplement: The Electromagnetic Hamiltonian . . . . . . . . 66

3 Time-Dependent Perturbation Theory 71

3.1 Transitions Between Two Discrete States . . . . . . . . . . . . . . . 713.2 Transitions to a Continuum of States . . . . . . . . . . . . . . . . . . 80

i

ii

1 The Variation Method

1.1 The Variation Theorem

The variation method is one approach to approximating the ground state energyof a system without actually solving the Schrodinger equation. It is based on thefollowing theorem, sometimes called the variation theorem.

Theorem 1.1. Let a system be described by a time-independent Hamiltonian H,and let ϕ be any normalized well-behaved function that satisfies the boundary con-ditions of the problem. If E0 is the true ground state energy of the system, then

〈ϕ|Hϕ〉 ≥ E0 . (1.1)

Proof. Consider the integral I = 〈ϕ|(H − E0)ϕ〉. Then

I = 〈ϕ|Hϕ〉 − E0〈ϕ|ϕ〉 = 〈ϕ|Hϕ〉 − E0 .

We must show that I ≥ 0. Let ψn be the true (stationary state) solutions tothe Schrodinger equation, so that Hψn = Enψn. By assumption, the ψn form acomplete, orthonormal set, so we can write

ϕ =∑

n

anψn

where 〈ψn|ψm〉 = δnm. Then

I =∑

n

a∗n〈ψn|(H − E0)∑

m

am|ψm〉

=∑

n,m

a∗nam(〈ψn|Hψm〉 − E0δnm)

=∑

n,m

a∗nam(Em − E0)δnm

=∑

n

|an|2 (En − E0) .

But |an| ≥ 0 and En > E0 for all n > 0 because E0 is the ground state of thesystem. Therefore I ≥ 0 as claimed.

Suppose we have a trial function ϕ that is not normalized. Then multiplyingby a normalization constant N , equation (1.1) becomes |N |2 〈ϕ|H |ϕ〉 ≥ E0. But by

definition we know that 1 = 〈Nϕ|Nϕ〉 = |N |2 〈ϕ|ϕ〉 so that |N |2 = 1/〈ϕ|ϕ〉, andhence our variation theorem becomes

〈ϕ|Hϕ〉〈ϕ|ϕ〉 ≥ E0 . (1.2)

1

The integral in (1.1) (or the ratio of integrals in (1.2)) is called the variational

integral.So the idea is to try a number of different trial functions, and see how low we can

get the variational integral to go. Fortunately, the variational integral approachesE0 a lot faster than ϕ approaches ψ0, so it is possible to get a good approximationto E0 even with a poor ϕ. However, a common approach is to introduce arbitraryparameters and minimize the energy with respect to them.

Before continuing with an example, there are two points I need to make. First, Istate without proof that the bound stationary states of a one-dimensional system arecharacterized by having no nodes interior to the boundary points in the ground state(i.e., the wavefunction is never zero), and the number of nodes increases by one foreach successive excited state. While the proof of this statement is not particularlydifficult (it’s really a statement about Sturm-Liouville type differential equations),it would take us too far astray at the moment. If you are interested, a proof maybe found in Messiah, Quantum Mechanics, Chapter III, Sections 8-12.

A related issue is the following: In one dimension, the bound states are nonde-generate. To prove this, suppose we have two degenerate states ψ1 and ψ2, bothwith the same energy E. Multiply the Schrodinger equation for ψ1 by ψ2:

− ~2

2mψ2d2ψ1

dx2+ V ψ1ψ2 = Eψ1ψ2

and multiply the Schrodinger equation for ψ2 by ψ1:

− ~2

2mψ1d2ψ2

dx2+ V ψ1ψ2 = Eψ1ψ2 .

Subtracting, we obtain

ψ2d2ψ1

dx2− ψ1

d2ψ2

dx2= 0 .

But thend

dx

[

ψ2dψ1

dx− ψ1

dψ2

dx

]

= ψ2d2ψ1

dx2− ψ1

d2ψ2

dx2= 0

so that

ψ2dψ1

dx− ψ1

dψ2

dx= const .

However, we know that ψ → 0 as x → ±∞, and hence the constant must equalzero. Rewriting this result we now have d lnψ1 = d lnψ2 or lnψ1 = lnψ2 + ln kwhere ln k is an integration constant. This is equivalent to ψ1 = kψ2 so that ψ1

and ψ2 are linearly dependent and hence degenerate as claimed.The second topic I need to address is the notion of classification by symmetry.

So, let us consider the time-independent Schrodinger equation Hψ = Eψ, andsuppose that the potential energy function V (x) is symmetric, i.e.,

V (−x) = V (x) .

2

Under these conditions, the total Hamiltonian is also symmetric:

H(−x) = H(x) .

To understand the consequences of this, let us introduce an operator Π called theparity operator, defined by

Πf(x) = f(−x)

where f(x) is an arbitrary function. It is easy to see that Π is Hermitian because

〈f |Πg〉 =

∫ ∞

−∞f(x)∗Πg(x) dx =

∫ ∞

−∞f(x)∗g(−x) dx

=

∫ ∞

−∞f(−x)∗g(x) dx =

∫ ∞

−∞[Πf(x)]∗g(x) dx

= 〈Πf |g〉

where in going from the first line to the second we simply changed variables x → −x.(I will use the symbol dx to denote the volume element in whatever n-dimensionalspace is under consideration.)

Now what can we say about the eigenvalues of Π? Well, if Πf = λf , then

Π2f = Π(Πf) = λΠf = λ2f .

On the other hand, it is clear that

Π2f(x) = Π(Πf(x)) = Πf(−x) = f(x)

and hence we must have λ2 = 1, so the eigenvalues of Π are ±1. Let us denote thecorresponding eigenfunctions by f±:

Πf+ = f+ and Πf− = −f− .

In other words,

f+(−x) = f+(x) and f−(−x) = −f−(x) .

Thus f+ is any even function, and f− is any odd function. Note that what haveshown is the existence of a Hermitian operator with only two eigenvalues, each ofwhich is infinitely degenerate. (I leave it as an easy exercise for you to show thatf+ and f− are orthogonal as they should be.)

Next, note that any f(x) can always be written in the form

f(x) = f+(x) + f−(x)

where

f+(x) =f(x) + f(−x)

2and f−(x) =

f(x) − f(−x)

2

3

are obviously symmetric and antisymmetric, respectively. Thus the eigenfunctionsof the parity operator are complete, i.e., any function can be written as the sum ofa symmetric function and an antisymmetric function.

It will be extremely convenient to now introduce the operators Π± defined by

Π± =1 ± Π

2.

In terms of these operators, we can write

Π±f = f± .

It is easy to see that the operators Π± satisfy the three properties

Π2± = Π±

Π+Π− = Π−Π+ = 0

Π+ + Π− = 1 .

The operators Π± are called projection operators.Returning to our symmetric Hamiltonian, we observe that

Π(H(x)ψ(x)) = H(−x)ψ(−x) = H(x)ψ(−x) = H(x)Πψ(x)

and thus the Hamiltonian commutes with the parity operator. But if [H,Π] = 0,then it is trivial to see that [H,Π±] = 0 also, and therefore acting on HψE = EψE

with Π± we see thatHψE+ = EψE+

andHψE− = EψE− .

Thus the stationary states in a symmetric potential can always be classified accord-ing to their parity, i.e., they can always be chosen to have a definite symmetry.Moreover, since, as we saw above, the bound states in one dimension are nonde-generate, it follows that each bound state in a one-dimensional symmetric potentialmust be either even or odd.

Example 1.1. Let us find a trial function for a particle in a one-dimensional boxof length l. Since the true wavefunction vanishes at the ends x = 0 and x = l, ourtrial function must also have this property. A simple (un-normalized) function thatobeys these boundary conditions is

ϕ = x(l − x) for 0 ≤ x ≤ l

and ϕ = 0 outside the box.

4

The integrals in equation (1.2) are

〈ϕ|Hϕ〉 = − ~2

2m

∫ l

0

x(l − x)d2

dx2x(l − x) dx

=~2

m

∫ l

0

x(l − x) dx =~2l3

6m

and

〈ϕ|ϕ〉 =

∫ l

0

x2(l − x)2 dx =l5

30.

Therefore

E0 ≤ 〈ϕ|Hϕ〉〈ϕ|ϕ〉 = 5

~2

ml2.

For comparison, the exact solution has energy levels

En =n2

~2π2

2ml2n = 1, 2, . . .

so the ground state (n = 1) has energy

π2

2

~2

ml2= 4.9348

~2

ml2

for an error of 1.3%. The figure below is a plot of the exact normalized ground statesolution to the particle in a box together with the normalized trial function. Youcan see how closely the trial function is to the exact solution.

0.2 0.4 0.6 0.8 1.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

Trial Function

Exact Solution

Figure 1: Plot of√

2 sinπx and√

30x(1 − x).

Example 1.2. Let us construct a variation function with parameter for the one-dimensional harmonic oscillator, and find the optimal value for that parameter.

5

What do we know in general? First, the wavefunction must vanish as x→ ±∞.The most obvious function that satisfies this is e−x2

. However, x has units of length,and we can only take the exponential of a dimensionless quantity (think of the power

series expansion for e−x2

). However, if we include a constant α with dimensions

of length−2, then e−αx2

is satisfactory from a dimensional standpoint. In addition,since the potential V = 1

2kx2 is symmetric, we know that the eigenstates will have

a definite parity. And since the ground state has no nodes, it must be an evenfunction (since an odd function has a node at the origin). Thus the trial function

ϕ = e−αx2

has all of our desired properties.Since ϕ is unnormalized, we use equation (1.2). The Hamiltonian is

− ~2

2m

d2

dx2+

1

2mω2x2

and hence

〈ϕ|Hϕ〉 = − ~2

2m

∫ ∞

−∞e−αx2 d2e−αx2

dx2dx+

1

2mω2

∫ ∞

−∞x2e−2αx2

dx

= − ~2

2m

∫ ∞

−∞

[

4α2x2e−2αx2 − 2αe−2αx2]

dx+1

2mω2

∫ ∞

−∞x2e−2αx2

dx

=

[−2~2α2

m+

1

2mω2

]∫ ∞

−∞x2e−2αx2

dx+~2α

m

∫ ∞

−∞e−2αx2

dx .

The second integral is easy (and you should already know the answer):∫ ∞

−∞e−2αx2

dx =

√

π

2α.

Using this, the first integral is also easy. Letting β = 2α we have∫ ∞

−∞x2e−2αx2

dx =

∫ ∞

−∞x2e−βx2

dx = − ∂

∂β

∫ ∞

−∞e−βx2

dx

= − ∂

∂β

√

π

β=

1

2

π1/2

β3/2

=1

2

π1/2

(2α)3/2.

After a little algebra, we now arrive at

〈ϕ|Hϕ〉 =~

2π1/2α1/2

23/2m+mω2π1/2α−3/2

27/2.

And the denominator in equation (1.2) is just

〈ϕ|ϕ〉 =

∫ ∞

−∞e−2αx2

dx =

√

π

2α.

6

Thus our variational integral becomes

W :=〈ϕ|Hϕ〉〈ϕ|ϕ〉 =

~2α

2m+mω2

8α.

To minimize this with respect to α we set dW/dα = 0 and solve for α:

~2

2m− mω2

8α2= 0

orα = ±mω

2~.

The negative root must be rejected because otherwise ϕ = e−αx2

would be divergent.Substituting the positive root for α into our expression for W yields

W =1

2~ω

which is the exact ground state harmonic oscillator energy. This isn’t surprising,because up to normalization, our ϕ with α = mω/2~ is just the exact ground stateharmonic oscillator wave function.

1.2 Excited States

So far all we have discussed is how to approximate the ground-state energy of asystem. Now we want to take a look at how to go about approximating the energyof an excited state. Let us assume that the stationary states of our system arenumbered so that

E0 ≤ E1 ≤ E2 ≤ · · · .If ψn is a complete set of orthonormal eigenstates of H , then our normalized

trial function can be written ϕ =∑

n anψn where an = 〈ψn|ϕ〉. Then as we haveseen

〈ϕ|Hϕ〉 =∑

n,m

a∗namEm〈ψn|ψm〉 =∑

n,m

a∗namEmδnm =∞∑

n=0

|an|2En

and

〈ϕ|ϕ〉 =

∞∑

n=0

|an|2 = 1 .

Suppose we restrict ourselves to trial functions that are orthogonal to the trueground-state wavefunction ψ0. Then a0 = 〈ψ0|ϕ〉 = 0 and we are left with

〈ϕ|Hϕ〉 =

∞∑

n=1

|an|2En and 〈ϕ|ϕ〉 =

∞∑

n=1

|an|2 = 1 .

7

For n ≥ 1 we have En ≥ E1 so that |an|2En ≥ |an|2E1 and hence

∞∑

n=1

|an|2En ≥∞∑

n=1

|an|2E1 = E1

∞∑

n=1

|an|2 = E1 .

This gives us our desired result

〈ϕ|Hϕ〉 ≥ E1 if 〈ψ0|ϕ〉 = 0 and 〈ϕ|ϕ〉 = 1 . (1.3)

While equation (1.3) gives an upper bound on the energy E1 of the first excitedstate, it depends on the restriction 〈ψ0|ϕ〉 = 0 which can be problematic. However,for some systems this is not a difficult requirement to achieve even though we don’tknow the exact ground-state wavefunction. For example, a one-dimensional problemwith a symmetric potential has a ground-state wavefunction that is always even,while the first excited state is always odd. This means that any (normalized) trialfunction ϕ that is an odd function will automatically satisfy 〈ψ0|ϕ〉 = 0.

It is also possible to extend this approach to approximating the energy levels ofhigher excited states. In particular, if we somehow choose the trial function ϕ sothat

〈ψ0|ϕ〉 = 〈ψ1|ϕ〉 = · · · = 〈ψn|ϕ〉 = 0,

then, following exactly the same argument as above, it is easy to see that if 〈ϕ|ϕ〉 = 1we have

〈ϕ|Hϕ〉 ≥ En+1 .

For example, consider any particle moving under a central potential V (r) (e.g.,the hydrogen atom). Then the Schrodinger equation factors into a radial equationthat depends on V (r) times an angular equation (that is independent of V ) withsolutions that are just the spherical harmonics Y m

l (θ, φ). It may very well be thatwe can’t solve the radial equation with this potential, but we know that sphericalharmonics with different values of l are orthogonal. Thus, we can get an upperbound to the energy of the lowest state with a particular angular momentum l bychoosing a trial function that contains the factor Y m

l .

1.3 Linear Variation Functions

The approach that we are now going to describe is probably the most commonmethod of finding approximate molecular wave functions. A linear variation

function ϕ is a linear combination of n linearly independent functions fi:

ϕ =

n∑

i=1

cifi .

The functions fi are called basis functions, and they must obey the boundaryconditions of the problem. The coefficients ci are to be determined by minimizingthe variational integral.

8

We shall restrict ourselves to a real ϕ, so the functions fi and coefficients ci aretaken to be real. Later we will remove this requirement. Furthermore, note thatthe basis functions are not generally orthogonal since they are not necessarily theeigenfunctions of any operator. Let us define the overlap integrals Sij by

Sij := 〈fi|fj〉 =

∫

f∗i fj dx

(where the asterisk on fi isn’t necessary because we are assuming that our basisfunctions are real). Then (remember that the ci are real)

〈ϕ|ϕ〉 =

n∑

i,j=1

cicj〈fi|fj〉 =

n∑

i,j=1

cicjSij .

Next, we define the integrals

Hij := 〈fi|Hfj〉 =

∫

f∗i Hfj dx

so that

〈ϕ|Hϕ〉 =

n∑

i,j=1

cicj〈fi|Hfj〉 =

n∑

i,j=1

cicjHij .

Then the variation theorem (1.2) becomes

W =〈ϕ|Hϕ〉〈ϕ|ϕ〉 =

∑ni,j=1 cicjHij

∑ni,j=1 cicjSij

or

Wn

∑

i,j=1

cicjSij =n

∑

i,j=1

cicjHij . (1.4)

Now W is a function of the n ci’s, and we know that W ≥ E0. In orderto minimize W with respect to all of the the ck’s, we must require that at theminimum we have

∂W

∂ck= 0 ; k = 1, . . . , n .

Taking the derivative of (1.4) with respect to ck and using

∂ci∂ck

= δik

we have

∂W

∂ck

n∑

i,j=1

cicjSij +W

n∑

i,j=1

(δikcj + ciδjk)Sij =

n∑

i,j=1

(δikcj + ciδjk)Hij

9

or (since ∂W/∂ck = 0)

W

n∑

j=1

cjSkj +W

n∑

i=1

ciSik =

n∑

j=1

cjHkj +

n∑

i=1

ciHik .

However, the basis functions fi are real so we have

Sik =

∫

fifk dx = Ski

and since H is Hermitian (and H(x) is real) we also have

Hik = 〈fi|Hfk〉 = 〈Hfi|fk〉 = 〈fk|Hfi〉∗ = 〈fk|Hfi〉 = Hki .

Therefore, because the summation indices are dummy indices, we see that the twoterms on each side of the last equation are identical, and we are left with

W

n∑

j=1

cjSkj =

n∑

j=1

cjHkj

orn

∑

j=1

(Hkj −WSkj)cj = 0 ; k = 1, . . . , n . (1.5)

This is just a system of n homogeneous linear equations in n unknowns (the ncoefficients cj), and hence for a nontrivial solution to exist (we don’t want all of thecj ’s to be zero) we must have the secular equation

det(Hkj −WSkj) = 0 . (1.6)

(You can think of this as a system of the form∑

j akjxj = 0 where the matrix

A = (ajk) must be singular or else A−1 would exist and then the equation Ax = 0would imply that x = 0. The requirement that A be singular is equivalent to therequirement that detA = 0.) Written out, equation (1.6) looks like

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

H11 −WS11 H12 −WS12 · · · H1n −WS1n

H21 −WS21 H22 −WS22 · · · H2n −WS2n

......

...

Hn1 −WSn1 Hn2 −WSn2 · · · Nnn −WSnn

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

= 0 .

The determinant in (1.6) is a polynomial in W of degree n, and it can be provedthat all n roots of this equation are real. (The proof is given at the end of thissection for those who are interested.) Let us arrange the roots in order of increasingvalue as

W0 ≤W1 ≤ · · · ≤Wn−1 .

10

Similarly, we number the bound states of the system so that the corresponding trueenergies of these bound states are also arranged in increasing order:

E0 ≤ E1 ≤ · · · ≤ En−1 ≤ En ≤ · · · .

From the variation theorem we know that E0 ≤ W0. Furthermore, it can also beproved (see the homework) that

Ei ≤Wi for each i = 0, . . . , n− 1 .

In other words, the linear variation method provides upper bounds for the energiesof the lowest n bound states of the system. It can be shown that increasing thenumber of basis functions used (and hence increasing the number of states whoseenergies are approximated), the better the accuracy of the previously calculatedenergies.

Once we have found the n roots Wi, we can substitute them one-at-a-time back

into equation (1.5) and solve for the coefficients c(i)j , where the superscript denotes

that fact that this particular set of coefficients applies to the root Wi. (Again, thisis just like finding the eigenvector corresponding to a given eigenvalue.) Note alsothat all we can really find is the ratios of the coefficients, say relative to c1, andthen fix c1 by normalization.

There are some tricks that can simplify the solution of equation (1.6). Forexample, if we choose the basis functions to be orthonormal, then Skj = δkj . Ifthe originally chosen set of basis functions isn’t orthonormal, we can always use theGram-Schmidt process to construct an orthonormal set. Also, we can make some ofthe off-diagonal Hkj ’s vanish if we choose our basis functions to be eigenfunctionsof some other Hermitian operator A that commutes with H . This because of thefollowing theorem:

Theorem 1.2. Let fi and fj be eigenfunctions of a Hermitian operator A corre-sponding to the eigenvalues ai 6= aj. If H is an operator that commutes with A,then

Hji = 〈fj |Hfi〉 = 0 .

Proof. Let us first assume that the eigenvalue ai is nondegenerate. Then Afi = aifi

andA(Hfi) = HAfi = ai(Hfi) .

Thus Hfi is in the eigenspace Vaiof A corresponding to the eigenvalue ai. But

ai is nondegenerate so that the eigenspace is one-dimensional and spanned by fi.Hence we must have Hfi = bifi for some scalar bi. Recalling that eigenfunctionsbelonging to distinct eigenvalues of a Hermitian operator are orthogonal, we have

〈fj |Hfi〉 = bi〈fj |fi〉 = 0 .

11

Now assume that the eigenvalue ai is degenerate. This means that the eigenspaceVai

has dimension greater than one, say dimVai= n. Then Vai

has a basis g1, . . . , gn

consisting of eigenvectors of A corresponding to the eigenvalue ai, i.e., Agk = aigk

for each k = 1, . . . , n. So if Hfi is in Vai, then Hfi =

∑nk=1 ckgk for some expansion

coefficients ck. But then we again have

〈fj |Hfi〉 =

n∑

k=1

ck〈fj |gk〉 = 0

because the eigenfunctions fj and gk belong to the distinct eigenvalues aj and ai

respectively.

Another (possibly easier) way to prove Theorem 1.2 is this. Let Afi = aifi andAfj = ajfj where ai 6= aj . (In other words, fi and fj belong to different eigenspacesof A.) Then on the one hand we have

〈fj |HAfi〉 = ai〈fj |Hfi〉

while on the other hand, we can use the fact that H and A commute along withthe fact that A is Hermitian and hence has real eigenvalues, to write

〈fj |HAfi〉 = 〈fj |AHfi〉 = 〈Afj |Hfi〉 = aj〈fj |Hfi〉 .

Equating these results shows that (ai − aj)〈fj |Hfi〉 = 0. Therefore, if ai 6= aj , wemust have 〈fj |Hfi〉 = 0.

Finally, it is left as a homework problem to show that equations (1.5) and (1.6)also hold if the variation function is in fact allowed to be complex.

Example 1.3. In Example 1.1 we constructed the trial function ϕ = x(l − x) forthe ground state of the one-dimensional particle in a box. Let us now construct alinear variation function ϕ =

∑

i cifi to approximate the energies of the first fourstates. This means that we need at least four independent functions fi that obeythe boundary conditions of vanishing at the ends of the box. While there are aninfinite number of possibilities, we want to limit ourselves to integrals that are easyto evaluate.

We begin by takingf1 = x(l − x) ,

and another simple function that obeys the proper boundary conditions is

f2 = x2(l − x)2 .

If the origin were chosen to be at the center of the box, we know that the exactsolutions would have a definite parity, alternating between even and odd functions,starting with the even ground state. To see that both f1 and f2 are even functions,

12

we shift the origin to the center of the box by changing variables to x′ = x − l/2.Then x = x′ + l/2 and we find

f1 = (x′ + l/2)(l/2− x′) and f2 = (x′ + l/2)2(l/2 − x′)2

which shows that f1 and f2 are both clearly even functions of x′.Since both f1 and f2 are even functions, if we took ϕ = c1f1+c2f2 we would end

up with an upper bound for the two lowest energy even states (the n = 1 and n = 3states). In order to also approximate the odd n = 2 and n = 4 states, we must addin two odd functions. Thus we need two functions that vanish at x = 0, x = l andx = l/2. Two functions that satisfy these requirements are

f3 = x(l − x)(l/2 − x)

andf4 = x2(l − x)2(l/2 − x) .

By again changing variables as we did for f1 and f2, you can easily show that f3and f4 are indeed odd functions. Note also that the four functions we have chosenare linearly independent as they must be.

One of the advantages in choosing our functions to have a definite parity is thatmany of the integrals that occur in equation (1.6) will vanish. In particular, sinceany integral of an odd function over an even interval is identically zero, and sincethe product of an even function with an odd function is odd, it should be clear that

S13 = S31 = 0 S14 = S41 = 0

S23 = S32 = 0 S24 = S42 = 0 .

Furthermore, since the functions have a definite parity, they are eigenfunctions of theparity operator Π with Πf1,2 = +f1,2 and Πf3,4 = −f3,4. And since the potentialis symmetric, we have [Π, H ] = 0 so that by Theorem 1.2 we know that Hij = 0 ifone index refers to an even function and the other refers to an odd function:

H13 = H31 = 0 H14 = H41 = 0

H23 = H32 = 0 H24 = H42 = 0 .

With these simplifications, (1.6) becomes

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

H11 −WS11 H12 −WS12 0 0

H21 −WS21 H22 −WS22 0 0

0 0 H33 −WS33 H34 −WS34

0 0 H43 −WS43 H44 −WS44

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

= 0 .

Since the determinant of a block diagonal matrix is the product of the determinantsof the blocks, we can find all four roots by finding the two roots of each of the

13

following equations:∣

∣

∣

∣

∣

H11 −WS11 H12 −WS12

H21 −WS21 H22 −WS22

∣

∣

∣

∣

∣

= 0 (1.7a)

∣

∣

∣

∣

∣

H33 −WS33 H34 −WS34

H43 −WS43 H44 −WS44

∣

∣

∣

∣

∣

= 0 . (1.7b)

Let the roots of (1.7a) be denoted W1,W3. These are the approximations to theenergies of the n = 1 and n = 3 even states. Similarly, the roots W2,W4 of (1.7b)are the approximations to the odd energy states n = 2 and n = 4. Once we have theroots Wi, we substitute them one-at-a-time back into equation (1.5) to determine

the set of coefficients c(i)j corresponding to that particular root. In the particular

case of W1, this yields the set of equations

(H11 −W1S11)c(1)1 + (H12 −W1S12)c

(1)2 = 0

(H21 −W1S21)c(1)1 + (H22 −W1S22)c

(1)2 = 0

(1.8a)

(H33 −W1S33)c(1)3 + (H34 −W1S34)c

(1)4 = 0

(H43 −W1S43)c(1)3 + (H44 −W1S44)c

(1)4 = 0 .

(1.8b)

Now, W1 was a root of (1.7a), so the determinant of the coefficients in (1.8a)

must vanish, and we have a nontrivial solution for c(1)1 and c

(1)2 . However, W1 was

not a root of (1.7b), so the determinant of the coefficients in (1.8b) does not vanish,

and hence there is only the trivial solution c(1)3 = c

(1)4 = 0. Thus the trial function

for W1 is ϕ1 = c(1)1 f1 +c

(1)2 f2. Exactly the same reasoning applies to the other three

roots, and we have the trial functions

ϕ1 = c(1)1 f1 + c

(1)2 f2 ϕ3 = c

(3)1 f1 + c

(3)2 f2

ϕ2 = c(2)3 f3 + c

(2)4 f4 ϕ4 = c

(4)3 f3 + c

(4)4 f4 .

So we see that the even states ψ1 and ψ3 are approximated by the trial functions ϕ1

and ϕ3 consisting of linear combinations of the even functions f1 and f2. Similarly,the odd states ψ2 and ψ4 are approximated by the trial functions ϕ2 and ϕ4 thatare linear combinations of the odd functions f3 and f4.

To proceed any further, we need to evaluate the non-zero integrals Hij and Sij .From Example 1.1 we can immediately write down H11 and S11. The rest of theintegrals are also straight-forward to evaluate, and the result is

H11 = ~2l3/6m H12 = H21 = ~

2l5/30m H22 = ~2l7/105m

H33 = ~2l5/40m H44 = ~

2l9/1260m H34 = H43 = ~2l7/280m

S11 = l5/30 S12 = S21 = l7/140 S22 = l9/630

S33 = l7/840 S44 = l11/27720 S34 = S43 = l9/5040 .

14

Substituting these results into equation (1.7a) to determine W1 and W3 we have

∣

∣

∣

∣

∣

~2l3

6m − l5

30W~2l5

30m − l7

140W

~2l5

30m − l7

140W~2l7

105m − l9

630W

∣

∣

∣

∣

∣

= 0 .

To evaluate this, it is easiest to recall that multiplying any single row of a deter-minant by some scalar is the same as multiplying the original determinant by thatsame scalar. (This is an obvious consequence of the definition

detA =

n∑

i1,...,in=1

εi1···ina1i1 · · · anin.)

Since the right hand side of this equation is zero, we don’t change anything bymultiplying any row in this determinant by some constant. Multiplying the firstrow by 420m/l3 and the second row by 1260m/l5 we obtain

∣

∣

∣

∣

∣

70~2 − 14ml2W 14~2l2 − 3ml4W

42~2 − 9ml2W 12~2l2 − 2ml4W

∣

∣

∣

∣

∣

= 0 (1.9)

orml4W 2 − 56ml2~2W + 252~

4 = 0 .

The roots of this quadratic are

W1,3 = (~2/ml2)(28 ±√

532) = 4.93487~2/ml2, 51.0651~

2/ml2 .

Similarly, substituting the values for Hij and Sij into (1.7b) results in

W2,4 = (~2/ml2)(60 ±√

1620) = 19.7508~2/ml2, 100.249~

2/ml2 .

For comparison, the first four exact solutions En = n2~2π2/2ml2 are

En = 4.9348~2/ml2, 19.7392~

2/ml2, 44.4132~2/ml2, 78.9568~

2/ml2

so the errors are (in the order of increasing energy levels) 0.0014%, 0.059%, 15.0%and 27.0%. As expected, we did great for n = 1 and n = 2, but not so great forn = 3 and n = 4.

We still have to find the approximate wave functions that correspond to each ofthe Wi’s. We want to substitute W1 = 4.93487~2/ml2 into equations (1.8a) and usethe integrals we have already evaluated. However, it is somewhat easier to note that

the coefficients of c(1)1,2 in equations (1.8a) are equivalent to the entries in equation

(1.9). Furthermore, as we have already noted, all we can find is the ratio of theci’s, so the two equations in (1.9) are equivalent, and we only need to use either oneof them. (That the equations are equivalent is a consequence of the fact that thedeterminant (1.9) is zero, so the rows must be linearly dependent. Hence we get nonew information by using both rows.)

15

So choosing the first row we have

70~2 − 14ml2W1 = 70~

2 − 14ml2(4.93487~2/ml2) = 0.91182~

2

14~2l2 − 3ml4W1 = 14~

2l2 − 3ml4(4.93487~2/ml2) = −0.80461~

2l2

so that

c(1)2 =

0.91182~2

0.80461~2l2c(1)1 = 1.133c

(1)1 /l2 .

To fix the value of c(1)1 we use the normalization condition:

1 = 〈ϕ1|ϕ1〉 = 〈c(1)1 f1 + c(1)2 f2|c(1)1 f1 + c

(1)2 f2〉

= [c(1)1 ]2S11 + 2c

(1)1 c

(1)2 S12 + [c

(1)2 ]2S22

= [c(1)1 ]2

[

S11 + 2 · 1.133

l2S12 +

(1.133)2

l4S22

]

= [c(1)1 ]2

[

l5

30+ 2 · 1.133

l2l7

140+

(1.133)2

l4l9

630

]

= 0.05156[c(1)1 ]2l5

and hence c(1)1 = 4.404l−5/2.

Putting this all together we finally obtain

ϕ1 = 4.404l−5/2f1 + 4.990l−9/2f2

= 4.404l−5/2x(l − x) + 4.990l−9/2x2(l − x)2

= l−1/2[4.404(x/l)(1− x/l) + 4.990(x/l)2(1 − x/l)2] .

As you can see from the plot below, the function ϕ1 is almost identical to the exactsolution ψ1 =

√2 sinπx/l:

0.2 0.4 0.6 0.8 1.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

Trial Function

Exact Solution

Figure 2: Plot of ψ1 and ϕ1 vs x/l.

16

Repeating all of this with the other roots W2,W3 and W4 we eventually arriveat

ϕ2 = l−1/2[16.78(x/l)(1 − x/l)(1/2 − x/l) + 71.85(x/l)2(1 − x/l)2(1/2 − x/l)]

ϕ3 = l−1/2[28.65(x/l)(1 − x/l) − 132.7(x/l)2(1 − x/l)2]

ϕ4 = l−1/2[98.99(x/l)(1 − x/l)(1/2 − x/l) − 572.3(x/l)2(1 − x/l)2(1/2 − x/l)]

1.3.1 Proof that the Roots of the Secular Equation are Real

In this section we will prove that the roots of the polynomial in W defined byequation (1.6) are in fact real. In order to show this, we must first review somebasic linear algebra.

Let V be a vector space over C. By an inner product on V (sometimes calledthe Hermitian inner product), we mean a mapping 〈· , ·〉 : V ×V → C such thatfor all u,v,w ∈ V and a, b ∈ C we have

(IP1) 〈au + bv,w〉 = a∗〈u,w〉 + b∗〈v,w〉 ;(IP2) 〈u,v〉 = 〈v,u〉∗ ;(IP3) 〈u,u〉 ≥ 0 and 〈u,u〉 = 0 if and only if u = 0 .

If ei is a basis for V , then in terms of components we have

〈u,v〉 =∑

i,j

u∗i vj〈ei, ej〉 :=∑

i,j

u∗i vjgij

where we have defined the (square) matrix G = (gij) = 〈ei, ej〉. As a matrixproduct, we may write

〈u,v〉 = u∗TGv .

I emphasize that this is the most general inner product on V , and any inner productcan be written in this form. (For example, if V is a real space and gij = 〈ei, ej〉 =δij , then we obtain the usual Euclidean inner product on V .) Notice that

gij = 〈ei, ej〉 = 〈ej , ei〉∗ = g∗ji

and hence G = G† so that G is in fact a Hermitian matrix. (Some of you mayrealize that in the case where V is a real vector space, the matrix G is just theusual metric on V .)

Now, given an inner product, we may define a norm on V by ‖u‖ = 〈u,u〉1/2.Note that because of condition (IP3), we have ‖u‖ ≥ 0 and ‖u‖ = 0 if and only ifu = 0. This imposes a condition on G because

‖u‖2 = 〈u,u〉 = u∗TGu =∑

i,j

u∗i ujgij ≥ 0

17

and equality holds if and only if u = 0. A Hermitian matrix G with the propertythat u∗TGu > 0 for all u 6= 0 is said to be positive definite.

It is important to realize that conversely, given a positive definite Hermitianmatrix G, we can define an inner product by 〈u,v〉 = u∗TGv. That this is truefollows easily by reversing the above steps.

Another fundamental concept is that of the kernel of a linear transformation (ormatrix). If T is a linear transformation, we define the kernel of T to be the set

KerT = u ∈ V : Tu = 0 .

A linear transformation whose kernel is zero is said to be nonsingular.The reason the kernel is so useful is that it allows us to determine whether or not

a linear transformation is an isomorphism (i.e., one-to-one). A linear transformationT on V is said to be one-to-one if u 6= v implies Tu 6= Tv. An equivalent wayto say this is that Tu = Tv implies u = v (this is the contrapositive statement).Thus, if Tu = Tv, the using the linearity of T we see that 0 = Tu−Tv = T (u−v)and hence u − v ∈ KerT . But if KerT = 0, then we in fact have u = v sothat T is an isomorphism. Conversely, if T is an isomorphism, then we must haveKerT = 0. This is because T is one-to-one, and any linear transformation hasthe property that T0 = 0. (Because Tu = T (u + 0) = Tu + T0 so that T0 = 0.)

Now suppose that T is a nonsingular surjective (i.e., onto) linear transformationon V . Such a T is said to be a bijection. You should already know that the matrixrepresentation A = (aij) of T with respect to the basis ei for V is defined by

Tei =∑

j

ejaji .

This is frequently written as A = [T ]e. Then the fact that T is a bijection simplymeans that the matrix A is invertible (i.e., that A−1 exists).

(Actually, if T : U → V is a nonsingular (one-to-one) linear transformationbetween two finite-dimensional vector spaces of equal dimensions, then it is auto-matically surjective. This is a consequence of the well-known rank theorem whichsays

rankT + dim KerT = dimU

where rankT is another term for the dimension of the image of T . Therefore, ifKerT = 0 we have dimKerT = 0 so that rankT = dimU = dimV . The proof ofthe rank theorem is also not hard: Let dimU = n, and let w1, . . . , wk be a basisfor KerT . Extend this to a basis w1, . . . , wn for U . Then ImT is spanned byTwk+1, . . . , Twn, and it is easy to see that these are linearly independent. ThusdimU = n = k + (n− k) = dimKerT + dim ImT .)

Note that if G is positive definite, then we must have KerG = 0. This isbecause if u 6= 0 andGu = 0, we would have 〈u,u〉 = u∗TGu = 0 in contradiction tothe assumed positive definiteness of G. Thus a positive definite matrix is necessarilynonsingular.

Let us take a more careful look at Sij = 〈fi|fj〉. I claim that the matrix S =(Sij) is positive definite. To show this, I will prove a general result. Suppose

18

I have n linearly independent (complex) vectors v1, . . . ,vn, and I construct thenonsingular matrix M whose columns are just the vectors vi. Letting vij denotethe jth component of the vector vi, we have

M =

v11 v21 · · · vn1

v12 v22 · · · vn2

......

...

v1n v2n · · · vnn

.

From this we see that

M † =

v∗11 v∗12 · · · v∗1n

v∗21 v∗22 · · · v∗2n...

......

v∗n1 v∗n2 · · · v∗nn

and therefore

M †M =

〈v1|v1〉〈v1|v2〉 · · · 〈v1|vn〉〈v2|v1〉〈v2|v2〉 · · · 〈v2|vn〉

......

...

〈vn|v1〉〈vn|v2〉 · · · 〈vn|vn〉

. (1.10)

A matrix of this form is called a Gram matrix.If I denote the Hermitian matrix M †M by S, then for any vector c 6= 0 we have

〈c|Sc〉 = 〈c|M †Mc〉 = 〈Mc|Mc〉 = ‖Mc‖2> 0

so that S is positive definite. That this is strictly greater than zero (and not greaterthan or equal to zero) follows from the fact that M is nonsingular so its kernel is0, together with the assumption that c 6= 0. In other words, any matrix of theform (1.10) is positive definite.

But this is exactly what we had when we defined Sij = 〈fi|fj〉 = 〈i|j〉, where thelinearly independent functions fi define a basis for a vector space. In other words,what we really have is fi = vi so that the matrix M †M defined above is exactlythe matrix S defined by Sij = 〈i|j〉.

With all of this formalism out of the way, it is now easy to show that the rootsof the secular equation are real. Let us write equation (1.5) in matrix form as

Hc = WSc

so that〈c|Hc〉 = W 〈c|Sc〉 .

19

On the other hand, using the fact that H is Hermitian and S is real and symmetric,we can write

〈c|Hc〉 = 〈Hc|c〉 = 〈WSc|c〉 = W ∗〈Sc|c〉 = W ∗〈c|Sc〉 .

Thus we have(W −W ∗)〈c|Sc〉 = 0

which implies W = W ∗ because c 6= 0 so that 〈c|Sc〉 > 0.Note that this proof is also valid in the case where ϕ is complex because (1.5)

still holds, and S = M †M is Hermitian so that 〈Sc|c〉 = 〈c|Sc〉.

20

2 Time-Independent Perturbation Theory

2.1 Perturbation Theory for a Nondegenerate Energy Level

Suppose that we want to solve the time-independent Schrodinger equation Hψn =Enψn, but the Hamiltonian is too complicated for us to find an exact solution.However, let us suppose that the Hamiltonian can be written in the form

H = H0 + λH ′

where we know the exact solutions to H0ψ(0)n = E

(0)n ψ

(0)n . (We will use a super-

script 0 to denote the energies and eigenstates of the unperturbed Hamiltonian H0.)The additional term H ′ is called a perturbation, and it must in some sense beconsidered small relative to H0. The dimensionless parameter λ is redundant, butis introduced for mathematical convenience; it will not remain a part of our finalsolution. For example, the unperturbed Hamiltonian H0 could be the (free) hydro-gen atom, and the perturbation H ′ could represent the interaction energy eE · r ofthe electron with an electric field E. (This leads to an energy level shift called theStark effect.)

The full (i.e., interacting or perturbed) Schrodinger equation is written

Hψn = (H0 + λH ′)ψn = Enψn (2.1)

and the unperturbed equation is

H0ψ(0)n = E(0)

n ψ(0)n . (2.2)

We think of the parameter λ as varying from 0 to 1, and taking the system smoothlyfrom the unperturbed system described by H0 to the fully interacting system de-scribed by H . And as long as we are discussing nondegenerate states, we can think

of each unperturbed state ψ(0)n as undergoing a smooth transition to the exact state

ψn. In other words,

limλ→0

ψn = ψ(0)n and lim

λ→0En = E(0)

n .

Since the states ψn = ψn(λ,x) and energies En = En(λ) depend on λ, let usexpand both in a Taylor series about λ = 0:

ψn = ψ(0)n + λ

(

∂ψn

∂λ

)

λ=0

+λ2

2!

(

∂2ψn

∂λ2

)

λ=0

+ · · ·

En = E(0)n + λ

(

dEn

dλ

)

λ=0

+λ2

2!

(

d2En

dλ2

)

λ=0

+ · · · .

Now introduce the notation

ψ(k)n =

1

k!

∂kψn

∂λk

∣

∣

∣

∣

λ=0

E(k)n =

1

k!

dkEn

dλk

∣

∣

∣

∣

λ=0

21

so we can write

ψn = ψ(0)n + λψ(1)

n + λ2ψ(2)n + · · · (2.3a)

En = E(0)n + λE(1)

n + λ2E(2)n + · · · . (2.3b)

For each k = 1, 2, . . . we call ψ(k)n and E

(k)n the kth-order correction to the

wavefunction and energy. We assume that the series converges for λ = 1, and thatthe first few terms give a good approximation to the exact solutions.

It will be convenient to simplify some of our notation, so integrals such as

〈ψ(j)n |ψ(k)

n 〉 will simply be written 〈n(j)|n(k)〉. We assume that the unperturbedstates are orthonormal so that

〈m(0)|n(0)〉 = δmn

and we also choose our normalization so that

〈n(0)|n〉 = 1 . (2.4)

If this last condition on ψn isn’t satisfied, then multiplying ψn by 〈n(0)|n〉−1 willensure that it is. Since multiplying the Schrodinger equation Hψn = Enψn by aconstant doesn’t change En, this has no effect on the energy levels. If so desired,at the end of the calculation we can always re-normalize ψn in the usual way.

Substituting (2.3a) into (2.4) yields

1 = 〈n(0)|n(0)〉 + λ〈n(0)|n(1)〉 + λ2〈n(0)|n(2)〉 + · · · .

Now, it is a general result that if you have a power series equation of the form∑∞

n=0 anxn = 0 for all x, then an = 0 for all n. That a0 = 0 follows by letting

x = 0. Now take the derivative with respect to x and let x = 0 to obtain a1 = 0.Taking the derivative again and letting x = 0 yields a2 = 0. Clearly we can continuethis procedure to arrive at an = 0 for all n. Applying this result to the above powerseries in λ and using the fact that 〈n(0)|n(0)〉 = 1 we conclude that

〈n(0)|n(k)〉 = 0 for all k = 1, 2, . . . . (2.5)

We now substitute equations (2.3) into the Schrodinger equation (2.1):

(H0 + λH ′)(ψ(0)n + λψ(1)

n + λ2ψ(2)n + · · · )

= (E(0)n + λE(1)

n + λ2E(2)n + · · · )(ψ(0)

n + λψ(1)n + λ2ψ(2)

n + · · · )

or, grouping powers of λ,

H0ψ(0)n + λ(H0ψ(1)

n +H ′ψ(0)n ) + λ2(H(0)ψ(2)

n +H ′ψ(1)n ) + · · ·

= E(0)n ψ(0)

n + λ(E(0)n ψ(1)

n + E(1)n ψ(0)

n )

+ λ2(E(0)n ψ(2)

n + E(1)n ψ(1)

n + E(2)n ψ(0)

n ) + · · · .

22

Again ignoring questions of convergence, we can equate powers of λ on both sidesof this equation. For λ0 we simply have

H0ψ(0)n = E(0)

n ψ(0)n (2.6a)

which doesn’t tell us anything new. For λ1 we have

H0ψ(1)n +H ′ψ(0)

n = E(0)n ψ(1)

n + E(1)n ψ(0)

n

or(H0 − E(0)

n )ψ(1)n = (E(1)

n −H ′)ψ(0)n . (2.6b)

For λ2 we have

H(0)ψ(2)n +H ′ψ(1)

n = E(0)n ψ(2)

n + E(1)n ψ(1)

n + E(2)n ψ(0)

n

or(H0 − E(0)

n )ψ(2)n = (E(1)

n −H ′)ψ(1)n + E(2)

n ψ(0)n . (2.6c)

And in general we have for k ≥ 1

(H0 − E(0)n )ψ(k)

n = (E(1)n −H ′)ψ(k−1)

n + E(2)n ψ(k−2)

n + · · · + E(k)n ψ(0)

n . (2.6d)

Notice that at each step along the way, ψ(k)n is determined by ψ

(k−1)n , ψ

(k−2)n ,

. . . , ψ(0)n . We can also add an arbitrary multiple of ψ

(0)n to each ψ

(k)n without

affecting the left side of these equations. Hence we can choose this multiple so that〈n(0)|n(k)〉 = 0 for k ≥ 1, which is the same result as we had in (2.5).

Now using the hermiticity of H0 and the fact that E(0)n is real, we have

〈n(0)|H0n(k)〉 = 〈H0n(0)|n(k)〉 = E(0)n 〈n(0)|n(k)〉 = 0 for k ≥ 1 .

Then multiplying (2.6d) from the left by ψ(0)∗n and integrating, we see that the

left-hand side vanishes, and we are left with (since 〈n(0)|n(0)〉 = 1)

0 = −〈n(0)|H ′n(k−1)〉 + E(k)n

orE(k)

n = 〈n(0)|H ′n(k−1)〉 for k ≥ 1 . (2.7)

In particular, we have the extremely important result for the first order energycorrection to the nth state

E(1)n = 〈n(0)|H ′n(0)〉 =

∫

ψ(0)∗n H ′ψ(0)

n dx . (2.8)

Letting λ = 1 in (2.3b), we see that to first order, the energy of the nth state isgiven by

En ≈ E(0)n + E(1)

n = E(0)n +

∫

ψ(0)∗n H ′ψ(0)

n dx .

23

Example 2.1. Let the unperturbed system be the free harmonic oscillator, withground-state wavefunction

ψ(0)0 =

(

mω

π~

)1/4

e−mωx2/2~

and energy levels

E(0)n =

(

n+1

2

)

~ω .

Now consider the anharmonic oscillator with Hamiltonian

H = H0 +H ′ := H0 + ax3 + bx4 .

The first-order energy correction to the ground state is given by

E(1)0 = 〈n(0)|H ′n(0)〉 =

(

mω

π~

)1/2 ∫ ∞

−∞e−mωx2/~(ax3 + bx4) dx .

However, the integral over x3 vanishes by symmetry (the integral of an odd functionover an even interval), and we are left with

E(1)0 = b

(

mω

π~

)1/2 ∫ ∞

−∞x4e−mωx2/~ dx = b

(

α

π

)1/2 ∫ ∞

−∞x4e−αx2

dx

= b

(

α

π

)1/2∂2

∂α2

∫ ∞

−∞e−αx2

dx = b

(

α

π

)1/2∂2

∂α2

(

π

α

)1/2

=3b

4α2=

3b

4

~2

m2ω2.

Thus, to first order, the ground state energy of the anharmonic oscillator is givenby

E0 ≈ E(0)0 + E

(1)0 =

1

2~ω +

3b

4

~2

m2ω2.

Now let’s find the first-order correction to the wavefunction. Since the unper-

turbed states ψ(0)n form a complete orthonormal set, we may expand ψ

(1)n in terms

of them asψ(1)

n =∑

m

amψ(0)m

wheream = 〈m(0)|n(1)〉 .

(It would be way too cluttered to try and label these expansion coefficients to denotethe fact that they also refer to the first-order correction of the nth state.) Then for

24

m 6= n, we multiply (2.6b) from the left by ψ(0)m and integrate:

〈m(0)|(H0 − E(0)n )n(1)〉 = E(1)

n 〈m(0)|n(0)〉 − 〈m(0)|H ′n(0)〉

or (since H0ψ(0)m = E

(0)m ψ

(0)m and 〈m(0)|n(0)〉 = 0 for m 6= n)

(E(0)m − E(0)

n )〈m(0)|n(1)〉 = −〈m(0)|H ′n(0)〉 .

Therefore

am = 〈m(0)|n(1)〉 =〈m(0)|H ′n(0)〉E

(0)n − E

(0)m

for m 6= n . (2.9)

You should realize that this last step was where the assumed nondegeneracy of the

states came in. In order for us to divide by E(0)n − E

(0)m , we must assume that

it is nonzero. This is true as long as m 6= n implies that E(0)n 6= E

(0)m . Since

an = 〈n(0)|n(1)〉 = 0 (this is equation (2.5)), we finally obtain

ψ(1)n =

∑

m 6=n

〈m(0)|H ′n(0)〉E

(0)n − E

(0)m

ψ(0)m . (2.10)

Now that we have the first-order correction to the wavefunction, it is easy toget the second-order correction to the energy. Using (2.10) in (2.7) with k = 2 weimmediately have

E(2)n =

∑

m 6=n

〈m(0)|H ′n(0)〉〈n(0)|H ′m(0)〉E

(0)n − E

(0)m

=∑

m 6=n

∣

∣〈n(0)|H ′m(0)〉∣

∣

2

E(0)n − E

(0)m

. (2.11)

The last term we will compute is the second-order correction to the wavefunction.

We again expand in terms of the ψ(0)n as

ψ(2)n =

∑

m

bmψ(0)m

where bm = 〈m(0)|n(2)〉. Multiplying (2.6c) from the left by ψ(0)∗m and integrating

we have (assuming m 6= n)

(E(0)m − E(0)

n )〈m(0)|n(2)〉 = E(1)n 〈m(0)|n(1)〉 − 〈m(0)|H ′n(1)〉

or

bm = 〈m(0)|n(2)〉 =E

(1)n

E(0)m − E

(0)n

〈m(0)|n(1)〉 − 〈m(0)|H ′n(1)〉E

(0)m − E

(0)n

.

Now use (2.9) in the first term on the right-hand side and use (2.10) in the secondterm to write

bm = −E(1)n 〈m(0)|H ′n(0)〉(E

(0)n − E

(0)m )2

−∑

k 6=n

〈m(0)|H ′k(0)〉〈k(0)|H ′n(0)〉(E

(0)m − E

(0)n )(E

(0)n − E

(0)k )

.

25

Using (2.8) we finally obtain

ψ(2)n =

∑

m 6=n

∑

k 6=n

〈m(0)|H ′k(0)〉〈k(0)|H ′n(0)〉(E

(0)n − E

(0)m )(E

(0)n − E

(0)k )

ψ(0)m

−∑

m 6=n

〈m(0)|H ′n(0)〉〈n(0)|H ′n(0)〉(E

(0)n − E

(0)m )2

ψ(0)m . (2.12)

Let me make several points. First, recall that because of equation (2.4), ourstates are not normalized. Second, be sure to realize that the sums in equations(2.10), (2.11) and (2.12) are over states, and not energy levels. If some of theenergy levels other than the nth are degenerate, then we must include a term ineach of these sums for each linearly independent wavefunction corresponding to the

degenerate energy level. The reason for this is that the expansions of ψ(1)n and ψ

(2)n

were in terms of a complete set of functions, and hence we must be sure to includeall linearly independent states in the sums. Furthermore, if there happens to bea continuum of states in the unperturbed system, then we must also include anintegral over these so that we have included all linearly independent states in ourexpansion.

2.2 Perturbation Theory for a Degenerate Energy Level

We now turn to the perturbation treatment of a degenerate energy level, meaningthat there are multiple unperturbed states that all have the same energy. If we

let d be the degree of degeneracy, then we have states ψ(0)1 , . . . , ψ

(0)d satisfying the

unperturbed Schrodinger equation

H0ψ(0)n = E(0)

n ψ(0)n (2.13a)

withE

(0)1 = E

(0)2 = · · · = E

(0)d . (2.13b)

You must be careful with the notation here, because we don’t want to clutter it up

with too many indices. Even though we write E(0)1 , . . . , E

(0)d , this does not mean

that these are necessarily the d lowest-lying states that satisfy the unperturbedSchrodinger equation. We are referring here to a single degenerate energy level.

The interacting (or perturbed) Schrodinger equation is

Hψn = (H0 + λH ′)ψn = Enψn .

In our treatment of a nondegenerate energy level, we assumed that limλ→0 En =

E(0)n and limλ→0 ψn = ψ

(0)n where the state ψ

(0)n was unique. However, in the case

of degeneracy, the second of these does not hold. While it is true that as λ goes tozero we still have

limλ→0

En = E(0)n

26

the presence of the perturbation generally splits the degenerate energy level intomultiple distinct states. However, there are varying degrees of splitting, and whilethe perturbation may completely remove the degeneracy, it may also only partiallyremove it or have no effect at all. This is illustrated in the figure below.

Energy

E2abc

E3

E3

E4abc

E5 E5

E2c

E2ab

E4a

E4b

E4c

E1

E1

λ0 1

Figure 3: Splitting of energy levels due to a perturbation.

The important point to realize here is that in the limit λ→ 0, the state ψn does

not necessarily go to a unique ψ(0)n , but rather only to some linear combination

of the normalized degenerate states ψ(0)1 , . . . , ψ

(0)d . This is because any such linear

combinationc1ψ

(0)1 + c2ψ

(0)2 + · · · + cdψ

(0)d

will satisfy (2.13a) with the same eigenvalue E(0)n . Thus there are an infinite number

of such linear combinations made up of these d linearly independent normalizedeigenfunctions, and any of them will work as the unperturbed state.

For example, recall that the hydrogen atom states are labeled ψnlm where theenergy only depends on n and l, and the factor eimφ makes the wave functioncomplex for m 6= 0. The 2p states correspond to n = 2 and l = 1, and these arebroken into the wave functions 2p1 and 2p−1. However, instead of these complexwave functions, we can take the real linear combinations defined by

ψ2px=

1√2(ψ2p1

+ ψ2p−1)

and

ψ2py=

1

i√

2(ψ2p1

− ψ2p−1)

which have the same energies. For most purposes in chemistry, these real wavefunctions are much more convenient to work with. And while the 2p0, 2p1 and 2p−1

states are degenerate, the presence of an electric or magnetic field will split the

27

degeneracy because the interaction term in the Hamiltonian depends on the spin ofthe electron (i.e., the m value).

Returning to our problem, all we can say is that

limλ→0

ψn =

d∑

i=1

ciψ(0)i , 1 ≤ n ≤ d .

Hence the first thing we must do is determine the correct zeroth-order wave func-

tions, which we denote by φ(0)n . In other words,

φ(0)n := lim

λ→0ψn =

d∑

i=1

ciψ(0)i , 1 ≤ n ≤ d (2.14)

where each φ(0)n has a different set of coefficients ci. (These should be labeled c

(n)i ,

but I’m trying to keep it simple.) Note that since H0ψ(0)i = E

(0)d ψ

(0)i for each

i = 1, . . . , d it follows that

H0φ(0)n = E

(0)d φ(0)

n . (2.15)

For the d-fold degenerate case, we proceed as in the nondegenerate case, except

that now we use φ(0)n instead of ψ

(0)n for the zeroth-order wave function. Then

equations (2.3) become

ψn = φ(0)n + λψ(1)

n + λ2ψ(2)n + · · · (2.16a)

En = E(0)d + λE(1)

n + λ2E(2)n + · · · (2.16b)

where we have used (2.13b). Equations (2.16) apply for each n = 1, . . . , d. As in thenondegenerate case, we substitute these into the Schrodinger equationHψn = Enψn

and equate powers of λ. This is exactly the same as we had before, except that now

we have φ(0)n instead of ψ

(0)n , so we can immediately write down the results from

equations (2.6).

Equating the coefficients of λ0 we have H0φ(0)n = E

(0)d φ

(0)n . Since for each n =

1, . . . , d the linear combination φ(0)n is an eigenstate of H0 with eigenvalue E

(0)d (this

is just the statement of equation (2.15)), this doesn’t give us any new information.From the coefficients of λ1 we have (for each n = 1, . . . , d)

(H0 − E(0)d )ψ(1)

n = (E(1)n −H ′)φ(0)

n . (2.17)

Multipling this from the left by φ(0)∗n and integrating we have (here I’m not using

n(0) as a shorthand for ψ(0)n to make sure there is no confusion with φ

(0)n )

〈φ(0)n |H0ψ(0)

n 〉 − E(0)d 〈φ(0)

n |ψ(0)n 〉 = E(1)

n 〈φ(0)n |φ(0)

n 〉 − 〈φ(0)n |H ′φ(0)

n 〉 .Using (2.15) we see that the left-hand side of this equation vanishes, so assumingthat the correct zeroth-order wave functions are normalized, we arrive at the firstorder correction to the energy

E(1)n = 〈φ(0)

n |H ′φ(0)n 〉 . (2.18)

28

This is similar to the nondegenerate result (2.8) except that now we use the correctzeroth-order wave functions. Of course, in order to evaluate these integrals, we must

know the functions φ(0)n which, so far, we don’t.

So, for any 1 ≤ m ≤ d, we multiply (2.17) from the left by one of the d-fold

degenerate unperturbed wave functions ψ(0)m and integrate to obtain

〈ψ(0)m |H0ψ(1)

n 〉 − E(0)d 〈ψ(0)

m |ψ(1)n 〉 = E(1)

n 〈ψ(0)m |φ(0)

n 〉 − 〈ψ(0)m |H ′φ(0)

n 〉 .

Since H0ψ(0)m = E

(0)d ψ

(0)m , we see that the left-hand side of this equation vanishes,

and we are left with

〈ψ(0)m |H ′φ(0)

n 〉 − E(1)n 〈ψ(0)

m |φ(0)n 〉 = 0 , m = 1, . . . , d .

There is no loss of generality in assuming that the zeroth-order wave functions ψ(0)i

of the degenerate level are orthonormal, so we take

〈ψ(0)m |ψ(0)

i 〉 = δmi for m, i = 1, . . . , d . (2.19)

(If the zeroth-order wave functions ψ(0)i aren’t orthonormal, then apply the Gram-

Schmidt process to construct an orthonormal set. Since the new orthonormal func-tions are just linear combinations of the original set, and the correct zeroth-order

functions φ(0)n are linear combinations of the ψ

(0)i , the φ

(0)n will just be different linear

combinations of the new orthonormal functions.) Then substituting the definition

(2.14) for φ(0)n we have

d∑

i=1

ci〈ψ(0)m |H ′ψ(0)

i 〉 − E(1)n

d∑

i=1

ci〈ψ(0)m |ψ(0)

i 〉 = 0

ord

∑

i=1

(H ′mi − E(1)

n δmi)ci = 0 , m = 1, . . . , d (2.20a)

whereH ′

mi = 〈ψ(0)m |H ′ψ

(0)i 〉 .

This is just another homogeneous system of d equations in the d unknowns ci. Infact, if we let c be the vector with components ci, then we can write (2.20a) inmatrix form as

H ′c = E(1)n c (2.20b)

which shows that this is nothing more than an eigenvalue equation for the matrixH ′ acting on the d-dimensional eigenspace of degenerate wave functions.

As usual, if (2.20a) is to have a nontrivial solution, we must have the secularequation

det(H ′mi − E(1)

n δmi) = 0 . (2.21)

29

Written out, this looks like

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

H ′11 − E

(1)n H ′

12 · · · H ′1d

H ′21 H ′

22 − E(1)n · · · H ′

2d...

......

H ′d1 H ′

d2 · · · H ′dd − E

(1)n

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

= 0 .

This is a polynomial of degree d in E(1)n , and the d roots E

(1)1 , E

(1)2 , . . . , E

(1)d are

the first-order corrections to the energy of the d-fold degenerate unperturbed state.

So, we solve (2.21) for the eigenvalues E(1)n , and use these in (2.20b) to solve for the

eigenvectors c. These then define the correct zeroth-order wave functions accordingto (2.14).

Again, note that all we are doing is finding the eigenvalues and eigenvectorsof the matrix H ′

mi. And since H ′ is Hermitian, eigenvectors belonging to distincteigenvalues are orthogonal. But each eigenvector c has components that are justthe expansion coefficients in (2.14), and therefore (reverting to a more completenotation)

〈φ(0)m |H ′φ(0)

n 〉 =

d∑

i,j=1

c(m)∗i 〈ψ(0)

i |H ′ψ(0)j 〉c(n)

j =

d∑

i,j=1

c(m)∗i H ′

ijc(n)j

= c(m)†H ′c(n) = E(1)n c(m)†c(n)

= E(1)n 〈c(m)|c(n)〉

or〈φ(0)

m |H ′φ(0)n 〉 = E(1)

n δmn (2.22)

where we assume that the eigenvectors are normalized.In the case where m = n, we arrive back at (2.18). What about the case m 6= n?

Recall that in our treatment of nondegenerate perturbation theory, the reason wehad to assume the nondegeneracy was because equations (2.10) and (2.11) would

blow up if there were another state ψ(0)m with the same energy as ψ

(0)n . However, in

that case, we would be saved if the numerator also went to zero, and that is preciselywhat happens if we use the correct zeroth-order wave functions. Essentially then,the degenerate case proceeds just like the nondegenerate case, except that we mustuse the correct zeroth-order wave functions.

Returning to (2.21), if all d roots are distinct, then we have completely split thedegeneracy into d distinct levels

E(0)d + E

(1)1 , E

(0)d + E

(1)2 , . . . , E

(0)d + E

(1)d .

If not all of the roots are distinct, then we have only partly removed the degeneracy(at least to first order). We will assume that all d roots are distinct, and hence thatthe degeneracy has been completely lifted in first order.

30

Now that we have the d roots E(1)n , we can take them one-at-a-time and plug

back into the system of equations (2.20a) and solve for c2, . . . , cd in terms of c1.(Recall that because the determinant of the coefficient matrix of the system (2.20a)is zero, the d equations in (2.20a) are linearly dependent, and hence we can only findd−1 of the unknowns in terms of one of them.) Finally, we fix c1 by normalization,using equations (2.14) and (2.19):

1 = 〈φ(0)n |φ(0)

n 〉 =d

∑

i,j=1

c∗i cj〈ψ(0)i |ψ(0)

j 〉 =d

∑

i,j=1

c∗i cjδij =d

∑

i=1

|ci|2 . (2.23)

Also be sure to realize that we obtain a separate set of coefficients ci for each root

E(1)n . This is how we get the d independent zeroth-order wave functions.

Obviously, finding the roots of (2.21) is a difficult problem in general. However,under some special conditions, the problem may be much more tractable. Thebest situation would be if all off-diagonal elements Hmi,m 6= i vanished. Then thedeterminant is just the product of the diagonal elements, and the d roots are simply

E(1)n = H ′

mm for m = 1, . . . , d or

E(1)1 = H ′

11, E(1)2 = H ′

22, . . . , E(1)d = H ′

dd .

Let us assume that all d roots are distinct. Taking the root E(1)n = E

(1)1 = H ′

11 asa specific example, (2.20a) becomes the set of d− 1 equations

(H ′22 − E

(1)1 )c2 = 0

(H ′33 − E

(1)1 )c3 = 0

...

(H ′dd − E

(1)1 )cd = 0 .

Since E(1)1 = H ′

11 6= H ′mm for m = 2, 3, . . . , d, it follows that c2 = c3 = · · · = cd = 0.

Normalization then implies that c1 = 1, and the corresponding zeroth-order wave

function defined by (2.14) is φ(0)1 = ψ

(0)1 . Clearly this applies to any of the d roots,

so we haveφ

(0)i = ψ

(0)i , i = 1, . . . , d .

Thus we have shown that when the secular equation is diagonal and the d matrix

elements H ′mm are all distinct, then the initial wave functions ψ

(0)i are the correct

zeroth-order wave functions φ(0)i .

Another situation that lends itself to a relatively simple solution is when thesecular determinant is block diagonal. For example, in the case where d = 4 we

31

would have∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

H ′11 − E

(1)n H ′

12 0 0

H ′21 H ′

22 − E(1)n 0 0

0 0 H ′33 − E

(1)n H ′

34

0 0 H ′43 H ′

44 − E(1)n

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

= 0 .

This is of the same form as we had in Example 1.3 (except with Sij = δij). Exactlythe same reasoning we used to show that two of the variation functions were linearcombinations of f1 and f2 and two of the variation functions were linear combina-tions of f3 and f4 now shows that the correct zeroth-order wave functions are ofthe form

φ(0)1 = c

(1)1 ψ

(0)1 + c

(1)2 ψ

(0)2 φ

(0)2 = c

(2)1 ψ

(0)1 + c

(2)2 ψ

(0)2

φ(0)3 = c

(3)3 ψ

(0)3 + c

(3)4 ψ

(0)4 φ

(0)4 = c

(4)3 ψ

(0)3 + c

(4)4 ψ

(0)4

Is there any way we can choose our initial wave functions ψ(0)i to make things

easier? Well, referring back to Theorem 1.2, suppose we have a Hermitian oper-ator A that commutes with both H0 and H ′. If we choose our initial wave func-tions to be eigenfunctions of both A and H0, then the off-diagonal matrix elements

H ′ij = 〈ψ(0)

i |H ′ψ(0)j 〉 will vanish if ψ

(0)i and ψ

(0)j belong to different eigenspaces of

A. Therefore, if the functions ψ(0)i all have different eigenvalues of A, the secular

determinant will be diagonal so that the φ(0)i = ψ

(0)i .

If more than one ψ(0)i belongs to a given eigenvalue ak of A (in other words,

dimVak> 1), then this subcollection will form a block in the secular determinant.

So in general, we will have a secular determinant that is block diagonal where eachblock has size dimVak

. In this case, each correct zeroth-order wave function will be

a linear combination of those ψ(0)i that belong to the same eigenvalue of A.

Before proceeding with an example, let me prove a very important and usefulproperty of the spherical harmonics. The parity operation is r → −r, and inspherical coordinates, this is equivalent to θ → π − θ and ϕ→ ϕ+ π.

x

y

z

θ

ϕ

r

32

Indeed, we know that (for the unit sphere) z = cos θ, and from the figure we seethat −z would be at π − θ. Similarly, a point on the x-axis at ϕ = 0 goes to thepoint −x at ϕ = π. Alternatively, letting θ → π−θ in x = sin θ cosϕ doesn’t changex, so in order to have x → −x we need cosϕ → − cosϕ which is accomplished byletting ϕ→ ϕ+ π.

Now observe that under parity, r → −r and p → −p, so that L = r × p isunchanged. Thus angular momentum is a pseudo-vector, as you probably alreadyknew. But this means that the parity operation Π commutes with the quantummechanical operator L, so that the three operators L2, Lz and Π are mutuallycommuting, and the eigenfunctions Y m

l (θ, ϕ) of angular momentum can be chosento have a definite parity. Note also that since Π and L commute, it follows that Πand L± commute, so acting on any Y m

l with L± won’t change its parity.Look at the explicit form of the state Y l

l :

Y ll (θ, ϕ) = (−1)l

[

(2l + 1)!

4π

]1/21

2ll!(sin θ)leilϕ .

Letting θ → π − θ we have (sin θ)l → (sin θ)l, but under ϕ → ϕ + π we haveeilϕ → eilπeilϕ = (−1)leilϕ. Therefore, under parity we see that Y l

l → (−1)lY ll .

But we can get to any Y ml by repeatedly applying L− to Y l

l , and since this doesn’tchange the parity of Y m

l we have the extremely useful result

ΠY ml (θ, ϕ) = (−1)lY m

l (θ, ϕ) . (2.24)

Example 2.2 (Stark Effect). In this example we will take a look at the effectof a uniform electric field E = E z on a hydrogen atom, where the unperturbedHamiltonian is given by

H0 =p2

2m− e2

r.

and r = r1 − r2 is the relative position vector from the proton to the electron. Wefirst need to find the perturbing potential energy.

The force on a particle of charge q in an electric field E = −∇φ is F = qE =−q∇φ where φ(r) is the electric potential. On the other hand, the force is alsogiven in terms of the potential energy V (r) by F = −∇V , and hence ∇V = q∇φso that

∫

r

0

∇V · dr = q

∫

r

0

∇φ · dr

orV (r) − V (0) = q[φ(r) − φ(0)] .

If we take V (0) = φ(0) = 0, then we have

V (r) = qφ(r) .

33

Thus the interaction Hamiltonian H ′ consists of both the energy eφ(r2) of theproton and the energy −eφ(r1) of the electron, and therefore

H ′ = e[φ(r2) − φ(r1)] .

But the electric field is constant so that∫

r2

r1

E · dr = E · (r2 − r1) = −E · (r1 − r2) = −E · r = −E z

while we also have∫

r2

r1

E · dr = −∫

r2

r1

∇φ · dr = −[φ(r2) − φ(r1)] .

Hence the final form of our perturbation is H ′ = eE · r or

H ′ = eE z .

Note also that if we define the electric dipole moment µe = e(r2 − r1) = −er,then H ′ can be called a dipole interaction because

H ′ = −µe · E .

Let us first consider the ground state ψ100 of the hydrogen atom. This stateis nondegenerate, so the first-order energy correction to the ground state is, fromequation (2.8),

E(1)100 = 〈ψ100|eE z|ψ100〉 = eE 〈ψ100|z|ψ100〉 .

But H0 is parity invariant, so the states ψnlm all have a definite parity (−1)l. Then

E(1)100 is the integral of an odd function over an even interval, and hence it vanishes:

E(1)100 = 0 .

In fact, this shows that any nondegenerate state of the hydrogen atom has no first-order Stark effect.

Now consider the n = 2 levels of hydrogen. This is a four-fold degenerate stateconsisting of the wave functions ψ200, ψ210, ψ211 and ψ21−1. Since the parity of thestates is given by (−1)l, we see that the l = 0 state has even parity while the l = 1states are odd.

However, it is not hard to see that [H ′, Lz] = 0. This either a consequence ofthe fact that H ′ is a function of z = cos θ while Lz = −i~∂/∂ϕ, or you can notethat [Li, rj ] = i

∑

k εijkrk so that [Lz, z] = 0. Either way, we have

0 = 〈ψnl′m′ |[H ′, Lz]|ψnlm〉 = 〈ψnl′m′ |H ′Lz − LzH′|ψnlm〉

= ~(m−m′)〈ψnl′m′ |H ′|ψnlm〉

34

and hence we have the selection rule

〈ψnl′m′ |H ′|ψnlm〉 = 0 if m 6= m′ .

(This is an example of Theorem 1.2.) This shows that H ′ can only connect stateswith the same m values. And since H ′ has odd parity, it can only connect stateswith opposite parities, i.e., in the present case it can only connect an l = 0 statewith an l = 1 state.

Suppressing the index n = 2, we order our basis states ψlm asψ00, ψ10, ψ11, ψ1−1. (In other words, the rows and columns are labeled by thesefunctions in this order.) Then the secular equation (2.21) becomes (also writing Einstead of E(1) for simplicity)

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

−E 〈ψ00|H ′|ψ10〉 0 0

〈ψ10|H ′|ψ00〉 −E 0 0

0 0 −E 0

0 0 0 −E

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

= 0

or (since it’s block diagonal)

[E2 − (H ′12)

2]E2 = 0

whereH ′

12 = 〈ψ00|H ′|ψ10〉 = 〈ψ10|H ′|ψ00〉 = H ′21

because both H ′ and the wave functions are real. Therefore the roots of the secularequation are

E(1)n = ±H ′

12, 0, 0 .

For our wave functions we have ψnlm = RnlYml or

ψ200 =

(

1

2a30

)1/2(

1 − r

2a0

)

e−r/2a0Y 00

ψ210 =

(

1

24a30

)1/2r

a0e−r/2a0Y 0

1

where a0 is the Bohr radius defined by a0 = ~2/mee2, and hence

H ′12 = 〈ψ200|eE z|ψ210〉

= eE

∫

(2a0)−3 2√

3e−r/a0

r

a0

(

1 − r

2a0

)

zY 0∗0 Y 0

1 r2drdΩ .

But

Y 0∗0 =

1√4π

and z = r cos θ = r

√

4π

3Y 0∗

1

35

so that using∫

dΩY m∗l Y m′

l = δll′δmm′

we have

H ′12 = eE (2a0)

−3 2

3a0

∫

e−r/a0r4(

1 − r

2a0

)

Y 0∗1 Y 0

1 drdΩ

= eE (2a0)−3 2

3a0

∫ ∞

0

(

r4 − r5

2a0

)

e−r/a0dr .

Using the general result

∫ ∞

0

rne−αrdr = (−1)n ∂n

∂αn

∫ ∞

0

e−αrdr

= (−1)n ∂n

∂αnα−1

=n!

αn+1

we finally arrive atH ′

12 = −3eE a0 .

Now we need to find the corresponding eigenvectors c that will specify the correctzeroth-order wave functions. These are the solutions to the system of equations

H ′c = E(1)n c for each value of E

(1)n (see equation (2.20b)). Let E

(1)1 = H ′

12. Thenthe eigenvector c(1) satisfies

−H ′12 H ′

12 0 0

H ′12 −H ′

12 0 0

0 0 −H ′12 0

0 0 0 −H ′12

c1

c2

c3

c4

= 0 .

This implies that c1 = c2 and c3 = c4 = 0. Normalizing we have c1 = c2 = 1/√

2 sothat

ϕ(0)1 =

1√2(ψ200 + ψ210) .

Next we let E(1)2 = −H ′

12. Now the eigenvector c(2) satisfies

H ′12 H ′

12 0 0

H ′12 H ′

12 0 0

0 0 H ′12 0

0 0 0 H ′12

c1

c2

c3

c4

= 0

36

so that c1 = −c2 and c3 = c4 = 0. Again, normalization yields c1 = −c2 = 1/√

2and hence

ϕ(0)2 =

1√2(ψ200 − ψ210) .

Finally, for the two degenerate roots E(1)3 = E

(1)4 = 0 we have

0 H ′12 0 0

H ′12 0 0 0

0 0 0 0

0 0 0 0

c1

c2

c3

c4

= 0

so that c1 = c2 = 0 while c3 and c4 are completely arbitrary. Thus we can simplychoose

ϕ(0)3 = ψ211 and ϕ

(0)4 = ψ21−1 .

In summary, the correct zeroth-order wave functions for treating the Stark effect

are ϕ(0)1 which gets an first-order energy shift of −3eE a0, the wave function ϕ

(0)2

which gets a first-order energy shift of +3eE a0, and the original degenerate states

ϕ(0)3 = ψ211 and ϕ

(0)4 = ψ21−1 which remain degenerate to this order.

2.3 Perturbation Treatment of the First Excited States

of Helium

The helium atom consists of a nucleus with two protons and two neutrons, and twoorbiting electrons. If we take the nuclear charge to be +Ze instead of +2e, thenour discussion will apply equally well to helium-like ions such as H−, Li+ or Be2+.Neglecting terms such as spin–orbit coupling, the Hamiltonian is

H = − ~2

2me∇2

1 −~2

2me∇2

2 −Ze2

r1− Ze2

r2+

e2

r12(2.25)

where ri is the distance to electron i, r12 is the distance from electron 1 to electron2, and ∇2

i is the Laplacian with respect to the coordinates of electron i. TheSchrodinger equation is thus a function of six variables, the three coordinates foreach of the two electrons. (Technically, the electron mass me is the reduced massm = meM/(me +M) where M is the mass of the nucleus. But M ≫ me so thatm ≈ me. If this isn’t familiar to you, we will treat two-body problems such as thisin detail when we discuss identical particles.)

Because of the term e2/r12 the Schrodinger equation isn’t separable, and wemust resort to approximation methods. We write

H = H0 +H ′

37

where

H0 = H01 +H0

2 = − ~2

2me∇2

1 −Ze2

r1− ~2

2me∇2

2 −Ze2

r2(2.26)

is the sum of two independent hydrogen atom Hamiltonians, and

H ′ =e2

r12. (2.27)

We can now use separation of variables to write the unperturbed wave functionΨ(r1, r2) as a product

Ψ(r1, r2) = ψ1(r1)ψ2(r2) .

In this case we have the time-independent equation

H0Ψ = (H01 +H0

2 )ψ1ψ2 = ψ2H01ψ1 + ψ1H

02ψ2 = Eψ1ψ2

so that dividing by ψ1ψ2 yields

H01ψ1

ψ1= E − H0

2ψ2

ψ2.

Since the left side of this equation is a function of r1 only, and the right side is afunction of r2 only, each side must in fact be equal to a constant, and we can write

E = E1 + E2

where each Ei is the energy of a hydrogenlike wave function:

E1 = −Z2

n21

e2

2a0E2 = −Z

2

n22

e2

2a0

and a0 is the Bohr radius

a0 =~2

mee2= 0.529 A .

In other words, we have the unperturbed zeroth-order energies

E(0) = −Z2

(

1

n21

+1

n22

)

e2

2a0, n1 = 1, 2, . . . , n2 = 1, 2, . . . . (2.28)

Correspondingly, the zeroth-order wave functions are products of the usual hydro-genlike wave functions.

The lowest excited states of helium have n1 = 1, n2 = 2 or n1 = 2, n2 = 1.Then from (2.28) we have (for Z = 2)

E(0) = −22

(

1

12+

1

22

)

e2

2a0= −5(13.606 eV) = −68.03 eV .

38

For n = 2, the possible values of l are l = 0, 1, and since there are 2l + 1 values ofml, we see that the n = 2 level of a hydrogenlike atom is fourfold degenerate. (Thisjust says that the 2s and 2p states have the same energy.) Thus the first excitedunperturbed state of He is eightfold degenerate, and the eight unperturbed wavefunctions are

ψ(0)1 = 1s(1)2s(2) ψ

(0)2 = 2s(1)1s(2) ψ

(0)3 = 1s(1)2px(2) ψ

(0)4 = 2px(1)1s(2)

ψ(0)5 = 1s(1)2py(2) ψ

(0)6 = 2py(1)1s(2) ψ

(0)7 = 1s(1)2pz(2) ψ

(0)8 = 2pz(1)1s(2)

Here the notation 1s(1)2s(2) means, for example, that electron 1 is in the 1s stateand electron 2 is in the 2s state. I have also chosen to use the real hydrogenlikewave functions 2px, 2py and 2pz which are defined as linear combinations of thecomplex wave functions 2p0, 2p1 and 2p−1:

2px :=1√2(2p1 + 2p−1) =

1

4√

2π

(

Z

a0

)5/2

re−Zr/2a0 sin θ cosφ

=1

4√

2π

(

Z

a0

)5/2

xe−Zr/2a0 (2.29a)

2py :=1

i√

2(2p1 − 2p−1) =

1

4√

2π

(

Z

a0

)5/2

re−Zr/2a0 sin θ sinφ

=1

4√

2π

(

Z

a0

)5/2

ye−Zr/2a0 (2.29b)

2pz := 2p0 =1

4√

2π

(

Z

a0

)5/2

re−Zr/2a0 cos θ

=1

4√

2π

(

Z

a0

)5/2

ze−Zr/2a0 (2.29c)

This is perfectly valid since any linear combination of solutions with a given energyis also a solution with that energy. (However, the 2px and 2py functions are noteigenfunctions of Lz since they are linear combinations of eigenfunctions with dif-ferent values of ml.) These real hydrogenlike wave functions are more convenientfor many purposes in constructing chemical bonds and molecular wave functions.In fact, you have probably seen these wave functions in more elementary chemistrycourses. For example, a contour plot in the plane (i.e., a cross section) of a real 2pwave function is shown in Figure 4 below. (Let φ = π/2 in any of equations (2.29).)The three-dimensional orbital is obtained by rotating this plot about the horizontalaxis, so we see that the actual shape of a real 2p orbital (i.e., a one-electron wavefunction) is two separated, distorted ellipsoids.

It is not hard to verify that the real 2p wave functions are orthonormal, and

hence the eight degenerate wave functions ψ(0)i are also orthonormal as required by

equation (2.19). The secular determinant contains 82 = 64 elements. However, H ′

39

-40 -20 0 20 40

-40

-20

0

20

40

Figure 4: Contour plot in the plane of a real 2p wave function.

is real, as are the ψ(0)i , so that H ′

ij = H ′ji and the determinant is symmetric about

the main diagonal. This cuts the number of integrals almost in half.Even better, by using parity we can easily show that most of the H ′

ij are zero.

Indeed, the perturbing Hamiltonian H ′ = e2/r12 is an even function of r since

r12 = [(x1 − x2)2 + (y1 − y2)

2 + (z1 − z2)2]1/2

and this is unchanged if r1 → −r1 and r2 → −r2. Also, the hydrogenlike s-wave functions depend only on r = |r| and hence are invariant under r → −r.Furthermore, you can see from the above forms that the 2p wave functions are oddunder parity since they depend on r and either x, y or z. Hence, since we areintegrating over all space, any integral with only a single factor of 2p must vanish:

H ′13 = H ′

14 = H ′15 = H ′

16 = H ′17 = H ′

18 = 0

andH ′

23 = H ′24 = H ′

25 = H ′26 = H ′

27 = H ′28 = 0 .

Now consider an integral such as

H ′35 =

∫ ∞

−∞1s(1)2px(2)

e2

r121s(1)2py(2) dr1dr2 .

If we let x1 → −x1 and x2 → −x2, then r12 is unchanged as are 1s(1) and 2py(2).However, 2px(2) changes sign, and the net result is that the integrand is an oddfunction under this transformation. Hence it is not hard to see that the integralvanishes. This lets us conclude that

H ′35 = H ′

36 = H ′37 = H ′

38 = 0

40

andH ′

45 = H ′46 = H ′

47 = H ′48 = 0 .

Similarly, by considering the transformation y1 → −y1 and y2 → −y2, it followsthat

H ′57 = H ′

58 = H ′67 = H ′

68 = 0 .

With these simplifications, the secular equation becomes

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

b11 H ′12 0 0 0 0 0 0

H ′12 b22 0 0 0 0 0 00 0 b33 H ′

34 0 0 0 00 0 H ′

34 b44 0 0 0 00 0 0 0 b55 H ′

56 0 00 0 0 0 H ′

56 b66 0 00 0 0 0 0 0 b77 H ′

78

0 0 0 0 0 0 H ′78 b88

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

∣

= 0 (2.30)

wherebii = H ′

ii − E(1), i = 1, 2, . . . , 8 .

Since the secular determinant is in block-diagonal form with 2 × 2 blocks on thediagonal, the same logic that we used in Example 1.3 would seem to tell us that thecorrect zeroth-order wave functions have the form

φ(0)1 = c1ψ

(0)1 + c2ψ

(0)2 φ

(0)2 = c1ψ

(0)1 + c2ψ

(0)2

φ(0)3 = c3ψ

(0)3 + c4ψ

(0)4 φ

(0)4 = c3ψ

(0)3 + c4ψ

(0)4

φ(0)5 = c5ψ

(0)5 + c6ψ

(0)6 φ

(0)6 = c5ψ

(0)5 + c6ψ

(0)6

φ(0)7 = c7ψ

(0)7 + c8ψ

(0)8 φ

(0)8 = c7ψ

(0)7 + c8ψ

(0)8

where the barred and unbarred coefficients distinguish between the two roots ofeach second-order determinant. However, while that argument applies to the upper2 × 2 determinant (i.e., the first two equations of the system), it doesn’t apply tothe whole determinant in this case. This is because it turns out (as we will seebelow) that the lower three 2 × 2 determinants are identical. Therefore, their pairsof roots are the same, and all we can say is that there are two six-dimensionaleigenspaces. In other words, all we can say is that for each of the two roots and

for each n = 3, 4, . . .8, the function φ(0)n will be a linear combination of ψ

(0)3 , . . . ,

ψ(0)8 . However, we can choose any basis we wish for this six-dimensional space, so

we choose the three two-dimensional orthonormal φ(0)n ’s as shown above.

The first determinant is∣

∣

∣

∣

∣

H ′11 − E(1) H ′

12

H ′12 H ′

22 − E(1)

∣

∣

∣

∣

∣

= 0 (2.31)

41

where

H ′11 =

∫

1s(1)2s(2)e2

r121s(1)2s(2) dr1dr2 =

∫

[1s(1)]2[2s(2)]2e2

r12dr1dr2

H ′22 =

∫

[1s(2)]2[2s(1)]2e2

r12dr1dr2 .

Since the integration variables are just dummy variables, it is pretty obvious thatletting r1 ↔ r2 shows that

H ′11 = H ′

22 .

Similarly, it is easy to see that

H ′33 = H ′

44 H ′55 = H ′

66 H ′77 = H ′

88 .

The integralH ′11 is sometimes denoted by J1s2s and called a Coulomb integral:

H ′11 = J1s2s =

∫

[1s(1)]2[2s(2)]2e2

r12dr1dr2 .

The reason for the name is that this represents the electrostatic energy of repulsionbetween an electron with the probability density function [1s]2 and an electron withprobability density function [2s]2. The integral H ′

12 is denoted by K1s2s and calledan exchange integral:

H ′12 = K1s2s =

∫

1s(1)2s(2)e2

r122s(1)1s(2) dr1dr2 .

Here the functions to the left and right of H ′ differ from each other by the exchangeof electrons 1 and 2. The general definitions of the Coulomb and exchange integralsare

Jij = 〈fi(1)fj(2)|e2/r12|fi(1)fj(2)〉

Kij = 〈fi(1)fj(2)|e2/r12|fj(1)fi(2)〉

where the range of integration is over the full range of spatial coordinates of particles1 and 2, and the functions fi, fj are spatial orbitals.

Substituting these integrals into (2.31) we have

∣

∣

∣

∣

∣

J1s2s − E(1) K1s2s

K1s2s J1s2s − E(1)

∣

∣

∣

∣

∣

= 0 (2.32)

orJ1s2s − E(1) = ±K1s2s

and hence the two roots are

E(1)1 = J1s2s −K1s2s and E

(1)2 = J1s2s +K1s2s .

42

Just as in Example 1.3, we substitute E(1)1 back into (2.20a) to write

K1s2sc1 +K1s2sc2 = 0

K1s2sc1 +K1s2sc2 = 0

and hence c2 = −c1. Normalizing φ(0)1 we have (using the orthonormality of the

ψ(0)i )

〈φ(0)1 |φ(0)

1 〉 = 〈c1ψ(0)1 − c1ψ

(0)2 |c1ψ(0)

1 − c1ψ(0)2 〉 = |c1|2 + |c2|2 = 1

so that c1 = 1/√

2. Thus the zeroth-order wave function corresponding to E(1)1 is

φ(0)1 = 2−1/2[ψ

(0)1 − ψ

(0)2 ] = 2−1/2[1s(1)2s(2) − 2s(1)1s(2)] .

Similarly, the wave function corresponding to E(1)2 is easily found to be

φ(0)2 = 2−1/2[ψ

(0)1 + ψ

(0)2 ] = 2−1/2[1s(1)2s(2) + 2s(1)1s(2)] .

This takes care of the first determinant in (2.30), but we still have the remainingthree to handle.

First look at the integrals H ′33 and H ′

55:

H ′33 =

∫

1s(1)2px(2)e2

r121s(1)2px(2) dr1dr2

H ′55 =

∫

1s(1)2py(2)e2

r121s(1)2py(2) dr1dr2 .

The only difference between these is the 2p(2) orbital, and the only difference be-tween the 2px and 2py orbitals is their spatial orientation. Since the 1s orbitals arespherically symmetric, it should be clear that these integrals are the same. For-mally, in H ′

33 we can change variables by letting x1 → y1, y1 → x1, x2 → y2 andy2 → x2. This leaves r12 unchanged, and transforms H ′

33 into H ′55. The same

argument shows that H ′77 = H ′

33 also. Hence we have

H ′33 = H ′

55 = H ′77 =

∫

1s(1)2pz(2)e2

r121s(1)2pz(2) dr1dr2 := J1s2p .

A similar argument shows that we also have equal exchange integrals:

H ′34 = H ′

56 = H ′78 =

∫

1s(1)2pze2

r122pz(1)1s(2) dr1dr2 := K1s2p .

Thus the remaining three determinants in (2.30) are the same and have the form

∣

∣

∣

∣

∣

J1s2p − E(1) K1s2p

K1s2p J1s2p − E(1)

∣

∣

∣

∣

∣

= 0 .

43

But this is the same as (2.32) if we replace 2s by 2p, and hence we can immediatelywrite down the solutions:

E(1)3 = E

(1)5 = E

(1)7 = J1s2p −K1s2p

E(1)4 = E

(1)6 = E

(1)8 = J1s2p +K1s2p

and

φ(0)3 = 2−1/2[1s(1)2px(2) − 1s(2)2px(1)]

φ(0)4 = 2−1/2[1s(1)2px(2) + 1s(2)2px(1)]

φ(0)5 = 2−1/2[1s(1)2py(2) − 1s(2)2py(1)]

φ(0)6 = 2−1/2[1s(1)2py(2) + 1s(2)2py(1)]

φ(0)7 = 2−1/2[1s(1)2pz(2) − 1s(2)2pz(1)]

φ(0)8 = 2−1/2[1s(1)2pz(2) + 1s(2)2pz(1)]

So what has happened? Starting from the eight degenerate (unperturbed) states

ψ(0)i that would exist in the absence of electron-electron repulsion, we find that in-

cluding this repulsion term splits the degenerate states into two nondegenerate levelsassociated with the configuration 1s2s, and two triply degenerate levels associatedwith the configuration 1s2p. Interestingly, going to higher-order energy correctionswill not completely remove the degeneracy, and in fact it takes the application ofan external magnetic field to do so.

In order to evaluate the Coulomb and exchange integrals in the expressions forE(1) we need to use the expansion

1

r12=

∞∑

l=0

l∑

m=−l

4π

2l + 1

rl<

rl+1>

[Y ml (θ1, ϕ1)]

∗Y ml (θ2, ϕ2) (2.33)

where r< means the smaller of r1 and r2 and r> is the larger of these. The detailsof this type of integral are left to the homework, and the results are

J1s2s =17

81

Ze2

a0= 11.42 eV J1s2p =

59

243

Ze2

a0= 13.21 eV

K1s2s =16

729

Ze2

a0= 1.19 eV K1s2p =

112

6561

Ze2

a0= 0.93 eV

where we used Z = 2 and e2/2a0 = 13.606 eV. Recalling that E(0) = −68.03 eVwe obtain

E(0) + E(1)1 = E(0) + J1s2s −K1s2s = −57.8 eV

E(0) + E(1)2 = E(0) + J1s2s +K1s2s = −55.4 eV

E(0) + E(1)3 = E(0) + J1s2p −K1s2p = −55.7 eV

E(0) + E(1)4 = E(0) + J1s2p +K1s2p = −53.9 eV .

44

E(0)−68.0 eV

Jp Js

1s2p

1s2s

Kp

Ks

E(0) + E(1)

−53.9 eV

−55.4 eV−55.7 eV

−57.8 eV

Figure 5: The first excited levels of the helium atom.

(See Figure 5 below.) The first-order energy corrections place the lower 1s2p levelbelow the upper 1s2s level, which disagrees with the actual helium spectrum. This isdue to the neglect of higher-order corrections. Since the electron-electron repulsionis not a small quantity, this is not surprising.

Finally, let us look at the sources of the degeneracy of the original eight zeroth-order wave functions and the reason for the partial lifting of this degeneracy. Thereare three types of degeneracy to consider: (1) The degeneracy between states withthe same n but different values of l. The 2s and 2p functions have the same energy.(2) The degeneracy between wave functions with the same n and l but differentvalues of ml. The 2px, 2py and 2pz functions have the same energy. (This couldjust as well have been the 2p0, 2p1 and 2p−1 complex functions.) (3) There is anexchange degeneracy between functions that differ only in the exchange of elec-

trons between the orbitals. For example, ψ(0)1 = 1s(1)2s(2) and ψ

(0)2 = 1s(2)2s(1)

have the same energy.By introducing the electron-electron perturbation H ′ = e2/r12 we removed the

degeneracy associated with l and the exchange degeneracy, but not the degeneracydue to ml. To understand the reason for the lifting of the l degeneracy, realize thata 2s electron has a greater probability than a 2p electron of being closer to thenucleus than a 1s electron, and hence a 2s electron is not as effectively shieldedfrom the nucleus by the 1s electrons as a 2p electron is. Since the energy levels aregiven by

E = −Z2

n2

e2

2a0

we see that a larger nuclear charge means a lower energy, and hence the 2s electronhas a lower energy than the 2p electron. This is also evident from the Coulombintegrals, where we see that J1s2s is less than J1s2p. These integrals represent the

45

electrostatic repulsion of their respective charge distributions: when the 2s electronpenetrates the 1s charge distribution it only feels a repulsion due to the unpene-trated portion of the 1s distribution. Therefore the 1s-2s electrostatic repulsion isless than the 1s-2p repulsion, and the 1s2s levels lies below the 1s2p levels. So wesee that the interelectronic repulsion in many-electron atoms lifts the l degeneracy,and the orbital energies for the same value of n increase with increasing l.

To understand the removal of the exchange degeneracy, note that the origi-nal zeroth-order wave functions specified which electron went into which orbital.Since the secular determinant wasn’t diagonal, these couldn’t have been the correctzeroth-order wave functions. In fact, the correct zeroth-order wave functions do notassign a specific electron to a specific orbital, as is evident from the form of each

φ(0)i . This is a consequence of the indistinguishability of identical particles, and will

be discussed at length a little later in this course. Since, for example, φ(0)1 and φ

(0)2

have different energies, the exchange degeneracy is removed by using the correctzeroth-order wave functions.

2.4 Spin–Orbit Coupling and the Hydrogen Atom Fine Struc-

ture

The Hamiltonian

H0 = − ~2

2m

(

∂2

∂r2+

2

r

∂

∂r

)

+1

2mr2L2 − e2

r(2.34)

used to derive the hydrogen atom wave functions ψnlm that we have worked withso far consists of the kinetic energy of the electron plus the potential energy ofthe Coulomb force binding the electron and proton together. (Recall that in thisequation, m is really the reduced mass m = meMp/(me +Mp) ≈ me.) While thisworks very well, the actual Hamiltonian is somewhat more complicated than this.In this section we derive an additional term in the Hamiltonian that is due to acoupling between the orbital angular momentum L and the spin angular momentumS.

The discussion that follows is a somewhat heuristic approach to deriving aninteraction term that agrees with experiment. You shouldn’t take the physicalpicture too seriously. However, the basic idea is simple enough. From the point ofview of the electron, the moving nucleus (i.e., a proton) generates a current thatis the source of a magnetic field B. This current is proportional to the electron’sangular momentum L. The interaction energy of a magnetic moment µ with thismagnetic field is −µ ·B. Since the magnetic moment of an electron is proportionalto its spin S, we see that the interaction energy will be proportional to L · S.

With the above disclaimer, the interaction term we are looking for is due tothe fact that from the point of view of the electron, the moving hydrogen nucleus(the proton) forms a current, and thus generates a magnetic field. From specialrelativity, we know that the electric and magnetic fields are related by a Lorentztransformation so that

B⊥ = γ(B′⊥ + β × E′) B‖ = B′

‖

46

E⊥ = γ(E′⊥ − β × B′) E‖ = E′

‖

where β = v/c is the velocity of the primed frame with respect to the unprimedframe, γ = (1− β2)−1/2 and ⊥, ‖ refer to the components perpendicular or parallelto β.

O

O′

β

We let the primed frame be the proton rest frame, and note that there is no B′

field in the proton’s frame due to the proton itself. Also, if β ≪ 1, then γ ≈ 1 andwe then have

B = β × E′ and E = E′ .

If v is the electron’s velocity with respect to the lab (or the proton), then β = −v/cso the field felt by the electron is

B = −v

c× E′ . (2.35)

The electric field E′ due to the proton is

E′ =e

r2r =

e

r3r (2.36)

where e > 0 and r is the position vector from the proton to the electron.From basic electrodynamics, we know that the energy of a particle with magnetic

moment µ in a magnetic field B is given by (see the end of this section)

W = −µ · B (2.37)

so we need to know µ. Consider a particle of charge q moving in a circular orbit.It forms an effective current

I =∆q

∆t=

q

2πr/v=

qv

2πr.

By definition, the magnetic moment has magnitude

µ =I

c× area =

qv

2πrc· πr2 =

qvr

2c.

But the angular momentum of the particle is L = mvr so we conclude that themagnetic moment due to orbital motion is

µl =q

2mcL . (2.38)

47

The ratio of µ to L is called the gyromagnetic ratio.While the above derivation of (2.38) was purely classical, we know that the

electron also possesses an intrinsic spin angular momentum. Let us hypothesizethat the electron magnetic moment associated with this spin is of the form

µs = g−e2mc

S .

The constant g is found by experiment to be very close to 2. (However, the rel-ativistic Dirac equation predicts that g is exactly 2. Higher order corrections inquantum electrodynamics predict a slightly different value, and the measurementof g − 2 is one of the most accurate experimental result in all of physics.)

So we now have the electron magnetic moment given by

µs =−emc

S (2.39)

and hence the interaction energy of the electron with the magnetic field of theproton is (using equations (2.35) and (2.36))

W = −µs · B = +e

mcS · B = − e

mcS ·

(

e

r3cv × r

)

=e2

m2c2r3S · (r × p)

or

W =e2

m2c2r3S · L . (2.40)

Alternatively, we can write W in another form as follows. If we assume that theelectron moves in a spherically symmetric potential field, then the force −eE on theelectron may be written as the negative gradient of this potential energy:

−eE = −∇V (r) = −dVdr

r = −r

r

dV

dr.

Using this in (2.35) we have

B = −v

c× r

1

er

dV

dr=

1

mcr× p

1

er

dV

dr

and hence

W = −µs · B =e

m2c2S · (r × p)

1

er

dV

dror

W =1

m2c21

r

dV

drS · L . (2.41)

However, we have made one major mistake. The classical equation that leadsto (2.37) is

dL

dt= N = µ × B (2.42)

where L is the angular momentum of the particle in its rest frame, N is the appliedtorque, and B is the magnetic field in that frame. But this only applies if the

48

electron’s rest frame isn’t rotating. If it is, then the left side of this equation isn’tvalid (i.e., it isn’t equal to only the applied torque), and we must use the correct(operator) expression from classical mechanics:

(

d

dt

)

lab

=

(

d

dt

)

rot

+ ω × . (2.43)

(If you don’t know this result, I will derive it at the end of this section so you cansee what is going on and why.)

For the electron, (2.42) gives dS/dt in the lab frame, so in the electron’s framewe must use

(

dS

dt

)

rot

=

(

dS

dt

)

lab

− ωT × S (2.44)

where ωT is called the Thomas precessional frequency. Thus we see that thechange in the spin angular momentum of the electron, (dS/dt)rot, is given by thechange due to the applied torque µ × B minus an effect due to the rotation of thecoordinate system:

(

dS

dt

)

rot

= µ × B− ωT × S = − e

mcS× B + S × ωT

or(

dS

dt

)

rot

= S ×(

− eB

mc+ ωT

)

. (2.45)

This is the analogue of (2.42), so the analogue of (2.37) is

W = −S ·(

− eB

mc+ ωT

)

=e

mcS ·B− S · ωT . (2.46)

Note that the first term is what we already calculated in equation (2.40). What weneed to know is the Thomas factor S ·ωT . This is not a particularly easy calculationto do exactly, so we will give a very simplified derivation. (See Jackson, ClassicalElectrodynamics, Chapter 11 if you want a careful derivation.)

Basically, Thomas precession can be attributed to time dilation, i.e., observerson the electron and proton disagree on the time required for one particle to make arevolution about the other. Let T be the time required for a revolution accordingto the electron, and let it be T ′ according to the proton. Then T ′ = γT whereγ = (1 − β2)−1/2. (Note that a circular orbit means an acceleration, so even thisisn’t really correct.) Then the electron and proton each measure orbital angularvelocities of 2π/T and 2π/T ′ respectively.

To the electron, its spin S maintains its direction in space, but to the proton, itappears to precess at a rate equal to the difference in angular velocities, or

ωT =2π

T− 2π

T ′ = 2π

(

1

T− 1

T ′

)

= 2π

(

γ

T ′ −1

T ′

)

=2π

T ′

[

(1 − β2)−1/2 − 1

]

≈ 2π

T ′β2

2.

49

But in general we know that ω = v/r and hence

2π

T ′ =v

r=mvr

mr2=

L

mr2

and therefore

ωT =L

mr2β2

2=

L

mr2v2

2c2=

1

2

L

m2c21

r

mv2

r.

We also know that F = ma, where for circular motion we have an inwarddirected acceleration a = v2/r. Since F = −∇V , we have

F = −mv2

rr = −dV

drr

and we can write

ωT =1

2

1

m2c21

r

dV

drL . (2.47)

From this we see that S · ωT is just one-half the energy given by equation (2.41),and equation (2.46) shows that it is subtracted off. Therefore the correct spin–orbitenergy is given by

W =1

2m2c21

r

dV

drL · S (2.48a)

or, from (2.40) with a slight change of notation,

Hso =e2

2m2c2r3L · S . (2.48b)

Calculating the spin–orbit interaction energy Eso by finding the eigenfunctionsand eigenvalues of the Hamiltonian H = H0 + Hso is a difficult problem. Sincethe effect of Hso is small compared to H0 (at least for the lighter atoms), we willestimate the value of Eso by using first-order perturbation theory. Then first-orderenergy shifts for the hydrogen atom will be the integrals

E(1)so ≈ 〈Ψ|HsoΨ〉

where the hydrogen atom wave functions including spin are of the form

Ψ = Rnl(r)Ym

l (θ, ϕ)χ(s) .

From J = L + S, we have J2 = L2 + S2 + 2L · S so that

L · S =1

2(J2 − L2 − S2) . (2.49)

Note that neither L nor S separately commutes with L ·S, but you can easily showthat J = L+S does in fact commute with L ·S. Because of this, we can choose ourstates to be simultaneous eigenfunctions of J2, Jz , L

2 and S2, all of which commutewith H .

50

Since Y ml is an eigenfunction of Lz and χ is an eigenfunction of Sz, the wave

function Y ml χ is an eigenfunction of Jz = Lz + Sz but not of J2. However, by

the usual addition of angular momentum problem, in this case L and S, we canconstruct simultaneous eigenfunctions ψ of J2, Jz, L

2 and S2. In this case we haves = 1/2, so we know that the resulting possible j values are j = l − 1/2, l + 1/2.The reason we want to do this is because there are 2(2l + 1) degenerate levels fora given n and l, where the additional factor of 2 comes from the two possible spinorientations.

Let us assume that we have constructed these eigenfunctions, and we now denotethe hydrogen atom wave functions by

Ψ = Rnl(r)ψ(θ, ϕ, s)

where, by (2.49)

L · Sψ =~

2

2[j(j + 1) − l(l + 1) − s(s+ 1)]ψ

=~2

2

[

j(j + 1) − l(l + 1) − 3

4

]

ψ .

Using this, our first-order energy estimate becomes

E(1)so ≈

⟨

Rnlψ

∣

∣

∣

∣

e2

2m2c21

r3L · S

∣

∣

∣

∣

Rnlψ

⟩

=e2~2

4m2c2

[

j(j + 1) − l(l + 1) − 3

4

]⟨

Rnl

∣

∣

∣

∣

1

r3

∣

∣

∣

∣

Rnl

⟩

(2.50)

where⟨

Rnlψ

∣

∣

∣

∣

1

r3

∣

∣

∣

∣

Rnlψ

⟩

=

⟨

Rnl

∣

∣

∣

∣

1

r3

∣

∣

∣

∣

Rnl

⟩

because 〈ψ|ψ〉 = 1. The integral in (2.50) is not at all hard to do if you use someclever tricks. I will show how to do it at the end of this section, and the answer is

⟨

Rnl

∣

∣

∣

∣

1

r3

∣

∣

∣

∣

Rnl

⟩

=1

a30n

3l(l+ 1/2)(l+ 1)(2.51)

where the Bohr radius is

a0 =~2

me2=

~

mcα(2.52)

and the fine structure constant is

α =e2

~c≈ 1

137. (2.53)

Note that for l = 0 we also have L · S = 0 anyway, so there is no spin–orbit energy.

51

Recall that the energy corresponding to H0 is

E(0)n = − me4

2~2n2= −mc

2α2

2n2. (2.54a)

or

E(0)n =

E(0)1

n2=

−13.6 eV

n2. (2.54b)

Combining (2.50) and (2.51) we have

E(1)so =

e2~2

4m2c2a30n

3

[j(j + 1) − l(l+ 1) − 3/4]

l(l+ 1/2)(l+ 1)

=

∣

∣E(0)n

∣

∣α2

2n

[j(j + 1) − l(l+ 1) − 3/4]

l(l+ 1/2)(l+ 1)

. (2.55)

Since j = l ± 1/2, this gives us the two corrections to the energy

E(1)so =

∣

∣E(0)n

∣

∣α2

n

[

1

(2l + 1)(l + 1)

]

for j = l + 1/2 and l 6= 0 (2.56a)

E(1)so = −

∣

∣E(0)n

∣

∣α2

n

[

1

l(2l+ 1)

]

for j = l − 1/2 and l 6= 0 . (2.56b)

There is yet another correction to the hydrogen atom energy levels due to therelativistic contribution to the kinetic energy of the electron. The kinetic energy isreally the difference between the total relativistic energy E = (p2c2 +m2c4)1/2 andthe rest energy mc2. To order p4 this is

T = (p2c2 +m2c4)1/2 −mc2 ≈ p2

2m− p4

8m3c2.

Since the Hamiltonian is the sum of kinetic and potential energies, we see from thisthat the term

Hrel = − p4

8m3c2(2.57)

may be treated as a perturbation to the states ψnlm.While the states ψnlm are in general degenerate, in this case we don’t have to

worry about it. The reason is that Hrel is rotationally invariant, so it’s alreadydiagonal in the ψnlm basis, and that is precisely what the zeroth-order wavefunc-

tions ϕ(0)n accomplish (see equation (2.22)). Therefore we can use simple first-order

perturbation theory so that

E(1)rel = − 1

8m3c2〈ψnlm|p4|ψnlm〉 .

Using H0 = p2/2m− e2/r we can write

p4 = 4m2

(

p2

2m

)2

= 4m2

(

H0 +e2

r

)2

52

and therefore

E(1)rel = − 1

2mc2

[

(E(0)n )2 + 2E(0)

n e2⟨

1

r

⟩

+ e4⟨

1

r2

⟩]

where 〈·〉 is shorthand for 〈ψnlm| · |ψnlm〉. These integrals are not hard to evaluate(see the end of this section), and the result (in different forms) is

E(1)rel = − (E

(0)n )2

2mc2

[

− 3 +4n

l + 1/2

]

= −∣

∣E(0)n

∣

∣α2

n2

[

− 3

4+

n

l + 1/2

]

= −1

2mc2α4

[

− 3

4n4+

1

n3(l + 1/2)

]

.

(2.58)

Adding equations (2.56) and (2.58) we obtain the fine structure energy shift

E(1)fs = −mc

2α4

2n3

[

− 3

4n+

1

j + 1/2

]

= −∣

∣E(0)n

∣

∣α2

n2

[

− 3

4+

n

j + 1/2

]

(2.59)

which is valid for both j = l ± 1/2. This is the first-order energy correction due tothe “fine structure Hamiltonian”

Hfs = Hso +Hrel . (2.60)

2.4.1 Supplement: Miscellaneous Proofs

Now let’s go back and prove several miscellaneous results stated in this section. Thefirst thing we want to show is that the energy of a magnetic moment in a uniformmagnetic field is given by −µ · B where µ for a loop of area A carrying currentI is defined to have magnitude IA and pointing perpendicular to the loop in thedirection of your thumb if the fingers of your right hand are along the direction ofthe current. To see this, we simply calculate the work required to rotate a currentloop from its equilibrium position to the desired orientation.

Consider Figure 6 below, where the current flows counterclockwise out of thepage at the bottom and into the page at the top. Let the loop have length a on thesides and b across the top and bottom, so its area is ab. The magnetic force on acurrent-carrying wire is

FB =

∫

Idl × B

and hence the forces on the opposite “a sides” of the loop cancel, and the force onthe top and bottom “b sides” is FB = IbB. The equilibrium position of the loop is

53

B

B

B

FB

FB

a/2

a/2

θθ

θ

µ

Figure 6: A current loop in a uniform magnetic field

horizontal, so the potential energy of the loop is the work required to rotate it fromθ = 0 to some value θ. This work is given by W =

∫

F ·dr where F is the force thatI must apply against the magnetic field to rotate the loop.

Since the loop is rotating, the force I must apply at the top of the loop is in thedirection of µ and perpendicular to the loop, and hence has magnitude FB cos θ.Then the work I do is (the factor of 2 takes into account both the top and bottomsides)

W =

∫

F · dr = 2

∫

FB cos θ(a/2)dθ = IabB

∫ θ

0

cos θ dθ = µB sin θ .

But note that µ · B = µB cos(90 + θ) = −µB sin θ, and therefore

W = −µ · B . (2.61)

In this derivation, I never explicitly mentioned the torque on the loop due to B.However, we see that

‖N‖ = ‖r × FB‖ = 2(a/2)FB sin(90 + θ) = IabB sin(90 + θ)

= µB sin(90 + θ) = ‖µ × B‖

and thereforeN = µ × B . (2.62)

Note that W =∫

‖N‖ dθ.Next I will prove equation (2.43). Let A be a vector as seen in both the rotating

and lab frames, and let ei be a fixed basis in the rotating frame. Then (using thesummation convention) A = Aiei so that

dA

dt=

d

dt(Aiei) =

dAi

dtei +Ai dei

dt.

54

Now (dAi/dt)ei is the rate of change of A with respect to the rotating frame, so wehave

dAi

dtei =

(

dA

dt

)

rot

.

And ei is a fixed basis vector in the frame that is rotating with respect to the labframe. Then, just like any vector rotating in the lab with angular velocity ω, wehave

dei

dt= ω × ei .

(See the figure below. Here ω = dφ/dt, and dv = v sin θ dφ so dv/dt = v sin θ ω ordv/dt = ω × v.)

v

dv

θ

ωdφ

Then

Ai dei

dt= Aiω × ei = ω ×Aiei = ω × A .

Putting this all together we have(

dA

dt

)

lab

=

(

dA

dt

)

rot

+ ω × A .

Equation (2.43) is just the ‘operator’ version of this result.Finally, let me show how to evaluate the integrals 〈1/r〉, 〈1/r2〉 and 〈1/r3〉 where

the expectation values are taken with respect to the hydrogen atom wave functionsψnlm.

First, instead of 〈1/r〉, consider 〈λ/r〉. This can be interpreted as the first-ordercorrection to the energy due to the perturbation λ/r. But H0 = T +V = T − e2/r,so H = H0 +H ′ = H0 +λ/r = T − (e2−λ)/r, and this is just our original problemif we replace e2 by e2 − λ everywhere. In particular, the exact energy solution isthen

En(λ) = −m(e2 − λ)2

2~2n2= − me4

2~2n2+ λ

me2

~2n2− λ2 m

2~2n2.

But another way of looking at this is as the expansion of En(λ) given in (2.3b):

En = E(0)n + λ

(

dEn

dλ

)

λ=0

+λ2

2!

(

d2En

dλ2

)

λ=0

+ · · ·

= E(0)n + λE(1)

n + λ2E(2)n + · · ·

55

where the first-order correction E(1)n = 〈H ′〉 is just the term linear in λ. Therefore,

letting λ→ 1, we have 〈1/r〉 = 〈H ′〉 = me2/~2n2 or

⟨

1

r

⟩

=1

a0n2. (2.63)

Note that if you have the exact solution En(λ), you can obtain E(1)n by simply

evaluating λ(dEn/dλ)λ=0.Before continuing, let me rewrite the hydrogen atom Hamiltonian as follows:

H0 = − ~2

2m

(

∂2

∂r2+

2

r

∂

∂r

)

+1

2mr2L2 − e2

r

=p2

r

2m+

L2

2mr2− e2

r

(2.64)

where I have defined the “radial momentum” pr by

pr = −i~(

∂

∂r+

1

r

)

.

Now consider 〈λ/r2〉. Again, letting H = H0 + H ′ = H0 + λ/r2, we can stillsolve the problem exactly because all we are doing is modifying the centrifugal term

L2

2mr2→ L2 + 2mλ

2mr2→ ~2l(l + 1) + 2mλ

2mr2=

~2l′(l′ + 1)

2mr2

where l′ = l′(λ) is a function of λ. (Just write ~2l′(l′ + 1) = ~

2l(l + 1) + 2mλ anduse the quadratic formula to find l′ as a function of λ.)

Recall that the exact energies were defined by

En = − me4

2~2n2= − me4

2~2(k + l + 1)2

where k = 0, 1, 2, . . . was the integer that terminated the power series solution ofthe radial equation. Now what we have is

E(l′) = − me4

2~2(k + l′ + 1)2= E(λ) = E(0) + λE(1) + · · ·

where (note λ = 0 implies l′ = l)

E(1) =dE

dλ

∣

∣

∣

∣

λ=0

=dl′

dλ

∣

∣

∣

∣

l′=l

dE

dl′

∣

∣

∣

∣

l′=l

.

Then from the explicit form of E(l′) and the definition of n we have

dE

dl′

∣

∣

∣

∣

l′=l

=me4

~2(k + l + 1)3=me4

~2n3

56

and taking the derivative of ~2l′(l′ + 1) = ~2l(l+ 1) + 2mλ with respect to λ yields

dl′

dλ

∣

∣

∣

∣

l′=l

=2m

~2

1

2l+ 1=m

~2

1

(l + 1/2).

Therefore

E(1) =(me2/~2)2

(l + 1/2)n3

and 〈λ/r2〉 = λE(1) so that

⟨

1

r2

⟩

=1

a20(l + 1/2)n3

. (2.65)

The last integral to evaluate is 〈1/r3〉. Since there is no term in H0 that goeslike 1/r3, we have to try something else. Note that H0ψnlm = Enψnlm so that

〈[H0, pr]〉 = 〈ψnlm|H0pr − prH0|ψnlm〉 = En〈pr〉 − 〈pr〉En = 0 .

Using[

1

r,∂

∂r

]

=1

r2and

[

1

r2,∂

∂r

]

=2

r3

(recall [ab, c] = a[b, c] + [a, c]b), it is easy to use (2.64) and show that

[H0, pr] = − i~m

L2

r3+i~e2

r2.

But now

0 = 〈[H0, pr]〉 = − i~m

⟨

L2

r3

⟩

+ i~e2⟨

1

r2

⟩

= − i~3l(l+ 1)

m

⟨

1

r3

⟩

+ i~e2⟨

1

r2

⟩

and therefore⟨

1

r3

⟩

=me2

~2l(l + 1)

⟨

1

r2

⟩

or⟨

1

r3

⟩

=1

a0l(l + 1)

⟨

1

r2

⟩

. (2.66)

Combining this with (2.65) we have

⟨

1

r3

⟩

=1

a30l(l + 1)(l + 1/2)n3

(2.67)

57

2.5 The Zeeman Effect

In the previous section we studied the effect of an atomic electron’s magnetic mo-ment interacting with the magnetic field generated by the nucleus (a proton). Inthis section, I want to investigate what happens when a hydrogen atom is placedin a uniform external magnetic field B. These types of interactions are generallyreferred to as the Zeeman effect, and they were instrumental in the discovery ofspin. (Pieter Zeeman and H.A. Lorentz shared the second Nobel prize in physicsin 1902. For a very interesting summary of the history of spin, read Chapter 10 inthe text Quantum Mechanics by Hendrik Hameka.)

The hydrogen atom Hamiltonian, including fine structure, is given by

H = H0 +Hfs = H0 +Hso +Hrel

where

H0 = − ~2

2m

(

∂2

∂r2+

2

r

∂

∂r

)

+1

2mr2L2 − e2

r(equation (2.34))

Hso =e2

2m2c2r3L · S (equation (2.48b))

Hrel = − p4

8m3c2(equation (2.57)) .

(And where I’m approximating the reduced mass by the electron mass me.) Theeasy way to include the presence of an external field B is to simply add an interactionenergy

Hmag = −µtot · Bwhere, from equations (2.38) and (2.39), we know that the total magnetic momentfor a hydrogenic electron is

µtot = µl + µs = − e

2mec(L + 2S) = − e

2mec(J + S) . (2.68)

However, the correct way to arrive at this is to rewrite the Hamiltonian taking intoaccount the presence of an electromagnetic field. For those who are interested, Iwork through this approach at the end of this section.

In any case, the Hamiltonian for a hydrogen atom in an external uniform mag-netic field is then

H = H0 +Hso +Hrel +Hmag .

There are really three cases to consider. (I’ll ignore Hrel for now because it’s acorrection to the kinetic energy and irrelevant to this discussion.) The first is whenB is strong enough that Hmag is large relative to Hso. In this case we can treatHso as a perturbation on the states defined by H0 +Hmag, where these states aresimultaneous eigenfunctions of L2, S2, Lz and Sz (rather than J2 and Jz). Thereason that J is not a good quantum number is that the external field exerts a

58

torque µtot ×B on the total magnetic moment, and this is equivalent to a changingtotal angular momentum dJ/dt. Thus J is not conserved, and in fact precessesabout B. In addition, if there is a spin–orbit interaction, then this internal fieldcauses L and S to precess about J.

The second case is when B is weak and Hso dominates Hmag. In this situation,Hmag is treated as a perturbation on the states defined by H0 +Hso. As we saw inour discussion of Hso, in this case we must choose our states to be eigenfunctionsof L2, S2, J2 and Jz because L and S are not conserved separately, even thoughJ = L+S is conserved. (Neither L nor S alone commutes with L·S, but [Ji,L·S] = 0and hence J2 commutes with H .)

And the third and most difficult case is when both Hso and Hmag are roughlyequivalent. Under this “intermediate-field” situation, we must take them togetherand use degenerate perturbation theory to break the degeneracies of the basis states.

2.5.1 Strong External Field

Let us first consider the case where the external magnetic field is much strongerthan the internal field felt by the electron and due to its orbital motion. TakingB = Bz we have

Hmag =eB

2mec(Lz + 2Sz) . (2.69)

If we first ignore spin, then the first-order correction to the hydrogen atom energylevels is

E(1)nlm =

⟨

ψnlm

∣

∣

∣

∣

eB

2mecLz

∣

∣

∣

∣

ψnlm

⟩

=e~

2mecBm := µBBm

where

µB =e~

2mec= 5.79 × 10−9 eV/gauss = 9.29 × 10−21 erg/gauss

is called the (electron) Bohr magneton. Thus we see that for a given l, the (2l+1)-fold degeneracy is lifted. For example, the 3-fold degenerate l = 1 state is split intothree states, with an energy difference of µBB between states:

l = 1

m = 1

m = 0

m = −1

µBB

µBB

This strong field case is sometimes called the Paschen-Back effect.If we now include spin, then

E(1)nlmlms

= µBB(ml + 2ms) (2.70)

where ms = ±1/2. This yields the further splitting (or lifting of degeneracies)sometimes called the anomalous Zeeman effect:

59

l = 1

ml = 1

ml = 0

ml = −1

µBBms = 1/2

ms = 1/2

ms = −1/2

ms = −1/2

ms = 1/2,−1/2

This gives us the energy levels E(0)n + E

(1)nlmlms

where E(0)n is given by (2.54a).

However, since the basis states we used here are just the usual hydrogen atomwave functions, it is easy to include further corrections due to both Hso and therelativistic correction Hrel discussed in Section 2.4. We simply apply first-orderperturbation theory using these as the perturbing potentials. For Hrel, we cansimply use the result (2.58). However, we can’t just use equations (2.56) for Hso

because they were derived using the eigenfunctions of J2 which don’t apply whenthere is a strong external magnetic field.

To get around this problem, we simply calculate 〈ψnlmlms|L · S|ψnlmlms

〉. Wehave

L · S = LxSx + LySy + LzSz

where Lx = (L+ + L−)/2 and Ly = (L+ − L−)/2i with similar results for Sx andSy. Using these, it is quite easy to see that the orthogonality of the eigenfunctionsyields

〈ψ|LxSx|ψ〉 = 〈ψ|LySy|ψ〉 = 0

while〈ψnlmlms

|LzSz|ψnlmlms〉 = ~

2mlms . (2.71)

Combining the results for Hrel and Hso we obtain the following corrections to

the “unperturbed” energies E(0)n + E

(1)nlmlms

:

E(1)rel + E(1)

so =mc2α4

2n3

[

3

4n− 1

l + 1/2

]

+e2

2m2c2~

2mlms1

a30n

3l(l+ 1)(l + 1/2)

where we used equations (2.58), (2.48b), (2.67) and (2.71). After a little algebra,which I leave to you, we arrive at

E(1)rel + E(1)

so =me4α2

2~2n3

3

4n−

[

l(l + 1) −mlms

l(l+ 1)(l + 1/2)

]

= −E(0)1

α2

n3

3

4n−

[

l(l + 1) −mlms

l(l+ 1)(l + 1/2)

]

. (2.72)

60

2.5.2 Weak External Field

Now we turn to the second case where the external field is weak relative to thespin–orbit term. As we discussed above, now we must take our basis states to beeigenfunctions of L2, S2, J2 and Jz .

For a many-electron atom, there are basically two ways to calculate the total J.The first way is to calculate L =

∑

Li and S =∑

Si and then evaluate J = L +S.This is called L–S or Russel-Saunders coupling. It is applicable to the lighterelements where interelectronic repulsion energies are significantly greater than thespin–orbit interaction energies. This is because if the spin–orbit coupling is weak,then L and S “almost” commute with H0 +Hso.

The second way is to first calculate Ji = Li +Si so that J =∑

Ji. This is calledj–j coupling. It is used for heavier elements where the electrons are moving veryrapidly, and hence there is a strong spin–orbit interaction. Because of this, L andS no longer commute with H , even though J does so. This type of coupling is alsomore difficult to use, so we will deal only with the L–S scheme.

Here is the physical situation:

B

L

S

S

J −µtot ∼ J + S = L + 2S

Since J commutes with H0 +Hso, it is conserved (and hence is fixed in space), eventhough L and S are not. This means that L and S both precess about J. If theapplied external B field is much weaker than the internal field, then J will precessmuch more slowly about B than L and S precess about J. We need to evaluate thecorrection (2.69) in first-order perturbation theory.

Since our basis states are eigenfunctions of J2 and Jz but not Lz and Sz, wecan’t directly evaluate the expectation value of Lz + 2Sz = Jz + Sz. The correctway to handle this is to use the Wigner-Eckart theorem, which is rather beyond thescope of this course. Instead, we will use a physical argument that gets us to thesame answer.

We note that since L and S (and hence µtot) precess rapidly about J, the timeaverage of the Hamiltonian Hav

mag = −〈µtot ·B〉 will be the same as −〈µtot〉 ·B. But

61

the average of µtot is just its component along J, which is

〈µtot〉 = (µtot · J)J =µtot · JJ2

J .

Using L = J− S so that L2 = J2 + S2 − 2S · J we have

(J + S) · J = J2 + S · J = J2 +1

2(J2 + S2 − L2) .

Then since B = Bz, we now have

Havmag = −B〈µtot〉 · z =

eB

2mec

(J + S) · JJ2

Jz

=eBJz

2mec

[

1 +J2 + S2 − L2

2J2

]

.

Our basis states are simultaneous eigenstates of L2, S2, J2 and Jz, so the averageenergy Eav

mag is given by the first-order correction

Eavmag =

e~Bmj

2mec

[

1 +j(j + 1) + s(s+ 1) − l(l+ 1)

2j(j + 1)

]

=e~Bmj

2mec

[

3

2+

3/4 − l(l+ 1)

2j(j + 1)

]

:= µBBmjgJ

(2.73)

where the Lande g-factor gJ is defined by

gJ = 1 +j(j + 1) + s(s+ 1) − l(l + 1)

2j(j + 1).

The total energy of a hydrogen atom in a uniform magnetic field is now given

by the sum of the ground state energy E(0)n (equation (2.54a)), the fine-structure

correction E(1)fs (equation (2.59)) and Eav

mag (equation (2.73)).

2.5.3 Intermediate-Field Case

Finally, we consider the intermediate-field case where the internal and externalmagnetic fields are approximately the same. In this situation, we must apply de-generate perturbation theory to the degenerate “unperturbed” states ψnlmlms

bytreating H ′ = Hfs + Hmag as a perturbation. It is easiest to simply work out anexample.

As we saw in our discussion of spin–orbit coupling, it is best to work in the basisin which our states are simultaneous eigenstates of L2, S2, J2 and Jz. (The choiceof basis has no effect on the eigenvalues of Hfs +Hmag, and the eigenvalues are just

62

what we are looking for when we solve (2.21).) Let us consider the hydrogen atomstate with n = 2, so that l = 0, 1. Since s = 1/2, the possible j values are

0 ⊗ 1

2+ 1 ⊗ 1

2=

1

2+

3

2⊕ 1

2

or j = 1/2, 3/2, 1/2. Our basis states |l s j mj〉 are given in terms of the states|l smlms〉 using the appropriate Clebsch-Gordan coefficients (which you can lookup or calculate for yourself).

For l = 0 we have j = 1/2 so mj = ±1/2 and we have the two states

ψ1 :=∣

∣0 12

12

12

⟩

=∣

∣0 12 0 1

2

⟩

ψ2 :=∣

∣0 12

12 −1

2

⟩

=∣

∣0 12 0 −1

2

⟩

where the first state in each line is the state |l s j mj〉, and the second state in eachline is the linear combination of states |l smlms〉 with Clebsch-Gordan coefficients.(For l = 0 the C-G coefficients are just 1.)

For l = 1 we have the four states with j = 3/2 and the two states with j = 1/2(which we order with a little hindsite so the determinant (2.21) turns out blockdiagonal):

ψ3 :=∣

∣1 12

32

32

⟩

=∣

∣1 12 1 1

2

⟩

ψ4 :=∣

∣1 12

32 −3

2

⟩

=∣

∣1 12 −1 −1

2

⟩

ψ5 :=∣

∣1 12

32

12

⟩

=√

23

∣

∣1 12 0 1

2

⟩

+√

13

∣

∣1 12 1 −1

2

⟩

ψ6 :=∣

∣1 12

12

12

⟩

= −√

13

∣

∣1 12 0 1

2

⟩

+√

23

∣

∣1 12 1 −1

2

⟩

ψ7 :=∣

∣1 12

32 −1

2

⟩

=√

13

∣

∣1 12 −1 1

2

⟩

+√

23

∣

∣1 12 0 −1

2

⟩

ψ8 :=∣

∣1 12

12 −1

2

⟩

= −√

23

∣

∣1 12 −1 1

2

⟩

+√

13

∣

∣1 12 0 −1

2

⟩

.

Now we need to evaluate the matrices of Hfs = Hso+Hrel and Hmag in the |j mj〉basis ψi. Since Hrel ∼ p4, it’s already diagonal in the |j mj〉 basis. And sinceHso ∼ S ·L = (1/2)(J2 −L2 − S2), it’s also diagonal in the |j mj〉 basis. ThereforeHfs is diagonal and its contribution is given by (2.59):

〈jmj |Hfs|jmj〉 = −∣

∣E(0)n

∣

∣α2

n2

[

− 3

4+

n

j + 1/2

]

= −∣

∣E(0)1

∣

∣α2

16

[

2

j + 1/2− 3

4

]

where I used E(0)n = E

(0)1 /n2 and let n = 2. For states with j = 1/2, this gives a

contribution

〈ψi|Hfs|ψi〉 = −5∣

∣E(0)1

∣

∣α2

64:= −5ξ for i = 1, 2, 6, 8 (2.74a)

63

and for states with j = 3/2 this is

〈ψi|Hfs|ψi〉 = −∣

∣E(0)1

∣

∣α2

64:= −ξ for i = 3, 4, 5, 7 . (2.74b)

Next, we easily see that the first four states ψ1–ψ4 are eigenstates of Hmag ∼Lz + 2Sz (since they each contain only a single factor |l smlms〉). Hence Hmag isalready diagonal in this 4 × 4 block, and so contributes the diagonal terms

〈ψi|Hmag|ψi〉 = µBB(ml + 2ms) := β(ml + 2ms) for i = 1, 2, 3, 4 .

For the remaining four states ψ5–ψ8 we must explicitly evaluate the matrix elements.For example,

Hmag|ψ5〉 =µBB

~(Lz + 2Sz)

√

2

3

∣

∣

∣

∣

11

20

1

2

⟩

+

√

1

3

∣

∣

∣

∣

11

21 −1

2

⟩

= µBB

1 ·√

2

3

∣

∣

∣

∣

11

20

1

2

⟩

+ 0 ·√

1

3

∣

∣

∣

∣

11

21 −1

2

⟩

= µBB

√

2

3

∣

∣

∣

∣

11

20

1

2

⟩

and therefore (using the orthonormality of the states |l smlms〉)

〈ψ5|Hmag|ψ5〉 =2

3µBB :=

2

3β

and

〈ψ6|Hmag|ψ5〉 = 〈ψ5|Hmag|ψ6〉 = −√

2

3µBB := −

√2

3β .

Also,

〈ψ6|Hmag|ψ6〉 =

⟨

ψ6

∣

∣

∣

∣

∣

−√

1

3µBB

∣

∣

∣

∣

∣

11

20

1

2

⟩

=1

3µBB :=

1

3β .

Since all other matrix elements with ψ5 and ψ6 vanish, there is a 2 × 2 blockcorresponding to the subspace spanned by ψ5 and ψ6. Similarly, there is a 2 × 2block corresponding to the subspace spanned by ψ7 and ψ8 with

〈ψ7|Hmag|ψ7〉 = −2

3β

〈ψ8|Hmag|ψ7〉 = 〈ψ7|Hmag|ψ8〉 = −√

2

3β

〈ψ8|Hmag|ψ8〉 = −1

3β .

64

Combining all of these matrix elements, the matrix of H ′ = Hfs + Hmag used in(2.21) becomes

2

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

6

4

−5ξ + β

−5ξ − β

−ξ + 2β

−ξ − 2β

−ξ +23β −

√2

3β

−

√2

3β −5ξ +

13β

−ξ −

23β −

√2

3β

−

√2

3β −5ξ −

13β

3

7

7

7

7

7

7

7

7

7

7

7

7

7

7

7

7

7

7

5

.

Now we need to find the eigenvalues of this matrix (which are the first-orderenergy corrections). Since it’s block diagonal, the first four diagonal entries areprecisely the first four eigenvalues. For the remaining four eigenvalues, we mustdiagonalize the two 2× 2 submatrices. Calling the eigenvalues λ, the characteristicequation for the ψ5, ψ6 block is

∣

∣

∣

∣

∣

−ξ + 23β − λ −

√2

3 β

−√

23 β −5ξ + 1

3β − λ

∣

∣

∣

∣

∣

= λ2 + λ(6ξ − β) + 5ξ2 − 11

3ξβ = 0 .

From the quadratic formula we find the roots

λ5,6± = −3ξ +

β

2±

√

4ξ2 +2

3ξβ +

1

4β2 .

Looking at the ψ7, ψ8 block, we see that we can just let β → −β and use thesame equation for the roots:

λ7,8± = −3ξ − β

2±

√

4ξ2 − 2

3ξβ +

1

4β2 .

The energy E(1)i of each of these eight states is then given by

E(1)1 = E

(0)2 − 5ξ + β

E(1)2 = E

(0)2 − 5ξ − β

E(1)3 = E

(0)2 − ξ + 2β

E(1)4 = E

(0)2 − ξ − 2β

E(1)5 = E

(0)2 − 3ξ +

β

2+

√

4ξ2 +2

3ξβ +

1

4β2

65

E(1)6 = E

(0)2 − 3ξ +

β

2−

√

4ξ2 +2

3ξβ +

1

4β2

E(1)7 = E

(0)2 − 3ξ − β

2+

√

4ξ2 − 2

3ξβ +

1

4β2

E(1)8 = E

(0)2 − 3ξ − β

2−

√

4ξ2 − 2

3ξβ +

1

4β2

For i = 1, 2, 3, 4 the energy E(1)i corresponds to ψi. But for i = 5, 6 the energy E

(1)i

corresponds to some linear combination of ψ5 and ψ6, and similarly for i = 7, 8 theenergy Ei corresponds to a linear combination of ψ7 and ψ8. (This is the essentialcontent of Section 2.2.)

It is easy to see that for β = 0 (i.e., B = 0), these energies reduce to Efs given by(2.74), and for very large β, we obtain the Paschen-Back energies given by (2.70).Thus our results have the correct limiting behavior. See Figure 7 below.

2 4 6 8 10Β

-20

-10

10

E

Figure 7: Intermediate-field energy corrections as a function of B for n = 2.

2.5.4 Supplement: The Electromagnetic Hamiltonian

In a proper derivation of the Lagrange equations of motion, one starts from d’Alembert’sprinciple of virtual work, and derives Lagrange’s equations

d

dt

∂T

∂qi− ∂T

∂qi= Qi (2.75)

where the qi are generalized coordinates, T = T (qi, qi) is the kinetic energy andQi =

∑

j Fj(∂xj/∂qi) is a generalized force. In the particular case that Qi isderivable from a conservative force Fj = −∂V/∂xj , then we have Qi = −∂V/∂qi.Since the potential energy V is assumed to be independent of qi, we can replace∂T/∂qi by ∂(T − V )/∂qi and we arrive at the usual Lagrange’s equations

d

dt

∂L

∂qi− ∂L

∂qi= 0 (2.76)

66

where L = T − V . However, even if there is no potential function V , we can stillarrive at this result if there exists a function U = U(qi, qi) such that the generalizedforces may be written as

Qi = −∂U∂qi

+d

dt

∂U

∂qi

because defining L = T − U we again arrive at equation (2.76). The function Uis called a generalized potential or a velocity dependent potential. We nowseek such a function to describe the force on a charged particle in an electromagneticfield.

Recall from electromagnetism that the Lorentz force law is given by

F = q(

E +v

c× B

)

or

F = q

(

−∇φ− 1

c

∂A

∂t+

v

c× (∇ × A)

)

where E = −∇φ − (1/c)∂A/∂t and B = ∇ × A. Our goal is to write this in theform

Fi = − ∂U

∂xi+d

dt

∂U

∂xi

for a suitable U . All it takes is some vector algebra. We have

[v × (∇ × A)]i = εijkεklmvj∂lAm = (δl

iδmj − δm

i δlj)v

j∂lAm

= vj∂iAj − vj∂jAi = vj∂iAj − (v · ∇)Ai.

But xi and xj are independent variables (in other words, xj has no explicit depen-dence on xi) so that

vj∂iAj = xj ∂Aj

∂xi=

∂

∂xi(xjAj) =

∂

∂xi(v ·A)

and we have

[v × (∇ × A)]i =∂

∂xi(v ·A) − (v · ∇)Ai.

But we also have

dAi

dt=∂Ai

∂xj

dxj

dt+∂Ai

∂t= vj ∂Ai

∂xj+∂Ai

∂t= (v · ∇)Ai +

∂Ai

∂t

so that

(v · ∇)Ai =dAi

dt− ∂Ai

∂t

and therefore

[v × (∇ × A)]i =∂

∂xi(v · A) − dAi

dt+∂Ai

∂t.

67

But we can write Ai = ∂(vjAj)/∂vi = ∂(v ·A)/∂vi which gives us

[v × (∇ × A)]i =∂

∂xi(v · A) − d

dt

∂

∂vi(v ·A) +

∂Ai

∂t.

The Lorentz force law can now be written in the form

Fi = q

(

− ∂φ

∂xi− 1

c

∂Ai

∂t+

1

c[v × (∇ × A)]i

)

= q

(

− ∂φ

∂xi− 1

c

∂Ai

∂t+

1

c

∂

∂xi(v ·A) − 1

c

d

dt

∂

∂vi(v · A) +

1

c

∂Ai

∂t

)

= q

[

− ∂

∂xi

(

φ− v

c· A

)

− d

dt

∂

∂vi

(v

c· A

)

]

.

Since φ is independent of v we can write

− d

dt

∂

∂vi

(v

c· A

)

=d

dt

∂

∂vi

(

φ− v

c· A

)

so that

Fi = q

[

− ∂

∂xi

(

φ− v

c·A

)

+d

dt

∂

∂vi

(

φ− v

c· A

)

]

or

Fi = − ∂U

∂xi+d

dt

∂U

∂xi

where U = q(φ − v/c · A). This shows that U is a generalized potential and thatthe Lagrangian for a particle of charge q in an electromagnetic field is

L = T − qφ+q

cv · A (2.77a)

or

L =1

2mv2 − qφ+

q

cv ·A. (2.77b)

From this, the canonical momentum is defined by pi = ∂L/∂xi = ∂L/∂vi sothat

p = mv +q

cA .

Using this, the Hamiltonian is then given by

H =∑

pixi − L = p · v − L

= mv2 +q

cA · v − 1

2mv2 + qφ− q

cA · v

=1

2mv2 + qφ

=1

2m

(

p − q

cA

)2

+ qφ .

68

This is the basis for the oft heard statement that to include electromagnetic forces,you need to make the replacement p → p− (q/c)A. Including any other additionalpotential energy terms, the Hamiltonian becomes

H =1

2m

(

p− q

cA

)2

+ qφ+ V (r) . (2.78)

Let’s evaluate (2.78) for the case of a uniform magnetic field. Since B = ∇×A,it is not hard to verify that

A = −1

2r × B

will work (I’ll work it out, but you could also just plug into a vector identity if youtake the time to look it up):

[∇ × (r × B)]i = εijkεklm∂j(xlBm)

= (δilδjm − δimδjl)[δjlBm + xl∂jBm]

= Bi − 3Bi = −2Bi

where I used ∂jxl = δjl, δjlδlj = δjj = 3 and ∂jBm = 0 since B is uniform. Thisshows that B = (−1/2)[∇ × (r × B)] = ∇ × A as claimed. Note also that for thisB we have

−2∇ ·A = ∇ · (r × B) = εijk∂i(xjBk) = εijkδijBk = 0

because εijkδij = εiik = 0. Hence ∇ ·A = 0.Before writing out (2.78), let me use this last result to show that

(p ·A)ψ = −i~∇ · (Aψ) = −i~(∇ ·A)ψ + i~A · ∇ψ = (A · p)ψ

and hence p ·A = A ·p. (Note this shows that p ·A = A ·p even if B is not uniformif we are using the Coulomb gauge ∇ · A = 0.) Now using this, we have

1

2m

(

p− q

cA

)2

=1

2m

[

p2 − q

c(p · A + A · p) +

q2

c2A2

]

=p2

2m− q

mcA · p +

q2

2mc2A2 .

But (thinking of the scalar triple product as a determinant and switching two rows)

q

mcA · p = − q

2mc(r × B) · p = +

q

2mcB · (r × p)

=q

2mcB · L .

And using (I’ll leave the proof to you)

A2 =1

4(r × B) · (r × B) =

1

4[r2B2 − (r · B)2]

69

we obtain

1

2m

(

p− q

cA

)2

=p2

2m− q

2mcB · L +

q2

8mc2[r2B2 − (r ·B)2] .

Let’s compare the relative magnitudes of the B ·L term and the quadratic (last)term for an electron. Taking r2 ≈ a2

0 and L ∼ ~, we have

(e2/8mc2)r2B2

(e/2mc)B · L =(e2/8mc2)a2

0B2

(e/2mc)~B=

1

4

e2

~c

B

e/a20

=1

4

1

137

B

(4.8 × 10−10 esu)/(0.5 × 10−8 cm)2

=B

9 × 109 gauss.

Since magnetic fields in the lab are of order 104 gauss or less, we see that thequadratic term is negligible in comparison.

Referring back to (2.38), we see that

q

2mcL = µl

where, for an electron, we have q = −e. And as we have also seen, for spin we mustpostulate a magnetic moment of the form

µs = gq

2mcS

where g = 2 for an electron (and g = 5.59 for a proton). Therefore, an electron hasa total magnetic moment

µtot = − e

2mec(L + 2S)

as we stated in (2.68).Combining our results, the Hamiltonian for a hydrogen atom in a uniform ex-

ternal magnetic field is then given by

H =p2

2me− e2

r− µtot · B = H0 − µtot · B = H0 +H ′

where we are taking qφ + V (r) = 0 − e2/r, and me in this equation is really thereduced mass, which is approximately the same as the electron mass.

70

3 Time-Dependent Perturbation Theory

3.1 Transitions Between Two Discrete States

We now turn our attention to the situation where the perturbation depends ontime. In this situation, we assume that the system is originally in some definitestate, and that applying a time-dependent external force then induces a transitionto another state. For example, shining electromagnetic radiation on an atom in itsground state will (may) cause it to undergo a transition to a higher energy state.We assume that the external force is weak enough that perturbation theory applies.

There are several ways to deal with this problem, and everyone seems to havetheir own approach. We shall follow a method that is closely related to the time-independent method that we employed.

To begin, supposeH = H0 +H ′(t)

and that we have the orthonormal solutions

H0ϕn = Enϕn

withϕn(t) = ϕne

−iEnt/~ .

Note that we no longer need to add a superscript 0 to the energies, because with atime-dependent Hamiltonian there is no energy conservation and hence we are notlooking for energy corrections.

We would like to solve the time-dependent Schrodinger equation

Hψ(t) = [H0 +H ′(t)]ψ(t) = i~∂ψ(t)

∂t. (3.1)

In this case, the solutions ϕn still form a complete set (they describe every possiblestate available to the system), the difference being that now the state ψ(t) thatresults from the perturbation will depend on time. So let us write

ψ(t) =∑

k

ck(t)e−iEkt/~ϕk . (3.2)

The reason for this form is that we want the time-dependent coefficients cn(t) toreduce to constants if H ′(t) = 0. In other words, so H ′(t) → 0 implies ψ(t) → ϕ(t).Our goal is to find the probability that if the system is in an eigenstate ϕi = ψ(0) attime t = 0, it will be found in the eigenstate ϕf at a later time t. This probabilityis given by

Pif (t) = |〈ϕf |ψ(t)〉|2 = |cf (t)|2 (3.3)

where 〈ψ(t)|ψ(t)〉 = 1 implies

∑

k

|ck(t)|2 = 1 .

71

Using (3.2) in (3.1) we obtain

∑

k

ck(t)e−iEkt/~[Ek +H ′(t)]ϕk =∑

k

i~

[

ck(t) − iEk

~ck(t)

]

e−iEkt/~ϕk

ori~

∑

k

ck(t)e−iEkt/~ϕk =∑

k

H ′(t)ck(t)e−iEkt/~ϕk . (3.4)

But 〈ϕn|ϕk〉 = δnk so that

i~cn(t)e−iEnt/~ =∑

k

〈ϕn|H ′(t)|ϕk〉ck(t)e−iEkt/~ .

Defining the Bohr angular frequency

ωnk =En − Ek

~(3.5)

we can write

cn(t) =1

i~

∑

k

〈ϕn|H ′(t)|ϕk〉ck(t)eiωnkt . (3.6a)

This set of equations for cn(t) is exact and completely equivalent to the originalSchrodinger equation (3.1). Defining

H ′nk(t) = 〈ϕn|H ′(t)|ϕk〉

we may write out (3.6a) in matrix form as (for a finite number of terms)

i~

c1(t)

c2(t)...

cn(t)

=

H ′11 H ′

12eiω12t · · · H ′

1neiω1nt

H ′21e

iω21t H ′22 · · · H ′

2neiω2nt

......

...

H ′n1e

iωn1t H ′n2e

iωn2t · · · H ′nn

c1(t)

c2(t)...

cn(t)

. (3.6b)

As we did in the time-independent case, we now let H ′(t) → λH ′(t), and expandck(t) in a power series in λ:

ck(t) = c(0)k (t) + λc

(1)k (t) + · · · . (3.7)

Inserting this into (3.6a) yields

c(0)n (t) + λc(1)n (t) + λ2c(2)n (t) + · · ·

=1

i~

∑

k

H ′nk(t)[λc

(0)k (t) + λ2c

(1)k (t) + λ3c

(2)k (t) + · · · ]eiωnkt .

Equating powers of λ, for λ0 we have

c(0)n (t) = 0 (3.8a)

72

and for λs+1 with s ≥ 0 we have

c(s+1)n (t) =

1

i~

∑

k

H ′nk(t)c

(s)k (t)eiωnkt . (3.8b)

In principle, these may be solved successively. Solving (3.8a) gives c(0)k (t), and using

this in (3.8b) then gives c(1)n (t). Then putting these back into (3.8b) again yields

c(2)n (t), and in principle this can be continued to any desired order.

Let us assume that the system is initially in the state ϕi, so that

cn(0) = δni . (3.9a)

Since this must be true for all λ, we have

c(0)n (0) = δni (3.9b)

andc(s)n (0) = 0 for s ≥ 1 . (3.9c)

From (3.8a) we see that the zeroth-order coefficients are constant in time, so wehave

c(0)n (t) = δni (3.9d)

and the zeroth-order solutions are completely determined.Using (3.9b) in (3.8b) we obtain, to first order,

c(1)n (t) =1

i~

∑

k

H ′nk(t)δkie

iωnkt =1

i~H ′

ni(t)eiωnit

so that

c(1)n (t) =1

i~

∫ t

0

H ′ni(t

′)eiωnit′

dt′ (3.10)

where the constant of integration is zero by (3.9c). Using (3.9d) and (3.10) in (3.2)yields ψ(t) to first order:

ψ(t) = ϕie−iEit/~ + λ

∑

k

(

1

i~

∫ t

0

H ′ki(t

′)eiωkit′/~ dt′

)

e−iEktϕk .

From (3.3) we know that the transition probability to the state ϕf is given by

Pif (t) = |〈ϕf |ψ(t)〉|2 = |cf (t)|2

where cf (t) = c(0)f (t)+λc

(1)f (t)+ · · · . We will only consider transitions to states ϕf

that are distinct from the initial state ϕi, and hence c(0)f (t) = 0. Then the first-order

transition probability is

Pif (t) = λ2∣

∣c(1)f (t)

∣

∣

2

73

or, from (3.10) and letting λ→ 1,

Pif (t) =1

~2

∣

∣

∣

∣

∫ t

0

H ′fi(t

′)eiωfit′

dt′∣

∣

∣

∣

2

. (3.11)

A minor point is that our initial conditions could equally well be defined att → −∞. In this case, the lower limit on the above integrals would obviously be−∞ rather than 0.

Example 3.1. Consider a one-dimensional harmonic oscillator of a particle ofcharge q with characteristic frequency ω. Let this oscillator be placed in an electricfield that is turned on and off so that its potential energy is given by

H ′(t) = qE xe−t2/τ2

where τ is a constant. If the particle starts out in its ground state, let us find theprobability that it will be in its first excited state after a time t≫ τ .

Since t ≫ τ , we may as well take t → ±∞ as limits. From (3.11), we see thatwe must evaluate the integral

I =

∫ ∞

−∞H ′

10(t′)eiω10t′ dt′

whereH ′

10(t) = qE e−t2/τ2〈ψ1|x|ψ0〉and En = ~ω(n + 1/2) so that ω10 = (E1 − E0)/~ω = 1. Then (keeping ω10 forgenerality at this point)

I = qE 〈ψ1|x|ψ0〉∫ ∞

−∞e−t2/τ2

eiω10t dt

= qE 〈ψ1|x|ψ0〉∫ ∞

−∞e−(1/τ2)(t2−iω10τ2t) dt

= qE 〈ψ1|x|ψ0〉e−ω2

10τ2/4

∫ ∞

−∞e−(1/τ2)(t−iω10τ2/2)2 dt

= qE 〈ψ1|x|ψ0〉e−ω2

10τ2/4

∫ ∞

−∞e−(1/τ2)u2

du

= qE 〈ψ1|x|ψ0〉e−ω2

10τ2/4

√πτ2 .

The easy way to do the spatial integral is to use the harmonic oscillator ladderoperators. From

x =

√

~

2mω(a+ a†)

74

whereaψn =

√nψn−1 and a†ψn =

√n+ 1ψn+1

we have

〈ψ1|x|ψ0〉 =

√

~

2mω〈ψ1|a†ψ0〉 =

√

~

2mω〈ψ1|ψ1〉 =

√

~

2mω.

Therefore

I = qE τ

√

π~

2mωe−ω2

10τ2/4

so that

P01(t→ ∞) =πq2E 2τ2

2m~ωe−ω2

10τ2/2 =

πq2E 2τ2

2m~ωe−τ2/2 .

Note that as τ → ∞ (i.e., the electric field is turned on very slowly), we haveP01 → 0. This shows that the system adjusts “adiabatically” to the field and isnot shocked into a transition.

Example 3.2. Let us consider a harmonic perturbation of the form

H ′(t) = V0(r) cosωt , t ≥ 0 .

Note that letting ω = 0 we obtain the constant perturbation H ′(t) = V0(r) as aspecial case. It just isn’t much harder to treat the more general situation, whichrepresents the interaction of the system with an electromagetic wave of frequencyω.

If we defineVfi = 〈ϕf |V0(r)|ϕi〉 ,

thenH ′

fi = 〈ϕf |V0(r) cosωt|ϕi〉 = 〈ϕf |V0(r)|ϕi〉 cosωt = Vfi cosωt .

Using cosωt = (eiωt + e−iωt)/2i, we then have∫ t

0

H ′fi(t

′)eiωfit′

dt′ =Vfi

2i

∫ t

0

(eiωt′ + e−iωt′)eiωfit′

dt′

=Vfi

2i

∫ t

0

(ei(ωfi+ω)t′ + ei(ωfi−ω)t′) dt′

=Vfi

2i

[

ei(ωfi+ω)t − 1

i(ωfi + ω)+ei(ωfi−ω)t − 1

i(ωfi − ω)

]

.

Inserting this into (3.11), we can write

Pif (t;ω) =|Vfi|24~2

∣

∣

∣

∣

1 − ei(ωfi+ω)t

ωfi + ω+

1 − ei(ωfi−ω)t

ωfi − ω

∣

∣

∣

∣

2

(3.12)

75

where I’m specifically including ω as an argument of Pif because the transitionprobability depends on ω.

Let us consider the special case of a constant (i.e., time-independent) perturba-tion, ω = 0. In this case, (3.12) reduces to

Pif (t; 0) =|Vfi|2~2ω2

fi

∣

∣1 − eiωfit∣

∣

2=

|Vfi|2~2ω2

fi

2(1 − cosωfit) .

Using the elementary identity

cosA = cos(A/2 +A/2) = cos2A/2 − sin2A/2 = 1 − 2 sin2A/2

we can write the transition probability as

Pif (t; 0) =|Vfi|2

~2

[

sinωfit/2

ωfi/2

]2

:=|Vfi|2

~2F (t;ωfi) . (3.13)

The function

F (t;ωfi) =

[

sinωfit/2

ωfi/2

]2

= t2[

sinωfit/2

ωfit/2

]2

has amplitude equal to t2, and zeros at ωfi = 2πn/t. See Figure 8 below.

-5 5

1

2

3

4

Figure 8: Plot of F (t;ωfi) vs ωfi for t = 2.

The main peak lies between zeros at ±2π/t, so its width goes like 1/t while itsheight goes like t2, and hence its area grows like t.

It is also interesting to see how the transition probability depends on time.

76

2 4 6 8 10 12 140.0

0.2

0.4

0.6

0.8

1.0

Figure 9: Plot of F (t;ωfi) vs t for ωfi = 2.

Here we see clearly that for times t = 2πn/ωfi the transition probability is zero, andthe system is certain to be in its initial state. Because of this oscillatory behavior,the greatest probability for a transition is to allow the perturbation to act only fora short time π/ωfi.

For future reference, let me make a (very un-rigorous but useful) mathemat-ical observation. From Figure 8, we see that as t → ∞, the function F (t, ω) =t2[(sinωt/2)/(ωt/2)]2 has an amplitude t2 that also goes to infinity, and a width4π/t centered at ω = 0 that goes to zero. Then if we include F (t, ω) inside theintegral of a smooth function f(ω), the only contribution to the integral will comewhere ω = 0. Using the well-known result

∫ ∞

−∞

sin2 x

x2dx = π

we have (with x = ωt/2 so dx = (t/2)dω)

limt→∞

∫ ∞

−∞f(ω)t2

[

sinωt/2

ωt/2

]2

dω = 2tf(0)

∫ ∞

−∞

sin2 x

x2dx = 2πtf(0)

and hence we conclude that

F (t;ω) =

[

sinωt/2

ω/2

]2

= t2[

sinωt/2

ωt/2

]2t→∞−−−→ 2πtδ(ω) . (3.14)

Example 3.3. Let us take a look at equation (3.12) when ω ≈ ωfi. This is calleda resonance phenomenon. We will assume that ω ≥ 0 by definition, and we willconsider the case where ωfi > 0. The alternative case where ωfi < 0 can be treatedin an analogous manner.

77

We begin by rewriting the two complex terms in (3.12). For the first we have

A+ :=1 − ei(ωfi+ω)t

ωfi + ω= ei(ωfi+ω)t/2

[

e−i(ωfi+ω)t/2 − ei(ωfi+ω)t/2

ωfi + ω

]

= −iei(ωfi+ω)t/2

[

sin(ωfi + ω)t/2

(ωfi + ω)/2

]

and similarly for the second

A− :=1 − ei(ωfi−ω)t

ωfi − ω= −iei(ωfi−ω)t/2

[

sin(ωfi − ω)t/2

(ωfi − ω)/2

]

If ω ≈ ωfi, then A− dominates and is called the resonant term, while the termA+ is called the anti-resonant term. (These terms would be switched if we wereconsidering the case ωfi < 0.)

We are considering the case where |ω − ωfi| ≪ |ωfi|, so A+ can be neglected incomparison to A−. Under these conditions, (3.12) becomes

Pif (t;ω) =|Vfi|24~2

|A−|2 =|Vfi|24~2

[

sin(ωfi − ω)t/2

(ωfi − ω)/2

]2

:=|Vfi|24~2

F (t;ωfi − ω) . (3.15)

A plot of F (t;ωfi − ω) as a function of ω would be identical to Figure 8 exceptthat the peak would be centered over the point ω = ωfi. In particular, F (t;ωfi−ω)has a maximum value of t2, and a width between its first two zeros of

∆ω =4π

t. (3.16)

Here is another way to view Example 3.3. Let us consider a time-dependentpotential of the form

H ′(t) = V0(r)e±iωt . (3.17)

Then∫ t

0

H ′fi(t

′)eiωfit′

dt′ = Vfi

∫ t

0

ei(ωfi±ω)t′ dt′ = Vfiei(ωfi±ω)t − 1

i(ωfi ± ω)

= Vfiei(ωfi±ω)t/2 sin(ωfi ± ω)t/2

(ωfi ± ω)/2

and (3.11) becomes

Pif (t) =|Vfi|2

~2

[

sin(ωfi ± ω)t/2

(ωfi ± ω)/2

]2

. (3.18)

78

As t→ ∞, we can use (3.14) to write

limt→∞

Pif (t) =2π

~|Vfi|2 δ(Ef − Ei ± ~ω)t

where we used the general result δ(ax) = (1/ |a|)δ(x) so that δ(ω) = δ(E/~) =~δ(E). Note that the transition probability grows linearly with time. We can writethis as

Pif (t → ∞) = Γi→f t (3.19a)

where the transition rate (i.e., the transition probability per unit time) is definedby

Γi→f =2π

~|Vfi|2 δ(Ef − Ei ± ~ω) . (3.19b)

(The result (3.19b) differs from (3.15) by a factor of 4 in the denominator. This isbecause in Example 3.2 we used cosωt which contains the terms (1/2)e±iωt.)

Because of the delta function, we only get transitions in those cases where|Ef − Ei| = ~ω, which is simply a statement of energy conservation. Assumingthat Ef > Ei, in the case of a potential of the form V0e

+iωt, we have Ef = Ei −~ωso the system has emitted a quantum of energy. And in the case where we have apotential of the form V0e

−iωt, we have Ef = Ei + ~ω so the system has absorbed aquantum of energy.

In Example 3.3, we saw that resonance occurs when ω = ωfi. Since we areconsidering the case where ωfi = (Ef − Ei)~ ≥ 0, this means that resonance is atthe point where Ef = Ei+~ω. In other words, a system with energy Ei undergoes aresonant absorption of a quantum of energy ~ω to transition to a state with energyEf . Had we started with the case where ωfi < 0, we would have found that thesystem underwent a resonant induced emission of the same quantum of energy ~ω,so that Ef = Ei − ~ω.

Also recall that in Example 3.3, we neglected A+ relative to A−. Noting that|A+(ω)|2 = |A−(−ω)|2, it is easy to see that a plot of |A+|2 is exactly the same as

a plot of |A−|2 reflected about the vertical axis ω = 0. See Figure 10 below. Notethat both of these curves have a width ∆ω = 4π/t that narrows as time increases.

-30 -20 -10 10 20 30

1

2

3

4

Figure 10: Plot of |A+|2 and |A−|2 vs ω for t = 2 and ωfi = 20.

79

In addition, we see that A+ will be negligible relative to A− as long as they arewell-separated, in other words, as long as

2 |ωfi| ≫ ∆ω .

Since ∆ω = 4π/t, this is equivalent to requiring

t≫ 1

|ωfi|≈ 1

ω.

Physically, this means that the perturbation must act over a long enough timeinterval t for the system to oscillate enough that it indeed appears sinusoidal.

On the other hand, in both Examples 3.2 and 3.3, the transition probabilityPif (t;ω) has a maximum value proportional to t2. Since this approaches infinityas t→ ∞, and since a probability always has to be less than or equal to 1, there isclearly something wrong. One answer is that the first order approximation we areusing has a limited time range. In Example 3.3, resonance occurs when ω = ωfi, inwhich case

Pif (t;ω = ωfi) =|Vfi|24~2

t2 .

So in order for our first-order approximation to be valid, we must have

t ≪ ~

|Vfi|.

Combining this with the previous paragraph, we conclude that

1

|ωfi|≪ ~

|Vfi|.

This is the same as

~ |ωfi| = |Ef − Ei| ≫ |Vfi| = 〈ϕf |V0|ϕi〉

and hence the energy difference between the initial and final states must be muchlarger than the matrix element Vfi between these states.

3.2 Transitions to a Continuum of States

In the previous section we considered the transition probability Pif (t) from aninitial state ϕi to a final state ϕf . But in the real experimental world, detectorsgenerally observe transitions over a (at least) small range of energies and over afinite range of incident angles. Thus, we should treat not a single final state ϕf ,but rather a group (or continuum) of closely spaced states centered about someϕf . Since the area under the curve in Figure 8 grows like t, we expect that thetransition probability to a set of states with approximately the same energy as ϕf

to grow linearly with time. (We saw this for a transition to a single state in equation(3.19a).)

80

Let us now generalize (3.19b) to a more physically realistic detector. After all,no physical transition rate can go like a delta function. To get a good idea of whatto expect, we first consider the perturbation (3.17) and the resulting transitionprobability (3.18).

For a physically realistic detector, instead of a transition to a single final statewe must consider all transitions to a group of final states centered about Ef :

P(t) =∑

Ef∈∆Ef

|Vfi|2~2

[

sin(ωfi ± ω)t/2

(ωfi ± ω)/2

]2

=∑

Ef∈∆Ef

|Vfi|2[

sin(Ef − Ei ± ~ω)t/2~

(Ef − Ei ± ~ω)/2

]2

where the sum is over all states with energies in the range ∆Ef . We assume thatthe final states are very closely spaced, and hence may be treated as a continuumof states. In that case, the sum may be converted to an integral over the interval∆Ef by writing the number of states with energy between Ef and Ef + dEf asρ(Ef ) dEf , where ρ(Ef ) is called the density of final states. It is just the numberof states per unit energy. Then

P(t) =

∫ Ef +∆Ef/2

Ef−∆Ef/2

ρ(Ef ) dEf |Vfi|2[

sin(Ef − Ei ± ~ω)t/2~

(Ef − Ei ± ~ω)/2

]2

. (3.20)

As t becomes very large, we have seen that the term in brackets becomes sharplypeaked about Ef = Ei ∓ ~ω, and hence we may assume that ρ(Ef ) and |Vfi| areessentially constant over the region of integration, which we may also let go to ±∞.Changing variables to x = (Ef − Ei ± ~ω)t/2~ we then have

P(t) = ρ(Ef ) |Vfi|22t

~

∫ ∞

−∞

sin2 x

x2dx =

2π

~ρ(Ef ) |Vfi|2 t .

Defining the transition rate Γ = dP/dt we finally arrive at

Γ =2π

~ρ(Ef ) |Vfi|2

]

Ef=Ei∓~ω(3.21)

which is called Fermi’s golden rule.A completely equivalent way to write this is to take equations (3.19) and write

P(t) =∑

final states

Pif (t) =∑

final states

Γi→f t = Γt

where

Γi→f =2π

~|Vfi|2 δ(Ef − Ei ± ~ω)

andΓ =

∑

final states

Γi→f .

81

If you wish, you can then replace the sum over states by an integral over energiesif you include a density of states factor ρ(E). This has the same effect as simplyusing (3.14) in (3.20) to write

P(t) =

∫ Ef+∆Ef /2

Ef−∆Ef /2

ρ(Ef ) dEf |Vfi|22π

~tδ(Ef − Ei ± ~ω)

=

2π

~|Vfi|2

∫ Ef+∆Ef /2

Ef−∆Ef /2

ρ(Ef )δ(Ef − Ei ± ~ω) dEf

t

= Γt .

Example 3.4. Let us consider a simple, one-dimensional model of photo-ionization,in which a particle of charge e in its ground state ψ0 in a potential U(x) is irradiatedby light of frequency ω, and hence is ejected into the continuum.

To keep things simple, we first assume that the wavelength of the incident lightis much longer than atomic dimensions. Under these conditions, the electric field ofthe light may be considered uniform in space, but harmonic in time. (The magneticfield of the light exerts a force that is of order v/c less than the electric force, andmay be neglected.) Since we are treating the absorption of energy, we write theelectric field as E = E e−iωtx. Using E = −∇ϕ we have

∫

E · dx = E e−iωt

∫

dx = E e−iωtx = −∫

∇ϕ · dx = −ϕ(x)

so that ϕ(x) = −E e−iωtx. From Example 2.2 we know that the interaction energyof the particle in the electric field is given by eϕ(x), and hence the perturbation is

H ′(x, t) = −eE xe−iωt = V0(x)e−iωt .

The second assumption we shall make is that the frequency ω is large enoughthat the final state energy Ef is very large compared to U(x), and therefore we maytreat the final state of the ejected particle as a plane wave (i.e., a free particle ofdefinite energy and momentum).

We need to find the density of final states and the normalization of these states.The standard trick to accomplishing this is to consider our system to be in a box oflength L, and then letting L→ ∞. By a proper choice of boundary conditions, thiswill give us a discrete set of normalizable states. However, we can’t treat this likea “particle in a box,” because such states must vanish at the walls, and a state ofdefinite momentum can’t vanish. Therefore, we employ the mathematical (but non-physical) trick of assuming periodic boundary conditions, whereby the walls aretaken to lie at x0 and x0 + L together with ψ(x0 + L) = ψ(x0).

The free particle plane waves are of the form eipx/~, so our periodic boundaryconditions become

eip(x0+L)/~ = eipx0

82

so that eipL/~ = 1 and hence

p =√

2mE =2πn~

L; n = 0,±1,±2, . . . .

This shows that the momentum (and hence energy) of the particle takes on discretevalues. Note that as L gets larger and larger, the spacing of the states becomes closerand closer, and in the limit L→ ∞ they become the usual free particle continuumstates of definite momentum. This is the justification for using periodic boundary

conditions. Finally, the normalization condition∫ x0+L

x0

|ψ|2 dx = 1 implies that thenormalized wave functions are then

ψE =1√Lei

√2mE x/~ .

The next thing we need to do is find the density of states ρ(E), which is definedas the number of states with an energy between E and E+dE, i.e., ρ(E) = dN/dE.Consider a state with energy E defined by

√2mE =

2πN~

L

so that

N =L

2π~

√2mE .

From n = 0,±1,±2, . . . ,±N , we see that there are 2N + 1 states with energy lessthan or equal to E. Calling this number N(E), we have

N(E) = 2N + 1 =L

π~

√2mE + 1 .

But then

N(E + dE) =L

π~

√

2m(E + dE) + 1 =L

π~

√2mE

√

1 + dE/E + 1

≈ L

π~

√2mE(1 + dE/2E) + 1 = N(E) +

L

2π~

√

2m

EdE

and hence

dN = N(E + dE) −N(E) =L

2π~

√

2m

EdE .

Directly from the definition of ρ(E) we then have

ρ(E) =L

2π~

√

2m

E. (3.22)

Now we turn to the matrix element Vfi. The initial state is the normalized wavefunction ψ0 with energy E0 = −ǫ where ǫ is the binding energy. The final state isthe normalized free particle state ψEf

with energy Ef = E0 + ~ω = ~ω − ǫ. Then

Vfi = −E 〈ψEf|ex|ψ0〉 = −E

∫

1√Le−i

√2mEf x/~exψ0 dx .

83

Note that this is the quantum mechanical average of the energy of an electric dipolein a uniform electric field E .

Putting all of this together in (3.21), we have the transition probability

Γ =2π

~

L

2π~

√

2m

Efe2E 2 1

L

∣

∣

∣

∣

∫

e−i√

2mEf x/~xψ0 dx

∣

∣

∣

∣

2

=e2E 2

~2

√

2m

Ef

∣

∣

∣

∣

∫

e−i√

2mEf x/~xψ0 dx

∣

∣

∣

∣

2

. (3.23)

Note that the box size L has canceled out of the final result, as it must.Let’s actually evaluate the integral in (3.23) for the specific example of a particle

in a square well potential. Recall that the solutions to this problem consist ofsines and cosines inside the well, and exponentially decaying solutions outside. Tosimplify the calculation, we assume first that the well is so narrow that the groundstate is the only bound state (a cosine wave function), and second, that this stateis only very slightly bound, so that its wave function extends far beyond the edgesof the well. By making the well so narrow, we can simply replace the cosine wavefunction inside the well by extending the exponential wave functions back to theorigin.

With these additional simplifications, the normalized ground state wave functionis

ψ0 =

(

2mǫ

~2

)1/4

e−√

2mǫ |x|/~

where ǫ is the binding energy. Then the integral in (3.23) becomes

∫ ∞

−∞e−i

√2mEf x/~xψ0 dx =

(

2mǫ

~2

)1/4 ∫ ∞

−∞e−

√2m(

√ǫ |x|+i

√Ef x)/~xdx

=

(

2mǫ

~2

)1/4∫ 0

−∞e√

2m(√

ǫ−i√

Ef )x/~xdx

+

∫ ∞

0

e−√

2m(√

ǫ+i√

Ef )x/~xdx

.

Using∫ 0

−∞eaxxdx =

∂

∂a

∫ 0

−∞eax dx =

∂

∂a

1

a= − 1

a2

and∫ ∞

0

e−bxxdx = − ∂

∂b

∫ ∞

0

e−bx dx = − ∂

∂b

1

b=

1

b2

we have∫ ∞

−∞e−i

√2mEf x/~xψ0 dx =

(

2mǫ

~2

)1/4~2

2m

1

(√ǫ+ i

√

Ef )2− 1

(√ǫ− i

√

Ef )2

84

=

(

2mǫ

~2

)1/4~2

2m

(−4i)√

ǫEf

(ǫ+ Ef )2.

Hence equation (3.23) becomes

Γ =8~e2E 2

m

ǫ3/2E1/2f

(ǫ+ Ef )4

where Ef = ~ω − ǫ, or ǫ + Ef = ~ω. Since our second initial assumption wasessentially that ~ω ≫ ǫ, we can replace Ef in the numerator by ~ω, leaving us withthe final result

Γ =8e2E 2

m~5/2

ǫ3/2

ω7/2.

What this means is that if we have a collection of N particles of charge e andmass m in their ground state in a potential well with binding energy ǫ, and theyare placed in an electromagnetic wave of frequency ω and electric vector E , thenthe number of photoelectrons with energy ~ω − ǫ produced per second is N Γ.

Now that we have an idea of what the density of states means and how to use thegolden rule, let us consider a somewhat more general three-dimensional problem.We will consider an atomic decay ϕi → ϕf , with the emission of a particle (photon,electron etc.), whose detection is far from the atom, and hence may be describedby a plane wave

ψ(r, t) =1√Vei(p·r−ωpt) .

(At the end of our derivation, we will generalize to multiple particles in the finalstate.) Here V is the volume of a box that contains the entire system, and thefactor 1/

√V is necessary to normalize the wave function. If we take the box to be

very large, its shape doesn’t matter, so we take it to be a cube of side L. In orderto determine the allowed momenta, we impose periodic boundary conditions:

ψ(x+ L, y, z) = ψ(x, y, z)

and similarly for y and z. Then eipxL/~ = eipyL/~ = eipzL/~ = 1 so that we musthave

px =2π~

Lnx ; py =

2π~

Lny ; pz =

2π~

Lnz

where each ni = 0,±1,±2, . . . .Our real detector will measure all incoming momenta in a range p to p + δp,

and hence we want to calculate the transition rate to all final states in this range.Thus we want

Γ =∑

δp

Γi→f (p)

85

where Γi→f (p) is given by (3.19b). Since each momentum state is described by thetriple of integers (nx, ny, nz), this is equivalent to the sum

Γ =∑

δnx,δny,δnz

Γi→f (n) →∫

d3nΓi→f (n)

where we have gone over to an integral in the limit of a very large box, so thatcompared to L, each δni becomes an infinitesimal dni. Noting that

d3n = dnx dny dnz =

(

L

2π~

)3

dpx dpy dpz =V

(2π~)3d3p (3.24)

we then have (from (3.19b))

Γ =2π

~

∫

V

(2π~)3d3p |Mfi|2 δ(Ef − Ei + E) (3.25)

where we have assumed that the emitted particle has energy E (which is essen-

tially the integration variable), and we changed notation slightly to |Mfi|2 =

|〈ϕf |H ′(t)|ϕi〉|2 where H ′(t) = V0(r)e+iωt as in (3.17).

If we let dΩp = d cos θpdφp be the element of solid angle about the directiondefined by p, then

Γ =2π

~

∫

dΩp

∫

V

(2π~)3p2 dp |Mfi|2 δ(Ef − Ei + E)

=2π

~

∫

dΩp

∫

V

(2π~)3dE

(

p2 dp

dE

)

|Mfi|2 δ(Ef − Ei + E)

=2π

~

∫

dΩp

V

(2π~)3

[(

p2 dp

dE

)

|Mfi|2]

E=Ei−Ef

. (3.26)

Here the integral is over Ωp, and is to cover whatever solid angle range we wish toinclude. This could be just a small detector angle, or as large as 4π to include allemitted particles. The quantity in brackets is evaluated at E = Ei −Ef as requiredby the energy conserving delta function. And the factor of V in the numerator willbe canceled by the normalization factor (1/

√V )2 coming from |Mfi|2 and due to

the outgoing plane wave particle.From (3.24) we see that

dΩp

V

(2π~)3

(

p2 dp

dE

)

=V

(2π~)3d3p

dE=d3n

dE:= ρ(E) (3.27)

where the density of states ρ(E) is defined as the number of states per unit ofenergy. Note that in the case of a photon (i.e., a massless particle) we have E = pcso that

p2 dp

dE=p2

c=E2

c3=

~2

c3ω2

86

where we used the alternative relation E = ~ω. And in the case of a massiveparticle, we have E = p2/2m and

p2 dp

dE= p2 m

p= mp = m

√2mE .

You should compare (3.27) using these results to (3.22). In all cases, the density ofstates goes like 1/E as it should.

In terms of the density of states, (3.26) may be written

Γ =2π

~ρ(E) |Mfi|2

]

E=Ei−Ef

. (3.28)

This is the golden rule for the emission of a particle of energy E. If the final statecontains several particles labeled by k, then (3.25) becomes

Γ =2π

~

∫

indep pk

∏

k

V d3pk

(2π~)3|Mfi|2 δ

(

Ef − Ei +∑

k

Ek

)

where the integral is over all independent momenta, since the energy conservingdelta function is a condition on the total momenta of the emitted particles, andhence eliminates a degree of freedom. However, the product of phase space factorsV d3pk/(2π~)3 is over all particles in the final state. Alternatively, we may leavethe integral over all momenta if we include an energy conserving delta function inaddition:

Γ =2π

~

∫

∏

k

V d3pk

(2π~)3|Mfi|2 δ

(

Ef +∑

k

Ek − Ei

)

δ

(

pf +∑

k

pk − pi

)

.

87

Documents

Approximation Methods - UC San Diego | Department …physics.ucsd.edu/.../fall2009/physics130b/Approximations.pdf2 Time-Independent Perturbation Theory 21 2.1 Perturbation Theory for