21
Relativistic Quantum Mechanics Dipankar Chakrabarti Department of Physics, Indian Institute of Technology Kanpur, Kanpur 208016, India (Dated: August 6, 2020) 1

Relativistic Quantum Mechanicshome.iitk.ac.in/~dipankar/RelativisticQM.pdf · Relativistic Quantum Mechanics Dipankar Chakrabarti DepartmentofPhysics, IndianInstituteofTechnologyKanpur,

others

• View
20

0

Embed Size (px)

Citation preview

Relativistic Quantum Mechanics

Dipankar Chakrabarti

Department of Physics, Indian Institute of Technology Kanpur, Kanpur 208016, India

(Dated: August 6, 2020)

1

I. INTRODUCTION

Till now we have dealt with non-relativistic quantum mechanics. A free particle satisfying

Schrodinger equation has the non-relatistic energy E = ~p2

2m . Non-relativistc QM is applicable

for particles with velocity much smaller than the velocity of light(v << c). But for relativis-

tic particles, i.e. particles with velocity comparable to the velocity of light(e.g., electrons in

atomic orbits), we need to use relativistic QM. For relativistic QM, we need to formulate a

wave equation which is consistent with relativistic transformations(Lorentz transformations) of

special theory of relativity. A characteristic feature of relativistic wave equations is that the

spin of the particle is built into the theory from the beginning and cannot be added afterwards.

(Schrodinger equation does not have any spin information, we need to separately add spin wave

function.) it makes a particular relativistic equation applicable to a particular kind of particle

(with a specific spin) i.e, a relativistic equation which describes scalar particle(spin=0) cannot

be applied for a fermion(spin=1/2) or vector particle(spin=1).

Before discussion relativistic QM, let us briefly summarise some features of special theory of

relativity here. Specification of an instant of time t and a point ~r = (x, y, z) of ordinary space

defines a point in the space-time. We’ll use the notation

xµ = (x0, x1, x2, x3)⇒ xµ = (x0, xi), x0 = ct, µ = 0, 1, 2, 3 and i = 1, 2, 3

x ≡ xµ is called a 4-vector, whereas ~r ≡ xi is a 3-vector(for 4-vector we don’t put the vector

sign(→) on top of x.

Consider two events in space-time (x0, x1, x2, x3) and (x0 + dx0, x1 + dx1, x2 + dx2, x3 + dx3)

where x0 = ct so dx0 = cdt as c =velocity of light and is a constant. In three dimensional

space we define the distance between two points. We generalize the notion of the distance

between two points in space to the interval between two points in the space-time, say, ds. For

ds to be same for all observer(ie, in all inertial frames), it must be invariant under Lorentz

transformations and rotations. The interval is defined as

ds2 = gµνdxµdxν (1)

2

where gµν is the metric of the space-time. In Minkowski space

gµν =

1 0 0 0

0 −1 0 0

0 0 −1 0

0 0 0 −1

. (2)

So,

ds2 = (cdt)2 − ((dx1)2 + (dx2)2 + (dx3)2) = (cdt)2 − ( ~dr)2 (3)

Under Lorentz transformation xµ transforms as x′µ = Λµνx

ν where Λµν is a 4 × 4 matrix

representing the Lorentz transformation operator. For example, the operator for boost along

x1 axis

Λµν =

γ −γβ 0 0

−γβ γ 0 0

0 0 1 0

0 0 0 1

(4)

where β = v/c and γ = 1/√

1− (v/c)2. So, the transformed coordinates under the boost along

x1:

ct′ = γ(ct− v

cx1), x′1 = γ(x1 − v

cct), x′2 = x2, x′3 = x3. (5)

Check that

ds′2 = (cdt′)2 − ((dx′1)2 + (dx′2)2 + (dx′3)2) = γ2(cdt− βdx′1)2 − γ2(dx1 − βcdt)2 − (dx2)2 − (dx3)2

= ds2 (6)

i.e., ds2 is Lorentz invariant. ds2 can be both positive or negative unlike spatial distance ( ~dr)2

which is always positive. If

ds2 > 0 i.e., (cdt)2 > ( ~dr)2, the interval is called "time-like"

ds2 < 0 i.e., (cdt)2 < ( ~dr)2, the interval is called "space-like"

ds2 = 0 i.e., (cdt)2 = ( ~dr)2, the interval is called "light-like".

3

covariant & contravariant vectors: Any quantity which transforms like xµ under Lorentz

transformation is called a contravariant vector while anything which transforms like ∂∂xµ

is called

covariant vector. General convention for contravariant vector is aµ (i.e.,µ is in the superscript)

and for covariant vector aµ (i.e, µ is in the subscript) i.e, ∂∂xµ

= ∂µ. The inner product of a

covariant vector and a contravariant vector is a Lorentz invariant(i.e., scalar). The contra and

covariant vectors are related by

xµ =∑ν

gµνxν . (7)

Using the convention of summation over repeated indices we can write the above eqn as xµ =

gµνxν where ν in gµν is repeated again in xν and hence is summed over. Similarly, xµ = gµνxν .

In Minkowski space, gµν = gµν . So, we have

x0 = g0νxν = g00x

0 + g01x1 + g02x

2 + g03x3 = g00x

0 = x0 (8)

x1 = g1νxν = g10x

0 + g11x1 + g12x

2 + g13x3 = g11x

1 = −x1. (9)

Similarly x2 = −x2 and x3 = −x3.

Inner product or scalar product of two 4-vectors is defined as

A ·B = AµBµ = (A0B0 + A1B1 + A2B2 + A3B3) = (A0B0 − A1B1 − A2B2 − A3B3) (10)

= A0B0 − ~A · ~B = gµνAµBν = gµνAµBν . (11)

Differential operators:

∂µ = ∂

∂xµ= (1

c

∂t,∂

∂x1 ,∂

∂x2 ,∂

∂x3 ) (12)

= (∂0, ∂1, ∂2, ∂3) = (1c

∂t, ~∇) (13)

∂µ = gµν∂ν = (1c

∂t,−~∇) (14)

The Lorentz invariant second order differential operator or the d’Alembertian operator is

� = ∂µ∂µ = ( 1c2∂2

∂t2,−( ∂

2

∂x2 + ∂2

∂y2 + ∂2

∂z2 )) = ( 1c2∂2

∂t2,−∇2). (15)

We know the relativistic mass mr = γm and energy E = mrc2 = γmc2. The energy-momentum

4-vector is pµ = (E/c, ~p) where ~p = γm~v. So,

p2 = gµνpµpν = pµpµ = (E

c)2 − (~p)2 = (γmc2)2

c2 − (γm~v)2 = m2c2 (16)

4

(in the natural unit ~ = c = 1, p2 = m2). So, the relativistic energy momentum relation is

given by E2 = (~p)2c2 +m2c4. Another useful quantity is

p · x = pµxµ = Et− ~p · ~x. (17)

For non-relativistic particle (v << c), we can write

E =√~p2c2 +m2c4 = mc2(1 + ~p2

m2c2 )1/2 (18)

= mc2(1 + ~p2

2m2c2 −(~p)4

8m4c4 + · · · ) = mc2 + ~p2

2m − · · · (19)

Negelecting the higher oredr terms, the kinetic energy of a non-relativistic particle is ~p2

2m =

E −mc2.

A. Klein-Gordon Equation

Schrodinger proposed a relavistic form of his non-relativistic equation (at the same time when

he developed his non-relativistic(NR) equation). Klein and Gordon developed this equation at

a later time and is knaown as Klein-Gordon(KG) equation. Schrodinger used the NR energy-

momentum dispersion relation E = p2

2m . Using the correspondence principle

E → E = i~∂

∂t, ~p→ ~p = −i~~∇ (20)

in Eφ(~r, t) = p2

2mφ(~r, t), we arrive at the Schrodinger equation for free particle. Now extend the

same algorithm for relativistic particle with energy-momentum relation E2 = ~p2c2 + m2c4. So

we get the relativistic wave equation

E2φ(x) = (~p2c2 +m2c4)φ(x) (21)

⇒ −~2 ∂2

∂t2φ(x) = (−~2c2~∇2 +m2c4)φ(x) (22)

⇒( 1c2∂2

∂t2− ~∇2

)φ(x) = −m

2c2

~2 φ(x) (23)

⇒ (� + m2c2

~2 )φ(x) = 0. (24)

This equation is known as Klein-Gordon equation. Note that � = ∂µ∂µ is a Lorentz invariant

quantity, so the KG equation is Lorentz invariant only if φ is Lorentz invariant or Lorentz

5

scalar. Thus KG equation describes the relativistic dynamics of a scalar particle. The plane

wave solution of the KG eqn is

φ(x) = Ne−i(Et−~p·~x) (25)

where N is the normalization constant and energy E = ±√~p2c2 +m2c4 i.e., energy can be both

positive and negative.

Continuity Equation:

Pre-multiply Eq.(23) by φ∗(x) to get

φ∗(x)( 1c2∂2

∂t2− ~∇2

)φ(x) = −m

2c2

~2 φ∗(x)φ(x) (26)

Now take the complex conjugate of Eq.(23) and post-multiply with φ(x), which gives

( 1c2∂2

∂t2φ∗)φ− (~∇2φ∗)φ = −m

2c2

~2 φ∗(x)φ(x) (27)

Eq(26)-Eq(27) gives:

φ∗1c2∂2φ

∂t2− 1c2∂2φ∗

∂t2φ− (φ∗∇2φ− φ∇2φ∗) = 0 (28)

⇒ 1c

∂t

[i~

2mc

(φ∗∂φ

∂t− ∂φ∗

∂tφ)]

+ ~∇ ·[ ~2im

(φ∗~∇φ− (~∇φ∗)φ

)]= 0 (29)

⇒ 1c

∂tρ+ ~∇ ·~j = 0 (30)

⇒ ∂µjµ = 0 (31)

This is the continuity equation for the KG eqn, where

j0 = ρ = i~2mc

(φ∗∂φ

∂t− ∂φ∗

∂tφ)

(32)

~j = ~2im

(φ∗~∇φ− (~∇φ∗)φ

). (33)

Recall the continuity eqn for Schrodinger equation, ρ is the probability density and ~j is the

probability current. Continuity equation has the interpretation of conservation of probability.

It tells that if the probability of finding a particle in some region decreases, the probability of

finding it out side that region increases, i.e., there is a flow of probability current so that the total

probability remains conserved. Since the KG eqn also satisfies the same continuity eqn, it is

natural to interpret ρ as the probability density and ~j as the probability current. [Note: Density

6

transforms like the 0th component of a 4−vector (jµ) under Lorentz transformation. Since φ

is a Lorentz invariant quantity, φ2 does not transform like a density, but ρ defined in Eq.(32)

does.] The probability density corresponding to the plane wave solution reads ρ = 2|N |2E.

There are two major problems with the KG equation.

(1) The eqn has both positive and negative energy solutions. The negative energy solution

poses a problem! For large |~p| we can have large negative energy, i.e., the system become

unbounded from below. So, we can extract any arbitrary large amount of energy from the

system by pushing it into more and more negative energy states. One may say, we truncate

the physical space to be the positive energy states only i.e, only E = +√~p2c2 +m2c4 are

physical. But then (a) the eigenstates don’t form a complete basis states, (both +ve and -ve

energy states are Fourier modes of φ); if we don’t have completeness relation, we cannot have

superposition principle too ie., we cannot expand a state χ in the basis of φ ( i.e., χ = ∑i ciφi

is no longer valid) and (b) a perturbation may cause the system to jump to a negative energy

states. Since -ve energy states are valid solutions of the KG equation, we can not stop that.

So, just interpreting negative energy states as unphysical does not work.

(2) The second problem is associated with the probability density. As we have seen ρ =

2|N |2E, i.e, ρ is negative if E is negative. But to interpret ρ as the probability density, it must

be positive definite.

[Though in QM, KG equation looks awkward at this moment, but in QFT this is a valid

equation for scalar (spin=0)particles. Feynman and Stückelberg interpreted the positive energy

states as particles propagating forward in time and negative energy states are propagating back-

ward in time and thus represent antiparticles propagating forward in time. But we’ll not discuss

those developments here.]

7

B. Dirac Equation:

The probability density in KG eqn depends on energy and becomes negative for negative

energy. The energy in the expression of ρ appears due to the time derivative in Eq.(32). Dirac

realised that this is due to the fact that KG eqn involves second order time derivative. Notice

that Schrodinger equation invoves first order time derivative, and ρ does not involve any time

derivative.. So, if we want to write a relativistic wave equation with positive definite probability

density, the equation should be first order in time derivative. To be consistent with the Lorentz

transformations in special theory of relativity, the wave equation with first order time derivative

must also be first order in space derivatives. So, Dirac wrote the Hamiltonian as

H = α1p1c+ α2p2c+ α3p3c+ βmc2. (34)

Writing the momentum in differential operator form in the position space, we must have the

wave equation

i~∂ψ(x)∂t

=(− i~c(α1

∂x1 + α2∂

∂x2 + α3∂

∂x3 ) + +βmc2)ψ(x)

= (−i~c~α · ~∇+ βmc2)ψ(x) (35)

Since the above Hamiltonian has to describe a free particle, αi and β cannot depend on space

and time, since such terms would have the properties of space-time dependent energies and

give rise to forces. Also αi and β cannot have space or time derivatives, the derivatives should

appear only in pi and E , since the equation is to be linear in all these derivatives. Thus αi, β

are some constants.

For relativistic particle, it must satisfy the relativistic energy momentum relation

E2 = ~p2c2 +m2c4 i.e., it must satisfy the KG equation.

8

Squaring both sides of Eq.(35), we get

(i~ ∂∂t

)2ψ =(− i~c(α1

∂x1 + α2∂

∂x2 + α3∂

∂x3 ) + +βmc2)

(− i~c(α1

∂x1 + α2∂

∂x2 + α3∂

∂x3 ) + +βmc2)ψ

=[− ~2c2

(α2

1∂2

∂x12 + α22∂2

∂x22 + α23∂2

∂x32

)+ β2m2c4

−~2c2(

(α1α2 + α2α1) ∂

∂x1∂

∂x2 + (α1α3 + α3α1) ∂

∂x1∂

∂x3 + (α2α3 + α3α2) ∂

∂x2∂

∂x3

)−imc3~

((α1β + βα1) ∂

∂x1 + (α2β + βα2) ∂

∂x2 + (α3β + βα3) ∂

∂x3

]ψ (36)

To satisfy E2 = ~p2c2 +m2c4, the above equation must satisfy

−~2( ∂∂t

)2ψ = −~2c2(∂2

∂x12 + ∂2

∂x22 + ∂2

∂x32

)ψ +m2c4ψ (37)

Now if Eq.(36) has to satisfy Eq.(37), then αi (i = 1, 2, 3) and β must satisfy

αiαj + αjαi = 0, (i 6= j) (38)

αiβ + βαi = 0 (39)

α2i = 1, β2 = 1 (40)

Clearly, αi and β cannot be ordinary classical numbers, rather they anticommute with each

other. So, Dirac propopsed that they are matrices. The above anticommutation relations can

be writen in the short forms as

{αi, αj} = 0 (i 6= j), {αi, β} = 0 (41)

(The notation { , } is called the anticommutator.) Combining with the fact that α2i = 1 we

can write

{αi, αj} = 2δijI. (42)

If αi and β are matrices, ψ cannot be a single component wave function, it must have more

than one components that can be written as a vector on which the matrices should operate.

For Dirac equation, we need four linearly independent matrices satisfying the anticommu-

tation relations. Since the Hamiltonian is hermitian, each of the four matrices αi, β must be

9

hermitian and hence they are square matrices(n × n). Since squares of all four matrices are

unity, their eigenvalues are +1 and −1. If we choose β to be diagonal, then αi cannot be

diagonal as they anticommute with β. In 2 dimensions, we have three Pauli matrices which

anticommute with each other but the fourth linearly independent matrix that we can have in

2D is the identity matrix which commutes with all other matrices. So, we cannot find a linearly

independent fourth matrix to anticommute with the Pauli matrices. Similarly, we fail to find

four 3× 3 matrices to satisfy all the above conditions. The smallest possible dimension to have

four such matrices is 4× 4. One such set of matrices are:

αi =

0 σi

σi 0

, β =

I 0

0 −I

(43)

where σi are the Pauli matrices and I is 2× 2 identity matrix.

σ1 =

0 1

1 0

, σ2 =

0 −i

i 0

, σ3 =

1 0

0 −1

, (44)

αi and β are not unique. All matrices related to these matrices by any unitary 4 × 4 matrix

are equally valid i.e.,

α′i = UαiU−1, β′ = UβU−1 (45)

will also satisfy the Dirac equation and all the anticommutation relations. Since αi and β are

4×4 matrices, ψ is a 4-component column vector. As UU−1 = I, you can show that for Lorentz

invariance of the Dirac equation, ψ then transforms as ψ′ = Uψ.

Free particle solution: Like KG equation, we look for the solution in which the space-time

behaviour is of plane wave form:

ψ(x) = ωe−ip·x~ = ωe−i

Et~ +i ~p·~x~ . (46)

where ω is a 4-component vector, indepndent of x and is called the Dirac spinor. Let us write

ω in 2-component notation

ω =

φχ

(47)

10

where φ and χ are 2 -component spinors. Putting the solution in the Dirac equation (Eq.35),

we get

E

φχ

= c~α · ~p

φχ

+ βmc2

φχ

=

0 c~σ · ~p

c~σ · ~p 0

φχ

+mc2

I 0

0 −I

φχ

=

mc2I c~σ · ~p

c~σ · ~p −mc2I

φχ

. (48)

The matrix equation can be written as two coupled equations:

Eφ = mc2φ+ c~σ · ~pχ ⇒ (E −mc2)φ = c~σ · ~pχ, (49)

and

Eχ = −mc2χ+ c~σ · ~pφ ⇒ (E +mc2)χ = c~σ · ~pφ

⇒ χ = c~σ · ~pE +mc2φ. (50)

Putting Eq.(50) in Eq.(49) we have

(E −mc2)φ = c~σ · ~p c~σ · ~pE +mc2φ

= c2(~σ · ~p)2

E +mc2 φ = ~p2c2

E +mc2φ (51)

where we have used (~σ · ~A)(~σ · ~B) = ~A · ~BI + i~σ · ( ~A× ~B)⇒ (~σ · ~p)2 = (~p)2. So finally we get,

(E −mc2)(E +mc2)φ = ~p2c2φ⇒ E2 = ~p2c2 +m2c4 (52)

i.e, E = ±√~p2c2 +m2c4, which means negative energy solutions are still admitted. Dirac’s

prescription cannot get rid of the negative energy solutions. Let us postpone the discussion on

negative energy now. We’ll come back to the issue of negative energy solution at the end.

Let us first check what happens to the probability density. To derive the continuity equation,

first premultiply the Dirac equation by ψ† :

ψ†i~∂ψ

∂t= ψ†(−i~c~α · ~∇+ βmc2)ψ (53)

11

Take hermitian conjugate of the Dirac equation and post multiply with ψ :

−i~(∂ψ†

∂t)ψ = ψ†(i~c~α·

←∇ +βmc2)ψ (54)

Note that the spatial derivative←∇ acts on the left i.e.,on the ψ† and α†i = αi and β† = β. Now

subtracting Eq.54 from Eq.53, we get

i~∂

∂t(ψ†ψ) = −i~c ψ†(~α · ~∇+ ~α·

←∇)ψ = −i~c~∇ · (ψ†~αψ). (55)

Thus we get the continuity equation( in the covariant form)

1c

∂t(ψ†ψ) + ~∇ · (ψ†~αψ) = 0 (56)

⇒ 1c

∂tρ+ ~∇ ·~j = 0 (57)

⇒ ∂µjµ = 0 (58)

where ρ = j0 = ψ†ψ and ~j = ψ†~αψ. Since ψ is a 4-component vector, let us write

ψ =

ψ1

ψ2

ψ3

ψ4

. (59)

Now the probability density

ρ = ψ†ψ =(ψ∗1 ψ∗2 ψ∗3 ψ∗4

)

ψ1

ψ2

ψ3

ψ4

= |ψ1|2 + |ψ2|2 + |ψ3|2 + |ψ4|2 ≥ 0 (60)

⇒ ρ is positive definite. Thus it can be interpreted as probability density.

But now we need to interpret ψ which is a four component vector. What is the significance

or physical meaning of these components? Note that the α matrices involve Pauli matrices σi.

We know that the spin operator are written as ~S = ~2~σ. So, one obvious question arises: Do

the different components in the Dirac spinor represent different spin components?

12

let us consider the positive energy solution only. From Eqs. (47 and 50), we can write

ω =

φ

c~σ·~pE+mc2φ

. (61)

The 2-component spinor φ is completely arbitrary. We may choose two linearly independent

forms

φ↑ =

1

0

, φ↓ =

0

1

(62)

These are the eigenstates of Sz = ~2σz. The most general form can be expanded in terms of

these two basis vectors

φ = aφ↑ + bφ↓ =

ab

. (63)

We have two linearly independent solution for any energy E. So, for a given 4-momentum,

there are just two linearly independent solutions ie, 2-fold degenerate solutions for ω, just as

expected for a quantum system with j = 1/2 (multiplicity (2j + 1) = 2). To give the spin

interpretation let us consider the rest frame of the particle, i.e., ~p = 0. Then E = mc2(we are

considering positive energy only). Two linearly independent solutions in the rest frame can be

written as

ψ1 =

1

0

0

0

e−imc

2t/~, ψ2 =

0

1

0

0

e−imc

2t/~, (64)

These are eigenfunctions of the operator

Σz = ~2

σz 0

0 σz

(65)

with eigenvalues ±~2 . Both solutions have same energy, but eigenvalues of Σz distinguishes

them. Similarly for negative energy solutions (E = −mc2), we can have two solutions ψ3

and ψ4 with 1 in the place third and fourth element in the column matrix respectively and

13

corresponding sign change in the exponential. Generalizing the definition of the operator Σz

for all three components, we write

~Σ = ~2

~σ 0

0 ~σ

. (66)

They satisfy the standard commutation relation for spin operators [Σx,Σy] = i~Σz. Thus Σ is

the appropriate for the spin 12 operator to our rest frame solution and we may conclude that

atleast in the rest frame Dirac solution represents spin-1/2 particles. We know that spin is a

fundamental property of a particle, spin of a particle does not change if we boost the system.

So a spin-half particle in rest frame is a spin-half particle in all frames. Thus Diarc equation

describes the dynamics of spin-half particles or fermions.

But for ~p 6= 0, Σ is no longer a suitable operator to describe spin, as it does not commute

withH = c~α · ~p + βmc2. We need to find an operator that commutes with H and whose

eigenvalues distinguish the two states with same energy. One of such operator is the helicity

operator

h(p) = ~2

~σ·~p|~p| 0

0 ~σ·~p|~p|

. (67)

Physically the helicity operator h(~p) gives the projection of spin(Σ) along the direction of ~p.

Eigen values of h(p) are called helicity of the particle. h(p) has the eigenvalues of h = ±~2 i.e.,

h(p)ω = ±~2ω

⇒ ~2

~σ·~p|~p| 0

0 ~σ·~p|~p|

φ

c~σ·~pE+mc2φ

= ±~2

φ

c~σ·~pE+mc2φ

. (68)

For positive helicity, we must have

~σ · ~p|~p|

φ+ = φ+, (69)

and for negative helicity,

~σ · ~p|~p|

φ− = −φ−, (70)

φ+ and φ− are linearly independent helicity spinors.

14

Lorentz transformaion properties of Dirac spinor:

Rotation: For simplicity consider the particle is moving along z-direction i.e., ~p = (0, 0, p).

Then we can simply choose

φ+ =

1

0

, φ− =

0

1

(71)

Now rotate the coordinate system about x axis by an angle θ. Since ~p is a vector, it transforms

just the same way as ~r.

p′x = px = 0, (72)

p′y = cos θ py + sin θ pz = sin θ p (73)

p′z = − sin θpy + cos θ pz = cos θ p (74)

Let under rotation φ+ → φ′+, so we must have

~σ · ~p′

|~p′|φ′+ = φ′+ (75)

since it must represent a positive helicity state, the state is merely being described in a different

coordinate system. Writing the above equation in explicit form,σxp

′x + σyp

′y + σzp

′z

|p′|φ′+ = φ′+ (76)

or,1p

[p

0 −i sin θ

i sin θ 0

+ p

cos θ 0

0 − cos θ

]φ′+ = φ′+ (77)

or,

cos θ −i sin θ

i sin θ − cos θ

φ′+ = φ′+. (78)

let φ′+ =

ab

, so we can write the above equation as

a cos θ − ib sin θ = a (79)

ia sin θ − b cos θ = b. (80)

Solving the above two coupled equations, we get a = cos(θ/2) and b = i sin(θ/2), i.e.,

φ′+ =

cos(θ/2)

i sin(θ/2)

=

cos(θ/2) i sin(θ/2)

i sin(θ/2) cos(θ/2)

φ+. (81)

15

Similarly we can write the solution for negative helicity state as

φ′− =

i sin(θ/2)

cos(θ/2)

=

cos(θ/2) i sin(θ/2)

i sin(θ/2) cos(θ/2)

φ−. (82)

Thus, the wavefunctions in the rotated coordinate system are linear combinations. of the

corresponding components in the original coordinate system. The transformation is given by a

2× 2 matrix as the spinors(φ±) are 2-component vectors. The matrix elements of the rotation

operator depends on the rotation angle but in the spinor case it is the half-angle θ/2 that enters.

(Note: In case of vector, the rotation angle θ appears in the rotation matrix). We can write the

above rotation in a compact form as

φ′ = eiσxθ/2φ. (83)

For rotation about an arbitrary direction n, the transformation rule can be generalized to

φ′ = ei~σ·nθ/2φ = Uφ. (84)

So, spinors have a well-defined transformation rule under rotation. But it is different from

vector, tensor or scalar. Since the rotation operator U is unitary , the norm of the state is

preserved: φ′†φ′ = φ†φ.

Boost: Let in the S-frame the system is at rest i.e., ~p = 0 and energy E = mc2 and in S ′

frame, it has momentum ~p′ = (p′x, 0, 0) and energy E ′. That is, we are considering a Lorentz

transformation where S ′ frame has velocity (−vx, 0, 0) with respect to S frame where vx = p′x/E′.

pµ = (E/c, ~p) transforms like a 4-vector. For a boost along x-axis

E ′

c= cosh ξE

c+ sinh ξpx (85)

p′x = sinh ξEc

+ cosh ξpx (86)

where tan ξ = p′x/E′.

In the rest frame (S-frame), we choose φ =

1

0

ie., ω =

1

0

0

0

.

16

In S ′ frame the spinor is

ω′ = N

1

0

cσxp′xE′+mc2

1

0

= N

1

0

cp′xE′+mc2

0

1

(87)

where N is the normalization constant. Lorentz transformation can be considered as a rotation

by an imaginary angle ξ = iθ. But φ and χ transform differently under the boost transformation

due to the fact that E and ~p enter differently in φ and χ. So, we expect , compared to rotation,

a pure imaginary angle and a matrix that couples φ and χ. It can be written as

ω′ = eαxξ/2ω (88)

where αx =

0 σx

σx 0

, so

eαxξ/2 = cosh ξ/2 + αx sinh ξ/2 =

cosh ξ/2 σx sinh ξ/2

σx sinh ξ/2 cosh ξ/2

(89)

as α2x = 1. For a boost in any arbitrary direction the general transformation rule is:

ω′ = exp(~α · ~v|v|

ξ

2

)ω. (90)

Thus the Lorentz transformation rules of ω or ψ is different from a vector. So, ψ is not a

vector under Lorentz transformation, it is called a spinor. Any object which transforms like ψ

under Lorentz transformation is called a spinor.

Positive and negative energy spinors: Write E = +√~p2c2 +m2c4. As time derivative

in Dirac equation gives E = p0c, we can write the Dirac equation as

p0c

φχ

=

mc2I c~σ · ~p

c~σ · ~p −mc2I

φχ

. (91)

For positive energy solution E = p0c > 0 and the Dirac spinor is written as

ωs = N

φs

c~σ·~pE+mc2φ

s

(92)

17

where s = 1, 2 corresponds to different spin states. Choosing the normalization condition

ω†ω = 2E, we can write the positive energy spinor

u(p, s) =√E +mc2

φs

c~σ·~pE+mc2φ

s

, (93)

and the complete plane wave positive energy solution is

ψs = u(p, s)exp−ip.x/~ (94)

where pµ = (+E/c, ~p).

For negative energy states, p0c = −√~p2c2 +m2c4 = −E < 0. To construct the spinor for -ve

energy, let us first consider the solution at rest frame: p0c = −E = −mc2, ~p = 0(we assume E

is a positive quantity). Then the Dirac equation simplifies to

−mc2

φχ

=

mc2I 0

0 −mc2I

φχ

(95)

which gives φ = 0, ⇒ ω =

0

χ

. For finite momentum, we have from the Dirac eqn

−E

φχ

=

mc2I c~σ · ~p

c~σ · ~p −mc2I

φχ

. (96)

Which gives,

φ = − c~σ · ~pE +mc2χ (97)

⇒ ω(−E, ~p, s) =

− c~σ·~pE+mc2χ

s

χs

(98)

Let us now change the sign of ~p, i.e, consider the solution for negative 4-momentum pµ− =

(−E/c,−~p) = −pµ,

ω(−E,−~p, s) = N

c~σ·~pE+mc2χ

s

χs

(99)

18

Again using the same normalization condition ω†ω = 2E, we write the negative energy spinor

as

v(p, s) =√E +mc2

c~σ·~pE+mc2χ

s

χs

(100)

s = 1, 2 and the complete plane wave solution reads

ψs = v(p, s)e−ip−·x/~ = v(p, s)eip·x/~ (101)

where pµ = (E/c, ~p).

Dirac’s interpretation of the negative energy solutions:

The physical interpretation of positive energy solutions is straight forward. They describe

spin-12 particles with 4-momentum pmu = (E/c, ~p). The probability density ρ and the probabil-

ity current ~j both are positive definite. But since the negative energy solutions are also allowed,

like KG equation, a particle with +ve energy can cascade down through the -ve energy levels

without limit. Hence +ve energy states cannot be stable! To make any sense of Dirac equation,

one then needs to make the +ve energy states stable, preventing them to make transition to -ve

energy states. Here comes the masterclass of Dirac! Dirac postulated that the normal empty

or vacuum state corresponds to the state with no positive energy particle and all the negative

energy states are completely filled up! The state with completely filled up negative energy

levels is called the Dirac sea. Since Dirac eqn describes fermions, according to Pauli exclusion

principle only two electrons(one spin up and one spin down) can occupy an energy level and

once they are occupied any +ve energy particle is forbidden to fall in the -ve energy levels. Let

us assume that the spin 1/2 particle we are talking about is an electron. So, the vacuum is

the state where all negative energy levels are filled up by electrons i.e., has infinite negative

charge and energy! But since all observations represent finite fluctuations of charge and energy

with respect to the vacuum, it leads to an acceptable theory and we rescale the vacuum to be

without any charge and energy (charge of the vacuum=0, energy of the vacuum=0, spin of the

vacuum=0). Assume an electron with energy −E and spin up is removed from the Dirac sea.

It will create a "hole" relative to the normal vacuum:

energy of the "hole" = −(−E) = +E → positive

charge of the "hole" = −(−e) = e→ positive charge.

19

FIG. 1: Energy levels of Dirac equations.The blobs represent electrons occupying the energy levels

in Dirac sea. Each level can have two electrons(spin up and spin down) according to Puali exclusion

principle. The open circle represents the absence of an electron i.e., presence of a "hole".

spin of the "hole"=-(up)= down.

Thus the absence of a negative energy electron with spin up is equivalent to the presence of

a positive energy and positively charged "hole" with spin down. So, "hole" represents the an-

tiparticle of the electron(i.e, positron). So, the unfilled negative energy states according to

Dirac, represent positive energy antiparticles. Thus in order to give stability to the +ve energy

states, Dirac predicted the existence of positron!(Actually, when Dirac wrote this equation,

positron was not known and he thought proton which is a positively charged particle might be

the antiparticle of electron!). Anderson discovered positron in 1932 to win the Nobel prize.

DIrac required to consider infinite number of electrons filling up the negative energy states

to describe a stabe "single" electron with positive energy! So, in that sense, it is no-longer

a "single-particle" theory! Exciting a negative energy electron to a positive energy state, ie.

creating a physical electron from the vacuum also creates a "hole" in the Dirac sea or a positive

energy positron, which corresponds to the process of creating an e−e+ pair! Appropriate theory

to describe the particle creations or destruction is the Quantum Field Theory! Dirac’s theory

thus suggests to move to quantum field theory.

20

Here we have discussed only free relativistic equations. If we consider Dirac equation in for

hydrogen atom( i.e., Dirac equation in central potential), the fine structures are observed in

the energy eigenvalues. One can also include interaction with radiations by introducing the

gauge fields as we have discussed before for Schrödinger eqn. But we’ll not discuss them here.

[1] Gauge Theories in Particle Physics, Volume -1, From Relativistic Quantum Mechanics to QED.,

by Aitchison and Hey.

[2] Quantum Mechanics, by L.I. Schiff

21