Upload
others
View
20
Download
0
Embed Size (px)
Citation preview
Relativistic Quantum Mechanics
Dipankar Chakrabarti
Department of Physics, Indian Institute of Technology Kanpur, Kanpur 208016, India
(Dated: August 6, 2020)
1
I. INTRODUCTION
Till now we have dealt with non-relativistic quantum mechanics. A free particle satisfying
Schrodinger equation has the non-relatistic energy E = ~p2
2m . Non-relativistc QM is applicable
for particles with velocity much smaller than the velocity of light(v << c). But for relativis-
tic particles, i.e. particles with velocity comparable to the velocity of light(e.g., electrons in
atomic orbits), we need to use relativistic QM. For relativistic QM, we need to formulate a
wave equation which is consistent with relativistic transformations(Lorentz transformations) of
special theory of relativity. A characteristic feature of relativistic wave equations is that the
spin of the particle is built into the theory from the beginning and cannot be added afterwards.
(Schrodinger equation does not have any spin information, we need to separately add spin wave
function.) it makes a particular relativistic equation applicable to a particular kind of particle
(with a specific spin) i.e, a relativistic equation which describes scalar particle(spin=0) cannot
be applied for a fermion(spin=1/2) or vector particle(spin=1).
Before discussion relativistic QM, let us briefly summarise some features of special theory of
relativity here. Specification of an instant of time t and a point ~r = (x, y, z) of ordinary space
defines a point in the space-time. We’ll use the notation
xµ = (x0, x1, x2, x3)⇒ xµ = (x0, xi), x0 = ct, µ = 0, 1, 2, 3 and i = 1, 2, 3
x ≡ xµ is called a 4-vector, whereas ~r ≡ xi is a 3-vector(for 4-vector we don’t put the vector
sign(→) on top of x.
Consider two events in space-time (x0, x1, x2, x3) and (x0 + dx0, x1 + dx1, x2 + dx2, x3 + dx3)
where x0 = ct so dx0 = cdt as c =velocity of light and is a constant. In three dimensional
space we define the distance between two points. We generalize the notion of the distance
between two points in space to the interval between two points in the space-time, say, ds. For
ds to be same for all observer(ie, in all inertial frames), it must be invariant under Lorentz
transformations and rotations. The interval is defined as
ds2 = gµνdxµdxν (1)
2
where gµν is the metric of the space-time. In Minkowski space
gµν =
1 0 0 0
0 −1 0 0
0 0 −1 0
0 0 0 −1
. (2)
So,
ds2 = (cdt)2 − ((dx1)2 + (dx2)2 + (dx3)2) = (cdt)2 − ( ~dr)2 (3)
Under Lorentz transformation xµ transforms as x′µ = Λµνx
ν where Λµν is a 4 × 4 matrix
representing the Lorentz transformation operator. For example, the operator for boost along
x1 axis
Λµν =
γ −γβ 0 0
−γβ γ 0 0
0 0 1 0
0 0 0 1
(4)
where β = v/c and γ = 1/√
1− (v/c)2. So, the transformed coordinates under the boost along
x1:
ct′ = γ(ct− v
cx1), x′1 = γ(x1 − v
cct), x′2 = x2, x′3 = x3. (5)
Check that
ds′2 = (cdt′)2 − ((dx′1)2 + (dx′2)2 + (dx′3)2) = γ2(cdt− βdx′1)2 − γ2(dx1 − βcdt)2 − (dx2)2 − (dx3)2
= ds2 (6)
i.e., ds2 is Lorentz invariant. ds2 can be both positive or negative unlike spatial distance ( ~dr)2
which is always positive. If
ds2 > 0 i.e., (cdt)2 > ( ~dr)2, the interval is called "time-like"
ds2 < 0 i.e., (cdt)2 < ( ~dr)2, the interval is called "space-like"
ds2 = 0 i.e., (cdt)2 = ( ~dr)2, the interval is called "light-like".
3
covariant & contravariant vectors: Any quantity which transforms like xµ under Lorentz
transformation is called a contravariant vector while anything which transforms like ∂∂xµ
is called
covariant vector. General convention for contravariant vector is aµ (i.e.,µ is in the superscript)
and for covariant vector aµ (i.e, µ is in the subscript) i.e, ∂∂xµ
= ∂µ. The inner product of a
covariant vector and a contravariant vector is a Lorentz invariant(i.e., scalar). The contra and
covariant vectors are related by
xµ =∑ν
gµνxν . (7)
Using the convention of summation over repeated indices we can write the above eqn as xµ =
gµνxν where ν in gµν is repeated again in xν and hence is summed over. Similarly, xµ = gµνxν .
In Minkowski space, gµν = gµν . So, we have
x0 = g0νxν = g00x
0 + g01x1 + g02x
2 + g03x3 = g00x
0 = x0 (8)
x1 = g1νxν = g10x
0 + g11x1 + g12x
2 + g13x3 = g11x
1 = −x1. (9)
Similarly x2 = −x2 and x3 = −x3.
Inner product or scalar product of two 4-vectors is defined as
A ·B = AµBµ = (A0B0 + A1B1 + A2B2 + A3B3) = (A0B0 − A1B1 − A2B2 − A3B3) (10)
= A0B0 − ~A · ~B = gµνAµBν = gµνAµBν . (11)
Differential operators:
∂µ = ∂
∂xµ= (1
c
∂
∂t,∂
∂x1 ,∂
∂x2 ,∂
∂x3 ) (12)
= (∂0, ∂1, ∂2, ∂3) = (1c
∂
∂t, ~∇) (13)
∂µ = gµν∂ν = (1c
∂
∂t,−~∇) (14)
The Lorentz invariant second order differential operator or the d’Alembertian operator is
� = ∂µ∂µ = ( 1c2∂2
∂t2,−( ∂
2
∂x2 + ∂2
∂y2 + ∂2
∂z2 )) = ( 1c2∂2
∂t2,−∇2). (15)
We know the relativistic mass mr = γm and energy E = mrc2 = γmc2. The energy-momentum
4-vector is pµ = (E/c, ~p) where ~p = γm~v. So,
p2 = gµνpµpν = pµpµ = (E
c)2 − (~p)2 = (γmc2)2
c2 − (γm~v)2 = m2c2 (16)
4
(in the natural unit ~ = c = 1, p2 = m2). So, the relativistic energy momentum relation is
given by E2 = (~p)2c2 +m2c4. Another useful quantity is
p · x = pµxµ = Et− ~p · ~x. (17)
For non-relativistic particle (v << c), we can write
E =√~p2c2 +m2c4 = mc2(1 + ~p2
m2c2 )1/2 (18)
= mc2(1 + ~p2
2m2c2 −(~p)4
8m4c4 + · · · ) = mc2 + ~p2
2m − · · · (19)
Negelecting the higher oredr terms, the kinetic energy of a non-relativistic particle is ~p2
2m =
E −mc2.
A. Klein-Gordon Equation
Schrodinger proposed a relavistic form of his non-relativistic equation (at the same time when
he developed his non-relativistic(NR) equation). Klein and Gordon developed this equation at
a later time and is knaown as Klein-Gordon(KG) equation. Schrodinger used the NR energy-
momentum dispersion relation E = p2
2m . Using the correspondence principle
E → E = i~∂
∂t, ~p→ ~p = −i~~∇ (20)
in Eφ(~r, t) = p2
2mφ(~r, t), we arrive at the Schrodinger equation for free particle. Now extend the
same algorithm for relativistic particle with energy-momentum relation E2 = ~p2c2 + m2c4. So
we get the relativistic wave equation
E2φ(x) = (~p2c2 +m2c4)φ(x) (21)
⇒ −~2 ∂2
∂t2φ(x) = (−~2c2~∇2 +m2c4)φ(x) (22)
⇒( 1c2∂2
∂t2− ~∇2
)φ(x) = −m
2c2
~2 φ(x) (23)
⇒ (� + m2c2
~2 )φ(x) = 0. (24)
This equation is known as Klein-Gordon equation. Note that � = ∂µ∂µ is a Lorentz invariant
quantity, so the KG equation is Lorentz invariant only if φ is Lorentz invariant or Lorentz
5
scalar. Thus KG equation describes the relativistic dynamics of a scalar particle. The plane
wave solution of the KG eqn is
φ(x) = Ne−i(Et−~p·~x) (25)
where N is the normalization constant and energy E = ±√~p2c2 +m2c4 i.e., energy can be both
positive and negative.
Continuity Equation:
Pre-multiply Eq.(23) by φ∗(x) to get
φ∗(x)( 1c2∂2
∂t2− ~∇2
)φ(x) = −m
2c2
~2 φ∗(x)φ(x) (26)
Now take the complex conjugate of Eq.(23) and post-multiply with φ(x), which gives
( 1c2∂2
∂t2φ∗)φ− (~∇2φ∗)φ = −m
2c2
~2 φ∗(x)φ(x) (27)
Eq(26)-Eq(27) gives:
φ∗1c2∂2φ
∂t2− 1c2∂2φ∗
∂t2φ− (φ∗∇2φ− φ∇2φ∗) = 0 (28)
⇒ 1c
∂
∂t
[i~
2mc
(φ∗∂φ
∂t− ∂φ∗
∂tφ)]
+ ~∇ ·[ ~2im
(φ∗~∇φ− (~∇φ∗)φ
)]= 0 (29)
⇒ 1c
∂
∂tρ+ ~∇ ·~j = 0 (30)
⇒ ∂µjµ = 0 (31)
This is the continuity equation for the KG eqn, where
j0 = ρ = i~2mc
(φ∗∂φ
∂t− ∂φ∗
∂tφ)
(32)
~j = ~2im
(φ∗~∇φ− (~∇φ∗)φ
). (33)
Recall the continuity eqn for Schrodinger equation, ρ is the probability density and ~j is the
probability current. Continuity equation has the interpretation of conservation of probability.
It tells that if the probability of finding a particle in some region decreases, the probability of
finding it out side that region increases, i.e., there is a flow of probability current so that the total
probability remains conserved. Since the KG eqn also satisfies the same continuity eqn, it is
natural to interpret ρ as the probability density and ~j as the probability current. [Note: Density
6
transforms like the 0th component of a 4−vector (jµ) under Lorentz transformation. Since φ
is a Lorentz invariant quantity, φ2 does not transform like a density, but ρ defined in Eq.(32)
does.] The probability density corresponding to the plane wave solution reads ρ = 2|N |2E.
There are two major problems with the KG equation.
(1) The eqn has both positive and negative energy solutions. The negative energy solution
poses a problem! For large |~p| we can have large negative energy, i.e., the system become
unbounded from below. So, we can extract any arbitrary large amount of energy from the
system by pushing it into more and more negative energy states. One may say, we truncate
the physical space to be the positive energy states only i.e, only E = +√~p2c2 +m2c4 are
physical. But then (a) the eigenstates don’t form a complete basis states, (both +ve and -ve
energy states are Fourier modes of φ); if we don’t have completeness relation, we cannot have
superposition principle too ie., we cannot expand a state χ in the basis of φ ( i.e., χ = ∑i ciφi
is no longer valid) and (b) a perturbation may cause the system to jump to a negative energy
states. Since -ve energy states are valid solutions of the KG equation, we can not stop that.
So, just interpreting negative energy states as unphysical does not work.
(2) The second problem is associated with the probability density. As we have seen ρ =
2|N |2E, i.e, ρ is negative if E is negative. But to interpret ρ as the probability density, it must
be positive definite.
[Though in QM, KG equation looks awkward at this moment, but in QFT this is a valid
equation for scalar (spin=0)particles. Feynman and Stückelberg interpreted the positive energy
states as particles propagating forward in time and negative energy states are propagating back-
ward in time and thus represent antiparticles propagating forward in time. But we’ll not discuss
those developments here.]
7
B. Dirac Equation:
The probability density in KG eqn depends on energy and becomes negative for negative
energy. The energy in the expression of ρ appears due to the time derivative in Eq.(32). Dirac
realised that this is due to the fact that KG eqn involves second order time derivative. Notice
that Schrodinger equation invoves first order time derivative, and ρ does not involve any time
derivative.. So, if we want to write a relativistic wave equation with positive definite probability
density, the equation should be first order in time derivative. To be consistent with the Lorentz
transformations in special theory of relativity, the wave equation with first order time derivative
must also be first order in space derivatives. So, Dirac wrote the Hamiltonian as
H = α1p1c+ α2p2c+ α3p3c+ βmc2. (34)
Writing the momentum in differential operator form in the position space, we must have the
wave equation
i~∂ψ(x)∂t
=(− i~c(α1
∂
∂x1 + α2∂
∂x2 + α3∂
∂x3 ) + +βmc2)ψ(x)
= (−i~c~α · ~∇+ βmc2)ψ(x) (35)
Since the above Hamiltonian has to describe a free particle, αi and β cannot depend on space
and time, since such terms would have the properties of space-time dependent energies and
give rise to forces. Also αi and β cannot have space or time derivatives, the derivatives should
appear only in pi and E , since the equation is to be linear in all these derivatives. Thus αi, β
are some constants.
For relativistic particle, it must satisfy the relativistic energy momentum relation
E2 = ~p2c2 +m2c4 i.e., it must satisfy the KG equation.
8
Squaring both sides of Eq.(35), we get
(i~ ∂∂t
)2ψ =(− i~c(α1
∂
∂x1 + α2∂
∂x2 + α3∂
∂x3 ) + +βmc2)
(− i~c(α1
∂
∂x1 + α2∂
∂x2 + α3∂
∂x3 ) + +βmc2)ψ
=[− ~2c2
(α2
1∂2
∂x12 + α22∂2
∂x22 + α23∂2
∂x32
)+ β2m2c4
−~2c2(
(α1α2 + α2α1) ∂
∂x1∂
∂x2 + (α1α3 + α3α1) ∂
∂x1∂
∂x3 + (α2α3 + α3α2) ∂
∂x2∂
∂x3
)−imc3~
((α1β + βα1) ∂
∂x1 + (α2β + βα2) ∂
∂x2 + (α3β + βα3) ∂
∂x3
]ψ (36)
To satisfy E2 = ~p2c2 +m2c4, the above equation must satisfy
−~2( ∂∂t
)2ψ = −~2c2(∂2
∂x12 + ∂2
∂x22 + ∂2
∂x32
)ψ +m2c4ψ (37)
Now if Eq.(36) has to satisfy Eq.(37), then αi (i = 1, 2, 3) and β must satisfy
αiαj + αjαi = 0, (i 6= j) (38)
αiβ + βαi = 0 (39)
α2i = 1, β2 = 1 (40)
Clearly, αi and β cannot be ordinary classical numbers, rather they anticommute with each
other. So, Dirac propopsed that they are matrices. The above anticommutation relations can
be writen in the short forms as
{αi, αj} = 0 (i 6= j), {αi, β} = 0 (41)
(The notation { , } is called the anticommutator.) Combining with the fact that α2i = 1 we
can write
{αi, αj} = 2δijI. (42)
If αi and β are matrices, ψ cannot be a single component wave function, it must have more
than one components that can be written as a vector on which the matrices should operate.
For Dirac equation, we need four linearly independent matrices satisfying the anticommu-
tation relations. Since the Hamiltonian is hermitian, each of the four matrices αi, β must be
9
hermitian and hence they are square matrices(n × n). Since squares of all four matrices are
unity, their eigenvalues are +1 and −1. If we choose β to be diagonal, then αi cannot be
diagonal as they anticommute with β. In 2 dimensions, we have three Pauli matrices which
anticommute with each other but the fourth linearly independent matrix that we can have in
2D is the identity matrix which commutes with all other matrices. So, we cannot find a linearly
independent fourth matrix to anticommute with the Pauli matrices. Similarly, we fail to find
four 3× 3 matrices to satisfy all the above conditions. The smallest possible dimension to have
four such matrices is 4× 4. One such set of matrices are:
αi =
0 σi
σi 0
, β =
I 0
0 −I
(43)
where σi are the Pauli matrices and I is 2× 2 identity matrix.
σ1 =
0 1
1 0
, σ2 =
0 −i
i 0
, σ3 =
1 0
0 −1
, (44)
αi and β are not unique. All matrices related to these matrices by any unitary 4 × 4 matrix
are equally valid i.e.,
α′i = UαiU−1, β′ = UβU−1 (45)
will also satisfy the Dirac equation and all the anticommutation relations. Since αi and β are
4×4 matrices, ψ is a 4-component column vector. As UU−1 = I, you can show that for Lorentz
invariance of the Dirac equation, ψ then transforms as ψ′ = Uψ.
Free particle solution: Like KG equation, we look for the solution in which the space-time
behaviour is of plane wave form:
ψ(x) = ωe−ip·x~ = ωe−i
Et~ +i ~p·~x~ . (46)
where ω is a 4-component vector, indepndent of x and is called the Dirac spinor. Let us write
ω in 2-component notation
ω =
φχ
(47)
10
where φ and χ are 2 -component spinors. Putting the solution in the Dirac equation (Eq.35),
we get
E
φχ
= c~α · ~p
φχ
+ βmc2
φχ
=
0 c~σ · ~p
c~σ · ~p 0
φχ
+mc2
I 0
0 −I
φχ
=
mc2I c~σ · ~p
c~σ · ~p −mc2I
φχ
. (48)
The matrix equation can be written as two coupled equations:
Eφ = mc2φ+ c~σ · ~pχ ⇒ (E −mc2)φ = c~σ · ~pχ, (49)
and
Eχ = −mc2χ+ c~σ · ~pφ ⇒ (E +mc2)χ = c~σ · ~pφ
⇒ χ = c~σ · ~pE +mc2φ. (50)
Putting Eq.(50) in Eq.(49) we have
(E −mc2)φ = c~σ · ~p c~σ · ~pE +mc2φ
= c2(~σ · ~p)2
E +mc2 φ = ~p2c2
E +mc2φ (51)
where we have used (~σ · ~A)(~σ · ~B) = ~A · ~BI + i~σ · ( ~A× ~B)⇒ (~σ · ~p)2 = (~p)2. So finally we get,
(E −mc2)(E +mc2)φ = ~p2c2φ⇒ E2 = ~p2c2 +m2c4 (52)
i.e, E = ±√~p2c2 +m2c4, which means negative energy solutions are still admitted. Dirac’s
prescription cannot get rid of the negative energy solutions. Let us postpone the discussion on
negative energy now. We’ll come back to the issue of negative energy solution at the end.
Let us first check what happens to the probability density. To derive the continuity equation,
first premultiply the Dirac equation by ψ† :
ψ†i~∂ψ
∂t= ψ†(−i~c~α · ~∇+ βmc2)ψ (53)
11
Take hermitian conjugate of the Dirac equation and post multiply with ψ :
−i~(∂ψ†
∂t)ψ = ψ†(i~c~α·
←∇ +βmc2)ψ (54)
Note that the spatial derivative←∇ acts on the left i.e.,on the ψ† and α†i = αi and β† = β. Now
subtracting Eq.54 from Eq.53, we get
i~∂
∂t(ψ†ψ) = −i~c ψ†(~α · ~∇+ ~α·
←∇)ψ = −i~c~∇ · (ψ†~αψ). (55)
Thus we get the continuity equation( in the covariant form)
1c
∂
∂t(ψ†ψ) + ~∇ · (ψ†~αψ) = 0 (56)
⇒ 1c
∂
∂tρ+ ~∇ ·~j = 0 (57)
⇒ ∂µjµ = 0 (58)
where ρ = j0 = ψ†ψ and ~j = ψ†~αψ. Since ψ is a 4-component vector, let us write
ψ =
ψ1
ψ2
ψ3
ψ4
. (59)
Now the probability density
ρ = ψ†ψ =(ψ∗1 ψ∗2 ψ∗3 ψ∗4
)
ψ1
ψ2
ψ3
ψ4
= |ψ1|2 + |ψ2|2 + |ψ3|2 + |ψ4|2 ≥ 0 (60)
⇒ ρ is positive definite. Thus it can be interpreted as probability density.
But now we need to interpret ψ which is a four component vector. What is the significance
or physical meaning of these components? Note that the α matrices involve Pauli matrices σi.
We know that the spin operator are written as ~S = ~2~σ. So, one obvious question arises: Do
the different components in the Dirac spinor represent different spin components?
12
let us consider the positive energy solution only. From Eqs. (47 and 50), we can write
ω =
φ
c~σ·~pE+mc2φ
. (61)
The 2-component spinor φ is completely arbitrary. We may choose two linearly independent
forms
φ↑ =
1
0
, φ↓ =
0
1
(62)
These are the eigenstates of Sz = ~2σz. The most general form can be expanded in terms of
these two basis vectors
φ = aφ↑ + bφ↓ =
ab
. (63)
We have two linearly independent solution for any energy E. So, for a given 4-momentum,
there are just two linearly independent solutions ie, 2-fold degenerate solutions for ω, just as
expected for a quantum system with j = 1/2 (multiplicity (2j + 1) = 2). To give the spin
interpretation let us consider the rest frame of the particle, i.e., ~p = 0. Then E = mc2(we are
considering positive energy only). Two linearly independent solutions in the rest frame can be
written as
ψ1 =
1
0
0
0
e−imc
2t/~, ψ2 =
0
1
0
0
e−imc
2t/~, (64)
These are eigenfunctions of the operator
Σz = ~2
σz 0
0 σz
(65)
with eigenvalues ±~2 . Both solutions have same energy, but eigenvalues of Σz distinguishes
them. Similarly for negative energy solutions (E = −mc2), we can have two solutions ψ3
and ψ4 with 1 in the place third and fourth element in the column matrix respectively and
13
corresponding sign change in the exponential. Generalizing the definition of the operator Σz
for all three components, we write
~Σ = ~2
~σ 0
0 ~σ
. (66)
They satisfy the standard commutation relation for spin operators [Σx,Σy] = i~Σz. Thus Σ is
the appropriate for the spin 12 operator to our rest frame solution and we may conclude that
atleast in the rest frame Dirac solution represents spin-1/2 particles. We know that spin is a
fundamental property of a particle, spin of a particle does not change if we boost the system.
So a spin-half particle in rest frame is a spin-half particle in all frames. Thus Diarc equation
describes the dynamics of spin-half particles or fermions.
But for ~p 6= 0, Σ is no longer a suitable operator to describe spin, as it does not commute
withH = c~α · ~p + βmc2. We need to find an operator that commutes with H and whose
eigenvalues distinguish the two states with same energy. One of such operator is the helicity
operator
h(p) = ~2
~σ·~p|~p| 0
0 ~σ·~p|~p|
. (67)
Physically the helicity operator h(~p) gives the projection of spin(Σ) along the direction of ~p.
Eigen values of h(p) are called helicity of the particle. h(p) has the eigenvalues of h = ±~2 i.e.,
h(p)ω = ±~2ω
⇒ ~2
~σ·~p|~p| 0
0 ~σ·~p|~p|
φ
c~σ·~pE+mc2φ
= ±~2
φ
c~σ·~pE+mc2φ
. (68)
For positive helicity, we must have
~σ · ~p|~p|
φ+ = φ+, (69)
and for negative helicity,
~σ · ~p|~p|
φ− = −φ−, (70)
φ+ and φ− are linearly independent helicity spinors.
14
Lorentz transformaion properties of Dirac spinor:
Rotation: For simplicity consider the particle is moving along z-direction i.e., ~p = (0, 0, p).
Then we can simply choose
φ+ =
1
0
, φ− =
0
1
(71)
Now rotate the coordinate system about x axis by an angle θ. Since ~p is a vector, it transforms
just the same way as ~r.
p′x = px = 0, (72)
p′y = cos θ py + sin θ pz = sin θ p (73)
p′z = − sin θpy + cos θ pz = cos θ p (74)
Let under rotation φ+ → φ′+, so we must have
~σ · ~p′
|~p′|φ′+ = φ′+ (75)
since it must represent a positive helicity state, the state is merely being described in a different
coordinate system. Writing the above equation in explicit form,σxp
′x + σyp
′y + σzp
′z
|p′|φ′+ = φ′+ (76)
or,1p
[p
0 −i sin θ
i sin θ 0
+ p
cos θ 0
0 − cos θ
]φ′+ = φ′+ (77)
or,
cos θ −i sin θ
i sin θ − cos θ
φ′+ = φ′+. (78)
let φ′+ =
ab
, so we can write the above equation as
a cos θ − ib sin θ = a (79)
ia sin θ − b cos θ = b. (80)
Solving the above two coupled equations, we get a = cos(θ/2) and b = i sin(θ/2), i.e.,
φ′+ =
cos(θ/2)
i sin(θ/2)
=
cos(θ/2) i sin(θ/2)
i sin(θ/2) cos(θ/2)
φ+. (81)
15
Similarly we can write the solution for negative helicity state as
φ′− =
i sin(θ/2)
cos(θ/2)
=
cos(θ/2) i sin(θ/2)
i sin(θ/2) cos(θ/2)
φ−. (82)
Thus, the wavefunctions in the rotated coordinate system are linear combinations. of the
corresponding components in the original coordinate system. The transformation is given by a
2× 2 matrix as the spinors(φ±) are 2-component vectors. The matrix elements of the rotation
operator depends on the rotation angle but in the spinor case it is the half-angle θ/2 that enters.
(Note: In case of vector, the rotation angle θ appears in the rotation matrix). We can write the
above rotation in a compact form as
φ′ = eiσxθ/2φ. (83)
For rotation about an arbitrary direction n, the transformation rule can be generalized to
φ′ = ei~σ·nθ/2φ = Uφ. (84)
So, spinors have a well-defined transformation rule under rotation. But it is different from
vector, tensor or scalar. Since the rotation operator U is unitary , the norm of the state is
preserved: φ′†φ′ = φ†φ.
Boost: Let in the S-frame the system is at rest i.e., ~p = 0 and energy E = mc2 and in S ′
frame, it has momentum ~p′ = (p′x, 0, 0) and energy E ′. That is, we are considering a Lorentz
transformation where S ′ frame has velocity (−vx, 0, 0) with respect to S frame where vx = p′x/E′.
pµ = (E/c, ~p) transforms like a 4-vector. For a boost along x-axis
E ′
c= cosh ξE
c+ sinh ξpx (85)
p′x = sinh ξEc
+ cosh ξpx (86)
where tan ξ = p′x/E′.
In the rest frame (S-frame), we choose φ =
1
0
ie., ω =
1
0
0
0
.
16
In S ′ frame the spinor is
ω′ = N
1
0
cσxp′xE′+mc2
1
0
= N
1
0
cp′xE′+mc2
0
1
(87)
where N is the normalization constant. Lorentz transformation can be considered as a rotation
by an imaginary angle ξ = iθ. But φ and χ transform differently under the boost transformation
due to the fact that E and ~p enter differently in φ and χ. So, we expect , compared to rotation,
a pure imaginary angle and a matrix that couples φ and χ. It can be written as
ω′ = eαxξ/2ω (88)
where αx =
0 σx
σx 0
, so
eαxξ/2 = cosh ξ/2 + αx sinh ξ/2 =
cosh ξ/2 σx sinh ξ/2
σx sinh ξ/2 cosh ξ/2
(89)
as α2x = 1. For a boost in any arbitrary direction the general transformation rule is:
ω′ = exp(~α · ~v|v|
ξ
2
)ω. (90)
Thus the Lorentz transformation rules of ω or ψ is different from a vector. So, ψ is not a
vector under Lorentz transformation, it is called a spinor. Any object which transforms like ψ
under Lorentz transformation is called a spinor.
Positive and negative energy spinors: Write E = +√~p2c2 +m2c4. As time derivative
in Dirac equation gives E = p0c, we can write the Dirac equation as
p0c
φχ
=
mc2I c~σ · ~p
c~σ · ~p −mc2I
φχ
. (91)
For positive energy solution E = p0c > 0 and the Dirac spinor is written as
ωs = N
φs
c~σ·~pE+mc2φ
s
(92)
17
where s = 1, 2 corresponds to different spin states. Choosing the normalization condition
ω†ω = 2E, we can write the positive energy spinor
u(p, s) =√E +mc2
φs
c~σ·~pE+mc2φ
s
, (93)
and the complete plane wave positive energy solution is
ψs = u(p, s)exp−ip.x/~ (94)
where pµ = (+E/c, ~p).
For negative energy states, p0c = −√~p2c2 +m2c4 = −E < 0. To construct the spinor for -ve
energy, let us first consider the solution at rest frame: p0c = −E = −mc2, ~p = 0(we assume E
is a positive quantity). Then the Dirac equation simplifies to
−mc2
φχ
=
mc2I 0
0 −mc2I
φχ
(95)
which gives φ = 0, ⇒ ω =
0
χ
. For finite momentum, we have from the Dirac eqn
−E
φχ
=
mc2I c~σ · ~p
c~σ · ~p −mc2I
φχ
. (96)
Which gives,
φ = − c~σ · ~pE +mc2χ (97)
⇒ ω(−E, ~p, s) =
− c~σ·~pE+mc2χ
s
χs
(98)
Let us now change the sign of ~p, i.e, consider the solution for negative 4-momentum pµ− =
(−E/c,−~p) = −pµ,
ω(−E,−~p, s) = N
c~σ·~pE+mc2χ
s
χs
(99)
18
Again using the same normalization condition ω†ω = 2E, we write the negative energy spinor
as
v(p, s) =√E +mc2
c~σ·~pE+mc2χ
s
χs
(100)
s = 1, 2 and the complete plane wave solution reads
ψs = v(p, s)e−ip−·x/~ = v(p, s)eip·x/~ (101)
where pµ = (E/c, ~p).
Dirac’s interpretation of the negative energy solutions:
The physical interpretation of positive energy solutions is straight forward. They describe
spin-12 particles with 4-momentum pmu = (E/c, ~p). The probability density ρ and the probabil-
ity current ~j both are positive definite. But since the negative energy solutions are also allowed,
like KG equation, a particle with +ve energy can cascade down through the -ve energy levels
without limit. Hence +ve energy states cannot be stable! To make any sense of Dirac equation,
one then needs to make the +ve energy states stable, preventing them to make transition to -ve
energy states. Here comes the masterclass of Dirac! Dirac postulated that the normal empty
or vacuum state corresponds to the state with no positive energy particle and all the negative
energy states are completely filled up! The state with completely filled up negative energy
levels is called the Dirac sea. Since Dirac eqn describes fermions, according to Pauli exclusion
principle only two electrons(one spin up and one spin down) can occupy an energy level and
once they are occupied any +ve energy particle is forbidden to fall in the -ve energy levels. Let
us assume that the spin 1/2 particle we are talking about is an electron. So, the vacuum is
the state where all negative energy levels are filled up by electrons i.e., has infinite negative
charge and energy! But since all observations represent finite fluctuations of charge and energy
with respect to the vacuum, it leads to an acceptable theory and we rescale the vacuum to be
without any charge and energy (charge of the vacuum=0, energy of the vacuum=0, spin of the
vacuum=0). Assume an electron with energy −E and spin up is removed from the Dirac sea.
It will create a "hole" relative to the normal vacuum:
energy of the "hole" = −(−E) = +E → positive
charge of the "hole" = −(−e) = e→ positive charge.
19
FIG. 1: Energy levels of Dirac equations.The blobs represent electrons occupying the energy levels
in Dirac sea. Each level can have two electrons(spin up and spin down) according to Puali exclusion
principle. The open circle represents the absence of an electron i.e., presence of a "hole".
spin of the "hole"=-(up)= down.
Thus the absence of a negative energy electron with spin up is equivalent to the presence of
a positive energy and positively charged "hole" with spin down. So, "hole" represents the an-
tiparticle of the electron(i.e, positron). So, the unfilled negative energy states according to
Dirac, represent positive energy antiparticles. Thus in order to give stability to the +ve energy
states, Dirac predicted the existence of positron!(Actually, when Dirac wrote this equation,
positron was not known and he thought proton which is a positively charged particle might be
the antiparticle of electron!). Anderson discovered positron in 1932 to win the Nobel prize.
DIrac required to consider infinite number of electrons filling up the negative energy states
to describe a stabe "single" electron with positive energy! So, in that sense, it is no-longer
a "single-particle" theory! Exciting a negative energy electron to a positive energy state, ie.
creating a physical electron from the vacuum also creates a "hole" in the Dirac sea or a positive
energy positron, which corresponds to the process of creating an e−e+ pair! Appropriate theory
to describe the particle creations or destruction is the Quantum Field Theory! Dirac’s theory
thus suggests to move to quantum field theory.
20
Here we have discussed only free relativistic equations. If we consider Dirac equation in for
hydrogen atom( i.e., Dirac equation in central potential), the fine structures are observed in
the energy eigenvalues. One can also include interaction with radiations by introducing the
gauge fields as we have discussed before for Schrödinger eqn. But we’ll not discuss them here.
[1] Gauge Theories in Particle Physics, Volume -1, From Relativistic Quantum Mechanics to QED.,
by Aitchison and Hey.
[2] Quantum Mechanics, by L.I. Schiff
21