80
SI2371 Special Relativity, Lecture notes Mattias Blennow May 5, 2017 Contents 1 Introduction 3 1.1 Relativity and classical mechanics ................ 3 2 Lorentz transformations and Minkowski space 4 2.1 A detour to the Euclidean plane ................ 6 2.2 The Lorentz transformation ................... 7 2.3 Minkowski space ......................... 9 2.3.1 World lines ........................ 9 2.4 Minkowski diagrams ....................... 10 2.4.1 Axes of other inertial frames .............. 10 2.4.2 Relativity of simultaneity ................ 11 2.5 Classifying regions of Minkowski space ............. 12 3 Vectors and tensors in Minkowski space 14 3.1 Vectors in Minkowski space ................... 14 3.1.1 The event vector and inner products .......... 14 3.1.2 Raising and lowering indices ............... 15 3.1.3 Changing basis ...................... 16 3.1.4 Classification of vectors ................. 17 3.2 Tensors in Minkowski space ................... 18 3.2.1 Raising and lowering indices ............... 19 3.2.2 Transformation rules ................... 19 3.2.3 Contractions ....................... 20 3.2.4 Gradients ......................... 21 1

SI2371 Special Relativity, Lecture notes

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SI2371 Special Relativity, Lecture notes

SI2371 Special Relativity, Lecture notes

Mattias Blennow

May 5, 2017

Contents

1 Introduction 31.1 Relativity and classical mechanics . . . . . . . . . . . . . . . . 3

2 Lorentz transformations and Minkowski space 42.1 A detour to the Euclidean plane . . . . . . . . . . . . . . . . 62.2 The Lorentz transformation . . . . . . . . . . . . . . . . . . . 72.3 Minkowski space . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.3.1 World lines . . . . . . . . . . . . . . . . . . . . . . . . 92.4 Minkowski diagrams . . . . . . . . . . . . . . . . . . . . . . . 10

2.4.1 Axes of other inertial frames . . . . . . . . . . . . . . 102.4.2 Relativity of simultaneity . . . . . . . . . . . . . . . . 11

2.5 Classifying regions of Minkowski space . . . . . . . . . . . . . 12

3 Vectors and tensors in Minkowski space 143.1 Vectors in Minkowski space . . . . . . . . . . . . . . . . . . . 14

3.1.1 The event vector and inner products . . . . . . . . . . 143.1.2 Raising and lowering indices . . . . . . . . . . . . . . . 153.1.3 Changing basis . . . . . . . . . . . . . . . . . . . . . . 163.1.4 Classification of vectors . . . . . . . . . . . . . . . . . 17

3.2 Tensors in Minkowski space . . . . . . . . . . . . . . . . . . . 183.2.1 Raising and lowering indices . . . . . . . . . . . . . . . 193.2.2 Transformation rules . . . . . . . . . . . . . . . . . . . 193.2.3 Contractions . . . . . . . . . . . . . . . . . . . . . . . 203.2.4 Gradients . . . . . . . . . . . . . . . . . . . . . . . . . 21

1

Page 2: SI2371 Special Relativity, Lecture notes

4 Time dilation and length contraction 224.1 Proper time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.2 Proper length . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.3 Paradoxes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3.1 The twin paradox . . . . . . . . . . . . . . . . . . . . 264.3.2 Garage paradox . . . . . . . . . . . . . . . . . . . . . . 27

5 Relativistic dynamics and kinematics 295.1 Addition of velocities . . . . . . . . . . . . . . . . . . . . . . . 295.2 4-acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.2.1 Proper acceleration . . . . . . . . . . . . . . . . . . . . 315.3 4-momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . 325.4 4-force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6 Electromagnetic fields in SR 366.1 Field equations . . . . . . . . . . . . . . . . . . . . . . . . . . 396.2 The 4-potential . . . . . . . . . . . . . . . . . . . . . . . . . . 406.3 The 4-current density . . . . . . . . . . . . . . . . . . . . . . 416.4 The Lorentz force . . . . . . . . . . . . . . . . . . . . . . . . . 42

7 Surfaces and waves in Minkowski space 437.1 Space-like surfaces and simultaneity . . . . . . . . . . . . . . 447.2 Waves and phase functions . . . . . . . . . . . . . . . . . . . 457.3 The relativistic Doppler effect . . . . . . . . . . . . . . . . . . 47

7.3.1 Doppler effect without Lorentz transformation . . . . 477.4 Aberration of light . . . . . . . . . . . . . . . . . . . . . . . . 48

8 Tetrads 50

9 Particle collisions and decays 519.1 Massless particles . . . . . . . . . . . . . . . . . . . . . . . . . 539.2 Elastic scattering . . . . . . . . . . . . . . . . . . . . . . . . . 549.3 Masses of quickly decaying particles . . . . . . . . . . . . . . 559.4 Threshold energies . . . . . . . . . . . . . . . . . . . . . . . . 569.5 Boosted decays . . . . . . . . . . . . . . . . . . . . . . . . . . 58

10 Energy and momentum in EM fields 5910.1 A problem of action at a distance . . . . . . . . . . . . . . . . 5910.2 The stress-energy tensor . . . . . . . . . . . . . . . . . . . . . 6110.3 Stress-energy and the EM fields . . . . . . . . . . . . . . . . . 6410.4 Interpreting the Maxwell stress tensor . . . . . . . . . . . . . 65

2

Page 3: SI2371 Special Relativity, Lecture notes

11 Electromagnetic waves 6711.1 The plane wave . . . . . . . . . . . . . . . . . . . . . . . . . . 67

11.1.1 Polarisation and wave vector relations . . . . . . . . . 6811.1.2 Fields in the electromagnetic wave . . . . . . . . . . . 69

11.2 Stress-energy of the plane wave . . . . . . . . . . . . . . . . . 69

12 Continuum mechanics 7012.1 The stress-energy tensor . . . . . . . . . . . . . . . . . . . . . 7012.2 Perfect fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

12.2.1 Particle gases . . . . . . . . . . . . . . . . . . . . . . . 7212.3 Conservation of energy and momentum in a continuum . . . . 7412.4 Angular momentum of a continuum . . . . . . . . . . . . . . 77

12.4.1 Continua and external forces . . . . . . . . . . . . . . 79

1 Introduction

The theory of relativity is perhaps one of the most iconic physics models andone that shook the very foundations of how we see the world. This documentcontains a set of lecture notes in the special theory of relativity, intendedfor the first year of master studies in theoretical physics. It is assumed thatthe reader has previous knowledge in multivariable calculus, vector analysis,linear algebra, classical mechanics, and electrodynamics. The theory will bepresented on a more in-depth level than what is usually encountered in entrylevel courses covering relativity at the bachelor level. We will not shy awayfrom using tensors (a short introduction will be given).

These lecture notes are exactly that, a set of briefly explained conceptspresented in a manner that I will also try to follow throughout the lectures- use them as you see fit, but keep in mind that they are not intended to bea textbook replacement.

1.1 Relativity and classical mechanics

As we will see several times in these notes, there are many concepts in specialrelativity that bear a close resemblance to the corresponding concepts inclassical mechanics. We shall see that the theories are not that different fromeach other, although some of the perhaps most fundamental assumptionsof classical mechanics are found to be violated. Starting from the basicassumptions, both classical mechanics and the special relativity include thespecial principle of relativity as a postulate, stating that the laws of physicsshould take the same form in all inertial frames and that any frame in

3

Page 4: SI2371 Special Relativity, Lecture notes

x

y′

x′

v

S ′S

y

Figure 1: Two inertial frames S and S′ with a relative velocity v in the xdirection are said to be in standard configuration to each other.

uniform rectilinear motion relative to an inertial frame is another inertialframe.

As we shall see, the real difference between special relativity and classicalmechanics arises due to the postulate in special relativity that the speed oflight is constant and the same in all inertial frames. We will find that thisis in direct contradiction to the classical mechanics postulate of a universalabsolute time and a universal and absolute measure of distances.

2 Lorentz transformations and Minkowski space

Let us consider a coordinate transformation between two inertial frames.We call the frames S and S′, respectively, and arrange them in a way suchthat the x axes of the frames are aligned and that the origin of S′ moves withvelocity v in the x-direction in S, see Fig. 1. This configuration of S and S′

is called the standard configuration. We could derive a the transformationbetween to completely general inertial frames, but it is significantly morecumbersome and often not necessary. An event is anything that occurs at agiven place and time, but this place and time may be assigned different co-ordinates in different inertial frames. We will call the time-coordinate t andthe spatial coordinates x, y, and z in S and denote the corresponding quan-tities in S′ using a prime. Assuming that time and space are homogeneous

4

Page 5: SI2371 Special Relativity, Lecture notes

the transformation between the coordinates in S and the coordinates in S′

should be given by a linear transformation. Furthermore, it is rather straightforward to argue that the coordinates perpendicular to the relative motionbetween S and S′ are not changed under the transformation and thereforey′ = y and z′ = z. We therefore wish to find a linear transformation on theform

t′ = At+Bx, x′ = Ct+Dx (1)

in order to relate the coordinates in S′ to those in S (we have here alsoassumed that the origins of S and S′ coincide at t = t′ = 0). The origin inS′ by definition has the coordinate x′ = 0, but since it moves with velocityv in S, it must also be describable as x = vt, which gives us the requirement

0 = Ct+Dvt =⇒ C = −vD. (2)

Inverting the expressions for t′ and x′, we now find that

t = K(Dt′ −Bx′), x = K(vDt′ +Ax′) (3)

where K = 1/[D(A + vB)]. By symmetry, the frame S must move inthe negative x′ direction with speed v, i.e., for the origin x = 0 of S, thecoordinates in S′ must satisfy x′ = −vt′, leading to

0 = K(vDt′ −Avt′) =⇒ A = D. (4)

The transformation is now of the form

t′ = Dt+Bx, x′ = D(x− vt). (5)

In classical mechanics, we postulate that there exists an absolute time ex-perienced the same by all observers and therefore set t = t′. This leads usto identify D = 1 and B = 0 and we find the Galilei transformation

t′ = t, x′ = x− vt. (6)

The problem with the Galilei transformation when it comes to special rel-ativity is that it results in any velocity being frame independent. Indeed,if an object moves with velocity u′ in the S′, then its position is given byx′ = u′t′+x′0 = u′t+x′0. Inserting this into the Galilei transformation gives

x = x′ + vt = (u′ + v)t+ x′0, (7)

which describes an object moving with velocity u = u′+v. However, until weset t = t′, the argument goes through exactly the same in special relativity.

5

Page 6: SI2371 Special Relativity, Lecture notes

2.1 A detour to the Euclidean plane

We will soon return to the derivation of coordinate transformations in specialrelativity, but before doing so we will have a brief look back at coordinatetransformations in the two-dimensional Euclidean plane, where we are goingto consider the coordinates x and y in the coordinate system K and x′ andy′ in the coordinate system K ′. We want both coordinate systems to beorthonormal and require that, for any point given by x and y (or theircorresponding primed coordinates), the distance ` between the point andthe origin is given by Pythagoras’ theorem

`2 = x2 + y2 = x′2 + y′2, (8)

since the distance should not depend on which coordinate system we areusing. The transformation between to the primed from the unprimed coor-dinates is linear and therefore

x′ = ax+ by, y′ = cx+ dy (9)

(note the similarity to our assumption above when deriving the Galilei trans-formation!). By squaring and summing each of these transformations, wefind that

x′2 + y′2 = (a2 + c2)x2 + (b2 + d2)y2 + 2(ab+ cd)xy = x2 + y2. (10)

In order for this to hold for all combinations of x and y, we can identify

a2 + c2 = 1, b2 + d2 = 1, ab+ cd = 0. (11)

The first two of these relations describe circles of radius one and can beparametrised by two angles φ and θ as

a = cos(φ), c = sin(φ), b = sin(θ), d = cos(θ). (12)

Inserted into the third relation, this gives

cos(φ) sin(θ) + sin(φ) cos(θ) = sin(θ + φ) = 0 =⇒ θ = −φ (13)

and therefore

x′ = cos(φ)x− sin(φ)y, y′ = sin(φ)x+ cos(φ)y, (14)

which should be familiar as the expression for a two-dimensional rotation.(We could also pick other solutions to sin(θ + φ) = 0, but they will beequivalent.)

6

Page 7: SI2371 Special Relativity, Lecture notes

2.2 The Lorentz transformation

In special relativity, we are instead looking for a transformation for which anobject moving at light speed in one frame moves at light speed in all frames.In particular, we can consider an object moving from the origin x = 0 atlight speed. Such an object is described by x = ±ct, where c is the speedof light, but also by x′ = ±ct′, since it also moves at speed c in S′. Theseconditions can be summarised without the ± as

(ct)2 − x2 = 0 ⇐⇒ (ct′)2 − x′2 = 0. (15)

Instead of treating this condition, let us treat the more general condition

(ct)2 − x2 = (ct′)2 − x′2 (16)

for all combinations of x and t with their corresponding x′ and t′ values. Itshould be clear that this implies Eq. (15) and we will see that we can finda transform that satisfies this more general set-up, which in particular willgive us a transform that satisfies Eq. (15). This condition is essentially thesame condition we enforced when deriving the coordinate transformation fora rotation with the exception that there now appears a negative sign, whichwe will see is the main crucial difference to Euclidean geometry when we dealwith the space-time of special relativity (it has far-reaching consequences!).We find that, using the slightly different convention

ct′ = Act+Bx, x′ = Cct+Dx (17)

for the transformation,

(ct′)2 − x′2 = (A2 − C2)(ct)2 + (B2 −D2)x2 + 2(AB − CD)xct

= (ct)2 − x2. (18)

And identification gives

A2 − C2 = 1, D2 −B2 = 1, AB − CD = 0. (19)

Unlike the case of the rotation, where the corresponding relations describedcircles, the first two relations here describe hyperbolae and may be parametrisedusing the hyperbolic functions

A = cosh(θ), C = − sinh(θ), B = − sinh(φ), D = cosh(φ), (20)

7

Page 8: SI2371 Special Relativity, Lecture notes

where the signs for B and C are purely conventional. The third relationnow gives

cosh(θ) sinh(φ)− cosh(φ) sinh(θ) = sinh(θ − φ) = 0 =⇒ θ = φ (21)

and therefore

ct′ = cosh(θ)ct− sinh(θ)x, x′ = − sinh(θ)ct+ cosh(θ)x. (22)

This is the Lorentz transform expressed in terms of the rapidity θ. In orderto express it on a form more familiar to most, we can again consider thatthe origin of S′ should move at velocity v in S, giving the condition

0 = cosh(θ)ct− sinh(θ)vt =⇒ v

c= tanh(θ). (23)

Squaring and applying the hyperbolic relation sinh2(θ) = cosh2(θ) − 1, wefind that

v2

c2=

cosh2(θ)− 1

cosh2(θ)=⇒ cosh(θ) =

1√1− v2

c2

≡ γ, (24)

where γ = γ(v) is often called the Lorentz factor. This puts the Lorentztransformations on the form

ct′ = γ(ct− v

cx), x′ = γ

(x− v

cct). (25)

We note in particular that this directly violates the assumption of t′ = tfrom classical mechanics.

Exercise 1 Verify that, using the Lorentz transformation, any object trav-elling at speed c in S travels at speed c in S′ as well.

In relativity, it is very common to work in units where c = 1 (meaningthat all velocities are dimensionless and measured in fractions of the speedof light) and from now on we will do so in these notes as well. Furthermore,the time coordinate t is often denoted x0. Another common approach isto simply define x0 = ct, but we will keep to units where c = 1. In a fewcases, it will be instructive to work in units where c 6= 1, but this will bementioned explicitly.

8

Page 9: SI2371 Special Relativity, Lecture notes

2.3 Minkowski space

We have seen how the Lorentz transformations arise as the coordinate trans-formations between different inertial frames and compared the derivation tothe derivation of rotations in the Euclidean plane. When we consider theEuclidean plane, there is nothing particular about any set of coordinates,just as there is nothing particular about any inertial frame, but they areonly descriptions of the underlying structure - the Euclidean plane. In thesame fashion, the coordinates in an inertial frame are only descriptions ofwhat is called Minkowski space, which is a four-dimensional space on whichwe can impose the coordinates belonging to a given inertial frame. Just asrotations then describe the transformation between different orthonormalcoordinates on the Euclidean plane, we will see that the Lorentz transformsdescribe the transformation between different orthonormal coordinates onMinkowski space.

The major difference between Minkowski space and a four-dimensionalEuclidean space is that Minkowski space does not have a distance functiongiven by Pythagoras’ theorem. Instead, Minkowski space is endowed with asquare-length function

s2 = t2 − ~x2, (26)

where we by ~x summarise all of the spatial coordinates into one three-dimensional vector. In the coming sections, we will interpret the meaning ofthis square-length function, but for now we simply note the peculiar prop-erty that this square-length function can be negative, depending on t and ~x.Just as we found rotations on the Euclidean plane as the transformationsthat preserved length, we found the Lorentz transformations as the trans-formations that preserve the square-length function on Minkowski space.

2.3.1 World lines

In Euclidean space, any curve can be parametrised by some curve parameters. The curve may be specified by giving the coordinates in some coordinatesystem as functions of this curve parameter.

Example 1 Think of the circle, which can be parametrised by x = cos(s)and y = sin(s).

In many cases, it is also possible to use one of the coordinates as the curveparameter. The requirement for this to be possible is that there is only onepoint on the curve for each value of that coordinate.

9

Page 10: SI2371 Special Relativity, Lecture notes

A curve in Minkwoski space is called a world line and is an assignmentof time and space coordinates to form a curve in it. The importance ofworld lines lies in describing the history of an object. As long as we aredealing with objects that cannot be in two places at once, we can use thetime coordinate t as the world line parameter and specify the world line bygiving the spatial coordinates as a function of t, but this is in general notnecessary. For any given object, its world line assigns ~x(t) to be the positionof the object at time t.

Example 2 For a light signal travelling along the positive x axis startingat the origin at time t = 0 can be described by the functions

t = t, x(t) = t, y(t) = z(t) = 0. (27)

However, we do not need to use t as the curve parameter. We could havechosen another parameter s and instead written

t(s) = ks, x(s) = ks, y(s) = z(s) = 0 (28)

and still described the same world line.

2.4 Minkowski diagrams

A very common pictorial representation of Minkowski space are Minkowskidiagrams (or space-time diagrams). By convention, a Minkowski diagram isdrawn based upon some given inertial frame S with the time axis being verti-cal and one of the spatial directions (usually the x-direction) as a horizontalaxis, see Fig. 2. In a Minkowski diagram, we can draw events, world lines,as well as other things that may help us to get a pictorial representation ofMinkowski space.

2.4.1 Axes of other inertial frames

We created the Minkowski diagram based on some inertial frame S, withits time and space axes being vertical and horizontal, respectively. We canalso represent the time and space axes in another inertial frame in the sameMinkowski diagram. Taking S′ to be in standard configuration with S, thet′ axis is the world line described by ~x′ = 0 and therefore

0 = γ(x− vt) =⇒ x(t) = vt. (29)

10

Page 11: SI2371 Special Relativity, Lecture notes

x

t

Figure 2: A Minkowski diagram based on the inertial frame S usually hasthe time of that frame on the vertical axis and one of the spatial directionsof that frame on the horizontal axis. In the Minkowski diagram, we candraw points (here red), representing events, and curves, representing worldlines (here green). This gives us a pictorial description of Minkowski space.

Since x is on the horizontal axis, this is a straight line with slope 1/v. Inthe same fashion, the x′ axis is defined by t′ = 0 and therefore

0 = γ(t− vx) =⇒ x(t) =t

v(30)

and the x′ axis is a straight line with slope v, see Fig. 3. In order to properlynormalise the t′ axis, we can draw the hyperbola t2 − x2 = 1. Since thesquare-length function is the same in all inertial frames, this hyperbola willintersect the t′ axis when x′ = 0 and therefore t′ = 1. A similar constructionwith the hyperbola t2 − x2 = −1 gives the proper normalisation of the x′

axis.

2.4.2 Relativity of simultaneity

We noted that the x′ axis in the Minkowski diagram based on S had a slopev. In fact, this will be true for any line describing a fixed t′, i.e., a set ofevents that are simultaneous in S′. At the same time, any line with a fixed tis going to be a horizontal line. From here, we can draw a crucial conclusion,namely that events that are simultaneous in S are generally not going to

11

Page 12: SI2371 Special Relativity, Lecture notes

t 2−x 2=−1

t 2−x 2=1

x

t

t′

x′

Figure 3: The time and space axes of other inertial frames may be addedto a Minkowski diagram. If the inertial frame S′ is moving with speed vrelative to the frame the diagram is based on, the time axis has a slope 1/vand the space axis has a slope v. The axes can be properly normalised bydrawing the hyperbolae t2 − x2 = ±1, which intersect the axes when thecorresponding coordinate is equal to one.

be simultaneous in S′. Indeed, we can see this directly from the Lorentztransformations if we consider two events with the same time coordinatet and different space coordinates x1 and x2, respectively. Applying theLorentz transformation yields

t′2 − t′1 = γ(t− vx2)− γ(t− vx1) = γv(x1 − x2), (31)

i.e., if x1 > x2, then the second event occurs after the first in S′.

2.5 Classifying regions of Minkowski space

Given an event E, we can split Minkowski space into several different regionsbased on the sign of the square-length from E to other events. We do thefollowing identification:

• Events with a positive square-length to E are said to be at time-likeseparation from E. Regardless of the inertial frame, such events are

12

Page 13: SI2371 Special Relativity, Lecture notes

E simultaneity

future

elsewhere elsewhere

past

Figure 4: A classification of how different parts of Minkowski space relateto the event E. Unlike classical mechanics, where we can split events intofuture, present, and past, Minkowski space splits into the future and pastlight cones and elsewhere. Elsewhere contains the surfaces of simultaneitycontaining E of all inertial frames. Here, as in Minkowski diagrams, thetime direction is assumed to be vertical.

going to be either in the future or in the past of E, depending onthe difference in the time coordinates. If the time coordinate of E islarger, the event is said to be in the past of E, while if it is smaller, itis in the future of E.

• Events with a negative square-length to E are said to be at space-likeseparation from E. Drawing a straight line between those events, thisline will have a slope which is less than one and therefore these eventswill be simultaneous in the frame for which this line is the x axis.Depending on the inertial frame, events of this type will have a largeror smaller time coordinate than E. We therefore cannot assign themto the future or past of E and instead assign them to be elsewhere.

• Events with zero square-length to E are said to be in the light coneof E. Just like the events at time-like separation, these events can besubdivided into events in the future of E and events in the past of E.

These concepts are illustrated in Fig. 4.

13

Page 14: SI2371 Special Relativity, Lecture notes

3 Vectors and tensors in Minkowski space

Before we start looking at the physical implications of special relativity, letus take some time to acquaint ourselves with vector and tensor analysisin Minkowski space, with particular focus on the difference to vector andtensor analysis in Euclidean space. We will later use vectors and tensorsextensively, so this is as good a time as any to have a short crash course.

3.1 Vectors in Minkowski space

As we have already seen, Minkowski space is 4-dimensional and as such weare going to use four basis vectors ~eµ, where the index µ runs from zero tothree, zero representing the time-direction. A vector in Minkowski space,often called a 4-vector, is a linear combination of the basis vectors

V = V µ~eµ, (32)

where V µ are the components of V . We have here used the Einstein sum-mation convention that repeated indices (in this case µ) are summed fromzero to three. In general, we will use Greek letters to denote indices thatrun over all space-time indices and Roman letters to denote indices that justrun over the spatial indices (one to three). In what follows, we will usuallyuse capital letters without a vector arrow to indicate 4-vectors.

3.1.1 The event vector and inner products

We define the event vector X as the 4-vector describing when and where aparticular event occurs relative to the origin. It is given by

X = xµ~eµ, (33)

where xµ are the coordinates of the event (again, note that x0 = t). Thisvector will play the same role in Minkowski space as the position vector doesin Euclidean space.

The (square of the) distance from the origin in Euclidean space can befound by taking the inner product of the position vector with itself. However,if we try introducing the inner product

V ·W = V µWµ (34)

in Minkowski space, we find that

X ·X = t2 + ~x2. (35)

14

Page 15: SI2371 Special Relativity, Lecture notes

In Euclidean space, this was the invariant square length given by Pythagoras’theorem, but this is no longer an invariant in Minkowski space. Instead, wedefine our inner product as

V ·W = V 0W 0 − V iW i = ηµνVµW ν , (36)

where we have introduced the components ηµν of the Minkowski metric as

ηµν =

1, µ = ν = 0

−1, µ = ν 6= 0

0, µ 6= ν

. (37)

This can be implemented by letting

~e0 · ~e0 = 1, ~ei · ~ej = −δij , ~e0 · ~ei = 0 (38)

or, equivalently,ηµν = ~eµ · ~eν . (39)

With this definition, we find that

X ·X = t2 − ~x2, (40)

which is exactly the invariant square-length of Minkowski space.Note: The inner product introduced here is not a strict inner product in

the mathematical sense, since it is possible to have V · V < 0. Instead, theinner product on Minkowski space is a pseudo inner product.

3.1.2 Raising and lowering indices

The components V µ of the vector V are called the contravariant vectorcomponents, identified by having a super-index. We can also introduce thecovariant vector components

Vµ = ηµνVν , (41)

identified by the sub-index. In special relativity, the spatial covariant com-ponents differ from the contravariant ones only by a minus sign, while thetime components are the same. For this reason, we can always move in-dices up and down using the metric tensor and we have that V 0 = V0 andV i = −Vi. The inverse metric tensor ηµν has the property that

ηµνηνσ = δµσ , (42)

15

Page 16: SI2371 Special Relativity, Lecture notes

where δµσ is the Kronecker delta (which is equal to one if µ = σ and zerootherwise).

Using the covariant vector components, the inner product between Vand W can be written interchangeably as

V ·W = ηµνVµW ν = VνW

ν = V µWµ. (43)

3.1.3 Changing basis

We have here considered the basis ~eµ of 4-vectors based on the inertial frameS. What if we wish to change to a description in terms of the inertial frameS′ where the event vector should take the form

X = xµ′~e ′µ′ (44)

based on the same line of argument as above? We note that in both bases,we have

~eµ =∂X

∂xµand ~e ′µ′ =

∂X

∂xµ′, (45)

respectively. Using the expression for X in terms of ~e ′µ′ , we find that

~eµ =∂xµ

∂xµ~e ′µ′ ≡ Λµ

′µ~e′µ′ , (46)

where we have introduced the transformation coefficients

Λµ′µ =

∂xµ′

∂xµ. (47)

Note that the transformation coefficients are just the constant coefficients

that appear in the Lorentz transformation as xµ′

= Λµ′µxµ. In matrix rep-

resentation and standard configuration

(Λµ′µ) =

γ −vγ 0 0−vγ γ 0 0

0 0 1 00 0 0 1

. (48)

Note that Λµ′µ is not a 4-vector (or tensor). I generally recommend against

lowering and raising its indices.For any 4-vector V , it should not matter which basis we use to represent

it. In particular, we must have

V = V µ~eµ = V µ′~e ′µ′ = V µΛµ′µ~e′µ′ . (49)

16

Page 17: SI2371 Special Relativity, Lecture notes

Identification gives us the transformation property

V µ′ = Λµ′µV

µ (50)

for the contravariant vector components.What about the covariant components? In general, we know that the

inner product should not depend on the inertial frame used to expressed it.Therefore, we must have

V ·W = VµWµ = Vµ′W

µ′ = Vµ′Λµ′µW

µ. (51)

For this to hold for all V and W , we must have

Vµ = Vµ′Λµ′µ. (52)

Defining

Λ µµ′ =

∂xµ

∂xµ′, (53)

we find that

Λ µµ′ Λµ

′ν =

∂xµ

∂xµ′∂xµ

∂xν=∂xµ

∂xν= δµν , (54)

Λ µµ′ Λν

′µ =

∂xµ

∂xµ′∂xν

∂xµ=∂xµ

∂xν′= δµ

ν′ . (55)

Multiplying both sides of Eq. (52) by Λµν′ therefore results in

Vν′ = Λ µν′ Vµ. (56)

Note that the components of Λ µν′ are the components of the inverse Lorentz

transform, which in standard configuration are given by

(Λ µν′ ) =

γ vγ 0 0vγ γ 0 00 0 1 00 0 0 1

. (57)

3.1.4 Classification of vectors

Unlike in the Euclidean case, the inner product of a non-zero 4-vector withitself need not be positive, which we will refer to as the square of the 4-vector. Instead, with the appearance of the minus signs in the Minkowskimetric, the square of a 4-vector can have any sign. This allows us to makethe following classification of 4-vectors:

17

Page 18: SI2371 Special Relativity, Lecture notes

• A 4-vector V is time-like if V 2 > 0.

• A 4-vector V is space-like if V 2 < 0.

• A 4-vector V is light-like (or null) if V 2 = 0.

Furthermore, if V 0 > 0 for a time-like 4-vector, then it is said to be future-directed while if V 0 < 0 it is said to be past-directed. It is relatively straight-forward to complete the following exercises:

Exercise 2 Show that if V is time-like and V ·W = 0, then W is space-like.

Exercise 3 For any two time-like future directed 4-vectors V and W , itholds that

V ·W ≥√V 2W 2 (58)

with equality only if W is a multiple of V .

3.2 Tensors in Minkowski space

The step from vectors to tensors is not as large as many students think.While a, colloquially, a vector can be said to represent a physical quantitywith a direction and a magnitude, a tensor may be said to represent severaldirections and a magnitude. The most hands-on example of a tensor mightbe the stress tensor of solid mechanics, describing how the force (a vector)on a surface element depends on the direction of the surface normal (anothervector) and therefore involves two vectors. Involving two vectors, the stresstensor is a rank two tensor, but there are some tensors in physics that involveeven more directions and thus are represented by higher rank tensors.

We start by defining the tensor product of the vectors V and W as abilinear product V ⊗W . In other words,

V ⊗ (a1W1 + a2W2) = a1(V ⊗W1) + a2(V ⊗W2), (59)

where a1 and a2 are constants, with a similar relation holding for (a1V1 +a2V2) ⊗W . This is an object that involves two directions, just as we arelooking to do. However, the most general tensor of rank two is a linearcombination of such products and the most general linear combination canbe decomposed into the form

T = Tµνeµν , (60)

where eµν = ~eµ ⊗ ~eν is the tensor product of two basis vectors and objectsof this type form the basis for tensors of rank two. Higher rank tensors

18

Page 19: SI2371 Special Relativity, Lecture notes

are defined in a completely analogous manner and we can also define tensorproducts of tensors similarly, the product of a tensor of rank n and one ofrank m being a new tensor of rank n+m, i.e.,

T ⊗ S = Tµ1...µnSν1...νm~eµ1 ⊗ . . .⊗ ~eµn ⊗ ~eν1 ⊗ . . .⊗ ~eνm . (61)

3.2.1 Raising and lowering indices

The components Tµν of the tensor T are referred to as the contravariant ten-sor components and as for contravariant vector components, all the indicesof such components are super-indices. However, just as for the 4-vectors, wecan lower and raise indices by using the metric. Indices can be raised andlowered individually, but if all of the indices are moved down, i.e.,

Tµν = ηµσηνρTσρ, (62)

we refer to the components Tµν as covariant tensor components. If a tensorcomponent has both covariant and contravariant indices, it is referred to asa mixed tensor component.

We note that whenever we raise or lower a time-index of a tensor com-ponent, we get back the same value, whereas we get a relative minus signwhenever we raise or lower a spatial index. For example,

T 0i = T i0 = −T 0

i = −T0i. (63)

3.2.2 Transformation rules

Just as for vectors, we might want to express tensors using different bases.In the case of a rank two tensor, we can consider the basis related to theinertial frame S versus the basis related to the inertial frame S′, which aregiven by

eµν = ~eµ ⊗ ~eν and e′µ′ν′ = ~e ′µ′ ⊗ ~e ′ν′ , (64)

respectively. From the transformation rule for the basis vectors, we find that

eµν = (Λµ′µ~e′µ′)⊗ (Λν

′ν~e′ν′) = Λµ

′µΛν

′ν(~e ′µ′ ⊗ ~e ′ν′) = Λµ

′µΛν

′νe′µ′ν′ . (65)

Just like a vector, a tensor does not depend on the basis it is expressed inand therefore we must have

T = Tµνeµν = Tµ′ν′e′µ′ν′ . (66)

19

Page 20: SI2371 Special Relativity, Lecture notes

Inserting the expression of the unprimed basis in terms of the primed andidentifying components, we find that

Tµ′ν′ = Λµ

′µΛν

′νT

µν , (67)

i.e., the contravariant tensor components transform with one factor of Λµ′µ

for each index.

Exercise 4 Verify that the covariant tensor components instead transformwith one factor of Λ µ

µ′ for each index, i.e.,

Tµ′ν′ = Λ µµ′ Λ ν

ν′ Tµν . (68)

It is common to refer to a vector or tensor, not by writing it downcompletely, but instead referring to its components in some given basis.This is a practice we will adopt in the future. What we really need toremember when doing this are the transformation rules derived here.

3.2.3 Contractions

In general, if we have a tensor of rank two or greater, then we can contracttwo of the indices and obtain a new tensor of lower rank. This amounts tomultiplying by the metric and summing the metric indices with the indicesin the tensor that we wish to contract. For example, for the rank threetensor Tµνσ, the contraction of the first two indices is given by

Aσ = ηµνTµνσ = T νσ

ν . (69)

Under basis changes, we find that

Aσ′

= T ν′σ′

ν′ = Λσ′σΛν

′νΛ µ

ν′ Tνσ

µ = Λσ′σδµνT

νσµ = Λσ

′σT

νσν = Λσ

′σA

σ. (70)

Thus, Aσ transforms just like the components of a 4-vector. The argumentfor tensors of other ranks is exactly the same and we conclude that thecontraction of two indices results in a new tensor with a rank two lowerthan the original one. In particular, if a the two indices of a rank two tensorare contracted, we obtain a rank zero tensor, which is just a scalar - a singlenumber that does not depend on the coordinate system.

Example 3 We have already encountered a contraction earlier, namely whenwe considered the contraction of the tensor product of two vectors V and W .The tensor product V ⊗W has components V µW ν and so the contraction is

ηµνVµW ν = VνW

ν = V ·W, (71)

i.e., the inner product of the vectors.

20

Page 21: SI2371 Special Relativity, Lecture notes

3.2.4 Gradients

Let us consider the partial derivative of a tensor Tµ1...µn with respect to thecoordinate xν . We can rewrite this in S′ using the chain rule as

∂ν′Tµ′1...µ

′n =

∂xν

∂xν′∂νΛ

µ′1µ1 . . .Λ

µ′nµnT

µ1...µn . (72)

Since all of the transformation coefficients are constant, they can be takenout of the partial derivative and we obtain

∂ν′Tµ′1...µ

′n = Λ ν

ν′ Λµ′1µ1 . . .Λ

µ′nµn∂νT

µ1...µn . (73)

It follows that ∂νTµ1...µn transforms just like Tµ1...µn but with an additional

covariant transformation coefficient belonging to the index of the partialderivative. Thus, we define this as the gradient of Tµ1...µn . Note that theregradients are naturally covariant, i.e., there is no minus sign related to thespatial components - the minus signs come when we make the spatial com-ponents contravariant!

Example 4 Consider the gradient Gµ = ∂µφ of the scalar field φ. Sincethe 4-vector G is given by G = Gµ~eµ, we find that

G = G0~e0 +Gi~ei = G0~e0 −Gi~ei =∂φ

∂t~e0 −∇φ, (74)

where ∇φ is the regular gradient in three spatial dimensions.

Of course, we can also contract the covariant index from the gradientwith one of the contravariant indices from the tensor being differentiated.In this case, we talk about a divergence. In this case, there will be no minussigns due to the metric since the index from the partial derivative is alreadycovariant.

Example 5 Consider the divergence ∂µVµ of the vector field V . We find

that

∂µVµ =

∂V 0

∂t+∇ · ~V , (75)

where ~V is the spatial part of V .

21

Page 22: SI2371 Special Relativity, Lecture notes

4 Time dilation and length contraction

4.1 Proper time

For a general world line, we can define its tangent vector as the derivative ofthe event vector with respect to the curve parameter, i.e., if the world lineis given by the functions xµ(α), then its tangent vector is given by

Tµ =dxµ

dα. (76)

Just like any 4-vector, the tangent vector may be time-like, space-like, ornull, but we will now restrict ourselves to the case of time-like world lines,i.e., world lines with a time-like tangent vector. For such world lines, T 2 > 0and we can define the length of the tangent as

√T 2 without feeling uneasy.

The distance between two events with infinitesimal separation along theworld line is given by

ds =√dt2 − d~x2 =

√ηµνdxµdxν =

√T 2 dα (77)

and summing the distances along the world line from event xµ(0) to eventxµ(α0) gives the world line length

s =

∫ds =

∫ α0

0

√T 2 dα. (78)

Note that we can always reparametrise the world line in terms of this curvelength by assuming a new curve parameter τ with corresponding tangentvector V such that V 2 = 1. This results in

V =dX

dτ=dX

dτ= T

dτ. (79)

Given a world line with an arbitrary parametrisation in terms of α, wecan therefore always find this reparametrisation by solving the differentialequation

dα=√T 2. (80)

In terms of τ , we find the world line length

s =

∫ τ0

0dτ = τ0. (81)

Note that this length is a property of the world line between the two eventsand does not depend on the inertial frame used to compute it. But what isthe physical interpretation of it?

22

Page 23: SI2371 Special Relativity, Lecture notes

Consider the instantaneous rest frame of the world line, i.e., the framewhere the tangent vector V can be written as V = ~e0. For a small changedτ in the curve parameter, we find that

dt = dτ. (82)

Assuming that clocks are not affected by acceleration, a clock following theworld line will tick at the same rate as a clock at rest in the instantaneousrest frame and the above equation therefore identifies dτ with the differentialtime interval dt. We can therefore conclude that integrating dτ will give thetotal time elapsed for a clock following the given world line. The time τ istherefore known as the proper time along the world line and we have alreadyidentified it with the world line length s, which depends only on the worldline itself and not on its parametrisation in any given inertial frame.

The tangent vector V = dX/dτ can be described in an arbitrary frameas

V =dX

dτ=dX

dt

dt

dτ. (83)

We here note thatdx0

dt= 1 and

d~x

dt= ~v (84)

and therefore

V 2 =

(dt

)2

(1− ~v2) = 1 =⇒ dt

dτ=

1√1− v2

= γ. (85)

It follows that V , known as the 4-velocity can be expressed in componentform as

V µ = γ(1, ~v)µ, (86)

where we have used (W 0, ~W )µ to represent the 4-vector components for the4-vector Wµ with time component W 0 and spatial part ~W .

Looking at Eq. (85), we can conclude that

dτ =dt

γ. (87)

This is the time dilation formula which tells us that the elapsed proper timealong a world line in the coordinate time interval dt is given by dt/γ ≤ dt,where the equality holds only when v = 0. In other words, the elapsedproper time will always be at most as large as the elapsed time t. The time

23

Page 24: SI2371 Special Relativity, Lecture notes

dilation formula can also be found directly from Eq. (78) by using the timet as the world line parameter, this results in

s =

∫ √(dt

dt

)2

−(d~x

dt

)2

dt =

∫ √1− v2 dt =

∫dt

γ. (88)

Example 6 Consider an observer moving at constant velocity v in the iner-tial frame S. The world line of this observer is given by x(t) = vt. Betweenthe times t = 0 and t = ∆t, the proper time elapsed for the observer is givenby

∆τ =

∫ ∆t

0

√1− v2 dt =

√1− v2 ∆t =

∆t

γ. (89)

Note that the time dilation is reciprocal. An observer at rest in S wouldexperience a proper time ∆t′/γ between the times t′ = 0 and t′ = ∆t′ of theS′ system.

4.2 Proper length

The first question to ask ourselves when talking about length is how wedefine the length of an object. For a stationary object, this is not a bigissue, we can just take the difference between the coordinates of the object’send points and apply Pythagoras’ theorem. However, for a moving object,we need to be more specific. We will here specialise to one spatial dimension,the direction of relative motion, as the other directions will not be affected.We then define the length of an object as the distance between its end pointsat a given time, i.e., we must measure the position of the end points at thesame time. If we do not do this, we will end up with a different result.

Example 7 Consider a rod of length ` moving in the positive x directionwith velocity v. The coordinates of the end points are then given by x1(t) =vt and x2(t) = vt+ ` such that the length of the rod is

` = x2(t)− x1(t). (90)

If we would measure the positions of the end points at different times, wewould instead get the result

L(t2, t1) = x2(t2)− x1(t1) = v(t2 − t1) + ` 6= `. (91)

In classical mechanics, we can apply the Galilei transformation to expressthe length in a system moving at speed u relative to the first. We would findthat

t′ = t, x′1(t′) = (v + u)t′, x′2(t) = (v + u)t′ + ` (92)

24

Page 25: SI2371 Special Relativity, Lecture notes

leading to`′ = x′2(t′)− x′1(t) = `. (93)

In special relativity, there are two complications when computing thelength of an object in a frame where it is moving. First of all, the spatialcoordinate x appears with a factor of γ in the expression for x′ and, second,the very definition of “at the same time” depends on the inertial frame. Letus consider an object with length ` in its rest frame S. We can chose thecoordinates in this frame such that x1(t) = 0 and x2(t) = `0, where `0 is theproper length of the object, i.e., its length in its rest frame. As discussed,in the object’s rest frame, it does not matter what times we measure thepositions of the end points at, since they are stationary, and we find

`0 = x2(t2)− x1(t1) (94)

for all t2 and t1. In order to find the length of the object in an inertial frameS′ moving with velocity v relative to S, we need to find the coordinates ofits end points. Applying the Lorentz transformation results in

x′1 = −γvt1, x′2 = γ(`0 − vt2). (95)

In order to find the length of the object in S′, we need to make sure thatthese coordinates are measured at the same time in S′ and in order to dothis we consider the Lorentz transformation for the time coordinates

t′1 = γt1, t′2 = γ(t2 − v`0). (96)

Requiring t′1 = t′2 = t′ therefore results in the condition

t2 − t1 = v`0. (97)

It follows that the length of the object in S′ is given by

`′ = x′2(t′)− x′1(t′) = γ[`0 − v(t2 − t1)] = γ`0(1− v2) =`0γ. (98)

Since γ ≥ 1, the length `′ ≤ `0 and the length observed in S′ is shorter thanthe proper length of the object. This is known as length contraction.

Note the assumption about measuring the x′ coordinates at the sametime in S′ is crucial to the argument. If we instead took two events that aresimultaneous in the rest frame, the difference between their x′ coordinateswould be γ`0 instead of `0/γ, but the events would not be simultaneous inS′.

25

Page 26: SI2371 Special Relativity, Lecture notes

4.3 Paradoxes

The concepts of time dilation and length contraction are often misunder-stood by people reading popular literature. The most common misconcep-tion arises from failing to account for the fact that both effects are basedon a mismatch in what different observers consider to be simultaneous. Wewill here resolve two of the more common paradoxes that arise due to thismisconception.

4.3.1 The twin paradox

The twin paradox is perhaps the most encountered paradox when dealingwith relativity. The argument goes as follows: Consider two twins A and Band let B travel away from A at velocity v for some time distance vt0. Aftertravelling this distance, B reverses direction and travels back to A. Applyingthe time dilation formula, A finds that B will have aged an amount 2t0/γwhile A has aged by 2t0, meaning that A is older. The apparent paradoxarises when considering the course of events from the perspective of B.What is stopping B from applying the same argument and concluding thatA should be younger? After all, A is always moving at speed v relative toB and should be time dilated in the rest frame of B.

Of course, we know that the proper time corresponding to the worldlines of the two twins do not depend on selecting a particular inertial frame.Indeed, if we would consider only one of B’s inertial frames for computingthe times elapsed for both observers, we would find the same result thatwe found in the rest frame of A. The problem arises when B changes restframe at the half-point of the journey. It is true that A is time dilated in therest frame of B. However, due to the relativity of simultaneity, the eventon A’s world line that is simultaneous with the turn-around of B is not thehalf-time event for A, but an event that occurs earlier. In the rest frame ofA, the half-time events for both twins are given by

tA = tB = t0, xA = 0, xB = vt0. (99)

Transforming the times of these events to the rest frame S′ of B during thefirst half of the journey, we find that

t′A = γt0, t′B = γ(t0 − v2t0) = γ(1− v)t0 = (1− v2)t′A. (100)

Thus, to find half the time elapsed for A for the full course of events, weshould not compute the time dilation in S′ based on t′B, but instead on

26

Page 27: SI2371 Special Relativity, Lecture notes

x

x′

t′t

t′0t0

t′0/γ

Figure 5: A Minkowski diagram based on the staying twin’s rest frame. Theworld line of the travelling twin is shown in blue and the world line of thestaying twin in green/yellow. The dashed line shows the events simultaneouswith the turnaround of the travelling twin in its original rest frame. It istherefore the yellow part of the world line of the staying twin that is takeninto account when using the time dilation formula with the time t′0. As seenhere, this does not correspond to half of the staying twin’s time.

t′A = t′B/(1− v2). Doing so results in

τA =τB

γ(1− v2)= τBγ, (101)

where τX is half the time elapsed for twin X during the full scenario. A sim-ilar argument can be made in the returning rest frame of B. The resolutionto the twin paradox is illustrated with a Minkowski diagram in Fig. 5.

4.3.2 Garage paradox

The second apparent paradox we will look appears with several differentnames, we will go with the name garage paradox, and is based on the sym-metry of length contraction. The statement of the paradox is as follows:Consider an observer A driving a car of proper length `0 towards a garagewhich also has proper length `0. The garage has two doors, one in the frontand one in the back which are initially open. In the rest frame S′ of A, thegarage is length contracted and has length `0/γ. However, in the rest frame

27

Page 28: SI2371 Special Relativity, Lecture notes

−v

S :

v

S ′ :

Figure 6: In the rest frame S of the garage, the car is short enough to fit inthe garage, but in the rest frame of the car S′, the garage is too short forthe car to fit.

S of the garage, it is the car that is length contracted and has length `0/γ.Thus, in S, the car fits in the garage and the doors can be closed simultane-ously, whereas in S′, the car is longer than the garage and the doors cannotbe closed simultaneously, which is an apparent paradox, see Fig. 6.

The resolution to this paradox is hinted at already in its formulation,where the word “simultaneous” is used. The question that should pop intoyour mind is “what does simultaneous mean?” Indeed, as we have discussed,the notion of simultaneity is different in S and S′ and it therefore is not sostrange that we might arrive at different conclusions in the different frames.However, for the sake of the argument, let us do the maths.

We consider coordinates in S such that the ends of the garage follow theworld lines

xb,±(t) = ±`02. (102)

We also let the world lines of the ends of the car follow the world lines

xp,±(t) = vt± `02γ. (103)

It should be clear that this describes an object of length `0/γ moving atvelocity v. At time t = 0, the car is completely within the garage and wecan close the doors. The events of the doors closing are therefore given by

t± = 0, x± = xb,±(0) = ±`02. (104)

How do these events appear in S′? By Lorentz transformation, we find that

t′± = −vγx± = ∓vγ`02

. (105)

28

Page 29: SI2371 Special Relativity, Lecture notes

`0/γ

t′

t = t0 t′ = t′0

`0`0/γ `0x

S : S ′ :

x′

t

Figure 7: The world sheets (extended objects do not make lines in Minkowskispace) of the car (red) and garage (green) in the garage paradox, as illus-trated in Minkowski diagrams based on S and S′, respectively. The dashedlines represent simultaneities in their corresponding frames. The car clearlyfits in the garage at t = t0 in S and is clearly too large to fit in the garageat t′ = t′0 in S′.

Thus, the door at the back of the garage closes at t′+ = −vγ`0/2, earlierthan the door at the front of the garage closes at t′− = vγ`0/2. If we do notwant the car to crash into the back door, we need to reopen the back doorbefore the front door can be closed (in S′). Minkowski diagrams illustratingthe garage paradox and its resolution are shown in Fig. 7.

5 Relativistic dynamics and kinematics

5.1 Addition of velocities

If the velocity of an object in the frame S is ~u, we ask the question of whatits velocity in a different frame S′ is. For simplicity, we assume S and S′

to be in standard configuration with respect to each other so that S′ moveswith velocity v in the positive x direction in S.

In general, the 4-velocity V µ of the object in S is given by

V = γ(u)(1, ~u). (106)

The velocity ~u can be found by dividing the spatial components by the

29

Page 30: SI2371 Special Relativity, Lecture notes

temporal one~V

V 0=γ~u

γ= ~u. (107)

Since the 4-velocity is a 4-vector, we can find its components in S′ by Lorentztransformation

V µ′ = Λµ′µV

µ. (108)

For the case where ~u is also in the x direction we find

V ′ = γ(v)γ(u)(1− uv, u− v, 0, 0). (109)

We can now find the velocity of the object in the primed system by dividingthe spatial components by the temporal one and find

~u′ =γ(v)γ(u)(u− v)

γ(v)γ(u)(1− uv)~e1 =

u− v1− uv

~e1. (110)

For motion in a the same direction as the relative motion of the inertialframes, we therefore find the relation for relativistic velocity addition

u′ =u− v1− uv

. (111)

Exercise 5 Find the relation between ~u and ~u′ when ~u is not in the x di-rection using the same type of argumentation as above.

5.2 4-acceleration

In classical mechanics, the acceleration ~a is given by

~a =d2~x

dt2. (112)

In order to generalise this to a concept that can be used in Minkowski space,we face two problems: First of all, the 3-vector ~x is not a 4-vector and it isnot clear how it should transform under Lorentz transformations. Second,the time with respect to which the derivatives are taken is frame dependentin Minkowski space.

We define the 4-acceleration as the second derivative of the event vectorwith respect to the proper time, i.e.,

A =dV

dτ(113)

30

Page 31: SI2371 Special Relativity, Lecture notes

The first thing we note is that

0 =d1

dτ=dV 2

dτ= 2V · dV

dτ= 2V ·A (114)

and hence the 4-acceleration A is necessarily orthogonal to the 4-velocityV . Since V is a time-like 4-vector, this necessarily implies that A is eitherspace-like or the zero 4-vector.

Let us examine how to express the 4-acceleration in terms of the velocityand acceleration in a given frame. We know that the 4-velocity is given byEq. (106) and therefore we find

dV

dτ=

d

dτγ(1, ~u) =

duidτ

∂uiγ(1, ~u) =

dt

duidt

[∂γ

∂ui(1, ~u) + γ(0, ~ei)

]. (115)

We now apply the result of the following exercise:

Exercise 6 Show that∂γ

∂ui= uiγ

3. (116)

Together with dt/dτ = γ, this leads to

A = γai[uiγ

3(1, ~u) + γ(0, ~ei)]

= γ3(~a · ~u)V + γ2(0,~a) , (117)

where ~a = d~u/dt is the 3-acceleration.

Exercise 7 Verify that AµVµ = 0 with the above expressions for the 4-

velocity and 4-acceleration.

5.2.1 Proper acceleration

In the instantaneous rest frame of an object, its velocity is given by ~u = 0.In this frame, the 4-acceleration takes the form

A = (0,~a), (118)

since γ = 1 and ~a · ~u = 0. This implies that

A2 = −~a2 ≡ −α2, (119)

where α is called the proper acceleration. The proper acceleration is theacceleration experienced by an object in its instantaneous rest frame.

31

Page 32: SI2371 Special Relativity, Lecture notes

Since the 4-acceleration is a 4-vector, its square is an invariant and theproper acceleration can always be computed by α2 = −A2 in any frame. Inparticular, in the most general case, can square the relation

γ2(0,~a) = A− γ3(~u · ~a)V (120)

to find−γ4~a2 = A2 + γ6(~u · ~a)2 = −α2 + γ6(~u · ~a)2, (121)

which can be rearranged to

α2 = γ4~a2 + γ6(~u · ~a)2 . (122)

Assuming that the |~u| = u and |~a| = a, this expression for the properacceleration becomes

α2 = γ6a2[(1− u2) + u2 cos2(θ)], (123)

where θ is the angle between the velocity and acceleration. In particular,when θ = 0, i.e., when ~u and ~a are parallel, the relation between the acceler-ation in a frame where the object is moving with velocity ~u and the properacceleration is

α = γ3a, (124)

while the relation when ~u and ~a are perpendicular is given by

α = γ2a. (125)

The conclusion from the above discussion is that for a given speed and accel-eration, the proper acceleration will generally depend on the angle betweenthe velocity and acceleration.

We also here see the seeds of why a massive object cannot be acceler-ated to light speed. To maintain a given acceleration a as u grows, theproper acceleration must grow as well. For an object with constant properacceleration, the acceleration goes to zero as u→ 1.

5.3 4-momentum

Looking back at Newtonian mechanics, the momentum of an object wasdefined as

~p = m~u, (126)

32

Page 33: SI2371 Special Relativity, Lecture notes

where m is called the mass of the object. Newton’s second law then statesthat the force ~F is the time derivative of the momentum, i.e.,

~F =d~p

dt, (127)

while Newton’s third law states that for every force acting on one object fromanother, there is a force of equal magnitude but opposite direction actingon the second object from the first, implying the conservation of momentumfor systems on which there are no external forces, i.e.,

d

dt(~p1 + ~p2) =

d~p1

dt+d~p2

dt= ~F1 + ~F2 = 0 (128)

in the case when ~F1 = −~F2, implying that the total momentum ~p1 + ~p2 isconserved.

For the case when m is constant, Newton’s second law gives

~F = m~a, (129)

identifying the mass m with the inertia of an object.Let us see if we can do something similar in relativity. Since we are

dealing with Minkowski space, we need to deal with 4-vectors rather than3-vectors and so we select the 4-velocity in order to define the 4-momentum

P = mV (130)

where m is some scalar. Our intention is now to examine how this new4-vector can be interpreted if it, like the momentum in classical mechanics,is conserved for closed systems. We will do so and identify the componentsof P by looking at the classical limit when u � 1. In this limit, neglectingterms of O(u3) we find that

P = mγ(1, ~u) = m

(1 +

u2

2

)(1, ~u) =

(m+

mu2

2,m~u

). (131)

It is here instructive to reinsert the speed of light c in units where c 6= 1 andso we get

P =

(mc2 +mu2

2

c,m~u

). (132)

We can immediately identify the spatial components of the 4-momentumwith the classical 3-momentum ~p. In the temporal component, we have

33

Page 34: SI2371 Special Relativity, Lecture notes

something which looks as the classical kinetic energy mu2/2 divided by cwith an additional constant contribution mc2. We therefore interpret thiscomponent as the total energy E (divided by c) of the object in the givenframe. In particular, when ~u = 0, the total energy becomes

E = mc2 , (133)

which we therefore identify as the rest energy. Note that so far, while themass parameter m has been identified with the rest energy (up to multi-plication by c2), we have not yet seen that we can still identify it with theinertia of the object.

With the above in mind, we simply define the 4-momentum of an objectto have the components

P = (E, ~p) (134)

(now again in units where c = 1), where E is the total energy and ~p the 3-momentum of an object. A useful relation that shall become of importancelater is the relativistic dispersion relation

E2 = ~p2 +m2, (135)

which can be found by squaring the 4-momentum, i.e., using that

P 2 = m2V 2 = m2 = (E, ~p)2 = E2 − ~p2. (136)

5.4 4-force

Just as we defined the force as the time derivative of momentum in classicalmechanics, we can do the same in relativity. However, since we want aninvariant object and time is frame dependent, we instead differentiate the4-momentum with respect to proper time τ . Thus, we define the 4-force onan object as

F =dP

dτ. (137)

Just as for the 4-momentum, let us check that the 4-force can be interpretedin a manner we understand in the non-relativistic limit. For u � 1, thetemporal component of the definition of the 4-force becomes

F 0 =dE

dτ= γ

dE

dt' dE

dt. (138)

In other words, the temporal component describes the change in total energyof an object. At the same time, using the non-relativistic expansion of the

34

Page 35: SI2371 Special Relativity, Lecture notes

4-momentum gives

F 0 ' dm(1 + ~u2/2)

t= m~u · ~a = ~u ·m~a =

dE

dt(139)

for the case of constant mass m. For the spatial components of F , we findthat

~F ' d~p

dt=dm~u

dt= m~a. (140)

Together with the temporal component, this gives the classical work-energytheorem

dE

dt= ~u · ~F . (141)

We can here also identify the mass m with the inertia of the object as longas u� 1 or, more precisely, the inertia of the object in its instantaneous restframe. With this identification, we have now arrived at one of the deeperinsights in special relativity: the inertia of an object in its instantaneousrest frame is equal to its rest energy (up to a factor c2 if we use units wherec 6= 1).

In general, there is no reason to assume that the rest energy of an objectis fixed. Indeed, changing the internal structure of an object or heating itwill generally result in the object containing more energy. Since we haveidentified the total rest energy of an object with its inertia in its rest frame,we can conclude that increasing the internal energy of an object results inthe object having a larger inertia as well. When it comes to the 4-force, wecan use the product rule for derivatives to write

F =dmV

dτ=dm

dτV +m

dV

dτ=dm

dτV +mA. (142)

Thus, the 4-force on an object can be split into one contribution that isproportional to the 4-acceleration and another which is proportional to the4-velocity of the object. In general, the change in the rest energy of theobject can be found by multiplying the 4-force with the 4-velocity V as

F · V =dm

dτV 2 +mA · V =

dm

dτ. (143)

If the 4-force does not give any acceleration, we find that A = 0 and hence

F =dm

dτV. (144)

Since such a force changes the rest energy of the object without acceleratingit, we call it a heat-like force. On the other hand, we can consider a force that

35

Page 36: SI2371 Special Relativity, Lecture notes

does not change the rest energy of an object and instead only accelerates it.For such a force we have dm/dτ = 0 and therefore

F = mA. (145)

We call this type of force a pure force. By definition, a pure force satisfiesF · V = mA · V = 0. As demonstrated by Eq. (142), any force can be splitinto a heat-like and a pure part.

6 Electromagnetic fields in SR

The aim of this section is to argue that electromagnetism is the absolutelysimplest field theory for a pure force that can be written down in specialrelativity. We will do so by considering how a field has to behave in orderfor the corresponding force to be pure and then assuming some dynamics ofthe field.

From classical mechanics, we are used to deal with force fields of differ-ent types. In particular, we should be familiar with the gravitational andelectrostatic forces with the corresponding fields ~g and ~E. The force onan object due to these fields is usually given by the field at the position ofthe object multiplied by a scalar property of the object. In the case of thegravitational field, this scalar property is the mass m and in the case of theelectrostatic field, it is the charge q, implying that

~Fg = m~g and ~FE = q ~E. (146)

Apart from these force fields, we are also acquainted with the magnetostaticfield ~B, for which the resulting force depends not only on a scalar propertyq of the object, but also on its velocity ~u as

~F = q~u× ~B. (147)

Trying to generalise this to the setting of 4-vectors, we look for a force fieldsuch that the 4-force on an object depends at most on a scalar property q,the 4-velocity V , and the force field. Of course, we also want the expressionfor the 4-force to transform as a 4-vector.

Exercise 8 We can try to make the force field a scalar field f . With thisassumption, the corresponding 4-force must be equal to Fµ = qfV µ (thinkof why!). Show that this force field does not result in a pure force on anyobject.

36

Page 37: SI2371 Special Relativity, Lecture notes

Exercise 9 We can also try to make the force field a vector field fµ. Inthis case the corresponding 4-force must be equal to Fµ = qfµ (again, thinkof why!). Show that this force field may be pure for some choices of V µ, butnot for a general 4-velocity V µ.

With the above exercises completed, it should be clear that a force fieldgiving a pure force cannot be a scalar field or a vector field. The next thinkwe can try is to consider a rank two tensor field Fµν such that

Fµ = qFµνVν . (148)

Note that the 4-force Fµ and the force field Fµν are completely differentobjects - one is a vector and the other a rank two tensor, we have just usedthe same letter to denote them. Which one is intended should be clear fromthe number of indices used. Demanding that this force is pure regardless ofthe 4-velocity of the object results in the requirement

FµVµ = qFµνV

µV ν = 0 (149)

for all V . This is satisfied if (and only if) the field strength tensor Fµν is ananti-symmetric tensor, i.e., if Fµν = −Fνµ. The simplest possible descriptionof a pure force field is therefore in terms of a rank two anti-symmetric tensor,which generally has six independent components. In matrix representation,we can write

(Fµν) =

0 F01 F02 F03

−F01 0 F12 F13

−F02 −F12 0 F23

−F03 −F13 −F23 0

=

0 E1 E2 E3

−E1 0 −B3 B2

−E2 B3 0 −B1

−E3 −B2 B1 0

,

(150)where we have introduced the components

Ei = F0i and Bi = −1

2εijkFjk. (151)

Here, εijk is the standard permutation symbol in three spatial dimensions.

Exercise 10 Show that the second of these relations also implies that

Fij = −εijkBk. (152)

Exercise 11 Derive the transformation properties of Ei and Bi under stan-dard Lorentz transformations.

37

Page 38: SI2371 Special Relativity, Lecture notes

For future reference, we also introduce the dual field tensor

Fµν =1

2εµνσρFσρ, (153)

where εµνσρ is the 4-dimensional permutation symbol, which is totally anti-symmetric with ε0123 = 1.

Exercise 12 Show that

F 0i = −Bi and F ij = εijkEk. (154)

The results of the above exercise imply that the dual field tensor can berepresented as

(Fµν) =

0 −B1 −B2 −B3

B1 0 E3 −E2

B2 −E3 0 E1

B3 E2 −E1 0

. (155)

Using the field tensor and its dual, we can construct two scalar invariants,namely

FµνFµν and FµνF

µν (156)

(FµνFµν is equal to FµνF

µν). The first of these can be written in terms of~E and ~B by splitting the sums into the temporal and spatial components

FµνFµν = F00F

00 + F0iF0i + Fi0F

i0 + FijFij = 2F0iF

0i + FijFij . (157)

We can now use that F 0i = −F0i = −Ei (lowering a spatial index givesa minus sign and lowering a temporal gives a plus sign) while Fij = F ij

(lowering two spatial indices gives to minus signs that cancel). It followsthat

FµνFµν = −2 ~E2 + εijkBkεij`B` = −2 ~E2 + 2 ~B2 . (158)

Thus, it follows that regardless of which frame is used to describe the fieldtensor ~E2− ~B2 must take the same value (note that a Lorentz transformationgenerally mixes the ~E and ~B components). The second invariant FµνF

µν

can be expressed as

FµνFµν = 2F0iF

0i + FijFij = −2EiBi − εijkBkεij`E` = −4 ~E · ~B. (159)

This is therefore also a frame independent quantity.

38

Page 39: SI2371 Special Relativity, Lecture notes

6.1 Field equations

In classical mechanics, the force fields generally satisfy some set of fieldequations. In the case of the gravitational and electrostatic fields, we have

∇ · ~g = ∂igi = ρm and ∇ · ~E = ∂iEi = ρq, (160)

where ρm and ρq are proportional to the mass and charge densities, respec-tively. The fields also satisfy

∇× ~g = 0 and ∇× ~E = 0, (161)

implying that the fields can be written as the gradient of some potential

~g = −∇φg and ~E = −∇φE . (162)

The condition on the curl can also be written as

εijk∂jgk = 0 (163)

with a corresponding equation of Ek.We can try to see if we can take a similar approach in the case of our

pure force field in relativity. However, our field is a rank two tensor and itsdivergence is a 4-vector instead of a scalar field. We will call this 4-vectorJµ and the field equation then becomes

∂µFµν = Jµ. (164)

In addition, the condition corresponding to the vanishing curl condition onthe gravitational field is given by

εµνσρ∂νFσρ = ∂νFµν = 0. (165)

Exercise 13 Show that this equation is equivalent to

∂µFνσ + ∂νFσµ + ∂σFµν = 0 (166)

for any anti-symmetric tensor Fµν .

Let us see what these field equations look like in terms of the components~E and ~B of the field tensor. In particular, we can split the sum in thedivergence of Fµν into the temporal and spatial parts, giving

∂µFµν = ∂0F

0ν + ∂iFiν = Jν . (167)

39

Page 40: SI2371 Special Relativity, Lecture notes

Considering the case when ν = 0, we find that

J0 = ∂0F00 + ∂iF

i0 = ∂iF0i = ∂iEi = ∇ · ~E. (168)

This looks suspiciously like the field equation for the electrostatic field ifwe let J0 be the charge density ρ (in units where ε0 = 1). Multiplying thecomponent J i by the spatial basis vector ~ei, we also find that

~J = ~ei[∂0F0i + ∂jF

ji] = ~ei

[−∂Ei∂t− εjik∂jBk

]= −∂

~E

∂t+∇× ~B. (169)

Doing exactly the same computation but exchanging ~J → 0 and Fµν → Fµν

results in similar equations with the roles of ~E and ~B exchanged (up to someminus signs and without the J). This can be summarised as

∇ · ~E = ρ, (170)

∇ · ~B = 0, (171)

∇× ~B − ∂ ~E

∂t= ~J, (172)

∇× ~E +∂ ~B

∂t= 0. (173)

These equations should be familiar as they are Maxwell’s equations for theelectromagnetic field in units where c = ε0 = µ0 = 1, where ~J is the currentdensity.

6.2 The 4-potential

As mentioned, the condition ∂µFµν = 0 is similar to the condition that the

curl of the gravitational field vanishes and lets us define a potential for thegravitational field. Similarly, there is a generalisation of this statement togeneral anti-symmetric tensors (known as Poincare’s lemma – the proof ofthis lemma is rather involved and we will leave it out) that tells us that∂µF

µν = 0 implies that there exists a 4-vector potential Aµ such that

Fµν = ∂µAν − ∂νAµ. (174)

Writing out the components of this relation, we find that

~E = ~eiF0i = ~ei(∂0Ai − ∂iA0) =∂ ~A

∂t−∇A0, (175)

~B = −1

2~eiεijk(∂jAk − ∂kAj) = ~eiεijk∂jA

k = ∇× ~A. (176)

40

Page 41: SI2371 Special Relativity, Lecture notes

If we let A0 = A0 = φ, this is exactly the expressions for the electromagneticfield in terms of the scalar potential φ and vector potential ~A. The 4-potential Aµ is therefore given by

A = (φ, ~A) , (177)

where φ is the scalar potential and ~A the vector potential from classicalelectrodynamics.

Just like the gravitational potential for a given gravitational field is notunique, the 4-potential corresponding to a particular electromagnetic fieldis not unique. Indeed, we can shift the 4-potential by a total divergence

Aµ → A′µ = Aµ + ∂µψ. (178)

This is known as a gauge transformation of the 4-potential.

Exercise 14 Show that the gauge transformation above does not change thefield strength tensor Fµν .

6.3 The 4-current density

The above discussion essentially shows that Maxwell’s equations are invari-ant under Lorentz transformations - as the can be expressed completely interms of the field tensor and the source 4-vector Jν , they will have the sameform in all inertial frames. However, this also requires Jν to transform as a4-vector. As we saw, J is given by

J = (ρ, ~J), (179)

where ρ is the charge density and ~J is the current density.

Example 8 Consider the 4-current density for a stationary homogeneouscharge distribution of charge density ρ0. In its rest frame, the 4-currentdensity is given by

J = (ρ0, 0) = ρ0(1, 0) = ρ0V, (180)

where V is the 4-velocity of the distribution. Since V is a 4-vector, ρ0 mustbe a scalar and the 4-current takes the form ρ0V in all inertial frames. Inparticular, this implies that

J = ρ0γ(1, ~u) (181)

in a frame where the charge distribution moves with velocity ~u. Thus, insuch a frame, there is a current density ~J = ρ0γ~u and the charge density isnot ρ0 but ρ0γ.

41

Page 42: SI2371 Special Relativity, Lecture notes

Example 9 Consider a long straight neutral conductor of cross-sectionalarea A carrying a current I in the x direction. For this conductor, the4-current density is equal to

J = (0, I/A), (182)

where we have suppressed the y and z components, which are zero. Applyinga standard configuration Lorentz transformation to this 4-current results in

J ′ =I

Aγ(−v, 1). (183)

Thus, in a frame moving with velocity v in the current direction, the con-ductor is charged with charge density −vγI/A.

From Maxwell’s equations, we know that the divergence of the 4-currentis given by

∂µJµ = ∂µ∂νF

νµ = 0 (184)

as Fµν is anti-symmetric. Expanding in terms of the temporal and spatialcomponents of the sum, we find that this is equivalent to

∂µJµ = ∂0ρ+ ∂iJ

i =∂ρ

∂t+∇ · ~J = 0. (185)

This is nothing but the source free continuity equation for the charge densityρ, which tells us that charge is conserved.

6.4 The Lorentz force

We started this discussion by requiring that the 4-force could be written asFµ = qFµνV

ν , but focused on the field strength tensor itself and never took

the time to discuss how this force is expressed in terms of the components ~Eand ~B. Expanding the sum in the force in terms of its temporal and spatialcontributions, we find that

Fµ = qFµ0V0 + qFµiVi = qFµ0V0 − qFµiV i. (186)

For µ = 0, we therefore arrive at

F 0 =dE

dτ= −qF 0iV i = qF0iγui = qγ ~E · ~u (187)

whereas the spatial part of the equation becomes

~F =d~p

dτ= qγ~ej(F

j0−F jiui) = qγ( ~E+εjikuiBk~ej) = qγ( ~E+~u× ~B). (188)

42

Page 43: SI2371 Special Relativity, Lecture notes

Dividing both of these relations by γ = dt/dτ , we conclude that

dE

dt= q ~E · ~u, (189)

d~p

dt= q( ~E + ~u× ~B). (190)

The first of these equations tell us that the change in the energy of the objectonly depends on the motion of the object in the direction of the electric field~E whereas the second is the Lorentz force law that should be familiar fromelectrodynamics. Note that the magnetic field cannot change the energy ofthe object as the force is always perpendicular to the motion of the objectand therefore does not appear in the expression for the change in the energy.

7 Surfaces and waves in Minkowski space

Let us first consider a number of properties of hypersurfaces of dimensionn − 1 in Rn. Such surfaces can generally be described as level surfaces ofsome function f(~x), i.e., they are the set of points such that f(~x) = f0 issome constant. Assuming that ~x is in the surface, we can look at smalldisplacements d~x from ~x and expand the function as

f(~x+ d~x) = f(~x) + dxi∂if(~x) = f0 + d~x · ∇f (191)

to leading order in d~x. If d~x is a displacement within the surface, then

f(~x+ d~x) = f0 =⇒ d~x · ∇f = 0. (192)

It follows that ∇f is orthogonal to all displacements in the surface and wedefine it to be the normal vector ~n (note that it is not necessarily nor-malised). We also define a vector ~t to be tangent to the hypersurface if itis orthogonal to ~n, i.e., if it is parallel to some infinitesimal displacementwithin the hypersurface, it holds that

~t · ~n = 0. (193)

In Rn, the set of tangent vectors and the normal vector at a point of ahypersurface together form a complete vector basis.

If we now look at the case of Minkowski space, we can still define hyper-surfaces as the level curves of a function f(x). In the same way as for Rn,we can expand f(x+ dx) according to

f(x+ dx) = f0 + dxµ∂µf (194)

43

Page 44: SI2371 Special Relativity, Lecture notes

and if the displacement dxµ is within the hypersurface it follows that

dxµ∂µf = 0 (195)

and therefore the gradient Nµ = ∂µf is orthogonal to all displacementswithin the hypersurface. We here note that Nµ are the covariant compo-nents of the 4-vector N . Also in analogy to the case of Rn, we define atangent vector Tµ as any vector proportional to a displacement within thehypersurface and it follows that

NµTµ = 0. (196)

So far, all of the properties have been completely analogous to those in thecase of Rn. However, there is a subtle, but important, difference. In thecase of Rn, it always holds that ~n2 > 0 (as long as ~n 6= 0), but in the caseof Minkowski space

N2 = ηµνNµNν (197)

can be equal to zero even if N is not, namely in the case when N is alight-like 4-vector. In fact, if the normal vector is light-like, then it is alsoa tangent vector. In this situation, the collection of tangent vectors andthe normal vector does not form a complete vector basis. Since there thenexists a light-like displacement within the surface, such a surface is called alight-like surface.

7.1 Space-like surfaces and simultaneity

A hypersurface for which N2 > 0, i.e., that has a time-like normal, all ofthe tangent vectors are necessarily space-like as a time-like vector cannotbe orthogonal to other time-like or light-like vectors. It follows that forsuch a surface all of the displacements within the surface are space-like andthe surface is therefore referred to as a space-like surface. In fact, we havealready encountered a family of such surfaces:

Example 10 Consider the case where f = t in some system S. The levelsurfaces of this function consist of the events for which t = t0 for someconstant time t0, i.e., they are the surfaces of simultaneity in S. The normalvector of this surface is given by

Nµ = ∂µt = (1, 0, 0, 0) =⇒ Nµ = (1, 0, 0, 0) = V µ, (198)

where V is the 4-velocity of an observer at rest in S. The tangent vectorsare any vectors for which

TµNµ = T 0 = 0, (199)

44

Page 45: SI2371 Special Relativity, Lecture notes

i.e., any 4-vector that is purely spatial in S. Of course, in a general framethe same tangent vector will have a non-zero time component as V will nowhave non-zero spatial components.

Note that different frames will have different simultaneity conventions.In fact, the concept of simultaneity can be generalised by considering anarbitrary function f for which all level surfaces are space-like and definingthem as simultaneities. This only goes to underline the fact that simultaneityis not something uniquely defined in relativity, but that there are severaldifferent possibilities.

7.2 Waves and phase functions

We will consider wave motion in Minkowski space by studying a scalar fieldF that satisfies the relation

F (x) = f(φ(x)), (200)

where φ(x) is the phase of the wave that is also a scalar field and f is a oneparameter function describing the shape of the wave.

Example 11 For a plane wave, we would have the phase function

φ(x) = νt− ~k · x, (201)

where ν is the wave frequency (as we shall see) and ~k is the wave vector.We would also have

f(s) = e2πis. (202)

For the remainder of this discussion, it may be useful to keep this examplein mind.

Let us now try to interpret the normal vector Nµ of the phase functionφ for the case when the wave shape f is periodic with period one, i.e.,f(s+ 1) = f(s). For reasons that will become clear, we will parametrise Nas

N = ν(1, k~n), (203)

where ~n is a unit vector and call it the 4-frequency. We start by consideringhow an observer at rest experiences the wave. In the observer’s rest frame,we have d~x = 0 and therefore

dφ = N0dt = ν dt =⇒ dφ

dt= ν. (204)

45

Page 46: SI2371 Special Relativity, Lecture notes

x

y

Figure 8: A snapshot of the position of a number of wave fronts (blue) fora fixed time t. The vectors ~n (red), being the spatial part of the gradient tothe phase function are orthogonal to the wave fronts.

Thus, the number ν tells us how fast the phase changes for this observer.Since the period of the wave shape is one, ν is therefore the frequency of thewave (note that the wave phase must change sufficiently fast for us to makethis interpretation, it is strictly true only for plane waves).

A wave front at time t is the set of points for which the phase function isthe same at t, i.e., it is the intersection between a level surface of the phasefunction and the simultaneity in the frame being considered. Displacementswithin the wave front therefore satisfy dt = 0 as well as

dxµNµ = −ν d~x · k~n = 0 =⇒ d~x · ~n = 0. (205)

This condition tells us that the vector ~n is orthogonal to the wave front,see Fig. 8. It remains only to give an interpretation of the number k. Con-sidering the motion of the wave front between times t and t+ dt, wave fronthaving the same phase results in

ν(dt− k~n · d~x) = 0 =⇒ k =1

w, (206)

where

w = ~n · d~xdt

(207)

is the phase velocity of the wave, i.e., how fast the wave front moves.

46

Page 47: SI2371 Special Relativity, Lecture notes

In the remainder of these notes, we are going to consider only light-likewaves, i.e., waves that move at the speed of light, such as electromagneticwaves in vacuum. For such waves, we find that w = 1 and consequently

N = ν(1, ~n). (208)

7.3 The relativistic Doppler effect

For a general light-like wave, we have written down the 4-frequency as N =ν(1, ~n) and interpreted ν as the frequency of the wave in the given frame.Since ν is equal to the time-component of N , it will in general not be thesame in different frames and therefore the frequency of a wave is not aLorentz invariant. However, the 4-vector N itself transforms as a 4-vectorand we can find the components of N in any other frame by applying theappropriate Lorentz transformation

Nµ′ = Λµ′µN

µ. (209)

Example 12 Consider a wave moving in the positive x direction. For sucha wave, N = ν(1, 1), where we have suppressed the y and z components,which are zero. Applying the Lorentz transformation in standard configura-tion results in

Nµ′ = ν1− v√1− v2

(1, 1)µ′

= ν

√1− v1 + v

(1, 1)µ′

= ν ′(1, 1)µ′, (210)

where ν ′ is the frequency of the wave in S′. It follows that

ν ′ = ν

√1− v1 + v

. (211)

This is the regular expression for the relativistic Doppler shift for the fre-quency of a wave when moving in the same direction as the wave.

Exercise 15 Compute the Doppler shift when going to a frame that is mov-ing in the opposite direction of the wave using the same approach.

7.3.1 Doppler effect without Lorentz transformation

The above computation applied a Lorentz transformation to the 4-frequencyin order to find the frequency of the wave in a different frame. However,there is an approach that avoids Lorentz transformations and instead usesthe invariance of the Minkowski inner product.

47

Page 48: SI2371 Special Relativity, Lecture notes

In the rest frame of an observer, the observed frequency of the wave canbe expressed as

ν = ν(1, ~n) · (1, 0) = N · V, (212)

where V is the 4-velocity of the observer. The last form of this expressionjust references the 4-vectors N and V , whose inner product will be the sameregardless of the frame it is computed in. In particular, this means that wecan find the frequency experienced by the observer in any frame by takingthe inner product of N and V as expressed in that frame. Thus, if the4-frequency in S is N = ν(1, ~n) and an observer is moving with velocity ~uin S, then the frequency experienced by that observer is

ν ′ = N · V = ν(1, ~n) · γ(1, ~u) = νγ(1− ~n · ~u). (213)

Exercise 16 Show that this expression for ν ′ agrees with the expressionsderived with the Lorentz transformation for the cases when the motion is inthe direction of the wave or opposite to the direction of the wave.

Exercise 17 Determine the angle between ~n and ~u for which ν ′ = ν.

7.4 Aberration of light

Aberration is the change in the direction of motion when changing betweendifferent inertial frames. The most commonly experienced version of aberra-tion is likely the fact that when driving a car in the rain, the rain seeminglyhits the car at an angle even though it is falling straight down in the restframe of the ground. This aberration results from the rain acquiring a hori-zontal velocity component when being transformed into the car’s rest frame.If the rain is falling with speed u and the car is moving at speed v relativeto the ground, the angle θ′ the rain makes to the horizontal can be foundby considering u and v as the sides of a right triangle, leading to

tan(θ′) =u

v. (214)

Note that as θ′ → π/2 as v → 0 as expected.If we exchange the rain for light in the above consideration, Newtonian

velocity addition is no longer valid and the vertical component of the velocityof the light is no longer one. However, the horizontal component must stillbe equal to v and the total speed of the light must be one, leading to v beingone of the sides of a right triangle with a hypotenuse of one. The angle tothe horizontal is therefore given by

cos(θ′) = v. (215)

48

Page 49: SI2371 Special Relativity, Lecture notes

v

u

−v

v

c

−v

θ′

θ′

u

c

S S ′

S S ′

Figure 9: Aberration of rain in classical mechanics (upper panel) and of lightin special relativity (lower panel). In both cases, the rain/light is movingvertically in S, but at an angle θ′ in the rest frame S′ of an observer movingat a velocity v relative to S.

Again, note that θ′ → π/2 as v → 0.For the general case, when the angle of the light to the horizontal in the

frame S is θ, we can find the direction in which the light is moving in S′

by Lorentz transforming the 4-frequency. In S, the 4-frequency is given by(suppressing the z direction)

Nµ = ν(1,− cos(θ),− sin(θ))µ. (216)

Applying the Lorentz transformation in standard configuration to this 4-frequency, we find that

Nµ′ = νγ(1 + v cos(θ),− cos(θ)− v,− sin(θ)/γ)µ′

= ν ′(1,− cos(θ′),− sin(θ′))µ′. (217)

49

Page 50: SI2371 Special Relativity, Lecture notes

From this we can conclude that

sin(θ′) = −N2′

N0′=

sin(θ)

γ(1 + v cos(θ)), (218)

cos(θ′) = −N1′

N0′=

cos(θ) + v

1 + v cos(θ), (219)

tan

(θ′

2

)=

sin(θ′)

1 + cos(θ′)=

√1− v1 + v

tan

2

). (220)

Exercise 18 Fill in the computations in the last relation for tan(θ′/2).

Note that the factor in front of tan(θ/2) is always smaller than one as longas v > 0. Consequently, the angle θ′ is smaller than θ and the light will tendto come from the direction in which the observer is moving, just as the raintends to hit the car from the front.

8 Tetrads

In the beginning of the previous section, we computed the Doppler shift ofthe frequency of a wave by taking the inner product between the 4-frequencyand the 4-velocity of the observer, as this always evaluates to the frequency inthe observer’s rest frame. When we moved on to computing the aberrationof light, we were instead interested in the spatial components of the 4-frequency in an observer’s rest frame in order to find the direction in whichthe light was moving. Can we do this without having to Lorentz transformthe 4-frequency? (Of course we can! I would not ask that rhetorical questionotherwise . . . Let us find out how it is done.)

Consider an observer at rest in the frame S. Given a 4-vector X, thespatial component Xi can be found by taking the inner product with the4-vector with components Tµi = (0, ~ei)

µ in S. Note that the i in Ti is not aLorentz index, it is just a counter to keep track of which of the three Ti weare talking about. We find that

Ti ·X = (0, ~ei) · (X0, ~X) = −~ei · ~X = −Xi. (221)

Just as we could express the 4-velocity of a moving observer in a frame wherethe observer is moving with a velocity ~u, we can express the Ti in such aframe as well. In order to do so, we must apply a Lorentz transformationwith velocity ~v = −~u and if we let this Lorentz transformation be in standard

50

Page 51: SI2371 Special Relativity, Lecture notes

configuration, we find that in a frame where the observer is moving withvelocity ~u = u~e1, the Ti are given by

T1 = γ(u, 1, 0, 0), T2 = (0, 0, 1, 0), T3 = (0, 0, 0, 1). (222)

Letting T0 = V be the 4-velocity of the observer, we can easily verify that

Tµ · Tν = ηµν . (223)

The vectors Tµ therefore form an orthonormal basis for 4-vectors known asa tetrad. Any 4-vector can be written as a linear combination of this basis

X = XµTµ, (224)

where Xµ are the contravariant components of X in the inertial frame whichthe tetrad is based on. In particular, we note that

X · Tµ = XνTν · Tµ = ηµνXν , (225)

which tells us that

X · T0 = X · V = η00X0 = X0, X · Ti = ηijX

j = −δijXj = −Xi. (226)

Using the tetrad instead of the Lorentz transformation often simplifiesfinding individual components of a 4-vector.

Example 13 Consider the aberration of light again. Computing the com-ponent N1′ for an observer moving in the x direction with velocity v cannow be done as

N1′ = −ν(1,− cos(θ),− sin(θ)) · γ(v, 1, 0) = −νγ(v + cos(θ)), (227)

which is the same result as that found by Lorentz transformation.

Exercise 19 Consider the 4-momentum given by p = (E, ~p) in S and de-termine its components in the rest frame of an observer moving with velocity~v = v~e1 in S using the tetrad approach.

9 Particle collisions and decays

An important aspect of special relativity is its application to solving thekinematics involved in particle collisions and decays. The basic underlyingprinciple that we will use in this section is the conservation of 4-momentum

51

Page 52: SI2371 Special Relativity, Lecture notes

(which essentially means conservation of energy and 3-momentum), i.e., ifwe have a N incoming particles with 4-momenta pa and M outgoing particleswith 4-momenta kb, where a and b labels the particles, then

P =N∑a=1

pa =M∑b=1

kb = K, (228)

where P is the total incoming 4-momentum and K is the total outgoing 4-momentum. As both sides of this equation are 4-vectors, this relation musthold in all frames and we can do any algebraic modifications as long as wedo the same modifications on both sides.

Example 14 Consider the decay of a particle A with 4-momentum pA totwo particles B and C with 4-momenta kB and kC , respectively. The con-servation of 4-momentum tells us that

pA = kB + kC . (229)

We can make algebraic manipulations of this relation such as subtracting kBfrom both sides, leading to

pA − kB = kC , (230)

or squaring it, leading to

p2A = m2

A = (kB + kC)2 = k2B + k2

C + 2kB · kC . (231)

A relation that we will use repeatedly in this section is the fact thatthe 4-momentum squared of a particle is equal to its mass squared. This isan invariant and is true regardless of which frame we use. Furthermore, allinner products of the form p ·k are also Lorentz scalars and can be computedin any frame. In particular, in the example above, we would find that

m2A = m2

B +m2C + 2kB · kC , (232)

where the inner product kB · kC can be computed in any frame of interest.We can also use algebraic manipulation and the fact that inner products areLorentz invariants to get rid of superfluous information that we are a priorinot interested in computing.

Example 15 In the case of the decay in the previous example, let us imag-ine that we want to know the energy of the particle B in the rest frame of

52

Page 53: SI2371 Special Relativity, Lecture notes

the original particle A. In order to get rid of the kinematic information onparticle C, we square Eq. (230) and find that

(pA − kB)2 = m2A +m2

B − 2pA · kB = k2C = m2

C . (233)

In the rest frame of A, we have pA = mA(1, 0) and kB = (EB, ~pB), whereEB is the sought energy of B in the rest frame of A. Thus, we find that

pA · kB = mAEB (234)

and therefore

m2A +m2

B − 2mAEB = m2C =⇒ EB =

m2A +m2

B −m2C

2mA. (235)

9.1 Massless particles

It is worth taking a few moments to consider what the 4-momentum of amassless particle is. In general, for a massive particle moving at speed v < 1,we concluded that

P = mV, (236)

where V is the 4-velocity of the particle and V is the tangent of the particle’sworld line normalised such that V 2 = 1. For a particle moving at lightspeed, it is impossible to normalise the tangent vector of its world line inthis manner. However, we still have the possibility of letting

P = ωdX

ds, (237)

where s is an affine line parameter, i.e., we assume that the 4-momentumis proportional to the tangent of the particle’s world line with some propor-tionality constant ω. This leads to

P 2 = ω2dX

ds· dXds

= 0, (238)

since the world line had a light-like tangent vector. Writing the 4-momentumas P = (E, ~p), this implies that

0 = P 2 = E2 − ~p2 = m2 = 0 (239)

and we therefore call such a particle massless. Note that this mass cannotbe related to the inertia of the particle in its rest frame because the particlewill move at light-speed in all frames, i.e., it has no rest frame. However,it appears in the relativistic dispersion relation just as any other non-zeromass.

53

Page 54: SI2371 Special Relativity, Lecture notes

9.2 Elastic scattering

An important special case of particle scattering is elastic scattering, wherethe incoming particles are the same as the outgoing particles. In particular,we will here treat the elastic scattering of two particles, A and B. Let usfurther assume that we are utterly uninterested in (or unable to measure) thekinematic properties ofB after the scattering, but we still want to make somestatements about the behaviour of A. The conservation of 4-momentum nowgives us

pA + pB = kA + kB =⇒ pA + pB − kA = kB (240)

and squaring this results in

m2B = k2

B = (pA+pB−kA)2 = 2m2A+m2

B+2pA ·pB−2kA ·(pA+pB). (241)

Note how the 4-momentum kB ofB after the scatter has now been eliminatedfrom the equation. A bit of algebra now results in

m2A + pA · pB = kA · (pA + pB). (242)

Example 16 Let us consider a special case of the elastic scattering, wherethe incoming particles are a photon of energy E0 and a stationary electron.Wishing to know the energy of the scattered photon, we can write down allof the involved 4-momenta as

pγ = E0(1, 1, 0), pe = (me, 0), kγ = E(1, cos(θ), sin(θ)), (243)

where θ is the scattering angle relative to the incoming direction of motion forthe photon and E is the sought energy, see Fig. 10. The above considerationwith A = γ and B = e results in mA = 0 and therefore

meE0 = E(1, cos(θ), sin(θ))·(me+E0, E0, 0) = E[me+E0(1−cos(θ))]. (244)

Solving for E gives us

E =E0

1 + E0me

(1− cos(θ)). (245)

This is the famous Compton scattering formula that relates the scatteredphoton energy E to the scattering angle θ and that has been well tested byexperiment.

54

Page 55: SI2371 Special Relativity, Lecture notes

γ

e− e−

γ

θ

Out:In:E

E0

Figure 10: The elastic scattering of a photon with energy E0 off of a sta-tionary electron. We seek the dependence of the final photon energy E onthe angle of scattering θ.

9.3 Masses of quickly decaying particles

To some extent we have already discussed particle decays in some examples.However, let us consider the situation where a particle A is very short livedand decays so fast that it is impossible to directly measure it in an experi-ment. By looking at the decay products, we can infer the mass of A in thefollowing manner.

Assuming that the decay is a two-body decay into particles B and C(the generalisation to three-body decays or more is straight forward), theconservation of 4-momentum tells us that

pA = kB + kC =⇒ m2A = (kB + kC)2 = m2

B +m2C + 2kB · kC . (246)

Thus, if we measure the energies and momenta of B and C, we shouldalways be able to reconstruct the mass mA. In practice, there are usuallymany backgrounds to this type of processes where particles of the same typeas B and C are found in the detector. However, such backgrounds usuallyproduce values of (kB + kC)2 which are different from m2

A and fall along acontinuum. If the particle A is present, plotting the spectrum of (kB + kC)2

for all pairs of B and C appearing in the detector, the decays of A willmanifest themselves as a peak in the spectrum at (kB + kC)2 = m2

A. Sometimes, the combination (kB + kC)2 is called m2

BC and is referred to as theinvariant mass of the BC system.

Example 17 The recent discovery of the Higgs boson H was mainly or-chestrated by looking for the decay of H into two photons. Since photons are

55

Page 56: SI2371 Special Relativity, Lecture notes

Figure 11: The peak at 125 GeV over a continuous background in the dis-tribution of mγγ in the CMS data. This figure is taken from CMS Collabo-ration, Phys. Lett. B 716 (2012), 30-61. cbnd

massless particles, we find that for a Higgs decay

m2H = m2

γγ = 2kγ1 · kγ2 = E1E2(1− cos(θ)), (247)

where E1 and E2 are the energies of the photons and θ is the angle betweentheir directions. The experiments at the Large Hadron Collider (LHC) founda peak in the spectrum for mγγ at 125 GeV and could deduce that the Higgsboson exists and has a mass of 125 GeV, see Fig. 11.

9.4 Threshold energies

The concept of threshold energies is important for particle accelerators hop-ing to produce new particles. It refers to the minimum energy (as defined insome frame) required in order for a given reaction to occur. For some pro-cesses, such as reactions where the total masses of the outgoing particles issmaller than that of the incoming particles, the threshold energy is zero andthe reaction can occur regardless of the energies of the incoming particles.

56

Page 57: SI2371 Special Relativity, Lecture notes

Consider the collision of two particles, A and B and let us assume thatwe wish to produce a N particles Ci with corresponding 4-momenta ki. Byconservation of 4-momentum, we must have

pA + pB =

N∑i=1

ki. (248)

Squaring this relation results in

m2AB = (pA + pB)2 =

N∑i=1

N∑j=1

ki · kj . (249)

Exercise 20 Show that for any 4-momenta ki and kj, it holds that

ki · kj ≥ mimj , (250)

where the equality holds only if the relative velocity between Ci and Cj iszero.

Based on the above exercise, we now have

m2AB ≥

N∑i=1

N∑j=1

mimj =

(N∑i=1

mi

)2

. (251)

Thus, in order for the reaction to be possible, it is necessary that the squareof the invariant mass of the AB system is larger than the sum of the massesof the outgoing particles. Furthermore, since the equality holds only if allfinal state particles are at relative rest, the threshold for the reaction occurswhen all final state particles are moving at the same velocity. The concept ofthreshold energy refers to the minimal energy required to make the reactionpossible, i.e., the energy for which the equality holds in equation Eq. (251).

Example 18 Consider the reaction γ+p→ p+π, i.e., the creation of a pionfrom a proton absorbing a photon. If the proton is at rest in the laboratoryframe, we search for the threshold photon energy E for the reaction. Weobtain

m2pγ = 2pγ · pp +m2

p ≥ (mp +mπ)2. (252)

If the photon has energy E in the proton rest frame, we find that

pγ · pp = mpE (253)

57

Page 58: SI2371 Special Relativity, Lecture notes

and thereforemp(2E +mp) ≥ m2

p + 2mpmπ +m2π. (254)

The threshold energy for the reaction is the energy E such that the equalityholds and therefore

Eth =2mpmπ +m2

π

2mp= mπ +

m2π

2mp. (255)

Note that this energy is larger than the pion mass mπ, since it is not enoughto supply enough energy in the laboratory frame, but some energy must alsogo into kinetic energy to ensure 3-momentum conservation.

9.5 Boosted decays

Let us consider a particle A decaying via a two-body decay to particles Band C as above. Assuming that the decay is isotropic i the rest frame of A,i.e., that there is an equal probability of particle B going in any direction, wecan ask the question of how the distribution of the directions of B dependson the inertial frame. In particular, let us consider an inertial frame whereA is moving with velocity v in the x direction. For simplicity, we will assumethat the particle B is massless. A similar approach can be applied when thisis not the case.

By the same argumentation as in the case of the aberration of light, theangle θ between B and the direction of motion of A in the laboratory frameis related to the angle θ′ in the rest frame of A according to

cos(θ′) =cos(θ)− v

1− v cos(θ)(256)

(note the direction B is moving and the definition of the angle θ). Anisotropic distribution in the rest frame of A is given by a flat distributionf0(ξ′) = 1/2 in the variable ξ′ = cos(θ′), with the factor of 1/2 arising fromthe fact that ξ′ varies from −1 to 1. The probability that any given decaywill result in a B moving in a direction between ξ′ and ξ′ + dξ′ is thereforeequal to f0(ξ′)dξ′ = dξ′/2. Changing variables to ξ = cos(θ), we find thatthe probability for a decay to result in a B with a direction between ξ andξ + dξ is given by

f(ξ)dξ = f0(ξ′(ξ))dξ′

dξdξ. (257)

It follows that the distribution function in the variable ξ is given by

f(ξ) =1

2

d

ξ − v1− vξ

=1

2γ2(1− vξ)2, (258)

58

Page 59: SI2371 Special Relativity, Lecture notes

θ′A

B

C

B

C

v

S S ′

θA

Figure 12: We are looking at the decay of a particle A to two particles, Band C which is isotropic in the rest frame S′ of A. Based on this, we canfigure out the angular distribution of the particle B in a frame S where Ais moving with velocity v.

which is no longer isotropic, but instead concentrated in the direction ofmotion of A. Note that the probability to obtain a B moving in any directionis still given by∫ 1

−1f(ξ)dξ =

1

2γ2

∫ 1

−1

(1− vξ)2=

1

2γ2v

∫ 1+v

1−v

dt

t2= 1, (259)

where we made the substitution t = 1 − vξ in order to compute the inte-gral (the remaining steps are just algebra). We show the resulting angulardistribution along with the energy as a function of cos(θ) in Fig. 13.

10 Energy and momentum in EM fields

10.1 A problem of action at a distance

In Newtonian gravity and electrostatics, two masses/charges affect eachother with an action at a distance described by a force

~F ∝ ~er1

r2. (260)

Consider the case of two point charges of charge q in electrostatics. If oneof the charges is moved a distance d~x, then the force on the other charge

59

Page 60: SI2371 Special Relativity, Lecture notes

Figure 13: The energy and angular distribution of the daughter neutrino inthe decay π+ → µ+ + νµ depending on the velocity of the parent pion andthe angle θ. Energies are given in units of MeV. Note that the scales differin the different panels.

immediately changes by an amount

d~F = ~F (~x+ d~x)− ~F (~x) ' q2

4π(r2 + 2~x · d~x)− q2

4πr2

' q2

4πr2

(1− 2

~x · d~xr2

)− q2

4πr2= − 2q2

4πr4~x · d~x, (261)

where we have assumed that the original separation vector is ~x. In theNewtonian setting, this is not a problem. Since the forces on both particleschange simultaneously, the change in momentum of one particle equals thechange in momentum of the other particle in accordance to Newton’s thirdlaw.

But what happens in relativity? Due to the relativity of simultaneity, itis no longer clear when the force on the second particle should change if we

60

Page 61: SI2371 Special Relativity, Lecture notes

move the first. Furthermore, if it occurs simultaneously to the movement ofthe first particle in one frame, it will not occur simultaneously in the other.

In a more abstract argumentation, we can consider an instant transfer ofmomentum from a point particle at point ~x to a particle at point ~y. Since thetransfer is instant, there will be no problem with conservation of momentum. . . in the frame where the transfer is instant. In other frames, the transferwill not be instant and momentum and energy will then not be conserved,i.e., one of the particles would change its motion before the other.

The above problem can be solved only if we require that all momentumand energy transfer must be local. In other words, we can have momentumand energy densities with corresponding currents, but we cannot simply takemomentum from one position and arbitrarily assign it to another.

So in the case of the two charges interacting, how can we explain theaction at a distance? The particles only exist at their positions and can-not carry momentum and energy at a different position, so what carries theenergy and momentum currents? This dilemma can be resolved if we canascribe energy and momentum densities to the field that carries the inter-action, in this case the electromagnetic field, which is what we will discussin this section.

10.2 The stress-energy tensor

The above discussion highlights the need to ascribe an energy and momen-tum to the electromagnetic field. In order to find out how we can do that,let us consider the force on a general 4-current density J . In terms of J , the4-force on the charge within a volume element dV is given by

Fµ = FµνρVµdV = γFµνJ

νdV, (262)

since Jν = ρ0Vν , where ρ0 = ρ/γ is the charge density in the rest frame

of the charge density. Similarly, the 4-momentum of the charge within thevolume element is going to be given by

Pµ = pµdV, (263)

where pµ is the 4-momentum density. By the definition of the 4-force, wetherefore find that

Fµ =dPµdτ

= γdPµdt

= γdpµdt

dV = γFµνJνdV. (264)

Cancelling the γ and the volume element, we therefore find that

dpµdt

= FµνJν , (265)

61

Page 62: SI2371 Special Relativity, Lecture notes

where the left-hand side is the change in the momentum density of thecharge within the volume dV due to the electromagnetic force acting onthat volume. The 4-momentum density added to the electromagnetic fieldmust be Kµ = −dpµ/dt. (Note that it is not obvious that the left-handside transforms as a 4-vector. This is due to the fact that the 4-momentumdensity in a frame will turn out to be pµ = T νµVν , where V is the 4-velocityof the observer and Tµν a rank two tensor. More on this later.)

In order to see how the above will help us define the energy and momen-tum of the electromagnetic field, we need to manipulate the right-hand side.From Maxwell’s equations, we know that Jν = ∂λF

λν , which leads to

−Kµ = FµνJν = Fµν∂λF

λν . (266)

The right-hand side here can be rewritten

Fµν∂λFλν = ∂λFµνF

λν − F λν∂λFµν . (267)

The last term in this expression can be shown to also be a total derivative.We start by splitting it in two terms and applying the result in Eq. (166)

F λν∂λFµν =1

2F λν∂λFµν −

1

2F λν(∂µFνλ + ∂νFλµ)

=1

4∂µF

2 +1

2F λν(∂λFµν + ∂νFµλ) =

1

4∂µF

2, (268)

where F 2 = FλνFλν and the second term vanishes as the factor F λν is anti-

symmetric under the exchange λ↔ ν and the rest of that term is symmetricunder the same exchange. We conclude that

FµνJν = ∂λFµνF

λν − 1

4∂µF

2 = ∂λ

(FµνF

λν − 1

4δλµF

2

). (269)

Introducing the rank two tensor Mµλ as

Mλµ = −FµνF λν +

1

4δλµF

2, (270)

we arrive at the result

∂µMµλ = −dp

λ

dt= Kλ. (271)

Expanding the sum over µ in the temporal and spatial parts, this equationreads

∂M0λ

∂t+ ∂iM

iλ = Kλ. (272)

62

Page 63: SI2371 Special Relativity, Lecture notes

Thus, for every fixed λ, this is a continuity equation. Based on the na-ture of the source term on the right-hand side, we can make the followingidentifications:

• For λ = 0, the right-hand side is the energy density added to theelectromagnetic field per time unit. We therefore identify ρ = M00

with the energy density and gi = M i0 with the energy current of theelectromagnetic field.

• For λ = j, the right-hand side is a component of the momentumdensity added to the electromagnetic field per time unit. We thereforeidentify kj = M0j with the momentum density and σij = M ij withthe momentum current in direction j. In more common language, thecurrent of momentum in direction i flowing in direction j is just thestress tensor (we shall soon identify σij with something familiar).

Because of the identification of the components of Mµλ it is called the stress-

energy tensor of the electromagnetic field.

Exercise 21 Show that

1

2δνµF

2 = F σνFσµ − F σνFσµ (273)

and use this result to derive

Mνµ = −1

2(FµσF

νσ + FµσFνσ). (274)

Raising the second index of the stress-energy tensor, we find that

Mλµ = −FµνF λν +1

4ηλµF 2 = −FµνF λν +

1

4ηλµF 2 = Mµλ, (275)

i.e, the stress energy tensor is symmetric. This leads to the insight that

gi = M i0 = M0i = ki, (276)

i.e., that the energy current is equal to the momentum density (if we useunits where c 6= 1, the two will instead be equal only up to multiplicationwith the appropriate number of factors of c to make the units come outright).

63

Page 64: SI2371 Special Relativity, Lecture notes

10.3 Stress-energy and the EM fields

We have so far only treated the stress-energy tensor in terms of the fieldstrength tensor Fµν . It is instructive to look at the components of Mµν interms of the electric and magnetic fields. We start by considering the energydensity

ρ = M00 = −F 0νF

0ν +1

2( ~B2 − ~E2) = −F 0

iF0i +

1

2( ~B2 − ~E2)

= F0iF0i +1

2( ~B2 − ~E2) = ~E2 +

1

2( ~B2 − ~E2) =

1

2( ~E2 + ~B2). (277)

If you have taken a course in electromagnetism, this should be familiar asthe energy density of the electromagnetic field (as we have just identifiedfrom the continuity equation!). For the energy current (or equivalently,momentum density), we find

~g = ~eiM0i = ~eiη

i0 1

2( ~B2 − ~E2)− ~eiF 0

νFiν = −~eiF0jFij

= ~eiEjεijkBk = ~E × ~B. (278)

This should be familiar from electromagnetism as the Poynting vector. Fi-nally, for the momentum current, we have

σij = M ij = −F iνF jν −1

2δij( ~B

2 − ~E2)

= −F i0F j0 − F ikF jk −1

2δij( ~B

2 − ~E2)

= −Fi0Fj0 + F ikF jk − 1

2δij( ~B

2 − ~E2)

= −EiEj + εik`B`εjkmBm −1

2δij( ~B

2 − ~E2)

= −EiEj + δij ~B2 −BiBj −

1

2δij( ~B

2 − ~E2)

=1

2( ~E2 + ~B2)δij − EiEj −BiBj . (279)

This will be recognised from electromagnetism as the Maxwell stress tensor(usually with the opposite sign so that it represents the flux of momentumin the opposite direction of the surface normal).

To summarise, we have shown that the stress-energy tensor combinesthree different quantities from electrodynamics, namely the energy density,the Poynting vector, and the Maxwell stress tensor into a single rank two

64

Page 65: SI2371 Special Relativity, Lecture notes

4-tensor with components

(Mµν) =

ρ g1 g2 g3

g1 σ11 σ12 σ13

g2 σ21 σ22 σ23

g3 σ31 σ32 σ33

. (280)

Of course, as this is a tensor in Minkowski space, any Lorentz transformationwill mix the components, e.g., we cannot know the Poynting vector in S′

solely based on its value in S, we also need to know the energy density andthe stress.

10.4 Interpreting the Maxwell stress tensor

We have identified the components of the Maxwell stress tensor σij with the

momentum current, i.e., the flow of momentum across a surface element d~Sis given by

dpidt

= σijdSj . (281)

Thus, this is the momentum transferred per time unit to the volume thesurface normal is pointing to from the volume the surface normal is pointingfrom. In other words, it is the force on the volume the normal is pointing tofrom the volume the normal is pointing from. The component of this forcein the normal direction is therefore given by

P~n dS = ninjσijdS, (282)

where ~n is a unit normal and P~n is the electromagnetic pressure in thedirection ~n. If this pressure is positive, the resulting force on the field isdirected away from the surface, while it is directed towards the surface if P~nis negative. In particular, let us consider what happens when the surfacenormal is parallel to the field lines. For this purpose, we consider a situationwhen ~B = 0 and pick a coordinate system such that ~E = E~e1 and ~n = ~e1.It follows that

P~e1 = σ11 =1

2E2 − E1E1 = −1

2E2 (283)

and the force is directed from the volume towards the surface. On the otherhand, if we consider the case when the field is perpendicular to the surface,i.e., when the field lines are along the surface, we can pick our coordinatesystem such that ~E = E~e1 and ~n = ~e2, leading to

P~e2 = σ22 =1

2E2 − E2E2 =

1

2E2 (284)

65

Page 66: SI2371 Special Relativity, Lecture notes

Figure 14: The electric field lines between two opposite charges (left) andtwo same sign charges (right). The direction of the forces on the fields acrossa plane between the charges can be inferred from the direction of the fieldlines.

and the force is directed into the volume.

Example 19 Consider the case of two stationary opposite charges q and−q located at positions x0~e1 and −x0~e1, respectively, see Fig. 14. The cor-responding electric field on the surface x = 0 is given by

~E = −~e1x0

2πε0(x20 + y2 + z2)3/2

= −~e1x0

2π(x20 + r2)3/2

≡ −~e1E(r), (285)

where r is the radial coordinate in polar coordinates on the yz-plane. Theforce from the electromagnetic field in the volume x < 0 on the field in thevolume x > 0 is given by the surface integral

Fi =

∫x=0

σijnj dS, (286)

where the surface normal is given by ~n = ~e1. Inserting the expression forthe Maxwell stress tensor results in

Fi =

∫x=0

[1

2E(r)2δi1 + EiE(r)

]dS = −1

2δi1

∫x=0

E(r)2dS. (287)

Inserting the expression for E(r) and using polar coordinates now gives us

Fi = −δi14π

∫ ∞0

r dr

(x20 + r2)3

= −δi18π

∫ ∞x2

0

dt

t3= − δi1

16πx20

= − δi14π(2x0)2

. (288)

66

Page 67: SI2371 Special Relativity, Lecture notes

In other words, the force is equal to

~F = − ~e1

4πd2, (289)

where d is the distance between the charges. We note that this is exactly theforce from the field in the region x > 0 on the charge at x = x0 and so thecharge must act on the field with the force −~F . Thus, the total force on thefield is equal to zero, as it should be in a static situation.

Exercise 22 Repeat the computation in the above example for the case oftwo equal charges.

11 Electromagnetic waves

11.1 The plane wave

In our first discussion about the electromagnetic field, we found that the4-potential in Lorenz gauge ∂µA

µ = 0 satisfies the sourced wave equation

∂µ∂µAν = Jν (290)

as a direct consequence of Maxwell’s equations. For a wave propagatingin free space without any charges or currents, this turns into the free waveequation with Jν = 0, which has plane wave solutions of the form

Aν = dν cos(k · x) +Dν sin(k · x) (291)

and superpositions thereof (with different k). The phase function of thiswave is φ = 2πk · x and consequently we find that the relation between thewave vector k and the 4-frequency is Nµ = ∂µφ = 2πkµ. This 4-potentialcan also be written as

Aν = <e(ενeik·x

), (292)

whereεν = dν − iDν (293)

is the polarisation vector. In what follows, we will always assume that weare taking the real part of any complex expression and not write out <e ()explicitly.

67

Page 68: SI2371 Special Relativity, Lecture notes

11.1.1 Polarisation and wave vector relations

Inserting the plane wave solution into the wave equation, we find

−kµkµAν = −k2Aν = 0, (294)

leading to k2 = 0 if we wish to have a non-trivial solution. Thus, hardlysurprising, plane waves that are solutions to Maxwell’s equations have light-like 4-frequencies. Using the Lorenz gauge condition results in

∂µAµ = ikµε

µeik·x = ik · εeik·x = 0, (295)

which implies that k and ε are orthogonal. Furthermore, we note that ifε ∝ k, then the field strength tensor is given by

Fµν = ∂µAν − ∂νAµ = i(kµεν − kνεµ)eik·x ∝ kµkν − kνkµ = 0 (296)

and therefore ε ∝ k gives no contribution to the electromagnetic fields in thewave. Since a light-like vector cannot be orthogonal to a time-like vector andtwo light-like vectors are orthogonal only if they are proportional to eachother, it follows that the polarisation vector ε must be space-like. Sincea polarisation vector proportional to k gives no contribution to the fields,given a polarisation vector ε, we can always define a new polarisation vector

ε′ = ε− ε0

k0k, (297)

which has time component ε′0 = ε0 − ε0k0/k0 = 0 and results in the samefield strength tensor. Thus, we can always assume that it is possible to finda suitable polarisation vector with time component zero (of course, such achoice is frame dependent).

If we consider a plane wave moving in the ~e3 direction, the wave vectoris given by

kµ = ω(1, 0, 0, 1), (298)

where k0 = ω = 2πν is the angular frequency of the wave. For this choiceof k, the two space-like directions that have time-component zero and areorthogonal to the wave vector are

εµ1 = (0, 1, 0, 0) and εµ2 = (0, 0, 1, 0). (299)

The general polarisation vector can therefore be written as any (complex)linear combination of those

ε = c1ε1 + c2ε2. (300)

68

Page 69: SI2371 Special Relativity, Lecture notes

This basis for the polarisation vector corresponds decomposing the wave inlinear polarised waves. We could also use the basis

εµ1 = (0, 1, i, 0) and εµ2 = (0, 1,−i, 0), (301)

corresponding to decomposing the wave in circular polarised waves. Notethat, when making sure that ε is space-like, it is crucial to remember thatthe inner product for complex vectors is of the form X · Y = ηµνX

µ∗Y ν .For a wave moving in an arbitrary direction ~n, we can similarly define

kµ = ω(1, ~n) = (ω,~k) and εµ = (0, ~ε), (302)

where k · ε = 0 leads to ~k · ~ε = 0. Since ~ε is orthogonal to ~k, we refer to thisas a transverse wave.

11.1.2 Fields in the electromagnetic wave

We can compute the field strength tensor of the plane wave described aboveby using the definition

Fµν = ∂µAν − ∂νAµ = i(kµεν − kνεµ)eik·x. (303)

For the electric field ~E, we find that

~E = F0i~ei = i~ei(ωεi − kiε0)eik·x = −iω~εeik·x. (304)

Consequently, the electric field ~E is proportional to ~ε and therefore orthog-onal to the wave vector ~k (and therefore to the direction of propagation).In a similar fashion, we find that the magnetic field ~B is given by

~B = −~ei2εijkFjk = −i~eiεijkkjεkeik·x = i~ε× ~keik·x (305)

and thus ~B is orthogonal to both ~k and ~E, which may be a familiar result.

Exercise 23 Verify that εµ = (0, 1, 0, 0) corresponds to a linear polarisedwave and that εµ = (0, 1, i, 0) corresponds to a circular polarised wave.

11.2 Stress-energy of the plane wave

In order to compute the stress-energy tensor of the plane wave, let us focuson the case where εµ = (0, ~ε) is real and we have a plane wave of the form

Aµ = εµ cos(k · x). (306)

69

Page 70: SI2371 Special Relativity, Lecture notes

Since the invariant F 2 is given by

F 2 = FµνFµν = (kµεν − kνεµ)(kµεν − kνεµ) sin2(k · x)

= 2[k2ε2 − (k · ε)2] sin2(k · x) = 0, (307)

we can express the stress-energy tensor as

Mλµ = −FµνF λν = −(kµεν − kνεµ)(kλεν − kνελ) sin2(k · x)

= −ε2kµkλ sin2(k · x) = kµkνA2 sin2(k · x), (308)

where we have used that k2 = k · ε = 0 and A2 = ~ε2 is the amplitude of thewave. The energy density is thus given by

ρ = M00 = ω2A2 sin2(k · x) (309)

and is proportional to the square of the amplitude A as well as the square ofthe frequency ω as expected. The Poynting vector in this scenario is givenby

~g = ω~kA2 sin2(k · x). (310)

It should come as no surprise that, since the Poynting vector is proportionalto ~k, the energy flow and momentum of the wave are in the direction of wavepropagation.

12 Continuum mechanics

12.1 The stress-energy tensor

The electromagnetic field, as described by the field strength tensor, is acontinuum and we were able to describe its energy and momentum proper-ties using a symmetric rank two tensor. This directly begs the question ofwhether we can do something similar to a general continuum such as a gas,fluid, or solid. Not surprisingly, it turns out that this is not only possible,but also very natural.

Introducing a stress-energy tensor Tµν for a general continuum, we ex-pect that if the continuum interacts with the electromagnetic field, thestress-energy tensor should satisfy

∂µTµν = −∂µMµν = F νµJµ, (311)

where Mµν is the stress-energy tensor of the electromagnetic field. Thisfollows from the very same reasoning that gave us the stress-energy tensor

70

Page 71: SI2371 Special Relativity, Lecture notes

for the electromagnetic field, starting from the force on a small volume andrelating it to the change in the fluid 4-momentum. Note that the aboveequation is just the relativistic equivalent of Newton’s third law as it can bewritten on the form

KµT = −Kµ

E , (312)

where KT is the source term in the continuity equation for the continuumand KE the source term in the continuity equation for the electromagneticfield. This statement is just saying that whatever 4-momentum is takenfrom the continuum is put into the electromagnetic field and vice versa.

As was the case for the stress-energy tensor of the electromagnetic field,writing down the continuity equation for the continuum, we can identify thecomponents of Tµν as

• the energy density ρ = T 00,

• the momentum density/energy current pi = T 0i = T i0, and

• the stress tensor σij = T ij .

As in the case of the Maxwell stress tensor, we have here used the oppositesign to the usual definition.

Under some quite reasonable assumptions on the relations among thecomponents of the stress-energy tensor, we can find a unique frame in whichthe energy current ~p = 0. If such a frame exists, we refer to it as the restframe of the continuum and refer to the corresponding energy density as ρ0

and stress tensor σ0ij as the rest energy density and stress tensor, respectively.

Since Lorentz transformations generally mix all of the components ofthe stress-energy tensor, it is worth noting that what appears as stress inone frame may be partially converted into momentum or energy density inanother.

12.2 Perfect fluids

If a continuum has a rest frame in which the stress tensor σij is isotropic,i.e., its components do not change under rotations, we refer to it as a perfectfluid. Since the only isotropic tensor of rank two is δij , this implies that thefluid has no tangential stresses, i.e., σij = 0 for i 6= j, meaning that the fluidis non-viscous. It follows that σij = pδij , where p is the pressure of the fluid.

Exercise 24 Verify that p is actually the pressure by computing the forceon a surface element d~S within the fluid.

71

Page 72: SI2371 Special Relativity, Lecture notes

In the rest frame of the fluid, we can now represent the stress-energytensor as

(Tµν) = diag(ρ0, p, p, p) = diag(ρ0 + p, 0, 0, 0) + p diag(−1, 1, 1, 1). (313)

The second of these terms is just equal to (−pηµν), while the first term hasa non-zero time-time-component, with all other components equal to zero.We note that this is a property of (V µV ν), where V is the 4-velocity of anobserver at rest in the fluid rest frame, which has the time-time-componentone and all other components equal to zero. We conclude that, for a perfectfluid,

Tµν = (ρ0 + p)V µV ν − pηµν . (314)

For any perfect fluid we define the equation of state parameter w = p/ρ0 asthe quotient between the pressure and the energy density in the fluid’s restframe.

12.2.1 Particle gases

Let us argue for the structure of the stress-energy tensor for a gas of non-interacting particles of mass m. We will be working in the rest frame ofthe gas, where the distribution of the particle momenta ~p are given by aspherically symmetric distribution f(~p) = f(p). The canonical choice forthe distribution would be the Maxwell–Juttner distribution, which is therelativistic generalisation of the Maxwell distribution, but we will not needto be specific. We will use the notation

〈A(~p)〉 =

∫A(~p)f(p)d3~p (315)

to represent the expectation value of the quantity A(~p) for any given particle.Consequently, the expectation value of the density of A will be n 〈A(~p)〉,where n is the number density of the particles.

The energy of a particle of momentum ~p and mass m is given by

E(p) =√m2 + p2. (316)

The energy and momentum densities of the gas are therefore given by

ρ0 = n 〈E(p)〉 = n

∫E(p)f(p)d3~p and ~g = n 〈~p〉 = n

∫~pf(p)d3~p, (317)

72

Page 73: SI2371 Special Relativity, Lecture notes

respectively. Since f(p) only depends on the magnitude and ~p is anti-symmetric with respect to ~p = 0, it directly follows that ~g = 0, i.e., weare indeed in the rest frame of the gas. For the energy density, we obtain

ρ0 = 4πn

∫ ∞0

√m2 + p2 f(p)p2dp, (318)

where the factor 4π comes from the angular integration.Let us now express the stress tensor σij . Any given particle is carrying

its momentum ~p with velocity ~p/E(p). The momentum current density istherefore of the form

σij = n 〈pipj/E(p)〉 = n

∫pipj√m2 + p2

f(p)d3~p. (319)

If i 6= j, then the integrand is again odd under pi → −pi and therefore σijis non-zero only for i = j. Due to the rotational symmetry of f(p), we musttherefore have σij = Pδij , where C is some constant. We can find C byconsidering the trace

σii = 3P = 4πn

∫ ∞0

p4√m2 + p2

f(p)dp (320)

or, equivalently,

P =4πn

3

∫ ∞0

p4√m2 + p2

f(p)dp. (321)

We therefore conclude that a gas of this form is a perfect fluid with energydensity ρ0 and pressure P .

There are two special cases that are particularly important for cosmology.The first of those cases is the gas of massless particles, for which m = 0 (orwhere m is so small that it can be neglected in the integrands). In this case,we find that

ρ0 = 4πn

∫ ∞0

p3f(p)dp and P =4πn

3

∫ ∞0

p3f(p)dp =ρ0

3. (322)

Thus, the equation of state parameter for this case is w = 1/3 and we referto a gas of this type as a radiation gas. An example of this type of gas is aphoton gas.

The second special case of interest is the case when the temperature ofthe gas is so low that p � m for all p such that f(p) is significant. In thiscase, we instead find that

ρ0 ' 4πnm

∫ ∞0

p2f(p)dp = nm (323)

73

Page 74: SI2371 Special Relativity, Lecture notes

and

P =4πρ0

3

∫ ∞0

p4

m2f(p)dp� 4πρ0

3

∫ ∞0

p2f(p)dp =ρ0

3, (324)

where we have used the normalisation requirement on f(p). In this situation,the pressure is therefore much smaller than the energy density and we havew ' 0. This type of gas is referred to matter gas or dust.

Throughout the history of the Universe, the universe started out beingradiation dominated. As the Universe expanded and cooled down, radiationwas diluted more than matter and the Universe became matter dominated.Recently (in cosmological terms), it seems that the Universe has becomedominated by a new and exotic component that has the equation of statew ' −1, i.e., a component with negative pressure. Whatever this componentconsists of we do not know, but it is generally referred to as dark energy.

Although this is all very interesting, understanding the interplay betweenperfect fluids and the expansion of the Universe is a topic for a course ingeneral relativity and cosmology, and we will not examine it further here.

12.3 Conservation of energy and momentum in a continuum

We have interpreted the components T 0µ as the 4-momentum density of acontinuum. By integrating this over a volume V at a fixed time t0, we shouldtherefore obtain the total 4-momentum inside that volume

Pµ(t0) =

∫t=t0

T 0µdV. (325)

Now that we have studied relativity for some time, this expression shouldseem nasty and ugly, since it is not explicitly transforming as a 4-vector,but makes reference to the time-components of Tµν as well as the frame-dependent definition of volume. We can rectify this by rewriting T 0µ as

T 0µ = VνTνµ, (326)

where V is the 4-velocity of an observer at rest in the system S we areconsidering. We now have the integral

Pµ(t0) =

∫t=t0

T νµVνdV. (327)

We now note that t = t0 defines a surface of simultaneity in S and thatVµdV is the three-dimensional surface element, since Vµ is the normal to

74

Page 75: SI2371 Special Relativity, Lecture notes

the surface of simultaneity. We therefore conclude that for any surface ofsimultaneity Σ, the 4-momentum crossing it in its rest frame is given by

Pµ(Σ) =

∫ΣT νµdSν . (328)

This expression is frame independent as long as we account for how Σ isexpressed in the frame where we are computing the integral.

So what about 4-momentum conservation? Assume that we have a con-tinuum such that the stress-energy tensor vanishes sufficiently fast at spatialinfinity. We can then write the 4-momentum at two different times as

Pµ(t2)− Pµ(t1) = Pµ(Σ2)− Pµ(Σ1) =

∫Σ2−Σ1

T νµdSν , (329)

where the surface normal is future-directed in both integrals. Since thestress-energy tensor is assumed to vanish at spatial infinity, we can applythe divergence theorem to rewrite this as a 4-dimensional volume integral

Pµ(t2)− Pµ(t1) =

∫t1<t<t2

∂νTνµd4x, (330)

where d4x is the Minkowski space volume element dt dx dy dz, see Fig. 15.We recognise the integrand here as the force density on the continuum andif there is no external force density acting on it, then ∂νT

νµ = 0 and weconclude that

Pµ(t2) = Pµ(t1), (331)

i.e., the 4-momentum is conserved.While expression for the total 4-momentum was written in a Lorentz

invariant manner, it might still depend on the chosen surface of simultane-ity. If we consider two surfaces of simultaneity Σ and Σ′, belonging to twodifferent frames, we can make an argument similar to the above and write

Pµ(Σ)− Pµ(Σ′) =

∫Σ−Σ′

T νµdSν . (332)

Again assuming that the stress-energy tensor vanishes sufficiently fast atspatial infinity and applying the divergence theorem, we find that

Pµ(Σ)− Pµ(Σ′) =

∫MΣΣ′

∂νTνµd4x, (333)

where MΣΣ′ is the space-time volume between Σ and Σ′ (note that thisvolume has an inward pointing normal in some part of Minkowski space,

75

Page 76: SI2371 Special Relativity, Lecture notes

x

tt = t2

−V

t = t1

V

t1 < t < t2

Figure 15: The difference of the integrals over the surfaces of simultaneityat t = t2 and t = t1, respectively, can be written as a 4-dimensional integralover the Minkowski space region t1 < t < t2 by using the divergence theorem.

x

t

t′ =t′0

−V

V ′

t = t0

Figure 16: When the two surfaces of simultaneity are not parallel, i.e.,based on the same inertial frame, the integration volume will have one partthat needs to be integrated with positive sign (red) and one that must beintegrated with a negative sign (blue).

76

Page 77: SI2371 Special Relativity, Lecture notes

the integral over this part should be taken with negative sign, see Fig. 16).If there is no external force acting on the continuum, we again find that∂νT

νµ = 0 and thereforePµ(Σ)− Pµ(Σ′). (334)

In the case of no external forces, it therefore does not matter which surface ofsimultaneity we use to define the total 4-momentum, it is frame independentand conserved.

What happens in the case when there is a force density acting on thecontinuum? According to the above discussion, there should then be apossibility that the total 4-momenta crossing Σ and Σ′ are different. Thisis rather simple to understand. In general, if we add some 4-momentumto the continuum at an event xµ, then this might have occurred before Σ,but after Σ′. Unless Σ and Σ′ are parallel (in which case we end up in thesame situation as before with 4-momentum conservation) there will be otherevents that occur before Σ′, but after Σ.

12.4 Angular momentum of a continuum

In classical mechanics, we often study the angular momentum

~L = ~x× ~p. (335)

The components of the angular momentum can be written as

Li = εijkxipj , (336)

but we can equally well represent these using the components of an anti-symmetric rank two angular momentum tensor

Lij = εijkLk = εijkεk`mx`pm = xipj − xjpi. (337)

Inspired by this, let us exchange the momentum ~p for the momentum densityand see how we can generalise this to Minkowski space. We define theangular momentum density as

Kµνσ = xµT νσ − xνTµσ. (338)

Assuming that there is no external force acting on the continuum, i.e.,∂µT

µν = 0, we find that

∂σKµνσ = δµσT

νσ + xµ∂σTνσ − δνσTµσ − xν∂σTµσ = T νµ − Tµν = 0. (339)

77

Page 78: SI2371 Special Relativity, Lecture notes

As a consequence, we find that Kµνσ is a conserved current with respect tothe σ index and therefore the entire argumentation we performed for thetotal 4-momentum goes through for the total angular momentum

Lµν(Σ) =

∫ΣKµνσdSσ =

∫Σ

(xµT νσ − xνTµσ)dSσ, (340)

which therefore is a conserved quantity in the absence of external forces.Based on the introduction to this section, we expect that the spatial com-

ponents of this tensor should be the angular momentum tensor in classicalmechanics. Indeed, we find that in any given frame

Lij(t0) =

∫t=t0

(xiT j0 − xjT i0)dV =

∫t=t0

(xipj − xjpi)dV, (341)

where ~p is the momentum density and therefore the integrand is the angularmomentum density.

Since Lµν is anti-symmetric, L00 = 0 by definition and we get no furtherinformation from this component. However, let us consider what we canlearn from the mixed components L0i. Expanding these components, wefind that

L0i(t0) =

∫t=t0

(tT i0 − xiT 00)dV = t0

∫t=t0

pidV −∫t=t0

ρxidV. (342)

Multiplying this by ~ei, we find that

E ~X(t0) ≡ ~eiL0i(t0) = t0 ~P (t0)− E~xc(t0), (343)

where ~P (t0) is the total momentum at time t0 and we have defined the centerof mass

~xc(t0) =1

E

∫t=t0

ρ~x dV (344)

and the total energy

E =

∫t=t0

ρ dV. (345)

If there are no external forces, then ~X(t0) = ~X0, ~P (t0) = ~P0, and E areconstants and we find the relation

~xc(t0) =~P0

Et0 − ~X0. (346)

This just states that the center of mass is in uniform motion with velocity~v = ~P0/E.

78

Page 79: SI2371 Special Relativity, Lecture notes

12.4.1 Continua and external forces

We have treated the situation where there is no external force acting on thecontinuum. If we instead include an external force such that

∂µTµν = fν (347)

where fµ is the 4-force density, then we can start from Eq. (330) with t2 = t0and t1 = 0 and find that

Pµ(t0)− Pµ(0) =

∫ t0

t=0

∫R3

fµ dV dt. (348)

Differentiation with respect to s now gives

dPµ

dt0=

∫fµ dV = Fµ, (349)

where Fµ is the total 4-force acting on the continuum at time t = t0. Notethat this is not exactly the same 4-force that we discussed when consideringa single point-like object, where we could define the 4-force relative to theproper time along the object’s world-line. Here we have instead differenti-ated the 4-momentum with respect to the time t0 in some given frame. Youwill not be able to get the 4-force in S′ by Lorentz transforming this Fµ asthe 4-force in S′ is defined using a different surface of simultaneity and adifferent volume element. However, this Fµ will tell us something particularabout the frame we are considering.

The above definition of fµ can be inserted into the computation of theangular momentum tensor Lµν . Where the divergence of Kµνσ previouslyvanished, we now find that

∂σKµνσ = xµfν − xνfµ. (350)

The conservation of angular momentum now becomes

Lµν(t0)− Lµν(0) =

∫ t0

t=0

∫(xµfν − xνfµ)dV dt. (351)

Differentiation with respect to t0 therefore gives

dLµν

dt0=

∫(xµfν − xνfµ)dV. (352)

For the purely spatial part of Lµν this leads us to

d~L

dt0=

1

2~eiεijk

dLjk

dt0=

∫~eiεijkx

jfkdV =

∫~x× ~f dV. (353)

79

Page 80: SI2371 Special Relativity, Lecture notes

Since ~f is the force density, it is clear that ~x × ~f is the torque density andthis equation therefore states that the rate of change of the overall angularmomentum is equal to the integral of the torque density over all of space.As for the interpretation of Lij as the angular momentum, this should berather expected from the construction of Lµν .

We obtain a bit more insight when we consider the time derivative ofL0i, which is given by

dL0i

dt0=

∫(t0f

i − f0xi)dV = t0Fi − F 0xif , (354)

where we have defined the center of power

~xf =1

F 0

∫f0~x dV, (355)

which describes where (on average) energy is being added to the continuum.Differentiating our earlier expression for L0i(t0) with respect to t0, we findthat

dL0i

dt0= P i + t0F

i − F 0xic − Edxicdt0

. (356)

Equating the two different expressions for dL0i/dt0 and solving for dxic/dt0,we find that

dxicdt0

=P i

E+F 0

E(xif − xic). (357)

The first of these terms describes the usual motion of the center of mass dueto the inherent momentum in the continuum. However, the second termis quite peculiar, but with a straight-forward interpretation. The quantityF0/E describes the relative rate of energy increase in the system as it is equalto d ln(E)/dt0. The factor xif − xic describes the displacement between thecenter of power and the center of energy and the equation tells us that thecenter of energy will move towards the center of power with a velocity thatdepends on the power relative to the energy that is already in the system.This makes sense! If we add a lot of energy in a point ~xf and there waslittle energy in the system from the beginning, the center of energy shouldquickly adapt and stabilise at ~xf . Note that there is nothing stopping thesecond term from giving a speed larger than one as it depends on an externaladdition of energy to a given point, not on parts of the continuum itselftransmitting energy or momentum.

80