299
A detailed solution manual and guide for Schutz’s First Course in General Relativity (Schutz, 2009) Robert B. Scott, 1* 1 Department of Physics, University of Brest, Brest, France *To whom correspondence should be addressed; E-mail: [email protected] February 10, 2012

FirstCourseGR_notes_on_Schutz2009.pdf

Embed Size (px)

Citation preview

Page 1: FirstCourseGR_notes_on_Schutz2009.pdf

A detailed solution manual and guide forSchutz’s First Course in General Relativity

(Schutz, 2009)

Robert B. Scott,1∗

1Department of Physics, University of Brest,

Brest, France

∗To whom correspondence should be addressed; E-mail: [email protected]

February 10, 2012

Page 2: FirstCourseGR_notes_on_Schutz2009.pdf

2

Note to user.This manual is to be used as a companion to the textbook “A First Course

in General Relativity”, by Bernard Schutz, 2nd edition, published 2009 byCambridge University Press. It will only make sense when read with Schutz’stext. Herein you’ll find brief notes meant to clarify and amplify his textbook,presented in the same order as his textbook. Most importantly you’ll findsolutions to almost all the exercises. These are presented in great detail,with each step explained including references to the text. You’ll also findmy supplementary problems that are meant to establish intermediate goalsnecessary to solve Schutz’s exercises or to amplify concepts not covered byhis exercises. Comments should be sent to [email protected].

Page 3: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 1

Special Relativity

1.1 Fundamental principles of special relativ-

ity (SR) theory

In the footnote 4 on p. 3, the answer to the first question is “no, the soupis unaffected by a acceleration experienced by an astronaut in orbit.” Thiswould appear to also cause problems for SR, since how do we know thatan observer is in an inertial frame? The acceleration cannot, necessarily,be measured locally. And there’s no special reference frame with which tomeasure ones acceleration.

1.2 Definition of an inertial observer in SR

Gives a “geometrical” definition of an inertial reference frame, or coordinatesystem.

Notes that gravity makes it impossible to construct such an inertial co-ordinate system.

1.3 New units

Introduces what Misner et al. (1973) called “geometric units”, wherein timeis measured in distance of light travel.

They claim the motivation is that c = 3 × 108m/s in SI, a “ridiculousvalue”. I disagree, since then a 1/3 second becomes the ridiculously large

3

Page 4: FirstCourseGR_notes_on_Schutz2009.pdf

4

105 km! A more useful motivation comes from

• velocity becoming a dimensionless parameter,

• space-time diagrams having the same units on all axes, and

• the world lines of light paths having unit slope.

1.4 Spacetime diagrams

Typo: Fig. 1.4: v is of course a vector, so one should replace this withv = |v|.

1.5 Construction of the coordinates used by

another observer

This is an extremely important section. Unfortunately he doesn’t explainwhy the angle of the x−axis to the x−axis is φ = arctan(v), where v = |v|is the magnitude of the velocity of O along the x−axis axis. Rather thisresult appears in Fig. 1.5 without explanation, nor even delegating it as an“exercise for the student”. The result does follow from the construction ofthe x−axis , but the steps involved are not trivial. Please see supplementaryproblem SP.1 in section 1.15.

1.6 Invariance of the interval

This section purports to provide a proof of the invariance of the interval.But bear in mind that it assumes that the relationship between coordinatesin different frames is linear, see discussion before Eq. (1.2).

He reduces the relationship between the interval in one frame and anotherto a function of the relative velocities of their origins, see Eq. (1.5) on p. 10,

∆s2 = −M00∆s2 = φ(v)∆s2.

To show that φ(v) depends only upon direction he considers the case of arod (or two events A and B at the ends of the rod) lying along the y−axis.

Page 5: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 5

A and B are simultaneous in O, and he argues that they are therefore alsosimultaneous in O, by constructing the y-axis as he did in Fig. 1.3. Butnow the velocity of O is orthogonal to the constructed axis, so of coursethe simultaneity of events is not changed by the coordinate transform. Theintermediate result is that the space-time interval between A and B in eitherframe is just the square of the length, so their ratio is the sought-after φ(v).Now the subtle point is that he then claims that this ratio cannot depend onthe direction of the velocity, because the rod is perpendicular to it and thereare no preferred directions?! So what??

I think the solution is that v could be in an arbitrary direction in the x−zplane. The ratio of lengths should not depend upon this direction, becausethen there would be preferred directions. But as far as I can tell, this onlyshows that the direction of the component orthogonal to the y−axis cannotinfluence φ(v).

1.7 Invariant hyperbolae

At the end of the section it’s stated that

“The lesson of Fig. 1.12b is that tangent to a hyperbola at anyevent P is line of simultaneity of the Lorentz frame whose timeaxis joins P to the origin. If this frame has velocity v, the tangenthas slope v.”

The above is stated without proof or even hint that there’s some calcula-tion involved. Fortunately it proceeds straightforwardly. We seek the slopeof the tangent to a hyperbola. Differentiate any timelike hyperbola wrt x, toobtain in general

dt

dx=x

t.

At some point P the slope of the tangent wrt the x−axis is xptp

. Now if the

t−axis is chosen to go through the origin and P its slope wrt the t−axis willalso be xp

tp, corresponding to tan(φ) = v = xp

tp. But we know from Fig. 1.5

that the corresponding x−axis will have slope v relative to the x−axis. Thatis, the tangent is parallel the x−axis, and is therefore a line of simultaneityfor O. QED.

Page 6: FirstCourseGR_notes_on_Schutz2009.pdf

6

1.8 Particularly important results

Time dilation This was straightforward once one uses the invariant hyper-bolae. The event xB was constructed so that it had t = 1. The correspondingevent in O is obtained by tracing the point back to the t−axis along the hy-perbola with the same interval, ∆s2 = −1,

−t2 + x2 = −1

One must also note that the equation for the t−axis is t = x/v. Substitutingthis into the hyperbola,

−t2B + x2B = −1 (1.1)

−t2B + (tBv)2 = −1 (1.2)

tB = 1√1−v2 (1.3)

This gives Eq. (1.8).

Lorentz contraction I still don’t see how he came up with

xC =l√

1− v2

But I obtain the same end result using instead the invariance of the interval,which in O is

∆s2AC = −∆t

2+ ∆x2 = −t2 + x2 = 0 + l2 = l2.

and therefore must also be in O. I also used the equation for the x−axis,t = vx. This was confusing at first since the units look wrong! But it’s clearwhen you go back to Fig. 1.5 and note that tan(φ) = v, which was obvious forthe t−axis since the observer O is moving along the x−axis at speed v. Thatthe x−axis was also inclined at the same angle φ was more complicated. Onealso needs a three relation, which is simply xC − xB = vtC. A little algebragives the Lorentz contraction:

xB = l√

1− v2.

Page 7: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 7

1.9 Lorentz transformation

The first step, substituting the equations for theO axes proceeds immediatelyto

t = α(t− v x) (1.4)

x = σ(x− v t) (1.5)

I had trouble seeing how α = σ from the path of a light ray, so I used theinvariant hyperbolae instead. Substituting (1.4 and 1.5) into the equationfor the interval from the origin, gives,

∆s2 = −t2 + x2 = ∆s2 = −t2 + x2

The cross term on the RHS involving x t must be zero, giving that α2 = σ2.Equating either of the other terms gives the Lorentz factor,

α2 =1√

1− v2

As Schutz (2009, p. 22) points out, the positive root is selected so that thecoordinates are not inverted when v = 0.

In retrospect, it is clear how the path of a light ray gives α = σ. Sim-ply note that the world line of line ray has ∆x = ±∆t and ∆x = ±∆t.Substitution into (1.4 and 1.5) gives,

σ

α

(∆x− v∆t

∆t− v∆x

)=

∆x

∆t= 1.

So σ = α.The Lorentz transformation is often said to reduce to the Galilean trans-

formation in the limit v 1, but that’s not strictly true. Unlike for theGalilean transformation, in the Lorentz transformation time is affected atlarge distances even for small velocities.

1.10 Velocity composition law

1.11 Paradoxes and physical intuition

1.12 Further reading

A more thoughtful look at fundamentals, Bohm (2008).

Page 8: FirstCourseGR_notes_on_Schutz2009.pdf

8

1.13 Appendix: the twin paradox dissected

1.14 Exercises

1.14.1 Convert to geometric units

a)

10 J = 10N m = 10kg m2/s2 = 10/9× 1016kg = 1.11× 10−16kg.

b)

100W = 100J/s = 1.11×10−15kg/s = 1.11×10−15/3×108kg/m = 0.371×10−23kg/m

c)

~ = 1.05× 10−34J s =1.05× 10−34J s

3× 108m/s= 0.352× 10−42kg m

d) Car velocity [108 km/hr]

v = 30m/s = 10−7

e) Car momentum

p = 30m/s× 1000kg = 10−4kg

f) Atmospheric pressure,

1bar = 105N m−2 =105kg m s−2

9× 1016m4s−2= 1.1× 10−12kg m−3

g) water density

103 kg m−3

h) Luminosity flux

106J s−1cm−2 = 1010J s−1m−2 =1010J s−1m−2 × 1.11× 10−16kg J−1

3× 108m s−1= 3.71×10−16kg m−3

Page 9: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 9

1.14.2 Convert from natural units (c = 1) to SI units

2 (a) Velocity, v = 10−2:

v = 10−2 × c[m s−1] = 3× 106 [m s−1]

2 (b) Pressure, 1019 [kg m−3]:

1019 [kg m−3]× c2[m2 s−2] = 9× 1035 [N m−2]

2 (c) Time, 1018 [m]:

1018 [m]

c[m s−1]= 3.3× 109 [s]

2 (d) Energy density, 1 [kg m−3]:

1 [kg m−3]× c2[m2 s−2] = 9× 1016 [J m−3]

2 (e) Acceleration, 10 [m−1]:

10 [m−1]× c2[m2 s−2] = 9× 1017 [m s−2]

1.14.6 Show that Eq. (1.2) contains only Mαβ + Mβα

when α 6= β, not Mαβ and Mβα independently.Argue that this allows us to set Mαβ = Mβα with-out loss of generality.

∆s2 =3∑

α=0

3∑β=0

Mαβ(∆xα)(∆xβ)

Page 10: FirstCourseGR_notes_on_Schutz2009.pdf

10

Pick a pair of indices, α = α′ and β = β′ say, where α′ 6= β′, and α′ ∈ 0 . . . 3and β′ ∈ 0 . . . 3. So ∆s2 contains a term like,

Mα′β′(∆xα′)(∆xβ

′).

But ∆s2 also contains a term like,

Mβ′α′(∆xβ′)(∆xα

′) = Mβ′α′(∆x

α′)(∆xβ′).

The equality follows because of course the product does not depend uponthe order of the factors. So we can group these two terms and factor out the(∆xα

′)(∆xβ

′) leaving,

(∆xα′)(∆xβ

′)(Mα′β′ +Mβ′α′)

Because the off-diagonal terms always appear in pairs as above, we couldwithout changing the interval (and therefore without loss of generality) re-place them with their mean value

Mαβ ≡ (Mαβ +Mβα)/2

Thus the new tensor Mαβ is by construction symmetric.

1.14.7 In the discussion leading up to Eq. (1.2), as-sume that the coordinates of O are given as thefollowing linear combinations of those of O:

t = αt+ βx, (1.6)

x = µt+ υx, (1.7)

y = ay, (1.8)

z = bz, (1.9)

where α, β, µ, υ, a, and b may be functions of the velocity v of O relative to O,but they do not depend on the coordinate. Find the numbers Mαβ, α, β =0, . . . 3 of Eq. (1.2) in terms of α, β, µ, υ, a, and b.

First note that the origins of the two coordinate systems line up, andthat ∆t = t etc. Then the result follows from straightforward substitutionof (1.6) to (1.9) into Eq. (1.1)

∆s2 = −∆t2

+ ∆x2 + ∆z2 + ∆z2 (1.10)

= −(α∆t+ β∆x)2 + (µ∆t+ υ∆x)2 + (a∆y)2 + (b∆z)2 (1.11)

Page 11: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 11

Grouping terms we find that (−α2 +µ2) multiplies ∆t2, so M00 = (−α2 +µ2).Similarly, the term multiplying ∆x2 is M11 = −β2 +υ2. The cross terms giveM01 = M10 = −αβ + µυ, and the remaining diagonal terms are M22 = a2,M33 = b2. Other cross terms are nil.

1.14.8 a) Derive Eq. (1.3) from Eq. (1.2) for generalMαβ.

Start with Eq. (1.2)

∆s2 = Mαβ∆xα∆xβ.

Substituting

∆s2 = M00∆t2 +M0i∆xi∆t+Mi0∆xi∆t+Mij∆x

i∆xj

Note that Mi0 = M0i (problem 6). Consider case ∆s2 = 0, so from Eq. (1.1),∆t = ∆r =

√∆x2 + ∆y2 + ∆z2. Then,

∆s2 = M00∆r2 + 2M0i∆xi∆r +Mij∆x

i∆xj

which is Eq. (1.3).b) Since ∆s2 = 0 in Eq. (1.3) for any ∆xi, replace ∆xi by −∆xi

in Eq. (1.3) and subtract the resulting equations from Eq. (1.3) toestablish that M0i = 0 for i = 1, 2, 3.

We have set ∆s2 = 0 and it followed, based upon the universality of thespeed of light, that ∆s2 = 0. Note that changing ∆xi to −∆xi does notchange ∆r nor ∆s. So that’s why ∆s2 = 0 in Eq. (1.3).

The only term in Eq. (1.3) to change sign when changing ∆xi to −∆xi

is the 2M0i∆xi∆r term. The final term doesn’t because changing ∆xi to

−∆xi also changes ∆xj to −∆xj; the i is just a dummy index. So when wesubtract from Eq. (1.3) the following

∆s2 = M00∆r2 − 2M0i∆xi∆r +Mij∆x

i∆xj

we’re left with

0 = 4M0i∆xi∆r.

This must be true for arbitrary ∆xi so M0i = 0. QED.c) Derive Eq. (1.4)b

Page 12: FirstCourseGR_notes_on_Schutz2009.pdf

12

Required to show:

Mij = −M00δij, (i, j = 1, 2, 3).

Adding to Eq. (1.3) the following

0 = ∆s2 = M00∆r2 − 2M0i∆xi∆r +Mij∆x

i∆xj

gives,0 = M00∆r2 +Mij∆x

i∆xj (1.12)

Suppose, ∆x = ∆r,∆y = ∆z = 0. Substituting into (1.12) then givesM00 = −M11. Or, when ∆y = ∆r,∆x = ∆z = 0, we see that M00 = −M22.Similarly, M00 = −M33. To see that the off-diagonal terms are zero, notethat it’s also possible that ∆x = ∆y = ∆r/

√2 and ∆z =. Substitution into

(1.12) gives that

0 = (M12 +M21)∆r/2 = ∆rM12 = 0

Similarly, M13 = 0 = M23. In summary,

Mij = −M00δij, (i, j = 1, 2, 3).

which is Eq. (1.4)b. QED.

1.14.18 a) Show that velocity parameters add linearly,b) apply to a specific problem

Define the velocity parameter W through w = tanh(W ).Want to show the velocity addition law,

w′ =u+ w

1 + wu

implies linear addition of velocity parameters. Simply substitute the defini-tion of velocity parameter,

w′ =tanh(U) + tanh(W )

1 + tanh(U) tanh(W )(1.13)

=(tanh(U) + tanh(W )) cosh(W ) cosh(U)

cosh(W ) cosh(U) + sinh(U) sinh(W )(1.14)

Page 13: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 13

The numerator can be written as,

N = sinh(W ) cosh(U) + cosh(W ) sinh(U)

so that

w′ =sinh(W ) cosh(U) + cosh(W ) sinh(U)

cosh(W ) cosh(U) + sinh(U) sinh(W )

The following identities are useful:

cosh(a) cosh(b) =

(exp(a) + exp(−a)

2

)(exp(b) + exp(−b)

2

)=

exp(a+ b) + exp(−(a+ b))

4+

exp(a− b) + exp(−(a− b))4

=cosh(a+ b)

2+

cosh(a− b)2

(1.15)

sinh(a) sinh(b) =

(exp(a)− exp(−a)

2

)(exp(b)− exp(−b)

2

)=

exp(a+ b) + exp(−(a+ b))

4− exp(a− b) + exp(−(a− b))

4

=cosh(a+ b)

2− cosh(a− b)

2(1.16)

sinh(a) cosh(b) =

(exp(a)− exp(−a)

2

)(exp(b) + exp(−b)

2

)=

exp(a+ b)− exp(−(a+ b))

4+

exp(a− b)− exp(−(a− b))4

=sinh(a+ b)

2− sinh(a− b)

2(1.17)

Using (1.15) and (1.16) the denominator above simplifies to D = cosh(U+W ). Using (1.17) the numerator simplifies to N = sinh(U +W ). So,

w′ = tanh(U +W )

which reveals that we can linearly add velocity parameters, then apply tanhto reduce the final parameter to the final velocity.

Page 14: FirstCourseGR_notes_on_Schutz2009.pdf

14

b Velocity of 2nd star relative to first, u2 = 0.9. Velocity of nth starrelative to (n-1)th, un − un−1 = 0.9. So the Nth star relative to the first is,

u′N = tanh[(N − 1)U ]

where 0.9 = tanh(U).My answer disagrees with that given by Schutz. Where I have N − 1 he

has N . Note that my answer is correct for N = 2, the 2nd star relative tothe first moves at 0.9.

1.14.19 a) Lorentz Transformation using velocity pa-rameter

t = γt− γvx (1.18)

x = −γvt+ γx

y = y

z = z

Let, v = tanh(V ). Note that the Lorentz factor also simplifies,

γ ≡ 1√1− v2

=(1− tanh2(V )

)−1/2

=

(cosh2(V )

cosh2(V )− sinh2(V )

)1/2

= ± cosh(V ) (1.19)

We always take the positive root in the Lorentz factor so that the Lorentztransformation reduces to the identity matrix when v = 0.

The final equality follows from the following identity (stated without proofin b):

cosh2(V )− sinh2(V ) =

(exp(V ) + exp(−V )

2

)2

−(

exp(V )− exp(−V )

2

)2

=

(exp(2V ) + exp(−2V ) + 2

4

)−(

exp(2V ) + exp(−2V )− 2

4

)= 1 (1.20)

Page 15: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 15

Substituting v = tanh(V ) and (1.19) into (1.18) gives the desired result,

t = cosh(V ) t− sinh(V )x (1.21)

x = − sinh(V ) t+ cosh(V )x

y = y

z = z

1.14.19 b) invariance of the interval using velocity pa-rameter

The given identity is derived above (1.20). Invariance of the interval followsfrom straightforward substitution into (1.21).

∆s2 = −∆t2

+ ∆x2 + ∆y2 + ∆z2

= −(cosh(V )∆t− sinh(V )∆x)2 + (− sinh(V )∆t+ cosh(V )∆x)2 + ∆y2 + ∆z2

= ∆s2 (1.22)

In the final equality, the cross terms cancelled directly while the squaredterms simplified with the identity (1.20).

1.14.19 c) analogy between Lorentz transformation us-ing velocity parameter and Euclidean coordi-nate transformation

Hyperbolic trigonometric functions replace regular trigonometric functions,but the sign changes for the sine term in the Euclidean coordinate transfor-mation and not the sinh term of the Lorentz transformation.

The analog to the interval ∆s2 is the squared distance to the origin.

The analog to the invariant hyperbolae are circles. These could be usedto calibrate axes of the rotated Euclidean frame.

1.14.20 Lorentz tranformation in matrix form

x = Ax

Page 16: FirstCourseGR_notes_on_Schutz2009.pdf

16

where

x =

txyz

, x =

txyz

and

A =

cosh(V ) − sinh(V ) 0 0− sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

1.14.21 a) Timelike separated events can be transformed

to occur at the same point.

Without loss of generality, consider two points, one at the origin and anotherat t, x in inertial frame O. For the interval to be timelike, we require

∆s2 = −∆t2 + ∆x2 = −t2 + x2 > 0

Consider another inertial frame O moving at velocity v along the x−axiswith origins that coincide at t = 0. From the Lorentz transformation

t = γt− γvxx = −γvt+ γx (1.23)

so

(∆x)2 = x2 = (−γvt+ γx)2

We divide through by t2 to reduce this expression to one in a single parameterα ≡ x/t with |α| < 1 for the timelike interval:

α2 − 2vα + v2 = 0

so

(α− v)2 = 0

This has solution α = v and this is possible for realistic velocity boost |v| < 1because |α| < 1 for the timelike interval. Q.E.D.

Page 17: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 17

1.15 Rob’s supplementary problems

SP.1 In Fig. 1.5, explain why the angle of the x−axis to the x−axis isφ = arctan(v), where v = |v| is the magnitude of the velocity of O along thex−axis axis. The result follows from the construction of the x−axis , butthe steps involved are not trivial.

Call the unknown angle between the x−axis and the x−axis α. Ex-tend the line from P to R all the way to the t−axis, and call this in-tersection Q. Draw two lines parallel to the x−axis, one through R andwhere it crosses the t−axis, call this U . The other through P and whereit crosses the t − axis, call this T . The events ξ,P ,R form a right trian-gle, with hypotenuse ξR = 2 a. We need the angle at RξP , which turnsout to be χ = π/4 − φ. (Call angle ORQ γ. Then φ + γ + π/4 = π,and angle ξRP , which is π − γ = π/4 + φ. It follows that χ = π/4 −φ.) So now we can compute the length RP = 2a sin(χ) = a

√2(cos(φ) −

sin(φ)). UR = a sin(φ). Then QR = UR/ sin(π/4) = a sin(φ)√

2. Sum-ming the two lengths QP = QR+RP = a

√2 cos(φ). We’re now after

OT = OQ− T Q. But OQ = OU + UQ, with UQ = UR = a sin(φ), andOU = a cos(φ). Also, T Q = QP cos(π/4) = a cos(φ) So OT = a(sin(φ) +cos(φ)) − a cos(φ) = a sin(φ). Note that the sought after angle α satisfiestan(α) = OT /T P = OT /T Q = a sin(φ)/a cos(φ), so α = φ, the desiredresult.

1.16 Additional thoughts

I think it’s worth mentioning that the Lorentz transformation, which is linearby construction, transforms lines to lines. This is easily verified by substi-tuting the equation for a line in O and confirming that it’s also a line inO.

It’s also worth pointing out that a tangent line to a curve in O remains atangent line in O. Of course it would be quite strange if this were not true,but on the other hand it was not immediately obvious to me that it holds.

Page 18: FirstCourseGR_notes_on_Schutz2009.pdf

18

Page 19: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 2

Vector Analysis in SpecialRelativity

An elementary introduction to 4-vectors, working with Lorentz transforma-tions. Contains lots of hand-holding about the algebra of working with vec-tors, the summation convention, changing dumbing indices etc.

19

Page 20: FirstCourseGR_notes_on_Schutz2009.pdf

20

2.1 Definition of a vector

Regarding the Einstein summation convention introduced on p. 34, Schutzstates that it applies whenever there is a repeated index, one up and onedown, in the same expression. I find this misleading because it’s actually thesame term or same factor.

Buried in footnote 2 on p. 35 is an important notational point.

2.2 Vector algebra

Eq. (2.10) introduces a strange notational twist. Apparently enclosing thevectors ~eα with parentheses and writing a superscript β implies that we areforming a tensor from the set of these vectors?

( ~eα)β = δβα

There’s no comment to explain this. Earlier the author explained that thesuperscript notation will become clear when he introduces differential geom-etry. For now I just note that the RHS is the Kronecker delta, which is asecond-rank tensor.

[Coming back to the above point after having read most of the book (allbut Chapters 9 and 12), I’m still amazed at this leap in notation. GenerallySchutz is clear and fairly careful but this is an exception. I interpret it asfollows. Enclosing the vector in parentheses I believe means taking the setof components, since later when we want to write down the components ofa 2nd rank tensor as a matrix we enclose the tensor in parentheses and theanalogous operation for a vector would be writing down its 4 components asa row or column vector. Apparently the superscript β means then selectingthe βth one of these components. Indeed, this interpretation is consistentwith his usage in Eq. (2.21) and later in the book, such as Eq. (5.52).]

On p. 38 he introduces the notation that putting a tensor within squarebrackets [Λβ

α] gives the matrix of components. He will oscillate betweensquare brackets and parentheses throughout the book, sometimes in the sameparagraph!! For example, the two sentences after Eq. (2.18).

Page 21: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 21

Eq. (2.18) is described as a key formula. Exercise 2.11c is to verify it.

Schutz doesn’t like the terms “contravariant” and “covariant”, used bymany others. Aα are the contravariant components of a vector (Hobson et al.,2009).

2.3 The four-velocity

First paragraph. Not clear what “uniformly moving” means. I guess hemeans not accelerating so that the inertial frame in which the particle isat rest is not constantly changing. This also makes sense because the nextparagraph discusses the case of an accelerating particle.

2.4 The four-momentum

Typo on p. 42, in the example, p1 = mv(1− v2)−1/2 should be p1 = mv(1−v2)−1/2.

2.6 Some applications

Top of p. 48. In the MCRF ~U has only a zero component . . .. It might bebetter to say “only a time component” since he means “zero” in the sense ofthe first component.

2.7 Photons

Four-momentumTypo: Last paragraph: “. . . a photon has frequency v and . . . ”. This

should be “. . . a photon has frequency ν and . . . ”.

The derivation of Eq. (2.39) seems a bit mysterious until one realizes

Page 22: FirstCourseGR_notes_on_Schutz2009.pdf

22

that it’s based on the idea that the four-momentum must transform under aLorentz transformation. In particular, the momentum of a photon directedin the x−direction is:

~p→O (E,E, 0, 0)

If we change reference frames to O then this four-vector must transformunder a Lorentz transformation, giving:

~p→O (E, E, 0, 0)

whereE = γE − Evγ = Eγ(1− v)

etc.

2.9 Exercises

2. Identify the dummy and free indices, count the equations:

a) α is the dummy index. One equation.

b) ν is the dummy index. µ is the free index. Four equations.

c) λ, µ dummy indices; α, γ free indices; 16 equations.

d) ν and µ are free indices, and there are 16 equations. Although the in-dices are repeated, they’re not repeated in the same factor, and one is notsuperscript.

3 Prove Eq. (2.5).

There’s nothing to prove really. It follows immediately from the definitionand notation conventions. In particular, the LHS involves a sum over allvalues of the dummy index β ∈ 0, 1, 2, 3, see p. 34. The RHS merelyspells this out, with the convention that Roman indices like i take all valuesi ∈ 1, 2, 3.

4 Practise adding components of vectors, and multiplying by a scalar.

Page 23: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 23

a) −6 ~A→O (−30, 6, 0,−6)

5 a) Show that the basis vectors are linearly independent. Start with ageneral linear combination aµ,

0 = aµ(~eα)µ = aµδµα

Start with the first component, α = 0. The equation above is 0 = a0 × 1,so a0 must be zero. Similarly for the other components. Since this trivialsolution is the only solution, the basis vectors must be linearly independent.

More formally, one could write this out in matrix notation:1 0 0 00 1 0 00 0 1 00 0 0 1

a0

a1

a2

a3

=

0000

It’s a result of elementary linear algebra that this system has nontrivial so-lutions only if the determinant of the matrix is zero. But the determinant is+1. So there are no nontrivial solutions and thus the basis vectors must belinearly independent, Q.E.D.

5 b) The given set is not linearly independent, since the linear combina-tion (−5,−3,+2, 1) gives the zero vector.

6 As in Fig. 1.5, the t and x axes are tilted at an angle φ relative to theirO frame counterparts and toward the world line of the line ray t = x. The

basis vectors are parallel to these O axes. Here tan(φ) = 0.6. For the O theaxes will be tilted even further toward t = x. The angle of this basis vectorsθ can be computed as

tan(θ) = tanh(2arctanh (0.6))

7 a) Verify Eq. (2.10). As mentioned above, this is a strange notationaltwist. If we write the basis vectors as row vectors as in Eq. (2.9), then the

Page 24: FirstCourseGR_notes_on_Schutz2009.pdf

24

set form a matrix, and the matrix element is unity when row and columnnumbers are equal, and zero otherwise, i.e. the identity matrix.

1 0 0 00 1 0 00 0 1 00 0 0 1

The RHS of Eq. (2.10) can of course be written as the identity matrix too,which demonstrates the equality.

7 b) I’ve always thought of Eq. (2.11) as the definition of the vector, so itseems to me a tautology, rather than something to prove. Perhaps it’s worthstating the result in words. If you use the components of the vector, ~A toform the linear combination of basis vectors ~e, i.e.

Aα~eα

then you, of course, recover the vector ~A. In particular, for the first compo-nent, α = 0, the first component A0 multiplies all the basis vectors, but onlythe first one ~e0 contributes since the other basis vectors are all zero in thefirst component. Similarly for the other components.

8 a) Prove that the zero vector has the same components in all referenceframes.

This follows immediately from the use of a linear transformation to gobetween reference frames. See p. 35, and Eq. (2.7) for the definition of thegeneral (4-) vector and the linear transformation. In other words,

A

0000

=

0000

for all matrices A and the Lorentz transformation can always be written asa matrix A.

8 b) Prove that if two vectors have equal components in one frame theircomponents are equal in all frames.

Page 25: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 25

My first thought is that if their components are equal in a given frame, thenthey’re the same vector. By the definition of a vector, they are invariantunder coordinate transformation. So their components are equal in all otherframes. But that doesn’t use 8a.

Using 8a, one could subtract the two equal vectors, giving the zero vec-tor in that frame. Under coordinate transformation, this difference vectorremains they zero vector. Thus their components must be equal in any otherframe.

9 There are 16 terms to write out, which is too much work. It seemsconvincing enough to me to note that for each term in the sum on the LHS,there is a corresponding term on the RHS. In general these terms look like,

Λαβ A

β~eα

Of course the order of summation doesn’t matter for a finite sum. Substi-tuting specific values for the dummy indices might make this more clear, sayα = 0, β = 1.

10 Prove Eq. (2.13) from

Aα(Λβα~eβ − ~eα) = 0

Eq. (2.13) was:

~eα = Λβα ~eβ

Choosing anyAα with only one non-zero entry, like (1, 0, 0, 0), or (10, 0, 0, 0),shows straight away that

Λβ0 ~eβ = ~e0

which is Eq. (2.13) with α = 0. Similarly choosing Aα as (0, 1, 0, 0), or(0, 2, 0, 0), shows straight away

Λβ1 ~eβ = ~e1.

So repeating this argument gives the result for the other two basis vectors.

Page 26: FirstCourseGR_notes_on_Schutz2009.pdf

26

Perhaps more instructive is to note that this result works for more generalsituations. The quantity inside the parentheses is a set of 4 different vectors~vα,

(Λβα ~eβ − ~eα) = ~vα

Then view the components of Aα as the components of a linear combinationof this vector ~vα. Now it’s clear that the RHS is not just the number zero,but the 4-vector (0, 0, 0, 0). The linear combination of the set of ~vα must sumto the zero vector for arbitrary components of the linear combination. If thefirst three led to a non-zero vector,

2∑α=0

~vα = (2, 4, 6, 8)

then A3 would have to be chosen so bring this to zero. For example, if~v3 = (1, 2, 3, 4) one would have to choose A3 = −2. But since Aα wasarbitrary so then choosing A3 = +2 would violate the equality. So thismeans that the only way it could work is if

2∑α=0

~vα = 0

and ~v3 = 0. One can now repeat this argument for the∑1

α=0 ~vα etc. andshow that all ~vα are the zero 4-vector. And the result Eq. (2.13) holds.

11 (a) Matrix of Λνµ(−v). Exercise 1.20 was to put the Lorentz trans-

formation in matrix form. Note that sinh(−V ) = − sinh(V ), cosh(−V ) =cosh(V ). So we only have to change the sign of the sinh(V ) elements,

Λ =

cosh(V ) sinh(V ) 0 0sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

where v = tanh(V ).

Page 27: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 27

(b) Aα for all α.

A0 = cosh(V )A0 − sinh(V )A1 (2.1)

A1 = − sinh(V )A0 + cosh(V )A1 (2.2)

A2 = A2 (2.3)

A3 = A3 (2.4)

(c) Verify Eq. (2.18). Written out in matrix form Eq. (2.18) becomes,cosh(V ) sinh(V ) 0 0sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

cosh(V ) − sinh(V ) 0 0− sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

=

1 0 0 00 1 0 00 0 1 00 0 0 1

.To show this it’s useful to use the hyperbolic function identity,

cosh2(x)− sinh2(x) = 1.

Eq. (2.18) follows immediately from matrix multiplication. This identityis easy to derive, and can be found at http://en.wikipedia.org/wiki/

Hyperbolic_function#Similarities_to_circular_trigonometric_functions

along with other properties.

(d) The Lorentz transformation matrix from O to O is just the matrix in(a). Since O is moving toward increasing x with velocity v with respect toO, then from O point of view O is moving toward increasing x with velocity−v.

(e) Aα for all α.

A0 = cosh(V )A0 + sinh(V )A1 = A0 (2.5)

A1 = + sinh(V )A0 + cosh(V )A1 = A1 (2.6)

A2 = A2 = A2 (2.7)

A3 = A3 = A3 (2.8)

Relation to Eq. (2.18): Multiplying the vector ~A on the left by the Lorentztransformation matrix Λ(v) gives the components in the O frame, Aα =

Page 28: FirstCourseGR_notes_on_Schutz2009.pdf

28

Λαβ(v) A

β. Multiplying this vector on the right by the Lorentz transformation

matrix Λ(−v) should return the vector to the O frame. And indeed it does,when we use Eq. (2.18) in the final step below:

Λνα(−v)Aα = Aν (2.9)

Λνα(−v) Λα

β(v) Aβ = Aν (2.10)

δνβ Aβ = Aν (2.11)

(f) Verify that the order applying the transformations doesn’t matter.Physically we know this must be true. Mathematically it works out becauseif we repeat (c) with the matrices in the opposite order, we get the sameresult:

cosh(V ) − sinh(V ) 0 0− sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

cosh(V ) sinh(V ) 0 0sinh(V ) cosh(V ) 0 0

0 0 1 00 0 0 1

=

1 0 0 00 1 0 00 0 1 00 0 0 1

.

(g) Establish that

~eα = δνα ~eν

I find this a rather strange question. From the definition of the Kroneckerdelta function, Eq. (1.4c), the result is immediately obvious. Another wayto see this is that the Kronecker delta can be written as the identity matrix.And of course, writing the vector on the RHS as a column vector, multiplyingby the identity matrix, gives back the original vector.

12 (b) Remember not to add the velocities linearly, but to use the Einsteinlaw of composition of velocities Eq. (1.13), or use the velocity parametersintroduced in Exercise 1.18.

(c) Note that the definition of the magnitude of the vector is analogousto the interval introduced in Chapter 1, see Eq. (2.24).

~A2 = −02 + (−2)2 + 32 + 52 = 38.

Page 29: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 29

(d) The magnitude should be independent of the reference frame, becauseof the invariance of the interval.

13(a) Transformation of coordinates from O to O is can be constructed in

two steps. First transform to O,

Aγ = Λγµ(v)Aµ.

Then transform from O to O,

Aα = Λαγ(v

′) (Λγµ (v)Aµ).

So the Lorentz transformation from O to O is

Λαµ = Λα

γ(v′) Λγ

µ(v).

(b) I thought we just did show that Eq. (2.41) was the matrix product ofthe two individual Lorentz transformations. Maybe he means write it out inmatrix form? I’m not sure what he’s looking for.

(c) The was an important exercise for me because I learned that theLorentz transformation matrix did not have to be symmetric when there arevelocity components in two directions.

Λαµ =

γ(v)γ(v′) −γ(v)vγ(v′) −γ(v′)v′ 0−γ(v)v γ(v) 0 0

−γ(v)γ(v′)v′ γ(v)vγ(v′)v′ γ(v′) 00 0 0 1

.

(d) Show that the interval is invariant under the above transformation.

(e) Show that the order matters in constructing the Lorentz transforma-tion as in (a), i.e.

Λαγ(v) Λγ

µ(v′) 6= Λαγ(v

′) Λγµ(v)

Page 30: FirstCourseGR_notes_on_Schutz2009.pdf

30

Using the example from (c), the LHS of the above would be,

LHS = Λαµ =

γ(v)γ(v′) −γ(v)v −γ(v)γ(v′)v′ 0−γ(v)vγ(v′) γ(v) γ(v)vγ(v′)v′ 0−γ(v′)v′ 0 γ(v′) 0

0 0 0 1

Comparison with the matrix in (c) shows it’s different. In fact, anotherobservation, not discussed by Schutz, is that one is the transpose of theother. This can be understood because

(AB)T = BTAT = BA

where the final equality holds because A and B are symmetric.This is surprising if we think in a Galilean way. However, mathematically

we know in general that matrix multiplication is not commutative, http://en.wikipedia.org/wiki/Matrix_multiplication#Common_properties. Phys-ically we know that the Lorentz transformation results in the axes tiltingtoward the t = x line, as in Fig. 1.5. The order of rotations matters. Forexample, rotating the globe 90 to the east about the polar axis, then 45

clockwise about the axis through the Equator and 90W and 90E, puts thecoordinates 0N, 0E where the South Indian Ocean used to be. But perform-ing the same rotations in the opposite order leaves the coordinates 0N, 0Eon the old Equatorial plane.

14 (a) v = −3/5 in the positive z direction. The off-diagonal term givesthe direction, −vγ = 0.75, and the diagonal term gives γ = 1.25. One canconfirm that γ = 1/

√1− v2, once v is found.

(b) Since it’s a Lorentz transformation, the inverse should be obtained byfrom the Lorentz transformation from O back to O.

Λ(−v) =

1.25 0 0 −0.75

0 1 0 00 0 1 0

−0.75 0 0 1

And matrix multiplication confirms this is the inverse.

Page 31: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 31

(c) 1.25 0 0 −0.75

0 1 0 00 0 1 0

−0.75 0 0 1

1200

=

1.25

20

−0.75

15 (a) The particle 3-velocity is v = (v, 0, 0). In the frame moving

with the particle, the 4-velocity is ~e0, so ~A →O (1, 0, 0, 0). The Lorentztransformation back to the O frame is

Λ(−v) =

γ(v) v γ(v) 0 0v γ(v) γ(v) 0 0

0 0 1 00 0 0 1

.So ~A in the O frame has components ~A→O (γ(v), vγ(v), 0, 0).

(b) For general particle 3-velocity is v = (u, v, w). Let’s start with aslightly less general 3-velocity is v = (u, v, 0) to make the algebra easier.One could rotate through an angle θ to a frame where v = (|v|, 0, 0). Here θis such that [

u′

v′

]=

[cos(θ) sin(θ)− sin(θ) cos(θ)

] [uv

]=

[|v|0

]Now we have the situation as in (a) so we can apply the Lorentz transforma-tion back to the O frame

Λ(−v) =

γ(|v|) |v| γ(|v|) 0 0|v| γ(|v|) γ(|v|) 0 0

0 0 1 00 0 0 1

.So ~A in a frame moving with the O frame but rotated through θ has compo-nents ~A →O (γ(|v|), |v|γ(|v|), 0, 0). Finally we rotate through −θ to obtain~A in the O frame

~A→O(γ(|v|), |v|γ(|v|) cos(θ), |v|γ(|v|) sin(θ), 0) (2.12)

=(γ(|v|), uγ(|v|), γ(|v|)v, 0) (2.13)

Page 32: FirstCourseGR_notes_on_Schutz2009.pdf

32

Finally, there’s no reason for the z component to behave differently, so we cangeneralize this. For general particle 3-velocity is v = (u, v, w), the 4-velocityis

~A→O(γ(|v|), |v|γ(|v|) cos(θ), |v|γ(|v|) sin(θ), 0) (2.14)

=(γ(|v|), uγ(|v|), vγ(|v|), wγ(|v|)) (2.15)

where|v| =

√u2 + v2 + w2.

(c) Starting with the 4-velocity components Uα, one can write the 3-velocity,

v = (U1/γ, U2/γ, U3/γ)

where γ ≡ 1/√

1− v · v = U0.

(d) Applying the above formula, if the 4-velocity is given as (2, 1, 1, 1) thenthe 3-velocity is v = (1/2, 1/2, 1/2). Note the magnitude of the 4-velocity is−4 + 3 = −1, making it a legitimate example.

16 Particle moves with speed w, say along the x−axis, in a referenceframe O moving along the x−axis with speed v. Deriving Einstein’s velocityaddition law from a Lorentz transformation of the particle’s 4-velocity.

The particle’s 4-velocity in reference frame O, U →O (γ(w), γ(w)w, 0, 0).Lorentz transformation from O to O

Λ(−v) =

γ(v) v γ(v) 0 0v γ(v) γ(v) 0 0

0 0 1 00 0 0 1

.So the 4-velocity is, U →O (γ(w)γ(v)+vwγ(w)γ(v), vγ(v)γ(w)+wγ(v)γ(w), 0, 0).Converting this to the 3-velocity using the formula in 15c,

vx =Ux

U0(2.16)

=γ(v) γ(w)(v + w)

γ(v) γ(w)(1 + v w)(2.17)

=(v + w)

(1 + v w)(2.18)

Page 33: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 33

17 (a) Prove that any timelike vector ~U for which U0 > 0 and ~U · ~U = −1is the four-velocity of some world line.

The four-velocity is the ~e0 in the MCRF. If ~U is some world line’s four-velocity, then there exists a Lorentz transformation for which ~U →O (1, 0, 0, 0).

Let’s see if that’s possible for the given vector ~U .The coordinate system can be rotated so that Uα = (U0, u, 0, 0), just to

make the algebra simpler. Now apply an arbitrary Lorentz transformationγ(v) −v γ(v) 0 0−v γ(v) γ(v) 0 0

0 0 1 00 0 0 1

U0

u00

=

1000

,for some v and γ(v). Thus we require

1 = γ(U0 − v u) (2.19)

0 = γ(u− U0v). (2.20)

But in general we require γ ≥ 1, so the second equation (2.20) requires

v = u/U0. We know U0 > 0 (given) and it follows from the fact that ~U istimelike that U0 > u. So thus v < 1. Thus γ(v) > 1, and most importantly,γ(v) ∈ <, i.e. the Lorentz transformation is possible. Does this requiredLorentz transformation also bring the time component to unity?

The algebra can get messy, but simplifies if we use the fact that ~U ·~U = −1.Eliminate v in the first equation (2.19) gives

1

γ(v)= U0 − u2

U0=

1

U0((U0)2 − u2) =

1

U0

So

γ = U0

This proves that the required Lorentz transformation to make ~U = ~e0 ispossible, which is enough to show that the original 4-vector is the 4-velocityof something. Note that the requirement that U0 ≥ 1 is hidden in therequirement that ~U · ~U = −1.

(b) Use this to prove that for any timelike vector ~V there is a Lorentz

frame in which ~V has zero spatial components.

Page 34: FirstCourseGR_notes_on_Schutz2009.pdf

34

The magnitude of a vector is the interval between the origin and the co-ordinates of the vector. For a timelike interval the vector is timelike, and viceversa. Timelike intervals can be transformed via a Lorentz transformationto have zero spatial part, see Exercise 1.21. The corresponding vector willhave zero spatial components.

If you haven’t done Exercise 1.21, you can construct a proof using part17(a). We are no longer required to make the time part unity; we onlyrequire the space part to be zero, i.e. (2.20), 0 = γ(v)(u − V 0v), where u isnow V iVi = u2. We no longer have V 0 > 0, but that doesn’t matter. Becauseit’s a timelike vector we have

(V 0)2 > V iVi = u2

So (2.20) implies now that

|v| < 1

and again, γ(v) ∈ <, i.e. the Lorentz transformation is possible.

18 (a) Sum of two spacelike orthogonal vectors is spacelike.

By definition, orthogonal vectors have ~A · ~B = 0, so

( ~A+ ~B) · ( ~A+ ~B) = ~A · ~A+ ~B · ~B + 2 ~A · ~B (2.21)

= ~A · ~A+ ~B · ~B > 0. (2.22)

Spacelike vectors have positive magnitude, ~A · ~A > 0. So ( ~A + ~B) is alsospacelike.

(b) Timelike vector and null vector cannot be orthogonal.

Timelike vector ~A. Let’s keep the algebra simple and rotate to a co-ordinate frame such that the spacepart of the null vector ~N is all in onecomponent,

N →O (N0, N1, 0, 0) = (N0, N0, 0, 0)

Because it’s a null vector, N0 = N1. The null vector ~N has unknown coor-dinates in this frame, but

A ·N = −A0N0 + A1N0 = N0(A1 − A0)

Page 35: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 35

But if (A1 − A0) = 0 then

~A · ~A = (A2)2 + (A3)2 ≥ 0

which contradicts the stipulation that ~A is timelike. Thus (A1 − A0) 6= 0.

Thus if ~N is not the zero vector then N0 6= 0 and A ·N 6= 0 so they are notorthogonal.

19 Consider a uniformly accelerated particle. That is, it has a 4-acceleration~a of constant magnitude ~a · ~a = α2 where α2 ≥ 0 is a constant.

(a) Show that ~a has constant components in the particle’s MCRF, andthat these components look like normal, Galilean, acceleration terms.

From Eq. (2.32) the 4-acceleration is always normal to the 4-velocity. But

in the MCRF ~U = ~e0. Without loss of generality we can take the ~a to haveonly one spatial component, say in the x−direction, with increasing x in thedirection of the acceleration. So

~a = (a0, a1, 0, 0)

with this orientation of the spatial axes. Then Eq. (2.32) requires that a0 = 0and a1 = α.

~a→MCRF (0, α, 0, 0)

In some small amount of time, say dt = dτ , in the MCRF at point P, the4-velocity will change by

d~U = dt~a = dt(0, α, 0, 0)

Since the 4-velocity started at ~e0 in the MCRF (by definition) then velocityat the end of this small amount of time will be simply

~U(dt) ≈ (1, dtα, 0, 0)

The equality can be made arbitrarily accurate for small dt, wherein the rel-ativistic effects are negligible. So the 3-velocity changes during dt from zeroto dtα in the x−direction, just as in Galilean acceleration, which is whatwe were required to prove. (See Exercise 15c for converting 4-velocity to3-velocity).

Page 36: FirstCourseGR_notes_on_Schutz2009.pdf

36

19(b) Say α = 10 m s−2 and particle starts from rest at t = 0. Findgeneral expression for the speed at time t.

The trick is to work in the MCRF to find the change in velocity in somesmall increment of time dt, such that relativistic corrections are small, asin (a) above. Then one find the new 4-velocity in the MCRF. But thenone must transform the new 4-velocity and the time increment back to theoriginal frame using a Lorentz transformation.

A some arbitrary point P with coordinates (tp, xp, 0, 0) the particle willbe moving at speed vp in the positive x−direction. So applying the Lorentztransformation from the MCRF back to the original frame we find the newvelocity is just:

~U →O

γ(−vp) +vp γ(−vp) 0 0

+vp γ(−vp) γ(−vp) 0 00 0 1 00 0 0 1

1αdt00

=

γ[1 + vpadt]γ[vp + adt]

00

,The new 3-velocity is:

ux =U1

U0

=γ[vp + αdt]

γ[1 + vpαdt]

=[vp + αdt]

[1 + vpαdt]

≈ [vp + αdt][1− vpαdt]≈ vp + αdt− v2

pαdt+O(dt2)

(2.23)

This change in velocity occurred in time increment dt inO, which correspondsto

dt = γdt+ γvpdx

= γdt+ γvp1

2αdt2

≈ γdt (2.24)

Page 37: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 37

The acceleration in O is

dux

dt=vp + αdt− v2

pαdt− vpdt

=αdt− v2

pαdt

γdt

=α(1− v2

p)

γ=

α

γ3(2.25)

This differential equation can be written

γ3dv = α dt (2.26)

which can be integrated immediately∫ vp

0

γ3dv =

∫ tp

0

αdt

vp√1− v2

p

= αtp (2.27)

Solving for

vp(t) =αtp

1 + α2t2p(2.28)

We can immediately solve for the distance traveled by

dx

dt= vp(t) =

αtp1 + α2t2p

dx =αtp

1 + α2t2pdt

x =1

2αln(1 + α2t2) (2.29)

Setting vp = 0.999 and α = 10 m s−2 = 103×108

s−1 we find a time of

tp = 6.7× 108 s ≈ 21 years

and expressing α = 10/(3× 108)2 m−1

xp =1

2

(3× 108)2

10ln

[1 +

α2[m/s2]2t2p[s]2

(3× 108)2

]= 2.0× 1017 m

Page 38: FirstCourseGR_notes_on_Schutz2009.pdf

38

19 (c) Find the elapsed proper time for the particle in (b).

Recall from part (a) that the time increment in the MCRF was the propertime increment, dt = dτ and this was related to the time increment in theoriginal frame via the Lorentz transformation with the spatial part playingnegligible role:

dt = γdt+ γvpdx

= γdt+ γvp1

2αdt2

≈ γdt (2.30)

So

dτ = dt =1

γ(v)dt

We can use the expression derived above (2.26)

dτ =1

γ(v)dt

=1

αγ(v)2dv (2.31)

which can also integrated immediately∫ τp

0

dτ =

∫ vp

0

1

αγ(v)2dv

τp =1

αarctanh (vp)

τp(tp) =1

αarctanh

(αtp

1 + α2t2p

)(2.32)

We can solve for the proper time elapsed during the 21 years it took toaccelerate the particle to v = 0.999:

τp =c[m/s]

α[m/s2]arctanh (0.999) = 1.14× 108[s] ≈ 3.6 years

20 The particle moves in a circle in the x − y plane of radius b, in aclockwise sense when viewed in the direction of decreasing z. The circle

Page 39: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 39

translates along the x−axis at speed a. It’s stated that |bω| < 1, but therequirement for a realistic particle is actually that |a+bω| < 1. The 3-velocityis computed directly by differentiating the given equations, v →O (x, y, 0),where

x = a+ ωb cos(ωt) (2.33)

y = −ωb sin(ωt) (2.34)

The 4-velocity is obtained from the 3-velocity using the formula derivedin problem 1.15b.

~U →O (γ(v), xγ(v), yγ(v), 0)

= γ(v)(1, a+ bω cos(ωt),−ωb sin(ωt), 0) (2.35)

where v = |v| =√

(a+ ωb sin(ωt))2 + (ωb cos(ωt))2 =√a2 + 2aωb sin(ωt) + ω2b2.

To obtain the 4-acceleration we require the 4-velocity as a function ofproper time, τ , not t, the time in the inertial frame. But remember thatthe proper time is the time measured by a clock at, say, the origin of theMCRF. Call this frame O, and then t = τ = x0. And t = Λ(−v)0

α xα.

For simplicity we choose the MCRF with origin at the particle location,so xα →O (τ, 0, 0, 0), and t = γ(−v)τ = γ(v)τ . Then we obtain the 4-acceleration from the given equations in t and the chain rule,

~a ≡ d~U

dτ=d~U

dt

dt

dτ= γ(v)

d~U

dt

We now confront the question as to whether or not to let γ(v) in thisderivative! The answer is yes. See my supplementary problem SP.3 in thenext section, in which we derived a general expression for the 4-acceleration,see (2.53).

~a = γ3[xx+ yy + zz]~U + γ2(0, x, y, z) (2.36)

Substituting the values for our particular problem we find:

~a = γ3[−ω2ab sin(ωt)]~U + γ2(0,−ω2b sin(ωt),−ω2b cos(ωt, 0) (2.37)

21 The motion is hyperbolic in frame O,

x2 − t2 = a2 cosh2

a

)− a2 sinh2

a

)= a2

Page 40: FirstCourseGR_notes_on_Schutz2009.pdf

40

and therefore hyperbolic in all reference frames, −t2 + x2 = a2. The velocityis obtained by differentiating with respect to λ,

v =dx

dt=dx

dλ/dt

dλ= tanh

a

).

So we notice that(λa

)is a velocity parameter for v, see problem 1.18.

The Lorentz transformation to the MCRF can be written in a simple formwith the velocity parameter, see problem 1.20:

Λ =

[cosh

(λa

)− sinh

(λa

)− sinh

(λa

)cosh

(λa

) ]Thus we find the points transform to[

t(λ)x(λ)

]=

[cosh

(λa

)− sinh

(λa

)− sinh

(λa

)cosh

(λa

) ] [ a sinh(λa

)a cosh

(λa

) ] =

[0a

]The particle always ends up on the x−axis.

To show that the parameter λ is the proper time, we show that

dt

dλ= 1

for a MCRF and any λ. This is a bit subtle, because we want to hold theLorentz transformation fixed (so hold λ = λMCRF fixed), so that the MCRFis inertial. But we want to let λ vary about λ = λMCRF so we can take thederivative of t(λ) wrt λ. I’ve written out this dependence explicitly below:

t(λ) = cosh

(λMCRF

a

)a sinh

a

)− sinh

(λMCRF

a

)a cosh

a

)Now differential wrt λ, and evaluate at λ = λMCRF giving,

dt

dλ= cosh2

a

)− sinh2

a

)= 1

The 4-velocity is

~U →O(

cosh

a

), sinh

a

), 0, 0

)

Page 41: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 41

The 4-acceleration is easy for this problem because we have the 4-velocityas a function of proper time!

~a ≡ d~U

dτ=d~U

dλ→O

(1

asinh

a

),

1

acosh

a

), 0, 0

)We can check if it’s orthogonal to the 4-velocity, as it should be.

~U · ~a = −1

asinh

a

)cosh

a

)+

1

asinh

a

)cosh

a

)= 0.

Is it uniformly accelerating?

~a · ~a = − 1

a2sinh2

a

)+

1

a2cosh2

a

)=

1

a2.

And a was given as constant, and it’s always pointing in the x−direction, soit is uniformly accelerating (see definition in problem 2.19).

22 (a) Given 4-momentum, ~p→O (4, 1, 1, 0) kg. Find:Energy in O: In general ~p→O (E, p1, p2, p3), so E = 4 kg.

3-velocity in O: In general m~U = ~p, where m is the rest mass and ~Uis the 4-velocity. And the 3-velocity is related to the 4-velocity as inferredin problem 2.15b, Uα = Λ(−|v|)α0 . So ~p →O (mγ,muγ,mvγ,mwγ), wherev→O (u, v, w) are the components of the 3-velocity. Note that E = mγ, andsimply dividing through by E gives v→O (1/4, 1/4, 0).

Rest mass:

γ =1√

1− v · v=

4√14.

From which it follows from E = mγ = 4 that m =√

14 ≈ 3.74kg.

(b) We must apply the law of conservation of 4-momentum.

~pI = ~p1 + ~p2 →O (5, 0, 1, 0) kg

By conservation of 4-momentum,

~pF = ~pI = ~p3 + ~p4 + ~p5,

Page 42: FirstCourseGR_notes_on_Schutz2009.pdf

42

so

~p5 = ~pI − ~p3 − ~p4 =→O (3,−1/2, 1, 0) kg.

Now, like in problem (a), we know the 4-momentum. From an analysisjust like in (a), we find the 5th particle has in this same reference frame:E5 →O 3kg, and v5 →O (−1/6, 1/3, 0). Finally, the rest mass is m =√

31/2 ≈ 2.83kg.

The CM frame is found by finding the Lorentz transformation that trans-forms the ~pF to have only a time component,

Λαβ p

β = (~e0)α

This gives the equation for the y-direction,

−vγ5 + γ = 0

So CM has 3-velocity v→O (0, 1/5, 0).

23 Find the energy given the 3-velocity and rest mass.

First find the 4-momentum, ~p = m~U = mγ(1, u, v, w). And the energy isthe time-part of the 4-momentum,

E = mγ

We can find an approximate value of γ from the binomial series, http://en.wikipedia.org/wiki/Binomial_series. This is just a Taylor series aboutx = 0. Let x = v · v = v2, and α = −1/2, so we obtain,

γ = 1 +1

2v2 +

3

8v4 + . . .

So

E ≈ m(1 +1

2v2 +

3

8v4 + . . .)

i.e. the rest mass, plus the classical kinetic energy, plus a correction of orderO(v4). The correction is 1/2 the kinetic energy when,

v =√

2/3

Page 43: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 43

24 Show that it’s impossible for a positron and an electron to annihilateand produce a single γ−ray.

Apparently particles come and go, but 4-momentum is conserved. Lineup the coordinates such that the x−axis is aligned with the direction ofpropagation of the γ−ray. Then conservation of 4-momentum,

~pe+ + ~pe− = ~pγ,

gives two equations. The time part looks like conservation of energy,

p0e+ + p0

e− = p0γ,

while the spatial part looks like traditional conservation of momentum,

p1e+ + p1

e− = p1γ.

It’s important to realize that they are not independent, since in a referenceframe wherein the electron and positron move with velocities ve− and ve+ ,we have

m(γ(ve+) + γ(ve−)) = hν (2.38)

m(γ(ve+) ve+ + γ(ve−) ve−) = hν, (2.39)

where m is the rest mass of the electron and positron and ν is the frequencyof the γ−ray. The only mathematical solution is then ve− = ve+ = 1, whichis physically impossible because of their non-zero rest mass. Nothing movesat the speed of light, except electromagnetic radiation and possibly gravitywaves if they exist.

It’s possible to produce two γ−rays. Suppose they are travel in oppositedirections with equal and opposite momentum in some frame of reference.Then the final total 4-momentum is the null vector. To satisfy momentumconservation we only require that the positron and electron have equal andopposite momentum in the same frame of reference, so ve+ = −ve− witharbitrary ve+ , which can obviously be satisfied.

25 Doppler shift.In frame O photon has 4-momentum

~p→O (hν, hν cos(θ), hν sin(θ), 0)

Page 44: FirstCourseGR_notes_on_Schutz2009.pdf

44

Transforming to the frame O moving at speed v along the x−axis, weapply the Lorentz transformation

Λ(v) =

γ(v) −v γ(v) 0 0−v γ(v) γ(v) 0 0

0 0 1 00 0 0 1

to obtain

~p→O

γhν − vγ(v)hν cos(θ)

−vγ(v)hν + γ(v)hν cos(θ)hν sin(θ)

0

So the Doppler shift is obtained from the time component, i.e. the firstcomponent, and can be expressed as,

ν

ν= γ(v)(1− v cos(θ)) =

1√1− v2

(1− v cos(θ))

as given.

(b)

No Doppler shift occurs when

ν

ν= 1 =

1√1− v2

(1− v cos(θ))

So

v =2 cos θ

1 + cos2 θor

θ = arccos

(1−√

1− v2

v

)Extra questions: Does this have solutions? For |v| 1 use the binomial

series to see that θ ≈ π/2. What’s the maximum angle of no Doppler shift?As v → 1, θ → 0. Show that at v = 1/2, θ ≈ 74.5.

(c)

Page 45: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 45

Eq. (2.35) is the frame-invarient expression for energy E relative to ob-

server moving with velocity ~Uobs,

−~p · Uobs = E

and Eq. (2.38) was just E = hν. This calculation ends up being exactly thesame as above, but allows one to focus on the relevant parts, i.e. just thetime component. Since

~Uobs →O (γ(v), γ(v)v, 0, 0)

and recall~p→O (hν, hν cos(θ), hν sin(θ), 0)

so we can immediately find

E = γ(v)hν − vγ(v)hν cos(θ).

which was the time component of the ~p→O found in (a) above.

26 Energy required to accelerate an object with rest mass m from v toδv to first order in δv.

E = mγ(v) = m1√

1− v2

so the change in energy is just

δE = m(γ(v + δv)− γ(v)).

When v 1 the problem is easy. Just differentiate γ wrt v to get theTaylor series approximation

γ(v + δv)− γ(v) = γ′(v)δv +1

2γ′′(v)δv2 + . . .

where

γ′ =dγ

dv= vγ3 (2.40)

γ′′ = γ3 + v3γ2γ′ (2.41)

Page 46: FirstCourseGR_notes_on_Schutz2009.pdf

46

Soγ(v + δv)− γ(v) = vγ3δv +O(δv2) . . .

And so the change in energy is,

δE ≈ mvγ3δv.

A subtlety arises when v is not small. The coefficient γ′′ become largerelative to γ′, so ignoring the O(δv2) term becomes misleading. The authorshould have instructed us to check this. In particular,

γ′′

γ′=

1

v+ 3vγ2

When v 1 we can replace

1

2γ′′δv2 ≈ γ′δv

(δv

2v

) γ′(v)δv

since we’re given that ( δvv

) 1. So we’re still justified in ignoring the 2ndterm in the Taylor series. But when v is not small we need another approach.

The above argument is not formally correct when v is not small becausethe higher order terms in the Taylor series can no longer be ignored. Here isone approach.

Write v = 1− ε where 0 < ε 1, so we’re close to the speed of light. Useε 1 and the Binomial series to simplify γ,

γ(v) =1√

ε(2− ε)≈ 1√

2ε(1 + ε/4),

and

γ(v + δv) =1√

(ε− δε)(2− ε+ δε)

where δε = −δv. To simplify the latter we need to consider the case where|δε| ε. But this is not so restrictive. Then

γ(v + δv) =1√

(ε− δε)(2− ε+ δε)≈ 1√

(1 +

δε

)(1 +

ε− δε4

)To find the perturbation in energy we take the difference,

γ(v + δv)− γ(v) ≈ 1√2ε

(δε

4

)

Page 47: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 47

It’s clear that as ε→ 0, so v → 1,A simpler and better solution: Write v = 1− ε where 0 < ε 1, so we’re

close to the speed of light.

γ(v) =1√

ε(2− ε).

Now expand this in a Taylor Series in ε:

γ(v + δv)− γ(v) =dγ

dε(−δε) +

1

2

d2γ

dε2(−δε)2 + . . .

Anddγ

dε=

−(1− ε

4

)(2ε)3/2(1− ε/2)3/2

≈−(1− ε

4

) (1 + 3ε

4

)(2ε)3/2

where the approximation exploits 0 < ε 1 with the Binomial Series ap-proximation. It’s important to check the size of the 2nd derivative relativeto the first. We find, again using the Binomial Series,

d2γ

dε2≈ −3

so we’re only justified in ignoring the 2nd term if |δε| ε. In this case, thechange in energy is

δE ≈ m1

(2ε)3/2δv = mγ3δv +O(ε)

This actually agrees with the result we would have obtained from using thesimply Taylor Series above.

We’re asked to show that the energy becomes infinite when v → 1. Thisis easily obtained by noting that γ is finite for 0 ≤ v < 1. However,

limv→1

γ(v)→∞.

27 Increasing temperature increases the rest mass.Object has rest mass, m(T0) = 10[kg]. Increasing temperature from T0

to T by heat flux δQ = 100 J. This must be reflected in an increase in restmass, since in the MCRF of the object, U0 = 1 and mU0 = p0 = E. So

m(T ) = m(T0)[kg] + δQ[J]/c2[m2/s2] = 10 + 1.1× 10−15[kg]

Page 48: FirstCourseGR_notes_on_Schutz2009.pdf

48

This problem is interesting to look at from a thermodynamics point ofview. The heat flux increases the temperature and enthalpy of the object,which is reflected on a microscopic scale by an increase in the motion, relativeto the centre of mass of the object, of the elements (atoms or molecules or seaof electrons depending on the material) composing the object. This motionincreases the effective mass of the elements. Say an element has rest massmi, then when it has thermal speed vi it has “relativistic mass”

mi,rel = miγ(vi).

I found this website, which expands on these ideas http://en.wikipedia.

org/wiki/Massenergy_equivalence.

28 Boring.

29

d

dτ(~U · ~U) =

d

(−(U0)2 + (U1)2 + (U2)2 + (U3)2

)(2.42)

= −2U0dU0

dτ+ 2U idU

i

dτ(2.43)

= 2~U · d~U

dτ(2.44)

Q.E.D.

30 Four velocity of rocket ship,

~U →O (2, 1, 1, 1)

High-velocity cosmic ray with 4-momentum,

~P →O (300, 299, 0, 0)× 10−27kg

(a) Transform to MCRF of rocket ship. We know from Ex. 2.15, that forgeneral particle 3-velocity is v = (u, v, w), the 4-velocity is

~A→O(γ(|v|), |v|γ(|v|) cos(θ), |v|γ(|v|) sin(θ), 0) (2.45)

=(γ(|v|), uγ(|v|), vγ(|v|), wγ(|v|)) (2.46)

Page 49: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 49

where

|v| =√u2 + v2 + w2.

Inspection of ~U reveals that

γ = 2

u = 1/2

v = 1/2

w = 1/2

and |v| =√u2 + v2 + w2 =

√3/2. Now we need the Lorentz transformation

for a reference frame moving with 3-velocity with more than one non-zerocomponent. Up to this point we haven’t learned this, and I’m a bit surprisedSchutz has thrown this at us now. To lead one through the steps to constructa general Lorentz transformation, I’ve created supplementary problem SP.1in section 2.10. Here we note that we actually only need the first row of theLorentz transformation matrix, since we only require P 0 = E. This first rowmust be such that it transforms ~U →O (1, 0, 0, 0). Thus it must be related

to the components of ~U as follows:

Λ00 = U0 Λ0

i = −U i.

Applying Λ0α to the given ~P gives, E = 301× 10−27kg in rocket ship frame.

(b)

−~P · ~Uobs = Eobs = 10−27[300 299 0 0]

2111

= 301× 10−27kg

(c) Of course (b) was faster. The same computations were performed toget the answer, but in (b) we only did the necessary computations.

31 Photon reflects off mirror without changing frequency ν. Angle ofincidence is θ.

Page 50: FirstCourseGR_notes_on_Schutz2009.pdf

50

This appears to be a straightforward application of conservation of 4-momentum, but it fun because it gets us thinking about all 4 components.

Let the mirror lie in the y − z plane, with photon travelling initially inthe x − y plane, with angle θ to the x−axis. Then the initial 4-momentumof the photon is written

~Pi = (hν, cos(θ)hν, sin(θ)hν, 0).

First let’s construct the 4-momentum of the reflected photon ~Pr. Since thephoton frequency doesn’t change, we know instantly the time component,

P 0r = P 0

i = hν.

For a smooth mirror we assume (actually I’m just guessing!) that the mo-mentum transferred is only in the x−direction. So then we can also constructthe components,

P 2r = P 2

i = sin(θ)hν, P 3r = P 3

i = 0

Recall from Eq. (2.37) that the 4-momentum of a photon is orthogonal toitself. This alone gives us two possibilities for P 1

r = ± cos(θ)hν. For thereflected photon, we choose the minus sign. In summary,

~Pr = (hν,−hν cos(θ), hν sin(θ), 0).

By conservation of 4-momentum, we see that the momentum transferred tothe mirror must be ∆P 1

m = 2hν cos(θ) in the x−direction. How did the mirroracquire x−direction momentum without gaining energy? See SupplementaryProblem SP 2 in section 2.10.

If the photon is absorbed, then the momentum transferred to the mirrorhas three components,

∆~Pm = (∆Em,∆P1m,∆P

2m, 0) = (hν, hν cos(θ), hν sin(θ), 0),

How did the mirror acquire the extra energy ∆Em = hν? See SupplementaryProblem SP 2 in section 2.10.

32 Derive the Compton scattering relationship Eq. (2.43).

Page 51: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 51

Initially the total 4-momentum in the particle’s initial rest frame O is

~P →O (hνi, hνf , 0, 0) + (m, 0, 0, 0)

After the scattering event,

~P →O (hνf , hνf cos(θ), hνf sin(θ), 0) +m(γ, vγ cos(φ), vγ sin(φ), 0)

where v and φ are the speed and the angle of the particle’s scattered trajec-tory in the x−y plane relative to the initial direction of the incident photon.Equating the three nonzero components of 4-momentum gives 3 equationsfor the 3 unknowns νf , v, φ. In principle one can then solve for νf in termsof the other two unknowns, but I found it too tedious to do so.

33 Compton scattering of a cosmic microwave background radiation pho-ton off a cosmic ray ( high-energy proton). What’s the max frequency ofscattered photon?

Very nice problem. At first appears very challenging, but the extremedifferences in energy between the two particles simplifies things.

First we note that in the rest frame of the particle, Compton scatteringonly reduces the frequency and more so for less massive particles (see alsosupplementary problem SP 2 below). So how can Compton scattering in-crease the energy of the photon?? The increase in energy is revealed via theDoppler shift.

The key simplification in this problem is that the Compton scattering inthe frame of the particle has very little effect on frequency.

1

hνi= 5000eV−1 1

mp

= 10−9eV−1.

So the angle of the Compton scattering has very little effect on the finallyfrequency in the particles initial rest frame. So in considering the effect ofthe angle, we need only consider its effect on the Doppler shift.

Now the problem is easy. The Doppler shift in frequency is given ingeneral by Eq. (2.42). Obviously to maximize the frequency in the cosmicray frame, νi, we want the photon and cosmic ray traveling in a line inopposite directions, i.e. θ = π radians, for which Eq. (2.42) gives

hνi = hνi1√

1− v2(1 + v) ≈ hνi

2√1− v2

= hνi2× 109 = 4× 105eV.

Page 52: FirstCourseGR_notes_on_Schutz2009.pdf

52

The Doppler shift has made a tremendous increase in frequency! The Comp-ton scattering will make very little difference, so to maximize the scat-tered frequency in the Sun’s frame, choose the Compton scattering angleto maximize the Doppler shift. That is, choose the scattering angle to be π.Eq. (2.43) gives

1

hνf=

1

hνi+

2

mp

= 0.25× 10−5 + 2× 10−9 ≈ 0.25× 10−5[eV]−1.

Compton scattered caused negligible decrease in energy in the proton’s frame.The proton, like the mirror in problem 31, is massive enough to cause littlechange in frequency of the photon in the proton’s frame. See also Supple-mentary problem SP 2. Now Lorentz transform back to the Sun’s frame.The photon again gains tremendously from the Doppler shift (that’s why wechoose the scattering angle to be complete reflection).

hνf ≈ hνf 2× 109 ≈ 8× 1014eV.

This is a very hard γ−ray. A pair of 511 keV photons arising from annihi-lation of an electron and positron are considered to be γ−rays. And this ismore than a billion times more energetic than that.

34 These are quite trivial. For example, expand out the dot product interms of components using the definition in Eq. (2.26), and use the linearityproperty given by Eq. (2.8),

(α ~A) · ~B = −αA0B0 + αA1B1 + αA2B2 + αA3B3

= α(−A0B0 + A1B1 + A2B2 + A3B3)

= α( ~A · ~B) (2.47)

35 Show that ~eβ obtained from Eq. (2.15),

~eµ = Λνµ(−v)~eν ,

obey~eα · ~eβ = ηαβ

Page 53: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 53

~eα · ~eβ = Λνα(−v)~eν · Λµ

β(−v)~eµ

= Λνα Λµ

β~eν · ~eµ

= Λνα Λµ

βηνµ

The LHS is a vector expression, and it shouldn’t depend upon the orientationof the coordinate axes. So let’s rotate the axes so that v is oriented alongthe x−axis. Then

Λ(v) =

γ −vγ 0 0−vγ γ 0 0

0 0 1 00 0 0 1

Note that Λ is symmetric so we can interchange indices on one without effect,

~eα · ~eβ = Λβµ Λν

α ηνµ

For given α = β, the RHS looks like the product of a row of Λβµ times a

column Λνα. It’s easy to see that the result is −1 for α = β = 0 and +1 for

α = β > 0. When α 6= β, the RHS = 0. Q.E.D.

2.10 Rob’s supplemental problems

R.1 Suppose the 4-velocity of rocket ship is ~U →O (2, 1,√

2, 0) in somereference frame O.

(a) Show that the given ~U is a legitimate 4-velocity. Show that ~V →O(2, 1, 1, 0) is not possible.

(b) Find the 3-velocity in O. Hint: see Ex. 2.15. (You’ll need this for(c)).

(c) Find the matrix that rotates of spatial coordinates such that the 3-velocity has only one non-zero component, in say the x−direction. What’sthe matrix that rotates the 4-velocity to have only one nonzero spatial com-ponent?

(d) Find the inverse rotation matrices for above. Hint: Think physicallyand check mathematically, i.e. R−14 R4 = I

(e) Find the Lorentz transformation from O to the MCRF of the rocket

ship. Confirm that it has the correct effect applied to ~U itself. Hint: The

Page 54: FirstCourseGR_notes_on_Schutz2009.pdf

54

problem here is that we have so far only seen the Lorentz transformationwhen the 3-velocity has only one non-zero component. Use your rotationmatrix from above and its inverse.

Solution:(a)

~U · ~U = −22 + 12 +√

22

= −1

which is consistent with Eq. (2.28). On the other hand,

~V · ~V = −22 + 12 + 12 = −2

which is inconsistent with Eq. (2.28).

(b) See solution to Ex. 2.15:

v→O (1/2,√

2/2, 0)

(c) Rotating anticlockwise through angle θ = arccos(1/√

3) aligns thex−axis with the 3-velocity. This is accomplished with the matrix R3,

R3 =

cos(θ) sin(θ) 0− sin(θ) cos(θ) 0

0 0 1

For the 4-velocity

R4 =

1 0 0 00 cos(θ) sin(θ) 00 − sin(θ) cos(θ) 00 0 0 1

(d) To find the inverse of the rotation matrix just change the sign of the

angle!

R−14 =

1 0 0 00 cos(θ) − sin(θ) 00 sin(θ) cos(θ) 00 0 0 1

Page 55: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 55

(e) The Lorentz transformation for the case Λ(u, v, 0) can be built from

the above tools. Consider transforming a vector, ~U .

~U = Λ~U

= R−14 Λ′(u′,0,0)R4~U

where

Λ′(u′, 0, 0) =

γ(u′) −u′γ(u′) 0 0−u′γ(u′) γ(u′) 0 0

0 0 1 00 0 0 1

So this defines the desired Lorentz transformation Λ(u, v, 0),

Λ(u, v, 0) =

γ(|v|) −uγ(|v|) −vγ(|v|) 0−uγ(|v|) γ(|v|) cos2(θ) + sin2(θ) (γ(|v|)− 1) cos(θ) sin(θ) 0−vγ(|v|) (γ(|v|)− 1) cos(θ) sin(θ) γ(|v|) sin2(θ) + cos2(θ) 0

0 0 0 1

(2.48)

where |v| =√u2 + v2 and θ = arctan(v/u). It’s straightforward, albeit a bit

tedious, to show that

Λ(u, v, 0)~U =

1000

.

R.2 (a)How did the mirror in problem 2.31 acquire x−direction momen-tum without acquiring energy when the photon was reflected?

(b) How did it acquire the energy when the photon was absorbed?

Solution:(a) The change in 4-momentum is related to the change in 4-velocity of

a massive object,

∆~Pm = m∆~U = m(∆γ,∆(uγ), 0, 0) = m(γ − 1, uγ, 0, 0),

where the 2nd equality assumes the mirror is initially at rest. Thus the ratioof

∆P 0m

∆P 1m

=∆Em

∆(mU1)=

1

u(1−

√1− u2) ≈ u

2.

Page 56: FirstCourseGR_notes_on_Schutz2009.pdf

56

The approximation applies in the limit u 1 using the binomial series. Sothe change in energy can be arbitrarily small for a given change in momentumif the change in velocity is correspondingly small. This corresponds to intu-ition that a more massive mirror would rebound less for a given momentumtransfer. I suspect the imposition of “reflection without change in frequency”is an idealization applicable for massive “mirrors”. Indeed the next problem,2.32 covers Compton scattering, wherein a photon reflects off a particle ofmass m. In Eq. (2.43) we see that for

m

h νi

where νi is the incident frequency of the photon, the reflected frequencyνf ≈ νi.

(b) For a massive mirror, the energy must have become mostly thermalenergy. For a less massive mirror the energy, more the energy would go intothe translational kinetic energy of the rebound.

R.3 Start with the expression for the 4-velocity in terms of the 3-velocitywith the components of the 3-velocity written as a function of time in aninertial frame O:

~U →O γ(v)(1, x, y, z)

wherev = |v| =

√x2 + y2 + z2

Show that the 4-acceleration is orthogonal to the 4-velocity by using thechange rule to derive a general expression for the 4-acceleration involvingderivatives with respect to time.

Solution:The 4-acceleration is defined as

~a ≡ d~U

dτEq. (2.32)

=d

dτ[γ(v)(1, x, y, z)]

=dt

d

dt[γ(v)(1, x, y, z)]

(2.49)

Page 57: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 57

One can always put the origin of the MCRF at the particle location, so that

xα →O (τ, 0, 0, 0)

and thus for short increments in time,

t = Λ0α x

α

= γ(−v)τ

= γ(v)τ (2.50)

and

dt

dτ= γ(v) (2.51)

To resolve (2.49) we also require

d

dtγ(v) =

d

dt

[1√

1− x2 − y2 − z2

]= γ3[xx+ yy + zz] (2.52)

Substituting (2.51) and (2.52) into (2.49) we find,

~a = γγ3[xx+ yy + zz](1, x, y, z) + γ(0, x, y, z)= γγ2[xx+ yy + zz]~U + γ(0, x, y, z)= γ3[xx+ yy + zz]~U + γ2(0, x, y, z) (2.53)

Now take the dot product with ~U :

~U · ~a = γ3[xx+ yy + zz]~U + γ2(0, x, y, z)

= −γ3[xx+ yy + zz] + γ3(1, x, y, z) · (0, x, y, z)

= 0. (2.54)

Page 58: FirstCourseGR_notes_on_Schutz2009.pdf

58

Page 59: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 3

Tensor Analysis in SpecialRelativity

59

Page 60: FirstCourseGR_notes_on_Schutz2009.pdf

60

3.1 The metric tensor

We learn that ηαβ (introduced in Eq. (2.27)) is the metric tensor, and itprovides a frame-invariant way to write the scalar product of two vectors,Eq. (3.1).

I don’t see why this is “frame-invariant” when the RHS depends uponthe components, which in turn depend upon the frame. Maybe he meansthe LHS? Of course the scalar product itself is frame-invariant, and perhapsthat’s all he means here.

In any case, he wants to talk about tensors in general and is using themetric tensor as a concrete example to get the discussion going.

3.2 Definition of tensors

Tensors are defined as rules for mapping N vectors to real numbers that arelinear in the arguments: (

0

N

)An ordinary function y = f(t, x, y, z) is a rule for mapping reals onto a

real and is classed as (0

0

)because it takes zero vectors as it’s input. I find this odd because no wherein the definition of a tensor is there mention of the dimension of the vectorsand scalars are vectors of dimension 0.

But this is just semantics.

Aside on the usage of the term ‘function’Emphasizes that a regular function y = f(x) is more generally thought

of as a rule for associating real values of x with real values of y.So we should think of tensors as rules for associating vectors with real

scalars. For example g is the tensor that associates the vectors with theirdot product; for example, say ~A and ~B with the number ~A · ~B.

Components of a tensor

Page 61: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 61

So the components of a tensor, for a given frame, are the values of thetensor applied to the basis vectors of that frame. This gives new insightinto Eq. (2.27) that introduced the metric tensor, for now we see the sameequation repeated here as Eq. (3.5), but now with the interpretation of the16 values of ηαβ being the “components of the metric tensor” in basis vectorsEq. (2.9). And the metric tensor provides the rule associated with the dotproduct.

3.3 The(

01

)tensors: one-forms

“Covector” = “covariant vector” = “one-form”.The concept of one-form was so confusing (to me at least) in Misner et al.

(1973) that I put their book aside and bought Schutz. Here Schutz comesthrough brilliantly, helping me through this hurdle.

General propertiesTypo in Eq. (3.6b).

r(A) = α~p( ~A)

should ber( ~A) = αp( ~A)

The set of all one-forms form a vector space. The axioms of a vectorspace are given in Appendix A on p. 374. BTW

. . . an abelian group, also called a commutative group, is a groupin which the result of applying the group operation to two groupelements does not depend on their order (the axiom of commu-tativity). Abelian groups generalize the arithmetic of addition ofintegers.

which I got from http://en.wikipedia.org/wiki/Abelian_group.Components of one-forms transform in the same way as basis vectors do;

see Eq. (3.9).A one-form is frame-independent, see p. 59.

Notation for derivatives

Page 62: FirstCourseGR_notes_on_Schutz2009.pdf

62

Result Eq. (3.20) is very fundamental, yet I found it a bit weakly ex-plained. Up until this point the gradient was defined only for a scalar field,like φ(~x), c.f. Eq. (3.15). Suddenly, and without comment to the effect, inEq. (3.20) Schutz is applying the gradient operator to the component of avector, xα). One might be tempted to say, but hold α fixed and then xα

is like a scalar field. I find that unsatisfactory because components of vec-tors are very different things from scalar fields. Components change under achange of basis, while scalars don’t! In any case, Eq. (3.20) appears out-of-hat. I now appreciate the approach of Hobson et al. (2009), who start withmanifolds and co-ordinate curves therein, then cover vector and tensor cal-culus on pseudo-Riemann manifolds. Then the notation of basis vector andbasis one-form, and their connection with coordinate curves, appears morenaturally.

In exercise 34(e) we’ll learn by example that the definition in Eq. (3.20)means that the one-form bases are not necessarily just the metric applied tothe vector basis vectors.

Normal one-forms

In my opinion it is necessary to state that the normal one-form is not thezero one-form. This was specified for instance in Problem 12.

3.5 Definition of tensors

Why distinguish one-forms from vectors

This is the crucial material missing from Misner et al. (1973). There’salso a nice connection made with row vectors and column vectors and Dirac’sbra and ket vector formulation of quantum mechanics.

Typo: Eq. (3.45) should be

~dφ→

because it’s a vector gradient.

Page 63: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 63

3.6 Finally:(MN

)tensors

Typo: Paragraph before Eq. (3.55), the equation below should have the RHSin regular, not bold, face type because it’s the components.

R(ωα;~eβ) := Rαβ

Similarly for Eq. (3.55), last line, the RHS should be in regular, not bold,face type because it’s the components. Also the first line of this equationshould have α on the RHS not ~α.

3.8 Differentiation of tensors

This is a very important section for later work. Unfortunately it is a bitrushed. As discussed in my solution to problem 28 of § 3.10, I find thatEq. (3.66) is not clearly explained. Rather than saying that we “deduce”Eq. (3.66) from Eq. (3.65), I’d find it more satisfying if he’d said somethinglike:

We chose to define the gradient of a rank 2 tensor as . Eq. (3.66). Andin so doing, we can then obtain the Eq. (3.65), which is desirable because itappears as a straightforward generalization of Eq. (3.14).

Page 64: FirstCourseGR_notes_on_Schutz2009.pdf

64

3.10 Exercises

1(a). The double sum is obviously different because it includes the off-diagonal terms

Mαβ, when α 6= β

(b)

AαBβηαβ =3∑

α=0

3∑β=0

AαBβηαβ (3.1)

= A0B0(−1) + A1B1 + A2B2 + A3B3 (3.2)

using ηαβ defined after Eq. (2.7).

2. To prove that the set of all one-forms is a vector space, we must showthat this set meets the axioms (1) and (2) given in Appendix A, p. 374.

Axiom (1): The sum of two one-forms must also be a one-form, which issatisfied by Eq. (3.6a), and the order of summation doesn’t matter, which issatisfied by Eq. (3.6b) because a one-form evaluates to a real and the sum oftwo reals doesn’t depend upon the order. We also require a zero. The zeroone-form gives zero for any vector (see p. 60). So say q is the zero one-form.Then assuming Eq. (3.6a) and by Eq. (3.6b)

s( ~A) = p( ~A) + q( ~A) (3.3)

= p( ~A) + 0 (3.4)

= p( ~A) (3.5)

so p+ 0 = p and we have a zero. Axiom (1) is satisfied.Axiom (2): There are four requirements to meet. Although it’s not made

explicit in Eq. (3.6a), let’s assume α ∈ <. By Eq. (3.6) it’s clear thatmultiplication of a one-form by a real scalar meets requirements of Axiom 2.

3(a). Show

p(Aα~eα) = Aαp(~eα)

Page 65: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 65

p(Aα~eα) = p

(3∑

α=0

Aα~eα

)(3.6)

=3∑

α=0

Aαp (~eα) , by linearity in arguments, c.f. p. 56, (3.7)

= Aαp(~eα) (3.8)

3(b). (i)

p( ~A) = Aαpα

= −2 + 1 + 0 + 0

= −1

(ii) +2, (iii) -7, (iv) -7.

4(a). To show the vectors are linearly independent we require that thereis no non-trivial linear combination

a ~A+ b ~B + c ~C + d ~D = 0 (3.9)

[~A ~B ~C ~D

] abcd

=

0000

(3.10)

A non-trivial combination (i.e. not a = b = c = d = 0) requires the determi-

nant of the matrix with columns formed by the vectors ~A through ~D to bezero. But the determinant is -8.

4(b). Find components of p given p( ~A) = 1 etc.

Using the definition of components Eq. (3.8), we can write a linear systemin the four unknown components:

~A~B~C~D

p0

p1

p2

p3

=

p( ~A)

p( ~B)

p(~C)

p( ~D)

(3.11)

Page 66: FirstCourseGR_notes_on_Schutz2009.pdf

66

Note the matrix is written with the rows given by the vectors. This can besolved in MaLab as follows:

A = [2 1 1 0; 1 2 0 0; 0 0 1 1; -3 2 0 0];

b = [1; -1; -1; 0];

x = A\b;

x =

-0.2500

-0.3750

1.8750

-2.8750

4(c). Given~E →O (1, 1, 0, 0)

we easily find that

p( ~E) = Eαpα

= p0 + p1

= −5/8

4(d). Given the values of the four one-forms, p, q, r, s applied to the four

known vectors ~A, ~B, ~C, ~D we can, in principle, find all components of all fourone-forms, repeating the procedure we did in 4b. And then one could write amatrix M where the columns of M are taken from the one-form components.If the determinant of M is zero the one-forms are linearly dependent. Butthat’s a lot of work.

I believe there is a simpler way to test for linear dependence. If the one-forms are linearly dependent, then there are non-trivial real number a, b, c, dsuch that

ap+ bq + cr + ds = t = 0

[p q r s

]abcd

= t =

0000

(3.12)

Page 67: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 67

But then ~A~B~C~D

t =

0000

(3.13)

By Eq. (3.6) we have~A~B~C~D

t =

p( ~A) q( ~A) r( ~A) s( ~A)

p( ~B) q( ~B) r( ~B) s( ~B)

p(~C) q(~C) r(~C) s(~C)

p( ~D) q( ~D) r( ~D) s( ~D)

abcd

=

1 0 2 −1−1 0 0 −1−1 1 0 00 −1 0 0

abcd

=

0000

(3.14)

The latter can only be true if the determinant is zero, but∣∣∣∣∣∣∣∣1 0 2 −1−1 0 0 −1−1 1 0 00 −1 0 0

∣∣∣∣∣∣∣∣ = −2 (3.15)

so the one-forms must not be linearly dependent.

5. Justify steps from Eq. (3.10a) to Eq. (3.10d).

Aαpα = (Λαβ A

β)(Λµα pµ), by Eq. (2.7) and Eq. (3.9) respectively

= (Λµα Λα

β) (Aβpµ), just rearranged the terms

(3.16)

Page 68: FirstCourseGR_notes_on_Schutz2009.pdf

68

We claimed that the transformation for one-forms Λµα was the same as for

basis vectors Eq. (3.9).

Aαpα = (ΛµαΛα

β)(Aβpµ),

= δµβ (Aβpµ), by Eq. (2.18),

= Aµpµ, by the properties of the Kronecker delta.

6. Given a basis ~eα of a frame O and a basis λ0, λ1, λ2, λ3 for thespace of one-forms, with

λ0 →O (1, 1, 0, 0) (3.17)

λ1 →O (1,−1, 0, 0) (3.18)

λ2 →O (0, 0, 1,−1) (3.19)

λ3 →O (0, 0, 1, 1) (3.20)

6a. Consider an arbitrary one-form p and vector ~A.

p(~eα)λα( ~A) = pαλα( ~A) (3.21)

= pα λα(~eβ)Aβ (3.22)

= pαAα iff λα(~eβ) = δαβ (3.23)

but it is clear that λα(~eβ) 6= δαβ by inspection of the given basis.

6b. Given p→O (1, 1, 1, 1), find

p = lαλα

We can write this as a linear system of equations to solve for lα:

λα(~eβ) lα = p(~eβ) (3.24)

(λα)β lα = pβ (3.25)

where I’m using that weird notation introduced in Eq. (2.10).This can be solved in MatLab with

Page 69: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 69

A = [1 1 0 0; 1 -1 0 0; 0 0 1 1; 0 0 -1 1];

p = [1 1 1 1]’;

l = A\p;

l =

1

0

0

1

7. The proof of Eq. (3.13), we’re told on p. 61, is analogous to thecorresponding relation for basis vectors, which was given on p. 37.

Imagine that p is an arbitrary one-form, and ~A an arbitrary vector. Let ~eαand ~eαbe the basis vectors in frame O, and O respectively. The componentsof ~A in the two frames are related by:

Λαβ A

β = Aα,

see Eq. (2.7) on p. 35. We seek the transformation relating the components

of the one-form in frame O and O. Let’s call it T µα . Because p( ~A) is frame-independent (after all, it’s just a scalar):

p( ~A) = Aαpα = Aαpα

= Λαβ A

βpα,

= Λαβ A

βT µα pµ,

= AβpµΛαβ T

µα , just rearranging terms

At this point I believe Schutz would relabel his indices and replace β withα. I’m a little uncomfortable with this because my understanding is that thesums on the two sides of the equal sign are equal, but we cannot immediatelysay the individual terms are equal. We can note however that ~A is arbitrary,and therefore we can imagine the case where all components of ~A are zerobut one, ~A = a~e0 say, and then:

p( ~A) = A0p0 = AβpµΛαβ T

µα , from above

= A0pµΛα0T

µα , (3.26)

Page 70: FirstCourseGR_notes_on_Schutz2009.pdf

70

And similarly, we can imagine ~A = a~e1 etc. In this way we can see thatindeed, it is valid to set α = β.

p( ~A) = Aαpα = AβpµΛαβ T

µα , from above

= AαpµΛαα T

µα , relabel β with α

(3.27)

and now it’s clear that

Λαα T

µα = δµα

T µα = Λµα (3.28)

8. The basis one-forms of dt, viewed from the t − x plane, would beequally spaced straight lines through the t−axis. The spacing is one unit oft between surfaces so that ~e0 would cross one surface. The equation for dxα

is given in Eq. (3.20) in terms of the basis one-forms, which are written outon the top of page 61.

The basis one-forms of dx, viewed from the t−x plane, would be equallyspaced straight lines through the x−axis. The spacing is one unit of x be-tween surfaces.

9. The components of dT are given by Eq. (3.15), and the partial deriva-tives can be estimated from Fig. 3.5, at least for the x and y directions. It’snot clear what we’re supposed to do for the t and z directions.

10 (a) It’s obvious that

∂xα

∂xβ= δαβ

(3.29)

because when α 6= β we have terms like, say,

∂x0

∂x1=∂t

∂x= 0

Page 71: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 71

because t and x are independent variables. But when α = β we have termslike, say,

∂x3

∂x3=∂z

∂z= 1.

10 (b)

∂xβ

∂xµ= δβµ , from (a)

∂Λβα x

α

∂xµ= δβµ , sub co-ordinate transform

Λβα

∂xα

∂xµ= δβµ , transform is a constant

Λβα Λα

µ = δβµ , from Eq. (3.18)

(3.30)

11Eq. (3.14) in different notation:

dτ= φ,t

dt

dτ+ φ,x

dx

dτ+ φ,y

dy

dτ+ φ,z

dz

dτ(3.31)

Eq. (3.15) in different notation:

dφ→O (φ,t, φ,x, φ,y, φ,z) (3.32)

Eq. (3.18) in different notation:

xβ,α = Λβα (3.33)

12 (a)

Page 72: FirstCourseGR_notes_on_Schutz2009.pdf

72

~V is not tangent to surface S, so it must have a component in the ~exdirection, V x 6= 0.

n(~V ) = nxVx

(3.34)

To show this is nonzero, we must show that nx > 0. The normal one-formis the one-form that is zero for every vector tangent to the surface. For thex = 0 surface in 3D space,

n→O (nx, 0, 0) (3.35)

where nx 6= 0 will serve as a non-zero, normal-one form.

12 (b)Suppose

n(~V ) = V αnα,

= V xnx > 0. (3.36)

Suppose ~W has same sign of W x as V x. Then,

n( ~W ) = Wαnα,

= W xnx > 0. (3.37)

12 (c)Any normal to the surface S must have ny = 0 and nz = 0. To be

non-zero it requires nx 6= 0. So

n→O a(nx, 0, 0) (3.38)

where a 6= 0 and a ∈ < will serve also as a non-zero, normal-one form.

12 (d)

Page 73: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 73

To generalize the above results to a 3D surface in 4D space-time, I found ithard to work with surfaces that are not simply one of the coordinates equals aconstant. So I suggest that we require that the surface be sufficiently smooththat we can approximate the surface locally by a tangent plane. Then wecan also rotate the coordinates so that the tangent plane is x = 0, and thenthe above immediately holds.

13 Show that the one-form formed from the gradient of a scalar functionf is normal to surfaces of constant value of f .

Consider an arbitrary point p = (t, x, y, z) where f(t, x, y, z) = fp. Nowimagine taking an infinitesimal step

∆t,∆x,∆y,∆z

such that the change in value of f ,

∆f =∂f

∂t∆t+

∂f

∂x∆x+

∂f

∂y∆y +

∂f

∂z∆z

= 0. (3.39)

This ensures we don’t leave the surface of constant f . So a tangent vectorto the surface of constant f is obtained from an arbitrary multiple of such astep:

~A→O a(∆t,∆x,∆y,∆z) (3.40)

where a ∈ < and a 6= 0. The gradient one-form evaluated with such a tangentvector is

df( ~A) = a

(∂f

∂t∆t+

∂f

∂x∆x+

∂f

∂y∆y +

∂f

∂z∆z

)= 0. (3.41)

14 Given

p→O (1, 1, 0, 0)

q →O (−1, 0, 1, 0)

Page 74: FirstCourseGR_notes_on_Schutz2009.pdf

74

I chose something very easy: ~A = ~e0 and ~B = ~e1. Then the computation areeasy because we only have one component contributing for each vector:

p⊗ q( ~A, ~B) = p( ~A)q( ~B)

= 1 · 0 = 0.

While changing the order of the vectors gives:

p⊗ q( ~B, ~A) = p( ~B)q( ~A)

= 1 · (−1) = −1

6= p⊗ q( ~A, ~B).

To find the components we must input the basis vectors. Because thereare two vector inputs we have a two-by-two array of basis components:

p⊗ q(~eα, ~eβ) = p(~eα)q(~eβ)

= p⊗ qαβ.

Let’s write these in a matrix where α is the row and β is the column:

p⊗ qαβ =

−1 0 1 0−1 0 1 00 0 0 00 0 0 0

.

15Supply the reasoning leading from Eq. (3.23) to Eq. (3.24).

f = fαβωαβ, what we mean by a basis

fµν = f(~eµ, ~eν), what we mean by components

fµν = fαβ ωαβ(~eµ, ~eν), sub first line into 2nd line

ωαβ(~eµ, ~eν) = δαµ δβν , solving above, verify by substitution, used Eq. (3.12)

(3.42)

16 (a)

Page 75: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 75

It’s obvious from the definition Eq. (3.69) that the(

02

)tensor h(s) is

symmetric. Just interchange the arguments ~A and ~B and you obtain thesame result because

1

2h( ~A, ~B)

and1

2h( ~B, ~A)

are just a real numbers and the order of addition doesn’t matter for reals.

16 (b)Similar argument for the antisymmetric

(02

)tensor h(A).

16 (c)Components of the symmetric part of

p⊗ qαβ =

−1 0 1 0−1 0 1 00 0 0 00 0 0 0

.are

p⊗ q(S) =

−1 −1

212

0−1

20 1

20

12

12

0 00 0 0 0

.The antisymmetric part is:

p⊗ q(A) =

0 1

212

0−1

20 1

20

−12−1

20 0

0 0 0 0

.

16 (d)From the definition of an antisymmetric tensor we know that Eq. (3.32)

h( ~A, ~B) = −h( ~B, ~A), ∀ ~B, ~Ah( ~A, ~A) = −h( ~A, ~A), a special case

= 0, solving for LHS. (3.43)

Page 76: FirstCourseGR_notes_on_Schutz2009.pdf

76

16 (e)Number of independent components of h(S)

For a general(

02

)tensor there are 4 × 4 = 16 components. But for the

symmetric tensor we have Eq. (3.28)

fαβ = fβα

This gives the 4 diagonal components are the 6 upper diagonal componentsare independent, so 10 independent components, while the 6 lower diagonalare determined by the symmetry.

For an antisymmetric tensor we have Eq. (3.33)

fαβ = −fβαwhich means the diagonal elements are zero,

fαα = 0,

so there are only 6 independent components in total.

17 (a) This problem takes some time to work through. There must bean easier way, but here is my long-winded solution!

In general,

h(~C, ~A) = hγβ CγAβ

Let’s treat ~C as an arbitrary vector. We’re told that for arbitrary vectors~A and ~B, but with ~B 6= 0

h( , ~A) = αh( , ~B)

(3.44)

Suppose Cγ →O (1, 0, 0, 0). Then

h0βBβ = αh0µA

µ (3.45)

q( ~B) = αq( ~A) (3.46)

The LHS of (3.45) has the form a one-form, so we wrote that explicitly in(3.46). So far there’s no restriction on q; we simply choose

α =q( ~A)

q( ~B). (3.47)

Page 77: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 77

and we note the stipulation that ~B 6= 0.Now suppose Cγ →O (1, 1, 0, 0). Then

h0βBβ + h1βB

β = α(h0µAµ + h1µA

µ) (3.48)

h1βBβ = α(h1µA

µ) (3.49)

For (3.49) we used (3.45), but now α is no longer a free variable, being setby (3.47). Both (3.45) and (3.49) can be satisfied iff

h0β = h1β a (3.50)

for an arbitrary a. And in general, we see simply by repeating the argumentabove for different ~C, that

h0β = hµβpµ. (3.51)

That is, if the tensor h is written as a matrix, the rows are arbitrary scalarconstants pµ of the first row. And so,

hµβ = pµqβ (3.52)

so the tensor h has the form of an outer product

h = p⊗ q. (3.53)

17 (b) Given(

11

)tensor T, show that T( ;~v) is a vector.

In general, for a(

11

)tensor T,

T(p;~v) = T(pαωα; vβ~eβ) (3.54)

= T νµpαvβuµν (ωα;~eβ) (3.55)

= T νµpνvµ (3.56)

I’ve used u as the basis of(

11

)tensors, and p as my one-form arguments

because I wanted to keep ω as the basis for the one-form. By equating (3.55)and (3.56) we see that

uµν (ωα;~eβ) = δαν δµβ (3.57)

Page 78: FirstCourseGR_notes_on_Schutz2009.pdf

78

and we identify ωµ(~eβ) = δµβ from Eq. (3.12) on p. 60, and from the dualityof one-forms and vectors we identify ~eν(ω

α) = δαν , so

uµν = ~eν ⊗ ωµ. (3.58)

Now I believe we can go back and write T acting on the vector alone:

T( ;~v) = T( ; vβ~eβ) (3.59)

= T νµ vβuµν ( ;~eβ) = T νµ v

β~eν ⊗ ωµ( ;~eβ) (3.60)

= T νµ vβωµ(~eβ)~eν (3.61)

= T νµ vβδµβ ~eν (3.62)

= T νµ vµ ~eν . (3.63)

And now we’re done because the RHS is a scalar T 0µv

µ times the basis vector~e0 etc., which is a vector!

And similar we can go back and write T acting on the one-form alone:

T(p; ) = T(pαωα; )

= T νµpα~eν ⊗ ωµ(ωα; )

= T νµpαωµδαν

= T νµpνωµ. (3.64)

Again we’re done here because the RHS is clearly a one-form.

18 (a) Applying the metric tensor to a vector gives the correspondingone-form. The result is simply a change in sign of the first component, so

~A→O (1, 0,−1, 0)

results inA→O (−1, 0,−1, 0).

Note that there is a caution on p. 69 that the metric tensor will change, butfor now we’re still within Special Relativity, and the metric tensor has thesimple values given on p. 45.

(b) Applying the inverse metric tensor to a one-form gives the correspond-ing vector. The inverse metric tensor is the same matrix as the metric tensor,so again, the result is simply a change in sign of the first component, so

p→O (3, 0,−1,−1)

Page 79: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 79

results in~p→O (−3, 0,−1,−1).

19 (a) The inverse matrix tensor ηαβ was given in Eq. (3.44). It’s amatrix with the same values as the metric tensor itself, given first on p. 45.

−1 0 0 00 1 0 00 0 1 00 0 0 1

−1 0 0 00 1 0 00 0 1 00 0 0 1

=

1 0 0 00 1 0 00 0 1 00 0 0 1

(3.65)

19 (b) To derive the formula for the inner product of one-forms in terms ofcomponents, Eq. (3.53), we start with the definition Eq. (3.52). This involvesonly the squares. But first we must establish how to obtain the componentsof the addition of two one-forms. Intuitively one must guess we just add thecomponents. Indeed this is so, but to establish this rigorously we start withthe definition of addition in Eq. (3.6). If s = p+ q then s( ~A) = p( ~A) + q( ~A)

for all ~A. Suppose~A→O (1, 0, 0, 0).

Then

p( ~A) = pαAα = p0 (3.66)

q( ~A) = qαAα = q0 (3.67)

s( ~A) = sαAα = s0 = p0 + q0. (3.68)

Similarly for ~A→O (0, 1, 0, 0) etc. This establishes that Eq. (3.6) implies thatto add two one-forms one just adds the components. Now we can expand(p+ q)2 in terms of components using Eq. (3.50).

(p+ q)2 = s2

= ηαβsαsβ, by Eq. (3.50)

= ηαβ(pα + qα)(pβ + qβ), component-wise addition of one-forms

= ηαβ(pαpβ + qαqβ + pαqβ + pβqα), components are just reals

(3.69)

Page 80: FirstCourseGR_notes_on_Schutz2009.pdf

80

Now we are finally ready to deal with the definition:

p · q =1

2[(p+ q)2 − p2 − q2], by definition Eq. (3.52)

=1

2[ηαβ(pαpβ + qαqβ + pαqβ + pβqα)− p2 − q2], by definition Eq. (3.50)

=1

2[ηαβ(pαpβ + qαqβ + pαqβ + pβqα)− ηαβpαpβ − ηαβqαqβ], by Eq. (3.50)

=1

2[ηαβ(pαqβ + pβqα)], after cancelling terms

= −p0q0 + p1q1 + p2q2 + p3q3, using ηαβ from Eq. (3.44) (3.70)

which is Eq. (3.53).

20 Suppose we’re in Euclidean 3-space in Cartesian coordinates.(a) We want to show that

Aα = Λαβ A

β

and

Pβ = Λαβ Pα (3.71)

are the same transformation since the matrix Λαβ is equal to the transpose

of its inverse.The inverse transformation takes us back from the O frame to the O

frame, and is thus written, (Λα

β )−1

= Λαβ. (3.72)

If the one-form components on the RHS of (3.71), Pα, are written as acolumn matrix, then the transformation in (3.71) amounts to multiplyingthe column vector by the matrix

Λαβ

T = Λβα.

And if the original matrix is orthogonal then we’re back to the same matrix.

(b) All we’re given is that metric tensor for Cartesian 3-space is δij, i, j =1, 2, 3. The metric tensor is used in forming the inner product of vectors,

Page 81: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 81

which we know must be frame invariant. So let’s write the inner productbetween two 3-space vectors in two different frames,

δijAiB j = ~A · ~B

= δklAkBl

= δkl(AiΛk

i)(BjΛl

j)

(3.73)

and so upon cancelling the AiB j on either side and rearranging we see that

δij = Λki Λl

jδkl, (3.74)

as required. If one were uncomfortable with cancelling the AiB j on eitherside, one can write

δijAiB j = ~A · ~B

= δklAkBl

= δkl(AmΛk

m)(BnΛln). (3.75)

And then because ~A and ~B were arbitrary we can consider cases like

~A→O (1, 0, 0).

Then there is only one none-zero Ai, Am component on each side, and wecan divide through by this single component. And this is true for all thecomponents Ai and B i. So again, we assert (3.74).

δij = Λki Λl

jδkl, from above (3.76)

= Λki Λk

j , after summing over l. (3.77)

The RHS of (3.77) is the product of a matrix by its transpose, and for thisto equal the identity matrix (i.e. the LHS), we require the matrix to beorthogonal.

And now I know why we never learned one-forms in undergrad, and calledthe gradient of a scalar field a vector. Incidentally, it was precisely this pointthat I didn’t understand in reading the explanation of a one-form by Misneret al. (1973) that made me give up on their book and try Schutz’s book. I

Page 82: FirstCourseGR_notes_on_Schutz2009.pdf

82

couldn’t see the difference between their description of a one-form and thegradient (and of course there is no difference) but I also thought I “knew”that the gradient of a scalar field was a vector. So hats off to Schutz formaking this point clear. Though perhaps if I had persisted with Misneret al. (1973) all would have been fine in the end.

21 (a) Starting with the t = 0 boundary and moving counter clockwise,so next the x = 1 boundary, let’s call the normal one-forms a, b, c, d. Thesetake on the values:

a→O (a0, 0), a0 > 0 (3.78)

b→O (0, b1), b1 > 0 (3.79)

c→O (c0, 0), c0 < 0 (3.80)

d→O (0, d1), d1 < 0. (3.81)

To obtain the corresponding vectors we simply change the sign of thetime component:

~a→O (−a0, 0), a0 > 0 (3.82)

~b→O (0, b1), b1 > 0 (3.83)

~c→O (−c0, 0), c0 < 0 (3.84)

~d→O (0, d1), d1 < 0. (3.85)

The normal vectors in the time direction, ~a and ~c look odd because theyappear to be pointing inward. But the metric is such,

η =

[−1 00 1

]that the scalar product of vectors that point outward will be positive.

21 (b) The first challenge is finding out what is meant by the “null bound-ary”. I would guess it’s the surface in the direction of the null vector, a nullvector being one whose inner product with itself is zero, e.g.:

~V →O (1, 1).

Page 83: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 83

This vector has the strange property of being orthogonal to itself, see p. 45.The other two boundaries: x = 1 and t = 1 are easily named and so processof elimination also points to the boundary between (1, 0) and (2, 1) as the“null boundary”.

The outward normal one-form, c is easily found:

c(~V ) = 0, definition of normal (3.86)

= cαVα (3.87)

= c0V0 + c1V

1 (3.88)

= c0 + c1. (3.89)

To ensure the outward normal, we require

c0(1) + c1(−1) = c0 − c1 > 0. (3.90)

For example,c→O (1,−1).

The associated vector is,

~c→O (−1,−1).

22To show that vectors form a vector space, when introduced as a functions

that take one-forms as arguments, we would proceed as in section 3.3, inter-changing the roles of vectors and one-forms in Eq. (3.6). That is, we woulddefine the addition of vectors and multiplication of a vector by a scalar asfollows. Suppose

~A = ~B + ~C (3.91)

~D = a ~B, a ∈ <, (3.92)

then we require for all one-form arguments p,

~A(p) = ~B(p) + ~C(p) (3.93)

~D(p) = a ~B(p), a ∈ <, (3.94)

in analogy to Eq. (3.6b).

Page 84: FirstCourseGR_notes_on_Schutz2009.pdf

84

We need a zero, which is provided by the null vector, the one that is zerofor any one-form argument:

~V (q) = 0, ∀q.

The two axioms of Appendix A are now clearly satisfied. (See also prob-lems 2 and 23.)

23 (a) Prove that the set of all(MN

)tensors for fixed M and N forms a

vector space.This is like question 2. But now we need to define what we mean by

the addition of two(MN

)tensors and the multiplication of an

(MN

)tensor by

a scalar. Of course we are guided by Eq. (3.6). That is, we note that(MN

)tensors produce real numbers that can be added like real numbers, so thegeneralization of Eq. (3.6) is trivial. The tensor S where

S = P + Q (3.95)

is defined to be that which gives the sum of the two values obtained byapplying the input to P and Q. That is,

S(a1, a2, . . . , aM ;~b1,~b2, . . . ,~bN) =

P(a1, a2, . . . , aM ;~b1,~b2, . . . ,~bN) + Q(a1, a2, . . . , aM ;~b1,~b2, . . . ,~bN) (3.96)

where I have invented the notation

(a1, a2, . . . , aM ;~b1,~b2, . . . ,~bN)

to show the M one-form inputs and N vector inputs. The choice of one-forms first, we’ll see later, gives the basis in order Schutz gave in 23b. Ihave followed the convention that superscript integers are used as indicesof different one-forms. That is, a1 and a2 are two different one-forms, notcomponents of the same one-form. Similarly, subscripts are used to denotedifferent vectors. In analogy with Eq. (3.6b) we can define multiplication ofan(MN

)tensor by a scalar α

R = αP (3.97)

to be the tensor that, for a given input, gives just α times the real numberproduced by supplying the input to P:

R(a1, a2, . . . , aM ;~b1,~b2, . . . ,~bN) =

αP(a1, a2, . . . , aM ;~b1,~b2, . . . ,~bN) (3.98)

Page 85: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 85

The set of(MN

)tensors for fixed M and N forms a vector space by the

same argument as given for question 2. Perhaps it’s worth making explicitwhat we mean by the zero

(MN

)tensor. This is the tensor that gives zero for

any input,(a1, a2, . . . , aM ;~b1,~b2, . . . ,~bN).

The set of(MN

)tensors, with (3.95) and (3.96) then meets axiom (1) in Ap-

pendix A:(MN

)tensors form an abelian group with the operation of addition.

23 (b) Prove that the basis for the vector space formed from the set ofall(MN

)tensors for fixed M and N is the set:

~eα ⊗ ~eβ ⊗ . . .⊗ ~eγ ⊗ ωµ ⊗ ων ⊗ . . .⊗ ωλ (3.99)

with M vectors labeled with α . . . γ and N one-forms labeled µ . . . λ.This is a nice question because it forces us to think about what we mean

by a basis. The answer is a straightforward generalization of the argumentfor the basis of the

(02

)tensors starting at the bottom of page 66 and ending

with Eq. (3.26) on p. 67.The notation is combersome because one needs to refer to M superscripts

and N subscripts where M and N are arbitrary. In defining the basis (3.99)Schutz has used a series of greek letters like α . . . γ. I’ve decided to put sub-script indices on the greek letters α1, α2, . . . αM . That way I can be explicitabout how many there are. Remember that each greek letter index can takeon 4 values, e.g. α1 = 0, 1, 2, 3 corresponding to the four dimensions.

As in Eq. (3.23) we write the(MN

)tensor as a sum of components times

the basis that we seek:

R = Rα1,α2,...,αMβ1,β2,...,βN

ωβ1,β2,...,βNα1,α2,...,αM(3.100)

And furthermore, the components correspond to the real values producedby applying the tensor to arguments that are the basis one-forms and basisvectors. So,

Rα1,α2,...,αMβ1,β2,...,βN

= R(ωα1 , ωα2 , . . . , ωαM ;~eβ1 , ~eβ2 , . . . , ~eβN ) (3.101)

which is the generalization of the formula given between Eq. (3.23) andEq. (3.24) on p. 67. Now, we simply substitute the tensor (3.100) into (3.101)to obtain:

Rµ1,µ2,...,µMν1,ν2,...,νN

= Rα1,α2,...,αMβ1,β2,...,βN

ωβ1,β2,...,βNα1,α2,...,αM(ωµ1 , ωµ2 , . . . , ωµM ;~eν1 , ~eν2 , . . . , ~eνN )

(3.102)

Page 86: FirstCourseGR_notes_on_Schutz2009.pdf

86

This implies the analogue to Eq. (3.24),

ωβ1,β2,...,βNα1,α2,...,αM(ωµ1 , ωµ2 , . . . , ωµM ;~eν1 , ~eν2 , . . . , ~eνN ) = δµ1α1

δµ2α2. . . δµMαM δβ1ν1 δ

β2ν2. . . δβNνN(3.103)

Using Eq. (3.12) we identify

δβ1ν1 = ωβ1(~eν1)

δβ2ν2 = ωβ2(~eν2)

...

δβNνN = ωβN (~eνN ) (3.104)

Based upon the dualism between vectors and one-forms, we identify:

δµ1α1= ~e µ1(ωα1)

δµ2α2= ~e µ2(ωα2)

...

δµMαM = ~e µM (ωαM ) (3.105)

So,

δµ1α1δµ2α2

. . . δµMαM = ~e µ1(ωα1)~eµ2(ωα2) . . . ~e

µM (ωαM ). (3.106)

So focusing on just the tensor, i.e. dropping the arguments, we’re left withthe basis that is the analogue to Eq. (3.25),

ωβ1,β2,...,βNα1,α2,...,αM= ~e µ1 ⊗ ~e µ2 ⊗ . . . ~e µM ⊗ ωβ1 ⊗ ωβ2 . . .⊗ ωβN (3.107)

where we have introduced the idea of an outer product of N one-forms as asimply extension of the case when N = 2 introduced on p. 66. That is, theouter product of N one-forms

p1 ⊗ p2 . . .⊗ pN

is simply the tensor that, when supplied withN vector inputs, say ~A1, ~A2, . . . , ~AN ,as arguments, produces that number that results from multiplying togethereach real number that results from applying pn to vector argument ~An, i.e.

p1 ⊗ p2 . . .⊗ pN( ~A1, ~A2, . . . , ~AN) = p1( ~A1)p2( ~A2) . . . pN( ~AN)

Page 87: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 87

24 (a)(i) The definitions of the symmetric and antisymmetric tensors are given inEq. (3.31) and Eq. (3.34).

M (αβ) =

0 1 1 1

2

1 −1 0 11 0 0 3

212

1 32

0

(3.108)

M [αβ] =

0 0 −1 −1

2

0 0 0 11 0 0 −1

212−1 1

20

(3.109)

(ii) Section 3.7 shows how to raise and lower indices using the metric tensor.

Mαβ = ηβγM

αγ (3.110)

= MαγηβγT (3.111)0 1 0 01 −1 0 22 0 0 11 0 −2 0

−1 0 0 00 1 0 00 0 1 00 0 0 1

=

0 1 0 0−1 −1 0 2−2 0 0 1−1 0 −2 0

(3.112)

(iii)

M βα = ηαγM

γβ (3.113)

= ηαγMγβ (3.114)−1 0 0 00 1 0 00 0 1 00 0 0 1

0 1 0 01 −1 0 22 0 0 11 0 −2 0

=

0 −1 0 01 −1 0 22 0 0 11 0 −2 0

(3.115)

(iv)

Mαβ = ηαγMγβ = (3.116)

−1 0 0 00 1 0 00 0 1 00 0 0 1

0 1 0 0−1 −1 0 2−2 0 0 1−1 0 −2 0

=

0 −1 0 0−1 −1 0 2−2 0 0 1−1 0 −2 0

(3.117)

Page 88: FirstCourseGR_notes_on_Schutz2009.pdf

88

OR,

Mαβ = ηβγMγ

α (3.118)

= M γα ηβγT (3.119)

0 −1 0 01 −1 0 22 0 0 11 0 −2 0

−1 0 0 00 1 0 00 0 1 00 0 0 1

=

0 −1 0 0−1 −1 0 2−2 0 0 1−1 0 −2 0

. (3.120)

Of course the two different ways to obtain Mαβ agree.

24 (b) Does it make sense to speak of the symmetric and antisymmetricparts of Mα

β? This(

11

)tensor is represented by a matrix, so if it did makes

sense then it would be easy to find the symmetric and antisymmetric parts!But I would guess that it doesn’t make sense, because symmetry has to dowith the interchange of the order of the arguments. For a

(11

)tensor, one

argument is a vector, the other a one-form. So they cannot be interchanged.(On the other hand, each vector has a corresponding one-form and vice versa,so if we incorporate this into the idea of symmetry, then one could definesymmetric and antisymmetric parts.) I guess I’m sitting on the fence.

This first view is the correct one. It only makes sense to discuss symmetryafter both indices have been raised or lowered, as we’ve done in this exampleusing the metric tensor.

24 (c)

The(

20

)tensor ηαβ was introduced in section 3.5 as the inverse of the

metric tensor. So when we raise an index on the metric tensor, we naturally

Page 89: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 89

get the identity matrix:

ηαβ = ηαγηγβ, using Eq. (3.58) (3.121)

=

−1 0 0 00 1 0 00 0 1 00 0 0 1

−1 0 0 00 1 0 00 0 1 00 0 0 1

, using Eq. (3.44) (3.122)

=

1 0 0 00 1 0 00 0 1 00 0 0 1

, (3.123)

= δαβ. (3.124)

25 Show that AαβBαβ is frame invariant.This simple-looking problem turned out to be quite tedious. Perhaps I’m

doing more than required?[In retrospect, having now read most of the book, the solution below of

course contains much more detail then required. But it was only throughworking through this sort of detail that I was able to gain confidence intensor calculus.]

First I believe we need to interpret AαβBαβ in terms of tensor calculous.It looks like it has the form of A(B), where A is a

(20

)tensor and B is a

(02

)tensor. To confirm this we write:

A = Aαβ~eα ⊗ ~eβ, see problem 23 (b) (3.125)

B = Bµνωµ ⊗ ων . (3.126)

Now one applies the tensor A to the arguments B,

A(B) = Aαβ~eα ⊗ ~eβ(Bµνωµ ⊗ ων) (3.127)

= AαβBµν~eα(ωµ)~eβ(ων) (3.128)

= AαβBµνδµα δ

νβ (3.129)

= AαβBαβ, (3.130)

which confirms our interpretation.

Page 90: FirstCourseGR_notes_on_Schutz2009.pdf

90

At this point, we expect AαβBαβ = A(B) to be frame-invariant, becauseits composed of vectors and one-forms, which are frame-invariant. But wecan demonstrate this more explicitly. That is,

A = Aαβ~eα ⊗ ~eβ, (3.131)

= AαβΛαα Λβ

β Λα′

αΛβ′

β~eα′ ⊗ ~eβ′ , (3.132)

= Aαβδα′

α δβ′

β ~eα′ ⊗ ~eβ′ , (3.133)

= Aαβ~eα ⊗ ~eβ. (3.134)

And similarly for B:

B = Bµν ωµ ⊗ ων , (3.135)

= Bµν ΛµµΛν

ν Λµµ′ Λν

ν′ ωµ′ ⊗ ων′ , (3.136)

= Bµνδµµ′ δ

νν′ ω

µ′ ⊗ ων′ , (3.137)

= Bµν ωµ ⊗ ων . (3.138)

Now we should be quite convinced thatAαβBαβ = A(B) is frame-invariant,because its both A and B are frame-invariant. And of course we can showthis in tedious detail:

A(B) = Aαβ~eα ⊗ ~eβ(Bµνωµ ⊗ ων) (3.139)

= AαβBµν~eα ⊗ ~eβ(ωµ ⊗ ων) (3.140)

= Aαβ Bµν ~eα(ωµ) ~eβ(ων) (3.141)

= AαβΛαα Λβ

β BµνΛµµ Λν

ν Λα′

α Λβ′

β~eα′(ω

µ′) Λµµ′Λ

νν′ ~eβ′(ω

ν′),

(3.142)

= Aαβδα′

αδβ′

β Bµνδµµ′δ

νν′ ~eα′(ω

µ′) ~eβ′(ων′), (3.143)

= Aαβδα′

αδβ′

β Bµνδµµ′δ

νν′ δ

µ′

α′ δν′

β′ , (3.144)

= AαβBαβ. (3.145)

26 A is an antisymmetric(

20

)tensor, B is a symmetric

(02

)tensor, C is

an arbitrary(

02

)tensor, D is an arbitrary

(20

)tensor. Prove:

Page 91: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 91

(a) AαβBαβ = 0. I don’t think Schutz discussed the symmetry of(

20

)tensors explicitly in this chapter. But in Section 3.6 we are introduced to

(M0

)tensors and assured that “All our previous discussions of

(02

)tensors apply

here.” So I think it’s safe to use the obvious analogues of the symmetry ideasof Section 3.4. [Yes, obviously it is.]

Because A is an antisymmetric(

20

)tensor, we can write it as the anti-

symmetric part of some more general tensor say A′:

A′[αβ] =1

2

(A′αβ − A′βα

), by Eq. (3.34),

= Aαβ, (3.146)

and because B is a symmetric(

02

)tensor, we can write it as follows

B(αβ) =1

2

(B′αβ +B′βα

), by Eq. (3.31). (3.147)

Now we simply apply A to B:

AαβBαβ = A′[αβ]B′(αβ)

=1

2

(A′αβ − A′βα

) 1

2

(B′αβ +B′βα

),

=1

4

(A′αβB′αβ + A′αβB′βα − A′βαB′αβ − A′βαB′βα

), (3.148)

But we can call the indices α and β whatever we like, and so

A′βαB′βα = A′αβB′αβ

A′βαB′αβ = A′αβB′βα (3.149)

just be relabelling. And from this we see that AαβBαβ = 0.

(b) Prove that: AαβCαβ = AαβC[αβ].

AαβCαβ = Aαβ(C(αβ) + C[αβ]

), by Eq. (3.35),

= AαβC[αβ], by part (a)! (3.150)

(c) Prove that: BαβDαβ = BαβD

(αβ).

BαβDαβ = Bαβ

(D(αβ) +D[αβ]

), by Eq. (3.35)

= BαβD(αβ), by part (a)! (3.151)

Page 92: FirstCourseGR_notes_on_Schutz2009.pdf

92

27 (a) A is an antisymmetric(

20

)tensor. Show that Aαβ is an antisym-

metric(

02

)tensor.

Aαβ = ηαµηβνAµν , by Eq. (3.56). (3.152)

So in matrix notation:

Aαβ =

−1 0 0 00 1 0 00 0 1 00 0 0 1

0 a12 a13 a14

−a12 0 a23 a24

−a13 −a23 0 a34

−a14 −a24 −a34 0

−1 0 0 00 1 0 00 0 1 00 0 0 1

=

−1 0 0 00 1 0 00 0 1 00 0 0 1

0 a12 a13 a14

a12 0 a23 a24

a13 −a23 0 a34

a14 −a24 −a34 0

=

0 −a12 −a13 −a14

a12 0 a23 a24

a13 −a23 0 a34

a14 −a24 −a34 0

(3.153)

27 (b) Suppose V α = Wα. Prove that Vα = Wα.

Vα = ηαβVβ, by 3.39

Wα = ηαβWβ. (3.154)

So

Vα −Wα = ηαβVβ − ηαγW γ

= ηαβ(V β −W β

)= 0. (3.155)

28 Deduce Eq. (3.66) from Eq. (3.65).

Page 93: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 93

I would start from comparing Eq. (3.65) with Eq. (3.14). For scalarfunction φ, we had

dτ=∂φ

∂tU t +

∂φ

∂xUx +

∂φ

∂yUy +

∂φ

∂zU z from Eq. (3.14)

= dφ(~U), just different notation. (3.156)

So I believe we want to write the derivate of the(

11

)tensor in a similar way,

i.e. as something that takes ~U as an argument and gives the rate of changeof that

(11

)tensor following the world line of ~U :

dT

dτ= ∇T (~U), analogous to (3.156) above,

= Tαβ,γ ~eα ⊗ ωβ ⊗ ωγ(Uµ~eµ) Eq. 3.66

= Tαβ,γ ~eα ⊗ ωβ Uµωγ(~eµ) rearranging

= Tαβ,γ ~eα ⊗ ωβ Uµδγµ Eq. (3.12)

= Tαβ,γ ~eα ⊗ ωβ Uγ. (3.157)

Thus we were lead back to Eq. (3.65) from Eq. (3.66) using the analogy withEq. (3.14).

[In retrospect, the above solution doesn’t seem to answer the questionposed. But I think the question is not well-posed. The solution given bySchutz merely refers to the arbitrariness of ~U . But there is more going onhere in making the step from Eq. (3.65) to Eq. (3.66), and what’s involvedis the analogy with the gradient of a scalar, as discussed above. In short,I’m not happy with either my solution nor with Schutz’s, and I blame thequestion! I discuss this more in my notes on this section above.]

29 Prove that tnesor differentiation obeys the Leibniz (product) rule.

∇(A⊗B) = (∇A)⊗B + A⊗ (∇B).

We’re not told what A and B are, which probably means that it doesn’tmatter. For convenience, I’ll assume that they are

(11

)tensors:

A = Aαβ ωβ ⊗ ~eα, Eq. 3.61

B = Bαβ ω

β ⊗ ~eα. (3.158)

Page 94: FirstCourseGR_notes_on_Schutz2009.pdf

94

We then have

A⊗B = Aαβ ωβ ⊗ ~eα ⊗ (Bµ

ν ων ⊗ ~eµ)

= Aαβ Bµν ω

β ⊗ ~eα ⊗ ων ⊗ ~eµ. (3.159)

My strategy is to just redo what Schutz did in Section 3.8. Let’s differentiateA⊗B with respect to proper time, as in Eq. (3.63):

d(A⊗B)

dτ=d(Aαβ B

µν )

dτωβ ⊗ ~eα ⊗ ων ⊗ ~eµ, (3.160)

were use have assumed that the basis one-forms and basis vectors are uniformin spacetime. And

d(Aαβ Bµν )

dτis just the ordinary derivative of this function (Aαβ B

µν ) along the world line.

Because it’s just the ordinary derivative, we can use the ordinary productrule:

d(Aαβ Bµν )

dτ=

(dAαβdτ

)Bµ

ν + Aαβ

(dBµ

ν

). (3.161)

Now just sub this into (3.160):

d(A⊗B)

dτ=

((dAαβdτ

)Bµ

ν + Aαβ

(dBµ

ν

))ωβ ⊗ ~eα ⊗ ων ⊗ ~eµ,

= ∇A(~U)⊗B + A⊗∇B(~U)

= ∇(A⊗B)(~U). (3.162)

If we wrote out ~U = Uγ~eγ everywhere, we’d find that it cancels on the twosides of the equal sign. So it’s clear that:

∇(A⊗B) = (∇A)⊗B + A⊗ (∇B). (3.163)

30 Given the following fields:

~U →O (1 + t2, t2,√

2t, 0)

~D →O (x, 5xt,√

2t, 0)

ρ = x2 + t2 − y2 (3.164)

Page 95: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 95

(a) By Eq. (2.28) we require a four-velocity to have ~U · ~U = −1. And ~Udoes at all points.

~U · ~U = ηαβUαUβ, see Eq. (3.1)

= −(1 + t2)2 + (t2)2 + (√

2t)2 + 0

= −1. (3.165)

On the other hand, ~D cannot be a four-velocity because it doesn’t meet thisproperty everywhere:

~D · ~D = ηαβDαDβ, see Eq. (3.1)

= −x2 + (5tx)2 + (√

2t)2 + 0

6= −1 everywhere. (3.166)

And for some reason we’re asked to also calculate:

~U · ~D = ηαβUαDβ, see Eq. (3.1)

= −(1 + t2)x+ (t2)(5tx) + (√

2t)2 + 0

= −x− xt2 + 5t3x+ 2t2. (3.167)

(b) Find spatial velocity ν given the four-velocity ~U . Spatial componentsof a four-velocity vector was explained on p. 42. Here,

ν →O (t2,√

2t, 0)

At t = 0, ν = 0. But as t→∞, ν →∞.

(c) Find Uα for all α.

Uα = ηαβUβ see section 3.7. (3.168)

U0 = −U0 = −(1 + t2)

U1 = U1 = t2

U2 = U2 =√

2t

U3 = U3 = 0.

Page 96: FirstCourseGR_notes_on_Schutz2009.pdf

96

(d) Find Uα,β for all α, β.

Uα,β ≡

∂Uα

∂xβsee Eq. (3.19) (3.169)

U0,0 =

∂U0

∂x0=∂U0

∂t=∂(1 + t2)

∂t= 2t,

U0,1 =

∂U0

∂x1=∂U0

∂x=∂(1 + t2)

∂x= 0,

U0,2 =

∂U0

∂x2=∂U0

∂y=∂(1 + t2)

∂y= 0,

U0,3 =

∂U0

∂x3=∂U0

∂x=∂(1 + t2)

∂x= 0.

For α = 1:

U1,0 =

∂U1

∂x0=∂t2

∂t= 2t,

U1,1 =

∂U1

∂x1=∂t2

∂x= 0,

U1,2 =

∂U1

∂x2=∂t2

∂y= 0,

U1,3 =

∂U1

∂x3=∂t2

∂x= 0.

For α = 2:

U2,0 =

∂U2

∂x0=∂√

2t

∂t=√

2,

U2,1 =

∂U2

∂x1=∂√

2t

∂x= 0,

U2,2 =

∂U2

∂x2=∂√

2t

∂y= 0,

U2,3 =

∂U2

∂x3=∂√

2t

∂x= 0.

For α = 3:

U3,β =

∂U3

∂xβ=

∂0

∂xβ= 0, (3.170)

Page 97: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 97

(e) Show that UαUα,β = 0 for all β.

For β = 0:

UαUα,t = −(1 + t2)2t+ t22t+

√2t√

2 + 0

= 0. (3.171)

For β > 0, Uα,β = 0. Thus, UαU

α,β = 0.

Show that UαUα,β = 0 for all β.

UαUα,β = Uα(ηαγUγ),β

= ηγαUα(Uγ

,β)

= Uγ(Uγ,β)

= 0, see above! (3.172)

(f) Find Dβ,β

Dβ,β =

∂D0

∂t+∂D1

∂x+∂D2

∂y+∂D3

∂z

=∂x

∂t+∂5tx

∂x+∂√

2t

∂y+∂0

∂z

= 5t. (3.173)

(g) Find (UαDβ),β for all α.

(UαDβ),β = Uα(Dβ,β) + Uα

,βDβ

= (1 + t2), t2,√

2t, 0(5t) +D02t, 2t,√

2, 0= (1 + t2), t2,

√2t, 0(5t) + x2t, 2t,

√2, 0

= 5t(1 + t2) + x2t, 5t3 + x2t, 5√

2t2 +√

2x, 0(3.174)

(h) Find Uα(UαDβ),β.

Page 98: FirstCourseGR_notes_on_Schutz2009.pdf

98

Uα(UαDβ),β = Uα[Uα,βD

β + UαDβ,β]

= Uα Uα,βD

β + Uα UαDβ

= Uα UαDβ

,β using result from (e)

= −Dβ,β using result from (a) and Eq. (2.28)

= −5t using result from (f) (3.175)

(i) Find ρ,α for all α

ρ,α = ∂ρ∂t,∂ρ

∂x,∂ρ

∂y,∂ρ

∂z,

= 2t, 2x,−2y, 0 (3.176)

Find ρ,α for all α

ρ,α = ηαβρ,β

= −2t, 2x,−2y, 0 (3.177)

Interpretation:

dρ→O ρ,α (3.178)

That is, the ρ,α are the components of the one-form that is the gradient ofthe scalar field ρ. ρ,α are the components of the associated vector.

(j) Find ∇~Uρ

∇~Uρ→ Uαρ,α from Eq. (3.68)

= 2t(1 + t2), 2xt2,−2y√

2t, 0 (3.179)

Find ∇~UD

Page 99: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 99

∇~U~D = UαDβ

= UαD0,α, U

αD1,α, U

αD2,α, U

αD3,α

= t2, 5x(1 + t2) + 5t(t2),√

2(1 + t2), 0 (3.180)

31 There’s a mistake in the initial definition of P : We’re told that ~Uis a unit timelike vector. This implies its scalar product with itself is bothnegative , i.e.

UαUα < 0,

But what we require for this problem is that: and of magnitude unity like a4-velocity:

UαUα = −1.

(i) Given the definition of P and ~V⊥ it is then simply a matter of taking

the scalar product with ~U to show that its orthogonal.

ηαµVα⊥U

µ = ηαµUµ(ηαβ + UαUβ)V β

= Uα(ηαβ + UαUβ)V β, using Eq. (3.39)

= Uα(δαβ + UαUβ)V β, using Eq. (3.60)

= (Uβ + UαUαUβ)V β,

= (Uβ − Uβ)V β,

= 0 (3.181)

(ii) Similarly for showing that ~V⊥ is unaffected by P.

P µαV

α⊥ = (ηµα + UµUα)V α

= (ηµα + UµUα)(ηαβ + UαUβ)V β

= (ηµαηαβ + UµUαη

αβ + ηµαU

αUβ + UµUαUαUβ)V β

= (ηµβ + UµUβ + UµUβ + (UαUα)UµUβ)V β

= (ηµβ + UµUβ + UµUβ + (−1)UµUβ)V β

= (ηµβ + UµUβ)V β

= V µ⊥ (3.182)

Page 100: FirstCourseGR_notes_on_Schutz2009.pdf

100

(b) Show that the tensor

ηµν −qµqνqαqα

,

with only restriction that ~q is not the null vector, projects orthogonally.Based upon (a) we can guess that “projects orthogonally” means that

this tensor converts vectors into one-forms that are orthogonal to ~q. (Notethat the given tensor produces a one-form from a vector input because it’s a(

02

)tensor.) Much like in (a) we simply apply the given tensor to an arbitrary

vector, say ~s. Here this produces a one-form. And then we apply this one-form to ~q to show that it’s zero, indicating a one-form orthogonal to ~q.(ηµν −

qµqνqαaα

)sνqγ =

(ηµνq

µ − qµqνqµ

qαqα

)sν

=

(ηνµq

µ − qµqµqν

qαqα

)sν metric tensor is its own transpose,

= (qν − qν) sν ,= 0. (3.183)

This fails when qµqµ = 0 because then of course we have zero in the

denominator of the definition. Now the relation to (a) is clear. In (a) wedidn’t need the qµq

µ in the denominator of the 2nd term, nor the negative

sign, because it was negative unity for vector ~U .

(c) Show that

P(~V⊥, ~W⊥) = g(~V⊥, ~W⊥).

I think it is as simple as this:

P(~V⊥, ~W⊥) = Pαβ Vα⊥ W β

⊥,

= (ηαβ + Uα Uβ) W β⊥ V

α⊥ , substituting definition given

= ηαβ Wβ⊥ V

α⊥ , using (a) (i) above,

= g(~V⊥, ~W⊥). (3.184)

32 (a) Prove the given transformation law for(

02

)tensors.

Page 101: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 101

fαβ = f(~eα, ~eβ)

= fαβ ωα ⊗ ωβ(~eα, ~eβ), using frame-independent definition of f, Eq. (3.26)

= fαβ ωα ⊗ ωβ(Λµ

α ~eµ,Λνβ ~eν), using Eq. (2.15)

= fαβ Λµα Λν

β δαµ δ

βν , using Eq. (3.24, 3.25)

= fαβ Λαα Λβ

β,

(3.185)

Let A = (aαβ) be the matrix with components, aαβ, and similarly for B.It’s easy to see that regular matrix multiplication

AB = aαγbγβ = (a)(b)

So we can write

fαβ = Λαα fαβ Λβ

β, just rearranging

(f) = (Λ)T (f) (Λ). (3.186)

32 (b) Prove that the familiar Lorentz transformation associated with a“velocity boost” obeys the generalization suggested.

The familiar Lorentz transformation was given on p. 22 by Eq. (1.12),and deemed a “velocity boost” v in the x−direction.

t = γt− vγxx = γx− vγty = y

z = z

where γ = 1/√

1− v2. Note this transformation can be written in matrixform

txyz

=

γ −vγ 0 0−vγ γ 0 0

0 0 1 00 0 0 1

txyz

Page 102: FirstCourseGR_notes_on_Schutz2009.pdf

102

Testing whether this transformation matrix meets Eq. (3.71) is just a matterof doing the matrix multiplication. Writing Eq. (3.71) in matrix form:

(η) = (Λ)T (η)(Λ), see problem 32 (3.187)

Substituting our familiar Lorentz transformation we get

RHS = (Λ)T (η)(Λ),

=

γ −vγ 0 0−vγ γ 0 0

0 0 1 00 0 0 1

−1 0 0 00 1 0 00 0 1 00 0 0 1

γ −vγ 0 0−vγ γ 0 0

0 0 1 00 0 0 1

=

γ −vγ 0 0−vγ γ 0 0

0 0 1 00 0 0 1

−γ vγ 0 0−vγ γ 0 0

0 0 1 00 0 0 1

=

−1 0 0 00 1 0 00 0 1 00 0 0 1

= LHS. (3.188)

32 (c) Suppose (L) and (Λ) are two matrices that satisfy Eq. (3.71).Prove that (Λ)(L) also obeys Eq. (3.71).

This is, of course, what we would hope since each Lorentz transformationcorresponds to changing reference frames, and (Λ)(L) = (N) corresponds tochanging twice.

RHS = (N)T (η)(N)

= ((Λ)(L))T (η)((Λ)(L))

= (L)T (Λ)T (η)(Λ)(L)

= (L)T((Λ)T (η)(Λ)

)(L)

= (L)T (η) (L), using Eq. (3.71)

= (η), using Eq. (3.71)

= LHS. (3.189)

Page 103: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 103

33 (a) Find the matrix for the identity element of the Lorentz group.Apparently ‘identity element” is a very general concept, beyond just

group theory: http://en.wikipedia.org/wiki/Identity_element. Forthe Lorentz group, we seek I such that

I L = L

for all elements L. Clearly the 4×4 identity matrix I meets this requirement.Note that Λ(v = 0) = I.

The implicit matrix in Eq. (1.12) wasγ −vγ 0 0−vγ γ 0 0

0 0 1 00 0 0 1

where γ = 1/

√1− v2.

Its inverse is γ vγ 0 0vγ γ 0 00 0 1 00 0 0 1

which is obvious on physical grounds, and can be easily confirmed by multi-plication:

γ vγ 0 0vγ γ 0 00 0 1 00 0 0 1

γ −vγ 0 0−vγ γ 0 0

0 0 1 00 0 0 1

=

γ2(1− v2) 0 0 0

0 γ2(1− v2) 0 00 0 1 00 0 0 1

=

1 0 0 00 1 0 00 0 1 00 0 0 1

(3.190)

33 (b) Prove that the determinant of any matrix representing a Lorentztransformation is ±1.

Page 104: FirstCourseGR_notes_on_Schutz2009.pdf

104

It’s easy to show that the determinant of Lorentz transformation associ-ated with the “velocity boost” v in the x−direction is +1. But what aboutvelocity components in different directions? In Chapter 1 we did find themost general Lorentz transformation for arbitrarily oriented velocity. Butit would be messy to find the determinant, and also I’m not sure if Schutzwants us to consider this familiar Lorentz transformation, or his generaliza-tion defined by Eq. (3.71).

Let’s work with generalization defined by Eq. (3.71).

|(η)| = |(L)T (η)(L)|= |(L)T | |(η)| |(L)|

−1 = −|(L)|2

|(L)| = ±1, (3.191)

where we have used properties of the determinant of matrices specified athttp://en.wikipedia.org/wiki/Determinant#Multiplicativity_and_matrix_

groups.

Clearly if |A| = 1, and |B| = 1, then |C| = |AB| = 1, and this forms asubgroup.

But if |A| = −1, and |B| = −1, then |C| = |AB| = 1. Thus C is not amember and the set of matrices with determinant of -1 do not for a subgroupbecause they fail to meet the closure axiom. The axioms of a group are givenhere http://en.wikipedia.org/wiki/Group_(mathematics).

33 (c) Show that the 3D orthogonal matrices form a group.

Matrix multiplication meets the following three axioms of a group: as-sociativity, identity, and invertibility. We need to show that the set of 3Dorthogonal matrices is closed. That is, if A and B are elements of the set, is

Page 105: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 105

C = AB also an element?

CTC = (AB)T (AB)

= (BTAT ) (AB)

= BT (ATA) B associativity

= BT (I) B orthogonality of A

= BTB

= I orthogonality of B

which implies the orthogonality of C. Thus the set is closed, and forms amatrix group.

If we remove the first row and column of the L(4) matrices we see thatEq. (3.71) becomes just the condition for orthogonal 3D matrices given inproblem 20. I believe this implies that the O(3) matrices are a subgroup ofL(4).

[Yes, this essentially agrees with Schutz’s solution.]

34 Introduce variables: u = t− x and v = t+ x.

(a) Given that ~eu connects events u = 1, v = 0, y = 0, z = 0 to the origin(u = 0, v = 0, y = 0, z = 0). Therefore

u = 1 = t− xv = 0 = t+ x

so that t = 1/2 and x = −1/2. And thus ~eu = (~et − ~ex)/2.

Given that ~ev connects events u = 0, v = 1, y = 0, z = 0 to the origin(u = 0, v = 0, y = 0, z = 0). Therefore this event is

u = 0 = t− xv = 1 = t+ x

and can be written as t = 1/2 and x = 1/2. And thus ~ev = (~et + ~ex)/2.

(b) Show that ~eu, ~ev, ~ey, ~ez form a basis.

Page 106: FirstCourseGR_notes_on_Schutz2009.pdf

106

To be a basis, the vectors must be at least linearly independent – no oneof them can be written as a linear combination of the others.

~eu · ~ev = (~et − ~ex)/2 · (~et + ~ex)/2

= 0. (3.192)

Also,

~eu · ~ey = ~eu · ~ez = 0,

~ev · ~ey = ~ev · ~ez = 0.

so the 4 vectors are orthogonal and can form in fact an orthogonal basis.

(c) The metric tensor is given in terms of the basis vectors by Eq. (3.5).The trick to finding the metric tensor is then to do the scalar product in thebasis where we know the metric tensor so that we can find the metric tensorin the new basis.

ηuu = ~eu · ~eu= (~et − ~ex)/2 · (~et − ~ex)/2

=1

4(~et · ~et + ~ex · ~ex)

=1

4(−1 + 1)

= 0 (3.193)

ηuv = ηvu

= ~eu · ~ev= (~et − ~ex)/2 · (~et + ~ex)/2

=1

4(~et · ~et − ~ex · ~ex)

=1

4(−1− 1)

= −1

2(3.194)

Page 107: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 107

ηuy = ~eu · ~ey= (~et − ~ex)/2 · ~ey= 0 (3.195)

ηuz = ~eu · ~ez= (~et − ~ex)/2 · ~ez= 0 (3.196)

And similarly ηvy = ηvz = 0.

ηvv = ~ev · ~ev= (~et + ~ex)/2 · (~et + ~ex)/2

=1

4(~et · ~et + ~ex · ~ex)

=1

4(−1 + 1)

= 0 (3.197)

While of course the rest of the tensor is the same as the familiar one.Collecting terms:

η =

0 −1

20 0

−12

0 0 00 0 1 00 0 0 1.

(3.198)

(d) We have already shown that ~eu · ~eu = 0, in (c) above, which meansthat it is a null vector, c.f. p. 45. (Not sure why he asked us that now,instead of because (c)!?). Similarly for ~ev · ~ev = 0, also a null vector.

And we have already shown that ~eu · ~ev = −1/2 6= 0, so we have alreadyshown that ~eu and ~ev are not orthogonal.

(e) Compute the four one-forms du, dv, g(~eu, ), and g(~eu, ) in terms of dt,dx.

Page 108: FirstCourseGR_notes_on_Schutz2009.pdf

108

The formula for the gradient is given by Eq. (3.15), but we’re asked herefor the answer “ in terms of dt, dx”. So I think we’re to use the basis formedfrom ~et, ~ex etc., and since the gradient is a one-form the basis elements are ωt,ωx, etc. Note that Eq. (3.20) let’s us write these basis one-forms as dt = ωt,dx = ωx.

The computation is trivial:

du =∂u

∂xαdxα

=∂(t− x)

∂tdt+

∂(t− x)

∂xdx

= dt− dx. (3.199)

And similarly for

dv =∂v

∂xαdxα

=∂(t+ x)

∂tdt+

∂(t+ x)

∂xdx

= dt+ dx. (3.200)

And now we’re asked for g(~eu, ). Look in Section 3.5 if you’re havingtrouble here.

g(~eu, ) = ηαβ ωα ⊗ ωβ(~eu, )

= ηαβ ωα(~eu)ω

β( )

= ηαβ δαu ω

β( )

= ηuβ ωβ( )

(3.201)

The “u row” of η only has one non-zero element, so we might as well continueto narrow this down:

g(~eu, ) = ηuβ ωβ( )

= ηuv ωv( )

= −1

2ωv( )

= −1

2dv( )

= −1

2dt− 1

2dx. (3.202)

Page 109: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 109

And similarly for g(~ev, ).

g(~ev, ) = ηαβ ωα ⊗ ωβ(~ev, )

= ηαβ ωα(~ev)ω

β( )

= ηαβ δαv ω

β( )

= ηvβ ωβ( )

= ηvu ωu( ) only non-zero element

= −1

2ωu( )

= −1

2du( )

= −1

2dt+

1

2dx. (3.203)

[Schutz stopped one line before I have above, so he left the answer interms of du and dv. ]

Page 110: FirstCourseGR_notes_on_Schutz2009.pdf

110

Page 111: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 4

Perfect fluids in SpecialRelativity

GR books tend to cover mostly the same material: SR, tensor calculus,curvature and the Riemann tensor, Einstein’s field equations, Schwarzschildsolution, etc. But not necessarily the material of this chapter on fluid me-chanics. This chapter has come in lieu of one on electricity and magnetism.Either of these topics would give us practise on working with tensors anddeveloping frame-invariant equations, as we will have to do to understandthe development of the Einstein field equations. I like Schutz’s choice herebecause fluid mechanics seems of more general relevance to astrophysical ap-plications of GR. The last question of § 4.10 is on electricity and magnetism,so Schutz is tipping his hat to the other possible choice. See (Hobson et al.,2009) for the a full chapter on electricity and magnetism with the aim ofpreparing the student for the development of the Einstein field equations(but without a fluids chapter).

4.1 Fluids

The continuum hypothesis is discussed in much more detail in classical fluiddynamics texts, see for example (Batchelor, 1967).

Traditionally the qualitative distinction between solids and fluids is thatsolids can sustain a stress (a force parallel to an interface between fluidelements) without a strain ( relative movement between the elements). I’mnot sure why Schutz avoids making this distinction here.

111

Page 112: FirstCourseGR_notes_on_Schutz2009.pdf

112

4.2 Dust: the number -flux vector ~N

The discussion at the end of this section, on p. 88, is very fundamental. It’sperhaps worth adding that Einstein arrived at GR by searching for a descrip-tion of gravity that was consistent with SR in the sense that it was written interms of tensors that are invariant under Lorentz transformations. One candevelop such a tensorial description of classical Electricity and Magnetism,see problem 25 of §4.10, or (Hobson et al., 2009).

4.3 One-forms and surfaces

Number density as a timelike flux

I found this section confusing until I reach the end and realized that all hewanted to say was that the time-component of ~N in Eq. (4.5) looks like thespatial components [but with velocity c]. The mathematics is much clearerthan all the words in my opinion. This point is also immediately obvious toanyone who looked at the four-velocity and realized that in the MCRF oneis moving at the speed of light in the direction of time.

Representation of a frame by a one-form

Last line of this section,

E = −~p · ~U

looks very confusing because, taking literally, it suggests that energy E isa scalar, i.e. a frame-independent concept! But we know that energy isthe zero-component of the four-momentum, and as such depends upon thereference frame (as one would expect!). The resolution is to flip back to p. 48

and remind oneself that equation Eq. (2.35) was written with ~U = ~Uobsbeing the 4-velocity of the observer:

E = −~p · ~Uobs

In short, he’s being sloppy at the end of § 4.10.

Page 113: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 113

4.4 Dust again: the stress-energy tensor

This is great practise for developing the frame-independent view of tensorequations – energy density must be a component of a

(20

)tensor since it

transforms like γ2. Indeed, the stress-energy tensor T forms the RHS of theEinstein field equations.

4.5 General Fluids

Symmetry of Tαβ in the MCRFDon’t take the lower indices of F i

1 as indicating the covariant components.They are just indices to give names to the forces on the different faces so hecan talk about them.

Conservation of energy-momentumTypo bottom of p. 98: −l2T 0x(x = a) should be −l2T 0x(x = l).

4.6 Perfect fluids

No viscosityThe only matrix diagonal in all frames is a multiple of the identity matrix.

Here by all frames, he means all orientations of the spatial axes. In two spatialdimensions this is easy to see. Just apply the rotation matrix

R =

[cos θ sin θ− sin θ cos θ

]Then consider the transformation of some arbitrary diagonal matrix

A =

[a 00 b

]We find that in a co-ordinate system rotated by θ we have

RT A R =

[a cos2 θ + b sin2 θ (a− b) cos θ sin θ(a− b) cos θ sin θ b cos2 θ + a sin2 θ

](4.1)

which is again diagonal off a = b.

Page 114: FirstCourseGR_notes_on_Schutz2009.pdf

114

The conservation laws

Typo: Eq. (4.52) should read:

(ρ+ p)U iβ U

β + p,β ηiβ = 0

To see that Ui,β Uβ is the definition of the four-acceleration ai, we must

piece together the definition in Eq. (2.32) and the notation in Eq. (3.64)and one must lower the index using the metric, which is special relativity isconstant and so commutes with derivatives.

It’s a bit beyond the scope of this text, but Eq. (4.55) is the famousEuler equation, which is the inviscid form of the Navier-Stokes equations ofclassical fluid mechanics. As Schutz says, it’s the F = ma of fluid mechanics.

4.10 Problems

1. The continuum hypothesis applies to which of these situations.

(a) Planetary motions in the solar system. The continuum hypothesisdoes not apply because there are only nine planets, and they have differentorbits, periods, velocities, etc.

(b) A lava flow from a volcano is likely well suited to the continuumapproximation because the molten rock flows like a liquid. Presumably ele-ments can be found that are much bigger than the molecules of the mineralsbut small enough to have uniform macroscopic properties like temperature,density, etc.

(c) Traffic on a major road at rush hour is likely to be well suited tothe continuum approximation if one considers scales much larger than anindividual car so there are many cars in the element, but small enough thatspeed and direction of the cars in one element is roughly constant. At rush

Page 115: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 115

hour it’s more likely to have bumper to bumper traffic which would force thecars in a vicinity to travel at the same speed.

(d) Traffic at an intersection with stop signed is likely to be not well suitedto the continuum approximation. The stop signs ensure that the cars in anelement will have different speeds at different times. There is no near-uniformelement.

(e) Plasma dynamics is likely to be well suited to the continuum approxi-mation unless the plasma is extremely rarified. In the latter case there mightnot be sufficient collisions to bring the ions into statistical equilibrium.

2. “Flux across a surface of constant x” is often called “flux in thex direction”. This is inappropriate because it implies that “flux in the xdirection” is a component of a vector. However, the “Flux across a surface ofconstant x” is actually the result of the application of a vector to a one-form,or vice versa:

~N(dx) = 〈n, ~N〉= 〈dx, ~N〉 (4.2)

This was described in Section 4.3.

3. (a) Galilean momentum is frame-dependent in a manner that relativis-tic momentum is not. Galilean momentum is the ordinary, 3-vector velocity,~v times the (frame-independent) mass m:

~pg = m~v. (4.3)

The velocity depends very much of the frame. It’s not just the componentsthat change with reference frame, but the vector itself that changes.

In contrast with Galilean momentum, the relativistic momentum, ~p, is a4-vector, created by the scalar rest mass m times the 4-velocity ~U :

~p = m~U. (4.4)

The rest mass is obviously frame-invariant. The 4-velocity is too, while itscomponents do depend upon reference frame. Note for instance that the

Page 116: FirstCourseGR_notes_on_Schutz2009.pdf

116

magnitude of the 4-velocity is alway,

~U · ~U = UαUα

= −1. (4.5)

(b) How is the above situation possible given that Galilean momentumis an approximation to relativistic momentum? Hint: Define a Galilean 4-momentum.

The Galilean 4-momentum would look like the regular 3-momentum butwith a time component,

~pG →O m1, vx, vy, vz. (4.6)

where vx, vy, vz is ordinary 3-velocity in frame O. Because |~v| 1 thenthe ~pG vector is dominated by its first component, just the inertial mass.Thus for instance, the magnitude of the Galilean 4-momentum is almostframe-invariant.

4. Show that the number density of dust to an arbitrary observer with4-velocity ~Uobs is

− ~N · ~Uobs

Let’s transform ourselves into the velocity of the observer, O, so then

~Uobs →O 1, 0, 0, 0

Now,

− ~N · ~Uobs = −n~U · ~Uobs= −n(−U0)

= n1√

1− v2

(4.7)

where v is the ordinary velocity of the dust measured in the O frame. Butthis is exactly what we came up with for the number density in Eq. (4.2).

Page 117: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 117

Now we note that the observers 4-velocity and ~N are frame-invariantvectors, and hence ~N · ~Uobs is frame-invariant. So the result must be true inall reference frames.

5. Complete the proof that Eq. (4.14) [for the stress-energy tensor] definesa tensor by arguing that it must be linear in both its arguments.

Eq. (4.14) defines the stress-energy tensor

T(dxα, dxβ) = Tαβ,

and Tαβ is the “flux of α−momentum across a surface of constant xβ”. Ofcourse we require this flux to be proportional to the area of the surface ofconstant xβ. This requirement is met because T(dxα, dxβ) is linear in the2nd argument dxβ.

Furthermore, the α−momentum of the four-momentum is

pα = 〈dxα, ~p〉, (4.8)

which is linear in dxβ, the first argument of T(dxα, dxβ) , by the propertiesof one-forms and vectors.

6. Derive Eq. (4.19)

We only need to show that ~p ⊗ ~N has only one non-zero component inthe MCRF, namely (T 00)MCRF = nm.

In the MCRF, say O,

~U →O (1, 0, 0, 0)

~N = n~U →O (n, 0, 0, 0)

~p = m~U →O (m, 0, 0, 0).

Thus T = ~p ⊗ ~N has only one non-zero component, namely (T 00)MCRF =nm.

7. Derive Eq. (4.21).

Page 118: FirstCourseGR_notes_on_Schutz2009.pdf

118

The terms in Eq. (4.21) are immediately clear from the preceding expres-sion for the 4-velocity in frame O:

~U →O (γ, vxγ, vyγ, vzγ) (4.9)

where γ = 1√1−v2 , and v2 =

∑i v

ivi.

8.(a) Argue that Eqs. (4.25) and (4.26) can be written as statements about

one-forms.In the derivation of Eqs. (4.25) and (4.26), we started with a fluid element

in the MCRF and considered how its energy could change by first a finiteamount ∆E and then we took the limit of an infinitesimal change dE. If wesimply divided by the change in proper time, ∆τ , then for instance Eq. (4.24)would become

n∆q

∆τ=

∆ρ

∆τ− ρ+ p

n

∆n

∆τ(4.10)

Now, following Schutz, we suppose that the changes are infinitesimal. TheRHS becomes

lim∆τ→0

∆ρ

∆τ− ρ+ p

n

∆n

∆τ=dρ

dτ− ρ+ p

n

dn

= dρ · ~U − ρ+ p

ndn · ~U

= nT dS · ~U (4.11)

If we divide by the ~U , we obtain

dρ− ρ+ p

ndn = nT dS

(4.12)

[Solution above is different from Schutz’s solutions, but I believe the twoare compatible.]

8. (b) Show that ∆q is not a gradient.

Page 119: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 119

Unlike, ρ and n, there was no property of each element that we coulddefine by q. (Yes formally one can write q := Q/N , but what is Q? Heat isonly defined in terms of transfer of energy.) I find it misleading that Schutzwrites q := Q/N on p. 95.

9. Show that Eq. (4.34), when α = i, i.e. any spatial index, expressesNewton’s 2nd law.

Let’s assume we’re dealing with dust, so that the stress-energy tensor isgiven by Eq. (4.19).

Let’s start with β = 0. Then Eq. (4.34) gives the following term:

T i0,0 =∂T i0

∂x0

=∂T i0

∂t(4.13)

which we interpret as the time rate of change of the i−direction momentumdensity.

The remaining terms

T ij,j =∂T ij

∂xj

we directly interpret as the divergence of the flux of i−direction momentum.The same terms can be written in a form familiar to classical fluid me-

chanics:

∂T i0

∂t=∂ρui∂t

∂T ij

∂xj=∂ρuiuj∂xj

(4.14)

where on the RHS we have used the notation of standard symbols of fluidmechanics: ρ is the mass density, ui is the fluid velocity in the xi direction.So

∂T i0

∂t+∂T ij

∂xj=∂ρui∂t

+∂ρuiuj∂xj

= 0. (4.15)

Page 120: FirstCourseGR_notes_on_Schutz2009.pdf

120

The RHS is the Navier-Stokes equations but without the pressure gradientterm, forcing or dissipation. This is of course a form of Newton’s 2nd lawbut when forces are not present. This case applies to dust only.

In a perfect fluid the stress energy tensor is given by Eq. (4.38) and weimmediately see the possibility of a pressure gradient term.

10. Take limit of |~v| 1 showing that Eq. (4.35) reduces to the equationgiven.

Solution:

Nα,α = 0, restating Eq. (4.35)

∂nγ

∂t+∂nγvi

∂xi= 0, using Eq. (4.5)

∂n

∂t+∂nvi

∂xi= 0, γ → 1when|~v| 1

where γ = 1/√

1− v2.

11. (a) Show that the matrix δij is unchanged when transformed by arotation of the spatial axes.

Why this question is in this chapter? Recall the discussion of viscosityon p. 101. If there is no viscosity then T should be diagonal in any referenceframe.

First one has to remind oneself how to transform matrices under a changeof coordinates. I don’t have my linear algebra book with me, but we can findthis be pretending that the matrix represents a

(20

)tensor, say A. This tensor

should be an entity that is invariant under the arbitrary choice of coordinatesystem so we demand that

A = Aαβ~eα ⊗ ~eβ, see Chapter 3, section 6 for basis

= AαβΛαα~eα ⊗ ~eβΛββ, using bases of O

= Aαβ~eα ⊗ ~eβ, entity independent of coordinate system. (4.16)

So it’s now clear that

Aαβ = AαβΛααΛββ

(Aαβ) = (Λαα)(Aαβ)(Λββ)T , matrix version of previous line. (4.17)

Page 121: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 121

Now we can address the problem by replacing (Aαβ) with the matrix δij

and the transformation (Λαα) with a rotation Rii. Let’s work in 2D. The

matrix δij in the rotated coordinate system O will be

δij = RiiδijRj

j

= (R)I(R)T

=

[cos θ sin θ− sin θ cos θ

] [1 00 1

] [cos θ − sin θsin θ cos θ

]=

[cos θ sin θ− sin θ cos θ

] [cos θ − sin θsin θ cos θ

]=

[1 00 1

]. (4.18)

11. (b) Show that any matrix that has this property is a multiple of δij.It is quite obvious that any multiple of the identity matrix has this prop-

erty. In fact I went through this in my notes above for § 4.6.

12. Derive Eq. (4.37) from Eq. (4.36).We simply go term by term as Schutz suggested on p. 101. He has already

addressed α = β = 0. Using the convention i ∈ 1, 2, 3, consider the threeterms T 0i:

T 0i = (ρ+ p)U0U i + pη0i

= (ρ+ p)U i + pη0i in MCRF, nil spatial components

= (ρ+ p)U i η is diagonal

= 0. in MCRF, nil spatial components of ~U (4.19)

Then by symmetry,

T i0 = T 0i

= 0. (4.20)

Finally,

T ij = (ρ+ p)U iU j + pηij

= pηij in MCRF, nil spatial components of ~U

= pδij. (4.21)

Page 122: FirstCourseGR_notes_on_Schutz2009.pdf

122

13. Supply reasoning behind Eq. (4.44)We want the derivative of the dot product of the 4-velocity with itself.

This can be written

(UαUγηαγ),β = (UαUγ),β ηαγ, metric tensor is constant

= (Uα,βU

γ + UαUγ,β)ηαγ, product rule

= (2Uα,βU

γ)ηαγ, metric tensor is symmetric. (4.22)

14. Argue that Eq. (4.46) is the time component of Eq. (4.45) in theMCRF.

Recall Eq. (4.45) was

nUβ

(ρ+ p

nUα

),β

+ p,β ηαβ = 0

Although there are two indices, β is just a dummy index (it appears bothupstairs and downstairs implying a sum over β). So the time component iswhen α = 0. The following argument is not correct:

0 = nUβ

(ρ+ p

nU0

),β

+ p,β η0β

= nUβ

(ρ+ p

n

),β

+ p,β η0β, since ~U = ~e0 in the MCRF

= nUβ

(ρ+ p

n

),β

− ∂p

∂t, recall properties of η see Eq. (3.44)

Although ~U = ~e0 in MCRF, we cannot immediately conclude that the gra-dient of the time component is zero. Instead, we must first take the gradientin Eq (4.45), and then set α = 0:

0 = nUβ

(ρ+ p

nUα

),β

+ p,β ηαβ Eq. (4.45)

= nUβ

[Uα

(ρ+ p

n

),β

+

(ρ+ p

n

)Uα

]+ p,β η

αβ, expanding (4.23)

Page 123: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 123

And now focus attention on the 2nd term in square brackets:

nUβ

(ρ+ p

n

)Uα

,β = (ρ+ p)Uβ Uα,β

= (ρ+ p)∇~U~U c.f. Eq. (3.68)

= (ρ+ p)d~U

dτc.f. Eq. (3.67)

= 0 for t component in MCRF, c.f. p. 48 (4.24)

So for the above reason, Eq. (4.45) in the time component reduces to

0 = nUβ

[Uα

(ρ+ p

n

),β

+

(ρ+ p

n

)Uα

]+ p,β η

αβ, from above

0 = nUβ

[Uα

(ρ+ p

n

),β

]+ p,β η

αβ, using above

0 = nUβ

[(ρ+ p

n

),β

]+ p,β η

0β, time component

= nUβ

(ρ+ p

n

),β

− ∂p

∂t, recall properties of η see Eq. (3.44) (4.25)

One the other hand, Eq. (4.46) was

0 = nUβUα

(ρ+ p

nUα

),β

+ p,β ηαβUα

Consider:

(ρ+ p

nUα

),β

= UαUα

(ρ+ p

n

),β

+ Uαρ+ p

nUα

,β, product rule

= −(ρ+ p

n

),β

+ Uαρ+ p

nUα

,β, using Eq. (2.28)

= −(ρ+ p

n

),β

, using Eq. (4.42). (4.26)

Page 124: FirstCourseGR_notes_on_Schutz2009.pdf

124

So Eq. (4.46) becomes:

0 = −nUβ

(ρ+ p

n

),β

+ p,β ηαβUα

= −nUβ

(ρ+ p

n

),β

+ p,β Uβ

= −nUβ

(ρ+ p

n

),β

+∂p

∂t, in MCRF.

= nUβ

(ρ+ p

n

),β

− ∂p

∂t, same as time component of Eq. (4.45).

(4.27)

15. Derive Eq. (4.48) from Eq. (4.47).

This is just trivial manipulation to encourage the student to follow thesteps of the argument and become comfortable with the notation:

0 = Uβ

[−n(ρ+ p

n

),β

+ p,β

], recall Eq. (4.47)

= Uβ[−nn

(ρ,β + p,β) +n

n2(ρ+ p)n,β + p,β

], product rule and chain rule,

= −Uβ

[ρ,β −

(ρ+ p

n

)n,β

], algebra. (4.28)

Which is Eq. (4.48).

16. In the MCRF, U i = 0. Why can’t we assume U i,β = 0?

The analogous statement in 3D space is also true. In fluid mechanics forinstance, one can alway transform the equations into a frame momentarilyco-moving with the local fluid velocity, but that doesn’t mean the velocitygradient will be zero. The 3-velocity in fluid mechanics, and 4-velocity inSR, can depend upon space so that adjacent fluid elements have differentMCRFs.

Page 125: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 125

17. We have defined aµ = Uµ,β U

β. Show that in the nonrelativistic limit:

ai = vi + (v · ∇)vi =Dvi

Dt.

The 4-velocity can be written in terms of the 3-velocity as

Uµ = [γ, γu, γv, γw] (4.29)

where

γ =1√

1− v2.

(Recall that v is assumed to be the ordinary 3-velocity divided by c.)

ai = U i,β U

β

= γ∂γui

∂t+ γu

∂γui

∂x+ γv

∂γui

∂y+ γw

∂γui

∂z

(4.30)

In the nonrelativistic limit v 1, so

γ ≈ 1.

and,

ai =∂ui

∂t+ u

∂ui

∂x+ v

∂ui

∂y+ w

∂ui

∂z

=Dui

Dt. (4.31)

Unfortunately I’ve used v as both the y-component of the 3-velocity, (stan-dard in fluid mechanics), and also as the magnitude of the 3-velocity in thedefinition of γ.

18. Sharpen the discussion at the end of § 4.6 by showing that −∇p isactually the net force per unit volume on the fluid element in the MCRF.

I believe this is simply the same argument used in classical fluid mechan-ics. Imagine a cube with one corner at the origin, with sides parallel to the

Page 126: FirstCourseGR_notes_on_Schutz2009.pdf

126

Cartesian coordinate axes, and of volume δx δy δz. Without loss of gener-ality let the pressure gradient be in the y−direction. The pressure force onthe face at y = 0 is p(y = 0) δx δz, while the pressure force on the face aty = δy is −p(y = δy) δx δz. So the pressure gradient force per unit volumeis

PGF

δx δy δz= − [p(y = δy)− p(y = 0)]

δx δz

δx δy δz

= −[p(y = δy)− p(y = 0)

δy

](4.32)

Taking the limit δx→ 0, δy → 0, δz → 0,

PGF

dx dy dz= −∂p

∂y. (4.33)

19. Starting with Eq. (4.58) prove Eq. (4.47).

Equation Eq. (4.58) contains the sum of terms like:∫V 0(t2)− V 0(t1) dx dy dz +

∫V x(x2)− V x(x1) dt dy dz + . . . (4.34)

Let’s start with the first term.∫V 0(t2)− V 0(t1) dx dy dz =

∫V 0(t1 + δt)− V 0(t1)

δtδt dx dy dz, (4.35)

where δt = t2 − t1. Taking the limit δt→ 0,∫V 0(t2)− V 0(t1) dx dy dz =

∫∂V 0(t)

∂tdt dx dy dz. (4.36)

And similarly for the other terms. For example,∫V x(x2)− V x(x1) dt dy dz =

∫∂V x(x)

∂xdt dx dy dz. (4.37)

Combining these terms we obtain Eq. (4.57).

Page 127: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 127

20. (a) Show that if particles are not conserved but are generated locallyat a rate ε particles per unit volume per unit time in the MCRF, then theconservation law, Eq. (4.35), becomes:

Nα,α = ε.

We must essentially derive Eq. (4.35) but including the source term. Wewere told just before Eq. (4.35) that the procedure was the same as forEq. (4.34), see p. 98. Consider a fluid element as described in Fig. 4.8 andbottom of p. 98. The number density is n in the MCRF, and by Eq. (4.42)it is γn in a reference frame moving at speed v relative to the fluid element,with,

γ =1√

1− v2.

The 4-velocity of the fluid is, by definition, ~U = ~e0 in the MCRF and thetime component is in general U0 = γ. Thus we can write the number ofparticles in an element of volume l3 as

l3nγ = n l3 U0.

The rate of flow (or flux of ) particles across surface 4 (c.f. Fig. 4.8) isl2nUx(x = 0). (This may seem strange because we know that Ux = 0 in theMCRF, but soon we’re going to take a derivative that will not be zero – recallproblem 16 above). The flux of particles across surface 2 is l2nUx(x = l).Similarly, in the y−direction and z−direction the net inflow of particles isl2nUy(y = 0) − l2nUy(y = l) and l2nU z(z = 0) − l2nU z(z = l) respectively.These net inflow terms increase the particle density in the fluid element at arate

∂nl3U0

∂t= l2[(nUx)(x = 0)− (nUx)(x = l) + (nUy)(y = 0)− (nUy)(y = l)

(4.38)

+ (nU z)(z = 0)− (nU z)(z = l)] + . . . other terms

(4.39)

There are other terms contributing now, unlike in deriving Eq. (4.35), because

Page 128: FirstCourseGR_notes_on_Schutz2009.pdf

128

there is also a source term giving,

∂nl3 U0

∂t= l2[(nUx)(x = 0)− (nUx)(x = l) + (nUy)(y = 0)− (nUy)(y = l)

(4.40)

+ (nU z)(z = 0)− (nU z)(z = l)] + l3ε. (4.41)

Note that this relation should be frame-invariant because n is obviously frameinvariant and ~U is also frame invariant.

Note: ε is frame-invariant! Recall ε is the rate of generation of particlesper unit volume per unit time in the MCRF. In another reference frame,there is a factor of γ to account for the fact that the volume will be smaller,thus tending to increase the generation rate, but the time will be slower bya factor 1/γ. In short, time dilation cancels length contraction.

And we can pull l3 out of the derivative because it is a specified constant,and then divide both sides by l3:

∂nU0

∂t= −(nUx)(x = l)− (nUx)(x = 0)

l− (nUy)(y = l)− (nUy)(y = 0)

l(4.42)

− (nU z)(z = l)− (nU z)(z = 0)

l+ ε. (4.43)

In the limit l→ 0,

∂nU0

∂t= −∂nU

x

∂x− ∂nUy

∂y− ∂nU z

∂z+ ε. (4.44)

OrNα

,α = ε.

20. (b) Show that if momentum and energy are not conserved (due tointeractions with external systems) then Eq. (4.34) becomes:

Tαβ,β = Fα.

where Fα is the relativistic 4-vector force.This problem is easier than 20. (a) because we follow the derivation of

Eq. (4.34) on pp. 98 and 99. Recall Eq. (4.31):

∂T 00

∂t= −T 0i

i

Page 129: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 129

where the LHS is the time rate of change of energy per unit volume, andthe RHS in the net in flux of energy per unit time per unit volume. Thusfor non-conservative systems we must add a F 0, which is the net rate ofenergy forcing (or supply of energy from external sources and not associatedwith fluxes across the boundaries of the fluid element) per unit time per unitvolume.

∂T 00

∂t= −T 0i

,i + F 0. (4.45)

I believe, unlike the source of particles, ε, F 0 will be frame dependent. Forsuppose it is associated with particles being generated at a rate ε. We arguedthat ε was frame-invariant in 20 a. But the total energy of each particle ismγ, where m is the rest mass. So the energy source F 0 will increase as onemoves to a reference frame moving relative to the fluid element.

[In retrospect, of course it’s frame dependent, it’s a component of a 4-vector.]

Similarly for the other components. Consider

∂T x0

∂t= −T xi ,i.

The LHS is the time rate of change of x−direction momentum per unitvolume, and the RHS in the net influx of x−momentum per unit time perunit volume. Thus we must add any external forces

∂T x0

∂t= −T xi ,i + F x. (4.46)

where F x is the external force on the fluid element per unit volume in theMCRF.

21. Find the stress-energy tensor components in an inertial frame O.

(a) Group of particles all with the same 3-velocity ~v = β~ex in frame O.The rest-mass density is ρ0 and we can make the continuum approximation.

Making the continuum approximation, we treat the particles as a fluid,and because all the particles have the same velocity the fluid is dust.

Page 130: FirstCourseGR_notes_on_Schutz2009.pdf

130

The stress-energy tensor for dust was discussed in § 4.4, see in particularEq. (4.19):

T = ρ ~U ⊗ ~U.

Here the energy density in the rest frame is ρ = ρ0 and the 4-velocity is

~U →O (γ, βγ, 0, 0)

with γ = 1/√

1− β2. So the stress-energy tensor is simply

Tαβ →O

ρ0 γ

2 ρ0 γ2β 0 0

ρ0 γ2β ρ0 γ

2β2 0 00 0 0 00 0 0 0

.

(b) A ring of N similar particles of mass m rotating counter-clockwisein the x − y plane about the origin of O. The radius of the ring is a, andthe circular cross-section of the ring has radius δa a. Ignore force to keepparticles in circular orbit.

Because we can ignore the forces to keep the particles in the circular orbit,I believe we can treat the fluid as a dust, even though the frame co-movingwith the particles is not inertial. [ Yes, my solution agrees with Schutz’ssolution, so this must be right.]

Let’s assume that the given mass m is the rest-mass of each of the parti-cles. The relativistic mass of each particle is then mγ measured in frame O,where

γ =1√

1− ω2a2

c2

.

Let’s assume that a was given in geometric units so that c = 1, and

γ =1√

1− ω2a2.

The total energy is thenNγm.

And this energy is uniformly distributed over the volume of the torus so theenergy density in frame O is

ρr =Nγm

2aπ(δa)2π=

Nγm

2aπ2(δa)2.

Page 131: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 131

But the particles each have speed ωa in O. To transform to the MCRFenergy density one must use Eq. (4.13),

ρ =ρrγ2

=Nm

2aπ2(δa)2γ.

(Here we’re assuming that we can treat the fluid as dust even though thereis centripetal acceleration.)

Now all we need is the 4-velocity. In frame O, the x−component ofvelocity is −ωa sin(θ) = −ωay/a = −ωy. The 3-velocity in frame O is

~v = (−ωy, ωx, 0).

The 4-velocity is then

~U →O (γ,−γωy, γωx, 0).

We can now easily form the stress-energy tensor

T = ρ ~U ⊗ ~U

which in matrix form is

(T) = ρ

γ2 −γ2ωy γ2ωx 0−γ2ωy γ2ω2y2 −γ2ω2xy 0γ2ωx −γ2ω2xy γ2ω2x2 0

0 0 0 0

(c) For two such rings that are identical in every way accept the sense of

rotation and wherein the particles do not interact.We simply add the two stress-energy tensors since the energy density is

linear in the number of particles, the volume is fixed. The 2nd stress-energytensor is obtained by just changing the sign of ω in the first stress-energytensor.

(T) = ρ

γ2 −γ2ωy γ2ωx 0−γ2ωy γ2ω2y2 −γ2ω2xy 0γ2ωx −γ2ω2xy γ2ω2x2 0

0 0 0 0

+ ρ

γ2 γ2ωy −γ2ωx 0γ2ωy γ2ω2y2 −γ2ω2xy 0−γ2ωx −γ2ω2xy γ2ω2x2 0

0 0 0 0

= ρ

γ2 0 0 00 γ2ω2y2 −γ2ω2xy 00 −γ2ω2xy γ2ω2x2 00 0 0 0

(4.47)

Page 132: FirstCourseGR_notes_on_Schutz2009.pdf

132

22.(i) Here we must argue that a collection of noncolliding particles, likea galaxy, with random velocities with no preferred direction in the MCRFhas a stress-energy tensor of a perfect fluid.

Here we must simply argue that heat conduction and viscosity are zero,so that the conditions of a perfect fluid are met. Head conduction and mo-mentum diffusion (viscosity) result from the net transfer of energy and mo-mentum, respectively, due to particle motions. In classical fluid dynamicsthis results from the molecular motions having a preferred direction due totemperature gradients or momentum gradients. But here we are assuminga priori that there is no preferred direction in the MCRF (this must be thesame MCRF for the entire system so that there are no gradients). So therecan be no net transfer of energy or momentum due to the motion of theparticles. Hence no heat transfer by conduction or momentum transfer byviscosity. With the conditions of a perfect fluid being met, the argument of§ 4.6 apply for the form of the stress-energy tensor.

This is interesting because the condition of no bias in any direction of theparticle velocities (in the MCRF) is a statistical condition, applying in mean.But if the velocities are truly randomly distributed, then we should expectrandom fluctuations about the mean, implying a random heat conductionand viscosity effect, albeit with a time mean of zero. In classical atmo-sphere/ocean fluid dynamics this is only recently being considered under thename of “stochastic parameterization”.

22. (ii) If all particles have the same speed v and mass m (less assumethat’s rest mass), express p and ρ as functions of m, v, and n.

The magnitude of the momentum of a given particle is,

mγv,

so the momentum flux in a given direction is∑i

nv cos(θi)mγv cos(θi) =∑i

nmγv2 cos2(θi).

We make the continuum hypothesis and approach the problem statistically.Suppose a particle is at the origin. Because the direction of a given particle’strajectory is random, the probability of leaving through a given piece of theunit sphere centred at the origin is proportional to the area of the piece.Dividing by the area of the sphere we get the solid angle. The solid angle

Page 133: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 133

of the strip of width dθ making an angle θ to the z-axis is 2π sin(θ) dθ/4π =sin(θ) dθ/2. So the mean value of the momentum flux of one particle in thez−direction is

mγv2

∫ π0

sin(θ) cos2(θ)dθ

2=mγv2

2

−1

3[cos3(θ)]π0

= mγv2 1

3. (4.48)

Originally I integrated only over half the sphere’s area because I mistakenlythought that only half the particles contribute to the momentum flux in agiven direction, the other half fluxing momentum in the other direction. Butthis is wrong because the momentum flux goes like the square of the velocity– a particle moving in the negative x−direction carries negative momentumin the negative x−direction and thus results in a positive momentum flux!This was made clear in the previous problem, see 21c.

For a photon gas, say the energy of each photon is E and the MCRMnumber density is n, so that

ρ = nE

The momentum of each photon is E, c.f. Eq. (2.37), and it’s speed is c = 1.So the momentum flux of a given photon in a given direction will be

E c

3=E

3

and the total momentum flux from all photons per unit volume will be

p = nE

3=ρ

3

23. This question was confusing at first because d3x is not defined in thetext. Is that dx1dx2dx3 or dx0dx2dx3 etc.? Below I just assumed it was the3 spatial dimensions d3x = dx dy dz, and found that it makes sense.

23 (a) Prove∂

∂t

∫T 0αd3x = 0

Page 134: FirstCourseGR_notes_on_Schutz2009.pdf

134

0 = Tαβ,β Eq. (4.34)

= T βα,β symmetry

= T 0α,t + T iα,i expanding

=

∫Ω

(T 0α

,t + T iα,i)dx dy dz integrate over volume Ω

=

∫Ω

(T 0α

,t + T iα,i)d3x change notation∫

Ω

T 0α,t d

3x = −∫

Ω

T iα,i d3x rearrange

= −∫dΩ

T iαni d2x 3D version of Gauss’ theorem

= 0 choosing surface dΩ in region where Tαβ = 0

∂t

∫Ω

T 0α d3x = 0 choosing time-independent dΩ (4.49)

23 (b) Prove

∂2

∂t2

∫T 00 xi xj d3x = 2

∫T ij d3x

This is a bit trickier. I just played with the integrand such that it wouldgive me terms that appeared in the above. Even if you guess incorrectly,after doing a few computations you’ll learn how things work and an then

Page 135: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 135

find the solution.

∂xβ∂

∂xα(T βα xµ xν

)=

∂xβ

(T βα,α x

µ xν + T βα∂xµ xν

∂xα

)=

∂xβ

(T βα

∂xµ xν

∂xα

)by Eq. (4.34)

=∂

∂xβ

(Tαβ

∂xµ xν

∂xα

)symmetry

= Tαβ∂2xµ xν

∂xα∂xβby Eq. (4.34)

= Tαβ∂

∂xβ(xµδνα + xνδµα) product rule of elementary calculus

= Tαβ(δµβ δ

να + δνβ δ

µα

)product rule of elementary calculus

= T µν + T νµ

= 2T µν symmetry (4.50)

Now we just restrict attention to spatial ranges of these indices: µ = iand ν = j, and we integrate over a 3D spatial volume Ω fixed in time sothat temporal derivatives commute with the spatial integral (without Leibnizterms) and large enough that it entirely encloses the region in which thestress-energy tensor is non-zero:

∂xβ∂

∂xα(T βα xi xj

)= 2T ij restricting above∫

Ω

∂xβ∂

∂xα(T βα xi xj

)d3x =

∫Ω

2T ij d3x integrating

∂2

∂t2

∫Ω

(T 00 xi xj

)d3x+

∫Ω

∂xk∂

∂xl(T kl xi xj

)d3x =

∫Ω

2T ij d3x expanding

(4.51)

The middle term can be shown to be zero because it always involves anintegral over a direction for which there is a spatial derivative, giving termslike, say for k = 2:∫

Ω

∂y

∂xl(T ky xi xj

)dx dy dz =

∫Ω

[∂

∂xl(T ky xi xj

)]yRyL

dx dz = 0

where yR and yL are the y coordinates of the bounding surface on the rightand left sides of the volume Ω. But on this surface T = 0 because we choosethe surface to be outside the bounded region of non-zero T = 0.

Page 136: FirstCourseGR_notes_on_Schutz2009.pdf

136

Maybe I could have used Gauss’ theorem again instead of the final argu-ment above?

In any case, we’re left with just

∂2

∂t2

∫Ω

(T 00 xi xj

)d3x =

∫Ω

2T ij d3x (4.52)

which was what we had to prove.

23 (c) Prove

∂2

∂t2

∫T 00 (xi xi)

2 d3x = 4

∫T ii x

j xj d3x+ 8

∫T ij xi xj d

3x

The solution to this problem proceeds much like in (b) but is a bit morecomplicated. The difficulty is mostly in guessing where to start, and inparticular what operator to apply to Tαβ. Again I chose something toogeneral and made life difficult for myself, but it soon became clear whereI should have started. These problems are actually much easier than theymight seem at first. Here is some “thinking out loud” that might help. Wesee on the left a second derivative wrt t so we want to start with a generaldouble derivative of Tαβ.

∂2

∂xαxβTαβ

As in parts (a) and (b), we anticipate exploiting the fact that these deriva-tives are zero for deriving the RHS. Clearly we have to multiple by somecombination of xµ and xν and there should be four of them. Some will ap-parently be eliminated somehow and somehow one of the indices of Tαβ willneed to be lowered but I had no clue how at this point. I started very general,with

∂2

∂xαxβTαβxµxνxγxσ

But after a few computations to see how things worked, it became clear Ionly needed the spatial components and that I only needed two pairs, thatis

∂2

∂xαxβTαβxixjxixj

Page 137: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 137

It wasn’t that tricky. It was obvious what needed to be done to get some ofthe terms on the desired RHS. It also became clear how to lower the indexof Tαβ. Let’s just proceed now with the computations.

∂2

∂xαxβTαβxixjxixj

= Tαβ∂2

∂xαxβxixjxixj Eq. (4.34) and symmetry

= Tαβ∂

∂xβ

[∂

∂xα(xixjηkix

kηljxl)

]raise indices so we can differentiate

= Tαβηkiηlj∂

∂xβ

[∂

∂xα(xixjxkxl)

]metric is uniform in space

= Tαβηkiηlj∂

∂xβ[(xixjxkδlα + xixjδkαx

l + xiδjαxkxl + δiαx

jxkxl)]

product rule

= Tαβηkiηljδlα[xixjδkβ + xiδjβx

k + δiβxjxk] + δkα[xixjδlβ + xiδjβx

l + δiβxjxl]

δjα[xixkδlβ + xiδkβxl + δiβx

kxl] + δiα[xjxkδlβ + xjδkβxl + δjβx

kxl] (4.53)

The terms in red are characterized by having the indices of x, say xixk forthe first one, that are both within the same metric term, e.g. ηki, as opposedto spread across two metric terms. Thus it’s only possible to lower one ofthe two indices. It doesn’t matter which one we lower (you’ll see in a secondbecause of the symmetry of Tαβ we can get what we want in the end eitherway). So the 4 red terms become

4T iixjxj

where we have relabelled dummy indices. And the remaining 8 black termsbecome

8T ijxixj

After integrating over all space this gives the RHS. The LHS follows as it didin (b).

24. (a) Show that in the rest frame O of a star of constant luminosity L(total energy radiated per second), the stres-energy tensor of the radiationfrom the star at the event (t, x, 0, 0) has components T 00 = T 0x = T x0 =T xx = L/(4πx2). Stars sites at the origin.

Page 138: FirstCourseGR_notes_on_Schutz2009.pdf

138

Assume the star emits radiation isotropically. So a sphere of radius xcentred at the origin has radiation flowing out of it at a rate of L. Thesurface area of the sphere is 4πx2. The flux will be evenly distributed overthe surface of the sphere by the assumption of isotropy. Thus the flux per unittime per unit area is everywhere of magnitude L/(4πx2). And in particularat event (t, x, 0, 0) it is also T 0x = L/(4πx2).

In time period δt the energy flow out of the sphere will be Lδt and thisenergy will fill a spherical shell of volume (4πx2) c δt = (4πx2) δt, since c = 1in geometric units. Thus the energy density at a distance of x from the originwill be T 00 = L δt/(4πx2 δt) = L/(4πx2).

By the symmetry properties of T, we know that in general T 0x = T x0.Finally, the energy flux is photon flux, say Fp times the energy per photon,

hν,T 0x = Fphν

And the momentum flux will be the photon flux times the momentum perphoton, hκ,

T xx = Fphκ = Fphν

c= T 0x

because again c = 1 in geometric units, c.f. p. 49.

24. (b) By drawing the world lines of photons emitted and absorbed atan event, I can only guess that the definition of a “a null vector that separatesthe emission and reception of the radiation” is a null vector in the directionof the radiation that passes through the event and points in the direction ofthe emitted radiation and opposite the received radiation. We are given that

~X →O (x, x, 0, 0)

is such a vector for the event (x, x, 0, 0). Recall the radiation came fromthe origin of O wherein the star is at rest, see (a). I don’t see that there’sanything to prove. I guess we’re just supposed to learn the above definitionthat wasn’t explicitly provided.

To show that the stress-energy tensor has the given frame-invariant form,we must show that it is indeed frame-invariant, and that it reproduces theresults of (a) in the MCRF. Vector’s are frame-invariant, and so their out-erproduct will for a

(20

)tensor that is also frame-invariant. The radiation

emitted per second, the luminosity, will depend upon reference frame (as wewill have to discuss in part (c)), so we must assume that L is the luminosity in

Page 139: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 139

the rest frame, even though this was not stated explicitly. Finally the denom-inator has an inner product of two four-vectors, and is also frame-invariant.So the given expression for T is frame-invariant.

In the MCRF it is easy to verify that the expression given produces theresults of (a):

T 00 =L

X0X0

(UαXα)4

=L

x · x(1 · x)4

=L

2πx2. (4.54)

And

T 0x =L

X0X1

(UαXα)4

=L

x · x(1 · x)4

=L

2πx2, (4.55)

and

T xx =L

X1X1

(UαXα)4

=L

x · x(1 · x)4

=L

2πx2. (4.56)

Furthermore,

T 0y =L

X0X2

(UαXα)4

=L

x · 0(1 · x)4

= 0, (4.57)

and similarly for the remaining terms.

Page 140: FirstCourseGR_notes_on_Schutz2009.pdf

140

24. (c) Find T 0x in a frame O moving at speed v along the x−axis of O.

Applying the Lorentz transformation to T we first transform ~X:

~X →O (γ(1− v)x, γ(1− v)x, 0, 0),

→O (R,R, 0, 0),

(4.58)

where γ = 1/√

1− v2, see p. 38, and we have found

R = γ(1− v)x =

√1− v1 + v

x

The 4-velocity of the star in the Earth’s frame of reference is

~Us →O (γ,−vγ, 0, 0)

which gives for ~Us · ~X = −Rγ(1 + v). Using the frame-invariant expressionfor the stress energy tensor we find in the Earth’s frame of reference:

T 0x =L

X 0X 1

(UαXα)4

=L

R2

(Rγ(1 + v))4

=L

4πR2

(1− v)2

(1 + v)2(4.59)

InterpretationConsider the emission of a photon from the star as event A at (0, 0, 0, 0) in

frame O, see Fig. 4.1. The photon travels along the x−axis, which is parallelto the x−axis. Consider the absorption of the photon in a detector as event Bat (x, x, 0, 0) in frame O. Suppose that frame O coincides with frame frameO at event A, so both frames observe event A at their origins. During thetime period between A and B, ∆t = x − 0 = x, the reference frame O hastraveled a distance v in the x−direction. So under a Galilean transformationthe event B at (x, x, 0, 0) in O would occur at x = x− vt = x(1− v) in theframe O moving at speed v in the positive x-direction. Due to relativisticeffects, this becomes x = xγ(1 − v) = R. This is different from Lorentzcontraction or time dilation because it’s neither a ruler nor a clock’s reading

Page 141: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 141

that is being transformed between Lorentz frames but the spatial part of alight path.

From the point of view of O the light emitted at event A spreads out ina spherical shell centred at the origin of O despite the fact that the star wasmoving when the light was emitted. In principle, this makes things simple!T 0x is the flux of energy across a surface of constant x = R. But note that theluminosity varies with reference frame. Suppose the star emits N photonsper second of mean energy hν such that L = Nhν in O the MCRF of thestar. The emission rate is a clock of sorts, and moving clocks run slowly, so inframe O, the emission rate will be N /γ. But the radiation will be red-shiftedby the relativistic Doppler effect,

ν

ν= (1− v)γ

see Eq. (2.39) on p. 49. Thus the luminosity observed in O will be reduced:

L =Nγν(1− v)γ = L(1− v).

Based on the above argument I would have anticipated

T 0x =L

4πR2(1− v)

but this is wrong by a factor of (1− v)/(1 + v)2!

A more accurate but only partial answer is to note from the frame-invariant expression given in 24 b), it’s clear that the components of T only

change because of the ~X ⊗ ~X term in the numerator, since the remainingfactors are frame-invariant. For all four non-zero terms,

( ~X ⊗ ~X)αβ →O R2, (α, β) ∈ 0, x

And R/x = γ(1 − v) =√

1−v1+v

, which implies that the magnitude of the

components of T change like

T 0x

T 0x=R2

x2=

1− v1 + v

The same result can of course be obtained directly from the Lorentz transfor-mation. Tensors are geometric objects and thus invariant under changes of

Page 142: FirstCourseGR_notes_on_Schutz2009.pdf

142

reference frame, and the Lorentz transformation tells us how the componentschange. Apply this to T we find

T αβ = Tαβ Λαα Λβ

β

which also givesT 0x

T 0x=R2

x2=

1− v1 + v

25. This problem is apparently not required to understand the remainderof the book so I’ve but it on hold.

Page 143: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 143

x

tt

x

A

B

*

*x

R

R

Figure 4.1: Events A and B seen in star’s reference frame (x − t axes) anda reference frame co-moving with Earth but with origin coinciding with starat event A (x− t axes), when radiation was emitted that reach Earth in timex at event B.

Page 144: FirstCourseGR_notes_on_Schutz2009.pdf

144

Page 145: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 5

Preface to curvature

145

Page 146: FirstCourseGR_notes_on_Schutz2009.pdf

146

5.1 On the relation of gravitation to curva-

ture

Bad choice of symbols in Eq. (5.1): h is Plank’s constant when multiplied byfrequency and h is the height of the tower when multiplied by g.

5.2 Tensor calculus in polar coordinates

How did he get Eq. (5.28b) for the magnitudes of the one-form bases? UseEq. (3.51) but adapted to 2D Cartesian space instead of Minkowski space,

p2 = p21 + p2

2 (5.1)

So e.g.

|dr| =√

cos2 θ + sin2 θ = 1 using Eq. (5.27) and (5.1)

Some typos:

• Eq. (5.28a), double == is just a typo.

• Typo middle p. 129, just before Eq. (5.54), the final µ should be su-perscript:

V αα =

∂V α

∂xα+ ΓαµαV

µ

• pp. 130 and 131, two separate equations labeled Eq. (5.64).

5.3 Christoffel symbols and the metric

Clarification: On p. 132, before Eq. (5.70), the superscripted α on the LHSof

Vα;β = gαµVµ;β

looks odd. But it is correct. It follows because we’re in Cartesian coordinateswherein gαµ = δαµ.

Page 147: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 147

5.4 Noncoordinate bases

Verifying Eq. (5.78):First we use Eqs. (5.22, 5.23) for ~er and ~eθ, and Eq. (5.76) to obtain:

~er = cos θ~ex + sin θ~ey,

~eθ = − sin θ~ex + cos θ~ey, (5.2)

and we use Eqs. (5.26, 5.27) for dr and dθ, and Eq. (5.77) to obtain:

dr = cos θ dx+ sin θ dy,

dθ = − sin θdx+ cos θdy. (5.3)

Taking the various dot products of basis vectors we find:

~er · ~er = cos2 θ + sin2 θ = 1,

~er · ~eθ = − sin θ cos θ + sin θ cos θ = 0,

~eθ · ~eθ = cos2 θ + sin2 θ = 1. (5.4)

Doing the same for the one-form bases gives the analogous result. If one isuncomfortable taking dot products of one-form bases, recall Eq. (3.47) andEq. (3.52).

dr · dr = (cos θ dx+ sin θ dy) · (cos θ dx+ sin θ dy),

= cos2 θ dx · dx+ sin2 θ dy · dy + 2 cos θ sin θ dx · dy,= cos2 θ + sin2 θ = 1. (5.5)

Eq. (5.84) reduces to,1√

x2 + y26= 0.

5.8 Exercises

1. Repeat the argument that led to Eq. (5.1) under more realistic assump-tions: suppose a fraction ε of the kinetic energy of the mass at the bottom

Page 148: FirstCourseGR_notes_on_Schutz2009.pdf

148

can be converted into a photon and sent back up, the remaining energy stay-ing at ground level in a useful form. Devise a perpetual motion engine ifEq. (5.1) is violated.

Taken literally, ε is the fraction of the kinetic energy of the mass at thebottom, not the fraction of the total energy. The remaining energy thenmeans the rest mass energy plus (1 − ε) of the kinetic energy. Somehow Idoubt that’s what Schutz meant because it makes it needlessly complicatedbut that’s what he said, so we’ll go with that interpretation. Conceptuallyit doesn’t really matter since we’re supposed to just see that the Einsteinthought experiment carries forward the same message even when inefficienciesare introduced.

Let’s introduce an index i to keep track of the iterations of the massfalling and photon propagating to the top of the tower. Say we start withmass m0 at the top, it falls gaining kinetic energy

m0gh

c2= m0gh

where the constants are in geometric units so that c = 1 and gh is dimen-sionless. For Earth conditions of course gh 1. Of this kinetic energy onlya fraction ε is available for generating the photon at the bottom of the tower,

εm0gh = 2π~ ν0,

while the remaining energy is accumulated, apparently in useful form, at thebase of the tower:

m0 = [(1− ε)gh+ 1]m0.

The key assumption is that the radiation is unaffected by the gravitationalfield (in violation with Eq. (5.1)), yielding a photon at the top of the towerof the same energy as at the bottom:

εm0gh = 2π~ ν0.

Now this is converted into mass

m1 = 2π~ ν0 = m0εgh.

This yielding kinetic energy at the base of the tower:

m1gh = m0ε(gh)2,

Page 149: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 149

of which the fraction ε is available for the 2nd photon:

εm1gh = 2π~ ν1 = m0(εgh)2.

The remaining energy accumulates at the bottom:

m1 = [(1− ε)gh+ 1]m1 = [(1− ε)gh+ 1]m0εgh.

The kinetic energy at the top of the tower is again taken as the total energyin the photon at the base of the tower, yielding a new mass at the top of thetower:

m2 = 2π~ ν1 = m0(εgh)2.

At the bottom we generate another photon:

2π~ ν2 = m0(εgh)3,

and accumulate more mass:

m2 = [(1− ε)gh+ 1]m2 = [(1− ε)gh+ 1](εgh)2m0.

The process will repeat indefinitely. After n+ 1 iterations we have accumu-lated this much mass at the base of the tower:

n∑i=0

mi =n∑i=0

ari,

with a = [(1−ε)gh+1]m0 and r = (εgh). As n→∞, the accumulated massapproaches

M =a

1− r=

([(1− ε)gh+ 1]m0

1− εgh,

see Boas (1983, Eq. (1.8)) for the sum of an infinite geometric series. As-suming Earth-like values, gh 1, and

M ≈ [(1− ε)gh+ 1]m0(1 + εgh) ≈ m0(1 + gh).

The accumulated mass is not much more than the starting mass so the processis not an efficient way to create energy. However, we gained something fornothing and generated an infinite process. Clearly something is wrong, andin particular, it was the violation of Eq. (5.1) describing the gravitational

Page 150: FirstCourseGR_notes_on_Schutz2009.pdf

150

redshift. Einstein’s simple thought experiment is robust to the inclusion ofinefficiencies.

2. A uniform external gravitational field would contribute to a uniformacceleration for the solid Earth and its fluid envelopes, engendering no rela-tive motion between them.

3. (a) Consider coordinate transformation (x, y) → (ξ, η) with ξ = xand η = 1. Note that ∂η/∂x = 0 and ∂η/∂y = 0. This violates Eq. (5.6),implying that this coordinate transformation is not good. In fact this sameexample was worked out on p. 118, complete with an example of a distinctpair of points (x, y) points having the same (ξ, η) coordinates.

3. (b) Are the following coordinates transformations good ones? Com-pute the Jacobian and list and points where the transformations fail.

(i) ξ = (x2 +y2)1/2, η = arctan(y/x). This is of course Eq. (5.3), the polarcoordinate transformation.

∂ξ

∂x=

x√x2 + y2

,

∂ξ

∂y=

y√x2 + y2

,

∂η

∂x=

−yx2 + y2

,

∂η

∂y=

x

x2 + y2, (5.6)

The determinant is 1/√x2 + y2 so I believe the only problem is at the

origin, where r = 0 and derivatives above are undefined.

(ii) ξ = ln(x), η = y.

Page 151: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 151

∂ξ

∂x=

1

x,

∂ξ

∂y= 0,

∂η

∂x= 0,

∂η

∂y= 1, (5.7)

The determinant is 1x

so again the only problem is at the origin and ofcourse x ≤ 0 where ξ = ln(x) is undefined.

(iii) ξ = arctan(y/x), η = (x2 + y2)−1/2.

∂ξ

∂x=

−yx2 + y2

,

∂ξ

∂y=

x

x2 + y2,

∂η

∂x=

−x(x2 + y2)3/2

,

∂η

∂y=

−y(x2 + y2)3/2

, (5.8)

The determinant is 1/(x2 + y2)3/2 so I believe the only problems are atthe origin and as x2 + y2 →∞, where the derivatives above are undefined.

4. A curve is defined by x = f(λ), y = g(λ), 0 ≤ λ ≤ 1. Show that thetangent vector (dx/dλ, dy/dλ) does actually lie tangent to the curve.

The slope of the tangent to the curve is

lim∆x→0

∆y

∆x=

∆λ dy/dλ

∆λ dx/dλ

=dy/dλ

dx/dλ, (5.9)

Page 152: FirstCourseGR_notes_on_Schutz2009.pdf

152

which is also the slope of the tangent vector.

5. Sketch the curves, compare paths, find tangent vectors when param-eter is nil. The computations in this exercise are a bit trivial but still Ifound it instructive. If one likes this sort of approach, one might like the textby Faber (1983). The plots can be found in the accompanying MapleTMfileschutz2009_ch5.mw.

(a)

x = sinλ

y = cosλ (5.10)

This is a unit circle centred at the origin. When λ = 0 the tangent vector isat (0, 1), points in the x−direction and has unit length.

(b)

x = cos(2πt2)

y = sin(2πt2 + π) (5.11)

This is a unit circle centred at the origin, as in (a).The tangent vector is a bit subtle.

x = − sin(2πt2)4πt

y = cos(2πt2 + π)4πt (5.12)

When t = 0 the tangent vector is the zero vector. But strangely we can stillidentify its direction! The angle of the tangent vector to the x−axis is

cot θ =x

y= − sin(2πt2)

cos(2πt2 + π)

When t = 0 it’s clear that θ = ±π/2. And we can decide the sign as follows.Let t = ε, a small and positive parameter. Then x is close to unity butslightly less, and y is close to zero but negative. Taking ε → 0 moves thepoint to (1, 0), so the tangent vector points to y = −∞. That is, θ = −π/2.

Page 153: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 153

(c)

x = s

y = s+ 4 (5.13)

The path is a straight line with y intercept at y = 4. The tangent vector isuniform (1, 1).

(d)

x = s2

y = −(s− 2)(s+ 2) = 4− s2 (5.14)

The path is a straight line with y intercept at y = 4, as in (c), but now we’rerestricted to x ≥ 0. The tangent vector is not uniform but depends upon s:

x = 2s

y = −2s (5.15)

As in (b) the tangent vector is the zero vector at s = 0, but again we canstill define its direction. In this case the direction is uniform (although themagnitude is not) (1,−1) or θ = −π/4.

(e)

x = µ

y = 1 (5.16)

The path is a straight line with y intercept at y = 1. The tangent vector isuniform:

x = 1

y = 0 (5.17)

6. Justify the basis vectors and one-forms in Fig. 5.5.

The ~er vectors should point away from the origin, and be of the samelength regardless of position because |~er| = 1, c.f. Eq. (5.28b).

Page 154: FirstCourseGR_notes_on_Schutz2009.pdf

154

The ~eθ vectors should be orthogonal to ~er, and point in the anticlockwisedirection about the origin. The length should increase linearly with distancefrom the origin, c.f. Eq. (5.28a).

The dr basis should be surfaces tangent to a curve of constant r so orthog-onal to ~er, which point away from the origin. The “amplitude” (indicatedby the spacing of the lines) should be constant regardless of position because|dr| = 1, c.f. Eq. (5.28b) or (5.1) in my notes of § 5.3.

The dθ basis should be surfaces tangent to a curve of constant θ so orthog-onal to ~eθ. The amplitude (weaker amplitude indicated by greater spacingof the lines) should decrease with distance from the origin, |dθ| = r−1, c.f.Eq. (5.28b).

7. Let primed indices indicate polar coordinates and unprimed Cartesian.Find Λα′

β and Λµν′ .

Let’s start with Λµν′ . Using Eq. (5.3) for the cooridnates (x, y) in terms of

polar coordinate variables (r, θ), we calculate the terms of the transformationgiven in Eq. (5.13):

(Λµν′) =

[cos θ −r sin θsin θ r cos θ

]. (5.18)

Slightly more awkward is Λα′

β. Use Eq. (5.8) with ξ = r and η = θ. Notefor the second row we must differentiate arctan:

∂η

∂x=∂θ

∂x

=∂ arctan(y/x)

∂x. (5.19)

I found this simpler to write tan θ = y/x and differentiate both sides withrespect to x:

∂ tan θ

∂x=

d tan θ

∂θ

∂x,

=1

cos2 θ

∂θ

∂x. (5.20)

Page 155: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 155

Now we solve for

∂θ

∂x= cos2 θ

∂ tan θ

∂x

= cos2 θ∂y/x

∂x

= cos2 θ

(−yx2

)=

−yx2 + y2

. (5.21)

And similarly for

∂θ

∂y= cos2 θ

∂ tan θ

∂y

=x

x2 + y2. (5.22)

Finally we arrive at:

(Λα′

β) =

[x√x2+y2

y√x2+y2

−yx2+y2

xx2+y2

](5.23)

As a check we can multiple the two transformations together to confirmthat they are indeed a pair of inverses. To do this we must choose a commonset of variables. Let’s use (r, θ). It’s straightforward to convert

(Λα′

β) =

[x√x2+y2

y√x2+y2

−yx2+y2

xx2+y2

],

=

[cos θ sin θ− sin θr

cos θr

]. (5.24)

And then we find their product gives the identity matrix,

(Λα′

β)(Λβγ′) =

[cos θ sin θ− sin θr

cos θr

] [cos θ −r sin θsin θ r cos θ

],

=

[1 00 1

]. (5.25)

Page 156: FirstCourseGR_notes_on_Schutz2009.pdf

156

8. (a) Let f = x2 + y2 + 2x y, and consider two vectors in Cartesiancoordinates,

~V →c (x2 + 3y, y2 + 3x)

~W →c (1, 1). (5.26)

Find f as a function or r and θ, and find the components of the two vectorson the polar basis.

From Eq. (5.3) we find immediately that

f = r2 + 2 r2 cos θ sin θ , by simple substitution. (5.27)

For the vectors, we need to express the Cartesian components in terms of(r, θ), and we need the transformation matrix Λα′

β where prime refers to thepolar coordinates and unprimed the Cartesian, as in problem 7. The formeris

~V →c (x2 + 3y, y2 + 3x)

→c (r2 cos2 θ + 3 r sin θ, r2 sin2 θ + 3 r cos θ) , by substitution. (5.28)

These are still Cartesian coordinates, but expressed as a function of (r, θ).Using the transformation matrix from problem 7 we have

~V →p

[cos θ sin θ− sin θr

cos θr

](r2 cos2 θ + 3 r sin θ,r2 sin2 θ + 3 r cos θ

)~V →p

(r2 (cos3 θ + sin3 θ) + 6 r sin θ cos θ

−r cos2 θ sin θ + r sin2 θ cos θ + 3 (cos2 θ − sin2 θ)

). (5.29)

And similarly for ~W , again using the transformation matrix from prob-lem 7, we have simply:

~W →p

[cos θ sin θ− sin θr

cos θr

](11

)~W →p

(sin θ + cos θ− sin θr

+ cos θr

). (5.30)

The coordinates of the simple vector ~W are a function of (r, θ) becausethe basis vectors change with position, c.f. Eqs. (5.22) & (5.23).

Page 157: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 157

8. (b) Find the components of the one-form df in Cartesian coordinates.

Solution: Using Eq. (5.10) we find

df →c

(∂f

∂x,∂f

∂y

)→c (2x+ 2y, 2x+ 2y) . (5.31)

(i) Find the components of the one-form df in polar coordinates by directcomputation.

We use result (5.27) above and Eq. (5.10),

df →p

(∂f

∂r,∂f

∂θ

), Eq. (5.10)

→p

(2r + 4r cos θ sin θ, 2r2 (cos2 θ − sin2 θ)

). (5.32)

(ii) Find the components of the one-form df in polar coordinates bytransforming the Cartesian components.

We use of course Eq. (5.14) in general to relate the polar and Cartesiancomponents of a one-form. The result (5.31) above gives the Cartesian com-ponents and the transformation, in general is Eq. (5.13) and was found abovein result (5.18).(

df)β′

=(

df)α

Λαβ′ , Eq. (5.14)

=

(∂f

∂x,∂f

∂y

)[∂x∂r

∂x∂θ

∂y∂r

∂y∂θ

]= (2x+ 2y, 2x+ 2y)

[cos θ −r sin θsin θ r cos θ

]= (2r(cos θ + sin θ), 2r(cos θ + sin θ))

[cos θ −r sin θsin θ r cos θ

]=(2r + 4r cos θ sin θ, 2r2 (cos2 θ − sin2 θ)

). (5.33)

which agrees, of course, with the result (5.32).

Page 158: FirstCourseGR_notes_on_Schutz2009.pdf

158

8. (c) (i) Find the components of the one-forms V and W in polarcoordinates using the metric tensor.

The matrix tensor in polar coordinates was given in Eq. (5.32). We found~V in polar coordinates in problem 8(a), see result (5.29).

Vα = gαβVβ c.f. Eq. (3.39)

=

[1 00 r2

](V r

V θ

)= (V r, r2V θ)

=

(r2 (cos3 θ + sin3 θ) + 6 r sin θ cos θ

−r3 cos2 θ sin θ + r3 sin2 θ cos θ + 3r2 (cos2 θ − sin2 θ)

). (5.34)

For ~W in polar coordinates we use result (5.30):

Wα = gαβWβ c.f. Eq. (3.39)

=

[1 00 r2

](W r

W θ

)=

(sin θ + cos θ

−r sin θ + r cos θ

). (5.35)

8. (c) (ii) Find the components of the one-forms V and W in polarcoordinates using the transformation from Cartesian components.

Because the matrix tensor in Cartesian components corresponds to theidentity matrix, the components of the one-forms are identical to the vectors:

Vα = gαβ Vβ c.f. Eq. (3.39)

Vα = δαβ Vβ , used Eq. (5.29)

Vα →c Vα , also explained p. 131 bottom. (5.36)

so we immediately have the components of the V and W in Cartesian com-

Page 159: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 159

ponents. We merely have to transform these to polar coordinates.

Vα′ = Vβ Λβα′ Eq. (5.14) or Eq. (3.16)

= (r2 cos2 θ + 3r sin θ, r2 sin2 θ + 3r cos θ)

[cos θ −r sin θsin θ +r cos θ

]=(r2 (cos3 θ + sin3 θ) + 6r sin θ cos θ,

− r3(cos2 θ sin θ − sin2 θ cos θ) + 3r2(cos2 θ − sin2 θ)), (5.37)

which agrees with result (5.34). And similarly for the other one-form:

Wα′ = Wβ Λβα′

= (1, 1)

[cos θ −r sin θsin θ +r cos θ

]=(

cos θ + sin θ, r (cos θ − sin θ))

(5.38)

which agrees with the result (5.35).

9. Draw a figure to explain Eqs. (5.38a, b).

Recall the Eqs. (5.38a, b):

∂~eθ∂r

=1

r~eθ,

∂~eθ∂θ

= −r~er.

We see that changing r does not change the direction of the polar coordinatebasis vectors. But ~eθ does change in magnitude since it must increase inlength as one moves further from the origin, albeit more slowly the fartherone is from the origin, see Fig. 5.1.

Changing θ on the other hand does change the orientation of the basisvectors. Increasing θ when one is in the first quandrant for instance resultsin ~eθ pointing more toward the −x-direction, see Fig. 5.2.

10. Prove that ∇~V defined in Eq. (5.52) is a(

11

)tensor.

Page 160: FirstCourseGR_notes_on_Schutz2009.pdf

160

A

B..

Δr

x

y

Figure 5.1: Moving from point A to B we see the basis vector ~eθ increasesin length but doesn’t change direction. For larger r the relative change issmaller. Plot was partly generated with MapleTM, see accompanying fileschutz2009 ch5.mw.

First a note on terminology. One might complain that the RHS below

(∇~V )αβ ≡ V α;β (5.39)

is not a tensor because it lacks a basis. In Chapter 3, Schutz was carefulto include the basis functions but here they were lost. While this complaintis valid, Schutz is following tradition here by calling the components of atensor “a tensor” for short. Hobson et al. (2009) were explicit about usingsuch an abbreviation and noted that it is commonly done. In Schutz’s partialdefence here, in Eq. (5.52) he enclosed the tensor on the LHS in parenthesesand pulled off the α and β components explicitly – recall the discussion inmy notes on chapter 2 §2.2 of this notation introduced without explanation.

Going back to the subsection The covariant derivative of § 5.3, if wetake the gradient of Eq. (5.43) instead of just the derivative, then instead ofEq. (5.46) we obtain:

dxβ∂~V

∂xβ= dxβ

(∂V α

∂xβ~eα + V αΓµαβ~eµ

). (5.40)

Page 161: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 161

A

B..

Δθ

x

y

Figure 5.2: Moving from point A to B we see the basis vector ~eθ changesdirection, in this case pointing more toward the −x-direction, but doesn’tchange in length. Plot was partly generated with MapleTM, see accompanyingfile schutz2009 ch5.mw.

In other words, I’ve just included the one-form basis. Continuing the devel-opment on p. 128 leads up to Eq. (5.51), but with the one-form basis:

dxβ∂~V

∂xβ= dxβ V α

;β ~eα

= dxβ ⊗ ~eα V α;β. (5.41)

To completely prove this is a(

11

)tensor would include showing that it is

invariant under a change of basis. This is easy if we restrict ourselves toCartesian basis for then we don’t have to deal with the Christoffel symbol.Consider frame O, related to frame O as follows:

V α = ΛααV

α

~eα = Λα′

α ~eα′

dxβ = Λββdxβ

∂xβ=∂xβ

∂xβ∂

∂xβ′= Λβ′

β

∂xβ′, (5.42)

Page 162: FirstCourseGR_notes_on_Schutz2009.pdf

162

where overbar indices indicate components of the O frame, those withoutoverbar indicate components of the O frame regardless of whether or notthey have primes. We simply substitute these into the expression for thegradient of the velocity in the frame O:

dxβ∂~V

∂xβ= dxβ

(∂V α

∂xβ~eα

),

= Λββ dxβ Λα

α Λβ′

βΛα′

α

(∂V α

∂xβ′~eα′

),

= δβ′

β dxβ δα′

α

(∂V α

∂xβ′~eα′

),

= dxβ(∂V α

∂xβ~eα

). (5.43)

Viola! It transforms properly such that it is frame invariant at least forframes with a Cartesian basis.

[Schutz’s solution focuses on showing linearity. ]

11. For ~V from Exer. 8 above, find:

(a) V α,β in Cartesian coordinates.

V α,β →c

[∂V x∂x

∂V x

∂y∂V y

∂x∂V y

∂y

],

=

[2x 33 2y

],

=

[2r cos θ 3

3 2r sin θ

]. (5.44)

11. (b) The transformation to polar coordinates.

Transformation of vectors between different coordinates was explained in§ 2.2 and § 5.2. Perhaps it was not thoroughly explained how to transformtensors. This exercise then is very important for clarifying that. The keyingredient is that one needs to apply a transformation matrix for each index

Page 163: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 163

or rank of the tensor – Eq. (5.8) for the superscript indices (what other bookscalled the contravariant components) and Eq. (5.13) for the subscript indices(what other books call the covariant components).

Λµ′

α Vα,β Λβ

ν′ =

[cos θ sin θ− sin θr

cos θr

] [2r cos θ 3

3 2r sin θ

] [cos θ −r sin θsin θ r cos θ

]=

[A BC D

](5.45)

where

A = 2r(cos3 θ + sin3 θ) + 3 sin 2θ

B = 2r2(− cos2 θ sin θ + sin2 θ cos θ) + 3r cos 2θ

C = 2(− cos2 θ sin θ + sin2 θ cos θ) +3

rcos 2θ

D = 2r(cos2 θ sin θ + sin2 θ cos θ)− 3 sin 2θ (5.46)

11. (c) V µ′

;ν′ directly in polar coordinates.

First we need the velocity field in polar coordinates. We use the result(5.29) found in problem 7 above, which was:

~V →p

[cos θ sin θ− sin θr

cos θr

](r2 cos2 θ + 3 r sin θ,r2 sin2 θ + 3 r cos θ

)~V →p

(r2 (cos3 θ + sin3 θ) + 3 r sin 2θ

−r cos2 θ sin θ + r sin2 θ cos θ + 3 cos 2θ

). (5.47)

From Eq. (5.50), the velocity gradient has two parts. The first part, dueto the gradient of the components:

V µ′

,ν′ =

[∂V r

∂r∂V r

∂θ∂V θ

∂r∂V θ

∂θ

]=

[A EF G

](5.48)

Page 164: FirstCourseGR_notes_on_Schutz2009.pdf

164

where

A = 2r(cos3 θ + sin3 θ) + 3 sin 2θ , as in (b) above

E = 3r2(− cos2 θ sin θ + sin2 θ cos θ) + 6r cos 2θ

F = (− cos2 θ sin θ + sin2 θ cos θ)

G = r(2 sin2 θ cos θ + 2 cos2 θ sin θ − cos3 θ − sin3 θ)− 6 sin 2θ (5.49)

The second part is due to the gradient in basis vectors. Using Eq. (5.45) forthe Christoffel symbols:

V γ′Γµ′

γ′ν′ =

[0 V θΓrθθ

V θΓθθr V rΓθrθ

][

0 r2(cos2 θ sin θ − sin2 θ cos θ)− 3r cos 2θ(− cos2 θ sin θ + sin2 θ cos θ) + 3 cos 2θ

rr (cos3 θ + sin3 θ) + 3 sin 2θ

](5.50)

Combining these two we obtain

V µ′

;ν′ →p (5.51)[2r(cos3 θ + sin3 θ) + 3 sin 2θ 2r2(− cos2 θ sin θ + sin2 θ cos θ) + 3r cos 2θ

2(− cos2 θ sin θ + sin2 θ cos θ) + 3 cos 2θr

r(2 sin2 θ cos θ + 2 cos2 θ sin θ)− 3 sin 2θ

](5.52)

And viola, it agrees with the much simpler calculation in Cartesian coordi-nates, transformed to polar (5.45).

11. (d) The divergence using results from (a).

The divergence should be frame-independent and is the trace of the matrixof the covariant derivative of the vector. Using the covariant derivative inCartesian coordinates from (a) above:

V α,α = 2r(cos θ + sin θ). (5.53)

11. (e) The divergence using results from (b).

V α;α = 2r(cos3 θ + sin3 θ + cos2 θ sin θ + sin2 θ cos θ),

= 2r[cos2 θ(cos θ + sin θ) + sin2 θ(sin θ + cos θ)],

= 2r(cos θ + sin θ). (5.54)

Page 165: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 165

And of course this agrees with the result (5.53) obtained in (d).

11. (f) The divergence using Eq. (5.56)Recall from (5.29) that we had:

~V →p

(r2 (cos3 θ + sin3 θ) + 6 r sin θ cos θ

−r cos2 θ sin θ + r sin2 θ cos θ + 3 (cos2 θ − sin2 θ)

). (5.55)

So applying Eq. (5.56) we get

∇ · ~V =1

r

∂r[rV r] +

∂θV θ

=1

r

∂r

[r(r2 (cos3 θ + sin3 θ) + 6 r sin θ cos θ)

]+ (5.56)

∂θ[−r cos2 θ sin θ + r sin2 θ cos θ + 3 (cos2 θ − sin2 θ)]

=[3(r (cos3 θ + sin3 θ) + 12 sin θ cos θ)

]+ (5.57)

[2r cos θ sin2 θ − r cos3 θ + 2r sin θ cos2 θ − r sin3 θ − 6 sin(2θ)]

= 2r(cos θ + sin θ). (5.58)

And of course this also agrees with the result (5.53) obtained in (d).

12. Given p whose components in Cartesian coordinates are the same as~V in Exers. 8(a) and 11.

(a) Find pα,β in Cartesian coordinates.

pα,β =∂pα∂xβ

and we’ve done all these calculations in Exer. 11(a).

pα,β →c

[∂px∂x

∂px∂y

∂py∂x

∂py∂y

],

=

[2x 33 2y

],

=

[2r cos θ 3

3 2r sin θ

]. (5.59)

Page 166: FirstCourseGR_notes_on_Schutz2009.pdf

166

(b) Transform pα,β from Cartesian coordinates to polar coordinates, pα′;β′ .

Note the different notation now for the derivative – because we’re us-ing curvilinear coordinates there are derivatives of the basis vectors too sowe use the colon instead of comma. Now we get something different fromExerc. 11(b) because the transformation is different for one-forms than forvectors:

pα′;β′ = Λαα′ Λβ

β′ pα,β

Instead of using matrices, let’s try using tensors as follows.

pr;r = Λαr Λβ

r pα,β

= Λxr Λx

rpx,x + Λxr Λy

r(px,y + py,x) + Λyr Λy

rpy,y

= (Λxr)

22x+ Λxr Λy

r(3 + 3) + (Λyr)

22y

= (Λxr)

22r cos θ + Λxr Λy

r(6) + (Λyr)

22r sin θ

= (cos θ)22r cos θ + cos θ sin θ(6) + (sin θ)22r sin θ

= 2r(cos3 θ + sin3 θ) + 6 cos θ sin θ

= V r;r see result (5.46) above (5.60)

And

pθ;r = Λαθ Λβ

rpα,β

= Λxθ Λx

rpx,x + Λxθ Λy

rpx,y + +Λyθ Λx

rpy,x + Λyθ Λy

rpy,y

= Λxθ Λx

r2r cos θ + Λxθ Λy

r3 + +Λyθ Λx

r3 + Λyθ Λy

r2r sin θ

= [−r sin θ cos θ]2r cos θ + [−r sin θ cos θ]3 + [r cos2 θ]3 + [r sin θ cos θ]2r sin θ

= r22[− cos2 θ sin θ + sin2 θ cos θ] +3

rcos(2θ)

= r2V θr see result (5.46) above (5.61)

We can save some time by noting because pα,β = pβ,α

pr;θ = Λαr Λβ

θpα,β

= pθ;r (5.62)

Page 167: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 167

And finally,

pθ;θ = Λαθ Λβ

θ pα,β

= Λxθ Λx

θpx,x + Λxθ Λy

θ(px,y + py,x) + Λyθ Λy

θpy,y

= (Λxθ)

22r cos θ + Λxθ Λy

θ(6) + (Λyθ)

22r sin θ

= (r sin θ)2 2r cos θ + (−r2 cos θ sin θ)6 + (r cos θ)2 2r sin θ

= 2r3(cos2 θ sin θ + sin2 θ cos θ)− 3r2 sin(2θ)

= r2V θ;θ see result (5.46) above (5.63)

12 (c) Obtain pα′;β′ directly in polar coordinates.

We need pα in polar coordinates. We can use the results we obtained inExer. 8(c). Because ~V from Exer. 8 had the same components as our one-

form p when both were in Cartesian components, ~V must be the vector dualof p. This works because gαβ = δαβ for Cartesian coordinates. So pα = Vαand recalling result (5.34) from above,

pr = r2 (cos3 θ + sin3 θ) + 6 r sin θ cos θ

pθ = −r3 cos2 θ sin θ + r3 sin2 θ cos θ + 3r2 (cos2 θ − sin2 θ) (5.64)

Let’s separate out the 4 terms in Eq. (5.63) and work on them one at atime.

pr;r = pr,r − pα′ Γα′

rr Eq. (5.63)

= pr,r , Eq. (5.45) gives Γα′

rr = 0

=∂

∂r(r2 (cos3 θ + sin3 θ) + 6 r sin θ cos θ)

= 2r(cos3 θ + sin3 θ) + 3 sin(2θ) (5.65)

as in result (5.60) above.

pr;θ = pr,θ − pα′ Γα′

rθ Eq. (5.63)

= pr,θ − pθΓθ rθ, only one non-zero Christoffle from Eq. (5.45)

=∂

∂θ(r2 (cos3 θ + sin3 θ) + 6 r sin θ cos θ)− pθ

(1

r

)= 2r2(− cos2 θ sin θ + sin2 θ cos θ) + 3r cos(2θ) , several terms canceled

(5.66)

Page 168: FirstCourseGR_notes_on_Schutz2009.pdf

168

as in result (5.61) above.You can save yourself some work by noting that order doesn’t matter for

partial derivatives: pθ,r = pr,θ and the latter was already calculated above.And the Christoffel symbol cannot make a difference because of the symmetry(Eq. (5.74)). Thus pθ;r = pr;θ:

pθ;r = pθ,r − pα′ Γα′

rθ Eq. (5.63)

= pr,θ − pθΓθ rθ, only one non-zero Christoffle from Eq. (5.45)

= 2r2(− cos2 θ sin θ + sin2 θ cos θ) + 3r cos(2θ)

= pr;θ (5.67)

as in result (5.62) above.Finally

pθ;θ = pθ,θ − pα′ Γα′

θθ Eq. (5.63)

= pθ,θ − prΓθ θθ, only one non-zero Christoffle from Eq. (5.45)

=∂

∂θ(pθ)− pθ

(1

r

)= 2r3(cos2 θ sin θ + sin2 θ cos θ)− 3r2 sin(2θ) , several terms canceled

(5.68)

as in result (5.63) above.

13. Show that one could have obtained the results in Exer. 12(b) bylowering the index using the metric.

pα′;β′ = gα′σ′ Vσ′

;β′

Recall in Exer. 12(b) we found

pr;r = Λαr Λβ

rpα,β

= 2r(cos3 θ + sin3 θ) + 6 cos θ sin θ

= V r;r (5.69)

But

pr;r = grσ′ Vσ′

;r

= grrVr;r

= V r;r , used Eq. (5.31) (5.70)

Page 169: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 169

Recall in Exer. 12(b) we found

pθ;r = Λαθ Λβ

rpα,β

= r22[− cos2 θ sin θ + sin2 θ cos θ] +3

rcos(2θ)

= r2V θ;r (5.71)

But

pθ;r = gθσ′ Vσ′

;r

= gθθVθ;r

= r2 V θ;r , used Eq. (5.31) (5.72)

And

pr;θ = grσ′ Vσ′

= grrVr;θ

= V r;θ , used Eq. (5.31)

= r2V θ;r , used result (5.46 (5.73)

Recall in Exer. 12(b) we found

pθ;θ = 2r3[cos2 θ sin θ + sin2 θ cos θ]− 3r2 sin(2θ)

= r2V θ;θ (5.74)

But

pθ;θ = gθσ′ Vσ′

= gθθVθ;θ

= r2 V θ;θ , used Eq. (5.31) (5.75)

14. Given a(

20

)tensor A with components in polar coordinates:

(A) =

[r2 r sin θ

r cos θ tan θ

]

Page 170: FirstCourseGR_notes_on_Schutz2009.pdf

170

find the components of ∇A.

From Eq. (5.65) there are three contributions:

Aαβ;µ = Aαβ,µ + AασΓβ σµ + AσβΓασµ

The three contributions are respectively:

Aαβ;µ Aαβ,µ AασΓβ σµ AσβΓασµArr;r = 2r +0 +0 = 2rArr;θ = 0 −r2 sin θ −r2 cos θ = −r2(sin θ + cos θ)Arθ;r = sin θ + sin θ +0 = 2 sin θArθ;θ = r cos θ +r +− r tan θ = r(1 + cos θ − tan θ)Aθr;r = cos θ +0 + cos θ = 2 cos θAθr;θ = −r sin θ −r tan θ +r = r(1− sin θ − tan θ)Aθθ;r = 0 +r−1 tan θ +r−1 tan θ = r−1 2 tan θAθθ;r = sec2 θ + cos θ + sin θ = sec2 θ + cos θ + sin θ

The above solution was verified with MapleTM, please see accompanying fileschutz2009_ch5.mw.

15. Given the uniform vector in polar coordinates V r = 1 and V θ = 0,which points radially from the origin, find

V α;µ ;ν

In principle this is quite straightforward, but there are several placesone might slip-up. First I think it’s a good idea to write down the generalexpression, and then substitute the given vector field. Write

Tαµ ≡ V α;µ

= V α,µ + V σ Γασµ Eq. (5.64), the 2nd one that is (5.76)

Don’t substitute the given vector at this point because we’re still going totake another derivative:

V α;µ ;ν ≡ Tαµ ;ν

= Tαµ ,ν + T σµ Γασν−Tασ Γσµν Eq. (5.66) (5.77)

Page 171: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 171

Now it’s straightforward substitution. The problem simplifies tremendouslybecause there are only three nonzero components of the Christoffel symbol,c.f. Eq. (5.45):

Γθ rθ = Γθ θr =1

rΓr θθ = −r (5.78)

To help you debug, I’ve split (5.77) into 3 colour coded parts (so the firstterm makes the contributions in red below):

V θ;θ ;r = − 1

r2+

1

r2− 1

r2= − 1

r2

V θ;r ;θ = − 1

r2= − 1

r2

V r;θ ;θ = −1 = −1 (5.79)

The above solution was verified with MapleTM, please see accompanyingfile schutz2009_ch5.mw.

16. Fill in steps between Eq. (5.74) and Eq. (5.75).

There are no steps to fill in! He has explained every step in detail. SeeExer. 20 for a problem that forces one thoroughly understand each of thesesteps.

17. Discover how V β,α transforms under a change of coordinates. Do

same for V µ Γβ µα

I’ve created SP 3 and SP 4 as an alternative to this problem. They carrythe same message but in a more straightforward way the follows naturallyfrom what we did in Chapter 2 for vectors.

18. Verify Eq. (5.78).

Page 172: FirstCourseGR_notes_on_Schutz2009.pdf

172

Recall Eq. (5.78) gave

~eα · ~eβ ≡ gαβ = δαβ

ωα · ωβ ≡ gαβ = δαβ

(5.80)

So in the first line there are two equalities to verify. Recall how we obtainthe components of any tensor, Eq. (3.21), and the metric tensor in particular,Eq. (3.5), consistent with:

~eα · ~eβ ≡ gαβ

For the second equality of the first line, we have four terms to verify.

~er · ~er = ~er · ~er , substituted Eq. (5.76)

= grr , Eq. (3.5)

= 1 , Eq. (5.31a)

(5.81)

~er · ~eθ = ~er ·1

r~eθ substituted Eq. (5.76)

=1

rgrθ , Eq. (3.5)

= 0 , Eq. (5.31b)

= ~eθ · ~er , order of dot product above doesn’t matter. (5.82)

~eθ · ~eθ =1

r~eθ ·

1

r~eθ , substituted Eq. (5.76)

=1

r2gθθ , Eq. (3.5)

= 1 , Eq. (5.31a) (5.83)

So in the second line there are two equalities to verify. Recall how weobtain the components of any tensor, Eq. (3.21), consistent with:

ωα · ωβ ≡ gαβ

Page 173: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 173

For the second equality of the second line, we have four terms to verify.

ωr · ωr = dr · dr , substituted Eq. (5.77)

= grr see p. 124

= 1 , Eq. (5.34)

(5.84)

ωr · ωθ = dr · rdθ , substituted Eq. (5.77)

= rgrθ , see p. 124

= 0 , Eq. (5.34)

= ωθ · ωr , order of dot product above doesn’t matter. (5.85)

ωθ · ωθ = rdθ · rdθ , substituted Eq. (5.77)

= r2 gθθ see p. 124

= 1 , Eq. (5.34) (5.86)

19. Repeating argument from Eq. (5.81) to Eq.(5.84) using dr and dθleads to the conclusion that these are a coordinate basis.

We simply repeat the argument but instead of substituting Eq. (5.77) weuse dr and dθ. We find the 2nd line of Eq. (5.81) changes:

dr = cos θ dx+ sin θdy , used Eq. (5.27)

dθ = − sin θ dx+ cos θdy , used Eq. (5.26) (5.87)

So now instead of Eq. (5.82) we get

∂η

∂x=−1

rsin θ

∂η

∂y=

1

rcos θ (5.88)

Page 174: FirstCourseGR_notes_on_Schutz2009.pdf

174

So instead of Eq. (5.83) we have factors of 1/r on both sides,

∂2η

∂y∂x=

∂y

(−1

rsin θ

)=

∂y

(−y

x2 + y2

)= − 1

x2 + y2+

2y2

(x2 + y2)2, chain rule

= − x2 + y2

(x2 + y2)2+

2y2

(x2 + y2)2

=−x2 + y2

(x2 + y2)2(5.89)

On the other hand,

∂2η

∂x∂y=

∂x

(1

rcos θ

)=

∂x

(x

x2 + y2

)=

1

x2 + y2− 2x2

(x2 + y2)2

=−x2 + y2

(x2 + y2)2

=∂2η

∂y∂x(5.90)

Thus the basis is consistent with a coordinate basis. See SP.5 to completethe proof that dr and dθ are a coordinate basis.

20 For a noncordinate basis ~eµ, define

∇~eµ~eν −∇~eν~eµ ≡ cαµν ~eα

and use this in place of Eq. (5.74) to generalize Eq. (5.75).

The Christoffel symbol [of the second kind] arising from the derivative ofthe basis vectors, as in Eq. (5.44). For coordinate bases it is symmetric on

Page 175: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 175

the two lower indices, Eq. (5.74), and this was used to derive Eq. (5.75). Nowwe must derive the generalization of Eq. (5.75) appropriate for noncoordinatebases.

It is clear that cαµν so defined is antisymmetric on the two lower indices.It is convenient to introduce a new symbol for the connection coefficientappropriate for the noncoordinate bases as follows:

∇~eµ~eν ≡ λαµν ~eα

=

(Γαµν +

1

2cαµν

)~eα (5.91)

For each α, λαµν is a 4 × 4 matrix and it can be uniquely decomposed intoits symmetric and antisymmetric parts. The cαµν is then identified withthe twice the antisymmetric part and the regular Christoffel symbol as thesymmetric part. We then replace in Eq. (5.72)

Γαµν → λαµν

and repeat the steps on p. 134:

gαβ,µ = λν αµ gνβ + λν βµ gαν

gαµ,β = λν αβ gνµ + λν µβ gαν

−gβµ,α = −λν βα gνµ − λν µα gβν (5.92)

Simply add both sides, grouping terms with common factors of g, remem-bering that we can still exploit the symmetry of g. [These have been colourcoded above to help you find them quickly.] We get of course the same asSchutz on p. 134 but with Γαµν → λαµν :

gαβ,µ + gαµ,β − gβµ,α = (λν αµ − λν µα) gβν + (λν αβ − λν βα) gνµ

+ (λν βµ + λν µβ) gαν (5.93)

Now we exploit the fact that λαµν has been represented in terms of its sym-metric and antisymmetric parts. That is,

(λν αµ − λν µα) = cν αµ

(λν αβ − λν βα) = cν αβ

(λν βµ + λν µβ) = 2Γν βµ (5.94)

Page 176: FirstCourseGR_notes_on_Schutz2009.pdf

176

which gives us

gαβ,µ + gαµ,β − gβµ,α = cν αµ gβν + cν αβ gνµ + 2Γν βµgαν

gαβ,µ + gαµ,β − gβµ,α = cαµβ + cαβµ + 2Γν βµgαν , lowered index (5.95)

We’re almost there! Multiply by (1/2) gαγ and solve for the Christoffelsymbol:

Γν βµδγν =

1

2gαγ(gαβ,µ + gαµ,β − gβµ,α − cαµβ − cαβµ)

Γγ βµ =1

2gαγ(gαβ,µ + gαµ,β − gβµ,α − cαµβ − cαβµ) (5.96)

21a. Hold λ fixed and let a vary in Eq. (5.96), showing that these coor-dinate curves are orthogonal to the world lines (coordinate curves obtainedwith a fixed and λ varying).

Let ~A be the tangent to the curve with λ fixed and varying a:

~A = ~e0∂t(a, λ)

∂a+ ~ex

∂x(a, λ)

∂a= ~e0 sinh(λ) + ~ex cosh(λ) (5.97)

Let ~B be the tangent to the curve with a fixed and varying λ (i.e. the worldlines):

~B = ~e0∂t(a, λ)

∂λ+ ~ex

∂x(a, λ)

∂λ= ~e0 a cosh(λ) + ~ex a sinh(λ) (5.98)

Now it’s easy to show that

~A · ~B = −a sinh(λ) cosh(λ) + a sinh(λ) cosh(λ) = 0 (5.99)

21b. Show that Eq. (5.96) defines a transformation from coordinates(t, x) to coordinates (λ, a) that form an orthogonal coordinate system. Draw

Page 177: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 177

these coordinates and show they only cover half of the original t − x plane.Apparently there are two “disjoint quadrants” separated by |t| = |x|.

The transformation matrix is

Λαα =

[∂t∂λ

∂t∂a

∂x∂λ

∂x∂a

]=

[a cosh(λ) sinh(λ)a sinh(λ) cosh(λ)

](5.100)

And the determinant of this transformation matrix is

det(Λαα) = a

which for a 6= 0 is a legitimate transformation.The basis vectors will be

Λαα ~eα = ~eα

so that we have already found the new basis vectors in terms of the old:

~ea = ~A

~eλ = ~B (5.101)

and shown they were orthogonal in part (a) above.“Plotting the coordinates” sounds vague, but certainly the easiest thing

to do is plot the curves in the t − x plane obtained by holding one of thea− λ pair fixed and varying the other. These are coordinate curves. Beforeplotting these coordinate curves (see Fig. 5.3) it’s useful to reflect on whatthey will look like. Say a = 1, and λ −1, then

t(1, λ) = 1 · sinh(λ) =exp(λ)− exp(−λ)

2≈ −exp(−λ)

2

x(1, λ) = 1 · cosh(λ) =exp(λ) + exp(−λ)

2≈ +

exp(−λ)

2(5.102)

so the curve approaches the straight line t = −x in the limit λ→ −∞. Andthe family of curves approach this same limit regardless of the value of a aslong as it’s finite. (We’ll discuss a < 0 in a minute. For now think of a > 0.)At λ = 0 we have t = 0 and x = a. So for various a > 0 we have a family ofcurves in the 4th quadrant (t ≤ 0 and x > 0).

Page 178: FirstCourseGR_notes_on_Schutz2009.pdf

178

For λ > 0 the curves in the 1st quadrant are the refection about thex−axis at t = 0 of the curves we just described in the 4th quadrant. Inparticular, all curves with a > 0 approach t = x as λ→ +∞.

For a < 0 we have a family of curves in the 2nd and 3rd quadrants thatare the refection about the t−axis at x = 0 of the curves we just described inthe 1st and 4th quadrants respectively. Thus we see that the region betweent = x and t = −x are not parameterized by (λ, a).

x

t

Figure 5.3: Coordinate curves for Eq. (5.96). Plot was partly generated withMapleTM, see accompanying file schutz2009 ch5.mw.

21c. Metric tensor and Christoffel symbols.

We know the metric in ηαβ in t − x coordinates. So we transform thismetric to λ− a coordinates as we did in § 5.2:

gαβ = Λαα Λβ

βgαβ

=∂xα

∂xα∂xβ

∂xβgαβ (5.103)

Let’s do this one component at a time. There are only 3 to check since

Page 179: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 179

gαβ = gβα:

g00 = Λα0 Λβ

0gαβ

gλλ =∂xα

∂λ

∂xβ

∂λgαβ , just rewritting

=∂t

∂λ

∂t

∂λη00 +

∂x

∂λ

∂x

∂λη11

=

(∂a sinhλ

∂λ

)2

(−1) +

(∂a coshλ

∂λ

)2

(+1)

= (a coshλ)2 (−1) + (a sinhλ)2 (+1)

= −a2 (5.104)

And the off-diagonal term:

g01 = Λα0 Λβ

1gαβ

gλa =∂xα

∂λ

∂xβ

∂agαβ , just rewritting

=∂t

∂λ

∂t

∂aη00 +

∂x

∂λ

∂x

∂aη11

=

(∂a sinhλ

∂λ

)(∂a sinhλ

∂a

)(−1) +

(∂a coshλ

∂λ

)(∂a coshλ

∂a

)(+1)

= a coshλ sinhλ (−1) + a sinhλ coshλ (+1)

= 0

= gaλ symmetry of the metric tensor. (5.105)

The final component:

g11 = Λα1 Λβ

1gαβ

gaa =∂xα

∂a

∂xβ

∂agαβ , just rewritting

=∂t

∂a

∂t

∂aη00 +

∂x

∂a

∂x

∂aη11

=

(∂a sinhλ

∂a

)2

(−1) +

(∂a coshλ

∂a

)2

(+1)

= (sinhλ)2 (−1) + (coshλ)2 (+1)

= 1 (5.106)

Page 180: FirstCourseGR_notes_on_Schutz2009.pdf

180

There are only 2×3 = 6 components of the Christoffel symbol to compute.We use Eq. (5.75) since we’re in a coordinate bases. We’ll need the inversemetric tensor

(gαβ) =

[−a2 0

0 1

]−1

=

[−a−2 0

0 1

](5.107)

Γαµν =1

2gασ(gσµ,ν + gσν,µ − gµν,σ) (5.108)

The metric depends only upon a since no where does λ appear in our com-ponents of g above. So we immediately conclude

Γλλλ = 0. (5.109)

Γλλa =1

2gλσ(gσλ,a + gσa,λ − gλa,σ)

=1

2gλλ(gλλ,a + gλa,λ − gλa,λ) , diagonal metric

=1

2gλλ(gλλ,a) , diagonal metric

=1

2

(−1

a2

)∂(−a2)

∂a=

1

a

= Γλaλ Eq. (5.74) (5.110)

Γλaa =1

2gλσ(gσa,a + gσa,a − gaa,σ)

=1

2gλλ(gλa,a + gλa,a − gaa,λ)

=1

2gλλ(−gaa,λ) , diagonal metric

= 0. (5.111)

Page 181: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 181

Γa λλ =1

2gaσ(gσλ,λ + gσλ,λ − gλλ,σ)

=1

2gaa(gaλ,λ + gaλ,λ − gλλ,a)

=1

2gaa(−gλλ,a) , diagonal metric

=1

2(+1)

∂(−a2)

∂a= a (5.112)

Γa λa =1

2gaσ(gσλ,a + gσa,λ − gλa,σ)

=1

2gaa(gaλ,a + gaa,λ − gλa,a)

=1

2gaa(+gaa,λ)

= 0. (5.113)

Γa aa =1

2gaσ(gσa,a + gσa,a − gaa,σ)

=1

2gaa(gaa,a + gaa,a − gaa,a)

= 0. (5.114)

22. Show that ifUα∇αV

β = W β

thenUα∇αVβ = Wβ

We simply use the metric tensor to lower the index β on both sides of theequals sign, =, in the first expression:

gσβUα∇αV

σ = gσβWσ

Uα∇α(gσβVσ) = gσβW

σ metric commutes with covariant derivative, Eq. (5.71)

Uα∇αVβ = Wβ lower the index (5.115)

Page 182: FirstCourseGR_notes_on_Schutz2009.pdf

182

5.9 Rob’s supplementary problems

SP.1 Eq. (5.28b) states that the magnitude of the one-form basis

|dθ| = 1

r

while Eq. (5.28a) implies that

|~eθ| = r

Does this contradict Eq. (3.47) wherein the magnitude of a one-form wasstated to be the same as its associated vector? Hint. The answer is of course“no”. Work through problem 34 of § 3.10, which might help.

Solution: The answer is of course “no, there is no contradiction”. Theone-form bases are not simply the associated one-forms of the vector bases.This was stated explicitly by Schutz in §3.3, p. 61. See also the next supple-mentary problem SP.2.

SP.2 Find the one-form associated with vector ~eα for a fixed α.

Solution: The one-form associated with a vector ~A is

A = g( ~A, )

as stated just before Eq. (5.67), and first introduced in § 3.5. We simplysubstitute Aα ~eα, with Aα = 1 for fixed α into the above expression:

A = g( ~A, )

= gµν ωµ ⊗ ων (Aα~eα, )

= Aα gµν ωµ(~eα) ων( )

= Aα gµν δµα ω

ν( )

= gαν ων( ) , used Aα = 1 (5.116)

So we see that only if the metric is diagonal is gαα ωα the one-form associated

with ~eα, for a fixed α, and furthermore only if gαα = 1 is ωα the one-formassociated with ~eα.

Page 183: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 183

SP.3 Write the identity matrix as the product of two transformationsand take the partial derivative ∂/∂xµ to show that

Λαα′ ,µ Λα′

β + Λαα′ Λα′

β ,µ = 0

Solution:

Λαα′ Λα′

β = δαβ a transform and its inverse transform

Λαα′ ,µ Λα′

β + Λαα′ Λα′

β ,µ = 0 LHS: product rule, RHS: identity matrix constant

(5.117)

SP.4 Multiply Eq. (5.43) by the one-form basis ωβ = dxβ so that it’sa proper tensor equation. Then show that it transforms as we would hope,that is like a

(11

)tensor. This supplementary problem is meant to be easier

than Schut’z problem 17 but to carry the same message. Hint: Go take toEq. (3.10) and remind yourself how we showed that p was invariant under achange of coordinates. Furthermore, you’ll need the result from SP.3.

Solution: Multiplying Eq. (5.43) by the one-form basis ωβ = dxβ gives,

dxβ∂

∂xβ(V α~eα) = dxβ~eα

∂xβV α + V α ∂

∂xβ~eα

In Eq. (3.10) we showed that one-forms are invariant under a change of coor-dinates (or rather we assumed they were invariant and found the appropriatetransformation). However you interpret it, we do the same manipulationshere but for a mixed rank 2 tensor. To be frame invariant, we require

dxβ∂

∂xβ(V α~eα) = dxβ

′ ∂

∂xβ′(V α′~eα′)

where, in keeping with Chapter 5 notation, the primes indicate a differentcoordinate system. Now let’s expand the RHS, writing it in terms of knowntransformations from things in the unprimed frame. Our strategy will be to

Page 184: FirstCourseGR_notes_on_Schutz2009.pdf

184

obtain terms on the RHS like V α and dxβ that we can cancel with term onthe LHS.

dxβ∂

∂xβ(V α~eα)

= dxβ′ ∂

∂xβ′(V α′~eα′) assumed it’s frame invariant

= dxβ′~eα′

∂xβ′V α′ + V α′ ∂

∂xβ′~eα′ product rule (as in Eq. (5.43))

= (Λβ′

βdxβ)(Λαα′ ~eα)

∂xβ′V α′ + V α′ ∂

∂xβ′(Λα

α′ ~eα) unprimed bases

= (Λβ′

βdxβ)(Λαα′ ~eα)

∂xβ′(V νΛα′

ν) + (V νΛα′

ν)∂

∂xβ′(Λα

α′ ~eα) unprimed components

= (Λβ′

βdxβ)(Λαα′ ~eα)

(∂xµ

∂xβ′∂

∂xµ

)(V νΛα′

ν) + (V νΛα′

ν)

(∂xµ

∂xβ′∂

∂xµ

)(Λα

α′ ~eα) chain rule

= (dxβ)(Λαα′ ~eα)

(δµβ

∂xµ

)(V νΛα′

ν) + (V νΛα′

ν)

(δµβ

∂xµ

)(Λα

α′ ~eα) inverses

= (dxµ)(Λαα′ ~eα)

(∂

∂xµ

)(V νΛα′

ν) + (V νΛα′

ν)

(∂

∂xµ

)(Λα

α′ ~eα) simplified

= (dxµ)(Λαα′ ~eα)

(V νΛα′

ν ,µ + V ν,µΛα′

ν

)+ (V νΛα′

ν)

(Λα

α′ ,µ ~eα + Λαα′∂~eα∂xµ

) product rule

= (dxµ)(Λαα′ ~eα)

(V ν

,µΛα′

ν

)+ (V νΛα′

ν)

(Λα

α′∂~eα∂xµ

) used SP3.

= (dxµ)δαν ~eα V ν,µ + V νδαν

∂~eα∂xµ inverse transforms

= (dxµ)~eν V ν,µ + V ν ∂~eν

∂xµ simplified

(5.118)

And if we changed dummy indices, we’d be back where we started. Soapplying the expected transformation rules we found the covariant derivativeof a vector does transform as we expected, like a

(11

)tensor. And notice

that, because of the product rule of elementary differential calculus, thecovariant derivative involved two terms. Neither of these terms transformedlike a tensor on their own, because of the troublesome derivatives of thetransformations, terms like Λα

α′ ,µ above. But these two terms cancelledwhen we used the result of SP3. This was the point of Schutz’s problem 17.

Page 185: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 185

SP.5 In Exer. 19 we showed that dr and dθ are consistent with a coor-dinate basis. Complete the proof that dr and dθ are a coordinate basis.

Solution: One must repeat the argument for ξ also. Analogous to Eq. (5.82)we obtain:

∂ξ

∂x= cos θ

∂ξ

∂y= sin θ (5.119)

Find the common mixed partial derivative,

∂2ξ

∂y∂x=

∂y(cos θ)

=∂

∂y

(x√

x2 + y2

)= − 2xy

(x2 + y2)3/2, product rule (5.120)

And

∂2ξ

∂x∂y=

∂x(sin θ)

=∂

∂x

(y√

x2 + y2

)= − 2xy

(x2 + y2)3/2, product rule

=∂2ξ

∂y∂x(5.121)

Page 186: FirstCourseGR_notes_on_Schutz2009.pdf

186

Page 187: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 6

Curved Manifolds

187

Page 188: FirstCourseGR_notes_on_Schutz2009.pdf

188

6.3 Covariant differentiation

Typo in Eqs (6.30) and (6.31). Colin and period should be semicolon andcomma.

Typo just before Eq. (6.36), reference should be to Eq. (5.52), not (5.53).

6.5 The Curvature Tensor

Typo in Eq. (6.67), the first upper index should be α not σ:

Rαβµν =

1

2gασ(gσν,βµ − gβν,σµ − gσµ,βν + gβµ,σν). (6.1)

This will be derived in Exerc. 17.

6.6 Bianchi identities: Ricci and Einstein ten-

sors

Typo in Eq. (6.91), the σ should be an α, i.e.

Rαβ ≡ Rµαµβ = Rβα

6.9 Exercises

It might help to tackle my supplementary problems first, see § 6.10 below.

1. Decide if the following sets are manifolds and say why. If there areexceptional points at which the sets are not manifolds, give them:

We are given only an intuitive explanation of manifolds in § 6.1, wherewe’re told that a manifold is a space that can be continuously parameterized;that there is a smooth mapping from points of the manifold to a Euclideanspace of the same dimension. So I believe such an intuitive explanation is allthat is required here.

(a) Phase space of Hamiltonian mechanics, the space of the canonicalcoordinates and momenta pi and qi;

Page 189: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 189

I believe this is a typo and should read “. . . the space of the canonicalcoordinates i.e. momenta pi and qi”, since in Hamiltonian mechanics thecanonical coordinates are momenta pi and qi, see http://en.wikipedia.

org/wiki/Canonical_coordinates.The answer is “yes” of course, since the phase space is a manifold. The

intuitive reason is that the generalized momentum pi and generalized coor-dinates qi are parameterized by time, and on physical grounds this parame-terization must be continuous – otherwise the particles would jump instantlyin position or accelerate infinitely.

(b) The interior of a circle of unit radius in two-dimensional Euclideanspace.

“Yes”, because again this can be parameterized by putting the centre ofthe circle at the origin, say, and parameterizing the curve on angle of theradius vector to the x−axis.

(c) The set of permutations of n objects.

I’m not completely clear on what the objects are nor what permutationsof them are, but I believe the answer is “No”, because I don’t see how sep-arate objects, potentially with a space between them, can be continuouslyparameterized.

(d) The subset of Euclidean space of two dimensions (coordinates x andy) which is a solution of

x y (x2 + y2 − 1) = 0.

The solution is the union of the x−axis, i.e. ∀x, and y = 0, and they−axis, and the unit circle centred at the origin. So because these intersect,one could parameterize it continuously. But it’s not differentiable every-where, so it’s not a differential manifold.

2. Of the manifolds in Exer. 1, on which is it customary to use a metric,and what is that metric? On which would a metric not normally be defined,and why?

Page 190: FirstCourseGR_notes_on_Schutz2009.pdf

190

(b) The natural coordinates would be polar coordinates, since the domainis then easily defined: r < 1 and 0 ≤ θ ≤ 2π. The metric tensor was givenin § 5.2, c.f. Eq. (5.32).

3. (a) Show that given a diagonal matrix, D one can always find amatrix R such that RTDR is also diagonal, with same elements as D but inascending order.

Let’s call the reorder matrix D,

D = RTDR,

and the elements of the matrices di j for elements of D etc. Then

dkl = (rT )ki dij rjl

= rik dij rjl

= rik dii ril

(6.2)

Suppose we want to move the diagonal element dII into slot K for given(fixed) I and K, i.e. we want

dKK = dII

We simply choose rIK = 1 and riK = 0 ∀i 6= I. But are the off-diagonalterms still zero? Yes, this is guaranteed because when k 6= l, we cannot haveboth rik 6= 0 and ril 6= 0 since that would correspond to moving the diagonalelement dii into two different slots dkk and dll. That’s not allowed becauseeach diagonal can only be moved into one slot.

3. (b) Show that the diagonal elements can be scaled such that they areeither −1, 0, or +1 using another matrix N as follows: NT DN .

As we found above the new elements will be

˜dkl = (nT )ki dij njl

= nik dij njl

= nik dii nil

(6.3)

Page 191: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 191

Suppose dKK 6= 0 and we want the diagonal element ˜dKK for given (fixed)K to be −1 or +1, i.e. we want

˜dKK = 1 sign(dKK).

We simply choose nKK = 1/√|dKK | and niK = 0 ∀i 6= K.

But if dKK = 0 we must cannot do this, and must choose nKK = a,where a is any nonzero number, and niK = 0 ∀i 6= K. We then end up with˜dKK = 0.

3. (c) Show that none of the diagonal elements can be zero for the inverseof A to exist.

The equation for the inverse is trivial for a diagonal matrix because onecan find one equation with one unknown for each element of the inverse.

DD−1 = I. (6.4)

So for element (D−1)ii we have

(D−1)ii = 1/(D)ii.

and for off-diagonal elements

(D−1)ij = 0/(D)ii = 0.

So the inverse of a diagonal matrix is also diagonal but with the elementsequal to the inverse of the original elements. When the original matrix hadzero for one or more diagonal elements, then the inverse doesn’t exist becausefinding it would involve dividing by zero.

3. (d) Show that the metric of Eq. (6.2) can always be found.As we are reminded in the text (p. 145), the metric tensor g is symmetric

by definition (for example if it is defined from the dot product of two vectors,the order of the vectors does not matter). So as stated in Exerc. 3, this impliesthat (g) can be diagonalized. Furthermore (g) must have an inverse (for themapping from vectors to one-forms to be invertible. Then the results (a)through (c) show that we can reduce (g) to a matrix with either −1 or +1on the diagonal. Since we choose (g) to have one negative eigenvalue we endup with one −1 on the diagonal, and remaining entries +1.

Page 192: FirstCourseGR_notes_on_Schutz2009.pdf

192

4. Prove the following results used in the proof of the local flatnesstheorem in § 6.2:

(a) ∂2xα/∂xγ′∂xµ

′ |0 has 40 independent values.Consider first the operator, ∂2/∂xγ

′∂xµ

′ |0. It is represented by a 4 × 4symmetric (because partial differentiation commutes) matrix, and has 10independent elements (4 diagonal and 6 in the upper diagonol). For each el-ement of this differential operator, there are 4 independent coordinates it canact on, xα. Hence there are 4×10 = 40 degrees of freedom in ∂2xα/∂xγ

′∂xµ

′|0.

(b) ∂3xα/∂xλ′∂xµ

′∂xν

′|0 has 80 independent values.Again the problem is to omit the elements made redundant by the fact

that the partial derivatives commute. Again, start with just the differentialoperator which here is ∂3/∂xλ

′∂xµ

′∂xν

′|0. There are 4 diagonal elements.There are 4× 3 = 12 ways to choose two of the derivatives the same and onedifferent (like λ′ = µ′ 6= ν ′. It’s not necessary to also consider λ′ 6= µ′ = ν ′

because we’ve accounted for these elements already, since the order of thederivatives does not matter). And finally there are the elements where λ′ 6=µ′ 6= ν ′. Here it is easiest to count them by noting that for any given choicethere is only one unused index value, so there are 4 such elements. Addingthese three types of terms we get 4 + 12 + 4 = 20. Again, for each element ofthis differential operator, there are 4 independent coordinates it can act on,xα. Hence there are 4× 20 = 80 degrees of freedom.

(c) gαβ ,γ′µ′|0 has 100 independent values.This one is easy. The differential operator has 10 degrees of freedom, see

Exerc. (a) above. And this is applied to the metric which is a symmetric 4×4tensor, and hence has 10 independent values. Thus there are 10× 10 = 100independent values in total.

5. (a) Prove that Γµαβ = Γµβα in any coordinate system in a curvedRiemannian space.

This Exercise is so important one really must do it. By the local flatnesstheorem, c.f. § 6.2, on a general Riemann manifold, there is a local inertial(Lorentz) reference frame.In a Lorentz frame spacetime is flat and one canconstruct a reference frame with basis vectors that do not change with posi-

Page 193: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 193

tion, so the Christoffel symbols are zero. This is all one needs to reproducethe argument of § 5.4 leading to Eq. (5.74).

5. (b) Use this to prove that Eq. (6.32) can be derived in the same manneras in flat space.

The same principle is as in (a), i.e. the local flatness theorem, is involvedin deriving Eq. (6.31) (which is identical to Eq. (5.71)). And Eq. (6.32) isidentical to Eq. (5.75). The Argument leading to Eq. (5.75) can be repeatedin curved Riemann space because it used Eqs. (5.71, 5.72, 5.74).

Eq. (5.72) is valid in curved space. Eq. (5.74) was proved above in (a).Eq. (5.71) was given.

6. Prove the first term in Eq. (6.37) vanishes.

Recall Eq. (6.37),

Γαµα =1

2gαβ(gβµ,α − gµα,β) +

1

2gαβgαβ,µ

As pointed out in the text, we only need to prove that

(gβµ,α − gµα,β)

is antisymmetric. Then we can use the result of Exerc. 26(a) of § 3.10 toshow that the term vanishes. First we note that gβµ = gµβ because of thesymmetry of the metric tensor.

(gβµ,α − gµα,β) = (gµβ,α − gµα,β).

Then we note that for each µ, the RHS is antisymmetric in α and β sinceobviously

(gµβ,α − gµα,β) = −(gµα,β − gµβ,α).

And for each µ, this term antisymmetric in α and β will be multiplied by theinverse metric tensor gαβ that is obviously symmetric in α and β. Thus, bythe result of Exerc. 26(a) of § 3.10, the first term vanishes.

7. This problem is similar to 8.16 on page 222 in Misner et al. (1973)book.

Page 194: FirstCourseGR_notes_on_Schutz2009.pdf

194

7. (a) Give the definition of the determinant of a matrix A in terms ofcofactors of elements.

7. (b) Differentiate the determinant of an arbitrary 2 × 2 matrix andshow that it satisfies Eq. (6.39).

7. (c) Generalize Eq. (6.39) (by induction or otherwise) to arbitrary n×nmatrices.

8. Fill in the missing algebra leading to Eqs. (6.40) and (6.42).

I find it easiest to work backwards. That is, start with Eq. (6.40) anddifferentiate using the chain rule:

Γαµα =(√−g),µ√−g

=1

2

(−g),µ(√−g)2

=1

2

g,µg. (6.5)

Substitution of Eq. (6.39) leads directly to Eq. (6.38).

And for the second part, again I find it easiest to work backwards. Thatis, start with Eq. (6.42) and differentiate using the product rule and chainrule:

V α;α =

1√−g

(√−g V α),α

=

√−g√−g

V α,α +

V α

√−g

(√−g ),α

= V α,α +

V α

√−g

(√−g ),α, (6.6)

which is Eq. (6.41). And Eq. (6.41) was obtained directly by substitution ofEq. (6.40) into Eq. (6.36).

Page 195: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 195

9. Show that Eq. (6.42) leads to Eq. (5.56).

This amounts to showing that the general formula for the divergence ofa velocity field is consistent with the special case derived in § 5.3 for polarcoordinates. We start with Eq. (5.32) for the metric in polar coordinates.The determinate is simply

g = det (gαβ) = r2.

Substitution into Eq. (6.42) we find

V α;α =

1√−g

(√−g V α),α

= V α,α +

V α

√−g

(√−g ),α,

= V r,r + V θ

,θ +V r

√−r2

(√−r2 ),r,

= V r,r + V θ

,θ +V r

r, (6.7)

consistent with the first line of Eq. (5.56).

Find the divergence formula for the metric given in Eq. (6.19), i.e. thatfor spherical polar coordinates.

From Eq. (6.19), the determinate is

g = r4 sin2 θ,

which has two nonzero gradient components,

∂√−r4 sin2 θ

∂r=−2r3 sin2 θ√−r4 sin2 θ

,

∂√−r4 sin2 θ

∂θ=−r4 sin θ cos θ√−r4 sin2 θ

. (6.8)

Page 196: FirstCourseGR_notes_on_Schutz2009.pdf

196

Substitution into Eq. (6.42) we find,

V α;α = V r

,r + V θ,θ + V φ

,φ +V r√−r4 sin2 θ

−2r3 sin2 θ√−r4 sin2 θ

,+V θ√−r4 sin2 θ

−r4 sin θ cos θ√−r4 sin2 θ

= V r,r + V θ

,θ + V φ,φ +

2V r

r+

V θ

tan θ(6.9)

10. Consider a triangle made up of great circles on a sphere intersectingat points A,B, and C, as in Fig. 6.3 but with B not necessarily on the pole.Show that the amount by which a vector is rotated by parallel transportaround such a triangle equals the excess of the sum of the angles over 180 =π rad.

This is a good problem because it forces us to think carefully throughall the steps of the first subsection of § 6.4. We strongly need a diagramto do this one. Let’s use that of Fig. 6.3. The only constraint given wasthat the sides of the triangle form great circles on the sphere. Without lossof generality we can take A and C to be on the equator. We can label theinterior angles of the triangle as follows: CAB is the angle at A betweenthe great circle through CA and that through AB, and similarly for theother two. As in § 6.4, let’s start with a vector at A that is parallel to theequator. The angle from the vector to geodesic through AB is CAB. Thisangle doesn’t change as we move to B because it is parallel-transported. Nowimagine looking down on the point B. The angle between the extension ofgeodesic AB to the vector is still CAB. Let φ be the angle between thevector at B and geodesic BC. The

ABC + φ+ CAB = π rad.

I believe this is the only tricky part of this problem. It stems from the factthat the intersection of two great circles forms four angles that add to 2π rad,with those on one side of a great circle adding to π rad. I take this as visuallyobvious (unfortunately I can’t offer any proof of this). Moving from B to Cthe angle φ doesn’t change. At C, let θ be the angle between the vector andAC i.e. the equator. There,

BCA = θ + φ.

Page 197: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 197

Combining these two equations and solving for θ gives

θ = CAB + ABC +BCA− π.

11. When is the gradient of the velocity field zero everywhere?

V α;β = 0 = V α

,β + ΓαµβVµ (6.10)

(a) A necessary condition for the solution of (6.10) is the integrability con-

dition for (6.10), which apparently follows from the commuting of partialderivatives. We are to use the fact that V α

,νβ = V α,βν to derive the equation:

(Γαµβ,ν − Γαµν,β)V µ = (ΓαµβΓµσν − ΓαµνΓµσβ)V σ (6.11)

This problem is fairly straightforward once one thinks about “what wouldI do to (6.10) to obtain a term like . . . in (6.11)”. Obviously to obtain a termlike Γαµβ,ν one must take

∂xνof (6.10).

To get rid of the resulting V α,βν one simply does this twice and subtracts

the two results, using the commuting property of partial differentiation, asSchutz indicated. But then one must also deal with terms like ΓαµβV

µ,ν . For

these one uses (6.10) again (without differentiating).

(b) By relabeling indices, work this into the form given. Actually theform given has a typo and should be:

(Γαµβ,ν − Γαµν,β + Γασν Γσµβ − Γασβ Γσµν) Vµ = 0. (6.12)

This problem is straightforward. The first two terms in (6.12) and (6.11)are identical. Based on the sign of the 3rd term in (6.12) it’s obviously thefinal term in (6.11). So we must replace µ → σ. That leaves the final termin in (6.12) as the 3rd in (6.11). So we need to also replace σ → µ.

12. Prove that Eq. (6.52) defines a new affine parameter.

Page 198: FirstCourseGR_notes_on_Schutz2009.pdf

198

Using φ defined in Eq. (6.52) to parameterize the curve, we can find theequation for the geodesics by using Eq. (6.51) and the chain rule. So theoperator

d

(d

)=

(dλ

)2d

(d

)=

(1

a

)2d

(d

)(6.13)

That means we can rewrite Eq. (6.51) in terms of φ by simply dividingEq. (6.51) by a2, which is constant. Because Eq. (6.51) is set to zero, thisdoesn’t change the form of the equation at all, as indicated on p. 157, equationbelow Eq. (6.52).

13. (a) Show that if ~A and ~B are parallel-transported along a curve, then

g( ~A, ~B) = ~A · ~B is constant on the curve.

A vector that is parallel-transported along a curve is moved in the direc-tion of the tangent to the curve without rotating or changing its length. Fromthis notion it seems obvious that the dot product of two vectors that wereparallel-transported along a curve would not change. To demonstrate thismathematically, on could take the derivative along the curve (parameterizedby λ) of the dot product:

d g( ~A, ~B)

dλ=

d

(gαβA

αBβ)

= AαBβ d

dλ(gαβ) + gαβB

β d

dλ(Aα) + gαβA

α d

(Bβ)

(6.14)

All the derivatives are zero. The first term is the derivative of the metricalong the curve and is zero as consequence of the local flatness theorem:

d

dλ(gαβ) = Uµgαβ;µ

= 0, c.f. Eq. (6.31). (6.15)

Page 199: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 199

The 2nd and 3rd terms are the derivatives of the components of the vectors.These are zero because these vectors were assumed to be parallel-transportedalong the curve, c.f. Eq. (6.47).

13. (b) Conclude from the results of (a) that if a geodesic is spacelike(or timelike or null) at some point, it is necessarily spacelike (or timelike ornull) at all points.

Vectors were defined as spacelike (or timelike or null) if their magnitudewas > 0(< 0,= 0), c.f. § 2.5 on p. 44. A geodesic is of course not a vector,but it does have a tangent vector at each point along the curve that gives thelinear approximation to the displacement along the curve at the point, perunit of the parameter that parameterizes the curve. So it would be reasonableto call a geodesic spacelike at a point if its tangent vector ~U were of positivemagnitude at that point,

~U · ~U = gαβUαUβ.

Can this change as one moves along the curve? The geodesic is, by definition,the curve that parallel-transports its own tangent vector. But from (a) wehave that any two vectors that are parallel-transported by any curve keepthe same dot product. So the tangent vector, dotted with itself, does notchange as it is parallel-transported around the geodesic.

14. Proper distance along a curve whose tangent is ~V is given by Eq. (6.8).Show that if the curve is a geodesic, then proper length is an affine parameter.(Use results of Exerc. 13).

Page 200: FirstCourseGR_notes_on_Schutz2009.pdf

200

l =

∫ λ1

λ0

|~V · ~V |1/2 dλ, Eq. (6.8)

=

∫ λ1

λ0

|~U · ~U |1/2 dλ, where ~U is tangent vector to geodesic

= |~U · ~U |1/2∫ λ1

λ0

dλ, using Exerc. 13b

(6.16)

Thus the proper distance along the curve has the form,

l(λ) = |~U · ~U |1/2(λ− λ0)

= |~U · ~U |1/2λ− λ0|~U · ~U |1/2,= aλ+ b. (6.17)

where a = |~U · ~U |1/2 is constant (c.f. Exerc. 13 (a)), and b = −λ0a. Thisis the same form as Eq. (6.52), which was (hopefully) shown to be an affineparameter in Exerc. 12.

15. Use Exercs. 13 and 14 to prove that the proper length of a geodesicbetween two points is unchanged to first order by small changes in the curvethat do not change its endpoints.

This looks like a nice problem because it implies that geodesics are ex-tremma in distance between two fixed points.

16. (a) Derive Eqs. (6.59) and (6.60) from Eq. (6.58).

This problem makes sure we’re following the argument. The Eq. (6.59) isfound simply by expanding the integrand in a Taylor series (about point A)and keeping only the constant term and the term in the first derivative. Theconstant terms cancel in both cases leaving only the first derivative terms,leading to Eq. (6.59).

Treating the integrands in Eq. (6.59) as constants (in keeping the Tay-lor series expansions), the integrals can be performed giving immediatelyEq. (6.60).

Page 201: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 201

16. (b) Provide algebra needed to justify Eq. (6.61).

Again a simple problem to make sure we’re following the argument. Let’sstart with the first term in Eq. (6.60).

− ∂

∂x1(Γαµ2V

µ) = −V µ ∂

∂x1(Γαµ2)− Γαµ2

∂x1(V µ)

= −V µ(Γαµ2,1)− Γαµ2(−Γµν1Vν), using Eq. (6.53)

= [−Γαµ2,1 + Γαν2Γνµ1]V µ, (6.18)

where we let ν → µ and µ→ ν in the 2nd term so that we can pull out theV µ. This is allowed because they are dummy (repeated) indices.

The 2nd term in Eq. (6.60) is dealt with in exactly the same way:

∂x2(Γαµ1V

µ) = V µ ∂

∂x2(Γαµ1) + Γαµ1

∂x2(V µ)

= V µ(Γαµ1,2) + Γαµ1(−Γµν2Vν), using Eq. (6.53)

= [Γαµ1,2 − Γαν1Γνµ2]V µ, (6.19)

where we let ν → µ and µ→ ν in the 2nd term so that we can pull out theV µ.

17. (a) Prove that Eq. (6.5) implies that

gαβ,µ(P) = 0

Because the metric tensor applied to its inverse gives the identity matrix,which is of course a constant, we have

gαµgµβ = gβα = δβα,

(gαµgµβ),γ = δβα,γ = 0,

gαµ,γ gµβ + gαµ g

µβ,γ = 0,

gαµ gµβ,γ = 0. (6.20)

And now for the tricky bit. In general gαβ is a general tensor and there couldin principle be several non-zero terms in each column that cancel to produce

Page 202: FirstCourseGR_notes_on_Schutz2009.pdf

202

zero when multiplied by gµβ,γ . So I believe one must argue as follows. However,one can always choose ones basis such that gαβ = ηαβ, c.f. Eq. (6.2). Butthen there is only one non-zero term in each column. For instance, let α = 0.

g0µ gµβ,γ = −1g0β

,γ = 0.

= g0β,γ = 0, ∀β, γ. (6.21)

The same argument of course applies to all the other values of α, and thus

gαβ,γ = 0, ∀α, β, γ.

17. (b) Use results of (a) to establish Eq. (6.64).

A nice easy problem, perhaps the point be to show us what use the resultsin (a) can be put to. Starting with Eq. (6.32)

Γαµν =1

2gαβ (gβµ,ν + gβν,µ − gµν,β), (6.22)

we simply differentiate with respect to xσ,

Γαµν,σ =1

2

∂xσ(gαβ (gβµ,ν + gβν,µ − gµν,β)

),

=1

2gαβ,σ (gβµ,ν + gβν,µ − gµν,β) +

1

2gαβ (gβµ,νσ + gβν,µσ − gµν,βσ),

=1

2gαβ (gβµ,νσ + gβν,µσ − gµν,βσ). (6.23)

17. (c) Fill in step needed to establish Eq. (6.68).

We start with Eq. (6.63). Because we’re in a local inertial frame atpoint P , the Christoffel symbol vanishes at P . For the derivative of theChristoffel symbol, we substitute from Eq. (6.64), for the first term makingthe substitutions

µ→ β

σ → µ

β → σ. (6.24)

Page 203: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 203

For the 2nd term we simply interchange µ and ν in the first term and changethe sign, giving first Eq. (6.65), and then, after cancelling the red terms usingEq. (6.66), then arriving at Eq. (6.67):

Rαβµν =

1

2gασ(gσβ,νµ + gσν,βµ − gβν,σµ−gσβ,µν − gσµ,βν + gβµ,σν),

=1

2gασ(gσν,βµ − gβν,σµ − gσµ,βν + gβµ,σν),

=1

2gασ(gσν,βµ − gσµ,βν + gβµ,σν − gβν,σµ), just changed order.

(6.25)

Finally we must lower the index,

Rαβµν = gαλRλβµν

= gαλ1

2gλσ(gσν,βµ − gσµ,βν + gβµ,σν − gβν,σµ),

= gσα1

2(gσν,βµ − gσµ,βν + gβµ,σν − gβν,σµ),

= δσα1

2(gσν,βµ − gσµ,βν + gβµ,σν − gβν,σµ),

=1

2(gαν,βµ − gαµ,βν + gβµ,αν − gβν,αµ), (6.26)

which is Eq. (6.68).

18. (a) Derive Eqs. (6.69) and (6.70) from Eq. (6.68)

This question involves trivial (and not so trivial) index manipulation. InEq. (6.68) on notices that changing the order of α and β changes the sign.This is clear because the first term, gαν,βµ is the negative of the last term−gβν,αµ but with α and β in the opposite order. And similarly for the twomiddle terms. This observation gives the first equality of Eq. (6.69)

Rαβµν = −Rβαµν .

An analogous observation gives the 2nd equality in Eq. (6.69). That is,the first two terms of Eq. (6.68) are of opposite sign and have the second

Page 204: FirstCourseGR_notes_on_Schutz2009.pdf

204

pair of indices, µ and ν, in the opposite order. Similarly for the 3rd and 4thterms. This gives, the second equality of Eq. (6.69)

Rαβµν = −Rαβνµ.

The 3rd and final equality in Eq. (6.69) is not so immediately clear. Ifound a proof using the following somewhat tedious procedure. Use Eq. (6.68)to find the expression for Rµναβ. That is, simply use Eq. (6.68) with thefollowing substitutions:

α→ µ,

β → ν,

µ→ α,

ν → β.

One finds

Rµναβ =1

2(gµβ,να − gµα,νβ + gνα,µβ − gνβ,µα) (6.27)

There are two ways to proceed from here. The most direct is to use the factsthat g is always symmetric, and derivatives commute. So we can interchangethe order of the two indices before and after the comma without changingthe result. Then one notices that the 3rd term in (6.27) above is the sameas the first term in Eq. (6.69),

gνα,µβ = gαν,βµ. (6.28)

And furthermore, the 2nd term in (6.27) above is the same as the 2nd termin Eq. (6.69); the 1st term in (6.27) above is the same as the 3rd term inEq. (6.69); the 4th term in (6.27) above is the same as the 4th term inEq. (6.69). This gives the 3rd equality in Eq. (6.69).

Proving Eq. (6.70) is also rather straightforward, albeit rather messy (atleast for the brute force method I came up with). I simply wrote done theexpressions for all three terms in Eq. (6.70), Rαβµν , Rανβµ and Rαµνβ usingEq. (6.68):

2Rαβµν = +gαν,βµ−gαµ,βν+gβµ,αν−gβν,αµ (6.29)

2Rανβµ = +gαµ,νβ−gαβ,νµ+gνβ,αµ − gνµ,αβ (6.30)

2Rαµνβ = +gαβ,µν−gαν,µβ + gµν,αβ−gµβ,αν (6.31)

Page 205: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 205

Recognizing again that the order of the first pair and last pair of indicesin terms like gαβ,µν doesn’t matter because of symmetry of the metric andpartial derivatives commute, we can cancel all the terms. The cancellingterms are typeset in the same colour of font.

18. (b) Show that Eq. (6.69) reduces the number of independent compo-nents of Rαβµν to 21.

If you’re having trouble here, perhaps it’s worth having a go at Exerc. 35,since that problem involves writing down the independent elements for aspecific example, perhaps forcing you to think about it in a more concreteor different way. Of course, Exerc. 35 would be much easier if you solvedExer. 18 first! If you’re still stuck, heres my solution below.

First of all one must notice that terms of the form Rααµν = 0, and ofcourse also Rαβµµ = 0. Speaking from personal experience, one can wastequite some time if one doesn’t appreciate this from the start! But onceone imposes this, then the problem simplifies tremendously. For now thereare only six independent choices for the first pair α, β (where order doesn’tmatter because of Eq. (6.69)). One can establish this a number of ways. Forinstance, there are 4 ways of choosing α and then only 3 ways of choosingβ 6= α giving 4× 3 = 12 pairs. But we have counting them twice because wecounted α = 1, β = 2 separately from α = 2, β = 1, etc. so we must dividethis by two to get 12/2 = 6 independent pairs. Similarly for the second set,but let’s be careful in choosing these because we’re going to have to imposethe symmetry given by the last equality in Eq. (6.69). Let’s consider firstthose pairs of µ, ν where of course we require µ 6= ν, but also we impose thatneither µ nor ν equals α or β. Thus for any given pair α, β we have only5 pairs of µ, ν. This counts twice the permutations Rαβµν and Rµναβ, so wemust divide this by 2 to get 6× 5/2 = 15 independent elements. To this wemust add the six pairs of µ, ν that were not different from the α, β pair, i.e.µ = α or β and ν = β or α. This gives a total of 15 + 6 = 21 independentelements of Rαβµν accounting for Eq. (6.69) only. (We account for Eq. (6.70)in the next problem.)

18. (c) Show that Eq. (6.70),

Rαβµν +Rανβµ +Rαµνβ = 0,

imposes only one restriction that is independent of Eq. (6.69), and thus

Page 206: FirstCourseGR_notes_on_Schutz2009.pdf

206

reduces the number of independent components of Rαβµν to 20.

My solution is, as is often the case, rather longwinded. If you find a neatersolution, please let me know! First let’s establish that all the indices mustbe different. I do this but simply consider all the cases. Consider α = β, andthen Rααµν = 0, which implies, from Eq. (6.70),

Rααµν +Rαναµ +Rαµνα = 0,

Rαναµ +Rαµνα = 0,

Rαναµ −Rαµαν = 0. (6.32)

But this final result is a special case in Eq. (6.69), so is not a new result,and does not provide any restraints on Rαβµν beyond those of Eq. (6.69). Weproceed like this, next considering α = µ:

Rαβαν +Rανβα +Rαανβ = 0,

Rαβαν +Rανβα = 0,

Rαβαν −Rαναβ = 0. (6.33)

This last equality was also implied by Eq. (6.69) as a special case. Unfortu-nately we have a few more cases to consider: α = ν, and separately β = µ,and separately µ = ν. They all reduce to special cases of Eq. (6.69). So weconclude that the only new information in Eq. (6.70) must come from thecase when the indices are all unique. There is only one set of 4 indices alldifferent, 0, 1, 2, 3, and these can form 3 unique pairs:

α, β µ, ν0, 1 2, 30, 2 1, 30, 3 2, 1

(6.34)

where the order of α, β does not matter, nor does the order of µ, ν. Andnotice also we don’t distinguish between α, β = 0, 1 while µ, ν = 2, 3 andthe case where α, β = 2, 3 while µ, ν = 0, 1, because the elements must havethe same value due to the last equality in Eq. (6.69). So of the 21 elementsof Rαβµν described in (b) above, there are 3 such that the indices are alldifferent. But these three elements obey the relation Eq. (6.70), which canbe written assuming all the indices α, β, µ, ν are unique, for example

R0123 +R0312 +R0231 = 0.

Page 207: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 207

It is not necessary to distinguish this equation from others that can be ob-tained by simply applications of the rules in Eq. (6.69). This equation thenimposes one constraint, independently of Eq. (6.69), and thus reduces thenumber of independent elements of Rαβµν from the 21 we found in (b) to 20.

19. Prove that Rαβµν for polar coordinates in the Euclidean plane.

First we find the number of independent components of Rαβµν in two

dimensions. [Refer to Exer. 18(b) for the case of 4-dimensional space.] Thereis only one degree of free associated with the first pair of indices because

Rrθµν = −Rθrµν

andRααµν = 0.

And similarly only one degree of freedom associated with the last two indicessince

Rαβrθ = −Rαβθr

andRαβµµ = 0.

It’s not possible to use the cyclic identity to reduce this any further sinceapplying the cyclic identity we discover something that was true by the othersymmetry relations, that is:

Rrθrθ +Rrθθr +Rrrθθ = 0 , by cyclic identity, Eq. (6.70)

Rrθrθ −Rrθθr = 0 , symmetry relations, Eq. (6.69) (6.35)

So there is only one independent value to compute. Let’s compute

Rrθrθ = Γr θθ,r − Γr θr,θ + Γσ θθΓrσr − Γσ θrΓ

rσθ , Eq. (6.63)

= Γr θθ,r + Γr θθΓrrr − Γθ θrΓ

rθθ , Eq. (5.45)

= Γr θθ,r − Γθ θrΓrθθ , Eq. (5.45)

=∂(−r)∂r

− 1

r(−r) , Eq. (5.45)

= 0. (6.36)

Page 208: FirstCourseGR_notes_on_Schutz2009.pdf

208

And this is of course what we expect since (despite the polar coordinates)we’re in Euclidean space, which is flat, and Eq. (6.71) tells us we must havethe zero for the Riemann tensor.

20. Fill in the algebra necessary to establish Eq. (6.73).

The covariant derivative of a vector:

∇βVµ = V µ

= V µ,β + ΓµσβV

σ see Eq. (6.33). (6.37)

Now we simply apply the gradient operator another time,

∇α(∇βVµ) = (V µ

;β);α

= (V µ;β),α + ΓµσαV

σ;β − ΓσβαV

µ;σ, see Eq. (6.34)

= (V µ;β),α (6.38)

where the last step used the fact that Christoffel symbols are zero in localinertial coordinates. But their gradients are not, so

∇α(∇βVµ) = (V µ

;β),α

= V µ,βα + (ΓµνβV

ν),α

= V µ,βα + Γµνβ,αV

ν (6.39)

which is Eq. (6.73).

21. Following Eq. (6.78) it was claimed that one could generalize Eq. (6.77)for the commutator of the covariant derivative of a vector to a tensor withEq. (6.78) . Each index got a Riemann tensor and the sign was always posi-tive, even when the index was a lower index. In parentheses it was claimedthat this must be the case because the metric tensor g is unaffected by thecovariant derivative.

This is a great problem. The result Eq. (6.78) seems to contradictEq. (6.34). I cannot find the explanation Schutz’s is looking for. All I’ve

Page 209: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 209

managed so far on this is to derive Eq. (6.78) (see my supplementary prob-lem SP2 in section 6.10 above) and to show that Eq. (6.34) is at least con-sistent in the following sense.

Vα;β = (gασVσ);β

= gασ;βVσ + gασV

σ;β

= (gασ,β − Γµσβ gαµ − Γµαβ gµσ)V σ + gασVσ;β

= (−Γµσβ gαµ − Γµαβ gµσ)V σ + gασVσ;β

= (−Γµσβ gαµ − Γµαβ gµσ)V σ + gασ(V σ,β + ΓσνβV

ν)

(6.40)

Let’s relabel the dummy indices so that it’s more clear which terms cancel:

Vα;β = (−Γµσβ gαµ − Γµαβ gµσ)V σ + gαµ(V µ,β + ΓµσβV

σ)

= −Γµσβ gαµVσ − Γµαβ gµσV

σ + gαµVµ,β + gαµΓµσβV

σ

= gαµVµ,β − Γµαβ gµσV

σ

= Vα,β − ΓµαβVµ (6.41)

which is consistent with Eq. (6.34). Of course we used Eq. (6.34) in obtainingthis, so it’s not a proof. But it is a check on internal consistency.

22. Establish Eqs. (6.84), (6.85), and (6.86).

23. Prove Eq. (6.88).

Eq. (6.88) gives the partial derivative of the Riemann curvature tensor.As suggested in § 6.6, we can find it by starting with the definition of theRiemann curvature tensor given in Eq. (6.63).

Rαβµν,λ = (gαγRγβµν),λ

= gαγRγβµν,λ using Eq. (6.5)

= gαγ(Γγβν,µλ − Γγβµ,νλ + Γγσµ,λΓ

σβν + ΓγσµΓσβν,λ − Γγσν,λΓ

σβµ − ΓγσνΓ

σβµ,λ)

= gαγ(Γγβν,µλ − Γγβµ,νλ) in local inertial frame (6.42)

Page 210: FirstCourseGR_notes_on_Schutz2009.pdf

210

We now use Eq. (6.32), which is purportedly correct in any coordinate sys-tem, to write Christoffel symbols in terms of the metric tensor. We mustdifferentiate Eq. (6.32) with respect to xσ:

Γγµν,σ =1

2gγβ(gβµ,νσ + gβν,µσ − gµν,βσ) +

1

2gγβ,σ(gβµ,ν + gβν,µ − gµν,β) (6.43)

We might note that to arrive at Eq. (6.64), we would eliminate the terms thatare zero by Eq. (6.5) and the results of Exerc. 17(a), i.e. gγβ,σ = 0, leading to:

Γγµν,σ =1

2gγβ(gβµ,νσ + gβν,µσ − gµν,βσ) (6.44)

But since we are going to differentiate this again, to be on the safe side, let’skeep the terms that we eliminated from (6.43) because of gγβ,σ = 0. Nowdifferentiate (6.43) with respect to xλ to arrive at:

Γγµν,σλ =1

2gγβ(gβµ,νσλ + gβν,µσλ − gµν,βσλ)

+1

2gγβ,σλ(gβµ,ν + gβν,µ − gµν,β) +

1

2gγβ,σ(gβµ,νλ + gβν,µλ − gµν,βλ)

=1

2gγβ(gβµ,νσλ + gβν,µσλ − gµν,βσλ)

(6.45)

The terms in blue didn’t contribute because there was always a commonfactor with at most one derivative of the metric, which vanishes. So be-ing careful didn’t amount to anything here – we could have differentiatedEq. (6.64) right away. Armed with this 2nd derivative of the Christoffelsymbol, we can substitute this into (6.46), giving:

Rαβµν,λ = gαγ1

2gγσ[gσβ,νµλ + gσν,βµλ − gβν,σµλ − (gσβ,µνλ + gσµ,βνλ − gβµ,σνλ)]

=1

2gσα[gσν,βµλ − gβν,σµλ − (gσµ,βνλ − gβµ,σνλ)]

=1

2δσα[gσν,βµλ − gβν,σµλ − (gσµ,βνλ − gβµ,σνλ)]

=1

2[gαν,βµλ − gβν,αµλ − gαµ,βνλ + gβµ,ανλ] (6.46)

which is Eq. (6.88).

Page 211: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 211

24. Establish Eq. (6.89) from Eq. (6.88).

The solution follows from straightforward substitution of Eq. (6.88) intoEq. (6.89). For the 2nd term, make the substitutions:

µ→ λ

ν → µ

λ→ ν (6.47)

For the 3rd term, make the substitutions:

µ→ ν

ν → λ

λ→ µ (6.48)

Matching up terms that cancel is simply a matter of finding the terms withthe same first two indices since g is symmetric. These pairs of terms (shownin the same colour font below) will cancel because the remaining indices(the 3 after the comma) will necessarily correspond and again order doesn’tmatter because partial derivatives commute.

2Rαβµν,λ + 2Rαβλµ,ν + 2Rαβνλ,µ

= [gαν,βµλ−gβν,αµλ−gαµ,βνλ+gβµ,ανλ]+ [gαµ,βλν−gβµ,αλν−gαλ,βµν + gβλ,αµν ]

+ [gαλ,βνµ − gβλ,ανµ−gαν,βλµ+gβν,αλµ]

= 0. (6.49)

25. (a) Prove that the Ricci tensor Rµαµβ is the only independent con-

traction of Rαβµν since all other are multiples of it [or they are zero as pointed

out in the text].

First we need the result that

Rαβµν = −R β

α µν

as was proven in supplementary problem SP.4 above. It follows that

Rααµν = −R α

α µν = 0, ∀µ, ν,

Page 212: FirstCourseGR_notes_on_Schutz2009.pdf

212

since zero is the only number equal to its own negative. Similarly,

R µαβ µ = 0, ∀α, β,

It remains to consider Rαµνα, R

µα µβ, R

µα βµ. These candidates were identified

by stepping through the possibilities systematically: 1st and 2nd, 1st and 3rd,1st and 4th, (that’s it for those involving the first index), 2nd and 3rd (1stand 2nd already considered), 2nd and 4th, 3rd and 4th. That’s all.

−Rµαβµ = −gσµRσαβµ

= gσµRσαµβ, by Eq. (6.69)

= Rµαµβ

= Rαβ, i.e. the Ricci tensor. (6.50)

−R µα µβ = Rµ

αµβ, using results of SP.4

= Rαβ, i.e. the Ricci tensor. (6.51)

R µα βµ = −Rµ

αβµ, using results of SP.4

= Rµαµβ

= Rαβ, i.e. the Ricci tensor. (6.52)

25. (b) Show the Ricci tensor is symmetric.

Rαβ = gσµRσαµβ

= gσµRµβσα, by Eq. (6.69)

= Rµβ σα,

= Rβα. (6.53)

26. Use Exer. 17(a) to prove Eq. (6.94).

Page 213: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 213

In Exer. 17(a) we proved that gαβ,µ(P) = 0 at some event P . If we chosea local inertial reference frame, then Γσαµ(P) = 0, so

gαβ,µ(P) = gαβ;µ(P) = 0

just as in the un-numbered equation on p.151 between Eqs. (6.30) and (6.31).The 2nd equality is a valid tensor equation, which is then valid in all referenceframes. So, just as for Eq. (6.31), we also have:

gαβ;µ(P) = 0, in any basis.

27. Fill in the steps needed to establish Eqs. (6.95), (6.97), and (6.99).

Eq. (6.95) is the 2nd term in Eq. (6.93)

gαµRαβµλ;ν = gαµRαβλµ;ν

= (gαµRαβλµ);ν using Eq. (6.94)

= (−gαµRαβµλ);ν using Eq. (6.69)

= (−Rµβµλ);ν

= (−Rβλ);ν = −Rβλ;ν (6.54)

Although not asked for this, note that the first and 3rd terms in Eq. (6.63)follow immediately from multiplication by the inverse metric tensor, so wehave established Eq. (6.63) as well.

To establish Eq. (6.67) we need Eq. (6.66), which is the contraction ofEq. (6.63) with the inverse metric:

0 = gβν [Rβν;λ −Rβλ;ν +Rµβνλ;µ]

= R;λ + gβν [−Rβλ;ν +Rµβνλ;µ] used Eqs. (6.92), (6.94)

= R;λ −Rµλ;µ + gβνRµ

βνλ;µ relabled µ→ ν (6.55)

The 3rd term is more involved. I found it easiest to work backwards, starting

Page 214: FirstCourseGR_notes_on_Schutz2009.pdf

214

at the result in Eq. (6.96):

−Rµλ;µ = −gσµRσλ;µ used Eqs. (6.92), (6.94)

= −gσµRνσνλ;µ used Eqs. (6.91)

= −gσµgβνRβσνλ;µ

= gσµgβνRσβνλ;µ used Eqs. (6.69)

= gβνRµβνλ;µ

(6.56)

Substituting this for the 3rd term in (6.57) gives Eq. (6.96):

0 = R;λ −Rµλ;µ −R

µλ;µ (6.57)

Multiplying by −1 gives:

2Rµλ;µ −R;λ = 0 (6.58)

For all values of λ and µ, the Ricci scalar R is a scalar. So we can write,

R;λ = (δµλR);µ

= δµλR;µ (6.59)

Substitution into (6.60) gives Eq. (6.97):

(2Rµλ − δ

µλR);µ = 0 (6.60)

The first equality in Eq. (6.98) is a definition (for some reason he’schanged notation from := to ≡, both being standard nomenclature). The2nd equality follows from the symmetry of Rαβ as follows:

Rαβ = Rµνgαµgβν

= Rνµgαµgβν c.f. Eq. (6.91)

= Rνµgµαgνβ symmetry of (inverse) metric tensor

= Rβα (6.61)

Working backwards again, replace α→ λ in Eq. (6.99) and multiply by 2gβλgiving:

2gβλGβµ

;µ = 2gβλGµβ

;µ symmetry of Gµβ

= 2gβλ(Rµβ − 1

2gµβR);µ using def. in Eq. (6.98)

= (2Rµλ − g

µλR);µ

= (2Rµλ − δ

µλR);µ (6.62)

Page 215: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 215

which is Eq. (6.97).

28.(a) Derive Eq. (6.19) by using the usual coordinate transformationfrom Cartesian to spherical polars. [On the one hand these problems work-ing with familiar coordinates like spherical polar may help build physicalintuition. But on the other hand, you loose generality by working with aspecific coordinate system. ]

First one needs the coordinate transformation equations:

x = r sin θ cosφ

y = r sin θ sinφ

z = r cos θ (6.63)

We’ll need the transformation matrix,

∂xα

∂xβ′

where the xβ′

refer to the spherical-polar coordinates (r, θ, φ) in that order.∂x∂r

= sin θ cosφ ∂x∂θ

= r cos θ cosφ ∂x∂φ

= −r sin θ sinφ∂y∂r

= sin θ sinφ ∂y∂θ

= r cos θ sinφ ∂y∂φ

= r sin θ cosφ∂z∂r

= cos θ ∂z∂θ

= −r sin θ ∂z∂φ

= 0

The metric tensor for Euclidean space in Cartesian coordinates is simply

gij = δij Eq. (5.29)

which is immediately clear from considering the dot product of the Cartesiancoordinate basis vectors in 3D Euclidean space. So we can obtain the metricof 3D Euclidean space in spherical-polar coordinates through the transfor-mation

gi′j′ =

(∂xi

∂xi′

)(∂xj

∂xj′

)gij

=

(∂xi

∂xi′

)(∂xi

∂xj′

)(6.64)

Page 216: FirstCourseGR_notes_on_Schutz2009.pdf

216

grr =

(∂xi

∂r

)(∂xj

∂r

)gij

=

(∂xi

∂r

)2

, with summation over i

=

(∂x

∂r

)2

+

(∂y

∂r

)2

+

(∂z

∂r

)2

= (sin θ cosφ)2 + (sin θ sinφ)2 + (cosφ)2

= 1. (6.65)

grθ =

(∂xi

∂r

)(∂xj

∂θ

)gij

=

(∂xi

∂r

)(∂xi

∂θ

)with summation over i

=

(∂x

∂r

)(∂x

∂θ

)+

(∂y

∂r

)(∂y

∂θ

)+

(∂z

∂r

)(∂z

∂θ

)= (sin θ cosφ) (r cos θ cosφ) + (sin θ sinφ) (r cos θ sinφ) + (cos θ) (−r sin θ)

= 0. (6.66)

grφ =

(∂xi

∂r

)(∂xj

∂φ

)gij

=

(∂xi

∂r

)(∂xi

∂φ

)with summation over i

=

(∂x

∂r

)(∂x

∂φ

)+

(∂y

∂r

)(∂y

∂φ

)+

(∂z

∂r

)(∂z

∂φ

)= (sin θ cosφ) (−r sin θ sinφ) + (sin θ sinφ) (r sin θ cosφ) + (cos θ) (0)

= 0. (6.67)

gθr = grθ , all metrics are symmetric (6.68)

Page 217: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 217

gθθ =

(∂xi

∂θ

)(∂xj

∂θ

)gij

=

(∂xi

∂θ

)2

with summation over i

=

(∂x

∂θ

)2

+

(∂y

∂θ

)2

+

(∂z

∂θ

)2

= (r cos θ cosφ)2 + (r cos θ sinφ)2 + (−r sin θ)2

= r2. (6.69)

gθφ =

(∂xi

∂θ

)(∂xj

∂φ

)gij

=

(∂xi

∂θ

)(∂xi

∂φ

)with summation over i

=

(∂x

∂θ

)(∂x

∂φ

)+

(∂y

∂θ

)(∂y

∂φ

)+

(∂z

∂θ

)(∂z

∂φ

)= (r cos θ cosφ) (−r sin θ sinφ) + (r cos θ sinφ) (r sin θ cosφ) + 0

= 0. (6.70)

gφr = grφ , all metrics are symmetric (6.71)

gφθ = gθφ , all metrics are symmetric (6.72)

gφφ =

(∂xi

∂φ

)(∂xj

∂φ

)gij

=

(∂xi

∂φ

)2

with summation over i

=

(∂x

∂φ

)2

+

(∂y

∂φ

)2

+

(∂z

∂φ

)2

= (r sin θ sinφ)2 + (r sin θ cosφ)2 + 0

= r2 sin2 θ. (6.73)

Page 218: FirstCourseGR_notes_on_Schutz2009.pdf

218

The above metric is consistent with that presented in matrix form inEq. (6.19).

28.(b) Deduce from Eq. (6.19) that the metric of the surface of a sphereof radius r has components

gθθ = r2, gφφ = r2 sin2 θ, gθφ = 0

in spherical coordinates.

On the surface of a sphere, the variable r is held fix at the radius sodr = 0 and the line element becomes

dl2 = r2dθ2 + r2 sin2 θdφ2

consistent with the metric given:

gθθ = r2, gφφ = r2 sin2 θ, gθφ = 0

28.(c) Find the components of gαβ for the sphere.

Recall from elementary linear algebra, or the results of Exer. 3(c), theinverse of a diagonal matrix is just the inverse of the diagonal elements

(Dij)−1 =

(1

Dij

)when i = j and 0 otherwise

Applying that here we find

gθθ = r−2, gφφ = r−2 sin−2 θ, gθφ = 0

29. Find the Riemann curvature tensor on the surface of a sphere ofradius r = 1 in polar coordinates. There’s only one independent component,Rθφθφ, see Exer. 18(b) for explanation.

Even though there’s only one component, and we have the metric fromExer. 28, this question still involves a lot of work. I suggest we keep r as a

Page 219: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 219

variable since it’s no extra effort and it gains us a more general result. Let’sbreak it into three steps.

(i) Find the basis vectors ~eφ and ~eθ.

30. Boring.

31. Show that covariant differentiation obeys the usual product rule, e.g.

(V αβWβγ);µ = V αβ;µWβγ + V αβWβγ;µ

Hint: Use a locally inertial frame.

In a locally inertial frame, the Christoffel symbol vanishes and covariantderivatives equal partial derivatives, so

(V αβWβγ);µ = (V αβWβγ),µ

=∂

∂xµ

(∑β

V αβWβγ

)from here forward suspend usual summation convention

=∑β

∂xµ(V αβWβγ

)=∑β

(Wβγ

∂xµV αβ + V αβ ∂

∂xµWβγ

)=∑β

(WβγV

αβ,µ + V αβWβγ,µ

)= WβγV

αβ,µ + V αβWβγ,µ reinvoke usual summation convention

= WβγVαβ

;µ + V αβWβγ;µ because we’re in a locally inertial frame.

(6.74)

The last equality is a valid tensor equation, valid in all reference frames.

32. A 4D manifold has coordinates (u, v, w, p) in which the metric has

Page 220: FirstCourseGR_notes_on_Schutz2009.pdf

220

components

guv = 1

gww = 1

gpp = 1 (6.75)

and all other independent components vanishing (emphasis my own).

(a) Show that the manifold is flat and the signature is +2.

By § 6.7 point (6), we have that a flat space has Riemann curvatureidentically zero, Rαβµν = 0. The Riemann curvature tensor was expressed interms of the metric in Eq. (6.68). The important point is that it dependsonly upon 2nd derivatives of the metric. But the metric here is given as aconstant, independent of event or point on the manifold. So the derivativesvanish and the Riemann curvature tensor is identically zero, Rαβµν = 0.

To find the signature we only need to count the number of positive andnegative eigenvalues. But since in (b) we will need to diagonalize the metrictensor, we might as well do it here. Then it’s trivial to find the eigenvalues.Simply by playing around, I found the following transformation worked:

H =

−√

22−√

22

0 0

+√

22−√

22

0 00 0 1 00 0 0 1

Then

HT (g)H = (η) (6.76)

(b) Find the transformation to the usual coordinates t, x, y, z.

We found the transformation matrix in (a) to be:

H =

−√

22−√

22

0 0

+√

22−√

22

0 00 0 1 00 0 0 1

Page 221: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 221

This means thatgαβ = ηαβ = Λα

α gαβ Λββ

where α, β are indices on the coordinates in (t, x, y, z) and α, β indices onthe coordinates in (u, v, w, p) and say u0 = u, u1 = v, . . . So,

H = (Λββ) =

∂uβ

∂xβ

Thus, integrating these derivatives finally we arrive at:

u = −√

2

2t−√

2

2x,

v = +

√2

2t−√

2

2x,

w = y,

p = z. (6.77)

33. A 3-sphere is the 3D surface in a 4D Euclidean space (coordinatesx, y, z, w), given by the equation

x2 + y2 + z2 + w2 = r2

where r is the radius of the three-sphere.

(a) Define new coordinates (r, θ, φ, χ) by the equations

w = r cos(χ),

z = r sin(χ) cos(θ),

y = r sin(χ) sin(θ) sin(φ),

x = r sin(χ) sin(θ) cos(φ), (6.78)

Show that (θ, φ, χ) are coordinates for the sphere.

If we simply substitute these equations (6.78) into the equation for the3-sphere, we find that the equation is satisfied for fixed r for all values of(θ, φ, χ). So these coordinates can vary and we “stay on the 3-sphere”. But

Page 222: FirstCourseGR_notes_on_Schutz2009.pdf

222

to show that these are truly coordinates, I believe we must also show thatthe transformation defined by (6.78) is not singular, c.f. Eq. (5.6). After awhole lot of algebra, I found the determinant of

det

∂w∂r

∂w∂θ

∂w∂φ

∂w∂χ

∂z∂r

∂z∂θ

. . .∂y∂r

. . .∂x∂r

. . . ∂x∂χ

= −2r3 sin2(χ) sin(θ).

So just as in spherical-polar coordinates there are singular points, but thetransformation is generally non-singular.

(b) Find the metric tensor of the three-sphere of radius r in these coor-dinates. (Use method of Exer. 28).

The off-diagonal terms of the metric tensor are zero if the basis vectorsare orthogonal. In spherical coordinates this was obvious but it’s not a prioriobvious for the three-sphere (at least for me). So I simply calculated all theterms of the metric tensor. There are only 9 independent terms (because ofsymmetry, gαβ = gβα. Let’s use an overbar to indicate indices on the basesin θ, φ, χ, with

x1 = θ,

x2 = φ,

x3 = χ. (6.79)

And indices without overbar to indicate the original coordinates in x, y, z, w.Then in general

gαβ = gαβ Λαα Λβ

β(6.80)

where

Λαα =

∂xα

∂xα(6.81)

Page 223: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 223

The calculus is tedious but straightforward. For instance,

g11 = gθθ

= gxx

(∂x

∂θ

)2

+ gyy

(∂y

∂θ

)2

+ gzz

(∂z

∂θ

)2

+ gww

(∂w

∂θ

)2

= r2 sin2 χ cos2 θ sin2 φ+ r2 sin2 χ cos2 θ cos2 φ+ r2 sin2 χ sin2 θ

= r2 sin2 χ. (6.82)

The only question for me was “what’s the metric tensor in the 4D space”?Turns out to get the right answer one must assume

gαβ = +1 if α = β

= 0 if α 6= β (6.83)

In a similar manner one can easily show the off-diagonal terms are zero.

34. Establish the following identities for a general metric tensor in ageneral coordinate system. Eqs. (6.39) and (6.40) are sometimes helpful.

In my humble opinion, these problems provide some nice practise withtensor calculus, but are not essential for understanding the material of Chap-ter 6.

(a)

Γµµν =1

2(ln |g|),ν

Γµµν = Γµνµ cf. Exer. 5

=1

2gαβ gαβ,µ cf. Eq. (6.38)

=1

2

g,µg

using Eq. (6.39)

=1

2(ln |g|),µ chain rule. (6.84)

Page 224: FirstCourseGR_notes_on_Schutz2009.pdf

224

(b) Personally I found this by far the hardest of these 5 identities. If youhave trouble with this one, try (d) first.

gµν Γαµν = −

(gαβ√−g),β√

−g

Expand the RHS using the product rule of differential calculus,

(gαβ√−g),β√

−g= −gαβ

(√−g),β√−g

− gαβ,β

= −gαβΓσβσ − gαβ,β using Eq. (6.40)

= −gαβ 1

2gµνgµν,β − gαβ,β using Eq. (6.38)

(6.85)

Expand the LHS using Eq. (6.32):

gµν Γαµν = gµν [1

2gαβ(gβµ,ν + gβν,µ − gµν,β)]

= −gαβ 1

2gµνgµν,β + gµν

1

2gαβ(gβµ,ν + gβν,µ) rearranging. (6.86)

Subtracting these two results we find it remains to prove only that

−gαβ,β = gµν1

2gαβ(gβµ,ν + gβν,µ) (6.87)

This follows easily using (result from (d)):

gµν1

2gαβgβµ,ν = −gµν 1

2gαβ,ν gβµ

= −gνβ1

2gαβ,ν

= −1

2gαβ,β.

(6.88)

Similarly for the other term, giving the result we required.

Page 225: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 225

(c) Suppose F µν is antisymmetric.

F µν;ν = −

(F µν√−g),ν√−g

Expand the RHS using the product rule of differential calculus,

−(F µν

√−g),ν√−g

= F µν(√−g),ν√−g

+ F µν,ν

= F µνΓανα + F µν,ν using Eq. (6.40)

(6.89)

Expand the LHS using Eq. (6.35):

F µν;ν = F µν

,ν + F µσΓνσν + F σνΓµσν

= F µν,ν + F µνΓανα relabelling dummy indices. (6.90)

The 3rd term vanished because F σν is antisymmetric (given) and Γµσν issymmetric in σν and they are contracted on these indices (vanishes by theresults of Exer. 26 of § 3.10).

(d) Prove that

gασ gσβ,γ = −gσβ gασ,γ (6.91)

in all bases.

Solution:gασ gσβ = gαβ = δαβ

in all bases because the inverse metric tensor is, by definition, the inverse ofthe metric tensor. Simply differentiate this formula:

(gασ gσβ),γ = δαβ,γ

gασ gσβ,γ + gασ,γ gσβ = 0

gασ gσβ,γ = −gσβ gασ,γ. (6.92)

(e) Provegµνα = −Γµβα g

βν − Γνβα gµβ

Page 226: FirstCourseGR_notes_on_Schutz2009.pdf

226

This is a one-liner that follows immediately from Eq. (6.31) and (6.35).

35. Given the line element

ds2 = − exp(2Φ(r)) dt2 + exp(2Γ(r)) dr2 + r2 dθ2 + r2 sin2 θ dφ2

find the Riemann curvature tensor.

I found this problem instructive on several levels. It should be done after(or concurrently with) Exer. 18, for it helps clarify the symmetry relationsand the implied reduction in degrees of freedom of the Riemann curvaturetensor. It also helps reveal how much information is packed in the line elementequation!

Later we’ll learn that the given form of the metric leads to the Schwarzschildmetric which represents the simplest solution of the Einstein equations. It istherefore extremely useful to know.

(i) The coordinates are t, r, θ, φ. This is clear because these form thedifferential variables of the line element. The metric tensor is

(gαβ) =

− exp(2Φ) 0 0 0

0 exp(2Λ) 0 00 0 r2 00 0 0 r2 sin2 θ

A fair question is “why is the metric tensor diagonal”? The answer is thatthere are no cross-terms in the line element,

dl = |gαβ dxα dxβ|1/2

which was given just before Eq. (6.6).

(ii) The inverse of the metric tensor:

(gαβ) =

− exp(−2Φ) 0 0 0

0 exp(−2Λ) 0 00 0 r−2 00 0 0 r−2 sin−2 θ

See Exer. 3 for computing the inverse of a diagonal matrix.

Page 227: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 227

The Christoffel symbol can be computed from the metric tensor usingEq. (6.32). One needs the first derivatives of the metric tensor:

gtt,r =∂

∂r[− exp(2Φ)] = −2 exp(2Φ) Φ′,

grr,r =∂

∂r[exp(2Λ)] = 2 exp(2Λ) Λ′,

gθθ,r =∂

∂r[r2] = 2r,

gφφ,r =∂

∂r[r2 sin2(θ)] = 2r sin2(θ),

gφφ,θ =∂

∂θ[r2 sin2(θ)] = 2r2 sin(θ) cos(θ) = r2 sin(2θ). (6.93)

All other first derivatives of the metric tensor are zero.Here are the nonzero Christoffel symbols:

Γ001 = Φ′

Γ100 = exp(−2Λ) exp(2Φ) Φ′

Γ111 = Λ′

Γ122 = − exp(−2Λ) r

Γ133 = − exp(−2Λ) r sin2(θ)

Γ212 =

1

rΓ2

33 = − sin(θ) cos(θ)

Γ313 =

1

r

Γ323 =

cos(θ)

sin(θ)(6.94)

(iii) Deciding the 20 terms of Rαβµν to calculate. This is not as simple asit might sound. Here it helps tremendously if one has solved Exer. 18.

Recall Rααµν = 0 as does Rαβνν because of Eq. (6.69) (see Exer. 18).We organize the terms as recommended in the hint for Exer. 18: we

choose pairs of α 6= β (there are 6 of them accounting for the fact that orderdoesn’t matter), and similarly there are 6 pairs of µ 6= ν. These would give6× 6 = 36 elements, but, because of the symmetry Rαβµν = Rµναβ, we must

Page 228: FirstCourseGR_notes_on_Schutz2009.pdf

228

divide the number off-diagonal elements by two to get 5 × 6/2 + 6 = 21.(We’ll deal with the reduction to 20 by Eq. (6.70) in a minute.)

I found it too difficult to attempt to write down these terms immediatelybased on the above prescription. Instead it was much easier to write downall 6× 6 = 36 terms and eliminate the lower diagonal:

Rtrtr Rtrtθ Rtrtφ Rtrrθ Rtrrφ Rtrθφ

Rtθtr Rtθtθ Rtθtφ Rtθrθ Rtθrφ Rtθθφ

Rtφtr Rtφtθ Rtφtφ Rtφrθ Rtφrφ Rtφθφ

Rrθtr Rrθtθ Rrθtφ Rrθrθ Rrθrφ Rrθθφ

Rrφtr Rrφtθ Rrφtφ Rrφrθ Rrφrφ Rrφθφ

Rθφtr Rθφtθ Rθφtφ Rθφrθ Rθφrφ Rθφθφ

(6.95)

Note that we’re not writing them down randomly. Instead, we step the2nd pair of indices, i.e. µν, systematically by increasing the ν most rapidlywith increasing column, µ more slowly with increasing column. Similarly weincrease the first pair of indices, i.e. αβ, with row, and β more rapidly thanα. These were arbitrary choices of course, but having a system and stickingto it makes it easy.

Recall only the upper diagonal is necessary to determine the tensor be-cause of symmetry Rαβµν = Rµναβ:

Rtrtr Rtrtθ Rtrtφ Rtrrθ Rtrrφ Rtrθφ

Rtθtθ Rtθtφ Rtθrθ Rtθrφ Rtθθφ

Rtφtφ Rtφrθ Rtφrφ Rtφθφ

Rrθrθ Rrθrφ Rrθθφ

Rrφrφ Rrφθφ

Rθφθφ

(6.96)

Now we must also impose the condition in Eq. (6.70), which we explainedin Exer. 18 only applies to the case when none of the indices are equal. Thereare three such terms, indicated in red above. One of these can be determinedfrom the other two.

Let’s evaluate a few of these in full detail. It’s important to use Eq. (6.63),which is true in all coordinate bases and not Eq. (6.68) which is only truein a local inertial frame. From Eq. (6.63) and the Christoffel symbols above

Page 229: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 229

(6.94) we find for

Rtrtr = gttRtrtr

= − exp(2Φ)[−Γ0

01,r + Γ010Γ1

11 − Γ001Γ0

01

]= exp(2Φ)

[(Φ′)2 + Φ′′ − Φ′ Λ′

](6.97)

It’s important to note that if one were to Eq. (6.68):

Rαβµν =1

2(gαν,βµ − gαµ,βν + gβµ,αν − gβν,αµ)

one would miss the cross term

Rtrtr =1

2(gtr,rt − gtt,rr + grt,tr − grr,tt)

=1

2

[0− (−4 exp(2Φ)(Φ′)2 − 2 exp(2Φ)Φ′′) + 0− 0

]= 2 exp(2Φ)(Φ′)2 + exp(2Φ)Φ′′ (6.98)

I’m finding my answers disagree with those provided by Schutz, so it’simportant to work these out in more detail. For the next one Rtθtθ I found,

Rtθtθ = gtt[Λ0

22,t − Λ020,θ + Λ0

σ0Λσ22 − Λ0

σ2Λσ20

]= gtt

[Λ0

σ0Λσ22

]= gtt

[Λ0

10Λ122

]= − exp(2Φ) [Φ′(−r exp(−2Λ))]

= +rΦ′ exp(2Φ− 2Λ) (6.99)

After a lot of algebra one arrives at only the diagonal elements of (6.96)are nonzero:

(Rαβµν) =

Rtrtr 0 0 0 0 0

0 Rtθtθ 0 0 0 00 0 Rtφtφ 0 0 00 0 0 Rrθrθ 0 00 0 0 0 Rrφrφ 00 0 0 0 0 Rθφθφ

(6.100)

Page 230: FirstCourseGR_notes_on_Schutz2009.pdf

230

and the other 256 − 36 = 220 terms determined by symmetry relations inEq. (6.69). These 6 nonzero terms are

Rtrtr = exp(2Φ)[(Φ′)2 + Φ′′ − Φ′ Λ′

]Rtθtθ = rΦ′ exp(2Φ) exp(−2Λ)

Rtφtφ = sin2(θ) rΦ′ exp(2Φ) exp(−2Λ)

Rrθrθ = rΛ′

Rrφrφ = rΛ′ sin2(θ)

Rθφθφ = −r2(cos2(θ)− 1 + exp(−2Λ)− cos2(θ) exp(−2Λ)

)= r2 sin2(θ) (1− exp(−2Λ)) (6.101)

36. A four-dimensional manifold has coordinates (t, x, y, z) and a lineelement:

ds2 = −(1 + 2φ)dt2 + (1− 2φ)(dx2 + dy2 + dz2),

with |φ(t, x, y, z)| 1 everywhere. At any point P with coordinates (t0, x0, y0, z0),find a coordinate transformation to a locally inertial coordinate system, tofirst order in φ. At what rate does such a frame accelerate with respect tothe original coordinates, again to first order in φ?

I’ve noticed in the other chapters that the questions near the end ofthe Exercises section generally anticipate results we’ll need later. This isprobably of that genre since I don’t immediately see the point of this (i.e.it’s probably important). (Indeed this metric will reappear in Chapter 7,Eq. (7.8).) First we need the metric, which we can find as we did in Exer. 35.Here the metric is

(gαβ) =

−(1 + 2φ) 0 0 0

0 (1− 2φ) 0 00 0 (1− 2φ) 00 0 0 (1− 2φ)

So we seek a transformation Λα

α such that,

ηαβ = Λαα gαβ Λβ

β

Page 231: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 231

By inspection (which was aided by solving Exer. 3), we see that:

(Λαα) =

(1 + 2φP )−1/2 0 0 0

0 (1− 2φP )−1/2 0 00 0 (1− 2φP )−1/2 00 0 0 (1− 2φP )−1/2

where φP = φ(P ) = φ(t0, x0, y0, z0). The question specifically for the “coor-dinate transformation to a locally inertial coordinate system, to first order inφ”. I guess that means Schutz wants us to approximate this transformation,perhaps using the binomial theorem:

(1 + 2φP )−1/2 ≈ (1− φP ) +O(φ2P ).

I’m not sure because I don’t see the point of this. In any case one would get:

(Λαα) ≈

(1− φP ) 0 0 0

0 (1 + φP ) 0 00 0 (1 + φP ) 00 0 0 (1 + φP )

I confess I don’t know what is meant by “the rate such a frame accelerates

with respect to the original coordinates”. But it is clear that this transforma-tion works exactly only at the event P , and for the binomial approximationit only applies approximately there as well. And one can write down howthe metric would depart from the local inertial metric, η, by simply applyingthe approximation:

ηαβ ≈

−(1 + 2φ)(1− φP )2 0 0 0

0 (1− 2φ)(1 + φP )2 0 00 0 (1− 2φ)(1 + φP )2 00 0 0 (1− 2φ)(1 + φP )2

−(1 + 2φ)(1− 2φP ) 0 0 0

0 (1− 2φ)(1 + 2φP ) 0 00 0 (1− 2φ)(1 + 2φP ) 00 0 0 (1− 2φ)(1 + 2φP )

−(1 + 2(φ− φP )) 0 0 0

0 (1− 2(φ− φP )) 0 00 0 (1− 2(φ− φP )) 00 0 0 (1− 2(φ− φP ))

.

(6.102)

Page 232: FirstCourseGR_notes_on_Schutz2009.pdf

232

37. (a) ‘Proper volume’ of a 2D manifold is usually called ‘proper area’.Using the metric oin Exer. 28, integrate Eq. (6.18) to find the proper area ofa sphere of radius r.

∫ ∫dx1 dx2 =

∫ π

0

∫ 2π

0

g1/2 dφ dθ, changed sign of det |g|, cf. Eq. (6.19)

=

∫ π

0

∫ 2π

0

(r2 r2 sin2 θ)1/2 dφ dθ

=

∫ π

0

∫ 2π

0

(r2 sin θ) dφ dθ

= r22π

∫ π

0

sin θ dθ

= −r22π[cos θ]π0= 4πr2. (6.103)

37. (b) Analogous problem for Exer. 33, the three-sphere.

I don’t see that we learn anything new here.

39. Defines the Lie bracket in Eq. (6.100):

[~U, ~V ]α = Uβ∇βVα − V β∇βU

α

(a) That,

[~U, ~V ]α = −[~V , ~U ]α

follows immediately from the definition in Eq. (6.100).

Show that,

[~U, ~V ]α = Uβ V α,β − V β Uα

Page 233: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 233

We start with the definition of Eq. (6.100):

[~U, ~V ]α = Uβ∇βVα − V β∇βU

α

= Uβ V α;β − V β Uα

;β notation change only

= Uβ (V α,β + ΓαµβV

µ)− V β (Uα,β + ΓαµβU

µ) used Eq. 6.33

= Uβ V α,β − V β Uα

,β + Uβ ΓαµβVµ − V β ΓαµβU

µ rearranging only

(6.104)

where the terms in black font match what we are required to prove. So weonly need to show that the red terms vanish.

Uβ ΓαµβVµ − V β ΓαµβU

µ = ΓαβµVβUµ − V β ΓαµβU

µ relabelled dummy indices on first term

= 0, (6.105)

because in any coordinate system Γαµβ = Γαβµ, see Exer. 5.

6.10 Rob’s supplementary problems

SP.1 What are the numerical values of the elements of the Riemann curvaturetensor, Rααµν and Rαβµµ, with summation not implied. (Hint: Think aboutthe implications of the symmetry relations contained in Eq. (6.69).)

I recommend doing this problem before attempting Exerc. 18(b) in § 6.9.

Solution: Rααµν = 0 and Rαβµµ = 0 because we must have

Rαβµν = Rβαµν

For the case where α = β we must have that this element has a numericalvalue equal to its inverse. The only number equal to its inverse is zero.

SP.2 Generalize Eq. (6.73) to the case of(

11

)tensor, F µ

ν .

Solution: This is a straightforward generalization of the argument leadingto Eq. (6.73). See also the solution to Schutz’s Exerc. 20.

Page 234: FirstCourseGR_notes_on_Schutz2009.pdf

234

The covariant derivative of a(

11

)tensor can be inferred from Eqs. (6.34)

and (6.35).

∇βFµν = F µ

ν;β

= F µν,β + ΓµσβF

σν − ΓσνβF

µσ (6.106)

Now we simply apply the gradient operator another time, initially for com-pactness without expanding F µ

ν;β

∇α(∇βFµν) = (F µ

ν;β);α

= (F µν;β),α + ΓµσαF

σν;β − ΓσναF

µσ;β − ΓσβαF

µν;σ

= (F µν;β),α (6.107)

where the last step used the fact that Christoffel symbols are zero in localinertial coordinates. But their gradients are not, so

∇α(∇βFµν) = (F µ

ν;β),α

= F µν,βα + (ΓµσβF

σν − ΓσνβF

µσ),α using (6.106) above

= F µ,βα + Γµσβ,αF

σν − Γσνβ,αF

µσ (6.108)

which generalizes Eq. (6.73).

SP.3 Derive Eq. (6.78) in a manor analogous to the derivation of Eq. (6.77).Use the results of SP2 above. I posed this problem because I couldn’t findthe solution to Exerc. 21 wherein one is to explain the positive signs inEq. (6.78).

Solution: Using results (6.108) from SP2 above, we start by changing theorder of the derivatives by changing the order of the indices α and β. Takingthe difference between the two derivatives we obtain (terms in red cancel)

[∇α,∇β]F µν ≡ ∇α(∇βF

µν)−∇β(∇αF

µν)

= F µ,βα + Γµσβ,αF

σν − Γσνβ,αF

µσ − (F µ

,αβ + Γµσα,βFσν − Γσνα,βF

µσ)

= (Γµσβ,α − Γµσα,β)F σν − (Γσνβ,α − Γσνα,β)F µ

σ collecting common factors

(6.109)

Page 235: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 235

Use Eq. (6.63) but in reference frame where the Christoffel symbols all vanish,

[∇α,∇β]F µν = Rµ

σαβFσν −Rσ

ναβFµσ (6.110)

SP.4 Do the symmetry relations in Eq. (6.69) apply also when an in-dex is raised? Prove that Rα

βµν = −R βα µν . This result will be useful for

Exerc. 25(a).

Yes.

−R βα µν = −gσβ Rασµν

= gσβ Rσαµν , by Eq. (6.69)

= Rβαµν . (6.111)

SP.5 What’s the Ricci tensor and Ricci scalar of the metric of Exer. 35(which turns out to be of the form of the Schwarzschild metric)?

SP.6 Why is it generally important to use Eq. (6.63) for the computationof the Riemann tensor components and not the one in terms of the metric,given by Eq. (6.68)?

Page 236: FirstCourseGR_notes_on_Schutz2009.pdf

236

Page 237: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 7

Physics in curved spacetime

237

Page 238: FirstCourseGR_notes_on_Schutz2009.pdf

238

7.2 Physics in slightly curved spacetimes

Eq. (7.14) should be

Γ000 ≈ (1− 2φ)φ,t = φ,t +O(φ).

p. 177, [gαβ] is apparently the matrix associated with the inverse metrictensor. In earlier chapters the notation (gαβ) was used, an annoying incon-sistency.

7.6 Exercises

It might help to tackle my supplementary problems first, see § 7.7 below.

1. (i) If Eq. (7.3) were the correct generalization of Eq. (7.1) to a curvedspactime, how would you interpret it? (ii) What would happen to the numberof particles in a comoving volume of fluid, as time evolves? (iii) In principle,can we distinguish experimentally between Eqs. (7.2) and (7.3)?

(i) Recall the hypothetical Eq. (7.3) was

(nUα);α = q R

where q was some constant and R the Ricci scalar. Based on kinematics,this equation states that, for q > 0 there is a source of particles for positivelycurved space R > 0. This was shown in Exer. 20 (a) of § 4.10, which

interpreted showed that the divergence of the 4-vector N = n~U was the rateof generation of particles per unit volume. In Chapter 4 we were working inflat-space time but on a curved manifold the divergence would include termsfrom the rate of change of the basis vectors:

Nα,α = ε

= (nUα);α using definitions Eq. (4.1) and Eq. (4.4)

(7.1)

If q R were negative there would be a sink of particles. The fact that, byEq. (7.1)

(nUα),α = 0

Page 239: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 239

while(nUα);α = q R 6= 0

would mean, at least mathematically, that the nonzero source results fromthe derivatives of the basis vectors:

(nUα);α = Nα;α

= Nα,α + ΓαµαN

µ Eq. (6.36)

= ΓαµαNµ using Eq. (7.1)

= Nα(ln(√−g)),α using Eq. (6.41)

= q R by the hypothetical Eq. (7.3). (7.2)

I suppose physically we could interpret this as a source of particles, per unitvolume per unit time, (q R), resulting from the flux of particles Nα up thegradient of the natural logarithm of the magnitude of the determinant ofthe metric, whatever that would mean! I’m not sure how to understandor interpret this more completely. Of course would be admittedly strangebecause it would mean that there would be a source of particles in someframes but not in the locally inertial frame (where by Eq. (6.5) we hadvanishing first derivatives of all components of the metric).

(ii) What would happen to the number of particles in a co-moving volumeof fluid, as time evolves?

Let’s recall the solution to Exer. 20 of § 4.10, where we derived (4.44):

∂nU0

∂t= −∂nU

x

∂x− ∂nUy

∂y− ∂nU z

∂z+ ε.

In the co-moving frame, U0 is always unity, such that:

∂n

∂t= −∂nU

x

∂x− ∂nUy

∂y− ∂nU z

∂z+ ε. (7.3)

How the number of particles per unit volume n evolves in time dependsupon the spatial convergence of the number flux −N i

;i and the source term,ε = qR.

(iii) Could we ever distinguish experimentally between Eqs. (7.2) and (7.3)?Recall Eq. (7.2) had

(nUα);α = 0.

Page 240: FirstCourseGR_notes_on_Schutz2009.pdf

240

So “yes”, I believe one could, in principle, measure the terms on the RHS of(7.3) and thereby distinguish between Eqs. (7.2) and (7.3).

2. To first order in φ, compute gαβ for [the line element given by ]Eq. (7.8).

See Exers. 35 and 36 from § 6.9 for how to calculate the metric from theline element. Here the metric is

(gαβ) =

−(1 + 2φ) 0 0 0

0 (1− 2φ) 0 00 0 (1− 2φ) 00 0 0 (1− 2φ)

and in fact, it’s exactly the same as in Exer. 36 of § 6.9. Now we are asked forthe inverse, which because it’s diagonal follows immediately as the inverse ofthe terms on the diagonal of the metric:

(gαβ) =

−(1 + 2φ)−1 0 0 0

0 (1− 2φ)−1 0 00 0 (1− 2φ)−1 00 0 0 (1− 2φ)−1

−(1− 2φ) 0 0 0

0 (1 + 2φ) 0 00 0 (1 + 2φ) 00 0 0 (1 + 2φ)

(7.4)

3. Calculate all the Christoffel symbols for the metric given by Eq. (7.8),to first order in φ(t, x, y, z).

This question requires tones of algebra yet I found I didn’t learn anything.On the other hand, given the importance of this metric, perhaps a completeset of Christoffel symbols will come in handy later. First lets count thenumber of independent Christoffel symbols, Γαµν to calculate. For each αthere are only ten independent terms because Γαµν = Γανµ in any basis. Given

Page 241: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 241

the metric and inverse metric, see Exer. 2, we can calculate the Christoffelsymbols using Eq. (6.32). The calculation simplifies tremendously because(gαβ) is diagonal. Thus we need only consider the β = α contribution inEq. (6.32).

Γ000 = Γttt

=1

2gttgtt,t

= −(1− 2φ)φ,t (7.5)

Γ001 = Γttx

=1

2gttgtt,x

= −(1− 2φ)φ,x (7.6)

Γ002 = Γtty

=1

2gttgtt,y

= −(1− 2φ)φ,y (7.7)

Γ003 = Γttz

=1

2gttgtt,z

= −(1− 2φ)φ,z (7.8)

Γ010 = Γ0

01 = Γttx= −(1− 2φ)φ,x (7.9)

but let’s not bother with redundant ones any more.

Page 242: FirstCourseGR_notes_on_Schutz2009.pdf

242

Γ011 = Γtxx

=1

2gtt(−gxx,t)

= (1− 2φ)φ,t (7.10)

Γ012 = Γtxy

=1

2gtt(0)

= 0. (7.11)

and in general when all the indices are different, the results is nil.

Γ013 = Γtxz

= 0. (7.12)

Γ022 = Γtyy

=1

2gtt(−gyy,t)

= (1− 2φ)φ,t (7.13)

Γ023 = Γtyz

= 0 (7.14)

Γ033 = Γtzz

=1

2gtt(−gzz,t)

= (1− 2φ)φ,t (7.15)

Page 243: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 243

Γ100 = Γxtt

=1

2gxx(−gtt,x)

= (1 + 2φ)φ,x (7.16)

Γ101 = Γxtx

=1

2gxx(gxx,t)

= −(1 + 2φ)φ,t (7.17)

Γ102 = Γxty

= 0. (7.18)

Γ103 = Γxtz

= 0. (7.19)

Γ111 = Γxxx

=1

2gxx(gxx,x)

= −(1 + 2φ)φ,x (7.20)

Γ112 = Γxxy

=1

2gxx(gxx,y)

= −(1 + 2φ)φ,y (7.21)

Page 244: FirstCourseGR_notes_on_Schutz2009.pdf

244

Γ113 = Γxxz

=1

2gxx(gxx,z)

= −(1 + 2φ)φ,z (7.22)

Γ122 = Γxyy

=1

2gxx(−gyy,x)

= (1 + 2φ)φ,x (7.23)

Γ123 = Γxyz

= 0. (7.24)

Γ133 = Γxzz

=1

2gxx(−gzz,x)

= (1 + 2φ)φ,x (7.25)

Γ200 = Γytt

=1

2gyy(−gtt,y)

= (1 + 2φ)φ,y (7.26)

Γ201 = Γytx

= 0. (7.27)

Page 245: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 245

Γ202 = Γyty

=1

2gyy(gyy,t)

= −(1 + 2φ)φ,t (7.28)

Γ203 = Γytz

= 0. (7.29)

Γ211 = Γyxx

=1

2gyy(−gxx,y)

= (1 + 2φ)φ,y (7.30)

Γ212 = Γyxy

=1

2gyy(gyy,x)

= −(1 + 2φ)φ,x (7.31)

Γ213 = Γyxz

= 0. (7.32)

Γ222 = Γyyy

=1

2gyy(gyy,y)

= −(1 + 2φ)φ,y (7.33)

Page 246: FirstCourseGR_notes_on_Schutz2009.pdf

246

Γ223 = Γyyz

=1

2gyy(gyy,z)

= −(1 + 2φ)φ,z (7.34)

Γ233 = Γyzz

=1

2gyy(−gzz,y)

= (1 + 2φ)φ,y (7.35)

Γ300 = Γztt

=1

2gzz(−gtt,z)

= (1 + 2φ)φ,z (7.36)

Γ301 = Γztx

= 0. (7.37)

Γ302 = Γzty

= 0. (7.38)

Γ303 = Γztz

=1

2gzz(gzz,t)

= −(1 + 2φ)φ,t (7.39)

Page 247: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 247

Γ311 = Γzxx

=1

2gzz(−gxx,z)

= (1 + 2φ)φ,z (7.40)

Γ312 = Γzxy

= 0. (7.41)

Γ313 = Γzxz

=1

2gzz(gzz,x)

= −(1 + 2φ)φ,x (7.42)

Γ322 = Γzyy

=1

2gzz(−gyy,z)

= (1 + 2φ)φ,z (7.43)

Γ323 = Γzyz

=1

2gzz(gzz,y)

= −(1 + 2φ)φ,y (7.44)

Γ333 = Γzzz

=1

2gzz(gzz,z)

= −(1 + 2φ)φ,z (7.45)

Page 248: FirstCourseGR_notes_on_Schutz2009.pdf

248

4. Verify that the results of Eqs. (7.15) and (7.24) depended only on g00;the form of gxx doesn’t affect them, as long as it is (1 +O(φ).

It’s not clear where we’re expected to start. I’ll go back to Eq. (7.9), sincethis is clearly the equation for a geodesic.

Let λ = τ/m, where τ is proper time and m is the rest mass. Let ~x bethe position of the particle, so the tangent vector

~U =d~x

is also the four-velocity of the particle. The momentum ~p is

~p = m~U

= md~x

=d~x

dλ. (7.46)

And thus the momentum is the tangent vector the same path parameterizedby λ. So particles path also satisfies Eq. (7.10), the equation for a geodesicwritten in terms of its tangent vector, cf. Eq. (6.49).

To justify Eq. (7.11) from Eq. (7.10), we write out Eq. (7.10) using thenotation used in Eq. (6.47):

∇~p ~p =d~p

=d

dλ(pα ~eα)

= ~eαd

dλpα + pα

d

dλ~eα

= ~eαd

dλpα + pα

∂~eα∂xβ

dxβ

= ~eαd

dλpα + pα Γµαβ ~eµ

dxβ

dλusing definition of Christoffel symbol, cf. Eq. (5.44)

= ~eαd

dλpα + pα Γµαβ ~eµ p

β using (7.46)

= ~eµ

(d

dλpµ + pα Γµαβ p

β

)relabelling dummy indices to allow factoring

= 0. (7.47)

Page 249: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 249

The quantity in parentheses is a scalar for each µ and the basis vectors ~eµare all orthogonal, since we’re working in an orthonormal basis (since theline element in Eq. (7.8) does not contain off-diagonal terms – see discus-sion in solution of Exer. 35 of § 6.9.) That implies that each component ofEq. (7.10) must vanish (something that was not immediately obvious). Sowe can consider just the µ = 0 (i.e. time) component, which gives Eq. (7.11).Recall the 4-momentum is

~p→O (E, p1, p2, p3) = (mγ,mγv1,mγv2,mγv3)

where vi is the ith component of the 3-velocity and

γ ≡ 1

1− v2/c2.

When the magnitude of the three-velocity v c = 1, we have p0 pi,and clearly Eq. (7.11) simplifies to Eq. (7.12). The Γ0

00 for this metric wascomputed in Exer. 3 and agrees with that given just before Eq. (7.14). ButI would get, instead of Eq. (7.14),

Γ000 ≈ (1− 2φ)φ,t = φ,t +O(φ).

Furthermore, p0 = E, the total energy,

E = mγ = m1√

1− v2≈ m,

see bottom p. 42. Substitution into Eq. (7.11) gives Eq. (7.15).

5. (a) For a perfect fluid, verify that the spatial components of Eq. (7.6)in the Newtonian limit reduce to the Euler equations (Eq. 7.38) for the metricEq. (7.8).

The stress-energy tensor T µν for a perfect fluid was given in Eq. (7.7),the only modification from that giving by Eq. (4.37) in Chapter 4 on fluidsin SR is the more general metric tensor. Substituting this into Eq. (7.6) weget:

T µν ;ν = [(ρ+ p)UµUν ];ν + [pgµν ];ν = 0

= [(ρ+ p)UµUν ];ν + p;ν gµν because gµν ;ν = 0

= [ρUµUν ];ν + p,ν gµν because p is a scalar

(7.48)

Page 250: FirstCourseGR_notes_on_Schutz2009.pdf

250

and because p ρ. This follows because the pressure arises from the randommotion of the particles, which provides them with negligible kinetic energyrelative to the rest mass energy in the non-relativistic limit. Furthermore,ρ ≈ ρ0 = nm again because the rest mass dominates the energy in thenon-relativistic limit. (See Table 4.1 for a definition of symbols.)

T µν ;ν = m [nUµUν ];ν + p,ν gµν

= mnU ν [Uµ];ν + p,ν gµν using Eq. (7.2)

= mnU ν [Uµ,ν + ΓµσνU

σ] + p,ν gµν using Eq. (6.33)

= ρ0 Uν [Uµ

,ν + ΓµσνUσ] + p,ν g

µν using Eq. (6.33) (7.49)

where mn = ρ0, the rest mass density. Let’s now restrict attention to theµ = i (spatial) components.

T iν ;ν = ρ0 Uν [U i

,ν + ΓiσνUσ] + p,ν g

iν (7.50)

Let’s look at the terms in (7.50) one-at-a-time. The first term is the timederivative, in particular the Eulerian part when ν = 0 and the advective partwhen ν = j > 0 (by convention). To see this, expand the first term whenν = 0:

ρ0 U0 U i

,0 = ρ0 U0 U i

,t

= ρ0 γ2 vi,t (7.51)

where γ = 1/√

1− v2, vi is the ith component of the 3-velocity and v is itsmagnitude.

ρ0 γ2 vi,t ≈ ρ0 v

i,t Newtonian limit

= ρ0∂v

∂tchanging to vector notation. (7.52)

And the advective part corresponds to this term with ν = j:

ρ0 γ2 vj vi,j ≈ ρ0 v

j vi,j Newtonian limit

= ρ0 (v · ∇)v changing to vector notation. (7.53)

Page 251: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 251

The next term in in (7.50) contains the Christoffel symbol and can be written:

ρ0 ΓiσνUσUν ≈ ρ0Γi00U

0U0 in Newtonian limit U0 U i

= ρ0 φ,i U0U0 Eq. (7.23)

= ρ0 φ,i γ2

≈ ρ0 φ,i in Newtonian limit

= ρ0∇φ (7.54)

which is the gravitational force per unit volume of fluid. The final term in(7.50) is the pressure gradient force per unit volume,

p,ν giν = p,i (1− 2φ)−1 Eq. (7.20)

≈ p,i (1 + 2φ) because |φ| 1

= p,i because |φ| 1

= ∇p. (7.55)

And that’s the Euler equation.

5. (b) Examine the time component of Eq. (7.6) under the Newtonianlimit, and interpret each term.

We go back to (7.49) and restrict attention to the µ = 0 (time) compo-nent:

T µν ;ν = ρ0 Uν [Uµ

,ν + ΓµσνUσ] + p,ν g

µν

T 0ν;ν = ρ0 U

ν [U0,ν + Γ0

σνUσ] + p,ν g

(7.56)

Let’s look at the terms in (7.56) one-at-a-time. The first term is the timederivative. To see this, expand the first term when ν = 0:

ρ0 U0 U0

,0 = ρ0 U0 U0

,t

= ρ0 γ γ,t

≈ ρ0∂

∂t

(1

2v2

)(7.57)

Page 252: FirstCourseGR_notes_on_Schutz2009.pdf

252

which the Eulerian time rate of change of the kinetic energy density. Thefinal approximation comes about by expanding γ and using the binomialapproximation that applies in the Newtonian limit:

γ =1√

1− v2

≈ 1 +v2

2. (7.58)

And the advective part corresponds to this term with ν = j:

ρ0 Uj U0

,j = ρ0 γ vj γ ,j

≈ ρ0 vj

(1

2v2

),j

Newtonian limit

= ρ0 (v · ∇)

(1

2v2

)changing to vector notation, (7.59)

which is the advection of the kinetic energy per unit volume. The next termin in (7.56) contains the Christoffel symbol and can be written:

ρ0 Γ0σνU

σUν ≈ ρ0Γ00νU

0Uν in Newtonian limit U0 U i

= ρ0U0Uν 1

2g0β(gβν,0 + gβ0,ν − g0ν,β)

= ρ0U0Uν 1

2g00(g0ν,0 + g00,ν − g0ν,0) (inverse) metric is diagonal

= ρ0U0Uν 1

2g00(g00,ν)

= ρ0U0Uν 1

2[−(1 + 2φ)−1][−(1 + 2φ),ν ]

≈ ρ0U0Uν (1− 2φ)φ,ν binomial approximation

≈ ρ0U0Uν φ,ν

= ρ0γ2vν φ,ν (7.60)

When ν = 0 this gives the tidal forcing:

ρ0 Γ0σ0U

σU0 ≈ ρ0 φ,t (7.61)

When ν = i this gives the work done by potential forces (or change inpotential energy):

ρ0 Γ0σiU

σU i ≈ ρ0vi φ,i (7.62)

Page 253: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 253

The final term in (7.56) involves the pressure,

p,ν g0ν = p,0 g

00 because the inverse metric is diagonal

= p,t [−(1 + 2φ)−1] Eq. (7.8), take the inverse of diagonal matrix

≈ p,t [−(1− 2φ)] binomial approximation

= −p,t (7.63)

This looks like a change from internal energy to kinetic energy but I’m con-fused because this should result from divergence of the velocity (compres-sion/expansion of fluid parcels working against ambient pressure).

5. (c) Derive the relativistic hydrostatic balance, given by Eq. (7.40),from Eq. (7.6). [I recommend you try my supplementary problem R1 abovebefore tackling this one.]

We start again with the divergence of the stress-energy tensor given abovein (7.48):

T µν ;ν = [(ρ+ p)UµUν ];ν + [pgµν ];ν = 0

= [(ρ+ p)Uµ]Uν;ν + [(ρ+ p)Uν ]Uµ

;ν + [UµUν ](ρ+ p);ν + p;ν gµν + p gµν ;ν

(7.64)

Let’s step through the terms of (7.64) starting with the last, applying thestatic condition when appropriate. The final term vanishes:

p gµν ;ν = 0

as we have used many times, see § 6.9 Exer. 17(a) and Eq. (6.31). The 4thterm is:

p;ν gµν = p,ν g

µν because p is a scalar,

= p,i gµi static metric. (7.65)

The 3rd term vanishes:

[UµUν ](ρ+ p);ν = [UµUν ](ρ+ p),ν because p is a scalar

= [U0U0](ρ+ p),0 static fluid

= [U0U0](ρ+ p),t

= 0 static fluid. (7.66)

Page 254: FirstCourseGR_notes_on_Schutz2009.pdf

254

The 2nd term is

[(ρ+ p)Uν ]Uµ;ν = [(ρ+ p)U0]Uµ

;0 static fluid

= [(ρ+ p)U0](Uµ,t + ΓµσtU

σ) Eq. (6.33)

= (ρ+ p)U0 Γµσt Uσ static fluid

= (ρ+ p)U0 Γµ00 U0 static fluid

(7.67)

The first term vanishes:

(ρ0 + p)Uα Uβ;β = (ρ0 + p)Uα [Uβ

,β + Uβ(ln(−g)),β]

= (ρ0 + p)Uα [U0,t + U0(ln(−g)),t] static fluid

= 0 static fluid

(7.68)

So there’s a (relativistic hydrostatic) balance between the 4th and 2nd terms:

(ρ+ p)U0 Γµ00 U0 + p,i g

µi = 0

(7.69)

Now let’s simplify the Christoffel symbol,

Γµ00 =1

2gµβ(−g00,β) using Eq. 5.75 & static metric

=1

2gµi(−g00,i) static metric. (7.70)

Substitution in our hydrostatic balance equation gives:

0 = (ρ+ p)U0 Γµ00 U0 + p,i g

µi recall above

= (ρ+ p)1

2gµi(−g00,i)U

0U0 + p,i gµi

(7.71)

And now for the tricky bit! Recall we learned in SR that the four velocityof a stationary particle was the speed of light in the direction of time, c.f.§ 2.2, so that

~U · ~U = gαβUαUβ

= U0 U0g00

= 1 · 1 · (−1) = −1, Eq. (2.28). (7.72)

Page 255: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 255

Now in GR the metric has changed, but do we keep the magnitude of the~U · ~U = −1 and change the components of Uα accordingly? Yes, we do.[Personally I think this wasn’t explained clearly by Schutz and it’s one of myfew criticisms of this text.] Once one sees this small but important step, theresult follows easily. One either considers the µ = 0 component or factorsgµi out:

0 = (ρ+ p)1

2gµi(−g00,i)U

0U0 + p,i gµi

= (ρ+ p)1

2(−g00,i)

(1

−g00

)+ p,i g

µi

= gµi(ρ+ p)

1

2(−g00,i)

(1

−g00

)+ p,i

)= (ρ+ p)

1

2(−g00,i)

(1

−g00

)+ p,i

= (ρ+ p)1

2[ln(−g00)],i + p,i chain rule of differential calculus

(7.73)

5. (d) Appears to be a relationship between g00 and − exp(2φ), where φis the Newtonian potential. Show that Eq. (7.8) and Exer. 4 are consistentwith this.

First we have to make sense of “there is a close relation ”. Let’s checkif they are approximately equal in some situations (such as the Newtonianlimit). So then the term

1

2[ln(−g00)],i =

1

2[ln(exp(2φ))],i = φ,i

the Newtonian gravitational force per unit mass that appears in the Newto-nian fluid hydrostatic balance (except that we have to then ignore p ρ).So yes, approximate equality (in some situations) appears to be what therelation is.

Let’s assume that the Newtonian potential is small, |φ| 1 in non-

Page 256: FirstCourseGR_notes_on_Schutz2009.pdf

256

dimensional units. Then

− exp(2φ) = −(1 + 2φ+O(4φ)2) Taylor series about φ = 0

≈ −(1 + 2φ) if |φ| 1

= g00 as in Eq. (7.8). (7.74)

And in Exer. 4 we were required to show that in the Newtonian limitof Eq. (7.10), which was Eq. (7.15) and Eq. (7.24), there was only a depen-dence on g00, and not the other components of the metric. These equationscorrespond to the energy and momentum of a particle in a time-varying gravi-tational field. In the Newtonian limit we expect expect classical mechanics toapply, of course, from which we know that the energy depends upon the tidalforcing (time variation of the gravitational potential), and the momentum isaltered by the gravitational force which is the gradient of the gravitationalpotential. So if g00 ≈ − exp(2φ) is consistent with this.

6. Deduce Eq. (7.25) from Eq. (7.10).

Eq. (7.10) was the equation for a geodesic,

∇~p(~p) = 0

gβµ[∇~p(~p)] = 0

= gβµ[pαpµ;α]

= pα[gβµpµ] ;α using Eq. (6.31)

= pα pβ;α (7.75)

7. We’re given for expressions for the line elements corresponding to fourdifferent metrics (i) . . . (iv). (i) is the Minkowski metric.

(a) For each metric, find as many conserved components of pα of a freelyfalling particle’s four momentum as possible.

(i) All four components of p, i.e. pα, are conserved.

Page 257: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 257

(ii) The conserved components of p are:

pt

pφ (7.76)

(iii) The conserved components of p are:

pt

pφ (7.77)

(iv) The conserved components of p are:

pφ (7.78)

(b) Use the results of Exer. 28 from § 6.9 to transform the Minkowskimetric (i) to the form

ds′2 = −dt2 + dr2 + r2(dθ2 + sin2 θdφ2)

Use this to argue that (ii) and (iv) are spherically symmetric. Does thisincrease the number of conserved components of pα?

Recall Exer. 28 from § 6.9 gave us the metric on the surface of a spherein 3D Cartesian space. But the 3 spatial coordinates of Minkowski spacelook like that of 3D Cartesian space. So we can represent them in sphericalpolar coordinates with the same metric, while keeping the time component.I suppose one also needs to know that dr2 gives the radial contribution tothe line element in spherical polar coordinates yet this was not provided inExer. 28.

I’m not sure about the correct answer to the remainder of this question,but I’ll have a stab at it.

It’s clear for Schwarzschild metric (ii) that for fixed r (so on the surfaceof 3-sphere manifold in this 4-space), that the Schwarzschild metric has thesame form (to within constants) as the metric for Minkowski space written inspherical coordinates. We know all components of pα are conserved for theMinkowski metric. This indicates that linear combinations of componentsin pα on the surface of 3-sphere manifold in the Schwarzschild space are

Page 258: FirstCourseGR_notes_on_Schutz2009.pdf

258

conserved. There is a linear transformation that converts this metric to theMinkowski metric everywhere on the 3-sphere.

For the Robertson-Walker metric (iv) for fixed r and t, the metric has thesame form to within constants as the Minkowski metric written in sphericalcoordinates. So again this indicates that linear combinations of componentsin pα on this manifold would be conserved.

(c) For metrics (i’), (ii), (iii), and (iv), a geodesic that begins tangent tothe equatorial plane stays on the equatorial plane (i.e. starts with θ = π/2and pθ = 0 and keeps θ = π/2 and pθ = 0). For cases (i’), (ii), and (iii), usethe equation ~p · ~p = −m2 to solve for pr in terms of m, and other conservedquantities, and known functions of position.

For (i’),

−m2 = ~p · ~p= pα pβ gαβ

= −(pt)2 + (pr)2 + r2[(pθ)2 + sin2 θ(pφ)2] writing out terms with metric (i’)

= −(pt)2 + (pr)2 + r2(pφ)2 tangent to equatorial plane

pr = ±√

(pt)2 − r2(pφ)2 −m2 (7.79)

We’re almost there, but remember it’s pt and pφ that are conserved. So weneed to relate pt and pφ to them to fulfill the requirement of “other conservedquantities”. Because (gαβ) is symmetric for metrics (i’), and (ii), a singlefactor relates each component:

pt = ptgαt = ptgtt = pt(−1)

pφ = pφgαφ = pφgφφ = pφr2 sin2 θ = pφr2 (7.80)

Substitution in the above equation gives:

pr = ±√

(−pt)2 − (pφ)2/r2 −m2 (7.81)

Page 259: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 259

For (ii),

−m2 = pα pβ gαβ

= −[1− 2M/r] (pt)2 +1

1− 2M/r(pr)2 + r2[(pθ)2 + sin2 θ(pφ)2] expanding

= −[1− 2M/r] (pt)2 +1

1− 2M/r(pr)2 + r2 (pφ)2 tangent to equatorial plane

pr = ±√

1− 2M/r√

[1− 2M/r] (pt)2 − r2(pφ)2 −m2 (7.82)

Again we must find the corresponding conserved quantities:

pt = ptgtt = pt(−1)[1− 2M/r]

pφ = pφgφφ = pφr2 (7.83)

Substitution in the above equation gives:

pr = ±√

1− 2M/r√

(pt)2/[1− 2M/r]− (pφ)2/r2 −m2 (7.84)

For (iii), it’s instructive to note immediately that on the equatorial planeρ2 = r2 + a2 cos2 θ = r2, so that

−m2 = pα pβ gαβ

= −∆− a2

r2(pt)2 − 4aM

r(pt pφ) +

(r2 + a2)2 − a2∆

r2(pφ)2 +

r2

∆(pr)2

(7.85)

The above equation involves known functions of position, and pt and pφ whichare not conserved. Because the Kerr metric is not diagonal, it’s more involvedto find the corresponding conserved quantities:

pt = pσgtσ = gttpt + gtφp

φ

pφ = pσgφσ = gφφpφ + gtφp

t (7.86)

We solve this 2× 2 system for pt and pφ to find:

pt = −ptgφφ

g2φt − gtt gφφ

+ pφgtφ

g2φt − gtt gφφ

pφ = ptgtφ

g2φt − gtt gφφ

− pφgtt

g2φt − gtt gφφ

(7.87)

Page 260: FirstCourseGR_notes_on_Schutz2009.pdf

260

Substitution in the above equation gives would give the required equation,but it would be very messy. We simply note that we have met the requirementin principle because gtt, gφt, gφφ are known functions of position.

(d) For (iv), spherical symmetry implies that if a geodesic begins withpθ = pφ = 0, these remain zero. Use this to show from Eq. (7.29) that whenk = 0, pr is conserved.

When k = 0 the Robertson-Walker metric simplifies to:

(gαβ) =

−1 0 0 00 R2(t) 0 00 0 r2R2(t) 00 0 0 r2 sin2 θR2(t)

Writing out Eq. (7.29) for pr and this metric we get:

md

dτpβ =

1

2gνα,β p

ν pα Eq. (7.29)

md

dτpr =

1

2gνα,r p

ν pα

=1

2[gtt,r (pt)2 + grr,r (pr)2 + gθθ,r (pθ)2 + gφφ,r (pφ)2]

=1

2[gθθ (pθ)2 + gφφ (pφ)2] clear from metric

= 0 clear from given initial conditions and “spherical symmetry”.(7.88)

8. Suppose that in some coordinate system the components of the metricgαβ are independent of some coordinate xµ.

(a) Show that the conservation law T νµ;ν = 0 for any stress energy tensorbecomes that given by Eq. (7.41).

If it’s not obvious where this conservation law came from, have a go atsupplementary problem SP2 above.

Page 261: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 261

Let’s start with expanding Eq. (7.41) to see what’s really there.

0 =1√−g

(√−gT νµ

),ν

= T νµ,ν + T νµ(√−g),ν√−g

product rule of differential calculus

= T νµ,ν + T νµ Γανα Eq. (6.40). (7.89)

This now looks somewhat like the given conservation law, for if we expandit using Eqs. (6.34) and (6.35), we obtain

T νµ;ν = T νµ,ν + T νµ Γανα − T νσΓσµν (7.90)

Comparing (7.89) and (7.90) we see that we merely have to show that underthe given conditions on the metric, the final term in (7.90) vanishes. Let’sstart with the most general expression relating the Christoffel symbol to themetric tensor, and then apply the restriction given here:

T νσΓσµν = T νσ1

2gσβ [gβµ,ν + gβν,µ − gµν,β] Eq. (6.32)

= T νσ1

2gσβ [gβµ,ν − gµν,β] when g independent of xµ

= T νβ1

2[gβµ,ν − gµν,β] just raised index

= 0, (7.91)

because T νβ is symmetric on ν, β while quantity in [ ] is antisymmetric onthese indices, cf. Exer. 26 of § 3.10.

(b) Suppose that in these coordinates Tαβ 6= 0 only in some boundedregion of each spacelike hypersurface x0 =const. Show that Eq. (7.41) implies∫

x0=const.T νµ√−g ην d3x

is independent of time x0, if ην is the unit normal to the hypersurface.I believe this is a typo and should be written with ν = 0, i.e.:∫

x0=const.T 0µ

√−g η0 d3x

Page 262: FirstCourseGR_notes_on_Schutz2009.pdf

262

Starting with Eq. (7.41) we integrate both sides, using the proper volumeelement,

√−g d4x (c.f. Eq. (6.18)):∫

1√−g

(√−gT νµ

),ν

√−g d4x = 0∫ (√

−gT νµ),ν

d4x = 0∮ (√−gT νµ

)nν d3S = 0 Eq. (6.44)

(7.92)

I found it helpful at this point to picture using Gauss’s law in a simple settingof 2D Cartesian space, with some field being only nonzero in a some regionof finite extent. In this simple case one would choose the bounding limits tobe straight lines x = x1 and x = x2 and y = y1 and y = y2 that lie outsidethe region of nonzero field. By analogy, we expand the LHS of (7.92)

7.7 Rob’s supplementary problems

SP 1. Recall we learned in SR that the four velocity of a stationary particlewas the speed of light in the direction of time, c.f. § 2.2, so that

~U · ~U = gαβUαUβ

= U0 U0g00

= 1 · 1 · (−1) = −1, Eq. (2.28). (7.93)

Now in GR the metric has changed, but do we keep the magnitude of the~U · ~U = −1 and change the components of Uα accordingly? I recommendyou do this problem before tackling Exer. 5(c).

Yes, we do. [Personally I think this wasn’t explained clearly by Schutz upto this point in the text and it’s one of my few criticisms.] See Eq. (10.18).

SP 2. Show that T νµ;ν = 0 can be derived from the conservation lawEq. (7.6).

Page 263: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 263

We simply multiply Eq. (7.6) by the metric tensor to lower the index.

0 = T σν ;ν Eq. (7.6)

= T νσ;ν symmetry of the stress-energy tensor

0 = gµσTνσ

;ν multiplying both sides by g

= (gµσTνσ);ν using Eq. (6.31)

= T νµ;ν (7.94)

et voila!

Page 264: FirstCourseGR_notes_on_Schutz2009.pdf

264

Page 265: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 8

The Einstein Field Equations

265

Page 266: FirstCourseGR_notes_on_Schutz2009.pdf

266

8.1 Purpose and justification of the field equa-

tions

Re-reading this long after Chapter 7, I found it strange that he was justifyingEq. (7.6) based upon the Einstein equivalence principle, see discussion be-tween Eqs. (8.4) and (8.5) on p. 185. But re-reading Chapter 7 it is clear thathe’s just referring to the application of the “comma goes to semi-colon” rule.That is, Eq. (7.5), the conservation of four-momentum in SR, generalizes (bythe Einstein equivalence principle) to Eq. (7.6).

8.2 Einstein’s equations

8.3 Einstein’s equations for weak gravitational

fields

It’s perhaps helpful to go through the derivation of Eq. (8.22).

gα′β′ = Λµα′Λ

νβ′gµν transformation of a 2nd rank tensor, like Eq. (8.16)

= Λµα′Λ

νβ′(ηµν + hµν) substitute Eq. (8.12)

= [δµα − εµ,α][δνβ − εν,β](ηµν + hµν) substitute Eq. (8.21)

= ηαβ + hαβ − ηανεν,β − ηβµεµ,α keeping just the largest terms

= ηαβ + hαβ − εα,β − εβ,α (8.1)

which gives Eq. (8.22). It’s tempting to justify the final step above by themetric tensor lowering the indices µ and ν. But note that Schutz is carefulnot to say this, but instead says that we define

εα ≡ ηαβεβ

see his Eq. (8.23). I guess this is because η is not the full metric, but onlythe dominant piece of it.

When gαβ is given by Eq. (8.12) then Eq. (8.25) follows immediatelyfrom Eq. (6.67), which is Rα

βµν in a local inertial frame. All one has to do issubstitute Eq. (8.12) and lower the index. Note the typo in Eq. (6.67).

Page 267: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 267

8.6 Exercises

1. Show that Eq. (8.2) is a solution to Eq. (8.1) using the method suggested.

Gauss’ law in 3D space is∫Ω

∇ · (∇φ)dV =

∫dΩ

n · (∇φ)dA

where Ω is a volume, dV a differential volume element, dΩ is the surfacebounding Ω, n is the outward pointing unit normal vector, and dA is an areaelement on the bounding surface.

As suggested, we consider a point particle at the origin of a sphericalco-ordinate system. Then the left-hand integral is trivial:∫

Ω

∇ · (∇φ)dV =

∫Ω

4πGρdV, using Eq. (8.1)

= 4πG

∫Ω

ρdV,

= 4πGm, (8.2)

where m is the mass of the particle. We have assumed that the particle is ina vacuum so that φ is only due to this particle.

The righthand side gives,

∫dΩ

n · (∇φ)dA =

∫dΩ

drdA, using surface of sphere for dΩ

=dφ

dr4πR, (8.3)

where R is the radius of the sphere and by spherical symmetry the integrandis constant on the surface of the sphere. Combining these two results givesthe 1st order ODE:

dr=Gm

r2

Page 268: FirstCourseGR_notes_on_Schutz2009.pdf

268

To solve this we integrate both sides, imposing the BC φ(∞) = 0,∫ ∞R

drdr =

∫ ∞R

Gm

r2dr

[φ]∞R =

[−Gm

r

]∞R

φ(R) = −GmR

(8.4)

which is Eq. (8.2).

2 (a). Derive the two given conversion factors.

One simply plugs in the given constants with values in SI units to calculatethe numerical value. One should also convince oneself that the units arecorrect. The SI values and units were given in Table 8.1. Personally I find iteasy to remember the formulae that the constants are used in and figure outthe units from there.

GSI[m3kg−1s−2]

c2

SI[m s−1]2=GSIc2

SI

[m3kg−1s−2]

[m2 s−2]

=GSIc2

SI

[m]

[kg]

=6.674× 10−11

(2.998× 108)2[m kg−1]

= 7.425× 10−28 [m kg−1] (8.5)

c5

SI[m s−1]5

GSI[m3kg−1s−2]

=c5

SIGSI

[m5 s−5]

[m3kg−1s−2]

=c5

SIGSI

[m2 kg]

[s3]

=(2.998× 108)5

6.674× 10−11[N m s−1]

= 3.629× 1052 [J s−1] (8.6)

Page 269: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 269

2 (b). Derive the constants in Table 8 in geometrized units.

Here I’ll just get you started. The numerical computations were per-formed in the corresponding MapleTMscript for Chapter 8.

Table 8.1: Conversion between SI and geometrized units

Constant Geometrized units Value in terms of constants in SI

c unitless cSI (cSI)−1

G unitless GSI G−1

SI

~ m2 ~SI GSI c−3

SI

me m meSI GSI c−2

SI

M m mSI GSI c−2

SI

L unitless LSI GSI c−5

SI

2 (c) Express the following in geometrized units.

The trick is again to find the right combination of c and G that give theright units. By “right units” we mean that you’re not allowed to have [kg]or [s] or units derived from these, like [N]. But you are allowed to have [m].

(i) Density of a neutron star.

ρSI GSI c−1

SI [m−2] (8.7)

(ii) Pressure in a neutron star.

pSI GSI c−3

SI [m−2] (8.8)

Page 270: FirstCourseGR_notes_on_Schutz2009.pdf

270

(iii) Acceleration.

gSI c−2

SI [m−1] (8.9)

(iv) Luminosity of a neutron star.

LSI GSI c−5

SI [unitless] (8.10)

2 (d). Find the Planck length, mass, and time all in SI units.

Now the goal is to find the right combination of c and G and ~ such thatthe quantity has dimensions of length [m], mass [kg], and time [s] respectively.Computations were performed in the accompanying MapleTMworksheet forChapter 8.

(i) Planck length.

LP =(~SI GSI c

−3

SI

)1/2

[m]

= 1.62× 10−35 [m] (8.11)

(ii) Planck mass.

MP =(~SI G

−1

SI cSI

)1/2

[kg]

= 2.18× 10−8 [kg] (8.12)

(iii) Planck time.

TP =(~SI GSI c

−5

SI

)1/2

[s]

= 5.39× 10−44 [s] (8.13)

Now we are to compare these three scales with elementary particles. El-ementary particles are considered “point particles”, with no inherent radius

Page 271: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 271

(Perkins, 2009). Experiments in high-energy particle physics explore scalesdown to 10−17 m, or 100 times smaller than the charge radius of a proton(Perkins, 2009, p. 3), which is still much much larger than the Planck scale.The heavier leptons, i.e. the muon and the tauon, are unstable and decaywith mean lifetimes of 2.2×10−6 and 2.9×10−13 [s], much much longer thanthe Planck time. In contrast, the Planck mass appears to be rather large,much heavier than the electron.

3 (a). Calculate the following in geometrized units:

(i) Newtonian potential of Sun at the surface of the Sun.

φ = −GMr

= − 1 · 1476 m

6.96× 108 m= −2.12× 10−6 (8.14)

(ii) Newtonian potential of Sun at the radius of the Earth’s orbit.

φ = −GMr

= − 1 · 1476 m

1.496× 1011 m= −9.87× 10−9 (8.15)

(iii) Newtonian potential of Earth at the surface of the Earth.

φ = −GMr

= −1 · 4.434× 10−3 m

6.371× 106 m= −6.96× 10−10 (8.16)

Page 272: FirstCourseGR_notes_on_Schutz2009.pdf

272

3 (b). Why is the Sun’s potential at the Earth’s radius greater than thatof the Earths own potential there, yet we feel the Earth’s gravitational pullmore?

The force of gravity per unit mass is determined by the gradient of thegravitational potential. While the Sun’s potential is larger at the surface ofthe Earth, its potential gradient is much less than that of the Earth’s owngravitational potential.

3 (c). Show that a circular orbit around a body of mass M has anorbital velocity, in Newtonian theory, of v2 = −φ, where φ is the Newtonianpotential.

The centripetal acceleration of a body in a circular orbit of radius r is

~Ω× ( ~Omega× r~er) = −v2

r~er (8.17)

as can be found in elementary texts on mechanics (or see Lesson 2 in mycourse notes http://stockage.univ-brest.fr/~scott/GFD1/2012/index_gfd1.html).

4 (a). Let A be an n×n matrix whose entries are all very small, |Ai,j| 1/n, and let I be the unit matrix. Show that

(I + A)−1 = I − A+ A2 − A3 . . .

by first showing that(i) the series on the RHS converges absolutely.(ii) (I + A) times the RHS gives I.

(i) A series∞∑p=0

ap

is absolutely convergent if∞∑p=0

|ap| <∞

Page 273: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 273

see http://en.wikipedia.org/wiki/Absolute_convergence.Let’s say that the largest term is,

max |Ai,j| = ε1

n, ε 1.

where the maximum is over all i and j. Then consider an arbitrary term ofthe RHS,

Ri,j =∞∑p=0

ap,

with a0 = 1 from I. The next contribution from the series, p = 1, is −Ai,j,which makes contribution

|a1| = | − Ai,j| ≤ ε1

n

The next term in the series a2 has magnitude,

|a2| = |Ai,kAk,j| ≤ n× (ε1

n)2 = ε2

1

n

The next term in the series a3 has magnitude,

|a3| = |Ai,kAk,lAl,j| ≤ n× n× (ε1

n)3 = ε3

1

n

So we see that

|ap| = εp1

n

which is 1/n times the geometric series from 1 to ∞ with |ε| < 1 for whichthe sum is finite. Thus

∞∑p=0

= 1 +1

n

ε

1− ε

which proves that the RHS is absolutely convergent.

(ii) The next step is actually easier. Multiply the RHS by (I +A) gives twosets of terms. The first set comes from I and is of course just the RHS again.The next set of terms is like the RHS but each term has A to one higherpower. But because the terms alternate in sign, the second set cancels allthe terms in the first set but the I!

Page 274: FirstCourseGR_notes_on_Schutz2009.pdf

274

4 (b). Use results from (a) to establish Eq. (8.21) from Eq. (8.20).

First we must identify Eqs. (8.20) with the matrix equation:

(Λα′

β) = I + A

(δα′

β) = I

(εα′

β) = A (8.18)

And of course we know Λα′

β and Λαβ′ are inverse transforms. Alternatively

it’s also clear from basic calculus that (Dirac, 1996):

∂xα′

∂xβ∂xβ

∂xγ′=∂xα

∂xγ′= δα

γ′

So then Eq. (8.21) can be written in matrix form:

(I + A)−1 = I − A+O(A2) (8.19)

and of course this is a straightforward application of the result in exercise 4(a).

Page 275: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 10

Spherical solutions for stars

275

Page 276: FirstCourseGR_notes_on_Schutz2009.pdf

276

10.2 Static spherically symmetric spacetimes

10.2.1 The metric

10.2.2 Physical interpretation of metric terms

Typo p. 259, just before Eq. (10.11), has ~U · ~U = 1 but of course this shouldbe

~U · ~U = −1

c.f. Eq. (2.28) or Eq. (10.18). It really is just a typo. He wants to say

~U · ~U = −1

= UαUβgαβ

= U0U0 g00 because she’s in the MCRF

= U0U0(−e2Φ) using metric term from Eq. (10.7)

(10.1)

which implies, as Schutz concludes, U0 = exp(−Φ).

10.7 Realistic stars and gravitational collapse

White dwarfs

Typo: p. 273. “. . . the Fermi momentum rises (Eq. (10.5))”. I believethis should be Eq. (10.75), which shows that Fermi momentum pf increaseslike n1/3 where n is the number density.

10.9 Exercises

4. Calculate the diagonal components of the Einstein tensor, c.f Eqs. (10.14. . . 10.17) for the spherically symmetric static metric Eq. (10.7). For this wecan use the results of Problem 35 § 6.9 wherein we found the Riemann tensorfor a metric of this form. However, I found my Riemann tensor disagreedwith that of Schutz’s solutions. For that reason I’ll show my intermediateresults as well.

Page 277: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 277

First the Ricci tensor obtained by contracting the Riemann tensor (seeEq. 6.91):

Rtt = Rσtσt

= grrRrtrt + gθθRθtθt + gφφRφtφt

= exp(−2Λ)exp(2Φ)[(Φ′)2 + Φ′′ − Φ′ Λ′

]+ r−2rΦ′ exp(2Φ) exp(−2Λ)

+ r−2 sin−2(θ)sin2(θ) rΦ′ exp(2Φ) exp(−2Λ)

= exp(2Φ− 2Λ)

[(Φ′)2 + Φ′′ − Φ′ Λ′ + 2

Φ′

r

](10.2)

Rrr = Rσrσr

= gttRtrtr + gθθRθrθr + gφφRφrφr

= − exp(−2Φ)exp(2Φ)[(Φ′)2 + Φ′′ − Φ′ Λ′

]+ r−2rΛ′+ r−2 sin−2(θ)rΛ′ sin2(θ)

= −(Φ′)2 − Φ′′ + Φ′ Λ′ + 2Λ′

r(10.3)

Rθθ = Rσθσθ

= gttRtθtθ + grrRθrθr + gφφRφθφθ

= − exp(−2Φ)[rΦ′ exp(2Φ) exp(−2Λ)] + exp(−2Λ)[rΛ′]

+ r−2 sin−2(θ)[r2 sin2(θ)(1− exp(−2Λ))]

= −rΦ′ exp(−2Λ) + rΛ′ exp(−2Λ) + (1− exp(−2Λ)) (10.4)

Rφφ = Rσφσφ

= gttRtφtφ + grrRφrφr + gθθRφθφθ

= − exp(−2Φ)[sin2(θ) rΦ′ exp(2Φ) exp(−2Λ)] + exp(−2Λ)[rΛ′ sin2(θ)]

+ r−2[r2 sin2(θ) (1− exp(−2Λ))]

= exp(−2Λ) sin2(θ)[−rΦ′ + exp(2Λ)− 1 + rΛ′] (10.5)

Second the Ricci scalar obtained by contracting the Ricci tensor (seeEq. 6.92):

R = gαβRαβ

= 2 exp(−Λ)

[−2

Φ′

r+ 2

Λ′

r− 1

r2+

exp(2Λ)

r2+ Φ′Λ′ − Φ′2 − Φ′′

](10.6)

Page 278: FirstCourseGR_notes_on_Schutz2009.pdf

278

Finally we obtain the Einstein tensor via Eq. (6.98). But notice that forEqs. (10.14 . . . 10.17) we want the covariant components. So we use

Gαβ = Rαβ −1

2gαβR

(10.7)

We find:

G00 = Rtt −1

2g00R

= exp(2Φ− 2Λ)

[(Φ′)2 + Φ′′ − Φ′ Λ′ + 2

Φ′

r

]− 1

2[− exp(2Φ)]2 exp(−Λ)

[−2

Φ′

r+ 2

Λ′

r− 1

r2+

exp(2Λ)

r2+ Φ′Λ′ − Φ′2 − Φ′′

]

= exp(2Φ− 2Λ)

[2

Λ′

r− 1

r2+

exp(2Λ)

r2

]=

1

r2exp(2Φ)

d

dr[r(1− exp(−2Λ)] , the form of Eq. (10.14). (10.8)

G11 = Rrr −1

2grrR

= −(Φ′)2 − Φ′′ + Φ′ Λ′ + 2Λ′

r

− 1

2exp(2Λ)2 exp(−Λ)

[−2

Φ′

r+ 2

Λ′

r− 1

r2+

exp(2Λ)

r2+ Φ′Λ′ − Φ′2 − Φ′′

]= −exp(2Φ)

r2[(1− exp(−2Λ)] + 2

Φ′

r, the form of Eq. (10.15). (10.9)

G22 = Rθθ −1

2gθθR

= −rΦ′ exp(−2Λ) + rΛ′ exp(−2Λ) + (1− exp(−2Λ))

− 1

2r22 exp(−Λ)

[−2

Φ′

r+ 2

Λ′

r− 1

r2+

exp(2Λ)

r2+ Φ′Λ′ − Φ′2 − Φ′′

]

= r2 exp(−2Λ)

[Φ′′ + Φ′2 +

Φ′

r− Φ′Λ′ − Λ′

r

], the form of Eq. (10.16).

(10.10)

Page 279: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 279

And finally for the φ component we need not do any further computations ifwe note the relationship with the corresponding θ components of the Riccatensor and metric tensor:

G33 = Rφφ −1

2gφφR

= sin2(θ)Rθθ −1

2sin2(θ)gθθR

= sin2(θ)G22. (10.11)

10.10 Rob’s supplementary problems

SP. 1

Page 280: FirstCourseGR_notes_on_Schutz2009.pdf

280

Page 281: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 11

Schwarzschild geometry andblack holes

281

Page 282: FirstCourseGR_notes_on_Schutz2009.pdf

282

11.1 Trajectories in the Schwarzschild space-

time

Conserved quantitiesEqs. (11.5) and (11.6) look strange at first, since they appear to contradict

the definitions of p. 42. Why is E = −p0 here, yet E = p0 on p. 42. Why isL = pφ? I found it instructive to note that pα = mUα = mdxα/dτ , and xα

can be a coordinate like φ in spherical coordinates. And φ/dτ is an angularvelocity, so it doesn’t have units of velocity. But pφ = r2 sin2 θm dφ/dτ whichincludes the velocity r sin θ dφ/dτ and the momentum arm length r sin θ.We can also just accept p0 = −E and pφ = L as convenient definitions ofquantities that are conserved in the Schwarzschild metric and certainly E is“energy like” and L is “angular momentum like”, which is how I guess howSchutz meant us to take them.

Types of orbitsTypo: p. 286, first sentence of page, too many ”(”.

Regarding Eq. (11.19), Schutz refers to the “minimum” radius of a parti-cle’s circular orbit. But it is important to note that this is the minimum onlyfor the stable, larger orbit obtained from the positive root in Eq. (11.17). Forthe unstable, smaller orbit obtained from the negative root in Eq. (11.17),r = 6M is the maximum radius. In the latter case, the minimum is obtainedby taking the limit of the L→∞, which gives

r → 3M

See also the solution to Exercise 4 in § 11.7.

Perihelon shiftTypo: p. 288, 3rd line, “These opposite of ‘peri’ is ‘ap’ ” should be “Theopposite of ‘peri’ is ‘ap’ ”.Typo: p. 290, line above Eq. (11.42), “Each orbit take . . . ” should be “Eachorbit takes . . . ”.Typo: p. 291, last line of 1st paragraph “predict” should be “predicts”.

Page 283: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 283

[I sent these notes to Schutz in Dec. 2011 and he didn’t reply, so I’m nolonger bothering with cosmetic typos.]

11.3 General black holes

Ergoregion

Typo in Eq. (11.78), the τ should be a t obviously, since it came from theprevious line.

11.7 Exercises

1.

(a). A particle or photon in an orbit [careful, it’s not necessarily a closedorbit] in the Schwarzschild metric with a certain E and L, at a radius r M .Show that if space-time were really flat, the particle [or photon] would travelon a straight line that would pass a distance b ≡ L/

√E2 −m2 from the

centre of coordinates r = 0. This ratio b is called the impact factor.

(b). Show also that photon orbits that follow Eq. (11.12) depend onlyon b.

(a) If we rotate the coordinate system such that θ = π/2 then pθ = 0always. For a particle or a photon pαpα = −m2 where m is the rest mass,which is nil for a photon, c.f. Eqs. (2.33) and (2.40). At the point where r isa minimum

U r =dr

dτ= 0

because it’s an extremum of the particle path. The Schwarzschild metric isdiagonal so that this implies pr = 0 at r = b. So that we have only two

Page 284: FirstCourseGR_notes_on_Schutz2009.pdf

284

non-zero components of the momentum:

−m = pαpα in general

= ptpt + pφpφ

= g0αpαpt + gφαpαpφ

= g00ptpt + gφφpφpφ diagonal metric

= −(

1− 2M

r

)−1

E2 + r−2L2 , used Eqs. (11.5), (11.6)

= −E2 + r−2L2 flat space approximation

= −E2 + b−2L2 r = b

(11.1)

Solve for b and chose positive root since b is the spatial distance:

b =L√

E2 −m2.

(b) For a photon m = 0 Eq. (2.40),

b =L

E.

Note that we can rewrite Eq. (11.12) by absorbing the E into the parameterλ, defining for instance

γ = Eλ

which is consistent with Eq. (6.52), so we retain an affine parameter γ so thepaths are still geodesics,(

dr

)2

= E2 −(

1− 2M

r

)L2

r2

= E2 −(

1− 2M

r

)b2E2

r2substituted L = bE

= E2[1−(

1− 2M

r

)b2

r2](

dr

)2

= 1−(

1− 2M

r

)b2

r2(11.2)

Page 285: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 285

so besides M , the only parameter is b.

2. Prove Eqs. (11.17) and (11.18).

For the particle orbit, start with

0 =d

dr

[(1− 2M

r

)(1 +

L2

r2

)](11.3)

If one does the obvious thing, and simply integrates this, one runs into trou-ble. One can show that the integration constant is E2. But then one hasan cubic to solve. Even knowing the solution, it’s not obvious that it is asolution to the cubic.

However, if one first differentiates then everything works out easily, withonly a quadratic to solve,

0 =d

dr

[(1− 2M

r

)(1 +

L2

r2

)]

=2M

r2

(1 +

L2

r2

)+

(1− 2M

r

)(−2

L2

r3

)

= r2 − r L2

M+ 3L2 after multiplying by

r

M(11.4)

The quadratic formula gives the solution:

r =L2

2M

(1±

√1− 12M2

L2

)

For the photon orbit, start with

0 =d

dr

[(1− 2M

r

)(L2

r2

)](11.5)

Page 286: FirstCourseGR_notes_on_Schutz2009.pdf

286

Again, it’s much easier, perhaps essential, to start by differentiating,

0 =d

dr

[(1− 2M

r

)(L2

r2

)]=

2M

r2

(L2

r2

)+

(1− 2M

r

)(−2

L2

r3

)r = 3M after dividing by L2 (11.6)

4. What kind of orbits are possible outside a star of radius (a) R = 2.5M ,(b) R = 4M , (c) R = 10M ,

See the last paragraph of § 11.1, subsection Types of orbits, whereSchutz points out that if the radius of the star exceeds the radius of theorbit, the orbit is not possible, simply because the star is in the way. In mynotes above I qualify in what sense rMIN = 6M is a minimum in Eq. (11.19).The plot below is useful for interpreting this question.

(a) Note that R < 3M which is the photon’s circular orbit radius, so acircular, unstable, photon orbit is possible. This is point A in Fig. 11.2.

Note that R < 6M = rMIN so a circular, unstable, particle orbit ispossible, like point A in Fig. 11.1. Because R < 3M , which is the minimumunstable orbit radius (see my notes on section 11.1), the unstable orbit isallowed for all L.

Also the larger, stable, circular orbit is also allowed for all L, point B inFig. 11.1.

(b) Note that R > 3M which is the photon’s circular orbit radius, so acircular, unstable, photon orbit is not possible.

Note that R = 4 < 6M = rMIN so a circular, unstable, particle orbit ispossible, like point A in Fig. 11.1. Because R > 3M , which is the minimumunstable orbit radius (see my notes on section 11.1), the unstable orbit isonly possible for a finite range of L, see my Fig. 11.1 above.

The larger, stable, circular orbit is allowed for all L, point B in Fig. 11.1.

(c) Note that R = 10 > 3M which is the photon’s circular orbit radius,so a circular, unstable, photon orbit is not possible.

Page 287: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 287

Figure 11.1: Radius of circular orbit r vs. L for M = 1 from Eq. (11.17).Green line taking the positive root, while red line from taking the negativeroot.

Note that R > 6M which is the particle’s maximum unstable circularorbit radius, so a circular, unstable, particle orbit is not possible for any L.

The larger, stable, circular orbit is allowed for a range of L. Let L becomelarge in Eq. (11.17) and it’s clear that the larger orbit can be at arbitrarilylarge r so and be such that r > R = 10M and this orbit must exist. See alsogreen line in my Fig. 11.1 above.

5 (a). Find the radius R0.01 at which −g00 differs from the ‘Newtonian’value 1− 2M/R by only 1%.

This question doesn’t make sense because

−g00 = 1−2M

R

in the Schwartzschild metric without approximation. Please see my supple-mentary problem below.

Page 288: FirstCourseGR_notes_on_Schutz2009.pdf

288

5 (b). How many normal [Sun-like] stars can fit in the region betweenR0.01 and the radius 2M?

This question does make sense, once one answers 5(a). It is a simplegeometry question, no tricks.

11.8 Rob’s Supplementary Problems

SP. 1 Recall from Exercise 5d of § 7.6 that g00 is “closely related to”− exp(2φ), where φ is the Newtonian potential for a similar [non-relativistic]situation. This was derived for a hydrostatic fluid in Exercise 5d, but usethis fact here to find at what distance R from a black hole of mass 106 Mthe Schwartzschild g00 differs from the Newtonian potential by 1%.

Solution: The Schwartzschild metric has, precisely,

−g00 = 1− 2M

R

with G = 1 of course. When the weak gravity limit applies, the Newtonianpotential φ = −M

Rfor a similar [non-relativistic] situation obeys φ 1 and

we have

−g00 ≈ exp(2φ) , Exercies 5d of § 7.6. (11.7)

Use a Taylor series about φ = 0 to approximate the exponential function,

−g00 ≈ exp(2φ)

≈ 1 + 2φ+ 2φ2 +4

3φ3 . . .

1 + 2φ = 1 + 2φ+ 2φ2 +4

3φ3 . . .

(11.8)

For φ 1 the LHS differs from the RHS by approximately 2φ2. Setting thisdifference to 1% of −g00 ≈ 1 we solve for the radius when 106 M,

2φ2 = 2

(−MR

)2

=1

100

soR = 10

√2M = 14.1× 106 × 1.5km see Table 8.1

Page 289: FirstCourseGR_notes_on_Schutz2009.pdf

Chapter 12

Cosmology

289

Page 290: FirstCourseGR_notes_on_Schutz2009.pdf

290

12.2 Cosmological kinematics: observing the

expanding universe

Three types of universeTypo: in Eq. (12.15), missing square on dΩ.

Eq. (12.21) uses z for redshift. This was defined in Eq. (10.12) and alsoin Eq. (12.37).

12.6 Exercises

1. Use the metric of the 2-sphere to prove the statement associated withFig. 12.1 that the rate of increase of the distance between any two pointsas the sphere expands ( as measured on the sphere) is proportional to thedistance between them.

The metric of the 2-sphere is apparent from the line element

dl2 = r2(dθ2 + sin2 θ dφ2)

Here we assume that r = r(t) and pick two points on the sphere, p and q.We can always rotate the coordinates such that the two points are on theequator, θ = π/2, so the distance is∫

dl = l =

∫r(t) dφ = r(t)(φp − φq)

Differentiating with respect to time one obtains the rate of increase of theirseparation

l = r(φp − φq) = lr

r(12.1)

So for a given r and rate of change of r, the rate of change of the distance lis proportional to l.

2. Find the parsec in meters given the radius of the Earth’s orbit as about1011 m and the definition of the parsec, which is the parallax of 1 second ofarc.

Page 291: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 291

This question is of course quite trivial. But it’s nice to go through it oncesince one feels then more comfortable with the parsec and gains appreciationof the astronomers who measure positions of stars to within an arc second 6months apart.

The distance from the Sun to the star l is

l =r

sin(1[arc second])

≈ rπ

180[rad per degree] 1

60[degree per minute] 1

60[minute per second]

=1011 × 180× 60× 60

π(12.2)

3. Newtonian cosmology.

3(a). Apply Newton’s law of gravity to the study of cosmology by show-ing that the general solution of ∇2φ = 4πρ for constant ρ is a quadraticpolynomial in Cartesian coordinates that is not necessarily isotropic.

Newton’s law is a Poisson equation with constant coefficient. By inspec-tion it is immediately clear that

Φ = Φ0(a2x2 + b2y2 + c2z2)

is a solution with Φ0 some constant and

2(a2 + b2 + c2) =ρ

Φ0

But to this one can add an arbitrary solution to Laplace’s equation, so oursolution is by no means a completely general solution! It is clear that it isnot isotropic because we do not require a = b = c.

3 (b) Show that if the universe consists of a region where ρ is constant,outside of which there is a vacuum, then, if the boundary is not spherical,the field will not be isotropic. The field will show significant deviations fromsphericity throughout the interior, even at the centre.

Page 292: FirstCourseGR_notes_on_Schutz2009.pdf

292

Suppose we stretch our Cartesian co-ordinates such that:

x′ = ax; y′ = by; z′ = cz

Then our “general solution” is

Φ = Φ0(x′2 + y′2 + z′2) = Φ0r′2

and is spherically symmetric in this stretched coordinate system. Outsidethe region of nonzero ρ, so in the vacuum, we know the solution must be

Φ = −4πρ

3

1

r′, see Eq. (8.2) and Exercise 1 of § 8.6

And this solution must match the solution Φ0r′2 at the boundary. But this

proves that the boundary must be a surface of constant r′ = R, for otherwiseit would be impossible to match the interior and exterior solutions. And thisgives us the result we required. For the boundary is affecting the shape of thesolution every, including the whole interior and even its central point r′ = 0.And if the boundary R is not a true circle (i.e. a circle in the original un-stretched coordinates, such that a 6= b 6= c) then the anisotropy is apparent,after transforming back to the original coordinates, for instance in the secondderivatives:

Φ,xx = Φ02a 6= Φ,yy = Φ0 2 b 6= Φ,zz = Φ0 2 c (12.3)

3 (c) Show that an experiment done locally could determine the shapeof the boundary.

One could measure the gravitational acceleration on a test particle in 3orthogonal directions as a function of position. Under the assumption ofρ =constant [or possibly correcting for local inhomogeneities – not sure howfeasible that would be in practice] one would obtain

Φ,xx = Φ02a

from the x−direction gradient of the x−direction gravitational acceleration,and similarly for the other two directions. This gives the relative sizes ofa, b, c which would determine the shape of an ellipsoidal boundary.

Page 293: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 293

For a boundary shape more general than ellipsoidal, it is not immediatelyclear how to proceed. The problem lies in our solution in terms of quadraticterms in x, y, z not being truly a general solution. I guess the mathematicswill become complicated rather quickly and it’s not clear it’s worth the effort.For we have already shown that the boundary effects of an ellipsoidal shapedNewtonian universe would in principal be observable throughout the interior,and clearly we don’t observe such effects.

4. Show that if hij(t1) 6= f(t1, t0)hij(t0) for all i and j in Eq. (12.3), thendistances between galaxies would increase anisotropically: the Hubble lawwould have to be written as

vi = H ij x

j

for a matrix H ij not proportional to the identity.

We replace

hij(t1) = f(t1, t0)hij(t0)

by a more general expression

hij(t1) = (ai j)2 (t1, t0)hij(t0)

with no summation over the double i or j indices on the RHS. Consider agalaxy at some distance l0 at t = t0. We can orient our Cartesian coordinatessuch that we are at the centre and the galaxy is in the x−direction.

dl2 = hij(t0)dxi dxj

= hxx(dx)2

dl =√hxx(t0) dx∫

dl =

∫ √hxx(t0) dx = l0 (12.4)

Page 294: FirstCourseGR_notes_on_Schutz2009.pdf

294

Then at a later time t1

dl2 = hij(t1)dxi dxj

= (ai j)2hij(t0)dxi dxj

= (ax x)2hxx(dx)2

dl = ax x√hxx(t0) dx∫

dl = ax x

∫ √hxx(t0) dx = ax x l0 (12.5)

So the rate of change of position of this galaxy observed at Earth is

l = v = ax xl0 (12.6)

But if our (ai j)2 is not just a constant times δi j then we measure a different

Hubble parameter in each of the 3 orthogonal directions:

vx = ax x lx0

vy = ay y lx0

vz = az z lz0 (12.7)

And performing a rotation to orient our Cartesian coordinates to an arbitrarydirection we obtain the generalized Hubble law

vj = Hji x

i

6. (a) Prove the statement leading to Eq. (12.8), that we can deduce Gij

of our three-spaces by setting Φ = 0 in Eq. (10.15)-(10.17).

I’m going to be a bit deconstructionist for this question since I don’t like howit’s posed. Recall that the Einstein tensor is defined in Eq. (6.98), which,after lower the indices, is

Gαβ = Rαβ −1

2gαβR

So we want the spatial part of that, Gij. But this involves the temporaldimension through the Riemann tensor. Recall the Ricci tensor was definedby Eq. (6.91), which after fixing the typo, is

Rαβ = Rµαµβ

Page 295: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 295

So even when we restrict ourselves to Gij we have contributions from thetemporal diminution through µ = 0 in the contraction of the Riemann tensor.

So I think it’s better to say we want to ensure that the metric in Eq. (12.6)is homogeneous at a given point in time, t = t0. Then R2(t0) constant onewithout loss of generality it can be recalled to R2 = 1:

ds2 = −dt2 +R2(t)hijdxi dxj = −dt2 + (e2Λ(r) dr2 + r2dΩ2)

If we set Φ = 0 in Eq. (10.7) then we obtain the same metric. So of courseit will have the same Riemann tensor and all that follows (Ricci tensor andscalar and Einstein tensor).

6. (b) Derive Eq. (12.9).

To obtain Eq. (12.9) we substitute Eq. (12.8) for the diagonal componentsof the Einstein tensor into the equation for the trace,

G = Gij δij

and the rest is just algebra. There’s nothing to derive.

7.Show the metric in Eq. (12.7) is only flat at r = 0 if A = 0 in Eq. (12.11).

For the metric to be flat we require the Riemann tensor to be zero, seeEq. (6.71),

Rαβµν = 0

Careful, it’s not correct to show that

Rαβµν = 0

I’ve learned from experience one can waste hours working with this incorrectequation! Our metric was given by Eq. (12.7):

dl2 = exp(2Λ(r)) dr2 + r2dΩ2

which has the spatial part the same as metric Eq. (10.7). For the Riemanntensor, we can pull off the spatial part without fear of contamination from

Page 296: FirstCourseGR_notes_on_Schutz2009.pdf

296

the temporal part of the metric since our metric is diagonal and there is nocontraction involved in the definition of the Riemann tensor

Rαβµν vα = vα;

Using the results from Exer. 35 of § 6.9, we see there are only three non-zero terms to consider (and the ones obtainable from these 3 by symmetryrelations), see result (6.101),

Rrθrθ = rΛ′

Rrφrφ = rΛ′ sin2(θ)

Rθφθφ = r2 sin2(θ) (1− exp(−2Λ)) (12.8)

We’ll be using Eq. (12.11)

grr = exp(2Λ) =1

1 + 13κr2 − A

r

Consider case A 6= 0. Then

Rrθrθ = grαRαθrθ

=1

grrRrθrθ , for diagonal metric

=1

grrrΛ′ , substitute from result (12.8)

=r

2(grr)2

dgrrdr

, elementary operations

= −1

2

(2

3κr2 +

A

r

)(12.9)

which is nonzero (in fact singular) as r → 0.

On the other hand, if A = 0 then Rrθrθ = 0 at r = 0. But we still need to

check the other two non-zero components of the Riemann tensor when A = 0to confirm that space is flat there. Note that

Rrφrφ = Rr

θrθ sin2(θ) = 0 at r = 0.

Page 297: FirstCourseGR_notes_on_Schutz2009.pdf

Rob’s notes on Schutz 297

Rθrθr = gθαRαrθr

=1

gθθRθrθr , for diagonal metric

=1

gθθRrθrθ , symmetry

=1

r2rΛ′ , substitute from result (12.8)

=−κ

3 + κr2, elementary operations

(12.10)

Rθφθφ = gθαRαφθφ

=1

gθθRθφθφ , for diagonal metric

=1

r2r2 sin2(θ) (1− exp(−2Λ)) , substitute from result (12.8)

= 0 at r = 0. (12.11)

Rφθφθ = gφαRαθφθ

=1

gφφRφθφθ , for diagonal metric

=1

gφφRθφθφ , symmetry

=1

r2 sin2 θRθφθφ =

1

sin2 θRθ

φθφ

= 0 at r = 0. (12.12)

8.Find the coordinate transform leading to Eq. (12.19).

It’s immediately clear by inspection of the term in dΩ2 that

r = sinhχ

Page 298: FirstCourseGR_notes_on_Schutz2009.pdf

298

is the transform. To confirm this works we substitute it in Eq. (12.13) whenk = −1. We’ll use the identity

cosh2 χ− sinh2 χ = 1

andd sinhχ = coshχdχ

These identities all follow easily from the definitions of the hyperbolic func-tions, http://en.wikipedia.org/wiki/Hyperbolic_function. Omittingthe temporal part we find

dl2 = R(t0)

(dr2

1 + r2+ r2dΩ2

), spatial part of Eq. (12.13)

= R(t0)

(cosh2(χ)dχ2

1 + sinh2(χ)+ sinh2(χ)dΩ2

), sub transform

= R(t0)(dχ2 + sinh2(χ)dΩ2

), use hyperbolic identies

(12.13)

12.7 Rob’s supplementary problems

SP. 1 Let’s get a feel for the order of magnitude of the terms in the expandinguniverse.

SP. 1(a) At the current rate of expansion of the universe, how long will ittake for a meter stick to double in size? For the universe to double in size?

SP. 1(b) Plate tectonics is causing the Atlantic ocean to spread at a rateof about 25 mm/year, see http://en.wikipedia.org/wiki/Mid-ocean_

ridge. So stationed in Brest, France one observes New York City to re-cede at v = 25 mm/year. Assuming Hubbles’ law applies locally, comparethis to the rate at which New York recedes from Brest due to expansion ofthe universe.

Page 299: FirstCourseGR_notes_on_Schutz2009.pdf

Bibliography

Batchelor, G. K., 1967: An Introduction to Fluid Dynamics . Cambridge Uni-versity Press, 615 pp.

Boas, M. L., 1983: Mathematical methods in the physical sciences . JohnWiley and Sons, 793 pp.

Dirac, P. A. M., 1996: The General Theory of Relativity . Princeton Univer-sity Press. 71 pp.

Faber, R. L., 1983: Differential geometry and relativity theory: An Introduc-tion. Marcel Dekker. 255 + X pp.

Hobson, M., G. Efstathiou, and A. Lasenby, 2009: General Relativity: Anintroduction for physicists . Cambridge. 572 +XVIII pp.

Misner, C. W., K. S. Thorne, and J. A. Wheeler, 1973: Gravitation. W. H.Freeman and company. 1279 + XXVI pp.

Perkins, D., 2009: Particle Astrophysics . Oxford University Press. 2nd ed.,339 + IX pp.

Schutz, B., 2009: A first course in General Relativity . Cambridge UniversityPress. 2nd ed., 393 + XV pp.

299