Strong Typed Tensor Calculus

Riemannian Calculus of Variationsusing Strongly Typed Tensor Calculus

Victor Dods

2012.12.11

Abstract

In this paper, the notion of strongly typed language will be borrowed from the field of computerprogramming to introduce a calculational framework for linear algebra and tensor calculus for the purposeof detecting errors resulting from inherent misuse of objects and for finding natural formulations of variousobjects. A tensor bundle formalism, crucially relying on the notion of pullback bundle, will be used tocreate a rich type system with which to distinguish objects. The type system and relevant notationis designed to “telescope” to accomodate a level of detail appropriate to a set of calculations. Varioustechniques using this formalism will be developed and demonstrated with the goal of providing a relativelycomplete and uniform method of coordinate-free computation.

The calculus of variations pertaining to maps between Riemannian manifolds will be formulated usingthe strongly typed tensor formalism and associated techniques. Energy functionals defined in terms offirst order Lagrangians are the focus of the second half of this paper, in which the first variation, theEuler-Lagrange equations, and the second variation of such functionals will be derived.

IntroductionMany important differential equations have a variational origin, being derived as the Euler-Lagrange equa-tions for a particular functional on some space of functions. The variational approach lends itself particularlyto physics, in which conservation of energy or minimization of action is a central concept. The naturality ofsuch formulations can’t be understated, as solutions to such problems often depend critically on the inherentgeometry of the underlying objects. For example, solutions to Laplace’s equation for a real valued function(e.g. modeling steady-state heat flow) on a Riemannian manifold depend qualitatively on the topology of themanifold (e.g. harmonic functions on a closed Riemannian manifold are necessarily constant, which makessense geometrically because there is no boundary through which heat can escape).

A central concept in the field of software design is that of information hiding [16], in which a com-puter program is organized into modules, each presenting an abstract public interface. Other parts of theprogram can interact only through the presented interface, and the details of how each module works arehidden, thereby preventing interference in the implementation details which are not required by the inherentstructure of the module. This concept has clear usefulness in the field of mathematics as well. For example,there are several formulations of the real numbers (e.g. equivalence classes of Cauchy sequences of rationalnumbers, Dedekind cuts, decimal expansions, etc), but their particulars are instances of what are known asimplementation details, and the details of each particular implementation are irrelevant in most areas ofmathematics, which only use the inherent properties of the real numbers as a complete, totally ordered field.Of course, at certain levels, it is useful or necessary to “open up the box” [go past the public interface] andwork with a particular representation of the real numbers.

Information hiding is characteristic of abstract mathematics, in which general results are proved aboutabstract mathematical objects without using any particular implementation of said objects. These resultscan then be used modularly in other proofs, just as the functionality of a computer program is organized intomodularized objects and functions. For example, a fixed point theorem for contractive mappings on closed

1

arX

iv:1

212.

2376

v1 [

mat

h.D

G]

11

Dec

201

2

sets in Banach spaces, but a particular application of this theorem renders an existence and uniquenesstheorem for first order ODEs [19, pgs. 59, 62].

A loose conceptual analogy for modularity is that of diagonalizing a linear operator. A basis of eigen-vectors are chosen so that the action of the linear operator on each eigenspace has a particularly simpleexpression, and distinct eigenspaces do not interact with respect to the operator’s action. In this analogy,the eigenvectors then correspond to individual lemmas, and the linear operator corresponds to a large theo-rem which uses each lemma. Decomposing the proof of the main result in terms of non-interacting lemmassimplifies the proof considerably, just as it simplifies the quantification of the linear operator. The term“orthogonal” has been borrowed by software design to describe two program modules whose functionality isindependent [18, Chapter 4, Section 2]. Orthogonality in software design is highly desirable as it generallyeases program implementation and program correctness verification, as the human designers are only capableof keeping track of a certain finite number of details simultaneously [13]. The scope of each detail level ofthe design is limited in complexity, making the overall design easier to comprehend.

This technique in software design carries over directly to proof design, where it is desirable (elegant) towrite proofs and do calculations without introducing extraneous details, such as choice of bases in vectorspaces or local coordinates in manifolds. Because such choices are generally non-unique, they can oftenobscure the inherent structure of the relevant objects by introducing artifacts arising from properties of theparticular details used to implement said objects. For example, the choice of a particular local coordinatechart on a manifold artificially imposes an additive structure on a neighborhood of the manifold, but sucha structure has nothing to do with the inherent geometry of the manifold. Furthermore, the descent to this“lower level” of calculation discards some type information, representing points in a manifold as Euclideanvectors, thereby losing the ability to distinguish points from different manifolds, or even different localitiesin the same manifold.

This paper makes a particular emphasis on natural formulations and calculations in order to exposethe underlying geometric structures rather than relying on coordinate-based expressions. The constructionof the “full” direct sum and “full” tensor product bundles are used in combination with induced covariantderivatives to this end.

Notation and ConventionsLet all vector spaces, manifolds and [fiber] bundles be real and finite-dimensional unless otherwise noted (thisallows the canonical identification V ∗∗ ∼= V for a vector space or vector bundle V ), and let all tensor productsbe over R. The unqualified term “bundle” will mean “fiber bundle”. The Einstein summation convention willbe assumed in situations when indexed tensors are used for computation.

Unary operators are understood to have higher binding precedence than binary operators, and super andsubscripts are understood to have the highest binding precedence. For example, the expression ∇X,M ◦ φwould be parenthesized as (∇ (X,M )) ◦ φ.

Apart from the obvious purpose of providing a concise and central reference for the notation in thispaper, the following notation index serves to illustrate the use of telescoping notation (see Section 2). Thehigh-level (terse notation which requires the reader to do more work in type inference but is more agile),mid-level, and low-level (completely type-specified, requiring little work on the part of the reader) notationsare presented side-by-side with their definitions.

Let I ⊆ R be a neighborhood of 0, let ε, i each be coordinates on I, let A,A1, . . . , An, B be sets, letM,N be manifolds, let φ ∈ C∞ (M,N), let πAM : A → M and πHN : H → N and be vector bundles, whereA = E,F, F1, . . . , Fn, G, let U, V, V1, . . . , Vn,W be vector spaces, and let ci ∈ Γ (Fi ⊗M T ∗M) such that

c1 ⊕M · · · ⊕M cn ∈ Γ ((F1 ⊕M · · · ⊕M Fn)⊗M T ∗M)

is a vector bundle isomorphism.

2

High- Mid- Low-level Description

Variations; variational derivatives; tangent vectors.mε m Variation of a point in M ; I 3 ε 7→ mε ∈M ; m : I →M .δ δε Variational derivative; δε := ∂

∂ε|ε=0.

δmε δεm Tangent vector; linearization of a variation;δmε ∈ Tm0M ; δεm ∈ Tm(0)M .

Projection maps; canonical isomorphisms; bundle-related maps and spaces.pr pri prA1×···×An

i Set-theoretic projection onto ith factor or named factor;prAi prA1×···×An

AiprA1×···×Ani : A1 × · · · ×An → Ai.

ι ιB , ιA ιAB Canonical isomorphism; ιAB : A→ B; ιBA :=(ιAB)−1.

π πM , πF πFM Bundle projection map; πFM : F →M .ρ ρH , ρφ

∗H ρφ∗HH Pullback bundle fiber projection map; ρφ

∗HH : φ∗H → H.

Trivial bundle constructions and projection maps.M ×N → N M ×�N → N Trivial bundle over N ; M ×�N := M ×N ;

πM×�NN : M ×�N → N, (m,n) 7→ n.

M ×N →M M ×�N →M Trivial bundle over M ; M ×�N := M ×N ;πM×�NM : M ×�N →M, (m,n) 7→ m.

Shared base-space bundle constructions and projection maps.E × F →M E ×M F →M Direct product; E ×M F :=

∐m∈M Em × Fm;

πE×MFM (e, f) := πEM (e) ≡ πFM (f).

E ⊕ F →M E ⊕M F →M Whitney sum; E ⊕M F :=∐m∈M Em ⊕ Fm;

πE⊕MFM (e⊕ f) := πEM (e) ≡ πFM (f).

E ⊗ F →M E ⊗M F →M Tensor product; E ⊗M F :=∐m∈M Em ⊗ Fm;

πE⊗MFM

(cijei ⊗ fj

):= πEM (ek) ≡ πFM (f`) (for any k, `).

Separate base-space bundle constructions and projection maps.E ×H →M ×N E ×M×N H →M ×N Direct product; E ×M×N H :=

∐(m,n)∈M×N Em ×Hn.

πE×M×NHM×N (e, h) :=

(πEM (e) , πHN (h)

).

E ⊕H →M ×N E ⊕M×N H →M ×N Whitney sum; E ⊕M×N H :=∐

(m,n)∈M×N Em ⊕Hn.πE⊕M×NHM×N (e⊕ h) :=

(πEM (e) , πHN (h)

).

E ⊗H →M ×N E ⊗M×N H →M ×N Tensor product; E ⊗M×N H :=∐

(m,n)∈M×N Em ⊗Hn.πE⊗M×NHM×N

(cijei ⊗ hj

):=(πEM (ek) , πHN (h`)

)(for any k, `).

Trace; natural pairing; tensor/tensor field contraction. Simple tensor expressions are extended linearly.tr trV Trace on V ; trV : V ∗ ⊗ V → R, α⊗ v 7→ α (v).α · v α ·V v Natural pairing; ·V : V ∗ × V → R, (α, v) 7→ α (v).A ·B A ·V B Tensor contraction; ·V : (U ⊗ V ∗)× (V ⊗W )→ U ⊗W,

(u⊗ α) ·V (v ⊗ w) := u⊗ (α ·V v)⊗ w ≡ α (v) u⊗ w.S ·n T S ·V1⊗···⊗Vn T Alternate for ·V , where V = V1 ⊗ · · · ⊗ Vn.

S : T , S ·2 T S ·V1⊗V2 T Special notation for n = 2.tr trF Trace on F →M ; trF : Γ (F ∗ ⊗M F )→ C∞ (M,R) ,

[trF (σ ⊗M f)] (m) := σ (m) ·Fm f (m) for m ∈M .σ · f σ ·F f Natural pairing; ·F : Γ (F ∗)× Γ (F )→ C∞ (M,R),

(σ ·F f) (m) := σ (m) ·Fm f (m) for m ∈M .A · f A ·F f Natural pairing; ·F : Γ (E ⊗M F ∗)× F → E,

(e⊗M σ) ·F f := e (m) (σ (m) ·Fm f) ∈ Em; m := πFM (f).S · T S ·F T Tensor field contraction; pointwise tensor contraction;

·F : Γ (E ⊗M F ∗)× Γ (F ⊗M G)→ Γ (E ⊗M G) ,

[(e⊗ σ) ·F (f ⊗ g)] (m) := (σ (m) ·Fm f (m)) e (m)⊗ g (m).S ·n T S ·F1⊗M ···⊗MFn T Alternate for ·F , where F = F1 ⊗M · · · ⊗M Fn.

S : T , S ·2 T S ·F1⊗MF2 T Special notation for n = 2.

3

High- Mid- Low-level Description

Permutations of tensors and tensor fields.Aσ, A ·n σ A ·V ∗1 ⊗···⊗V ∗n σ Right-action of permutations on n-tensors/n-tensor fields;

(v1 ⊗ · · · ⊗ vn)σ := vσ−1(1) ⊗ · · · ⊗ vσ−1(n); (Aσ)τ = Aστ .Spaces of sections of bundles.

Γ (H), Γ(πHN)

Space of smooth sections of the bundle πHN ;Γ (H) :=

{h ∈ C∞ (N,H) | πHN ◦ h = IdN

}.

Γφ (H), Γφ(πHN)

Space of smooth sections of πHN along φ;Γφ (H) :=

{h ∈ C∞ (M,H) | πHN ◦ h = φ

}.

Vertical bundle, pullback bundle, projection maps, pullback of sections.V E → E Vertical bundle over E →M ; V E := kerTπEM ≤ TE.

projection map πV EE := πTEE |V E .φ∗H →M Pullback bundle; φ∗H := {(m,h) ∈M ×H | φ (m) = π (h)}.

πφ∗HM (m,h) := m; ρφ

∗HH (m,h) := h.

φ∗h Pullback of section h ∈ Γ (H); φ∗h ∈ Γ (φ∗H)

defined by ρφ∗HH ◦ φ∗h = h; h ∈ Γ (H).

Covariant derivatives; partial covariant derivatives.∇L ∇p L ∇p M→RL Natural linear covariant derivative; differential of functions;

∇M→RL ∇p M→RL := dL ∈ Γ (T ∗M), where L ∈ C∞ (M,R).∇X ∇p X ∇p EX Linear covariant derivative on vector bundle E →M ;

∇EX ∇EX ∈ Γ (E ⊗M T ∗M), where X ∈ Γ (E).∇φ ∇◦ φ ∇◦ M→Nφ Tangent map as tensor field;

∇M→Nφ ∇◦ M→Nφ ∈ Γ (φ∗TN ⊗M T ∗M), where φ ∈ C∞ (M,N).∇σ ∇φσ ∇φ

∗Hσ Pullback covariant derivative; σ ∈ Γ (φ∗H);defined by ∇φ

∗Hφ∗h = φ∗∇Hh ·φ∗TN ∇◦ M→Nφ; h ∈ Γ (H).L,c1 , . . . , L,cn Partial differential of functions;

L,ci ∈ Γ (F ∗i ), defined by ∇p M→RL =∑ni=1 L,ci ·Fi ci.

X,c1 , . . . , X,cn Partial linear covariant derivative;X,ci ∈ Γ (E ⊗M F ∗i ), defined by ∇p EX =

∑ni=1X,ci ·Fi ci.

φ,M1 , . . . , φ,Mn Partial derivative decomposition of tangent map;φ,Mi ∈ Γ (φ∗TN ⊗M pr∗i T

∗Mi),where M = M1 × · · · ×Mn, pri := prMi , and∇◦ M→Nφ =

∑ni=1 φ,Mi ·pr∗i TMi ∇◦

M→Mi pri.

Covariant Hessians.∇2L ∇T

∗M∇M→RL Covariant Hessian of functions;∇p ∇p L ∇p T

∗M ∇p M→RL ∇2L ∈ Γ (T ∗M ⊗ T ∗M); L ∈ C∞ (M,R).∇2X ∇E⊗T

∗M∇EX Covariant Hessian on vector bundle E →M ;∇p ∇p X ∇p E⊗MT∗M ∇p EX ∇2X ∈ Γ (E ⊗ T ∗M ⊗ T ∗M); X ∈ Γ (E).∇2φ ∇φ

∗TN⊗T∗M∇M→Nφ Covariant Hessian of maps;∇p ∇◦ φ ∇p φ

∗TN⊗MT∗M ∇◦ M→Nφ ∇2φ ∈ Γ (φ∗TN ⊗ T ∗M ⊗ T ∗M). φ ∈ C∞ (M,N).

Derivative conventions.∇Xe Directional derivative notation; ∇Xe := ∇e ·TM X.

∇ne ·n (X1 ⊗ · · · ⊗Xn−1 ⊗Xn) Iterated covariant derivative convention;defined by

(∇Xn∇n−1e

)·n−1 (X1 ⊗ · · · ⊗Xn−1).

R (X,Y ) := −∇X∇Y +∇Y∇X +∇[X,Y ] Curvature operator; R (X,Y ) e = ∇2e : (X ⊗ Y − Y ⊗X).z : M →M × I, m 7→ (m, 0) Evaluation-at-zero map.

z∗∂i = δi Pullback formulation of derivative-at-zero.

For more on relevant introductory theory on manifolds, bundles and Riemannian geometry, see [10], [8],[12], [9].

4

Part I

Mathematical Setting1 Using Strong Typing to Error-Check CalculationsLinear algebra is an excellent setting for discussion of the strong typing [1] of a language, a concept usedin the design of computer programming languages. The idea is that when the human-readable source codeof a program is compiled (translated into machine-readable instructions), the compiler (the program whichperforms this translation) or runtime (the software which executes the code) verifies that the program objectsare being used in a well-defined way, producing an error for each operation that is not well-defined. Forexample, a vector-type value would not be allowed to be added to a permutation-type value, even thoughtuples of unsigned integers (i.e. bytes) are used by the computer to represent both, and the computer’sprocessing unit could add together their byte-valued representations. However, such an operation would bemeaningless with respect to the types of the operands. The result of the operation would depend on the non-canonical choice of representation for each object. Strong type checking has the advantage of catching manyprogramming errors, including most importantly those resulting from an inherent misuse of the program’sobjects. Within this paper, certain type-explicit notations will be used to provide forms of type awarenessconducive to error-checking.

An important example of semi-strong typing in math is Penrose’s abstract index notation [17], modeledon Einstein’s summation convention, in which linear algebra and tensor calculus are implemented usingindexed objects (tensors) having a certain number and order of “up” and “down” indices (an abstraction ofthe genuine basis/coordinate expressions in which the indexed objects are arrays of scalars/functions). Anon-indexed tensor is a scalar value, a tensor having a single up or down index is a vector or covector valuerespectively, a tensor having an up and a down index is an endomorphism, and so forth. The tensors arecontracted by pairing a certain number of up indices with the same number of down indices, resulting in anobject having as indices the uncontracted indices.

For example, given a finite-dimensional inner product space (V, g), where g is a(

02

)-tensor (having the

form gij , i.e. two down indices), a vector v ∈ V is a(

10

)-tensor, and the length of v is

√vigijvj . If dimV > 1,

then∧2

V has positive dimension, its vectors each being(

20

)-tensors, and Gijk` := gikgj`− gi`gjk is an inner

product on∧2

V (which must be a(

04

)-tensor in order to contract with two

(20

)-tensors).

Certain type errors are detected by use of abstract index notation in the form of index mismatch. Forexample, with (V, g) as above, if α ∈ V ∗, then α is a

(01

)-tensor. Because of the repeated j down indices, the

expression gijαj typically indicates a type error; gij can’t contract with αj because of incompatible valence(valence being the number of up and down indices). Furthermore, multiplying a

(02

)-tensor with a

(01

)-tensor

without contraction should result in a(

03

)-tensor, which should be denoted using three indices, as in gijαk.

The only explicit type information provided by abstract index notation is that of valence. The “semi”qualifier mentioned earlier is earned by the lack of distinction between the different spaces in which thetensors reside. For example, if U, V,W are finite-dimensional vector spaces, then linear maps A : U → Vand B : V → W can be written as

(11

)-tensors, and their composition B ◦ A : U → W is written as the

tensor contraction (B ◦A)ij = BikA

kj . However, while the expression AikB

kj makes sense in terms of valence

compatibility (i.e. grammatically), the composition “A◦B” that it should represent is not well-defined. Thusthis form of type error is not caught by abstract index notation, since the domains/codomains of the linearmaps must be checked separately.

The use of dimensional analysis (the abstract use of units such as kilograms, seconds, etc) in Physics isan important precedent of strong typing. Each quantity has an associated “dimension” (this is a differentmeaning from the “dimension” of linear algebra) which is expressed as a fraction of powers of formal symbols.The ordinary algebraic rules for fractions and formal symbols are used for the dimensional components, with

5

the further requirement that addition and equality may only occur between quantities having the samedimension.

For example, if E, M and C represent the dimensions of energy, mass and cost, respectively, and if theenergy storage density ρE/M of a battery manufacturing process is known (having dimensions energy permass) and the manufacturing weight yield wM/C of the battery is known (having dimensions mass per cost),then under the algebraic convensions of dimensional analysis, calculating the energy storage per cost (whichshould have dimensions energy per cost) is simple;(

ρEM

)(w

MC

)= ρw

EMMC

= ρwEC

(the M symbols cancel in the fraction). Here, both ρ and w are real numbers, and besides using the well-definedness of real multiplication, no type-checking is done in the expression ρw.

A contrasting example is the quantity ρ/w, having dimensions EC/M2. However, these dimensions maybe considered to be meaningless in the given context. The quantity’s type adds meaning to the real-valuedquantity, and while the quantity is well-defined as a real number, the uselessness of the type may indicatethat an error has been made in the calculations. For example, a type mismatch between the two sides of anequation is a strong indication of error.

This is also a convenient way to think about the chain rule of calculus. If z (y), y (x), and x measurereal-valued quantities, then z (y (x)) measures the quantity z with respect to quantity x. Using Z, Y and Xfor the dimensions of the quantities z, y and x respectively, the derivative dz

dx has units Z/X. When workedout, the dimensions for the quantities on either side of the equation dz

dx = dzdy

dydx will match exactly, having a

non-coincidental similarity to the calculation in the battery product example.

2 Telescoping Notation (aka Don’t Fear the Verbosity)Many of the computations developed in this paper will appear to be overly pedantic, owing to the decoration-heavy notation that will be introduced in Section 3. This decoration is largely for the purpose of tracking themyriad of types in the type system and to assist the human reader or writer in making sense of and error-checking the expressions involved. The pedantry in this paper plays the role of introducing the technique.The notation is designed to telescope1, meaning that there is a spectrum of notational decoration; from

• pedantically type-specified, verbose, and decoration-heavy, where [almost] no types must be inferredfrom context and there is little work or expertise required on the part of the reader, to

• somewhat decorated but more compact, where the reader must do a little bit of thinking to infer sometypes, all the way to

• tersely notated with minimal type decoration, where [almost] all types must be inferred from contextand the reader must either do a lot of thinking or be relatively experienced.

Additionally, some of the chosen symbols are meant to obey the same telescoping range of specifity. Forexample, compare n-fold tensor contraction ·n with type-specified ·V1⊗···⊗Vn as discussed in Section 3, or thesymbols ∇, ∇◦ , and ∇p as discussed in Section 10. Tersely notated computations can be seen in Section 10,while fully-verbose computations abound in the careful exposition of Part II.

3 Strongly-Typed Linear Algebra via Tensor ProductsA fully strongly typed formulation of linear algebra will now be developed which enjoys a level of abstractionand flexibility similar to that of Penrose’s abstract index notation. Emphasis will be placed on notational

1Credit for the notion of telescoping notation is due in part to David DeConde, during one of many enjoyable and insightfulconversations.

6

and conceptual regularity via a tensor formalism, coupled with a notion of “untangled” expression whichexploits and notationally depicts the associativity of linear composition.

If V denotes a finite-dimensional vector space, then let

·V : V ∗ × V → R, (α, v) 7→ α (v)

denote the natural pairing on V , and denote ·V (α, v) using the infix notation α ·V v. The natural pairingis a nondegenerate bilinear form and its bilinearity gives the expression α ·V v multiplicative semantics(distributivity and commutativity with scalar multiplication), thereby justifying the use of the infix · operatornormally reserved for multiplication. The natural pairing subscript V is seemingly pedantic, but will proveto be an invaluable tool for articulating and navigating the rich type system of the linear algebraic and vectorbundle constructions used in this paper. When clear from context, the subscript V may be omitted.

Because V is finite-dimensional, it is reflexive (i.e. the canonical injection V → V ∗∗, v 7→ (α 7→ α (v)) isa linear isomorphism). Thus the natural pairing ·V ∗ on V ∗ can be written naturally as

·V ∗ : V × V ∗ → R, (v, α) 7→ α (v) .

Note that α ·V v = v ·V ∗ α. Though subtle, the distinction between ·V and ·V ∗ is important within the typesystem used in this paper.

Through a universal mapping property of multilinear maps, the bilinear forms ·V and ·V ∗ descend to thenatural trace maps

trV : V ∗ ⊗ V → R, α⊗ v 7→ α (v) , andtrV ∗ : V ⊗ V ∗ → R, v ⊗ α 7→ α (v) ,

each extended linearly to non-simple tensors. These operations can also be called tensor contraction.Noting that (V ∗ ⊗ V )

∗ and (V ⊗ V ∗)∗ are canonically isomorphic to V ⊗ V ∗ and V ∗ ⊗ V respectively, thenfor each A ∈ V ∗ ⊗ V and B ∈ V ⊗ V ∗, it follows that trV (A) = IdV ∗ ·V ∗⊗VA and trV ∗ (B) = IdV ·V⊗V ∗B.

Definition 3.1 (Linear maps as tensors). Let V and W be finite-dimensional vector spaces, and letHom (V,W ) denote the space of vector space morphisms from V to W (i.e. linear maps). The linearisomorphism

W ⊗ V ∗ → Hom (V,W ) ,

w ⊗ α 7→ (V →W, v 7→ w (α ·V v))

(extended linearly to general tensors) will play a central conceptual role in the calculations employed in thispaper, as it will facilitate constructions which would otherwise be awkward or difficult to express. Linearmaps and appropriately typed tensor products will be identified via this isomorphism.

Given bases v1, . . . , vm ∈ V and w1, . . . , wn ∈W , and dual bases v1, . . . , vm ∈ V ∗ and w1, . . . , wn ∈W ∗,a linear map A : V →W can be written under the identification in (3.1) as

A = Aij wi ⊗ vj ,

where Aij = wi ·W A ·V vj ∈ R, and in fact[Aij]∈Mn×m (R) is the matrix representation of A with respect

to the bases v1, . . . , vm ∈ V and w1, . . . , wn ∈ W , noting that the i and j indices denote the “output” and“input” components of A respectively. Tensors are therefore the strongly typed analog of matrices, wherethe W ⊗ V ∗ type information is carried by the wi ⊗ vj component.

One clarifying example of the tensor formulation is the adjoint operation of the natural pairing, alsoknown as forming the dual of a linear map. It is straightforward to show that

∗ : W ⊗ V ∗ → V ∗ ⊗W,w ⊗ α 7→ α⊗ w,

7

(where the map is extended linearly to general tensors). This is literally the tensor abstraction of the matrixtranspose operation; if A = Aij wi ⊗ αj , then the dual A is A∗ = Aji α

i ⊗ wj . The matrix of A∗ is preciselythe transpose of the matrix of A with respect to the relevant bases. The map ∗ itself can be written as a4-tensor ∗ ∈ V ∗ ⊗W ⊗W ∗ ⊗ V , where A∗ = ∗ ·W⊗V ∗ A.

There is a notion of the natural pairing of tensor products, which implements composition and evaluationof linear maps, and can be thought of as a natural generalization of scalar multiplication in a field. If U, V,and W are each finite-dimensional vector spaces, then the bilinear form

(U ⊗ V ∗)× (V ⊗W ) → U ⊗ R⊗W ∼= U ⊗W,(u⊗ α, v ⊗ w) 7→ u⊗ (α ·V v)⊗ w = (α ·V v)u⊗ w

will be denoted also by the infix notation ·V (i.e. (u⊗ α) ·V (v ⊗ w) = (α ·V v)u⊗w). If V itself is a tensorproduct of n factors which are clear from context, then ·V may be denoted by ·n (think an n-fold tensorcontraction). If n = 2, then typically : is used in place of ·2. For example, from above, A∗ = ∗·W⊗V ∗A = ∗ : A.

Given a permutation σ ∈ Sn, define a right-action by σ : V1⊗· · ·⊗Vn → Vσ−1(1)⊗· · ·⊗Vσ−1(n), mappingelements in the obvious way. For example, (2 3 4) acting on v1⊗v2⊗v3⊗v4 puts the second factor in the thirdposition, the third factor in the fourth position, and the fourth factor in the second, giving v1⊗ v4⊗ v2⊗ v3.This permutation is itself a linear map and of course can be written as a tensor. However, because it isdefined in terms of a right action, the “domain factors” will come on the left. Thus σ is written as a tensorof the form V ∗1 ⊗ · · · ⊗ V ∗n ⊗ Vσ−1(1) ⊗ · · · ⊗ Vσ−1(n) (i.e. as a 2n-tensor). Certain tensor constructions areconducive to using such permutations. In the above example, ∗ can be written as (1 2) ∈W ∗⊗V ⊗V ∗⊗W .

The permutation right-action also works naturally when notated using superscripts. For example, ifB ∈ U ⊗ V ⊗W , then

B(1 2) := B ·U∗⊗V ∗⊗W∗ (1 2) ∈ V ⊗ U ⊗W

and so (B(1 2)

)(2 3)

= (B ·U∗⊗V ∗⊗W∗ (1 2)) ·V ∗⊗U∗⊗W∗ (2 3)

= B ·U∗⊗V ∗⊗W∗ ((1 2) ·V ∗⊗U∗⊗W∗ (2 3))

= B ·U∗⊗V ∗⊗W∗ (1 2) (2 3)

= B ·U∗⊗V ∗⊗W∗ (1 3 2) ∈ V ⊗W ⊗ U.

When multiplying the permutations (1 2) and (2 3) in the third line, it is important to note that they areread left-to-right, since they are acting on B on the right.

The inline cycle notation is somewhat ambiguous in isolation because the number of factors in thedomain/codomain is not specified, let alone their types. This information can sometimes be inferred fromcontext, such as from the natural pairing subscripts, as in the following examples.

Example 3.2 (Linearizing the inversion map). Let i : GL (V ) → GL (V ) , A 7→ A−1, i.e. the linear mapinversion operator, whereGL (V ) is an open submanifold of V⊗V ∗ via the isomorphism V⊗V ∗ ∼= Hom (V, V ).Its linearization (derivative) Di : GL (V )→ V ⊗ V ∗ ⊗ (V ⊗ V ∗)∗ ∼= V ⊗ V ∗ ⊗ V ∗ ⊗ V at A ∈ GL (V ) in the

8

direction B ∈ TA (GL (V )) ∼= V ⊗ V ∗ is

Di (A) ·V⊗V ∗ B = Di ·V⊗V ∗ δ (A+ εB)

= δ (i (A+ εB))

= δ(

(A+ εB)−1)

= δ(((

1 + εBA−1)A)−1)

= δ(A−1

(1 + εBA−1

)−1)

= δ

(A−1

∞∑n=0

(−εBA−1

)n)(∣∣−εBA−1

∣∣is taken arbitrarily small due to the derivative δ := ddε |ε=0

being evaluated in an arbitrarily small neighborhood of ε = 0)

= δ(A−1 − εA−1BA−1 +O

(ε2))

= −A−1 ·V B ·V A−1.

In order to “move” the B parameter out so that it plays the same syntactical role as in the original expressionDi (A) ·B, via adjacent natural pairing, some simple tensor manipulations can be done. The process is easilyand accurately expressed via diagram. The following sequence of diagrams is a sequence of equalities. Thediagram should be self-explanatory, but for reference, the number of boxes for a particular label denotesthe rank of the tensor, with each box labeled with its type. The lines connecting various boxes are naturalpairings, and the circles represent the unpaired “slots”, which comprise the type of the resulting expression.

V V ∗ V ∗ V

Di(A)

V V ∗

B

V V ∗

−A−1

V V ∗

B

V V ∗

A−1

The following step is nothing but moving the boxes for B out; the natural pairings still apply to the sameslots, hence the cables dangling below.

V V ∗

−A−1

V V ∗

A−1

V V ∗

B

In this setting, a tensor product amounts to flippantly gluing boxes together.

V V ∗ V V ∗

−A−1 ⊗A−1

V V ∗

B

9

In order for B to be naturally paired in the same adjacent manner as in the original expression Di (A) ·B,the slots of −A−1 ⊗A−1 must be permuted; the second moves to the third, the third to the fourth, and thefourth to the second.

V V ∗ V ∗ V

−(A−1 ⊗A−1

)(2 3 4)

V V ∗

B

The first diagram equals the last one, thus Di (A) ·V⊗V ∗ B = −(A−1 ⊗A−1

)(2 3 4) ·V⊗V ∗ B, and by the

nondegeneracy of the natural pairing on V ⊗ V ∗, this implies that Di (A) = −(A−1 ⊗A−1

)(2 3 4), notingthat the statement of this expression does not require the direction vector B. The permutation exponent(2 3 4) can be calculated easily using simple tensors, if not by the above diagrammatic manipulations;

(a1 ⊗ a2) · (b1 ⊗ b2) · (a3 ⊗ a4) = (a1 ⊗ a4 ⊗ a2 ⊗ a3) : (b1 ⊗ b2) = (a1 ⊗ a2 ⊗ a3 ⊗ a4)(2 3 4)

: (b1 ⊗ b2) .

Here, the expression (a1 ⊗ a2) · (b1 ⊗ b2) · (a3 ⊗ a4) represents the expression A−1 ·B ·A−1.

The next example will later be extended to the setting of Riemannian manifolds and their metric tensorfields, and put to use to formulate what are known as harmonic maps (see (12.7)). But first, a new tensoroperation must be defined.

Definition 3.3 (Parallel tensor product). If U, V,W,X are vector spaces and A ∈ U ⊗ V and B ∈W ⊗X,then define their parallel tensor product A�B by

A�B := (A⊗B)(2 3) ∈ (U ⊗W )⊗ (V ⊗X) .

The parentheses in the type specification are unnecessary, but hint at what the tensor decomposition for thequantity A�B should be, if used as an operand to � again (see below).

If A and B represent linear maps, then A�B ∈ (U ⊗W )⊗ (V ⊗X) represents their tensor product aslinear maps (the parentheses are unnecessary but hint at what the domain and codomain are, and for use ofA � B as an operand in another parallel tensor product), which is a “parallel” composition; if α ∈ V ∗ andβ ∈ X∗, then (A�B) ·V ∗⊗X∗ (α⊗ β) = (A ·V ∗ α)⊗ (B ·X∗ β).

There is a slight ambiguity in the notation coming from a lack of specification on how the tensor productof the operands is decomposed in the case when there is more than one such decomposition. Notationexplicitly resolving this ambiguity will not be needed in this paper as the relevant tensor product is usuallyclear from context.

The parallel tensor product is associative; if Y and Z are also vector spaces and C ∈ Y ⊗ Z, then

(A�B) � C = A� (B � C) ∈ (U ⊗W ⊗ Y )⊗ (V ⊗X ⊗ Z) ,

allowing multiply-parallel tensor products.

Example 3.4 (Tensor product of inner product spaces). If (V, g) and (W,h) are inner product spaces (notingthat g ∈ V ∗ ⊗ V ∗ and h ∈ W ∗ ⊗W ∗ are symmetric, i.e. literally invariant under (1 2)), then W ⊗ V ∗ isan inner product space having induced inner product k (A,B) := trV

(g−1 ·V ∗ A∗ ·W∗ h ·W B

). Here, the

“inputs” of A and B (the V ∗ factors) are being paired using g−1 ∈ V ⊗V , while the “outputs” (theW factors)are being paired using h ∈ W ∗ ⊗W ∗, and the trace is used to “complete the cycle” by plugging the outputinto the input, thereby producing a real number. The expression k (A,B) can be written in a more naturalway, which takes advantage of the linear composition, as A : k : B (or, pedantically, A ·W∗⊗V k ·W⊗V ∗ B),instead of the more common but awkward trace expression mentioned earlier. In the tensor formalism, theinner product k should have type W ∗⊗V ⊗W ∗⊗V . Permuting the middle two components of the 4-tensor

10

h ⊗ g−1 ∈ W ∗ ⊗W ∗ ⊗ V ⊗ V gives the correct type. In fact, k = h � g−1. A further advantage to thisformulation is that if any or all of A, k,B are functions, there is a clear product rule for derivatives of theexpression A : k : B. This is something that is used critically in Riemannian geometry in the form ofcovariant derivatives of tensor fields (see (8.2)).

In this paper, the main use of the tensor formulation of linear maps is twofold: to facilitate linear algebraicconstructions which would otherwise be difficult or awkward (this includes the ability to express derivatives of[possibly vector or manifold-valued] maps without needing to “plug in” the derivative’s directional argument),and to make clear the product-rule behavior of many important differentiable constructions.

4 Bundle ConstructionsIn order to use the calculus of variations involving Lagrangians depending tangent maps of maps betweensmooth manifolds, it suffices to consider Lagrangians defined on smooth vector bundle morphisms. Con-tinuing in the style of the previous section, a “full” tensor product of smooth vector bundles (4.4) will beformulated which will then allow expression of smooth vector bundle morphisms as tensor fields, sometimescalled two-point tensor fields [11, pg. 70]. The full arsenal of tensor calculus can then be used to considerableadvantage.

First, some definitions and simpler bundle constructions will be introduced. A smooth [fiber] bundle(hereafter refered to simply as a smooth bundle) is a 4-tuple (E , E, π,N) where E , E and N are smoothmanifolds and π : E → N is locally trivial, i.e. N is covered by open sets {Uα} such that π−1 (Uα) ∼= Uα×Eas smooth manifolds. The manifolds E , E and N are called the typical fiber, the total space, and thebase space respectively. The map π is called the bundle projection. The full 4-tuple specifying a bundlecan be recovered from the bundle projection map, so a locally trivial smooth map can be said to define asmooth bundle. The dimension of the typical fiber of a bundle will be called its rank, and will be denotedby rankπ or rankE when the bundle is understood from context.

The space of smooth sections of a smooth bundle defined by π : E → N is

Γ (π) := {σ ∈ C∞ (N,E) | π ◦ σ = IdN} ,

and may also be denoted by Γ (E), if the bundle is clear from context. If nonempty, Γ (π) is generally aninfinite-dimensional manifold (the exception being when the base space N is finite).

Proposition 4.1 (Trivial bundle). Let M and N be smooth manifolds. With M ×�N := M ×N and

πM×�N := prM×N2 : M ×�N → N

defines a smooth bundle(M,M ×�N, πM×�N , N

), called a trivial bundle. Similarly, withM ×�N := M×N

and πM×�N := prM×N1 : M ×�N →M ,(N,M ×�N, πM×�N ,M

)is a trivial bundle.

No proof is deemed necessary for (4.1), as each bundle projection trivializes globally in the obvious way.The ×� symbol is a composite of × (indicating direct product) and � or � (indicating the base space).

IfM and N are smooth manifolds as in (4.1), then there are two particularly useful natural identifications.

C∞ (M,N) ∼= Γ (M ×�N) C∞ (M,N) ∼= Γ (N ×�M)

φ 7→ IdM ×Mφ φ 7→ φ×M IdM

prM×N2 ◦Φ←[ Φ prN×M1 ◦Φ← [ Φ

These identifications can be thought of identifying a map φ ∈ C∞ (M,N) with its graph in M × N andN ×M respectively. Furthermore, this allows bundle theory to be applied to reasoning about spaces ofmaps. The symbols M ×� N and N ×� M now carry a significant amount of meaning. Generally N ×� M

11

will be used in this paper, for consistency with the Hom (V,W ) ∼= W ⊗ V ∗ convention discussed in Section3. The symbols ×� and ×� are examples of telescoping notation, as they are built notationally on ×, andconceptually on the direct product, which is what is denoted by ×. The arrow portion of the symbols canbe discarded when type-specificity is not needed.

Proposition 4.2 (Direct product bundle). Let(E , E, πE ,M

)and

(F , F, πF , N

)be smooth bundles. Then

πE × πF : E × F →M ×N, (e, f) 7→(πE (e) , πF (f)

)defines a smooth bundle

(E × F , E × F, πE × πF ,M ×N

). This bundle is called the direct product of πE

and πF , and is not necessarily a trivial bundle.

Proof. Let ΨE :(πE)−1

(U) → U × E and ΨF :(πF)−1

(V ) → V × F trivialize πE and πF over open setsU ⊆M and V ⊆ N respectively. Then

ΨE ×ΨF :(πE)−1

(U)×(πF)−1

(V )→ U × E × V ×F

has inverse(ΨE)−1 ×

(ΨF)−1. Note that(

πE)−1

(U)×(πF)−1

(V ) ={

(e, f) ∈ E × F | πE (e) ∈ U, πF (f) ∈ V}

={

(e, f) ∈ E × F |(πE × πF

)(e, f) ∈ U × V

}=(πE × πF

)−1(U × V ) ,

and thatP : (U × E)× (V ×F)→ (U × V )× (E × F) , ((u, e) , (v, f)) 7→ ((u, v) , (e, f))

defines a diffeomorphism. Then

ΨE×F := P ◦(ΨE ×ΨF

):(πE × πF

)−1(U × V )→ (U × V )× (E × F)

defines a diffeomorphism, and

pr(U×V )×(E×F)1 ◦ΨE×F (e, f)

= pr(U×V )×(E×F)1 ◦P ◦

(ΨE ×ΨF

)(e, f)

= pr(U×V )×(E×F)1 ◦P

(ΨE (e) ,ΨF (f)

)= pr

(U×V )×(E×F)1 ◦P

(ΨE (e) ,ΨF (f)

)= pr

(U×V )×(E×F)1 ◦P

((prU×E1 ◦ΨE (e) ,prU×E2 ◦ΨE (e)

),(prV×F1 ◦ΨF (f) ,prV×F2 ◦ΨF (f)

))= pr

(U×V )×(E×F)1

((prU×E1 ◦ΨE (e) ,prV×F1 ◦ΨF (f)

),(prU×E2 ◦ΨE (e) ,prV×F2 ◦ΨF (f)

))=(prU×E1 ◦ΨE (e) ,prV×F1 ◦ΨF (f)

)=(πE (e) , πF (f)

)=(πE × πF

)(e, f) ,

showing that ΨE×F trivializes πE×πF over U×V ⊆M×N . SinceM×N can be covered by such trivializingsets, this establishes that πE × πF defines a smooth bundle. The typical fiber of πE × πF is E × F .

A smooth vector bundle is a fiber bundle whose typical fiber is a vector space and whose localtrivializations are linear isomorphisms when restricted to each fiber. If (E , E, π,M) is a smooth vectorbundle, then its dual vector bundle (E∗, E∗, π∗,M) is a smooth vector bundle defined in the followingway.

E∗ :=∐p∈M

(Ep)∗, π∗ : E∗ →M, ηp 7→ p.

12

Because E is a vector space, the notation E∗ is already defined. In analogy with Section 3, there are naturalpairings on a vector bundle and its dual, defined simply by evaluation. If p ∈ M , η ∈ E∗p and e ∈ Ep, thenη ·E e := η ·Ep e and e ·E η := e ·Ep η. Both expressions evaluate to η (e). Natural traces and n-fold tensorcontraction can be defined analogously. Again, while seemingly pedantic, the subscripted natural pairingnotation will prove to be a valuable tool in articulating and error-checking calculations involving vectorbundles. To generalize the rest of Section 3 will require the definition of additional structures.

For the remainder of this section, let(E , E, πE ,M

)and

(F , F, πF , N

)now be smooth vector bundles.

The following construction is essentially an alternate notation for πE × πF : E × F → M × N , but is onethat takes advantage of the fact that πE and πF are vector bundles, and encodes in the notation the factthat the resulting construction is also a vector bundle. This is analogous to how V ×W is a vector spacewith a natural structure if V and W are vector spaces, except that this is usually denoted by V ⊕W .

Proposition 4.3 (“Full” direct sum vector bundle). If

E ⊕M×N F := E × F,

ThenπE ⊕M×N πF := πE × πF : E ⊕M×N F →M ×N

defines a smooth vector bundle(E ⊕ F , E ⊕M×N F, πE ⊕M×N πF ,M ×N

), called the full direct sum of

πE and πF .For each (p, q) ∈M ×N , the vector space structure on

(πE ⊕M×N πF

)−1(p, q) is given in the following

way. Let α ∈ R and (e1, f1) , (e2, f2) ∈(πE ⊕M×N πF

)−1(p, q). Then

α (e1, f1) + (e2, f2) = (αe1 + e2, αf1 + f2) .

It is critical to see (4.5) for remarks on notation.

Proof. Let U , V , E , F , P , ΨE , ΨF and ΨE×F be as in the proof of (4.2), and define ΨE⊕M×NF := ΨE×F .Noting that ΨE⊕M×NF is a smooth bundle isomorphism over IdU×V , so to show that ΨE⊕M×NF is a linearisomorphism in each fiber, it suffices to show that it is linear in each fiber. Let α ∈ R, (p, q) ∈ U × V and(e1, f1) , (e2, f2) ∈

(πE ⊕M×N πF

)−1(p, q). Then

ΨE⊕M×NF (αe1 + e2, αf1 + f2) = P ◦(ΨE ×ΨF

)(αe1 + e2, αf1 + f2)

= P(ΨE (αe1 + e2) ,ΨF (αf1 + f2)

)= P

(αΨE (e1) + ΨE (e2) , αΨF (f1) + ΨF (f2)

)(by trivial vector bundle structures on U × Eand V ×F)

= αP(ΨE (e1) ,ΨF (f1)

)+ P

(ΨE (e2) ,ΨF (f2)

)(by trivial vector bundle structure on (U × V )× (E × F))

= αP ◦(ΨE ×ΨF

)(e1, f1) + P ◦

(ΨE ×ΨF

)(e2, f2)

= αΨE⊕M×NF (e1, f1) + ΨE⊕M×NF (e2, f2) .

Thus ΨE⊕M×NF is linear in each fiber, and because it is invertible, it is a linear isomorphism in each fiber.In particular, ΨE⊕M×NF is a smooth vector bundle isomorphism over IdU×V . Applying

(ΨE⊕M×NF

)−1 tothe above equation gives

(αe1 + e2, αf1 + f2) = α (e1, f1) + (e2, f2) ,

as desired.

This construction differs from theWhitney sum of two vector bundles, as the base spaces of the bundles arekept separate, and aren’t even required to be the same. This allows the identification of T (M ×N)→M×N

13

as TM ⊕M×N TN → M × N , which may be done without comment later in this paper. Some importantrelated structures are pr∗1 π

TMM : pr∗1 TM →M ×N and pr∗2 π

TNN : pr∗2 TN →M ×N , where pri := prM×Ni .

The next construction is what will be used in the implementation of smooth vector bundle morphismsas tensor fields.

Proposition 4.4 (“Full” tensor product bundle). If

E ⊗M×N F :=∐

(p,q)∈M×N

Ep ⊗ Fq (disjoint union),

ThenπE ⊗M×N πF : E ⊗M×N F →M ×N, αijei ⊗ fj 7→

(πE (e1) , πF (f1)

)(here, αij ∈ R)

defines a smooth vector bundle(E ⊗ F , E ⊗M×N F, πE ⊗M×N πF ,M ×N

), called the full tensor product2

of πE and πF .It is critical to see (4.5) for remarks on notation.

Proof. Since the argument αijei ⊗ fj in the definition of πE ⊗M×N πF is not necessarily unique, the well-definedness of πE ⊗M×N πF must be shown. Let αije1

i ⊗ f1j = βije2

i ⊗ f2j . Then in particular, αije1

i ⊗f1j , β

ije2i ⊗ f2

j ∈ Ep ⊗ Fq for some (p, q) ∈M ×N , and therefore e1i , e

2i ∈ Ep and f1

j , f2j ∈ Fq for each index

i and j. Thus πE(e1

1

)= p = πE

(e2

1

)and πF

(f1

1

)= q = πF

(f2

1

), so the expression defining πE ⊗M×N πF

is well-defined.The set E ⊗M×N F does not have an a priori global smooth manifold structure, as it is defined as the

disjoint union of vector spaces. A smooth manifold structure compatible with that of the constituent vectorspaces will now be defined.

Let ΨE :(πE)−1

(U)→ U×E and ΨF :(πF)−1

(V )→ V ×F trivialize πE and πF over open sets U ⊆Mand V ⊆ N respectively, such that ΨE and ΨF are each linear in each fiber. Define

ΨE⊗M×NF :(πE ⊗M×N πF

)−1(U × V ) → (U × V )× (E ⊗ F)

X 7→((πE ⊗M×N πF

)(X) ,

((prU×E2 ◦ΨE

)⊗(prV×F2 ◦ΨF

))(X)

).

The map ΨE⊗M×NF is well-defined and smooth in each fiber by construction, since for each (p, q) ∈ U × V ,(prU×E2 ◦ΨE

)⊗(prV×F2 ◦ΨF

)|Ep⊗Eq : Ep ⊗ Eq → E ⊗F

is a linear isomorphism by construction. Additionally, ΨE⊗M×NF has been constructed so that

pr(U×V )×(E⊗M×NF)1 ◦ΨE⊗M×NF = πE ⊗M×N πF

on(πE ⊗M×N πF

)−1(U × V ). Define the smooth structure on

(πE ⊗M×N πF

)−1(U × V ) ⊆ E ⊗M×N F

by declaring ΨE⊗M×NF to be a diffeomorphism. The map πE ⊗M×N πF is trivialized over U × V . The setE ⊗M×N F can be covered by such trivializing open sets. Thus E ⊗M×N F has been shown to be locallydiffeomorphic to the direct product of smooth manifolds, and therefore it has been shown to be a smoothmanifold. With respect to the smooth structure on E ⊗M×N F , the map πE ⊗M×N πF is smooth, and hastherefore been shown to define a smooth vector bundle.

Remark 4.5 (Notation regarding base space). The “full” direct sum (4.3) and “full” tensor product (4.4)bundle constructions allow direct sums and tensor products to be taken of vector bundles when the basespaces differ. If the base spaces are the same, then the construction “joins” them, producing a vector bundleover that shared base space. For example, if E and F are vector bundles over M , then E ⊗M×M F has basespace M ×M , while E⊗F has base space M . The base space can be specified in either case as a notationalaide; the latter example would be written as E ⊗M F . If no subscript is provided on the ⊗ symbol, then

2This construction is alluded to in [7, pg. 121], but is not defined or discussed.

14

the base spaces are “joined” if possible (if they are the same space), otherwise they are kept separate, as inthe “full” tensor product construction. This notational convention conforms to the standard Whitney sumand tensor product bundle notation, and uses the notion of telescoping notation to provide more specificitywhen necessary.

Given a fiber bundle, a natural vector bundle can be constructed “on top” of it, essentially quantifyingthe variations of bundle elements along each fiber. This is known as the vertical bundle, and it plays acritical role in the development of Ehresmann connections, which provide the “horizontal complement” tothe vertical bundle.

Proposition 4.6 (Vertical bundle). Let πE : E → M define a smooth [fiber] bundle. If V E := kerTπE ≤TE, then πV E := πTEE |V E : V E → E defines a smooth vector bundle subbundle of πTEE : TE → E, calledthe vertical bundle over E. Furthermore, the fiber over e ∈ E is VeE = TeEπE(e) ≤ TeE.

Proof. Because πE is a smooth surjective submersion, V E → E is a subbundle of TE → E having corankdimM and therefore rank equal to that of E. Furthermore, if e ∈ E and ε 7→ eε ∈ EπE(e), then δeε representsan arbitrary element of TeEπE(e), and TπE (δeε) = δ

(πE (eε)

)= δ (π (e)) = 0, showing that δeε ∈ kerTπE ,

and therefore that δeε ∈ VeE. This shows that TeEπE(e) ⊆ VeE. Because dimTeEπE(e) = rankE, this showsthat TeEπE(e) = VeE.

Given the extra structure that a vector bundle provides over a [fiber] bundle, there is a canonical smoothvector bundle isomorphism which adds significant value to the pullback bundle formalism used throughoutthis paper. This can be seen put to greatest use in Part II, for example, in development of the first variation(see (12.1)).

Proposition 4.7 (Vertical bundle as pullback). If π : E →M defines a smooth vector bundle, then

ιπ∗EV E : π∗E → V E,

(x, y) 7→ δε (x+ εy)

is a smooth vector bundle isomorphism over IdE, called the vertical lift, having inverse

ιV Eπ∗E : δeε 7→(e0, lim

ε→0

eε − e0

ε

),

where, without loss of generality, eε is an E-valued variation which lies entirely in a single fiber.

Proof. It is clear that ιπ∗EV E is linear and injective on each fiber. By a dimension counting argument, it is

therefore an isomorphism on each fiber. Because it preserves the basepoint, it is a vector bundle isomorphismover IdE . Because the map (x, y, ε) 7→ x + εy is smooth, so is the defining expression for ιπ

∗EV E , thereby

establishing smoothness. That ιV Eπ∗E inverts ιπ∗EV E is a trivial calculation.

5 Strongly-Typed Tensor Field OperationsBecause vector bundles and the related operations can be thought of conceptually as “sheaves of linear al-gebra”, the constructions in Section 3, generalized earlier in this section, can be further generalized to thesetting of sections of vector bundles.

If E,F,G are smooth vector bundles over M , then define the natural pairing of a tensor field with avector:

·F : Γ (E ⊗M F ∗)× F → E,

(e⊗M φ, f) 7→ e(πF (f)

) [φ(πF (f)

)·F f

],

15

extending linearly to general tensor fields. Further, define the natural pairing of tensor fields:

·F : Γ (E ⊗M F ∗)× Γ (F ⊗M G) → Γ (E ⊗M G) ,

(e⊗M φ, f ⊗M g) 7→(p 7→ e (p)⊗M

(φ (p) ·Fp f (p)

)⊗M g (p)

)=(p 7→

(φ (p) ·Fp f (p)

)(e⊗M g) (p)

),

extending linearly to general tensor fields. This multiple use of the ·F symbol is a concept known as opera-tor overloading in computer programming. No ambiguity is caused by this overloading, as the particularuse can be inferred from the types of the operands. As before, the subscript F may be optionally omittedwhen clear from context.

The permutations defined in Section 3 are generalized as tensor fields. If F1, . . . , Fn are smooth vectorbundles over M , and σ ∈ Sn is a permutation, then σ can act on F1⊗M · · · ⊗M Fn by permuting its factors,and therefore can be identified with a tensor field

σ ∈ Γ(F ∗1 ⊗M · · · ⊗M F ∗n ⊗M Fσ−1(1) ⊗M · · · ⊗M Fσ−1(n)

)defined by

(f1 ⊗M · · · ⊗M fn) ·F∗1⊗M ···⊗MF∗n σ := fσ−1(1) ⊗M · · · ⊗M fσ−1(n).

An important feature of such permutation tensor fields is that they are parallel with respect to covariantderivatives on the factors F1, . . . , Fn (see (8.12) for more on this).

6 Pullback BundlesThe pullback bundle, defined below, is a crucial building block for many important bundle constructions, asit enriches the type system dramatically, and allows the tensor formulation of linear algebra to be extendedto the vector bundle setting. In particular, the abstract, global formulation of the space of smooth vectorbundle morphisms over a map φ : M → N is achieved quite cleanly using a pullback bundle. Furthermore,the use of pullback bundles and pullback covariant derivatives simplifies what would otherwise be local co-ordinate calculations, thereby giving more insight into the geometric structure of the problem.

For the duration of this section, let (F , F, π,N) be a smooth bundle having rank r.

Proposition 6.1 (Pullback bundle). Let M and N be smooth manifolds and let φ : M → N be smooth. If

φ∗F := {(m, f) ∈M × F | φ (m) = π (f)} ,

andπφ∗F := prM×F1 |φ∗F : φ∗F →M, (m, f) 7→ m,

then(F , φ∗F, πφ∗F ,M

)defines a smooth bundle. In particular, φ∗F is a smooth manifold having dimension

dimM + rankπ. The bundle defined by πφ∗F is called the pullback of π by φ.

Proof. Recalling that F denotes the typical fiber of π, let Ψ: π−1 (U) → U × F trivialize π over open setU ⊆ N . Define

Ψφ : φ∗(π−1 (U)

)→ φ−1 (U)×F , (m, f) 7→

(m, prU×F2 ◦Ψ (f)

)and

Ψ−1φ : φ−1 (U)×F → φ∗

(π−1 (U)

), (m, f) 7→

(m,Ψ−1 (φ (m) , f)

).

Claim (1): Ψφ and Ψ−1φ are smooth. Proof: φ∗

(π−1 (U)

)⊆ φ−1 (U)×π−1 (U), and Ψφ is clearly smooth

as a map defined on the larger manifold. Therefore it restricts to a smooth map on φ∗(π−1 (U)

). An

analogous argument shows that Ψ−1φ is smooth. Claim (1) proved.

16

Claim (2): Ψ−1φ inverts Ψφ. Proof: Let (m, f) ∈ φ∗

(π−1 (U)

). Then

Ψ−1φ ◦Ψφ (m, f) = Ψ−1

φ

(m, prU×F2 ◦Ψ (f)

)=(m,Ψ−1

(φ (m) ,prU×F2 ◦Ψ (f)

))=(m,Ψ−1

(π (f) ,prU×F2 ◦Ψ (f)

))(since φ (m) = π (f))

=(m,Ψ−1

(prU×F1 ◦Ψ (f) ,prU×F2 ◦Ψ (f)

))=(m,Ψ−1 ◦Ψ (f)

)= (m, f) .

With g ∈ F ,

Ψφ ◦Ψ−1φ (m, g) = Ψφ

(m,Ψ−1 (φ (m) , g)

)=(m,prU×F2 ◦Ψ ◦Ψ−1 (φ (m) , g)

)=(m,prU×F2 (φ (m) , g)

)= (m, g) ,

proving Claim (2).Claim (3): Ψφ trivializes πφ

∗F over φ−1 (U) ⊆M . Proof: Let (m, f) ∈ φ∗(π−1 (U)

). Then

prφ−1(U)×F1 ◦Ψφ (m, f) = pr

φ−1(U)×F1 ◦

(m,prU×F2 ◦Ψ (f)

)= m = πφ

∗F (m, f) ,

and by claims (1) and (2), Ψφ is a diffeomorphism, so Ψφ trivializes πφ∗F over φ−1 (U) ⊆ M . Claim (3)

proved.SinceM can be covered with sets as in claim (3) and since the typical fiber of πφ

∗F is diffeomorphic to F ,this shows that πφ

∗F defines a smooth bundle(F , φ∗F, πφ∗F ,M

). Because φ∗F is locally diffeomorphic to

the product of an open subset of M with F , φ∗F has been shown to be a smooth manifold having dimensiondimM + dimF = dimM + rankπ.

While the pullback bundle is constructed as a submanifold of a direct product, there is a natural bundlemorphism into the pulled-back bundle, which serves as an interface to maps defined on the pulled-back bundle.Usually this morphism is notationally suppressed, just as naturally isomorphic spaces can be identifiedwithout explicit notation.

Corollary 6.2 (Pullback fiber projection bundle morphism). If φ : M → N is smooth, then

ρφ∗FF : φ∗F → F,

(m, f) 7→ f

is a smooth bundle morphism over φ which is an isomorphism when restricted to any fiber of φ∗F .

Because ρφ∗FF is the projection prM×FF |φ∗F , its tangent map is also just the projection prTM⊕TFTF |Tφ∗F .

Proposition 6.3 (Bundle pullback is a contravariant functor). The map of categories

Pullback : Manifold → {Bundle (M) |M ∈ Manifold} ,M 7→ Bundle (M) ,

(φ : M → N) 7→(Bundle (N)→ Bundle (M) , (F , F, π,N) 7→

(F , φ∗F, πφ

∗F ,M))

is a contravariant functor. Here, naturally isomorphic bundles in Bundle (M), for each manifold M , areidentified (along with the corresponding morphisms).

17

Proof. Noting thatId∗N F = {(n, f) ∈ N × F | IdN (n) = π (f)} ∼= F

and that

(Id∗N π) (n, f) =(prN×F1 |Id∗N F

)(n, f) = n = π (f)

=⇒ Id∗N π∼= π,

it follows that Pullback (IdN ) = IdBundle(N) = IdPullback(N), i.e. Pullback satisfies the identity axiom offunctoriality.

For the contravariance axiom, let φ : M → N and ψ : L → M be smooth manifold morphisms and let(F , F, π,N) be a smooth bundle. Then

ψ∗φ∗F ={

(`, p) ∈ L× φ∗F | ψ (`) = πφ∗F (p)

}={

(`, (m, f)) ∈ L× (M × F ) | ψ (`) = πφ∗F (m, f) and φ (m) = π (f)

}= {(`, (m, f)) ∈ L× (M × F ) | ψ (`) = m and φ (m) = π (f)}∼= {(`, f) ∈ L× F | φ ◦ ψ (`) = π (f)}= (φ ◦ ψ)

∗F

and

πψ∗φ∗F (`, (m, f)) =

(prL×φ

∗F1 |ψ∗φ∗F

)(`, (m, f)) = ` and

π(φ◦ψ)∗F (`, f) =(prL×F1 |(φ◦ψ)∗F

)(`, f) = `,

showing that πψ∗φ∗F ∼= π(φ◦ψ)∗F , and therefore

Pullback (ψ) ◦ Pullback (φ) = Pullback (φ ◦ ψ) ,

establishing Pullback as a contravariant functor.

The space of sections of a pullback bundle is easily quantified.

Γ (φ∗F ) ={σ ∈ C∞ (M,φ∗F ) | πφ

∗F ◦ σ = IdM

}.

This space will be central in the theory developed in the rest of this paper. Furthermore, it is naturallyidentified with the space of sections along the pullback map;

Γφ (F ) :={

Σ ∈ C∞ (M,F ) | πF ◦ Σ = φ}.

These spaces are naturally isomorphic to one another, and therefore an identification can be made whenconvenient. While the former space is more correct from a strongly typed standpoint, the latter space is aconvenient and intuitive representational form. The particular correspondence depends heavily on the factthat φ∗F is a submanifold of M × F .

Γ (φ∗F ) ∼= Γφ (F )

σ 7→ prM×F2 ◦σ,IdM ×MΣ ← [ Σ.

Furthermore, if f ∈ Γ (F ), then f ◦ φ ∈ Γφ (F ). Note that it is not true that any σ ∈ Γφ (F ) can be writtenas f ◦ φ for some f ∈ Γ (F ), for example when there exists some distinct p, q ∈ M such that φ (p) = φ (q)and σ (p) 6= σ (q). Furthermore, the representation f ◦ φ is generally non-unique, for example when φ is notsurjective, sections f1, f2 ∈ Γ (F ) which differ only away from the image of φ will still give f1 ◦ φ = f2 ◦ φ.Before developing the notion of a linear connection on a pullback bundle, it will be necessary to addressthese features which, while inconvenient, provide the strength of the pullback bundle and pullback covariantderivative (see (8.8)).

18

Lemma 6.4 (Local representation of Γφ (F ) elements). Recall that r denotes the rank of smooth bundleF . If σ ∈ Γφ (F ) then each point p ∈ M has some neighborhood U in which σ can be written locally asσ |U= σi fi ◦ φ |U , where f1, . . . , fr ∈ Γ

(F |φ(U)

)is a frame for F |φ(U), and σ1, . . . , σr ∈ C∞ (U,R) are

defined by σi =(f i ◦ φ |U

)·F σ |U .

Proof. Let p ∈ M , let V ⊆ N be a neighborhood of φ (p) over which F |V is trivial, and let U = φ−1 (V ),so that U is a neighborhood of p. Let f1, . . . , fr ∈ Γ (F |V ) be a frame for F |V (i.e. F |φ(U)), and letf1, . . . , fr ∈ Γ

((F |V )

∗) be the corresponding coframe (i.e. the unique f1, . . . , fr such that f i ·F fj = δij foreach i, j). Define σi ∈ C∞ (M,R) by σi =

(f i ◦ φ |U

)·F σ |U . Then

σi fi ◦ φ |U =(f i ◦ φ |U

)·F σ |U fi ◦ φ |U

=((fi ◦ φ |U )⊗U

(f i ◦ φ |U

))·F σ |U

=((fi ⊗V f i

)◦ φ |U

)·F σ |U

=(IdF |V ◦φ |U

)·F σ |U

= σ |U ,

as desired.

Some literature uses expressions of the form f ◦ φ ∈ Γφ (F ) along with an implicit use of the section-identifying isomorphism to write down particular sections of pullback bundles. In most cases, this tacitidentification of spaces is harmless, but certain highly involved calculations may suffer from it. The sectionthat f ◦ φ corresponds to under said isomorphism is IdM ×M (f ◦ φ) ∈ Γ (φ∗F ). However, because thisexpression is unwieldy and therefore a more compact and contextually meaningful expression is called for.

Definition 6.5 (Pullback section). If f ∈ Γ (F ) and φ : M → N is smooth, then define

φ∗f := IdM ×M (f ◦ φ) ∈ Γ (φ∗F ) .

This is known as a pullback section.

The pullback section is deservedly named. If φ : M → N and ψ : L → M are smooth, then ψ∗φ∗f ∼=(φ ◦ ψ)

∗f in the sense of the proof of (6.3).

Proposition 6.6 (Bundle pullback commutes with tensor product). If E and F are smooth vector bundlesover manifold N and φ : M → N is smooth, then the map

φ∗E ⊗M φ∗F → φ∗ (E ⊗N F ) ,

(m, e)⊗M (m, f) 7→ (m, e⊗N f)

(extended linearly to general tensors) is a smooth vector bundle isomorphism.

Proof. Let c denote the above map. The well-definedness of c comes from the universal mapping property onmultilinear forms which induces a linear map on a corresponding tensor product. If c ((m, e)⊗M (m, f)) = 0,then e ⊗N f = 0, which implies that e = 0 or f = 0, and therefore that (m, e) ⊗M (m, f) = 0. Becausethere exists a basis for (φ∗E ⊗M φ∗F )m consisting only of simple tensors, this implies that c is injective, andby a dimensionality argument, that c is an isomorphism. The map is clearly smooth and respects the fiberstructures of its domain and codomain. Thus c is a smooth vector bundle isomorphism.

The contravariance of pullback and its naturality with respect to tensor product are two essential prop-erties which provide some of the flexibility and precision of the strongly typed tensor formalism described inthis paper. This will become quite apparent in Part II.

19

Remark 6.7 (Tensor field formulation of smooth vector bundle morphisms). A particularly useful applicationof pullback bundles is in forming a rich type system for smooth vector bundle morphisms. This approach wasinspired by [20, pg. 11]. Let πE : E →M and πF : F → N be smooth vector bundles, and let φ : M → N besmooth. Consider Homφ (E,F ), i.e. the space of smooth vector bundle morphisms over the map φ. There isa natural identification with another space which lets the base map φ play a more direct role in the space’stype. In particular,

Homφ (E,F ) ∼= HomIdM (E, φ∗F ) ,

A 7→ πE ×E A,prM×F2 ◦B ←[ B.

This particular identification of smooth vector bundle morphisms over φ can now be directly translated intothe tensor field formalism, analogously to (3.1).

Γ (φ∗F ⊗M E∗) → HomIdM (E, φ∗F ) ,

A 7→ (e 7→ A ·E e) .

The inverse image of B ∈ HomIdM (E, φ∗F ) is given locally; let (ei) and (fi) denote local frames for E andF in neighborhoods U ⊆ M and V ⊆ N respectively, with φ (U) ⊆ V , and let

(ei)and

(f i)denote their

dual coframes. Then the tensor field corresponding to B is given locally in U by Bij φ∗fi ⊗M ej , where

Bij := φ∗df i ◦B ◦ ej ∈ C∞ (U,R).Quantifying smooth vector bundle morphisms as the tensor fields lends itself naturally to doing calculus

on vector and tensor bundles, as the relevant derivatives (covariant derivatives) take the form of tensor fields.The type information for a particular vector bundle morphism is encoded in the relevant tensor bundle.

7 Tangent Map as a Tensor FieldThis section deals specifically with the tangent map operator by using concepts from Section 5 and Section6 to place it in a strongly typed setting and to prepare to unify a few seemingly disparate concepts andnotation for some tangible benefit (in particular, see Section 10).

Given a smooth map φ : M → N , its tangent map Tφ : TM → TN is a smooth vector bundle morphismover φ, so by (6.7), is naturally identified with a tensor field

∇◦ M→Nφ ∈ Γ (φ∗TN ⊗M T ∗M) ,

which may be denoted by ∇◦ φ where type pedantry is deemed unnecessary. This construction is known as atwo-point tensor field [11, pg. 70]. The inscribed ◦ symbol in ∇◦ is used to denote that this is a nonlinearderivative, thereby distinguishing it from a linear covariant derivative.

Remark 7.1 (Generalized covariant derivative). The well-known one-to-one correspondence between linearconnections and linear covariant derivatives [8, pg. 520] generalizes to a one-to-one correspondence betweenEhresmann connections and a generalized notion of covariant derivative. To give a partial definition for thepurposes of utility, a generalized covariant derivative on a smooth [fiber] bundle F → N is a map ∇ onΓ (F ) such that ∇σ ∈ Γ (σ∗V F ⊗N T ∗N) for each σ ∈ Γ (F ). The space of maps C∞ (M,N) is naturallyidentified as Γ (N ×�M), and there is a natural Ehresmann connection on the bundle N ×� M , whose cor-responding covariant derivative is the tangent map operator. This is the subject of another of the author’spapers and will not be discussed here further. This is mentioned here to incorporate linear covariant deriva-tives (to be introduced and discussed in Section 8) and the tangent map operator (a nonlinear covariantderivative) under the single category “covariant derivative”.

20

There is a subtle issue regarding construction of the cotangent map of φ which is handled easily by thetensor field construction. In particular, while the cotangent map T ∗φ is the pointwise adjoint of the tangentmap Tφ, i.e. for each p ∈M , Tpφ : TpM → Tφ(p)N is linear and T ∗p φ : T ∗φ(p)N → T ∗pM is the adjoint of Tpφ,it does not follow that T ∗φ ∈ Hom (T ∗N,T ∗M), being some sort of “total adjoint” of Tφ ∈ Hom (TM, TN).The obstruction is due to the fact that φ may not be surjective, so there may be some fiber T ∗qN that is not ofthe form T ∗φ(p)N , and therefore the domain could not be all of T ∗N . Furthermore, even if φ were surjective,if it weren’t also injective, say φ (p0) = φ (p1) for some distinct p0, p1 ∈ M , then T ∗φ(p0)N = T ∗φ(p1)N , andTp0M 6= Tp1M , so the action on the fiber T ∗φ(p0)N is not well-defined.

In the tensor field parlance, the cotangent map T ∗φ simply takes the form

(∇◦ φ)(1 2) ∈ Γ (T ∗M ⊗M φ∗TN) .

The permutation superscript (1 2) is used here instead of ∗ to distinguish it notationally from pullback nota-tion, which will be necessary in later calculations. The key concept is that the tensor field (∇◦ φ)

(1 2) encodesthe base map φ; the basepoint p ∈M is part of the domain φ∗T ∗N itself.

The chain rule in the tensor field formalism makes use of the bundle pullback. If ψ : L→ M is smooth,then

∇◦ L→N (φ ◦ ψ) = ψ∗∇◦ M→Nφ ·ψ∗TM ∇◦ L→Mψ.

Because ∇◦ ψ ∈ Γ (ψ∗TM ⊗L T ∗L), to form a well-defined natural pairing, the use of the pullback

ψ∗∇◦ φ ∈ Γ (ψ∗ (φ∗TN ⊗M T ∗M)) = Γ (ψ∗φ∗TN ⊗L ψ∗T ∗M) = Γ((φ ◦ ψ)

∗TN ⊗L ψ∗T ∗M

)is necessary (instead of just ∇◦ φ ∈ Γ (φ∗TN ⊗M T ∗M)).

Sometimes it is useful to discard some type information and write ∇◦ φ ∈ Γφ×M IdM (TN ⊗N×M T ∗M),i.e. ∇◦ φ : M → TN ⊗N×M T ∗M such that

(πTNN ⊗N×M πT

∗MM

)◦ ∇◦ φ = φ ×M IdM . This is easily

done by the canonical fiber projeciton available to all pullback bundle constructions; φ∗TN ⊗M T ∗M ∼=(φ×M IdM )

∗(TN ⊗N×M T ∗M), and the canonical fiber projection is

ρ(φ×M IdM )∗(TN⊗N×MT∗M)TN⊗N×MT∗M : (φ×M IdM )

∗(TN ⊗N×M T ∗M)→ TN ⊗N×M T ∗M,

as defined in (6.2). The granularity of the type system should reflect the weight of the calculations beingperformed. For demonstration of contrasting situations, see the discussion at the beginning of Section 8 andthe computation of the first variation in (12.1).

It is important to have notation which makes the distinction between the smooth vector bundle morphismformalism and the tensor field formalism, because it may sometimes be necessary to mix the two, thoughthis paper will not need this. An added benefit to the tensor field formulation of tangent maps is that certainnotions regarding derivatives can be conceptually and notationally combined, for example in Section 10.

8 Linear Covariant DerivativesAs will be shown in the following discussion, a linear covariant derivative (commonly referred to in thestandard literature without the “linear” qualifier) provides a way to generalize the notion in elementary cal-culus of the differential of a vector-valued function. The linear covariant derivative interacts naturally withthe notion of the pullback bundle, and this interaction leads naturally to what could be called a covariantderivative chain rule, which provides a crucial tool for the tensor calculus computations seen later.

Let V andW be finite-dimensional vector spaces let U ⊆ V be open, and let φ : U →W be differentiable.Recall from elementary calculus the differential Dφ : U → W ⊗ V ∗ (essentially matrix-valued). There

21

is no base map information encoded in Dφ (i.e. φ can’t be recovered from Dφ alone), it contains onlyderivative information. The vector space structure of V and W allows the trivializations TU ∼= V ×� U andTW ∼= W ×�W , where the first factors are the base spaces and the second factors are the fibers (see (4.1)).The tangent map ∇◦ U→Wφ : U → TW ⊗W×U T ∗U (see Section 7) has a codomain that can be trivializedsimilarly;

TW ⊗W×U T ∗U ∼= (W ×�W )⊗W×U (V ∗×�U) ∼= (W ⊗ V ∗) ×� (W × U) .

Because (W ⊗ V ∗) ×� (W × U), as a set, is a direct product, it can be decomposed into two factors. Lettingpr1 and pr2 be the projections onto the first and second factors respectively,

pr1 ◦∇◦ φ : U →W ⊗ V ∗ and pr2 ◦∇◦ φ : U →W × U.

The map pr2 ◦∇◦ φ is the element of Γ (W ×�U) identified with the base map φ itself; prW×UW ◦ pr2 ◦∇◦ φ = φ.This base map information is discarded in defining the differential of φ as Dφ := pr1 ◦∇◦ φ; the fiber portionof ∇◦ φ. This construction relies critically on the natural isomorphism TW ∼= W ×�W for a vector space W .

An analogous construction shows that the differential Dφ of a map φ is well-defined even when its domainis a manifold. However, when the codomain of a map φ is only a manifold, there does not in general exist anatural trivialization of its tangent bundle (in contrast to the vector space case), and therefore Dφ can’t bedefined without additional structure. A linear covariant derivative provides the missing structure.

For the remainder of this section, let π : E → N define a smooth vector bundle having rank r.

A linear covariant derivative on E provides a means of taking derivatives of sections of E (i.e. mapsσ : N → E such that π ◦ σ = IdN ) without passing to a higher tangent bundle as would happen under thetangent map functor (i.e. if σ ∈ Γ (E) then Tσ : TN → TE and ∇◦ N→Eσ : N → TE ⊗E×N T ∗N). A linearcovariant derivative provides an effective “trivialization” of TE analogous to the trivialization TW ∼= W ×�Was discussed above, discarding all but the “fiber” portion of the derivative, allowing the construction of anobject known as the total linear covariant derivative analogous to the differential Dφ as discussed above.

The notion of a linear covariant derivative on a vector bundle is arguably the crucial element of differentialgeometry3. In particular, this operator implements the product rule property common to anything that canbe called a derivation – a property which is particularly conducive to the operation of tensor calculus. Thetotal linear covariant derivative of a vector field (i.e. section of a vector bundle) allows the generalizationof many constructions in elementary calculus to the setting of smooth vector bundles equipped with linearcovariant derivatives. For example, the divergence divX := trDX of a vector field X on Rn generalizes tothe divergence divX := tr∇X of a vector field X on N , which has an analogous divergence theorem amongother qualitative similarities.

Remark 8.1 (Natural linear covariant derivative on trivial line bundle). Before making the general definitionfor the linear covariant derivative, a natural linear covariant derivative will be introduced. With N denotinga smooth manifold as before, if f ∈ C∞ (N,R), then df ∈ Γ (T ∗N) is the differential of f . Let

∇p N→Rf := df.

Because C∞ (N,R) is naturally identified with Γ (R×�N), this is essentially the natural linear covariantderivative on the trivial line bundle R×�N . Note that there is an associated product rule; if f, g ∈ C∞ (N,R),then fg ∈ C∞ (N,R), and

∇p N→R (fg) = d (fg) = g df + f dg = g∇p N→Rf + f∇p N→Rg.

When clear from context, the superscript decoration can be omitted and the derivative denoted as ∇p f .3The Fundamental Lemma of Riemannian Geometry establishes the existence of the Levi-Civita connection[9, pg. 68],

which is a linear covariant derivative satisfying certain naturality properties.

22

Definition 8.2 (Linear covariant derivative). A linear covariant derivative on a vector bundle definedby π : E → N is an R-linear map ∇p E : Γ (E)→ Γ (E ⊗N T ∗N) satisfying the product rule

∇p E (f ⊗N σ) = σ ⊗N ∇p N→Rf + f ⊗N ∇p Eσ, (8.1)

where f ∈ C∞ (N,R) and σ ∈ Γ (E). The switch in order in the first term of the expression is necessary toform a tensor field of the correct type, Γ (E ⊗N T ∗N). If σ ∈ Γ (E), then the expression ∇p Eσ is knownas the total [linear] covariant derivative of σ. If ∇p Eσ = 0 [in a subset U ⊆ N ], then σ is said to beparallel [on U ]. The “linear” qualifier is implied in standard literature and is therefore often omitted.

The inscribed p in ∇p is to indicate that the covariant derivative is linear, and can be omitted when clearfrom context, or when it is unnecessary to distinguish it from the nonlinear tangent map operator whosedecorated symbol is ∇◦ . For the remainder of this section, this distinction will not be necessary, so anundecorated ∇ will be used.

For V ∈ Γ (TN), it is customary to denote∇Eσ·V by∇EV σ, where V indicates the “directional” componentof the derivative. Following this convention, the product rule can be written in a form where the productrule is more obvious;

∇EV (f ⊗N σ) = ∇N→RV f ⊗N σ + f ⊗N ∇EV σ.

A covariant derivative is a local operator with respect to the base space N ; if p ∈ N , then(∇Eσ

)(p)

depends only on the restriction of σ to an arbitrarily small neighborhood of p [9, pg. 50], and thereforethe restriction ∇E|U : ΓU (E)→ ΓU (E ⊗N T ∗N) makes sense, allowing calculations using local expressions.Furthermore, a covariant derivative can be constructed locally and glued together under certain conditions.See [8, pg. 503] for more on this, and as a reference for general theory on bundles, covariant derivatives, andconnections.

Linear covariant derivatives on several vector bundle constructions will now be developed. In analogyto defining a linear map by its action on a generating subset (e.g. a basis or a dense subspace) and thenextending using the linear structure, Lemma (8.6) allows a covariant derivative to be defined on a generatingsubset (which can be chosen to make the defining expression particularly natural) and then extending. Inthis case, the relevant space is the space of sections of the vector bundle, which is a module over the ringof smooth functions on a manifold, and the extension process is done via linearity and the product rule(see (8.2)). This approach will allow the local trivialization implementation details to be hidden within theproof of Lemma (8.6) – an example of information hiding – so that constructions of covariant derivativescan proceed clearly by focusing only on the natural properties of the relevant objects and then invoking thelemma to do the “dirty” work (see (8.7) and (8.9)).

A bit of useful notation will be introduced to simplify the next definition. If G ⊆ Γ is a subset of aC∞ (N,R)-module Γ whose elements are functions on N (and therefore have a notion of restriction to asubset) and U ⊆ N is open, then let GU denote the set of restrictions of the elements of G to the set U .Note that GU ⊆ ΓU by construction.

Definition 8.3 (Finitely generating subset). Say that a subset of a module finitely generates the moduleif the subset contains a finite set of generators for the module.

Definition 8.4 (Locally finitely generating subset). If Γ is a C∞ (N,R)-module and G ⊆ Γ, then G is saidto be a locally finitely generating subset of Γ if each point q ∈ N has a neighborhood U ⊆ N for whichGU finitely generates ΓU .

The space of sections of a vector bundle is the archetype for the above definition. The locally trivialnature of π : E → N allows local frames to be chosen in a neighborhood of each point of N , from which globalsmooth sections (though not necessarily a global frame) can be made using a partition of unity subordinateto the trivializing neighborhoods. The set of such global sections forms a locally finite generating subset ofΓ (E).

23

Lemma 8.5. If G is a locally finitely generating subset of Γ (E), then each point in N has a neighborhoodU ⊆ N and e1, . . . , er ∈ GU such that e1, . . . , er forms a frame for ΓU (E). In other words, a local frame canbe chosen out of G near each point in N .

Proof. Let q ∈ N and let V ⊆ N be a neighborhood of q for which GV = {g1, . . . , g`} finitely generatesΓV (E) (here, ` ≥ r, recalling that r = rankE). Without loss of generality, let g1 (q) , . . . , gr (q) be lin-early independent (this is possible because {g1 (q) , . . . , g` (q)} spans the vector space Eq). Because gi iscontinuous for each i and the linear independence of the sections g1, . . . , gr is an open condition (defined byL−1 (R\ {0}) where L : N →

∧rEq, p 7→ g1 (p) ∧ · · · ∧ gr (p)), there is a neighborhood U ⊆ V of q for which

{g1 (p) , . . . , gr (p)} is a linearly independent set for each p ∈ U . Finally, letting ei := gi |U for i ∈ {1, . . . , r},the sections e1, . . . , er ∈ GU form a frame for ΓU (E).

The following lemma shows that defining a covariant derivative on a locally finitely generating subset ofthe space of sections of a vector bundle is sufficient to uniquely define a covariant derivative on the wholespace. The particular generating subset can be chosen so the covariant derivative has a particularly naturalexpression within that subset.

Lemma 8.6 (Linear covariant derivative construction). Let G be a locally finite generating subset of Γ (E).If ∇G : G → Γ (E ⊗N T ∗N) satisfies the linear covariant derivative axioms4, then there is a unique linearcovariant derivative ∇E : Γ (E)→ Γ (E ⊗N T ∗N) whose restriction to G is ∇G.

Proof. If q ∈ N , then by (8.5) there exists a neighborhood U ⊆ N of q for which there are e1, . . . , er ∈ GUforming a frame for E |U . If σ ∈ Γ (E), then σ |U= σiei for some σ1, . . . , σr ∈ C∞ (U,R) (specifically,σi = ei ·E σ |U , where e1, . . . , er ∈ ΓU (E∗) denotes the dual coframe of e1, . . . , er). Define ∇E : Γ (E) →Γ (E ⊗N T ∗N) locally on ΓU (E) so as to satisfy the product rule

∇E (σ |U ) := ei ⊗N ∇N→Rσi + σi ⊗N ∇Gei.

To show well-definedness, let f1, . . . , fr ∈ GU be another frame for E |U . Then σ = τ ifi for some τ1, . . . , τ r ∈C∞ (U,R). Let Ψ: ΓU (E)→ ΓU (E) be the unique smooth vector bundle isomorphism such that fi = Ψ·E ei.Writing Ψ and Ψ−1 with respect to the frame (ei) as Ψi

jei ⊗ ej and(Ψ−1

)ijei ⊗ ej respectively, it follows

that fi = Ψjiej and τ

i = σj(Ψ−1

)ij. Then

∇E(τ ifi

)= fi ⊗N ∇N→Rτ i + τ i ⊗N ∇Gfi

= Ψjiej ⊗N ∇

N→R(σk(Ψ−1

)ik

)+ σj

(Ψ−1

)ij⊗N ∇G

(Ψki ek)

= Ψjiej(Ψ−1

)ik⊗N ∇N→Rσk + Ψj

iejσk ⊗N ∇N×R

(Ψ−1

)ik

+ σj(Ψ−1

)ijek ⊗N ∇N→RΨk

i + σj(Ψ−1

)ij

Ψki ⊗∇Gek

= δjkej ⊗N ∇N→Rσk + σjδkj ⊗∇Gek + σ`ek ⊗N ∇N→R

(Ψki

(Ψ−1

)i`

)= ∇E

(σiei

).

The last equality follows because Ψki

(Ψ−1

)i`

= δk` , which is a constant function, so ∇N→R(

Ψki

(Ψ−1

)i`

)= 0.

Thus the expression defining ∇E doesn’t depend on the choice of local frame. This establishes the well-definedness of ∇E .

Clearly the restriction of ∇E to G is ∇G. This establishes the claim of existence. Uniqueness followsfrom the fact that ∇E is defined in terms of the maps ∇N→R and ∇G.

Lemma (8.6) is used in the proof of the following proposition to allow a natural formulation of thepullback covariant derivative with respect to a natural locally finite generating subset of Γ (φ∗E), in whichthe relevant derivative has a natural chain rule.

4What is meant by this is that the product rule must only be satisfied on λ⊗N g if λg ∈ G, where λ ∈ C∞ (N,R) and g ∈ G.

24

Proposition 8.7 (Pullback covariant derivative). If φ : M → N is smooth and ∇E is a covariant derivativeon E, then there is a unique covariant derivative ∇φ∗E on φ∗E satisfying the chain rule

∇φ∗Eφ∗e = φ∗∇Ee ·φ∗TN ∇◦ M→Nφ

for all e ∈ Γ (E).

Proof. Let G := {σ ∈ Γ (φ∗E) | σ = φ∗e for some e ∈ Γ (E)}, noting that a local frame e1, . . . , erankE ∈ΓU (E) over open set U ⊆ N induces a local frame φ∗e1, . . . , φ

∗erankE ∈ Γφ−1(U) (φ∗E), so G is a locallyfinite generating subset of Γ (φ∗E). Define

∇G : G → Γ (φ∗E ⊗M T ∗N) ,

φ∗e 7→ φ∗∇Ee ·φ∗TN ∇◦ M→Nφ.

The well-definedness and R-linearity of ∇G comes from that of ∇E . For the product rule, if λ ∈ C∞ (M,R)and e ∈ Γ (E), then the product λ⊗M φ∗e is an element of G if and only if λ = φ∗µ for some µ ∈ C∞ (N,R),in which case, λ⊗M φ∗e = φ∗µ⊗M φ∗e = φ∗ (µ⊗N e). Then it follows that

∇G (λ⊗M φ∗e) = ∇Gφ∗ (µ⊗N e)

= φ∗∇E (µ⊗N e) ·φ∗TN ∇◦ M→Nφ= φ∗

(e⊗N ∇N→Rµ+ µ⊗N ∇Ee

)·φ∗TN ∇◦ M→Nφ

= φ∗(e⊗N ∇N→Rµ

)·φ∗TN ∇◦ M→Nφ+ φ∗

(µ⊗N ∇Ee

)·φ∗TN ∇◦ M→Nφ

= φ∗e⊗M(φ∗∇N→Rµ ·φ∗TN ∇◦ M→Nφ

)+ φ∗µ⊗M

(φ∗∇Ee ·φ∗TN ∇◦ M→Nφ

)= φ∗e⊗M ∇M→Rφ∗µ+ φ∗µ⊗M ∇Gφ∗e= φ∗e⊗M ∇M→Rλ+ λ⊗M ∇Gφ∗e,

which is exactly the required product rule. By (8.6), there exists a unique covariant derivative ∇φ∗E on φ∗Ewhose restriction to G is ∇G.

The full notation ∇φ∗E is often cumbersome, so it may be denoted by ∇φ when the pulled-back bundleis clear from context.

Remark 8.8. There is an important feature of a pullback covariant derivative in the case that pullback map isnot an immersion; the pullback covariant derivative may be nonzero even where the pullback map is singular.This fact can be obscured by a certain abuse of notation which often comes in the expression of the geodesicequations in differential geometry (see (12.8)). An example will illustrate this point.

Let ∇TM be a covariant derivative on πTMM : TM → M . Let Θ: R → TM be a unit-length vector fieldwhich describes the location of a person (the basepoint) and direction s/he is looking (the fiber portion)with respect to time (let R have standard coordinate t). Define θ : R → M by θ := πTMM ◦ Θ, so that θ isthe base map of Θ, i.e. θ has discarded the direction information and only encodes the location information.Say that for some closed interval I ⊆ R, dθ

dt |I is identically zero (and so is not an immersion), but thatdΘdt |I is nonvanishing; see Figure 8.1. Mathematically, this means that during this time, Θ is varying onlywithin a single fiber of TM . Physically, this means that during this time, the person is standing still butthe direction s/he is looking is changing. Passing to a higher tangent space is often undesirable (note thatdΘdt takes values in TTM), so to avoid this, a covariant derivative is used. In order to be meaningful, thecovariant derivative must capture this fiber-only variation.

Because Θ is a vector field along θ, it can be written as Θ ∈ Γ (θ∗TM), and the covariant derivative onTM induces a pullback covariant derivative on θ∗TM , which has base space R. In other words, θ∗TM isparameterized by time. Then ∇θ∗TMd

dt

Θ ∈ Γ (θ∗TM) is the desired covariant derivative of Θ with respect totime. A coordinate-based calculation will be made to make completely obvious why this pullback covariantderivative captures the desired information. Let

(xi)be local coordinates on M and, for simplicity, assume

25

θ

θ′(t)

Θ(t)

M

Θ(t) varies for t ∈ I.

θ(t)

θ(t) is constant for t ∈ I,θ′(t) = 0 for t ∈ I.

Figure 8.1: A picture of the manifoldM , path θ, and vector fields θ′ and Θ. The blue dots represent θ (t) atcertain points t ∈ R, while the green and red arrows represent θ′ (t) and Θ (t) at at these points respectively.Note that Θ is a unit-length vector field along θ and varies within I, whereas θ′ is a vector field along θ thatvanishes within I.

that the image of θ lies entirely within this coordinate chart. Because (∂i) is a local frame for TM , (θ∗∂i)is a local frame for θ∗TM , by (6.4) and Θ ∈ Γ (θ∗TM) can be written locally as Θ (t) = Θi (t) (θ∗∂i) (t) forsome functions

(Θi : R→ R

). Then

∇θ∗TMddt

Θ = ∇θ∗TMddt

(Θi θ∗∂i

)=(∇ d

dtΘi)θ∗∂i + Θi∇θ

∗TMddt

θ∗∂i

=dΘi

dtθ∗∂i + Θi θ∗∇TM∂i ·θ∗TM ∇◦ R→Mθ ·TR

d

dt

=dΘi

dtθ∗∂i + Θi θ∗∇TM∂i ·θ∗TM

dθ

dt.

Note that ∇◦ R→Mθ ∈ Γ (θ∗TM). Within the interval I, dθdt vanishes, so the second term vanishes on I.

However, because Θ is varying in a fiber-only direction within I, the basepoint is not changing and dΘi

dt θ∗∂i

can be identified with an elementary vector space derivative (the fiber is a vector space and so an elementaryderivative is well-defined there). This fiber-direction derivative is nonvanishing by assumption, so ∇θ∗TMd

dt

Θ

is nonvanishing on I as desired.

Introducing a bit of natural notation which will be helpful for the next result, if X ∈ Γ (E) and Y ∈ Γ (F ),then define X ⊕ Y ≡ X ⊕M×N Y ∈ Γ (E ⊕M×N F ) and X ⊗ Y ≡ X ⊗M×N Y ∈ Γ (E ⊗M×N F ) by

(X ⊕M×N Y ) (p, q) := X (p)⊕ Y (q) and (X ⊗M×N Y ) (p, q) := X (p)⊗ Y (q)

for each (p, q) ∈M ×N .

26

Proposition 8.9 (Induced covariant derivatives on E⊕M×NF and E⊗M×NF ). If ∇E and ∇F are covariantderivatives on E and F respectively, then there are unique covariant derivatives

∇E⊕M×NF : Γ (E ⊕M×N F )→ Γ ((E ⊕M×N F )⊗M×N (T ∗M ⊕M×N T ∗N))

and∇E⊗M×NF : Γ (E ⊗M×N F )→ Γ ((E ⊗M×N F )⊗M×N (T ∗M ⊕M×N T ∗N))

on E ⊕ F and E ⊗ F respectively, satisfying the sum rule

∇E⊕Fu⊕v (X ⊕ Y ) = ∇EuX ⊕∇Fv Y

and the product rule∇E⊗Fu⊕v (X ⊗ Y ) = ∇EuX ⊗ Y +X ⊗∇Fv Y,

respectively, where X ∈ Γ (E), Y ∈ Γ (F ), and u ⊕ v ∈ TM ⊕ TN . Here, TM ⊕ TN → M × N (and itsdual) is used instead of the isomorphic vector bundle T (M ×N)→M ×N (and its dual).

Proof. Suppressing the pedantic use of the M ×N subscript to avoid unnecessary notational overload, theset G := {e⊕ f | e ∈ Γ (E) , f ∈ Γ (F )} is a locally finite generator of Γ (E ⊕ F ), since local frames for E⊕Ftake the form {ei ⊕ 0, 0⊕ fj}, where {ei} and {fj} are local frames for E and F respectively. Define

∇G : G → Γ ((E ⊕ F )⊗ (T ∗M ⊕ T ∗N)) ,

X ⊕ Y 7→(u⊕ v 7→ ∇EuX ⊕∇Fv Y

), where u⊕ v ∈ TM ⊕ TN.

This map is well-defined and R-linear by construction, since the connections ∇E and ∇F are well-definedand R-linear. If λ ∈ C∞ (M ×N,R), X ∈ Γ (E), and Y ∈ Γ (F ), then the product λ⊗ (X ⊕ Y ) is in G (i.e.has the form X ⊕ Y for some X ∈ Γ (E) and Y ∈ Γ (F )) if and only if λ is constant. Thus the product rule(restricted to elements of G) reduces to R-linearity, which is already satisfied. By (8.6), there exists a uniqueconnection ∇E⊕F on E ⊕ F whose restriction to G is ∇G.

Similarly, the set H := {e⊗ f | e ∈ Γ (E) , f ∈ Γ (F )} is a locally finite generator of Γ (E ⊗ F ), since localframes for E ⊗ F take the form {ei ⊗ fj}, where {ei} and {fj} are local frames for E and F respectively.Define

∇H : H → Γ ((E ⊗ F )⊗ (T ∗M ⊕ T ∗N)) ,

X ⊗ Y 7→(u⊕ v 7→ ∇EuX ⊗ Y +X ⊗∇Fv Y

), where u⊕ v ∈ TM ⊕ TN.

This map is well-defined and R-linear by construction, since the connections ∇E and ∇F are well-defined andR-linear. For the product rule, with λ ∈ C∞ (M ×N,R), X ∈ Γ (E), and Y ∈ Γ (F ), the product λ⊗(X ⊗ Y )is in H if and only if there exist µ ∈ C∞ (M,R) and ν ∈ C∞ (N,R) such that λ = µ⊗ν ∈ (R×�M)⊗(R×�N)(noting that then λ⊗M×N (X ⊗ Y ) = (µ⊗ ν)⊗M×N (X ⊗ Y ) = (µ⊗M X)⊗ (ν ⊗N Y )). In this case, withu⊕ v ∈ TM ⊕ TN ,

∇Hu⊕v (λ⊗M×N (X ⊗ Y ))

= ∇Hu⊕v ((µ⊗ ν)⊗M×N (X ⊗ Y ))

= ∇Hu⊕v ((µ⊗M X)⊗ (ν ⊗N Y ))

= ∇Eu (µ⊗M X)⊗ (ν ⊗N Y ) + (µ⊗M X)⊗∇Fv (ν ⊗N Y )

=(∇M→Ru µ⊗M X

)⊗ (ν ⊗N Y ) +

(µ⊗M ∇EuX

)⊗ (ν ⊗N Y )

+ (µ⊗M X)⊗(∇N→Rv ν ⊗N Y

)+ (µ⊗M X)⊗

(ν ⊗N ∇Fv Y

)=(∇M→Ru µ⊗ ν + µ⊗∇N→R

v ν)⊗M×N (X ⊗ Y ) + λ⊗M×N

(∇EuX ⊗ Y +X ⊗∇Fv Y

)= ∇M×N→R

u⊕v λ⊗M×N (X ⊗ Y ) + λ⊗M×N ∇Hu⊕v (X ⊗ Y ) ,

which is exactly the required product rule. By (8.6), there exists a unique connection ∇E⊗F on E⊗F whoserestriction to H is ∇H .

27

Remark 8.10 (Naturality of the covariant derivatives on E⊕M×N F and E⊗M×N F ). Letting pri := prM×Ni

(i ∈ {1, 2}) for brevity, the maps

ξ : E ⊕M×N F → pr∗1 E ⊕M×N pr∗2 F,

e⊕ f 7→((πE ⊕ πF

)(e⊕ f) , e

)⊕M×N

((πE ⊕ πF

)(e⊕ f) , f

)and

ψ : E ⊗M×N F → pr∗1 E ⊗M×N pr∗2 F,

e⊗ f 7→((πE ⊗ πF

)(e⊗ f) , e

)⊗M×N

((πE ⊗ πF

)(e⊗ f) , f

),

each extended linearly to the rest of their domains, are easily shown to be smooth vector bundle isomorphismsover IdM×N . Then

∇E⊕Fz (X ⊕ Y ) = ξ−1(∇pr∗1 E⊕M×Npr∗2 Fz ξ (X ⊕ Y )

)and

∇E⊗Fz (X ⊗ Y ) = ψ−1(∇pr∗1 E⊗M×Npr∗2 Fz ψ (X ⊗ Y )

)for all X ∈ Γ (E), Y ∈ Γ (F ), and z ∈ T (M ×N), showing that the connections on E ⊕ F and E ⊗ F are ξand ψ-related to the naturally induced connections on pr∗1 E⊕pr∗2 F and pr∗1 E⊗M×N pr∗2 F respectively, andare therefore in this sense natural. The sum X⊕Y ∈ Γ (E ⊕ F ) and product X⊗Y ∈ Γ (E ⊗ F ) correspondto pr∗1 X ⊕M×N pr∗2 Y and pr∗1 X ⊗M×N pr∗2 Y ∈ Γ (pr∗1 E ⊗M×N pr∗2 F ) under ξmathnψ respectively.

Many important tensor constructions involve permutations. An extremely useful property of these per-mutations is that they commute with the covariant derivatives induced by the covariant derivatives on thetensor bundle factors, making them natural operators in the setting of covariant tensor calculus.

Proposition 8.11 (Transposition tensor fields are parallel). Let E1, E2, E3, E4 be smooth vector bundlesover M having covariant derivatives ∇E1 ,∇E2 ,∇E3 ,∇E4 respectively, let A := E1⊗M E2⊗M E3⊗M E4 andB := E1 ⊗M E3 ⊗M E2 ⊗M E4, and let ∇A and ∇B denote the induced covariant derivatives.

If (2 3) ∈ Γ (A∗ ⊗M B) denotes the tensor field which maps e1⊗M ⊗e2⊗M e3⊗M e4 ∈ A to e1⊗M e3⊗Me2 ⊗M e4 ∈ B (i.e. (2 3) transposes the second and third factors), then (2 3) is a parallel tensor field withrespect to the covariant derivative induced on the vector bundle A∗ ⊗M B →M , i.e. ∇A∗⊗MB (2 3) = 0.

Proof. Let X ∈ Γ (TM). Then

(e1 ⊗M e2 ⊗M e3 ⊗M e4) ·A∗ ∇A∗⊗MB

X (2 3)

= ∇BX ((e1 ⊗M e2 ⊗M e3 ⊗M e4) ·A∗ (2 3))−∇AX (e1 ⊗M e2 ⊗M e3 ⊗M e4) ·A∗ (2 3)

= ∇BX (e1 ⊗M e3 ⊗M e3 ⊗M e4)

−∇E1

X e1 ⊗M e3 ⊗M e2 ⊗M e4

− e1 ⊗M ∇E3

X e3 ⊗M e2 ⊗M e4

− e1 ⊗M e3 ⊗M ∇E2

X e2 ⊗M e4

− e1 ⊗M e3 ⊗M e2 ⊗M ∇E4

X e4

= ∇BX (e1 ⊗M e3 ⊗M e3 ⊗M e4)−∇BX (e1 ⊗M e3 ⊗M e3 ⊗M e4)

= 0.

Because X is arbitrary, this shows that (e1 ⊗M e2 ⊗M e3 ⊗M e4) ·A∗∇A∗⊗MB (2 3) = 0. This extends linearly

to general tensors, so ∇A∗⊗MB (2 3) = 0, as desired.

The fact that all transposition tensor fields are parallel implies that all permutation tensor fields areparallel, since every permutation is just the product of transpositions. This gives as an easy corollary that acovariant derivative operation commutes with a permutation operation, which has quite a succinct statementusing the permutation superscript notation.

28

Corollary 8.12 (Permutation tensor fields are parallel). Let E1, . . . , Ek be smooth vector bundles over Meach having a covariant derivative, and let A := E1⊗M · · ·⊗MEk and B := Eσ−1(1)⊗M · · ·⊗MEσ−1(k). If σ ∈Sk is interpreted as the tensor field in Γ (A∗ ⊗M B) which maps e1⊗M · · ·⊗M ek to eσ−1(1)⊗M · · ·⊗M eσ−1(k),then σ is a parallel tensor field. Stated using the superscript notation, with X ∈ Γ (TM) and a ∈ Γ (A),

∇BXaσ =(∇AXa

)σ.

Proof. This follows from the fact that σ can be written as the product of transpositions; ∇Xσ = 0 becauseof the product rule and because each transposition is parallel. The claim regarding commutation with thesuperscript permutation follows easily from its definition.

∇BXaσ = ∇BX (a ·A∗ σ) = a ·A∗ ∇A∗⊗MB

X σ +∇AXa ·A∗ σ =(∇AXa

)σ,

using the fact that ∇A∗⊗MB

X σ = 0, since σ is a parallel tensor field.

9 Decomposition of πTEE : TE → E

In using the calculus of variations on a manifold M where the Lagrangian is a function of TM (this formof Lagrangian is ubiquitous in mechanics), taking the first variation involves passing to TTM . Without away to decompose variations into more tractable components, the standard integration-by-parts trick [6, pg.16] can’t be applied. The notion of a local trivialization of TTM via choice of coordinates on M is one wayto provide such a decomposition. A coordinate chart (U, φ : U → Rn) on M establishes a locally trivializingdiffeomorphism TTU ∼= φ (U)× Rn × Rn × Rn. However such a trivialization imposes an artificial additivestructure on TTU depending on the [non-canonical] choice of coordinates, only gives a local formulationof the relevant objects, and the ensuing coordinate calculations don’t give clear insight into the geometricstructure of the problem. The notion of the linear connection remedies this.

A linear connection on the vector bundle π : E → M is a subbundle H → E of πTEE : TE → E suchthat TE = H ⊕E V E and Tλa ·Hx = Hax for all a ∈ R\ {0} and x ∈ E, where λa : E → E, e 7→ ae is thescalar multiplication action of a on E [8, pg. 512]. The bundle H → E may also be called a horizontalspace of the vector bundle πTEE : TE → E (“a” is used instead of “the” because a choice of H → E is generallynon-unique). For convenience, define h := ∇◦ π ∈ Γ (π∗TM ⊗E T ∗E) , noting then that V E = kerh.

A linear connection can equivalently be specified by what is known as a connection map; essentially aprojection onto the vertical bundle. This is a slightly more active formulation than just the specification ofa horizontal space, as a covariant derivative can be defined directly in terms of the connection map – see [8,pg. 518], [4, pg. 128], [5, pg. 173], and [12, pg. 208].

Proposition 9.1 (Connection map formulation of a linear connection). If v ∈ Γ (π∗E ⊗E T ∗E) (i.e.v : TE → E is a smooth vector bundle morphism over π) is a left-inverse for ιπ

∗EV E ∈ Γ (V E ⊗E π∗E∗)

that is equivariant with respect to Tλa and λa (i.e. v · Tλa = π∗λa · v) [12, pg. 245], then H := ker v ≤ TEdefines a linear connection on the vector bundle π : E → M . Such a map v is called the connection mapassociated to H. Conversely, given a linear connection H, there is exactly one connection map defining Hin the stated sense.

Proof. That v is a left-inverse for ιπ∗EV E implies that v has full rank, so H := ker v defines a subbundle of

πTEE : TE → E having the same rank as TM . Because v is smooth, H is a smooth subbundle. Furthermore,the condition implies that VeE∩He = {0} for each e ∈ E, and therefore TE = H⊕E V E by a rank-countingargument.

If x ∈ TE and a ∈ R\ {0}, then v · Tλa · x = π∗λa · v · x, which equals zero if and only if v · x = 0, i.e. ifand only if x ∈ H. Thus Tλa ·H = H. This establishes H → E as a linear connection.

Conversely, if H is a linear connection and v1 and v2 are connection maps for H, then v1 ·TE ιπ∗EV E =

Idπ∗E = v2 ·TE ιπ∗EV E . Then because the image of ιπ

∗EV E is all of V E, it follows that v1 |V E= v2 |V E . Since

v1 |H= 0 = v2 |H by definition, and since TE = H ⊕E V E, this shows that v1 = v2. Uniqueness of

29

E

ZE

T0pE

TepE

ep

0p

VepE

Hep

V0pE

H0p

c

Ep

Figure 9.1: A diagram representing the decomposition of TE → E into horizontal and vertical subbundles.The vertical lines represent individual fibers of E, while p ∈ M , ep ∈ Ep, 0p ∈ Ep denotes the zero vectorof Ep, and ZE denotes the zero subbundle of E; ZE ∼= M . By the equivariance property of the linearconnection, ZE is a submanifold of E which is entirely horizontal (its tangent space is entirely composed ofhorizontal vectors). The tangent spaces T0pE and TepE are drawn; green arrows representing the verticalsubspaces (“along” the fibers), red arrows representing the horizontal subspaces. Finally, c is a horizontalcurve passing through ep.

connection maps has been established. To show existence, define v := ιV Eπ∗E ·V E prV E ∈ Γ (π∗E ⊗E T ∗E),where prV E : H⊕E V E → V E be the canonical projection, recalling that H⊕E V E = TE. It is easily shownthat v is a connection map for H.

Proposition 9.2 (Decomposing πTEE : TE → E). If v ∈ Γ (π∗E ⊗E T ∗E) is a connection map, then

h⊕E v : TE → π∗TM ⊕E π∗E (9.1)

is a smooth vector bundle isomorphism over IdE. See Figure 9.1.

Proof. Because TE = H ⊕E V E, and H = ker v and V E = kerh, the fiber-wise restriction

h⊕E v |TeE : TeE → (π∗TM ⊕E π∗E)e∼= Tπ(e)M ⊕ Eπ(e)

is a linear isomorphism for each e ∈ E. The map is a smooth vector bundle morphism over IdE by construc-tion. It is therefore a smooth vector bundle isomorphism over IdE .

30

Remark 9.3 (Linear connection/covariant derivative correspondence). Given a covariant derivative ∇E on asmooth vector bundle π : E →M , there is a naturally induced linear connection, defined via the connectionmap

v : TE → E, (9.2)

δεΘ 7→ ∇(π◦Θ)∗Eδε

Θ,

where Θ: I → E is a variation of θ ∈ E. Here, ∇(π◦Θ)∗E denotes the pullback of the covariant derivative∇E through the map π ◦ Θ (see (8.7)). Conceptually, all v does is replace an ordinary derivative (δε) withthe corresponding covariant one (∇(π◦Θ)∗E

δε).

Conversely, given a connection map v ∈ Γ (π∗E ⊗E T ∗E) for a linear connection H → E, there is anaturally induced covariant derivative ∇E on the smooth vector bundle π : E →M , defined by

∇E : Γ (E) → Γ (E ⊗M T ∗M) ,

σ 7→ σ∗v ·σ∗TE ∇◦ M→Eσ.

The scaling equivariance of v is critical for showing that this map actually defines a covariant derivative.Full type safety should be observed here; by the contravariance of the pullback of bundles (see (6.3)),σ∗π∗E ∼= (π ◦ σ)

∗E = Id∗M E ∼= E, so

σ∗v ∈ Γ (σ∗ (π∗E ⊗E T ∗E)) ∼= Γ (σ∗π∗E ⊗M σ∗T ∗E) ∼= Γ (E ⊗M σ∗T ∗E) ,

and therefore σ∗v · ∇◦ σ ∈ Γ (E ⊗M T ∗M) as desired. This connection map construction of a covariantderivative gives (8.7) as an immediate consequence via the chain rule for the tangent map.

The following construction is an abstraction of taking partial derivatives of a function, inspired by [11, pg.277]. Instead of taking partial derivatives with respect to individual coordinates, partial covariant derivativesalong distributions over the base manifold are formed, where the distributions (subbundles) decompose thebase manifold’s tangent bundle into a direct sum. Such a construction conveniently captures the geometryof maps with respect to the geometry of its domain.

Proposition 9.4 (Partial covariant derivatives). Let L ∈ C∞ (M,R), and for each i ∈ {1, . . . , n} let Fi →Mbe a smooth vector bundle. If, for each i ∈ {1, . . . , n}, ci ∈ Γ (Fi ⊗M T ∗M) such that c1 ⊕M · · · ⊕M cn ∈Γ ((F1 ⊕M · · · ⊕M Fn)⊗M T ∗M) is a smooth vector bundle isomorphism over IdE, then there exist uniquesections L,ci ∈ Γ (F ∗i ) for each i ∈ {1, . . . , n} such that

∇M→RL = L,c1 ·F1 c1 + · · ·+ L,cn ·Fn cn.

This decomposition of ∇L provides what will be called partial covariant derivatives of L (with respect tothe given decomposition).

Proof. The following equivalences provide a formula for directly defining L,c1 , . . . , L,cn .

∇L = L,c1 ·F1 c1 + · · ·+ L,cn ·Fn cn⇐⇒ ∇L = (L,c1 ⊕M · · · ⊕M L,cn) ·F1⊕M ···⊕MFn (c1 ⊕M · · · ⊕M cn)

⇐⇒ ∇L ·TM (c1 ⊕M · · · ⊕M cn)−1

= L,c1 ⊕M · · · ⊕M L,cn .

Existence and uniqueness is therefore proven.

Corollary 9.5 (Horizontal/vertical derivatives). Let h := ∇◦ π ∈ Γ (π∗TM ⊗E T ∗E) as before. If v ∈Γ (π∗E ⊗E T ∗E) is a connection map, and if L : E → R is smooth, then there exist unique L,h ∈ Γ (π∗T ∗M)and L,v ∈ Γ (π∗E∗) such that ∇L = L,h ·π∗TM h+ L,v ·π∗E v.

31

It should be noted that the basepoint-preserving issue discussed in Section 7 plays a role in choosing touse the tensor field formulation of h : TE → TM and v : TE → E. In particular, without preserving thebasepoint (via the π-pullback of TM and E to form h ∈ Γ (π∗TM ⊗E T ∗E) and v ∈ Γ (π∗E ⊗E T ∗E)), themap h⊕E v would not be a smooth bundle isomorphism, and the horizontal and vertical derivatives wouldbe maps of the form L,h : E → T ∗M and L,v : E → E∗, but that, critically, are not sections of smooth vectorbundles, and can only claim to be smooth [fiber] bundle morphisms. Derivative trivializations will be centralin calculating the first and second variations of an energy functional having Lagrangian L (see (12.1) and(13.1)).

10 Curvature and Commutation of DerivativesA ubiquitous consideration in mathematics is to determine when two operations commute. In the setting oftensor calculus, this often manifests itself in determining the commutativity (or lack thereof) of two covariantderivatives. Here, “covariant derivatives” may refer to both linear covariant derivatives and the tangent mapoperator (see (7.1)). This unified categorization of derivatives will now be leveraged to show that certainfiber bundles are flat (in a sense analogous to the vanishing of a curvature endomorphism) with respect toparticular covariant derivatives. This reduces the work often done showing commutativity of derivatives inthe derivation of the first variation of a function in the calculus of variations to the simple statement that aparticular tensor field is symmetric, which is comes as a corollary to the aforementioned flatness.

In this section, the symbol ∇ may denote ∇◦ or ∇p , depending on context. This eases the expression ofrepeated covariant derivatives, such as the covariant Hessian of a section (see below), and is an example oftelescoping notation as discussed in Section 3.

If π : E → M defines a smooth [fiber] bundle whose space of sections Γ (E) has two repeated covariantderivatives defined and if ∇TM is a symmetric linear covariant derivative (meaning ∇XY −∇YX = [X,Y ]for X,Y ∈ Γ (TM)), then the tensor contraction

∇2σ : (X ⊗M Y − Y ⊗M X)

is an expression measuring the non-commutativity of the X and Y derivatives of σ. The quantity ∇2σ willbe called the covariant Hessian of σ, because it generalizes the Hessian of elementary calculus; it containsonly second-derivative information, and in the special case seen below, it is symmetric in the argumentcomponents. It should be noted that if F → M is the vector bundle such that ∇σ ∈ Γ (F ⊗M T ∗M), then∇2σ ∈ Γ (F ⊗M T ∗M ⊗M T ∗M). Intentionally leaving the ∇ and · symbols undecorated in preference ofcontextual interpretation, unwinding the expression above gives

∇2σ : (X ⊗M Y − Y ⊗M X) = ∇Y∇σ ·X −∇X∇σ · Y= ∇Y∇Xσ −∇σ · ∇YX −∇X∇Y σ +∇σ · ∇XY= −∇X∇Y σ +∇Y∇Xσ +∇σ · [X,Y ]

= −∇X∇Y σ +∇Y∇Xσ +∇[X,Y ]σ,

which is syntactically identical to the common definition for the [Riemannian] curvature endomorphismR (X,Y )σ. In the traditional setting, where ∇E is a linear covariant derivative on vector bundle E, thecurvature endomorphism takes the form of a tensor field RE ∈ Γ (E ⊗M E∗ ⊗M T ∗M ⊗M T ∗M). In thissetting however, because ∇E may be nonlinear, such a tensorial formulation doesn’t generally exist. Instead,

RE (X,Y ) := −∇X∇EY +∇Y∇EX +∇E[X,Y ]

defines a second-order covariant differential operator (“covariant” meaning tensorial in the X and Y compo-nents). Put differently,

RE (X,Y )σ = ∇2σ : (X ⊗M Y − Y ⊗M X) ,

32

which will be called the (possibly nonlinear) curvature operator, measures the non-commutativity of theX and Y derivatives of σ. If RE is identically zero, then the bundle E is said to be flat with respect to therelevant connections/covariant derivatives.

There are two particularly important instances of flat bundles. The first is the trivial line bundle definedby πR×�S (whose space of smooth sections, as discussed in Section 4, is naturally identified with C∞ (S,R)).In this case, ∇S→Rf ∈ Γ (T ∗S), and ∇2f ≡ ∇p T∗S ∇p S→Rf ∈ Γ (T ∗S ⊗S T ∗S) is the object referred to inmost literature as the covariant Hessian of f . Here, RS→R (X,Y ) f is a real-valued function on S.

Proposition 10.1 (Symmetry of covariant Hessian on functions). Let S be a smooth manifold and let ∇TSbe a symmetric covariant derivative. If f ∈ C∞ (S,R), then ∇2f ∈ Γ (T ∗S ⊗S T ∗S) is a symmetric tensorfield (i.e. it has a (1 2) symmetry). Here, the covariant derivative on C∞ (S,R) is ∇S→R as defined above.

Proof. Let X,Y ∈ Γ (TS). Recall that ∇f ≡ df ∈ Γ (T ∗S). Then

∇2f : (X ⊗S Y − Y ⊗S X)

= ∇Y∇f ·X −∇X∇f · Y= ∇Y (∇f ·X)−∇f · ∇YX −∇X (∇f · Y ) +∇f · ∇XY= ∇ (∇f ·X) · Y −∇ (∇f · Y ) ·X +∇f · [X,Y ] (by symmetry of ∇TS)= −∇f · [X,Y ] +∇f · [X,Y ] (by definition of [X,Y ])= 0.

Because X⊗S Y is pointwise-arbitrary in TS⊗S TS, this shows that ∇2f is symmetric. Equivalently stated,RS→R is identically zero, and therefore the relevant bundle is flat.

The second important case involves the nonlinear covariant derivative ∇M→S on C∞ (M,S). Here, ifφ ∈ C∞ (M,S), then

∇2φ ≡ ∇p φ∗TS⊗MT∗M ∇◦ M→Sφ ∈ Γ (φ∗TS ⊗M T ∗M ⊗M T ∗M) ,

so RM→S (X,Y )φ ∈ Γ (φ∗TS).

Proposition 10.2 (Symmetry of covariant Hessian on maps). LetM and S be smooth manifolds and let ∇TMand ∇TS be symmetric covariant derivatives. If φ ∈ C∞ (M,S), then ∇2φ ∈ Γ (φ∗TS ⊗M T ∗M ⊗M T ∗M)is a tensor field which is symmetric in the two T ∗M components (i.e. it has a (2 3) symmetry). Here, thecovariant derivative on C∞ (M,S) is ∇◦ M→S as defined above.

Proof. Let X,Y ∈ Γ (TM) and f ∈ C∞ (S,R), so that φ∗∇f ∈ Γ (φ∗TS). Then

φ∗∇f ·φ∗TS RM→S (X,Y )φ = φ∗∇f ·(−∇X∇Y φ+∇Y∇Xφ+∇[X,Y ]φ

)= −∇X (φ∗∇f · ∇φ · Y ) +∇Xφ∗∇f · ∇φ · Y

+∇Y (φ∗∇f · ∇φ ·X)−∇Y φ∗∇f · ∇φ ·X+ φ∗∇f · ∇φ · [X,Y ]

= −∇X (∇φ∗f · Y ) +∇Y (∇φ∗f ·X) +∇φ∗f · [X,Y ]

+(φ∗∇2f · ∇φ ·X

)· ∇φ · Y −

(φ∗∇2f · ∇φ · Y

)· ∇φ ·X

= −∇ (∇φ∗f · Y ) ·X +∇ (∇φ∗f ·X) · Y +∇φ∗f · [X,Y ]

− φ∗∇2f : ((∇φ ·X)⊗M (∇φ · Y )− (∇φ · Y )⊗M (∇φ ·X)) .

By definition, −∇ (∇φ∗f · Y ) ·X +∇ (∇φ∗f ·X) · Y = −∇φ∗f · [X,Y ], which cancels out the other term.By (10.1), ∇2f is symmetric, so the final term is zero. Because φ∗∇f is pointwise-arbitrary in φ∗T ∗S andX and Y are pointwise-arbitrary in TM , this shows that RM→S is identically zero, so the bundle definedby πS×MM : S ×M → M , whose space of sections is identified with C∞ (M,S), is flat, and therefore ∇2φ issymmetric in its two T ∗M components.

33

The construction used in (9.4) can be applied to nonlinear as well as linear covariant derivatives toconsiderable advantage. For example, if ψ : M × N → L, where M,N,L are smooth manifolds and pM :=prM×N1 and pN := prM×N2 , then define ψ,M ∈ Γ (ψ∗TL⊗M×N p∗MT

∗M) and ψ,N ∈ Γ (ψ∗TL⊗M×N p∗NT∗N)

by∇◦ ψ = ∇◦ M×N→Lψ = ψ,M ·p∗MTM ∇◦ pM + ψ,N ·p∗NTN ∇◦ pN .

This gives a convenient way to express partial covariant derivatives, which will be used heavily in Part II incalculating the first and second variations of an energy functional. Note that in this parlance, ψ,(M×N) isthe full tangent map ∇◦ ψ.

Defining second partial covariant derivatives ψ,MM , ψ,MN , ψ,NM and ψ,NN by

∇ψ,M = ψ,MM · ∇◦ pM + ψ,MN · ∇◦ pN and∇ψ,N = ψ,NM · ∇◦ pM + ψ,NN · ∇◦ pN ,

the symmetry of the covariant Hessian of ψ can be used to show various symmetries these second derivatives.

Proposition 10.3 (Symmetries of partial covariant derivatives). With ψ and its second partial covariantderivatives as above,

ψ,MM ∈ Γ (ψ∗TL⊗M×N p∗MT∗M ⊗M×N p∗MT

∗M)

and ψ,NN (having analogous type) are (2 3)-symmetric (i.e. (ψ,MM )(2 3)

= ψ,MM and (ψ,NN )(2 3)

= ψ,NN )and the mixed, second partial covariant derivatives

ψ,MN ∈ Γ (ψ∗TL⊗M×N p∗MT∗N ⊗M×N p∗NT

∗N) andψ,NM ∈ Γ (ψ∗TL⊗M×N p∗NT

∗N ⊗M×N p∗MT∗M)

are mutually (2 3)-symmetric (i.e. ψ,MN = (ψ,NM )(2 3)).

Proof. Let X,Y ∈ Γ (TM ⊕ TN). If TpN ·X = 0 and TpM · Y = 0, then

0 = ∇2ψ : (X ⊗M×N Y − Y ⊗M×N X) (by (10.2))= ψ,MM : (∇◦ pM ·X ⊗M×N ∇◦ pM · Y − ∇◦ pM · Y ⊗M×N ∇◦ pM ·X)

+ ψ,MN : (∇◦ pM ·X ⊗M×N ∇◦ pN · Y − ∇◦ pM · Y ⊗M×N ∇◦ pN ·X)

+ ψ,NM : (∇◦ pN ·X ⊗M×N ∇◦ pM · Y − ∇◦ pN · Y ⊗M×N ∇◦ pM ·X)

+ ψ,NN : (∇◦ pN ·X ⊗M×N ∇◦ pN · Y − ∇◦ pN · Y ⊗M×N ∇◦ pN ·X)

= ψ,MN : (∇◦ pM ·X ⊗M×N ∇◦ pN · Y − ∇◦ pN · Y ⊗M×N ∇◦ pM ·X)

=(ψ,MN − (ψ,NM )

(2 3))

: (∇◦ pM ·X ⊗M×N ∇◦ pN · Y ) .

Because ∇◦ pM ·X and ∇◦ pN ·Y are pointwise-arbitrary in p∗MTM and p∗NTN respectively, this implies thatψ,MN = (ψ,NM )

(2 3). Analogous calculations (setting ∇◦ pM ·X = 0 and ∇◦ pM · Y = 0 and then separatelysetting ∇◦ pN ·X = 0 and ∇◦ pN · Y = 0) show that ψ,MM = (ψ,MM )

(2 3) and ψ,NN = (ψ,NN )(2 3).

There are two final results regarding the second covariant derivative that will be especially useful in thecalculation of the first and second variations of an energy functional (see (12.1) and (13.2)).

Proposition 10.4 (Chain rule for covariant Hessian). Let π : E → N define a bundle having a first andsecond covariant derivative (i.e. a section of E can be covariantly differentiated twice). If φ : M → N ande ∈ Γ (E), then

∇2φ∗e = φ∗∇2e :φ∗TN (∇◦ φ�M ∇◦ φ) + φ∗∇e ·φ∗TN ∇∇◦ φ.

34

Proof. Let X ∈ Γ (TM). Then

∇2φ∗e ·X = ∇X∇φ∗Eφ∗e

= ∇X(φ∗∇Ee ·φ∗TN ∇◦ φ

)= ∇X (φ∗∇e) ·φ∗TN ∇◦ φ+ φ∗∇e ·φ∗TN ∇X∇◦ φ=(φ∗∇2e ·φ∗TN ∇◦ φ ·X

)·φ∗TN ∇◦ φ+ φ∗∇e ·φ∗TN ∇X∇◦ φ

=[φ∗∇2e :φ∗TN (∇◦ φ�M ∇◦ φ) + φ∗∇e ·φ∗TN ∇∇◦ φ

]·X.

Because X is pointwise-arbitrary in TM , this establishes the desired equality.

Proposition 10.5 (Pullback curvature endomorphism). Let π : E → N define a vector bundle having firstand second covariant derivatives. If φ : M → N , then Rφ

∗TN = φ∗RTN :φ∗TN (∇◦ φ�M ∇◦ φ).

Proof. Note that Rφ∗TN ∈ Γ (φ∗TN ⊗M φ∗T ∗N ⊗M T ∗M ⊗M T ∗M). Let X,Y ∈ Γ (TM) and let Z ∈

Γ (TN), so that φ∗Z ∈ Γ (φ∗TN). Then

(Idφ∗TN ⊗Mφ∗Z) ·φ∗TN⊗Mφ∗T∗N Rφ∗TN :TM (X ⊗M Y )

= Rφ∗TN (X,Y ) (φ∗Z)

= ∇2φ∗Z :TM (X ∧M Y )

= φ∗∇2Z :φ∗TN (∇◦ φ�M ∇◦ φ) :TM (X ∧M Y )

+ φ∗∇Z ·φ∗TN ∇∇◦ φ :TM (X ∧M Y ) (by (10.4))

= φ∗∇2Z :φ∗TN ((∇◦ φ ·X) ∧M (∇◦ φ · Y )) + φ∗∇Z ·φ∗TN 0 (by symmetry of ∇∇◦ φ)=(φ∗((IdTN ⊗NZ) ·TN⊗NT∗N RTN

)):φ∗TN ((∇◦ φ ·X)⊗M (∇◦ φ · Y ))

= (Idφ∗TN ⊗Mφ∗Z) ·φ∗TN⊗Mφ∗T∗N φ∗RTN :φ∗TN (∇◦ φ�M ∇◦ φ) :TM (X ⊗M Y ) ,

and because X,Y and φ∗Z are pointwise-arbitrary in their respective spaces, this establishes the desiredequality.

A common operation is to evaluate a covariant derivative along a single tangent vector. One can express asingle tangent vector as a section of a particular pullback bundle, the map being the constant map evaluatingto the basepoint of the vector. This allows the richly-typed formalism of pullback bundles to be used toevaluate derivatives at a point, particularly noting that this safely deals with the overloading of the naturalpairing operator · (see Section 5).

Proposition 10.6 (Evaluation commutes with non-involved derivatives). Let A and B be smooth manifoldsand let σ ∈ Γ (E) for some smooth bundle E → A× B having a covariant derivative ∇E. If b ∈ B and themap z : A→ A×B, a 7→ (a, b) represents evaluation at b, then

z∗ (σ,A) = (z∗σ),A ,

i.e. evaluation in B commutes with a derivative along A.

Proof. Let X ∈ Γ (TA), and let pA := prA×B1 and pB := prA×B2 . Then

(z∗σ),A ·X = ∇z∗Ez∗σ ·X

= z∗∇Eσ ·z∗(TA⊕TB) ∇◦ z ·TA X= z∗

(∇Eσ ·TA⊕TB p∗AX

)= z∗

(σ,A ·p∗ATA ∇◦ pA ·TA⊕TB p

∗AX)

(since ∇◦ pB ·TA⊕TB p∗AX = 0)

= z∗σ,A ·TA X (since z∗p∗AX = (pA ◦ z)∗X = Id∗AX = X),

and because X is pointwise-arbitrary in TM , this implies that z∗σ,A = (z∗σ),A as desired.

35

Proposition 10.7. Let A,B,C be smooth manifolds, let ψ : A × B → C be smooth, let pA := prA×B1 andpB := prA×B2 , and let X,Y ∈ Γ (TA⊕ TB). If ∇◦ pB ·X = 0 and ∇◦ pA · Y = 0, then

ψ,AB : ((∇◦ pA ·X)⊗A×B (∇◦ pB · Y )) = ∇ψ∗TC

Y ∇A×B→CX ψ.

Proof. The conditions ∇◦ pB · X = 0 and ∇◦ pA · Y = 0 imply that ∇YX = 0 in the product covariantderivative. Then since pA ×A×B pB = IdA×B , it follows that

∇∇◦ pA ⊕A×B ∇∇◦ pB = ∇ (∇◦ pA ⊕A×B ∇◦ pB) = ∇∇◦ (pA ×A×B pB) = ∇ IdTA⊕TB = 0,

and therefore∇Y (∇◦ pA ·X) = ∇Y ∇◦ pA ·X + ∇◦ pA · ∇YX = 0 ·X + ∇◦ pA · 0 = 0.

For the main calculation,

ψ,AB : ((∇◦ pA ·X)⊗A×B (∇◦ pB · Y ))

=(ψ,AB ·p∗BTB ∇◦ pB ·TA⊕TB Y

)·p∗ATA ∇◦ pA ·TA⊕TB X

=(∇ψ×A×BpAY ψ,A

)·p∗ATA ∇◦ pA ·TA⊕TB X (since ∇◦ pA · Y = 0)

= ∇ψY (ψ,A · ∇◦ pA ·X)− ψ,A · ∇pAY (∇◦ pA ·X) (by reverse product rule)

= ∇ψ∗TC

Y ∇A×B→CX ψ (since ∇Y (∇◦ pA ·X) = 0),

as desired.

36

Part II

Riemannian Calculus of VariationsThe use of the Calculus of Variations in the Riemannian setting to develop the geodesic equations and tostudy harmonic maps is quite well-established. A more general formulation is required for more specificapplications, such as continuum mechanics in Riemannian manifolds. The tools developed in Part I will nowbe used to formulate the first and second variations and Euler-Lagrange equations of an energy functionalcorresponding to a first-order Lagrangian. In particular, the bundle decomposition discussed in Section 9will be needed to employ the standard integration-by-parts trick seen in the formulation of the analogousparts of the elementary Calculus of Variations. The seemingly heavy and pedantic formalism built up thusfar will now show its usefulness.

In this part, let (M, g) and (S, h) be Riemannian manifolds with M compact. Calculations will be doneformally in the space C∞ (M,S), noting that its completion under various norms will give various Sobolevspaces of maps from M to S, which are ultimately the spaces which must be considered when finding criticalpoints of the relevant energy functionals. See [4, 5] for details on the analytical issues. Let dVg denotethe Riemannian volume form corresponding to metric g, and let dV g be the induced volume form on ∂M .Let ι : ∂M → M be the inclusion, and let ν ∈ Γ (ι∗T ∗M) be the unit normal covector field on ∂M . LetE := TS ⊗S×M T ∗M and π := πTSS ⊗S×M πT

∗MM , making π : E → S ×M a vector bundle.

The energy functionals in this section will be assumed to have the form

L : C∞ (M,S) → R,

φ 7→∫M

L ◦ ∇◦ φdVg,

where L : E → R, referred to as the Lagrangian of the functional, is smooth. Here, ∇◦ φ could be understoodto take values either in E = TS⊗S×M T ∗M or φ∗TS⊗M T ∗M . In the former case, the composition L ◦ ∇◦ φis literal, while in the latter case, there is an implicit conversion from φ∗TS⊗M T ∗M to TS⊗S×M T ∗M viaa fiber projection bundle morphism (see (6.2)). Either way, L ◦ ∇◦ φ : M → R. Let ∇TS and ∇TM denotethe respective Levi-Civita connections, which induce a covariant derivative ∇E on E (see (8.9)). Define theconnection map v ∈ Γ (π∗E ⊗E T ∗E) using ∇E as in (9.2). For convenience, the S ×M subscript will besuppressed on the “full” tensor product defining E from here forward.

11 Critical Points and VariationsOne of the most pertinent properties of an energy functional is its set of critical points. Often, the solution toa problem in physics will take the form of minimizing a particular energy functional. Lagrangian mechanicsis the quintessintial example of this. This section will deal with some of the main considerations regardingsuch critical points.

Because the domain of a [real-valued] functional L may be a nonlinear space, the relevant first derivativeis the [real-valued] differential dL, which is paired with the linearized variation of a map φ ∈ C∞ (M,S). Inparticular, a one-parameter variation of φ is a smooth map Φ: M × I → S, where the I component isthe variational parameter. Letting i denote the standard coordinate on I, the linearized variation is thenδiΦ: M → TS, recalling that δi := ∂

∂i |i=0. Because πTSS ◦δiΦ = φ, it follows that δiΦ ∈ Γ (φ∗TS), i.e. δiΦ isa vector field along φ. The object δiΦ will be called a linearized variation. Call the elements of Γ (φ∗TS)linear variations.

Proposition 11.1 (Each linear variation is a linearized variation). Let exp: U → S denote the exponentialmap associated to ∇TS, where U ⊆ TS is a neighborhood of the zero bundle in TS on which exp is defined,and let λ : TS ×R→ TS, (s, ε) 7→ εs denote the scalar multiplication structure on TS. If A ∈ Γ (φ∗TS) and

37

if Φ: U → S is defined by Φ := exp ◦λ ◦ (A× IdI) |U , then δiΦ = A. In other words, every vector field overφ is realized as the linearization of a one-parameter variation of φ.

Proof. The map Φ is well-defined and smooth by construction. Let p ∈M . Then

(δiΦ) (p) = δi (Φ (p, i))

= δi (exp ◦λ ◦ (A× IdI) (p, i))

= ∇◦ exp ·δi (λ (A (p) , i))

= ∇◦ exp ·δi (iA (p))

= ∇◦ exp ·(ιπ∗EV E |Z(π∗E)

)·A (p)

= A (p) ,

where Z (π∗E) denotes the zero subbundle of π∗E. The last equality follows from a naturality property ofthe exponential map [10, pg. 523].

Thus each linear variation is a linearized variation, establishing a natural identification of Tφ (C∞ (M,S))with Γ (φ∗TS), which will be useful when calculating the differential of a functional on C∞ (M,S). In fact,the exponential map construction in (11.1) is a way to construct charts for the infinite dimensional manifoldC∞ (M,S) [5, Theorem 5.2].

12 First VariationThis section is devoted to calculating the first variation of the previously defined energy functional. Hereis where the full richness of the type system of the objects developed earlier in the paper will really showtheir power (and arguably, necessity). While the type-specifying notation may appear overly decorated andpedantic, subtle usage errors can be detected and avoided by keeping track of the myriad of types of therelevant objects through the sub/superscripts on covariant derivatives and natural pairings; extremely com-plex constructions can be made and navigated without much trouble. By contrast, performing the ensuingcalculations in coordinate trivializations would result in an intractible proliferation of Christoffel symbolsand indexed expressions which would prove difficult to read and would be highly prone to error.

Because the Lagrangian L : E → R is defined on a vector bundle π : E → S ×M over the product spaceS ×M , the decomposition in (9.5) can be slightly refined. The projection π can be decomposed into thefactors πS := prS×MS ◦π and πM := prS×MM ◦π, so that π = πS ×E πM . Then h = ∇◦ π = ∇◦ πS ⊕E ∇◦ πM .Let

σ := ∇◦ πS ∈ Γ (π∗STS ⊗E T ∗E) and µ := ∇◦ πM ∈ Γ (π∗MTM ⊗E T ∗E) .

The letters sigma and mu have been chosen to reflect the fact that L,σ ∈ Γ (π∗ST∗S) and L,µ ∈ Γ (π∗MT

∗M)give the “S component” (spatial) and “M component” (material) of the derivative ∇E→RL ∈ Γ (T ∗E). Theconnection map v will be retained as is, giving L,v ∈ Γ (π∗E∗), the “E component” (fiber) of ∇E→RL. See(12.5) for a discussion of how the quantities L,µ, L,σ, L,v generalize the analogous structures in the elementarytreatment of the calculus of variations.

Because a one-parameter variation of φ ∈ C∞ (M,S) has the form Φ: M × I → S but the energyfunctional L involves only the M derivative of its argument, the partial tangent map must be used here. Forthe purposes of calculating the first and second variations, L must be written as

L (φ) :=

∫M

L ◦ φ,M dVg.

Theorem 12.1 (First variation of L). Let L, L, σ, µ, v and ν all be defined as above. If φ ∈ C∞ (M,S)and A ∈ Γ (φ∗TS), then

dL (φ) ·A =

∫M

A ·φ∗T∗S(φ∗,ML,σ − divM

(φ∗,ML,v

))dVg +

∫∂M

A ·φ∗T∗S φ∗,ML,v ·T∗M ν dV g.

38

The expression above is often called the first variation of L. A type analysis here gives φ∗,ML,σ ∈Γ (φ∗T ∗S) and φ∗,ML,v ∈ Γ (φ∗T ∗S ⊗M TM). Recall that because the domain of φ is M , ∇◦ φ ≡ φ,M .

Proof. Supporting calculations will be made below in lemmas. Let Φ: M × I → S be as in (11.1), so thatδiΦ = A. For tidiness, let L,σ := φ∗,ML,σ and L,v := φ∗,ML,v. Then

dL (φ) ·A = dL (φ) · δiΦ= δi (L (Φ))

=

∫M

δi (L ◦ Φ,M ) dVg

=

∫M

L,σ ·φ∗TS A+ L,v ·φ∗TS⊗MT∗M ∇φ∗TSAdVg (by (12.2))

=

∫M

A ·φ∗T∗S (L,σ − divM L,V ) + divM (A ·φ∗T∗S L,V ) dVg (by (12.2))

=

∫M

A ·φ∗T∗S (L,σ − divM L,V ) dVg +

∫∂M

A ·φ∗T∗S L,V ·T∗M ν dV g (divergence theorem),

as desired.As for the types of φ∗,ML,σ and φ∗,ML,v, the contravariance of bundle pullback allows significant simplifi-

cation. Because L,σ ∈ Γ (π∗ST∗S) and L,v ∈ Γ (π∗E∗),

φ∗,ML,σ ∈ Γ(φ∗,Mπ

∗ST∗S)

= Γ((πS ◦ φ,M )

∗T ∗S

)= Γ (φ∗T ∗S) and

φ∗,ML,v ∈ Γ(φ∗,Mπ

∗E)

= Γ((π ◦ φ,M )

∗(T ∗S ⊗ TM)

)= Γ (φ∗T ∗S ⊗M TM) .

The supporting calculations follow. Define z : M →M×I, m 7→ (m, 0) for purposes of evaluation of i = 0via precomposition as in (10.6). Then δi is a section of a pullback bundle; δi = z∗∂i ∈ Γ (z∗ (TM ⊕ TI)). Itshould be noted that Φ ◦ z = φ by definition, and that z∗Φ,M = (z∗Φ),M = φ,M by (10.6).

Lemma 12.2. Let L, Φ, A, σ, and v be as in Theorem 12.1. The variational derivative of L◦Φ,M decomposesin terms of the partial covariant derivatives L,σ and L,v and the linearized variation A;

δi (L ◦ Φ,M ) = φ∗,ML,σ ·φ∗TS δiΦ + φ∗,ML,v ·(φ×M IdM )∗E ∇φδiΦ.

The integration-by-parts trick as in the derivation of the first variation in elementary calculus of variationsgeneralizes to the covariant setting;

L,σ ·φ∗TS A+ L,v ·φ∗TS⊗MT∗M ∇φA = A ·φ∗T∗S (L,σ − divM L,v) + divM (A ·φ∗T∗S L,v) .

Proof. A wonderful string of equalities follows.

δi (L ◦ Φ,M )

= z∗∇M×I→R (L ◦ Φ,M ) ·z∗(TM⊕TI) δi (here, δi = z∗∂i)

= z∗Φ∗,M∇E→RL ·z∗Φ∗,MTE z∗∇◦ M×I→EΦ,M ·z∗(TM⊕TI) δi (chain rule)

= φ∗,M(L,σ ·π∗STS σ + L,µ ·π∗MTM µ+ L,v ·π∗E v

)·φ∗,MTE δiΦ,M (by (9.4) and because Φ,M ◦ z = φ,M )

= φ∗,ML,σ ·φ∗,Mπ∗STS φ∗,Mσ ·φ∗,MTE δiΦ,M

+ φ∗,ML,µ ·φ∗,Mπ∗MTM φ∗,Mµ ·φ∗,MTE δiΦ,M+ φ∗,ML,v ·φ∗,Mπ∗E φ

∗,Mv ·φ∗,MTE δiΦ,M

= φ∗,ML,σ ·φ∗TS δiΦ + φ∗,ML,v ·(φ×M IdM )∗E ∇φδiΦ (by (12.3))

39

Note that by (6.3), Φ∗,Mπ∗STS = (πS ◦ Φ,M )

∗TS = Φ∗TS, Φ∗,Mπ

∗MTM = (πM ◦ Φ,M )

∗TM =

(prM×IM

)∗TM

and Φ∗,Mπ∗E = (π ◦ Φ,M )

∗E =

(Φ×M×I prM×IM

)∗E. Replacing δiΦ with A gives

δi (L ◦ Φ,M ) = L,σ ·φ∗TS A+ L,v ·φ∗TS⊗MT∗M ∇φA,

establishing the first equality.For the second,

L,σ ·φ∗TS A+ L,v ·φ∗TS⊗MT∗M ∇φA= L,σ ·φ∗TS A+ trT∗M

(L,v ·φ∗TS ∇φA

)(tracing TMseparately)

= A ·φ∗T∗S L,σ + trT∗M(∇ (L,v ·φ∗TS A)−

(∇φ×M IdML,v

)·φ∗TS A

)(reverse product rule)

= A ·φ∗T∗S L,σ −A ·φ∗T∗S trT∗M ∇φ×M IdML,v + trT∗M ∇ (A ·φ∗T∗S L,v) (·φ∗TScommutes with trT∗M )= A ·φ∗T∗S (L,σ − divM L,v) + divM (A ·φ∗T∗S L,v) (definition of divM ).

Note that L,v ∈ Γ (φ∗T ∗S ⊗M TM), so divM L,v ∈ Γ (φ∗T ∗S) and A ·φ∗T∗S L,v ∈ Γ (TM).

Lemma 12.3. The variation δiΦ,M decomposes as follows.

φ∗,Mσ ·φ∗,MTE δiΦ,M = δiΦ ∈ Γ (φ∗TS) ,

φ∗,Mµ ·φ∗,MTE δiΦ,M = 0 ∈ Γ (TM) ,

φ∗,Mv ·φ∗,MTE δiΦ,M = ∇φ∗TSδiΦ ∈ Γ (φ∗TS ⊗M T ∗M) .

Proof. This calculation determines the σ component of δiΦ,M .

φ∗,Mσ ·φ∗,MTE δiΦ,M = φ∗,M ∇◦ πS ·φ∗,MTE δiΦ,M= δi (πS ◦ Φ,M )

= δi(prS×MS ◦π ◦ Φ,M

)= δiΦ ∈ Γ (z∗Φ∗TS) ∼= Γ (φ∗TS) .

This calculation determines the µ component of δiΦ,M .

φ∗,Mµ ·φ∗,MTE δiΦ,M = φ∗,M ∇◦ πM ·φ∗,MTE δiΦ,M= δi (πM ◦ Φ,M )

= δi(prS×MM ◦π ◦ Φ,M

)= δi prM×IM

= 0 ∈ Γ(z∗(prM×IM

)∗TM

)∼= Γ (TM) .

The last equality follows from the fact that prM×IM does not depend on the i coordinate.This calculation determines the v component of δiΦ,M . Let pM := prM×IM and pI := prM×II . The

left-hand side of the third equality claimed in the lemma will be examined before evaluating at i = 0;

Φ∗,Mv ·Φ∗,MTE ∂iΦ,M = ∇(π◦Φ,M )∗Ep∗I∂i

Φ,M = ∇(Φ×M×IpM )∗(TS⊗T∗M)p∗I∂i

Φ,M ∈ Γ (Φ∗TS ⊗M×I p∗MT ∗M) .

40

Let Y ∈ Γ (TM), noting that p∗MY ∈ Γ (p∗MTM). Then(Φ∗,Mv ·Φ∗,MTE ∂iΦ,M

)·p∗MTM p∗MY

= ∇Φ∗TS⊗M×Ip∗MT∗M

p∗I∂iΦ,M ·p∗MTM p∗MY

=(Φ,MI ·p∗ITI p

∗I∂i)·p∗MTM p∗MY

=(Φ,IM ·p∗MTM p∗MY

)·p∗ITI p

∗I∂i (by (10.3))

=(Φ,I ·p∗ITI p

∗I∂i),M·p∗MTM p∗MY − Φ,I ·p∗ITI

((p∗I∂i),M ·p∗MTM p∗MY

)= (∂iΦ),M ·p∗MTM p∗MY (since p∗I∂idoesn’t depend on M).

Recall that IdM = pM ◦ z and that the pullback of bundles is contravariant. Then evaluating at i = 0 viapullback by z renders(

φ∗,Mv ·φ∗,MTE δiΦ,M)·TM Y

=((Φ,M ◦ z)∗ v ·(Φ,M◦z)∗TE z

∗∂iΦ,M)·(pM◦z)∗TM (pM ◦ z)∗ Y

=(z∗Φ∗,Mv ·z∗Φ∗,MTE z

∗∂iΦ,M

)·z∗p∗MTM z∗p∗MY

= z∗((

Φ∗,Mv ·Φ∗,MTE ∂iΦ,M)·p∗MTM p∗MY

)= z∗

((∂iΦ),M ·p∗MTM p∗MY

)= z∗ (∂iΦ),M ·z∗p∗MTM z∗p∗MY

= (z∗∂iΦ),M ·(pM◦z)∗TM (pM ◦ z)∗ Y (by (10.6))

= (δiΦ),M ·TM Y

= ∇φ∗TSδiΦ ·TM Y

The last equality is because δiΦ ∈ Γ (φ∗TS), which is a bundle over M , and therefore (δiΦ),M is thetotal covariant derivative. Because Y is pointwise-arbitrary in TM , this implies that φ∗,M ·φ∗,MTI δiΦ,M =

∇φ∗TSδiΦ, i.e. the variational derivative δi commutes with the first material derivative, just as in theanalogous situation in elementary calculus of variations.

Corollary 12.4 (Euler-Lagrange equations). If φ ∈ C∞ (M,S) is a critical point of L (i.e. if dL (φ) ·A = 0for all A ∈ Γ (φ∗TS)), then

φ∗,ML,σ − divM(φ∗,ML,v

)= 0 on M,

φ∗,ML,v ·TM ν = 0 on ∂M.

These are called the Euler-Lagrange equations for the energy functional L. Recall that because the domainof φ is M , ∇◦ φ ≡ φ,M .

Proof. This follows trivially from (12.1) and the Fundamental Lemma of the Calculus of Variations [6, pg.16].

It should be noted that the boundary Euler-Lagrange equation is due to the fact that the admissiblevariations are entirely unrestricted. If, for example, the class of maps being considered had fixed boundarydata, then any variation would vanish at the boundary, and there would be no boundary Euler-Lagrangeequation; this is typically how geodesics and harmonic maps are formulated.

41

Remark 12.5 (Analogs in elementary calculus of variations). The quantities L,µ, L,σ, L,v generalize the quan-tities ∂L

∂x ,∂L∂z ,

∂L∂p respectively of the elementary treatment of the calculus of variations for energy functional

(f : U → Rn) 7→∫U

L (x, f (x) , Df (x)) dx,

where U ⊂ Rm is compact and U × Rn × Rm×n 3 (x, z, p) 7→ L (x, z, p) is the Lagrangian. Here, ∂L∂x : U ×Rn × Rm×n → Rm, ∂L∂z : U × Rn × Rm×n → Rn, and ∂L

∂p : U × Rn × Rm×n → Rm×n decompose the totalderivative dL and are defined by the relation

dL (x, z, p) · (u, v, w) =∂L

∂x(x, z, p) · u+

∂L

∂z(x, z, p) · v +

∂L

∂p(x, z, p) : w

for u ∈ Rm, v ∈ Rn, and w ∈ Rm×n. The Euler-Lagrange equation in this setting is(∂L

∂z− divU

∂L

∂p

)(x, f (x) , Df (x)) = 0 for x ∈ U,

noting that the left hand side of the equation takes values in Rn.

In most situations involving simpler calculations, it is desirable and acceptable to dispense with thehighly decorated notation and use trimmed-town, context-dependent notation, leaving off type-specifyingsub/superscripts when clear from context.

Proposition 12.6 (Conserved quantity). IfM is a real interval, φ ∈ C∞ (M,S) satisfies the Euler-Lagrangeequation, and L,µ = 0, then

H := (∇◦ φ)∗L,v ·φ∗TS⊗MT∗M ∇◦ φ− (∇◦ φ)

∗L ∈ C∞ (M,R)

is constant. If L is kinetic minus potential energy, then H is kinetic plus potential energy (the total energy),and is referred to as the Hamiltonian.

Proof. Let t be the standard real coordinate. Note that because M is a real interval, ∇◦ φ = φ′ ⊗M dt.Terms appearing in the derivative of H can be simplified as follows. Note the repeated ∇◦ derivatives;∇◦ φ : M → φ∗TS ⊗M T ∗M but ∇◦ ∇◦ φ : M → (∇◦ φ)

∗T (φ∗TS ⊗M T ∗M) lands in a higher tangent space.

(∇◦ φ)∗σ · ∇◦ ∇◦ φ · d

dt= (∇◦ φ)

∗ ∇◦ πS · ∇◦ ∇◦ φ ·d

dt=

d

dt(πS ◦ ∇◦ φ) = φ′,

(∇◦ φ)∗v · ∇◦ ∇◦ φ · d

dt= (∇◦ φ)

∗v · d

dt∇◦ φ = ∇ d

dt∇◦ φ,

∇ ddt

(∇◦ φ)∗L = (∇◦ φ)

∗∇L · ∇◦ ∇◦ φ · ddt

= (∇◦ φ)∗L,σ · φ′ + (∇◦ φ)

∗L,v · ∇ d

dt∇◦ φ,

∇ ddt

((∇◦ φ)

∗L,v : ∇◦ φ

)= ∇ d

dt(∇◦ φ)

∗L,v : (φ′ ⊗M dt) + (∇◦ φ)

∗L,v : ∇ d

dt∇◦ φ

=(∇ d

dt(∇◦ φ)

∗L,v · dt

)· φ′ + (∇◦ φ)

∗L,v : ∇ d

dt∇◦ φ.

Again, because M is a real interval, the divergence is just the derivative, so the Euler-Lagrange equation is

0 = (∇◦ φ)∗L,σ − divM

((∇◦ φ)

∗L,v)

= (∇◦ φ)∗L,σ −∇ (∇◦ φ)

∗L,v :

(dt⊗M

d

dt

)= (∇◦ φ)

∗L,σ −∇ d

dt(∇◦ φ)

∗L,v · dt,

42

and therefore ∇ ddt

(∇◦ φ)∗L,v · dt = (∇◦ φ)

∗L,σ. Thus

∇ ddtH = ∇ d

dt

((∇◦ φ)

∗L,v : ∇◦ φ− (∇◦ φ)

∗L)

=(∇ d

dt(∇◦ φ)

∗L,v · dt− (∇◦ φ)

∗L,σ

)· φ′

which is zero because φ satisfies the Euler-Lagrange equation. This shows that H is constant along solutionsof the Euler-Lagrange equation, and is therefore a conserved quantity. It should be noted that this proofrelies on the fact that the divergence takes a particularly simple form when the domain M is a real interval;the result does not necessarily hold for a general choice of M .

Example 12.7 (Harmonic maps). Define a metric

k ∈ Γ (E∗ ⊗S×M E∗) ∼= Γ ((TS ⊗ T ∗M)⊗S×M (TS ⊗ T ∗M))

in a manner analogous to that in (3.4);k := h� g−1.

To clarify, h ⊗ g−1 ∈ Γ ((T ∗S ⊗S T ∗S)⊗ (TM ⊗M TM)), so permuting the middle two components (as inthe definition of h � g−1) gives the correct type, including the necessary metric symmetry condition. IfA ∈ E, then |A|2k is the quantity obtained by raising/lowering the indices of A and pairing it naturally withA. A useful fact is that ∇k = 0; if u ⊕ v ∈ TS ⊕ TM , then permutation commutativity (8.12) and theproduct rule gives

∇u⊕vk = ∇u⊕v(h� g−1

)= ∇uh� g−1 + h�∇vg−1,

which equals zero because h and g−1 are parallel with respect to ∇TS and ∇T∗M respectively.With Lagrangian

L : E → R, A 7→ 1

2|A|2k

and energy functional

E (φ) :=

∫M

L ◦ ∇◦ φdVg

(E (φ) is called the energy of φ), the resulting Euler-Lagrange equations can be written down after calculatingL,σ and L,v. It is worthwhile to note that L is a quadratic form A 7→ A : 1

2k : A on E, which will automaticallyimply that L,v (A) = A : k. However, the calculation showing this will be carried out for demonstrationpurposes.

Let A,B ∈ TS ⊗ T ∗M . Then ε 7→ A+ εB is a vertical variation of A, since h (δ (A+ εB)) = 0, so

L,v (A) : B = L,v (A) : v · δε (A+ εB)

= δε (L (A+ εB))

= δε

((A+ εB) :

1

2(π∗k (A+ εB)) : (A+ εB)

).

The product rule gives three terms. The middle term is zero because π (A+ εB) = π (A), and therefore doesnot depend on ε. The basepoint evaluation notation for π∗k (A) will be suppressed for brevity (see Section5). Thus

L,v (A) : B = B :1

2k : A+A :

1

2k : B = A : k : B,

where the last equality results from the symmetry of k. By the nondegeneracy of the natural pairing onTS ⊗ T ∗M (which is denoted here by :), this implies that L,v (A) = A : k.

To calculate L,σ, it is sufficient (and can be easier) to calculate L,h, as h = ∇◦ π, π = πS ×E πM , soh = σ ⊕E µ. Let A (ε) be a horizontal curve in E = TS ⊗ T ∗M ; this means that v · ddεA = 0. Recall that

43

v · ddεA is defined by ∇(π◦A)∗Eddε

A. Then

L,h (A) ·π∗(TS⊕TM) h ·TE δεA =(L,h ·π∗(TS⊕TM) h+ L,v ·π∗E v

)·TE δεA

= ∇L ·TE δεA= δε (L ◦A)

= δε

(A :

1

2A∗π∗k : A

).

As before, the product rule gives three terms. Using the contravariance of bundle pullback, the middle termis

1

2∇(π◦A)∗(E∗⊗S×ME∗)δε

(π ◦A)∗k =

1

2(π ◦A)

∗∇E∗⊗S×ME∗k · δε (π ◦A) ,

which equals zero because ∇k = 0. Thus

L,h (A) · h · δεA = ∇δεA :1

2k : A+A :

1

2k : ∇δεA,

which equals zero because ∇δεA = v · δεA = 0. The quantity h · δεA can take any value in π∗ (TS ⊕ TM),showing that L,h = 0. Finally, h = σ ⊕E µ implies that L,σ = 0 and L,µ = 0. This can be understood fromthe fact that L depends only on the fiber values of A, and has no explicit dependence on the basepoint; thisrelies crucially on the fact that ∇k = 0.

Finally, the Euler-Lagrange equations can be written down. Recalling that the natural trace of a tensor(used in the divergence term in the Euler-Lagrange equation) is contraction with the appropriate identitytensor, let (ei) be a local frame for TM and let

(ei)be its dual coframe, so that ei⊗M ei is a local expression5

for IdTM ∈ Γ (TM ⊗M T ∗M). The type-subscripted notation will be minimized except to help clarify. OnM :

0 = (∇◦ φ)∗L,σ − divM

((∇◦ φ)

∗L,v)

= − tr∇ (∇◦ φ : k)

= −∇ei (∇◦ φ : k) ·T∗M ei

= −∇ei∇◦ φ : k · ei − ∇◦ φ : ∇eik · ei.

The second term vanishes because ∇k = 0. Unraveling the definition of k gives ∇ei∇◦ φ : k = φ∗h · ∇ei∇◦ φ ·g−1. Contracting both sides of the above equation with −φ∗h−1 gives

0 = ∇ei∇◦ φ · g−1 · ei = ∇ei∇◦ φ · ei = trg∇2φ ∈ Γ (φ∗TS) .

The quantity trg∇2φ is the g-trace of the covariant Hessian of φ and can rightfully be called the covariantLaplacian of φ and denoted by ∆gφ (this is also referred to as the tension field of φ in other literature[20, pg. 13], which is denoted τ (φ)). Note that ∆gφ is a vector field along φ. This makes sense because φis not necessarily a scalar function; it takes values in S. In the case S = R, ∆gφ is the ordinary covariantLaplacian on scalar functions.

A harmonic map is defined as a critical point of the energy functional E (φ) :=∫M

12 |∇◦ φ|

2k dVM .

Assuming a fixed boundary (so that the variations vanish on the boundary) eliminates the boundary Euler-Lagrange equation, the remaining equation is

∆gφ = 0 on the interior of M,

which is the generalization of Laplace’s equation. Satisfying Laplace’s equation is a sufficient condition for amap to be a critical point of the energy functional. There is an abundance of literature concerning harmonicmaps and the analysis thereof [2, 6, 14, 20].

5It should be noted that while IdTM is being written as the local expression ei ⊗M ei, no inherently local property is beingused; this tensor decomposition is only used so that the product rule can be used in the following calculations in a clear way.

44

Example 12.8 (The geodesic equation). A fundamental problem in differential geometry is determininglength-minimizing curves between given points. If M is a bounded, real interval, and t denotes the standardreal coordinate, then the length functional on curves φ : M → S is L (φ) :=

∫M|φ′|g dt. A topological metric

d : M ×M → R on M can be defined as

d (p, q) := inf {L (φ) | φ joins pto q} .

It can be shown that the length functional L (φ) :=∫M|φ′|h dt and the energy functional E (φ) :=

∫M

12 |φ′|2h dt

have identical minimizers. Note that φ′ ∈ Γ (φ∗TS). It is therefore sufficient to consider the analyticallypreferable energy functional.

In this case, the metric g on M is just scalar multiplication on R. Because M is one-dimensional and tis the standard real coordinate, d

dt is a global, parallel orthonormal frame for TM , and the g-trace of ∇2φ(i.e. ∆gφ) has a single term. The Euler-Lagrange equation, on the interior of M , is

0 = ∆gφ = trg∇2φ = ∇∇◦ φ :

(d

dt,d

dt

)= ∇ d

dt∇◦ φ · d

dt= ∇ d

dt

(∇◦ φ · d

dt

)− ∇◦ φ · ∇ d

dt

d

dt.

But ∇◦ φ · ddt = φ′ and ∇ ddt

ddt = 0, giving the geodesic equation

∇φ∗TSddt

φ′ = 0 on the interior of M.

This is the covariant way to state that the acceleration of φ is identically zero. The geodesic equation iscommonly notated as 0 = ∇φ′φ′, though such notation is inaccurate because φ′ is not a vector field on S,but a vector field along φ, and therefore use of the pullback covariant derivative ∇φ∗TS is correct (see (8.8)).

While formulated using fixed boundary conditions (φ has p and q as its endpoints), the geodesic equationis a second order ODE for which initial tangent vector conditions are sufficient to uniquely determine asolution.

13 Second VariationA further consideration after finding critical points of the energy functional L is determining which criticalpoints are extrema. This will involve calculating the second derivative of L. Let C := C∞ (M,S), notingthat TφC ∼= Γ (φ∗TS) for φ ∈ C. The first derivative of L is ∇C→RL := dL, as seen in the previous section.The second derivative is the covariant Hessian ∇T∗C∇C→RL, where the covariant derivative ∇T∗C is inducedby ∇TS [5, Theorem 5.4].

For the remainder of this section, let I, J ⊆ R be neighborhoods of zero, let i and j be their respective stan-dard coordinates, and extend the existing δ-style derivative-at-a-point notation by defining δi := ∂

∂i |i=j=0,δj := ∂

∂j |i=j=0, and evaluation map z : M →M × I×J, m 7→ (m, 0, 0). Then δi = z∗∂i and δj = z∗∂j ; thesewill be used as in the calculation of the first variation.

Theorem 13.1 (Second variation of L). Let L, L, σ, µ, v and ν all be defined as above. If φ ∈ C∞ (M,S)is a critical point of L and A,B ∈ TφC ∼= Γ (φ∗TS), then the covariant Hessian of L is

∇2L (φ) :TφC (A⊗B)

=

∫M

A ·φ∗T∗S φ∗,ML,σσ ·φ∗TS B +A ·φ∗T∗S φ∗,ML,σv ·φ∗TS⊗MT∗M ∇φ∗TSB

+∇φ∗TSA ·φ∗T∗S⊗MTM φ∗,ML,vσ ·φ∗TS B +∇φ

∗TSA ·φ∗T∗S⊗MTM φ∗,ML,vv ·φ∗TS⊗MT∗M ∇φ∗TSB

−A ·φ∗T∗S(φ∗,ML,v ·φ∗TS⊗MT∗M

(φ∗RTS ·φ∗TS φ,M

))·φ∗TS B dVg.

This is often called the second variation of L. Here, RTS ∈ Γ (TS ⊗S T ∗S ⊗S T ∗S ⊗S T ∗S) denotes theRiemannian curvature endomorphism tensor for the Levi-Civita connection on TS.

45

Proof. Let Φ: M×I×J → S be a two-parameter variation such that δiΦ = A and δjΦ = B (e.g. Φ (m, i, j) :=exp (iA (m) + jB (m))). The variation Φ can be naturally identified with a variation Φ: I × J → C, (i, j) 7→(m 7→ Φ (m, i, j)) which is more conducive to the use of C as a manifold. The tensor products in the generallyinfinite-dimensional TC are taken formally. Let z := (0, 0) ∈ I × J .

By (10.4), taking the algebra formally in the case of infinite-rank vector bundles,

∇2(L ◦ Φ

)= Φ

∗∇2L :Φ∗TC(∇◦ Φ �I×J ∇◦ Φ

)+ Φ

∗∇L ·Φ∗TC ∇∇◦ Φ,

so (∇2L ◦C φ

):TφC (A⊗B)

=(∇2L ◦C φ

):TφC

(δiΦ⊗ δjΦ

)+ (∇L ◦C φ) ·TφC ∇Φ

∗TC

δj ∂iΦ (since ∇L ◦C φ = 0)

=(∇2L ◦C Φ ◦I×J z

):z∗Φ∗TC

(δiΦ⊗ δjΦ

)+(∇L ◦C Φ ◦I×J z

)·z∗Φ∗TC ∇

Φ∗TC

δj ∂iΦ

= ∇2(L ◦C Φ

):TI⊕TJ (δi ⊗I×J δj) (by above)

= δj∂i(L ◦C Φ

)=

∫M

δj∂i (L ◦ Φ,M ) dVg

=

∫M

∇2 (L ◦ Φ,M ) :TM⊕TI⊕TJ (δi ⊗M×I×J δj) dVg

=

∫M

A ·φ∗T∗S φ∗,ML,σσ ·φ∗TS B +A ·φ∗T∗S φ∗,ML,σv ·φ∗TS⊗MT∗M ∇φ∗TSB

+∇φ∗TSA ·φ∗T∗S⊗MTM φ∗,ML,vσ ·φ∗TS B

+∇φ∗TSA ·φ∗T∗S⊗MTM φ∗,ML,vv ·φ∗TS⊗MT∗M ∇φ

∗TSB

−A ·φ∗T∗S(φ∗,ML,v ·φ∗TS⊗MT∗M


))·φ∗TS B dVg (by Calculation (1)).

Supporting calculations follow.

Calculation (1): Abbreviate φ∗,ML,xy by L,xy. By (10.4),

∇2 (L ◦ Φ,M ) :TM⊕TI⊕TJ (δi ⊗M×I×J δj)

=([

Φ∗,M∇2L :Φ∗,MTE (∇◦ Φ,M �M×I×J ∇◦ Φ,M ) + Φ∗,M∇L ·Φ∗,MTE ∇∇◦ Φ,M

]◦ z)

:z∗(TM⊕TI⊕TJ) (δi ⊗M δj)

= z∗Φ∗,M∇2L :z∗Φ∗,MTE (δiΦ,M ⊗M δjΦ,M ) + z∗Φ∗,M∇L ·z∗Φ∗,MTE ∇Φ∗,MTE

δj∂iΦ,M

(by Calculation (2))

= L,σσ :φ∗,Mπ∗STS (δiΦ⊗M δjΦ) + L,σv ·φ∗,Mπ∗STS⊗Mφ∗,Mπ∗E(δiΦ⊗M ∇φ

∗TSδjΦ)

+ L,vσ ·φ∗,Mπ∗E⊗Mφ∗,Mπ∗STS(∇φ∗TSδiΦ⊗M δjΦ

)+ L,vv :φ∗,Mπ∗E

(∇φ∗TSδiΦ⊗M ∇φ

∗TSδjΦ)

+ L,σ ·φ∗,Mπ∗STS ∇Φ∗TSδj ∂iΦ + L,v ·φ∗,Mπ∗E ∇

φ∗TS∇Φ∗TSδj ∂iΦ

+ L,v ·φ∗,Mπ∗E((Idφ∗TS ⊗MδiΦ) ·φ∗TS⊗Mφ∗T∗S

(φ∗RTS ·φ∗TS δjΦ

)·φ∗TS φ,M

)(by Calculation (3)).

Note that ∇Φ∗TSδj

∂iΦ ∈ Γ (φ∗TS), and since φ is a critical point of L,∫M

L,σ ·φ∗,Mπ∗STS ∇Φ∗TSδj ∂iΦ + L,v ·φ∗,Mπ∗E ∇

φ∗TS∇Φ∗TSδj ∂iΦ dVg = 0.

46

Thus ∫M

∇2 (L ◦ Φ,M ) :TM⊕TI⊕TJ (δi ⊗M×I×J δj) dVg

=

∫M

L,σσ :φ∗,Mπ∗STS (δiΦ⊗M δjΦ) + L,σv ·φ∗,Mπ∗STS⊗Mφ∗,Mπ∗E(δiΦ⊗M ∇φ

∗TSδjΦ)

+ L,vσ ·φ∗,Mπ∗E⊗Mφ∗,Mπ∗STS(∇φ∗TSδiΦ⊗M δjΦ

)+ L,vv :φ∗,Mπ∗E

(∇φ∗TSδiΦ⊗M ∇φ

∗TSδjΦ)

+ δiΦ ·φ∗T∗S(L,v ·φ∗,Mπ∗E

((φ∗RTS ·φ∗TS δjΦ

)·φ∗TS φ,M

))dVg

=

∫M

A ·φ∗T∗S L,σσ ·φ∗TS B +A ·φ∗T∗S L,σv ·φ∗TS⊗MT∗M ∇φ∗TSB

+∇φ∗TSA ·φ∗T∗S⊗MTM L,vσ ·φ∗TS B +∇φ

∗TSA ·φ∗T∗S⊗MTM L,vv ·φ∗TS⊗MT∗M ∇φ∗TSB

−A ·φ∗T∗S(L,v ·φ∗TS⊗MT∗M


))·φ∗TS B dVg

(by antisymmetry of curvature tensor).

Calculation (2):

z∗∇∇◦ Φ,M :z∗(TM⊕TI⊕TJ) (δi ⊗M δj)

= z∗∇Φ∗,MTE⊗M×I×J (T∗M⊕T∗I⊕T∗J)∇◦ Φ,M :z∗(TM⊕TI⊕TJ) z∗ (∂i ⊗M×I×J ∂j)

= z∗(∇Φ∗,MTE⊗M×I×J (T∗M⊕T∗I⊕T∗J)∇◦ Φ,M :TM⊕TI⊕TJ (∂i ⊗M×I×J ∂j)

)= z∗

(∇Φ∗,MTE⊗M×I×J (T∗M⊕T∗I⊕T∗J)

∂j∇◦ Φ,M ·TM⊕TI⊕TJ ∂i

)= z∗∇Φ∗,MTE

∂j(∇◦ Φ,M ·TM⊕TI⊕TJ ∂i) (since ∇TM⊕TI⊕TJ∂j

∂i = 0)

= ∇Φ∗,MTE

δj∂iΦ,M .

Calculation (3): As calculated in the proof of (12.1),

φ∗,Mσ ·φ∗,MTE δiΦ,M = δiΦ ∈ Γ (φ∗TS) ,

φ∗,Mµ ·φ∗,MTE δiΦ,M = 0 ∈ Γ (TM) ,

φ∗,Mv ·φ∗,MTE δiΦ,M = ∇φ∗TSδiΦ ∈ Γ (φ∗TS ⊗M T ∗M) .

Furthermore, letting P := prM×I×JM for brevity and noting that P ◦ z = IdM ,

φ∗,Mσ ·φ∗,MTE ∇Φ∗,MTE

δj∂iΦ,M

= z∗∇Φ∗,Mπ∗STS

∂j

(Φ∗,Mσ ·Φ∗,MTE ∂iΦ,M

)(since ∇σ = 0)

= z∗∇(πS◦Φ,M )∗TS∂j

∂iΦ (using calculation from (12.1))

= ∇Φ∗TSδj ∂iΦ ∈ Γ (z∗Φ∗TS) ∼= Γ (φ∗TS) ,

47

φ∗,Mµ ·φ∗,MTE ∇Φ∗,MTE

δj∂iΦ,M

= z∗∇Φ∗,Mπ∗MTM

∂j

(Φ∗,Mµ ·Φ∗,MTE ∂iΦ,M

)(since ∇µ = 0)

= z∗∇(πM◦Φ,M )∗TM∂j

0 (using calculation from (12.1))

= 0 ∈ Γ(z∗ (πM ◦ Φ,M )

∗TM

) ∼= Γ (z∗P ∗TM) ∼= Γ (TM) ,

φ∗,Mv ·φ∗,MTE ∇Φ∗,MTE

δj∂iΦ,M

= z∗∇Φ∗,Mπ∗E

∂j

(Φ∗,Mv ·Φ∗,MTE ∂iΦ,M

)(since ∇v = 0)

= z∗∇(π◦Φ,M )∗E∂j

(∂iΦ),M (using calculation from (12.1)).

Note thatφ∗,Mv ∈ Γ

(φ∗,Mπ

∗E ⊗M φ∗,MT∗E) ∼= Γ

((φ×M IdM )

∗E ⊗M φ∗,MT

∗E),

and therefore


δj∂iΦ,M ∈ Γ

((φ×M IdM )

∗E) ∼= Γ (φ∗TS ⊗M T ∗M) ,

so it suffices to examine its natural pairing with TM elements. Let X ∈ Γ (TM), noting that X = Id∗M X =z∗P ∗X and that P ∗X = TP · (X ⊕ 0TI ⊕ 0TJ) ∈ Γ (P ∗TM). Then(


δj∂iΦ,M

)·TM X

= z∗∇Φ∗TS⊗M×I×JP∗T∗M∂j

(∂iΦ),M ·z∗P∗TM z∗P ∗X

= z∗∇Φ∗TS∂j

((∂iΦ),M ·P∗TM ∇◦ P ·TM⊕TI⊕TJ (X ⊕ 0TI ⊕ 0TJ)

)− z∗

((∂iΦ),M · ∇

P∗TM∂j (∇◦ P ·TM⊕TI⊕TJ · (X ⊕ 0TI ⊕ 0TJ))

)= z∗

(∇Φ∗TS∂j ∇Φ∗TS

X⊕0TI⊕0TJ∂iΦ− (∂iΦ),M · 0P∗TM)

= z∗(∇Φ∗TSX⊕0TI⊕0TJ∇

Φ∗TS∂j ∂iΦ +∇Φ∗TS

[∂j ,X⊕0TI⊕0TJ ]∂iΦ−RΦ∗TS (∂j , X ⊕ 0TI ⊕ 0TJ) ∂iΦ

)= z∗

(∇Φ∗TSX⊕0TI⊕0TJ∇


0 ∂iΦ)

− z∗(

(IdΦ∗TS ⊗M×I×J∂iΦ) ·Φ∗TS⊗M×I×JΦ∗T∗S RΦ∗TS :TM⊕TI⊕TJ (∂j ⊗M×I×J (X ⊕ 0TI ⊕ 0TJ))

)=[∇φ∗TS∇Φ∗TS

δj ∂iΦ + (Idφ∗TS ⊗MδiΦ) ·φ∗TS⊗Mφ∗T∗S(φ∗RTS ·φ∗TS δjΦ

)·φ∗TS φ,M

]·TM X,

where the last equality follows from Calculations (4) and (5). Because X is pointwise-arbitrary in TM , thisshows that

φ∗,Mv ·φ∗,MTE∇Φ∗,MTE

δj∂iΦ,M =

(∇Φ∗TSδj ∂iΦ

),M

+(Idφ∗TS ⊗MδiΦ)·φ∗TS⊗Mφ∗T∗S(φ∗RTS ·φ∗TS δjΦ

)·φ∗TSφ,M .

Calculation (4):

z∗(∇Φ∗TSX⊕0TI⊕0TJ∇


0 ∂iΦ)

= z∗(∇Φ∗TS∂j ∂iΦ

),M·z∗P∗TM z∗P ∗X

=(∇Φ∗TSδj ∂iΦ

),M·TM X (by (10.6))

= ∇φ∗TS∇Φ∗TS

δj ∂iΦ ·TM X (because ∇Φ∗TSδj

∂iΦ ∈ Γ (z∗Φ∗TS) ∼= Γ (φ∗TS)).

48

Calculation (5):

− z∗(RΦ∗TS :TM⊕TI⊕TJ (∂j ⊗M×I×J (X ⊕ 0TI ⊕ 0TJ))

)= z∗

(RΦ∗TS :TM⊕TI⊕TJ ((X ⊕ 0TI ⊕ 0TJ)⊗M×I×J ∂j)

)(antisymmetry of RΦ∗TS)

= z∗(Φ∗RTS :Φ∗TS (∇◦ Φ �M×I×J ∇◦ Φ) :TM⊕TI⊕TJ ((X ⊕ 0TI ⊕ 0TJ)⊗M×I×J ∂j)

)(by (10.5))

= z∗(Φ∗RTS :Φ∗TS ((Φ,M ·P∗TM P ∗X)⊗M×I×J ∂jΦ)

)= z∗

((Φ∗RTS ·Φ∗TS ∂jΦ

)·Φ∗TS Φ,M ·P∗TM P ∗X

)=(z∗Φ∗RTS ·z∗Φ∗TS z∗∂jΦ

)·z∗Φ∗TS z∗Φ,M ·z∗P∗TM z∗P ∗X

=(φ∗RTS ·φ∗TS δjΦ

)·φ∗TS φ,M ·TM X.

Theorem 13.2 (Second variation of L (alternate form)). Let L, L, σ, µ, v and ν all be defined as above.If φ ∈ C∞ (M,S) is a critical point of L and A,B ∈ Γ (φ∗TS), then

∇2L (φ) :TφC (A⊗B)

=

∫M

A ·φ∗T∗S L,σσ ·φ∗TS B +A ·φ∗T∗S L,σv ·φ∗TS⊗MT∗M ∇φ∗TSB

−A ·φ∗T∗S divM L,vσ ·φ∗TS B −A ·φ∗T∗S L,vσ ·T∗M⊗Mφ∗TS(∇φ∗TSB

)(1 2)

−A ·φ∗T∗S divM L,vv ·φ∗TS⊗MT∗M ∇φ∗TSB

−A ·φ∗T∗S L,vv ·T∗M⊗Mφ∗TS⊗MT∗M(∇φ∗TS⊗MT∗M∇φ

∗TSB)(1 2 3)

−A ·φ∗T∗S(L,v ·φ∗TS⊗MT∗M


))·φ∗TS B dVg

+

∫∂M

(A ·φ∗T∗S L,vσ ·φ∗TS B) ·T∗M ν +(A ·φ∗T∗S L,vv ·φ∗TS⊗MT∗M ∇φ

∗TSB)·T∗M ν dV g

Proof. This result follows essentially from (13.1) via several instances of integration by parts to express theintegrand(s) entirely in terms of A and not its covariant derivatives. Abbreviate φ∗,ML,xy by L,xy. Then,integrating by parts allows the covariant derivatives of A to be flipped across the natural pairings over φ∗TS.∫

M

∇φ∗TSA ·φ∗T∗S⊗MTM L,vσ ·φ∗TS B dVg

=

∫M

trTM

((∇φ∗TSA

)(1 2)

·φ∗TS L,vσ ·φ∗TS B)dVg (TMtrace is taken separately)

=

∫M

trTM(∇TM (A ·φ∗T∗S L,vσ ·φ∗TS B)

)− trTM

(A ·φ∗T∗S ∇φ

∗T∗S⊗MTM⊗Mφ∗TSL,vσ ·φ∗TS B)

− trTM

(A ·φ∗T∗S L,vσ ·φ∗TS ∇φ

∗TSB)dVg (reverse product rule)

=

∫M

−A ·φ∗T∗S divM L,vσ ·φ∗TS B (definition of divergence)

−A ·φ∗T∗S L,vσ ·T∗M⊗Mφ∗TS(∇φ∗TSB

)(1 2)

dVg

+

∫∂M

(A ·φ∗T∗S L,vσ ·φ∗TS B) ·T∗M ν dV g (divergence theorem).

49

Similiarly, ∫M

∇φ∗TSA ·φ∗T∗S⊗MTM L,vv ·φ∗TS⊗MT∗M ∇φ

∗TSB dVg

=

∫M

trTM

((∇φ∗TSA

)(1 2)

·φ∗T∗S L,vv ·φ∗TS⊗MT∗M ∇φ∗TSB

)dVg

=

∫M

trTM

(∇TM

(A ·φ∗T∗S L,vv ·φ∗TS⊗MT∗M ∇φ

∗TSB))

− trTM

(A ·φ∗T∗S ∇φ

∗T∗S⊗MTM⊗Mφ∗T∗S⊗MTML,vv ·φ∗TS⊗MT∗M ∇φ∗TSB

)− trTM


∗TS⊗MT∗M∇φ∗TSB

)dVg

=

∫M

−A ·φ∗T∗S divM L,vv ·φ∗TS⊗MT∗M ∇φ∗TSB

−A ·φ∗T∗S L,vv ·T∗M⊗Mφ∗TS⊗MT∗M(∇φ∗TS⊗MT∗M∇φ

∗TSB)(1 2 3)

dVg

+

∫∂M


∗TSB)·T∗M ν dV g.

Together with (13.1), this gives the desired result.

14 Questions and Future WorkThis paper is a first pass at the development of a strongly-typed tensor calculus formalism. The details ofits workings are by no means complete or fully polished, and its landscape is riddled with many temptingrabbit holes which would certainly produce useful results upon exploration, but which were out of the scopeof a first exposition. Here is a list of some topics which the author considers worthwhile to pursue, andwhich will likely be the subject of his future work. Hopefully some of these topics will be inspiring to othermathematicians, and ideally will start a conversation on the subject.

• There refinements to be made to the type system used in this paper in order to achieve better error-checking and possibly more insight into the relevant objects. There are still implicit type identificationsbeing done (mostly the canonical identifications between different pullback bundles).

• The calculations done in this paper are not in an optimally polished and refined state. With experi-ence, certain common operations can be identified, abstract computational rules generated for theseoperations, and the relevant calculations simplified.

• The language of Category Theory can be used to address the implicit/explicit handling of natural typeidentifications, for example, the identification used in showing the contravariance of bundle pullback;ψ∗φ∗F ∼= (φ ◦ ψ)

∗F .

• The details of the particular implementation of the pullback bundle φ∗F as a submanifold of the directproduct M × F are used in this paper, but there is no reason to “open up the box” like this. For mostpurposes, the categorical definition of pullback bundle suffices; the pullback bundle can be workedexclusively using its projection maps πφ

∗FM and ρφ

∗FF . In the author’s experience (which occurred too

late to be incorporated into this paper), using this abstract interface cleans up calculations involvingpullback bundles significantly.

• The type system used for any particular problem or calculation can be enriched or simplified to adjustto the level of detail appropriate for the situation. For example, if γ ∈ C∞ (R,M), then ∇◦ γ ∈Γ (γ∗TM ⊗R T

∗R), but if t is the standard coordinate on R, then ∇◦ γ = γ′⊗Rdt, where γ′ ∈ Γ (γ∗TM)is given by ∇◦ γ · ddt . This “primed” derivative has a simpler type than the total derivative, and would

50

presumably lead to simplier calculations (e.g. in (12.6). This “primed” derivative could also be used inthe derivation of the first and second variations. While this would simplify the type system, it woulddiversify the notation and make the computational system less regularized. However, some situationsmay benefit overall from this.

• The notion of strong typing comes from computer programming languages. The human-driven type-checking which is facilitated by the pedantically decorated notation in this paper can be done bycomputer by implementing the objects and operations of this tensor calculus formalism in a stronglytyped language such as Haskell. This would be a step toward automated calculation checking, andcould be considered a step toward automated proof checking from the top down (as opposed to fromthe bottom up, using a system such as the Coq Proof Assistant).

• Is there some sort of completeness result about the calculational tools and type system in this paper?In other words, is it possible to accomplish “everything” in a global, coordinate-free way using a certainset of tools, such as pullback bundles, covariant derivatives, chain rules, permutations, evaluation-by-pullback?

• The alternate form of the second variation (see (13.2)) can be used to form a generalized Jacobi fieldequation for a particular energy functional. Analysis of this equation and its solutions may give insightsanalogous to the standard (geodesic-based) Jacobi field equation.

AcknowledgementsI would like to express my gratitude to the ARCS (Achievement Rewards for College Scientists) Founda-tion for their having awarded me a 2011-2012 ARCS Fellowship, and for their generous efforts to promoteexcellence in young scientists. I would like to thank my advisor Debra Lewis for trusting in my abilitiesand providing me with the freedom in which the creative endeavor that this paper required could flourish. Iwould like to thank David DeConde for the invaluable conversations at the Octagon in which imagination,creativity, and exploration were gladly fostered. Thanks to Chris Shelley for showing me how to create thetensor diagrams using Tikz. Finally, I would like to thank both Debra and David for their help in editingthis paper.

References[1] Luca Cardelli. Typeful programming. 1991 (revised 1993). Available online at

ftp://gatekeeper.research.compaq.com/pub/DEC/SRC/research-reports/SRC-045.pdf.

[2] Conference Board of the Mathematical Sciences Regional Conference Series in Mathematics. SelectedTopics in Harmonic Maps, number 50. American Mathematical Society, 1983.

[3] C.T.J. Dodson and M.S. Radivoiovici. Second-order tangent structures. International Journal of The-oretical Physics, 21(2):151–161, 1982.

[4] David G. Ebin and Jerrold E. Marsden. Groups of diffeomorphisms and the motion of an incompressiblefluid. The Annals of Mathematics, Second Series, 92(1):102–163, 1970.

[5] Halldor I. Eliasson. Geometry of manifolds of maps. J. Differential Geometry, 1(2), 1967.

[6] Mariano Giaquinta and Stefan Hildebrandt. Calculus of Variations I. Springer-Verlag, 1996.

[7] Ivan Kolár, Peter W. Michor, and Jan Slovák. Natural Operations in Differential Geometry, volume 434.Springer Verlag, 1993. This is an online book which can be found at http://www.mat.univie.ac.at/ mi-chor/listpubl.html.

51

[8] Jeffrey M. Lee. Manifolds and Differential Geometry, volume 107. American Mathematical Society,2009.

[9] John M. Lee. Riemannian Manifolds: An Introduction to Curvature, volume 176. Springer Verlag, 1997.

[10] John M. Lee. Introduction to Smooth Manifolds, volume 218. Springer Verlag, 2006.

[11] Jerrold E. Marsden and Thomas J. R. Hughes. Mathematical Foundations of Elasticity. Prentice Hall,Inc., 1983.

[12] Peter W. Michor. Topics in Differential Geometry, volume 93. American Mathematical Society, 2008.

[13] George A. Miller. The magical number seven, plus or minus two: Some limits on our capacity forprocessing information. American Psychological Association, 101(2):343–352, 1955.

[14] Seiki Nishikawa. Vartiational Problems in Geometry, volume 205. American Mathematical Society,2002.

[15] Richard S. Palais. Foundations of Global Non-Linear Analysis. W.A. Benjamin, Inc., 1968.

[16] David Parnas. On the criteria to be used in decomposing systems into modules. Communications ofthe ACM, 15(12):1053–1058, 1972.

[17] Roger Penrose. The Road to Reality. Vintage Books, 2004.

[18] Eric S. Raymond. The Art of Unix Programming. Pearson Education, Inc., 2003. This is an online bookwhich can be found at http://www.faqs.org/docs/artu/index.html.

[19] Wolfgang Walter. Ordinary Differential Equations, volume 182. Springer Verlag, 1998.

[20] Yuanlong Xin. Geometry of Harmonic Maps, volume 23. Birkhäuser, 1996.

52

Documents

Strong Typed Tensor Calculus