A SIGNIFICANT PORTION OF THIS TEXT HAS BEEN …a significant portion of this text has been adapted from the typed notes of sean sather-wagstaff, from a course given by benedict gross

A S I G N I F I C A N T P O R T I O N O F T H I S T E X T H A S B E E N A D A P T E D F R O M T H E

T Y P E D N O T E S O F S E A N S AT H E R - W A G S TA F F, F R O M A C O U R S E G I V E N B Y

B E N E D I C T G R O S S I N U TA H I N 1 9 9 9 , W H I C H I N T U R N U S E D M A N Y I D E A S O F

TAT E ’ S U N P U B L I S H E D N O T E S .

M A R T I N H I L L E L W E I S S M A N

N U M B E R T H E O R Y

Copyright © 2009 Martin Hillel Weissman

typeset with tufte-latex

For academic use only. You may not reproduce or distribute without permission of the author.

First printing, May 2009

Contents

1 Lattices 7

2 Integers and Number Fields 19

3 The geometry of numbers 35

4 Ideals 49

5 Units 63

6 Cyclotomic Fields 77

7 Valued fields 91

8 P-adic fields 109

9 Galois Theory 127

Index 139

6

Bibliography 141

1Lattices

This chapter introduces lattices, which can be thought of as integralstructures on rational vector spaces. Most importantly, we studylattices in vector spaces endowed with a nondegenerate bilinear form,and introduce the discriminant of such lattices.

The study of lattices prepares us for the study of “rings of in-tegers” in number fields. A more general discussion, includingfunction fields, is left for the exercises.

1.1 LatticesNote that the vector space V is not yetendowed with a bilinear form. Otherauthors use the word lattice only whenthe ambient vector space is endowedwith a nondegenerate symmetric orskew-symmetric bilinear form.

Definition 1.1 Let V be a finite-dimensional vector space over Q. Alattice in V is a free abelian subgroup L ⊂ V such that spanQ(L) = V.

Given a vector space V over Q of finite dimension n and a subgroupL of V, we find that L is torsion-free, since V is torsion-free. For Lto be a lattice, it is necessary and sufficient1 that L be a free abelian 1 This is an exercise to the reader. See

the exercises at the end of the chaptergroup, isomorphic to Zn. Assume that L is a lattice in V in whatfollows. Giving an isomorphism ι : Zn → L is equivalent to giving anordered n-tuple (v1, . . . , vn) of vectors such that:

L = Zv1 ⊕Zv2 ⊕ · · · ⊕Zvn.

Such an ordered tuple of vectors (v1, . . . , vn) is called an ordered2 2 Many authors neglect the differencebetween bases and ordered bases. Irecommend taking care to distinguishthese notions

basis of the lattice L.A basis of the lattice L is also a basis of the vector space V. Indeed,

since L spans V, we immediately find that a basis of L spans V.Moreover, if there is a Q-linear relation:

n

∑i=1

civi = 0,

then after multiplying through by a “common denominator” D, onefinds a Z-linear relation:

n

∑i=1

Dcivi = 0.

8 number theory

Since L = Zv1 ⊕ · · ·Zvn, we find that Dci = 0 for all 1 ≤ i ≤ n.Fix an isomorphism ι : Zn → L, corresponding to an ordered

basis (v1, . . . , vn) as above. If j : Zn → L is another isomorphism,corresponding to another ordered basis (w1, . . . , wn), then we findthat ι−1 j is an automorphism of Zn, i.e., an element of GLn(Z). Thisprovides a map:

B : ordered bases of L → GLn(Z),

This map is bijective, and depends upon an initial choice of orderedbasis, i.e., the chosen initial isomorphism ι. A better way to thinkabout this is the following:

Proposition 1.2 The set of ordered bases of L is a torsor3 for GLn(Z). In 3 The word “torsor” seems to inspirefear in students, perhaps because ofits common usage in very difficultmathematics. But, given a group G, atorsor for G is a G-set X, such that Gacts transitively on X and the stabilizerof any element x ∈ X is trivial. A torsoris also called a principal homogeneousspace, by many authors. Fixing a “basepoint” x ∈ X determines a bijectiong 7→ gx from G to X. But no choice ofbase point is necessarily preferred in atorsor.

other words, GLn(Z) acts simply transitively on the set of ordered bases ofL.

Proof: Suppose that (v1, . . . , vn) is an ordered basis of L, and letg = (gij) be an element of GLn(Z). Then a new ordered basis of L isgiven by (w1, . . . , wn) where

wi = ∑j

gijvj.

We leave it to the reader to check that this defines a simply-transitiveaction of GLn(Z) on the set of ordered bases of L.

Similarly, an isomorphism ι : Qn → V corresponds to an orderedbasis of V. The set of ordered bases of V (as a vector space overQ) is naturally a torsor for the group GLn(Q). Perhaps the mostfamiliar elementary consequence is that given an invertible matrixg ∈ GLn(Q), the rows (or columns) form a basis of Qn.

More significant is the action of GL(V) on the set of lattices. Ob-serve that if L is a lattice in V as before, and g ∈ GL(V), then gL isalso a lattice.4 4 Indeed, linear algebra implies that

gL is a subgroup of V. Furthermore,the function v 7→ gv gives a groupisomorphism from L to gL. It followsthat gL is free of the appropriate rank.

Proposition 1.3 Let Lat(V) be the set of lattices in V, and fix a latticeL ⊂ V. The action of GL(V) on Lat(V) given above yields a bijection:

Lat(V)↔ GL(V)/AutZ(L).

Proof: Suppose that g ∈ GL(V). We find that gL = L if and only ifg ∈ AutZ(L), the group of Z-linear automorphisms of L. Indeed, ifgL = L, then the map v 7→ gv is a Z-linear automorphism of L. Onthe other hand, a Z-linear automorphism of L extends uniquely toa Q-linear automorphism of V, by considering a basis of L (which is

lattices 9

also a basis of V), for example. It follows that GL(V) acts on Lat(V),and the stabilizer of L is AutZ(L).

It remains to prove that the action of GL(V) on Lat(V) is transitive.If L and M are two lattices in V, then we may choose ordered bases(v1, . . . , vd) of L and (w1, . . . , wd) of M. Furthermore, since thesebases are both bases of V, there exists a “change of basis matrix”g ∈ GL(V) such that g(v1, . . . , vn) = (w1, . . . , wn). It follows thatgL = M.

A special case of the action of GL(V) on Lat(V) is given by ho-motheties. Namely, if α ∈ Q×, and L ∈ Lat(V), then αL is also alattice. When two lattices are related by such a scaling, they are calledhomothetic.

Proposition 1.4 If L and M are lattices in V, then there exists a nonzerointeger such that:

αM ⊂ L.

Proof: By the previous proposition, there exists g ∈ GL(V) suchthat g(L) = M. Express g as a matrix (gij) ∈ GLn(Q), with respectto a basis (v1, . . . , vn) of L. Let α be the (nonzero integer) commondenominator of the rational numbers gij. Then, we find that αgL =αM. Moreover, if v ∈ L, then αgv ∈ L, since the entries of the matrix(αgij) are integers. Hence αM ⊂ L.

We finish this section by discussing important facts about pairs oflattices. Namely, we often are led to consider a pair L ⊂ M of latticesin a vector space V over Q. We write [M : L] for the index of L in M,both viewed as abelian groups.

Proposition 1.5 Suppose that M is a lattice in a vector space V over Q

and assume that L is a subgroup of M. Then L is a lattice in V if and only if[M : L] is finite.

Proof: First, suppose that [M : L] is finite. Then, L, as a subgroup ofa free fintely-generated abelian group, is again a finitely-generatedabelian group. Since [M : L] is finite, we find that the rank of Lequals the rank of M. It follows that spanQ(L) = spanQ(M) = V.Hence L is a lattice in V.

Conversely, assume that L ⊂ M is a pair of lattices in V. ByProposition 1.4, there exists a nonzero integer α such that αM ⊂ L.Since homothety preserves inclusions and indices of pairs of lattices,we find that:

αL ⊂ αM ⊂ L, and

10 number theory

[L : αL] = [L : αM][αM : αL] = [L : αM][M : L].

Since L ∼= Zn as an abelian group, we find that [L : αL] = |α|n < ∞.Since all terms above are cardinal numbers, it follows that5 5 This actually provides a practical way

to estimate, though coarsely, the index[M : L]. Squeeze L into a multiple αM,and the index [M : L] is bounded by|α|d.

[M : L] < [L : αL] < |α|n < ∞.

A deeper analysis demonstrates that, given a pair of lattices L ⊂M, one may choose bases of these lattices which are compatible in auseful way.

Theorem 1.6 Suppose that L ⊂ M is a pair of lattices in a vector space Vover Q. Then, there exist positive integers a1, . . . , an, and an ordered basis(w1, . . . , wn) of M, such that (a1w1, . . . , anwn) is an ordered basis of L.6 6 Closely related to this theorem is the

fundamental property of buildingsof p-adic groups – that for any twopoints in such a building, there existsan apartment containing both points.

Proof: This theorem is usually stated as an immediate consequenceof the “theory of elementary divisiors”. We review the relevantresults here. Since L ⊂ M is a pair of lattices in V, there exist iso-morphisms λ : Zn → L and µ : Zn → M. Let ι : L → M denotethe inclusion. There results an injective homomorphism of abeliangroups:

φ = µ−1 ι λ : Zn → Zn.

As an endomorphism of Zn, we view φ as a n by n matrix withinteger entries.

There is an algorithm7 which constructs two matrices g, h ∈ 7 This algorithm brings the matrix φinto “Smith normal form”. The entriesof the resulting diagonal matrix arecalled the elementary divisors of φ.They are uniquely determined, up tore-ordering and sign. The algorithmworks over any principal ideal domain,after which the elementary divisors areuniquely determined up to re-orderingand scaling by units. Algorithms areimplemented in every major softwaremathematical software package, such asSAGE, Maple, PARI, etc..

GLn(Z) such that g−1φh is a diagonal matrix. The diagonal entriesof g−1φh are nonzero, since φ (and hence g−1φh) is an injective endo-morphism of Zn.

Let a1, . . . , an denote the nonzero diagonal entries of this matrix.Let (e1, . . . , en) denote the standard basis of Z, so that for all 1 ≤ i ≤n,

[g−1φh](ei) = aiei.

For all 1 ≤ i ≤ d, define wi = [µg](ei) ∈ M and vi = [λh](ei) ∈ L.Then (w1, . . . , wn) is an ordered basis of M and (v1, . . . , vn) is anordered basis of L. We now compute:

ι(vi) = [µφλ−1](vi)

= [µφh](ei)

= [µg](aiei)

= aiwi.

The basis (w1, . . . , wn) satisfies the desired condition.

lattices 11

1.2 Bilinear forms

The study of lattices in a vector space becomes much more interest-ing when the ambient vector space is endowed with a nondegeneratebilinear form. In this section, we will always consider a lattice L ina finite-dimensional vector space V over Q, endowed with a bilinearform 〈·, ·〉.

Definition 1.7 A bilinear form 〈·, ·〉 on V is called symmetric if 〈v, w〉 =〈w, v〉 for all v, w ∈ V. It is called skew-symmetric if 〈v, w〉 = −〈w, v〉for all v, w ∈ V.

We will always work with symmetric or skew-symmetric bilinearforms. By working with either symmetric or

skew-symmetric forms throughout, weavoid having to constantly distinguishbetween “left-nondegenerate” and“right-nondegenerate”, or “left-dual”and “right-dual”, etc..

Definition 1.8 Suppose that 〈·, ·〉 is a symmetric or skew-symmetricbilinear form on V. We say that this bilinear form is nondegenerate if forevery linear functional f ∈ V∗ = HomQ(V, Q), there exists a uniquew ∈ V such that f (v) = 〈w, v〉 for all v ∈ V.

A classical pairing on V is a nondegenerate, symmetric or skew-symmetric, bilinear form on V.

We work in this section with lattices L in a vector space V over Q

endowed with a classical pairing.The notation for “dual lattices” variesgreatly from one author to the next.We prefer to use the rare notation L]

to avoid confusion with more commonnotions of duality. In particular, webelieve that the notation L∗ shouldbe avoided, since L∗ would naturallybe a lattice in V∗, while L] is a latticein V itself. In addition, L] should notbe confused with the dual Z-moduleHom(L, Z).

Definition 1.9 Given a lattice L ⊂ V, and a classical pairing on V, thedual lattice L] is the subgroup of V given by:

L] = v ∈ V : 〈v, w〉 ∈ Z for all w ∈ L.

Proposition 1.10 With the assumptions of the previous definition, the duallattice L] is a lattice in V, and L]] = L.

Proof: Suppose that (v1, . . . , vn) is an ordered basis of L. Let v]1, . . . , v]

n

denote the dual ordered basis of V, in the sense that 〈v]i , vj〉 = δij

(the Kronecker symbol8). From the definition of L], it follows that 8 It is defined that δij = 1 if i = j, andδij = 0 otherwisev]

1, . . . , v]n are elements of L].

Conversely, if w ∈ L], then there exist integers c1, . . . , cn such that〈w, vi〉 = ci for all 1 ≤ i ≤ n. It follows9 that w = ∑ civ

]i . It follows 9 This utilizes nondegeneracy of the

bilinear form. The reader should verifythis if it is not clear.

now that L] is the lattice in V given by:

L] = Zv]1 ⊕ · · · ⊕Zv]

n.

Furthermore, the definition of “dual basis” is symmetric or skew-symmetric (depending upon the classical pairing) in that (±v1, . . . ,±vn)is the ordered basis dual to (v]

1, . . . , v]n). It follows that

L]] = Zv1 ⊕ · · · ⊕Zvn = L.

12 number theory

Definition 1.11 Let (v1, . . . , vn) be an ordered basis of L. Let λij =〈vi, vj〉. The discriminant of L (with respect to this ordered basis) is thedeterminant of the matrix λ = (λij).

Suppose that (v1, . . . , vn) and (w1, . . . , wn) are two ordered bases ofL. Let λij = 〈vi, vj〉 and µij = 〈wi, wj〉. Then, by Proposition 1.2, thereexists g ∈ GLn(Z) such that wi = ∑j gijvj. It follows that:

Det µ = Det(gλtg) = Det(g)2 Det(λ).

But, observe that if g ∈ GLn(Z), then Det(g) ∈ Z× = ±1. It followsthat Det(g)2 = 1. Hence we have proven that

Proposition 1.12 The discriminant of a lattice L (in a vector space V withclassical pairing) depends only upon the lattice and not upon the choice ofordered basis of V. Thus, we find an invariant:

Disc : Lat(V)→ Q.

Proposition 1.13 Suppose that M = αL are homothetic lattices. ThenDisc(M) = α2Disc(L).

Proof: If (v1, . . . , vn) is an ordered basis of L, then (αv1, . . . , αvn) isan ordered basis of M. The proposition now follows directly from thedefinition of the discriminant.

By applying a homothety, one can scale a lattice in such a way thatits discriminant is an integer; in fact, one arrives at the followinguseful fact: every lattice L is homothetic to a unique lattice M suchthat Disc(M) is a square-free integer.

More generally, using elementary divisors, one can prove

Proposition 1.14 Suppose that L ⊂ M are lattices. Then Disc(L) = [M :L]2 Disc(M).

Proof: By Theorem 1.6, there exists an ordered basis (v1, . . . , vn) ofM and nonnegative integers a1, . . . , an, such that (a1v1, . . . , anvn) isan ordered basis of L. Let λij = 〈vi, vj〉. Let A denote the diagonalmatrix with diagonal entries a1, . . . , an. Then we find that

Disc(L) = Det(AλA) = Det(A)2 Det(λ) = Det(A)2 Disc(M).

Furthermore, the index of L in M is precisely the product ∏i ai =Det(A). The proposition follows.

lattices 13

Two methods are now available to classify and describe a latticeL. First, one may describe a lattice L by describing properties of itsdiscriminant. Second, one may describe a lattice L by describingproperties of the bilinear form, evaluated on L. The following definitionsfall within these frameworks.

Definition 1.15 A lattice L is called integral if one of the following equiva-lent conditions holds:

1. L ⊂ L].

2. 〈v, w〉 ∈ Z for all v, w ∈ L.

Proposition 1.16 Suppose that L is an integral lattice. Then [L] : L] =±Disc(L). In particular, Disc(L) ∈ Z.

Proof: Since L is an integral lattice, L ⊂ L] and by Theorem 1.6, thereexists an ordered basis (v]

1, . . . , v]n) of L] and nonnegative integers

a1, . . . , an, such that (a1v]1, . . . , anv]

n) is an ordered basis of L.Using this ordered basis, we find that

[L] : L] = ±n

∏i=1

ai.

Since L and L] are dual lattices, consider the basis (v1, . . . , vn) of Ldual to the basis (v]

1, . . . , v]n) of L]. We now consider the discriminant

of L; let λ = (λij) where λij = 〈vi, vj〉. Thus Disc(L) = Det(λ). Since

(v1, . . . , vn) and (a1v]1, . . . , anv]

n) are both ordered bases of L, thereexists g ∈ GLn(Z) such that aiv

]i = ∑j gijvj. It follows that:

λg =(〈vi, ajv

]j 〉)

.

Thus λg is the diagonal matrix with diagonal entries a1, . . . , an. SinceDet(g) = ±1, we find that:

Disc(L) = Det(λ) = ±Det(λg) = ±n

∏i=1

ai.

1.3 Examples and Properties of Lattices

In this section, we consider lattices L in a fixed vector space V over Q,endowed with a classical pairing 〈·, ·〉.

Definition 1.17 An integral lattice L is called unimodular if L = L], orequivalently (by Proposition 1.16), if Disc(L) = ±1.

14 number theory

Example 1.18 The standard lattice Zn in Qn is unimodular, when Qn isendowed with the usual “dot product”:

〈~x,~y〉 = x1y1 + · · ·+ xnyn.

The discriminant of this lattice is equal to 1.

If L is an integral lattice, then L ⊂ L]. It follows that the classicalpairing on V restricts to a Q-bilinear form on L], which descends to abilinear form:

L]

L× L]

L→ Q

Z.

Proposition 1.19 If L is an integral lattice, then the bilinear form above isa perfect pairing on the finite abelian group L]/L. In other words, it inducesan isomorphism:

L]/L→ HomZ(L]/L, Q/Z).

Proof: Since the group L]/L is finite, it suffices to prove injectivity ofthe map L]/L→ HomZ(L]/L, Q/Z). An element of the kernel lifts to anelement v ∈ L] for which 〈v, w〉 ∈ Z for all w ∈ L]. Such an element vlies in L]] = L, by Proposition 1.10. It follows that the aforementionedmap is injective.

An important consequence of thisproposition is that it reduces the huntfor unimodular lattices to a problemabout finite abelian groups.

Proposition 1.20 Suppose that L is an integral lattice, and let A denotethe finite abelian group L]/L, endowed with the perfect pairing of theprevious proposition. If B is a subgroup of A, define

B⊥ = a ∈ A such that 〈b, a〉 = 0 for all b ∈ B.

If M is a lattice and L ⊂ M ⊂ L], then M is unimodular if and only ifM = M⊥, where M = M/L ⊂ A.

Proof: Suppose that M is a lattice and L ⊂ M ⊂ L]. We find that:

L = L]] ⊂ M] ⊂ L].

Hence M and M] are uniquely determined by their images M andM] in A = L]/L. One may directly compute that M] = M⊥, fromwhich the proposition follows.

Definition 1.21 An integral10 lattice L is called even if for all v ∈ L, 10 If the classical pairing on V is sym-metric, then the condition 〈v, v〉 ∈ 2Z

for all v ∈ L implies that L is integral.〈v, v〉 ∈ 2Z.

lattices 15

Example 1.22 Let Zn denote the standard lattice in Qn, endowed with the“dot product” bilinear form. Consider the sublattice L ⊂ Zd, given by

L = (x1, . . . , xn) ∈ Zn such thatn

∑i=1

xi ∈ 2Z.

Then, we find that [Zn : L] = 2, since L is the kernel of the “summod 2” homomorphism from Zn to Z/2Z. It follows that Disc(L) =22 Disc(Zn) = 4. The lattice L is even, since for all v = (x1, . . . , xn) ∈ L,we find that

〈v, v〉 =n

∑i=1

x2i ≡

n

∑i=1

xi ≡ 0 (mod 2).

Figure 1.1: The lattice L and its dualL], satisfying L ⊂ Z2 ⊂ L]. Elementsof L are the red dots. Elements of L]

are blue dots. Observe that L]/L isisomorphic to Z/2Z⊕ Z/2Z.

Observe that L ⊂ Zn ⊂ L], and [L] : L] = 4. It follows that L]/L is afinite abelian group isomorphic to Z/2Z×Z/2Z, or to Z/4Z. These casescan be distinguished by considering the element η = 1/2(1, . . . , 1) ∈ Qd.Then η ∈ L] and η 6∈ Zn. If n is odd, then 2η 6∈ L. Hence, if n is odd, thenL]/L is not a 2-torsion group. Hence, if n is odd then L]/L is isomorphicto Z/4Z. On the other hand, if n is even, and v = (x1, . . . , xn) ∈ L], then2η ∈ L and

〈v, 2η〉 = x1 + · · ·+ xn ∈ Z.

It follows that 2x1 + · · · + 2xn ∈ 2Z, so that 2v ∈ L. Therefore, thegroup L]/L is a 2-torsion group. We have found that if n is even thenL/L] ∼= Z/2Z×Z/2Z.

1.4 Generalizations

Many definitions and results of this chapter, suitably modified, holdwhen Z is replaced by a principal ideal domain R, and Q is replacedby the field K of fractions of R. The definition of a lattice generalizesas follows:

Definition 1.23 If V is a K-vector space of finite dimension d, then a R-lattice in V is an R-submodule L of V, such that L is free of rank d as anR-module and K · L = V.

The following generalizations are straightforward:

• The set of ordered R-bases of an R-lattice L is a torsor for GLn(R).

• Given an R-lattice L, there is a natural bijection between the setLatR(V) of R-lattices in V and the quotient GL(V)/AutR(L).

• If L and M are two R-lattices in V, then there exists a nonzeroelement α ∈ R such that αM ⊂ L.

• If L ⊂ M are two R-lattices in V, then there exist nonzero elementsa1, . . . , an ∈ R and an ordered R-basis (w1, . . . , wn), such that(a1w1, . . . , anwn) is an ordered R-basis of L.

16 number theory

Given a nondegenerate symmetric or skew-symmetric K-bilinearform on V, the other definitions and results generalize as well. If L isan R-lattice in V, we write

L] = v ∈ V such that 〈v, w〉 ∈ R for all w ∈ L.

• If L ⊂ M are lattices in V, then M] ⊂ L].

• If L is a lattice in V, then L]] = L.

• The discriminant Disc(L) of an R-lattice is well-defined, as anelement of K×/R×2, where R×2 denotes the group of squares ofunits of R.

Beyond these basic properties, it is difficult and sometimes impos-sible to generalize results connecting the discriminant to indices oflattices, for example. The primary obstruction is that the elementsof K or R no longer have any necessary relation to integers, and onecannot expect them to relate to indices and orders of finite abeliangroups.

1.5 Exercises

Exercise 1.1 Suppose that V is a finite-dimensional vector space over Q

with classical pairing. Prove that every lattice L is homothetic to an integrallattice.

Exercise 1.2 Suppose that V is a finite-dimensional vector space overQ with classical pairing. Suppose that L is a lattice in V. Prove that ifα ∈ Q×, then (αL)] = α−1(L]).

Exercise 1.3 Suppose that V and W are finite-dimensional vector spacesover Q endowed with classical pairings. Let L and M be lattices in V andW, respectively. Then L⊕M is a lattice in V ⊕W. How is the discriminantof L⊕M related to the discriminant of L and the discriminant of M?

Exercise 1.4 Suppose that V is a n-dimensional vector space over Q.Suppose that L is a subgroup of V, and L is isomorphic to Zn. Prove thatspan(L) = V. What if Q is replaced by R? Can you think of a counterex-ample?

Exercise 1.5 Suppose that V is a n-dimensional vector space over Q, andM is a lattice in V. Let p be a prime number.

(a) Prove that if L is a lattice in V, and L ⊂ M and [M : L] = p, thenp ·M ⊂ L.

lattices 17

(b) Prove that there are 1 + p + · · · + pn−1 lattices L in V, such thatL ⊂ M and [M : L] = p. Hint: Find a correspondence between theselattices and hyperplanes (subspaces of codimension 1) in the Fp-vectorspace M/pM.

(c) Prove that AutZ(M) acts transitively on the set of lattices of index p inM.

Exercise 1.6 Let R = Z[i] and K = Q(i); recall that R is a principal idealdomain. Let V be a finite-dimensional K-vector space. Fix a nondegenerateHermitian form11 Φ on V. 11 Recall that this means that Φ : V ×

V → K is a Q-bilinear form whichsatisfies the following axioms:

• Φ(zv, w) = Φ(v, zw) = zΦ(v, w), forall v, w ∈ V and all z ∈ K. Here, zdenotes the complex conjugate of z.

• Φ(v, w) = Φ(w, v) for all v, w ∈ V.In particular, Φ(v, v) ∈ Q for allv ∈ V.

• The Q-linear map, sending w ∈ Vto the K-linear form Φ(•, w) ∈HomK(V, K) is an isomorphism ofvector spaces over Q. (It is conjugate-linear, as a map of K vector spaces).

(a) Let L be a R-lattice in V. Define the discriminant of L, relative to anordered R-basis of L. In what sense is it independent of the choice ofbasis?

(b) Consider the “standard” Hermitian space V = Kn, endowed with theHermitian form:

Φ((x1, . . . , xn), (y1, . . . , yn)) = x1y1 + · · ·+ xnyn.

Let L = Rn ⊂ Kn. What is the discriminant of L?

(c) Given an R-lattice L in V, we may also view L as a Z-lattice in V whenV is viewed as a vector space over Q. Consider the Q-bilinear form on V,given by:

〈v, w〉 = 2<(Φ(v, w)).

Find a formula relating the discriminant of L as a Z-lattice in V (en-dowed with the bilinear form 〈·, ·〉) to the discriminant of L as a R-latticein V (endowed with the Hermitian form Φ).

2Integers and Number Fields

A number field is a field F which is a finite extension of Q, i.e., afield of characteristic zero, which is finite-dimensional as a vectorspace over Q. If α ∈ F, a number field, then the countable infiniteset 1, α, α2, . . . cannot be linearly independent over Q. A Q-linearrelation among these elements is precisely a polynomial with α as aroot:

P(α) = anαn + · · ·+ a1α + a0 = 0,

for some a0, . . . , an ∈ Q. In this way, every element of F is algebraicover Q. 1 1 On the other hand, there exist fields

F/Q, in which every element is alge-braic, but [F : Q] = ∞. Indeed, anyalgebraic closure of Q has infinitedimension, as a Q-vector space.

Given a number field F, we focus in this chapter on the subset OF

of “integers” in F. Viewing F as a finite-dimensional vector spaceover Q, we will find that OF is not only a ring, but also a lattice inF. Moreover, there is a natural classical pairing on F, from which OF

can be further studied using the techniques of the previous chapter.In particular, we arrive at invariants such as the “discriminant of F”,defined as the discriminant of the lattice OF.

In this chapter, we assume some basic field theory, but we recallsome important definitions along the way.

2.1 Algebraic and integral elements

Suppose that F is a field extension2 of Q. Recall that an element α of 2 This just means that F is a field whichcontains QF is called algebraic (over Q) if there exists a polynomial3 P ∈ Q[X]3 If such a polynomial exists, then onemay clear divide through by the leadingcoefficient to make it monic.

such that P 6= 0 and P(α) = 0. But it is convenient to think aboutalgebraicity in a few different ways.

Proposition 2.1 Suppose that F is a field extension of Q and α ∈ F. Thefollowing conditions are equivalent.

1. α is algebraic.

2. The field4 Q(α) is a finite extension of Q. 4 This is the smallest subfield of Fcontaining α, i.e., the intersection of allsubfields of F containing α

20 number theory

3. There exists a subfield K of F, such that K is a finite extension of Q, andK contains α.

Proof:

(1) implies (2) Suppose that α is algebraic, and let I denote the setof polynomials P in Q[X] such that P(α) = 0. Since α is algebraic,I 6= 0. Furthermore, I is an ideal in Q[X], which is a principalideal domain. Let gα be a generator of I. Evaluation at α induces aring homomorphism:

Evα : Q[X]/I → Q(α), given by P 7→ P(α).

Since Q(α) is a domain, I is a nonzero prime ideal in Q[X].Since Q[X] is a PID, this implies that I is a maximal ideal andso Im(Evα) is a subfield of Q(α). Since Q(α) is the smallest sub-field of F containing α, and Im(Evα) contains α, we find thatIm(Evα) = Q(α). Hence Evα is a field isomorphism.

The dimension of Q[X]/I, as a Q-vector space, is equal to the degreeof gα. Thus Q(α) is a finite extension of Q of degree Deg(gα).

(2) implies (3) The field Q(α) is a finite extension of Q containing α,by (2).

(3) implies (1) Suppose that K is a finite extension of Q, and Kcontains α. Let mα denote “multiplication by α”, viewed as a Q-linear endomorphism of the Q-vector space K. The characteristicpolynomial5 of mα is the polynomial 5 The argument given here is a bit

overpowered for the modest needsof this proposition. One may insteadobserve that the infinite set 1, α, α2, . . .cannot be linearly independent, in thefinite-dimensional vector space K. A Q-linear dependence among these powersof α yields a polynomial with α as aroot. However, our stronger argumentgives a specific polynomial of interest.

Pα = Xn − Tr(mα)xn−1 + · · · ±Det(mα).

By the Cayley-Hamilton theorem, we find that Pα(mα) = 0 inEndQ(K). Since K is a field, the map α 7→ mα is an injective homo-morphism from K into the Q-algebra EndQ(K). Thus, the identityPα(mα) = 0 in EndQ(K) implies the identity Pα(α) = 0 in K. SincePα 6= 0, it follows that α is algebraic.

The previous proposition allows us to easily prove the algebraicityof numbers, even when we cannot easily write down polynomials forwhich they are roots.

Corollary 2.2 Suppose that F is a field extension of Q. Let K denote the setof elements of F which are algebraic over Q. Then K is a subfield of F.

Proof: It suffices to prove that, given (nonzero) algebraic elementsα, β ∈ F, the elements α± β, αβ, and α/β are algebraic6. Let Q(α, β) 6 Finding explicit polynomials is more

difficult. For example, can you find anonzero polynomial P with rationalcoefficients, such that P(

√2 +√

3) = 0.The previous results imply that sucha polynomial P exists; in fact, one canfind P of degree 4. But finding such apolynomial, especially in more generalcircumstances, can be difficult.

integers and number fields 21

denote the smallest7 subfield of F containing α and β. Foundational

7 The smallest such subfield can beobtained by intersecting all suchsubfields

results in field theory imply that

[Q(α, β) : Q] = [Q(α, β) : Q(α)][Q(α) : Q], and

[Q(α, β) : Q(α)] ≤ [Q(β) : Q].

By algebraicity of α and β, we find that [Q(α) : Q] < ∞ and [Q(β) :Q] < ∞. Hence

[Q(α, β) : Q] ≤ [Q(α) : Q][Q(β) : Q] < ∞.

Since α± β, αβ and α/β are contained in Q(α, β), the third criterionof the previous proposition implies their algebraicity.

Definition 2.3 The algebraic numbers, denoted Q, are defined to be theset of elements of C, which are algebraic over Q. By the previous result, Q isa subfield of C.

A very similar line of reasoning can be used to study algebraicintegers in a field F containing Q. Recall that an element α of F iscalled integral (over Z) if there exists a monic8 P ∈ Z[X] such that 8 Unlike working over Q, the monic

condition is crucial.P 6= 0 and P(α) = 0. Observe that integrality implies algebraicity: thecondition defining integrality is stronger than the condition definingalgebraicity.

Proposition 2.4 Suppose that F is a field extension of Q and α ∈ F. Thefollowing conditions are equivalent.

1. α is integral over Z.

2. The ring9 Z[α] is a finite rank Z-module. 9 This is the smallest subring of Fcontaining α, i.e., the intersection of allsubrings of F containing α3. There exists a subring M of F, such that M is finitely-generated as a

Z-module and M contains α.

Proof:

(1) implies (2) Since α is integral over Z, there exists a nontrivialmonic polynomial having α as a root:

αn + an−1αn−1 + · · ·+ a1α + a0 = 0.

It follows that αn may be expressed in terms of lower powers of α.Inductively, one may show that all powers of α may be expressedin terms of powers of α between α0 and αn−1. It follows that everyelement of Z[α] can be expressed as an integer linear combinationof the elements α0, . . . , αn−1. Hence Z[α] is finitely-generated, as aZ-module.

22 number theory

(2) implies (3) Trivial.

(3) implies (1) The same argument as in Propositon 2.1 applies.Namely, multiplication by α is a Z-endomorphism mα of thefinitely-generated Z-module M. Note that since M ⊂ F, and F isa field, M is torsion-free and hence free. Thus mα is representedby a matrix of integers, with respect to any Z-basis of M. As inProposition 2.1, α is a root of the characteristic polynomial Pα ofmα:

Pα = Xn − Tr(mα)Xn−1 + · · · ±Det(mα).

Indeed, mα is also the matrix representing multiplication by α

in the Q-vector space Q · M spanned by M. Since the entriesof the matrix representing mα are integers, it follows that everycoefficient of Pα is an integer as well. Thus Pα(α) = 0, and 0 6= 0 ∈Z[X]. Since Pα is monic, we find that α is integral over Z.

The previous proposition allows us to easily prove the integralityof numbers, even when we cannot easily write down monic polyno-mials with integer coefficients, for which they are roots.

Corollary 2.5 Suppose that F is a field extension of Q. Let A denote the setof elements of F which are integral over Z. Then A is a subring of F.

Proof: It suffices to prove that, given integral elements α, β ∈ F, theelements α ± β and αβ are integral. Consider the ring Z[α, β] ⊂ F.By the integrality of α, β, there exist positive integers d, e, such thatevery element of Z[α, β] can be expressed as a polynomial in α andβ, whose α-degree is at most d, and whose β-degree is at most e.10 It 10 This is a direct generalization of the

argument used to prove (1) implies (2)in Proposition 2.4.

follows that Z[α, β] has finite-rank as a Z-module.By condition (3) of the previous Proposition 2.4, the elements

α± β, αβ of Z[α, β] are integral over Z.

Definition 2.6 The algebraic integers, denoted11 Z, are defined to be the 11 This is not a completely standardnotation, but we find it natural.set of elements of C, which are integral over Z. By the previous result, Z is

a subring of C.

Definition 2.7 Suppose that F is an algebraic extension of Q. The ring ofintegers in F, denoted OF, is the set of elements of F which are integral overZ. Again, OF is a subring of F.

For example, the ring of integers in Q is precisely Z. The followinglemma describes two basic properties of rings integers.


Lemma 2.8 Suppose that F is an algebraic extension of Q. Then OF ∩Q =Z and F/OF is a torsion12 Z-module. 12 In other words, for every z ∈ F, there

exists a nonzero N ∈ Z, such thatNz ∈ OF .Proof: First, suppose that α ∈ OF ∩Q. Thus, α = u/v for two relatively

prime integers a, b, and moreover α is a root of a monic polynomialwith integer coefficients:

un

vn + an−1un−1

vn−1 + · · ·+ a0 = 0.

Multiplying through by vn, one finds that

un = −(

an−1un−1v + · · ·+ a0vn)

.

Any prime number dividing v divides the right side, and hencedivides the left side, and hence divides u. Since we assume that uand v are relatively prime, and α = u/v, we find that α is an integer.

Now, in order to see that F/OF is a torsion Z-module, consideran element z ∈ F. Since F is an algebraic extension of Q, there existsa nonzero polynomial with rational coefficients having z as a root;by dividing through by the leading coefficient, one may assume thispolynomial is monic:

zn + an−1zn−1 + · · ·+ a0 = 0, for some a0, . . . , an−1 ∈ Q.

Choose an integer N, such that an−i Ni ∈ Z, for i = 0, . . . , n− 1.13 13 For example, one may take N to bethe product of the denominators of thea0, . . . , an−1. But such a choice is farlarger than necessary, in many practicalinstances.

Then, we find that Nz satisfies the following identity:

(Nz)n + an−1N(Nz)n−1 + an−2N2(Nz)n−2 + · · ·+ Nna0 = 0.

Thus Nz is a root of a monic polynomial with integer coefficients.Hence Nz ∈ OF. It follows that F/OF is a torsion Z-module.

2.2 The lattice of integers

Consider a number field F, with ring of integers OF. By proving thatOF is a Z-lattice in the finite-dimensional Q-vector space F, we canapply the results of the previous chapter to study the integers in anumber field. We begin with a few definitions, and a lemma.

Definition 2.9 Suppose that F is a number field, and α ∈ F. The trace ofα, denoted Tr(α), or TrF/Q(α) if more clarity14 is required, is the trace of 14 And often, more clarity is required.

It is crucial to remember that the“trace of α” depends on the choice ofnumber field containing α. For exampleTrQ/Q(2) = 2, but TrQ(i)/Q(2) = 4.

the “multiplication by α” map mα : F → F, viewed as a Q-linear endomor-phism of the finite-dimensional Q-vector space F.

The norm of α, denoted N(α) or NF/Q(α) is the determinant of mα.

Proposition 2.10 Suppose that F is a number field. Then,

24 number theory

1. The function TrF/Q : F → Q is a Q-linear map.

2. The function NF/Q : F× → Q× is a group homomorphism.

3. The trace yields a symmetric Q-bilinear form 〈·, ·〉F/Q :

〈α, β〉 = Tr(αβ).

Proof: If α ∈ F, recall that both the trace and norm are determinedby the multiplication by α map mα ∈ EndQ(F). The distributive lawimplies that mα+β = mα + mβ. The associative law for multiplicationimplies that mαβ = mαmβ. From this, and the basic properties of Trand Det from linear algebra, we find that

(1) The function Tr is a Q-linear map, using the following two identi-ties.

Tr(α + β) = Tr(mα+β) = Tr(mα + mβ)

= Tr(α) + Tr(β), for all α, β ∈ F.

Tr(qα) = Tr(mqα) = Tr(mqmα) = Tr(q ·mα)

= q Tr(α), for all q ∈ Q, α ∈ F.

(2) The norm is a group homomorphism from F× to Q×, because ofthe following identity:

N(αβ) = Det(mαβ) = Det(mαmβ)

= Det(mα) Det(mβ) = N(α) N(β), for all α, β ∈ F.

(3) The trace yields a symmetric15 Q-bilinear form, by using the 15 Symmetry is obvious, by the commu-tativity of multiplication. Symmetryalso implies that only the identity belowsuffices to prove bilinearity.

following identity, valid for all q ∈ Q, and all α, β, γ ∈ F:

〈qα + β, γ〉 = Tr(mqαγ+βγ) = Tr(q ·mαmγ + mβmγ)

= q Tr(mαmγ) + Tr(mβmγ) = q〈α, γ〉+ 〈β, γ〉.

Lemma 2.11 Suppose that F is a number field, and α ∈ OF is an integer inF. Then TrF/Q(α) ∈ Z and NF/Q(α) ∈ Z.

Proof: Consider the field extensions F ⊃ Q(α) ⊃ Q. Let Pα denotethe minimal polynomial of α over Q; Pα is a monic polynomial withinteger coefficients:

Pα(X) = Xd + ad−1Xd−1 + · · ·+ a0.

As we have seen before, Q(α) is naturally isomorphic to Q[X]/〈Pα〉. Letd = Deg(Pα), so that a basis of Q(α) as a Q-vector space is given by1, α, . . . , αd−1.


Let e = [F : Q(α)], so that n = [F : Q] = de. Let β1, . . . , βe be a basisfor F as a Q(α)-vector space. A basis16 for F as a Q-vector space is 16 In what follows, the basis is ordered

“lexicographically”.given byαiβ j : 1 ≤ j ≤ e, 0 ≤ i ≤ d− 1.

The “multiplication by α” endomorphism of Q(α) is represented17 by 17 Represented, with respect to the basis1, α, α2, . . . , αd−1, that is.the matrix:

T =

0 0 0 · · · −a0

1 0 0 · · · −a1

0 1 0 · · · −a2...

......

. . ....

0 0 0 · · · −ad−1

.

Since F can be identified with Q(α)e, as a Q(α)-module, we find thatthe “multiplication by α” endomorphism of F is represented by theblock-diagonal matrix:

mα =

T 0 · · · 0

0 T. . .

......

. . . . . . 00 · · · 0 T

.

We find the following formula for the trace of α:

TrF/Q(α) = −ead−1 = e TrQ(α)/Q(α).

In particular, since ad−1 ∈ Z, we find that TrF/Q(α) is an integer.Similarly, we find a formula for the norm of α:

NF/Q(α) = Det(T)e = (−1)deae0 = NQ(α)/Q(α)e.

Since a0 ∈ Z, we find that NF/Q(α) ∈ Z as well.

Now we demonstrate a fundamental result:

Theorem 2.12 Let F be a number field. Then OF is a Z-lattice in the Q-vector space F. Endowing F with the trace bilinear form, the lattice OF is anintegral lattice.

Proof: We prove this theorem by a squeeze argument; we find alattice L ⊂ F, such that L ⊂ OF ⊂ L]. If we can find such a lattice L,then OF has finite index18 in the latticeL], which implies that OF is a 18 Index bounded above by |Disc(L)|, in

fact.lattice itself.Now, to construct the lattice L, begin by choosing an ordered

basis v1, . . . , vn of F as a Q-vector space. By Proposition 2.8, F/OF

is a torsion Z-module; it follows that there exist nonzero integers

26 number theory

N1, . . . , Nn such that N1v1, . . . , Nnvn ∈ OF. Let L = N1v1Z + · · · +NnvnZ.

We find that L is a lattice, since L is a finitely-generated Z-module,and

SpanQ(L) = SpanQ(v1, . . . , vn) = F.

Hence L is a lattice and L ⊂ OF. It remains to show that OF ⊂ L],where L] is the dual lattice with respect to the trace bilinear form.So suppose that α ∈ OF. Then, for every β ∈ L ⊂ OF, we find thatαβ ∈ OF (by Corollary 2.5, OF is a subring of F). By the previousLemma 2.11,

〈α, β〉 = TrF/Q(αβ) ∈ Z.

Hence α ∈ L].We have proven that L ⊂ OF ⊂ L]. Hence OF is a lattice. Integral-

ity follows from the previous Lemma 2.11 again; for any α, β ∈ OF,

〈α, β〉 = TrF/Q(αβ) ∈ Z.

This result allows us to study a number field F, by applyingthe machinery of the first chapter to the lattice OF, in the finite-dimensional Q-vector space F, which is endowed with the tracebilinear form. In particular, the discriminant gives an invariant of F:

Definition 2.13 Let F be a number field. The discriminant of F, oftendenoted19 Disc(F), is defined to be the discriminant of the lattice OF in the 19 This is slightly abusive notation. It

would be more consistent with theprevious chapter to write Disc(OF). Butwe follow common conventions here,and think about the discriminant as aninvariant of a number field.

Q-vector space F, with respect to the trace bilinear form. Since the latticeOF is integral, Disc(F) ∈ Z.

We will later see that Disc(F) = ±1, if and only if F = Q (in whichcase, Disc(F) = 1).

Definition 2.14 Let F be a number field. The inverse different20 of F, 20 The inverse different arises morenaturally than the different itself. Wewill define the different dF later, as theinverse of the inverse different, once wehave the theory of fractional ideals

usually denoted d−1F , is the lattice O]

F dual to OF, with respect to the tracebilinear form.

2.3 Applications of field theory

Suppose that F is a number field. Since F is a finite extension of Q,we may apply some results of Galois theory, even though F is notassumed to be a Galois extension of Q. When F is a Galois extensionof Q, these results simplify further.

Define Emb(F, C) to be the set of embeddings of F as a subfieldof C, i.e., the set of field homomorphisms from F to C. Since F isalgebraic, the image of any such embedding is contained in Q, theset of algebraic numbers. Furthermore, if η ∈ Emb(F, C), then


η(OF) ⊂ Z; the condition of integrality is preserved under suchembeddings.

Recall the primitive element theorem21 of field theory; we may 21 If F/K is a finite separable fieldextension, then there exists an elementα ∈ F such that F = K(α)

(and do) choose an element φ ∈ F, such that F = Q(φ). Without lossof generality, we may assume that φ ∈ OF, by multiplying φ by asuitable nonzero integer. Let Pφ denote the minimal polynomial of φ.Thus Pφ is a monic irreducible polynomial, with integer coefficients,of degree n = [F : Q]. The following is a basic result of field theory:

Proposition 2.15 The number of embeddings # Emb(F, C) is equal to thedegree [F : Q] of the field extension F.

Proof: Let r1, . . . , rn denote the roots of Pφ in C, which are distinct byseparability and irreducibility. Note that n = [F : Q] = Deg(Pφ), sinceevery nonconstant polynomial with rational coefficients has a root inC. An embedding of F in C is uniquely determined by where it sendsα. Furthermore, an embedding of F in C can be given by sending α toany one of the roots r1, . . . , rn. Hence there are n such embeddings.

Now, we may interpret the trace, norm, and trace pairing, usingembeddings of F into C.

Proposition 2.16 Suppose that α ∈ F. Then we have

1. TrF/Q(α) = ∑σ∈Emb(F,C) σ(α).

2. NF/Q(α) = ∏σ∈Emb(F,C) σ(α).

3. 〈α, β〉 = ∑σ∈Emb(F,C) σ(α)σ(β).

Proof: Let Pα denote the (monic) minimal polynomial of α, and lete = [F : Q(α)]. Let s1, . . . , sd denote the roots of Pα in C. From theproof of Lemma 2.11, we find that:

TrF/Q(α) = e TrQ(α)/Q(α), and NF/Q(α) = NQ(α)/Q(α)e.

Moreover, these quantities are directly related to the polynomial Pα:

TrQ(α)/Q(α) =d

∑i=1

si, and NQ(α)/Q(α) =d

∏i=1

si.

Putting this together, we find that

TrF/Q(α) = ed

∑i=1

si, and NF/Q(α) =

(d

∏i=1

si

)e

.

On the other hand, the elements s1, . . . , sd are precisely the imagesσ(α) as σ ranges over embeddings of Q(α) into C. As one varies σ

28 number theory

over embeddings of F into C, each embedding of Q(α) into C occurse = [F : Q(α)] times. It follows that

TrF/Q(α) = ed

∑i=1

si = ∑σ∈Emb(F,C)

σ(α).

NF/Q(α) =

(d

∏i=1

si

)e

= ∏σ∈Emb(F,C)

σ(α).

The identity for the trace pairing follows directly from identity forthe trace above.

Using this proposition, we find a new way of computing thediscriminant.

Corollary 2.17 Let F be a number field, and let (α1, . . . , αn) be an orderedZ-basis for the lattice OF (or any other lattice L in the Q-vector space F).Let (σ1, . . . , σn) be (an ordering on) the set of embeddings of F into C. LetB = (Bij) be the matrix with (complex) entries Bij = σj(αi). Then,

Disc(F) = Disc(OF) = Det(B)2.

Proof: By the definition of discriminant, we are led to consider thematrix A = (Aij) = (〈αi, αj〉), which satisfies

Disc(F) = Disc(OF) = Det(A).

Observe that for all i, j, we have:

Aij = 〈αiαj〉 = Tr(αiαj) =n

∑k=1

σk(αi)σk(αj).

Hence Aij = ∑k BikBjk. In terms of matrix multiplication, A = BtB.Therefore we find that

Disc(F) = Det(A) = Det(BtB) = Det(B)2.

Of course, in practice, number fields often arise as subfields of C

already. If in addition, F is a subfield of C, which is a finite Galoisextension of Q, then one may work with automorphisms of F/Q,rather then embeddings of F into C; in this case, it is safe to replaceEmb(F, C) by Gal(F/Q).

An important result, obtained from this new formulation of thediscriminant, is the following

Theorem 2.18 Suppose that F is a number field. Then Disc(F) is congru-ent to 0 or 1 mod 4.


Proof: WIth the same notation as in the previous Corollary 2.17, letBij = σj(αi). Since every element αi ∈ OF is integral over Z, andfield embeddings preserve integrality, we find that Bij ∈ Z ⊂ C. Thedeterminant of B can be expressed as a familiar sum of products,which we further partition based on signs of permutations22. Define, 22 We write Sn for the symmetric group

acting upon 1, . . . , n, and An for thealternating group.

for every π ∈ Sn, the following contributor to the determinant:

Tπ =n

∏j=1

σj(απ(j)).

Aggregating these terms, define Teven = ∑π∈An Tπ and Todd =∑π∈Sn−An Tπ . Then, we can express the discriminant of F as follows:

Disc(F) = Det(B)2 =

(∑

π∈Sn

sgn(π)Tπ

)2

= (Teven − Todd)2

As each term of Bij is an algebraic integer, we find that Teven, Todd ∈ Z

as well.Since for all i, j, Bij is algebraic over Q, there exists a finite Galois

extension K/Q, such that K ⊂ C and Bij ∈ K for all i, j. It followsthat Teven, Todd ∈ K as well. There exists an action of Gal(K/Q)on Emb(F, C), given by composition (since the image of everyembedding of F into C lands inside of K). It follows that for allγ ∈ Gal(K/Q), there exists a permutation γg ∈ Sn such that for all j,

g(Bij) = g(σj(αi)) = σγg(j)(αi) = Biγg(j).

Hence, for every g ∈ Gal(K/Q), and every π ∈ Sd, we find that:

g(σj(απ(j))) = σγg(j)(απγ−1

g γg(j)).

Therefore, we find that g permutes the terms Tπ

g(Tπ) = Tπγ−1

g

Depending on whether γg is even or odd, one of the following twofacts holds (respectively):

1. gTeven = Teven and gTodd = Todd, or

2. gTeven = Todd and gTodd = Teven.

In both cases, g(Teven + Todd) = (Teven + Todd) and g(TevenTodd) =TevenTodd. We have proven that

(Teven + Todd), TevenTodd ∈ KGal(K/Q) = Q.

Since Teven, Todd ∈ Z, it follows from Lemma 2.8 that

(Teven + Todd), TevenTodd ∈ Z.

30 number theory

Finally, we find that

Disc(F) = Det(B)2 = (Teven − Todd)2 = (Teven + Todd)2 − 4TevenTodd.

Since all squares of integers are congruent to 0 or 1, modulo 4, thetheorem follows.

2.4 Quadratic Fields

A quadratic field is a field F, which contains Q, and which satisfies[F : Q] = 2. Basic algebra can be used to see the following:

Proposition 2.19 If F is a quadratic field, then there exists an elementα ∈ F, such that F = Q(α) and α2 is a square-free integer.

By reason of this proposition23, we write Q(√

D) for the field 23 Perhaps this paragraph seems exces-sive. The point is that we would rathernot fix an embedding of Q(

√D) in

C, i.e., we would rather not choose acomplex square root of D.

Q[X]/〈X2−D〉, and in this circumstance always assume that D is asquare-free integer. Every quadratic field is isomorphic to such afield. We write

√D for the canonical image of X in Q(

√D), the latter

viewed as a quotient of the polynomial ring Q[X].In this way, the field Q(

√D) has a natural basis as a Q-vector

space, given by the elements 1 and√

D. Note that 1 and√

D are bothintegral over Z. Define OD to be the ring of integers of Q(

√D); then

we find an inclusion of lattices:

Z[√

D] = Z⊕Z√

D ⊂ OD.

The squeeze method, which allowed us to prove that OD is a latticein Theorem 2.12, is effective, allowing us to determine OD explicitly.

Proposition 2.20 Let ∆ = Disc(Q(√

D)). Then, a Z-basis of the latticeOD is given by the pair of elements 1 and (∆+

√∆)/2. There are two cases.

1. If D is congruent to 1, mod 4, then ∆ = D and OD = Z + Z(1+√

D)/2.

2. If D is congruent24 to 2 or 3, mod 4, then ∆ = 4D and OD = Z + 24 Notice, in this case, the discriminant∆ is congruent to zero, mod 4.Z

√D = Z[

√D].

Proof: We first consider the chain of lattices

Z[√

D] ⊂ OD ⊂ Z[√

D]].

The discriminant of Z[√

D] can be computed directly, using theZ-basis (1,

√D). Indeed, Tr(1) = 2, Tr(

√D) = 0, and we find that

Disc(Z[√

D]) = Det

(2 00 2D

)= 4D.


Let N = [OD : Z[√

D]]. Then we find that

N2∆ = N2 Disc(OD) = Disc(Z[√

D]) = 4D.

Since D is square-free, we find that N2 = 1 or N2 = 4. Hence we haveproven that

∆ = D or ∆ = 4D.

If D is congruent to 2 or 3, modulo 4, then ∆ 6= D since ∆ is congru-ent to 0 or 1, modulo 4 by Theorem 2.18. On the other hand, if D iscongruent to 1, modulo 4, then we find that the element z = (1+

√D)/2

satisfies the monic polynomial identity

z2 − z− D− 14

= 0.

Thus z ∈ OD. Since z 6∈ Z[√

D], we find that N 6= 1, so N = 2 and∆ = D.

We have now reduced our study to two cases.

N2 = 1: In this case, OD = Z[√

D], ∆ = 4D, and D is congruent to 2or 3, modulo 4. Now, It follows from the fact that ∆ = 4D that

∆ +√

∆2

=4D + 2

√D

2= 2D +

√D.

Hence we find that

Z[√

D] = Z + Z√

D = Z + Z∆ +√

∆2

.

The conclusion of the proposition is verified in this case.

N2 = 4: In this case OD contains Z[√

D] as an index two sublattice,∆ = D, and D is congruent to 1, modulo 4. Since we have seenthat OD contains the element z = 1/2(1 +

√D) 6∈ Z[

√D], and OD

contains Z[√

D] with index two, it follows that

OD = Z + Z1 +√

D2

= Z + ZD +√

D2

.

The conclusion of the proposition is verified in this case.

By examining the list of discriminants of quadratic fields, we find

Corollary 2.21 Suppose that ∆ is an integer, and ∆ is congruent to 0 or1, modulo 4. Suppose, moreover, that ∆ is square-free, except perhaps fora factor of 4. In other words, suppose that if p is an odd prime numberdividing ∆, then p2 does not divide ∆, and that 16 does not divide ∆. Then,there exists a unique quadratic field of discriminant ∆. The ring of integersin this quadratic field is

O = Z + Z

(∆ +√

∆2

).

32 number theory

In fact, a stronger result of Stickelberger describes all “quadraticrings” – rings which are isomorphic to Z2 as a Z-module. Such ringsare all presentable as O is presented here, but all values ∆ congruentto 0 or 1, modulo 4 are allowed (without the square-free condition).Most of these quadratic rings arise as “orders” in quadratic fields.

Definition 2.22 Let F be a number field. An order in F is a subringA ⊂ F, such that A is a Z-lattice in the Q-vector space F.

For example, the ring Z[√

D] is always an order in Q(√

D); howeverit may or may not be the maximal order, i.e.25 , the ring of integers 25 It is a fact that the ring of integers

in a number field can be alternativelycharacterized as the (unique) maximalorder in a number field.

OD ⊂ Q(√

D).Consider a quadratic field Q(

√D) of discriminant ∆ (recall ∆ = D

or 4D, according to whether D is congruent to 1, or to 2 or 3, modulo4). The following gives a construction of orders in Q(

√D):

Proposition 2.23 Let f be a positive integer. Then, there is a uniqueorder B of index f (equivalently, of discriminant ∆ f 2) in OD, and B =Z + f · OD.

Proof: First, it is not difficult to see that B = Z + f · OD is a subring ofOD. Furthermore, since OD can be described as a lattice Z + Z ∆+

√∆

2 ,we find that

B = Z + f∆ +√

∆2

·Z.

This is a sublattice of index f in OD, equivalently of discriminantf 2∆.

On the other hand, suppose that B′ is a order in Q(√

D), and B′

has index f in OD. Then we find that

fOD ⊂ B′ ⊂ OD.

Since B′ is a ring, Z ⊂ B′. It follows that B ⊂ B′ ⊂ OD. By a squeezeargument – both B and B′ have the same index in OD – we find thatB = B′.

2.5 Exercises

Exercise 2.1 Consider the cubic field F = Q( 3√

2), by which we mean thefield Q[X]/〈X3 − 2〉, and 3

√2 refers to the canonical image of X in F.

(a) Compute TrF/Q( 3√

2) in two ways: by directly writing the matrix formultiplication by this element, and by using the three embeddings of Finto C.

(b) Compute the discriminant of the lattice Z[ 3√

2]. Bound the index ofZ[ 3√

2] as a sublattice of OF.


(c) Challenge: Find (a basis for) OF.

Exercise 2.2 Let A be a quadratic ring, i.e., a commutative unital ringwhich is isomorphic to Z2 as a Z-module. Let V = A⊗Z Q be the resulting2-dimensional Q-vector space, in which A sits as a lattice. Define an algebrastructure on V, by extension of scalars. The trace and norm can be defined,for elements of V, using the trace and determinant of the “multiplicationby α” matrices, for all α ∈ V. This leads to the appropriate definition of thediscriminant of A (viewed as a lattice in V).

(a) Prove that if Disc(A) = 0, then A is isomorphic to Z[X]/〈X2〉.

(b) Prove that if Disc(A) = 1, then A is isomorphic to Z × Z (theCartesian product of the two rings).

(c) Challenge: Prove that if Disc(A) 6= 0, 1, then A is an integral domain.

Exercise 2.3 Let F = Q(√

D) be a quadratic field (so we assume D issquare-free). Describe explicitly the inverse different d−1

D of F.

Exercise 2.4 Let F = Q(ζ), where ζ = e2πi/8 is a primitive eighth root ofunity in C.

(a) Describe Gal(F/Q), and its subgroups.

(b) Given that OF = Z[ζ], compute the discriminant of F.

3The geometry of numbers

In this chapter, we discuss the “geometry of numbers”, in whichwe consider lattices in real vector spaces, from which we can applyclassical geometric methods. In particular, this method allows us toestimate the discriminant, in terms of volumes of regions in Rn. Later,as this method applies to lattices related to OF (the ring of integers ina number field), this method will allow us to study the class numberof a number field.

3.1 Lattices, algebra, and measures

Consider, for now, the general situation of a lattice L in a finite-dimensional vector space V over Q. In this circumstance, defineVR = V ⊗Q R; thus L is isomorphic to Zn (as a group) and VR isisomorphic to Rn (as a R-vector space). Moreover, L sits naturally1 as 1 Compose the inclusion of L in V, with

the inclusion v 7→ v⊗ 1 of V in VR.a subgroup of VR, and L spans VR as a R-vector space2.2 Any Z-basis of L will be a basis of VR

as an R-vector space.Proposition 3.1 If L is a lattice in a finite-dimensional Q-vector space V,then L is a discrete subgroup of VR, and the quotient VR/L is a compact3 3 When VR/L is compact, it is said that

L is cocompact in VR.Hausdorff space.It is obvious that L is isomorphic to Zn

and VR is isomorphic to Rn. However,this does not suffice to imply that L isdiscrete and cocompact in RRn.

Proof: Choose a basis v1, . . . , vn of the lattice L in V; this is also abasis of VR as a R-vector space. Around the point 0 ∈ L, consider theopen set

U0 = v = a1v1 + · · ·+ anvn ∈ VR such that max|ai| < 1/2.

Then U0 is an open subset of VR, and U0 ∩ L = 0. TranslatingU0 by any element v of L yields an open subset Uv ⊂ VR such thatUv ∩ L = v. It follows that L is a discrete subset of VR.

As L is a closed subgroup of VR, it follows that VR/L is a Haus-dorff topological space. To prove compactness, consider the closureof a fundamental domain4 D ⊂ VR given by: 4 If G is a group acting upon a set X,

then a fundamental domain for thisaction is a subset D ⊂ X, such that:

∀x ∈ X, ∃!d ∈ D, ∃g ∈ G, gx = d.

In our context, we consider the action ofL by translation on the vector space V.A fundamental domain is given by

D = a1v1 + · · ·+ anvn : 0 ≤ ai < 1.

D = v = a1v1 + · · ·+ anvn ∈ VR : 0 ≤ ai ≤ 1 for all 1 ≤ i ≤ n.

36 number theory

Then under the canonical projection from VR onto VR/L, D surjectsonto VR/L. Since D is compact (homeomorphic to the product ofclosed intervals), and projection is continuous, it follows that VR/L iscompact.

Most interestingis when the finite-dimensional Q-vector spacearises as a number field. When F is a number field, the vector spaceFR = F ⊗Q R is endowed with a natural R-algebra structure. Todescribe this, we define the set of real and complex “places” of F:

Definition 3.2 A real place of a number field F is an embedding σ : F →R. A complex place of a number field is a (unordered) pair of embeddingsσ, σ of F into C, such that Im(σ) 6⊂ R, and σ(x) = σ(x) for all x ∈ F.Typically, one writes r1 for the number of real places, and r2 for the numberof complex places of a given number field F. An archimedean place of anumber field is a place which is either real or complex.5 5 Later, we will discuss nonarchimedean

places, arising from the p-adic fields.The first observation about archimedean places is the following:

Proposition 3.3 Let F be a number field, with [F : Q] = n. Let r1 be thenumber of real places, and r2 be the number of complex places of F. Thenn = r1 + 2r2.

Proof: The number of embeddings of F into C equals n. Each suchembedding has image contained in R, and hence corresponds toa real place of F, or else has image not contained in R. As thoseembeddings whose image is not contained in R come in conjugatepairs, each pair corresponding to a complex place, it follows thatn = r1 + 2r2.

The structure of the R-algebra FR is related to the archimedeanplaces as follows:

Proposition 3.4 There is a isomorphism of R-algebras FR∼= Rr1 × Cr2 ,

where r1 denotes the number of real places of F, and r2 denotes the numberof complex places of F.

Proof: By the theorem of the primitive element, there exists α ∈ Fsuch that F = Q(α). Let P denote the minimal polynomial of α,i.e. the irreducible monic polynomial, with rational coefficients, ofminimal degree having α as a root. There is a field isomorphism fromF to Q[X]/〈P〉, which extends naturally to an algebra isomorphismfrom FR to R[X]/〈P〉. On the other hand, the polynomial P factorsover R into linear and quadratic terms. Each linear term (X − xi)

the geometry of numbers 37

corresponds to a real place of F, sending α to x1. Each quadratic term(X2 − tjX + nj) corresponds to a complex place of F, sending α to theeither of conjugate roots zj, zj of this quadratic polynomial. In thisway, we find a factorization of P in R[X]:

P =r1

∏i=1

(X− xi) ·r2

∏j=1

(X2 − tjX + nj).

By the Chinese remainder theorem (applied to the principal idealdomain R[X], we find an isomorphism of R-algebras

FR∼=

R[X]〈P〉

∼=r1

∏i=1

R[X]〈X− xi〉

×r2

∏j=1

R[X]〈X2 − tjX + nj〉

.

It follows immediately that FR is isomorphic, as an R-algebra, toRr1 ×Cr2 .

One subtle issue with the previous proposition is that there is nota unique (or natural) isomorphism from FR to Rr1 ×Cr2 ; indeed, theconstruction of such an isomorphism relied on an initial choice of“primitive element”, for which there are an infinitude of choices. Noris it true that the resulting isomorphism is independent of choices.Instead, we must be content to prove that there are “not too many”such isomorphisms.

Proposition 3.5 Suppose that F is a number field, and ι1, ι2 are R-algebraisomorphisms from FR to Rr1 × Cr2 . Let c ∈ Gal(C/R) denote complexconjugation. Then, there exist permutations σ ∈ Sr1 and τ ∈ Sr2 , and a

“conjugation datum” e1, . . . , er2 ∈ 0, 1r2 , such that for all x ∈ FR,

ι1(x) = (x1, . . . , xr1 , z1, . . . , zr2) if and only if

ι2(x) = (xσ(1), . . . , xσ(r2), ce1 zτ(1), . . . , cer2 zτ(r2)).

In other words, there is a unique isomorphism from FR to Rr1 ×Cr2 , up toreordering and complex conjugation.

Proof: Let A = Rr1 ×Cr2 , to abbreviate. If ι1, ι2 are R-algebra isomor-phisms from FR to A, then ι = ι2 ι−1

1 is an R-algebra automorphismof A. Thus, it suffices to prove that every such automorphism arisesfrom reordering and complex conjugations.

Recall that an idempotent in a commutative ring is an element esuch that e2 = e. The set of idempotents form a partially orderedset, by defining e ≤ f if e f = e for all idempotents e, f . A minimalidempotent is a minimal nonzero idempotent, for this partial order.

The idempotents in the ring A are precisely those expressions(x1, . . . , xr1 , z1, . . . , zr2) such that for all i, j, xi ∈ 0, 1 and zj ∈

38 number theory

0, 1. Among these 2r1+r2 idempotents, there are r1 + r2 minimalidempotents in the ring A, given by

(1, 0, . . . , 0), . . . , (0, . . . , 0, 1).

We call a minimal idempotent e ∈ A real or complex, depending onwhether eA is isomorphic to R or to C, respectively.

As the property of “being a minimal idempotent” is certainlypreserved by ring automorphisms, we find that for any minimalidempotent e ∈ A, there exists a minimal idempotent f ∈ A suchthat ι(e) = f . We find that ι induces a ring isomorphism6 from eA 6 When e is an idempotent in a ring A,

the set eA forms a commutative unitalring, with e as its unit element.

to f A. For our ring A, and a minimal idempotent e, we find that eAis isomorphic to R or to C. as ι induces a ring isomorphism fromeA to f A, we find that ι permutes the real idempotents with realidempotents, and complex idempotents to complex idempotents.

Let σ and τ denote the resulting permutations. Since the onlyR-algebra isomorphism from R to R is the identity, and the onlyR-algebra isomorphisms from C to C are the identity and complexconjugation, the proposition follows immediately.

When F is a number field, the real vector space FR is endowedwith an R-algebra structure. From the previous proposition, thisalgebra structure is quite “rigid” – there are few automorphisms, allarising from reorderings and complex conjugations. It follows thatFR is endowed with a canonical lax basis, i.e., a basis as a R-vectorspace, unordered, and only defined up to sign. Namely, if r1 and r2

denote the number of real and complex places, and FR∼= Rr1 ×Cr2

is a fixed isomorphism, then we may consider the real idempotentse1, . . . , er1 and complex idempotents f1, . . . , fr2 . A basis for FR as areal vector space is given by the set:

e1, . . . , er1 , f1, . . . , fr2 , f1√−1, . . . , fr2

√−1.

Changing the algebra isomorphism FR∼= Rr1 × Cr2 will only re-

order these basis elements, and perhaps change some of the signs(changing one square root of −1 to the other in C).

From these observations, the real vector space FR∼= Rr1 × Cr2 is

also endowed with a natural measure, given by the absolute value ofa top-degree differential form dV:

dV = dx1 ∧ · · · ∧ dxr1 ∧ du1 ∧ dv1 ∧ · · · ∧ dur2 ∧ dvr2 .

Here, zi = ui + vi√−1, for each coordinate zi on Cr2 Observe that,

upon reordering the coordinates x1, . . . , xr1 , and the coordinates


z1, . . . , zr2 , and upon conjugating various complex coordinates, vol-ume form dV only changes in sign, and the measure |dV| does notchange.

It now follows that, given only the data of a number field F overQ, one arrives at a lattice OF in the Q-vector space F (which in turnis endowed with the trace pairing), as well as a discrete cocompactsubgroup OF in the R-vector space FR (which in turn is endowedwith a canonical measure). The previous chapter applied the theoryof Z-lattices in Q-vector spaces with classical pairings to the specialcase OF in F. This chapter applies the theory of discrete cocompactsubgroups of R-vector spaces with measures to the special case OF inFR.

Example 3.6 Consider first the fields G = Q(√−1) and E = Q(

√−3).

Let i denote a complex square root of −1, and let ω = e2πi/3 denote acomplex cube root of 1. Then OG = Z[i] and O[E] = Z[ω] by the resultsof the previous chapter7. Both of these fields are totally complex, meaning 7 Note that ω = 1/2(−1 +

√−3).

that all archimedean places are complex. We have GC∼= C ∼= EC. The

resulting measure on C is the usual measure. In this way, we find thecovolume (i.e., the volume of the quotient) to be:

Vol(C/OG) = 1, and Vol(C/OE) =√

32

.

Figure 3.1: The lattice OG = Z[i],represented by blue dots in C ∼= GR.The lattice OE = Z[ω], represented byred dots in C ∼= ER

Example 3.7 Consider the field F = Q(√

5), with ring of integers OF =Z[γ], where γ = 1/2(1 +

√5). This field is totally real, meaning that all

archimedean places are real. We have FR∼= R2. One embeds F (and OF) in

R2 by sending an element a + b√

5 ∈ Q(√

5) to the pair of real numbers(a + b

√5, a− b

√5) (for

√5 a fixed square root of 5 in R).

In this way, the Z-basis 1, γ of the lattice OF becomes identified withthe vectors (1, 1) and 1/2(1 +

√5, 1−

√5) in the real vector space R2.

40 number theory

Figure 3.2: The lattice OF = Z[γ],represented by red dots in R2 ∼= FR.

3.2 Geometry and applications

Consider here a number field F, with r1, r2 the numbers of real andcomplex places as before. Fix an identification FR

∼= Rr1 × Cr2 ,although the results of this section will be independent on this choice.Also, fix a lattice L in the Q-vector space F; for example, L might beOF, but other lattices will be important as well. Then L is discreteand cocompact in FR.

Consider a Z-basis (v1, . . . , vn) of the lattice L. This is also a basisof F, as a Q-vector space, and of FR, as a R-vector space. This basisalso yields a fundamental domain D for the action of L on FR bytranslation:

D = a1v1 + · · ·+ anvn : 0 ≤ ai < 1 for all i.

Its closure D is a closed parallelotope, i.e., the image of a closedn-cube under a linear invertible map.

Since the measure dV on FR is translation invariant, it descendsto a measure on the compact Hausdorff space FR/L. Explicitly, ifπ : FR → FR/L is the projection map, and X is a locally closed subsetof FR/L, the measure of a subset X is given by

Vol(X) = Vol(π−1(X) ∩ D).

In particular, Vol(FR/L) = Vol(D) is the measure of the parallelotopeD. This can be related to the discriminant of L by the following

Proposition 3.8 The discriminant Disc(L) is equal to (−4)r2 ·Vol(D)2.

Proof: Consider the set of embeddings Emb(F, C), partitionedinto the subset σ1, . . . , σr1 of embeddings into R, and the subsetτ1, τ1, . . . , τr2 , τr2 of embeddings into C, whose image is not con-tained in R, and which come in conjugate pairs.

Recall that (v1, . . . , vn) is a basis of L, and n = r1 + 2r2. We haveseen previously, in Proposition 2.17 that the discriminant of L can becomputed using the embeddings of F into C:

Disc(L) = Det(BtB) = Det(B)2,


where Bij is the matrix whose ith row has the form

(σ1(vi), . . . , σr1(vi), τ1(vi), τ1(vi), . . . , τr2(vi), τr2(vi)).

For example, consider the lattice Lspanned by 1 and i in F = Q(i). Thediscriminant can be computed viaembeddings

Disc(L) = Det(

1 1i −i

)2

= (−2i)2 = −4.

On the other hand, the (signed) volume of the parallelotope cor-responding to the basis v1, . . . , vn ∈ FR can be computed as Det(C),where the ith row of Cij has the form

(σ1(vi), . . . , σr1(vi),<(τ1(vi)),=(τ1(vi)), . . . ,<(τr2(vi)),=(τr2(vi))).

Thus, it remains to relate the matrices B and C.Let φ : C2 → C2 be the map defined by:

φ(z, w) =12(z + w, i(z− w)).

In particular, when w = z, we find that

φ(z, z) = (<(z),=(z)).

Let Φ =

(1/2 −i/21/2 i/2

)denote the corresponding invertible matrix.

Let ΦF denote the n by n matrix, given in blocks down the diagonal:the first r1 blocks are 1× 1 blocks with 1 in each entry; the last r2

blocks are 2 × 2 blocks with Φ in each block. Then we find thatB ·ΦF = C, and so

Det(B) Det(ΦF) = Det(C).

On the other hand, we can compute Det(ΦF):

Det(ΦF) = Det(Φ)r2 = (i/2)r2 .

It follows that

Det(B) = (2i)r2 Det(C) = (−2i)r2 ·Vol(D).

Since Disc(L) = Det(B)2, we get

Disc(L) = Det(B)2 = (−4)r2 Vol(D)2.

In particular, the sign of the discriminant is uniquely determinedby the number of complex places:

Corollary 3.9 If L is a Z-lattice in a number field F, then Disc(L) =(−1)r2 , where r2 is the number of complex places of F.

Our next geometric result is crucial for estimates involving dis-criminants and class numbers. We begin with

42 number theory

Definition 3.10 Let P be a subset of a real vector space V. Then P is calledconvex if for all p, q ∈ P, and all t ∈ [0, 1] (the closed unit interval in R),the point tp + (1− t)q is also in P. In other words P is convex if for any twopoints of P, the line segment joining p and q is also fully contained in P.

P is called centrally symmetric if for all p ∈ P, −p is also a point of P.

Theorem 3.11 Suppose that L ∼= Zn is embedded as a discrete cocompactsubgroup of a real vector space V , endowed with a “usual measure” (arisingfrom an identification with Euclidean space). Let P be a compact, convex,centrally symmetric subset of V, and suppose that Vol(P) ≥ 2nVol(V/L).Then P contains a nonzero8 point of L. 8 If P is a subset of V, and P is centrally

symmetric, convex, and nonempty, then0 ∈ P.Proof: Suppose first that Vol(P) > 2nVol(D) (a strict inequality,

unlike the nonstrict inequality given in the theorem). Let H = 1/2P,so that Vol(H) = 2−nVol(P) > Vol(Rn/L). Consider the projectionπ : V → V/L. For every non-negative integer k, define measurablesubsets of V/L by:

Mk = v ∈ C/L such that π−1(v) ∩ H has cardinality k.

The compactness (the boundedness, in fact) of H implies that

V/L =∞⊔

k=0

Mk.

Namely, the cardinality of π−1(v) ∩ H is always finite, since otherwisewe would find an infinite discrete subset of H. We can use thispartition to compute the volume of H in a new way9: 9 Essentially, we are partitioning H into

subsets Hk (0 ≤ k < ∞), in such a waythat the projection map from Hk toMk is a k to 1 map. Since the volumeform on Hk is precisely the pullbackof the volume form on Mk, we haveVol(Hk) = k ·Vol(Mk)

Vol(H) =∞

∑k=0

k ·Vol(Mk).

In particular, since 0 · Vol(M0) = 0, and 1 · Vol(M1) = Vol(M1) ≤Vol(Rn/L), and Vol(H) > Vol(V/L), we find that there exists k ≥ 2such that Mk 6= ∅.

Hence, there exists v ∈ π(H) such that π−1(v) ∩ H has cardinalityat least two. In other words, there exist points p, q ∈ H, such thatπ(p) = π(q). In other words, there exist p, q ∈ H, and a nonzerov ∈ L, such that p− q = v.

Since H is centrally symmetric, we find that p and −q are elementsof H, and so 2p and −2q are elements of P. It follows from convexityof P that

v = p− q =12(2p + (−2q)) ∈ (L ∩ P)− 0.

Hence P contains a nonzero point of the lattice L.Now, we prove the theorem, in case of the nonstrict inequality.

Thus, we suppose that Vol(P) ≥ 2nVol(D). Consider the family


Pt = t · P, for 0 < t ∈ R. For t > 1, we find that Pt ∩ L contains anonzero lattice point, since Vol(Pt) > Vol(P) ≥ 2nVol(D) in this case.

Thus, we find a sequence of nonzero lattice points v1, v2, . . ., suchthat vi ∈ P1+1/i for all i ≥ 1. By the compactness of P2, this sequencehas a convergent subsequence; by the discreteness of L, we find thatthere exists a single nonzero lattice point v such that v ∈ P1+1/i for alli sufficiently large. It follows that v (the limit of this sequence) lies inthe compact set P = P1.

In the next theorem, we choose a compact, convex, centrally sym-metric subset of FR

∼= Rr1 ×Cr2 in such a way that we can prove theexistence of a “short vector” in a lattice.

Theorem 3.12 Suppose that L is a Z-lattice in a number field F withn = [F : Q]. Then, there exists a nonzero element α ∈ L, such that

|NF/Q(α)| ≤ n!nn

(4π

)r2

·√|Disc(L)|.

Proof: For all t > 0, consider the compact, convex, centrally symmet-ric subset of FR given by

Pt = x ∈ FR : ∑σ∈Emb(F,C)

|σ(x)| ≤ t.

Here, we are using the fact that any embedding σ of F into C extendsuniquely to an R-algebra homomorphism from FR to C, by sendingx⊗ r to rx for all x ∈ F and r ∈ R.

The fact that Pt is compact follows from the observation that Pt

is closed and (almost by definition) bounded. The fact that Pt iscentrally symmetric follows from the observation that its definition issymmetric with respect to the transformation v 7→ −v of FR. The factthat Pt is centrally symmetric follows from the triangle inequality.

Note that the definition of Pt essentially depends only uponr1, r2, and t (and not otherwise on the number field F); thus wewrite Pt(r1, r2). Furthermore, since Pt(r1, r2) is obtained by scalingP1(r1, r2) uniformly by a factor of t, there is a function V(r1, r2) =Vol(P1(r1, r2)), such that:

Vol(Pt(r1, r2)) = V(r1, r2)tn.

We compute the function V(r1, r2), recursively with respect to r2.

r2 = 0: When r2 = 0, the region P1 is an “orthoplex”:

Figure 3.3: The orthoplex P1 in R2,dissected into 22 right triangles (one ofwhich is shaded).

P1 = (x1, . . . , xn) ∈ Rn such that ∑ |xi| ≤ 1.

44 number theory

This orthoplex can be dissected into 2n congruent pieces, based onthe signs of the coordinates:

V(n, 0) = 2n Vol(x1, . . . , xn) ∈ Rn : ∑ xi ≤ 1 and ∀i, xi > 0.

Each of these pieces is a right pyramid, and the volume can becomputed without too much trouble:

V(n, 0) =2n

n!.

r2 > 0: When r2 > 0, we may express V(r1, r2) as an iterated integral,by slicing along one complex coordinate, which we express inpolar coordinates ρ and θ:

V(r1, r2) =∫ 1/2

0

∫ 2π

0V(r1, r2 − 1)(1− 2ρ)n−2ρdθdρ.

Computing this integral, using integration by parts,

Figure 3.4: V(1, 1) is computed byintegrating over the disc of radius1/2. The blue line segment is a 1-dimensional orthoplex, with lengthV(1, 0)(1− 2 · 1/4)3−2.

V(r1, r2) = 2πV(r1, r2 − 1)∫ 1/2

0(1− 2ρ)n−2ρdρ

= 2πV(r1, r2 − 1)∫ 1/2

0

12(n− 1)

(1− 2ρ)n−1dρ

= 2πV(r1, r2 − 1)1

4n(n− 1)(1− 2ρ)n|t/2

0

= V(r1, r2 − 1)π

2n(n− 1).

The recursive relation V(r1, r2) = V(r1, r2 − 1)π/2n(n−1) implies that

V(r1, r2) = V(r1, 0)(π

2

)r2 1n(n− 1) · · · (n− 2r2 + 1)

.

From our previous result on V(r1, 0) (and as usual, recalling thatn = r1 + 2r2), we find that

V(r1, r2) = 2r1(π

2

)r2 1n!

.

In particular, we find that

Vol(Pt) = Vol(Pt(r1, r2)) = 2r1(π

2

)r2 tn

n!.

Choose10 t such that Vol(Pt) = 2nVol(FR/L). By Proposition 3.8, 10 Explicitly, we are choosing t so that

tn = 2r1+r2 2−r1 d!2r2 π−r2

√|Disc(L)|.

To simplify,

tn = n!(

4π

)r2 √|Disc(L)|.

and a bit of algebra

Vol(Pt) = 2nVol(FR/L) = 2r1+r2

√|Disc(L)|.

Since Pt satisfies the hypotheses of the previous Theorem 3.11,there exists a nonzero element α ∈ L ∩ Pt. The norm of α can be


computed via embeddings, and estimated using the fact that thegeometric mean is bounded by the arithmetic mean:

|N(α)|1/n =

(n

∏i=1|σi(α)|

)1/n

≤ 1n

n

∑i=1|σi(α)| ≤ t

n.

It follows that |N(α)| ≤ tn/nn. Hence we find that

|N(α)| ≤ tn

nn =n!nn

(4π

)r2 √|Disc(L)|.

In the particular case when the lattice is the ring of integers in anumber field, we find that

Corollary 3.13 If OF is the ring of integers in a number field, so Disc(F) =Disc(OF), then we find that√

|Disc(F)| ≥ nn

n!

(π

4

)r2.

Proof: Applying the previous theorem, there exists a nonzero ele-ment α ∈ OF, such that

|N(α)| ≤ n!nn

(4π

)r2 √|Disc(L)|.

However, for a nonzero element α ∈ OF, we also find that 0 6= N(α) ∈Z. It follows that |N(α)| ≥ 1. Hence we have

n!nn

(4π

)r2 √|Disc(L)| ≥ 1.

The result follows immediately.

Finally, we achieve the theorem of Minkowski

Corollary 3.14 If F is a number field, and Disc(F) = ±1, then Disc(F) =1 and F = Q.

Proof: Consider the function

µ(n, r2) =nn

n!

(π

4

)r2.

Since n ≥ 2r2, we find that

µ(n, r2) ≥nn

n!

(π

4

)n/2.

The right side of the above inequality is monotonic strictly increas-ing, as a function of n ≥ 1 (as can be seen by induction). Hence√|Disc(F)| = 1 ≥ µ(n, r2) implies that n = 1, so F = Q.

46 number theory

Consider the quantity occurring in the Minkowsi bound:

µ(n, r2) =nn

n!

(π

4

)r2.

This can be analytically approximated, using a “double-inequality”version of Stirling’s formula, due to H. Robbins11. Specifically, Rob- 11 Herbert Robbins. A remark on

Stirling’s formula. Amer. Math. Monthly,62:26–29, 1955. ISSN 0002-9890

bins prove that

n! =√

2πnn+1/2e−nern =√

2πnnne−n+rn ,

where for all n ≥ 1,1

12n + 1< rn <

112n

.

From this formula, we deduce that

µ(n, r2) =en−rn√

2πn

(π

4

)r2.

As√|Disc(F)| ≥ µ(n, r2), raising both sides of this inquality to the

2/n power yields

|Disc(F)|1/n ≥ e2(π

4

)2r2/n(

e2rn/n

n√

2πn

).

Now we may expand

e2 = e2n/n = (e2)r1n + 2r2

n ,

and we find

n√|Disc(F)| ≥ (e2)r1/n

(e2π/4

)2r2/n(

e2rn/n

n√

2πn

).

The final term approaches 1 as n approaches infinity. It follows that

n√|Disc(F)| & (7.3)r1/n(5.8)2r2/n,

where the symbol & denotes that the left side is greater than the rightside, for all sufficiently large values of n.

3.3 Exercises

Exercise 3.1 Suppose that F is a number field, and Disc(F) = ±4.Describe the possible values of r1, r2, and d, using the “Minkowski bounds”.How about Disc(F) = 5, 8? How about Disc(F) = −7,−8?


Exercise 3.2 Suppose that L is an even unimodular lattice, in an 8-dimensional vector space endowed with a nondegenerate, positive-definite,symmetric bilinear form. Prove that L contains a vector v such that〈v, v〉 = 2. Hint: you might as well assume that L sits in the Euclideanspace R8. Compute (or look up) the volume of an 8-dimensional sphere ofradius ρ for some values of ρ < 4..

Exercise 3.3 Consider the number field F = Q(√

2,√−3). Compute

the discriminant of F. Find OF. Recompute the discriminant of OF bycomputing the volume of a parallelotope in C2.

4Ideals

One of the gems of number theory, from the second half of thenineteenth century, is the theory of algebraic integers and ideals.While germs of ideal theory can be found in the work of Kummerand Kronecker, the modern theory owes much to Dedekind’s 1877

work1 (translated into English with an historical introduction by 1 Richard Dedekind. Sur la théorie desnombres entiers algébriques. DarbouxBull. (2), I:17–41; 69–92; 114–164; 207–248, 1877

Stillwell2).

2 Richard Dedekind. Theory of algebraicintegers. Cambridge MathematicalLibrary. Cambridge University Press,Cambridge, 1996. ISBN 0-521-56518-9. Translated from the 1877 Frenchoriginal and with an introduction byJohn Stillwell

We develop the theory of “ideal factorization” in this chapter. Asthe theory of unique factorization of integers into primes fails in mostrings of integers (such as Z[e2πi/23], for example), one replaces thisby the unique factorization of ideals into prime ideals (valid in any“Dedekind domain”).

4.1 Ideals and orders

Let F be a number field, with n = [F : Q]. Let OF be the ring ofintegers in F. We have studied Z-lattices in F quite extensively, andproven that OF is a Z-lattice in F. It is important not only to studyOF, but other orders in F:

Definition 4.1 An order in F is a Z-lattice O in F, such that O is asubring of F.

It turns out that every order O in F is contained in the ring ofintegers OF; for that reason, OF is often referred to as the maximalorder in F.

Proposition 4.2 Suppose that O is an order in the number field F. Thenevery element of O is integral over Z, i.e., O ⊂ OF. Hence OF is theunique maximal order in F.

Proof: Let α be an element of O. Since O is a Z-lattice and a ring,we may consider the “multiplication by α” endomorphism mα of theZ-module O. Then mα, as an endomorphism of a free finite-rank Z-module, has a characteristic polynomial which is monic with integer

50 number theory

coefficients. By the Cayley-Hamilton theorem, we find that mα is aroot of this polynomial, from which we deduce that α is a root of thispolynomial in O. Thus α is integral over Z.

Within an order O in F, other arithmetically important lattices in Fare given by ideals in O:

Proposition 4.3 Suppose that I is a nonzero ideal in the ring O. Then[O : I] is finite3, and is a Z-lattice in F. Furthermore, the index [O : I] is 3 By [O : I], we mean the index of I in

O, as a group under addition.bounded above by |NF/Q(α)|n, for every nonzero α ∈ I.

Proof: Suppose that α ∈ I and α 6= 0. Then 0 6= NQ(α)/Q(α) ∈ Z.Moreover, the identity

αd − TrQ(α)/Q(α)αd−1 + · · · ± NQ(α)/Q(α) = 0

implies that NQ(α)/Q(α) is a multiple of α in O. It follows that

N = NQ(α)/Q(α) ∈ I ⊂ O.

It follows that NO ⊂ I ⊂ O. Therefore, we find a bound on [O : I]:

[O : I] divides [O : NO] = |N|n.

It follows that I is a Z-lattice in F.

Corollary 4.4 Suppose that P is a nonzero prime ideal in the order O.Then P is a maximal ideal.4 4 We say that O is one-dimensional

since any maximal chain of prime idealshas length 1: (0) ⊂ P for some nonzeroprime ideal P which is maximal.

Proof: By the previous proposition, we find that kP = O/P is afinite integral domain. Suppose that 0 6= α ∈ kP; let mα denote themultiplication by α map from kP to kP. Then mα is injective, sincekP is an integral domain. By the finiteness of kP, we find that mα issurjective; hence there exists β ∈ kP such that αβ = 1 ∈ kP. Thus kP isa field.

Corollary 4.5 Suppose that I1 ⊂ I2 ⊂ · · · is a chain of ideals in the ringO. Then the chain stabilizes, i.e., there exists a natural number m such thatIi = Ij for all i, j ≥ m. In other words, the ring O is Noetherian.

Proof: We have seen that the indices [O : Ii] are finite for all i;moreover, the inclusions imply divisibility:

[O : Ii] divides [O : Ij], if i > j.

Since any decreasing chain of natural numbers eventually stabilizes,we find that these indices stabilize, and hence the chain of idealsstabilize.

ideals 51

Definition 4.6 Suppose that I is an ideal in O. The norm of I (in O),denoted N(I) is defined to be the index [O : I], or equivalently, the order ofthe finite ring O/I.

For principal ideals, the norm coincides with our previous definition:

Proposition 4.7 Suppose that α ∈ O. Then |NF/Q(α)| = N((α)), where(α) denotes the principal ideal generated by α.

Proof: On the one hand, N((α)) = [O : (α)] = [O : αO]. But theindex of αO in O is precisely the absolute value of the determinant ofthe endomorphism mα of O, which is precisely the absolute value ofNF/Q(α).

4.2 Operations on lattices and ideals

Fix an order O in F, for this section. We introduce some operationson ideals which are valid more generally for lattices in F.

Definition 4.8 Suppose that L and M are Z-lattices in F. The sum oflattices L + M is defined to be the smallest Z-submodule of F containingboth L and M, or equivalently, the Z-module whose elements are all sums ofelements of L with elements of M. The product of lattices L ·M is definedto be the smallest Z-submodule of F containing all elements v ·w, wheneverv ∈ L and w ∈ M. Equivalently, L ·M is the lattice whose elements are finitesums of the form ∑ vi · wi, where vi, wi are elements of L and M, respectively.

Proposition 4.9 Given two Z-lattices L and M in F, L + M is a lattice inF and L ·M is a lattice in F. Furthermore, if L and M are stable under theaction of O, i.e., OL = L and OM = M, then the same is true for L + Mand L ·M: O · (L + M) = (L + M) and O · (L ·M) = L ·M.

Proof: The fact that L + M and L ·M are Z-submodules of F followsdirectly from their definition. To demonstrate that these are lattices,we use a typical squeeze argument. There exists a nonzero integer `

such that `L ⊂ M by Proposition 1.4. It follows that

L + M ⊂ `−1M + M ⊂ `−1M, and `L ⊂ L + M.

Hence L + M is a lattice. There also exist nonzero integers `, m, suchthat `L ⊂ O and mM ⊂ O. It follows that

L ·M ⊂ (`m)−1O.

52 number theory

Also, there exist nonzero integers `′, m′ such that `′O ⊂ L andm′O ⊂ M. Hence

O · O = O ⊂ (`′m′)−1LM,

so that (`′m′)O ⊂ L ·M. Hence L ·M is a lattice.Now, if L and M are stable for the action of O, then certainly

O · (L + M) = (L + M) and O · (L ·M) = (L ·M), by the distributiveand associative properties of multiplication.

In particular, when L and M are ideals in O, we find that L + Mand L ·M are ideals in O. There are two inclusions to remember inthis case:

Proposition 4.10 Suppose that L and M are ideals in O. Then

(L ∪M) ⊂ L + M, and L ·M ⊂ L ∩M.

Proof: The first inclusion, L ∪ M ⊂ L + M is obvious from thedefinition of L + M (and is true for lattices, and not just ideals). Thesecond inclusion follows directly from the definition of an ideal.

Next, we discuss the “lattices quotient”, in a number field F.

Definition 4.11 Suppose that L and M are lattices in a number field F.Define the lattice quotient (L÷M) to be the Z-submodule of F given by

L÷M = α ∈ F such that αβ ∈ L for all β ∈ M.

Proposition 4.12 Suppose that L and M are lattices in a number field F.Then (L÷M) is also a lattice in F and M · (L÷M) ⊂ O. If L is O-stable,then (L÷M) is O-stable.

Proof: Let (m1, . . . , mn) be a Z-basis of M. Then

(L÷M) =n⋂

i=1

L ·m−1i .

Therefore (L÷M) is the intersection of a finite number of lattices, andso it is a lattice. The fact that M · (L÷M) ⊂ O follows directly fromthe definition of (L÷M). If OL = L, then one can check directly thatO(L÷M) = (L÷M).

ideals 53

Next, we discuss endomorphisms of lattices. Suppose that L isa lattice in F. Then, we may consider the (noncommutative, unlessn = 1) ring of endomorphisms5 of L as a Z-module: 5 Since L is isomorphic to Zn as a Z-

module, EndZ(L) is isomorphic toMn(Z), the algebra of n by n integermatrices.EndZ(L) = Z-linear maps, e : L→ L.

When OL = L, so that L is naturally a O-module, we my consider thesubring of O-linear endomorphisms:

EndO(L) = O-linear maps, e : L→ L.

Since any O-linear endomorphism of L extends uniquely to a O-linear endomorphism of F = Q · L, on which it is a F-linear endo-morphism, we find an inclusion EndO(L) → EndF(F) = F. In otherwords, every element e ∈ EndO(L) has the form e(v) = α · v for someα ∈ F, uniquely determined by e (and depending only on e and noton v). In this way, we identify EndO(L) with its canonical image in F.Another characterization of its image is the following:

EndO(L) = (L÷ L) = α ∈ F : αL ⊂ L.

Proposition 4.13 Suppose that L is a lattice in F, and OL = L. ThenEndO(L) is an order in F containing O.

Proof: We have seen that EndO(L) is a subring of F, so it remainsto prove that EndO(L) is a lattice in F. Since OL = L, we haveO ⊂ EndO(L). On the other hand, EndO(L) = (L÷ L) is a lattice, andhence is an order in F.

Corollary 4.14 Suppose that L is a lattice in F, and OFL = L. ThenEndOF (L) = (L÷ L) = OF.

Proof: Since EndOF (L) an order containing OF, and OF is a maximalorder, EndOF (L) = OF.

Proposition 4.15 Suppose that L = αO is a principal ideal in the order O.Then EndO(L) = O.

Proof: The previous proposition implies that EndO(L) contains O.Furthermore, if β ∈ EndO(L), then β · α ∈ αO. It follows that β ∈ O.

54 number theory

4.3 Prime ideals in orders

While the factorization of ideals into prime ideals does not behavewell in general orders, the following lemma holds in this level ofgenerality.

Lemma 4.16 Suppose that I is a nonzero ideal in an order O. Then thereexist nonzero prime ideals P1, . . . , Pt of O such that I contains P1 · · · · · Pt

and N(I) = ∏ N(Pi). Furthermore, if P is a prime ideal containing I, thenP occurs among the prime ideals P1, . . . , Pt.

Proof: Consider the quotient M = O/I, as a O-module. There existsa filtration

M = M0 ⊃ M1 ⊃ · · · ⊃ Mt = 0

of this module, such that the successive quotients Mi/Mi+1 aresimple O-modules. As each submodule of M corresponds to an idealcontaining I, we find a sequence of ideals

I = I0 ⊂ I1 ⊂ · · · ⊂ It = O,

for which Mi = O/I and so Ii+1/Ii is a simple O-module.As a nonzero, finite (in cardinality!), simple O-module, each

quotient Ii+1/Ii is cyclic, generated by some (any nonzero) elementei ∈ Ii+1/Ii. Define

Pi = Ann(ei) = α ∈ O : αei = 0.

The map O/Pi → Ii+1/Ii, sending α to αei, is an isomorphism of O-modules by the simplicity of the target. Furthermore, the simplicityimplies that Pi is a maximal ideal. By analyzing indices, we see that

N(I) = [O : I] = [It : It−1] · · · [I1 : I0]

= [O : Pt] · · · [O : P1]

= N(Pt) · · ·N(P1).

It now follows that there are nonzero prime (maximal) idealsP1, . . . , Pt such that Pi · Ii+1 ⊂ Ii. Hence P1 · · · Pt ⊂ I0 = I.

Furthermore, if P is any prime ideal containing I, then we claimthat P contains some ideal Pi. Indeed, if P contains none of theideals Pi, then there exist elements αi ∈ Pi for all i such that αi 6∈ P.However, since I contains P1 · · · Pt, we find that ∏ αi ∈ I ⊂ P. Itfollows that one of the factors αi must contain P, a contradiction. Itfollows that P contains some ideal Pi; since both are nonzero, bothare maximal, and it follows that P = Pi.

To begin our approach to factorization, we have

ideals 55

Theorem 4.17 Suppose that O is an order in F, and P a nonzero primeideal in O. Then (O ÷ P) contains, but is not equal to O.

Proof: It is clear from the definition that (O ÷ P) contains O. Tosee that equality does not hold, let α be a nonzero element of P, andconsider I = αO ⊂ P ⊂ O. By the previous lemma, there are primeideals P1, . . . , Pt such that I ⊃ PP1 · · · Pt. If I contains P1 · · · Pt, thenP = Pi for some 1 ≤ i ≤ t; assuming that i = t in this case, wehave I ⊃ PP1 · · · Pt−1. Continuing in this fashion, we may “cancel” asmany factors of P as possible until

P1 · · · Pu 6⊂ I and PP1 · · · Pu ⊂ I.

Fix β ∈ (P1 · · · Pu)\I. Let γ = β/α ∈ F, so that γ 6∈ O (sinceβ 6∈ I = αO). Then we find that γP ⊂ O; hence γ ∈ (O ÷ P).

Corollary 4.18 Suppose that O is an order in F and P is a nonzero primeideal in O. Then either P · (O÷ P) = O, or else (P÷ P) contains but doesnot equal O.

Proof: In general, we have P = O · P ⊂ (O ÷ P) · P ⊂ O. If(O ÷ P) · P 6= O, then we find that (O ÷ P) · P is a proper ideal in O(since it is a sub-O-module of O). Furthermore, (O÷ P) · P contains P,a maximal ideal, and hence (O ÷ P) · P = P. By the previous theorem,there exists γ ∈ (O ÷ P), such that γ 6∈ O. But then γP ⊂ P, so thatγ ∈ (P÷ P). Hence (P÷ P) contains, but does not equal O.

4.4 Factorization in the maximal order

We may characterize the nonzero ideals in any order O preciselyas the sublattices of O, which are stable under multiplication by O.Such a characterization leads directly to a generalization:

Definition 4.19 A fractional ideal in F is a Z-lattice L ⊂ F, such that6 6 Note that the definition of a fractionalideal references the maximal order OF ,not an arbitrary order O.

OF · L = L.

In particular, a nonzero ideal in OF is simply a fractional ideal whichis contained in OF. Just as one may construct a principal ideal (α),for every element α ∈ OF, one may construct a principal fractionalideal (α) for every nonzero element α ∈ F:

(α) = OFα = z ∈ F : z = βα for some β ∈ OF.

56 number theory

Proposition 4.20 For any nonzero element α ∈ F, (α) is a fractional ideal.

Proof: To see that (α) is a fractional ideal, it is clear that (α) is aZ-submodule of F and stable under multiplication by OF. To see that(α) is a lattice, recall that by Proposition 2.8 there exists a nonzeroN ∈ Z such that Nα ∈ OF. Similarly, there exists a nonzero M ∈ Z

such that Mα−1 ∈ OF. It follows that if z ∈ MOF, then zα−1 ∈Mα−1OF, so zα−1 ∈ OF and z ∈ αOF. We have proven that

MOF ⊂ (α) ⊂ 1NOF.

It follows that (α) is a Z-lattice in F, so (α) is a fractional ideal.

A direct consequence of Proposition 4.9 and Proposition 4.12 is thefollowing

Proposition 4.21 If L and M are fractional ideals in F, then L + M andL ·M are fractional ideals in F. Also, L÷M is a fractional ideal in F.

Inversion of fractional ideals leads to a very important cancellationprinciple; we begin proving this for prime ideals.

Lemma 4.22 Suppose that P is a prime ideal in OF. Then, the fractionalideal (OF ÷ P) satisfies (OF ÷ P) · P = OF.

Proof: If (OF ÷ P) · P 6= OF, then by Corollary 4.18, we find thatEnd(P) = (P ÷ P) properly contains OF; but this contradicts themaximality of the order OF.

The most important implication of this lemma is the following“prime cancellation” lemma:

Lemma 4.23 Suppose that P is a prime ideal in OF. Then, for any frac-tional ideals I, J in F, I = J if and only if I · P = J · P.

Proof: If I = J, then clearly I · P = J · P. On the other hand, ifI · P = J · P, then I · P · (O÷ P) = J · P · (O÷ P). Hence I · OF = J · OF.As I and J are fractional ideals, this implies that I = J.

With this lemma in place, it makes sense to define

Definition 4.24 Suppose that I is a fractional ideal. Define the inversefractional ideal I−1 = (OF ÷ I).

In particular, P−1 · P = OF, for every prime ideal P of OF. We will sethat this generalizes to any fractional ideal.

ideals 57

Theorem 4.25 (Dedekind) Every nonzero ideal I in OF can be expressedas the (possibly empty) product P1 · · · Pt of prime ideals, and this factoriza-tion is unique up to order.

Proof: If I is a nonzero proper ideal in OF, then we have seen thatthere is a chain of ideals I = I0 ⊂ · · · It = OF for which the successivequotients Ii+1/Ii are simple OF-modules isomorphic to quotientsOF/Pi for some maximal ideals Pi, and the data P1, . . . , Pt is uniquelydetermined (up to order) by I, using the Jordan-Hölder theorem. Thenumber t is called the length of the module OF/I.

We prove that I = P1 · · · Pt in this circumstance, by induction onthe length of OF/I.

When t = 1, we find that the only prime ideal containing I is P1,and P1 ⊂ I ⊂ P1 ⊂ OF, which implies that I = P1. Now, we assumethat t > 1, and that the result has been proven for smaller lengths. Inparticular, we find that

P1 · · · Pt ⊂ I = I0 ⊂ I1 ⊂ · · · ⊂ It = OF.

As I1/I0 is isomorphic to OF/P1, we find that the Jordan-Höldercomponents for OF/I1 are precisely P2, . . . , Pt. It follows from theinductive hypothesis that

P2 · · · Pt = I1 ⊂ · · · ⊂ It = OF.

Thus we find that

P1 I1 = P1P2 · · · Pt ⊂ I ⊂ I1 = P2 · · · Pt.

Multiplying by P−12 · · · P

−1t yields

P1 ⊂ IP−12 · · · P

−1t ⊂ OF.

If OF = IP−12 · · · P

−1t , then we find that I = P2 · · · Pt. This implies

that OF/I has a Jordan-Hölder series of length t− 1, a contradiction.Hence, the maximality of P1 implies that

P1 = IP−12 · · · P

−1t .

Therefore, I = P1 · · · Pt.For uniqueness, one may apply the uniqueness of the simple

factors in the Jordan-Hölder series of I.

Lemma 4.26 Suppose that I, J are fractional ideals, with I ⊂ J. Then, thereexists an ideal M in OF, such that J ·M = I.

58 number theory

Proof: There exists an integer n, such that n · I ⊂ n · J ⊂ OF. It followsthat both nI and nJ are ideals in OF, and hence have factorizationsinto prime ideals:

n · I = P1 · · · Ps, and n · J = Q1 · · ·Qt.

The containment nI ⊂ nJ implies that Q1 (as a Jordan-Hölder com-ponent of OF/nJ) also occurs in the factorization of I (as a Jordan-Hölder component of OF/nI). Without loss of generality, we find thatQ1 = P1. Thus we find that nIP−1

1 ⊂ nJP−11 . Continuing this process,

we see that

nI = P1 · · · Ps = Q1 · · ·QtPt+1 · · · Ps = nJPt+1 · · · Ps.

It follows thatI = J · (Pt+1 · · · Ps).

Proposition 4.27 Suppose that J is a fractional ideal in F. Then there existideals I, M ⊂ OF, such that J ·M = I.

Proof: If J is a fractional ideal in F, then I = J ∩ OF is an ideal inOF. Applying the previous lemma to the inclusion I ⊂ J, we find thatthere exists an ideal M in OF such that J ·M = I.

Theorem 4.28 The set I(F) of fractional ideals in F forms a group underthe operation (I, J) 7→ I · J. The identity element is furnished by OF, andthe inverse is furnished by I 7→ I−1 = (OF ÷ I). This group is a free abeliangroup, generated by the set of nonzero prime ideals in OF.

Proof: The operation I, J 7→ I · J is certainly commutative andassociative, and OF · I = I for all fractional ideals I. Thus I(F) is acommutative monoid.7 It remains to check that I · I−1 = OF for all 7 A monoid is an algebraic structure

given by a set with a binary composi-tion and identity element, satisfying allof the group axioms with the exceptionof the axiom of inverses.

fractional ideals I. If I is an ideal in OF, then I = P1 · · · Pt for someprime ideals. I−1 is a fractional ideal which contains OF. It followsthat there exists an ideal J in OF, such that I−1 · J = OF. The ideal Jhas a factorization into prime ideals J = Q1 · · ·Qs. Since I · I−1 ⊂ OF,we find that

I−1 · P1 · · · Pt ⊂ I−1 ·Q1 · · ·Qs.

Multiplying through by J (which satisfies J I−1 = OF we find that

P1 · · · Pt ⊂ Q1 · · ·Qs.

It follows that each prime Q1, . . . , Qs occurs among the primesP1, . . . , Pt. After reordering, if necessary, we find that

P1 = Q1, . . . , Ps = Qs.

ideals 59

Hence, we find that

I−1 = Q−11 · · ·Q

−1s = P−1

1 · · · P−1s ⊂ P−1

1 · · · P−1t ⊂ I−1

where the last inclusion follows from the fact that I = P1 · · · Pt. Itfollows by squeezing that

P−11 · · · P

−1s = P−1

1 · · · P−1t .

Therefore,Ps+1 · · · Pt = OF.

Hence s = t, I−1 = P−11 · · · P

−1t , and I · I−1 = OF. We have proven that

I(F) is an abelian group.This group is certainly generated by the nonzero prime ideals in

OF, since it is generated by the monoid of ideals in OF, each of whichmay be factored into prime ideals. If there was any relation amongthese prime ideals:

Pe11 · · · P

ess = OF,

for integers e1, . . . , es, then one may multiply both sides by powersof primes ideals in order to find such a relation in positive powers ofthe prime ideals Pi. No nontrivial relations among such powers exists,by the uniqueness of prime factorization for ideals in OF. Hence thegroup I(F) is a free abelian group generated by the nonzero primeideals in OF.

Corollary 4.29 The norm, I 7→ N(I), initially defined for ideals in themaximal order OF, extends uniquely to an abelian group homomorphismfrom I(F) to R×pos (the group of positive real numbers, under multiplica-tion).

Proof: For any ideal I in OF, we have seen that I has a prime factor-ization I = P1 · · · Pt, and N(I) equals ∏ N(Pi). From the uniquenessof factorization, we find that if J is another ideal in OF, with factoriza-tion J = Q1 · · ·Qs, then

N(I J) = N(P1) · · ·N(Pt) N(Q1) · · ·N(Qs) = N(I) N(J).

Since I(F) is generated by the ideals in OF, it follows that N extendsuniquely to a group homomorphism from I(F) to R×pos.

According to this corollary, we may define the norm of a fractionalideal I, by expressing I as a quotient I = (J ÷ K) of ideals in OF, anddefining N(I) = N(J) N(K)−1. Alternatively, given a fractional idealI, there exists an integer N such that N · I is an ideal in OF. It followsthat

|N(N)| ·N(I) = N(N · I) = [OF : NI].

60 number theory

Proposition 4.30 If J is a fractional ideal in F, then Disc(J) = Disc(F) N(J).

Proof: We have seen this result, when J is an ideal in OF. When J isa fractional ideal, let N be a nonzero integer such that N · J is an idealin OF. Then we find that

N(N) Disc(J) = Disc(N · J) = Disc(F) N(N · J) Disc(F) N(N) N(J),

from which the result follows.

4.5 The class group

Within the group I(F) of fractional ideals in F, a natural subgroup isprovided by the set P(F) of principal ideals αOF for various α ∈ F×.Two elements α and β of F× generate the same principal ideal if andonly if αβ−1 ∈ O×F . In this way, we may identify groups

P(F) = F×/O×F .

As the multiplication on F× is compatible with the multiplication ofideals, we find a subgroup embedded in I(F):

F×/O×F = P(F) → I(F).

Definition 4.31 The ideal class group of F, denoted H(F), is the quo-tient H(F) = I(F)/P(F). The class number of F, denoted h(F) is thecardinality of the group H(F).

The class group fits into a very important 4-term exact sequence ofabelian groups:

1→ O×F → F× → I(F)→ H(F)→ 1.

In particular, we find that H(F) is the trivial group if and only if OF

is a principal ideal domain.

Theorem 4.32 The ideal class group of F is a finite abelian group. Fur-thermore, every ideal class in H(F) is represented by an ideal in OF, whosenorm is bounded by

η(n, r2, Disc(OF)) =n!nn

(4π

)r2 √|Disc(OF)|.

Proof: Suppose that J ∈ I(F) is a fractional ideal, and consider theinverse J−1. By the Minkowski estimates of the previous chapter,there exists a nonzero element α of the lattice J−1 such that

|N(α)| ≤ n!nn

(4π

)r2 √|Disc(J−1)| = n!

nn

(4π

)r2√|Disc(OF)|

N(J).

ideals 61

Above, we use the fact that J · J−1 = OF, and so N(J−1) = N(J)−1,and thus

Disc(J−1) = Disc(OF) N(J−1) = Disc(OF) N(J)−1.

Since α ∈ J−1, αOF ⊂ J−1, and I : = (α) · J ⊂ OF. Note that I andJ are in the same class (in the ideal class group). We find that

N(I) = |N(α)| ·N(J) ≤ n!nn

(4π

)r2 √|Disc(OF)|.

It follows that every fractional ideal J is in the same ideal classas an ideal I ⊂ OF, whose norm is bounded by a constant onlydepending on the number field F (in fact, on n, Disc(F) = Disc(OF),and r2).

4.6 Exercises

Exercise 4.1 Suppose that F is a number field, and OF is its ring of in-tegers. Let P be an ideal in OF. Prove that if N(P) is a prime number (aprime in Z, that is), then P is a prime ideal.

Exercise 4.2 Let F = Q(√−5). The ring OF = Z[

√−5] is not a

principal ideal domain; for example 6 = 2 · 3 = (1 +√−5)(1−

√−5) is a

non-unique factorization of 6 into irreducible elements.

(a) Let P = (2, 1 +√−5). Prove that P is a prime ideal.

(b) Factor the principal ideal (6) into prime ideals.

(c) Prove that H(F) is a group of order two.

(d) Suppose that p is a prime number (a prime in Z, that is), then theprincipal ideal pOF is either prime, or else factors as P · P, where P isa prime ideal in OF, and P is its Galois conjugate. Extra: prove thatP = P, in this case, if and only if p = 2 or p = 5 (i.e., p dividesDisc(OF).

Exercise 4.3 Prove that h(Q(√

13)) = 1.

Exercise 4.4 Prove that h(Q(√−19)) = 1. Hint: use the fact that any

ideal class can be represented by an ideal in OF of norm 1 or 2. Prove thatno prime ideal in OF has norm 2, and use multiplicativity of the norm.

Exercise 4.5 Compute h(F), where F = Q( 3√

2).

5Units

In this chapter, F will always denote a number field, and OF its ringof integers. Occasionally, we discuss an arbitrary order O ⊂ OF ⊂ F.We have seen in the previous chapter that there is an exact sequenceof abelian groups

1→ O×F → F× → I(F)→ H(F)→ 1.

In addition, we have prove that the group H(F) is a finite abeliangroup. The subject of the current chapter is the group O×F , called theunit group of the number field F.

Even in simple cases, such as when F is a real quadratic field, un-derstanding the group O×F has significant Diophantine implications.We will completely describe the structure of the group O×F , as anabelian group, and discuss a related invariant called the regulator ofF.

5.1 The norm, and quadratic fields

Recall that N = NF/Q : F× → Q× is a group homomorphism; it canbe computed in practice either by considering the “multiplication byα” endomorphisms of F (for α ∈ F×), or by considering the productof the images of α, under all field embeddings F → C. Furthermore,we have seen that if α ∈ OF, then N(α) ∈ Z. A consequence is thefollowing:

Proposition 5.1 Suppose that α ∈ OF. Then α ∈ O×F if and only ifN(α) = ±1.

Proof: First, suppose that α ∈ O×F , so that α−1 ∈ OF as well. Thenwe find that

N(α) N(α−1) = N(αα−1) = N(1) = 1.

It follows that N(α) ∈ Z× = ±1.

64 number theory

Conversely, suppose that α ∈ OF and N(α) = ±1. Recall that ife = [F : Q(α)], then N(α) = NF/Q(α) = NQ(α)/Q(α)e and all of thesequantities are in Z. It follows that NQ(α)/Q(α) = ±1, since the onlyinteger roots of ±1 are ±1.

Considering the minimal monic polynomial of α, we find that

αd − TrQ(α)/Q(α)αd−1 + · · · ±NQ(α)/Q(α) = 0.

We find an identity in OF:

±1 = α ·(

αd−1 − TrQ(α)/Q(α)αd−2 + · · ·)

.

It follows that α ∈ O×F (and we have in fact found a formula for α−1).

Example 5.2 Suppose that F = Q(√

D) is a quadratic field, with Da square-free integer. If α = a + b

√D ∈ F, then N(α) = αα, where

α = a− b√

D is the Galois conjugate of α. Thus we find that

N(α) = (a + b√

D)(a− b√

D) = a2 − Db2.

If α is in OF, then we find that α ∈ O×F if and only if

a2 − Db2 = ±1.

Recall that, depending on whether D is congruent to 1, or to 2 or 3, modulo4, there are two possibilities for OF.

D ≡ 1, mod 4: In this case, OF = Z[1/2(1 +√

D)]. Thus α ∈ OF if andonly if

α =a2

+b2

√D, and a, b ∈ Z and a− b is even .

We find that the units O×F correspond precisely to pairs of integers(a, b) ∈ Z2 such that a2 − Db2 = ±4 (note that this condition impliesthat a− b is even, since D is congruent to 1 modulo 4).

D ≡ 2, 3, mod 4: In this case, OF = Z[√

D] Thus α ∈ OF if and only ifα = a + b

√D for some integers a, b. Thus the units of OF correspond

precisely to pairs of integers (a, b) ∈ Z2 such that a2 − Db2 = ±1.

In particular, when D is a square-free positive integer, we find that a com-plete understanding of O×F is equivalent to a complete solution of “Pell’sequation”1 1 Euler is responsible for naming such

Diophantine equations after Pell,and historians (reference needed!)have suggested that Euler confusedBrouncker and Pell in this attribution.In any case, there was great progresson Pell’s equation by Brahmaguptaaround 630 A.D., and later by BhaskaraII around 1150 A.D..

One particular case can be analyzed now by explicit methods.

Proposition 5.3 Suppose that F = Q(√−D) is a quadratic field, where

D is a positive square-free integer (the case r1 = 0, r2 = 1).2 Then O×F is

2 We say that F is an imaginaryquadratic field in this case

a finite cyclic group, whose order is 4 or 6 or 2, corresponding to the caseswhen D = −1, D = −3, and all other cases, respectively.

units 65

Proof: Observe that the norm is a positive-definite quadratic formin this case: N(a + b

√−D) = a2 + Db2. In particular the equation

N(α) = ±1 implies N(α) = 1. We consider two cases now.

−D ≡ 1, mod 4: In this case, recall that the units correspond tointeger solutions of a2 + Db2 = 4. If −D = −3, then we seeksolutions to a2 + 3b2 = 4. There are precisely six solutions:

a = ±1, b = ±1, and a = ±2, b = 0.

If −D ≤ −7, then there are only two integer solutions to a2 + Db2 =4: a = ±1, b = 0. We have found that when −D = −3, O×F is acyclic group, consisting of all sixth roots of unity. When −D ≤ −7,O×F = ±1 is a cyclic group of order two.

−D ≡ 2, 3, mod 4: In this case, recall that the units correspond tointeger solutions of a2 + Db2 = 1. If −D = −1, then we seek integersolutions to a2 + b2 = 1. There are precisely four solutions:

a = ±1, b = ±1.

If −D < −1, then we seek integer solutions to a2 + Db2 = 1, andthere are only two solutions:

a = ±1, b = 0.

We have found that when −D = −1, O×F is a cyclic group, consist-ing of all fourth roots of unity. When −D ≤ −1, O×F = ±1 is acyclic group of order two.

5.2 Geometry of units

We have seen that understanding the units O×F in a number field isequivalent to finding all solutions to N(α) = ±1 in OF; if n = [F : Q],then every element of OF can be expressed as a Z-linear combinationα = a1v1 + · · ·+ anvn of a Z-basis of OF. The condition N(α) = ±1can be rephrased as a Diophantine equation in the integer variablesa1, . . . , an; however, the degree of this Diophantine equation is n –solving general Diophantine equations of degree n in n variables isintractable.

However, the specific Diophantine equations involved in findingO×F are tractable, largely because the solution set O×F is a group.We will prove that O×F is a finitely-generated group, and thus everyunit can be expressed as a product involving a finite number of“fundamental units” (generators).

66 number theory

Recall that a real place of F is a field embedding F → R, and acomplex place of F is a conjugate pair of field embeddings F → C.We define VR to be the set of real places and VC to be the set of com-plex places of F. We write Varc for the resulting set of archimedeanplaces, called infinite places by many authors:

Varc = VR tVC, satisfying #Varc = r1 + r2.

To each archimedean place of F, we associate an “absolute value”; ifv ∈ VR, and α ∈ F, we define |α|v = |v(α)|, where the latter denotesthe usual real absolute value. For a complex place w, we use thesquare of the usual complex absolute value:

|α|w = |w(α)|2 = w(α)w(α).

The absolute value of the norm NF/Q(α) can now be expressedusing absolute values:

|N(α)| = ∏v∈Varc

|α|v.

Thus, we can study the units OF using the following characterization:

O×F = α ∈ OF : ∏v∈Varc

|α|v = 1.

While this characterization is highly nonlinear, we may linearize viathe “logarithmic modulus” map; let R×pos denote the group of positivereal numbers, under multiplication. Then the natural logarithmprovides a continuous group isomorphism

log : R×pos → R.

Recall that there is a unique, up to reordering and conjugation,algebra isomorphism

FR∼= Rr1 ×Cr2 .

The logarithm, composed with the absolute value, yields the logarith-mic modulus homomorphism:

λ : F×R = (R×)r1 × (C×)r2 → Rr1+r2 ,

λ(x1, . . . , xr1 , z1, . . . , zr2) = (log |x1|, . . . , log |xr1 |, 2 log |z1|, . . . , 2 log |zr2 |).

The coefficient 2, at each complex place, arises since log |z2| =2 log |z|, and we use the square of the usual complex absolute value.In particular, the logarithmic modulus on elements α ∈ F can becomputed:

λ(α) = (log |α|v)v∈Varc .

units 67

Let H ⊂ Rr1+r2 be the hyperplane (subspace of codimension 1), cutout by the condition:

H = (t1, . . . , tr1+r2) :r1+r2

∑i=1

ti = 0.

Thus H is a real vector space of dimension r1 + r2 − 1.

Proposition 5.4 The group O×F , viewed as a subset of the real vectorspace FR, can be identified as the intersection of the lattice OF with the setλ−1(H). In other words,

O×F = α ∈ OF : λ(α) ∈ H.

Proof: Given that α ∈ OF, we find that α ∈ O×F if and only ifN(α) = ±1. This occurs if and only if |N(α)| = 1. This occurs if andonly if

∏v∈Varc

|α|v = 1.

Figure 5.1: The lattice OF in blue dots,and the curve λ(α) ∈ H in red. HereF = Q(

√−1).

Figure 5.2: The lattice OF in blue dots,and the curve λ(α) ∈ H in red. HereF = Q(

√5)

Applying the logarithm to both sides, we see that this occurs if andonly if

∑v∈Varc

log |α|v = 0.

This is equivalent to the condition that λ(α) ∈ H.

Geometrically, the condition λ(α) ∈ H may describe a somewhat com-plicated region in the real vector space FR. When F is an imaginaryquadratic field, this describes the unit circle in C. But when F is areal quadratic field, the condition λ(α) ∈ H describes a union of twohyperbolas.

5.3 Torsion units

Lemma 5.5 The function λ : F×R → Rr1+r2 is “proper”; in other words, forevery element (t1, . . . , tr1+r2) ∈ Rr1+r2 , the preimage λ−1(t1, . . . , tr1+r2) isa compact subset of F×R .

Proof: Since λ is a homomorphism, it suffices to prove that λ−1(0, . . . , 0)is compact. This preimage can be described as a direct product:

λ−1(0, · · · , 0) =

(r1

∏i=1±1

)×(

r2

∏j=1z ∈ C : |z| = 1

).

As the finite product of compact spaces, this is compact.

68 number theory

For any field K, we write µ(K) for the torsion subgroup of K×.When F is a number field, µ(F) ⊂ O×F , since every torsion element ofK× is a root of a monic polynomial of the form Xn − 1 for some n ≥ 1.

We will study the group O×F of units by studying its image Λ =λ(O×F ), which satisfies

Λ ⊂ H ⊂ Rr1+r2 .

Proposition 5.6 There is a short exact sequence

1→ µ(F)→ O×F → Λ→ 1,

The group µ(F) is finite.

Proof: The map O×F → Λ is surjective, by definition, since Λ =λ(O×F ). Its kernel consists of all units ε satisfying λ(ε) = (0, . . . , 0).Such units lie in the intersection of a compact set (the set λ−1(0, . . . , 0)) and a discrete set (the set OF of integers in F) in the vector spaceFR. As such, the kernel is a finite subgroup of O×F . As such, everyelement of the kernel is a torsion element of O×F , i.e., a root of unity.

Conversely, if ζ is a root of unity in F, then ζn = 1 for somepositive integer n. It follows that ζ is integral, a unit, and the imageof ζ is a root of unity in any embedding of F in C. It follows that|ζ|v = 1 for all v ∈ Varc. Thus ζ is in the kernel of O×F → Λ.

We write w = w(F) = #µ(F) for the number of roots of unity in anumber field F. If r1 > 0, then we immediately find that w = 2 sinceR only has two roots of unity. In general, we find the following

Proposition 5.7 If F is a number field, then µ(F) is a finite cyclic group.3 3 In fact, if F is any field, and C is afinite subgroup of F×, then C is cyclic.This is proven, for example, in J. P.Serre’s “A Course in Arithmetic”.

Proof: Consider any embedding F → C. This embedding inducesan isomorphism fromW(F) to a finite subgroup of the group µ(C)of complex roots of unity. Thus it suffices to prove that every finitegroup of complex roots of unity is cyclic.

Let e : Q→ µ(C) denote the group homomorphism given by

e(q) = e2πiq.

Then e yields a short exact sequence

0→ Z→ Q→ µ(C)→ 0.

In this way, every4 finite subgroup of C×tor is the image of a finitely- 4 To see that every complex root ofunity occurs in the image of e, one mayobserve that e(j/n) produces n distinctnth roots of unity, as 1 ≤ j ≤ n. Thusthe homomorphism e contains all nth

roots of unity, for all positive integers n.

generated subgroup of Q containing Z. Every finitely-generated

units 69

subgroup of Q is a Z-lattice in Q, hence isomorphic to Z; in otherwords, every finitely-generated subgroup of Q is cyclic. Hence everyfinite subgroup of µ(C) is the image of a cyclic group, and hence iscyclic.

5.4 The unit theorem

We have found that the unit group O×F fits into a short exact se-quence

1→ µ(F)→ O×F → Λ→ 1,

where µ(F) = (O×F )tor is a finite cyclic group. In this section, weprove the unit theorem, which describes the quotient Λ, which is asubgroup of the real vector space H ⊂ Rr1+r2 .

The discreteness of OF in FR immediately yields an “upper bound”for Λ, using the following geometric lemma:

Lemma 5.8 Suppose that L is a discrete subgroup of a finite-dimensionalreal vector space V, with dim(V) = d. Then L is isomorphic, as a group, toZe for some integer e ≤ d.

Proof: As a subgroup of a torsion-free group, we know that L istorsion-free. Let U = spanR(L) denote the span of L; thus Λ is adiscrete subgroup spanning the real vector space U, with dimR(U) =e ≤ d. Since L spans U, there exists a basis v1, . . . , ve of U such thatv1, . . . , ve ∈ L.

Let D be a fundamental parallelotope given by this basis, and Dits closure. Let L0 ⊂ L denote the Z-span of v1, . . . , vd. Since L isdiscrete in V, and D is compact, the intersection L ∩ D is finite. Sinceevery element of L can be translated, via L0, to an element of D, wefind that L is generated by L0 together with L ∩ D.

Hence L is a finitely-generated torsion-free abelian group; L isisomorphic to Z f for some integer f , and f ≥ e since L spans thereal vector space U of dimension e. Let u1, . . . , u f denote a Z-basisof L. After reordering, we may assume that u1, . . . , ue is a maximalR-linearly independent subset of this Z-basis, and let L′ denote theZ-span of u1, . . . , ue. In particular, U/L′ is a compact Hausdorffspace. If f > e, then observe that the set Zu f projects onto an infinitesubset of U/L′ since the set u1, . . . , u f is Z-linearly independent.

Hence the projection of Z · u f onto U/L′ has an accumulation point.Lifting the projection Zu f to the fundamental parallelotope for L′

in U, we find an accumulation point in the lattice L, contradictingdiscreteness.

Hence f = e, and so L is isomorphic to Ze.

70 number theory

Proposition 5.9 The group Λ is a discrete subgroup of H ∼= Rr1+r2−1. Itfollows that Λ is a free abelian group of rank bounded by r1 + r2 − 1.

Proof: We find that Λ is discrete in H, since O×F is discrete in F×R ,and the logarithmic modulus map λ : F×R → Rr1+r2 is proper.

As a discrete subgroup of H ∼= Rr1+r2−1, we find that Λ ∼= Ze forsome e ≤ r1 + r2 − 1.

Since H has dimension r1 + r2 − 1, Λ is trivial in two basic cases; ifr1 = 1 and r2 = 0, then F = Q, and O×F ±1, and Λ = 0. If r1 = 0and r2 = 1, then F is a quadratic imaginary field, O×F = W(F) isa cyclic group of order 2, 4, or 6, and again Λ = 0. Therefore, itsuffices to consider the cases when r1 + r2 ≥ 2.

We will soon prove a much more difficult result, that Λ is a latticein H, i.e., that Λ is isomorphic to Zr1+r2−1, embedded discretely in H.Our approach is based on that of Anthony Knapp5, beginning with 5 Anthony W. Knapp. Advanced algebra.

Cornerstones. Birkhäuser Boston Inc.,Boston, MA, 2007. ISBN 978-0-8176-4522-9

the following:

Proposition 5.10 Fix an archimedean place w ∈ Varc. There existsan infinite sequence α(1), α(2), . . . in OF, satisfying the following threeproperties:

1. |N(α(i))| ≤ 2nVol(FR/OF) for all i.

2. For every archimedean place v 6= w,

limi→∞|α(i)|v = 0.

3. At the archimedean place w,

limi→∞|α(i)|w = ∞.

Proof: In order to construct such elements of OF, we use our resultson existence of lattice points within compact convex centrally sym-metric regions of sufficient volume. First, for all i ≥ 1, we define acompact convex centrally symmetric subset Ωj ⊂ FR. Note that eachabsolute value | · |v (for v ∈ Varc) defines a gauge on FR; in particular,for all v ∈ Varc, and all positive real numbers ρ, the following is aclosed, convex, centrally symmetric subset of FR:

Dv(ρ) = α ∈ FR : |α|v ≤ ρ.

Furthermore, if one is given a positive real number ρv for eacharchimedean place v, then

D(~ρ) =⋂

v∈Varc

Dv(ρv)

units 71

is a compact, convex, centrally symmetric subset of FR. It is a productof intervals and discs, centered at the origin.

Define, for all i ≥ 1, a collection ~ρ(i) of radii, by:

1. ρ(i)v i−1, for all archimedean places v 6= w.

2. If w is real, then ρ(i)w = 2n−r1 π−r2 in−1Vol(FR/OF).

3. If w is complex, then ρ(i)w =

√2n−r1 π−r2 in−2Vol(FR/OF).

In this way, we get a sequence of compact, convex, centrally symmet-ric subsets D(~ρ(i)); the volume of each may be computed directly,and

Vol(D(~ρ(i))) = 2nVol(FR/OF).

Thus, we find that there exists a sequence of nonzero lattice pointsα(i), such that

α(i) ∈ D(~ρ(i)) ∩OF.

For all i ≥ 1, we find that

|N(α(i))| = ∏v∈Varc

|α(i)|v ≤ 2nVol(FR/OF)2−r12−r2 ≤ 2nVol(FR/OF).

We find directly that, for all place v 6= w, |α(i)|v ≤ i−1 approacheszero as i approaches infinity. The boundedness of the product

|N(α(i))| = ∏v∈Varc

|α(i)|v,

and the decay of all but one term in the product, implies the growthof the single term |α(i)|w.

Proposition 5.11 Fix an archimedean place w ∈ Varc. Then, there exists asequence ε(i) of units such that

1. For all places v 6= w,limi→∞|ε(i)|v = 0.

2. At the place w,limi→∞|ε(i)|w = ∞.

Let α(i) be a sequence of elements of OF, constructed to satisfy theconditions of the previous proposition. Thus, the sequence |N(α(i))|is a bounded sequence of integers. By restricting to a subsequence,we may assume that N(α(i)) is a constant integer, M. Furthermore,since OF/MOF is a finite set, we may restrict further to a subse-quence on which α(i) lies in a single residue class modulo MOF.

72 number theory

Define now ε(i) = α(i)/α(1). Since α(i) and α(1) are nonzero ele-ments of OF, of the same norm, we find that ε(i) is an element of F×

of norm 1. Since α(i) and α(1) are in the same residue class, moduloMOF, we find that

(α(i) − α(1)) ∈ MOF.

Since M = NF/Q(α(1)), we find by a now familiar argument that

M ∈ α(1)OF.

Putting these observations together, we find that

(α(i) − α(1)) ∈ α(1)OF.

We deduce that

ε(i) = 1 +α(i) − α(1)

α(1)∈ OF.

We have found that ε(i) ∈ OF and |N(ε(i))| = 1, so ε(i) ∈ O×F .The decay of |ε(i)|v (for v 6= w), and the growth of |ε(i)|w now

follow from the corresponding conditions on α(i).

We have found, for every archimedean place w, a sequence of unitsε(i)w ; by choosing i sufficiently large, we find units εw for all places w,

satisfying the following identities:

1. For all archimedean places v 6= w, log |εw|v < 0.

2. At the place w, log |εw|w > 0.

3. Since εw is a unit,

∑v∈Varc

log |εw|v = 0.

Proposition 5.12 Fix an archimedean place w. Define for all u, v ∈ Varc,with u 6= w, v 6= w,

λu,v = log |εu|v.

Then the matrix (λu,v) is a nonsingular square matrix with r1 + r2 − 1 rowsand columns.

Proof: We find that the diagonal entries of this matrix are positive,and the off-diagonal entries of this matrix are negative. Moreover,within each row, we find that

∑v 6=u|λu,v| < λu,u.

Roughly speaking, the entries of the matrix are “concentrated” on thediagonal. To prove invertibility, suppose to the contrary that there

units 73

was a linear relation among the rows; suppose that there exist realnumbers cv, not all zero, such that

∑v

λu,vcv = 0, for all u.

Let u be a place at which |cu| is maximal. Then we find that

|cuλu,u| = |cu|λu,u

> |cv| ∑v 6=u|λu,v|

≥ |cv| · | ∑v 6=u

λu,v|

= ∑v 6=u|cvλu,v|

≥ | ∑v 6=u

cvλu,v|.

But the strictness of this inequality contradicts the assumption that

∑v∈Varc

λu,vcv = 0.

Hence the matrix (λu,v) is non-singular.

Finally, we have proven

Theorem 5.13 (Dirichlet unit theorem) The discrete subgroup Λ ⊂H ⊂ Rr1+r2 is a lattice in the real vector space H of dimension r1 + r2 − 1.

Proof: We have seen already that Λ is a discrete subgroup ofH ∼= Rr1+r2−1. Thus, it only remains to prove that Λ is a “full-rank”subgroup, i.e., Λ ∼= Zr1+r2−1. Let Λ′ ⊂ Λ be the subgroup generatedby the logarithmic modulus of the units εv for all v 6= w, discussedin the previous proposition. The matrix (λu,v) describes precisely theprojection of Λ′ ⊂ Rr1+r2 onto a coordinate hyperplane correspond-ing to all archimedean places but one. The nonsingularity of thematrix (λu,v) implies that Λ′ is a full rank lattice in Rr1+r2−1. SinceΛ contains Λ′, we find that the rank of Λ is at least r1 + r2 − 1. Sinceearlier results imply that the rank of Λ is at most r1 + r2 − 1, we haveprecisely determined the rank of Λ.

From Dirichlet’s unit theorem, we find that there exists a system offundamental units ε1, . . . , εr1+r2−1, such that every element ε of O×Fhas a unique expression of the form:

ε = ζ · εe11 · · · ε

er1+r2−1r1+r2−1,

74 number theory

for a root of unity ζ. Observe that there is not a canonical splitting ofthe short exact sequence

1→W(F)→ O×F → Λ→ 1,

but giving such a system of fundamental units yields such a splitting(as well as an ordered Z-basis of Λ).

We finish with a definition, which follows from the Dirichletunit theorem. We have seen that Λ is a lattice in the real vectorspace H ⊂ Rr1+r2−1. As H is a hyperplane in the real vector spaceRr1+r2 , and the latter has a natural basis (up to reordering), H isendowed with a canonical measure. The regulator of F is the volumeVol(H/Λ), divided by

√r1 + r2; this can be explicitly computed by

the following, which we take as our working definition:

Definition 5.14 Let ε1, . . . , εr1+r2−1 be a system of fundamental units forthe number field F. Choose an archimedean place w. Then, the regulator ofF is defined to be the absolute value of the determinant of the r1 + r2 − 1 byr1 + r2 − 1 square matrix (λi,v), where

λi,v = log |εi|v, for all v 6= w.

On the surface, it seems that the regulator depends on the choice of asystem of fundamental units, as well as the choice of an archimedeanplace w. It is not difficult to see that changing a system of fundamen-tal units corresponds to multiplying the matrix (λi,v) by an invertibleinteger matrix, which has determinant ±1. Thus the choice of systemof fundamental units does not affect the regulator.

On the other hand, it is not as clear that the choice of archimedeanplace does not affect the regulator. But this follows from a result inlinear algebra:

Proposition 5.15 Suppose that (mij) is a t by t + 1 matrix of real numbers,for some positive integer t. Suppose that every row sum equals zero:

t+1

∑j=1

mij = 0, for all 1 ≤ i ≤ t.

For all 1 ≤ j ≤ t + 1, let Mj denote the matrix obtained by deleting the jth

column of (mij). Then the absolute value of the determinant |Det(Mj)| isindependent of j.

Since the sums ∑v log |ε|v equal zero, for any unit ε, the previousproposition implies that the regulator is well-defined, regardless ofthe choice of a specific “fixed archimedean place”.

5.5 Concluding remarks

At this point, we have seen a number of “invariants” of a numberfield F; these are numbers which we can associate in a natural way to

units 75

a number field. These invariants occur together in the Dirichlet classnumber formula; like Euler’s formula eiπ + 1 = 0, the Dirichlet classnumber formula unites these disparate invariants in a surprisinglyclean way, relating them to the residue of the Dedekind zeta function.

The Riemann zeta function refers to the function

ζ(s) =∞

∑n=1

n−s = ∏p

(1− p−s)−1,

which converges (both the sum and the “Euler product” convergeand are equal) when <(s) > 1. The function ζ(s) is holomorphic inthe open subset of C given by <(s) > 1; moreover, there is a uniquemeromorphic extension of ζ(s) to the complex plane, whose onlypole is a simple pole at s = 1.

The Riemann zeta function generalizes to the Dedekind zetafunction, associated to a number field F. It is also defined by aDirichlet series:

ζF(s) = ∑I⊂OF

N(I)−s = ∏P⊂OF

(1−N(P)−s)−1,

where the sum ranges over all nonzero ideals I of OF and the prod-uct over all nonzero prime ideals P of OF. Once again, it can beproven that ζF(s), while initially defined and holomorphic in a righthalf-plane, extends uniquely to a meromorphic function in the com-plex plane, whose only pole is a simple pole at s = 1.

We have now defined all of the terms occurring in the following

Theorem 5.16 (Class number formula) The Dedekind zeta functionζF(s) has meromorphic continuation to the complex plane. At s = 1, ζF hasa simple pole and

lims→1

(s− 1)ζF(s) =2r1(2π)r2 h(F)R(F)w(F)

√|Disc(F)|

.

This formula brings together all of the invariants of a number fieldthat we have encountered. To recall these invariants we have:

Discriminant Disc(F) = Disc(OF) arises from viewing OF as aZ-lattice in the Q-vector space F, which is endowed with a nonde-generate symmetric bilinear form given by the trace pairing.

Places r1 is the number of real places, and r2 is the number of com-plex places. These arise naturally, in that FR

∼= Rr1 × Cr2 as aR-algebra.

Class number h(F) is the class number of F, i.e., the cardinality ofthe finite group H(F) = I(F)/P(F). It measures the extent towhich OF might fail to be a principal ideal domain.

76 number theory

Roots of unity w(F) is the cardinality of the finite groupW(F) =(O×F )tor of roots of unity in F.

Regulator R(F) is the regulator of F, given as the covolume of thelattice Λ ∼= O×F /W(F) in the real vector space H ∼= Rr1+r2−1.

Zeta function The Dedekind zeta function ζF(s) is defined above.

5.6 Exercises

Exercise 5.1 Describe explicitly the Diophantine equation which corre-sponds to finding the units in the ring Z[ 3

√2]. What does the unit theorem

imply in this case? Can you find a non-torsion unit?

Exercise 5.2 Suppose that p is a prime number, and F is a number fieldwhich is a Galois extension of Q. Suppose that p is totally ramified in F,which means that pOF = Pn for some prime ideal P of OF.

(a) Prove that if σ ∈ Gal(F/Q), then σ(P) = P.

(b) Suppose that P is a principal ideal, P = α · OF. Prove that α/σ(α) ∈O×F .

Exercise 5.3 A CM-field is a number field F, which contains a subfieldK, such that K is totally real (all archimedean places are real), F is totallycomplex (all archimedean places are complex), and [F : K] = 2.

(a) Suppose that F is a CM field. Prove that there is a unique subfield Ksuch that K is totally real and [F : K] = 2. Hint: prove that if K1, K2 aretotally real subfields of F, then K1K2 is a totally real subfield of F. WhenF is a CM field, we write F+ for this uniquely determined totally realsubfield satisfying [F : F+] = 2.

(b) Suppose that K ⊂ F is any proper inclusion of number fields. Prove thatF is a CM field with K = F+ if and only if O×F /O×K is a finite group.Hint: analyze how r1 and r2 change in an extension of number fields, andapply Dirichlet’s unit theorem.

6Cyclotomic Fields

We have developed a great deal of theory, covering the most impor-tant basic invariants of a number field. However, our examples havebeen quite limited, covering quadratic fields and occasionally (in theexercises) cubic or biquadratic quartic fields.

Historically and today, cyclotomic fields are some of the mostimportant number fields. In the nineteenth century, they motivatedmuch of the development of number theory due to their relevance toFermat’s Last Theorem. Later, with the Kronecker-Weber theorem, itwas realized that cyclotomic fields play an important role in under-standing the abelian Galois extensions of Q. They play a basic role inIwasawa theory, developed in the late twentieth century.

There are many books on cyclotomic fields; we mention Wash-ington 1 and Lang 2 as two that are well-known. In this chapter, we 1

2

use cyclotomic fields to review all we have studied in the previouschapters, and to prove an important case of Fermat’s Last Theorem.

6.1 Cyclotomic polynomials and fields

Let m be a positive integer, and let F be a number field. An mth rootof unity in F is an element ζ ∈ F satisfying ζm = 1. A primitive mth

root of unity is an element ζ ∈ F satisfying ζm = 1, and not satisfyingζn = 1 for any positive integer n < m. One writes µm(F) for thecyclic3 group of mth roots of unity in F. The generators of this cyclic 3 We have proven that this group is

cyclic when F is a number field, inProposition 5.7

group are precisely the primitive mth roots of unity.

Lemma 6.1 Suppose that ζ is a primitive mth root of unity in a numberfield F. Let Φm be the minimal polynomial of ζ, called the mth cyclotomicpolynomial. Then, if p is a prime number not dividing m, and r is a root ofΦm, then Φm(rp) = 0.

Proof: Note that since ζm = 1, ζ is an algebraic integer, and Φm isa monic polynomial with integer coefficients. There exists a monic

78 number theory

polynomial Ψm ∈ Z[X] such that

Xm − 1 = Φm(X) ·Ψm(X).

Observe that µm(F) is a cyclic group of order m, and therefore theroots of Xm − 1 are all distinct; it follows that every root of Xm − 1 iseither a root of Φm or is a root of Ψm (but never both).

Now consider a root r of Φm and a prime number p not dividingm. We immediately know that rp is a root of Xm − 1, and hence rp isa root of Φm or of Ψm. We seek to prove that rp is a root of Φm.

Let O denote the ring of integers in Q(ζ). Let P be a prime idealof O occurring in the factorization of the principal ideal pO. SinceΦm(r) = 0 and the coefficients of Φm are integers, we may reducemod P to obtain:

Φm(r) ≡ 0, modulo P.

Note that O/P is a field of characteristic p, since p ∈ P. Hence,the Frobenius map sending every element to its pth power is a fieldautomorphism of O/P. We find that

Φm(rp) ≡ 0, modulo P.

Note that, since p does not divide m, the polynomial Xm − 1 isrelatively prime4 to its derivative mXm−1, not only in Z[X], but also 4 In the Euclidean domain (O/P)[X],

one may check that these two poly-nomials are relatively prime, sincepolynomial division leaves a remainderof 1.

in (O/P)[X]. It follows that Xm − 1 has no repeated roots in thefield O/P. In particular, since Φm(rp) ≡ 0, modulo P, we find thatΨm(rp) 6≡ 0, modulo P. Therefore, Ψm(rp) 6= 0.

We have demonstrated that rp is a root of Xm − 1, but not a root ofΨm. Hence rp is a root of Φm.

Proposition 6.2 Let ζ be a primitive mth root of unity in a number fieldF. The roots of the cyclotomic polynomial Φm are precisely the primitivemth roots of unity. In particular, Q(ζ) is a Galois extension of Q, andGal(Q(ζ)/Q) is isomorphic to the group (Z/mZ)×.

Proof: Every root of Φm is a primitive mth root of unity, since Φm

is the minimal polynomial of ζ. Furthermore, every primitive mth

root of unity can be expressed as ζe, for some exponent e relativelyprime to m. Factoring e into primes, we have e = q1 · · · qt, for somesequence of prime numbers q1, . . . , qt, none of which divide m.

By the previous lemma, we find that ζ is a root of Φm, hence ζq1

is a root of Φm. Applying the previous lemma to ζq1 , we find that(ζq1)q2 = ζq1q2 is a root of Φm. Continuing inductively with theprevious lemma, we find that ζq1···qt = ζe is a root of Φm. Thusevery primitive mth root of unity is a root of Φm. The roots of Φm areprecisely all primitive mth roots of unity, i.e., the generators of µm(F).

cyclotomic fields 79

As the polynomial Φm is irreducible, and all roots of Φm arepowers of ζ, we find that Q(ζ) is a Galois extension of Q. An elementg ∈ Gal(Q(ζ)/Q) is uniquely determined by g(ζ), since the otherroots of Φm have the form ζe for some integer e. Furthermore, g(ζ) =ζ f (g), for some integer f (g) relatively prime to m, unique modulo m,since g(ζ) is again a primitive mth root of unity.

If g, g′ ∈ Gal(Q(ζ)/Q), then we find that:

ζ f (gg′) = [gg′](ζ) = g(ζ f (g′)) = ζ f (g) f (g′).

It follows that f is a well-defined injective homomorphism fromGal(Q(ζ)/Q) to the group (Z/mZ)×. Since Gal(Q(ζ)/Q) must acttransitively on the roots of Φm, a set with cardinality equal to thecardinality5 of (Z/mZ)×, we find that this homomorphism f is 5 The cardinality of (Z/mZ)× is

the number of integers k satisfying1 ≤ k < m and relatively primeto m. This cardinality is called the(Euler’s) totient of m, and denotedφ(m). Note that φ(m) is the degree ofthe cyclotomic polynomial Φm.

surjective.

The subject of this chapter is the following:

Definition 6.3 A cyclotomic field is a number field F, such that F =Q(ζ) for some root of unity ζ ∈ F. Without loss of generality, a cyclotomicfield can also be thought of as a field F = Q(ζ) for some primitive root ofunity ζ.

Let φ(m) denote the degree of Φm, i.e., the totient6 of m, or the 6 Euler’s totient, φ(m) may be describedas “the number of integers between 1and m (inclusive), which are relativelyprime to m”.

cardinality of (Z/mZ)×. By the Chinese remainder theorem, a primefactorization m = pe1

1 · · · pett yields a decomposition of rings:

Z

mZ∼=

Z

pe11 Z× · · · × Z

pett Z

.

This yields a decomposition of unit groups:(Z

mZ

)×∼=(

Z

pe11 Z

)×× · · · ×

(Z

pett Z

)×.

Observe that, if p is a prime number, and e is a positive integer, onemay identify: (

Z

peZ

)×=

Z

peZ− pZ

peZ,

from which one finds that φ(pe) = pe − pe−1. Using the factorizationabove, we find more generally that

φ(m) =t

∏i=1

φ(peii ) =

t

∏i=1

(pei

i − pei−1i

).

The above product has at least one even factor, unless m = 1 or m = 2(in which cases φ(m) = 1). Of course, when m = 1 or m = 2, the

80 number theory

mth roots of unity are already contained in Q, so this is not a veryinteresting case to study.

Observe that the only roots of unity in R are ±1. It follows that ifm > 2 and F = Q(ζ) for a primitive mth root of unity ζ, then r1 = 0,[F : Q] = φ(m), and the number of complex places of F is given byr2 = 1/2 · φ(m).

From considering the degrees of field extensions, we may provethe following

Proposition 6.4 Suppose that F = Q(ζ) is a cyclotomic field, where ζ

is a primitive mth root of unity. Let µ(F) denote the group of all roots ofunity in F (the group law given by multiplication, of course). Then, if m iseven, then µ(F) is generated by ζ, and has order m. If m is odd, then µ(F)is generated by −ζ, and has order 2m.

Proof: We have seen that [F : Q] = φ(m). Furthermore, µ(F) is afinite subgroup of O×F , hence cyclic. Let η be a generator of µ(F).Then, we find that η is a primitive nth root of unity, and m divides n.Hence F = Q(ζ) = Q(η), and it must be the case that

[F : Q] = φ(m) = φ(n).

The totients φ(m) and φ(n) may be computed from the prime fac-torizations of m and n. Let M be the set of prime numbers dividingm, and N the set of prime factors dividing n. The prime factoriza-tions of m and n may now be expressed as:

m = ∏p

pep , and n = ∏p

p fp ,

where ep, fp are positive integers. Note that M ⊂ N since m divides n.We find that

φ(m) = ∏p∈M

(p− 1)pep−1 = ∏p∈N

(p− 1)p fp−1 = φ(n).

Observe that if p ∈ M, and fp > ep, then φ(n) > φ(m), a contradic-tion. If p ∈ N but p 6∈ M, then we again find a contradiction, unlessp = 2 and fp = 1.

Thus, we find that φ(m) = φ(n) implies that m = n, or elsen = 2m and m is odd. The converse is not difficult to check by directcomputation in the above formula.

Thus we find that µ(F) is generated by ζ if m is even. If m is odd,then we find that µ(F) is a cyclic group of order 2m. The element −ζ

satisfies (−ζ)2m = 1, and (−ζ)m = −1. It follows directly that −ζ

generates µ(F).


6.2 Integers

Our first task, in the spirit of the second chapter of this text, is todetermine the ring of integers in the number field F = Q(ζ). Wefocus hereafter on the case when ζ is an (`d)th root of unity, where `

is a prime number and d ≥ 1. In this case, the cyclotomic polynomialΦ`d has degree

n = `d − `d−1 = `d−1(`− 1),

and it has the simple form7 7 One may see this by observing thatthe primitive (`d)th roots of unity areall (`d)th roots of unity, excluding the(`d−1)th roots of unity.Φ`d(X) =

X`d − 1X`d−1 − 1

= (X`d−1)`−1 + (X`d−1

)`−2 + · · ·+ (X`d−1)1 + 1.

In particular, when d = 1,

Φ`(X) = X`−1 + X`−2 + · · ·+ X + 1.

Let π = ζ − 1 and define Pπ(X) = Φ`d(X + 1). The irreducibilityof Φ`d implies the irreducibility of Pπ . Observing that Pπ(π) = 0,it follows that Pπ is the minimal polynomial of π, and F = Q(ζ) =Q(π).

Proposition 6.5 The element π ∈ OF has norm (−1)n · `, and hence theprincipal ideal πO is prime.

Proof: The norm of π equals the constant term of its minimal poly-nomial Pπ , times (−1)Deg(Pπ) = (−1)n. The proposition now followsfrom the computation8 8 This is not really a computation;

simply observe that Φ`d is a polynomialwith ` terms, each of which is a powerof X.

Pπ(0) = Φ`d(1) = `.

In what follows we frequently use the following “indexing set”:

I = i ∈ Z : 1 ≤ i ≤ `d, and ` - i.

Thus, I is an ordered set of representatives for the finite group(Z/`dZ)×. Now we may factor ` in the ring OF:

` = ∏i∈I

(ζ i − 1)

since the terms in this product are precisely the Galois conjugates ofπ.

On the other hand, we may prove

Proposition 6.6 For all i ∈ I, the element 1−ζ i/1−ζ is an element of O×F .

82 number theory

Proof: There exists g ∈ G = Gal(F/Q), such that g(1− ζ) = 1− ζ i,and so these elements have the same norm. Thus 1−ζ i/1−ζ has norm 1.Furthermore, we may expand a partial geometric series:

1− ζ i

1− ζ= 1 + ζ + ζ2 + · · ·+ ζ i−1.

It follows that 1−ζ i/1−ζ is an element of O. The proposition follows.

The units of the form 1−ζ i/1−ζ are called cyclotomic units9 in OF. 9 The cyclotomic units provide a quitelarge supply of units in O×F . In fact, thesubgroup generated by the cyclotomicunits in O×F (modulo torsion units) hasfinite index. This index is equal to theclass number of the maximal totally realsubfield F+ in F, discussed later.

Since these are units, we find the following

Corollary 6.7 In the ring OF, there exists a unit ε such that ` = επn.

Reminder:

n = `d − `d−1 = `d−1(`− 1).

I = i ∈ Z : 1 ≤ i ≤ `d, and ` - i.

Moreover, the principal ideal ÒF factors into prime ideals as:

ÒF = (πOF)n.

Proof: Recalling that π = ζ − 1, we have

` = ∏i∈I

(ζ i − 1) = ∏i∈I

ζ i − 1ζ − 1

(ζ − 1) = ε · πn,

for a suitable product ε of cyclotomic units. This corresponds toa factorization of the principal ideal ÒF as a power (πO)n of theprincipal ideal πOF. Since |N(π)| = ` is a prime number, we see thatπOF is a prime ideal in OF. Hence, this is the (unique) factorizationof the principal ideal ` into prime ideals in OF.

Consider the order O = Z[ζ] ⊂ OF. A natural Z-basis of O isgiven by the set of powers 1, ζ, ζ2, . . . , ζn−1. On the other hand, theGalois conjugates of ζ are precisely ζ i for i ∈ I. Thus we are led toconsider the square matrix

V =(

ζ ij)

i∈I,0≤j<n.

The discriminant of O can now be expressed as

Disc(O) = Det(V)2.

In order to evaluate this determinant, we apply the following resultof independent interest

Proposition 6.8 Consider the following Vandermonde determinant as anelement of the polynomial ring Z[X1, . . . , Xn]:

v(X1, . . . , Xn) = Det

1 X1 X2

1 · · · Xn−11

1 X2 X22 · · · Xn−1

21 X3 X2

3 · · · Xn−13

......

.... . .

...1 Xn X2

n · · · Xn−1n

.


Then the following is an equality in the ring Z[X1, . . . , Xn]:

v(X1, . . . , Xn) = ∏1≤i<j≤n

(Xj − Xi).

Proof: If one expands the determinant, each summand is a prod-uct of the form ∏n

i=1 Xi−1σ(i), for some permutation σ of 1, . . . , n. It

follows that the determinant v(X1, . . . , Xn) is a homogeneous polyno-mial of total degree 0 + 1 + · · ·+ (n− 1) = n(n−1)/2.

Suppose for a moment that 1 ≤ i < j ≤ n. Consider the quotientring Z[X1, . . . , Xn]/〈Xj − Xi〉. In this quotient, the images Xi and Xj

are equal; it follows that the determinant v(X1, . . . , Xn) equals zeroin this quotient ring, since two rows of the Vandermonde matrix areequal. Hence we find that the polynomial v(X1, . . . , Xn) is a multipleof (Xj − Xi) in the ring Z[X1, . . . , Xn].

We find that v(X1, . . . , Xn) is a polynomial of degree n(n−1)/2 inthe unique factorization domain Z[X1, . . . , Xn], and the irreduciblepolynomials (Xj −Xi) are factors of v for all n(n−1)/2 possible 1 ≤ i <

j ≤ n. It follows now that there exists a constant r ∈ Z such that

v(X1, . . . , Xn) = r · ∏1≤i<j≤n

(Xj − Xi).

Consider the coefficient of the term

X01 · X1

2 · · · ·Xn−1n .

In the Vandermonde determinant, this term occurs only on the maindiagonal, where it has coefficient 1. In the product ∏(Xj − Xi), thisterm occurs only once, by choosing the first summand in each termin the product. Hence the coefficient r equals 1.

This proposition allows one to evaluate the Vandermonde deter-minant “generically”; we are interested in the special case where thevariables Xi are replaced by the distinct powers ζ i for i ∈ I. We findthat

Det(V) = ∏i<j

i,j∈I

(ζ i − ζ j).

With the aid of cyclotomic units , each term can be expressed as Reminder:

n = `d − `d−1 = `d−1(`− 1).

I = i ∈ Z : 1 ≤ i ≤ `d, and ` - i.(ζ i − ζ j) = ζ i(1− ζ j−i) = εij · π,

for some units εij. Here, we observe that since i and j are not multi-

84 number theory

ples of `, neither is j− i. Finally, we arrive at:

Disc(O) = Det(V)2

= ∏i<j

(ζ i − ζ j)2

= ∏i<j

ε2ijπ

2,

=

(∏i<j

ε2ij

)· πn(n−1),

= u`n−1.

for some unit u ∈ O×F . Note that we use our previous factorization,` = επn in the last step above. Since the discriminant Disc(O) isnecessarily an integer, we find that

Disc(O) = ±`n−1 = ±``d−`d−1−1.

In particular, if d = 1, then we find that Disc(O) = ±``−2.10 10 The sign of the discriminant, werecall, can be obtained from the numberof complex places.

We have one more lemma to prove, before the main theorem ofthis section:

Lemma 6.9 π/` ∈ O]F. In other words, for every α ∈ OF, Tr(πα/`) ∈ Z.

Proof: To prove that Tr(πα/`) ∈ Z for all α ∈ OF, it suffices to provethat Tr(πα) ∈ `Z for all α ∈ OF. Recall that G = Gal(F/Q); for eachi ∈ I, let gi ∈ G be the unique element for which gi(ζ) = ζ i. Now wecompute

Tr(πα) = Tr((ζ − 1)α)

= ∑i∈I

(ζ i − 1)gi(α)

= ∑i∈I

(ζ − 1)(ζ i−1 + · · ·+ ζ + 1)gi(α)

∈ πOF ∩Z.

Thus, it suffices to prove that πOF ∩ Z = `Z. On the one hand,πOF ∩Z is a prime ideal in Z, since the quotient Z/(πOF ∩Z) isnaturally a subring of the integral domain OF/πOF. Furthermore,` ∈ πOF ∩Z, and ` is a prime number. Thus πOF ∩Z contains themaximal ideal `Z, and hence must equal `Z.

Theorem 6.10 The ring of integers OF in F = Q(ζ) is equal to O = Z[ζ].

Proof: We have found that Z[ζ] = O ⊂ OF and |Disc(Z[ζ])| = `n−1.Consider the following sequence of lattices and inclusions:

O ⊂ OF ⊂ O]F ⊂ O

].


By multiplicativity of the index, we find that

`n−1 = |Disc(O)| = [O] : O] = [O] : O]F][O]

F : OF][OF : O].

Therefore, in order to prove that OF = O, it suffices to prove that[OF : O] = 1, for which it suffices to prove that [O]

F : OF] ≥ `n−1.Consider now the inclusions ÒF ⊂ πOF ⊂ OF. Since the absolute

value of the norm of π is `, and the norm of ` is `n, we find that[πOF : ÒF] = `n−1. Dividing through by `, we have an inclusion oflattices:

OF ⊂π

ÒF ⊂ O]

F.

Since the first inclusion has index `n−1, we find that [O]F : OF] ≥ `n−1.

This completes the squeeze.11 11 In fact, we have additionally proventhat O]

F = (π/`) · OF .

Corollary 6.11 The discriminant of the number field F = Q(ζ) is equal to`n−1.

6.3 The totally real subfield

Observe that if ζ is a primitive (`d)th root of unity, then ζ−1 is also aprimitive (`d)th root of unity. Thus, there is a canonical element ofG = Gal(Q(ζ)/Q), which sends ζ to ζ−1. If α ∈ Q(ζ), we write α forthe image of α under this field automorphism; in particular, ζ = ζ−1.Of course α = α; the field automorphism is an involution.

We define F+ to be the fixed subfield of F = Q(ζ), under thisinvolution:

F+ = Q(ζ)+ = α ∈ F : α = α.The cyclotomic field F is a quadratic extension of F+. Observe

that if v is any embedding of F into C, then ζ and ζ get mapped, viav, to complex conjugate elements of C. It follows that v(F+) ⊂ R.Since every embedding of F+ in C extends to an embedding of Finto C, we find that every embedding of F+ in C lands inside ofR. Therefore F+ is a totally real number field, and F is a totallycomplex quadratic extension of F+.

Proposition 6.12 The unit group O×F contains the unit group O×F+ withfinite index.

Proof: The field F has n/2 complex places and 0 real places, anddegree n over Q. On the other hand, F+ has zero complex places,degree n/2 over Q, and hence has n/2 real places.

It follows from the unit theorem that the free rank12 of O×F and the 12 If A is a finitely generated abeliangroup, then A is isomorphic to Ator ×Zr , for some torsion abelian group Atorand some non-negative integer r. Theinteger r is called the free rank of A.

free rank of O×F+ are the same:

Rank(O×F ) = Rank(O×F+) =n2− 1.

86 number theory

In fact, we can go further and express the units in O×F in terms ofunits from O×F+ and roots of unity. In what follows, we write µ(F) forthe group of torsion units in O×F , i.e., roots of unity. Note that, for allε ∈ µ(F), ε = ε−1. If ` is an odd prime, then

µ(F) = ±ζr : 0 ≤ r < `d − 1,

and µ(F)2 is the cyclic subgroup generated by ζ. On the other hand,if ` = 2, then µ(F) is generated by ζ, and µ(F)2 is generated by ζ2.

Proposition 6.13 Suppose that u ∈ O×F . Then, there exists a unit v ∈O×F+ , and a root of unity ε ∈ µ(F) such that u = ε · v.

Proof: Some parts of the following proof are loosely based on Propo-sition 5.12 of Milne’s text13. 13 James S. Milne. Algebraic number

theory (v3.02), 2009. Available atwww.jmilne.org/math/

Consider the map u 7→ u/u from O×F to O×F . It is a group homo-morphism, whose kernel is contained in O×F+ .

Since complex conjugation preserves every archimedean absolutevalue, we find that λ(u/u) = 0, where λ denotes the logarithmicmodulus map of the previous chapter. Thus u/u is a torsion unit, i.e.,u/u ∈ µ(F).

Furthermore, if u/∈µ(F)2, then u = uε2 for some ε ∈ µ(F). Itfollows (since ε = ε−1) that (uε)/(uε) = 1. Thus uε ∈ O×F+ , and wefind that u ∈ µ(F) · O×F+ .

We now have the following commutative diagram of abeliangroups and homomorphisms, whose columns arise from inclusion,equality, and projection, from left to right:

1 // O×F+//

O×Fu 7→u/u //

=

µ(F) //

1

1 // µ(F)O×F+// O×F // µ(F)/µ(F)2 // 1

Now, in order to prove that O×F = µ(F)O×F+ , it suffices to prove that Much of this reasoning goes through,without changes, to the case of a CMfield and its totally real subfield. In thatgenerality, one finds always that O×Fcontains O×F+ µ(F) with index 1 or 2.

u/u is always a square in µ(F). There are two cases to consider:

` odd: When ` is odd, µ(F)2 is generated by ζ. Observe that if u ∈O×F , then there exist integers a0, . . . , an−1 such that

u = a0 + a1ζ + · · ·+ an−1ζn−1.

Since integers are fixed under the involution, we find that

u = a0 + a1ζ + · · ·+ an−1ζn−1.


Modulo π = ζ − 1, we find that u ≡ u. If u/u is not in µ(F)2, thenu/u = −ζr, for some integer r, then we find that u = −ζru, and

u ≡ −zetaru ≡ −u, modulo π.

Hence 2u ≡ 0, modulo π. Note that 2 6≡ 0, modulo π, since ` isodd; it follows that u ≡ 0, modulo π, contradicting the fact that uis a unit. Therefore, we find that u/u ∈ µ(F)2.

` = 2: When ` = 2 (and d > 1, so F 6= Q, of course), we find thatµ(F) is generated by ζ, and µ(F)2 is generated by ζ2. Thus, if u/uis not in µ(F)2, then u/u = ζ · ζ2r for some integer r. In this case,we work modulo v = 1 − ζ2; note that π is not a multiple ofv. Working modulo v we find that u ≡ ζu. Note that ζ−1 ≡ ζ,modulo v, and hence u ≡ u, modulo v. Thus u ≡ ζu, modulo v.It follows that πu is a multiple of v. But this is a contradiction; π

is not a multiple of v. Hence u/u ∈ µ(F)2.

Thus u/u ∈ µ(F)2 for all u ∈ O×F . The result follows.

6.4 The first case of Fermat’s Last Theorem

Fermat’s Last Theorem, proven by Andrew Wiles, is the following

Theorem 6.14 Suppose that n is an integer, and n ≥ 3. Suppose thatx, y, z ∈ Z and xn + yn = zn. Then xyz = 0. In other words, there are nosolutions to the equation Xn + Yn = Zn, except for the “trivial solutions”in which at least one coordinate vanishes.

It was long ago realized that it suffices to prove Fermat’s LastTheorem for odd prime exponents. Indeed, any nontrivial solution(x, y, z) to Xn + Yn = Zn yields a nontrivial solution

(xn/`)` + (yn/`)` = (zn/`)`

to the equation X` + Y` = Z`, if ` is a prime factor of n. For thisreason, it is natural to restrict to the case of Fermat’s Last Theoremwith odd prime exponent.

Historically, Fermat’s Last Theorem has been approached in twocases.

Case 1: Suppose that x, y, z ∈ Z and x` + y` = z`. Then ` must dividexyz.

Case 2: Suppose that x, y, z ∈ Z and ` | (xyz). Then if x` + y` = z`,then xyz = 0.

88 number theory

In other words, the first case of Fermat’s Last Theorem states thatthere are no solutions (x, y, z) to X` + Y` = Z` in which ` does notdivide any x, y, z. The second case begins with the assumption that `

divides one of x, y, z.In this section, we prove the first case of Fermat’s Last Theorem,

for “regular primes” `.

Definition 6.15 A prime number ` is a regular prime if ` does not di-vide the class number h(Q(ζ)) where ζ is a primitive `th root of unity.Otherwise, ` is called an irregular prime.14 14 While “irregular” primes appear

rare among small prime numbers, ithas not been proven that irregularprimes occur less often than regularprimes, asympotically. In fact, it hasbeen proven that there are infinitelymany irregular primes, and it has notbeen proven that there are infinitelymany regular primes!

Hereafter, fix an odd prime number `, and a primitive `th rootof unity ζ. Let F = Q(ζ) denote the resulting cyclotomic field. Letπ = ζ − 1. Recall that n = [F : Q] = `− 1, and π`−1 is a unit multipleof `. Also, the discriminant of F is equal to ``−2, and OF = Z[ζ].

The relevance of cyclotomic fields to Fermat’s Last Theorem is thatthe equation X` −Y` = Z` can be factored in the ring OF[X, Y, Z]:

X` −Y` =`−1

∏i=0

(X− ζ iY) = Z`.

Note that the switch in sign, Y` 7→ −Y` does not affect the study ofFermat’s Last Theorem, since we assume that ` is an odd prime.

Theorem 6.16 (Case 1, Regular Primes) Suppose that ` > 5. Supposethat ` does not divide the order h(F) of the ideal class group of F, i.e., ` is aregular prime. Then if (x, y, z) ∈ Z and x` − y` = z`, then ` | (xyz).

Proof: First, note that it suffices to assume that (x, y, z) = 1, i.e., x,y, and z do not have any common integer factors except for ±1. In-deed, if d were a common factor of x, y, and z, then a new “smaller”solution to Fermat’s Last Theorem would be obtained by dividingthrough by d.

So hereafter, we assume that x, y, z generate the unit ideal in Z. Itfollows that x, y, z generate the unit ideal in OF as well. We work inthe ring OF throughout the rest of the proof.

Given that x` − y` = z`, for x, y, z ∈ OF, we find an identity in OF:

`−1

∏i=0

(x− ζ iy) = z`.

This may also be interpreted as an equality of principal ideals in OF:

`−1

∏i=0

(x− ζ iy)OF = (zOF)`.

Now, suppose that two ideals (x− ζ iy)OF and (x− ζ jy)OF have acommon prime ideal factor P. Thus (x− ζ iy) ∈ P and (x− ζ jy) ∈ P.


It follows (from subtracting) that (ζ i − ζ j)y ∈ P. Since P is prime, wefind that (ζ i − ζ j) ∈ P or y ∈ P. Hence π ∈ P or y ∈ P.

Similarly, by considering ζ−i(x− ζ iy)− ζ−j(x− ζ jy), we find that(ζ−i − ζ−j)x ∈ P. It follows that π ∈ P or x ∈ P. To summarize, eitherπ ∈ P or x, y ∈ P. Of course, if x, y ∈ P, then z ∈ P. This contradictsour assumption that x, y, z generate the unit ideal in OF.

We have proven that, if two ideals (x − ζ iy)OF and (x − ζ jy)OF

have a common prime ideal factor P, then π ∈ P; it follows thatP = πOF. In this case, we find that P = πOF is a prime factor of(zOF). Therefore z ∈ (πOF) ∩Z = `Z, and we are done.

It remains to prove the theorem when the ideals (x − ζ i)OF (for0 ≤ i ≤ `− 1) are pairwise relatively prime. In this case, let the primefactorization of zOF be given by

zOF = Pe11 · · · P

ett ,

for distinct prime ideals P1, . . . , Pt in OF. Then, after reordering, wefind that the prime factorization of (x− ζ iy)OF is given by

(x− ζ iy)OF = Pè11 · · · P

èss ,

for some 0 ≤ s ≤ t. Let I = Pe11 · · · P

ess . Then we find that I` is a

principal ideal in OF. Since ` does not divide the order of the groupH(F), this implies that I itself is principal: I = (wi) for some wi ∈ OF.We find that for all 0 ≤ i ≤ `− 1, there exists ωi ∈ OF such that:

(x− ζ iy)OF = ωìOF.

Hence there exist units ui such that

x− ζ iy = uiωì .

We now “reduce mod `”. Observe that every element of OF is con-gruent to an element of Z, modulo π = ζ − 1; it follows that every`th power in OF is congruent to an element of Z, modulo ÒF. Inparticular, ω`

i is congruent to an integer wi, modulo ÒF:

x− ζ iy ≡ uiwi, modulo ÒF.

Recall now that if u is a unit in O×F , then u = ζrv for some integer0 ≤ r < `− 1 and some unit v ∈ O×F+ . Considering i = 1, and omittingsubscripts accordingly, we find that:

x− ζy ≡ ζrvw, and x− ζ−1y ≡ ζ−rvw, modulo ÒF.

Identifying νw in both equalities, we find that

ζ−rx− ζ1−ry = ζrx− ζr−1y.

90 number theory

Simplifying, by multiplying through by ζr,

x− ζy− ζ2rx + ζ2r−1y = 0.

Now, we have three possibilities to consider, depending onwhether the coefficients 1, ζ, ζ2r, ζ2r−1 are distinct elements of OF:

All distinct: If the elements 1, ζ, ζ2r, ζ2r−1 are all distinct, then since` > 5, we find that the set 1, ζ, ζ2r, ζ2r−1 is Z-linearly indepen-dent in OF = Z[ζ]. Thus we find a contradiction, since x, y ∈ Z.

1 = ζ2r: If 1 = ζ2r, then we find that ζ(1− ζ−1)y = 0. It follows thaty ∈ πOF ∩Z = `Z. Hence ` divides y.

1 = ζ2r−1: If 1 = ζ2r−1, then we find that (1− ζ)(x + y) = 0. It followsthat x + y ∈ πOF ∩Z = `Z. Hence x is congruent to y, modulo `.Since x` − y` = z`, it follows that ` divides z.

ζ = ζ2r−1: If ζ = ζ2r−1, then (1− ζ2)x = 0. Hence x ∈ πOF ∩Z = `Z.Hence ` divides x.

As such coincidences include all possible equalities among elements1, ζ, ζ2r, ζ2r−1, we find in all possible cases that ` divides xyz.

6.5 Exercises

Exercise 6.1 Write down the cyclotomic polynomial Φ12(X).

Exercise 6.2 Let ζ be a primitive 7th root of unity and let F = Q(ζ). Whatis the minimal polynomial of ζ + ζ−1? Can you determine OF+?

Exercise 6.3 Suppose that ζ is a primitive `th root of unity, for an oddprime number `. Prove that the cyclotomic field Q(ζ) has a unique subfieldK such that [K : Q] = 2. Challenge: identify this field K explicitly.

Exercise 6.4 Suppose that ` is an odd prime number, and ζ is a primitive(`d)th root of unity. Let G = Gal(Q(ζ)/Q). Let H = g ∈ G : g`−1 = 1.Prove that H has order `− 1, and thus Q(ζ) contains a subfield M suchthat [M : Q] = `d−1. Challenge: prove that if C is a finite cyclic group, thenthere exists a Galois extension of Q with Galois group isomorphic to C.

Exercise 6.5 Prove the first case of Fermat’s last theorem, for ` = 3 and` = 5. In other words, prove that if x` + y` = z`, then ` divides xyz. Hint:Consider the cubes, modulo 9, and the fifth powers, modulo 25.

7Valued fields

Much of our analysis of number fields has followed from embeddingthe number field into the real or complex numbers. After this em-bedding, we may apply our knowledge of geometry and analysis(computing volumes, for example), to study the number field withina real or complex vector space.

There are other fields which, like the real or complex numbers,possess topological and analytic structure; the most important onesfor number theory are the p-adic fields. By embedding into p-adicfields, we will be able to exploit geometry and analysis in these fieldsto obtain results about number fields.

We present the general theory of valuations on fields in this chap-ter, a theory which encompasses both real numbers and p-adicnumbers. By classifying these valuations on number fields, we justifythe central role of p-adic valuations in later chapters.

Much of the material in this chapter, including of course Os-trowski’s theorem, owes a great deal to the original work of Os-trowski from 1916

1. 1 Alexander Ostrowski. Über einigeLösungen der Funktionalgleichungψ(x) · ψ(x) = ψ(xy). Acta Math., 41(1):271–284, 1916. ISSN 0001-59627.1 Valuations

Let F be any field, for the moment. The following is an abstraction ofthe idea of “absolute value” on F:

Definition 7.1 A valuation2 on F is a function | · | : F → R, which is 2 The term valuation is used differentlyby different authors, and in differentcontexts. In commutative ring theory,one often works with “valuationrings”. The valuations discussed thereare different from our valuations; insome cases, valuations on rings arelogarithms of the valuations we studyhere.

Multiplicative |xy| = |x| · |y|, for all x, y ∈ F.

Positive-definite |x| ≥ 0, for all x ∈ F, and |x| = 0 if and only if x = 0.

Metric |x + y| ≤ |x|+ |y|, for all x, y ∈ F.

A pair consisting of a field F, together with a valuation | · | on F, is called avalued field.3 Often one just says that F is a valued field, if the valuation is 3 Some authors use the term “valued

field” only to refer to nonarchimedeanvalued fields, which we introducemomentarily.

to be understood, and written with the standard notation.

92 number theory

An immediate consequence of this definition is the following:

Proposition 7.2 Let | · | be a valuation on a field F. Then the valuationrestricts to a group homomorphism from F× to the group R×pos of positivereal numbers under multiplication. For every ε ∈ µ(F), |ε| = 1.

Proof: The fact that |xy| = |x| · |y| implies that the valuation defines agroup homomorphism from F× to the group R×. Since all valuationsare non-negative, its image lies in R×pos.

If ε ∈ µ(F) is a root of unity, then we find that εn = 1 for somepositive integer n. It follows that |ε|n = 1 in R×pos. Since the onlypositive root of unity in R is 1, we find that |ε| = 1.

There is a surprising dichotomy among valuations, suggested bythe following

Definition 7.3 Let | · | be a valuation on a field F. It is called archimedeanif for all x, y ∈ F×, there exists an integer4 n such that |nx| > |y|. 4 Here, we are viewing any integer

n as an element of F, via the unique“characteristic” ring homomorphismfrom Z to F.

The valuation is called nonarchimedean if for all x, y ∈ F, the followingultrametric triangle inequality is satisfied:

|x + y| ≤ max|x|, |y|.

Lemma 7.4 If F is an archimedean valued field, then the set |n| : n ∈ Zis unbounded in R. If F is a nonarchimedean valued field, then the set|n| : n ∈ Z is bounded by 1. In particular, a valued field cannot be botharchimedean and nonarchimedean.

Proof: First, suppose that F is an archimedean valued field. Thenthere exists an integer n such that

|n| = |n · 1| > |1| = 1.

It follows that the sequence |nk| = |n|k is unbounded, for positiveintegers k.

Next, suppose that F is a nonarchimedean valued field. Then theultrametric triangle inequality implies that for every positive integern,

|n| = |1 + (n− 1)| ≤ max1, |n− 1|.

By induction, we find that |n| ≤ 1 for all positive integers n. Since| − 1| = 1 and |0| = 0, we find that |n| ≤ 1 for all integers n.

With this lemma in place, we can prove the following dichotomy:

Proposition 7.5 Suppose that F is a field with valuation | · |. Then thevaluation is either archimedean or nonarchimedean.

valued fields 93

Proof: Suppose first that | · | is not archimedean5. It follows that 5 Note here that the condition “notarchimedean” does not yet imply“nonarchimedean”. That is the conclu-sion of the proposition.

there exist x, y ∈ F×, such that |nx| ≤ |y|, for all n ∈ Z. It followsthat |n| ≤ |x−1y| for all n ∈ Z. In other words, the absolute values|1|, |2|, . . . are bounded by some nonzero constant M.

Suppose, without loss of generality, that |x| ≥ |y|. Then we findthat

|x + y| = n√|x + y|n

= n√|(x + y)n|

= n√|xn + nxn−1y + · · · nxyn−1 + yn|

≤ n√|xn|+ |yn|+ M(n− 1)|x|n

≤ n√

(2 + M(n− 1))|x|n

= n√

2 + M(n− 1) · |x|.

We find that |x + y| is a real number, bounded above by n√

2 + M(n− 1) ·|x|, for all positive integers n. Since the sequence n

√2 + M(n− 1) ap-

proaches 1, as n grows large, we find that |x + y| ≤ (1 + ε)|x| for allpositive real numbers ε. Hence

|x + y| ≤ |x| = max|x|, |y|.

With the previous lemma in mind, we find that

Corollary 7.6 A valued field is archimedean if and only if the set |n| : n ∈Z is unbounded.

7.2 Completion

When F is a valued field, it has a natural Hausdorff topology inducedby the metric d(x, y) = |x − y|. If x ∈ F, and 0 < r ∈ R, we writeD(x, r) for the “open disc” in F:

D(x, r) = y ∈ F : |x− y| < r.

Such discs form a basis for the topology on F.In order to see that the topology on F is “compatible” with the

field structure, it is important to see the effect of addition, subtrac-tion, multiplication, and [in a field]inversion (the map x 7→ x−1 onF×) on discs.

Lemma 7.7 Suppose that x1, x2 ∈ F, and 0 < r1, r2 ∈ R. Then

D(x1, r1)± D(x2, r2) ⊂ D(x1 ± x2, r1 + r2).

94 number theory

Proof: If y1 ∈ D(x1, r1) and y2 ∈ D(x2, r2), then we find that

|(y1 ± y2)− (x1 ± x2)| = |(y1 − x1)± (y2 − x2)|≤ |y1 − x1|+ |y2 − x2|< r1 + r2.

Lemma 7.8 Suppose that x1, x2 ∈ F, and 0 < r1, r2 ∈ R, and |x1|, |x2| <R. Then

D(x1, r1) · d(x2, r2) ⊂ D(x1x2, R(r1 + r2) + r1r2).

Proof: If y1 ∈ D(x1, r1) and y2 ∈ D(x2, r2), then we find that

|(y1 · y2)− (x1 · x2)| = |y1y2 − y1x2 + y1x2 − x1x2|= |y1(y2 − x2) + x2(y1 − x1)|≤ |y1||y2 − x2|+ |x2||y1 − x1|< (R + r1)(r2) + (R)(r1)

= R(r1 + r2) + r1r2.

Lemma 7.9 Suppose that x ∈ F× and 0 < r ∈ R, and Rmin < |x|. Also,suppose that r < Rmin. Then

D(x, r)−1 ⊂ D(x−1, rR−1min(Rmin − r)−1).

Proof: If y ∈ D(x, r), then observe first that

|y|+ |x− y| ≥ |x| > Rmin.

It follows that |y| > Rmin − r > 0, and so y ∈ F×.Continuing, we find that

|y−1 − x−1| = |x− y||x−1y−1|< r · |x|−1 · |y|−1

< r · R−1min · |y|

−1

< r · R−1min · (Rmin − r)−1.

One construction of the real numbers uses Cauchy sequences. Fora general valuation on a field F, we define these by

Definition 7.10 Suppose that (ai)∞i=0 is a sequence of elements of a valued

field F. It is called a Cauchy sequence if for every real6 ε > 0, there exists 6 If one wishes to use Cauchy sequencesto define the real numbers, this defini-tion is circular, at least on the surface.However, one may restrict to 0 < ε ∈ Q,and only consider rational-valuedvaluations as well, if one wishes to useCauchy sequences to define the realnumbers from the rationals.

N such that n, m ≥ N implies that |an − am| < ε.A valued field F is called complete if every Cauchy sequence in F con-

verges to some element of F.

valued fields 95

An important first observation about Cauchy sequences in avalued field is the following

Lemma 7.11 If (ai) is a Cauchy sequence in a valued field F, then thesequence |ai| is a Cauchy sequence in R. It follows that there exists a realnumber α such that |ai| → α.

Proof: For all ε > 0, there exists N such that n, m ≥ N implies that|an − am| < ε. It follows that for all n, m ≥ N.

|an| − |am| ≤ |an − am + am| − |am|≤ |an − am|+ |am| − |am|≤ ε.

Thus the sequence (|ai|) is a Cauchy sequence in R.

Fortunately, in a valued field, sums, products, and often inversesof Cauchy sequences are again Cauchy sequences.

Lemma 7.12 Suppose that (ai), (bi) are Cauchy sequences in a valued fieldF. Then (ai ± bi) and (aibi) are Cauchy sequences in F.

Proof: Let ε be a positive real number. Let N be sufficiently large sothat

|an − am| < ε/2 and |bn − bm| < ε/2,

for all n, m ≥ N. Then we find that

an ± bn ∈ D(am, ε/2)± D(bm, ε/2) ⊂ D(am ± bm, ε).

It follows directly that (ai ± bi) is a Cauchy sequence in F.Now consider the elements α, β ∈ R such that |ai| → α and

|bi| → β. Let R be a real number greater than maxα, β. Let δ be anypositive real number satisfying

2Rδ + δ2 < ε.

Let M be sufficiently large so that

|an − am| < δ, and |bn − bm| < δ,

and |an| < R and |bn| < R, for all n, m ≥ M. Then we find that

an · bn ∈ D(am, δ) · D(bm, δ) ⊂ D(ambm, 2Rδ + δ2) ⊂ D(ambm, ε).

It follows directly that (ai · bi) is a Cauchy sequence in F.

96 number theory

Lemma 7.13 Suppose that (ai) is a Cauchy sequence in a valued field F.Suppose moreover that ai 6= 0 for all i and |ai| does not approach 0, asi→ ∞. Then (a−1

i ) is a Cauchy sequence in F.

Proof: Let ε be a positive real number, and let α be the positive realnumber such that |ai| → α. Let Rmin be any real number satisfying0 < Rmin < α. Let δ be a positive real number satisfying

δR−1min(Rmin − δ)−1 < ε.

Let N be sufficiently large so that

|an − am| < δ, and Rmin < |an|,

for all m, n ≥ N. Then we find that

a−1n ∈ D(am, δ)−1 ⊂ D(a−1

m , δR−1min(Rmin − δ)−1) ⊂ D(a−1

m , ε).

It follows that (a−1i ) is a Cauchy sequence in F.

Just as one may construct the real numbers as the set of equiv-alence classes of Cauchy sequences of rational numbers, one may“complete” any valued field by considering Cauchy sequences mod-ulo “null sequences”.

Definition 7.14 If F is a valued field, then a null sequence in F is aCauchy sequence (ai) in F such that |ai| → 0.

We are now prepared to prove

Proposition 7.15 Suppose that F is a valued field. Then, there exists acomplete valued field F containing F (the valuation on F restricting to thaton F), such that F is dense in F. Such a complete valued field F is called acompletion of F.

Proof: Let R be the set of all Cauchy sequences (aI) of elements ofF. This set forms a ring, by termwise addition and multiplication;this assertion requires the previous analytic lemmas to prove thatthe sum, difference, and product of Cauchy sequences is a Cauchysequence.

Let N be the subset of R consisting of null sequences. We claimthat the ideal N is maximal; indeed, suppose that (ai) ∈ R\N. Then|ai| → α for some real number α > 0. Since (ai) is Cauchy, thereexists an integer M, such that ai 6= 0 for all i ≥ M.

Let ai = ai, if i ≥ M, and let ai = 1 if 0 ≤ i < M. Thus, we findthat (ai) is a Cauchy sequence, not a null sequence, and every term is

valued fields 97

invertible. It follows (from a previous lemma) that (bi) = (a−1i ) is a

Cauchy sequence. Furthermore, in the ring R,

(ai) · (bi)− (1) ∈ N.

Indeed, aibi − 1 = 0 for all i ≥ M.Thus, every sequence in R\N is invertible, modulo N. Hence N is

a maximal ideal. Define F = R/N.We now claim that F is a complete valued field containing F as a

dense subfield. Indeed, F is a field, defined as the quotient of a ringby a maximal ideal. F embeds as a subfield of F, by associating to anelement x ∈ F the constant sequence (x). If (ai) ∈ R, then we maydefine

|(ai)| = limi→∞|ai|,

since this limit converges by the completeness of R. We leave it tothe reader to check that this makes F a valued field, containing F as avalued subfield (i.e., the valuation on F is the restriction of that on F.

The completeness of F follows from a “diagonal argument”, usinga Cauchy sequence of Cauchy sequences. The density of F in Ffollows from the fact that every Cauchy sequence in F converges toan element of F.

The completion F satisfies the following universal property.

Proposition 7.16 Let F be a valued field, and K be any complete val-ued field. Then, any continuous field homomorphism from F to K factorsuniquely through F.

Proof: In fact, by the universal property of completions of metricspaces, there is a unique continuous map from F to K which restrictsto F. One may check directly that this unique map is a field homo-morphism, given that F → K is a field homomorphism.

7.3 Valuations on the rational numbers

In this section, we prove Ostrowski’s theorem, which classifies allof the valuations on Q. We have found a dichotomy among thesevaluations. Some are archimedean, and some are nonarchimedean,depending on whether |Z| is bounded or unbounded.

It will be useful to prove strong statements about archimedean andnonarchimedean valuations. These stronger statements rely on thefollowing analytic lemma:

98 number theory

Lemma 7.17 Suppose that | · | is any valuation on Q, and m, n are positiveintegers with n > 1. Then,

|m| ≤ max1, |n|log(m)/ log(n).

Proof: Consider the expression of m, “base n”:

m = a0 + a1n + · · · adnd,

where

d = blogn(m)c = b log(m)log(n)

c

and 0 ≤ ai < n for all 0 ≤ i ≤ d. The triangle inequality implies7 that 7 Here, we use the fact that

|1 + · · ·+ 1| ≤ |1|+ · · ·+ |1|,

with the same number of terms on bothsides.

|ai| ≤ ai < n, for all 0 ≤ i ≤ d. It follows that

|m| = |a0 + a1n + · · · adnd|≤ (d + 1)n max1, |n|d

≤(

1 +log(m)log(n)

)n max1, |n|

log(m)log(n) .

Carrying out this estimate with mt instead of m, for t a positiveinteger, yields

|mt| ≤(

1 + tlog(m)log(n)

)n max1, |n|t log(m)

log(n) .

Taking tth roots of both sides yields

|m| ≤ t

√n(

1 + tlog(m)log(n)

)max1, |n|

log(m)log(n) .

As t approaches infinity, the quantity still under the tth root aboveapproaches 1. It follows that

|m| ≤ max1, |n|log(m)/ log(n).

Corollary 7.18 Suppose that | · | is a valuation on Q, n > 1 is an integer,and |n| ≤ 1. Then the valuation is nonarchimedean.

Proof: Suppose that m > 1 is an integer. Then, the previous lemmaimplies that

|m| ≤ max1, |n|log(m)/ log(n) = 1log(m)/ log(n) = 1.

It follows that the valuation is bounded on Z, and hence nonar-chimedean.

valued fields 99

Proposition 7.19 Suppose that | · | is an archimedean valuation on Q. Let| · |R denote the usual absolute value on Q. Then there exists a positive realnumber 0 < α ≤ 1, such that |x| = |x|αR for all x ∈ Q.

Proof: Since | · | is an archimedean valuation, it follows that |n| > 1for all integers n > 1. For any integer m > 1, we find that

|m| ≤ |n|log(m)/ log(n), and |n| ≤ |m|log(n)/ log(m).

It follows that

|m|1/ log(m) ≤ |n|1/ log(n) and |n|1/ log(n) = |m|1/ log(m).

Hence we find that there exists a real constant β > 1 such that|m|1/ log(m) = β for all integers m > 1. It follows that

|m| = βlog(m) = |m|αR,

where α = log(β) > 0. Since Q× is generated by the set of integersm > 1, it follows that for all q ∈ Q,

|q| = |q|αR.

Finally, the triangle inequality implies that

2α = |1 + 1| ≤ |1|+ |1| = 2.

Hence 0 < α ≤ 1.

This proposition classifies all of the archimedean valuations on Q;any such valuation must equal | · |αR for some 0 < α ≤ 1. Conversely,we have

Proposition 7.20 If 0 < α ≤ 1, then | · |αR is an archimedean valuation onQ.

Proof: The facts that | · |αR is multiplicative, and positive-definite,follow quickly from the corresponding facts about | · |R. For thetriangle inequality, observe that the convexity of the positive real-valued function x 7→ xα implies that

|x + y|αR ≤ |x|αR + |y|αR,

for all x, y ∈ Q.

From this proposition, we may classify all of the continuous valua-tions on R and C:

100 number theory

Proposition 7.21 Suppose that | · | is a continuous valuation on R or C.Then, there exists a real number α such that 0 < α ≤ 1, and |x| = |x|αR or|x|αC accordingly.

Proof: Given that | · | is a continuous valuation on R, it is completelydetermined by its values on the dense subset Q. We claim that | · |is an archimedean valuation. It cannot be the trivial valuation, sincethis valuation is clearly not continuous8 as a function from R to R. 8 Recall that the trivial valuation

“jumps” from 1 to 0 at 0.Indeed, if it were nonarchimedean and nontrivial, then therewould exist a positive integer n such that |n| < 1. In this case, wefind that the sequence |n−k| is unbounded. However, the sequencen−k approaches zero in the usual topology of R, and hence |n−k|must approach |0| = 0 in R, a contradiction.

Thus the continuous valuation on R restricts to an archimedeanvaluation on Q, which must be | · |αR for some 0 < α ≤ 1. As thecontinuous valuation on R is uniquely determined by its restrictionto the dense set Q, this classifies the continuous valuations on R.

Now, if | · | is a continuous valuation on C, then it restricts to acontinuous valuation on R. Hence |x| = |x|αR for some 0 < α ≤ 1.Furthermore, it can be checked that the set µ(C) ·R is dense in C.Since, for any valuation on C, |µ(F)| = 1, we find that |x| = |x|αCfor all x ∈ C.

The nonarchimedean valuations cannot be related directly to theusual absolute value. But we begin their study with the following

Lemma 7.22 Suppose that | · | is a nonarchimedean valuation on Q. Let Ibe the set of integers n, for which |n| < 1. Then I is a prime ideal in Z.

Proof: Since | · | is a nonarchimedean valuation on Q, we have seenthat |Z| is bounded by 1. It follows directly that I is an ideal in Z.Furthermore, if a, b ∈ Z, and ab ∈ I, then we find that |ab| = |a| · |b| <1. It follows directly that a ∈ I or b ∈ I. Hence I is a prime ideal.

This lemma suggests that we classify nonarchimedean valuationson Q by their associated prime ideals in Z. If this associated primeideal is the zero ideal, then |n| = 1 for all nonzero integers n. This im-plies that |q| = 1 for all nonzero q ∈ Q. Thus, if the associated primeideal is the zero ideal, then the valuation is the trivial valuation onQ.

If, on the other hand, the associated prime ideal is a nonzero ideal,then it is a prime ideal (p) for some prime number p. In this case, wesay that the valuation is a p-adic valuation. Such valuations may alsobe completely described.

valued fields 101

Recall that any nonzero rational number x has a unique decompo-sition into powers of primes:

x = ∏p

pep ,

where p ranges over the set of prime numbers, ep ∈ Z, and all butfinitely many ep are equal to zero. The constant ep is also calledordp(x), the order of x at p. The quantity pordp(x) is called the p-partof x. If x = 0, we define ordp(x) = −∞ so that the p-part of 0 is 0.

Proposition 7.23 Suppose that | · | is a p-adic valuation. Then there existsa real number α such that 0 < α < ∞, and such that

|x| = p−α ordp(x).

Proof: Since | · | is a p-adic valuation, we find that |q| = 1 for allprime numbers q 6= p. It follows that, if x is a rational number(decomposed into prime powers), then the valuation of x is equal tothe valuation of its p-part:

|x| = |pordp(x)|.

Furthermore, this valuation depends only upon |p|, which must be areal number β such that 0 < β < 1:

|x| = βordp(x).

Letting α = − logp(β) = − log(β)/ log(p), we find that

|x| = p−α ordp(x)

and 0 < α < ∞.

Setting α = 1 in the previous proposition, we are led to considerthe standard p-adic valuation:

|x|p = p− ordp(x),

for which every p-adic valuation | · | satisfies |x| = |x|αp for 0 < α < ∞.At this point, it is perhaps important to observe that

Proposition 7.24 For every positive9 real number α, the function | · |αp is a 9 This should be contrasted with thearchimedean case, where only powersof the standard valuation between 0and 1 occur in valuations.

valuation on Q.

Proof: The facts that | · |αp is multiplicative, and positive-definite,follow quickly from the corresponding facts about | · |p. For the

102 number theory

triangle inequality, observe that | · |p satisfies the stronger ultrametrictriangle inequality:

|x + y|p ≤ max|x|p, |y|p,

for all x and y in Q. Indeed, if ordp(x) = m and ordp(y) = n, thenx = pmx0 and y = pny0 for rational numbers x0, y0 which are “p-free”, i.e., have no p in their numerator or denominator. Without lossof generality, suppose that m < n. Then we find that

x + y = pm(x0 + pey0),

for some non-negative integer e. Thus we find that

ordp(x + y) ≥ maxordp(x), ordp(y).

The ultrametric triangle inequality follows:

|x + y|p = p− ordp(x+y) ≤ maxp− ordp(x), p− ordp(y) = max|x|p, |y|p.

Now, after taking α powers, we find that

|x + y|αp ≤ max|x|αp, |y|αp.

Since the ultrametric triangle inequality is satisfied, we find that | · |αpis a valuation.

We have now proven the following

Theorem 7.25 (Ostrowski) Suppose that | · | is a valuation on Q. Then,this valuation is one of the following:

Trivial: |x| = 1 for all nonzero x ∈ Q, and |0| = 0.

p-adic: There exists a prime number p, and a positive real number α, suchthat

|x| = |x|αp = p−α ordp(x).

Real: There exists a real number α, such that 0 < α ≤ 1 and

|x| = |x|αR.

Within each family of valuations on Q, we have chosen a represen-tative

|x|p = p− ordp(x), and |x|R,

which is normalized in a somewhat special way. One motivation forthis normalization is the following:

valued fields 103

Proposition 7.26 (Product Formula) For any x ∈ Q,

|x|R ·∏p|x|p = 1.

Proof: The proof is a quick computation. If x ∈ Q, then x has acanonical decomposition

x = ±∏p

pordp(x) = ±∏p|x|−1

p .

It follows that|x|R = ∏

p|x|−1

p .

The product formula follows directly.

7.4 Valuations on number fields

In the last section, we classified valuations on Q. From this classifi-cation, one may classify valuations on any number field F. For thepurposes of classification, one should note that any valuation on Frestricts to one on Q, on which it is trivial, p-adic, or real.

We begin with the archimedean valuations on F.

Proposition 7.27 Suppose that F is a number field. If | · | is an archimedeanvaluation on F, then there exists an embedding σ : F → C, and a realnumber 0 < α ≤ 1, such that:

|x| = |σ(x)|αC, for all x ∈ F.

Conversely, the above formula defines a valuation on F, for all such σ and α.

Proof: If | · | is an archimedean valuation on F, then | · | restricts toan archimedean valuation on Q; by the results of the previous section,we find that

|x| = |x|αR,

for all x ∈ Q. Let F be the completion of F with respect to the givenarchimedean valuation. By the universal property of completion, theinclusion of Q into F extends uniquely to a continuous inclusion ofQ = R into F. In this way, we find a canonical ring homomorphism:

FR = F⊗Q R→ F.

Any ring homomorphism from FR∼= Rr1 ×Cr2 to the field F factors

through a projection onto R or C. In this way, we find a sequence ofrings and homomorphisms:

F → FR → K → F,

104 number theory

where K = R or K = C. It follows that the archimedean valuationon F, after extending to F, restricts to a continuous archimedeanvaluation on R or C. Such valuations are all given by powers of theusual absolute value. Since |x| = |x|αR for x ∈ Q, we find that

|x| = |x|αK,

for all x ∈ K. Restricting from K to F yields the desired result.

From this proposition, we find that the archimedean places ofF correspond bijectively to archimedean valuations on F. If v is anarchimedean place of F (so v is an embedding of F into R or a pairof conjugate embeddings of F into C), we write | · |v for the resultingarchimedean valuation.

For the nonarchimedean valuations, we begin with the following

Proposition 7.28 Let F be a number field, and | · | a nonarchimedeanvaluation on F. Then |x| ≤ 1 for all x ∈ OF.

Proof: We have seen already that |x| ≤ 1 for all x ∈ Z ⊂ F. On theother hand, OF is a lattice, and we may choose a Z-basis α1, . . . , αd ofOF as such. We find that, for all c1, . . . , cd ∈ Z,

|∑i

ciαi| ≤ max|ciαi|

≤ max1 · |αi|≤ max|αi|.

In this way, we find that |OF| is bounded by some positive realconstant C = max|αi|. If C > 1, then we find that |αi| > 1 for somei. But then |α2

i | > |αi|, and α2i ∈ OF, a contradiction. Thus C ≤ 1.

From this proposition, we may easily prove the following

Corollary 7.29 Let F be a number field, and | · | a nonarchimedean valua-tion on F. Let I be the set of elements x of OF satisfying |x| < 1. Then I isa prime ideal in OF.

Proof: The proof is precisely the same as for Z.

It follows that, to every nonarchimedean valuation on F, one mayassociate a prime ideal P. If P = (0), then the valuation is trivial.Otherwise, we call the valuation a P-adic valuation. The P-adicvaluations on F can be described in much the same way as p-adicvaluations on Q, using the factorization theory for fractional ideals.

valued fields 105

Consider an element x ∈ F×, and the resulting principal fractionalideal xOF. We have seen that this fractional ideal can be factored, in aunique way, into integer powers of prime ideals

(x) = ∏P

PeP .

The resulting exponent of P is called the order of x at P: ordp(x) = ep

in the above decomposition. If x = 0, then we define ordp(x) = −∞as before. The following lemma is influenced by the notes of KeithConrad 10. 10

Lemma 7.30 Let P be a prime ideal in OF. If x ∈ F× and ordP(x) ≥ 0,then there exist α, β ∈ OF such that x = α/β, ordP(α) = ordP(x), andordP(β) = 0.

Proof: There exist ideals I, J ⊂ OF, such that xOF = I · J−1, and I andJ have no common prime ideal factors. Moreover, since ordP(x) ≥ 0,we find that P is not a prime factor of J, i.e., J 6⊂ P. Observe that Iand J are in the same ideal class, since I = (x) · J.

Choose an element j ∈ OF such that j ∈ J and j 6∈ P. It follows that(j) ⊂ J, so (j) = JK for some integral ideal K ⊂ OF. Furthermore, Pis not a factor of K, since otherwise j ∈ JK ⊂ P, a contradiction. SinceI and J are in the same ideal class, and JK = (j) is principal, we findthat IK is principal IK = (i) for some i ∈ OF.

We find that

(x) = I · J−1 = IK(JK)−1 = (i) · (j)−1.

Thus x = α/β, where α and β are unit multiples of i and j. Since j 6∈ P,we find that β 6∈ P, so ordP(β) = 0.

Lemma 7.31 Let P be a prime ideal in OF. If x ∈ F× and ordP(x) = 0,then |x| = 1 for any P-adic valuation.

Proof: Since ordP(x) = 0, there exist α, β ∈ OF such that x = α/β,ordP(α) = 0 and ordP(β) = 0. It follows that |α| = |β| = 1. Hence|x| = 1.

Proposition 7.32 Suppose that | · | is a P-adic valuation on a number fieldF. Then there exists a real number α such that 0 < α < ∞ and such that

|x| = N(P)−α ordP(x).

106 number theory

Proof: Begin by fixing an element v ∈ P− P2 (sometimes called auniformizing element). Thus |v| < 1. Then, for any nonzero x ∈ OF,we find that

ordP

(x

vordP(x)

)= 0, and |

(x

vordP(x)

)| = 1.

It follows that|x| = |v|ordP(x).

Letting α = − logN(P)(|v|), we find that 0 < α ∈ R and

|x| = N(P)−α ordP(x).

We have now proven the analogue of Ostrowski’s theorem fornumber fields:

Theorem 7.33 Suppose that | · | is a valuation on a number field F. Then,this valuation is one of the following:

Trivial: |x| = 1 for all nonzero x ∈ Q, and |0| = 0.

P-adic: There exists a prime ideal P, and a positive real number α, suchthat

|x| = |x|αP = N(P)−α ordP(x).

Archimedean: There exists a realarchimedean place σ or complexarchimedean place τ, τ and a real number α, such that 0 < α ≤ 1and

|x| = |σ(x)|αR or |x| = |τ(x)|αC.

7.5 Exercises

Exercise 7.1 Let F = Q(i). Let λ = 1 + i ∈ F. Prove that (λ) is a primeideal in OF. Let | · |λ denote the associated valuation. Compute |2|. Compute|3|. Compute |1− i|.

Exercise 7.2 Let C((X)) be the field of Laurent series with coefficients inC. These are formal series

f =∞

∑i=N

ciXi,

where N is allowed to be any integer, positive or negative. Define ord( f )to be the smallest integer i for which ci is nonzero. Prove that the function| f | = e−α ord( f ) defines a nonarchimedean valuation on C((X)) for allpositive real numbers α.

valued fields 107

Exercise 7.3 Suppose that p is a prime number, F is a number field, and(p) = P1 · · · Pt is the factorization of (p) in OF. Suppose moreover thatthe prime ideals P1, . . . , Pt are distinct. Suppose that | · | is a valuation on Fwhich restricts to | · |p on Q. Prove that | · | = | · |Pi for some 1 ≤ i ≤ t.

Exercise 7.4 Suppose that p is a prime number. Let x, y ∈ Q. Prove thatfor all ε > 0, there exists an element z ∈ Q such that (simultaneously!):

|z− x|R < ε and |z− y|p < ε.

Challenge: Generalize this to simultaneous approximation at any finitenumber of primes and R.

8P-adic fields

Let F be a number field, and let P be a prime ideal in OF. Recall thatif x ∈ F×, then (x) denotes the principal fractional ideal xOF ⊂ F.The integer ordP(x) is the power of P occurring in the factorization of(x). The resulting “P-adic absolute value” is defined by

|x|P = N(P)− ordP(x).

This is a nonarchimedean valuation on F.Recall that P ∩Z = pZ is a nonzero prime ideal in Z. It follows

that OF/P is a finite field extension of Fp : = Z/pZ. Thus thereexists a positive integer f , such that OF/P = Fq is a finite field withq : = p f elements; simply put, q = N(P).

For this reason, we can rewrite the P-adic absolute value

|x|P = q− ordP(x).

In this chapter, we will study the field obtained by completing F withrespect to this valuation.

Definition 8.1 A p-adic field is a field which arises as the completion of anumber field F with respect to a nonarchimedean valuation associated to aprime ideal P ⊂ OF whose intersection with Z is pZ.

The most basic example of a p-adic field is “the field of p-adicnumbers”, denoted Qp, which arises by completing Q with respect tothe p-adic absolute value |x|p = p− ordp(x). Though we do not proveit here, the previous definition has an alternative characterization: ap-adic field is a field which is a finite extension of Qp.

8.1 Systems of Congruences

Let K be the completion of F with respect to the valuation

|x|P = q− ordP(x).

110 number theory

An element of K (by definition) is an equivalence class of Cauchysequences from F, modulo null sequences. K is a field, and thevaluation on F extends uniquely to a continuous valuation on K.Since the valuation on F× has discrete image in R, we find that thevaluation on K× also has discrete image. If x ∈ K×, then |x| ∈ qZ.

Definition 8.2 A uniformizing element in K is some element v ∈ K×,such that |v| < 1, and such that if x ∈ K× satisfies |x| < 1 then |x| ≤ |v|.

Lemma 8.3 There exists a uniformizing element in K.

Proof: If v ∈ P, and v 6∈ P2, then we find that ordP(v) = 1. SinceordP(v) must be an integer and |x| = q− ordP(x), we find that v

satisfies the required property.

Recall that F is dense in K (with respect to the topology arisingfrom the P-adic valuation. Let A be the closure of OF in K. On theone hand, if x ∈ OF, then |x| ≤ 1. A stronger result is the following

Lemma 8.4 A is precisely the closed unit disc in K:

A = x ∈ K such that |x| ≤ 1.

Proof: Suppose that x is contained in the closed unit disk in K:|x| ≤ 1. Then, since F is dense in K, we find that for any 0 < ε ≤ 1there exists y ∈ F such that |y − x| < ε. The ultrametric triangleinequality implies that

|y| = |y− x + x| ≤ maxε, |x| ≤ 1.

Thus, by Lemma 7.30 there exists α, β ∈ OF, such that y = α/β andβ 6∈ P (so |β| = 1).

It remains to find an element γ ∈ A such that |γ− y| < ε, sincethen |γ− x| < ε by the ultrametric triangle inequality. Equivalently, itremains to find γ ∈ A such that |α− γβ| < ε.

For this, it suffices to solve the congruence:

γβ ≡ α, modulo PN ,

for arbitrarily large N. Observe that P (modulo PN) is the uniquemaximal ideal1 in OF/PN , since P is a maximal ideal in OF. Since 1 The maximal ideals of OF/PN cor-

respond to the maximal ideals of OFcontaining PN . These correspond to thenonzero prime ideals of OF occurring inthe factorization of PN . By the unique-ness of factorization of ideals, the onlysuch prime ideal is P itself.

β 6∈ P, we find that β is invertible2 in OF/PN . Thus there exists an

2 Either β is contained in a maximalideal or β is invertible. But P is theunique maximal ideal in OF/PN .

element δ ∈ OF such that βδ ≡ 1, modulo PN . Letting γ = δα solvesthe required congruence.

p-adic fields 111

The closed unit disc A ⊂ K is thus a subring of K. Its ideals can becharacterized as discs of varying radii:

Theorem 8.5 There is a natural order-preserving bijection between theideals in A and the non-negative integers k = 0, 1, 2, . . ., given by:

Ik = x ∈ A : |x| ≤ q−k.

Every ideal Ik in A is principal. If x ∈ A and |x| = q−k, then Ik = (x).

Proof: Suppose first that I is an ideal in A. Since, for every elementx ∈ I, |x| = q−m for some non-negative integer m, there exists x ∈ Isuch that |x| ≥ |y| for all y ∈ I. We find that for any such element x,xA ⊂ I. Also, if y ∈ I then |y/x| ≤ 1. Thus y/x ∈ A (since A is theclosed unit disc in K). Hence y ∈ xA. Furthermore, this argumentdemonstrates that if |y| ≤ |x| then y ∈ I = xA.

In this way, we find that every nonzero ideal in A is principal, andcan be described as a closed disc. The multiplicativity of the absolutevalue implies that every such closed disc is an ideal in A.

An important special case of the above classification of ideals is thefollowing:

Corollary 8.6 Let v be any uniformizing element of K, e.g., v ∈ P andv 6∈ P2. Then vA is the unique maximal ideal in the ring A. Furthermore,

vA = x ∈ A such that |x| < 1.

Proof: Given a uniformizing element v of K, we find that |v| ismaximal among absolute values occurring which are less than 1. Itimmediately follows that vA is the unique maximal ideal in A. Theprevious lemma also implies that vA is characterized as:

vA = x ∈ A such that |x| ≤ |v| = x ∈ A such that |x| < 1.

Using a uniformizing element v, we can describe all of the idealsin A:

A ⊃ vA ⊃ v2 A ⊃ v3 A ⊃ · · · .

These ideals can also be described as closed discs of radii q−k fork ≥ 0. On the other hand, these closed discs are also open discs, sincethe metric is discretely-valued. In this way, every ideal in A is simul-taneously closed and open (sometimes called clopen).

Since vA is the unique maximal ideal in A, it follows that

A× = A \vA.

112 number theory

Geometrically, this can be viewed as an annulus:

A× = x ∈ K : q−1 < |x| ≤ 1.

Now we are ready to consider a new perspective on the ring A:

Theorem 8.7 The inclusion OF → A induces a ring isomorphismOF/Pk ∼= A/vk A for all k ≥ 1. Furthermore, there is a continuousring isomorphism

A ∼= lim←−k

OF

Pk ,

in which the right side is given the subspace topology from the infiniteproduct ∏kOF/Pk.

Proof: Consider the map OF → A/vk A, given by composing theinclusion OF → A with the projection map. Recall that vk A is anopen subset of A for all k ≥ 1. It follows that for all a ∈ A, the coseta + vk A is an open subset3 of A. Since OF is dense in A, there exists 3 Translation by a is a homeomorphism

from K to K, by the continuity ofaddition.

α ∈ OF such thatα ∈ a + vk A.

It follows that the map OF → A/vk A is surjective. The kernel isprecisely vk A ∩OF. This is precisely the set

x ∈ OF : |x| ≤ q−k = x ∈ OF : ordP(x) ≥ k = Pk.

It follows that the map OF → A/vk A descends to a ring isomor-phism

OF

Pk →A

vk A.

Now consider the family of continuous4 projections A → A/vk A; 4 Continuity follows from the fact thatvk A is closed and open, so the quotienttopology on A/vk A is precisely thediscrete topology.

by the universal property of inverse limits, this factors through theinverse limit via a continuous ring homomorphism:

A→ lim←−k

Avk A

.

Here, the right side is given the inverse limit topology. The inverselimit can be thought of as the set of sequences (ak), where for eachk ≥ 1, ak ∈ A/vk A, and for each pair k ≥ ` ≥ 1, ak ≡ a`, modulo vÀ.In other words, the inverse limit is the set of compatible systemsof congruences, modulo all powers of v. As a set of sequences, thenatural topology is the subspace topology from the infinite product∏k (A/vk A).

Now that we have a continuous ring homomorphism from A tolim←−k

A/vk A, and A/vk A is isomorphic to OF/Pk, it remains to showthat this continuous ring homomorphism is bijective.

p-adic fields 113

The kernel of this ring homomorphism is an ideal in A. If it werenot injective, then the kernel would contain vk A for some k ≥ 0 byour classification of ideals in A. But certainly vk does not map tozero in A/vk+1 A, and hence does not map to zero in the projectivelimit. Thus we find that A maps injectively to lim←−k

A/vk A.For surjectivity, consider a compatible system of congruences (αk)

where for each k ≥ 1, αk ∈ OF/Pk. Then we find that the sequence(αk) is a Cauchy sequence in OF, and hence converges to an elementa ∈ A. In this way, we find that A maps surjectively onto lim←−k

A/vk A.

The above theorem should be thought of as a second perspectiveon p-adic fields (or rather the unit discs therein). An element of Acan be thought of as a compatible sequence of congruences, moduloall powers of P. When K = Qp, the unit disc is called Zp, and Z isdense in Zp. An element of Zp can be thought of as a compatiblesystem of congruences, modulo pZ, p2Z, p3Z, etc.. In this way,“working p-adically” allows us to work simultaneously with aninfinite tower of congruences.

For example, to say that 5 ∈ (Z/3Z)× means that there existsx ∈ Z such that 5x ≡ 1, modulo 3 (for example, x = 2). To say that5 ∈ (Z/9Z)× means that there exists x ∈ Z such that 5x ≡ 1, modulo9 (for example x = 2 again). To say that 5 ∈ (Z/27Z)× means thatthere exists x ∈ Z such that 5x ≡ 1, modulo 27 (for example x = 11).

To say that 5 ∈ Z×3 means that one may find a sequence (xk) ofintegers, such that

5 · xk ≡ 1 modulo 3k, for all k ≥ 1.

In this way, a 3-adic number (i.e., an element of Z3) encodes aninfinite system of congruences.

As another example, suppose that x ∈ Z3 and x2 = 7. Then wefind a sequence (xk) of integers, such that

x2k ≡ 7 modulo 3k, for all k ≥ 1.

From this perspective, the p-adic fields are most useful, since solvingequations p-adically (where analytic methods apply) allow for thesolution of infinite families of congruences.

8.2 Analysis of series

In complete nonarchimedean fields, one may consider “specialfunctions” defined by infinite series in one or more variables. Webegin with an analytic lemma.

114 number theory

Lemma 8.8 Suppose that (ai) is a sequence of elements of K. Then ∑ ai

converges in K (i.e., the sequence of partial sums is a Cauchy sequence in K)if and only if (ai) is a null sequence.

Proof: Let si = ∑ij=0 ai denote the sequence of partial sums. If si is a

Cauchy sequence, then for any ε > 0, there exists a positive integer Nsuch that

|si − si−1| = |ai| < ε,

for all i > N. It follows that (ai) is a null sequence.Conversely, suppose that (ai) is a null sequence. Let ε be a positive

real number, and let N be a positive integer such that for all i > N,|ai| < ε. Then, if j ≥ i > N, then we find that

|sj − si| = |j

∑k=i+1

ak|

≤ max|ak| : i + 1 ≤ k ≤ j< ε.

Thus the sequence (si) is a Cauchy sequence, converging to an ele-ment of K.

This result sharply contrasts with our knowledge about archimedeancomplete fields such as R and C, in which many divergent sums haveterms approaching zero. Convergence in nonarchimedean completefields is far simpler!

As a first example, we may consider geometric series. Supposethat x ∈ K and |x| < 1; in other words, x ∈ vA. Then we find that|xk| → 0, and so the sum

∞

∑k=0

xk converges.

The usual argument from analysis5 implies that 5 We are using the interchange of limitsand the basic arithmetic operations inK.

(1− x) ·∞

∑k=0

xk = 1.

Hence the usual formula for geometric series holds in the ring A:

∞

∑k=0

xk =1

1− x, if |x| < 1.

Example 8.9 If p is a prime number, then |p| = p−1 < 1 in the p-adicfield Qp. It follows that:

11− p

=∞

∑k=0

pk ∈ Zp.

p-adic fields 115

As another example, we can consider the exponential and log-arithm maps, viewed as power series. We begin with the “formallogarithm”:

log(1− x) = −(

x +x2

2+

x3

3+ · · ·

).

Suppose that |x| = r < 1 in a p-adic field K. Since the P-adic absolutevalue restricts to a p-adic valuation on Q, we find that

|k| = |p|ordp(k) = q−e ordp(k), for all positive k ∈ Z.

Since ordp(k) ≤ logp(k), and q = p f , we find that

|k| ≥ k−e f .

Then we find that

| xk

k| = rk|k|−1 ≤ rkke f ,

for all k > 0. Since rkke f → 0 (since 0 < r < 1 and 0 < e f < ∞), wefind that the series defining log(1− x) converges for all |x| < 1.

The “formal exponential” is more difficult to analyze:

exp(x) = 1 + x +x2

2!+

x3

3!+ · · · .

Suppose again that |x| = r < 1 in a p-adic field K. We can estimatethe P-adic valuation of factorials using the following6 6 In order to determine ordp(k!), we

count multiples. There are bk/pcmultiples of p occurring among 1, . . . , k.There are bk/p2c multiples of p2

occurring among 1, . . . , k. Continuing,there are bk/pic multiples of pi among1, . . . , k. This number becomes zero, assoon as pi exceeds k, i.e., as soon as iexceeds blogp(k)c.

ordp(k!) =blogp(k)c

∑i=1

b kpi c

≤blogp(k)c

∑i=1

kpi

= kp−1 · p−blogp(k)c − 1p−1 − 1

≤ kp−1 · k−1 p− 1p−1 − 1

=p− k1− p

.

From this, we find that

| 1k!| ≤ pe f k−p

p−1 .

It follows that

| xk

k!| ≤ rk pe f k−p

p−1 =(

r · pe f

p−1

)k· p

e f pp−1 .

From this we find that

116 number theory

Proposition 8.10 The exponential function exp(x) converges for all|x| ≤ r, as long as

r < p−e f

p−1 = |p|1/(p−1).

Example 8.11 Let U1 = 1 + pZp, i.e.,

U1 = x ∈ Qp : |x− 1| < 1.

Then U1 is a subgroup of Z×p , i.e., U1 is closed under multiplication andinversion. In fact, inverses may be easily computed using geometric series.

The open disc pZp = x ∈ Qp : |x| < 1 is an additive subgroupof Qp, i.e., it is closed under addition and subtraction. The convergence oflog(1− x) for |x| < 1 implies that log defines a function:

log : U1 → pZp.

On the other hand, if x ∈ pZp, and p 6= 2, then 1 > (p− 1)−1 and

|x| ≤ p−1 < p−1/(p−1).

Thus when p 6= 2, the exponential series defines a function:

exp : pZp → U1.

The reader may apply analytic methods to verify that log and exp definecontinuous group homomorphisms, within their radii of convergence, andare inverses when convergence is satisfied. In that way, we find a continuousgroup isomorphism, when p 6= 2:

U1∼= pZp.

8.3 Hensel’s Lemma

We have already seen the interpretation of A ⊂ K, as the set ofcompatible systems of congruences:

A ∼= lim←−k

OF

Pk .

In this way, an element of A encodes an infinite sequence of congru-ences. Such an encoding is not worthwhile unless we can, in the end,carry out only a finite number of computations.

Hensel’s lemma is a “lifting theorem”, which approximatelysays that once a sufficiently long chain of congruences has beensolved (modulo P, P2, P3, . . . , Pt), there exists an infinite sequence ofsolutions (modulo Pk for all k > 0). For example, in order to provethat the equation x2 = 7 has a solution in Z3, it suffices to prove that

p-adic fields 117

x2 = 7 has a solution in Z/3Z – Hensel’s lemma takes care of therest.

There are many other interpretations, reformulations, and gen-eralizations of Hensel’s lemma. A good exposition and historicaltreatment of Hensel’s lemma can be found in the article of Brink7 or 7 David Brink. New light on Hensel’s

lemma. Expo. Math., 24(4):291–306, 2006.ISSN 0723-0869

the article of Ribenboim8.8 Paulo Ribenboim. Equivalent forms ofHensel’s lemma. Exposition. Math., 3(1):3–24, 1985. ISSN 0723-0869

Consider a polynomial F ∈ A[X]. We are interested foremost in thequestion of finding roots to the equation F(x) = 0, in the ring A. Inother words, we are interested in finding a compatible sequence ofcongruences (xk), with xk ∈ OF/Pk, such that for all k ≥ 1, F(xk) ≡ 0,modulo Pk.

Let F′ ∈ A[X] denote the derivative of F. Using the binomialtheorem, one may check directly that

F(x0 + δ)− F(x0) = δF′(x0) + δ2α,

for some α ∈ A. Thus in nonarchimedean fields, we see that (at leastfor polynomials) results from calculus on linear approximation arestill valid.

One application of linear approximation (of polynomials) is New-ton’s method for finding roots of polynomials. Hensel’s lemmademonstrates that Newton’s method applies in the ring A as well:

Theorem 8.12 Suppose that F ∈ A[X], a ∈ A, and F(a) is close to zero inthe precise sense that

|F(a)| < |F′(a)|2.

Then there exists x ∈ A such that F(x) = 0 and

|x− a| ≤ | F(a)F′(a)2 |.

Proof: Our proof follows Theorem 6.28 of Knapp text9. Suppose that 9 Anthony W. Knapp. Advanced algebra.Cornerstones. Birkhäuser Boston Inc.,Boston, MA, 2007. ISBN 978-0-8176-4522-9

|F(a)| ≤ |F′(a)|2. If F′(a) = 0, then F(a) = 0 and the result is proven.Thus we assume hereafter that F′(a) 6= 0.

Consider the sequence defined recursively 10 by a0 = a and 10 This recursive process is known asNewton’s method for finding roots of apolynomial.

ai+1 = ai −F(ai)F′(ai)

.

In order to prove that this sequence converges, it is natural to con-sider the associated sequences:

δi = ai+1 − ai = − F(ai)F′(ai)

, and ρi = |δ|.

Also define

C =ρ0

|F′(a0)|=|F(a0)||F′(a0)|2

.

118 number theory

Thus C < 1 and ρ0 = C · |F′(a0)| < 1 as well.We prove by induction that for all i ≥ 0 the following statements

hold:

(1i) ai is an element of A.

(2i) |F′(ai)| = |F′(a0)| 6= 0.

(3i) ρi ≤ C2i |F′(a0)|.

When i = 0, every one of the above statements are trivially true.Assume now that Statements 1i, 2i, 3i are true. Then by Statement 2i ,we find that

ai+1 = ai −F(ai)F′(ai)

is well-defined. By Statement 3i, we find that

|ai+1 − ai| =|F(ai)||F′(ai)|

≤ C2i |F′(a0)|.

Since ai ∈ A (by Statement 1i) and ρ0, F′(a0) are elements of A, wefind that ai+1 ∈ A by the ultrametric triangle inequality. Thus wehave proven Statement 1i+1.

Now, by linear approximation, we find that

F(ai+1) = F(ai + δi) = F(ai) + F′(ai)δi + δ2i α,

F′(ai+1) = F′(ai + δi) = F′(ai) + δiβ,

for some α, β ∈ A. It follows that

|F′(ai+1)| = |F′(ai) + δiβ|.

By Statement 2i, we have

|δi| = ρi ≤ C2i |F′(a0)| = C2i |F′(ai)| < |F′(ai)|,

and so

|F′(ai+1)| = |F′(ai)|.

Thus we have proven Statement 2i+1.Finally, we are left to consider

ρi+1 = | F(ai+1)F′(ai+1)

|.

Note that since (ai+1 − ai)F′(ai) = −F(ai),

F(ai+1) = F(ai) + F′(ai)δi + δ2i α = δ2

i α.

p-adic fields 119

By Statements 2i+1 and 2i, we find that

ρi+1

F′(a0)= | F(ai+1)

F′(a0)2 |

= |δ2

i α

F′(a0)2 |

≤ | δ

F′(a0)|2

≤ (C2i )2

= C2i+1.

Statement 3i+1 follows immediately.Now that we have a sequence (ai) in A, satisfying Statements 1i,

2i, 3i, we find that the sequence is a Cauchy sequence by Statement3i (and basic results on geometric series). Thus we consider the limita = lim(ai), which is an element of A. We find that |a− a0| ≤ C aswell by Statements 3i.

Finally, we find that

|F(ai)| = |ai+1 − ai| · |F′(ai)| = ρi · |F′(ai)| = ρi|F′(a0)| → 0,

using Properties 2i and 3i. Since |F(ai)| approaches zero, we find (bythe continuity of polynomials) that F(a) = 0.

The previous lemma is one of the more powerful of many equiva-lent statements to Hensel’s lemma. In practice, a very important casearises when F′(a) is a unit in A:

Corollary 8.13 Suppose that F ∈ A[X], a ∈ A, |F′(a)| = 1, and |F(a)| <1. Then there exists x ∈ A such that F(x) = 0 and |x− a| ≤ |F(a)|.

One important case of this is the following

Corollary 8.14 Let C be an integer, and n be a positive integer. Let p be aprime number. Then, if p does not divide n, and p does not divide C, andthere exists a ∈ Z such that an ≡ C modulo p, then there exists x ∈ Zp

such that xn = C. Thus there exists a sequence of integer (xk), such that forevery k ≥ 1,

xk ≡ C modulo pk.

Proof: Let F ∈ Z[X] be the polynomial F(X) = Xn − C. Since an ≡ C,modulo p, we find that |F(a)| < 1. Moreover, |a|n = |C| = 1 sincep does not divide C, and hence |a| = 1 as well. If we consider thederivative, we find that

|F′(a)| = |n| · |a|n−1 = 1,

120 number theory

since p does not divide n.Thus by Hensel’s lemma, there exists x ∈ Zp such that F(x) =

0. The result follows from our interpretation of Zp as the ring ofcompatible congruences, modulo powers of p.

Example 8.15 There exists a sequence of integers (xk) such that x2k ≡ 7,

modulo 3k. Indeed, 7 ≡ 1, modulo 31, so 7 is a “quadratic residue”, modulo3. Since 3 divides neither 2 nor 7, the hypotheses of the previous corollaryhold.

8.4 Another Hensel’s Lemma

Hensel’s lemma immediate implies the following statement:

Proposition 8.16 Suppose that k = A/vA is the residue field of A, andfor f ∈ A[X] write f for its reduction in k[X]. Suppose that f ∈ A[X] is amonic polynomial. Then if f = gh for a linear monic polynomial g ∈ k[X]and a monic polynomial h ∈ k[X] such that g and h are coprime, then thereexist monic polynomials g, h ∈ A[X] which reduce to g, h, and which satisfyf = gh.

Proof: Given the decomposition f = gh, with g monic and linear, wefind that f has a root r in k = A/vA. Let r ∈ A be any lift of r; Thuswe find that f (r) = 0, so f (r) ∈ vA. Furthermore, since g and h arecoprime, we find that r is not a repeated root of f . Thus f ′(r) 6= 0.Therefore f ′(r) 6∈ vA, and so | f ′(r)| = 1.

It follows that | f (r)| < | f ′(r)|2, and Hensel’s lemma applies. Wemay find a root x of f which reduces to r, modulo vA. Thus f has amonic linear factor g which reduces to g, and the quotient h = f /greduces to h as desired.

The following theorem generalizes the above proposition, re-moving assumptions on the degree of g and h, and it is often calledHensel’s lemma as well. While the following theorem is stronger thanthe previous proposition, it is somewhat weaker than Hensel’s lemmasince it does not allow one to find roots of polynomials f whosereduction has multiple roots.

Theorem 8.17 Let k = A/vA be the residue field of A, and for f ∈A[X] write f for its reduction in k[X]. Suppose that f ∈ A[X] is a monicpolynomial. Then if f = gh for two monic coprime (in k[X]) polynomials gand h, then there exist monic polynomials g, h ∈ A[X] which reduce to g, h,and satisfy f = gh.

p-adic fields 121

Proof: Our proof follows the proof of Theorem 7.33 in Milne’sexcellent notes.11 11

We begin with f ∈ A[X] monic, and g, h ∈ k[X] monic andcoprime, such that f = gh. We can immediately find monic g1, h1

lifting12 g, h, from which we find that 12 We say that g lifts g if g ∈ k[X] andg ∈ A[X], and g reduces (mod vA) to g

f − g1h1 ∈ vA[X], and Deg( f ) = Deg(g1h1)

We prove, by induction, that there exists a sequence of pairs (gn, hn) ∈A[X]2 such that

• The polynomials gn, hn are monic and reduce to g, h.

• Deg(gn) ≤ Deg(g1) and Deg(hn) ≤ Deg(h1).

• f − gnhn ∈ vn A[X].

The base step of induction has already been demonstrated. So sup-pose that a pair (gn, hn) has been found satisfying the above condi-tions. It suffices to find u, v ∈ A[X] with Deg(u) < Deg(g1) andDeg(v) < Deg(h1) such that

f − (gn + vnu)(hn + vnv) ∈ vn+1 A[X].

Indeed, we could then define gn+1 = gn + vn+1u and hn+1 =hn + vn+1 to achieve the next step in the induction. Observe thatthe degree of gn+1 and hn+1 remains bounded above by Deg(g1) andDeg(h1). gn+1 and hn+1 remain monic, since they only differ from gn

and hn in terms below the dominant term.Simplifying the conditions on u and v, it suffices to find u, v ∈

A[X] of degrees bounded as before, such that

uhn + vgn ≡f − gnhn

vn modulo vA[X].

For this, consider the A[X]-module M = A[X]/(gn, hn). This isfinitely-generated, as an A-module since gn (and hn) is monic13. Since 13 M is generated as an A-module by

1, X, X2, . . ., XDeg(gn)−1, since higherpowers of X can be expressed in termsof these powers

g, h are coprime, we find that M/vM = 0. It follows by Nakayama’slemma14 that M = 0.

14 Nakayama’s lemma: If R is a localring, i.e., a ring with a unique maximalideal m, and M is a finitely-generatedR-module, and M/mM = 0, thenM = 0.

Hence (gn, hn) = A[X], i.e. gn and hn are coprime in A[X]. Itfollows that there exist u, v ∈ A[X] such that

uhn + vgn =f − gnhn

vn .

If Deg(v) ≥ Deg(hn), we may divide polynomials15 with remainder 15 Division of monic polynomials withcoefficients in a PID has the expectedproperties.

to find v = qhn + r for polynomials q, r ∈ A[X] with Deg(r) <

Deg(hn). In this case we find

(u + qgn)hn + rgn =f − gnhn

vn .

122 number theory

Not only is Deg(r) < Deg(hn), but moreover Deg(u + qgn) < Deg(gn)since

Deg(u + qgn)+ Deg(hn) = Deg(f − gnhn

vn − rgn) < Deg(hn)+ Deg(gn).

In this way, we find by division with remainder that there exist u, v inA[X] such that

• uhn + vgn = 1.

• Deg(u) < Deg(gn) and Deg(v) < Deg(hn).

• f − (gn + vnu)(hn + vnv) ∈ vn+1 A[X].

Thus we find a sequence of polynomials gn, hn satisfying the desiredthree conditions. Observe that the coefficients of these polynomialsform Cauchy sequences, and the degrees of these polynomials arebounded. Hence, there are limiting polynomials g, h such that:

• g and h are monic and reduce to g and h.

• Deg(g) ≤ Deg(g1) and Deg(h) ≤ Deg(h1).

• f = gh.

Example 8.18 An important application of the above formulation ofHensel’s lemma is given by the Teichmuller representatives. Recall thatq = N(P) is the cardinality of the field k = A/vA. Thus the polynomialf (X) = Xq − X has q distinct roots in k. The resulting factorization off (X) in k[X], into distinct monic linear factors, lifts by the previous the-orem to a factorization of f (X) in A[X] into distinct monic linear factors.In this way, we find a canonical injective function (not a homomorphism!)θ : k→ A. One may check directly that θ does restrict to an injective grouphomomorphism from k× into A×.

8.5 Extension of the valuation

Suppose that K is a p-adic field, and K ⊂ L is a finite extension offields. Thus L is a finite-dimensional K-vector space, endowed withthe product topology from that on K. Recall that A is the closed unitdisc in K, also called the valuation ring of K. Fix a uniformizer v forK, and recall that the nonzero ideals of A are precisely the principalideals vk A for 0 ≤ k ∈ Z. In particular, A is a PID and UFD.

Since A is a PID, Proposition 2.4 holds, replacing Z by A every-where:

Proposition 8.19 Suppose that L is a field extension of K and α ∈ L. Thefollowing conditions are equivalent.

p-adic fields 123

1. α is integral over A.

2. A[α] is a finite rank A-module.

3. There exists a subring M of K, such that M is finitely-generated as aA-module and M contains α.

Consider the set B of elements of L which are integral of A (alsocalled the integral closure of A in L). An immediate corollary of theprevious proposition, following Corollary 2.5 for Z, is that B is asubring of L.

Let M = vA be the unique maximal ideal in A; we consider theprime ideals of B containing M.

Lemma 8.20 There exists a unique (and hence maximal) prime ideal in Bcontaining M.

Proof: Suppose that M1, M2 are distinct prime ideals of B containingM. If M1 6= M2, then we may choose β ∈ M1 such that β 6∈ M2

(without loss of generality). Let f ∈ A[X] be the minimal polynomialof β; thus f is monic and irreducible, since every element of B isintegral over A. Then we find that A[β] is isomorphic to A[X]/〈 f 〉.

Let k = A/M be the residue field of A, and consider the reductionf ∈ k[X]. If f had two distinct irreducible factors in k[X], this factor-ization would lift (by the previous Hensel’s lemma) to a factorizationof f ∈ A[X], contradicting irreducibility. Thus f is a power of anirreducible polynomial g in k[X]. Therefore, we find that:

A[β]/vA[β] ∼= k[X]/〈 f 〉 = k[X]/〈gd〉.

It follows that A[β]/vA[β] has only one nonzero prime ideal, con-tradicting our assumption that there were two prime ideals M1, M2

containing M such that M1 ∩ A[β] 6= M2 ∩ A[β].

Much of the first four chapters of this text go through easily,replacing Z by A and Q by K. In particular, we find that the “ringof integers” B is an A-lattice in the K-vector space L. Moreover, Bis a UFD, and ideals in B factor uniquely into powers of the uniquenonzero prime ideal N ⊂ B. Just as in the proof of Ostrowski’stheorem, every nonarchimedean valuation on B arises as an “N-adicvaluation”:

|x| = e−α ordN(x).

Theorem 8.21 The P-adic valuation on K extends uniquely to a valuationon L. L is complete with respect to this valuation.

124 number theory

Proof: Observe that K is the fraction field of B (by the same ar-guments as in Chapter 2), and so any valuation of L is uniquelydetermined by its restriction to B. In this way, it suffices to classifythe valuations on the ring B which extend the P-adic valuation on A.

Let N be the maximal ideal of B containing the maximal idealM of A. Observe that the ideal MB factors as Ne, as ideals in B.Moreover, B/N is a finite extension of the field A/M, since B is afinitely-generated A-module. Recalling that [A : M] = q, a power ofp,

[B : N] = q f , for some 0 < f ∈ Z.

Now, if | · | is a valuation on L extending the P-adic valuation on K,we find that

|x| = q− f α ordN(x),

for some α ∈ R. Since it extends the P-adic valuation on K, we findthat

|m| = q− f α ordN(m) = q− f eα = q−1.

Thus α = (e f )−1. This proves the uniqueness of the extension of theP-adic valuation. For existence, one may simply define

|x| = q− ordN(x)/e.

For the completeness of L, observe that L is isometric to the vectorspace Kn, where n = [L : K]. One can directly prove completenessfrom this fact.

Corollary 8.22 If L is a finite extension of K, then L also satisfies Hensel’sLemma in all its variants. There is a natural Teichmuller lift from theresidue field ` = B/N to L, extending the Teichmuller lift from k = A/Mto K.

8.6 Exercises

Exercise 8.1 Prove that if p is a prime number, and p is congruent to 1mod 4, then there exists x ∈ Zp such that x2 = −1.

Exercise 8.2 Let K be a p-adic field, with valuation ring A. Let v bea uniformizing element, and let k = A/vA be the residue field. Letθ : k→ A denote the Teichmuller map.

Prove that every element of A can be expressed as a convergent series ofthe form:

∞

∑i=0

θ(ai)vi,

where for all i, ai ∈ k.

p-adic fields 125

Exercise 8.3 Suppose that K is a p-adic field, and L is a finite Galoisextension of K. Prove that if γ ∈ Gal(L/K), and L is given the topologyfrom the unique valuation extending that on K, then γ is a continuous mapfrom L to L.

9Galois Theory

In this chapter, we study the Galois theory of extensions of numberfields, and the related theory of extensions of p-adic fields. This willculminate in the statements of global and local class field theory.

9.1 Extensions of p-adic fields

Suppose that K is a p-adic field, and L is a finite extension of K. Inparticular, the P-adic valuation on K extends uniquely to a valuationon L, and we fix this valuation on L throughout. Let A be the unitdisc in K, and B the unit disc in L (also the integral closure of A inL). Let M = vA be the unique maximal ideal of A, and N = νB theunique maximal ideal of B.

We have seen previously that the extension L/K has two numericalinvariants, defined as follows:

Definition 9.1 The ramification degree of L over K is the positive integere which satisfies MB = Ne. The inertial degree of L over K is the positiveinteger f which satisfies [B : N] = [A : M] f . L/K is called an unramifiedextension if e = 1. L/K is called a totally ramified extension if f = 1.

These two invariants are connected to the degree of the fieldextension L/K by the following

Proposition 9.2 Let n = [L : K]. Then n = e f , the product of theramification degree and the inertial degree.

Proof: The theory of A-lattices demonstrates that n is equal to therank of the A-lattice B. Moreover, since B is a free A-module of rankn, we find that:

[B : vB] = [A : vA]n.

Now, we have vB = νeB, and so

[B : vB] = [B : νB]e = [A : vA] f e.

Thus n = e f .

128 number theory

The extension L/K yields an extension of residue fields k =A/M ⊂ ` = B/N in a canonical way. The extension L/K is unrami-fied if e = 1 (so n = f ). In this case, n equals not only [L : K], but alsothe degree of the extension of residue fields [` : k].

The ramification of an extension can be detected numerically bythe discriminant:

Proposition 9.3 The extension L/K is ramified if and only if v divides thediscriminant Disc(B) of B as an A-lattice.

Proof: Let β1, . . . , βn be an A-basis of B; thus Disc(B) is the determi-nant of the matrix (Tr(βiβ j)), and Disc(B) ∈ A.

On the other hand, consider the k-algebra

Bk : = B⊗A k =B

vB=

BνeB

.

The discriminant of Bk is (defined to be) the determinant of the ma-trix Tr(βi β j), where here the trace denotes the trace of the resultingmatrix with entries in k. We find that Disc(Bk) is precisely the re-duction, mod v, of Disc(B). Thus v divides Disc(B) if and only ifDisc(Bk) = 0.

It is a general fact about finite-dimensional k-algebras1 Bk that 1 The only assumption is that k is a per-fect field and Bk is a finite-dimensionalassociative k-algebra. See, for example,Lemma 3.38 of Milne’s notes

Disc(Bk) = 0 if and only if Bk has a nonzero nilpotent element. SinceBk = B

νeB , and νB is a maximal ideal in B, we find that Bk has anonzero nilpotent element if and only if e > 1.

Consider an unramified extension L/K, so that n = f = [L : K] =[` : k]. Let q be the cardinality of k, so q f is the cardinality of `. Theextension `/k can be defined as the splitting field of an irreduciblemonic polynomial Φ f of degree f in k[X]. We may choose a moniclift Φ f of degree f in A[X] as well, which is then irreducible in A[X]and in K[X]. By Hensel’s lemma, the distinct roots of Φ f in ` lift todistinct roots of Φ f in L. In this way, we see that L is the splittingfield of an irreducible polynomial in K[X]. In particular, we find that

Proposition 9.4 Every unramified extension L/K is a Galois extension.

A much stronger result is the following:

Proposition 9.5 There is a canonical bijection (really, an equivalence ofcategories) between the finite unramified extensions of K and the finiteextensions of k, which preserves inclusions and Galois groups.

Proof: Every finite unramified extension L/K yields a finite exten-sion `/k of the same degree, as we have already seen. Furthermore,

galois theory 129

any inclusion L1 → L2 of unramified extensions of K must be com-patible with valuations, by the uniqueness of extension of valuations.Thus any such inclusion induces an inclusion of discs of any radius,and hence induces an inclusion of residue fields `1 → `2.

Conversely, any finite extension `/k arises as ` = k[z] for someelement z ∈ `. Letting Pz be the minimal polynomial of z in k[X],one may lift Pz to a monic irreducible polynomial Pz of the samedegree in A[X] (which is then also irreducible in K[X]). Let L be asplitting field of Pz, which is then a finite extension of K. Then, thereis a unique valuation on L extending that on K. If r is any root ofPz in L, then we find that r ∈ B (the valuation ring of L) since Pz ismonic and irreducible in A[X]. It follows that the reduction r is a rootof Pz in B/N (the residue field of L). It follows that B/N is a fieldcontaining k and a root of Pz; thus B/N is isomorphic to `. Note thatL/K is unramified, since its degree equals the degree of Pz, whichequals the degree of ` as an extension of k.

Thus, we find a correspondence between finite unramified exten-sions of K and finite extensions `/k. To be precise, one may considerthe category Unr(K) of finite unramified field extensions of K andK-algebra homomorphisms, and the cateogry Fie(k) of finite fieldextensions of k and k-algebra homomorphisms. Our work so far hasconstructed a functor from Unr(K) to Fie(k), and we have demon-strated that this functor is essentially surjective (every object of Fie(k)is isomorphic to the image of an object of Unr(K)).

To finish the proof2, it suffices to show that the induced maps 2 To prove that a functor defines anequivalence of categories, one mustshow that it is essentially surjective,full, and faithful. We have, so far,constructed an essentially surjectivefunctor. The morphism sets are eitherempty, or are torsors for Galois groups.For this reason, the remaining argumentis sufficient.

Gal(L/K) → Gal(`/k) are isomorphisms for any finite unramifiedextension L/K with residue field `. For this, we may realize L = K[β]and ` = k[β], for some β ∈ B such that the Galois conjugates of β

have distinct reductions in `. Every automorphism in Gal(L/K) isdetermined by where it sends β, and this is determined by where β

is sent. By such an argument, one verifies that Gal(L/K) injects intoGal(`/k). Surjectivity follows from the fact that both groups haveorder f .

Recall that any finite extension of finite fields `/k is cyclic, gener-ated by a Frobenius automorphism3 Fr : `→ ` given by: 3 This is often called the arithmetic

Frobenius automorphism. Its inverseis called the geometric Frobeniusautomorphism.

Fr(x) = xq.

By the previous proposition, we find:

Corollary 9.6 Every finite unramified extension L/K is cyclic, generatedby a Frobenius4 automorphism Fr ∈ Gal(L/K) which reduces to the usual 4 In other words, the phrase “Frobenius

automorphism” is used both to denotethe automorphism of finite fieldsgiven by x 7→ xq and to the uniqueautomorphism of L reducing to thisautomorphism.

Frobenius automorphism of the residue fields `/k.

130 number theory

In addition, we find

Corollary 9.7 If L1, L2 are finite unramified extensions of K, contained insome larger finite extension L/K, then L1L2 is a finite unramified extensionof K.5 5 This corollary can also be proven us-

ing the discriminant. The discriminantis multiplicative in towers, in the sensethat if K ⊂ L ⊂ L′ with valuation ringsA ⊂ B ⊂ B′, then

DiscA(B) ·NL/K DiscB(B′) = DiscA(B′).

The corollary follows quickly from thisfact.

Proof: Let `1, `2 denote the residue fields of L1, L2, respectively, and` the residue field of L. Let L0 = L1 ∩ L2, which is then an unramifiedextension of K by the previous proposition. Let `0 be the residue fieldof L0.

Consider the compositum `′ = `1`2 of finite fields, in `. Then,there exists an unramified extension L′ of K, with residue field `1`2,such that the inclusions among the fields K, L0, L1, L2, L′ and amongthe fields k, `0, `1, `2, `′ are compatible. As one may construct L′ asthe splitting field of a polynomial with roots in `1`2 ⊂ `, one mayconstruct L′ as a subfield of L containing L1 and L2. It follows bylooking at degrees, that L′ = L1L2.

From this, we find

Proposition 9.8 Let L be a finite extension of a p-adic field K. Thenthere exists a unique intermediate field K ⊂ Lu ⊂ L, such that Lu/Kis unramified and L/Lu is totally ramified:

e = [L : Lu], and f = [Lu : K].

Proof: Let Lu be the compositum of all unramified extensions of K inL.

9.2 Local class field theory

The unramified extensions of K provide a source of cyclic Galoisextensions of K. In this section, we state the main results of local classfield theory, describing to some extent, the abelian Galois extensionsof K. First, we mention some infinite Galois theory.

Let L/K be an algebraic extension of fields, not necessarily finite.We say that L is a Galois extension if every polynomial in K[X] whichhas a root in L splits completely in L. Note that, whether or not L isfinite, L is the compositum of all of its subfields L′ such that L′/K isa finite Galois extension. Using inverse limits, which we have seenpreviously in “compatible systems of congruences”, we may defineGal(L/K) when L/K is an infinite Galois extension:

Definition 9.9 Let L/K be a (finite or infinite) Galois extension of fields.Let S be the set of intermediate extensions L′/K such that L′ ⊂ L, L′/K is

galois theory 131

Galois, and [L′ : K] < ∞. Then Gal(L/K) is the group whose elements arecollections (γL′), where for each L′ ∈ S, γL′ ∈ Gal(L′/K), and where forany pair L′ ⊂ L′′ of elements of S,

pr(γL′′) = γL′ ,

where pr : Gal(L′′/K) → Gal(L′/K) is the canonical projection homomor-phism. The group law on Gal(L/K) is obtained “component-wise” from theGalois groups Gal(L′/K). Gal(L/K) is given the subspace topology, fromthe direct product topology on ∏L′∈S Gal(L′/K); in this way Gal(L/K) isa compact topological group.

In other words, Gal(L/K) is the group of “compatible systems” ofelements of Galois groups of intermediate extensions L/L′/K with[L′/K] finite. As a first example we consider the absolute Galoisgroup of a finite field:

Example 9.10 Let k be a finite field (for example, the residue field of K)with q elements. Let k be an algebraic closure of k. Then, it is a standard factof field theory that for every positive integer f , there exists a unique subfield` f of k with q f elements, and Gal(` f /k) is cyclic of order f . Moreover,inclusions correspond to divisibility of these positive integers f : if f dividesf ′, then ` f ⊂ ` f ′ , and the resulting map on Galois groups corresponds to thereduction map of cyclic groups:

pr :Z

f ′Z→ Z

f Z.

Here, the generator 1 of the cyclic groups Z/ f Z and Z/ f ′Z corresponds to theFrobenius automorphism x 7→ xq of fields.

Since all finite extensions of k arise as fields ` f , one for each f , we findthat Gal(k/k) is isomorphic to Z, the group of compatible systems of con-gruences, one for each modulus 1 ≤ f ∈ Z. By the Chinese remaindertheorem, such a compatible system of congruences is determined by a com-patible system of congruences mod powers of p for every prime numberp:

Z ∼= ∏p

Zp.

Let K be any algebraic closure of K. Let Kunr be the maximalunramified extension of K in K, i.e., the compositum of all finite un-ramified extensions of K in K. Then, we find a short exact sequenceof topological groups (and continuous homomorphisms, in fact):

1→ Gal(K/Kunr)→ Gal(K/K)→ Gal(Kunr/K)→ 1.

By the previous example, we find

1→ Gal(K/Kunr)→ Gal(K/K)→ Z→ 1.

132 number theory

If G is a topological group, let Gab denote the quotient G/[G, G]of G by the closure6 of its commutator subgroup. This is called 6 The quotient of a topological group by

a closed subgroup is again a Hausdorfftopological group. With a non-closedsubgroup, one ends up with a non-Hausdorff quotient.

the (topological) abelianization of G, and it satisfies the followingimportant property:

Proposition 9.11 Every continuous homomorphism from G to a topologi-cal abelian group A factors uniquely through the quotient Gab.

For Galois extensions with Galois group G, the group Gab plays animportant role:

Proposition 9.12 Let L/K be a (finite or infinite) Galois extensionsof fields, with Galois group G. Let Kab be the fixed field of [G, G], soGal(Kab/K) ∼= Gab. Then Kab is the maximal abelian Galois extensionof K in L: every abelian Galois extension of K in L is contained in Kab.

The following is the main theorem of local class field theory:

Theorem 9.13 Fix an algebraic closure K of K, and let Kab be the maximalabelian extension of K in K. There is a canonical embedding of groups:

I : K× → Gal(K/K)ab = Gal(Kab/K),

which makes K× a dense subgroup of Gal(Kab/K), and for which thefollowing diagram commutes:

1 // A×

∼

// K× //

I

Z

// 1

1 // Gal(Kab/Kunr) // Gal(Kab/K) // Z // 1.

Closely related to the above theorem is the following: there is acanonical inclusion

Homcont(Gal(K/K), C×) → Homcont(K×, C×).

In other words, the continuous characters of Gal(K/Kunr) correspondto continuous characters of K× (specifically, those whose image iscontained in a subgroup of C× of finite order).

This is an important step towards understanding G = Gal(K/K).Indeed, since G is a compact topological group, Tannaka dualityimplies that G is determined by the category of continuous finite-dimensional complex representations, i.e. the category whose objectsare elements of Homcont(G, GLn(C)) for various n. Local class fieldtheory describes these objects when n = 1. The recent proofs byHarris-Taylor and Henniart of the “local Langlands conjectures forGLn” describe Homcont(G, GLn(C)) as well.

galois theory 133

9.3 Global Fields

Let us return, finally, to a number field F/Q, and assume that F is aGalois extension of Q with Γ = Gal(F/Q). Consider a prime numberp ∈ Z. By the factorization theory of ideals in OF, we find thatthere are distinct prime ideals P1, . . . , Pt in OF, and positive integerse1, . . . , et, such that:

pOF = Pe11 · · · P

ett .

Furthermore, since pOF is a Γ-fixed ideal in OF, we find that Γ actson the set P1, . . . , Pt of primes of OF which lie above p. In fact, wehave

Proposition 9.14 Γ acts transitively on the set of primes of OF which lieabove p, and there exists a positive integer e such that

pOF = Pe1 · · · Pe

t .

Proof: For each prime ideal P of OF lying above p, let FP be thecompletion of F with respect to the P-adic valuation. One can check,using the universal property of completion, that FP embeds topologi-cally, and as an F-algebra, in Qp ⊗Q F. In particular, we find that FP isan algebraic extension of Qp.

Conversely, every embedding of F into an algebraic closure of Qp

yields a valuation on F which restricts to the p-adic valuation on Q,and hence yields a prime ideal P of OF lying above p. In this way, wefind a surjective function from the set of embeddings of F into a fixedalgebraic closure of Qp to the set of prime ideals of OF lying above p.Moreover, this function is compatible with the natural action of Γ oneach set; since Γ acts transitively on the set of embeddings of F intoany algebraically closed field, we find that Γ acts transitively on theset of prime ideals of OF lying above p.

Finally, since Γ acts transitively on the set P1, . . . , Pt, and

Pe11 · · · P

ett = pOF = γ(pOF) = Pe1

γ(1) · · · Petγ(t),

we find that e1 = e2 = · · · = et.

Definition 9.15 Suppose that p is a prime number, and P is a prime idealof OF lying above p. The decomposition group ΓP at P is the subgroup ofΓ = Gal(F/Q) which fixes P.

In particular, since Γ acts transitively on the primes P1, . . . , Pt abovep, we find that

[Γ : ΓP] = t.

134 number theory

Since ΓP fixes P, ΓP preserves the P-adic valuation on F. It follows di-rectly that ΓP maps injectively to the automorphism group (FP/Qp),where FP is the finite extension of Qp obtained by completing F.

On the other hand, we find that:

F⊗Q Qp ∼= FP1 × · · · × FPt .

It follows that (for any prime P lying above p):

n = t · [FP : Qp] = t · #ΓP.

Thus FP is a Galois extension of Qp, and Γp is isomorphic to Gal(FP/Qp).Once we can identify ΓP with a Galois group of an extension

FP/Qp of p-adic fields, we find a canonical subgroup ΓunrP whose

fixed field is Fup , the maximal unramified extension of Qp in FP.

This is called the inertia subgroup of Γ at P. We find a tower ofsubgroups and subfields:

Γ ⊃ ΓP ⊃ ΓunrP ⊃ 1,

Q ⊂ FΓP ⊂ FΓunrP ⊂F.

A corollary of our results is the following:

Proposition 9.16 A prime p ramifies in F (i.e., e > 1) if and only if pdivides Disc(F).

Proof: The discriminant of F, i.e., the discriminant of OF as a Z-lattice with respect to the trace form, can also be computed “locally”,i.e., in the valuation ring Zp for any prime p. We have already seenthat p divides this locally computed discriminant if and only ifthe extension of p-adic fields ramifies. The previous propositiondemonstrates that this local ramification is equivalent to globalramification at p.

Corollary 9.17 If F is a number field, and F 6= Q, let Ram(F) be the set ofprime numbers which ramify in OF. Then Ram(F) is finite and non-empty.

Proof: If F 6= Q, then we have seen that Disc(F) 6= ±1. Thus Disc(F)has a prime factor p, which then ramifies. On the other hand, Disc(F)is an integer, and hence has only finitely many prime factors. ThusRam(F) is a finite set.

On the other hand, if F is a number field, it is often possible tofind a larger number field H ⊃ F such that no prime ideals of OF

ramify when factored in OH . The following result is quite deep, andbelongs in a course on class field theory:

galois theory 135

Theorem 9.18 Let F be a number field. There exists a finite extensionH ⊃ F, which is maximal among extensions of F in which no prime ideals ofOF ramify. Furthermore, the field H, called the Hilbert class field of F, isan abelian Galois extension of F, and Gal(H/F) is canonically isomorphicto the ideal class group H(F).

We finish by mentioning one more deep result. Suppose that pis a prime number, which does not ramify in OF. Thus we find thatFP/Qp is an unramified extension of p-adic fields, for any prime P ofOF lying above p. In particular, there is a Frobenius element FrP ∈ ΓP

for any such prime P. As all primes P above p are Γ-conjugate, wefind that there is a well-defined conjugacy class Frp of Frobeniuselements “at p”: Frp is the conjugacy class in Γ, containing all of theelements FrP, for all primes P above p.

Theorem 9.19 (Tchebotarev density) Let Unr(F) be the set of primenumbers, which are unramified in the number field F, with Γ = Gal(F/Q)as before. Let C be a conjugacy class in Γ. Let UnrC(F) be the subset ofUnr(F), consisting of prime numbers p for which Frp = C. Then UnrC(F)is a subset of Unr(F) with Dirichlet density #C/#Γ.

It turns out that for any prime number p with (p, n) = 1, the conju-gacy class of Frp in Γ = Gal(Q(ζn)/Q) = (Z/nZ)× is determined bythe congruence class of p modulo n. Tchebotarev density then impliesthat the prime numbers are equally distributed among the (possible)congruence classes modulo n.

galois theory 137

Index

[, 93

abelianization, 132

algebraic, 19

algebraic integers, 22

algebraic numbers, 21

archimedean, 92

archimedean place, 36

archimedean places, 66

arithmetic Frobenius, 129

basis, 7

Cauchy sequence, 94

centrally symmetric, 42

class number, 60

classical pairing, 11

clopen, 111

cocompact, 35

compatible systems, 112

complete, 94

completion, 96

complex place, 36, 66

convex, 42

covolume, 39

cyclotomic field, 79

cyclotomic polynomial, 77

cyclotomic units, 82

decomposition group, 133

Dedekind zeta function, 75

discriminant, 12, 26

dual lattice, 11

elementary divisors, 10

even, 14

fractional ideal, 55

Frobenius, 129

Frobenius map, 78

fundamental domain, 35

fundamental units, 73

gauge, 70

geometric Frobenius, 129

Hilbert class field, 135

homothetic, 9

homotheties, 9

ideal class group, 60

idempotent, 37

imaginary quadratic field, 64

index, 9

inertia subgroup, 134

inertial degree, 127

infinite places, 66

integers, 22

integral, 13, 21

integral closure, 123

inverse different, 26

inverse fractional ideal, 56

involution, 85

irregular prime, 88

lattice, 7, 15

lattice quotient, 52

lax basis, 38

lie above, 133

lifts, 121

logarithmic modulus, 66

maximal order, 32, 49

maximal unramified extension, 131

minimal idempotent, 37

monoid, 58

nonarchimedean, 92

nondegenerate, 11

norm, 23, 51

norm of a fractional ideal, 59

null sequence, 96

number field, 19

order, 32, 49, 101, 105

p-adic field, 109

p-adic valuation, 100

p-part, 101

parallelotope, 40

place, 36

primitive, 77

principal fractional ideal, 55

principal homogeneous space, 8

product of lattices, 51

quadratic field, 30

ramification degree, 127

real place, 36, 66

regular prime, 88

regulator, 74

residue field, 120

Riemann zeta function, 75

root of unity, 77

skew-symmetric, 11

sum of lattices, 51

symmetric, 11

Teichmuller representatives, 122

torsor, 8

totally complex, 39, 85

totally ramified, 127

totally real, 39, 85

totient, 79

trace, 23

140 number theory

trivial valuation, 100

ultrametric, 92

uniformizing element, 106, 110

unimodular, 13

unramified, 127

valuation, 91

valuation ring, 122

valued field, 91

Vandermonde determinant, 82

Bibliography

[1] David Brink. New light on Hensel’s lemma. Expo. Math., 24(4):291–306, 2006. ISSN 0723-0869.

[2] Richard Dedekind. Sur la théorie des nombres entiers algébriques.Darboux Bull. (2), I:17–41; 69–92; 114–164; 207–248, 1877.

[3] Richard Dedekind. Theory of algebraic integers. CambridgeMathematical Library. Cambridge University Press, Cambridge,1996. ISBN 0-521-56518-9. Translated from the 1877 French originaland with an introduction by John Stillwell.

[4] Anthony W. Knapp. Advanced algebra. Cornerstones. BirkhäuserBoston Inc., Boston, MA, 2007. ISBN 978-0-8176-4522-9.

[5] James S. Milne. Algebraic number theory (v3.02), 2009. Availableat www.jmilne.org/math/.

[6] Alexander Ostrowski. Über einige Lösungen der Funktionalgle-ichung ψ(x) · ψ(x) = ψ(xy). Acta Math., 41(1):271–284, 1916. ISSN0001-5962.

[7] Paulo Ribenboim. Equivalent forms of Hensel’s lemma. Exposition.Math., 3(1):3–24, 1985. ISSN 0723-0869.

[8] Herbert Robbins. A remark on Stirling’s formula. Amer. Math.Monthly, 62:26–29, 1955. ISSN 0002-9890.

Documents

A SIGNIFICANT PORTION OF THIS TEXT HAS BEEN …a significant portion of this text has been adapted from the typed notes of sean sather-wagstaff, from a course given by benedict gross