Rolf Berndt - University of Chicagomargalit/repthy... · Mackey: “Induced Representations in Physics, Probability and Number Theory” [Ma1]. The title of this text is a mixture

Rolf Berndt

Representationsof Linear Groups

Rolf Berndt

Representations of Linear GroupsAn Introduction Based on Examples from Physics and Number Theory

First edition, July 2007

All rights reserved© Friedr. Vieweg & Sohn Verlag | GWV Fachverlage GmbH, Wiesbaden 2007

Editorial Office: Ulrike Schmickler-Hirzebruch | Susanne Jahnel

Vieweg is a company in the specialist publishing group Springer Science+Business Media.www.vieweg.de

No part of this publication may be reproduced, stored in a retrieval systemor transmitted, mechanical, photocopying or otherwise without priorpermission of the copyright holder.

Cover design: Ulrike Weigel, www.CorporateDesignGroup.dePrinting and binding: MercedesDruck, BerlinPrinted on acid-free paperPrinted in Germany

ISBN 978-3-8348-0319-1

Bibliografische information published by Die Deutsche NationalbibliothekDie Deutsche Nationalbibliothek lists this publication in the Deutschen Nationalbibliografie;detailed bibliographic data is available in the Internet at <http://dnb.d-nb.de>.

Prof. Dr. Rolf BerndtDepartment of MathematicsUniversity of HamburgBundesstraße 55D-20146 Hamburg Germany

[email protected]

Mathematics Subject Classification20G05, 22E45, 20C25, 20C35, 11F70

Preface

There are already many good books on representation theory for all kinds of groups.Two of the best (in this author’s opinion) are the one by A.W. Knapp: “RepresentationTheory for Semisimple Groups. An Overview based on Examples” [Kn1] and by G.W.Mackey: “Induced Representations in Physics, Probability and Number Theory” [Ma1].The title of this text is a mixture of both these titles, and our text is meant as a veryelementary introduction to these and, moreover, to the whole topic of group represen-tations, even infinite-dimensional ones. As is evident from the work of Bargmann [Ba],Weyl [Wey] and Wigner [Wi], group representations are fundamental for the theory ofatomic spectra and elementary physics. But representation theory has proven to be aninavoidable ingredient in other fields as well, particularly in number theory, as in thetheory of theta functions, automorphic forms, Galois representations and, finally, theLanglands program. Hence, we present an approach as elementary as possible, having inparticular these applications in mind.

This book is written as a summary of several courses given in Hamburg for students ofMathematics and Physics from the fifth semester on. Thus, some knowledge of linearand multilinear algebra, calculus and analysis in several variables is taken for granted.Assuming these prerequisites, several groups of particular interest for the applications inphysics and number theory are presented and discussed, including the symmetric groupSn as the leading example for a finite group, the groups SO(2), SO(3), SU(2), and SU(3)as examples of compact groups, the Heisenberg groups and SL(2,R), SL(2,C), resp. theLorentz group SO(3, 1) as examples for noncompact groups, and the Euclidean groupsE(n) = SO(n)Rn and the Poincare group P = SO(3, 1)+ R4 as examples for semidi-rect products.

This text would not have been possible without the assistance of my students and colleagues; it

is a pleasure for me to thank them all. In particular, D. Bahns, S. Bocherer, O. v. Grudzinski,

M. Hohmann, H. Knorr, J. Michalicek, H. Muller, B. Richter, R. Schmidt, and Chr. Schweigert

helped in many ways, from giving valuable hints to indicating several mistakes. Part of the

material was treated in a joint seminar with Peter Slodowy. I hope that a little bit of his way

of thinking is still felt in this text and that it is apt to participate in keeping alive his memory.

Finally, I am grateful to U. Schmickler-Hirzebruch and S. Jahnel from the Vieweg Verlag for

encouragement and good advice.

Contents

Introduction ix

0 Prologue: Some Groups and their Actions 10.1 Several Matrix Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Group Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.3 The Symmetric Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1 Basic Algebraic Concepts 71.1 Linear Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Equivalent Representations . . . . . . . . . . . . . . . . . . . . . . . . . . 91.3 First Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.4 Basic Construction Principles . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.4.1 Sum of Representations . . . . . . . . . . . . . . . . . . . . . . . . 141.4.2 Tensor Product of Representations . . . . . . . . . . . . . . . . . . 141.4.3 The Contragredient Representation . . . . . . . . . . . . . . . . . . 151.4.4 The Factor Representation . . . . . . . . . . . . . . . . . . . . . . 16

1.5 Decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.6 Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2 Representations of Finite Groups 232.1 Characters as Orthonormal Systems . . . . . . . . . . . . . . . . . . . . . 232.2 The Regular Representation . . . . . . . . . . . . . . . . . . . . . . . . . . 272.3 Characters as Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . 28

3 Continuous Representations 313.1 Topological and Linear Groups . . . . . . . . . . . . . . . . . . . . . . . . 313.2 The Continuity Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.3 Invariant Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Representations of Compact Groups 434.1 Basic Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.2 The Example G = SU(2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.3 The Example G = SO(3) . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5 Representations of Abelian Groups 595.1 Characters and the Pontrjagin Dual . . . . . . . . . . . . . . . . . . . . . 595.2 Continuous Decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . 60

viii CONTENTS

6 The Infinitesimal Method 636.1 Lie Algebras and their Representations . . . . . . . . . . . . . . . . . . . . 636.2 The Lie Algebra of a Linear Group . . . . . . . . . . . . . . . . . . . . . . 676.3 Derived Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706.4 Unitarily Integrable Representations of sl(2,R) . . . . . . . . . . . . . . . 736.5 The Examples su(2) and heis(R) . . . . . . . . . . . . . . . . . . . . . . . 826.6 Some Structure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

6.6.1 Specifications of Groups and Lie Algebras . . . . . . . . . . . . . . 856.6.2 Structure Theory for Complex Semisimple Lie Algebras . . . . . . 896.6.3 Structure Theory for Compact Real Lie Algebras . . . . . . . . . . 936.6.4 Structure Theory for Noncompact Real Lie Algebras . . . . . . . . 956.6.5 Representations of Highest Weight . . . . . . . . . . . . . . . . . . 97

6.7 The Example su(3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

7 Induced Representations 1177.1 The Principle of Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

7.1.1 Preliminary Approach . . . . . . . . . . . . . . . . . . . . . . . . . 1187.1.2 Mackey’s Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 1207.1.3 Final Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1257.1.4 Some Questions and two Easy Examples . . . . . . . . . . . . . . . 126

7.2 Unitary Representations of SL(2,R) . . . . . . . . . . . . . . . . . . . . . 1307.3 Unitary Representations of SL(2,C) and of the Lorentz Group . . . . . . . 1437.4 Unitary Representations of Semidirect Products . . . . . . . . . . . . . . . 1477.5 Unitary Representations of the Poincare Group . . . . . . . . . . . . . . . 1547.6 Induced Representations and Vector Bundles . . . . . . . . . . . . . . . . 161

8 Geometric Quantization and the Orbit Method 1738.1 The Hamiltonian Formalism and its Quantization . . . . . . . . . . . . . . 1738.2 Coadjoint Orbits and Representations . . . . . . . . . . . . . . . . . . . . 178

8.2.1 Prequantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1788.2.2 Example: Construction of Line Bundles over M = P1(C) . . . . . 1818.2.3 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1848.2.4 Coadjoint Orbits and Hamiltonian G-spaces . . . . . . . . . . . . . 1868.2.5 Construction of an Irreducible Unitary Representation by an Orbit 196

8.3 The Examples SU(2) and SL(2,R) . . . . . . . . . . . . . . . . . . . . . . 1978.4 The Example Heis(R) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2028.5 Some Hints Concerning the Jacobi Group . . . . . . . . . . . . . . . . . . 209

9 Epilogue: Outlook to Number Theory 2159.1 Theta Functions and the Heisenberg Group . . . . . . . . . . . . . . . . . 2169.2 Modular Forms and SL(2,R) . . . . . . . . . . . . . . . . . . . . . . . . . 2219.3 Theta Functions and the Jacobi Group . . . . . . . . . . . . . . . . . . . . 2369.4 Hecke’s Theory of L−Functions Associated to Modular Forms . . . . . . . 2399.5 Elements of Algebraic Number Theory and Hecke L-Functions . . . . . . 2469.6 Arithmetic L-Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2509.7 Summary and Final Reflections . . . . . . . . . . . . . . . . . . . . . . . . 256

Bibliography 261

Index 266

Introduction

In this book, the groups enumerated in the Preface are introduced and treated as matrixgroups to avoid as much as possible the machinery of manifolds, Lie groups and bundles(though some of it soon creeps in through the backdoor as the theory is further devel-oped). Parallel to information about the structure of our groups we shall introduce anddevelop elements of the representation theory necessary to classify the unitary represen-tations and to construct concrete models for these representations. As the main toolfor the classification we use the infinitesimal method linearizing the representations of agroup by studying those of the Lie algebra of the group. And as the main tools for theconstruction of models for the representations we use

– tensor products of the natural representation,– representations given by smooth functions (in particular polynomials) living on a

space provided with an action of the group,and

– the machinery of induced representations.Moreover, because of the growing importance in physics and the success in derivingbranching relations, the procedure of geometric quantization and the orbit method, de-veloped and propagated by Kirillov, Kostant, Duflo and many others shall be explainedvia its application to some of the examples above.

Besides the sources already mentioned, the author was largely influenced by the nowclassical book of Kirillov: “Elements of the Theory of Representations” [Ki] and themore recent “Introduction to the Orbit Method” [Ki1]. Other sources were the booksby Barut and Raczka: “Theory of Group Representations and Applications” [BR], S.Lang: “SL(2,R)” [La], and, certainly, Serre: “Linear Representations of Finite Groups”[Se]. There is also the book by Hein: “Einfuhrung in die Struktur- und Darstellungs-theorie der klassischen Gruppen” [Hei], which follows the same principle as our text,namely to do as much as possible for matrix groups, but does not go into the infinite-dimensional representations necessary for important applications. Whoever is furtherinterested in the history of the introduction of representation theory into the theory ofautomorphic forms and its development is referred to the classical book by Gelfand, Graevand Pyatetskii-Shapiro: “Representation Theory and Automorphic Forms” [GGP], Gel-bart’s: “Automorphic Forms on Adele Groups” [Ge], and Bump’s: “Automorphic Formsand Representations” [Bu]. More references will be given at the appropriate places in ourtext; as already said, we shall start using material only from linear algebra and analysis.But as we proceed more and more elements from topology, functional analysis, complexfunction theory, differential and symplectic geometry will be needed. We will try to in-troduce these as gently as possible but often will have to be very rudimentary and willhave to cite the hard facts without the proofs, which the reader can find in the morerefined sources.

To sum up, this text is prima facie about real and complex matrices and the nice andsometimes advanced things one can do with them by elementary means starting froma certain point of view: Representation theory associates to each matrix from a givengroup G another matrix or, in the infinite-dimensional case, an operator acting on aHilbert space. One may want to ask, why study these representations by generally morecomplicated matrices or operators if the group is already given by possibly rather simplematrices? An answer to this question is a bit like the one for the pudding: the proofis in the eating. And we hope our text will give an answer. To the more impatientreader who wants an answer right away in order to decide whether to read on or not, weoffer the following rough explanation. Certain groups G appear in nature as symmetrygroups leaving invariant a physical or dynamical system. For example, the orthogonalgroup O(3) is the symmetry group for the description of the motion of a particle in acentral symmetric force field, and the Poincare group P is the symmetry group for themotion of a free particle in Minkowski space. Then the irreducible unitary representa-tions of G classify indivisible intrinsic descriptions of the system and, boldly spoken, canbe viewed as “elementary particles” for the given situation. Following Wigner and hiscontemporaries, the parameters classifying the representations are interpreted as quan-tum numbers of these elementary particles. . .

The importance of representations for number theory is even more difficult to put into anutshell. In the Galois theory of algebraic number fields (of finite degree) Galois groupsappear as symmetry groups G. Important invariants of the fields are introduced via cer-tain zeta- or L-functions, which are constructed using finite-dimensional representationsof these Galois groups. Another aspect comes from considering smooth (holomorphicresp. meromorphic) functions in several variables which are periodic or have a more gen-eral covariant transformation property under the action of a given discrete subgroup ofa continuous group G, like for instance G = SL(2,R) or G a Heisenberg or a symplecticgroup. Then these functions with preassigned types, e.g., theta functions or modularforms, generate representation spaces for (infinite-dimensional) representations of the re-spective group G.

Finally, we will give an overview over the contents of our text: In a prologue we will fixsome notation concerning the groups and their actions that we later use as our first ex-amples, namely, the general and special linear groups over the real and complex numbersand the orthogonal and unitary groups. Moreover, we present the symmetric group Sn

of permutations of n elements and some facts about its structure. We stay on the level ofvery elementary algebra and stick to the principle to introduce more general notions anddetails from group theory only when needed in our development of the representationtheory. We follow this principle in the first chapter where we introduce the concept oflinear representations using only tools from linear algebra. We define and discuss thefundamental notions of equivalence, irreducibility, unitarity, direct sums, tensor product,characters, and give some first examples.

The theory developed thus far is applied in the second chapter to the representations offinite groups, closely following Serre’s exposition [Se]. We find out that all irreduciblerepresentations may be unitarized and are contained in the regular representation.

In the next step we move on to compact groups. To do this we have to leave the purelyalgebraic ground and take in topological considerations. Hence, in the third chapter, wedefine the notion of a topological and of a (real or complex) linear group, the centralnotion for our text. Following this, we refine the definition of a group representation byadding the usual continuity condition. Then we adequately modify the general conceptsof the first chapter. We try to take over as much as possible from finite to compactgroups. This requires the introduction of invariant measures on spaces with a (from nowon) continuous group action, and a concept of integration with respect to these measures.In the forth chapter we concentrate on compact groups and prove that the irreduciblerepresentations are again unitarizable, finite-dimensional, fixed by their characters andcontained in the regular representation. But their number is in general not finite, incontrast to the situation for finite groups. We state, but do not prove, the Peter-WeylTheorem. But to get a (we hope) convincing picture, we illustrate it by reproducingWigner’s discussion of the representaions of SU(2) and SO(3). We use and prove thatSU(2) is a double cover of SO(3). Angular momentum, magnetic and spin quantumnumbers make an appearance, but for further application to the theory of atomic spectrawe refer to [Wi] and the physics literature.

In a very short fifth chapter, we assemble some material about the representations oflocally compact abelian groups. We easily get the result that every unitary irreduciblerepresentation is one-dimensional. But as can be seen from the example G = R, theirnumber need not be denumerable. More functional analysis than we can offer at thisstage is needed to decompose a given reducible representation into a direct integral ofirreducibles, a notion we not consider here.

Before starting the discussion of representations of other noncompact groups, we presentin chapter 6 an important tool for the classification of representations, the infinitesimalmethod. Here, at first, we have to explain what a Lie algebra is and how to associateone to a given linear group. Our main ingredient is the matrix exponential function andits properties. We also reflect briefly on the notion of representations of Lie algebras.Here again we are on purely algebraic, at least in examples, easily accessible ground.We start giving examples by defining the derived representation dπ of a given grouprepresentation π. We do this for the Schrodinger representation of the Heisenberg groupand the standard representation π1 of SU(2). Then we concentrate on the classificationof all unitary irreducible representations of SL(2,R) via a description of all (integrable)representations of its Lie algebra. Having done this, we consider again the examples su(2)and heis(R) (relating them to the theory of the harmonic oscillator) and give some hintsconcerning the general structure theory of semisimple Lie algebras. The way a generalclassification theory works is explained to some extent by considering Lie SU(3); we willsee how quarks show up.

Chapters 7 and 8 are the core of our book. In the seventh chapter we introduce theconcept of induced representations, which allows for the construction of (sometimesinfinite-dimensional) representations of a given group G starting from a (possibly one-dimensional) representation of a subgroup H of G. To make this work we need again abit more Hilbert space theory and have to introduce quasi-invariant measures on spaceswith group action. We illustrate this by considering the examples of the Heisenberggroup and G = SU(2), where we rediscover the representations which we already know.Then we use the induction process to construct models for the unitary representations of

SL(2,R) and SL(2,C). In particular, we show how holomorphic induction arises in thediscussion of the discrete series of SL(2,R) (here we touch complex function theory). Weinsert a brief discussion of the Lorentz group GL = SO(3, 1)0 and prove that SL(2,C)is a double cover of GL. To get a framework for the discussion of the representations ofthe Poincare group GP , which is a semidirect product of the Lorentz group with R4, wedefine semidirect products and treat Mackey’s theory in a rudimentary form. We outlinea recipe to classify and construct irreducible representations of semidirect products ifone factor is abelian. We do not prove the general validity of this procedure as Mackey’sImprimitivity Theorem is beyond the scope of our book, but we apply it to determinethe unitary irreducible representations of the Euclidean and the Poincare group, whichare fundamental for the classification of elementary particles.

Under the heading of Geometric Quantization, in the eighth chapter we take an alterna-tive approach to some material from chapter 7 by constructing representations via theorbit method. Here we have to recall (or introduce) more concepts from higher analysis:manifolds and bundles, vector fields, differential forms, and in particular the notion ofa symplectic form. We can again use the information and knowledge we already haveof our examples G = SL(2,R), SU(2) and the Heisenberg group to get a feeling whatshould be done here. We identify certain spheres and hyperboloids as coadjoint orbits ofthe respective groups, and we construct line bundles on these orbits and representationspaces consisting of polarized sections of the bundles.

Finally, in the nineth and last chapter, we give a brief outlook on some examples whererepresentations show up in number theory. We present the notion of an automorphicrepresentation (in a rudimentary form) and explain its relation with theta functions andautomorphic forms. We have a glimpse upon Hecke’s and Artin’s L-functions and men-tion the Artin conjecture.

We hope that some of the exercises and/or omitted proofs may give a starting point fora bachelor thesis, and also that this text motivates further studies in a master programin theoretical physics, algebra or number theory.

¡ Libro, afande estar en todas partes,en soledad!

J. R. Jimenez

Chapter 0

Prologue: Some Groups andtheir Actions

This text is mainly on groups which some way or another come from physics and/ornumber theory and which can be described in form of a real or complex matrix group.

0.1 Several Matrix Groups

We will use the following notation:The letter K indicates a field. The reader is invited to think of the field R of real or Cof complex numbers. Most of the things we do at the beginning of our text are valid alsofor more general fields at least if they are algebraically closed and of characteristic zero,but as this is only an introduction for lack of space we will not go into this to a greaterdepth.Mm,n(K) denotes the K-vector space of m × n matrices A = (aij) with aij ∈ K (i =1, . . . ,m, j = 1, . . . , n) and Mn(K) stands for Mn,n(K).Our groups will be (for some n) subgroups of of the general linear group of invertiblen × n-matrices

GL(n,K) := A ∈ Mn(K); det A = 0 .As usual, we will denote the special linear group by

SL(n,K) := A ∈ Mn(K); detA = 1 ,

the orthogonal group by

O(n) := A ∈ Mn(R); tAA = En ,

resp. for n = p + q

O(p, q) := A ∈ Mn(R); tADp,qA = Dp,q ,

where Dp,q is the diagonal matrix having p times 1 and q times −1 in its diagonal, andthe unitary group

U(n) := A ∈ Mn(C); tAA = En ,

2 0. Prologue: Some Groups and their Actions

resp. for n = p + q

U(p, q) := A ∈ Mn(R); tADp,qA = Dp,q .Again, addition of the letter S to the symbol of the group indicates that we take onlymatrices with determinant 1, e.g.

SO(n) := A ∈ O(n); det A = 1 .These groups together with some other families of groups, in particular the symplecticgroups showing up later, are known as classical groups.

Later on, we will often use subgroups consisting of certain types of block matrices, e.g.the group of diagonal matrices

An := D(a1, . . . , an); a1, . . . , an ∈ K∗ ,where D(a1, . . . , an) denotes the diagonal matrix with the elements a1, . . . , an in the di-agonal, or the group of upper triangular matrices (or standard Borel group) Bn consistingof matrices with zeros below the diagonal and the standard unipotent group Nn, the sub-group of Bn where all diagonal elements are 1.In view of the importance for applications, moreover, we distinguish several types ofHeisenberg groups: Thus the group N3 we just defined, is mostly written as

Heis′(K) := g =

⎛⎝ 1 x z

1 y1

⎞⎠ ; x, y, z ∈ K .

In the later application to theta functions it will become clear that, though it may seemmore complicated, we shall better use the following description for the Heisenberg group.

Heis(K) := g = (λ, µ, κ) :=

⎛⎜⎜⎝

1 0 0 µλ 1 µ κ0 0 1 −λ0 0 0 1

⎞⎟⎟⎠ ; µ, λ, κ ∈ K ,

and the “higher dimensional” groups, which for typographical reasons we here do notwrite as matrix groups

Heis(Kn) := g = (x, y, z); x, y ∈ Kn, z ∈ K with the multiplication law given by

gg′ = (x + x′, y + y′, z + z′ + txy′ − tyx′).

We suggest to write elements of Kn as columns.

Exercise 0.1: Write Heis(Kn) in matrix form. Show that Heis(Kn) for n = 1 and theother two Heisenberg groups above are isomorphic.

Exercise 0.2: Verify that the matrices(0 1−1 0

),

(z 00 z

), z ∈ C∗ := C \ 0

generate (by taking all possible finite products) a non-abelian subgroup of GL(2,C).This is called the Weil group of R and plays an important role in the epilogue at the endof our text.

0.2 Group Actions 3

0.2 Group Actions

Most groups “appear in nature” as transformation groups acting on some set or space.This motivates the introduction of the following concepts.

Let G be a group with neutral element e and let X be a set.

Definition 0.1: G acts on X from the left iff a map

G ×X −→ X , (g, x) −→ g · x

is given, which satisfies the conditions

g · (g′ · x) = (gg′) · x, e · x = x

for all g, g′ ∈ G and x ∈ X .

Remark 0.1: If AutX denotes the group of all bijections of X onto itself, the definitionsays that we have a group homomorphism G −→ AutX associating to every g ∈ G thetransformation x −→ g · x.

In this case the set X is also called a left G-set.

The group action is called effective iff no element except the neutral element e acts asthe identity, i.e. the homomorphism G −→ AutX is faithful.

The group action is called transitive iff for every pair x, x′ ∈ X there is a g ∈ G withx′ = g · x.

For x0 ∈ X we call the subset of X

G · x0 := g · x0; g ∈ G

an orbit of G (through x0) and the subgroup

Gx0 := g ∈ G; g · x0 = x0

the isotropy group or the stabilizing group of x0.

Example 0.1: G = GL(n,K) and its subgroups act on X = Kn from the left by matrixmultiplication

(A, x) −→ Ax

for x ∈ Kn (viewed as a column).

Exercise 0.3: Assure yourself that GL(n,K) acts transitively on X = Kn \ 0.Describe the orbits of SO(n).


Example 0.2: A group acts on itself from the left in three ways:a) by left translation (g, g0) −→ g · g0 = gg0 =: λgg0,b) by (the inverse of) right translation (g, g0) −→ g · g0 = g0g

−1 =: ρg−1g0,c) by conjugation (g, g0) −→ g · g0 = gg0g

−1 =: κgg0.Left- and right translations are obviously transitive actions. For a given group, the de-termination of its conjugacy classes, i.e. the classification of the orbits under conjugationis a highly interesting question, as we shall see later.

Exercise 0.4: Determine a family of matrices containing exactly one representative foreach conjugacy class of G1 = GL(2,C) and G2 = GL(2,R).

Definition 0.2: A set is called a homogeneous space iff there is a group G acting tran-sitively on X .

Remark 0.2: In this case one has X = G · x for each x ∈ X and any two stabilizinggroups Gx and Gx′ are conjugate. (Prove this as Exercise 0.5.)

Definition 0.3: For any G-set we shall denote by X/G or XG the set of G-orbits inX and by XG the set of fixed points for G, i.e. those points x ∈ X for which one hasg · x = x for all g ∈ G.

If the set X has some structure, for instance, if X is a vector space, we will (tacidly)additionally require that a group action preserves this structure, i.e. in this case thatx −→ g · x is a linear map for each g ∈ G (or later on, is continuous if X is a topologicalspace). It is a fundamental question whether the orbit space X/G inherits the sameproperties as the space X may have. For instance, if X is a manifold, is this true alsofor X/G? We will come back to this question later several times. Here let us only lookat the following situation: Let X be a homogeneous G-space, x0 ∈ X , and H = Gx0 theisotropy group. Then we denote by G/H the set of left cosets gH := gh; h ∈ H. Theseare also to be seen as H-orbits H · g where H acts on G by right translation. G/H is aG-set, G acting by (g, g0H) −→ gg0H. Then we shall identify G/H and X via the map

G/H −→ X , gH −→ g · x0.

This map is an example for the following general notion.Definition 0.4: Let X and X ′ be G-sets and f : X −→ X ′ be a map. The map f iscalled G-equivariant or a G-morphism iff one has for every g ∈ G and x ∈ X

g · f(x) = f(g · x).

Now we come back to the question raised above.Exercise 0.6: Prove that G/H has the structure of a group iff H is not only a subgroupbut a normal subgroup, i.e. one has ghg−1 ∈ H for all g ∈ G,h ∈ H.

We mention here another useful fact: If H is a subgroup of G, one defines the normalizerof H in G as

NG(H) := g ∈ G; gHg−1 = H.It is clear that NG(H) is the maximal subgroup in G that has H as a normal sub-group. Then the group Aut (G/H) of G-equivariant bijections of G/H is isomorphic toNG(H)/H. (Prove this as Exercise 0.7.)

0.3 The Symmetric Group 5

Parallel to the normalizer, one defines the centralizer

CG(H) := g ∈ G; ghg−1 = h for all h ∈ H.

In particular for H = G, the centralizer specializes to the center

CG(G) = g ∈ G; gh = hg for all h ∈ G =: C(G).

As well as left actions one defines and studies right actions of a group G on a set or aspace X :

Definition 0.5: G acts from the right iff a map

G ×X −→ X , (g, x) −→ x · g,

is given that satisfies the conditions

(x · g) · g′ = x · (gg′), x · e = x

for all g, g′ ∈ G and x ∈ X .

Remark 0.3: Obviously it is easy to switch from right action to left action and viceversa using the antiautomorphism g −→ g−1 of G. Namely, on any right G-space X onecan naturally define a left action by g · x := x · g−1.

Right actions often show up in number theory and they are written there as xg := x · g(for instance in the action of the Galois group on a number field).

0.3 The Symmetric Group

Though this text is meant to treat mainly continuous groups like the matrix groups setup in Section 0.1, some discrete groups as, for instance, Z, SL(n,Z) (where the elementsare matrices with determinant one and integers as entries), and in particular the sym-metric group Sn, are unavoidable.

Definition 0.6: The symmetric group Sn is the group of permutations, i.e. bijections,of a set of n elements.

We already introduced AutX as the group of bijections of a set X , so we can takeXn := 1, . . . , n and have Sn = AutXn. We emphasize that (with the notions fromSection 0.2) we treat permutations as left actions on the set Xn.There are several fundamental facts concerning this group, which the reader is supposedto know from Linear Algebra:

Remark 0.4: The number of elements of the symmetric group Sn is # Sn = n!

Remark 0.5: Each permutation σ ∈ Sn can be written as a product of transpositions,i.e. permutations, which interchange exactly two elements.


For every σ ∈ Sn one defines the signum of σ by

sgn(σ) := ε(σ) :=∏

1≤i<j≤n

σ(i) − σ(j)i − j

.

Remark 0.6: The map σ −→ ε(σ) is a group homomorphism of Sn to the subgroup1,−1 of the multiplicative group R∗ = R \ 0. Thus, the number of transpositions ina representation of σ as a product of transpositions is either even or odd.The kernel of ε is usually written as An. It is a normal subgroup and called the alternatinggroup.

Permutations are often written as products of cycles, i.e. permutations (i1, . . . , ir) withij −→ ij+1 for j < r and ir −→ i1 if r > 1 and the identity for r = 1. More precisely,one has

Remark 0.7: Each permutation is (up to order) a unique product of disjoint cycles.

From here it is not far to the fundamental fact:

Theorem 0.1: The number of conjugacy classes of Sn is equal to the number p(n) ofpartitions of n.

A partition of n is a sequence (n1, . . . , nr) of natural numbers ni ∈ N with ni ≥ nj fori < j and Σni = n.

Example 0.3: The partitions of n = 3 are (1,1,1), (2,1), and (3).The conjugacy classes of S3 are

– id

– the three two-element subgroups generated by transpositions, i.e.,(1, 2), (2, 3), (3, 1),

– the two three-element subgroups generated by the 3-cycles(1, 2, 3) resp. (1, 3, 2).

Exercise 0.8: Determine the conjugacy classes of A4.

Sn can be realized as a matrix group: we associate to every σ ∈ Sn the matrixA = A(σ) = (aij) ∈ GL(n,Z) with aij = δi,σ−1j .

A matrix like this is called a permutation matrix.

Exercise 0.9: Write down the matrices A(σ) and verify that the map given by associat-ing σ −→ A(σ) is in fact an isomorphism from S3 to the subgroup of GL(3,R) consistingof the permutation matrices.

Chapter 1

Basic Algebraic Concepts forGroup Representations

We first collect the basic definitions and concepts concerning representations of groups,using only algebraic methods. As is common, we introduce the concept of a linear repre-sentation, which is completely adequate and sufficient for the study of representations offinite groups. In a later chapter we will enrich the discussion with elements from topologyto refine the notion of a representation to that of a continuous representation.We start by stating the essential definitions and only afterwards illustrate these by someexamples and constructions.

1.1 Linear Representations

Let G be a group and V a K-vector space. Here, in principle, we mean K = C, butmost things go through as well for an algebraically closed field of characteristic zero.There are interesting phenomenae in the other cases, which are beyond the scope of thisintroduction. For an interested reader wanting some impression of this, we recommendChapter 12 ff from Serre’s book [Se], whose first Chapters together with §7-11 from Kir-illov’s book [Ki] are guidelines for this and our next section.

Definition 1.1: π is a linear representation of G in V iff π is a homomorphism of Ginto AutV i.e. iff we have a map

π : G −→ Aut V, g −→ π(g)

withπ(gg′) = π(g)π(g′) for all g, g′ ∈ G.

AutV is equivalently denoted by GL(V ) and stands for the group of all (linear) isomor-phisms of V .

In the case of a finite-dimensional vector space V , say of dimension n, one says that π isof degree n or π is a n-dimensional representation.

8 1. Basic Algebraic Concepts

Let B = (v1, . . . , vn) be a basis of V . Then every F ∈ AutV is represented with respectto B by an invertible n × n-matrix A := MB(F ), and we have an isomorphism of vectorspaces V Kn and of groups AutV GL(n,K). Essentially equivalent to the definitionabove, but perhaps more down-to-earth for some readers is the definition:An n-dimensional linear representation of a group G is a prescription π associating toeach g ∈ G a matrix π(g) = A(g) ∈ GL(n,K) such that

A(gg′) = A(g)A(g′)

holds for all g, g′ ∈ G.

As every group homorphism transforms the neutral elements of the groups into oneanother, it is clear that this prescription associates the unit matrix En to the neutralelement e of the group G, and that we have π(e) = idV in the situation of the generaldefinition above.

If G is a matrix group, G ⊂ GL(n,C), as in Section 0.1, obviously, we have the naturalrepresentation π0 given by

π0(A) = A

for each A ∈ G. Another even more ubiquous representation is the trivial representationπ = id, associating the identity idV to each g ∈ G.

Using the notion of group action introduced in Section 0.2, we can also say that a lin-ear representation of G in V is the same thing as a left action of G on the vector space V .

At a first sight, it appears to be the main problem of representation theory to determineall representations of a given group. To make this a more sensible and accessible prob-lem, one has to find some building blocks or elementary particles to construct generalrepresentations from, and, moreover, decide when representations are really different.We begin by searching for the building blocks.

Definition 1.2: Let π be a linear representation of G in V . π is called irreducible, iffthere is no genuine π-invariant subspace V0 in V .A subspace V0 ⊂ V is π-invariant iff we have π(g)v0 ∈ V0 for all g ∈ G and v0 ∈ V0.If this is the case, π0 := π |V0 is a representation of G in V0 and this is called a subrep-resentation.So we can also say that π is irreducible, iff π has no genuine subrepresentation.

Next, we restrict our objects still more, in particular in view to the applications in physics.

We assume now that V is a unitary complex vector space, i.e. V is equipped with a scalarproduct

< ., . >: V × V −→ C, (v, v′) −→< v, v′ >

which is

– linear in the second variable and antilinear in the first,– hermitian: for every v, v′ ∈ V one has < v, v′ >= < v′, v >, and– positive definite: for every v ∈ V one has < v, v > ≥ 0 and = 0 exactly for v = 0.

1.2 Equivalent Representations 9

For V = Cn we use the standard scalar product

< x, y >:=n∑

i=1

xiyi for all x, y ∈ Cn.

Definition 1.3: A representation π of G in V is unitary iff each π(g) is unitary, i.e. iffone has for every v, v′ ∈ V and every g ∈ G

< π(g)v, π(g)v′ >= < v, v′ > .

1.2 Equivalent Representations

If two representations π and π′ of G in K-vector spaces V resp. V ′ are given, it is obviousto look for G-equivariant maps F : V −→ V ′:

Definition 1.4: A K-linear map F : V −→ V ′ is called an intertwining operator betweenπ and π′ iff one has for every g ∈ G

Fπ(g) = π′(g)F,

i.e. iff the following diagram commutes

VF

π(g)

V ′

π′(g)

V

F V ′.

π and π′ are called equivalent iff there is an isomorphism

F : V −→ V ′

intertwining π and π′. In this case we write π ∼ π′.

Remark 1.1: The space of intertwining operators between π and π′ is again a vec-tor space. It is denoted by HomG(V, V ′) or C(V, V ′). Moreover, we use the notationC(V ) := C(V, V ) and c(π, π′) = c(V, V ′) = dim C(V, V ′). c(π, π′) is also called the multi-plicity of π in π′ and denoted by mult(π, π′).

Representations π and π′ with c(π, π′) = c(π′, π) = 0 are called disjoint.

At this point it is possible to state as our principal task:

Determine the unitary (linear) dual G of a given group G, i.e. the set of equivalenceclasses of unitary irreducible representations of G.


1.3 First Examples

We hope the reader gets the feeling that the following special cases prepare the groundfor the later more general considerations.

Example 1.1: In Section 1.1 we already mentioned that every matrix group G has itsnatural representation π0, i.e. for each real or complex matrix group G ⊂ GL(n,K) onehas the representation in V = Cn associating to every A ∈ G the matrix A itself. Thesenatural representations are obviously unitary for G = SO(n) or SU(n) but in generalneither unitary nor irreducible, as the following example shows:

Example 1.2: Let g be an element of the group G = SO(2). Then we usually write

g = r(ϑ) := (α β−β α

), α = cos ϑ, β = sin ϑ, ϑ ∈ R.

For each k ∈ Z we putχk(r(ϑ)) := exp(ikϑ).

Hence π = χk is a one-dimensional unitary representation in V = C, as, by the additiontheorems for the trigonometric functions, we have

χk(r(ϑ)r(ϑ′)) = χk(r(ϑ + ϑ′)) = exp(ik(ϑ + ϑ′)) = χk(r(ϑ))χk(r(ϑ)).

For U = (1/√

2)(1 ii 1 ) we have the relation

Ur(ϑ)U−1 = ( α − iβ 00 α + iβ

) = ( e−iϑ 00 eiϑ ).

This can be interpreted as follows. The matrix U is the matrix for an intertwiningoperator F intertwining the natural representation π0 with the representation π1 onV = C2 given by

π1(r(ϑ)) = (e−iϑ 0

0 eiϑ ).

This representation is reducible and we see that the natural representation is equivalentto a reducible one. Or we realize that the vectors

e′1 := Ue1 = (1/√

2)(

1i

), e′2 := Ue2 = (1/

√2)

(i

1

)

span nontrivial π0-invariant subspaces of V = C2. But if we look at π0 as a real repre-sentation, i.e. in the R-vector space V0 = R2, π0 is in fact irreducible.

Example 1.3: In Section 0.3 we introduced the symmetric group Sn. G = S3 consistsof the elements

id = (1), (1, 2) =: σ, (1, 3), (2, 3), (1, 2, 3) =: τ, (1, 3, 2).

As there are relations like σ2 = id, τ3 = id and

στσ = τ2 = (1, 3, 2), στ = (2, 3), τσ = (1, 3),

1.3 First Examples 11

we see that each element g ∈ S3 can be written as a product of (in general) severalfactors σ and τ . We indicate this here (and later on in a similar fashion in other cases)by the notation

S3 =< σ, τ > .

We easily find one-dimensional representations in V = C, namely the trivial representa-tion

π1(g) = 1 for all g ∈ S3

and the signum representation

π2(g) = sgn g ∈ ±1 for all g ∈ S3.

One also has a natural three-dimensional representation π0 given in V = C3 by thepermutation matrices from Exercise 0.9. For instance, one has

π0(τ) = A(τ) =

⎛⎝ 0 0 1

1 0 00 1 0

⎞⎠ .

This representation is also called the permutation representation. It can equivalently bedescribed as follows.We write

V = C3 =3∑

i=1

eiC w = e1z1 + e2z2 + e3z3

with the canonical basis vectors e1 = t(1, 0, 0), . . . , e3 = t(0, 0, 1) and z1, z2, z3 ∈ C. Thenπ0 is given by

π0(g)w =∑

i

eg(i)zi =∑

eizg−1(i).

Exercise 1.1: Verify that this is true.

π0 is unitary (verify this also) but not irreducible: V1 := (e1 + e2 + e3)C is an invariantsubspace, and as we have

π0(g)(e1 + e2 + e3) = eg(1) + eg(2) + eg(3) = e1 + e2 + e3,

π0 |V1 is the trivial representation π1. As one has for w =∑

ziei

< e1 + e2 + e3, w >=∑

zi,

V3 := w,∑

zi = 0 is the orthogonal complement to V1 and as∑

i zi =∑

i zg−1(i) holdsfor all g ∈ S3, V3 is an invariant subspace.

Exercise 1.2: Show that a := e1ζ + e2 + e3ζ2 and b := e1 + e2ζ + e3ζ

2 is a basis forV3, if ζ := e2πi/3. Determine the matrices of π2 := π0 |V3 with respect to this basis andshow that π2 is irreducible.

Remark 1.2: By the general methods to be developed later, it will be clear that all ir-reducible representations of S3 are equivalent to π1, π2, or π3, i.e. the equivalence classesof πi (i=1,2,3) constitute the unitary dual of S3.


Exercise 1.3: Prove this in a direct elementary way (see [FH] §1.3).

Example 1.4: A rather general procedure to construct representations is hidden in thefollowing simple example.

We take G ⊂ GL(2,C) and as vector space V the space Vm = C [u, v]m of homogeneouspolynomials P of degree m in two variables u, v. By counting, we see dim V = m + 1.There are two essentially equivalent ways to define a representation of G on Vm:

For P ∈ Vm and g = ( a bc d

) ∈ G resp. g−1 = ( a′ b′

c′ d′ ) we put

(λm(g)P )(u, v) := P (a′u + b′v, c′u + d′v)

or

(ρm(g)P )(u, v) := P (au + cv, bu + dv).

Obviously λ(g)P and ρ(g)P are both again homogeneous polynomials of degree m. Thefunctional equations λ(gg′) = λ(g)λ(g′) and ρ(gg′) = ρ(g)ρ(g′) can be verified directly.It becomes also clear from the general discussion in the following Example 1.5.We remark that λm and ρm are irreducible unitary representations of G = SU(2), if Vm

is provided with a suitable scalar product. This will come out later in section 4.2.

Example 1.5: Let X be a G-set with a left G-action x −→ g ·x and V = F(X ) a vectorspace of complex functions f : X −→ C such that for f ∈ V we also have fg ∈ V wherefg is defined by

fg(x) = f(g−1x).

Remark 1.3: The prescription

(λ(g)f)(x) := f(g−1x)

defines a representation λ of G on V .

Proof: One has (gg′) · x = g · g′ · x and hence

λ(gg′)f(x) = f((gg′)−1 · x) = f(g′−1g−1 · x) = f(g′−1 · g−1 · x)

and

λ(g)λ(g′)f(x) = λ(g)fg′(x) = fg′(g−1 · x) = f(g′−1 · g−1 · x).

An important special case is given by X = Cn and G ⊂ GL(n,C) acting by matrix mul-tiplication g · x = gx (as usual x is here a column) on V , a space of continuous functionsin n variables or, say, a subspace of homogeneous polynomials of fixed degree (for n = 2we then have the last example).

Analogously, one treats a set X with a right G-action x −→ x · g and a vector spaceV = F(X ) of complex functions f on X closed under f −→ fg where fg is defined byfg = f(x · g).

1.3 First Examples 13

Remark 1.4: The prescription

(ρ(g)f)(x) := f(x · g)

defines a representation ρ of G on V .

Proof: One has x · (gg′) = x · g · g′ and hence

ρ(gg′)f(x) = f(x · (gg′)) = f(x · g · g′)

and

ρ(g)ρ(g′)f(x) = ρ(g)fg′(x) = f(x · g · g′).

For X = Cn and V again a space of functions in n variables, this time viewed as a row,and G ⊂ GL(n,C), we have again the action x −→ x · g = xg by matrix multiplication.For n = 2 this leads naturally to the second version of Example 1.4.

Another (as, later to be seen, crucial) application of this procedure is given by making Gact on itself, i.e. X = G, by left or right translation and V = L2(G), the space of squareintegrable functions with respect to a left resp. right invariant Haar measure (these no-tions will be explained later). We then get the left (right) regular representation of G.

Example 1.6: Let G = Heis(R) be the Heisenberg group from Section 0.1, in the formof the group of triples g = (λ, µ, κ), λ, µ, κ ∈ R with the multiplication law

gg′ = (λ + λ′, µ + µ′, κ + κ′ + λµ′ − λ′µ),

and V = L2(R) the space of quadratic Lebesgue integrable functions f on X = R (withnorm ‖ f ‖2 = (

∫R

f(x)f(x)dx)1/2 ). Then for m ∈ R∗, π = πm given by

πm(g)f(x) := e2πim(κ+(λ+2x)µ)f(x + λ)

is a unitary irreducible representation of Heis(R). We call it the Schrodinger represen-tation.

Exercise 1.4: Prove that πm is a unitary representation. (The proof of the irreducibilitywill become easier later when we have prepared some more tools.)

Example 1.7: Prove (as Exercise 1.5) that the Weil group of R, WR from Exercise0.2 in Section 0.1, has the irreducible representations π = π0 and π = πm,m ∈ Z \ 0given by

π0((

0 1−1 0

)) := det

(0 1−1 0

)= 1, π0(

(z 00 z

)) := det

(z 00 z

)= zz,

and for m = 0

πm((

z 00 z

)) :=

(zm 00 zm

), πm(

(0 1−1 0

)) :=

(0 1

(−1)m 0

).


1.4 Basic Construction Principles

Proceeding from a given representation (π, V ) of a group G one can construct severalnew representations using standard notions from (multi-)linear algebra. We suppose thatthe reader has some knowledge of direct sums, tensor products and dual spaces. If thisis not the case, we hope that one can follow in spite of it as we stay on a very elementarylevel and do enough paraphrasing to get some understanding. But sooner or later thereader should consult any book where these notions appear, for instance take Fischer’s“Lineare Algebra” [Fi], Lang’s “Algebra” [La], or the appendices to [Ki1] or [Be].

1.4.1 Sum of Representations

Let (π, V ) and (π′, V ′) be (linear) representations of the same group G. Then we definethe direct sum π ⊕ π′ of π and π′ by

(π ⊕ π′)(g)(v ⊕ v′) := π(g)v ⊕ π′(g)v′ for all v ⊕ v′ ∈ V ⊕ V ′.

For V = Kn, V ′ = Km and π(g) = A(g) ∈ GL(n,K), π′(g) = A′(g) ∈ GL(m,K) inmatrix form, the sum is given by the block diagonal matrix

(π ⊕ π′)(g) = ( A(g) 00 A′(g) ) ∈ GL(n + m,K).

1.4.2 Tensor Product of Representations

Let (π, V ) and (π′, V ′) be as above and V ⊗ V ′ the tensor product of V and V ′. Thenwe define the tensor product π ⊗ π′ of π and π′ by

(π ⊗ π′)(g)(v ⊗ v′) := π(g)v ⊗ π′(g)v′ for all v ⊗ v′ ∈ V ⊗ V ′.

For V = Kn, V ′ = Km and π(g) = A(g) ∈ GL(n,K), π′(g) = A′(g) ∈ GL(m,K), thetensor product is given by the Kronecker product of the matrices A(g) and A′(g)

(π ⊗ π′)(g) =

⎛⎝ a1,1A

′(g) . . . a1,nA′(g). . .

an,1A′(g) . . . an,nA′(g)

⎞⎠ ∈ GL(nm,K) .

We will not go into the definition of the tensor product (as to be found in the aforementioned sources) but simply recall the following. If V has a basis (ei)i∈I and V ′ abasis (fj)j∈J , then V ⊗ V ′ has a basis (ei ⊗ fj)(i,j)∈I×J .

In V ⊗ V ′ we have the rules

(v1 + v2) ⊗ v′ = v1 ⊗ v′ + v2 ⊗ v′,v ⊗ (v′1 + v′

2) = v ⊗ v′1 + v ⊗ v′2,λ(v ⊗ v′) = (λv) ⊗ v′ = v ⊗ (λv′) for all v, v1, v2 ∈ V, v′, v′

1, v′2 ∈ V ′, λ ∈ K.

It is clear that one can also define tensor products of more than two factors, and in par-ticular tensor powers like ⊗3V = V ⊗V ⊗V (It is not difficult to see that these productsare associative.)

1.4 Basic Construction Principles 15

Moreover, there are subspaces of tensor powers with a preassigned symmetry property.In particular we are interested in p-fold symmetric tensors SpV and p-fold antisymmetrictensors ∧pV .For instance, if V is three-dimensional with basis (e1, e2, e3),

• ⊗2V has dimension 9 and a basis

(e1 ⊗ e1, e1 ⊗ e2, e1 ⊗ e3, e2 ⊗ e1, . . . , e3 ⊗ e3),

• S2V has dimension 6 and a basis

e1 ⊗ e1, e1 ⊗ e2 + e2 ⊗ e1, e1 ⊗ e3, e2 ⊗ e2, e2 ⊗ e3 + e3 ⊗ e2, e3 ⊗ e3.

• ∧2V has dimension 3 and a basis

e1 ∧ e2 := e1 ⊗ e2 − e2 ⊗ e1, e1 ∧ e3 := e1 ⊗ e3 − e3 ⊗ e1, e2 ∧ e3 := e2 ⊗ e3 − e3 ⊗ e2.

More generally, one has symmetric and antisymmetric subspaces in ⊗pV with bases

ei1 · · · · · eip :=∑

g∈Sp

eig(1) ⊗ · · · ⊗ eig(p) , i1 ≤ · · · ≤ ip ,

resp.ei1 ∧ · · · ∧ eip

:=∑

g∈Sp

sgng eig(1) ⊗ · · · ⊗ eig(p) , i1 < · · · < ip .

For instance, if V is three-dimensional, SpV can be identified with the space K [u, v, w]pof homogeneous polynomials of degree p in three variables. And if π is a representationof G on V, the prescription ei −→ π(g)ei defines by linear continuation representationsSpπ and ∧pπ on SpV resp. ∧pV .It is one of the most important construction principles for finite-dimensional representa-tions to take the natural representation π0 and to get (other) irreducible representationsby taking tensor products and reduce these to sums of irreducibles. This will be analysedlater.

1.4.3 The Contragredient Representation

We use the notation V ∗ to denote the dual space of a given K-vector space. This spaceconsists of K-linear maps ϕ : V −→ K, is again a K-vector space (of the same dimensionas V if the dimension is finite), and is written as

V ∗ = Hom(V,K) = ϕ : V −→ K, ϕ K − linear .

We shall also use the notation

ϕ(v) =: < ϕ, v > for all ϕ ∈ V ∗, v ∈ V

and, if dim V = n with basis (e1, . . . , en), we denote (e∗1, . . . , e∗n) for the dual basis for

V ∗, i.e. with < e∗i , ej > = δi,j(= 1 for i = j and = 0 for i = j).


If π is a representation of G on V , we define the contragredient representation π∗ on V ∗

by

(π∗(g)ϕ)(v) := ϕ(π(g−1)v) for all ϕ ∈ V ∗, v ∈ V,

or equivalently by the condition

< π(g)∗ϕ, π(g)v >=< ϕ, v > for all ϕ ∈ V ∗, v ∈ V.

The proof that we really get a representation is already contained in the example 1.5 insection 1.3.It is clear from linear algebra that in matrix form we have A∗(g) = tA(g−1).

Exercise 1.6: Verify this.

1.4.4 The Factor Representation

If (π1, V1) is a subrepresentation of the representation (π, V ) of G, one has also anassociated representation on the factor space V/V1, called the factor representation.

There is still another essential construction principle, namely the concept of inducedrepresentations. It goes back to Frobenius and is also helpful in our present algebraiccontext. But we leave it aside here and postpone it for a more general treatment inChapter 7.

1.5 Decompositions

We now have at hand procedures to construct representations and we have the notion ofirreducible representations which we want to look at as the building blocks for generalrepresentations. To be able to do this, we still have to clarify some notions.We defined in Section 1.1 that (π, V ) is irreducible, iff there is no nontrivial subrepre-sentation. Now we add:

Definition 1.5: (π, V ) is decomposable, iff there exists an invariant subspace V1 ⊂ Vwhich admits an invariant complementary subspace V2, i.e. with V = V1 ⊕ V2. We thenwrite π = π1 + π2.Representations for which every nontrivial subrepresentation admits an invariant com-plement are called completely reducible.

Complete reducibility is desirable as it leads (at least in the case of finite-dimensionalrepresentations) immediately to the possibility to decompose a given representation intoirreducible representations. Therefore facts like the following are very important.

Theorem 1.1: Let (π, V ) be the representation of a finite group G and (π1, V1) a sub-representation. Then there exists an invariant complementary subspace V2 .

1.5 Decompositions 17

Proof: Let <,>′ be any scalar product for V (such a thing exists as is obvious at leastfor finite-dimensional V Kn) . We can find a G-invariant scalar product <,> , namelythe one defined by

< v, v′ >:=∑g∈G

< π(g)v, π(g)v′ >′ for all v, v′ ∈ V.

Then V2 := v ∈ V, < v, v1 > = 0 for all v1 ∈ V1 is a complementary subspace to V1

which is π-invariant:For v ∈ V2, g ∈ G one has, as <,> is in particular π(g−1)-invariant,

< π(g)v, v1 > = < π(g−1)π(g)v, π(g−1)v1 >= < v, π(g−1)v1 > = 0

if v1 ∈ V1 and hence also π(g−1)v ∈ V2.

Remark 1.5: If G is any group and (π, V ) a unitary representation of G, again everysubrepresentation (π1, V1) has an invariant complement (π2, V2) given as in the proofabove by V2 := V ⊥

1 = v ∈ V, < v, v1 > = 0 for all v1 ∈ V1 because V is alreadyprovided with a π-invariant scalar product.

Remark 1.6: If G is compact, the theorem holds as well, as the sum in the definitionof the invariant scalar product can be replaced by an integral. This will be done later inChapter 4.

Remark 1.7: As an immediate consequence we have that finite-dimensional unitaryrepresentations of an arbitrary group are completely reducible.

Remark 1.8: Every finite-dimensional representation of a finite (or compact) group isunitarizable as the representation space can be given a π-invariant scalar product by theTheorem 1.1 above.

Example 1.7: We have already seen an example for a decomposition of a representationin Example 1.3 in Section 1.3, namely the permutation representation of G = S3. Thestandard example that not every representation is completely reducible and not evendecomposable is the following.Let G = R be the additive group of the real numbers and π a two-dimensional represen-tation of G in V = C2 given by

R b −→ ( 1 b0 1 ) =: A(b).

Here V1 := e1C is an invariant subspace: we have A(b)e1 = e1 for all b ∈ R. If

V2 := v C =(

x

y

)C

is an invariant complementary subspace, we have

A(b)v =(

x + by

y

)= λ

(x

y

)

for a certain λ ∈ C, i.e. x + by + λx and y = λy. As y must be nonzero, we have λ = 1.This leads to by = 0 for all b and hence a contradiction.


The following considerations are important in a general representation theory. But aswe will not use them here so much they are only mentioned (for proofs see for instance[Ki] p.115 f).Every representation (π, V ) is either indecomposable or the sum of two representationsπ = π1 +π2. The same alternative applies to the representations π1 and π2 and so on. Inthe general case this process can continue infinitely. We say that π is finite, if every familyof π-invariant subspaces Vi of V that is strictly monotone with respect to inclusion isfinite. In this case there exists a strictly monotone finite collection of invariant subspaces

V = V0 ⊃ V1 ⊃ · · · ⊃ Vn = 0

such that that the representations πi appearing on Vi/Vi+1 are irreducible. One has theJordan - Holder Theorem: The length of such a chain of subspaces and the equivalenceclasses of the representations πi (up to order) are uniquely defined by the equivalenceclass of π.

If we want to discuss the question whether the decomposition of a given representation(π, V ) is the direct sum of irreducibles, we need criteria for irreducibility. A central cri-terium is a consequence of the following famous statement.

Theorem 1.2: (Schur’s Lemma): If two linear representations (π, V ) and (π′, V ′) areirreducible, then every intertwining operator F ∈ C(π, π′) is either zero or invertible. IfV = V ′, π = π′ and dimV = n, then F is a homothety, i.e. F = λ id, λ ∈ C.

Proof: i) As defined in Section 1.2 an intertwining operator for π and π′ is a linear mapF : V −→ V ′ with

(1.1) π′(g)F (v) = F (π(g)v) for all g ∈ G, v ∈ V.

Then Ker F = v ∈ V, F (v) = 0 is a π-invariant subspace of V because for v ∈ Ker Fwe have F (v) = 0 and by (1.1) F (π(g)v) = π′(g)F (v) = 0, i.e. π(g)v ∈ Ker F . By asimilar reasoning it is clear that Im F = F (v), v ∈ V is an invariant subspace of V ′.The irreducibility of π implies Ker F = 0 or Ker F = V and by the irreducibility ofπ′ we have Im F = 0 or = V ′. Hence F is the zero map or an isomorphism.ii) If V = V ′, the endomorphism F : V −→ V has an eigenvalue λ ∈ C (as the charac-teristical polynomial PF (t) := det(F − tE) ∈ C[t] has a zero λ ∈ C by the fundamentaltheorem of algebra). Hence F ′ := F −λE is a linear map with Ker F ′ = 0. From parti) we then deduce F ′ = 0, i.e. F = λE.

Remark 1.9: The first part of the Theorem is also expressed as follows: Two irreduciblerepresentations of the same group are either equivalent or disjoint.The second part can be written as C(π) = C.

There are several more refined versions of Schur’s Lemma which come up if one sharpensthe notion of a linear representation by adding a continuity requirement as we shall dolater on. For the moment, though it is somewhat redundant, we add another versionwhich we take over directly from Wigner ([Wi] p. 75) because it contains an irreducibil-ity criterium used in many physics papers and its proof is a very instructive example formatrix calculus.

1.5 Decompositions 19

Theorem 1.3 (the finite-dimensional unitary Schur): Let (π,Cn) be a unitary matrixrepresentation of a group G, i.e. with π(g) = A(g) ∈ U(n). Let M ∈ GL(n,Cn) be amatrix commuting with all A(g), i.e.

(1.2) MA(g) = A(g)M for all g ∈ G.

Then M is a scalar multiple of the unit matrix, M = λEn, λ ∈ C.

Proof: i) First, we show that we may assume M is a Hermitian matrix: We take theadjoint (with A −→ A∗ := tA) of the commutation relation (1.2) and get

A(g)∗M∗ = M∗A(g)∗,

multiply this from both sides by A(g) and, using the unitarity A(g)A(g)∗ = En, get

M∗A(g) = A(g)M∗.

Hence, if M commutes with all A(g), not only M∗ does but also the matrices

H1 := M + M∗, H2 := i(M − M∗),

which are Hermitian. It is therefore sufficient to show that every Hermitian matrix,which commutes with all A(g), is a scalar matrix, since if H1 and H2 are multiples ofEn, so must be 2M = H1 − iH2.

ii) If M is Hermitian, one of the fundamental theorems from linear algebra (see forinstance [Ko], p.256 or [Fi] Kapitel 4) tells that M can be diagonalized, i.e. conjugatedinto a diagonal matrix D by a matrix U ∈ U(n)

UMU−1 = D.

Then we put A(g) := UA(g)U−1. A(g) is unitary as can be easily verified. From thecommutation relation (1.2), we have DA(g) = A(g)D. With D = (di,j), di,j = 0 fori = j, A(g) = (ai,j), this leads to

di,iai,j = ai,jdj,j .

Hence for di,i = dj,j we have ai,j = aj,i = 0, that is, all A(g) must have zeros at allintersections of rows and columns, where the diagonal elements are different. This wouldmean that g −→ A(g) resp. g −→ A(g) are reducible representations. Hence, as thisis not the case, we have that all diagonal elements are equal and D is a scalar matrixλEn, λ ∈ C.The proof shows that we can sharpen the statement from the Theorem to the irreducibiliycriterium:

Corollary: If there exists a nonscalar matrix which commutes with all matrices of afinite-dimensional unitary representation π, then the representation is reducible. If thereexists none, it is irreducible.

This Corollary will help to prove the irreducibility in many cases. Here we show the use-fulness of Schur’s Lemma by proving another fundamental fact (later we shall generalizethis to statements valid also in the infinite-dimensional cases):


Theorem 1.4: Any finite-dimensional irreducible representation of an abelian group isone-dimensional.

Proof: Let (π, V ) be a representation of G. Then

F : V −→ V, v −→ π(g0)v

is an intertwining operator for each g0 ∈ G, as we have

π(g)F (v) = π(g)π(g0)v

= π(gg0)v, as π is a representation,

= π(g0g)v, as G is abelian,

= π(g0)π(g)v,

= F (π(g)v).

The second part of Schur’s Lemma says that F is a homothety, i.e. we have a λ ∈ C with

F (v) = π(g0)v = λv for all v ∈ V.

As g0 ∈ G was chosen arbitrarily, we see that V0 = vC is an invariant subspace of V .For an irreducible representation π this implies V = V0, i.e. V is one-dimensional.

To finish our statements on decompositions in this section, we state here without proofs(for these see [Ki] p.121ff) some standard facts about the kind of uniqueness one canexpect for a decomposition of a completely reducible representation. By the way, thistheorem is a reflex of a central theorem from Algebra about the structure of modulesover a ring (a representation (π, V ) will be seen also as module over the group ring C[G]).

Reminding the notions of disjoint representations from Section 1.2 and of finite repre-sentations from above, we add the following notion.

Definition 1.6: A representation π is said to be primary, iff it cannot be representedas a sum of two disjoint representations.

As an example one may think here of a representation of the form πk, k ∈ N, the sumof k representations (π, V ) given on the k-th sum V k = V ⊕ · · · ⊕ V .The central fact can be stated like this:

Theorem 1.5: Let π be a completely reducible finite representation in the space V .Then there exists a unique decomposition of V into a sum of invariant subspaces Wj , forwhich the representations πj = π |Wj are primary and pairwise disjoint. Every invariantsubspace V ′ ⊂ V has the property V ′ = ⊕j(V ∩ Wj).

Among other things, the proof uses the notion of projection operators, known from linearalgebra. As we have to use these here later on, we recall the following concept.

1.6 Characters 21

Definition 1.7: Suppose that the space V of a representation π is decomposed into adirect sum of invariant subspaces Vi, i.e., V = ⊕m

i=1Vi. Then the projection operator Pi

onto Vi parallel to the remaining Vj is defined by Pi v = vi if v = (v1, . . . , vi, . . . , vm),vi ∈ Vi, i = 1, . . . , m.

Pi has the following properties

- a) P 2i = Pi,

- b) PiPj = PjPi = 0 for i = j,- c)

∑mi=1 Pi = 1,

- d) Pi ∈ C(π).

One can prove that every collection of operators (i.e. linear maps) in V that enjoys theseproperties produces a decomposition of V into a sum of invariant subpaces Vi = PiV .

Later on, we shall treat explicit examples of decompositions.

1.6 Characters of Finite-dimensional Representations

The following notions need the finite-dimensionality of a representation. For the infinite-dimensional cases there are generalizations using distributions and/or infinitesimal meth-ods which we only touch briefly later on because they are beyond the scope of this book.So in this Section (π, V ) is always a finite-dimensional complex representation of thegroup G with dim V = n.

Definition 1.8: We call character of π the complex function χπ (or equivalently χV ) onG given by

χπ(g) := Trπ(g) for all g ∈ G.

As usual Tr denotes the trace. If π is given by the matrices A(g) = (aij(g)) ∈ GL(n,C),one has

χπ(g) =n∑

i=1

aii.

Conjugate matrices A ∼ A′, i.e. with A′ = TAT−1, T ∈ GL(n,C) have the same trace.Therefore the trace of a representation is well defined independently of the choice of thematrix representation. And for conjugate g, g′ ∈ G we have

χπ(g) = χπ(g′).

This is also expressed by saying the character is a class function.Obviously one has for the neutral element e of G

χπ(e) = n = dim V.

We assume moreover that π is unitary (we shall see soon that this is not really a restric-tion if the group is finite or compact). Then π(g) is in matrix form A(g) conjugate to adiagonal matrix D = D(λ1, . . . , λn), where λ1, . . . , λn are the eigenvalues of A(g). Hence:

Remark 1.10: χπ(g) =∑n

i=1 λi.


If G is finite, each element g has finite order and so has π(g). Thus the eigenvalues λi

must have | λi |= 1 and we have λ−1i = λi.

Using this, from Remark 1.10 we deduce an important fact.

Remark 1.11: χπ(g−1) = χπ(g) .

Finally we state two very useful formulae.

Theorem 1.6: Let (π, V ) and (π′, V ) be two finite-dimensional representations of agroup G. Then we have

χπ⊕π′ = χπ + χπ′(1.3)χπ⊗π′ = χπ × χπ′ .(1.4)

Proof: The first equation is simply the expression of the fact that the trace of a block

matrix ( AA′ ) is the sum of the traces of the two matrices A and A′. The second

relation can be seen by inspection of the trace of the Kronecker product in 1.4.2.

Exercise 1.7: Let (π, V ) be a finite-dimensional representation of G. Show that forevery g ∈ G one has the equations

χ∧2π(g) = (1/2)(χπ(g)2 − χπ(g2)),

χS2π(g) = (1/2)(χπ(g)2 + χπ(g2)).

As we shall find out in the next chapters, the irreducible representations of finite andcompact groups (where in the last case we have to add a continuity condition) are allfinite-dimensional. Then the characters of these representations already contain enoughinformation to fix these representations in a way which we will discuss more precisely now.

Chapter 2

Representations of FiniteGroups

As we are mainly interested in continuous groups like SO(3) etc., this topic plays no cen-tral role in our text. But here we can learn a lot about how representation theory works.Thus we present the basic facts. We will not do it by using the abstract notions offeredby modern algebra (as for instance in [FH]), where eventually the whole theory can beput into the nutshell of one page. Instead we follow very closely the classic presentationfrom the first pages of Serre’s book [Se].

In this chapter G is always a finite group of order #G = m.

2.1 Characters as Orthonormal Systems

We introduce the group ring (or group algebra): Let K be a field and G a group with#G = m. Then we denote the space of maps from G to K by

KG := u : G −→ K.KG is a K-vector space as we have an addition u + u′ defined by

(u + u′)(g) := u(g) + u′(g) for all g ∈ G

and scalar multiplication λu by (λu)(g) := λu(g) for u ∈ KG and λ ∈ K. But we alsohave a multiplication uu′ defined by

(uu′)(g) :=∑

a,b∈G,ab=g

u(a)u′(b).

Remark 2.1: One can show that KG is an associative K-algebra (for this notion seeSection 6.1).

There is an alternative way to describe KG, namely as a set of formal sums∑

g∈G u(g)gbuilt from the values u(g) of u in g ∈ G, i.e.

KG := ∑g∈G

u(g)g, u(g) ∈ K.

24 2. Representations of Finite Groups

Here we have addition∑u(g)g +

∑u′(g)g :=

∑(u(g) + u′(g))g

and multiplication given by∑

u(g)g ·∑

u′(g)g :=∑

v(g)g, with v(g) :=∑ab=g

u(a)u′(b).

As we restrict our interest in this text to complex representations, in the sequel we writeH = CG and H0 = CclG for the algebra of class functions, i.e. with u(g) = u(g′) forg ∼ g′. If (π, V ) is a representation of G, by Remark 1.8 in 1.5 we may assume that it isunitary. Conjugate elements of G have the same character. Hence the character u = χπ

is not only an element in H but also in H0.Our first aim is to prove that, for a complete system π1, . . . , πh of irreducible represen-tations, the characters χπ1 , . . . , χπh

are an orthonormal system (=: ON-system) in H.Here the scalar product <,> is given by

< u, v >:= (1/m)∑t∈G

u(t)v(t) for all u, v ∈ H.

(It is clear that this is a scalar product.)We will also use the following bilinear form

(u, v) := (1/m)∑t∈G

u(t−1)v(t).

If u = χπ is a character, by the Remark 1.11 in 1.6 we have u(t) = u(t−1) and hence(χπ, v) = < χπ, v > for v ∈ H.

Theorem 2.1: Let π and π′ be irreducible representations of G and χ = χπ as well asχ′ = χπ′ their characters. Then we have

< χ, χ′ > = 0, if π ∼ π′,< χ, χ > = 1.

This is also written as < χπ, χ′π > = δπ,π′ .

The proof will be a consequence of two propositions.

Proposition 2.1: Let π, π′ be irreducible, F : V −→ V ′ a linear map, and

F 0 := (1/m)∑g∈G

π′(g−1)Fπ(g).

Then one has

1)F 0 = 0, if π ∼ π′ (“case 1”),2)F 0 = λ id, if V = V ′, π = π′, where λ = (1/ dim V )TrF (“case 2”).

2.1 Characters as Orthonormal Systems 25

Proof: F 0 is an intertwining operator for π and π′: We have

π′(g)−1F 0π(g) = (1/m)∑t∈G

π′(g)−1π′(t)−1Fπ(t)π(g)

= (1/m)∑t∈G

π′(tg)−1Fπ(tg) = F 0.

By application of Schur’s Lemma, we have F 0 = 0 or F 0 = λ id in the respective cases,and because of

TrF 0 = (1/m)∑t∈G

Tr(π(t−1)Fπ(t)) = TrF

and Tr id = n = dim V, we get λ = (1/n)TrF .

Proposition 2.2: For π and π′ as in Proposition 2.1 given in matrix form by π(g) = A(g)and π′(g) = B(g), we have

(1/m)∑t∈G

Bij(t−1)Akl(t) = 0 for all i, j, k, l in case 1 ,

(1/m)∑t∈G

Aij(t−1)Akl(t) = (1/n)δilδjk for all i, j, k, l in case 2 .

Proof: With F = (Fij) Proposition 2.1 expressed in matrix form

(1/m)∑t∈G

∑jk

Bij(t−1)FjkAkl(t) = F 0il

states(1/m)

∑t∈G

∑jk

Bij(t−1)FjkAkl(t) = 0 for all i, l in case 1,

and this requires that all the coefficients of all Fjk are zero. Hence, we have the firstrelation. In the second case we have

(1/m)∑t∈G

∑jk

Aij(t−1)FjkAkl(t) = λδil = (1/n)∑jk

Fjkδjkδil.

Comparing the coefficients of Fjk on both sides leads to the second relation.

Proof of the Theorem: We have χ′(t) =∑

Bii(t) and χ(t) =∑

Aii(t) and hence

< χ,χ > = (1/m)∑t∈G

χ(t−1)χ(t) = (1/m)∑t∈G

∑ij

Aii(t−1)Ajj(t)

= (1/n)∑ij

δijδij = 1

by the second relation in Proposition 2.2, and

< χ,χ′ >= (1/m)∑t∈G

χ′(t−1)χ(t) = (1/m)∑t∈G

∑ij

Bii(t−1)Ajj(t)

by the first relation in Proposition 2.2.


Corollary 1: Let (π, V ) be a finite-dimensional representation of G with characterχ := χπ and (πi, Vi) irreducible representations with characters χi := χπi and

V = V1 ⊕ · · · ⊕ Vk.

Let moreover be (π′, V ′) an irreducible representation with character χ′ = χπ′ . Then wehave

(2.1) # Vi, Vi V ′ = < χ′, χ > .

From 1.2 we know that this number is also recognized as mult(π′, π).

Proof: By Theorem 1.6 in 1.6 we have χ =∑

i χi. Theorem 2.1 in 2.1 says that thesummands in

< χ′, χ >=∑

i

< χ′, χi >

are either 0 or 1 if χ′ ∼ χi or χ′ ∼ χi. Hence < χ′, χ > counts the number of irreduciblecomponents πi contained in π which are equivalent to π′.

Corollary 2: mult(π′, π) is independent of the decomposition as in Corollary 1.

Proof: < χ′, χ > does not depend on the decomposition.

Corollary 3: If two representations π and π′ have the same character χπ = χπ′ , theyare equivalent.

Proof: By Corollary 1 the multiplicities are equal for all irreducible components of πand π′ and so both representations are equivalent.

Corollary 4: If π decomposes into the irreducible representations π1, . . . , πh with re-spective multiplicities mi, i.e. one has

V = m1V1 ⊕ · · · ⊕ mhVh,

or, as we also write

V = V m11 ⊕ · · · ⊕ V mh

h ,

one has

< χ, χ > =h∑

i=1

m2i .

Proof: This follows immediately from Theorem 2.1 in 2.1, as χ =∑

miχi and the χi

are an ON-system.

This Corollary specializes to the following important irreducibility criterium.

Corollary 5: π is irreducible exactly if < χ, χ > = 1.

2.2 The Regular Representation 27

2.2 Regular Representation and its Decomposition

The regular representation λ of the finite group G with #G = m is defined on the vectorspace V spanned by a basis et indexed by the elements t of G, i.e. V =< et >t∈G by theprescription

λ(g)et := egt,

i.e. we have λ(g)v =∑

t∈G z(g−1t)et for v =∑

t∈G ztet, zt ∈ C.V can be identified with H concerning the vector space structure.Then we have

λ(g)u =∑t∈G

u(g−1t)t for all u =∑t∈G

u(t)t ∈ H.

If g = e, we have gt = t for all t ∈ G, which shows that the diagonal elements of λ(g) arezero. In particular we have Trλ(g) = 0 and Trλ(e) = TrEm = m. So we have proven:Proposition 2.3: The character χλ of the regular representation λ of G is given by

χλ(e) = #G = m,

χλ(g) = 0 for all g = e.

Proposition 2.4: Every irreducible representation πi is contained in the regular repre-sentation with multiplicity equal to its dimension ni, i.e. mult(πi, λ) = ni = dim Vi.

Proof: According to Corollary 1 in 2.1 we have

mult(πi, λ) = < χπi , χλ >

and by the definition of the scalar product also

mult(πi, λ) = (1/m)∑t∈G

χλ(t−1)χπi(t).

By Proposition 2.3 and the general fact that the character of the neutral element equalsthe dimension, we finally get

mult(πi, λ) = (1/m) · m · χπi(e) = ni.

Proposition 2.5: Let π1, . . . , πh be a complete system representing the equivalenceclasses of irreducible representations of G. Then the dimensions ni = dim Vi satisfy therelations

a)∑h

i=1 n2i = m,

and, if t ∈ G, t = e,b)

∑hi=1 niχπi(t) = 0.

Proof: By Proposition 2.4 we have χλ(t) = Σniχi(t) for all t ∈ G. Taking t = e weobtain a) and for t = e we obtain b).

This result is very useful for the determination of the irreducible representations of a finitegroup: suppose one has constructed some mutually nonequivalent irreducible representa-tions of dimensions n1, . . . , nh. In order that they be all irreducible representations (upto equivalence) it is necessary and sufficient that one has n2

1 + · · · + n2h = m.


In 1.3 we already discussed the example G = S3. Here we have m = 6. And wesingled out the two one-dimensional representations π1 = 1 and π2 = sgn and the two-dimensional representation π3. As we have here 12 +12 +22 = 6 = m, we see that we gotall irreducible representations. Moreover, we have a simple way to check the irreducibil-ity of π3 by using our Corollary 5. in 2.1As an easy Exercise 2.1, we recommend to do this and to determine the characters ofall the representations π1, π2, π3, and π0 introduced there. One can see that that one hasπ0 = π1 + π3 consistent with our Corollary 4 in 2.1.

One can prove more facts in this context, for instance the dimensions ni all divide theorder m of G (see [Se] p.53]).

2.3 Characters as Orthonormal Bases and Number ofIrreducible Representations

We sharpen Theorem 2.1 from 2.1 to the central fact that the characters χ1, . . . , χh be-longing to a complete set of equivalence classes of irreducible representations of G forman orthonormal basis of the space of class functions H0. We recall that a function fdefined on G is a class function if one has f(g) = f(tgt−1) for all g, t ∈ G.

Proposition 2.6: Let f be a class function on G and (π, V ) a representation of G. Letπf be the endomorphism of V defined by

πf :=∑t∈G

f(t)π(t).

If π is irreducible with dimV = n and character χ, then πf is a scalar multiple of theidentity, i.e. πf = λ idV , with

λ = (1/n)∑t∈G

χ(t)f(t) = (m/n) < χ, f > .

Proof: πf is an intertwining operator for π: We have

π(s)−1πfπ(s) =∑t∈G

f(t)π(s)−1π(t)π(s) =∑t∈G

f(t)π(s−1ts).

As f is a class function, we get by t −→ s−1ts = t′

π(s)−1πfπ(s) =∑t′∈G

f(t′)π(t′) = πf .

By the second part of Schur’s Lemma, πf = λid is a scalar multiple of the identity. Wehave Trλ id = nλ and

Trπf = Σf(t)Trπ(t) = Σf(t)χ(t).

Henceλ = (1/n)Σf(t)χ(t) = (m/n) < χ, f > .

As the characters are class functions, they are elements of H0.

2.3 Characters as Orthonormal Bases 29

Theorem 2.2: The characters χ1, . . . , χh form an ON-basis of H0.

Proof: Theorem 2.1 from Section 2.1 says that the (χi) form an ON-system in H0. Wehave to prove that they generate H0. That is, we have to show that every f ∈ H0,orthogonal to all χi, is zero. Let f be such an element. For each representation π of G,we put πf = Σf(t)π(t) as above. Proposition 2.6 shows that πf is zero if π is irreducible.But as each π may be decomposed into irreducible representations, πf is always zero. Weapply this to the regular representation , i.e. to π = λ, and compute for the first basisvector of the representation space for λ

πfe1 =∑

t

f(t)λ(t)e1 =∑

t

f(t)et.

Since πf is zero, we have πfe1 = 0 and this requires f(t) = 0 for all t ∈ G. Hence f is zero.

As each element of H0 is fixed by associating a complex number to each conjugacy classg∼ of G, the dimension of H0 as a C-vector space equals the number of conjugacy classes.Hence we have:

Corollary 1: The number of irreducible representations of G (up to equivalence) isequal to the number of conjugacy classes.

Corollary 2: Let g be an element of G and c(g) := #g′; g′ ∼ g the number of elementsin the conjugacy class of g. Then we have

a)∑h

i=1 χi(g)χi(g′) = m/c(g) if g′ = g,b) = 0 if g′ ∼ g.

Proof: Let fg be the characteristic function of the class of g, i.e. fg(g′) = 1 if g′ ∼ gand = 0 else. Since we have fg ∈ H0, by the Theorem, we can write

fg =h∑

i=1

λiχi with λi = < χi, fg > = (c(g)/m)χi(g).

For each t ∈ G we then have

fg(t) = (c(g)/m)h∑

i=1

χi(g)χi(t).

And because of fg(t) = 1 for t ∼ g and = 0 for t ∼ g, we get a) and b).

Remark 2.2: We recall the statement from 0.3 that the number of conjugacy classes ofthe symmetric group Sn equals the number p(n) of partitions of n.It is a nice excercise to redo the example of G = S3 in the light of the general facts nowat hand: We have p(3) = 3 and thus know of three irreducible representations π1, π2, π3

with respective characters χ1, χ2, χ3. Moreover we know that G = S3 can be generatedby the transposition σ = (1, 2) and the cycle τ = (1, 2, 3). The three conjugacy classesare represented by e = id, σ, and τ . We have

σ2 = e, τ3 = e, στ = τ2σ.


Hence, for each representation π the equation π(σ)2 = id holds. So there are two one-dimensional representations, namely π1 = 1 and π2 = sgn with

χ1(e) = 1, χ1(σ) = 1, χ1(τ) = 1,χ2(e) = 1, χ2(σ) = −1, χ2(τ) = 1.

The third representation π3 must have dim π3 = n with 12 + 12 + n2 = 6, i.e. n = 2, aswe already concluded at the end of 2.2. The value of χ3 can be deduced (Proposition 2.4in 2.2) from the relation with the character χλ of the regular representation λ

χ1 + χ2 + 2χ3 = χλ.

As we have χλ(e) = 6 and χλ(g) = 0 for g = e, we conclude

χ3(e) = 2, χ3(σ) = 0, χ3(τ) = −1.

These formulae should also be read off from the results of Exercise 2.1.

There is a lot more to be said about the representations of finite groups in generaland of the symmetric group in particular. Here, we only mention that the irreduciblerepresentations of Sn are classified by the Young tableaus realizing a permutation of n.And the following theorem about the canonical decomposition of a representation of afinite group, which is a special case of our Theorem 1.5 in 1.5 and can be proved rathereasily (see [Se] p. 21):Let π1, . . . , πh be as before irreducible representations representing each exactly oneequivalence class and let (π, V ) be any representation. Let V = U1 ⊕ · · · ⊕ Uk be adirect sum decomposition of V into spaces belonging to irreducible representations. Fori = 1, . . . , h denote by Wi the direct sum of those of the U1, . . . , Uk which are isomorphicto Vi (the Wi belong to the primary representations introduced at the end of 1.5). Thenwe have

V = W1 ⊕ · · · ⊕ Wh,

which is called the canonical decomposition. Its properties are as follows:

Theorem 2.3: i) The decomposition V = W1⊕· · ·⊕Wh does not depend on the initiallychosen decomposition of V into irreducible subspaces.ii) The projection Pi of V onto Wi associated to the decomposition is given by

Pi = (ni/m)∑t∈G

χi(t)π(t), ni = dim Vi.

We stop here our treatment of finite groups and propose to the reader to treat the fol-lowing examples.

Exercise 2.2: Determine the irreducible representations of the cyclic group Cn and ofthe dihedral group Dn (This is the group of rotations and reflections of the plane whichpreserve a regular polygon with n vertices, it is generated by a rotation r and a reflections fulfilling the relations rn = e, s2 = e, srs = r−1).

Exercise 2.3: Determine the characters of S4.

Chapter 3

Continuous Representations

Among the topological groups, compact groups are the most easy ones to handle. Sincewe would like to treat their representations next, it is quite natural that the appearanceof topology leads to an additional requirement for an appropriate definition, namely acontinuity requirement. We will describe this first and indicate the necessary changesfor the general concepts from Sections 1.1 to 1.5 (to be used in the whole text later). Inthe next chapter we specialize to compact groups. We will find that their representationtheory has a lot in common with that of the finite groups, in particular the fact that allirreducible representations are finite-dimensional and contained in the regular represen-tation (the famous Theorem of Peter and Weyl). But there is an important difference:the number of equivalence classes of irreducible representations may be infinite.

We do not prove all the modifications in the general theorems when linear representa-tions are specialized to continuos representations, but concentrate in the next chapterson an explicit description of the representation theory for SU(2) resp. SO(3) and theother examples mentioned in the introduction.

3.1 Topological and Linear Groups

We follow [Ki] p.22:Definition 3.1: A topological group is a set that is simultaneously a group and a topo-logical space in which the group and topological structures are connected by the followingcondition:

The mapG × G −→ G, (a, b) −→ ab−1

is continuous.

Remark 3.1: This requirement is equivalent to the following three conditions which aremore convenient for checking

1.) (a, b) −→ ab is continuous in a and in b,2.) a −→ a−1 is continuous at the point a = e,3.) (a, b) −→ ab is continuous in both variables together at the point (e, e).

32 3. Continuous Representations

Every abstract group can be regarded as a topological group if one gives it the discretetopology where every subset is open.

All the matrix groups presented in Section 0.1 and later on are topological groups: Asthey are subsets of Mn(R) Rn2

, they inherit the (standard) topology from Rn2. Open

balls in Mn(R) (and analogously in Mn(C)) are

Kρ(A) := B ∈ Mn(R), ‖ B − A ‖< ρ, ρ ∈ R>0, A ∈ Mn(R),

where

(3.1) ‖ A ‖ :=√

< A, A >, < A, B >:= Re Tr tAB.

This topology is consistent with the group structure: multiplication of two matrices givesa matrix where the coefficients of the product matrix are polynomials in the coefficients ofthe factors and thus the two continuity requirements 1.) and 3.) in the remark above arefulfilled. Similarly with condition 2.), as the inverse of a matrix has coefficients whichare polynomials in the coefficients of the matrix divided by the determinant, anotherpolynomial here not equal to zero, i.e. we have again a continuous function.

For A ∈ U(n) we have tAA = E, i.e. ‖ A ‖2 = n, so we have an example of a compactgroup. Heis(R), SL(2,R) and SO(3, 1) are not compact but these groups together withall others in this text are locally compact, i.e. each element admits a neighbourhood withcompact closure (as Rn does). Moreover, they are closed subgroups of some GL(N,R).This, in principle, has to be verified in each case. A practical tool is the following stan-dard criterium.Remark 3.2: G is closed if for every convergent sequence (Aj), Aj ∈ G with lim Aj ∈GL(n,R), we have lim Aj ∈ G.

An example for a different procedure is the following: SL(n,R) is a closed subgroup ofGL(n,R) as det: GL(n,R) −→ R∗ is a continuous map and SL(n,R) is the inverseimage of the closed subset 1 ⊂ R∗.

Now we have the possibility to define the class of groups we will restrict to in this text:

Definition 3.2: A group is called a linear group if there is an n ∈ N such that G isisomorphic (as abstract group) to a closed subgroup of GL(n,R) or GL(n,C).

Remark 3.3: More generally, one calls also linear those groups, which are isomorphicto a closed subgroup of GL(n,H), H the quaternion skewfield, which in this text willonly appear in a later application.

Remark 3.4: In another kind of generality one treats Lie groups, which are topologicalgroups and at the same time differentiable manifolds such that

G × G −→ G, (a, b) −→ ab and G −→ G, a −→ a−1,

are differentiable maps. As we try to work without the (important) notion of a manifoldas long as possible, it is quite adequate to stay with the concept of linear groups asdefined above (which comprises all the groups we are interested in here).

3.2 The Continuity Condition 33

While treating topological notions, let us remind that a group G is called (path-)connectedif for any two a, b ∈ G there is a continuous map of a real interval mapping one endpointof the interval to a and the other to b.

Exercise 3.1: Show that SU(2) and SL(2,R) are connected and GL(2,R) is not.

Before we come to the definition of continuous representations, let us adapt the notionof a G-set from Section 0.2 to the case that G is a topological group.

Definition 3.3: A set X is called a (left-)topological G-space iff it is a left G-set as in0.2 and X is also provided with a topology such that the map

G ×X −→ X , (g, x) −→ gx,

is continuous.

In the sequel, G-space will abbreviate the content of this definition.

If the space X is Hausdorff (as all our examples will be), then the stabilizing subgroupGx ⊂ G of each point x ∈ X is a closed subgroup of G. Conversely, if H is a closedsubgroup of a Hausdorff group G, then the space G/H of left cosets gH, g ∈ G, is ahomogeneous Hausdorff G-space when given the usual quotient space topology. If X isany other topological G-space with the same stabilizing group Gx = H for x ∈ X , thenthe natural map

Φ : G/H −→ X , gH −→ gx

is 1-1 and continuos. If G and X are locally compact, then Φ is a homeomorphism andallows an identification of both spaces.

3.2 The Continuity Condition

From now on a group will mean a locally compact topological group. Then an appropri-ate representation space V should be a linear topological space. To make things easierfor us, we will suppose that V = H is a separable complex Hilbert space (i.e. having de-numerable Hilbert space basis (ej)j∈I , I N), as for instance L2(S1) with basis given bythe functions ej(t) := exp(2πijt), j ∈ Z. For the finite-dimensional case this is nothingnew: we have as before V = Cn.If we write again GL(V ), in the finite-dimensional case, we have the same set as be-fore provided with the topology given by an isomorphism GL(V ) GL(n,C). In theinfinite-dimensional case, V = H, GL(V ) = GL(H) is meant as the group L(H) of linearbounded operators in H. We recall here the following fundamental notions relevant forthe infinite-dimensional cases (from e.g. [BR] p.641f) :

• An operator T with domain D(T ) ⊂ H is said to be continuous at a point v0 ∈ D(T )iff for every ε > 0 there exists a δ = δ(ε) > 0 such that from ‖ v − v0 ‖< δ withv ∈ D(T ) we can conclude ‖ Tv − Tv0 ‖ < ε.


• An operator T is said to be bounded iff there exists a constant C such that one has‖ Tv ‖≤ C ‖ v ‖ for all v ∈ D(T ).

• The norm ‖ T ‖ of a bounded operator T is defined as the smallest C such that

‖ Tv‖ ≤ C ‖ v‖ for all ‖ v‖ ≤ 1

.

Remark 3.5: A bounded linear operator is uniformly continuous. Conversely, if a linearoperator T is continuous at a point v0 (e.g. v0 = 0), then T is bounded.

Exercise 3.2: Prove this.

• Let A be a linear operator in H with dense domain D(A) ⊂ H. Then one can provethat there are v, v′ ∈ H such that < Au, v > =< u, v′ > holds for all u ∈ H. Onesets v′ =: A∗v, verifies that A∗ is a linear operator, and calls A∗ the adjoint of A.

• For operators A and B in H one writes B ⊃ A iff one has D(B) ⊃ D(A) andBu = Au for all u ∈ D(A). Then B is called an extension of A.

• A linear operator A with dense domain D(A) is called symmetric iff one has A∗ ⊃ Aand

< Au, v > = < u,Av > for all u, v ∈ D(A).

• A linear operator A with dense domain is called self-adjoint iff one has A∗ = A.One can prove that a linear symmetric operator A with D(A) = H is bounded andself-adjoint.

If not mentioned otherwise, L(H) is provided with the strong operator topology ([BR]),i.e. we have as basis of open neighbourhoods

Bρ,u(T0) := T ∈ L(H); ‖ Tu − T0u ‖< ρ, ρ ∈ R>0, u ∈ H, T0 ∈ L(H).

In the sequel a representation of G shall mean the following.

Definition 3.4: A linear representation (π,H) is said to be continuous iff π is stronglycontinuous, i.e. the map

G −→ H, g −→ π(g)v

is continuous for each v ∈ V .

Remark 3.6: It is quite obvious that one could also ask for the following continuityrequirements:

a) The map G ×H −→ H, (g, v) −→ π(g)v is continuous.b) The map G −→ GL(H), g −→ π(g) is continuous.

andc) For each v ∈ V, the map G −→ H, g −→ π(g)v is continuous in g = e and

there is a uniform bound for ‖ π(g) ‖ in some neighbourhood of e ([Kn] p.10).

Definition 3.5: A representation (π, V ) is bounded iff we have

supg∈G

‖ π(g) ‖ < ∞.


In all examples in this section and in most cases of the following sections all these con-ditions are fulfilled. We will not discuss the hypotheses necessary so that all theseconditions are equivalent and refer to this to [Ki] p.111, [Kn] p.10 and [BR] p.134. Butwe will show for some examples how the continuity of a representation comes out. Laterwe will not examine continuity and leave this to the reader’s diligence. (To check thisand the question whether a representation from the examples to be treated is boundedor not may even be (part of) the subject of a bachelor thesis.)

Example 3.1: The linear representation π = χk, k ∈ Z of G = SO(2) in V = C givenby

χk(r(θ)) := eikϑ for all g = r(ϑ) = (cos ϑ sin ϑ− sinϑ cos ϑ

), ϑ ∈ R,

is obviously continuous: The map

(g, z) −→ eikϑz

is continuous in g = r(ϑ) for each z ∈ C or even contiuous in both variables g and z,because the exponential function is (in particular) continuous and an open neighborhoodof e ∈ SO(2) is homeomorphic to an open interval in R containing 0.

Later we will modify the notions of reducibility and irreducibility for continuous rep-resentations. But here we can already prepare the way how to do this and prove thefollowing:Remark 3.7: Each irreducible continuous representation π of G = SO(2) is equivalentto some χk, k ∈ Z.

Proof: We look at (π, V ) only as a linear representation and apply Schur’s Lemma from1.5. As G is abelian, we can conclude by Theorem 1.4 following from Schur’s Lemma in1.5 that V = C. Hence we can assume π : SO(2) −→ C∗. As SO(2) is homeomorphic toR/Z, we have a continuous map r : R −→ R/Z. Now we use the assumption that π iscontinuous and we have a sequence of continuous maps

R −→ SO(2) −→ C∗, t −→ r(t) −→ π(r(t)),

and in consequence a continuous map

R t −→ π(r(t)) =: χ(t) ∈ C∗.

As π is a linear representation, χ has the property

χ(t1 + t2) = χ(t1)χ(t2), χ(t) = 1 for all t ∈ 2πZ.

Here we have to recall a fact from standard calculus (see for instance [Fo] p.75): Everycontinuous function χ : R −→ C∗ obeying the functional equation χ(t1+t2) = χ(t1)χ(t2)is an exponential function, i.e. χ(t) = c exp(tζ) for c ∈ C∗, ζ ∈ C. The conditionχ(2πn) = 1 for all n ∈ Z fixes c = 1 and ζ = ki for k ∈ Z.

By the way, already here, we can can anticipate a generalization of Proposition 2.4 in2.2 from finite to compact groups: L2(SO(2)) can be identified with a space of periodicfunctions. And from the theory of Fourier series one knows that this space has as aHilbert space basis the trigonometric functions ej , j ∈ Z, with

ej(t) := e2πijt, t ∈ R.


This can be translated into the statement that each irreducible continuous representationof SO(2) is contained (with multiplicity one) in the left- (or right-)regular representationλ resp. ρ of SO(2) and λ and ρ are direct sums of the χj , j ∈ Z.

Example 3.2: Let G be SO(3) and, for fixed m ∈ N, let πm be the representation of Ggiven on the space Vm of all homogeneous polynomials of degree m in 3 variables, i.e.

Vm := P ∈ C[x1, x2, x3];P homogeneous, deg P = m,

byπ(g)P (x) := P (g−1x), x = t(x1, x2, x3) (a column).

πm is a linear representation as is already clear from the general consideration in 1.3.The continuity condition is fulfilled since SO(3) inherits the topology from GL(n,R) andmatrix multiplication x −→ g−1x is a continuous map and polynomials P are continuousfunctions in their arguments.

This example is similarly constructed as the example π = πj for G = SU(2) introducedin 1.3, which is continuous by the same reasoning as are also the other examples in 1.3and later on. So we will now take the continuity of our representations for granted.

Moreover, let us announce here that we shall have to discuss the relationship of the twoexamples G = SO(3) and G = SU(2) very thoroughly later. The representations πj ofSU(2) will come out as irreducible, but here we see already that for instance for m = 2(πm, Vm) has an invariant subspace V0 := (x2

1 + x22 + x2

3)C since each element g ∈ SO(3)acts on R3 leaving invariant the spheres given by

∑x2

i = ρ2.

The addition of the continuity condition to the definition of the representation has nat-ural consequences for the notions related to the definition of the linear representation inSections 1.1 and 1.2: Here only closed subspaces are relevant. Hence in the future, weuse definitions modified as follows.

• Suppose that there is a closed subspace V1 of the space V of a representation (π, V )of G, invariant under all π(g), g ∈ G. Then π1 := π|V is called a (topological)subrepresentation of π.

• The representation π2 on the factor (or quotient) space V/V1 is called a (topological)factor representation.

The representation (π, V ) is called

• (topologically) irreducible iff it admits no nontrivial subrepresentation,

• (topologically) decomposable iff there exist closed subspaces V1 and V2 in V suchthat V = V1 ⊕ V2. In this case we write π = π1 + π2, where πi := π |Vi

,


• completely reducible (or discretely decomposable) iff it can be expressed as a directsum of irreducible subrepresentations,

• unitary iff V is a Hilbert space H and every π(g) respects the scalar product in H.

• If (π, V ) and (π′, V ′) are continuous representations of G, then a bounded linearmap (or operator) F : V −→ V ′ is called an intertwining operator for π and π′ iffone has

Fπ(g) = π′(g)F for all g ∈ G.

Then Schur’s Lemma here appears in the following form.

Theorem 3.1: If π and π′ are unitary irreducible representations in the Hilbert spacesH resp. H′ and F : H −→ H′ is an intertwining operator, then F is either an isomor-phism of Hilbert spaces or F = 0.

This theorem has as a consequence the following irreducibility criterium, which is ageneralization to the infinite-dimensional case of the similar one we proved for the finite-dimensional linear representations without the continuity condition in 1.5. We will callit the unitary Schur:

Theorem 3.2: A unitary representation (π,H) is irreducible iff the only operatorscommuting with all the π(g) are multiples of the identity.

The proofs of these theorems (see e.g. [BR] p. 143/4 or [Kn] p.12) use some tools fromfunctional analysis (the spectral theorem) which we do not touch at this stage.

We indicate another very useful notion (in particular concerning unitary representations).

Definition 3.6: A representation π of G in V is said to be cyclic iff there is a v ∈ V(called a cyclic vector for V ) such that the closure of the linear span of all π(g)v is Vitself.

Theorem 3.3: Every unitary representation (π,H) of G is the direct sum of cyclic rep-resentations.

As the proof is rather elementary and gives some indications how to work in the infinite-dimensional case and how the continuity condition works, we give it here (following [BR]p.146):Let Hv1 be the closure of the linear span of all π(g)v1, g ∈ G for any 0 = v1 ∈ H.Then Hv1 is π-invariant: Indeed, let H′

v1be the linear span of all π(g)v1, then for each

u ∈ Hv1 we have a sequence (un), un ∈ H′v1

which converges to u. Obviously, one hasπ(g)un ∈ H′

v1. The continuity of each π(g) implies π(g)un −→ π(g)u and hence we have

π(g)u ∈ Hv1 and Hv1 is invariant. Thus the subrepresentation π1 = π |Hv1is cyclic with

cyclic vector v1.If Hv1 = H, choose 0 = v2 ∈ H⊥

v1= H \ Hv1 , consider the closed linear span Hv2 which

is π-invariant and orthogonal to Hv1 , and continue like this if H = Hv1 ⊕Hv2 .Let ξ denote the family of all collections Hvi, each composed of a sequence of mutu-ally orthogonal, invariant and cyclic subspaces. We order the family by means of the


inclusion relation. Then ξ is an ordered set to which Zorn’s Lemma applies: It assuresthe existence of a maximal collection Hvimax. By the separability of H, there canbe at most a countable number of subspaces in Hvi

max and their direct sum, by themaximality of Hvimax must coincide with H.

As a consequence of this theorem, we have a very convenient irreducibility criterium forunitary representations :Corollary: A unitary representation (π,H) of G is irreducible iff every nonzero u ∈ His cyclic for π.

Proof: i) If π is irreducible, every 0 = u ∈ H is cyclic as to be seen from the first partof the proof above.

ii) Suppose that H1 ⊂ H is a nontrivial invariant subspace of H and 0 = u ∈ H1.Due to the invariance of H1 we have π(g)u ∈ H1. Moreover, the closure of the linearspan of all π(g)u is contained in H1 but by assumption is also equal to H. Hence, wehave a contradiction and π is irreducible.

To close these general considerations let us recall the following. If H and H′ are Hilbertspaces with scalar products <,> resp. <,>′, then we can define a scalar product inH⊗H′ by the formula

< v ⊗ v′, w ⊗ w′ >:=< v,w >< v′, w′ > .

If either H or H′ is finite-dimensional, then the space H⊗H′ equipped with this scalarproduct is complete. If both H and H′ are infinite-dimensional, we complete H⊗H′ bythe norm defined by the scalar product above and denote this by H⊗H′.

3.3 Invariant Measures

We want to extend the results from finite groups to compact groups. The tool to dothis is given by replacing the finite sum

∑g∈G by an integration

∫G

. To be a bit moreprecise, one has to recall some notions from measure and integration theory, which more-over are inevitable in all the following sections. For the whole topic we recommend theclassical books Measure Theory by Halmos ([Ha]) and Abstract Harmonic Analysis byHewitt-Ross ([HR]) or, for summaries, the appropriate chapters in [Ki] p.129ff and [BR]p.67ff.

If X is a topological space with the family V = Oii∈I of open sets, it is also a Borelspace with a family B of Borel sets B, where these sets are obtained from open and closedsets by the operation of countable unions, countable intersections and complementation.A little more generally, a Borel space is a set with a family B of subsets B such that Bis a σ-algebra, i.e. one has

X \ B ∈ B for B ∈ B,⋂i∈I Bi ∈ B for Bi ∈ B, i ∈ I N,⋃i∈I Bi ∈ B for Bi ∈ B, i ∈ I N.

Let (X ,B) and (X ′,B′) be two Borel spaces. A map F : X −→ X ′ is called a Borel mapiff one has F−1(B′) ∈ B for all B′ ∈ B′.

3.3 Invariant Measures 39

Definition 3.7: We call µ a measure on the Borel space (X ,B) if µ is a map

µ : B −→ [0,∞] := R≥0 ∪ ∞,

where µ is σ-additive, i.e.i) µ(∪Bi) =

∑µ(Bi) if the Bi are pairwise disjoint (i ∈ I N), and

ii) there is a covering of X by sets Bi of finite measure, i.e. we have Bi ∈ B, i ∈ I Nwith µ(Bi) < ∞ and ∪Bi = X .

The most obvious example is X = R provided with the Lebesgue-measure µ =: λ, whichis defined by

λ([a, b)) := b − a for a, b ∈ R, a ≤ b.

As we know from elementary integration theory (or see immediately), λ is translationinvariant. This is a special incidence of the following general machinery.Let X be a space with a family V of open sets and a family B of Borel sets B and letµ be a measure on X . We then say (X , B, µ) is a measure space. If we have a rightG-action

(G ×X ) −→ X , (g, x) −→ xg,

we can define a measure µg by

µg(B) = µ(Bg) for all B ∈ B.

Then the measure µ is called right-invariant iff one has µg = µ for all g ∈ G. Similarlyµ is called left-invariant iff we have a left action and gµ = µ for all g ∈ G wheregµ(B) = µ(gB) for B ∈ B.

Now we take X = G, a locally compact group. Then a right- or left-invariant measureon G is called a right- or left-Haar measure. And we have as central result ([Ki] p.130):

Theorem 3.4 (Haar): On every locally compact group with countable basis for thetopology, there exists a nonzero left-invariant (and similarly a right-invariant) measure.It is defined uniquely up to a numerical factor.

We will not go into the proof of the Theorem (see for instance [HR] Ch.IV, 15) but givelater on several concrete examples for its validity. The most agreable case is the following:

Definition 3.8: G is called unimodular iff it has a measure which is simultaneously left-and right-invariant. Such a measure is also called biinvariant or simply invariant.

To make these up to now only abstract concepts work in our context, we have to recalla more practical aspect due to the general integration theory:Let (X , B, µ) be a measure space and f : X −→ C a measurable function, i.e. f is thesum f = f1 − f2 + i(f3 − f4) of four real nonnegative functions fj (j = 1, .., 4) having theproperty

x ∈ X ; fj(x) > a ∈ B for all a > 0.


For measurable nonnegative functions f general integration theory provides a linear map

f −→ µ(f) =∫X

f(x)dµ(x) ∈ C ∪ ∞.

We call a measurable function f integrable iff∫ |f(x)| dµ(x) < ∞.

If X = G as above, one usually writes

dµ(g) =: drg if µ is right − invariant,=: dlg if µ is left − invariant,=: dg if µ is bi − invariant.

The following fact is central for applications to representation theory.

Proposition 3.1: Let G be a locally compact group. Then there is a continuous homo-morphism ∆G, called the modular function of the group G, into the multiplicative groupof positive real numbers for which the following equalities hold

dr(gg′) = ∆G(g)drg′, dl(gg′) = ∆G(g′)−1dlg,

drg = const.∆G(g)dlg, dlg = const.∆G(g)−1drx,

dr(g−1) = ∆G(g)−1drg = const.dlg, dl(g−1) = ∆G(g)dlg.

As Kirillov does in [Ki] p.130, we propose the proof as an exercise based on the applica-tion of Haar’s Theorem. There is a proof in [BR] p.68/9, but the modular function thereis the inverse to our function, which here is defined following Kirillov.

Proposition 3.2: If G is compact, then the Haar measure is finite and biinvariant.

Proof: The image of a compact group under the continuous homomorphism ∆G is againa compact group. The multiplicative group R>0 has only 1 as a compact subgroup. Soby Proposition 3.1, we have a biinvariant measure which can be normalized to

∫G

dg = 1.

3.4 Examples

Most of the noncompact groups to be treated later on are also unimodular, e.g. the groupSL(2,R) or the Heisenberg group Heis(R). We will show here how this can be proved.

Example 3.3: G = Heis(R) = g = (λ, µ, κ); λ, µ, κ ∈ R with

gg′ = (λ + λ′, µ + µ′, κ + κ′ + λµ′ − λ′µ).

As Heis(R) is isomorphic to R3 as a vector space, we try to use the usual Lebesguemeasure for R3. So we take

dg := dλdµdκ.

The left translation λg0 : g −→ g0g =: g′ acts as

(λ, µ, κ) −→ (λ′ := λ0 + λ, µ′ := µ0 + µ, κ′ := κ0 + κ + λ0µ − λµ0).

3.4 Examples 41

The Jacobian of this map is

Jg0(g) =

⎛⎝ 1 −µ0

1 λ0

1

⎞⎠

with det Jg0(g) = 1. Hence we see that dg is left-invariant. Similarly we prove the right-invariance.

Remark 3.8: Without going into the general theory of integration on manifolds, weindicate that, for calculations like this, it is very convenient to associate the infinitesimalmeasure element with an appropriate exterior or alternating differential form and thencheck if this form stays invariant. In the example above we take

ω′ := dλ′ ∧ dµ′ ∧ dκ′

and calculate using the the general rules for alternating differential forms (we shall saya bit more about this later in section 8.1), in particular

dµ ∧ dλ = − dλ ∧ dµ, dµ ∧ dµ = 0,

as follows

ω′ λg0 = dλ ∧ dµ ∧ (dκ + λ0dµ − µ0dλ) = dλ ∧ dµ ∧ dκ = ω.

Example 3.4: G = SO(2)

We introduced as standard notation

SO(2) r(ϑ) = ( cos ϑ sinϑ− sin ϑ cos ϑ

).

Here we have dg = dϑ as biinvariant measure with

∫G

dg =∫ 2π

0

dϑ = 2π.

Example 3.5: G = SU(2)

This group will be our main example for the representation theory of compact groups.As standard parametrization we use the same as Wigner does in [Wi] p.158: The generalform of a two–dimensional unitary matrix

g = ( a bc d

).

of determinant one is

g = (a b−b a

), a, b ∈ C with | a2| + | b2| = 1.


With

s(α) := ( eiα

e−iα ) and r(ϑ) = ( cos ϑ sin ϑ− sin ϑ cos ϑ

)

one has a decomposition

SU(2) g = (a b−b a

) = s(−α/2)r(−β/2)s(−γ/2), α, γ ∈ [0, 2π], β ∈ [0, π].

In particular, one can see that SU(2) is generated by the matrices s(α), r(β).


Later on we shall see later that the angles α, β, γ correspond to the Euler angles forthe three-dimensional rotation introduced by Wigner in [Wi] p.152. In Hein [He] onefinds all these formulas, but unfortunately in another normalization, namely one haszHein = a, uHein = −b, αHein = β, βHein = −α, γHein = −γ. Our parametrizationis one-to-one for α, γ ∈ [0, 2π], β ∈ [0, π] up to a set of measure zero. An invariantnormalized measure is given by

dg := (1/16π2) sin β dαdβdγ.


Example 3.6: An example for a group G which is not unimodular is the following

G = g = ( a b1 ) ; a ∈ R∗, b ∈ R

We have gg′ = ( aa′ ab′ + b1 ) =: g′′, hence

da′′ ∧ db′′ = a′da ∧ (b′da + db) = a′da ∧ db

so that drg = a−1dadb and, from g′g = ( a′a a′b + b′

1 ) =: g∗

da∗ ∧ db∗ = a′2da ∧ db,

we see dlg = a−2dadb. Hence, we have here ∆G(g) = a−1.

Exercise 3.5: Determine ∆G for

G = g = ( 1 x1 )( y1/2

y−1/2 ); y > 0, x ∈ R.

Chapter 4

Representations of CompactGroups

As already mentioned, the representation theory of compact groups generalizes the the-ory of finite groups. We shall closely follow the presentation in [BR], and cite from [BR]p.166: “The representation theory of compact groups forms a bridge between the rela-tively simple representation theory of finite groups and that of noncompact groups. Mostof the theorems for the representations of finite groups have direct analogues for compactgroups and these results in turn serve as the starting point for the representation theoryof noncompact groups.”

4.1 Basic Facts

Let G be a compact group provided with an invariant measure µ normalized such that∫G

dg = 1 and let π be a representation of G in a Hilbert space H. We recall that all ourrepresentations are meant to be linear und continuous. The following statements are easyto prove for the finite-dimensional case dimH < ∞. Most times this is quite sufficientbecause it will come out that all irreducible representations of compact groups are finite-dimensional. To prepare a feeling for the noncompact cases where infinite-dimensionalityis essential, we state the results here for general H. But we will report on the proofs onlywhen not too much general Hilbert space theory is needed.

Proposition 4.1: Let (π,H) be a representation of G in the Hilbert space H with scalarproduct <,>′. Then there exists a new scalar product <,> defining a norm equivalent tothe initial one, relative to which the map g −→ π(g) defines a unitary representation of G.

Proof: i) As to be expected from the case of finite groups, we define <,> by

< u, v > :=∫

G

< π(g)u, π(g)v >′ dg.

Sesquilinearity and hermiticity of <, > are obvious. Let us check that we can concludeu = 0 from

< u, u > =∫

G

< π(g)u, π(g)u >′ dg = 0 :

44 4. Representations of Compact Groups

From∫

G< π(g)u, π(g)u >′ dg = 0 one deduces that < π(g)u, π(g)u >′= 0 almost every-

where. If g ∈ G is such that π(g)u = 0, then π(g)−1π(g)u = u = 0.

ii) < ,> is π-invariant:

< π(g′)u, π(g′)v > =∫

G

< π(g)π(g′)u, π(g)π(g′)v >′ dg

=∫

G

< π(gg′)u, π(gg′)v >′ dg = < u, v >,

because dg is left-invariant.

iii) The norms ‖ . ‖ and ‖ . ‖′ induced by <,> resp. <, >′ are equivalent:We have

‖ u ‖2 =∫

G

< π(g)u, π(g)u >′ dg

≤ supg∈G

‖ π(g) ‖2

∫G

< u, u >′ dg = N2‖ u ‖′2

with N := supg∈G ‖ π(g) ‖. And from

‖ u ‖′ 2 = < π(g−1)π(g)u, π(g−1)π(g)u >′

≤ supg∈G

‖ π(g−1) ‖2< π(g)u, π(g)u >′= N2 ‖ π(g)u ‖′ 2

it follows that

‖ u ‖′ 2 =∫

G

< u, u >′ dg

≤ N2

∫G

< π(g)u, π(g)u >′ dg = N2 < u, u >= N2 ‖ u ‖2 .

Hence we have N−1 ‖ u ‖′≤‖ u ‖≤ N ‖ u ‖′ and ‖ . ‖ and ‖ . ‖′ are equivalent.

iv) Equivalent norms define the same families of open balls and thus equivalent topolo-gies, i.e. if g −→ π(g)v, v ∈ H is continuous for the topology belonging to ‖ . ‖′, it is alsocontinuous for the topology deduced from ‖ . ‖.

We now turn to the already announced result that every unitary irreducible represen-tation is finite-dimensional. In its proof we use an interesting tool, namely the Weyloperator Ku defined for elements u, v from a Hilbert space H with π-invariant scalarproduct by

Kuv :=∫

G

< π(g)u, v > π(g)udg.

Remark 4.1: The Weyl operator has the following propertiesi) Ku is bounded,

ii) Ku commutes with every π(g), i.e. Kuπ(g) = π(g)Ku.

4.1 Basic Facts 45

Proof: We perform the standard calculations using Cauchy–Schwarz inequality.

i) ‖ Kuv ‖ ≤ ∫G|< π(g)u, v >|‖ π(g)u ‖ dg

≤ ∫G‖ π(g)u ‖2‖ v ‖ dg =‖ u ‖2‖ v ‖,

ii) π(g′)Kuv =∫

G< π(g)u, v > π(g′g)udg

=∫

G< π(g′g)u, π(g′)v > π(g′g)udg

=∫

G< π(g)u, π(g′)v > π(g)udg = Kuπ(g′)v

for all g′ ∈ G and v ∈ H.

Theorem 4.1: Every irreducible unitary representation π of G in a Hilbert space H isfinite-dimensional.

Proof: By the last remark, Ku is an intertwining operator for π. Hence by Schur’sLemma, Ku is a homothety, i.e. Ku = λ(u)idH and, in consequence,

< Kuv, v > =∫

G

< v, π(g)u >< π(g)u, v > dg(4.1)

=∫

G

|< π(g)u, v >|2 dg = λ(u) ‖ v ‖2 .

By interchanging the roles of u and v and using the equality (which is equivalent to theunimodularity of G) ∫

f(g−1)dg =∫

f(g)dg,

we get

λ(v) ‖ u ‖2 =∫

|< π(g)v, u >|2 dg =∫

|< u, π(g)v >|2 dg

=∫

|< π(g−1)v, u >|2 dg =∫

|< π(g)u, v >|2 dg

= λ(u) ‖ v ‖2 .

Hence we have λ(u) = c ‖ u ‖2 for all u ∈ H with a constant c ∈ C. In (4.1) we putu = v, ‖ v ‖= 1 and get ∫

|< π(g)u, u >|2 dg = λ(u) = c.

As the nonnegative continuous function g −→ |< π(g)u, u >| assumes the value ‖ u ‖= 1at g = e, we must have c > 0.Let eii=1,...,n be a set of ON-vectors in H. In (4.1) we now put u = ek and v = e1 toobtain ∫

|< π(g)ek, e1 >|2 dg = λ(ek) ‖ e1 ‖2= c.

and

nc =n∑

k=1

∫G

|< π(g)ek, e1 >|2 dg =∫

G

n∑k=1

|< π(g)ek, e1 >|2 dg

≤∫

G

‖ e1 ‖2 dg = 1.


The last inequality is a special case of Parseval’s inequality saying that for a unitary ma-trix representation g −→ A(g) one has

∑k | Ak1(g) |2≤ 1. So finally, one has n ≤ 1/c,

i.e. the dimension of H must stay finite.

We already proved that a finite-dimensional unitary representation is completely re-ducible. This result can be sharpened for compact groups G:Theorem 4.2: Every unitary representation π of G is a direct sum of irreducible finite-dimensional unitary subrepresentations.

The proof uses some more tools from functional analysis and so we skip it here (see [BR]p.169/170). But we mention the appearance of the important notion of a Hilbert–Schmidtoperator, which is an operator A in H, s.t. for an arbitrary basis eii∈I of H we have

∑i∈I

‖ Aei ‖2< ∞.

Again, as in the theory of finite groups, we have ON-relations for the matrix elements ofan irreducible representation.

Theorem 4.3: Let π and π′ be two irreducible unitary representations of G and A(g)resp. A′(g) their matrices with respect to a basis ei of H and e′k of H′. Then onehas relations∫

Aij(g)A′kl(g)dg = 0 if π ∼ π′

= (1/n)δikδjl if π ∼ π′ and n := dimH.

Proof: As to be expected, one applies Schur’s Lemma (Theorem 3.1):We introduce a matrix fij with entries (fij)kl = δikδjl and an operator

Fij :=∫

G

π(g)fijπ′(g−1)dg.

This is an operator intertwining the representations π′ and π since we have

π(g′)Fij =∫

G

π(g′g)fijπ′(g−1)dg

=∫

G

π(g)fijπ′(g−1g′)dg = Fijπ

′(g′).

Hence, if π ∼ π′, we have Fij = 0 or in matrix form for all (r, t)

(Fij)rt =∫

Ari(g)A′jt(g

−1)dg

=∫

Ari(g)A′tj(g)dg = 0.

For π = π′, Schur’s Lemma requires Fij = λijid. Hence for (r, t) = (i, j), the orthogo-nality relations just obtained are still satisfied. For (r, t) = (i, j), we have

(Fij)ij =∫

Aii(g)Ajj(g)dg = λij = 0 if i = j,=

∫ | Aii(g) |2 dg = λii if i = j.

4.1 Basic Facts 47

We have TrFiiid = nλii and from the definition of Fij ,

TrFii =∫

G

Tr (A(g)fiiA(g−1)dg = Tr fii = 1

and hence λii = 1/n.

Similar to the case of finite groups we have a nice fact:

Theorem 4.4: Every irreducible unitary representation π of G is equivalent to a sub-representation of the right regular representation.

Proof: Let A(g) = (Ajk(g))j,k=1,...,n be a matrix form of π(g) and let Hπ be the subspaceof L2(G) spanned by the vectors ek with ek(g) :=

√nA1k(g), k = 1, . . . , n (these are an

ON-system by the preceding theorem). We have

ρ(g0)ek(g) = ek(gg0) =√

nA(gg0)=

√nΣj A1j(g)Ajk(g0) = Σj

√nA1j(g)Ajk(g0)

= Σj ej(g)Ajk(g0).

Hence ρ restricted to Hπ is a subrepresentation equivalent to π.

Exercise 4.1: Show the same fact for the left regular representation.

Remark 4.2: With the same reasoning as in the case of finite groups, the characters χπ

of finite-dimensional irreducible unitary representations π, defined by χπ(g) := Trπ(g)resp. χπ(g) :=

∑i Aii(g) =

∑i < A(g)ei, ei > for a matrix form with respect to a basis

e1, . . . , en, have the following properties.

1) The characters are class functions, i.e. χ(g0gg−10 ) = χ(g) for all g, g0 ∈ G.

2) We have χ(g−1) = χ(g).3) If π ∼ π′, then χπ = χ′

π.4)

∫χπ(g)χ′

π(g)dg = 0 if π ∼ π′ and = 1 if π ∼ π′.

If π is any finite-dimensional representation of G, then we can decompose it into irre-ducible representations πi appearing with multiplicities mi and we have

χπ =h∑

i=1

miχπi .

Due to the ON-relation 4) in Remark 4.2 above, we obtain

mi =∫

G

χπ(g)χπi(g)dg

andh∑

i=1

m2i =

∫G

χπ(g)χπ(g)dg.

This formula specializes to the very useful irreducibility criterium:


Corollary: The finite-dimensional representation π is irreducible iff∫G

χπ(g)χπ(g)dg = 1.

The big difference to the theory of finite groups is that for compact groups the number ofequivalence classes of irreducible representations no longer necessarily is a finite number.We already know this from the example G = SO(2) in section 3.2. The central fact isnow the famous Peter–Weyl Theorem:

Theorem 4.5: Let G = πii∈I be the unitary dual of G, i.e. a complete set of represen-tatives of the equivalence classes of irreducible unitary representations of G. For everyi ∈ I let Ai(g) = (Ai

jk(g))j,k=1,...,nibe a matrix form of πi and

Y ijk(g) :=

√ni Ai

jk(g).

Then we have1) The functions Y i

jk, i ∈ I, j, k = 1, . . . , ni form a complete ON-system in L2(G).2) Every irreducible unitary representation π of G occurs in the decomposition of

the right regular representation with a multiplicity equal to the dimension of π, i.e.

mult (πi, ρ) = ni.

3) Every C-valued continuous function f on G can be uniformly approximated by alinear combination of the (Y i

jk).4) The characters (χπi)i∈I generate a dense subspace in the space of continuous class

functions on G.

The proof of these facts use standard techniques from higher analysis. We refer to [BR]p. 173 - 176 or [BtD] p.134.

4.2 The Example G = SU(2)

Representation theory owes a lot to the work of H. Weyl and E. Wigner. As we havethe feeling that we can not do better, we will discuss the example G = SU(2) followingWigner’s presentation in [Wi] p.163 - 166:

We put

G = SU(2) g = (a b−b a

), | a | 2 + | b | 2 = 1

and for j a half integral nonnegative number, we let V (j) be the space C[x, y]2j ofhomogeneous polynomials of degree 2j in two variables. Thus we have dimV (j) = 2j +1.We choose as a basis of V (j) monomials fp, which are conveniently normalized by

fp(x, y) :=xj+p√(j + p)!

yj−p√(j − p)!

, p = −j,−j + 1, . . . , j.

We define the representation πj by the action

(g,

(x

y

)) −→ g−1

(x

y

),

4.2 The Example G = SU(2) 49

i.e. we putπj(g)fp(x, y) := fp(ax − by, bx + ay),

and get by a straightforward computation

πj(g)fp(x, y) = Σp′fp′(x, y)Ajp′p(g)

with

Ajp′p(g) =

j+p∑k=0

(−1)k

√(j + p)!(j − p)!(j + p′)!(j − p′)!

(j − p′ − k)!(j + p − k)!k!(k + p′ − p)!aj−p′−kaj+p−kbk bk+p′−p

in particular

(4.2) Ajjp(g) =

√(2j)!

(j + p)!(j − p)!.

Exercise 4.2: Verify these formulae (at least) for j = 1.

Theorem 4.6: These representations πj , j ∈ (1/2)N0, are unitary, irreducible, and - upto equivalence - there are no other such representations of SU(2).

Proof: i) Unitarity

Wigner’s proof relies on the fact that the polynomials fp are normalized so that we have

j∑p=−j

fpfp =j∑

p=−j

1(j + p)!(j − p)!

| x2 |j+p| y2 |j−p=(| x2 | + | y2 |)2j

(2j)!.

By a similar computation, we conclude from the definition of πj

j∑p=−j

| πj(g)fp(x, y) |2 =j∑

p=−j

| ax − by |2(j+p)| bx − ay |2(j−p)

(j + p)!(j − p)!

=(| ax − by |2 + | bx − ay |2)2j

(2j)!=

(| x |2 + | y |2)2j

(2j)!,

where the last equality follows from the unitarity of g. So we have the invariance

Σp | π(g)fp |2= Σp | fp |2 .

Substituting here π(g)fp = Σp′fp′Ajp′p(g), we get

Σp(Σp′ fp′Ajp′p(g)Σp′′fp′′Aj

p′′p(g)) = Σpfpfp.

If we know that the (2j + 1)2 functions fp′fp′′ are linearly independent, we see that wehave

ΣpAjp′p(g)Aj

p′′p(g) = ΣpAjp′p(g)tAj

pp′′(g) = δp′p′′ ,

i.e. the unitarity of Aj(g). Hence we still have to show the linear independence of thefunctions fp′fp′′ : We look at a relation∑

p′,p′′cp′p′′ xj+p′

yj−p′xj+p′′

yj−p′′= 0, cp′p′′ ∈ C.


Then in particular for x ∈ R with q = 2j + p′ + p′′, the coefficient of xq has to vanish.After division by yjy3j−q this requires for∑

p′cp′,q−2j−p′(y/y)p′

= 0.

But this condition implies that cp′,q−2j−p′ = 0 (and thus the linear independence of thefp′fp′′), because we can write y/y = eit, t ∈ R, and a relation

Σcp′,q−2j−p′eitp = 0 for all t ∈ R

requires that all the coefficients c must vanish.

ii) Irreducibility

Wigner’s proof of the irreducibility of πj is by direct application of the criterium whichhere appears as Corollary to the finite-dimensional unitary Schur (Theorem 1.3) in Sec-tion 1.5, namely by showing that any matrix M , which commutes with all Aj(g), mustnecessarily be a constant matrix:We take

g = s(−α/2) = (e−iα/2

eiα/2 ).

Then the matrix Aj(g) specializes to the diagonal matrix

(4.3) Ajp′p(s(−α/2)) = δp′pe

ipα.

It is a standard fact that each matrix M commuting with all such matrices must bediagonal. Just after the introduction of the matrices Aj(g) we remarked (in (9.9)) thatno element in the last row of Aj(g) vanishes identically. Then by equating the elementsof the j-th row of Aj(g)M and MAj(g) we conclude that

Ajjk(g)Mkk = MjjA

jjk(g), i.e. Mjj = Mkk,

and M is a scalar multiple of the unit matrix.

One can also find a proof by application of the criterium given in the Corollary at the endof Section 4.1 using the character of πj . We will see this as a byproduct of the followingthird part.

iii) Completeness

Wigner’s proof that there are - up to equivalence - no other irreducible representationsof SU(2) than the πj can be seen as a direct proof of (part of) the Peter–Weyl Theoremfrom 4.1 in this special case:Characters are class functions. Since each unitary matrix can be diagonalized by conju-gation with a unitary matrix, in each conjugacy class of SU(2) we find a diagonal matrixof the type

s(α) = (eiα

e−iα ), α ∈ [0, 2π).

Conjugation by w = (1

−1 ) transforms

( 1−1 )( ζ−1

ζ)( −1

1 ) = ( ζζ−1 ).

4.2 The Example G = SU(2) 51

Hence every conjugacy class is exactly represented by a matrix

s(α) := (eiα

e−iα ), α ∈ [0, π), resp. s(α/2), α ∈ [0, 2π).

As we have (see (4.3) in ii))

Ajp′p(s(−α/2)) = δp′pe

ipα

its trace is

χπj(s(−α/2)) =

j∑p=−j

eipα =: ξj(α).

So we have

ξ0(α) = 1,

ξ1/2(α) = e−iα/2 + eiα/2 = 2 cos(α/2),

ξ1(α) = e−iα + 1 + eiα = 2 cos α + 1,

ξ3/2(α) − ξ1/2(α) = 2 cos(3α/2), etc.

It is now evident that SU(2) can have no other representations than the πj , j ∈ (1/2)N0:For the character of such a representation must, after multiplication by a weightingfunction, be orthogonal to all ξj , and therefore to ξ0, ξ1/2, ξ1 − ξ0, ξ3/2 − ξ1/2, . . . . Buta function, which is orthogonal to 1, 2 cos(α/2), 2 cos α, 2 cos(3α/2), . . . in [0, 2π) mustvanish according to Fourier’s Theorem.

Exercise 4.3: Verify the irreducibilty of πj using the character criterium (Corollary toRemark 4.2 in Section 4.1).

The explicit determination of the characters χπj = ξj of the irreducible representationsπj is very useful if one wants to decompose a given representation π.

Example 4.1: For π = π1 ⊗ π1, by Theorem 1.6 in Section 1.6, we have

χπ(s(α/2)) = χπ1(s(α/2))2 = (eiα + 1 + e−iα)2

= ξ2(α) + ξ1(α) + ξ0(α),

i.e.

π1 ⊗ π1 = π0 + π1 + π2.

Exercise 4.4: Verify the decomposition ([Wi] p.185)

πj ⊗ πj′ =j+j′∑

=|j−j′|π.


4.3 The Example G = SO(3)

One can construct representations of SO(3) by a procedure similar to the one used inthe case of SU(2), namely by using homogeneous polynomials in three variables. Butas we already remarked in the Example 3.2 in Section 3.2, we do not get irreduciblerepresentations without some further considerations. So let us discuss this a bit laterand first follow a method suggested by H. Weyl deducing the representation theory ofSO(3) from the result just obtained in 4.2 and the fact that SU(2) is a double cover ofSO(3):

The elements g of SO(3) are real 3×3-matrices consisting of three ON-rows (or columns)(a1, a2, a3), i.e. we have six conditions < aj , ak >= δjk for the nine entries in g. Hencewe expect three free parameters. It is a classical fact that the elements of SO(3) areparametrized by three Euler angles α, β, γ. Again we follow Wigner’s notation ([Wi]p.152):

Proposition 4.2: For every g ∈ SO(3) there exist angles α, β, γ ∈ R, such that

g = S3(α)S2(β)S3(γ).

Here S3(α) and S3(γ) are rotations about the z-axis (or e3 ∈ R3) through α resp. γ andS2(β) is a rotation about the y-axis (or e2 ∈ R3), i.e. we put

S3(α) :=

⎛⎝ cos α − sin α

sin α cos α1

⎞⎠ ,

S2(α) :=

⎛⎝ cos α − sin α

1sin α cos α

⎞⎠ ,

S1(α) :=

⎛⎝ 1

cos α − sin αsin α cos α

⎞⎠ .


Remark 4.3: i) There are several notational conventions for the Euler angles in com-mon use. In Hein [He] p.56 we find a statement (with proof) analogous to our Proposi-tion 4.2 with S2(β) replaced by S1(β). For a parametrization of SO(3) one needs onlyα, γ ∈ [0, 2π) and β ∈ [0, π]. This parametrization is not everywhere injective: for β = 0α and γ are only fixed up to their sum. Goldstein [Go] p.145-148 gives an overview ofother descriptions.

ii) SO(3) is generated by rotations S3(t) and S2(t) or by rotations of type S3(t) and S1(t).

The key fact for our discussion here is given by the following theorem (we shall latersee that this is a special case of an analogous statement about SL(2,C) covering the(restricted) Lorentz group SO(3, 1)+):

4.3 The Example G = SO(3) 53

Theorem 4.7: There is a surjective homomorphism

ϕ : SU(2) −→ SO(3)

with kerϕ = ±E2.This homomorphism can be chosen such that

ϕ(s(α/2)) = S3(α) and ϕ(r(−β/2)) = S2(β).

Proof: As in [Wi] p.158 we consider the Pauli matrices

sx := (1

1 ), sy := (i

−i), sz := (

−11 ).

The three-dimensional R-vector space V of hermitian 2× 2-marices H with trace zero isthe linear span of the Pauli matrices

V := H ∈ M2(C); tH = H, TrH = 0= H = ( −z x + iy

x − iy z) = xsx + ysy + zsz; x, y, z ∈ R.

G = SU(2) acts on V by conjugation, i.e. we have a map

G × V −→ V, (g, H) −→ gHg−1 = gHtg =: H ′,

which for H as above and g = (a b−b a

) gives by a straightforward computation

H ′ = ( −z′ x′ + iy′

x′ − iy′ z′ )

with

x′ = (1/2)(a2 + a2 − b2 − b2)x + (i/2)(a2 − a2 + b2 − b2)y + (ab + ab)zy′ = (i/2)(a2 − a2 + b2 − b2)x + (1/2)(a2 + a2 + b2 + b2)y + i(ab − ab)z(4.4)z′ = −(ab + ab)x + i(ab − ab)y + (aa − bb)z.

E.g. from the fact that the map det is multiplicative, we deduce

detH = x2 + y2 + z2 = det H ′ = x′2 + y′2 + z′2

and if we write (4.4) as a matrix relation with column t(x, y, z) = x

x′ = A(g)x,

we see that multiplication with A(g) respects the euclidean quadratic form, i.e. we haveA(g) ∈ O(3). Moreover, by a continuity argument, we have A(g) ∈ SO(3): as everyg ∈ SU(2) can be deformed contiuously to g = E2 and det is a continuous function, wecannot have det A(g) = −1.


From the fact that conjugation is a group action, we see that

ϕ : g −→ A(g) ∈ SO(3)

is a group homomorphism. As kerϕ = g ∈ SU(2); gHtg = H for all H ∈ V , we candeduce easily that kerϕ = ±E2 (by choosing appropriate matrices H).The fact that ϕ is surjective is more delicate: If g = s(α/2), i.e. a = eiα/2, b = 0, we seefrom (4.4) that we have

A(s(α/2)) = S3(α).

If g = r(β/2), we see from (4.4) that we have

A(r(β/2)) = S2(−β).

We already know that SO(3) is generated by matrices of type S2 and S3 and so we aredone.

Now we use this result to describe representations of SO(3) :

If a representation π : SU(2) −→ GL(V ) factorizes through SO(3), i.e. one has a homo-morphism π′ : SO(3) −→ GL(V ) with π = π′ · ϕ resp. a commutative diagram

SU(2) GL(V )

SO(3)

π

ϕ π′

then, this way, π induces the representation π′ of SO(3). Obviously this is the case for arepresentation π with π(−E2) = id, where we get a unique prescription by putting

π′(A) := π(g) if A = ϕ(g).

For π(−E2) = −id one gets a double valued representation by putting

π′(A) := ±π(g) if A = ϕ(g) = ϕ(−g).

(This is a special case of a projective representation, which will be discussed below.)

In consequence, we know the unitary dual of SO(3): The representations πj of SU(2) withj ∈ N0 are exactly those with π(−E2) = π(E2) = id and thus induce (up to equivalence)all irreducible unitary representations π′ of SO(3): If there were any π′ not in this family,it would composed with ϕ produce a representation of SU(2), which by our completenessresult for SU(2) would be equivalent to a πj .For g ∈ SU(2) and ϕ(g) = g′, Wigner ([Wi] p.167) conjugates the π′(g′) := πj(g) = Aj(g)from 4.2 by the diagonal matrix M = ((i)−2kδkl) and calls the result Dj(α, β, γ) withα, β, γ representing the Euler angles of g′ ∈ SO(3) in the parametrization chosen above.Hence we have from the explicit form of the matrices Aj(g) obtained in 4.2 the explicitform

Dj(α, β, γ)p′p =∑

k

(−1)k

√(j + p)!(j − p)!(j + p′)!(j − p′)!

(j + p′ − k)!(j + p − k)!k!(k + p′ − p)!

× eip′α(cos(β/2))2j+p+p′−2k(sin(β/2))2k+p′−p.


The characters of these (projective) representations are

χπ′j(S3(α)) =

j∑p=−j

eipα = 1 + 2 cos α + · · · + 2 cos(jα), j integral

= 2 cos(α/2) + 2 cos(3α/2) + · · · + 2 cos(jα), j half − integral.

The double valued representations π′j , j half-integral, led to H. Weyl’s explanation of the

spin of an electron. As said above, they are special examples of projective representa-tions, which are very natural for reasons from physics: In quantum theory the state of asystem is not so much described by an element v of a Hilbert space H but by an elementv∼ of the associated projective space P(H) consisting of the one-dimensional subspacesof H. In general, one writes for a K-vector space P(V ) := v∼; v ∈ V, v = 0 where v∼is the class of v for the equivalence v ∼ v′ if there is a λ ∈ K∗ such that v′ = λv. Havingthis in mind, it is appropriate to introduce the following.

Definition 4.1: A projective representation π of G in the K-vector space V is a map

π : G −→ GL(V ),

which enjoys the propertyπ(gg′) = c(g, g′)π(g)π(g′),

wherec : G × G −→ K∗

withc(g, g′)c(gg′, g′′) = c(g, g′g′′)c(g′, g′′) for all g, g′, g′′ ∈ G.

c is also called a system of multipliers.

Remark 4.4: The functional equation for c is equivalent to the associativity

π(g(g′g′′)) = π((gg′)g′′).


Every linear representation is also a projective representation. And analogously to thecase of linear representations, one also adds a continuity condition and discusses in gen-eral these representations.

In the second part of his book [Wi], Wigner describes the application of these resultsto the theory of spin and atomic spectra. Though we are tempted to do more, fromthis we only take over that, roughly, an irreducible unitary (projective) representation π′

j

of SO(3) corresponds to a particle having three-dimensional rotational symmetry. Thenumber j specifying the representation then is its angular momentum quantum numberor its spin and the indices numbering a basis of the representation space are interpretedas magnetic quantum numbers. We strongly recommend to the reader interested in thistopic to look into Wigner’s book or any text from physics.


To finish this chapter, we briefly indicate another approach to the determination of therepresentations of G = SO(3), which is a first example for the large theory of eigenspacerepresentations promoted in particular by Helgason.

From analysis it is well known that the three-dimensional euclidean Laplace operator

∆ = ∂2x + ∂2

y + ∂2z

is SO(3) invariant, i.e. for f ∈ C∞(R3) we have

(∆f)(gx) = ∆f(x), for all g ∈ G,x = t(x, y, z) ∈ R3.

Hence the space V of homogeneous polynomials of degree in three variables, whichsatisfy the Laplace equation

∆f = 0,

is invariant under the action f −→ fg with fg(x) = f(g−1x) of G = SO(3). We denoteby D(g) the matrix form of the representation π of G on V given by π(g)f = fg.To solve the Laplace equation, one usually introduces spherical coordinates (r, ϕ, ϑ) with

x = r sin ϑ cos ϕ,

y = r sin ϑ sin ϕ,

z = r cos ϑ.

Homogeneous polynomials f of degree in x, y, z have the form f = rY (ϑ, ϕ). If this fis introduced into the Laplace equation written in the spherical coordinates

(∂2r + (2/r)∂r + (1/r2)(∂2

ϑ + cot ϑ ∂ϑ + (1/(sinϑ)2)∂2ϕ))f = 0

r drops out and the differential equation

(( + 1) + (∂2ϑ + cot ϑ∂ϑ + (1/(sinϑ)2)∂2

ϕ))Y = 0

in the variables (ϑ, ϕ) results. The (2+1) linearly independent solutions Y = Ym (withm = −,−+1, . . . , , ) are known as spherical harmonics of the −th degree. They havethe form

Ym(ϑ, ϕ) = Φm(ϕ)Θm(ϑ)

where

Φm(ϕ) :=1√2π

eimϕ,

Θm(ϑ) := (−1)m(2 + 1

2( − m)!( + m)!

)1/2(sinϑ)m dm

d(cos ϑ)mPl(cos ϑ),

Θ,−m(ϑ) := (2 + 1

2( − m)!( + m)!

)1/2(sinϑ)m dm

d(cos ϑ)mPl(cos ϑ),

with m ≥ 0 in the last two equations. The P are the Legendre polynomials defined by

P(x) :=1

2!d

dx(x2 − 1).


This leads to the matrix form D for π′ with

π′(g)rYm(ϑ, ϕ) =

∑m′=−

Ym′(ϑ, ϕ)Dm′m(g).

As one easily deduces, we have in particular

Dm′m(S3(α)) = eimαδm′m.

By applying Schur’s commutation criterium, one can prove here, like we did alreadyabove, that D is irreducible for all = 0, 1, 2, . . . .The conjugacy classes of SO(3) are parametrized by the angle α, α ∈ [0, π). For thecharacters we obtain the formula

χ()(S3(α)) =∑

m=−

eimα = 1 + 2 cos α + · · · + 2 cos(α).

Again we can see that we got all irreducible representations as cos nϕn∈N0 is a completesystem in L2([0, π)) by Fourier’s Theorem.

Chapter 5

Representations of AbelianGroups

We have seen that– finite groups have (up to equivalence) finitely many irreducible representations andthese are all finite-dimensional, and– compact groups have (up to equivalence) denumerably many irreducible representa-tions and these are also all finite-dimensional.Abelian topological groups G (with prototype G = R or = C) are something of an ex-treme in the opposite direction. Here we know that unitary irreducible representationsare all one-dimensional (we proved this as a consequence of the “Unitary Schur” in 1.5).But in general the unitary dual is not denumerable. Since in this text we are mainlyinterested in groups like SL(2,R) and Heis(R), where unitary irreducible representationswill turn out to be infinite-dimensional, we discuss here only briefly some notions of gen-eral interest, following [BR] p.159 - 165 and [Ki] p.167 - 173.

5.1 Characters and the Pontrjagin Dual

Definition 5.1: A character of an abelian locally compact group G is a continuousfunction χ : G −→ C, which satisfies

| χ(g) |= 1 and χ(gg′) = χ(g)χ(g′) for all g, g′ ∈ G,

i.e. a character is a (one-dimensional) continuous irreducible unitary representation of G.

Remark 5.1: We already defined the unitary dual G of any group G as the set ofequivalence classes of unitary irreducible representations of G. For G abelian, G consistsjust of all characters of G. It is also called the Pontrjagin dual of G.

Remark 5.2: For G abelian, G is also a group: We define the composition of twocharacters χ and χ′ by

(χχ′)(g) := χ(g)χ′(g) for all g ∈ G.

If G is a topological group, G is also. More precisely, if G is locally compact (resp. com-

pact, resp. discrete), G is locally compact (resp. discrete, resp. compact). One has G ˆG.

60 5. Representations of Abelian Groups

Example 5.1: For G = Rn every character χ has the form

χ(x) = exp(i(x1x1 + · · · + xnxn)) = exp(i < x, x >), with x ∈ Rn, forx ∈ Rn,

i.e. we have G = G.

Example 5.2: For G = S1 = ζ ∈ C; | ζ | = 1 every character has the form

χ(ζ) = ζk, k ∈ Z,

as we discussed already in 3.2. I.e., here we have G = Z.

Example 5.3: For G = Z, every character has the form

χ(m) = eimϕ, ϕ ∈ [0, 2π),

i.e. we have G = S1 (and, hence, the self-duality of Z).

5.2 Continuous Decompositions

While studying representation theory for compact groups, we generalized the Fourierseries

f(t) =∑k∈Z

c(k)eikt, c(k) := (1/(2π))∫ 2π

0

e−iktf(t)dt

for G = SO(2) S1 to other compact groups. Now, the analogon of Fourier series forG = R is the Fourier transform associating to a sufficiently nice function f its Fouriertransform

(5.1) Ff(k) = f(k) := (1/√

2π)∫ ∞

−∞χk(t)f(t)dt, χk(t) := eikt.

This may be interpreted in the following way: In the case of compact groups we had a(denumerable) direct sum decomposition of the regular representation ρ into irreduciblesubrepresentations πj

ρ =∑j∈J

πj .

For G = R we analogously have the regular representation ρ given on H = L2(R, dx) by

ρ(t)f(x) = f(x + t) for all x, t ∈ R.

Suppose H1 would be the space of a one-dimensional subrepresentation. Then we havefor every f1 ∈ H1

ρ(t)f1(x) = f1(x + t) = λ(t)f1(x) for all x, t ∈ R.

Hence f1 has to be an exponential function and we get a contradiction, because L2(R)contains no exponential function = 0. But (5.1) indicates that ρ can be decomposedinto a direct integral of the irreducible one-dimensional representations χk, k ∈ R. For aproper generalization one here needs more tools from functional analysis, in particularreplacement of the usual measure (as recalled in 3.3) by the notion of a spectral measureE(.), i.e. a measure which takes operators as values and allows to decompose (linear)

5.2 Continuous Decompositions 61

operators into a direct integral. Following [BR] p.649 resp. [Ki] p.57, we present thembriefly as they are also essential for subsequent chapters.

Spectral Measure

Let [a, b] ⊂ R be a finite or infinite interval and E be a function on [a, b] with values E(λ)for λ ∈ [a, b], which are operators in a Hilbert space H. E is called a spectral function ifit satisfies the following conditions:i) E(λ) is self-adjoint for each λ,ii) E(λ)E(µ) = E(min(λ, µ)),iii) The operator function E is strongly right continuous, i.e. one has

limε→0+

E(λ + ε)u = E(λ)u for all u ∈ H,

iv) E(−∞) = 0, E(∞) = Id, i.e. one has

limλ→−∞

E(λ)u = 0, limλ→∞

E(λ)u = u for all u ∈ H.

Conditions i) and ii) mean that E(λ), λ ∈ [a, b], are (refining Definition 1.7) orthogonalprojection operators. For an interval ∆ = [λ1, λ2] ⊂ [a, b] one denotes the differenceE(λ2) − E(λ1) by E(∆).For ∆1,∆2 ⊂ [a, b] one has

E(∆1)E(∆2) = E(∆1 ∩ ∆2),

in particularE(∆1)E(∆2) = 0 if ∆1 ∩ ∆2 = ∅,

i.e. the subspaces H1 = E(∆1)H and H2 = E(∆2)H are orthogonal.

Example 5.4: We take G = R,H = L2(R) and π(a)u(x) := u(x + a) for u ∈ H. Then

R λ −→ E(λ) := (2π)−1

∫ λ

−∞dλ′

∫ ∞

−∞e−iλ′aπ(a)da

is a spectral function.

The properties of the spectral function imply that, for any u ∈ H, the function

σu(λ) := < E(λ)u, u >

is a right continuous, non-decreasing function of bounded variation with

σu(−∞) = 0, σu(∞) = ‖ u ‖2 .

Moreover it is denumerable additive: For a pairwise disjoint decomposition ∆n with∆ = ∪∞

n=1∆n, one has

E(∆) =∞∑

n=1

E(∆n) and σu(∆) =∞∑

n=1

σu(∆n).

The function σu is called the spectral measure. We now can state the fundamental theo-rem of spectral decomposition theory as follows:

62 5. Representations of Abelian Groups

Theorem 5.1: Every self-adjoint operator A in H has the representation

A =∫ ∞

−∞λdE(λ).

Continuous Sums of Hilbert Spaces

The operation of direct sum of Hilbert spaces admits a generalization: Let there be givena set X with measure µ and a family Hxx∈X of Hilbert spaces. It is natural to try todefine a new space

H =∫

X

Hxdµ(x),

the elements of which are to be functions f on X, assuming values in Hx for all x ∈ X,and the scalar product being defined by the formula

< f1, f2 > :=∫

X

< f1(x), f2(x) >Hx dµ(x).

The difficulty consists in the fact that the integrand in this expression may be non-measurable. In the case where all of the spaces Hx are separable, one can introducean appropriate definition of measurable vector-functions f , which will guarantee thismeasurability of the numerical function x −→< f1(x), f2(x) >: In the special examplethat all Hx coincide with a fixed separable H0 with basis ej, the numerical functionsx −→< f(x), ej > are measurable (for more general cases see [Ki] p.57). Once anappropriate concept of measurable vector-functions has been adopted one defines thecontinuous sum (or direct integral ) H =

∫XHxdµ(x) as the set of equivalence classes of

measurable vector-functions f with summable square of norms.

An operator A in H =∫

XHxdµ(x) is called decomposable iff there exists a family A(x)

of operators in the spaces Hx such that

(Af)(x) = A(x)f(x) almost everywhere on X.

By a suitable generalization of the notion of spectral measure to the dual G (which foran abelian locally compact group G can be given the structure of a topological space),one has the general statement:

Theorem 5.2: Let π be a unitary continuous representation of an abelian locally com-pact group G in a Hilbert space H. Then there exists a spectral measure E(.) on thecharacter group G such that

π(g) =∫

G

χ(g)dE(χ).

This theorem is called SNAG-theorem, as it goes back to Stone, Naimark, Ambrose, andGodement. One can find a proof in [BR] p.160ff. We also recommend the Chapter onlocally compact commutative groups in [Ma] p.37ff, which is centered on the notion ofprojection valued measure. This is a function E −→ PE associating a projection operatorPE in H to each Borel set E from a Borel space S with the propertiesi) P∅ = 0, PS = Id,ii) PE∩F = PEPF for all E and F,iii) PE =

∑j∈N PEj for every disjoint decomposition E = j∈N Ej .

Chapter 6

The Infinitesimal Method

Up to now, with exception of the Schrodinger representation of the Heisenberg group,all irreducible representations treated here were finite-dimensional. Before we go furtherinto the construction of infinite-dimensional representations, we discuss a method whichallows a linearization of our objects and hence simplifies the task to determine the struc-ture of the representations and classify them. As we shall see, this is helpful for compactand noncompact groups as well. The idea is to associate to the linear topological groupG a linear object, its Lie algebra g = Lie G, and study representations π of these alge-bras, which are open to purely algebraic and combinatorial studies. One can associateto each representation π of G an infinitesimal representation dπ of g. Conversely, onecan ask which representations π may be integrated to a unitary representation π of G,i.e., which are of the form π = dπ, π ∈ G. As we will see in several examples, thismethod allows us to classify the (unitary) irreducible representations and, hence, gives aparametrization of G. It will further prove to be helpful for the construction of explicitmodels for representations (π,H).

6.1 Lie Algebras and their Representations

We introduce some algebraic notions, which will become important in the sequel. Theinterested reader can learn more about these in any text book on Algebra (e.g. Lang’sAlgebra [La]).

Definition 6.1: Let K be a field. A K-algebra is a K-vector space A provided with aK-bilinear map (multiplication)

A×A −→ A, (x, y) −→ xy.

A is called

– associative iff one has

(xy)z = x(yz) for all x, y, z ∈ A,

– commutative iff one has

xy = yx for all x, y ∈ A,

64 6. The Infinitesimal Method

– a Lie algebra iff the multiplication xy, usually written as xy =: [x, y] (the Liebracket), is anti-symmetric, i.e.

[x, y] = −[y, x] for all x, y ∈ A,

and fulfills the Jacobi identity, i.e. one has

[x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0 for all x, y, z ∈ A.

– a Jordan algebra iff one has

xy = yx and x2(xy) = x(x2y) for all x, y ∈ A,

– a Poisson algebra iff A is a commutative ring with multiplication

A×A −→ A, (x, y) −→ xy

and one has a bilinear map

A×A −→ A, (x, y) −→ x, y

(the Poisson bracket) fulfilling the identities

x, y = −y, x,x, yz = x, yz + yx, z,

x, y, z = x, y, z + y, x, z for all x, y, z ∈ A.

The dimension of A as a K-vector space is called the dimension of the algebra.

A map ϕ : A −→ A′ is an algebra homomorphism iff it is a K-linear map and respectsthe respective composition, i.e. if one has

ϕ(xy) = ϕ(x)ϕ(y) for all x, y ∈ A.

Exercise 6.1: Show that C∞(R2n) is a Poisson algebra with

f, g :=n∑

i=1

(∂f

∂qi

∂g

∂pi− ∂f

∂pi

∂g

∂qi) for all f, g ∈ C∞(R2n),

where the coordinates are denoted by (q1, . . . , qn, p1, . . . , pn) ∈ R2n.

Exercise 6.2: Verify that each Poisson algebra is also a Lie algebra.

In the following, we are mainly interested in Lie algebras over K = R or K = C, whichwe usually denote by the letter g . We recommend as background any text book on thistopic, for instance Humphreys’ Introduction to Lie Algebras and Representation Theory[Hu].

Example 6.1: Every Mn(K) is an associative noncommutative algebra with composi-tion XY of two elements X, Y ∈ Mn(K) given by matrix multiplication.

6.1 Lie Algebras and their Representations 65

Example 6.2: Every associative algebra A is a Lie algebra with

[X,Y ] = XY − Y X for all X, Y ∈ A,

the commutator of X and Y . In particular, one denotes

gl(n,K) := Mn(K).

Example 6.3: If V is any K-vector space, the space End V of linear maps F from V toV is a Lie algebra with

[F,G] := FG − GF for all F,G ∈ EndV.

Example 6.4: If g is a Lie algebra and g0 is a subspace with [X, Y ] ∈ g0 for all X,Y ∈ g0,g0 is again a Lie algebra. One easily verifies this and that

sl(n,K) := X ∈ gl(n,K); TrX = 0,so(n) := X ∈ gl(n,K);X = −tX,su(n) := X ∈ gl(n,K);X = −tX, TrX = 0

are examples of Lie algebras.

Example 6.5: If A is a K-algebra,

DerA := D : A −→ A K − linear; D(ab) = aDb + Da b for all a, b ∈ A

is a Lie algebra, the algebra of derivations.

Exercise 6.3: Verify that the matrices

F := (1

), G := ( 1 ), H := (1

−1 )

are a basis of sl2(K) with the relations

[H, F ] = 2F, [H,G] = −2G, [F,G] = H.


X1 := (1/2)( −i−i

), X2 := (1/2)( −11 ), X3 := (1/2)( i

−i)

are a basis of su(2) with the relations

[Xi, Xj ] = −Xk, (i, j, k) = (1, 2, 3), (2, 3, 1), (3, 1, 2).


X1 :=

⎛⎝ 0

−11

⎞⎠ , X2 :=

⎛⎝ −1

01

⎞⎠ , X3 :=

⎛⎝ −1

10

⎞⎠


are a basis of so(3) with the relations

[Xi, Xj ] = −Xk, (i, j, k) = (1, 2, 3), (2, 3, 1), (3, 1, 2).

Exercise 6.6: Show that g = R3 provided with the usual vector product as compositionis a Lie algebra isomorphic to so(3).

Exercise 6.7: Show that g = R3 provided with the composition

((p, q, r), (p′, q′, r′)) −→ (0, 0, 2(pq′ − p′q))

is a Lie algebra, the Heisenberg algebra heis(R), with basis

P = (1, 0, 0), Q = (0, 1, 0), R = (0, 0, 1)

and relations[R, P ] = [R, Q] = 0, [P, Q] = 2R.

Remark 6.1: It will soon be clear that gl(n,R), sl(n,R), so(n), su(n), and heis(R) areassociated to the groups GL(n,R),SL(n,R), SO(n),SU(n), resp. Heis(R).

In analogy with the concept of the linear representation of a group G in a vector spaceV , one introduces the following notion.

Definition 6.2: A representation of the Lie algebra g in V is a Lie algebra homorphismπ : g −→ EndV , i.e. π is K-linear and one has

π([X,Y ]) = [π(X), π(Y )] for all X,Y ∈ g.

Here we have no topology and, hence, no continuity condition.

Each Lie algebra has a trivial representation π = π0 with π0(X) = 0 for all X ∈ g andan adjoint representation π = ad given by

adX(Y ) := [X, Y ] for all X,Y ∈ g.


In analogy with the respective notions for group representations, we have the followingconcepts.– (π0, V0) is a subrepresentation of (π, V ) iff V0 ⊂ V is a π-invariant subspace, i.e. π(X)v0 ∈V0 for all v0 ∈ V0, and π0 = π |V .– π is irreducible iff π has no nontrivial subrepresentation.If π1 and π2 are representations of g we have– the direct sum π1 ⊕ π2 as the representation on V1 ⊕ V2 given by

(π1 ⊕ π2)(X)(v1, v2) := (π1(X)v1, π2(X)v2) for all v1 ∈ V1, v2 ∈ V2,

and– the tensor product π1 ⊗ π2 as the representation on V1 ⊗ V2 given by

(π1 ⊗ π2)(X)(v1 ⊗ v2) := π1(X)v1 ⊗ v2 + v1 ⊗ π2(X)v2 for all v1 ∈ V1, v2 ∈ V2.

6.2 The Lie Algebra of a Linear Group 67

Exercise 6.9: Verify that these are in fact representations.

Exercise 6.10: Find the prescription to define a contragredient representation (π∗, V ∗)to a given representation (π, V ) of g.

There is a lot of material about the structure and representations of Lie algebras, muchof it going back to Elie and Henri Cartan. As already indicated, a good reference for thisis the book by Humphreys [Hu]. We will later need some of this and develop parts of it.Here we only mention that one comes up with facts like this: If g is semisimple (i.e. hasno nontrivial ideals), then each representation π of g is completely reducible.Before we discuss more examples, we establish the relation between linear groups andtheir Lie algebras.

6.2 The Lie Algebra of a Linear Group

In Section 3.1 we introduced the concept of a linear group G as a closed subgroup of amatrix group GL(n,R) or GL(n,C) for a suitable n ∈ N. To such a group we associatea Lie algebra g = Lie G using as essential tool the exponential function for matrices. Werely again on some experience from calculus in several complex variables and we definethe exponential function exp by

exp X :=∞∑

k=0

(1/k!)Xk for all X ∈ Mn(C).

Remark 6.2: One has the following facts:

1. exp X is absolutely convergent for each X ∈ Mn(C).2. exp X : Mn(C) −→ Mn(C) is continuously differentiable.3. There is an open neighbourhood U of 0 ∈ Mn(R), which is diffeomorphic to an openneighbourhood V of E ∈ GL(n,R).4. log X := Σ((−1)k/k)(X − E)k converges for ‖ X − E ‖ < 1 and we have

exp log X = X for ‖ X − E ‖ < 1,log exp X = X for ‖ exp X − E ‖ < 1.

5. One has

i) exp(X + Y ) = expX expY for commuting X, Y ∈ Mn(C),ii) (expX)−1 = exp(−X) for X ∈ Mn(C),iii) A exp XA−1 = exp(AXA−1) for X ∈ Mn(C), A ∈ GL(n,C),iv) det(exp X) = expTrX forX ∈ Mn(C).

The concientious reader should prove all this as Exercise 6.11. She/he should rememberthat we introduced in (3.1) in Section 3.1

< X, Y > = Re Tr tXY, ‖ X ‖ = < X,X >1/2 for all X, Y ∈ Mn(C),

and that one has the fundamental inequalities

‖ X + Y ‖ ≤ ‖ X ‖ + ‖ Y ‖, the triangle inequality,|< X, Y >| ≤ ‖ X ‖‖ Y ‖, the Cauchy Schwarz inequality.


Everyone interested in concrete examples could do the following Exercise 6.12 preparingthe ground for the next definition: Verify

exp(tX) =(

1 t1

), for X =

(1)

,

exp(tX) =(

cos t sin t− sin t cos t

), for X =

(1

−1

),

exp(tX) =(

et

e−t

), for X =

(1

−1

),

exp(tX) =

⎛⎝ cos t sin t

− sin t cos t1

⎞⎠ , for X =

⎛⎝ 1

−10

⎞⎠ .

Exercise 6.13: Verify that one has the following relations

exp(tX), X ∈ so(3); t ∈ R = SO(3),exp(tX), X ∈ su(2); t ∈ R = SU(2),exp(tX), X ∈ sl(2,R); t ∈ R = SL(2,R).

This exercise shows that there are groups, which are examples for groups of exponentialtype, i.e. images under the expontial map of their Lie algebras. For G = SL(2,R) onehas

exp(tX), X ∈ sl(2,R), t ∈ R = SL(2,R),

but the matrices exp(tX), X ∈ sl(2,R), t ∈ R, generate the identity component ofG = SL(2,R).

We take this as motivation to associate a Lie algebra to each linear group by a kind ofinversion of the exponential map:

Definition 6.3: Let G be a linear group contained in GL(n,K),K = R or K = C.Then we denote

g = Lie G := X ∈ Mn(K); exp(tX) ∈ G for all t ∈ R.Theorem 6.1: g is a real Lie algebra.

A hurried or less curious reader may simply believe this and waive the proof. (See forinstance [He] p.114: it is, in principle, not too difficult but lengthy. It shows why (in ourapproach) we had to restrict ourselves to closed matrix groups.) We emphasize that it isessentially the heart of the whole theory that not only a linear group but also every Liegroup has an associated Lie algebra. This is based on another important notion, whichwe now introduce because it gives us a tool to determine explicitly the Lie algebras forthose groups we are particularly interested in (even if we we have skipped the proof ofthe central theorem above).

Definition 6.4: A one-parameter subgroup γ of a topological group G is a continuoushomomorphism

γ : R −→ G, t −→ γ(t).

This notation is a bit misleading: to be precise, G0 = γ(R) is indeed a subgroup of G.γ is often abbreviated as a 1-PUG (from the German term “Untergruppe”).

6.2 The Lie Algebra of a Linear Group 69

Remark 6.3: Each X ∈ g produces a 1-PUG γ, namely the subgroup given by

γ(t) := exp(tX).

This is obvious since exp(tX) is continuous (and even differentiable) in t and fulfills thefunctional equation i) in part 5 of the Remark 6.2 above.

If the 1-PUG γ = γ(t) is differentiable in t, we assign to it an infinitesimal generatorX = Xγ by

X :=d

dtγ(t) |t=0 .

If γ is given as γ(t) = exp(tX), we have X as infinitesimal generator.

This suggests the following procedure to determine the Lie algebra g of a given matrixgroup G (which is also the background of our Exercise 6.13 above): Look for “sufficientlymany independent” one-parameter subgroups γ1, . . . , γd and determine their generatorsX1, . . . , Xd. Then g will come out as the R-vector space generated by these matrices Xi.Obviously, one needs some explanation:– Two 1-PUGs γ and γ′ are independent iff their infinitesimal generators are linearlyindependent over R as matrices in Mn(C).– The number d of the sufficiently many independent (γi) is just the dimension of ourgroup G as a topological manifold: Up to now we could work without this notion, butlater on we will even have to consider differentiable manifolds. Therefore, let us recall (orintroduce) that the notion of a real manifold M of dimension d includes the conditionthat each point m ∈ M has a neighbourhood which is homeomorphic to an open set inRd. This amounts to the practical recipe that the dimension of a group G is the numberof real parameters needed to describe the elements of G.For instance for G = SU(2) and SO(3) we have as parameters the three Euler anglesα, β, γ, which we introduced in 4.3 and 4.2. Hence, here we have d = 3, as also in thenext example.

Example 6.6: For G = SL(2,R) we take as a standard parametrization (a specialinstance of the important Iwasawa decomposition)

(6.1) SL(2,R) g = (a bc d

) = n(x)t(y)r(ϑ)

with

n(x) := (1 x

1 ), t(y) := (y1/2

y−1/2 ), r(ϑ) := (cos ϑ sin ϑ− sinϑ cos ϑ

),

where x ∈ R, y > 0, ϑ ∈ [0, 2π). Therefore, we take

γ1(t) := ( 1 t1 ), γ2(t) := ( et

e−t ), γ3(t) := r(t),

and get by differentiating and then putting t = 0

X1 = (1

), X2 = (1

−1 ), X3 = (1

−1 ).

Obviously, we have with the notation introduced in Exercise 6.3 in 6.1

Lie SL(2,R) = < X1, X2, X3 > =< F, G, H >= sl2(R).


Example 6.7: For

G = Heis′(R) := ⎛⎝ 1 x z

1 y1

⎞⎠ ; x, y, z ∈ R

we take

γ1(t) :=

⎛⎝ 1 t

11

⎞⎠ , γ2(t) :=

⎛⎝ 1

1 t1

⎞⎠ , γ3(t) :=

⎛⎝ 1 t

11

⎞⎠ ,

and get

X1 =

⎛⎝ 1

⎞⎠ , X2 =

⎛⎝ 1

⎞⎠ , X3 =

⎛⎝ 1

⎞⎠ .

By an easy computation of the commutators we get the Heisenberg commutation relations

[X1, X2] = X3, [X1, X3] = [X2, X3] = 0.

Exercise 6.14: Repeat this for G = Heis(R) as a subgroup of GL(4,R).

Exercise 6.15: Determine Lie G for G = g = n(x)t(y); x ∈ R, y > 0.

6.3 Derived Representations

Now that we have associated a Lie algebra g = Lie G to a given linear group (using adifferentiation procedure), we want to associate an algebra representation π of g to agiven representation (π,H) of G. Here we will have to use differentiation again. To makethis possible, we have to modify the notion of a group representation:

Definition 6.5: Let (π,H) be a continuous representation of a linear group G. Thenits associated smooth representation (π,H∞) is the restriction of π to the space H∞ ofsmooth vectors, i.e. those v ∈ H for which the map

G g −→ π(g)v ∈ H

is differentiable.

This definition makes sense if we accept that a linear group is a differentiable manifoldand extend the notion of differentiability to vector valued functions. In our examplesthings are fairly easy as we have functions depending on the parameters of the groupwhere differentiability is not difficult to check.

Definition 6.6: Let (π,H) be a continuous representation of a linear group G. Then itsassociated derived representation is the representation dπ of g given on the space H∞

of smooth vectors by the prescription

dπ(X)v :=d

dtπ(exp(tX))v |t=0 for all v ∈ H∞.

6.3 Derived Representations 71

If it is clear, which representation π is meant, one abbreviates Xv := dπ(X)v.It is not difficult to verify that dπ(X) stays in H∞ and rather obvious that dπ(X) islinear. But there is some more trouble to prove (see [Kn] p.53, [BR] p.320, or [FH] p.16)that one has

[dπ(X), dπ(Y )] = dπ([X,Y ]) for all X,Y ∈ g.

Example 6.8: We take G = SU(2) and π = π1 given by

π(g)f(x, y) := f(ax − by, bx + ay)

forf ∈ V (1) = < f−1, f0, f1 >

withf−1(x, y) = (1/

√2)y2, f0(x, y) = xy, f1(x, y) = (1/

√2)x2.

As we did in Exercise 6.4 in 6.1, we take as basis of su(2)

X1 := (1/2)(−i

−i), X2 := (1/2)(

−11 ), X3 := (1/2)(

i−i

).

Then we have the one-parameter subgroups

exp(tX1) = (cos (t/2) −i sin (t/2)

−i sin (t/2) cos (t/2) ),

exp(tX2) = ( cos (t/2) − sin (t/2)sin (t/2) cos (t/2) ),

exp(tX3) = ( eit/2

e−it/2 ).

It is clear that V (1) consists of smooth vectors and hence we can compute

X1f−1 = ddtπ(exp tX1)f−1 |t=0

= ddt (1/

√2)(i sin(t/2)x + cos(t/2)y)2 |t=0

= (i/√

2)xy = (i/√

2)f0,

andX1f0 = d

dtπ(exp tX1)f0 |t=0

= ddt (cos(t/2)x + i sin(t/2)y)(i sin(t/2)x + cos(t/2)y) |t=0

= (i/2)(x2 + y2) = (i/√

2)(f−1 + f1),

X1f1 = ddtπ(exp tX1)f1 |t=0

= ddt (1/

√2)(cos(t/2)x + i sin(t/2)y)2 |t=0

= (i/√

2)xy = (i/√

2)f0.

Similarly we get

X2f−1 = −(1/√

2)f0, X2f0 = (1/√

2)(f−1 − f1), X2f1 = (1/√

2)f0,

X3f−1 = if−1, X3f0 = 0, X3f1 = −if−1.


Exercise 6.16: i) Verify this and show that for

X0 := iX3, X± := (1/√

2)(−iX1 ∓ X2)

one hasX0fp = pfp, X±fp = fp±1, p = −1, 0, 1 (f2 = f−2 = 0).

ii) Do the same for π = πj , j > 1.

Example 6.9: We take the Heisenberg group G = Heis(R) ⊂ GL(4,R) and theSchrodinger representation π = πm,m ∈ R \ 0, given by

π(g)f(x) = em(κ + (2x + λ)µ)f(x + λ)

withg = (λ, µ, κ) ∈ G, x ∈ R, f ∈ H = L2(R), em(u) := e2πimu.

Here H∞ is the (dense) subspace of differentiable functions in L2(R). We realize the Liealgebra g = heis(R) by the space spanned by the four-by-four matrices

P =

⎛⎜⎜⎝ 1

−1

⎞⎟⎟⎠ , Q =

⎛⎜⎜⎝

11

⎞⎟⎟⎠ , R =

⎛⎜⎜⎝ 1

⎞⎟⎟⎠ .

Then we have the Heisenberg commutation relations

[P,Q] = 2R, [R,P ] = [R, Q] = 0

and (using again the notation introduced in 0.1)

exp tP = (t, 0, 0), exp tQ = (0, t, 0), exp tR = (0, 0, t).

For differentiable f we compute

P f = ddtπ(t, 0, 0)f |t=0 = d

dtf(x + t) |t=0 = ∂xf,

Qf = ddtπ(0, t, 0)f |t=0 = d

dtem(2xt)f(x) |t=0 = 4πimxf(x),

Rf = ddtπ(0, 0, t)f |t=0 = d

dtem(t)f(x) |t=0 = 2πimf(x),

i.e. we haveP = ∂x, Q = 4πimx, R = 2πim.

Exercise 6.17: i) Verify that these operators really fulfill the same commutation rela-tions as P,Q, R.ii) Take f0(x) := e−πmx2

and

Y0 := −iR, Y± := (1/2)(P ± iQ)

and determinefp := Y p

+f0, Y−fp and Y0fp for p = 0, 1, 2, . . . .

These Examples show how to associate a Lie algebra representation to a given grouprepresentation. Now the following question arises naturally: Is there always a nontrivialsubspace H∞ of smooth vectors in a given representation space H? We would even want

6.4 Unitarily Integrable Representations of sl(2,R) 73

that it is a dense subspace. The answer to this question is positive. This is easy in ourexamples but rather delicate in general and one has to rely on work by Garding, Nelsonand Harish Chandra. It is not very surprising that equivalent group representations areinfinitesimally equivalent, i.e. their derived representations are equivalent algebra repre-sentations. The converse is not true in general. But as stated in [Kn] p.209, at least forsemisimple groups one has that infinitesimally equivalent irreducible representations areunitarily equivalent.Taking these facts for granted, a discussion of the unitary irreducible representations ofa given linear group G can proceed as follows: We try to classify all irreducible repre-sentations of the Lie algebra g = Lie G of G and determine those, which can be realizedby derivation from a unitary representation of the group. In general, this demands forsome more information about the structure of the Lie algebra at hand. Before going abit into this structure theory, we discuss in extenso three fundamental examples whoseunderstanding prepares the ground for the more general cases.

6.4 Unitarily Integrable Representations of sl(2,R)

A basic reference for this example is the book “SL(2,R)” [La1] by Serge Lang but in someway or other our discussion can be found in nearly any text treating the representationtheory of semisimple algebras or groups.As we already know from Exercise 6.3 in 6.1 and Example 6.5 in 6.2, g = sl(2,R) is theLie algebra of G = SL(2,R) and we have

g = sl(2,R) = X ∈ M2(R); TrX = 0 = < F, G, H >

with

F = (1

), G = ( 1 ), H = (1

−1 )

and the relations[H, F ] = 2F, [H,G] = −2G, [F,G] = H.

The exercises at the end of the examples in the previous section should give a feelingthat it is convenient to complexify the Lie algebra, i.e. here we go over to

gc := g ⊗R C = < F,G, H >C = < X+, X−, Z >C

where

(6.2) X± := (1/2)(H ± i(F + G)) = (1/2)( 1 ±i±i −1 ), Z := −i(F − G) = ( −i

i)

with[Z, X±] = ±2X±, [X+, X−] = Z.

This normalization will soon prove to be adequate and the operators X± will get ameaning as ladder operators: We consider a representation π of gc = < X±, Z > on aC-vector space V =< vi >i∈I where we abbreviate

Xv := π(X)v for all v ∈ V,X ∈ gc.

We are looking for representations (π, V ) of gc, which may be integrated, i.e. which arecomplexifications of derived representations dπ for a representation π of SL(2,R).


This has the following consequence: π|K is a representation of K := SO(2). FromExample 3.1 in 3.2 we know that all irreducible unitary representations of SO(2) aregiven by

χk(r(ϑ)) = eikϑ, k ∈ Z,

hence one has the derived representation dχk with

dχk(Y ) = ik for Y = ( 1−1 ) = iZ.

This motivates the idea to look for each k ∈ Z at the subspace of V consisting of theeigenvectors with eigenvalue k with respect to the action of Z, namely

Vk := v ∈ V ; Zv = kv.This space Vk is called a K-isotypic subspace of V . Each v ∈ Vk is said to have weightk. Now we can explain the meaning of the term ladder operator: The isotypic spacesVk may be seen as rungs of a ladder numerated by the weight k and the operators X±change the weights in a controlled way:

Remark 6.4: We haveX±Vk ⊂ Vk±2.

Proof: For v ∈ Vk we put v′ := X±v and check that v′ is again a Z-eigenvector of weightk ± 2 :

Zv′ = ZX±v = (X±Z ± 2X±)v = X±(k ± 2)v = (k ± 2)v′.

Here we used the Lie algebra relation [Z,X±] = ±2X±, which for the operators actingon V is transformed into the relation

ZX± − X±Z = ±2X±.

This remark has immediate consequences:

i) If V = ΣVk is the space of an irreducible gc-representation, there are only nontrivialVk where k is throughout odd or even.ii) If there is a Vk0 = 0, then all Vk with k ≥ k0 or all Vk with k ≤ k0 have to be zero.Hence, there can be only four different configurations for our irreducible gc-representations.The nontrivial weights k, i.e. those for which Vk = 0, compose– case 1: a finite interval

Inm := m ≤ k ≤ n; m ≡ k ≡ n mod 2

of even or odd integers,– case 2: a half line with an upper bound

In := k ≤ n; k ≡ n mod 2consisting of even or odd integers,– case 3: a half line with lower bound

Im := m ≤ k; m ≡ k mod 2consisting of even or odd integers,– case 4: a full chain of even or odd integers

I+ := k ∈ 2Z or I− := k ∈ 2Z + 1.


case 1 case 2

case 3

case 4

X+ X−

It is quite natural to call an element 0 = v ∈ Vk with k = n in configuration 1 or 2 avector of highest weight and similarly for k = m in configuration 1 or 3 a vector of lowestweight.

The following statement offers the decisive point for the reasoning that we really getall representations by our procedure. It is a special case of a more general theorem byGodement on spherical functions. It also is a first example for the famous and importantmultiplicity-one statements.

Lemma: If π is irreducible, we have dim Vk ≤ 1.

For a proof we refer to Lang [La1] p.24 or the article by van Dijk [vD].

We look at the configurations 1 or 2: Let 0 = v ∈ Vn be a vector of highest weight,i.e. with

Zvn = nvn, X+vn = 0,

and put

vn−2k := Xk−vn.


As we want an irreducible representation and we have accepted to take the muliplicity-one statement from the Lemma for granted, we can take this vn−2k as a generator forVn−2k with vn−2k = 0 for all k ∈ N0 in configuration 1 and all k with n − 2k ≥ m inconfiguration 2. We check what X+ does to vn−2k:Remark 6.5: For k as fixed above, we have

X+vn−2k = akvn−2k+2 with ak := kn − k(k − 1).

Proof: Since X+vn = 0, we know that a0 = 0. As in the proof of Remark 6.4, for theoperators realizing the commutation relation [X+, X−] = Z we calculate

X+vn−2 = X+X−vn = (X−X+ + Z)vn = nvn.

Hence we get a1 = n. Then we verify by induction

X+vn−2(k+1) = X+X−vn−2k = (X−X+ + Z)vn−2k

= X−((kn − k(k − 1))vn−2k+2 + (n − 2k)vn−2k

= ((k + 1)n − k(k + 1))vn−2k = ak+1vn−2k.

We see that the irreducibility of π forces ak to be nonzero for all k ∈ N in configuration2 and for 2k ≤ n − m in configuration 1. Since ak = 0 iff n − (k − 1) = 0, we concludethat n has to be negative in case 2 and m = −n with n ∈ N in case 1.Just as well, we can start to treat the configurations 1 and 3 with a vector 0 = vm ∈ Vm

of lowest weight. Here we put vm+2k := Xk+vm. In parallel to Remark 6.5 we get:

Remark 6.6: For k ∈ N0 in configuration 3 and m+2k ≤ n in configuration 1, we have

X−vm+2k = bkvm+2k−2 with bk = −(km + k(k − 1)).


As above, we see that we have m = −n with negative n in configuration 1 and that n hasto be positive in configuration 3. For instance, the following are possible configurations.

case 1

case 2 case 3

m = −2

n = 2

n = −2

−4

−6

m = 2

4

6

X+ X−


In configuration 4 we have no distinguished vector of highest or lowest weight. In thiscase we choose for each even (or odd) integer n a nonzero vn ∈ Vn, i.e. with Zvn = nvn,such that they are related as follows:

Remark 6.7: For all even (resp. odd) n ∈ Z we have

X±vn = a±n vn±2 with a±

n = (1/2)(s + 1 ± n),

where s ∈ C, such that a±n = 0 for all n ∈ 2Z (resp. 2Z + 1), i.e. s ∈ 2Z + 1 for n ∈ 2Z

and s ∈ 2Z for n ∈ 2Z + 1.

Proof: From irreducibility and multiplicity-one we conclude that each X±vn has to bea nonzero multiple of vn±2. We verify that the choice of a±

n is consistent with the com-mutation relations:

[Z,X±]vn = (ZX± − X±Z)vn = Za±n vn±2 − a±

n nvn±2

= ±2a±n vn±2

= ±2X±vn,and

[X+, X−]vn = Zvn = nvn.

Verify this as Exercise 6.19.

The conditions for s are obvious since we have a±n = 0 iff s + 1 = ∓n.

We chose a symmetric procedure for our configuration 4. But as well we could havechosen any nonzero vn ∈ Vn, and then vn+2j := Xj

+vn, vn−2j := Xj−vn for j ∈ N would

constitute an equivalent representation.Exercise 6.20: Check this.

We denote the representations obtained above as follows

– πs,+ or πs,− for the even resp. odd case in configuration 4 with s = 2Z + 1 for πs,+

and s = 2Z for πs,−, s ∈ C,– π+

n for configuration 3 with lowest weight n, n ∈ N0,– π−

n for configuration 2 with highest weight −n, n ∈ N0,– σn for configuration 1 with highest weight n and lowest weight −n, n ∈ N0.

Remark 6.8: If in πs,+ and πs,− we do not exclude the values of s as above, we get allrepresentations as sub - resp. quotient representations of these πs,± .

Proposition 6.1: We have the following equivalences

πs,+ ∼ π−s,+ and πs,− ∼ π−s,−.

All other representations listed above are inequivalent.

Proof: i) Let V = < vn > and V ′ = < v′n > be representation spaces for πs,+ resp. πs′,+and F : V −→ V ′ an intertwining operator. Because of Zvn = nvn, we have for each n

ZFvn = FZvn = nFvn.


Hence one has Fvn ∈ V ′n, i.e. Fvn = bnv′n with bn ∈ C. Moreover, from

X±Fvn = FX±vn

we deduce(s′ + 1 ± n)bnv′

n±2 = bn±2(s + 1 ± n)v′n±2,

i.e.bn/bn−2 = (s + 1 − n)/(s′ + 1 − n)

andbn+2/bn = (s′ + 1 + n)/(s + 1 + n)

resp.bn/bn−2 = (s′ + n − 1)/(s + n − 1).

Now we see that we have

(s + 1 − n)(s + n − 1) = (s′ + n − 1)(s′ + 1 − n), i. e. s′ = ±s.

The same conclusions can be made for πs,−.ii) The inequivalence of the σn’s among themselves and with respect to all the otherrepresentations is clear because the dimensions are different. The other inequivalenceshave to be checked individually. As an example let V =< vn > and V ′ =< v′

n > bespaces of representations π+

k and πs,+ and F : V −→ V ′ an intertwining operator. Asabove, for k ≥ n we have the relation Fvn = bnv′n with bn ∈ C. Hence F is not surjectiveand both representations are not equivalent. The same reasoning goes through in all theother cases.Exercise 6.21: Realize this.

Classification of the Unitarily Integrable Representations of sl(2,R)

At first, we derive a necessary condition for a general representation π of a Lie algebrag = Lie G to be integrable to a unitary representation π of G.

Proposition 6.2: If (π,H) is a unitary representation of G and X ∈ g = Lie G, theoperator X := dπ(X) on H∞ is skew-Hermitian, i.e. one has X = −X∗.

Proof: For v, w ∈ H∞, unitarity of π implies

< v, π(g)w > = < π(g−1)v, w > for all g ∈ G

and hence, for t ∈ R, t = 0,

< v, iπ(exp(tX))w − w

t>= < i

π(exp(−tX))v − v

−t, w > .

Recalling the definition

dπ(X)v :=d

dtπ(exp(tX))v |t=0= lim

t→0

π(exp tX)v − v

t,

we get from the last equation in the limit t −→ 0

< v, i dπ(X)w > = − < i dπ(X)v, w > .


Therefore we have< v, i dπ(X)w > = − < v, i dπ(X)∗w >,

i.e. X = dπ(X) is skew-Hermitian.

Now, we apply this to g = sl(2,R). The Proposition leads to the condition:

Remark 6.9: We have dπ(X±)∗ = −dπ(X∓).

Proof: Using the notation from the beginning of this section we have

X± = (1/2)(H ± i(F + G)) = (1/2)(1 ±i±i −1 ) =: (X1 ± iX2)

with X1 = H,X2 = F + G ∈ sl(2,R). Hence, the skew-Hermiticity of Xi leads to

(X1 ± iX2)∗ = −(X1 ∓ iX2).

Now, if we ask for unitarity, we have to check whether we find a scalar product <,>defined on the representation space V such that we get

< X±v, w >= − < v, X∓w > for all v, w ∈ V.

For V = < vj >j∈J this condition implies

(6.3) < X+vj , vj+2 > = − < vj , X−vj+2 > for all j ∈ J.

i) In configuration 1 and 2 we have

X−vj+2 = vj

and from Remark 6.5 for j = n − 2k

X+vj = akvj+2 with ak = k(n − k + 1),

i.e.ak ‖ vj+2 ‖2= − ‖ vj ‖2 .

An equation like this is only possible for ak < 0, which is true for the ak in configuration2 but not for configuration 1, where we have seen that all ak are positive.

Similarly Remark 6.6 shows that in configuration 3 the condition (6.3) is no obstacle.


ii) Configuration 4 is more delicate: From Remark 6.7 we have

X±vj = a±j vj±2 with a±

j = (1/2)(s + 1 ± j),

i.e. the condition (6.3) has the form

(6.4) (s + 1 + j) ‖ vj+2 ‖2= −(s − 1 − j) ‖ vj ‖2 .

At first we treat the even case V =< vj >j∈2Z. Condition (6.4) requires for j = 0

(s + 1) ‖ v2 ‖2= −(s − 1) ‖ v0 ‖2 .


As the norm has to be positive, a relation like this is only possible if one has

(1 − s)/(s + 1) > 0, i. e. (1 − s)(1 + s) > 0.

We put s = σ + iτ, σ, τ ∈ R, and get the condition

s2 = σ2 − τ2 + 2στi < 1,

i.e. unitarity demands for

σ = 0 and τ ∈ R arbitrary

or

τ = 0 and σ2 < 1.

In the odd case V = < vj >j∈2Z+1 we get for j = −1 from (j′)

s ‖ v1 ‖2= −s ‖ v−1 ‖2 .

This is possible only for

−s/s > 0, i. e. s2 = σ2 − τ2 − 2στi < 0.

Therefore, here we have only one possibility, namely

σ = 0 and τ ∈ R.

Putting all this together, we have proved the following statement.

Theorem 6.2: Unitary irreducible representations π of G = SL(2,R) can exist only iftheir derived representation π = dπ is among the following list:

– π the trivial representation,– π = π±

k , k ∈ N,

– π = πis,±, s ∈ R (s = 0 for πis,−)– π = πs, s ∈ R, 0 < s2 < 1.

In the next chapter we will prove that, in fact, all these representations π can be inte-grated to unitary representations π of G. They get the following names– π = π∓

k discrete series representation of highest (lowest) weight ∓k,– π = πis,± even (odd ) principal series representation,– π = πs complementary (or supplementary) series representation.

Because of the equivalences stated in Proposition 6.1, we have for the unitary dual ofG = SL(2,R)

G = π0; π±k , k ∈ N; πis,±, s ∈ R>0; πs, 0 < s < 1.

We will study this more closely in the next chapter but already indicate a kind of visu-alization.


π−k π+

k

πis,+

πis,−

πs

iR

−1−2−3 1 2 3

We can observe a very nice fact, which will become important later and is the startingpoint for a general procedure in the construction of representations:

Exercise 6.23: Show that the operator

Ω := X+X− + X−X+ + (1/2)Z2

acts as a scalar for the three types of representations in Theorem 6.2 and determine thesescalars.

Our infinitesimal considerations showed that there is no nontrivial finite-dimensional uni-tary representation of SL(2,R). There are other proofs for this fact. Because it is soelegant and instructive, we repeat the one from [KT] p.16.

Theorem 6.3: Every finite-dimensional representation π of G = SL(2,R) is trivial.

Proof: If G has an n-dimensional representation space, we have a homomorphism

π : SL(2,R) −→ U(n).

Since

(a

a−1 )(1 b

1 )(a

a−1 )−1 = (1 a2b

1 )

all n(b) = (1 b

1 ) are conjugate. So all π(n(b)) with b > 0 are conjugate. It is a not too

difficult topological statement that in a compact group all conjugacy classes are closed.As U(n) is compact, we can deduce

limb→0+

π(n(b)) = π(E2) = En,


and therefore π(n(b)) = En for all b > 0. But we can repeat this conclusion for b < 0

and as well for n(b) = ( 1b 1 ) and get

π(n(b)) = π(n(b)) = En for all b ∈ R.

Now the proof is finished if we recall the following

Lemma: SL(2,R) is generated by the elements n(b) and n(b), b ∈ R.


6.5 The Examples su(2) and heis(R)

In section 6.3 we computed derived representations for G = SU(2) and Heis(R). We lookat this again on the ground of what we did in the last section.We refer to Exercise 6.4 in 6.1 and Example 6.8 in 6.3 and (having in mind later appli-cations) change the basis (Xj) of su(2) we used there to

(6.5) H1 = (−i

i), H2 = (

−i−i

), H3 = (−1

1 ),

with[Hj ,Hk] = 2Hl, (j, k, l) = (1, 2, 3), (2, 3, 1), (3, 1, 2).

su(2) and sl(2,R) have the same complexifications. Hence we can express the matricesZ,X± from (6.2) in the previous section as

Z = iH3, X± = (1/2)(iH1 ∓ H2)

and repeat the discussion of the representation space V = < vj >j∈J to get the configu-rations 1 to 4. Let us look at configuration 1: We take a vector vn ∈ V of highest weightn ∈ N, put

vn−2k = Xk−vn, X+vn−2k =: akvn−2k+2,

get ak = kn − k(k − 1) and m = −n for the lowest weight. But in this case the skew-hermiticity of the Hj changes things and Remark 6.9 in 6.4 is to be replaced by:

Remark 6.10: For a unitary representation π of SU(2) we must have

dπ(X±)∗ = dπ(X∓).

Hence, the discussion in 6.4 shows that for unitarity in this case the ak have to bepositive as, in fact, they are for n ≥ k ≥ 0. We see that our discussion rediscovers therepresentation space V = V (j) from section 4.2 with n = 2j. By the way, we get here forfree another proof that the representation πj from 4.2 is irreducible.Configurations 2 to 4 are not unitarizable.

For G = Heis(R), as in Example 6.8 in 6.3, we have

g = heis(R) = < P,Q, R >

6.5 The Examples su(2) and heis(R) 83

with the Heisenberg commutation relations

[P, Q] = 2R, [R, P ] = [R,Q] = 0.

We complexify g and take as a C-basis

Y0 := iR, Y± := (1/2)(P ± iQ)

to get the relations [Y+, Y−] = Y0, [Y0, Y±] = 0.

We want to construct a representation π of g on a space V = < vj >j∈J , (which wecontinue linearly to a representation of gc,) such that π can be unitarily integrated,i.e. we can find a unitary representation π of Heis(R) with dπ = π : In every group onehas as a distinguished subgroup, the center C(G), defined by

C(G) := g ∈ G; gg0 = g0g for all g0 ∈ G.Obviously for G = Heis(R) we have

C := C(Heis(R)) = (0, 0, κ); κ ∈ R R.

Hence, a representation π of our G restricted to the center C is built up (see Chapter 5)from the characters of R. We write here

ψ(κ) := ψm(κ) := exp(2πimκ) = em(κ), m ∈ R \ 0with derived representation dψ(κ) = 2πim. Having this in mind (and the discussion ofExample 6.9 in 6.3), we propose the following construction of a representation (π, V ),where we abbreviate again π(Y )v =: Y v.

We suppose that we have a vacuum vector, i.e. an element v0 ∈ V with

Y0v0 = µv0, Y−v0 = 0, µ = 2πm.

The first relation is inspired by differentiation of the prescription of the Schrodingerrepresentation

π((0, 0, κ))f = ψ(κ)f

where we have (0, 0, κ) = exp κR and Y0 = iR.

We put vj := Y j+v0. As Y0 commutes with Y±, we have Y0vj = µvj . Then we try to

realize a relationY−vj = ajvj−1 :

From the commutation relation [Y+, Y−] = Y0 we deduce

Y−v1 = Y−Y+v0 = (Y+Y− − Y0)v0 = −µ,Y−vj = Y−Y+vj−1 = (Y+Y− − Y0)vj−1 = (aj−1 − µ)vj−1 = ajvj−1,

and, hence, by inductionaj = −jµ.

Thus we have V spanned by v0, v1, v2, . . . with

Y0vj = µvj , Y+vj = vj+1, Y−vj = −jµvj−1.


We check whether this representation can be unitarized: We look at the necessary con-dition in Proposition 6.2 in 6.4. The skew-Hermiticity of P,Q, R leads to

Y ∗± = dπ(Y±)∗ = (1/2)(P ∗ ∓ iQ∗) = −(1/2)(P ∓ iQ) = −Y∓.

Hence, as in the sl-discussion, we get the necessary condition for a scalar product <, >on V

< Y±v, w > = − < v, Y±w >,

in particular, for v = vj−1, w = vj , we have

‖ vj ‖2= jµ ‖ vj−1 ‖2 .

We see that unitarity is possible iff µ = 2πm > 0. A model, i.e. a realization of thisrepresentation π by integration, is given by the Schrodinger representation πm, which wediscussed already in several occasions (see Example 6.9 and Exercise 6.17 in 6.3).

Remark 6.11: Our infinitesimal considerations show that πm is irreducible.

Remark 6.12: This discussion above is already a good part of the way to a proof of thefamous Stone-von Neumann Theorem stating that up to equivalence there is no otherirreducible unitary representation π of Heis(R) with π |C= ψm for m = 0 (for a completeproof see e.g. [LV] p.19ff).

Remark 6.13: What we did here is essentially the same as that appears in the physicsliterature under the heading of harmonic oscillator. Our v0 = f0 with f0(x) = e−2πmx2

describes the vacuum, and Y+ and Y− are creation resp. annihilation operators producinghigher resp. lower excitations.

Exercise 6.25: Use Exercise 6.17 in 6.3 to verify the equation

(Y+Y− + Y−Y+)f = (1/2)(f ′′ − (4πmx)2f) = −2πmf

and show that f0 is a solution. Recover the Hermite polynomials.

6.6 Roots, Weights and some Structure Theory

In the examples discussed above in 6.4 and 6.5 we realized the representations of thecomplexification gc of the given Lie algebra g = Lie G by ladder operators X± resp. Y±changing generators vj of our representation space V by going to vj±2 resp. vj±1 in acontrolled way. For G = SL(2,R) and SU(2) the indices of the generators were related tothe weights, i.e. the eigenvalues of the operator Z. For G = Heis(R) we observe a differ-ent behaviour: We have an operator Y0 producing the same eigenvalue when applied toall generators of our representation space. This is expression of the fact that sl(2,R) andsu(2) resp. the groups SL(2,R) and SU(2) on the one hand and heis(R) resp. Heis(R) onthe other are examples of two different types for which we have to expect different waysof generalizations. Let us already guess that we expect to generalize the machinery of theweights in the first case (the semisimple one) by assuming to have a greater number ofcommuting operators whose eigenvalues constitute the weights and to have two differentkinds of operators raising or lowering the weights (ordered, say, lexicographically).

6.6 Some Structure Theory 85

We have to introduce some more notions to get the tools for an appropriate structuretheory. The reader interested mainly in special examples and eager to see the repre-sentation theory of the other groups mentioned in the introduction may want to skipthis section. Otherwise she or he is strongly recommended to a parallel study of a moredetailed source like [Hu], [HN] or [Ja]. Here we follow [Ki] p.92f, [KT], and [Kn1], whichdon’t give proofs neither (for these see [Kn] p.113ff, and/or [Kn2] Ch.I - VI).

6.6.1 Specifications of Groups and Lie Algebras

In this section we mainly treat Lie algebras. But we start with some definitions forgroups, which often correspond to related definitions for algebras.

The reports [Kn1] and [KT] take as their central objects real or complex linear connectedreductive groups. The definition of a reductive group is not the same with all authorsbut the following definition ([Kn] p.3, [KT] p.25) is very comprehensive and practical forour purposes.

Definition 6.7: A linear connected reductive group is a closed connected group of realor complex matrices that is stable under conjugate transpose, i.e. under the Cartaninvolution

Θ : Mn(C) −→ Mn(C), X −→ (tX)−1.

A linear connected semisimple group is a linear connected reductive group with finitecenter.More standard is the following definition: A group is called simple iff it is non-trivial,and has no normal subgroups other than e and G itself.

Example 6.10: The following groups are reductive

a) G = GL(n,C),b) G = SL(n,C),c) G = SO(n,C),

d) G = Sp(n,C) := g ∈ SL(2n,C); tgJg = J := (En

−En).

The center of GL(n,C) is isomorphic to C∗, so GL(n,C) is not semisimple. The othergroups are semisimple (with exception n = 2 for c)). More examples come up by thegroups of real matrices in the above complex groups. GL(n,R) is disconnected and there-fore not reductive in the sense of the above definition, however its identity component is.We will come back to a classification scheme behind these examples in the discussion ofthe corresponding notions for Lie algebras.

For the moment we follow another approach (as in [Ki] p.17). Non-commutative groupscan be classified according to the degree of their non-commutativity: Given two subsetsA and B of the group G, let [A,B] denote the set of all elements of the form aba−1b−1

as a runs through A and b runs through B. We define two sequences of subgroups:

Let G0 := G and for n ∈ N let Gn be the subgroup generated by the set [Gn−1, Gn−1].Gn is called the n-th derived group of G.


Let G0 := G and for n ∈ N assume Gn to be the subgroup generated by the set [G,Gn−1].We obviously obtain the following inclusions

G = G0 ⊃ G1 ⊃ · · · ⊃ Gn ⊃ . . . ,

G = G0 ⊃ G1 ⊃ · · · ⊃ Gn ⊃ . . . .

For a commutative group these sequences are trivial, we have Gn = Gn = e for alln ∈ N.

Definition 6.8: G is called solvable of class k iff Gn = e beginning with n = k.G is called nilpotent of class k iff Gn = e beginning with n = k.

Exercise 6.26: Show that Heis(R) is nilpotent and solvable of class 1.

Now we pass to the corresponding concepts for Lie algebras (following [Ki] p.89).

Definition 6.9: A linear subspace g0 in a Lie algebra is called a subalgebra iff

[X,Y ] ⊂ g0 for all X,Y ∈ g0.

A linear subspace a in a Lie algebra is called an ideal iff

[X,Y ] ⊂ a for all X ∈ g and Y ∈ a.

If a is an ideal in g, then the factor space g/a is provided in a natural way with thestructure of a Lie algebra, which is called the factor algebra of g by a.

In every Lie algebra g we can define two sequences of subspaces, where here the expression[a, b] denotes the linear hull of all [X,Y ], X ∈ a, Y ∈ b:

g1 := [g, g], g2 := [g, g1], . . . , gn+1 := [g, gn],g1 := [g, g], g2 := [g1, g1], . . . , gn+1 := [gn, gn].

It is clear that one has the following inclusions

gn ⊃ gn+1, gn ⊃ gn+1, gn ⊃ gn, n ∈ N.

Exercise 6.27: Prove that all gn and gn are ideals in g and that gn/gn+1 and gn/gn+1

are commutative.

For dim g < ∞ the sequences (gn) and (gn) must stabilize, beginning with a certain nwe have gn = gn+1 = · · · = g∞ and gn = gn+1 = · · · = g∞.

Definition 6.10: g is called solvable resp. nilpotent iff g∞ = 0 resp. g∞ = 0.Obviously every nilpotent algebra is solvable.

Example 6.11: heis(R) is nilpotent, and hence solvable as we have more generally

bn(K) := X = (xij) ∈ Mn(K); xij = 0 for i > j is solvable,nn(K) := X = (xij) ∈ Mn(K); , xij = 0 for i ≥ j is nilpotent.

It is clear that every X ∈ nn is nilpotent, i.e. has the property Xn = 0. This is generalizedby the following statement justifying the name nilpotent algebra:


Theorem 6.4 (Engel): g is nilpotent iff the operator adX is nilpotent for all X ∈ g,i.e. for every X there is an n ∈ N with (ad X)n = 0.We recall that adX is defined by adX(Y ) := [X, Y ] for all Y ∈ g.

Now we pass to the other side of Lie algebra structure theory. As in [Kn1] p.1 ff werestrict ourselves to the treatment of finite dimensional real or complex algebras g.

Definition 6.11: g is said to be simple iff g is non-abelian and g has no proper non-zeroideals.

In this case one has [g, g] = g, which shows that we are as far from solvability as possible.

Definition 6.12: g is said to be semisimple iff g has no non-zero abelian ideal.

There are other (equivalent) definitions, for instance g is said to be semisimple iff onehas rad g = 0. In this definition the radical rad g is the sum of all solvable ideals of g.

Semisimple and simple algebras are related a follows.

Theorem 6.5: g is semisimple iff g is the sum of simple ideals. In this case there areno other simple ideals, the direct sum decomposition is unique up to order of summandsand every ideal is the sum of the simple ideals. Also in this case [g, g] = g.

Finally, from [KT] p.29 we take over

Definition 6.13: A reductive Lie algebra is a Lie algebra that is the direct sum of twoideals, one equal to a semisimple algebra and the other to an abelian Lie algebra.

We have a practical criterium:

Theorem 6.6: If g is a real Lie algebra of real or complex (even quaternion) matricesclosed under conjugate transposition, then g is reductive. If moreover the center of g istrivial, i.e. Zg := X ∈ g; [X, Y ] = 0 for all Y ∈ g = 0, then g is semisimple.

Reductive Lie algebras have a very convenient property:

Proposition 6.3: A Lie algebra g is reductive iff each ideal a in g has a complementaryideal, i.e. an ideal b with g = a ⊕ b.

As to be expected, to some extent the notions just defined for groups and algebras fittogether and one has statements like the following:

Proposition 6.4 ([KT] p.29): If G is a linear connected semisimple group, then g = Lie Gis semisimple. More generally, if G is linear connected reductive, then g is reductive withg = Zg ⊕ [g, g] as a direct sum of ideals. Here Zg denotes the center of g, and the com-mutator ideal is semisimple.

Example 6.12: gl(n,R) = scalars ⊕ sl(n,R).


The main tool for the structure theory of semisimple Lie algebras is the following bilinearform that was first defined by Killing and then extensively used by E. Cartan.

Definition 6.14: The Killing form B is the symmetric bilinear form on g defined by

B(X,Y ) := Tr (adXadY ) for all X, Y ∈ g.

Remark 6.14: B is invariant in the sense that

B([X, Y ], Z]) = B(X, [Y,Z]) for all X, Y, Z ∈ g.

Exercise 6.28: Determine the matrix of B with respect to the basisa) F, G,H (from Exercise 6.3 in 6.1) for g = sl(2,R),b) P, Q, R (from Exercise 6.7 in 6.1) for g = heis(R).

The starting point for structure theory of semisimple Lie algebras is Cartan’s Criteriumfor Semisimplicity.

Theorem 6.7: g is semisimple if and only if the Killing form is nondegenerate, that isB(X, Y ) = 0 for all Y ∈ g implies X = 0.

The proof uses the remarkable fact that kerB is an ideal in g.There is another important and perhaps a bit more accessible bilinear form on g: Thetrace form B0 is given by

B0(X, Y ) := Tr (XY ) for all X, Y ∈ g.

The trace form is invariant in the same sense as the Killing form. Both forms are related(see e.g. [Fog] IV.4):

Remark 6.15: We have

B(X,Y ) = 2nTr (XY ) for g = sl(n,K) and n ≥ 2,

= (n − 2)Tr (XY ) for g = so(n,K) and n ≥ 3,

= 2(n + 1)Tr (XY ) for g = sp(2n,K) and n ≥ 1.

A variant of the trace form already appeared in our definition of topology in matrixspaces. We introduced for X, Y ∈ Mn(C)

< X,Y > := ReTr tXY, and ‖ X ‖2 := ReTr tXX.

As infinitesimal object corresponding to the Cartan involution for our matrix groups G

Θ : GL(n,C) −→ GL(n,C), g −→ (tg)−1,

we have the map for the Lie algebra g = Lie G

θ : Mn(C) −→ Mn(C), X −→ −tX.

Hence we have< X, Y > := −Re B0(X, θY )

as a scalar product on g as a real vector space. This will become important later on.


6.6.2 Structure Theory for Complex Semisimple Lie Algebras

A complete classification exists for semisimple Lie algebras, as one knows their buildingblocks, the simple algebras: Over C there exist four infinite series of classical simple Liealgebras and five exceptional simple Lie algebras. Over R we have 12 infinite series and23 exceptional simple algebras. From all these, here we reproduce only the followingstandard list of classical complex simple algebras

1) An := X ∈ Mn+1(C); TrX = 0, n = 1, 2, 3, . . .

2) Bn := X ∈ M2n+1(C); X = −tX, n = 2, 3, . . .

3) Cn := X ∈ M2n(C); XJ2n + J2nX = 0, n = 3, 4, . . . , J2n := (En

−En),

4) Dn := X ∈ M2n(C); X = −tX, n = 4, 5, . . . .

We have the isomorphisms

B1 A1 C1, C2 B2, D2 A1 ⊕ A1, D3 A3

and D1 is commutative. In the sequel we shall in most cases return to our previousnotation and write

An = sl(n + 1,C), Bn = so(2n + 1,C), Cn = sp(n,C), Dn = so(2n,C).

All these algebras are also simple algebras over R. The remaining real classical algebrasremain simple upon complexification. For a complete list see for instance [Ki] p.92/3.By the way, we introduce here the following also otherwise important concept:

Definition 6.15: Given a complex Lie algebra g, a real algebra g0 is called a real formof g iff (g0)c := g0 ⊗ C = g.

Example 6.13: g = sl(2,C) has just two real forms g0 = sl(2,R) and su(2).More about this is to be found in [Kn1] p.15.

The main ingredients for a classification scheme are the root and Dynkin diagrams. Wegive a sketch starting by the case of complex semisimple algebras where the theory ismost accessible, and then go to compact real and finally briefly to noncompact real al-gebras. The central tool to define roots and their diagrams is a certain distinguishedabelian subalgebra.

Definition 6.16: Let g be a complex semisimple Lie algebra. A Cartan subalgebra h isa maximal abelian subspace of g in which every adH, H ∈ h, is diagonizable.

There are other equivalent ways to characterize a Cartan subalgebra h, for instance (see[Kn1] p.2): h is a nilpotent subalgebra whose normalizer Ng(h) satisfies

Ng(h) := X ∈ g, [X, H] ∈ h for all H ∈ h = h.

Each semisimple complex algebra has a Cartan subalgebra. Any two are conjugate viaInt g ([Kn1] p.24), where Int g is a certain analytic subgroup of GL(g) with Lie algebraad g. We refer to [Kn2] p.69/70 for the notion of an analytic subgroup, which comes upquite naturally when one analyzes the relation between Lie algebras and Lie groups.


The theory of Cartan subalgebras for the complex semisimple case extends to a complexreductive Lie algebra g by just saying that the center of g is to be adjoined to a Cartansubalgebra of the semisimple part of g.

Example 6.14: For g = sl(n,C) we have as Cartan subalgebra

h := diagonal matrices in g.Using our Cartan subalgebra h, we generalize the decomposition we introduced at thebeginning of Section 6.4 for g = sl(2,C)

g = < Z0 > + < X+ > + < X− >,

[Z0, X±] = ±2X±, [X+, X−] = Z0,

to the root space decompositiong = h ⊕

⊕α∈∆

gα

as follows: For all H ∈ h the maps ad H are diagonizable, and, as the elements of hall commute, they are simultaneously diagonizable (by the finite-dimensional spectraltheorem from Linear Algebra). Let V1, . . . , V be the eigenspaces in g for the differentsystems of eigentupels. If h := < H1, . . . , Hr > and adHi acts as λij · id on Vj , definea linear functional λj on h by λj(Hi) = λij . If H := ΣciHi, then adH acts on Vj bymultiplication with ∑

i

ciλij =∑

i

ciλj(Hi) =: λj(H).

In other words, ad h acts in simultaneously diagonal fashion on g and the simultaneouseigenvalues are members of the dual vector space h∗ := HomC(h,C). There are finitelymany such simultaneous eigenvalues and we write

gλ := X ∈ g; [H,X] = λ(H)X for all H ∈ hfor the eigenspace corresponding to λ ∈ h∗. The nonzero such λ are called roots and thecorresponding gλ a root space. The (finite) set of all roots is denoted by ∆.

The following are some elementary properties of root space decompositions (for proofssee for instance [Kn2] II.4):

Proposition 6.5: We have

a)[gα, gβ ] ⊂ gα+β .

b) If α, β ∈ ∆ \ 0 and α + β = 0, then B(gα, gβ) = 0, i.e. root spaces are orthogonalwith respect to the Killing form.c) B is nonsingular on gα × g−α if α ∈ ∆.d) −α ∈ ∆ if α ∈ ∆.e) B |h×h is nondegenerate. We define Hα to be the element of h paired with α.

f) ∆ spans h∗.

Some deeper properties of root space decompositions are assembled in the following state-ment (see again [Kn2] II.4).


Theorem 6.8: Root space decompositions have the following properties:

a) dim gα = 1 for α ∈ ∆.b) nα ∈ ∆ for α ∈ ∆ and any integer n ≥ 2.c) [gα, gβ ] = gα+β for α + β = 0.d) The real subspace h0 of h on which all roots are real is a real form of h and B |h0×h0

is an inner product.

We transfer B| h0×h0 to the real span h∗0 of the roots obtaining a scalar product <,>

and a norm ‖ . ‖. It is not too difficult to see that for α ∈ ∆ there is an orthogonaltransformation sα given on h∗

0 by

sα(ϕ) := ϕ − 2 < ϕ, α >

‖ α ‖2α for ϕ ∈ h∗

0.

sα is called a root reflection in α and the hyperplane α⊥ a mirror.

The analysis of the root space shows that it has very nice geometrical and combinatoricalproperties. The abstraction of these is the following:

Definition 6.17: An abstract root system is a finite set ∆ of nonzero elements in a realinner product space V such that

a) ∆ spans V ,b) all reflections sα for α ∈ ∆ carry ∆ to itself,c) 2 < β,α > / ‖ α ‖2∈ Z for all α, β ∈ ∆.

The abstract root system is called reduced iff α ∈ ∆ implies 2α ∈ ∆.And it is called reducible iff ∆ = ∆′ ∪ ∆′′ with ∆′ ⊥ ∆′′, otherwise it is irreducible.

The root system of a complex semisimple Lie algebra g with respect to a Cartan subal-gebra h forms a reduced abstract root system in h∗

0. And a semisimple Lie algebra g issimple if the corresponding root system is irreducible.

Definition 6.18: The dimension of the underlying space V of an abstract root system∆ is called its rank, and if ∆ is the root system of a semisimple Lie algebra g, we alsorefer to r = dim h as the rank of g.

To give an illustration, we sketch the reduced root systems of rank 2.

case 1 case 2 case 3 case 4


Ordering of the Roots, Cartan Matrices, and Dynkin Diagrams

We fix an ordered basis α1, . . . , αr of h∗0 and define λ = Σciαi to be positive if the first

nonzero ci is positive. The ordering comes from saying λ > µ if λ − µ is positive. Let∆+ be the set of positive members in ∆.

A root α is called simple if α > 0 and α does not decompose as α = β1 + β2 with β1 andβ2 positive roots. And a root α is called reduced if (1/2)α is not a root.Relative to a given simple system α1, . . . , αr, the Cartan matrix C is the r × r-matrixwith entries

cij = 2 < αi, αj > / ‖ αi ‖2 .

It has the following propertiesa) cij ∈ Z for all i, j,b) cii = 2 for all i,c) cij ≤ 0 for all i = j,d) cij = 0 iff cji = 0,e) there exists a diagonal matrix D with positive diagonal entries such that DCD−1 issymmetric positive definite.

An abstract Cartan matrix is a square matrix satisfying properties a) through e) as above.To such a matrix we associate a diagram usually called a Dynkin diagram (historicallymore correct is the term CDW-diagram, C indicating Coxeter and W pointing to Witt):To the elements α1, . . . , αr of a basis of a root system correspond bijectively r points ofa plane, which are also called α1, . . . , αr. For i = j one joins the point αi to αj by cijcji

lines, which do not touch any αk, k = i, j. For cij = cji the lines are drawn as arrowspointing in the direction to αj if cij < cji.

The main facts are that a Cartan matrix and equivalently the CDW-diagram determinethe Lie algebra uniquely (up to isomorphism). In particular this correspondence doesnot depend on the choice of a basis of the root system.

Example 6.15: For g = sl(r + 1,C), r ≥ 1 and its standard Cartan subalgebra

h := D(d1, . . . , dr); d1, . . . , dr ∈ C, Σdi = 0

we have the Cartan matrix

C :=

⎛⎜⎜⎜⎜⎜⎜⎝

2 −1−1 2

••

2 −1−1 2

⎞⎟⎟⎟⎟⎟⎟⎠

and the corresponding CDW-diagram

Ar r ≥ 1.. . .α1 α2 αr


The diagrams for the other classical algebras are the following

Br r ≥ 2,. . .α1 α2 αr−1 αr

Cr r ≥ 3,. . .α1 α2 αr−1 αr

Dr r ≥ 4.. . .α1 α2 αr−2

αr−1

αr

The remaining exceptional graphs of indecomposable reduced root systems are the fol-lowing

E6

E7

E8

F4 G2

6.6.3 Structure Theory for Compact Real Lie Algebras

Up to now, we treated complex Lie algebras. We go over to a real Lie algebra g0 and(following [Kn1] p.16) we call a subalgebra h0 of g0 a Cartan subalgebra if its complexifi-cation is a Cartan subalgebra of the complex algebra g = g0 ⊗C. At first we look at thecompact case: If g0 is the Lie algebra of a compact Lie group G and if t0 is a maximalabelian subspace of g0, then t0 is a Cartan subalgebra. We have already discussed theexample g0 = su(2) in 4.2 and we will discuss g0 = su(3) in the next section and see thatwe have dim t0 = 2 in this case. To illustrate the general setting, we collect some moreremarks about the background notions (from [Kn1] p.15):


Theorem 6.9: If g0 is semisimple, then the following conditions are equivalent:a) g0 is the Lie algebra of some compact Lie group.b) Int g0 is compact.c) The Killing form of g0 is negative definite.

If g is semisimple complex, a real form g0 of g is said to be compact if the equivalentconditions of the theorem hold. Our main example su(n) is a compact form of sl(n,C).The fundamental result is here:

Theorem 6.10: Each complex semisimple Lie algebra has a compact real form.

Another important topic is maximal tori.

Definition 6.19: Let G be a compact connected linear group. A maximal torus in G isa subgroup T maximal with respect to the property of being compact connected abelian.

From [Kn2] Proposition 4.30, Theorem 4.34 and 4.36 we take over:

Theorem 6.11: If G is a compact connected linear group, thena) the maximal tori in G are exactly the analytic subgoups corresponding to the maximalabelian subalgebras of g0 = Lie G;b) any two maximal abelian subalgebras of g0 are conjugate via AdG and hence any twomaximal tori are conjugate via G.

Theorem 6.12: If G is compact connected and T a maximal torus, then each elementof G is conjugate to a member of T .

Example 6.16:1. For G = SU(n) one has as a maximal torus, its Lie algebra and its complexified Liealgebra

T = D(eiϑ1 , . . . , eiϑn); Σϑj = 0,t0 = D(iϑ1, . . . , iϑn); Σϑj = 0,t = D(z1, . . . , zn); Σzj = 0.

2. For G = SO(2n) and SO(2n + 1) one has as a maximal torus T the diagonal blockmatrices with 2× 2-blocks r(ϑj), j = 1, . . . , n, in the diagonal, resp. additionally 1 in thesecond case.

We go back to the general case and use the notation we have introduced. In this setting,we can form a root-space decomposition

g = t ⊕⊕α∈∆

gα.

Each root is the complexified differential of a multiplicative character χα of the maximaltorus T that corresponds to t0, with

Ad(t)X = χα(t)X for all X ∈ gα.


Another central concept is the one of the Weyl group. We keep the notation used above:G is compact connected, g0 = Lie G, T a maximal torus, t0 = Lie T , t = t0⊗C, ∆(g, t) isthe set of roots, and B is the negative of a G invariant scalar product on g0. We definetR = it0. As roots are real on tR, they are in t∗R. The form B, when extended to becomplex bilinear, is positive definite on tR, yielding a scalar product <,> on t∗R. Now, theWeyl group W = W (∆(g, t)) is in this context defined as the group generated by the rootreflections sα, α ∈ ∆(g, t), given (as already fixed above) on t∗R by sα(λ) = λ− 2<λ,α>

|α|2 α.

This is a finite group, which also can be characterized like this:One defines W (G,T ) as the quotient of the normalizer by the centralizer

W (G,T ) := NG(T )/ZG(T ).

By Corollary 4.52 in [Kn2] p.260 one has ZG(T ) = T and hence also the formulaW (G,T ) = NG(T )/T.

Theorem 6.13: The group W (G,T ), when considered as acting on t∗R, coincides withW (∆(g, t)).

6.6.4 Structure Theory for Noncompact Real Lie Algebras

Now, in the final step, we briefly treat the general case of a real Lie algebra g0 ofa noncompact semisimple group G. We already introduced the Killing form B, theCartan involution θ with θX = −tX for a matrix X, and the fact that if g0 consists ofreal, complex or quaternion matrices and is closed under conjugate transpose, then it isreductive. More generally we call here a Cartan involution θ any involution of g0 suchthat the symmetric bilinear form

Bθ(X, Y ) := −B(X, θY )

is positive definite. Then we have the Cartan decomposition

g0 = k0 ⊕ p0, k0 := X ∈ g0; θX = X, p0 := X ∈ g0; θX = −X

with the bracket relations

[k0, k0] ⊆ k0, [k0, p0] ⊆ p0, [p0, p0] ⊆ k0.

One has the following useful facts ([Kn 2] VI.2):a) g0 has a Cartan involution.b) Any two Cartan involutions of g0 are conjugate via Int g0.c) If g is a complex semisimple Lie algebra, then any two compact real forms of g areconjugate via Int g.d) If g is a complex semisimple Lie algebra and is considered as a real Lie algebra, thenthe only Cartan involutions of g are the conjugations with respect to the compact realforms of g.

The Cartan decomposition of Lie algebras has a global counterpart (see for instance[Kn1] Theorem 4.3). Its most rudimentary form is the following:


Proposition 6.6: For every A ∈ GL(n,C) resp. GL(n,R), there are S1, S2 ∈ O(n)resp. U(n) and a diagonal matrix D with real positive elements in the diagonal such thatA = S1DS2. This decomposition is not unique.

Proof as Exercise 6.29 (use the fact that to each positive definite matrix B there is aunique positive definite matrix P with B = P 2 and/or see [He] p.73).

Restricted Roots

We want to find a way to describe Cartan subalgebras in our general situation: LetG be semisimple linear group with Lie G = g0, g = g0 ⊗ C, θ a Cartan involution ofg0, and g0 = k0 ⊕ p0 the corresponding Cartan decomposition. Let B be the Killingform (or more generally any nondegenerate symmetric invariant bilinear form on g0 withB(θX, θY ) = B(X, Y ) such that Bθ(X,Y ) = −B(X, θY ) is positive definite).

Definition 6.20: Let a0 be a maximal abelian subspace of p0. Restricted roots are thenonzero λ ∈ a∗0 such that

(g0)λ := X ∈ g0; (ad H)X = λ(H)X for all H ∈ a0 = 0.

Let Σ be the set of restricted roots and m0 := Zk0 . We fix a basis of a0 and an associatedlexicographic ordering in a∗0 and define Σ+ as the set of positive resticted roots. Then

n0 =⊕

λ∈Σ+

(g0)λ

is a nilpotent Lie subalgebra, and we have the following Iwasawa decomposition:

Theorem 6.14: The semisimple Lie algebra g0 is a vector space direct sum

g0 = k0 ⊕ a0 ⊕ n0.

Here a0 is abelian, n0 is nilpotent, a0 ⊕ n0 is a solvable subalgebra of g0, and a0 ⊕ n0 has[a0 ⊕ n0, a0 ⊕ n0] = n0.

For a proof see for instance [Kn2] Proposition 6.43. There is also a global version ([Kn2]Theorem 6.46) stating (roughly) that one has a diffeomorphism of the correspondinggroups K × A × N −→ G given by (k, a, n) −→ kan. We already know all this in ourstandard example G = SL(2,R): If we take θX = −tX, we have

g0 = sl(2,R) = k0 ⊕ a0 ⊕ n0,

with

k0 =< (1

−1 ) >, a0 =< (1

−1 ) >, n0 =< (1

) >

and as in Example 6.6 in 6.2

G = KAN with K = SO(2), A = t(y); y > 0, N = n(x ); x ∈ R.


As to be expected, roots and restricted roots are related to each other. If t0 is a maximalabelian subspace of m0 := Zk0(a0), then h0 := a0 ⊕ t0 is a Cartan subalgebra of g0 inthe sense we defined at the beginning ([Kn2] Proposition 6.47). Roots are real valued ona0 and imaginary valued on t0. The nonzero restrictions to a0 of the roots turn out tobe restricted roots. Roots and restricted roots can be ordered compatibly by taking a0

before it0. Cartan subalgebras in this setting are not always unique up to conjugacy.

Exercise 6.30: Show that h0 :=< ( 11 ) > and h′

0 :=< ( 1−1 ) > are Cartan

subagebras of g0 = sl(2,R) and determine the corresponding Iwasawa decomposition.

Every Cartan subalgebra of g0 is conjugate (via Intg0) to this h′0 or the h0 above.

6.6.5 Representations of Highest Weight

After this excursion into structure theory of complex and real Lie algebras, we finallycome back to show a bit more of the general representation theory, which is behind theexamples discussed in our sections 6.4 and 6.5 and which we shall apply in our nextsection to the example g0 = su(3). We start by following the presentation in [Kn1] p.8.Let at first g be a complex Lie algebra, h a Cartan subalgebra, ∆ = ∆(g, h) the set ofroots, h0 the real form of h where roots are real valued, B the Killing form (or a moregeneral form as explained above), and Hλ ∈ h0 corresponding to λ ∈ h∗

0.Let π : g −→ EndV be a representation. For λ ∈ h∗ we put

Vλ := v ∈ V ; (π(H) − λ(H)1)nv = 0 for all H ∈ h and some n = n(H,V ) ∈ N.

If Vλ = 0, Vλ is called a generalized weight space and λ a weight. If V is finitedimensional, V is the direct sum of its generalized weight spaces. This is a generalizationof the fact from linear algebra about eigenspace decompositions of a linear transformationon a finite-dimensional vector space. If λ is a weight, then the subspace

V 0λ := v ∈ V ; π(H)v = λ(H)v for all H ∈ h

is nonzero and called the weight space corresponding to λ. One introduces a lexicographicordering among the weights and hence has the notion of highest and lowest weights. Theset of weights belonging to a representation π is denoted by Γ(π). For dim h = 1 wehave h h∗ C and so the weights are simply the complex numbers we met in ourexamples in 6.4 and 6.5, for instance for g = sl(2,C) we found representations σN withdim VN = N + 1 and weights −N,−N + 2, . . . , N − 2, N . In continuation of this, wetreat a more general example.

Example 6.17: Let G = SU(n) and, hence, g = su(n)c = sl(n,C) and the Cartansubalgebra the diagonal subalgebra h :=< H; Tr H = 0 >. We choose as generators of hthe matrices Hj := Ejj − Ej+1,j+1 and (slightly misusing) we write also Hj = ej − ej+1

where ej denote the canonical basis vectors of Cn−1 h. We have three natural typesof representations :At first, let V be the space of homogeneous polynomials P of degree N in z1, . . . , zn andtheir conjugates and take the action of g ∈ SL(n,C) given by

(π(g)P )(z, z) := P (g−1z, g−1z), z = t(z1, . . . , zn).


Hence for H = D(iϑ1, . . . .iϑn) with ϑj ∈ R, Σϑj = 0, we come up with

(dπ(H)P )(z, z) =n∑

j=1

(−iϑjzj)∂zj P (z, z) +n∑

j=1

(−iϑj zj)∂zj P (z, z).

If P is a monomial

P (z, z) = zk11 · · · · · zkn

n · zl11 · · · · · zln

n withn∑

j=1

(kj + lj) = N,

then we get

dπ(H)P =n∑

j=1

(lj − kj)(iϑj)P.

Exercise 6.31: Describe the weights for this representation with respect to the lexico-graphic ordering for the basis elements of h0 =< iH1, . . . , iHn−1 >R. Do this also for thetwo other “natural” representations, namely on the subspaces V1 of holomorphic and V2

of antiholomorphic polynomials in V (i.e. polyomials only in z1, . . . , zn resp. z1, . . . , zn).We will come back to this for the case n = 3 in the next section.

In our examples g0 = sl(2,R) and g0 = su(2) in 6.4 resp. 6.5 we saw that the finite-dimensionality of the representation space V was equivalent to an integrality conditionfor the weight. Now we want to look at the general case where the weights are l-tupelsif one fixes a basis of a Cartan subalgebra h resp. its dual h∗ and the real form h∗

0.

Definition 6.21: λ ∈ h∗ is said to be algebraically integral if

2 < λ,α > /|α|2 ∈ Z for all α ∈ ∆.

Then as a generalization of what we saw in 6.4 and 6.5, we have the elementary propertiesof the weights for a finite-dimensional representation π on a vector space V :

Proposition 6.7: a) π(h) acts diagonally on V , so that every generalized weight vectoris a weight vector and V is the direct sum of all weight spaces.b) Every weight is real valued on h0 and algebraically integral.c) Roots and weights are related by π(gα)Vλ ⊂ Vλ+α.

We fix a lexicographical ordering and take ∆+ to be the set of positive roots withΠ = α1, . . . , αl as the corresponding simple system of roots. We say that λ ∈ V isdominant if < λ, αj > ≥ 0 for all αj ∈ Π. The central fact is here the beautiful Theoremof the Highest Weight ([Kn2] Th. 5.5):

Theorem 6.15: The irreducible finite-dimensional representations π of g stand (up toequivalence) in one-one correspondence with the algebraically integral dominant linearfunctionals λ on h, the correspondence being that λ is the highest weight of πλ.The highest weight λ of πλ has these additional properties:a) λ depends only on the simple system Π and not on the ordering used to define Π.b) The weight space Vλ for λ is one-dimensional.c) Each root vector Eα for arbitrary α ∈ ∆+ annihilates the members of Vλ, and themembers of Vλ are the only vectors with this property.


d) Every weight of πλ is of the form λ − ∑li=1 niαi with ni ∈ N0 and αi ∈ Π.

e) Each weight space Vµ for πλ has dimVwµ = dim Vµ for all w in the Weyl group W (∆),and each weight µ has | λ |≥| µ | with equality only if µ is in the orbit W (∆)λ.

We already introduced in 3.2 the concept of complete reducibility. Here we can state thefollowing fact ([Kn2] Th. 5.29).

Theorem 6.16: Let π be a complex linear representation of g on a finite-dimensionalcomplex vector space V . Then V is completely reducible in the sense that there existinvariant subspaces U1, . . . , Ur of V such that V = U1 ⊕ · · · ⊕ Ur and such that therestriction of the representation to each Ui is irreducible.

The proofs of these two theorems use three tools, which are useful also in other contexts:– Universal enveloping algebras,– Casimir elements,– Verma modules.Again following [Kn1] p.10f, we briefly present these as they will help us to a betterunderstanding in our later chapters.

The Universal Enveloping Algebra

This is a general and far reaching concept applicable for any complex Lie algebra g:

We take the tensor algebra

T (g) := C ⊕ g ⊕ (g ⊗ g) ⊕ . . . .

and the two-sided ideal a in T (g) generated by all

X ⊗ Y − Y ⊗ X − [X, Y ], X, Y ∈ T 1(g).

Here T 1(g) denotes the space of first order tensors.

Then the universal enveloping algebra is the associative algebra with identity given by

U(g) := T (g)/a.

This formal definition has the practical consequence that U(g) consists of sums of mono-mials usually written (slightly misusing) as a(j)X

j11 . . . Xjn

n , a(j) ∈ C, X1, . . . , Xn ∈ gwith the usual addition and a multiplication coming up from the Lie algebra relationsby XiXj = XjXi + [Xi, Xj ]. More carefully and formally correct, one has the following:Let ι : g −→ U(g) be the composition of natural maps

ι : g T 1(g) → T (g) −→ U(g),

so that

ι([X,Y ]) = ι(X)ι(Y ) − ι(Y )ι(X).


ι is in fact injective as a consequence of the fundamental Poincare-Birkhoff-Witt Theorem:

Theorem 6.17: Let Xii∈I be a basis of g, and suppose that a simple ordering hasbeen imposed on the index set I. Then the set of all monomials

(ιXi1)j1 · · · · · (ιXin

)jn

with i1 < · · · < in and with all jk ≥ 0, is a basis of U(g). In particular the canonicalmap ι : g −→ U(g) is one-to-one.

The name universal enveloping algebra stems from the following universal mapping prop-erty.

Theorem 6.18: Whenever A is a complex associative algebra with identity and we havea linear mapping ϕ : g −→ A such that

ϕ(X)ϕ(Y ) − ϕ(Y )ϕ(X) = ϕ([X, Y ]) for all X, Y ∈ g,

then there exists a unique algebra homomorphism ϕ : U(g) −→ A such that ϕ(1) = 1and ϕ = ϕ ι.

The map ϕ from the theorem may be thought of as an extension of ϕ from g to all ofU(g). This leads to the following useful statement.

Theorem 6.19: Representations of g on complex vector spaces stand in one-one corre-spondence with left U(g) modules in which 1 acts as the identity.This fact is essential for the construction of representations and implicit in our examplesin 6.4 and 6.5.

We now come back to the case that g is semisimple and to the notation introduced above.We enumerate the positive roots as α1, . . . , αm and we let H1, . . . , Hl be a basis of h.Then for the construction of highest weight representations it is appropriate to use theordered basis

E−α1 , . . . , E−αm ,H1, . . . , Hl, Eα1 , . . . , Eαm

in the Poincare-Birkhoff-Witt Theorem. The theorem says that

Ep1−α1

. . . Epm

−αmHk1

1 . . . Hkl

l Eq1α1

. . . Eqmαm

is a basis of U(g). If one applies members of this basis to a nonzero highest weight vectorv0 of V, one gets control of a general member of U(g)v0: Eq1

α1. . . Eqm

αmwill act as 0 if

q1 + · · · + qm > 0, and Hk11 . . .Hkl

l will act as a scalar. Thus one has only to sort outthe effect of Ep1

−α1. . . Epm

−αmand most of the conclusions of the Theorem of the Highest

Weight follow readily. If one looks at our examples in 6.4 and 6.5, one can get an ideahow this works even in the case of representations, which are not finite-dimensional, andhow the integrality of the weight leads to finite-dimensionality.


The Casimir Element

For a complex semisimple Lie algebra g with Killing form B, the Casimir element Ω isthe element

Ω0 :=∑i,j

B(Xi, Xj)XiXj

of U(g), where (Xi) is a basis of g and (Xi) is the dual basis relative to B. One can showthat Ω0 is defined independently of the basis (Xi) and is an element of the center Z(g)of U(g).

Exercise 6.32: Check this for the case g = sl(2,C) and determine the Casimir element.

In the general case one has the following statement.

Theorem 6.20: Let Ω0 be the Casimir element, (Hi)i=1,.,l an orthogonal basis of h0

relative to B, and choose root vectors Eα so that B(Eα, E−α) = 1 for all roots α. Thena) Ω0 =

∑li=1 H2

i +∑

α∈∆ EαE−α.b) Ω0 operates by the scalar |λ|2 + 2 < λ, δ > = |λ + δ|2 − |δ|2 in an irreducible finite-dimensional representation of g of highest weight λ, where δ is half the sum of the positiveroots.c) The scalar by which Ω0 operates in an irreducible finite-dimensional representation ofg is nonzero if the representation is not trivial.

The main point is that kerΩ0 is an invariant subspace of V if V is not irreducible.

Remark 6.16: The center of U(g) is important also in the context of the determinationof infinite-dimensional representations. We observed in Exercise 6.23 in 6.4 that Ω, amultiple of the Casimir element Ω0, acts as a scalar for the unitary irreducible represen-tations of sl(2,R). For more general information we recommend [Kn] p.214 where as aspecial case of Corollary 8.14 one finds the statement: If π is unitary, then each mem-ber of the center Z(gc) of U(gc) acts as a scalar operator on the K−finite vectors of π.We will come back to this in 7.2 while explicitly constructing representations of SL(2,R).

The Verma Module

We fix a lexicographic ordering and introduce

b := h ⊕⊕α>0

gα.

For ν ∈ h∗, make C into a one-dimensional U(b) module Cν by defining the action ofH ∈ h by Hz = ν(H)z for z ∈ C and the action of

⊕α>0 gα by zero. For µ ∈ h∗, we

define the Verma module V (µ) by

V (µ) := U(g) ⊗U(h) Cµ−δ.

where δ is again half the sum of the positive roots and this term is introduced to simplifycalculations with the Weyl group.


Verma modules are essential for the construction of representations. They have the fol-lowing elementary properties:a) V (µ) = 0.b) V (µ) is a universal highest weight module for highest weight modules of U(g) withhighest weight µ − δ.c) Each weight space of V (µ) is finite-dimensional.d) V (µ) has a unique irreducible quotient L(µ).

If λ is dominant and algebraically integral, then L(λ+δ) is the irreducible representationof highest weight λ looked for in the theorem of the Highest Weight (Theorem 6.15).

In our treatment of finite and compact groups we already saw the effectiveness of thetheory of characters of representations. As for the moment we look at finite-dimensionalrepresentations we can use characters here too. To allow for more generalization, wetreat them for now as formal exponential sums (again following [Kn1] p.12/3):Let again g be a semisimple Lie algebra, h a Cartan subalgebra, ∆ a set of roots providedwith a lexicographic ordering, α1, . . . , αl the simple roots, and W (∆) the Weyl group.We regard the set Zh∗

of functions f from h∗ to Z as an abelian group under pointwiseaddition. We write

f =∑λ∈h∗

f(λ)eλ.

The support of f is defined to be the set of λ ∈ h∗ for which f(λ) = 0. Within Zh∗,

let Z[h∗] be the subgroup of all f of finite support. The subgroup Z[h∗] has a naturalcommutative ring structure, which is determined by eλeµ = eλ+µ. Moreover, we introducea larger ring Z < h∗ > with

Z[h∗] ⊆ Z < h∗ > ⊆ Zh∗

consisting of all f ∈ Zh∗whose support is contained in the union of finitely many sets

νi − Q+, νi ∈ h∗ and

Q+ := l∑

i=1

niαi; ni ∈ N0.

Multiplication in Z < h∗ > is given by

(∑λ∈h∗

cλeλ)(∑µ∈h∗

cµeµ) :=∑λ∈h∗

(∑

λ+µ=ν

cλcµ)eν .

If V is a representation of g (not necessarily finite-dimensional), one says that V has acharacter if V is the direct sum of its weight spaces under h, i.e., V = ⊕µ∈h∗Vµ , and ifdim Vµ < ∞ for µ ∈ h∗. In this case the character is

char(V ) :=∑µ∈h∗

(dimVµ)eµ

as an element of Zh∗. This definition is meaningful if V is finite-dimensional or if V is a

Verma module.


We have two more important notions:The Weyl denominator is the element d ∈ Z[h∗] given by

d := eδΠα∈∆+(1 − e−α).

Here δ is again half the sum of the positive roots.The Kostant partition function P is the function from Q+ to N that tells the number ofways, apart from order, that a member of Q+ can be written as the sum of positve roots.We put P(0) = 1 and define K := Σγ∈Q+P(γ)e−γ ∈ Z < h∗ >. Then one can prove thatone has Ke−δd = 1 in the ring Z < h∗ > , hence d−1 ∈ Z < h∗ >. Then we have as thelast main theorem in this context the famous Weyl Character Formula:

Theorem 6.21: Let (π, V ) be an irreducible finite-dimensional representation of thecomplex semisimple Lie algebra g with highest weight λ. Then

char(V ) = d−1∑

w∈W (∆)

(det w)ew(λ+δ).

Now we leave the treatment of the complex semisimple case and go over to the follow-ing situation: G is compact connected, g0 := Lie G, g the complexification of g0, T amaximal torus, t0 := Lie T , ∆(g, t) the set of roots, and B the negative of a G invariantinner product on g0, and tR := it0. As we know, roots are real on tR, hence are in t∗R.The form B, when extended to be complex bilinear, is positive definite on tR, yielding aninner product <,> on t∗R. W (∆(g, t)) is the Weyl group generated by the root reflectionssα for α ∈ ∆(g, t). Besides the notion of algebraic integrality already exploited in thecomplex semisimple case above, we have here still another notion of integrality:We say that λ ∈ t∗ is analytically integral if the following equivalent conditions hold:1) Whenever H ∈ t0 satisfies exp H = 1, then λ(H) is in 2πiZ.2) There is a multiplicative character ψλ of T with ψλ(exp H) = eλ(H) for all H ∈ t0.In [Kn1] p.18 one finds a list of properties of these notions. We cite part of it:a) Weights of finite-dimensional representations of G are analytically integral. In partic-ular every root is analytically integral.b) Analytically integral implies algebraically integral.c) If G is simply connected and semisimple, then algebraically integral implies analyti-cally integral.

For instance, the half sum δ of positve roots is algebraically integral but not analyticallyif G = SO(3).

In our situation the Theorem 6.15 (Theorem of the Highest Weight) comes in the followingform:

Theorem 6.22: Let G be a compact connected Lie group with complexified Lie alge-bra g, let T be a maximal torus with complexified Lie algebra t, and let ∆+(g, t) be apositive system for the roots. Apart from equivalence the irreducible finite-dimensionalrepresentations π of G stand in one-one correspondence with the dominant analyticallyintegral linear functionals λ on t, the correspondence being that λ is the highest weightof π.


And we restate Theorem 6.21 (Weyl’s Character Formula) in the form:

Theorem 6.23: The character χλ of the irreducible finite-dimensional representation ofG with highest weight λ is given by

χλ =∑

w∈W (detw)ψw(λ+δ)−δ(t)Πα∈∆+(1 − ψ−α(t))

at every t ∈ T where no ψα takes the value 1 on t.

In the next section we shall illustrate this by treating an example, which was a milestonein elementary particle physics.

6.7 The Example su(3)

In 1962 Gell-Mann proposed in a seminal paper [Gel] a symmetry scheme for the de-scription of hadrons, i.e. certain elementary particles (defined by interacting by stronginteraction), which had a great influence in elementary physics and beyond, as it ledto the notion of quarks and the eightfold way. We can not go too much into the phys-ical content but discuss the mathematical background to give another example of thegeneral theory. Respecting the historical context, we adopt the notation introduced byGell-Mann and used in Cornwell’s presentation in [Co] vol II, p.502ff: As su(3) consistsof tracelass skew hermitian three-by-three matrices, one can use as a basis

A1 :=

⎛⎝ i

i

⎞⎠ , A2 :=

⎛⎝ 1

−1

⎞⎠ , A3 :=

⎛⎝ i

−i

⎞⎠ ,

A4 :=

⎛⎝ i

i

⎞⎠ , A5 :=

⎛⎝ 1

−1

⎞⎠ , A6 :=

⎛⎝ i

i

⎞⎠ ,

A7 :=

⎛⎝ 1

−1

⎞⎠ , A8 := 1√

3

⎛⎝ i

i−2i

⎞⎠ .

This basis is also a basis for the complexification su(3)c = sl(3,C). One may be temptedto use the elementary matrices Eij with entries (Eij)kl = δikδjl and take as a basis

Eij , i = j, (i, j = 1, 2, 3)H1 := E11 − E22,H2 := E22 − E33.

One has the commutation relations [Eij , Ekl] = δjkEil − δilEkj . For a diagonal matrixH := D(h1, h2, h3) we get

[H,Eij ] = (ei(H) − ej(H))Eij , ei(H) = hi,

and hence[H1, E12] = 2E12, [H2, E12] = −E12,[H1, E13] = E13, [H2, E13] = E13,[H1, E23] = −E23, [H2, E23] = 2E23,

and[E12, E21] = H1, [E23, E32] = H2, [E13, E31] = E11 − E33 =: H3.

6.7 The Example su(3) 105

We see that h :=< H1, H2 > is a Cartan subalgbra,

gα1 :=< E12 >, gα2 :=< E13 >, gα3 := < E23 >

are root spaces, i.e. nontrivial eigenspaces for the roots α1, α2, α3 ∈ h∗ given by

α1 := (α1(H1), α1(H2)) = (2,−1),α2 := (α2(H1), α2(H2)) = (1, 1),α3 := (α3(H1), α3(H2)) = (−1, 2).

(In the physics literature these tuples αj themselves sometimes are called roots.)We choose these roots αj as positive roots and then get in this coordinization a slightlyunsymmetric picture:

α1

α2

α3

1 2

-1-2

2

-1

Hence we better use Gell-Mann’s matrices and change to Xj := −iAj . Then we getH1 = X3, H ′

2 := X8. Moreover, for X± := (1/2)(X1± iX2), Y± := (1/2)(X6± iX7), andZ± := (1/2)(X4±iX5), we have X+ = E12, X− = E21 etc. and the commutation relations

[H1, X±] = 2X±, [H ′2, X±] = 0,

[H1, Y±] = ∓Y±, [H ′2, Y±] = ±√

3Y±,

[H1, Z±] = ±Z±, [H ′2, Z±] = ±√

3Z±.

Now the Cartan subalgebra is h =< H1,H′2 > and the root spaces are

g±α1 =< X±, >, g±α2 =< Y± >, g±α3 =< Z± >

with positive roots given by

α1 := (α1(H1), α1(H ′2)) = (2, 0),

α2 := (α2(H1), α2(H ′2)) = (−1,

√3),

α3 := (α3(H1), α3(H ′2)) = (1,

√3).

α1 and α2 are simple roots, we have α1 + α2 = α3 and we get a picture with hexagonalsymmetry:

α1

α3α2

1 2-1-2

2

-1


In the literature (see for instance [Co] p.518) we find still another normalization, whichis motivated like this: The Killing form B for su(3) is given by diagonal matrices

B(Ap, Aq) = −12δpq, resp. B(Xp, Xq) = 12δpq.


One uses B to introduce a nondegenerate symmetric bilinear form <,> on h∗ as follows:We define a map

h∗ ⊃ ∆ α −→ Hα ∈ h by B(Hα,H) = α(H) for all H ∈ h,

and then< α, β >:= B(Hα,Hβ) for all α, β ∈ ∆.

This leads to inner products on hR = < Hα1 , . . . , Hαl>R (α1, . . . , αl simple positive

roots) and its dual space. For su(3) we have l = 2 and with α1, α2 from above

< α1, α1 >=< α2, α2 >= 1/3, < α1, α2 >=< α2, α1 >= −1/6.

It is convenient to introduce an orthonormal basis H1, . . . , Hl of h, i.e. with B(Hp, Hq) =δpq. Then we have the simple rule

< α, β >=∑

j

αj βj , αj = α(Hj), βj = β(Hj).

In our case this leads to

H1 = (1/(2√

3))H1, H2 = (1/2)H ′2 = (1/(2

√3))X8

andα1 = (

√3/6)α1 = (1/

√3, 0), α2 = (

√3/6)α2 = (−1/(2

√3), 1/2).

We see that, for the Cartan matrix C = (Cpq) = (2 < αp, αq > / < αp, αp >) from 6.6.2,we have in this case

C = (2 −1−1 2 ).

The Weyl group W (see the end of 6.6.3) is generated by the reflections

sαp(αq) = αq − Cpqαp

at the planes in h∗R Rl orthogonal to the αp with l = 2 and p, q = 1, 2 in our case. For

g0 = su(3) we have 6 elements in W .

Exercise 6.34: Determine these explicitely.

Now we proceed to the discussion of the representations (π, V ) of su(3). The generaltheory tells us that we can assume that π(H) is a diagonal matrix for each H ∈ h. Thediagonal elements π(H)jj =: λj(H) fix the weights λj , i.e. linear functionals on h. Wewrite

λ = (λ(H1), . . . , λ(Hl))

for an ON-basis of h. By the Theorem of the Highest Weight (Theorem 6.15 and Theorem6.22), an irreducible representation is determined by its highest weight Λ and this highest


weight is simple, i.e. has multiplicity one. Moreover the general theory says ([Co] p.568)that these highest weights are linear combinations with non negative integer coefficientsof fundamental weights Λ1, . . . ,Λl defined by

Λj(H) =l∑

k=1

(C−1)kjαk(H)

where α1, . . . , αl are the positive simple roots, fixed at the beginning.And with δ = (1/2)Σαj the half sum of these positive simple roots Weyl’s dimensionalityformula says ([Co] p.570) that the irreducible finite-dimensional representation (π, V )with highest weight Λ has dimension

d = Πlj=1

< Λ + δ, αj >

< δ, αj >.

In our case we have

C−1 = (2/3 1/31/3 2/3 ).

and the fundamental weights

Λ1 = (2/3)α1 + (1/3)α2 = (1/6)(√

3, 1), Λ2 = (1/3)α1 + (2/3)α2 = (1/6)(0, 2).

We write π(n1, n2) for the representation with highest weight Λ = n1Λ1 + n2Λ2 and getby Weyl’s formula for its dimension

(6.6) d = (n1 + 1)(n2 + 1)((1/2)(n1 + n2) + 1).

In the physics literature one often denotes the representation by its dimension. Then onehas

a) π(0, 0) = 1, the trivial representation, which has only one weight, namely the high-est weight Λ = 0,b) π(1, 0) = 3 with highest weight Λ = Λ1 and dimension 3,c) π(0, 1) = 3∗ with highest weight Λ = Λ2 and dimension 3,d) π(2, 0) = 6 with highest weight Λ = 2Λ1 and dimension 6,e) π(0, 2) = 6∗ with highest weight Λ = 2Λ2 and dimension 6,f) π(1, 1) = 8 with highest weight Λ = Λ1 + Λ2 and dimension 8,g) π(3, 0) = 10 with highest weight Λ = 3Λ1 and dimension 10,

and so on. The weights appearing in the representations can be determined by the factthat they are of the form

λ = Λ − m1α1 − m2α2, m1,m2 ∈ N0,

and that the Weyl group transforms weights into weights (preserving the multiplicity).We sketch some of the weight diagrams and later discuss an elementary method to getthese diagrams, which generalizes our discussion in 6.4 and 6.5.


π(1, 0) = 3

λ(H1)

λ(H2)

Λ = (1/6)(√

3, 1) = (2/3)α1 + (1/3)α2

sα1Λ = (1/6)(−√3, 1) = −(1/3α1 + (1/3)α2

sα2sα1Λ = (1/6)(0, 2) = −(1/3)α1 − (2/3)α2

π(0, 1) = 3∗

λ(H1)

λ(H2)

Λ = (1/6)(0, 2) = (1/3)α1 + (2/3)α2

sα2Λ = (1/6)(√

3,−1) = −(1/3α1 − (1/3)α2

sα1sα2Λ = (1/6)(−√3,−1) = − (2/3)α1 − (1/3)α2

π(2, 0) = 6

λ(H1)

λ(H2)

Λ = (1/6)(2√

3, 2) = (4/3)α1 + (2/3)α2

Λ − α1 = (1/6)(0, 2) = (1/3)α1 + (2/3)α2

sα1Λ = −(2/3)α1 + (2/3)α2

sα1sα2(Λ − α1) = −(2/3)α1 − (1/3)α2

sα2(Λ − α1) = (1/6)(√

3,−1) = (1/3)α1 − (1/3)α2

sα2sα1Λ = (1/6)(0,−4) = −(2/3)α1 − (4/3)α2


π(1, 1) = 8

λ(H1)

λ(H2)

Λ = (1/6)(√

3, 3) = α1 + α2sα1Λ = α2

sα1sα2Λ = −α1 0 = (0, 0) sα2Λ = α1

sα1sα2sα1Λ = −α1 − α2 sα2sα1Λ = (1/6)(√

3,−3) = −α2

π(3, 0) = 10

λ(H1)

λ(H2)

Λ = (1/6)(3√

3, 3) = 2α1 + α2α1 + α2α2−α1 + α2

α10 = (0, 0)−α1

−α1 − α2 −α2

−α1 − 2α2

In 8 we find a first example of a weight, which has not multiplicity one, namely theweight (0, 0).


Similar to our discussion for G = SU(2) and g = su(2) in 4.2, one has here the problemof the explicit decomposition of representations into irreducible components. This goesunder the heading of Clebsch Gordon series and can be found for instance in [Co] p.611ff.As examples we cite

3 ⊗ 3∗ 8 ⊕ 1 and 3 ⊗ 3 ⊗ 3 10 ⊕ 28 ⊕ 1.

The proofs of these formulae need some skill or at least patience. But it is very tempt-ing to imagine the bigger weight diagrams in our examples as composition of the tri-angles 3 and 3∗. This leads to an important interpretation in the theory of ele-mentary particles: We already remarked that the discussion of the irreducible repre-sentations of SU(2) resp. SO(3) is useful in the description of particles subjected toSO(3)-symmetry (in particular an electron in an hydrogen atom), namely a representa-tion πj , j = 0, 1/2, 1, 3/2, . . . describes a multiplet of states of the particle with angularmomentum resp. spin 2j + 1 and the members of the multiplet are distinguished by amagnetic quantum number m ∈ Z with m = −(2j + 1),−(2j − 1), . . . , (2j + 1). As onetried to describe and classify the heavier elementary particles, one observed patterns,which could be related to the representation theory of certain compact groups, in partic-ular G = SU(3): In Gell-Mann’s eightfold way we take SU(3) as an internal symmetrygroup to get the ingredients of an atom, the proton p and the neutron n as members ofa multiplet of eight states, which correspond to our representation 8. To be little bitmore precise, one looks at a set of hadrons, i.e. particles characterized by an experimen-tally fixed behaviour (having strong interaction), and describes these by certain quantumnumbers, namely electric charge Q, hypercharge Y, baryon number B, strangeness S,isospin I with third component I3. The baryon number is B = 1 for baryons like protonor the neutron, -1 for antibaryons, and zero for other particles like the π-mesons. In thiscontext, one has the following relations

Y = B + S, Q = I3 + (1/2)Y

and one relates particle multipets to representations of su(3) by assigning an I3 - axisto the axis measuring the weight λ(H1) and an Y -axis to λ(H2). This way, the low-est dimensional nontrivial (three-dimensional) representations correspond to two triples(u, d, s) and (u, d, s), named quarks resp. antiquarks (with u for up, d for down, and s fornonzero strangeness). We give a list of the quantum numbers characterizing the quarks

quark B I I3 Y S Qu 1/3 1/2 −1/2 1/3 0 2/3d 1/3 1/2 1/2 1/3 0 −1/3s 1/3 0 0 −2/3 −1 −1/3

and for more information refer to p. 50 f of [FS]. In this picture the weight diagrams looklike this:


3 =: quark 3∗ =: antiquark

I3

Y

I3

Y

−1/2 1/2

1/3

−2/3

−1/3

−1/2 1/2

1/3

−1/3

2/3

u d

s

u d

s

Up to now the experimental evidence for the existence of the quarks (as far as we know) isstill indirect: they are thought of as building blocks of the more observable particles likethe proton and the neutron. The following diagram shows the quantum numbers of thebaryon octet 8. By a simple addition process (comparing the (I3, Y )-coordinates) onecan guess the quark content of the neutron as n = (udd) and of the proton as p = (uud).One can do as well for the other particles showing up in the diagram.

π(1, 1) = 8

I3

Y

1

−1

−1/2 1/2

pn

Σ− Σ0

∆0

Σ+

Ξ− Ξ+


We shall come back to these interpretations of representations later on when we dis-cuss the Euclidean groups and the Poincare group. A broader discussion of the physicalcontent of these schemes is beyond the scope of this book. But we still reproduce thebaryon decuplet 10. It became famous in history because it predicted the existenceof an Ω−-particle (the bottom of the diagram) with fixed properties, which after thisprediction really was found experimentally.

π(3, 0) = 10

I3

Y

1

−1

−2

−3/2 −1/2 1/2 3/2

∆++∆+∆0∆−

Σ+Σ0Σ−

Ξ− Ξ0

Ω−


To finish this section, let us look what happens when we apply to our example theprocedure, which was successful in our constructions of the representations of su(2) andsl(2,R) in 6.4 and 6.5. We stay in the unnormalized picture with

g =< H1, H′2, X±, Y±, Z± >

with the commutation relations

[X+, X−] = 0, [Y+, Y−] = (1/2)(√

3H ′2 − H1), [Z+, Z−] = (1/2)(

√3H ′

2 + H1),

[X+, Y+] = Z+, [X+, Z−] = −Y−,[X−, Z+] = Y+, [ X−, Y−] = −Z−,[ Y+ , Z+] = X−, [ Y− , Z+] = −X+,[ H1, X±] = 2X±, [H ′

2, X±] = 0,

[ H1, Y±] = ∓Y±, [ H ′2, Y±] = ±√

3Y±,

[ H1, Z±] = ±Z±, [ H ′2, Z±] = ±√

3Z±,

(the others are zero) and the roots with coordinates

α1 = (2, 0), α2 = (−1,√

3), α2 = (1,√

3).

(We then get the weights from our diagrams above by multiplication with (√

3/6).) Andas we prefer to think positively, we start our construction of a representation space V bytaking a vector v ∈ V, v = 0, of lowest weight Λ = (k, l), i.e. with

H1v = kv, H ′2 = lv, X−v = Y−v = Z−v = 0.

We try to construct V by applying the plus or creation operators X+, Y+, Z+ to v, i.e. wetry

V =∑

vrstC, vrst := Xr+Y s

+Zt+v.

From the PBW-Theorem we recall that we have to be careful with the order in which weapply the creation operators. For instance, one has

v′110 := Y+X+v = (X+Y+ + [Y+, X+])v = v110 − v001

since [X+, Y+] = Z+, but Z+X+v = X+Z+v = v101 as X+ and Z+ commute.From the commutation relations it is clear that vrst has weight Λ + rα1 + sα2 + tα3, so,in particular, the elements v100, v010, v001 of the first shell have the respective weightscoordinatized by (k + 2, l), (k − 1, l +

√3), (k + 1, l +

√3). If we apply the minus or

annihilation operators X−, Y−, Z−, using the commutation relations, we get

X−v100 = X−X+v = ([X−, X+] + X+X−)v = −H1v + 0 = −kv,

and the same wayY−v100 = 0,Z−v100 = 0,X−v010 = 0,

Y−v010 = −(1/2)(l√

3 − k)v,Z−v010 = 0,X−v001 = v010,Y−v001 = −v100,

Z−v001 = −(1/2)(l√

3 + k)v,


Before we interpret this and get into danger to lose track in the next shells, let us seehow the natural representation of su(3) looks in this picture:Here we have

V =3∑

j=1

ejC, e1 = t(1, 0, 0), e2 = t(0, 1, 0), e3 = t(0, 0, 1),

and hence

H1e1 =

⎛⎝ 1

−10

⎞⎠

⎛⎝ 1

00

⎞⎠ = e1,

H ′2e1 = (1/

√3)

⎛⎝ 1

1−2

⎞⎠

⎛⎝ 1

00

⎞⎠ = (1/

√3)e1.

In the same way we get

H1e2 = −e2, H ′2e2 = (1/

√3) e2

H1e3 = 0, H ′2e3 = −(2/

√3) e3.

Recalling X+ = E12, X− = E21 and the analogous relations for the other matrices, onehas

X+e1 = 0, Y+e1 = 0, Z+e1 = 0,X−e1 = e2, Y−e1 = 0, Z−e1 = e3,X+e2 = e1, Y+e2 = 0, Z+e2 = 0,X−e2 = 0, Y−e2 = e3, Z−e2 = 0,X+e3 = 0, Y+e3 = e2, Z+e3 = e1,X−e3 = 0, Y−e3 = 0, Z−e3 = 0.

These relations show that, in the coordinates fixed above, e1 is a highest weight vector ofweight (1, (1/

√3)), e2 has weight (−1,

√3) and e3 is a lowest weight vector with weight

(0,−(2/√

3). Thus, as to be expected, we find the weight diagram of 3. This fits intothe construction we started above as follows: we put v1 := v = e3 and get in the firstshell translating the equations just obtained

X+v = v100 = 0, Y+v = v010 = e2 := v2, Z+v = v001 = e1 := v3.

Moreover, the equations translate to the fact that the creation operators applied to thehighest weight vector v3 and to v2 give zero with the exception X+v2 = v3. The only non-trivial actions of the annihilation operators are Y−v2 = v1, X−v3 = v2 and Z−v3 = v1.We compare this to the general form for the action obtained at the beginning (now withv100 = 0, v010 = v2, v001 = v3) and find as neccessary condition for a three-dimensionalrepresentation in our general scheme the condition k = 0, l = −2/

√3, just as it should be.


To show how to come to higher dimensional representations, one has to go to higher shells:

We have v200 = X2+v, v110 = X+Y+v, v101 = X+Z+v and so on. Here we remember

to pay attention to the order if the creation operators do not commute: As alreadyremarked, we have v′

110 := Y+X+v = v110−v001 since one has [X+, Y+] = Z+. Though itis a bit lengthy we give the list of the action of the annihilation operators (to be obtainedparallel to the case X−v100 above and using the actions of the minus operators on thefirst shell)

X−v110 = −(k − 1)v010,

Y−v110 = −(1/2)(l√

3 − k)v100,

Z−v110 = −(1/2)(l√

3 − k)v,X−v101 = −(k + 1)v001 + v110,Y−v101 = −v200,

Z−v101 = −(1/2)(l√

3 + k + 2)v100,X−v011 = v200,

Y−v011 = −(1/2)(l√

3 − k)v001 − v110,

Z−v011 = −(1/2)(l√

3 + k + 2)v010,X−v200 = −2(k + 1)v100,Y−v200 = 0,Z−v200 = 0,X−v020 = 0,

Y−v020 = −(l√

3 − k + 2)v010,Z−v020 = 0,X−v002 = 2v001,Y−v002 = −2v101,

Z−v002 = −(l√

3 − k + 2)v001.

Just as well one can determine the action of the minus operators in the next shell and soon. As an example, we show how the eight-dimensional representation 8 (with highestweight Λ = α1 + α2) comes up in this picture: As above we start with a lowest weightvector v1 := v, which has in our coordinates the weight Λ0 = (k, l). In the first shell wehave

v2 := X+v = v100, v3 := Y+v = v010, v4 = Z+v = v001

of respective weights Λ0+α1,Λ0+α2, Λ0+α1+α2. For the second shell we put (secretelylooking at the weight diagram we already constructed using the results from the generaltheory, otherwise the reasoning would take a bit more time)

X2+v = v200 = 0, Y+X+v = v′

110 =: v5, Z+X+v =: v6,

and using the relation v′110 = v110−v001 of weight Λ0 +α1 +α2 obtained at the beginning

X+Y+v = v110 = v5 + v4, Y 2+v = 0, Z+Y+v = v011,

and, since X+ and Z+ commute

X+Z+v = v101 = v6, Y+Z+v =: v7, Z2+v = v002 =: v8.

If we look at the relations coming from the application of the minus operators to theelements of the second shell listed above, we find in particular the equations

X−v200 = −2(k + 1)v100, Y−v020 = (l√

3 − k + 2)v010.


Since we have fixed v200 = v020 = 0, these equations lead to k = −1 and l = 2/√

3.One verifies easily that all the other relations are fulfilled and the weight diagram for theeightfold way from above in this context leads to the following picture where only partof the relations between the operators and generators are made explicit:

π(1, 1) = 8

I3

Y

2/√

3

−2/√

3

−1

1

v8 = Z+v4v7 = Y+v4

v3 = Y+v1 v4 = Z+v1

v5 = Y+v2

v6 = Z+v2

v1 v2 = X+v1

Exercise 6.35: Do the same to construct the representations 6 and 10.

We hope that these examples help to get a feeling how the discreteness of the weights isrelated to the finite-dimensionality of the representations. And, as this procedure getsmore and more complicated, one gets perhaps some motivation to look into the proofsof the general theorems in the previous section about the integrality of the weights, theexistence of fundamental weights, the action of the Weyl group etc.

Chapter 7

Induced Representations

Induction is in this context a method to construct representations of a group starting bya representation of a subgroup. As we shall see, in its ultimate specialization one redis-covers Example 1.5 in our section 1.3, where we constructed representations in functionspaces on G-homogeneous spaces. Already in 1898, Frobenius worked with this methodfor finite groups. Later on, Wigner, Bargman, and Gelfand-Neumark used it to con-struct representations of special groups, in particular the Poincare group. But it wasMackey, who, from 1950 on, developed a systematic treatment using essentially elementsof functional analysis. Because it is beyond the scope of this book, we do not describethese here as carefully as we should. In most places we simply give recipes so that wecan construct the representations for the groups we are striving for. We refer to Mackey[Ma], Kirillov [Ki] p.157ff, Barut-Raczka [BR] p.473ff, or Warner [Wa] p.365ff for thenecessary background.

7.1 The Principle of Induction

Let G be a group, H a subgroup, and π0 a representation of H in a space H0, allthis with certain properties to be specified soon. From these data one can constructrepresentations of G in several ways. The guiding principle is the following. We look at aspace of functions H consisting of functions φ : G −→ H0, which allow for some additionalconditions also to be specified below and for the fundamental functional equation

(7.1) φ(hg) = δ(h)1/2π0(h)φ(g), for all h ∈ H, g ∈ G.

Here δ : H −→ R>0 is a normalizing function, which has several different descriptions.It is identically one in many important cases, so we do not bother with it for the moment(if δ is left out, one has unnormalized induction, which later on will appear in connectionwith an interpretation via line bundles). If H is closed under right translation, this spaceis the representation space for the representation

π = indGHπ0

given by right translation π(g0)φ(g) = φ(gg0). Before we make all this more precise, inthe hope to add some motivation, we treat as an example a representation, which wealready treated as one of our first examples in 1.3.

118 7. Induced Representations

Example 7.1: Let G be the Heisenberg group Heis(R) = g := (λ, µ, κ); λ, µ, κ ∈ R,H the subgroup consisting of the elements h := (0, µ, κ) and π0 the one-dimensionalrepresentation of H, which for a fixed m ∈ R∗ is given by

π0(h) := em(κ) = e2πimκ.

As we know from 3.3 that the Heisenberg group and the subgroup H are unimodular,one has δ ≡ 1 (we shall see this below in 7.1.2) and we have to deal with the functionalequation (7.1)

φ(hg) = π0(h)φ(g) = em(κ)φ(g), for all h ∈ H, g ∈ G.

By this functional equation the complex function φ(g) = φ(λ, µ, κ) is forced to have theform

(7.2) φ(λ, µ, κ) = em(κ + λµ)f(λ)

where f is a one-variable function since every g ∈ Heis(R) has a decomposition

g = (λ, µ, κ) = (0, µ, κ + λµ)(λ, 0, 0) =: h0s(λ)

with h0 ∈ H and s(λ) = (λ, 0, 0) ∈ G. Application of the right translation with g0 ∈ G(via the muliplication law in the Heisenberg group) leads to

(π(g0)φ)(g) = φ(gg0) = φ(λ + λ0, µ + µ0, κ + κ0 + λµ0 − λ0µ)= em(κ + κ0 + λµ + λ0µ0 + 2λµ0)f(λ + λ0)= em(κ + λµ)em(κ0 + (λ0 + 2λ)µ0)f(λ + λ0).

Comparing with (7.2) above, we rediscover the Schrodinger representation from Section1.3: If we choose H such that in the decomposition of the elements φ above we havef ∈ L2(R), then

f −→ π(g0)f with π(g0)f(x) = em(κ0 + (λ0 + 2x)µ0)f(x + λ0)

is the prescription for the unitary representation from Example 1.6 in 1.3. The repre-sentation given by right translation on the space of functions φ on G of the form (7.2) iscalled Heisenberg representation.

7.1.1 Preliminary Approach

Now we want to see what is behind this example and try a more systematic treatment.At first we do not consider the most general case but (following [La] p.43f) one sufficientfor several applications. Let• G be an unimodular connected linear group,• H be a closed subgroup,• K be an unimodular closed subgroup such that the map

H × K −→ G, (h, k) −→ hk

is a topological isomorphism,• δ = ∆H be the modular function of H, and• π0 a representation of H in H0 .

7.1 The Principle of Induction 119

Then the homogeneous space of right H-cosets X = H\G can be identified with K(provided with an action of K from the right). Let Hπ be the space of functions

φ : G −→ H0

satisfying the functional condition (7.1)

φ(hg) = δ(h)1/2π0(h)φ(g) for all h ∈ H, g ∈ G

and the finiteness condition that the restriction φ |K=: f is in L2(K). We define a normfor Hπ by

(7.3) ‖ φ ‖2Hπ :=‖ f ‖2

L2(K)=∫

K

|f(k)|2H0dk,

and with φ |K=: f the scalar product by

< φ, φ >:=< f, f > .

We denote π = indGHπ0 for π given by right translation (π(g0)φ)(g) = φ(gg0) and call this

the representation of G induced by π0. We have to show that this really makes sense andshall see at the same time why the strange normalizing function δ has to be introduced.

Theorem 7.1: If π0 is bounded, π defines a bounded representation of G on Hπ. π isunitary if π0 is.

Proof: We have to show that φg0 given by φg0(g) := φ(gg0) is in Hπ. One has

φg0(hg) = φ(hgg0) = δ(h)1/2π0(h)φ(gg0) = δ(h)1/2π0(h)φg0(g),

i.e. φg0 satifies the functional equation. Now to the finiteness condition: We write g = hkand kg0 =: h′

kk′ with h, h′k ∈ H, k, k′ ∈ K. Then we have (with δ = ∆H)

φg0(k) = φ(kg0) = φ(h′kk′) = δ(h′

k)1/2π0(h′k)φ(k′),

and, since π0 is bounded, ‖ φg0 ‖2Hπ is bounded by a constant times∫K

∆H(h′k) | φ(k′) |2 dk,

and equality holds if π0 is unitary. The proof is done if we can show that for f ∈ Cc(K)one has ∫

K

f(k)dk =∫

K

∆H(h′k)f(k′)dk.

We take φ ∈ Cc(G) and use Fubini’s Theorem. Since G is unimodular, we get∫H

∫K

φ(hk)dkdh =∫

Gφ(g)dg =

∫G

φ(gg0)dg

=∫

H

∫K

φ(hkg0)dkdh

=∫

K

∫H

φ(hh′kk′)dhdk

=∫

K

∫H

φ(hk′)∆H(h′k)dhdk

=∫

H

∫K

φ(hk′)∆H(h′k)dkdh.

Here we take φ such that

φ(hk) = ϕ(h)f(k) with f ∈ C(K), ϕ ∈ Cc(H),∫

H

ϕ(h)dh = 1,

and get the desired result.


Exercise 7.1: Verify that the case of the Heisenberg group and its Schrodinger repre-sentation is covered by this theorem.

Remark 7.1: Using the notions from our section 6.6 where we introduced a bit morestructure theory, one can also state a useful variant of Theorem 7.1 (as in Flicker [Fl]p.39):If G has an Iwasawa decomposition G = NAK, K a maximal compact subgroup, Athe maximal torus in H = NA, N the unipotent radical, and χ a character of H, therepresentation space consists of smooth functions φ : G −→ C with

φ(nak) = (δ1/2χ)(a)φ(k), for all a ∈ A, n ∈ N, k ∈ K,

where the normalizing function can be described by

δ(a) = | det(Ad a |LieN ) | .

The representation is given again by right translation.

7.1.2 Mackey’s Approach

We shall afterwards come back to more easily accessible special cases but now go (fol-lowing Mackey) to a more general situation (see [Ki] p.187ff, [BR] p.473) where we haveto intensify the measure theoretic material from section 3.3. Let• G be a connected linear group,• H be a closed subgroup,• π0 a unitary representation of H in H0 (later on, in some cases, we will also treatnon-unitary representations),• δ := ∆H/∆G where ∆H and ∆G are the modular functions of H resp. G introducedin 3.3,• X the space of right H-cosets X = H\G,• p : G −→ X the natural projection g −→ Hg =: x ∈ X .

Theorem 7.2: The normalized induced representation

π := indGHπ0

is given by right translation on the space Hπ, which is defined as the completion ofits dense subspace spanned by the continuous functions φ : G −→ H0 satisfying thefunctional equation (7.1)

φ(hg) = (δ(h))1/2π0(h)φ(g) for all h ∈ H, g ∈ G

and the finite norm condition

(7.4) ‖ φ ‖2Hπ :=

∫X

‖ φ(s(x)) ‖2H0

dµs(x) < ∞.

Before we elaborate on the fact that we really get a unitary representation, this conditionintroducing the norm ‖ . ‖Hπ needs some explanation:


Quasi-invariant Measures, Master Equation and the Mackey Decomposition

As above, we denote by X the space of left H-cosets, X = H\G, and p : G −→ Xthe natural projection given by g −→ Hg =: x ∈ X . Then it is a general fact thatthe homogeneous space X admits quasi-invariant measures µ. These are defined by thecondition that, for every g ∈ G, µ and µg (with µg(B) := µ(Bg) for all Borel sets B)have the same null sets, i.e. µ(B) = 0 is equivalent to µg(B) = 0. Again we do not provethis general statement but later on give enough examples to illuminate the situation (wehope). For f ∈ Cc(X ) we have

µg(f) =∫X

f(xg)dµ(x) =∫X

f(x)dµ(xg−1) with dµ(xg−1) =: dµg(x)

and for g ∈ G we have a density function ρg, which may be understood as the quotientof dµg and dµ (in the general theory this goes under the name of a Radon-Nikodymderivative). The construction of quasi-invariant measures can be done as follows.We take over (from [BR] p.70:) [Ma1] part I, Lemma 1.1:

Theorem 7.3 (Mackey’s Decomposition Theorem): Let G be a separable locally compactgroup and let H be a closed subgroup of G. Then there exists a Borel set B in G suchthat every element g ∈ G can be uniquely represented in the form

g = hs, h ∈ H, s ∈ B.

In our situation where we have

p : G −→ X , g −→ Hg =: x,

this leads to the existence of a Borel section s : X −→ G, i.e. a map preserving Borelsets with p s = idX . Hence for g ∈ G, we have a unique Mackey decomposition

(7.5) G g = h(x)s(x), h(x) ∈ H, x ∈ X .

Then the invariant measures drh and drg on H resp. G are related to a quasi-invariantmeasure dµs associated to the section ϕ by

drg =:∆G(h)∆H(h)

drhdµs(x)

(by a reasoning, which is a refinement of the last part of the proof of Theorem 7.1, or bysolving the problems in [Ki] p.132) and we have the quasi-measure relation

(7.6)dµs(xg)dµs(x)

= ρg(x) =∆H(h(g, x))∆G(h(g, x))

= δ(h(g, x)).

To someone eventually thinking this is complicated, we recommend to look at the caseof the Heisenberg group we treated above: We have the section s : X −→ G given byX x −→ s(x) = (x, 0, 0) and the Mackey decomposition

Heis(R) g = (λ, µ, κ) = h(x)s(x)

withh(x) = (0, µ, κ + λµ), s(x) = (λ, 0, 0), x = λ.


And, since the groups G and H are both unimodular, we simply have

drg = dλdµdκ, drh = dµdκ, dµs(x) = dx.

In this special case, one easily verifies directly that dx is not only quasi-invariant butinvariant. And the general finite norm condition above reduces to the condition that fis an L2-function on X R, i.e. the condition we found above while rediscovering theSchrodinger representation.

The First Realization

In the general case we will need the explicit form of the action of G on X in the frameworkof the Mackey decomposition: We apply the Mackey decomposition (7.5) to s(x)g0 andget a consistent picture by chosing xg0 such that, for h∗ := h(xg0) ∈ H (which, forreasons to be seen later, we also write as h(g0, x)), we have the Master Equation (thename is from [Ki1] p.372)

(7.7) s(x)g0 = h∗s(xg0).

Now, let us verify that the space Hπ, as defined above, is in fact the space for our inducedrepresentation with π(g0)φ = φg0 :If φ satisfies the functional equation (7.1), by the same reasoning as in the proof ofTheorem 7.1, so does φg0 for every g0 ∈ G.The main point is the norm condition. The more experienced reader will see that, inprinciple, we do the same as in the proof of Theorem 7.1: Using (7.7) above, the functionalequation (7.1) and the unitarity of π0, one has from (7.4)

‖ φg0 ‖2Hπ =

∫X ‖ φg0(s(x)) ‖2

H0dµs(x)

=∫X ‖ φ(s(x)g0) ‖2

H0dµs(x)

=∫X ‖ φ(h∗s(xg0)) ‖2

H0dµs(x)

=∫X ‖ φ(s(xg0)) ‖2

H0δ(h(g0, x))dµs(x).

Replacing x −→ xg−10 , we see that translation does not change the norm

‖ φg0 ‖2Hπ =

∫X ‖ φ(s(x)) ‖2

H0δ(h(g0, xg−1

0 ))dµs(xg−10 )

=∫X ‖ φ(s(x)) ‖2

H0dµs(x) = ‖ φ ‖2

Hπ ,

since from the Master Equation (7.7) (as expression of the associativity of the groupoperation) one deduces the cocycle condition

(7.8) h(g1, x)h(g2, xg1) = h(g1g2, x).

Via the chain rule the quasi-measure relation (7.6) dµs(xg0)/dµs(x) = ρg0(x) = δ(h(x, g0))provides for

δ(xg−10 , g0)dµs(xg−1

0 ) = dµs(x).

It is now evident that π is unitary if we put

< φ, ψ >:=∫X

< φ(s(x)), ψ(s(x)) >H0 dµs(x).


Thus we have the essential tools to extend the validity of Theorem 7.1 to the more gen-eral hypothesis stated above (we leave out to show that changing the section ϕ leads toequivalent representations). This realization of the induced representation by functionsliving on the group is usually called the first realization or induced picture.

Before we go over to another realization, let us point out that the representation byright translation has an equivalent version where G acts by left translation, i.e. we haveπ′(g0)φ(g) = φ(g−1

0 g) where here we have the space H′π consisting of functions φ fulfillingthe fundamental functional equation

(7.9) φ(gh) = δ(h−1)1/2π0(h−1)φ(g) for all h ∈ H, g ∈ G,

and the same finiteness condition.

Exercise 7.2: Fill in missing details.

The Second Realization

We stay with the same hypotheses as above and turn to functions living on the homoge-neous space X = H\G. We denote by

Hπ := L2(X , dµs,H0)

the space of H0-valued functions f on X , which are quadratic integrable with respectto the quasi-invariant measure dµs. In this context we have as fundamental fact thefollowing

Lemma: The mapψ : Hπ −→ Hπ, φ −→ f

given by f(x) := φ(s(x)) for x ∈ X is bijective and an isometry.

Proof: ψ is linear and has as its inverse the map

ϕ : Hπ −→ Hπ, f −→ φf

for g = h(x)s(x), x ∈ X given by φf (g) := δ(h(x))1/2π0(h(x))f(x). One has

‖ φf ‖2Hπ =

∫X ‖φf (s(x))‖2

H0dµs(x)

=∫X ‖f(x)‖2

H0dµs(x) = ‖f‖2

Hπ.

Hence the maps preserve the norm.Both spaces have compatible Hilbert space structure since in a complex inner productspace V for u1, u2 ∈ V one has the relation

4 < u1, u2 > = ‖ u1 + u2 ‖2 − ‖ u1 − u2 ‖2 −i ‖ u1 + iu2 ‖ +i ‖ u1 − iu2 ‖2 .


Using this, we can carry over the induced representation from Hπ to Hπ and from ourTheorem 7.2 come to an equivalent picture as follows.


Theorem 7.4 (Second Realization): π = indGHπ0 is realized on Hπ by π with

π(g0)f(x) := A(g0, x)f(xg0) for all f ∈ Hπ

withA(g0, x) := δ(h(g0, x))1/2π0(h(g0, x))

for s(x)g0 = h(g0, x)s(xg0), x ∈ X , g0 ∈ G.

Proof: We evaluate the commutative diagram:

Hπ Hπ

Hπ Hπ

π(g0)

π(g0)

ϕ ψ

• f ∈ Hπ is mapped by ϕ onto φ ∈ Hπ with

φ(g) = φ(h(x)s(x)) = δ(h(x))1/2π0(h(x))f(x) for g = h(x)s(x) ∈ G, x ∈ X .

• π(g0) maps φ to φg0 with φg0(g) = φ(gg0).• ψ maps φg0 to f with

f(x) = φg0(s(x)) = φ(s(x)g0).

By the Master Equation s(x)g0 = h(g0, x)s(xg0) and the definition of φ this can bewritten as

f(x) = φ(h(x, g0)s(xg0)) = δ(h(g0, x))1/2π0(h(g0, x))f(xg0).

Hence, π(g0) transports f to f = π(g0)f with f as given in the Theorem.

Exercise 7.4: Repeat this discussion to get a representation space consisting of functionsf living on the space of left cosets X ′ = G/H by starting from the induced representationπ′ given by left translation on the space H′π. Deduce the representation prescription

(7.10) π(g0)f(y) := (δ(h(g0, y))−1/2π0((h(g0, y)−1)f(g−10 y)

based on the Master equation

g−10 s(y) = s(g−1

0 y)h(g0, y).

Remark 7.2: In the special case that π0 is the trivial one-dimensional representationwith π0(h) = 1 for all h ∈ H, we fall back to the situation of our Example 1.5 in Section1.3 where we constructed a representation on a space of functions on a homogeneousspace X = H\G. This may be seen as starting point of the construction of inducedrepresentations.


Remark 7.3: From the cocycle condition (7.8) for h = h(g, x), one can deduce that themultiplier system A in Theorem 7.4 as well fulfills this relation, i.e. we have

A(g1g2, x) = A(g1, x)A(g2, xg1) for all g1, g2 ∈ G, x ∈ X .

And, vice versa, one can take such a system as tool to construct a representation on ahomogeneous space X with a right G-action (see for instance [Ki1] p.388).

For instance in [BR] p.478 a proof can be found that essentially all this does not dependon the choice of the section s. In most practical cases one has a more or less naturaldecomposition of the given group into subgroups as at the beginning of this section, soone can avoid to talk about different sections, since one has a natural decomposition.We already discussed the Heisenberg group as an example for this. Here the Schrodingerrepresentation came out as the Second Realization with π = πm acting on Hπ = L2(R).The First Realization by functions living on the group is called Heisenberg representation .

7.1.3 Final Approach

Before we give more concrete examples, we reproduce some general considerations (whichmay be skipped at first reading) linked to the report on structure theory in section 6.6and take over the following situation over from [Kn] p.168: Let G be (not uniquely)decomposed as G = KH and H in Langlands decomposition H = MAN . We do not giveall the details for a proper definition as to be found in [Kn] p.132. But it may sufficeto say that one has a corresponding direct sum decomposition with corresponding Liealgebras g = k ⊕ m ⊕ a ⊕ n where a is abelian, n nilpotent, and m and a normalize n.(Our standard example is G = SL(2,R). Here we have as in section 6.6.4 the (unique)Iwasawa decomposition G = KAN with K = SO(2), A the positive diagonal elements,N = n(x); x ∈ R, and H = MAN with M = ±E2.) Let ∆+ denote the roots of(g, a) positive for N and ρ := (1/2)Σα∈∆+(dim gα)α. Moreover, let σ be an irreducibleunitary representation of M on V σ, and ν ∈ (a∗)C, i.e. a complex linear functional on a.Then in this context, the induced representation is written as

π = indGMAN (σ ⊗ exp ν ⊗ 1)

and in the induced picture given by left translation π(g0)φ(g) = φ(g−10 g) on the represen-

tation space, which is given by the dense subspace of continuous functions φ : G −→ V σ

with functional equation

φ(gman) = e−(ν+ρ) log aσ(m)−1φ(g)

and the norm‖φ‖2 :=

∫K

|φ(k)|2dk.

By restriction to the compact group K one gets the compact picture. A dense subspaceis given by continuous functions f : K −→ V σ with

f(km) = σ(m)−1f(k)

and the same norm as above. Here the representation π is given by

π(g0)f(k) := e−(ν+ρ)H(g−10 k)σ(µ(g−1

0 k))−1f(κ(g−10 k))

where g decomposes under G = KMAN as g = κ(g)µ(g)eH(g)n.


In [Kn] p.169 there is a third picture, the noncompact picture, given by restriction toN = ΘN . Here the representation π is given by

π(g0)f(n) := e−(ν+ρ) log a(g−10 n)σ(m(g−1

0 n))−1f(n(g−10 n))

if one takes account of the fact that NMAN exhausts G except for a lower dimensionalset and that in general one has a decomposition g = n(g)m(g)a(g)n. The representationspace is here L2(N , e2Re νH(n)dn).

7.1.4 Some Questions and two Easy Examples

Several questions arise immediately:

i) How to construct non-trivial functions in Hπ resp. Hπ?ii) When is an induced representation π irreducible?iii) Which representations can be constructed by induction?iv) If one restricts an induced representation to the subgroup, does one get back therepresentation of the subgroup?v) What happens by iteration of the induction procedure?vi) What about the induced representation if the inducing representation is the sum ortensor product of two representations ?

Here we can not give satisfying answers to these questions (but hope that our exampleslater on will be sufficiently illuminating). We only mention some remarks.

Ad i) One can prove ([BR] p.477) the following statement.Proposition 7.1: Let ϕ be a continuous H0-valued function on G with compact support.Define φϕ by

φϕ(g) :=∫

H

π0(h−1)ϕ(hg)drh.

Then φϕ is a continuous function on G whose support goes to a compact set by theprojection p : G −→ X and is an element of Hπ. The set

Cπ := φϕ; ϕ(g) = ξ(g)v, ξ ∈ Cc(G), v ∈ H0is a dense set in Hπ.

Ad ii) We have seen that the Schrodinger representation of the Heisenberg group is anirreducible induced representation. In general, this is not the case and in our examplesbelow we will have to discuss this. We shall find conditions to be added to the definitionof our space if one wants to come to an irreducible representation (e.g. holomorphic in-duction). In [BR] p.499, there are general criteria to get irreducibility (using the conceptof a canonical system of imprimitivity).

Ad iii) This concept of canonical system of imprimitivity is also used to give a criteriumof inducibility stating under which conditions a unitary representation is equivalent toan induced representation. One finds versions of this in [BR] p.495, [Ki] p.191, and in[Ki1] p.389 (Mackey Inducibility Criterion). We do not reproduce these because thestatements need more new notions.


Ad iv) Induction and restriction are adjoint functors. We do not try to make this moreprecise. At least for finite groups, one finds statements of the Frobenius Reciprocity as in[Se] p.57 or [Ki] p.185, which, using our notation (see section 1.2), for a representationπ of G can be written as

c(π, indGHπ0) = c(π|H , π0).

Ad v) One has the concept of induction in stages, which symbolically can be written as

indGHindH

K indGK

if we have a subgroup chain G ⊃ H ⊃ K ([Ki] p.184, [BR] p.489).

Ad vi) If π0 and π′0 are representations of the subgroup H of G the operations of taking

direct sum and induction are interchangeable

indGH (π0 ⊕ π′

0) indGHπ0 ⊕ indG

Hπ′0

([BR] p.488). And if π1 and π2 are unitary representations of the closed subgroups H1

and H2 of the separable, locally compact groups G1 resp. G2, then one has ([BR] p.491)

indG1×G2H1×H2

(π1 ⊗ π2) indG1H1

π1 ⊗ indG2H2

π2.

Up to now, we treated only the example of the Heisenberg group. Before we do more se-rious work with G = SL(2,R), we discuss another example we already looked at, namelyG = SU(2), and the two-dimensional form of the Poincare group, which later on willbe discussed more thoroughly in the four-dimensional form in the context of Mackey’smethod for semidirect products.

Example 7.2: Let

G = g = ( A n0 1 ) ∈ SL(3,R); A ∈ SO(1, 1), n = ( a

t) ∈ R2

be the two-dimensional Poincare group. We abbreviate g =: (A, n), so that matrixmultiplication leads to the composition law

gg′ = (AA′, An′ + n).

In analogy to the case of SO(2), we write the elements of SO(1, 1) in the form

A = ( cosh ϑ sinhϑsinhϑ cosh ϑ

) =: rh(ϑ), ϑ ∈ R.

From the addition theorem we have AA′ = rh(ϑ)rh(ϑ′) = rh(ϑ + ϑ′), and dϑ is a bi-invariant measure on SO(1, 1). We take the subgroup

H := h = (E2 n0 1 ); n ∈ R2

and the one-dimensional representation π0 defined by

π0(h) := eπi(aα+tβ) = eπitnn =: χn(n), n := ( αβ

) ∈ R2.


Moreover, we denote by K the image of the injection of SO(1, 1) into G

K := k = (A 00 1 ); A ∈ SO(1, 1)

Then every g ∈ G has a Mackey decomposition g = (A,n) = (E, n)(A, 0) and we have

G −→ X = H\G K R, g = (A,n) −→ A = rh(ϑ) −→ ϑ.

The Master Equation s(x)g0 = h(g0, x)s(xg0) in this case simply is

(A, 0)(A0, n0) = (AA0, An0) = (E, An0)(AA0, 0).

We see that dϑ is not only quasi-invariant but invariant and we have a trivial normalizingfunction δ(h) = 1 for all h ∈ H. Hence, the induced representation π = indG

Hχn is givenby right translation on the space Hπ, which is the completion of the space of continuousfunctions

φ : G −→ C, φ(hg) = χn(h)φ(g),∫R

| φ((rh(ϑ), 0)) |2 dϑ < ∞.

The induced representation has its Second Realization on the space Hπ = L2(X , dµ(x)) =L2(R, dϑ). We have the isomorphism given by

Hπ φ −→ f ∈ Hπ with f(ϑ) := φ((rh(ϑ), 0)),

and the prescription for the representation from Theorem 7.4, namely

π(g0)f(x) = A(g0, x)f(xg0),

in this case leads to

π((rh(ϑ0), n0))f(ϑ) = χn(rh(ϑ)n0)f(ϑ + ϑ0) = eπitn0rh(ϑ)nf(ϑ + ϑ0).

The representation is irreducible. This is a byproduct of Mackey’s general theory forrepresentations of semidirect products, upon which we will report later in section 7.3.The reader is invited to find a direct proof using the infinitesimal method analogous tothe example in 6.5 (Exercise 7.5). In the next example the induced representation willcome out to be reducible.

Example 7.3: For the group

G = SU(2) = g = (a b−b a

); a, b ∈ C, | a |2 + | b |2= 1

we obtained in 4.2 the result that every irreducible representation of G is equivalent toa representation πj , j ∈ (1/2)N0, given on the space V (j) = C[x, y]2j of homogeneouspolynomials P of degree 2j by the prescription

πj(g0)P (x, y) := P (a0x − by0, b0x + a0y).

This formula is based on the action on functions produced by

( xy

) −→ g−10 ( x

y) = tg0(

xy

) = ( a0x − b0yb0x + a0y

).

We want to see how this fits into our scheme of construction of induced representations:It is very natural to take as subgroup H the group of diagonal matrices

H := h = (a

a); a = eiϕ, ϕ ∈ R U(1) SO(2)


and as its representation π0 given by π0(h) = eikϕ =: χk(h), k ∈ N0. Since in πj we havea left action, we deal here also with an action of G by left translation, i.e.

π(g0)φ(g) = φ(g−10 g)

for functions φ from a representation space H′π, which fulfill the functional equation

φ(gh) = π0(h−1)φ(g) = λkφ(g), λ = e−iϕ.

(We know that G and H are compact and, hence, unimodular so that the normalizingfactor δ is trivial.) The homogeneous space X ′ = G/H showing up here is one of themost fundamental examples of homogeneous spaces: One has

SU(2)/H P1(C),

the one-dimensional complex projective space. We already met with projective spaces in4.3 while introducing the concept of projective representations. Here we use the followingnotation: G = SU(2) acts on V = C2 by

(g, v) −→ gv, v := ( xy

).

We have P1(C) = (V \0)∼ where for v, v′ ∈ V one has v ∼ v′ iff there exists a λ ∈ C∗

with v′ = λv. The action of G on V induces an action on P1(C) given by g(v∼) := (gv)∼.We write (x : y) = t(x, y)∼. For v0 = t(0, 1) we have gv0 = t(b, a), i.e. g(0 : 1) = (b : a),and the stabilizing group of (v0)∼ = (0 : 1) is just H U(1). Hence we have

SU(2)/H −→ P1(C), g = ( a b−b a

) −→ (b : a).

A homogeneous function in two variables F (x, y) induces via f(x : y) := F (x, y) afunction f on P1(C) , and this in turn induces a function φF living on G given byφF (g) := F (t(g t(0, 1))) = F ((0, 1)tg). Hence we associate the function φP to a homoge-neous polynomial P of degree k = 2j from our representation πj , namely

φP (g) := P (b, a) for g = (a b−b a

).

The fact that P is homogeneous of degree k translates to the functional equation

φP (gh) = P ((0, 1)t(gh)) = P ((0, e−iϕ)tg) = e−iϕkP ((0, 1)tg) = λkφP (g)

where h = ( eiϕ

e−iϕ ) and λ = e−iϕ as above. One has

g−10 g = ( a0 −b0

b0 a0)( a b

−b a) = ( a0a + b0b a0b − b0a

b0a − a0b b0b + a0a)

and henceφP (g−1

0 g) = P (a0b − b0a, b0b + a0a) = φπ(g0)P (g).

We see that πj is a candidate for a subrepresentation of the induced representationπ = indG

Hχk. We propose to show as Exercise 7.6 that the finite norm condition isfulfilled and to search for further conditions to be imposed such that we get a subspaceof Hπ, which is isomorphic to the space V (j) of πj . We shall return to this questionand the example in Section 7.6 in the framework of the more sophisticated and elegantconcept of an interpretation of the induction process via line bundles.


7.2 Unitary Representations of SL(2,R)

Serge Lang dedicated a whole book, namely “SL(2,R)” ([La1]), to this group and itsrepresentations. Also, there is a lot of refined material concerning the representations ofSL(2,R) throughout Knapp’s book [Kn]. Summaries presenting models for the differentrepresentations can be found in many places, for instance in [KT] p.1–24 or in [Do]. Themost classical reference is Bargmann’s article [Ba] from 1947. We shall need the repre-sentations of G = SL(2,R) later in the discussion of the representations of the Poincaregroup and most essentially in the Outlook to Number Theory.

Using the Infinitesimal Method, in Theorem 6.2 in section 6.4, we obtained a list fixing thepossible equivalence classes of irreducible unitary representations of G = SL(2,R). Herewe will show that these representations really exist by presenting concrete realizationsor, as one also says, models. In most occasions it is important to have different models,depending on the occasion where the representation is showing up. The method we followhere is just the application of the induction procedure that we developed in the precedingsection. Thus, we have to start by choosing a subgroup H of our G. Already in severaloccasions in this text, we specified subgroups of G = SL(2,R). We recall (for instancefrom 6.6.4) the notation

K := SO(2) = r(ϑ) = (cos ϑ sinϑ− sin ϑ cos ϑ

); ϑ ∈ R,

A := t(y) = (y1/2

y−1/2 ); y ∈ R, y > 0,

N := n(x) = (1 x

1 ); x ∈ R,

N := n(x) = (1x 1 ); x ∈ R,

M := ±E2,B := MAN = h = (

a ba−1 ); a, b ∈ R, a = 0,

B := MAN.

The subgroup B of upper triangular matrices is a special example of a parabolic subgroup.It is also called the standard Borel group. One has the disjoint Iwasawa decompositionG = KAN = ANK and G = KB = BK with B ∩ K = ±E2. It makes sense to startthe induction procedure from all these subgroups but the best thing to do is parabolicinduction:

1. The Principal Series

We take H := B and π0 the one-dimensional representation, which for fixed s ∈ R andε ∈ 0, 1, is given by

π0(h) :=| a |is (sgn a)ε.

We put X := H\G and take the Mackey decomposition (7.5)

g = (a bc d

) = (d−1 b

d)(

1c/d 1 ) = h(x)s(x) for d = 0,

= ( b −ab−1 )( 1

−1 ) = h(x)s(x) for d = 0.

7.2 Unitary Representations of SL(2,R) 131

Hence X is in bijection to the set S := ( 1x 1 ); x ∈ R ∪ ( 1

−1 ). As in our

context a point has measure zero, we can disregard the element (1

−1 ) and work

withG −→ X = H\G ⊃ X ′ R, g −→ x := c/d, d = 0

with section s : X ′ −→ G, x −→ s(x) = (1x 1 ). For g0 = (

a0 b0

c0 d0) our Master

Equation (7.7)s(x)g0 = h(g0, x)s(xg0)

is realized by

h(x, g0) = ( (b0x + d0)−1 b0

0 b0x + d0), s(xg0) = ( 1

xg0 1 ), xg0 =a0x + c0

b0x + d0.

We see that dµ(x) = dx is a quasi-invariant measure on X ′ R with the quasi-measurerelation (7.6)

dµ(xg0)dµ(x)

=1

(b0x + d0)2=: ρg0(x) = δ(h(g0, x)).

Hence, we have δ(h) = a2 and the induced representation π = indGHπ0 is given in the

First Realization by right translation on the space Hπ, which is the completion of thespace of smooth functions φ : G −→ C with

(7.11) φ(hg) = (δ(h))1/2π0(h)φ(g) for all h ∈ H, g ∈ G

and

‖φ‖2 :=∫R

| φ(s(x)) |2 dx < ∞.

As we saw in our discussion of the representations of the Heisenberg group, it is moreilluminating to look at the Second Realization. This is given by the representation spaceHπ = L2(X , dµ(x)) = L2(R, dx). We see from Theorem 7.4 (or by a direct calculation)that the representation acts by

π(g0)f(x) =| b0x + d0 |−is−1 (sgn (b0x + d0))εf(a0x + c0

b0x + d0) for f ∈ L2(R, dx).

Exercise 7.7: Determine the derived representation and verify that this induced rep-resentation is a model for the up to now only hypothetical representation πis,± fromTheorem 6.2 in 6.4.

Remark 7.4: This exercise provides as byproduct a proof that this induced representa-tion is irreducible. There are more direct ways to see this (see for instance [Kn] p.36).

Exercise 7.8: We propose to verify in a special situation that the construction es-sentially does not depend on the choice of the quasi-invariant measure: As in [BR]p.483, repeat the former discussion for the quasi-invariant measure dµ(x) = ϕ(x) dxwith ϕ(x) = (1 + x2)−2 to get a representation space Hπ linked to our Hπ by the inter-twining operator f −→ f, f(x) = f(x)/(1 + x2)1/2.


To get more familiarity with these constructions and because of the general importance ofthe result, we also discuss the realization of our induced representation by left translation,s.t., for functions φ living on G, we have a representation given by π′(g0)φ(g) = φ(g−1

0 g).

We keep H and π0 as above and now put X ′ = G/H. For g = ( a bc d

) with a = 0, we

have a Mackey decomposition

g = s(y)h(y), s(y) = (1y 1 ), y = c/a, h(y) = (

a ba−1 ),

i.e. a map p : G −→ G/H = X ′ ⊃ X ′ R, g −→ y = c/a,

with section y −→ s(y) = ( 1y 1 ). The Master Equation is in this case

g−10 s(y) = s(g−1

0 y)h(y, g0)

withs(g−1

0 y) = (1

g−10 y 1 ), g−1

0 y =a0y − c0

−b0y + d0,

h(y, g0) = (d0 − b0y −b0

(d0 − b0y)−1 ) =: h∗.

We take dµ(y) = dy as quasi-invariant measure on X ′ R and have the quasi-measurerelation

dµ(g−10 y)

dµ(y)=

1(−b0y + d0)2

= ρg0(y).

To be on the safe side and to provide for some more exercise in the handling of thesenotions, we verify that left translation preserves the norm: We take a function φ with

φ(gh) = (δ(h))−1/2π0(h−1)φ(g) for all h ∈ H, g ∈ G,

put φg0(g) := φ(g−10 g), and calculate using the Master Equation g−1

0 s(y) = s(g−10 y)h∗

and the unitarity of π0

‖φg0‖2 =∫R| φg0(s(y)) |2 dµ(y)

=∫R| φ(s(g−1

0 y)h∗) |2 dµ(y)

=∫R| φ(s(g−1

0 y)) |2 δ(h∗)−1dµ(y)=

∫R| φ(s(y)) |2 dµ(y),

since one has by the quasi-measure relation

dµ(y)dµ(y)

=dµ(g−1

0 y)dµ(y)

=1

(−b0y + d0)2= δ(h∗)−1.

So we have as the First Realization the induced representation defined by functions φ onG with the functional equation

φ(gh) = (δ(h))−1/2π0(h−1)φ(g) for all h ∈ H, g ∈ G,

and the norm condition

‖φ‖2 =∫R

| φ(s(y)) |2 dµ(y) < ∞.


From here, by the map φ −→ f, f(y) := φ((1y 1 )), we come to the Second Realization

given on the space Hπ = L2(R, dy) by the prescription

π(g0)f(y) = (δ(h(y, g0)))−1/2π(h(y, g0)−1)f(g−10 y)

= | d0 − b0y |−is−1 (sgn(d0 − b0y))εf(a0y − c0

−b0y + d0).

This is the same formula as in [KT] p.17 and [Kn] p.36 and p.167 where this is an examplefor a noncompact picture. We remark that Knapp comes to this formula from his inducedpicture, which is prima facie different from our First Realization as he does not use thedefinition of the norm we used above but the same definition of the norm as in (7.3)

‖φ‖2 =∫

K

| φ(k) |2 dk.

In [Kn] p.168, one finds a proof that this norm and the one we used above are equal. Wepropose to search here for a direct proof (Exercise 7.9).

Moreover, in [Kn] p.37, there is a proof of the irreducibility of these induced representa-tions πis,± except for π0,−, which is based on euclidean Fourier Transform.

2. The Discrete Series

We will discuss several models, which realize the discrete series representations.

The Upper Half Plane Model

We take H = K = SO(2) and π0(h) = χk(r(ϑ)) = eikϑ, k ∈ Z, for h = r(ϑ) ∈ SO(2).It is a custom to construct the representations π±

k from our list in Theorem 6.2 in 6.4by functions living on the space of left cosets G/K = SL(2,R)/SO(2). This space isagain an eminent example of a homogeneous space. It is called the Poincare upper halfplane and we denote it by H := τ = x + iy ∈ C; y > 0. Since this is important forour constructions here but also as a standard example in complex function theory, weassemble some related standard notions and facts.

For g = ( a bc d

) ∈ SL(2,R) and τ ∈ H, we put

g(τ) :=aτ + b

cτ + d

and call this a linear fractional transformation.

Exercise 7.10: Prove that this is well defined and provides a transitive left action ofG = SL(2,R) (even of GL(2,R)) on H.Verify that H = SO(2) is the stabilizing group of τ = i.Show that the automorphic factor

j(g, τ) := (cτ + d)−1

fulfills the cocycle relation (7.8) (in the version for the action from the left)

j(gg′, τ) = j(g, g′(τ))j(g′, τ).


One has a map

p : G = SL(2,R) −→ SL(2,R)/SO(2) H, g −→ g(i) =: τ = x + iy

for the Iwasawa decomposition g = n(x)t(y)r(ϑ), which here can be interpreted as aMackey decomposition g = s(τ)h, s(τ) = n(x)t(y), h = r(ϑ). Moreover, one easily verifiesthe following formulae

(7.12) y =1

c2 + d2, x =

ac + bd

c2 + d2, eiϑ =

d − ic√c2 + d2

for g = ( a bc d

)

and, for g = r(ϑ′)t(y′)n(x′),

(7.13) cos ϑ′ =a√

a2 + c2, sinϑ′ =

−c√a2 + c2

, y′ =1

a2 + c2, x′ =

ab + cd

a2 + c2.

Exercise 7.11: Verifyd(g(τ))

dτ=

1(cτ + d)2

and show that dµ(τ) = y−2dxdy is a G−invariant measure on H.

This is also an occasion to practise again the calculation of infinitesimal operationsinitiated in 6.2 and 6.3: For a smooth complex function φ on G and U ∈ g, we geta right invariant differential operator RU defined by

RUφ(g) :=d

dtφ((exp tU)−1g) |t=0

and a left invariant operator LU by

LUφ(g) :=d

dtφ(g (exp tU)) |t=0 .

From 6.4 we recall that we have

g = sl(2,R) = X ∈ M2(R); TrX = 0 = < F, G,H >

with

F = (1

), G = ( 1 ), H = (1

−1 )

andgc := g ⊗R C = < F, G, H >C = < X+, X−, Z >C

where

X± = (1/2)(H ± i(F + G)) = (1/2)(1 ±i±i −1 ), Z = −i(F − G) = (

−ii

)

with[Z, X±] = ±2X±, [X+, X−] = Z.

Using the coordinates x, y, ϑ of the Iwasawa decomposition as parameters for g and thecommutation rule t(y)n(x) = n(yx)t(y), one obtains

RHφ(g) = ddtφ((exp tH)−1g) |t=0

= ddtφ(( e−t

et )n(x)t(y)r(ϑ)) |t=0

= ddtφ(n(e−2tx)t(e−2ty)r(ϑ)) |t=0

= −(2x∂x + 2y∂y)φ(g).


The same way one has

RF φ(g) = ddtφ((exp tF )−1g) |t=0

= ddtφ(( 1 −t

1 )n(x)t(y)r(ϑ)) |t=0

= ddtφ(n(x − t)t(y)r(ϑ)) |t=0

= −∂xφ(g),

andRGφ(g) = d

dtφ((exp tG)−1g) |t=0

= φ((1−t 1 )n(x)t(y)r(ϑ)) |t=0

= φ(n(x′)t(y′)r(ϑ′)) |t=0

= ((x2 − y2)∂x + 2xy∂y + y∂ϑ)φ(g),

where in the last formula we used (7.12) to get

x′ =x − t(x2 + y2)

t2y2 + (1 − tx)2,

y′ =y

t2y2 + (1 − tx)2,

eiϑ′= eiϑ 1 − t(x − iy)√

t2y2 + (1 − tx)2.

Hence we have the differential operators

(7.14)

RF = −∂x,

RG = (x2 − y2)∂x + 2xy∂y + y∂ϑ,

RH = −2x∂x − 2y∂y,

which combine to

(7.15)RZ = i((1 + x2 − y2)∂x + 2xy∂y + y∂ϑ),

RX± = ±(i/2)(((x ± i)2 − y2)∂x + 2(x ± i)y∂y + y∂ϑ).

A similar calculation (Exercise 7.12) leads to

(7.16)LZ = −i∂ϑ,

LX± = ±(i/2)e±2iϑ(2y(∂x ∓ i∂y) − ∂ϑ).

In 6.4 we already met with the (multiple of the Casimir) element

Ω := X+X− + X−X+ + (1/2)Z2,

which, by the way, is also (Exercise 7.13) Ω = FG + GF + (1/2)H2. By the naturalextension of the definition of the differential operators from g to U(gc) (via RUU ′ :=RURU ′) and a small calculation we get the SL(2,R)-Laplacian

(7.17) ∆ := RΩ = LΩ = 2y2(∂2x + ∂2

y) − 2y∂x∂ϑ.


After these general preparations, we finally come to the discrete series representations.There are a lot of different models in the literature. We present the most customary oneand then discuss how it fits into the scheme of the induction procedure.For instance in [La1] p.181, we find the representation πm with the representation spaceHm := L2

hol(H, µm),m ≥ 2. This space consists of holomorphic (i.e. complex differen-tiable) functions f on the upper half plane H, which have a finite norm ‖f‖m < ∞,i.e. with

‖f‖2m :=

∫H

| f(x, y) |2 ym dxdy

y2< ∞

and one has an scalar product defined by

< f1, f2 >:=∫

H

f1(x, y)f2(x, y)ym dxdy

y2.

As will be apparent soon (even for those readers, which are not so familiar with complexfunction theory), the main point is here reduced to the fact that holomorphic functionsf are essentially smooth complex valued functions characterized by the Cauchy Riemanndifferential equation

(∂x + i∂y)f = 0.

We write f(τ) := f(x, y) and put g−1 =: ( a bc d

) (this is some trick to get nicer formulae

but, if not remembered carefully, sometimes leads to errors). Then the representationπm is in the half plane model given by

(7.18) πm(g)f(τ) := f(g−1(τ))jm(g−1, τ) = f(g−1(τ))(cτ + d)−m.

Here we have used the important notation of the automorphic factor

jm(g, τ) := j(g, τ)m

extending the one introduced in Exercise 7.10.

Remark 7.5: In [Kn] p.35, one finds a representation given on the same space by theprescription

πm(a bc d

))f(τ) := (−bτ + d)−mf(aτ − c

−bτ + d).

And in [GGP] p.53, the authors define a representation Tm given by

Tm( a bc d

))f(τ) := (bτ + d)−mf(aτ + c

bτ + d).

The reader is invited to compare this to our definition. We stick to it because we wantto take over directly some proofs from Lang’s presentation in [La1] p.181ff. Namely, onehas to verify several things:

– Hm is complete and non-trivial.– π fulfills the algebraic relation π(gg′) = π(g)π(g′).– π is continuous and unitary.


The algebraic relation is an immediate consequence of the facts stated as Exercise 7.10.We verify the unitarity using Exercise 7.11:

‖π(g)f‖22 =

∫ ∞0

∫ ∞−∞ | f(g−1τ) |2| cτ + d |−2m ym−2 dxdy

=∫ ∞0

∫ ∞−∞ | f(τ) |2 ym−2 dxdy

= ‖f‖22.

For the other statements we refer to Lang. The idea is to transfer the situation from theupper half plane to the unit disc where the function theory is more lucid. Since this isan important technique, we report on some facts:

The Unit Disc Model

By the Cayley transform c the Poincare half plane H is (biholomorphically) equivalentto the unit disc

D := ζ ∈ C; | ζ | < 1 :

In all cases, this bijective map sends the point i to 0 but otherwise is not uniquely fixedin the literature. We take the map

c : H −→ D, τ −→ ζ :=τ − i

τ + i= C(τ) for C = ( 1 −i

1 i),

which has as its inverse the map given by

ζ −→ τ = −iζ + 1ζ − 1

= C−1(ζ) with C−1 = (1/(2i))( i i−1 1 ).

It is a standard fact that G = SL(2,R) is isomorphic to

G′ := SU(1, 1) = g′ = (α ββ α

); | α |2 − | β |2= 1.

We take

c : SL(2,R) −→ SU(1, 1), g = (a bc d

) −→ g′ := CgC−1 = (α ββ α

)

with

(7.19) α = α(g) = (1/2)(a + d + i(b − c)), β = β(g) = (1/2)(a − d − i(b + c)).

Then we have

r(ϑ) := ( cos ϑ sinϑ− sin ϑ cos ϑ

) −→ ( eiϑ

e−iϑ ) =: s(ϑ),

i.e. c(SO(2)) = s(ϑ); ϑ ∈ R = K ′, which we identify with U(1). By the linearfractional transformation G′ × D −→ D, (g′, ζ) −→ g′(ζ), G′ acts (transitively) on D.The stabilizing group of ζ = 0 is U(1) = K ′. Hence in analogy to the map

p : G −→ G/SO(2), g −→ g(i) = τ = x + iy

with the Iwasawa decomposition g = n(x)t(y)r(ϑ), we have the map

p′ : G′ −→ G′/K ′ = D, g′ −→ g′(0) = ζ = u + iv

and the commutative diagram


G = SL(2,R) G′ = SU(1, 1)

H = SL(2,R)/SO(2) D = SU(1, 1)/U(1)

c

c

p p′

with

c(n(x)t(y)) = (1/2)y−1/2

(1 + iτ −1 − iτ−1 + iτ 1 − iτ

)

and c(τ) = ζ = τ−iτ+i .

Now, it is a rather easily acceptable fact that the space of holomorphic functions on Dis spanned by the monomials ζn, n ∈ N0. And some calculation shows that

dν(ζ) :=dudv

1− | ζ |2 , dνm(ζ) := 41−m(1− | ζ |2)m dudv

1− | ζ |2 ,

is a G′−invariant measure resp. a quasi-invariant measure on D and one has an isomor-phism between the corresponding spaces of holomorphic functions on H and D

Cm : Hm = L2hol(H, dµm) −→ L2

hol(D, dνm), f −→ Cmf := f

given by

f(ζ) = f(−iζ + 1ζ − 1

)(−2i

ζ − 1)m.

Using a convenient extension of the group action given by (7.18), this can also be writtenas

f(ζ) = f(C−1(ζ))jm(C−1, ζ).

The map Cm intertwines the representation πm from the half plane model given by (7.18)with the equivalent representation πm on the space of functions f ∈ L2

hol(D, dνm) givenby the prescription

πm(g)f(ζ) := f(c(g)−1(ζ))jm(c(g)−1, ζ).

We call this the unit disc model. By the way, if one replaces c(g) by g′ ∈ G′, one alsohas a representation of G′ = SU(1, 1).

Already above, we stated the (plausible) fact that L2hol(D, dνm) is spanned by functions

fn with fn(ζ) = ζn, n ∈ N0. As to be seen easily, via Cm, these functions correspond tofunctions fn living on H with

fn(τ) := (τ − i

τ + i)n(τ + i)−m.


The Group Model

Intertwined with these two equivalent models, there is a third one, which we will call thegroup model because its representation space consists of functions living on the groupG = SL(2,R). We go back to the map c : G −→ G. In the formulae (7.19) one has theelements of G as functions of g ∈ G. We introduce the parameters x, y, ϑ describing g inthe Iwasawa decomposition as in (7.12) and get by a small calculation

α(g) = (1/2)(a + d + i(b − c)) = (1/2)y−1/2eiϑi(τ − i),β(g) = (1/2)(a − d − i(b + c)) = (1/2)y−1/2e−iϑ(−i)(τ − i),

and by another one

φn(g) := β(g)nα(g)−(n+m)(1/(2i))m = ym/2eimϑfn(τ).

Inspired by this formula, we define a lift ϕm, which associates functions φ living on G tofunctions f on H via

f(τ) −→ φf (g) = (ϕmf)(g) := f(g(i))jm(g, i) = f(τ)ym/2eimϑ.

Here we use the fact that one has jm(g, i) = (cτ + d)−m = ym/2eimϑ if g is fixed byour Iwasawa parametrization (7.12). With the help of the cocycle condition for theautomorphic factor (or multiplier) jm, we deduce for these lifted functions

φ(g−10 g) = f(g−1

0 g(i))jm(g−10 g, i)

= f(g−10 τ)jm(g−1

0 , g(i))jm(g, i)= (πm(g0)f)(g(i))jm(g, i).

This shows that the lift ϕm intertwines (algebraically) the representation πm on the spaceL2

hol(H, dµm) with the left regular representation λ on the space of functions φ living onG and coming out by the lift ϕm. In [La1] p.180 one finds a proof that the functions φn

are square integrable with respect to the (biinvariant) measure dg on G given (again inour Iwasawa parameters) by

dg = y−2dxdydϑ.

We call this representation, given by left translation on the space spanned by the func-tions φn living on the group, the group model.

Already earlier, while introducing the other two models, we could have identified theclass of our representation with the one of the representation π−

m of highest weight −mfrom the list of (possible) irreducible unitary representations obtained by infinitesimalconsiderations in 6.4. We will do this now. The reader is invited to watch very closelybecause Lang in [La1] p.184 (erroneously if our reasoning is correct) identifies it as alowest weight representation.There are several ways to proceed: For the isomorphism c : G = SL(2,R) −→ SU(1, 1)one has c(r(ϑ)) = s(ϑ), c(r(ϑ)−1g)) = s(−ϑ)c(g). Hence we deduce

α(r(ϑ)−1g) = e−iϑα(g), β(r(ϑ)−1g) = e−iϑβ(g),

andφn(r(ϑ)−1g) = e−(2n+m)iϑφn,

i.e. our representation has the weights −m,−(m + 2),−(m + 4), . . . . We compare thiswith our results from 6.4 and conclude that we have a representation equivalent to the


highest weight representation π−m. But we can do more: In (7.15) we determined the op-

erators RF ,RG,RH and RZ ,RX± . These operators are just the operators producing thederived representation of the left regular representation λ. Thus, by a small calculation(Exercise 7.14), one can verify the formulae

RZφn = −(2n + m)φn, RX+φn = −nφn+1, RX−φn = (n + m)φn+1.

We can get corresponding formulae by determination of the derived representations ofthe other models. We propose the following

Exercise 7.15: Verify the formulae

RmZ = πm(Z) = i((1 + τ2)∂τ + mτ),

RmX± = πm(X±) = ±(i/2)((i ± τ)2∂τ + m(τ ± i)),

andRm

Z fn = −(m + 2n)fn,

RmX+

fn = −nfn−1,

RmX−fn = (n + m)fn+1.

Similarly the representations π+m of lowest weight can be realized by holomorphic func-

tions on the lower half plane H = τ = x+iy ∈ C, y < 0 or by antiholomorphic functionsf , i.e. with (∂x − i∂y)f = 0.

Exercise 7.16: Determine the class of the representation πm taken over from Knapp inthe Remark 7.5 above.

The Holomorphically Induced Model

Finally, we want to show how these realizations of the discrete series representations fitinto the discussion of the induction procedure. Namely one can obtain them as subrep-resentations of suitable induced representations:

We start with the subgroup H = K = SO(2) and its representation π0 given by

π0(r(ϑ)) = χk(r(ϑ)) = eikϑ, k ∈ Z.

The Iwasawa decomposition can be understood as a Mackey decomposition

g = s(τ)r(ϑ), s(τ) := n(x)t(y),

and we have our standard projection p : G −→ G/SO(2) = H, g −→ g(i) = x + iy = τwith section s : H −→ G, τ −→ s(τ). Then the induced representation πind

k := indGSO(2)χk

is given by (inverse) left translation λ on the completion Hχk of the space of smoothfunctions φ on G with functional equation

φ(gr(ϑ)) = e−ikϑφ(g), for all g ∈ G, r(ϑ) ∈ SO(2)

and the norm condition

‖φ‖22 =

∫H

| φ(s(τ)) |2 y−2dxdy < ∞.


From the functional equation we learn that the functions φ in the representation spaceare all of the type

φ(g) = e−ikϑF (x, y)

with a function F depending only on x, y if g is given in our Iwasawa parametrization.Comparing this to the functions establishing the group model above, we see that we haveto find a condition, which further restricts these functions F . Here we remember fromthe infinitesimal considerations in 6.4 that functions spanning the space of a discreteseries representation π±

k have to be eigenfunctions of the Casimir element

Ω = X+X− + X−X+ + (1/2)Z2.

In (7.17) we already determined that Ω acts on smooth functions φ living on G as thedifferential operator

∆ = RΩ = LΩ = 4y2(∂2x + ∂2

y) − 4y∂x∂ϑ.

The preceding discussion in mind, we try the ansatz

φ(g) = e−ikϑy−k/2f(x, y).

We get

∆φ = [(4y2(∂2x + ∂2

y) + 4yik(∂x + i∂y) + (k/2)((k/2) + 1))f ]y−k/2e−ikϑ,

i.e. we have∆φ = (k/2)((k/2) + 1)φ

if∆kf := 4y2(∂2

x + ∂2y) + 4yik(∂x + i∂y)f = 0,

and this is obviously the case if f fulfills the equation

∂τf := (1/2)(∂x + i∂y)f = 0,

i.e. if f is holomorphic. Hence for k = −m, we have recovered the lift of the representationπm as a subrepresentation of our induced representation πind

k .There is a shorter way to single out in Hχk a subspace carrying πm: To the two conditionscharacterizing the space of the induced representation, namely the functional equationand the norm condition, we add simply the condition

LX−φ = 0.

This comes out as follows: From (7.16) we have

LX− = −(i/2)e−2iϑ(2y(∂x + i∂y) − ∂ϑ).

For φ(g) = e−ikϑy−k/2f(x, y) this again leads to the holomorphicity condition ∂τf = 0.

Remark 7.6: This is a special example of a general procedure propagated in particularby W. Schmid and called holomorphic induction.

Remark 7.7: Here we began by stating the prescription for the representation πm in aspace of functions living on the homogeneous space H = G/SO(2). It is a nice exercise to


rediscover this formula by our general procedure to change from the First to the SecondRealization of an induced representation:

We start with Mackey’s decomposition as above, g = s(τ)r(ϑ), and for g−10 = ( a b

c d)

the Master Equation, using s(τ) = n(x)t(y), states

g−10 s(τ) = s(g−1

0 (τ))r(ϑ∗)

with

eiϑ∗=

√cτ + d

cτ + d.

Moreover one has the standard formulae

dµ(g−10 (τ)

dµ(τ)=

1(cτ + d)2

, Im (g−10 (τ)) =

y

| cτ + d |2 ,

such that the left regular representation π(g0)φ(g) = φ(g−10 g) in the Second Realization

with F (τ) = φ(s(τ)) translates to the prescription

π(g0)F (τ) = (dµ(g−1

0 (τ))dµ(τ)

)1/2π−10 (r(ϑ∗))F (g−1

0 (τ)).

With π0 = χk and F (τ) = y−k/2f(τ) for k = −m by a small calculation this leads to theprescription (7.18)

πm(g0)f(τ) = f(g−10 (τ))(cτ + d)−m.

Remark 7.8: The functions φ in the representation space of the Ffirst Realization andthe matrix coefficients < π(g)φ, φ′ > are square-integrable for m ≥ 2. This is a generalcriterium to distinguish discrete series representations from the other representations.

The representations π±1 are called mock discrete series representations or limits of discrete

series. They are realized by the same prescriptions as the π±m with m ≥ 0 but here the

space of holomorphic functions f on H for π−1 gets the new norm

‖f‖2 = supy>0

∫R

| f(x + iy) |2 dx,

and analoguously for the other case.

The courageous reader may be tempted to apply this principle of holomorphic inductionto reconstruct as Exercise 7.17 the irreducible unitary representations of G = SU(2).

3. The Complementary Series

For the sake of completeness, we indicate here also a model, but only briefly, as donein most sources. (The best idea is to go back to Bargmann’s original paper [Ba].) Wetake up again the prescriptions as in the construction of the principal series but to geta model for πs, 0 < s < 1 we change the norm on the space of complex valued functionson R and introduce

‖f‖2s :=

∫R

∫R

f(x)f(y)| x − y |1−s

dxdy.

7.3 Unitary Representations of SL(2,C) and of the Lorentz Group 143

For g−1 = (a bc d

) the prescription is here

πs(g)f(x) =| cx + d |−1−s f(g−1(x)).

Some more models can be found in [vD] p.465 and [GGP] p.36.

A Final General Remark

We close this section by pointing out to the Subrepresentation Theorem, which goesback to Harish–Chandra and roughly states (for a precise formulation and proof see [Kn]p.238) that a certain class of representations, the admissible representations, may berealized as subrepresentations of parabolically induced representations starting eventuallyby non-unitary representations of the subgroup. Admissible representations are veryimportant, in particular in number theory. An admissible representation (π, V ) of Gis (in first approximation) characterized as follows ([Ge] p.55): Let K be a maximalcompact subgroup of G. For an irreducible representation χ of K, denote by V (χ) thesubspace of vectors in V, which transform according some multiple of χ. Then one has adecomposition

V = ⊕χ V (χ)

where each V (χ) is finite dimensional.

Irreducible unitary representations of reductive groups are admissible. So all represen-tations in this text have this property (with exception of the representations of the(non-reductive) Jacobi group appearing later).

The interested reader can exemplify the statement of the subrepresentation theorem inthe case of G = SL(2,R) by constructing as Exercise 7.18 another model of the discreteseries representation by starting the induction procedure from

H = h = n(x)t(y); x, y ∈ R, y > 0, π0(h) = ys, s ∈ C.

7.3 Unitary Representations of SL(2,C) and of theLorentz Group

The representations of the group G = SL(2,C) are of particular importance for physi-cists because this group is the double cover of the identity component of the completehomogeneous Lorentz group in four dimensions (to which we will restrict ourselves inthis section). At first, we will discuss the representations of SL(2,C) and then the ho-momorphism into the Lorentz group.

As central fact we have:Theorem 7.5: For G = SL(2,C) there are (up to equivalence) only two series of irre-ducible unitary representations:– the unitary principal series πk,iv depending on two parameters k ∈ Z, v ∈ R,and– the complementary series π0,u depending on a real parameter u with 0 < u < 2.Among these the only equivalences are πk,iv π−k,−iv.


Thus, in contrast to the smaller group SL(2,R), we have no discrete series representa-tions. We reproduce models for these representations from [Kn] p. 33/4 and discuss theirconstruction via the induction procedure afterwards.

As representation space we take the space H = L2(C) of square-integrable complexfunctions f on C with measure dzdz. For z ∈ C, f ∈ H, the prescription for the unitaryprincipal series representation is given by

(7.20) πk,iv((a bc d

))f(z) :=| −bz + d |−2−iv (−bz + d

| −bz + d | )−kf(

az − c

−bz + d).

Here, the notion of the fractional linear transformation used in the last section as a mapSL(2,R) × H −→ H, (g, τ) −→ g(τ), is naturally extended to a map

SL(2,C) × C ∪∞ −→ C ∪∞

with

(a bc d

)(z) := az+bcz+d for z ∈ C, cz + d = 0,

:= ∞ for z ∈ C, cz + d = 0,

:= ac for z = ∞.

In a similar way, for k ∈ Z, w = u + iv ∈ C, there is a non–unitary principal series πk,w

given by the prescription

πk,w((a bc d

)f(z) :=| −bz + d |−2−w (−bz + d

| −bz + d | )−kf(

az − c

−bz + d)

for functions f in the space L2(C, dµw) with dµw(z) = (1+ | z |2)udzdz. However, in thespecial case k = 0, w = u ∈ R, 0 < u < 2, the representation πu := π0,u is unitary if therepresentation space is provided with the inner product

< f, f >:=∫C

∫C

f(z)f(ζ)| z − ζ |2−u

dzdζ.

This is our model for the complementary series.

As to the construction, these representations are induced representations where one takesthe subgroup

H := h = ( α βα−1 ); α, β ∈ C, α = 0

and the representation π0 with π0(h) :=| α |w ( α|α| )

k. We have the Mackey decomposition

g = (a bc d

) = s(τ)h

where (as in our discussion of the principal series for SL(2,R) in the previous section) itsuffices to treat the case a = 0, such that one has

s(τ) = ( 1z 1 ), z := b/a and h = ( a b

a−1 ),

7.3 Unitary Representations of SL(2,C) and of the Lorentz Group 145

and the Master Equation g−1s(z) = s(z∗)h∗ with

z∗ =az − c

−bz + d, h∗ = (

−bz + d −b(−bz + d)−1 ).

Moreover, we havedz∗dz∗ = | −bz + d |−4 dzdz.

Hence, since the induced representation in left regular version in its Second Realizationon the space H = L2(C) has the general prescription

π(g)f(z) =

√dµ(z∗)dµ(z)

π0(h∗)f(z∗), for all f ∈ H,

with w = iv this leads to the formula (7.20).

In [Kn] p.33/4 Knapp proves the irreducibility of πk,iv using Schur’s Lemma in the uni-tary version cited in 3.2. One could also do this (and verify that really we only havethe above mentioned irreducible unitary representations) by infinitesimal considerationsas in our discussion of SL(2,R). But since sl(2,C) has real dimension 6, this is leadsto considerable non-trivial computations (to be found for instance in Naimark [Na] orCornwell [Co] p.668ff in the framework of the discusssion of the Lorentz group).

The interest in the representations of G = SL(2,C) is enlarged by the fact that theyimmediately lead to projective representations of the proper orthochronous Lorentz group,resp. those, which are trivial on −E2, to genuine representations, the same way as in ourdiscussion of SU(2) and SO(3) in 4.3. Before we describe the relation between SL(2,C)and the Lorentz group we have to fix some notation.

We write

GL := A ∈ GL(4,R); tAD3,1A = D3,1 =

⎛⎜⎜⎝

11

1−1

⎞⎟⎟⎠

for the complete homogeneous Lorentz group, i.e. the group of linear transformations,which leave invariant the Minkowski metric from the theory of special relativity

ds2 = dx21 + dx2

2 + dx23 − dx2

4.

One may be tempted to replace (x1, x2, x3, x4) by (x, y, z, ct), (c the velocity of light)and put

ds2 = c2dt2 − dx2 − dy2 − dz2,

as often done in the physics literature, but we will not do this here. In our former notationfrom 0.1 we also have GL = O(3, 1) and GL

+ = SO(3, 1) for the proper homogeneousLorentz group, i.e. the invariant subgroup consisting of matrices A with det A = 1. Wewrite the matrices in block form

A = ( B xty a

), B ∈ GL(3,R), x, y ∈ R3, a ∈ R.


GL+ has two connected components distinguished by a > 0 and a < 0 since one has

(tB ytx a

)( E3

−1 )( B xty a

) = (tBB − yty tBx − yatxB − aty txx − a2 ),

and the condition that this should be equal to (E3

−1 ) asks for

(7.21) tBB − yty = E3,tBx − ya = 0, txx − a2 = −1,

and, hence, in particular a2 ≥ 1.

The subgroup GL+

+ with a > 0 is called the proper orthochronous homogeneous Lorentzgroup. It is the connected component of the identity in SO(3, 1). We will abbreviate thishere to Lorentz group and write G0 := SO(3, 1)0 for it. As easily to be seen, one canimbed SO(3) into G0 by putting

SO(3) B −→ B := (B

1 ).

These elements are called Lorentz rotations. Moreover, we introduce the Lorentz boosts,which are of the type

B(t) =

⎛⎜⎜⎝

cosh t sinh t1

1sinh t cosh t

⎞⎟⎟⎠ , t ∈ R.

The structure of G0 is clarified by the following remark.

Proposition 7.2: For every A ∈ G0 there are R,S ∈ SO(3) and t ∈ R such that

A = RB(t)S,

i.e. G0 is generated by Lorentz rotations and Lorentz boosts.

A proof can be found in any source treating the Lorentz group, for instance in [He] p.75.Our main fact in this context is the next theorem.

Theorem 7.6: There is a surjective homomorphism

ρ : G = SL(2,C) −→ G0 = SO(3, 1)0

with ker ρ = ±E2.

Proof (sketch): We put

V := X ∈ M2(C); X = tX =: X∗ = ( α zz β

); α, β ∈ R, z ∈ C

and look at the map

ϕ : R4 −→ V, v = t(a, b, c, d) −→ ( −a + d b + icb − ic a + d

).

7.4 Unitary Representations of Semidirect Products 147

For X = ϕ(v), Y = ϕ(w) this map transforms the Minkowski (quasi-)inner product

< v, w >= aa′ + bb′ + cc′ − dd′

into the corresponding one defined on V by

σ : V × V −→ R, (X,Y ) −→ −(1/2)(det(X + Y ) − detX − detY )

as easily to be verified by a small calculation. Hence our ϕ induces a bijection betweenO(3, 1) and O(V, σ), which can be recognized as an isomorphism and we take this toidentify both sets.We define a map

ρ : SL(2,C) −→ GL(V ), A −→ ρ(A) with ρ(A)X := AXA∗ for X ∈ V,

which obviously is a homomorphism. We have

ker ρ = A; AXA∗ = X for all X ∈ V

and deduce ker ρ = ±E2 by evaluating the defining condition for X = E2, (1

−1 ),

and (1

1 ). The main task is to prove the surjectivity of ρ. We go back to the proof

of Theorem 4.7 in 4.3 and use the fact that G0 is generated in GL(V ) by elementsS2(α), S3(α) and B(t). By some computation, one verifies that these are images underρ of respectively r(α/2), s(α/2) and s(−it/2).

This theorem justifies the statement made above that the representations π of SL(2,C)with π(−E2) = id create the representations of SO(3, 1)0. E.g., in [Co] p.671ff one findsa discussion of the finite-dimensional (non-unitary) representations of the Lorentz group.

7.4 Unitary Representations of Semidirect Products

The semidirect products form a class of groups whose importance is also enhanced by itsapplications in physics as the Euclidean and Poincare groups are prominent examples,which fit into the following general concept:

We take two groups G0 and N and a (left) action of G0 on N given by

ρ : G0 × N −→ N, (a, n) −→ ρ(a)n

with the additional condition that every ρ(a) is an automorphim of N , i.e. ρ(a) ∈ Aut Nfor all a ∈ G0. Then we define the associated semidirect product G := G0 N as the setof pairs g = (a, n), a ∈ G0, n ∈ N provided with the composition law

(7.22) gg′ := (aa′, nρ(a)n′).

G comes out as a group with neutral element e consisting of the pair of neutral elementsin G0 and in N . The inverse to g = (a, n) is g−1 := (a−1, ρ(a−1)n−1). (Verification ofthe associative law as Exercise 7.19.)


Examples:

Example 7.4: The Euclidean group GE(n) in n dimensions: We take G0 = SO(n)and N = Rn. For A ∈ SO(n), x ∈ Rn, the action ρ is given by matrix multiplication(A, x) −→ Ax. We denote the semidirect product SO(n) Rn by GE(n) and write thisagain as a matrix group

GE(n) = g = (A, a) = ( A a1 ); A ∈ SO(n), a ∈ Rn.

Here the matrix multiplication in Mn+1(R)

gg′ = ( A a1 )( A′ a′

1 ) = ( AA′ a + Aa′

1 )

reflects exactly the composition law (7.22) leading in this case to gg′ = (AA′, a + Aa′).This law comes out very naturally if one asks for the composition of the affine transfor-mations x −→ Ax + a, which preserve the euclidean norm ‖x‖2 = x2

1 + · · · + x2n.

Example 7.5: The Poincare group GP (n) in n dimensions: Paralell to the case abovewe take G0 = SO(n − 1, 1) and N = Rn and get

GP (n) = g = (A, a) = (A a

1 ), A ∈ SO(n − 1, 1), a ∈ Rn

with the same composition law. This is the invariance group of the affine transformationspreserving the Minkowski (pseudo-)norm ‖x‖2 = x2

1 + · · · + x2n−1 − x2

n.

Example 7.6: Interprete the group B := g = ( a ba−1 ); a, b ∈ R, a > 0 as a semidi-

rect product (Exercise 7.20).

Example 7.7: Interprete the Heisenberg group

Heis′(R) := g =

⎛⎝1 x z

1 y1

⎞⎠ ; x, y, z ∈ R

as semidirect product of G0 = R acting on N = R2. Can you do the same with Heis(R)as realized in 0.3? (Exercise 7.21)

Example 7.8: The Jacobi group GJ is the semidirect product of G0 = SL(2,R) actingon the Heisenberg group N = Heis(R) via

(( a bc d

), (λ, µ, κ)) −→ (dλ − cµ,−bλ + aµ, κ).

This group can be presented in several ways as a matrix group (see our Section 8.5 orfor instance [EZ] or [BeS] p.1).

With exception of the last one, in all these examples the group N is abelian. In the sequelwe will restrict our treatment to this case as things become easier under this hypothe-sis and use additive notation in the second item of the pair g = (a, n) (as we already


did in the examples). Whoever is interested in more general cases is refered to Mackey’soriginal papers or the treatment of the representation theory of the Jacobi group in [BeS].

Remark 7.9: We use the following embeddings

G0 −→ G = G0 N, a −→ (a, 0) =: a

andN −→ G = G0 N, n −→ (e, n) =: n.

Here one has to be a bit careful because we have n · a = (e, n)(a, 0) = (a, n) = g buta · n = (a, 0)(e, n) = (a, ρ(a)n) = (a, n) = g in general. There is the useful relation

(a, 0)(e, n)(a−1, 0) = (e, ρ(a)n),

which, using our embeddings, can be stated as ana−1 = ρ(a)n.

Example 7.9: We take

G := Heis′(R) = g = (x, y, z); x, y, z ∈ R,G0 := a = (x, 0, 0); x ∈ R,

andN := n = (0, y, z); y, z ∈ R. |=

Using the multiplication law gg′ = (x + x′, y + y′, z + z′ + xy′), one calculates

ana−1 = (0, y, z + xy) = ρ(a)n

and this is consistent with the usual semidirect product composition law (7.22) given bygg′ = (a, n)(a′, n′) = (aa′, nρ(a)n′).

Example 7.10: For a presentation as semidirect product of the group

B := g = (a b

a−1 ) =: (a, b); a, b ∈ R, a > 0

one may be tempted to try G0 := a = (a, 0); a > 0 and N := n = (1, b); b ∈ R.From matrix multiplication we have gg′ = (aa′, ab′ + ba′−1) and (using the embeddingintroduced above) ana−1 = (1, a2b) = ρ(a)n. But then the composition law would givegg′ = (a, b)(a′, b′) = (aa′, nρ(a)n′) = (aa′, b+a2b′) which is not the multiplication law inthis group. To make things consistent (and thus solve the problem from Exercise 7.20),one constructs the semidirect product G = R>0 R = [a, x], a, x ∈ R, x > 0 withρ(a)x = a2x and defines an isomorphism

ϕ : B −→ G, (a, b) −→ [a, ab].

The agreable thing about these semidirect products G = G0 N, N abelian, is the factthat Mackey’s Theory provides an explicit way to construct their irreducible unitaryrepresentations as induced representations. We describe this method without going ev-erywhere into the intricacies of the proofs, which rely on some serious functional analysis:


We start by an easy but important fact relating the representations of G = G0 N withG0 acting on N by ρ to those of its constituents G0 and N .

Proposition 7.3: Let π be a representation of G and π′, π′′ the restrictions of π to Nresp. G0, i.e π′(n) = π((e, n)) for n ∈ N and π′′(a) = π((a, 0)) for a ∈ G0. Then π′ andπ′′ completely determine π and are related by the condition

(7.23) π′(ρ(a)n) = π′′(a)π′(n)π′(a−1) for all a ∈ G0, n ∈ N.

Proof: π is determined by π′ and π′′ as one has

π((a, n)) = π((e, n)(a, 0)) = π((e, n))π((a, e)) = π′(n)π′′(a).

And from the composition law in G

(a1, n1)(a2, n2) = (a1a2, n1 + ρ(a1)n2)

we deduce

π′(n1)π′′(a1)π′(n2)π′′(a2) = π′(n1)π′(ρ(a1)n2)π′′(a1)π′′(a2)

and with a1 = a, n2 = n this leads immediately to (7.23).

Hence to find representations of G, one has to find representations of G0 and N fulfilling(7.23). This condition is important in the context of Mackey’s system of imprimitivity,which is the main tool in the proof that the construction we describe in the sequel leadsto a way to construct all irreducible unitary representations of semidirect products. Weessentially follow [BR] p.503ff and [Ki] p.195ff (where the more general case of repre-sentations of group extensions is treated). A decisive element in the construction comesfrom the fact that the action ρ of G0 on N leads also to an action on the unitary dualN of N , which in our abelian case consists of (classes of) characters n of N. Namely, wehave

G0 × N −→ N , (a, n) −→ an

given by an(n) := n(ρ(a−1)n). Since the (abelian) group N acts (trivially) on N byconjugation, i.e. via n0n(n) := n(n−1

0 nn0), one has an action of the semidirect productG on N . The G−orbit of n ∈ N is written as

On := Gn.

Mackey’s construction works under the assumption that the semidirect product is regu-lar. All the groups we treat as examples are of this type. Hence, since we will not givecomplete proofs anyway, a not so ambitious reader at first sight may skip the followingrather technical condition (from [BR] p.506).

Definition 7.1: We say that G is a regular semidirect product of N and G0 if N containsa countable family Z1, Z2, . . . of Borel subsets, each a union of G−orbits, such that everyorbit in N is the intersection of the members of a subfamily Zn1 , Zn2 , . . . containing thatorbit. Without loss of generality, we can suppose that the intersection of a finite numberof Zi is an element of the family (Zi). This is equivalent to the assumption that anyorbit is the limit of a deceasing subsequence of (Zi).


One also has a slightly different looking condition for regularity used by Mackey ([Ma]p.77):There is a subset J ⊂ N , which meets each G−orbit O exactly once and is an analyticsubset.

Now as our central prescription for the construction of representations in this section, wetake over from [BR] Theorem 4 on p.508:

Theorem 7.7: Let G = G0 N be a regular semidirect product of separable, locallycompact groups G0 and N , N abelian. Then every irreducible unitary representationof G is unitarily equivalent to an induced representation π = indG

Hπ0, which can beconstructed as follows.– Step 1: Determine the set N of characters n of N , i.e. the dual group of N .– Step 2: Classify and determine the G0−orbits O in N .– Step 3: Choose an element n0 in each orbit O and determine the stabilizing groupG0n0 =: H0. (This group is called the little group in the physics literature.)– Step 4: Determine the unitary dual H0 of H0.– Step 5: From every element of H0 take a representation π1 of H0 and extend it to arepresentation π0 of H := H0 N by

π0((a, n)) := n0(n)π1(a) for all a ∈ H0, n ∈ N.

– Step 6: Take H and π0 from the last step and proceed to (normalized) inductionπ = indG

Hπ0.

It is quite evident that this recipe produces unitary representations. But it is far fromevident that these are irreducible and that we really get the unitary dual of the semidi-rect product and, hence, have a classification scheme at hand. As already said, we cannot go into the proofs here (which essentially use Mackey’s imprimitivity theorem) andrefer to the Chapters 16 and 17 of [BR] or §13 of [Ki] and Section 10 of [Ma] whereslightly different approaches are adopted. Now, we discuss some examples to show howthe method works. In these examples one can verify directly the validity of the procedureencapsulated in Theorem 7.7 for instance with the help of the infinitesimal method fromour Chapter 6.

Examples:

Example 7.11: G := Heis′(R) = g = (x, y, z); x, y, z ∈ R with

gg′ = (x + x′, y + y′, z + z′ + xy′)

is isomorophic to the semidirect product of G0 := a = (x, 0, 0); x ∈ R acting onN := n = (0, y, z); y, z ∈ R via ρ(a)n := (0, y, z + xy) as seen in the Example 7.9above.– Step 1: The dual N of N R2 is again to be identified with R2. With (l, m) ∈ R2

we writen(n) = ei(ly+mz) =: χ(l,m)(y, z).

– Step 2: a ∈ G0 R acts on n ∈ N R2 by

an(n) := n(ρ(a−1)n) = ei(ly+m(z−xy)) = χ(l−xm,m)(y, z).


Hence N is disjointly decomposed into exactly two types of G0−orbits:

• Type 1: For m = 0 and n0 = (0,m) one has the line On0 = (−xm,m), x ∈ R,• Type 2: For m = 0 and n0 = (l, 0) there is the point On0 = (l, 0).

We get the following picture

l

m

O(0,m)

O(l,0)

One can see that G is a regular semidirect product: The set (l, m) ∈ R2; lm = 0 is ananalytic set, which meets each orbit exactly once.– Step 3: The stabilizing group of n0 = (0,m),m = 0 is H0 = G0(0,m) = 0 and thatof n0 = (l, 0), l ∈ R is H0 = G0(l,0) = G0 R.– Step 4: In the first case the unitary dual of H0 simply consists of the trivial identityrepresentation and in the second case we have H0 R with representatives given byπ1(a) = eijx =: χj(x), j ∈ R.– Step 5 and 6: In the second case, the type 2 orbits, the product of the stabilizinggroup H0 = G0 with N already exhausts the whole group G. Hence there is no inductionnecessary and one has for (j, l) ∈ R2 a one-dimensional representation π of G = G0 Ngiven by

π((x, y, z)) = eijxeily.

In the first case, belonging to the type 1 orbits, one has H = H0 N N R2 with arepresentation π0 given by π0(h) = eimz for all h = (0, y, z). Here the induction makessense and we have a representation π = indG

Hπ0 (for instance) given by right translationon the completion H of the space of smooth functions φ : G −→ C with functionalequation

φ(hg) = eimzφ(g) for all h ∈ H, g ∈ G,

and ‖φ‖2 =∫R| φ((x, 0, 0) |2 dx < ∞. As the composition law says

gg0 = (x + x0, y + y0, z + z0 + xy0) and (0, y, z)(x, 0, 0) = (x, y, z),

an element φ in H is of the form φ(g) = eimzf(x) with f ∈ L2(R) and one has

φ(gg0) = eimzeim(z0+xy0)f(x + x0).

So we have a second realization of this representation given by

π(g0)f(x) = eim(z0+xy0)f(x + x0) for f ∈ L2(R).

This is the Schrodinger representation πm from our Example 1.6 in 1.3 translated toHeis′(R) by the isomorphism

ϕ : Heis′(R) −→ Heis(R), g = (x, y, z) −→ (λ, µ, κ) = (x, y, 2z − xy),

from Exercise 0.1. Hence the unitary dual of the Heisenberg group G is

G = R2 (R \ 0).


Example 7.12: G := GE(2) = SO(2) R2 has G0 = SO(2) and N = R2. We writeg = (a, n) with a = r(ϑ) and n = x := t(x1, x2). One has ρ(a)n := r(ϑ)x– Step 1: As in example 7.11 we have N = R2. This time, for y = (y1, y2) ∈ R2 wewrite

n(n) = ei(y1x1+y2x2) = χ(y1,y2)(x1, x2).

– Step 2: a = r(ϑ) ∈ G0 = SO(2) acts on n ∈ N by

an(n) := n(ρ(a−1)n) = eiyr(−ϑ)x = χyr(−ϑ)(x).

Hence N is again decomposed into two types of orbits:• Type 1: For n0 = (r, 0), r > 0, the circle

On0 = y ∈ R2, ‖y‖2 = r2.

• Type 2: For n = (0, 0), the point On0 = (0, 0).

We can verify the regularity of the semidirect product: One has a countable family ofBorel subsets of N consisting of the following sets: Z0,0 := the point On0 = (0, 0) andfor any two positive rational numbers r1 < r2 Zr1,r2 := the union of all orbits of type 2with r1 < r < r2. Then each orbit is the intersection of the members of the subfamily,which contain the orbit.– Step 3: For the type 1 orbits the stabilizing group is G0(r,0) = E2 and for the type2 one has G0(0,0) = G0.– Step 4: The unitary dual of G0 = SO(2) is Z.– Step 5 and 6: In the second case, the type 2 orbit, the product of N and the stabilizinggroup is G. Hence for every k ∈ Z, we have a one-dimensional representation π of G givenby

π((r(ϑ), x)) = eikϑ.

In the first case, the type 1 orbit, one has for r > 0 the genuine subgroup H N andits one-dimensional representation π0 given by

π0((E2, x)) = n0(n) = eirx1 .

Hence, for every r > 0, we have an induced representation π = indGHπ0.

In consequence, the unitary dual of G = GE(2) is

G = R>0 Z.

Example 7.13: As Exercise 7.22 apply our construction recipe to construct represen-tations of

G = B := g = ( a ba−1 ) =: (a, b); a, b ∈ R, a > 0

and

G = B′ := g = ( a ba−1 ) =: (a, b); a, b ∈ R, a = 0

and determine the unitary dual.

Exercise 7.23: Do the same for the Euclidean group in three dimensions GE(3).


7.5 Unitary Representations of the Poincare Group

The Poincare or inhomogeneous Lorentz group is generally known as the symmetry groupof space-time. In continuation to our discussion of the Lorentz group in 7.3, we put

G = GP (4) := G0 N with G0 := SO(3, 1)0, N := R4.

Sometimes it is necessary to take more generally G0 = SO(3, 1) or even G0 = O(3, 1),which also preserve the Minkowski (pseudo)norm | · | given by

| x |2:= x21 + x2

2 + x23 − x2

4 = tx( E3

−1 )x, for all x = t(x1, x2, x3, x4) ∈ R4.

We restrict our treatment to a discussion of the simplest case (it should not be toodifficult to extend our results to the more general cases). It should also be not difficultto translate our procedure to the situation apparently prefered by the physics literature([Co] p.677ff, [BR] p.513ff]), namely that the coordinates are x0, x1, x2, x3 and the normis given by | x |2= x2

0 − x21 − x2

2 − x23. Moreover, there are reasons from physics (see the

hints in 4.3 while discussing the relation of the representations of SO(3) and SU(2)) andmathematics to define and treat the semidirect product

G := SL(2,C) R4.

For the definition of this product, one goes back to the proof of Theorem 7.6 in 7.3where we fixed the surjection ρ : SL(2,C) −→ SO(3, 1): There we had an identificationϕ : R4 −→ V = Herm2(C) and defined an action of SL(2,C) on V (and hence on R4)by

ρ(A)X = AXA∗ for all X ∈ V, A ∈ SL(2,C).

We start with the discussion of G = G0N, G0 = SO(3, 1)0, N = R4 (say, for aestheticalreasons) but soon shall be lead to take in also G. We follow the construction procedurefrom Theorem 7.7 in the last section.

Step 1: The dual N of N = R4 is again to be identified with R4. We treat N asconsisting of (coordinate) columns x ∈ R4 and N as consisting of (momentum) rowstp, p = (p1, p2, p3, p4) ∈ R4. Hence we write

n(n) = eiΣpjxj = eitpx =: χp(x).

(In the physics literature often minus signs appear in this identfication of N and its dual,but we want to avoid the discussion of co- and contravariant vectors at this stage.)

Step 2: a ∈ G0, here given by A ∈ SO(3, 1)0, acts on n ∈ N , given by p ∈ R4, by

an(n) := n(ρ(A−1)n) = χpA−1(x),

i.e. A transforms p into pA−1. It is elementary calculus that the orbits for this actionare subsets in the three types of quadrics

• Type 1: the two-sheeted hyperboloid given by

− | p |2= p24 − p2

3 − p22 − p2

1 = m2 for m ∈ R,m = 0,

7.5 Unitary Representations of the Poincare Group 155

• Type 2: the one-sheeted hyperboloid given by

− | p |2= p24 − p2

3 − p22 − p2

1 = (im)2 for m ∈ R,m = 0,

• Type 3: the light cone given by

− | p |2= p24 − p2

3 − p22 − p2

1 = 0.

More precisely, one verifies using the explicit description of SO(3, 1) in 7.3 that R4 de-composes exactly into the following orbits:

• Type 1+: O+m := On0 = G0n0 for n0 = (0, 0, 0,m),

• Type 1−: O−m := On0 = G0n0 for n0 = (0, 0, 0,−m),

• Type 2 : Oim := On0 = G0n0 for n0 = (m, 0, 0, 0),

• Type 3+: O+0 = On0 = G0n0 for n0 = (−1, 0, 0, 1),

• Type 3−: O−0 := On0 = G0n0 for n0 = (−1, 0, 0,−1),

• Type 30: O00 := On0 = G0n0 = (0, 0, 0, 0) for n0 = (0, 0, 0, 0),

The type 1+ orbit O+m is called the positive mass shell, O−

m the negative mass shell andO±

0 the forward resp. backward light cone. The situation is sketched in the picture:

p4

p1

(0,m)

(m, 0)

Oim

O+m

O−m

O+0

O−0

The regularity of the semidirect product can be verified analogously as in our example 2in the last section (see [BR] p.517/8).

Now, for each orbit O, we have to determine its little group, i.e. the stabilizing groupH0 := G0n0 .

Step 3: We get the following four cases.

H0 := G0n0 SO(3) for the type 1±, i.e. n0 = (0, 0, 0,±m),

SO(2, 1) for the type 2, i.e. n0 = (m, 0, 0, 0),

GE(2) for the type 3±, i.e. n0 = (−1, 0, 0,±1),

SO(3, 1)0 for the type 30, i.e. n0 = (0, 0, 0, 0).


Proof: i) As in 7.3 we write the elements of G0 = SO(3, 1) as

A = ( B xty a

), B ∈ SO(3), x, y ∈ R3, a ∈ R.

For m = 0, the condition(0, 0, 0,m)A = (0, 0, 0,m)

enforces y = 0 and a = 1. Moreover, from A ∈ SO(3, 1) one deduces (via (7.21)) x = 0and, hence, the first assertion G0n0 SO(3).

ii) The other assertions can be proved similarly by direct computations. But as an-nounced above, it is more agreable to use the fact from 7.3 that one has SL(2,C) =: G0

via the surjection ρ : G0 −→ G0 as a double cover of the Lorentz group. In the proofof Theorem 7.6 in 7.3 we identified R4 via a map ϕ with the space V of hermitian2 × 2−matrices

t(a, b, c, d) −→ (−a + d b + icb − ic a + d

),

and defined the map

ρ : SL(2,C) −→ GL(V ), A −→ ρ(A) with ρ(A)X = AXA∗ for X ∈ V.

Hence for m = 0, the condition

(m, 0, 0, 0)A = (m, 0, 0, 0), A ∈ G0

can be translated into the condition

ρ(A)X0 = AX0A = X0 for X0 := ϕ(m, 0, 0, 0) = ( −mm

),

i.e. A ∈ SU(1, 1). It is not difficult to see (Exercise 7.24) that the homomorphismρ : SL(2,C) −→ SO(3, 1)0 restricts to a surjection SU(1, 1) −→ SO(2, 1)0.

It will be useful later to take also

X0 := ϕ(0, 0,m, 0) = (mi

−mi).

By a small calculation, one verifies that the stabilizing group A; AX0A = X0 here

is SL(2,R): For σ = ( i−i

) one has σ = σ−1 and tA−1 = σAσ−1. Hence from

AσA∗ = σ we deduce σ−1Aσ = A∗−1 = tA−1, i.e. A = A.

iii) One has ϕ(−1, 0, 0, 1) = (2

) =: X0, and AX0A = X0 calls for

A = (a b

a−1 ), with a, b ∈ C, | a |= 1.

We put a =: eiϑ/2, b =: e−iϑ/2z, ϑ ∈ [0, 4π], z ∈ C, and get the multiplication law

AA′ = (a′′ b′′

a′′−1 ), with a′′ = ei(ϑ+ϑ′)/2, b′′ = e−i(ϑ+ϑ′)/2(z + eiϑz′).


This shows that in this case the stabilizing group H0 ⊂ G is isomorphic to the semidirectproduct of a rotation group eiϑ and a translation group R2 C with the compositionlaw

(eiϑ/2, z)(eiϑ′/2, z′) = (ei(ϑ+ϑ′)/2, z + eiϑz′).

The homomorphism (eiϑ/2 −→ (r(ϑ)) makes S1 := eiϑ/2; 0 < ϑ < 4π a double coverof SO(2) and this extends to a surjection of H0 to the Euclidean group GE(2).

iv) In the last case there is nothing to prove.

Step 4: We already determined the unitary dual of covering groups H0 of all the littlegroups coming in here:

i) In the first case (type 1±), for H0 = SU(2), the elements of the unitary dual are rep-resented by the representations πj , j ∈ (1/2)N0 discussed in 4.2.

ii) In the second case (type 2), one has H0 = SU(1, 1) and from 6.2 we know thatSU(1, 1) is isomorphic to SL(2,R) and that the representations appear in three seriesπ±

k , k ∈ N; πis,±, s ∈ R; πs, 0 < s < 1.

iii) For the third case (type 3±), as Example 7.12 to our construction procedure in 7.4 weelaborated that for the two-dimensional Euclidean group one has one-dimensional repre-sentations χk, k ∈ Z (hence χj , j ∈ (1/2)Z for the double cover) and infinite-dimensionalrepresentations πr, r > 0.

iv) For the type 30, one has the representations of the Lorentz group resp. its coveringSL(2,C) discussed in 7.3, which come in two series πk,iv, k ∈ Z, v ∈ R; π0,u, 0 < u < 2.

Step 5 and 6: Finally, we have to extend the representations of the little group to itsproduct with N = R4 and, if this product is not the whole group, proceed to an inducedrepresentation. This way we get the following representations (as usual, we list those ofthe covering group G of the Poincare group):

The Unitary Dual of the Poincare Group

i) In the first case, one has H = SU(2) R4. For the type 1+ with m > 0, we have forthis group the representation π0 given by

π0((B, x)) = πj(B)eimx4 for B ∈ SU(2), x = t(x1, x2, x3, x4) ∈ R4.

We write for the induced representation

indGH

π0 =: πm,+;j .

Similarly, the type 1− leads to

π0((B, x)) = πj(B)e−imx4 for B ∈ SU(2), x = t(x1, x2, x3, x4) ∈ R4

and we denote the induced representation by

πm,−;j , m > 0, j ∈ (1/2)N0.


ii) In the second case, one has

H = SL(2,R) R4

and its representation

π0((A, x)) = π(A)eimx3 for A ∈ SL(2,R), x = t(x1, x2, x3, x4) ∈ R4

where π here is one of the representations of SL(2,R) listed above in part ii) of Step 4resp. in Section 7.2. For the corresponding induced representation we write

indGH

π0 =: πim;k,± for π = π±k , k ∈ N; m > 0,

=: πim;is,± for π = πis,±, s ∈ R; m > 0,=: πim;s for π = πs, s ∈ R, 0 < s < 1;m > 0.

iii) In the third case, for type 3±, one has

H = (S1 R2) R4

and its representation

π0((A, x)) = π(A)ei(x1±x4) for A ∈ (S1 R2), x = t(x1, x2, x3, x4) ∈ R4

where π here is one of the representations of (S1 R2) indicated in Step 4. For thecorresponding induced representation we write

indGH

π0 =: π0,±;j for π = χj , j ∈ (1/2)Z,

=: π0,±;r for π = πr, r ∈ R, r > 0.

iv) In the forth case (for type 30) the little group is the whole group and we get therepresentations of G by the trivial extension of the representations π of SL(2,C), i.e. onehas π((A, x)) = π(A) for (A, x) ∈ G0. For the corresponding representation we write

π =: π0;j,iv,± for π = πj,iv,±, v ∈ R=: π0;0,u for π = π0,u, u ∈ R, 0 < u < 2.

Among these, the representations πm,+;j , m > 0, j ∈ (1/2)N0, coming from the type 1+

orbit and π0,±;j , j ∈ (1/2)Z, coming from the type 3± orbit have a meaning for physicssince they are interpreted as describing elementary particles of spin j and mass m resp.of zero mass.

Example 7.14: As an illustration we discuss an explicit model for the representationπm,+;j following [BR] p.522:

From the preceding discussion we take over

G = G0 N, G0 = SL(2,C), N = R4

and write G g = (A, x), A ∈ SL(2,C), x ∈ R4 (a column). A bit more explicitely thanat the beginning of this section, the action of G0 on N is described as follows. We havethe identification

ϕ : R4 −→ V = Herm2(C),

x = t(x1, x2, x3, x4) −→ X = ( x4 − x1 x2 + ix3

x2 − ix3 x4 + x1).


Via ϕ the action of G0 on V given by (A,X) −→ ρ(A)X = AXA∗ is translated into theaction of G0 on R4 given by

ρ(A)x := ϕ−1(ρ(A)ϕ(x)).

By a slight abuse of notation, as we already did above, we also take ρ(A) to be a matrixfrom G0 = SO(3, 1)0 and understand ρ(A)x as multiplication of the column x by thematrix ρ(A). We identify the unitary dual N with R4 by writing n(n) = eitpx, i.e. n iscoordinatized by (the row) tp, p ∈ R4. Then the action of G0 on N is simply given bythe multiplication of the row tp by the matrix ρ(A−1), i.e.

tp −→ tpρ(A−1) =: A · p.

The construction of the representation πm,+;j is based on the type 1+ orbit

Om = G0 · p0, p0 = (0, 0, 0, m),m > 0.

As we remarked already, p0 corresponds to X0 = mE2 and has the stabilizing groupH0 = SU(2). We take the representation πj of SU(2) and extend it to H = H0 N toget π0 given by

π0((R, x)) = eimx4π(R) = eitp0xπj(R) for all R ∈ SU(2), x ∈ R4.

To get some more practice and since it is important for application in physics, we workout the “ Second Realization” of the induced representation indG

Hπ0, i.e. with a repre-

sentation space consisting of functions u living on the homogeneous space

G/H G0/H0 O+m.

(At this moment, one usually does not bother with any finiteness or integrability condi-tion and simply takes smooth functions. We shall have occasion to do better in the nextchapter.)The general theory of the Iwasawa decomposition or a direct calculation say that we havea decomposition

G0 = SL(2,C) A = ApRp, Rp ∈ SU(2), Ap = (λ z

λ−1 ), λ ∈ R∗, z ∈ C.

Here it is customary in physics to coordinatize Ap by coordinates tp = (p1, p2, p3, p4)such that p = Ap · p0. Then one has the Mackey decomposition

G g = s(p)h, s(p) = (Ap, 0), h = (Rp, x) ∈ H

and the Master Equation

g−10 s(p) = s(g−1

0 (p))h∗, h∗ = (R∗, x∗) ∈ H

with

g−10 s(p) = (A−1

0 ,−ρ(A−10 )x0)(Ap, 0)

= (A−10 Ap,−ρ(A−1

0 )x0)= (Ag−1

0 (p), 0)((Ag−10 (p))

−1A−1

0 Ap,−ρ((Ag−10 (p))

−1A−1

0 )x0).


This shows that one has

R∗ = (Ag−10 (p))

−1A−1

0 Ap, x∗ = −ρ((Ag−10 (p))

−1A−1

0 )x0.

One can verify that dµ(p) = p−14 dp1dp2dp3 is an invariant measure on O+

m and that theRadon-Nikodym derivative is dµ(g−1

0 (p))/dµ(p) = 1. Hence by (7.10), in this case therepresentation is given by the prescription

π(g0)u(p) = π0(h∗−1)u(g−10 p)

withπ0(h∗−1) = ei tpx0πj(R∗−1)

as one can verify by a small calculation starting from π0((R, x)) = eitp0xπj(R): Namely,one has

h∗−1 = (R∗−1, x) with x = −ρ(R∗−1)x∗

and hencetp0x = −tp0ρ(R∗−1)x∗ = tp0ρ(R∗−1(Ag−1

0 (p))−1A0)x0

i.e.tp0x = tp0ρ(A−1

p )x0 = tpx0

since we have R∗ = (Ag−10 (p))

−1A−1

0 Ap.

The functions u we are treating here are vector valued functions

u : O+m −→ Vj C2j+1,

i.e. can be written as u(p) = (un(p))n=−j,−j+1,...,j . Here we write Djnn′ for the matrix

elements of the representation πj from our discussion in 4.2 and then come to

(7.24) (π((A0, x0))u)n(p) =∑n′

eitpx0Djnn′(R∗−1)un′(A−1

0 · p).

From here one derives the following interpretation that we have a free particle of spin jand mass m :

In the rest system p = p0 and under rotations (R, 0), R ∈ SU(2) (7.24) restricts to

(π((R, 0))un)(p0) =∑n′

Djnn′(R)un′(p0),

i.e. the elementary expression of SU(2)-symmetry belonging to the representation πj .And for the infinitesimal generators Pµ = ∂µ, µ = 1, .., 4 of the translations in the µ-thdirection x −→ xµ(t) := x + (tδµµ′)µ′ one obtains from (7.24)

Pµun(p) = pµun(p).

Hence we have Mun = mun, i.e. un is an eigenfunction with eigenvalue m for the massoperator

M :=√

P 24 − P 2

3 − P 22 − P 2

1 .

7.6 Induced Representations and Vector Bundles 161

7.6 Induced Representations and Vector Bundles

The classic approach to the inducing mechanism we followed up to now, in the meantimehas found a more geometric version using the notion of bundles. As Warner remarks in[Wa] p.376, this does not simplify in any essential way the underlying analysis but hasthe advantage that important generalizations of the entire process immediately suggestthemselves (e.g. cohomological induction). Bundles play an important role in most partsof contemporary mathematics and we shall have to use them also in our next chapterconcerning the beautiful orbit method. Hence, we present here some rudiments of thetheory, which nowadays appears in most books on differential geometry, analytic and/oralgebraic geometry etc. For instance, for a thourough study we recommend the book[AM] by Abraham and Marsden or 5.4 in [Ki] or, more briefly, the appendices in [Ki1]or [Be]. Whoever is in a hurry to get to the applications in Number Theory may skipthis at first lecture. Before we try to shed some light on the notion of line, vector andHilbert bundles, we look again, but from a different angle, at the construction of therepresentations of SU(2) from 4.2 and Example 7.3 in 7.1.5. We do this in the hope tocreate some motivation for the bundle approach.

We take G := SU(2) ⊃ H U(1) and π0(h) := χk(ζ) = ζk for h =(

ζζ

)∈ H, k ∈ N0.

As we discussed in Example 7.3 in 7.1.5, we have the homogeneous space

X = G/H P1(C), g =(

a b−b a

)−→ (b : a)

and the unnormalized induction πuk = induG

Hχk given by left (inverse) translation on thespace Hπ of smooth functions φ : G −→ C with functional equation

φ(gh) = χk(h−1)φ(g) for all g ∈ G,h ∈ H.

(In this example everything is unimodular, hence it does not matter whether we treat un-or normalized induction, but in general several authors stick to unnormalized inductionthough one can also treat normalized induction as we shall see below.) For V = C weintroduce as our first example of a line bundle

Bk := G ×H V := [g, t] := (g, t)∼; g ∈ G, t ∈ V where the equivalence ∼ of the pairs is defined by

(gh, t) ∼ (g, π0(h)t) for g ∈ G,h ∈ H, t ∈ V.

G acts on Bk byg0[g, t] := [g0g, t] for all g0 ∈ G, [g, t] ∈ Bk .

One has a projection

pr : Bk −→ G/H = X , [g, t] −→ x := gH.

Obviously the projection and the action are well defined and one has the fiber over x

(Bk)x := pr−1(x) = [g, t]; gH = x V C.


This leads to the picture that Bk is a bundle standing over the base space X and consistingof the fibers (Bk)x over all base points x ∈ X .Our main object is the space Γ(Bk) of sections of Bk, i.e. maps

ϕ : X −→ Bk with pr ϕ = idX .

These sections are of the form ϕ(x) = [g, φ(g)] for x = gH with a function φ : G −→ V .Since the coset gH may be represented also by gh, h ∈ H, the welldefinedness of thesection asks for

[g, φ(g)] = [gh, φ(gh)] = [gh, π0(h−1)φ(g)]

where the last equation is expression of the equivalence introduced in the definition ofthe bundle Bk. This leads to the fact that the function φ has to fulfill the functionalequation

φ(gh) = π0(h−1)φ(g) for all h ∈ H, g ∈ G,

which we know from the construction of the representation space for the induced repre-sentation. We define an action of G on the space Γ(Bk) of sections by

(g0 · ϕ)(gH) := g0ϕ(g−10 g) = [g, φ(g−1

0 g)] for all g0 ∈ G,ϕ ∈ Γ(Bk),

and this leads to an action of G on the space H of functions φ : G −→ V (satisfying thefunctional equation) defined by

g0 · φ(g) := φ(g−10 g)

Hence at least formally, we recover the prescription for the definition of the inducedrepresentation π. This can be completed to a precise statement if we provide our bundlewith an appropriate topology. And this appears to be largely independent of the specialexample G = SU(2), which we proposed at the beginning. In the following, we will tryto show how this works by giving some rudiments of the theory of bundles. We startby following the treatment of Warner [Wa] p.377, which is adapted to the application inrepresentation theory, and proceed to another more general approach afterwards.

Bundles, First Approach

Definition 7.2: Let X be a fixed Hausdorff topological space. A bundle B over X is apair (B, p) where B is a Hausdorff topological space and p is a continuous open map ofB onto X .

One calls X the base space, B the bundle space, and p the bundle projection. By abuseof notation the pair (B, p) is often abbreviated by the letter B to indicate the bundle.

For x ∈ X , the set p−1(x) is called the fiber over x and often denoted by Bx. A crosssection or simply section of the bundle is a map s : X −→ B with p s = idX .

If b ∈ B and s(p(b)) = b, then we say that s passes through b. If to every b ∈ B there isa continuous cross section s passing through b, then the bundle is said to have enoughcross sections.


A Hilbert bundle B over X is a bundle (B, p) over X together with operations and a normmaking each fiber Bx, x ∈ X into a complex Hilbert space and satisfying the followingconditions:

1.) The map B −→ R, b −→ ‖b‖ is continuous,2.) The operation of addition is a continuous map of (b1, b2) ∈ B × B, p(b1) = p(b2)into B,3.) The operation of scalar multiplication is a continuous map of C × B into B,4.) The topology of B is such that its restriction to each fiber Bx is the norm topologyof the Hilbert space Bx. (This condition is formulated more precisely in [Wa] p.377.)

Example 7.15: Let E be any Hilbert space. The trivial bundle with constant fiber Eis given by B := X × E equipped with the product topology and the projection p givenby p(x, a) = x for x ∈ X , a ∈ E. The cross sections are given by the functions s : X −→ E.

Let B = (B, p) be a Hilbert bundle over X and Γ(B) the set of continuous cross sectionsof B . Then it is clear that Γ(B) is a complex vector space under pointwise additionand scalar multiplication. In the application to representation theory this space shallprovide the representation space of a representation. So we need a topology on it. Andhere things start to be a bit technical. Following [Wa], we assume that X is locallycompact and for Γ(B) define the uniform on compacta topology. For our purposes it issufficient that in particular this leads to the consequence that each sequence (sn) in Γ(B)converges to a continuous cross section ϕ iff for each compact subset C ⊂ X one hassupx∈C ‖sn(x)− s(x)‖ −→ 0 for n −→ ∞. (In [Wa] the sequence is replaced by the moregeneral notion of net but we will not go into this here.)

The integration theory in this context is still more delicate. So for details, we also referto [Wa]. For our purpose it will suffice to fix a measure µ on X and think of Lq(B, µ)as the space of sections s of B with ‖s‖q := (

∫X ‖s(x)‖qdµ(x))1/q < ∞. Then the inner

product on the space of sections is given by

< s, t > :=∫X

< s(x), t(x) > dµ(x) for all s, t ∈ L2(B, µ).

Let G be a linear group, H a closed subgroup, δ the quotient of the modular functionsof G and H treated in 7.1 (normalized by the condition δ(1) = 1) and π0 a unitaryrepresentation of H on the Hilbert space V .

We define a continuous right action of H on the product space G × V as follows:

(g, a)h := (gh, δ(h)1/2π0(h−1)a) for all g ∈ G, a ∈ V.

The orbit under H of a pair (g, a) ∈ G × E will be denoted by [g, a] := (g, a)H. Thespace of all such orbits equipped with the quotient topology will be called E(G,H, π0)or simply E. This space is Hausdorff and the natural projection map

p : E −→ G/H =: X , [g, a] = (g, a)H −→ gH =: x

is a continuous and open surjection. Therefore E := (E, p) is a bundle over X = G/H,which becomes a Hilbert bundle by the following natural assignments:


We define addition and scalar multiplication in each fiber Ex := [g, a]; a ∈ V by

c1[g, a1] + c2[g, a2] := [g, c1a1 + c2a2] for all c1, c2 ∈ C

and the scalar product by

< [g, a1], [g, a2] >Ex := δ(g)−1 < a1, a2 > .

Obviously, there are some verifications to be done. Whoever does not want to do theseon his own can look into [Wa] p.380/1. The bundle is called the Hilbert bundle inducedby π0 (and δ) and denoted by E(π0) or again E . G acts continuously from the left on Eby

g0[g, a] := [g0g, a].

Finally, we rediscover the normalized induced representation π = indGHπ0 of G as a

representation by unitary operators on the space of square integrable sections of E : Tothis end, one first observes that there exists a one-to-one correspondence between crosssections ϕ for E = E(π0) and functions φ : G −→ V fulfilling the functional equation

φ(gh) = δ(h)1/2π0(h−1)φ(g) for all g ∈ G,h ∈ H.

The correspondence is given by

ϕ(gH) = (g, φ(g))H.

If φ and ϕ are connected in this way, one says that φ is the unitary form of ϕ and ϕ is thebundle form of φ. Evidently a cross section ϕ is continuous on X = G/H iff its unitaryform φ is continuous on G. It is a bit less evident but true (see [Wa] p.381) that E(π0)admits enough cross sections. Therefore it makes sense to consider the Hilbert space

Hπ0 := L2(E , µ)

where µ is the quasi-invariant measure on X associated with δ and the ρ−function onX as in 7.1. (It can be shown that the construction is independent of the choice of thefunction δ.) By a (small) abuse of notation, one denotes the space of bundle forms φ ofelements ϕ in the space of sections as well by Hπ0 . This space inherits the Hilbert spacestructure and has as scalar product

< φ, φ > :=∫X

δ(s(x))−1 < φ(s(x)), φ(s(x)) > dµ(x)

where s denotes the Mackey section we introduced in 7.1.

The bundle theoretic definition of the unitarily induced representation π is as follows:For g0 ∈ G and ϕ ∈ Hπ0 , put

π(g0)ϕ(gH) := g0ϕ(g0−1gH)

and correspondingly for the unitary form φ of ϕ

π(g0)φ(g) = φ(g−10 g).

Some routine verifications parallel to those we did in 7.1 (see [Wa] p.382) show that wehave got a unitary representation equivalent to the one constructed in 7.1. And one


should get the feeling that all this is realized by the example of the (complex!) bundleE = Bk over X = P1(C) from the beginning of this section. We will say more when wecome back to this at the end of the section.

As Warner remarks ([Wa] p.382), the preceding bundle-theoretic construction makes nouse of the local triviality of the constructed bundle. Moreover, “the definition of Hilbertbundle is, strictly speaking, not in accord with the customaty version - however, it hasthe advantage of allowing to place Hilbert space structures on the fibers without havingto worry about transition functions”. As we shall need this later, we present some ele-ments of this other way to treat bundles.

Bundles, Second Approach

Here we need the notion of a differentiable manifold, which already was alluded to in 6.2and is to be found in most text books on advanced calculus. We recommend in particularthe Appendix II in [Ki1] (or the Appendix A in [Be]).

Definition 7.3 : A p−manifold M (or a p-dimensional manifold) is a Hausdorff topo-logical space that admits a covering by open sets Ui, i ∈ I, endowed with one-to-onecontinuous maps ϕi : Ui −→ Vi ⊂ Rp.M is called separable if it can be covered by a countable system of such open sets.A smooth resp. analytic p-manifold is a separable p-manifold so that all maps

ϕi,j := ϕi ϕ−1j |ϕj(Ui∩Uj)

are infinitely differentiable (resp. analytic).

We use the following terminology:

– the pairs (Ui, ϕi) (and often also the sets Ui) are called charts;

– for m ∈ Ui =: U and ϕi =: ϕ the members x1, . . . , xp of the p−tupel ϕ(m) =: x ∈ Rp

are called local coordinates;

– the functions ϕi,j are called transition functions;

– the collection (Ui, ϕi)i∈I is called an atlas on M .

It has to be clarified which atlases are really different: We say that two atlases (Ui)i∈I

and (U ′j)j∈J are called equivalent if the transition functions from any chart of the first

atlas to any chart of the second one are smooth (resp. analytic). The structure of asmooth (resp. analytic) manifold on M is an equivalence class of atlases.

When dealing with a smooth manifold, we use only one atlas keeping in mind that wecan always replace it by an equivalent one.

In parallel to the just given notions for real manifolds, one defines a smooth (or holo-morphic) complex p-manifold where Rp is replaced by Cp and the transition functionsare required to be holomorphic, i.e. complex differentiable.


Examples:

Example 7.16: M = Rp and open subsets are real p−manifolds.

Example 7.17: M ⊂ Rq is a p-dimensional submanifold if for any m ∈ M there are anopen set V ⊂ Rq and differentiable functions fi : V −→ R, i = 1, . . . , q − p, such that

– a) M ∩ V = x ∈ V ; f1(x) = · · · = fq−p(x) = 0,– b) the functional matrix (∂f

∂x ) has rank q − p for all x.

The hyperboloids appearing as orbits in our discussion of the representations of thePoincare group in 7.5 are three-dimensional smooth submanifolds of R4.

Example 7.18: M = P1(C) (as in Example 7.3 in 7.1.5 and the beginning of thissection) is a smooth one-dimensional complex manifold:We cover P1(C) by two open subsets U := (u : v); u, v ∈ C, v = 0 and U ′ := (u :v); u, v ∈ C, u = 0. We have the homeomorphisms

ϕ : U −→ C, (u : v) −→ u/v =: w ∈ C,

ϕ′ : U ′ −→ C, (u : v) −→ v/u =: w′ ∈ C.(7.25)

And the map ϕ′ ϕ |w =0 (w) = 1/w = w′ is complex differentiable. So here one has anatlas consisting of two charts U and U ′.

Certainly, at the same time, P1(C) is an example for a two-dimensional real manifold.

Example 7.19: Show as Exercise 7.25 that M is a smooth p−dimensional (real)manifold where

M = Pp(R) := [u1 : · · · : up+1] = (u1, . . . , up+1)∼; (u1, . . . , up+1) ∈ Rp+1\0

with (u1, . . . , up+1) ∼ (u′1, . . . , u

′p+1) iff there exists a λ ∈ R∗ with u′

i = λui for all i.

If M is a smooth p−manifold and M ′ a smooth p′−manifold, there is a natural notionof a smooth map F : M −→ M ′. Namely, for any point m ∈ M with F (m) = m′ andany charts U m and U ′ F (m) = m′, the local coordinates of m′ must be smoothfunctions of local coordinates of m ∈ U .A smooth map from one p−manifold to another, which has a smooth inverse map is calleda diffeomorphism. Two manifolds are called diffeomorphic if there is a diffeomorphismfrom one to another.

A manifold is called oriented iff it has an oriented atlas, i.e. an atlas where for any twocharts with coordinates x and y the determinant of the functional matrix (∂x

∂y ) is positivewhenever it is defined.

As already said, there is much more to be discussed in this context (and we refer again toe.g. [Ki1]) but, we hope, this rudimentary presentation suffices to grab an understandingof our second approach to bundles:


Definition 7.4: E is called a (fiber) bundle over the base X with fiber F iff E , F and Xare smooth manifolds and there is given a smooth surjective map p : E −→ X such thatlocally E is a direct product of a part of the base and the fiber F .

More precisely, we require that any point x ∈ X has a neighbourhood Ux such thatEUx

:= p−1(Ux) can be identified with Ux×F via a smooth map hx so that, for m ∈ EUx,

one has hx(m) = (x′, v) if x′ = p(m), i.e. the following diagram (where p1 denotes theprojection to the first factor) is commutative

p−1(Ux) hx−→ Ux × F

↓ p ↓ p1

Ux∼−→ Ux.

From the definition it follows that all sets Ex := p−1(x), x ∈ X , called fibers, are smoothsubmanifolds diffeomorphic to F .One often writes E F−→ X to denote a bundle with fiber F and base X .There is the natural notion of a map between bundles: Let E F−→ X and E ′ F ′

−→ X ′ betwo bundles. Then a bundle map is given by the pair of smooth maps

Φ : E −→ E ′, f : X −→ X ′

such that the following diagram is commutative

E Φ−→ E ′

p ↓ ↓ p′

X f−→ X ′.

Two bundles over the same base X are called equivalent if there is a bundle map withf = idX and Φ a diffeomorphism.

A bundle E F−→ X is called a trivial bundle if it is equivalent to the bundle X ×FF−→ X .

We are mainly interested in vector or line bundles where the fiber is F := V a vectorspace (with dim V = n = 1 for the case of a line bundle) and the maps hx are linear ineach fiber p−1(x′), x′ ∈ Ux.Primarily, we think of real vector spaces and real manifolds but we also will treat complexcases.

In any case, the dimension of the fiber as a vector space is called the rank of the vectorbundle.


To make this more explicit and show a way to construct bundles, we take an atlasof the real manifold X given by open sets Ui (i ∈ I), and V = Rn. Then we havehomeomorphisms

h := hi : EUi= p−1(Ui) −→ Ui × Rn,

which a) respect the fibers, i.e. with

h(m) = (x, v) for all m ∈ Ex = p−1(x), x ∈ Ui,

and b) restrict to vector space isomorphisms

h |Ex : Ex −→ Rn.

We denote by

hi,j := hi h−1j |(Ui∩Uj)×Rn : (Ui ∩ Uj) × Rn −→ (Ui ∩ Uj) × Rn

the bundle transition maps and introduce (differentiable) functions, which are calledbundle transition functions

ψi,j : Ui ∩ Uj −→ GL(n,R) Aut (V )

byhi,j(x, v) = (x, ψi,j(x)v) for all (x, v) ∈ (Ui ∩ Uj) × Rn.

It is not difficult to see that these ψi,j are uniquely determined and for i, j, k with

Ui ∩ Uj ∩ Uk = ∅

fulfill the following cocycle condition

(7.26) ψi,j ψj,k = ψi,k.

Here, our main point is that we can construct bundles with fiber Rn over a manifoldwith atlas (Ui)i∈I by pasting together columns Ui × Rn via such a given cocycle systemψi,j . We have the following construction principle, whose validity is not too difficult toprove.

Proposition 7.4: Let X be a manifold with atlas (Ui)i∈I and ψi,j : Ui∩Uj −→ GL(n,R)be a system of smooth functions with

ψi,i = 1 in Ui,

ψi,j ψj,i = 1 in Ui ∩ Uj ,

ψi,j ψj,k ψk,i = 1 in Ui ∩ Uj ∩ Uk.

Let E be the union of the manifolds Ui × Rn. One says that points (xi, vi) ∈ Ui × Rn

and (xj , vj) ∈ Uj × Rn are equivalent if xi = xj and vi = ψi,j(xj)vj . Then the factorspace E of E defined by this equivalence relation is a vector bundle with fiber Rn over Xwith projection p induced by the natural projections (xi, vi) −→ xi. And each such fiberbundle is equivalent to a bundle of this type.


Proposition 7.5: Let E and E ′ be vector bundles of the same rank n over X definedby the systems of bundle transition functions ψi,j and ψ′

i,j (without loss of generality,by a refinement argument, one can assume that both bundles are defined by the sameatlas (Ui)i∈I). Then both bundles are equivalent iff there is a family of matrix functionsfi : Ui −→ GL(n,R) such that one has

ψ′i,j = f−1

i ψi,jfj for all i, j ∈ I.

As in our first approach, for open sets U ⊂ X we introduce (vector) spaces Γ(U, E) ofsections s over U , i.e. smooth maps s : U −→ E with p s = idU . For U = X we talk ofglobal sections.Parallel to our treatment of real bundles, one can look at the complex case by replacingappropriately R by C.

Examples:

Example 7.20: The most prominent example, which led to the introduction of thenotion of bundles, is the tangent bundle TM , which comes about if for an n-dimensionalmanifold M with charts (Ui, ϕi), i ∈ I, as matrix of bundle transition functions one takesthe Jacobian J of the manifold transition functions ϕi,j

ψi,j(m) := Jϕi,j(ϕj(m)).

Then TM can be thought of as consisting of classes of pairs (m,a)∼,m ∈ M,a ∈ Rn,with (m,a) ∼ (m, b) iff a and b are related by the contravariant transformation property

bi =n∑

j=1

∂yi

∂xj(y)aj

when m appears in the chart (U,ϕ) with coordinates x = (x1, . . . , xn) and simultane-ously in the chart (U ′, ϕ′) with coordinates y = (y1, . . . , yn). This has a geometricalbackground: in first approximation, we take the manifold M and to each point m ∈ Massociate as a fiber Em the tangent space TmM to M at this point m. From highercalculus one knows that there are different ways to describe the elements Xm ∈ TmMi.e. the tangent vectors to a manifold. We follow the one already prepared in 6.2 whilediscussing the definition of the Lie algebra associated to a linear group and think of thetangent space of M at m as the space of all tangent vectors of all smooth curves γ inMpassing through m. More precisely, a tangent vector Xm to M at the point m is definedas the equivalence class of smooth parametrized curves γ = γ(t) passing through m. Twocurves γ and γ in M with γ(0) = γ(0) = m are equivalent iff their tangent vectors in mcoincide in the coordinates of a chart (U,ϕ) with m ∈ U , i.e. one has γ ∼ γ if there is achart (U,ϕ) with m ∈ U and coordinates ϕ(m) = x such that

d

dtϕ(γ(t)) |t=0=

d

dtϕ(γ(t)) |t=0 .

With respect to this chart, a tangent vector can be symbolized by the n-tupel of realnumbers ai where ai is just the i-th component of d

dtϕ(γ(t))|t=0 or by the differentialoperators Xm =

∑ni=1 ai∂xi acting on smooth functions f in the coordinates x. Addition

and scalar multiplication of these operators resp. tuples define a vector space structure,which can and will be carried over to the first given description of the space TmM .


If (U ′, ϕ′) with coordinates y is another chart for m and (b1, . . . , bn) are the componentsof the tangent vector to γ expressed in this coordinate system, by the chain rule we havewith ϕ′(ϕ−1(x)) = y(x)

bi = (d

dtϕ′(γ(t)) |t=0)i = (

d

dt(ϕ′(ϕ−1(ϕ(γ(t)))) |t=0)i =

∑j

∂yi

∂xjaj ,

i.e. the contravariant transformation property from our general construction above.

Definition 7.5: Global sections X : M −→ TM of the tangent bundle TM are calledvector fields on M .

Let (U,ϕ) be a chart with coordinates x = (x1, . . . , xn). With a (now usual) slight misuseof notation we write

X |U=:n∑

i=1

ai(x)∂xi

where the coefficients ai are smooth functions defined in U . For a chart (U ′, ϕ′) withcoordinates y = (y1, . . . , yn), one has as well

X |U ′=n∑

j=1

bj(y)∂yj

and on U ∩ U ′ = ∅ we have the contravariant transformation property

bj(y) =∑

i

∂yj

∂xi(y) ai(x(y)).

V (M) denotes the vector space of smooth vector fields X on M .

Example 7.21: The cotangent bundle to M has as its fibers the cotangent spaces T ∗mM ,

i.e. the duals to the tangent spaces TmM . It turns up if, in the same situation as abovein Example 7.20, as matrix of bundle transition functions we take the transpose of theJacobian of the map ϕj ϕ−1

i =: ϕj,i, i.e.

ψi,j(m) = tJϕj,i(ϕi(m)).

The global sections α : M −→ T ∗M of the cotangent bundle T ∗M are called differentialforms of first degree or simply one-forms on M . As a parallel to the notation for vectorfields above, with functions a∗

i and b∗j defined in U resp. U ′ we write

α |U=∑

i

a∗i (x)dxi, resp. α |U ′=

∑j

b∗j (y)dyj ,

and here have the covariant transformation property

(7.27) b∗j (y) =∑

j

∂xi

∂yj(x(y))a∗

i (x(y)).

Ω1(M) denotes the vector space of one-forms α on M .


Example 7.22: The first two examples concerned real bundles but similarly one cantreat the complex case. Here we take the one-dimensional complex manifold M = P1(C)covered by the two charts (U,ϕ) and (U ′, ϕ′) fixed above in (8.10) while discussing Mas a manifold. For k ∈ Z we get a line bundle Lk by choosing ψ(w) = w−k as bundletransition function between these two charts. Let us work out the construction of thisbundle along the line of the general construction principle given above: We have two“columns”, one over U and one over U ′, namely

L := (w, t), w, t ∈ C, L′ := (w′, t′), w′, t′ ∈ C.

Using ψ we paste them together over U ∩ U ′ = C∗ by identifying (w, t) ∼ (w′, t′) iff

w′ = 1/w, t′ = w−kt.

Remark 7.9: It can be shown that up to (holomorphic) equivalence there are no otherholomorphic line bundles over P1(C) than these Lk.

Proposition 7.6: For the space of holomorphic sections one has

dimC Γ(Lk) = k + 1 for k ≥ 0,

= 0 for k < 0.

Proof: Let s be a section of Lk with

s : (u : v) −→ ∈ Lk,

where h() = (w, t) ∈ L for w = u/v, v = 0, and h′() = (w′, t′) ∈ L′ for w′ = v/u, u = 0.The section is holomorphic iff t is given by a holomorphic function in w and t′ by aholomorphic function in w′ and we have the compatibility condition t′ = w−kt. Asexamples and prototypes for such a holomorphic function we take the monomials t = wp

and t′ = w′p′with p, p′ ∈ N0. The compatibilty condition says for w = 0

t′(w′) = w′p′= wkt(w) = w−k+p

and, since one has w′ = 1/w, we get w−p′= wp−k and see that everything is consistent

for p = 0, 1, . . . , k if k is not negative. And the holomorphicity of the relation breaksdown for negative k. Hence we have k + 1 essentially different holomorphic sections fornon-negative k and none for negative k. One needs a bit more complex function theoryto be sure that there are no more (see for instance [Ki] p.80).

Bundles and Representations

After this short digression into the world of bundles, which will be useful later, we returnto the problem of the construction of group representations. As we know from our firstapproach to bundles, the space of sections is meant to provide a representation spaceand, hence, should get the structure of a Hilbert space. In general this needs somework, which we shall attack in the next chapter. But if the space of sections is a finite-dimensional vector space, this is no problem. To get an idea we return to our exampleG = SU(2) ⊃ H U(1) and the unitary representation πj = indG

Hπ0 induced from therepresentation π0 = χk, k = 2j ≥ 0, of H U(1). In our first approach we constructed


the bundle space Bk = [g, t]; g ∈ G, t ∈ C with projection pr onto X = G/H = P1(C)given by

pr([g, t] := gH = (u : v) = (b : a) for g =(

a b−b a

).

We define a map of Bk to the line bundle Lk constructed by pasting together the setsL = (w, t); w, t ∈ C and L′ = (w′.t′); w′, t′ ∈ C via the bundle transition functionψ(w) = wk as follows: For g with a = 0 we put

Bk [g, t] −→ (w = b/a; bkt) ∈ L

and for g with b = 0Bk [g, t] −→ (w′ := a/b, akt) ∈ L′.

Both maps are well defined since for h =(

ζζ−1

)one has

gh−1 =(

aζ−1 bζ−bζ−1 aζ

)

andπ0(h) = ζk.

Hence application of the mapping prescription to the element [gh−1, π0(h)t], which aswell represents [g, t], leads to the same images. And the images in L and L′ are consistentwith the pasting condition (w′, t′) = (1/w,w−kt) for w = 0, since with w = b/a one hasw−k(bkt) = akt. Thus both maps induce a map of Bk to Lk, which obviously is a bijection.

Here we have a complex line bundle over the one-dimensional complex manifold P1(C)whose space of holomorphic sections is a complex vector space and the representationspace for the representation πj , 2j = k of G = SU(2). In principle, the base spaces forthe bundles in our first approach are real manifolds and the section spaces real vectorspaces. This example shows that we may get the wanted irreducible representation bysome more restrictions, namely we put more structure on the real construction to get themore restrictive complex case (as in holomorphic induction). We shall see more of thisunder an angle provided from the necessities of physics in the next chapter.

Chapter 8

Geometric Quantization andthe Orbit Method

The orbit method is an object of pure mathematics, but it got a lot of impetus fromphysics under the heading of geometric quantization. Historically it was proposed byKirillov in [Ki2] for the description of the unitary dual of nilpotent Lie groups. By fur-ther work of Kirillov and many others, in particular B. Kostant, M. Duflo, M. Vergne andD. Vogan, it grew into an important tool for explicit constructions of representations ofvarious types of groups. The construction mechanism uses elements that also appear intheoretical physics when describing the transition from classical mechanics to quantummechanics. We therefore start by recalling some of this background, even though wecannot give all the necessary definitions.

8.1 The Hamiltonian Formalism and its Quantization

The purpose of theoretical mechanics is to describe the time development of the stateof a physical system. In classical mechanics such a state is identfied with a point inan n-dimensional manifold Q, the configuration space. The point is usually described bylocal coordinates q1, . . . , qn, called position variables. The time development of the systemthen is described by a curve γ given in local coordinates as t −→ (qi(t)), which is fixedby an ordinary differential equation of second order. In the Hamiltonian formalism oneadds to the position variables q = (q1, . . . , qn) the momentum variables p = (p1, . . . , pn)(related to the velocity of the particle or system, we do not need to be more exact here)and the time development t −→ (q(t), p(t)) is described by the hypothesis that the curvet −→ (q(t), p(t)) =: γ(t) has to fulfill Hamilton’s equations, i.e. the first order system ofpartial differential equations

(8.1) qi :=dqi

dt=

∂H

∂pi, pi :=

dpi

dt= −∂H

∂qi

where the function H = H(q, p, t) is the Hamiltonian of the system, i.e. encodes all therelevant information about the system. For instance, if our system is simply a mass one(point) particle in R3 moving under the influence of the force with a radial symmetricpotential V (q) = 1/r, r =

√q21 + q2

2 + q23 , one has H = p2

1 + p22 + p2

3 + V (q).

174 8. Geometric Quantization and the Orbit Method

This Hamilton formalism has several interpretations:

1. The Geometric Picture

A curve γ = γ(t) is called an integral curve for the vector field

X = a(q, p)∂q + b(q, p)∂p := a1(q, p) ∂q1 + . . . an(q, p) ∂qn+ b1(q, p) ∂p1 + · · ·+ bn(q, p) ∂pn

iff at each point of the curve its tangent vector is a vector from the vector field. We writethis as

γ(t) = X(γ(t)).

The statement that γ is an integral curve for the Hamiltonian vector field

X = XH with ai =∂H

∂pi, bi = −∂H

∂qi,

is an expression of the assertion that γ fulfills Hamiltons equations (8.1).

2. The Symplectic Interpretation

To state this, we have to provide some more general notions about differential forms,which we will need anyway in the following sections. Hence we enlarge our considera-tions about the tangent and cotangent bundle and their sections from the last section:We look again at a (real) smooth n-manifold M .– A vector field X is defined as a section of the tangent bundle TM and in local coordi-nates x = (x1, . . . , xn) written as

X = a(x)∂x := a1(x)∂x1 + · · · + an(x)∂xn .

– A differential form of degree one or one-form α is defined as a section of the cotangentbundle T ∗M and in local coordinates written as

α = a(x)dx := a1(x)dx1 + · · · + an(x)dxn.

As one knows from calculus, there also are higher order differential forms of differenttypes. We shall need the symmetric and the antisymmetric or exterior forms.– A symmetric r-form β is written in the local coordinates from above as

β :=∑

bi1...ir(x)dxi1 · · · · · dxir

where the coefficients b are (smooth) functions of the coordinates x and the differentialsdxi are commuting symbols, i.e. with dxi · dxj = dxj · dxi for all i, j, the sum is meantover all r-tuples i1, . . . , ir ∈ 1, . . . , n and the coefficients b are invariant under allpermutations of the indices. For instance, if n = 2, one has the two-form

β = b11dx21 + b12dx1dx2 + b21dx2dx1 + b22dx2

2, b12 = b21.

Certainly, there is the possibility to introduce a reduced form where one restricts theindex-tuples by taking exactly one representative from each orbit of the symmetric groupSr acting on the index-tuples. Hence, as reduced form one has in the example above

β = b11dx21 + b12dx1dx2 + b22dx2

2, b12 = 2b12.

8.1 The Hamiltonian Formalism and its Quantization 175

If one wants to be more sophisticated (and precise), one says that a symmetric r−formis a (smooth) section of the r−th symmetric power of the cotangent bundle. But for ourtreatment here and for the antisymmetric case below the naive description above shouldbe sufficient.

An antisymmetric or exterior r−form α or form of degree r is given in local coordinatesby

α =∑

ai(x)dxi1 ∧ · · · ∧ dxir

where here the differentials are anticommuting, i.e. with dxi ∧ dxj = −dxj ∧ dxi for alli, j. The summation is again over all r−tuples i = (i1, . . . , ir) and the coefficients havethe special property of antisymmetry

ai = sgn σ aσi for all i = (i1, . . . , ir), σ ∈ Sr,

hence, in particular one has ai = 0 if two of the indices coincide. If we take in ourexample n = r = 2, we get

α = a12dx1 ∧ dx2 + a21dx2 ∧ dx1, a12 = −a21.

It is very convenient to introduce here the reduced form where the sum is taken only overall r−tuples (i) = (i1, . . . , ir) with 1 ≤ i1 < · · · < ir ≤ n. Hence the reduced form of ourexample is

α = a12dx1 ∧ dx2, a12 = 2a12.

After changing coordinates the differential forms are transformed according to an ap-propriate generalization of the transformation property (7.27) in Example 7.21 in 7.6and the complete definition is again given by characterizing exterior r−forms as smoothsections of the r−th exterior power of the cotangent bundle.

The set of these exterior r−forms on U ⊂ M is denoted by Ωr(U). By the operationson the coefficients one has a natural manner to define a vector space structure on Ωr(U)and one has the possibility to do scalar multiplication with functions. Moreover one hasan associative multiplication if one multiplies a p−form with a q−form distributively re-garding the anticommutativity of the dxi. It is decisive for the application of differentialforms that we have still two more operations, the differentiation d augmenting the degreeby one and the inner multiplication i(X) with a vector field X lowering the degree byone (moreover there is the Lie derivative, which we will not consider for the moment).We will have to be content again with the description in local coordinates.

For α =∑

a(i)dxi1 ∧ · · · ∧ dxir we define the exterior differentiation dα by

dα :=∑

da(i) ∧ dxi1 ∧ · · · ∧ dxir

where da denotes the usual total differential of a function a, i.e. is given by

da :=n∑

i=1

∂a

∂xidxi.

To handle the exterior differential, one usually reorders it to its reduced form.


A simple but most important example appears if we look at the coordinates given byx = (q1, . . . , qn, p1, . . . , pn) and take the one-form α =

∑ni=1 qidpi. We get

(8.2) ω0 := dα =n∑

i=1

dqi ∧ dpi

and, obviously, have dω0 = 0.We note that the relation ddα = 0 holds in general for all α.

A form α is called closed iff one has dα = 0 and it is called exact iff there is a form βsuch that α = dβ.

Moreover, we point to the fact that exterior differentiation commutes with substitutionof the variables and, if Φ : M −→ M ′ is a smooth map of manifolds and α′ a form on M ′,then one can draw back α′ to a form Φ∗α′ on M . Without being more pecise here, forpractical purposes this comes out by substitution of the local coordinates x −→ y = y(x)and using the well known formula dyi =

∑j

∂yi

∂xjdxj . We do not have enough space to go

into this more deeply but later on shall see in examples how this works.

For α =∑

a(i)(x)dxi1 ∧ · · · ∧ dxir and the vector field X =∑

bj(x)∂j , we have an innerproduct i(X)α, which comes out (this certainly is not the most elegant definition, butavoids more generalities) by applying successively the prescriptions

i(X)dxi = bi(x)

andi(X)(α ∧ β) = (i(X)α) ∧ β + (−1)rα ∧ (i(X)β) for α ∈ Ωr, β ∈ Ωq.

To show how this works, we give an example, which is the most interesting for us: Asabove, we take coordinates x = (q1, . . . , qn, p1, . . . , pn) and α = ω =

∑ni=1 dqi ∧ dpi and

X =∑n

i=1(ai(q, p) ∂qi + bi(q, p) ∂pi). Then we have

i(X)ω =n∑

i=1

(aidpi − bidqi).

Finally, we use this for the symplectic formulation of the Hamilton formalism: We gofrom the configuration space Q with coordinates q to its cotangent bundle M = T ∗Q,called the phase space, with coordinates (q, p). We provide this space with the two-formω =

∑ni=1 dqi ∧ dpi (and hence get the protype of a symplectic manifold, which will be

analysed later more thoroughly). Now we can say that a curve γ provides a solution toHamilton’s equations (8.1) if it is an integral curve of the vector field X, which fulfillsthe equation

(8.3) dH = i(X)ω.

Namely, in this case one has∂H

∂qi= −bi,

∂H

∂pi= ai,

i.e. X is a Hamiltonian vector field XH .

8.1 The Hamiltonian Formalism and its Quantization 177

3. The Poisson Interpretation

In 6.1 we introduced the Poisson bracket for two smooth functions f and g with coordi-nates (q, p)

f, g :=n∑

i=1

∂f

∂qi

∂g

∂pi− ∂f

∂pi

∂g

∂qi.

Hence Hamilton’s equations (8.1) can also be expressed as

q = q, H, p = p,H.And more generally, if one replaces the special observables positions qi and momentapi by a more general observable f , i.e. a smooth function of (q, p), as equation of timedevelopment one has

(8.4) f = f, H.Here we find an appropriate starting point to do quantization, i.e. to go over to a quantummechanical description of the system. A quantum mechanical system is modeled by acomplex Hilbert space H. The state of the system is represented by a one-dimensionalsubspace L = < ψ > of H, ψ ∈ H\0, which can be seen as a point ψ∼ in the projectivespace

P(H) := ψ∼; ψ ∈ H \ 0, ψ ∼ ψ′ iff ψ′ = λψ for a λ ∈ C∗.

Then the time development reflects in a curve γ(t) = ψ(t)∼ in P(H) and the physicallaw determining this development is encoded into a one-parameter group G of unitaryoperators U(t), t ∈ R, such that γ is a G−orbit in P(H). A physical observable f shouldcorrespond to an operator B = Bf acting on or in H (as there are subtle problems,we will not be more precise here) and < Bψ, ψ > provides the information about theprobability to measure the observable in the state < ψ >. Similar to our treatment ofinfinitesimal generators of one-parameter subgroups in section 6.2, one can look for askew-adjoint operator A such that (in some precise sense) one has U(t) = exp tA and,hence, the counterpart of the classical equation of motion will be

dψ

dt= Aψ.

In the physics literature, usually one introduces the (self-adjoint) Hamilton operator Hby another normalization, namely

U(t) =: eitH

(we use units such that Planck’s constant is one) and then has the Schrodinger equation

dψ

dt= iHψ.

In our standard example of a system consisting of a particle with mass m moving inR3 in a central force field with potential V, one translates the position variables qj fromthe classical world into operators Bqj acting by multiplication with iqj , the momentumvariables pj go to the differential operators Bpj = ∂qj acting on (a subspace of smoothfunctions in) the Hilbert space H = L2(R3) and one has the Schrodinger equation

dψ

dt= iHψ = i((1/2m)

3∑j=1

∂2qj

− V (q))ψ.


In general, it can be a serious problem to find a Hilbert space H and operators corre-sponding to the symplectic space and the observables from the classical description. Forsome discussion of this we refer to Chapter 5 in [Be]. For our purposes here, we retainthe following observation even though it is somewhat nebulous:

In important cases the behaviour of a physical system is guided by its symmetry group G,i.e. a group of transformations, which do not change the physical content of the system.If we determine unitary representations of G we will be provided with a Hilbert spaceand a lot of unitary operators.

In the next sections we shall see that important groups, like the Heisenberg group, theEuclidean and the Poincare group, SU(2) and SL(2,R), give rise to symplectic manifoldsand line bundles living on these such that the spaces of sections of these bundles arerepresentation spaces of unitary representation of G.

8.2 Coadjoint Orbits and Representations

We start by analysing the construction of bundles on symplectic manifolds. Beside thealready mentioned sources [Ki] and [Ki1], for further studies we recommend the book byWoodhouse [Wo]. We return to the realm of mathematics though we still use some termsfrom physics, which should help to give some motivation.

8.2.1 Prequantization

In Section 8.1 we already established the cotangent bundle T ∗Q to a manifold Q and inparticular the space Rn with coordinates (q1, . . . , qn, p1, . . . , pn) and provided with thestandard two-form (8.2)

ω0 :=n∑

j=1

dqj ∧ dpj

as prototypes of a symplectic manifold. In general, one has the fundamental notion:

Definition 8.1: A smooth (real) p-manifold M is called a symplectic manifold if it isprovided with a closed nondegenerate symplectic two-form ω defined on M , that is anω ∈ Ω2(M), such that

– i) dω = 0,and– ii) on each tangent space TmM, m ∈ M, the condition holds: if for X ∈ TmM one has

ωm(X,Y ) = 0 for all Y ∈ TmM,

then X = 0.

8.2 Coadjoint Orbits and Representations 179

Slightly paraphrasing the condition ii), we remark that M neccessarily has to have evendimension p = 2n and that for ω in m ∈ M with local coordinates (q1, . . . , qn, p1, . . . , pn)given by (8.2)

ω0 =n∑

j=1

dqj ∧ dpj

and for vector fields X,Y given by

X =n∑

j=1

(a′j(q, p) ∂qj

+ a′′j(q, p) ∂pj

), Y =n∑

j=1

(b′j(q, p) ∂qj+ b′′j(q, p) ∂pj

),

from the general duality formula dxi(∂xj) = δij and the antisymmetry of the forms one

has the relation

ω0(X, Y ) =n∑

j=1

(a′j(q, p) b′′j(q, p) − a′′

j(q, p) b′j(q, p)).

At first sight, this form ω0 may look very special. But there is a beautiful theorem statingthat this really a general situation:

Theorem 8.1 (Darboux): For every point m of a symplectic manifold (M,ω) of dimen-sion 2n, there is an open neighbourhood U of m and a smooth map

ϕ : U −→ R2n with ϕ∗ω0 = ω|U ,

where ω0 is the standard symplectic form (8.2) on R2n.

The examples of symplectic manifolds, which are of interest for us, are the coadjointorbits belonging to the action of a group G on the dual g∗ of its Lie algebra g. Forinstance the projective space P1(C) will come out as such an orbit. Before we go intothis, we elaborate a bit on the way to construct bundles on symplectic manifolds. As wesee it, this construction essentially goes back to Weil’s book Varietes Kahleriennes [We].

Definition 8.2: A symplectic manifold (M,ω) is called quantizable if ω is integral.

The integrality of ω can be defined in several equivalent ways. Perhaps the most elemen-tary one is to require (as in [Wo] p.159]:

(I.) The integral of ω over any closed oriented surface in M is an integral multiple of 2π.

More elegant is the requirement

(II.) ω determines a class [ω] in H2(M,Z).

To understand (II.), one has to have at hand the notions of the second cohomology groupH2(M,R) and its integral subgroup H2(M,Z). Though we are not too far from thepossibility to define these, we renounce to do it here but discuss the construction of theprequantum line bundle where we can see how this condition works (later we shall be ledto a more practical criterion to decide the quantizability of the manifolds showing up ascoadjoint orbits):


Recalling our definitions from 7.6, this bundle can be constructed as follows. We take acontractible open cover U = (Uj) of M . Then Poincare’s Lemma assures that on eachUj , there exists a real one–form βj such that

ω = dβj .

The same way, one knows that on each contractible Uj ∩ Uk = ∅ a smooth real functionfjk = −fkj exists such that

dfjk = βj − βk

and, for each contractible Uj ∩ Uk ∩ U = ∅, there is a constant ajk such that

2πajk = fjk + fkl + fj on Uj ∩ Uk ∩ U.

It is the decisive point that the integrality of ω says that the ajk may be assumed to beintegers (this is really the essence of the definition above). If we put

cjk := exp ifjk,

we have

dcjk = icjk(βj − βk)

and for ajkl ∈ Z on each non–empty Uj ∩ Uk ∩ U

cjkcklclj = exp(2πiajk) = 1.

Hence Proposition 7.4 in 7.6 shows that the cjk are transition functions for a line bundleB over M . This bundle has a compatible Hermitian structure <, > (see [Wo] p.264),which we define below in 8.2.3. Since in Poincare’s Lemma a form β with dβ = α isnot unique (it can changed by an exact form), this construction is not unique. The gen-eral statement is that these constructions of prequantum bundles B are parametrized byH1(M,R)/H1(M,Z) ([Wo] p.286).

As it is our goal to construct unitary group representations, we need a Hilbert space.The Hilbert space H in our prequantization construction (the prequantum Hilbert space)is the space of all s ∈ Γ(M, B), i.e. smooth global sections of B for which in some sensean integral of (s, s)εω over M exists and is finite. Here the measure εω is associated tothe Liouville form

(−1)n(n−1)/2 1n!

ωn

known from calculus. For s1, s2 ∈ H the inner product on H is

< s1, s2 > :=1

(2π)n

∫M

(s1, s2)εω.


8.2.2 Example: Construction of Line Bundles over M = P1(C)

The topic of line bundles on projective space is treated in each text book on algebraicgeometry (for instance, see Griffiths-Harris [GH] p.144) and we already constructed holo-morphic line bundles over P1(C) at the end of section 7.6. But to illustrate the procedureproposed above, for those who – like the author – enjoy explicit descriptions, we add herethe following very elementary constructions (to be skipped if one is in a hurry).

1. Real Analytic Bundles B −→ P1(C)

We look at P1(C) as a real two-dimensional manifold and take the covering U = (Uj)j=1,...,6

withU1 = (z : 1); z = x + iy ∈ C; |z| < 1,U2 = (1 : w); w = u + iv ∈ C; |w| < 1, w = 1/z = (x − iy)/(x2 + y2),U3,4 = (z : 1); z ∈ C; Re z = x ≷ 0,U5,6 = (z : 1); z ∈ C; Im z = y ≷ 0.

This covering is chosen such that all intersections of the covering neighborhoods aresimply connected (this is not the case for the covering by U and U ′, which we used inExample 7.18 in 7.6 as one has U ∩U ′ C∗). With c = /π ∈ R we take as a symplecticform on M

ω := cdx ∧ dy

(1 + x2 + y2)2= c

i

2dz ∧ dz

(1 + |z|2)2 for (z : 1) ∈ Uj , j = 2,

:= cdu ∧ dv

(1 + u2 + v2)2= c

i

2dw ∧ dw

(1 + |w|2)2 for (1 : w) ∈ U2.

Exercise 8.1: Verify that this two–form ω in the appropriate coordinates has potentialforms θz and θw with dθz = dθw = ω given by

θz =c

2xdy − ydx

1 + x2 + y2= −c

i

4

(zdz − zdz

1 + |z|2)

for (z : 1) ∈ Uj , j = 2,

θw =c

2udv − vdu

1 + u2 + v2= −c

i

4

(wdw − wdw

1 + |w|2)

for (1 : w) ∈ U2.

On Uj ∩ U2(j = 2) we have

θz − θw = −ci

4

(zdz − zdz

1 + |z|2 − 11 + |z|2

(dz

z− dz

z

))

= −ci

4

(zdz − zdz

|z|2)

=c

2xdy − ydx

x2 + y2

=c

2df2j

with a potential function f2j . If we fix the tangens as a function on (−π/2, π/2) andf = arctan as its inverse, we can take

−f32(x, y) = f(y/x) = arctan(y/x)

and moreover, for instance,

−f52(x, y) = −f(x/y)+π/2 = − arctan(x/y)+π/2 and − f42(x, y) = π+ arctan(y/x).


Then, on U2 ∩ U4 ∩ U5, we have

f24 + f45 + f52 = −π − arctan(y/x) + π/2 − arctan(x/y)= −π,

and similarly on the other intersections. Thus, we see that ω is integral in the sense

i) ω = dθj on Uj

ii) θj − θk = dfjk on Uj ∩ Uk

iii) fjk + fk + fj = ajk ∈ 2πZ in Uj ∩ Uk ∩ U exactly for c ∈ 4Z.

This result coincides with the usual volume computation

∫P1(C)

ω = c

∫dx ∧ dy

(1 + x2 + y2)2= c

∞∫0

2π∫0

rdrdδ

(1 + r2)2

= 2πc

∞∫0

rdr

(1 + r2)2”s = πc

∞∫1

du

u2= πc

telling that the volume is an integral multiple of 4π = vol(S2) exactly for c = 4n, n ∈ Z.

Moreoverer, we see that we can construct the C∞–prequantum bundle

B π−→ M = P1(C)

in the standard way as

B = (m, v, j)∼; m ∈ M, v ∈ C, j ∈ I,where here I = 1, . . . , 6 and

(m′, v′, j′) ∼ (m, v, j)

exactly form′ = m, v′ = ψj′j(m)v.

The bundle transition functions ψjk are fixed (in the charts with coordinates x, y) by

ψjk(x, y) = cjkeifjk(x,y),

i.e. the different trivializations are glued together by

ei argz

.

By the way, the transition function z comes out if one does not take real potential formsas above but complex forms like θ′z = (1 + |z|2)−1zdz.

2. Holomorphic Bundles E −→ P1(C)

Following Weil ([We] p. 90), the same construction can be repeated by looking at P1(C)as a one-dimensional holomorphic manifold and searching for holomorphic transitionfunctions

Fjk : Uj ∩ Uk −→ C∗.


Here we take the same covering U = (Uj)j=1,...,6 as above and the symplectic form

ω :=

2πi

dz ∧ dz

(1 + |z|2)2 =

2πiddΦj forj = 2,

withΦj(z) := log(1 + |z|2)

resp. for j = 2

ω :=

2πi

dw ∧ dw

(1 + |w|2)2 =

2πiddΦ2

withΦ2(w) := log(1 + |w|2).

On Uj ∩ U2, j = 2, we have

Φj(z) − Φ2(1/z) = log(|z|2).Fixing appropriate branches of the complex log-function, we take for j = 2

f2j :=

2πilog z.

Then we haveΦj(z) − Φ2(1/z) = −2πi(f2j − f2j)

andηj = −

2πidΦj =

2πid log(1 + |z|2) =

2πi

z dz

1 + |z|2with

ω =

2πi

dz ∧ dz

(1 + |z|2)2 .

Using Weil’s notation, for the construction of a holomorphic bundle E = E we havetransition functions

Fjk(z) = exp (2πi fjk(z)) = z

exactly if ∈ Z. We see immediately that for ∈ N0 the holomorphic sections are givenin the z-variable by

s(z) = zλ, λ = 0, 1, . . . ,

(as exactly in these cases we have (1/w)λw holomorphic in w), i.e. we find

Γ (E) ∼= C+1.

As described in a general procedure below, this space consists of the polarized sectionsof the prequantum bundle B = B, where the polarization is given by assigning to thepoint m with coordinate z the complex subspace spanned by ∂z in the complexification

Tm P1(C) ⊗ C C ∂z ⊕ C ∂z

of the tangent space Tm P1(C) for each m ∈ P1(C), i.e. Γ(E) can be identified with thespace of those sections of B, which are constant “along the direction of the z−coordinate”.

We try to make this a bit more comprehensive. But to do it, again we need more generalnotions from complex and algebraic geometry. As seen above, these geometric notionscan and shall be made later rather tangible by replacing them by algebraic ones in ourconcrete examples (and, hence, may be skipped at first try).


8.2.3 Quantization

The prequantization Hilbert space H built from C∞–sections of the prequantum bundleB over M - as we just have seen - often turns out to be too large to be useful: This is to beseen again in the case of the coadjoint orbits, as the representations of the group G on Has a representation space are in general far from being irreducible. From representationtheory procedures are known to construct appropriate representation spaces, for instanceby parabolic or holomorphic induction. But in this approach here, it is “natural” to usea notion from the Kahler geometry to single out a subspace of H, namely the notion ofthe Kahler polarization. We take over from [Wo] p.92 ff:

Definition 8.3: A complex polarization on a 2n–dimensional symplectic manifold (M, ω)is a complex distribution P with

(CP1) For each m ∈ M , Pm is a complex Lagrangian subspace of TCmM , i.e. with

P⊥m = Pm,

(CP2) Dm := Pm ∩ Pm ∩ TmM has constant dimension for all m ∈ M ,

(CP3) P is integrable.

The polarization P is of type (r, s) if the Hermitian form <,> on Pm defined by

< Z,W > := −4iω(Z,W ) for Z, W ∈ Pm

is of sign (r, s) for all m ∈ M .P is a Kahler polarization if it is of type (r, s) with r+s = n. Such a polarization inducesa complex structure J on M and (M, ω, J) is a Kahler manifold (in the broad sense thatthe real symmetric tensor g determined by

g(X, Y ) = 2ω(X,JY ), X, Y ∈ V (M)

may be indefinite, see the “note 1” to p.93 in [Wo]).

Perhaps the reader wants some more explanations. We follow [Wo] p.269:

– A real distribution on a real manifold M is a subbundle of the tangent bundle TM .The fiber Pm at m ∈ M is a subspace of TmM , which varies smoothly with m.– A vector field X is tangent to P if X(m) ∈ Pm for every m. The space of vectorfields tangent to P is denoted by VP (M). A smooth function is constant along P if dfvanishes on restriction to P . The space of smooth functions constant along P is denotedby C∞

P (M).– An immersed submanifold N ⊂ M is an integral manifold of P if Pm = TmN for everym ∈ N .– A distribution is integrable if [X,Y ] is tangent to P whenever X and Y are tangentto P .– A complex distribution is similarly defined to be a subbundle of the complexified tan-gent bundle TCM . A complex distribution P is integrable if in some neighbourhood ofeach point, one has smooth complex functions fk+1, . . . , fl with gradients that are inde-pendent and annihilate all complex vector fields tangent to P , where l − k is the fiberdimension of P .


– A complex structure on a real vector space V is a linear transformation J : V −→ Vsuch that J2 = −1. Complex structures exist only in spaces of even dimension. On areal manifold a complex structure J is given if one has a complex structure Jm on eachtangent space TmM and if these Jm vary smoothly with m, i.e. the complex distributionP spanned by the vector fields X − iJX, X ∈ V (M), is integrable. A real manifold witha complex stucture comes out as a complex manifold.– A connection on a vector bundle B is an operator that assigns a one-form s withvalues in B to each (smooth) section s of B, i.e.,

: Γ(B) −→ Γ(T ∗M ⊗ B),

such that for any function f and sections s, s′ ∈ C∞B (M) one has

(s + s′) = s + s′ and (fs) = (df)s + f s.

Vice versa, a given Kahler manifold (M, ω, J), i.e. a manifold which is simultaneously ina compatible manner a symplectic and a complex manifold, carries two Kahler polariza-tions: the holomorphic polarization P spanned at each point by the vectors (∂za) and itscomplex conjugate, the anti–holomorphic polarization P spanned by the (∂za)’s. Locallyit is possible to find a real smooth function f such that

ω = i∂∂f

andΘ = −i∂f resp. Θ = i∂f

is a symplectic potential adapted to P resp. P .– A Hermitian structure on a complex vector bundle B is a Hermitian inner product(·, ·) on the fibers, which is smooth in the sense that B −→ C, v −→ (v, v) is a smoothfunction. It is compatible with the connection if for all sections s, s′ and all real vectorfields X

i(X)d(s, s′) = (Xs, s′) + (s,Xs′), Xs := i(X) s.

Now, let be given a Kahler polarization P on the quantizable (M,ω) with prequantumline bundle B over M with connection , Hermitian structure <, > and Hilbert space Hof square integrable sections of B. Then we define a space of polarized sections

C∞B (M, P ) := s ∈ C∞

B (M); Xs = 0 for all X ∈ VC(M, P ) .

Here, for U ⊂ M open,

VC(U,P ) := X ∈ VC(M), Xm ∈ Pm for all m ∈ Mdenotes the vector fields tangent to P on U . Then

HP := H ∩ C∞B (M, P )

comes out as our “new” Hilbert space. As remarked above, it consists of all integrableholomorphic sections of B.Alternatively, it can be said that the prequantum bundle B (via P ) is given a structureof a complex line bundle. Anyway, in important cases, this HP is the representationspace of a discrete series representation.


We leave aside the problem, which of the classical observables have a selfadjoint coun-terpart acting in HP , discussed to some length in [Wo] and the modification by theintroduction of half–densities and go directly to the application in the construction ofunitary representations via line bundles on the coadjoint orbits.

8.2.4 Coadjoint Orbits and Hamiltonian G-spaces

This is the central topic of this chapter. For background information we recommend thestandard sources by Kirillov [Ki] §15, Kostant [Kos], (at least) the first pages of Kirillov’sbook [Ki1], or, for a comprehensive overview, the article by Vogan [Vo].

Let G be a (real) linear group with Lie algebra g and its dual space g∗. As usual, werealize all this by matrices and then have the adjoint representation Ad of G on g givenby

Ad(g)X := gXg−1.

(This is the derived map of the inner automorphism κg : G −→ G, g0 −→ gg0g−1.)

Following our preparation in section 1.4.3, we can construct the contragredient represen-tation Ad∗ defined on the dual g∗ to g by

Ad∗(g) := Ad(g−1)∗.

The asterisk on the right-hand side indicates a dual operator: If < η,X > denotes thevalue η(X) of the linear functional η applied to the vector X and A is an operator on gthe dual A∗ is defined by

< A∗η, X >:=< η, AX > for all X ∈ g, η ∈ g∗.

Hence we have as our central prescription

< Ad∗(g)η, X > = < η,Ad(g−1)X > for all g ∈ G, X ∈ g, η ∈ g∗.

This representation Ad∗ is called the coadjoint representation. Often we will abbreviateg · η := Ad∗(g)η.

In 6.6.1 we already worked with the trace form for A,B ∈ Mn(C) given by

(A,B) −→ ReTr (AB) =:< A, B > .

(Here, unfortunately, one has a double meaning of < ., . >.) If g is semisimple orreductive, one can use this bracket to identify g∗ with g (G-equivariantly) via

g∗ η −→ Xη ∈ g, defined by η(Y ) = < Xη, Y > for all Y ∈ g.

Then the coadjoint representation is fixed by the simple prescription Xη −→ gXηg−1.

Now we are prepared to define the coadjoint orbit O of η ∈ g∗ by

Oη := G · η.

If Gη denotes the stabilizing group Gη := g ∈ G; g · η = η, one has Oη G/Gη and ifwe can identify g∗ with a matrix space as proposed above, we have

Oη = gXηg−1, g ∈ G.


It is an outstanding and most wonderful fact that our coadjoint orbits are symplecticmanifolds. We cite from [Vo] p.187:

Theorem 8.2: Suppose M = Oη is the coadjoint orbit of η ∈ g∗ and gη := Lie Gη. Thenone has1.) The tangent space to M at η is

TηM g/gη.

2.) The skew-symmetric bilinear form

ωη(X, Y ) := η([X,Y ]) for all X,Y ∈ g

on g has radical exactly gη and so defines a symplectic form on TηM .3.) The form ωη makes M into a symplectic manifold.

The bilinear form ωη is called Kirillov-Kostant form, sometimes also Kirillov-Kostant-Souriau form or any permutation of these names. It is also written as ([Ki] p.230 or [Wo]p.52)

(8.5) ωη(η′)(ξX(η′), ξY (η′)) = < η′, [X, Y ] > for all X, Y ∈ g, η′ ∈ Oη

where ξX denotes the vector field associated to X in the form of a differential operatoracting on smooth functions f on M by the definition

(8.6) (ξXf)(η) :=d

dt(f(exp tX · η))|t=0.

For a proof we refer to [Ki] 15.2 or [Kos] Theorem 5.4.1. It is obvious that ωη is skewsymmetric. Also it is, in principle, not deep that a form given for the tangent space ofone point can be transported to the whole space if the space is homogeneous. But it maybe an intricate task to give an explicit description of ω in local coordinates. We shallgive an example soon how (8.5) can be used. The fact that ω comes out as a closed formcan be proved by direct calculation or by more sophisticated means as in [Ki1] p.5-10.

The coadjoint orbits still have more power, they are Hamiltonian G-spaces:

Definition 8.4: Suppose (M, ω) is a symplectic manifold and f ∈ C∞(M) is a smoothfunction with its Hamiltonian vector field Xf . Moreover suppose that G is a linear groupendowed with a smooth action on M , which respects the form ω. We say that M is aHamiltonian G-space if there is a linear map

µ : g −→ C∞(M)

with the following properties:– i) µ intertwines the adjoint action of G on g with its action on C∞(M).– ii) For each Y ∈ g, Xµ(Y ) is the vector field by which Y acts on M .– iii) µ is a Lie algebra homomorphism.

µ can be reinterpreted as a moment map

µ : G −→ g∗, µ(m)(Y ) := µ(Y )(m) for all m ∈ M, Y ∈ g.


Then the highlight of this theory is the following result to be found also in the above citedsources. We shall not have occasion to use it here but state it to give a more completepicture.

Theorem 8.3: The homogeneous Hamiltonian G-spaces are the covering spaces of coad-joint orbits. More precisely, suppose M is such a space, with map µ : g −→ C∞(M) andmoment map µ. Then µ is a G-equivariant local diffeomorphism onto a coadjoint orbit,and the Hamiltonian G-space structure on M is pulled back from that on the orbit bythe map µ.

If G is reductive and we have identified g with its dual g∗ as n×n−matrices, one has thefollowing classification of the coadjoint orbits, which later will be reflected in a mecha-nism to construct representations of G.

Definition 8.5: An element X ∈ g is called nilpotent if it is nilpotent as a matrix, i.e. ifXN = 0 for N large enough.X is called semisimple if X regarded as a complex matrix is diagonalizable.X is called hyperbolic resp. elliptic if it is semisimple and all eigenvalues are real resp. purelyimaginary.An orbit O = G · η inherits the name from X = Xη.

For instance, the real matrix

X =(

0 1−1 0

)

is elliptic, since over C it is conjugate to(

i 00 −i

). Now, this finally is the moment to

discuss an example.

Coadjoint Orbits of SL(2,R)

We takeG = SL(2,R) ⊃ K = SO(2)

and

g∗ g = Lie G =

U(x, y, z) :=(

x y + zy − z −x

); x, y, z ∈ R

= < X, Y, Z >R

with

X =(

1−1

), Y =

(1

1

), Z =

(1

−1

)

[X, Y ] = 2Z, [X, Z] = 2Y, [Y, Z] = −2X

andX2 + Y 2 − Z2 = 3I2.

Obviously, X and Y are hyperbolic, Z is elliptic, and F = (1

) as well as G = ( 1 )

are nilpotent.


One can guess the form of the orbits generated by these matrices, but to be on the safeside we calculate:

gU(x, y, z)g−1 =: U(x, y, z) for g = (a bc d

)

is given by

x = (ad + bc)x + (bd − ac)y − (bd + ac)z,

y = (dc − ab)x + (1/2)(a2 + d2 − b2 − c2)y + (1/2)(a2 + b2 − c2 − d2)z,(8.7)z = −(ab + cd)x + (1/2)(a2 − b2 + c2 − d2)y + (1/2)(a2 + b2 + c2 + d2)z.

Hence, for U = αX,α = 0, as a hyperbolic orbit OαX we have the one-sheeted hyper-boloid in R3 given by

x2 + y2 − z2 = α2.

For U = αZ, α ≷ 0, as elliptic orbits OαZ we have the upper resp. lower half of thetwo-sheeted hyperboloid given by

x2 + y2 − z2 = −α2, z ≷ 0.

For U = F and U = G, as nilpotent orbits OF and OG we have the upper resp. lowerhalf of the cone given by

x2 + y2 − z2 = 0, z ≷ 0.

Finally, for U = 0, there is the nilpotent orbit O0 consisting only of the origin in R3.Hence we have a disjoint decomposition of R3 into the orbits in their realization assubsets of R3

R3 =∐α>0

OαX

∐α≷0

OαZ

∐(OF ∪0 ∪ OG).

It is an easy exercise to determine the stabilizing groups of the elements we chose togenerate our orbits. We get (again with α = 0)

GαX =(

a1/a

); a ∈ R∗

,

GαZ = SO(2),

GF =( ±1 b

±1

); b ∈ R

,

GG =( ±1 0

c ±1

); c ∈ R

.

From the general theory we have the prediction that the coadjoint orbits are symplecticmanifolds. We can verify this here directly:


Let us take the elliptic orbit

OαZ = G · (αZ) G/SO(2), α > 0.

We want to give an explict expression for the Kirillov-Kostant form ω in this case. Hereone has to be rather careful. We used the trace form for matrices < A,B >= TrAB toidentify g∗ with g via η −→ Xη with < Xη, U >= η(U) for all U ∈ g. Now, we writeg∗ =< X∗, Y∗, Z∗ >R with

X∗ =(

1−1

), Y∗ =

(1

1

), Z∗ =

(1

−1

),

and then have as non-vanishing duality relations

< X∗, X >= 2, < Y∗, Y >= 2, < Z∗, Z >= −2.

We putg∗ η = x∗X∗ + y∗Y∗ + z∗Z∗

andX = xX + yY + zZ, Y = x′X + y′Y + z′Z ∈ g.

Hence, by the commutation relations for X, Y, Z, we have

[X, Y ] = 2((xy′ − x′y)Z + (xz′ − x′z)Y − (yz′ − y′z)X).

From the Kirillov-Kostant prescription we have

ωη(X, Y ) = < η, [X, Y ] >

= −2(yz′ − yz′)x∗ + 2(xz′ − x′z)y∗ − 2(xy′ − x′y)z∗,

i.e. in particular for the elliptic orbit through η = kZ∗

ωkZ∗(X, Y ) = −(xy′ − x′y)k.

There are several ways to go from here to an explicit differential two-form ω on

M : = G · kZ∗= (x∗, y∗, z∗) ∈ R3; x2

∗ + y2∗ − z2

∗ = −k2.

It is perhaps not the shortest one but rather instructive to proceed like this: One knowsthat

ω1 =dx ∧ dy

y2

is a G-invariant symplectic form on M ′ := H G/SO(2) where

τ = x + iy := g(i)

with

x =bd + ac

c2 + d2, y =

1c2 + d2

for g =(

a bc d

).


The coadjoint G-action on M and the usual action of G on M ′ = H are equivariant (Exer-cise 8.2: Verify this). From (8.7) we have the description of the elements (x∗, y∗, z∗) ∈ Mas

x∗ = −(ac + bd)k,

y∗ = (1/2)(a2 + b2 − c2 − d2)k,

z∗ = (1/2)(a2 + b2 + c2 + d2)k,

and, hence, a G-equivariant diffeomorphism

ψ : M −→ M ′ = H, (x∗, y∗, z∗) −→ (x, y)

given by

x =x∗

y∗ − z∗, y =

k

z∗ − y∗.

We use ψ to pull back ω1 to a symplectic form ω0, which in local coordinates (x∗, y∗)with z∗ =

√x2∗ + y2∗ + k2, i.e. z∗dz∗ = x∗dx∗ + y∗dy∗ comes out as

ω0 = ψ∗ω1 =dx∗ ∧ dy∗

kz∗.

(Verify this as Exercise 8.3.)

Now we have to compare this with the Kirillov-Kostant prescription above and determinethe right scalar factor to get the Kirillov-Kostant symplectic form ω. We shall see thatone has

ω = −dx∗ ∧ dy∗z∗

.

Perhaps the more experienced reader can directly pick this off the Kirillov-Kostant pre-scription. As we feel this is a nice occasion to get some more practice in handling thenotions we introduced here, we propose to evaluate the general formula (8.5)

ωη(η′)(ξX(η′), ξY (η′)) := < η′, [X, Y ] > for all X, Y ∈ g, η′ ∈ Oη

in this situation. For the vector fields ξX acting on functions f on M we have the generalformula

(ξXf)(η) =d

dtf((exp tX) · η)|t=0.

Putting gX(t) := exp tX one has

gX(t) =(

et

e−t

),

gF (t) =(

1 t1

),

gG(t) =(

1t 1

).


And, putting A(g) for the matrix of the coadjoint action η −→ g · η, one deduces from(8.7)

A(gX(t)) =

⎛⎝1

(1/2)(e2t + e−2t) (1/2)(e2t − e−2t)(1/2)(e2t − e−2t) (1/2)(e2t + e−2t)

⎞⎠ ,

A(gF (t)) =

⎛⎝ 1 t −t−t (1/2)(2 − t2) (1/2)t2

−t (1/2)t2 (1/2)(2 + t2)

⎞⎠ ,

A(gG(t)) =

⎛⎝ 1 −t −t

t (1/2)(2 − t2) −(1/2)t2

−t (1/2)t2 (1/2)(2 + t2)

⎞⎠ .

From (8.6) we get the differential operators (with Y = F + G,Z = F − G)

ξX = 2(z∗∂y∗ + y∗∂z∗),ξY = −2(z∗∂x∗ + x∗∂z∗),ξZ = 2(y∗∂x∗ − x∗∂y∗).

When we treat the functions f as functions in the two local coordinates (x∗, y∗) withz2∗ = x2

∗ + y2∗ + k2, these operators reduce to

ξX = 2z∗∂y∗ ,

ξY = −2z∗∂x∗ ,

ξZ = 2(y∗∂x∗ − x∗∂y∗).

And a vector field in these local coordinates comes as

X = α∂x∗ + β∂y∗ = xξX + yξY + zξZ

withα = 2(zy∗ − yz∗), β = 2(xz∗ − zx∗)

and analogously for

Y = α′∂x∗ + β′∂y∗ = x′ξX + y′ξY + z′ξZ .

One has

αβ′ − α′β = 4((zx′ − xz′)y∗z∗ + (xy′ − yx′)z2∗ + (yz′ − y′z)x∗z∗)

and hence in (8.5) for ω = (1/z∗)dx∗ ∧ dy∗ the left hand side is

(1/z∗)dx∗ ∧ dy∗(ξX , ξY ) = (1/z∗)dx∗ ∧ dy∗(α∂x∗ + β∂y∗ , α′∂x∗ + β′∂y∗)

= 4((zx′ − xz′)y∗ + (xy′ − yx′)z∗ + (yz′ − y′z)x∗).

And this is exactly the negative of the outcome of the right hand side evaluated forη′ = x∗X∗ + y∗Y∗ + z∗Z∗

< η′, [X, Y ] >= −4(xy′ − x′y)z∗ + 4(xz′ − x′z)y∗ − 4(xy′ − yx′)x∗.


Coadjoint Orbits for SU(2)

As already done several times, we take

G := SU(2) = g =(

a b

−b a

); a, b ∈ C, |a|2 + |b|2 = 1.

Thus, as a (real) manifold, G is the unit 3–sphere S3 ⊂ C2.One has

g = su(2) =X ∈ M2(C); X = −tX

= X = (1/2)

3∑j=1

ajHj ; a = (a1, a2, a3) ∈ R3

with (see (6.5) in Section 6.5)

H1 =( −i

i

), H2 =

( −i−i

), H3 =

( −11

),

and[H1,H2] = 2H3, [H2,H3] = 2H1, [H3,H1] = 2H2 .

su(2) is isomorphic to so(3) (see the exercises 6.4, 6.5 and 6.6 in Section 6.1) and, asusual, we identify X ∈ su(2) and a = (a1, a2, a3) ∈ R3 with

A =

⎛⎝ 0 −a3 a2

a3 0 −a1

−a2 a1 0

⎞⎠ =: a1A1 + a2A2 + a3A3 ∈ so(3)

Using this identification the Lie bracket corresponds to the vector product in R3

[A,B] = a ∧ b

and the trace form <,> in M3(R) to the scalar product (·, ·) in R3

< A, B >= −2(a,b) = −2 tab.

Since <,> is proportional to the euclidean scalar product on R3, we can use the traceform to identify so(3)∗ with so(3) via the prescription

(8.8) so(3)∗ η −→ Aη such that η(B) =< Aη, B > for all B ∈ so(3) :

We choose a basis (A∗1, A

∗2, A

∗3) with < A∗

i , Aj >= δij and for an element η ∈ so(3)∗ wewrite η ≡ Aη = a∗

1A∗1 + a∗

2A∗2 + a∗

3A∗3.

As is well known (see the proof of the fact that SU(2) is a twofold covering of SO(3)in section 4.3), using these identifications the coadjoint operations of SU(2) on su(2)∗

resp. so(3)∗ are simply the rotations of R3. The orbits are the spheres of radius s,Ms := G · ηs for ηs := sA∗

3 with s > 0, centered at the origin and the origin itself. Onehas a symplectic structure on the sphere Ms of radius s given by the volume form, whichis given by restriction to Ms of the 2–form

ω0 := (1/s)∑

i, j, kcyclic

a∗i da∗

j ∧ da∗k.


As local coordinates we use (a∗1, a

∗2) such that Ms is given by the parametrization

ψ : J := (a∗1, a

∗2); (a

∗1)

2 + (a∗2)

2 ≤ s2 −→ Ms,(a∗

1, a∗2) −→ (a∗

1, a∗2, a

∗3), (a∗

3)2 = s2 − (a∗

1)2 − (a∗

2)2.

Exercise 8.4: Verify that one has

ω0 = (s/a∗3)da∗

1 ∧ da∗2

and

vol2(Ms) = 2∫

J

ψ∗ω0 = 4πs2.

We want to compare this with the Kirillov-Kostant recipe

ωη(A,B) = η([A,B]),

resp. (8.5)ω(ξA, ξB) =< η′, [A,B] > .

We haveA = a1A1 + a2A2 + a3A3, B = b1A1 + b2A2 + b3A3,

i.e.[A, B] = (a1b2 − a2b1)A3 + (a2b3 − a3b2)A1 + (a3b1 − a1b3)A2,

and with η = a∗1A

∗1 + a∗

2A∗2 + a∗

3A∗3

(8.9) ωη(A,B) = (a1b2 − a2b1)a∗3 + (a2b3 − a3b2)a∗

1 + (a3b1 − a1b3)a∗2.

As in the preceding example we determine the associated vector fields ξAj. Namely using

here gj(t) = exp tAj , j = 1, 2, 3,

ξAj f(η) =d

dtf(gj(t) · η)|t=0

leads to

ξA1 = −a∗3∂a∗

2+ a∗

2∂a∗3,

ξA2 = a∗3∂a∗

1− a∗

1∂a∗3,

ξA3 = −a∗2∂a∗

1+ a∗

1∂a∗2.

and hence in the local coordinates fixed above

ξA = a1ξA1 + a2ξA2 + a3ξA3

= α∂a∗1

+ β∂a∗2,

withα = (a2a

∗3 − a3a

∗2), β = (a3a

∗1 − a1a

∗3)

and the corresponding expressions α′, β′ for ξB replacing the aj by bj . We get

αβ′ − α′β = (a2b3 − a3b2)a∗1a

∗3 + (a1b2 − a2b1)a∗

3a∗3 + (a3b1 − a1b3)a∗

2a∗3.


By evaluation at η = sA∗3 one obtains

ω0(ξA, ξB) = (s/a∗3)(αβ′ − α′β)

= s2(a1b2 − a2b1).

We compare this with (8.9) and see that the Kirillov-Kostant form in this case is

ω = sω0 = (s2/a∗3)da∗

1 ∧ da∗2.

The isotropy group of η is

Gη =

h =(

eit 00 e−it

), t ∈ R

U(1).

G/Gη is identified with the sphere Ms S2 via the important Hopf map

ψ : G = SU(2) −→ Ms

given by(a, b) −→ s(ba + ab, iba − iab, aa − bb)

By some more calculation using the relation

ada + ada + bdb + bdb = 0,

which is a consequence of aa + bb = 1, we get for the drawback of the Kirillov-Kostantform by the Hopf map

ψ∗((s/a∗3)da∗

1 ∧ da∗2) = 2is(da ∧ da + db ∧ db).

As we already constructed bundles on P1(C) (in 7.6 Example 7.19 and in 8.2.2) it isquite fruitful to compare this description of the coadjoint SU(2)-orbits with the followingalternative: We discussed (in 7.1 Example 7.3) the identification

SU(2)/U(1) P1(C) = (u : v); u, v ∈ C2 \ (0, 0)essentially given by

g −→ z = u/v = b/a,

which is a consequence of g acting by multiplication from the left on the column t(u, v).The same way, one has an identification ψ via

(8.10) g −→ z = v/u = b/a

coming from multiplication from the right to the row (u, v). If we take the symplecticform ω studied in 8.2.2

ω = 2nidz ∧ dz

(1 + |z|2)2 , n ∈ Z,

and via ψ draw it back to SU(2), we get again the form obtained above

2ni(da ∧ da + db ∧ db)

and see that we have quantizable orbits for half integral s resp. n.

Now that we have examples for coadjoint orbits and bundles living on them, we continuein the program to use the sections of these bundles as representation spaces and we haveto discuss how the general polarization business works in practical cases.


8.2.5 Construction of an Irreducible Unitary Representation byan Orbit

If we want to construct representations from a coadjoint orbit of a group, the orbit has tobe admissible, i.e. “integral” and been provided with a “nice” polarization. Then one canconstruct a complex line bundle, whose polarized sections lead to the space for a unitary(and perhaps even irreducible) representation of the group. Or, in another formulation,this representation can be got by an induction procedure from a subgroup arising fromthe polarized orbit. Up to now, all this is not in a final form if one strives for maximalgeneralilty (which we do not do in this text). Though there are now several more refinedprocedures using and discussing Duflo’s version of admissibility as for instance in Vogan[Vo] p.193 ff or in Torasso [To], we will follow the procedure designed by Kirillov [Ki]p.235 ff and [Ki1] (but in some places adapting our notation to the one used by Vogan).Similar presentations can be found in [GS1] p.235 and [Wo] p.103.

Step 1. For each orbit Oη ⊂ g∗ we look for real subalgebras n ⊂ g which are subordinateto η. This means that the condition

< η, [X, Y ] > = 0 for all X, Y ∈ n

holds or, equivalently (this looks quite different but is not too hard to prove), the map

Y −→ 2πi < η, Y >

defines a one-dimensional representation = η of n .

Step 2. We say that n is a real algebraic polarization of η if in addition the condition

2 dim(g/n) = dim g + dim gη

is satisfied. The notion of a complex algebraic polarization is defined in the same way:we extend η to gc = g⊗C by complex linearity and consider complex subalgebras n ⊂ gc

that satisfy the corresponding conditions as in the real case.An algebraic polarization is called admissible if it is invariant under AdGη. In the sequel,we only look for admissible algebraic polarizations.

The relation of these “algebraic” polarizations to the “geometric” ones defined earlier isexplained in [Ki1] p.28/9 and will be made explicit in a simple example later.

Step 3. Moreover, we suppose (as in [Ki] p.238) that in the complex case mc = n + n isa subalgebra of gc such that we have closed subgroups Q and M of G with Lie algebrasq and m fulfilling the relations

L := Gη ⊂ Q = GηQ ⊂ M, n ∩ n = qc, n + n = mc.

In the real case, we have q = n.

8.3 The Examples SU(2) and SL(2,R) 197

Step 4. Finally, and this is our practical criterium to guarantee the integrality orquantizability of the orbit, we suppose that η can be integrated to a unitary characterχQ of Q with dχQ = η. Then we get a unitary representation π of our group G on (thecompletion of) the space F+ := C∞(G, n, Q, , χQ) of smooth C–valued functions φ onG with

φ(gl) = Q(l)−1/2χQ(l−1)φ(g) for all l ∈ L, g ∈ G

andLY φ = 0 for all Y ∈ u, n =: gη + u,

where Q is the usual modular function and LY φ(g) = ddtφ(g exptY )|t=0.

(Here we follow [Vo] p.194 and p.199. For an essentially equivalent approach see [Ki]p.199ff).

Though this looks (and is) rather intricate, we shall see that the scheme can easily berealized in our simple examples, in particular for the Heisenberg group in the next butone section. As already said, there are a lot of subtle questions hidden here. For instance(see [Ki] p.236), as another condition Pukanszky’s condition, i.e. the additional propertyη + n⊥ ⊂ Oη comes in where n⊥ denotes the annihilator of n in g∗. And, as discussed in[Ki] p.239, the fundamental group π1(O) of the orbit plays an important role in the inte-gration of the representation η of the Lie algebra q to a representation of the group Q.This leads to the notion of rigged orbits, i.e. orbits in g∗rigg, the set of pairs (η, χ) whereη ∈ g∗ and χ is here a one-dimensional representation of Gη such that dχ = 2πiη|gη

(see[Ki1] p.123 (and Vogan’s criticism of Kirillov’s book on his homepage)). Here we onlywant to give an impression, and (from [Ki] p.241) we cite a theorem by Kostant andAuslander:

Theorem 8.4: Assume G to be a connected and simply connected solvable Lie group.Then representations that correspond to different orbits or different characters of thefundamental group of the orbit are necessarily inequivalent.

And (a special version of) the famous Borel-Weil-Bott theorem says:

Theorem 8.5: All irreducible representations of a compact connected, and simply con-nected Lie group G correspond to integral G-orbits of maximal dimension in g∗.

In more detail, in our special examples the result of this procedure is as follows.

8.3 The Examples SU(2) and SL(2,R)

1. G = SU(2)

We begin by taking up again the here already often treated example G = SU(2). Using(6.5) in Section 6.5, we take

g = X = ΣajHj ; a1, a2, a3 ∈ Rand, similar to (8.8) in 8.2.4, we choose (H∗

j ) such that < H∗j ,Hk > = δjk and

g∗ = η = Σa∗jH

∗j ; a∗

1, a∗2, a

∗3 ∈ R.


From (8.9) for the elliptic orbit Oη SU(2)/U(1) passing through to η = sH∗1 we have

the Kirillov-Kostant prescription

ωη(X, Y ) = s(a2b3 − a3b2).

One has gη = Lie U(1) 〈H1〉. As g has no two-dimensional subalgebras n, there isno real polarization. But one has two complex polarizations given by the subalgebrasn+ = 〈H0,H+〉 and n− = 〈H0,H−〉 of the complexification gc = 〈H0,H±〉 with

H0 = iH1, H± = H2 ± iH3.

As one easily verifies, both algebras are subordinate to η in the sense defined above inSubsection 8.2.5 and, for s = k ∈ N0, the character of gη

gη Y −→ 2πi < η, Y >

integrates to a unitary character χk of H = U(1) identified as a subgroup of G = SU(2).In this case the space F+ is the space belonging to the induced representation indG

Hχk

restricted by the polarization condition

LH±φ = 0

since one has n+ = gη+ < H+ > or n− = gη+ < H− >. And this space can be under-stood exactly as the subspace of holomorphic resp. antiholomorphic smooth sections ofthe real bundle B over P1(C) from 8.2 and the end of 7.6: To show this, we determinethe appropriate differential operators acting on the smooth sections f of B resp. the func-tions φ on G with φ(gh) = χ2s(h−1)φ(g) where we denote φ(g) =: φ(a, b) =: φ(α, α′, β, β′)decomposing our standard coordinates for g into real coordinates

a =: α + iα′, b =: β + iβ′.

We put gj(t) := exp tHj and get

g1(t) =(

e−it

eit

), g2(t) =

(cos t −i sin t

−i sin t cos t

), g3(t) =

(cos t − sin tsin t cos t

).

Application of the prescription LHj φ(g) := ddtφ(ggj(t))|t=0 to φ as a function of the four

real variables α, α′, β, β′ leads to the operators

LH1 = α′∂α − α∂α′ − β′∂β + β∂β′ ,

LH2 = β′∂α − β∂α′ + α′∂β − α∂β′ ,

LH3 = β∂α + β′∂α′ − α∂β − α′∂β′ ,

and hence, using the Wirtinger relations ∂a = (1/2)(∂α − i∂α′), ∂a = (1/2)(∂α + i∂α′),we get the operators

LH2+iH3 = 2i(b∂a − a∂b), LH2−iH3 = −2i(b∂a − a∂b)

acting on the functions φ here written in the coordinates a, a, b, b. If φ comes from asection f of the bundle B over P1(C), for b = 0 it is of the form (see (8.10) in 8.2)

φ(a, a, b, b) = f(z, z), z = a/b.


Hence we get

LH2+iH3φ = (2i/b2)fz, LH2−iH3φ = (2i/b2)fz,

and we see that the polarization by n± provides that the section f has to be holomorphicresp. antiholomorphic.

Summary: For G = SU(2) (and SO(3)) all irreducible unitary representations can beconstructed by the orbit method.Each integral orbit Oη carries a line bundle and has two complex polarizations.The correponding polarized sections provide equivalent representations.

2. G = SL(2,R)

We continue by completing the results obtained for the orbits from the SL(2)-theory inthe last section. The orbits are all realized as subsets of

Mβ := (x∗, y∗, z∗) ∈ R3; x2∗ + y2

∗ − z2∗ = β

with convenient β ∈ R.

2.1. Hyperbolic Orbits

For the hyperbolic orbit Oη = G · αX∗ = Mα2 G/L , L = Gη = MA , α = 0 we haveAdL-invariant real polarizations given by

q1 := 〈X,Y + Z〉 and q2 := 〈X, Y − Z〉 .

The associated characters ρ1 and ρ2 of q1 resp. q2 given by

ρ1 (Y ) = 2πi 〈αX∗, xX + y(Y + Z)〉 = 4πiαx for Y = xX + y(Y + Z)

and

ρ2 (Y ) = 2πi 〈αX∗, xX + z(Y − Z)〉 = 4πiαx for Y = xX + z(Y − Z)

integrate uniquely to unitary characters χj,s (j = 1, 2) of the respective groups

Q1 = MNA and Q2 = MNA,

which are trivial on Qj ∩ K. For

b = ε n(x) t(y) resp. = ε n (x) t(y)

these characters are given by

χj,s(b) := (y1/2)is with s = 4πiα.


Then by inducing these characters from Qj to G one gets irreducible unitary repre-sentations. This coincides quantitatively with the fact that one has the principal seriesrepresentations P±,is := πis,± , which we constructed in 7.2 and which have modelsconsisting of (smooth) functions φ = φ (g) on G with

φ (mn(x)t(y)g) = γ(m) (y1/2)(is+1) φ(g) , γ

(ε

ε

)=

ε for−1 for + ,

i.e. in a space spanned by functions

φs,2k , k ∈ Z for + .

andφs,2k+1 , k ∈ Z for −

with

(8.11) φs,j(g) = y(is+1)/2 eijθ, j ∈ Z.

The unitary equivalence P±,is P±,−is can be translated into the fact that both polar-izations produce equivalent representations (see the discussion in [GS1] p. 299 f). Thus,by the orbit method, the construction above yields only one half of the representations.This arises from the fact that the orbit Mα2 is not simply connected but has here afundamental group π1 (Mα2) Z , so that we have one representation (the even one)belonging to id ∈ π1 (Mα2) and the other one to a generator of π1 (Mα2) (see the dis-cussion in [Ki3] p.463). This may serve as motivation to introduce the notion of riggedorbits already mentioned at the end of our Subsection 8.2.5.

2.2. Elliptic Orbits

For the elliptic orbits M±−α2 := Oη = G ·αZ∗ G/SO(2), α = 0 , consisting of the upper

half M+−α2 (i.e. with z∗ > 0) of the two-sheeted hyperboloid Mα2 for α > 0 and the lower

half M−−α2 (with z∗ < 0) for α < 0, we have two purely complex polarizations : We take

Z1 = −iZ =( −i

i

), X± = (1/2) (X ± iY ) = (1/2)

(1 ± i±i 1

)

with[Z1, X±] = ±2X± , [X+, X− ] = Z1 .

and havegc = 〈Z1, X+, X−〉.

This establishes the complex bilinear form

Bη([ X, Y ]) = iα(a+ b− − a− b+)

forX := a0Z1 + a+X+ + a−X− , Y := b0Z1 + b+X+ + b−X− .


We have Bη ≡ 0 on the AdGη-invariant subalgebras

n := n± = 〈Z1, X±〉Cof gc and n + n = gc , n ∩ n = kc = 〈Z1〉

n± ∩ g1 = 〈Y − Z〉R = k .

At first we see that a one-dimensional representation of k is defined by

ρ (a0 Z1) := i 〈(

α−α

), a0 Z1 〉 = i 2α a0

and this integrates to a representation χk of SO(2) (with χk(r(θ)) = eikθ ) exactly for2α =: k ∈ Z \ 0 . Thus the elliptic orbits are quantizable for 2α = k ∈ Z \ 0 . Then weuse n+ and n− to define a complex structure on M = M±

α2 G1/SO(2) H± . Andby the usual procedure we get the discrete series representations π±

k , k ∈ N , realizedon the L2-spaces of holomorphic sections of the holomorphic line bundles L on M±

−α2 .With similar calculations as we did for SU(2), these can also be interpretated as spacesof polarized sections φ of the bundle belonging to χk on M±

−α2 , i.e. with

LX+ φ = 0 resp. LX− φ = 0 .

These representations are also representations of K = SO(2) and we have

mult(χk, π+k ) = 1 for k = k + 2l, l ∈ N0 and = 0 else.

2.3. Nilpotent Orbits

In the nilpotent case Oη = G · (Y∗ + Z∗) = M+0 = G/N we have one real polarization.

The form ωY∗+Z∗ is zero on the Ad N - invariant subalgebra of g given by

n = < X, Y − Z > .

The standard representation ρ of n is trivial

ρ(xX + y(Y − Z)) = < Z∗, xX + y(Y − Z) > = 0.

The group B = MNA = ±(a b0 a−1 ); a > 0, b ∈ R has Lie B = n. We can (at

least formally) proceed as follows. If χ = id denotes the trivial representation of B, onehas here the representation indG

B id given by right translation on the space of functionsφ : G −→ C with

φ(bg) = y1/2φ(g) for b ∈ B, g ∈ G,

(spanned by φ(g) = y1/2eiθ, ∈ 2Z), i.e. the representation P+,0. As in the previouscase, here we have a fundamental group π1 Z, so there is a second representationP−,0 consisting of odd functions and spanned by φ(g) = y1/2eiθ, ∈ 1 + 2Z), whichdecomposes into irreducible halves

P−,0 = π+1 ⊕ π−

1

(see for instance [Kn] p.36). The lower half of the cone M−0 = G1 · Y∗ generates an

equivalent situation.


All this is a small item in the big and important theme of how to attach representationsto nilpotent orbits (see in particular the discussion in Example 12.3 of Vogan [Vo1] p.386).

Summary

For the representations of G = SL(2,R), we get a correspondence between– elliptic orbits and discrete series representations.– (rigged) hyperbolic orbits and principal series representations,We do not get the complementary series.For the nilpotent orbits in the SL(2,R)-theory, it seems still not clear how the correspon-dence is to be fixed (see the discussion in [Vo1] p.386). To me it seems most probable(see in particular [Re] p.I.110f) to associate to the nilpotent cone the principal seriesP+,0 and the (decomposing) P−,0 = D1 ⊕ D−1.

Let us finally point to a remark by Kirillov in [Ki3] 8.4 and [Ki1] 6.4 that he sees thepossibility to extend the correspondence framework such that even the complementaryseries fits in. This should be further elucidated. Up to now one has here an example of animperfect matching between the unitary dual and the set of coadjoint orbits. But thereis a conjecture that by the orbit method one gets all representations appearing in thedecomposition of the regular representation of a group ([Ki1] p 204): “Indeed, accordingto the ideology of the orbit method, the partition of g∗ into coadjoint orbits correspondsto the decomposition of the regular decomposition into irreducible components”.

8.4 The Example Heis(R)

The treatment of the Heisenberg group is even easier than that of the two previousexamples. But one has one additional complication as one can not any longer AdG-equivariantly identify the Lie algebra with its dual via the trace or Killing form. Sowe use the original definition of the coadjoint representation to determine the set O(G)of coadjoint orbits and compare it with the unitary dual G of the Heisenberg groupG = Heis(R), which we determined in Example 7.11 in 7.4 as the disjoint union of theplane R2 with a real line without its origin R \ 0. We shall get a perfect matchingbetween G and O(G).

As the formulae are a bit more smooth, we work with the realization of the Heisenberggroup by three-by-three matrices, i.e. we take

G := Heis′(R) = g =

⎛⎝1 a c

1 b1

⎞⎠ ; a, b, c ∈ R

and

g := X =

⎛⎝0 x z

0 y0

⎞⎠ = xX + yY + zZ; x, y, z ∈ R.

8.4 The Example Heis(R) 203

The Coadjoint Action

As in 8.2, we have the adjoint representation Ad of G given on g by

(Ad g)X = gXg−1 =

⎛⎝1 a c

1 b1

⎞⎠

⎛⎝0 x z

0 y0

⎞⎠

⎛⎝1 −a −c + ab

1 −b1

⎞⎠

=

⎛⎝0 x z + ay − bx

0 y0

⎞⎠ .

The coadjoint representation Ad∗ is the contragredient of Ad (see 1.4.3) and given on g∗

by Ad∗g = (Ad g−1)∗, i.e. if we write

〈η, X〉 := η(X)

for all η ∈ g∗ and X ∈ g, then Ad∗g is fixed by the relation

〈Ad∗g η, X〉 = 〈η, (Ad g−1)X〉 for all X ∈ g.

We have

(Ad g−1)X =

⎛⎝0 x z − ay + bx

0 y0

⎞⎠ .

And if we take X∗, Y∗, Z∗ as a basis for g∗, dual to X, Y, Z with respect to <,>, andwrite

g∗ η = x∗X∗ + y∗Y∗ + z∗Z∗,

we haveAd∗g η = x∗X∗ + y∗Y∗ + z∗Z∗

with

(8.12) x∗ = x∗ + bz∗, y∗ = y∗ − az∗, z∗ = z∗.

There is a more elegant approach to this result: In [Ki1] p.61 Kirillov uses the trace formon M3(R) to identify g∗ with the space of lower-triangular matrices of the form

η =

⎛⎝∗ ∗ ∗

x ∗ ∗z y ∗

⎞⎠ .

Here the stars remind us that one actually considers the quotient space of M3(R) by thesubspace g⊥ of upper-triangular matrices (including the diagonal). This way, one gets

Ad∗g η = g

⎛⎝∗ ∗ ∗

x ∗ ∗z y ∗

⎞⎠ g−1 =

⎛⎝ ∗ ∗ ∗

x + bz ∗ ∗z y − az ∗

⎞⎠ .


Exercise 8.5: Do the same for the realization of the Heisenberg group and its Lie algebraby four-by-four matrices as in 0.1 and 6.3. Verify that Ad∗g is given by

(p∗, q∗, r∗) −→ (p∗ + 2µr∗, q∗ − 2λr∗, r∗)

for g = (λ, µ, κ) and η = p∗P∗ + q∗Q∗ + r∗R∗, where P∗, Q∗, R∗ is a basis of g∗ dual toP,Q, R.

The Coadjoint Orbits

From (8.12) it is easy to see that one has just two types of coadjoint orbits Oη = G · η G/Gη.

i) For η = mZ∗,m = 0 we have a two-dimensional orbit

Om := OmZ∗ = (x∗, y∗,m); x∗, y∗ ∈ R R2.

ii) For η = rX∗ + sY∗, r, s ∈ R, as orbit we have the point

Ors := OrX∗+sY∗ = (r, s, 0).Obviously g∗ R3 is the disjoint union of all these orbits and the set of orbits is inone-by-one correspondence with the unitary dual G described above. Now, we constructrepresentations following the general procedure outlined in 8.2:

Orbits as Symplectic Manifolds

For M := Om R2,m = 0, we have as symplectic forms all multiples of ω0 = dx∗ ∧ dy∗.As it is a nice exercise in the handling of our notions, we determine the factor for theKirillov-Kostant form ω: Again we use (8.5)

ωη(η′)(ξX(η′), ξY (η′)) = < η′, [X, Y ] > for all X, Y ∈ g, η′ ∈ gη∗

We putX := xX + yY + zZ, Y := x′X + y′Y + z′Z,

and have[X, Y ] = (xy′ − x′y)Z

and< η′, [X, Y ] > = (xy′ − x′y)z∗ for η′ = x∗X∗ + y∗Y∗ + z∗Z∗.

We determine the vector fields ξ using again (8.6)

(ξXf)(η) =d

dt(f(exp tX · η))|t=0.

One has

gX(t) := exp tX =

⎛⎝1 t

11

⎞⎠ ,

gY (t) := exp tY =

⎛⎝1

1 t1

⎞⎠ ,(8.13)

gZ(t) := exp tZ =

⎛⎝1 t

11

⎞⎠ ,


and, by (8.12),

gX(t) · η = x∗X∗ + (y∗ − tz∗)Y∗ + z∗Z∗,gY (t) · η = (x∗ + tz∗)X∗ + y∗Y∗ + z∗Z∗,gZ(t) · η = x∗X∗ + y∗Y∗ + z∗Z∗.

and, hence,ξX = −z∗∂y∗ , ξY = z∗∂x∗ , ξZ = 0,

that is

ξX = xξX + yξY + zξZ ,

= α∂x∗ + β∂y∗

withα = yz∗, β = −xz∗.

Thus we haveω0(ξX , ξY ) = (αβ′ − α′β) = z2

∗(xy′ − x′y).

If we compare this with the Kirillov-Kostant prescription above, we see that one has thesymplectic form ω on M = Om given by

ω = (1/m)dx∗ ∧ dy∗.

Polarizations of the Orbits Om

The algebraic approach to real polarizations asks for subalgebras n of g = 〈X, Y, Z〉 sub-ordinate to η = mZ∗. Obviously one has the (abelian) subalgebras n1 = 〈X, Z〉 andn2 = 〈Y, Z〉 and more generally, for α, β ∈ R with αβ = 0, n = 〈αX + βY, Z〉. In thegeometric picture these correspond to a polarization given by a splitting of Om R2

into the union of parallel lines αx∗ + βy∗ =const., which are the orbits of Ad∗Q whereQ is the subgroup of G with Lie Q = n, i.e. Q = (αt, βt, c); t, c ∈ R.

A complex polarization of Om, which is translation invariant, is generated by a constantvector field ξ = ∂x∗ + τ∂y∗ , τ ∈ C \ R, in particular τ = i. The functions F satisfyingξF = 0 are simply holomorphic functions in the variable w := x∗ + τy∗. In the algebraicversion the complex polarizations are given by the subalgebra n = 〈(X + τY ), Z〉.

Construction of Representations

We want to follow the scheme outlined at the end of 8.2. As up to now our representationsof the Heisenberg group were realized by right translations and the recipe in 8.2.5 is givenfor left translations, we have to make the appropriate changes. For the zero-dimensionalorbits Or,s, r, s ∈ R there is not much to be done: One has η = rX∗ + sY∗ and Gη = Gand hence the one-dimensional representation π given by π(a, b, c) = exp(2πi(ra + sb)).For the two-dimensional orbits Om,m = 0 we have η = mZ∗, Gη = (0, 0, c), c ∈ R = Land for each polarization given by n the representation Y −→ 2πi < η, Y >= 2πimzintegrates to a character χm of the group Q belonging to n.


Hence each orbit Om is integral and we get a representation space consisting of sectionsof the real line bundle over M = Om (constructed using the character χm of Gη withχm((0, 0, c)) = exp (2πimc) = e(mc)) which are constant along the directions fixed by thepolarization n. This leads to consider the following functions: We look at φ : G −→ Cwith

φ(lg) = χm(l)φ(g),

i.e. φ is of the form φ(a, b, c) = e(mc)F (a, b) with a smooth function F : R2 −→ Crestricted by the polarization condition, which in our case for n =< U,Z > is given by

RUφ = 0

where RU is the right-invariant differential operator

RUφ(g) =d

dtφ(gU (t)−1g)|t=0, gU (t) := exp tU.

We havegX(t) = (t, 0, 0), and gX(t)−1g = (a − t, b, c − tb),gY (t) = (0, t, 0), gY (t)−1g = (a, b − t, c),gZ(t) = (0, 0, t), gZ(t)−1g = (a, b, c − t).

and hence

(8.14) RX = −∂a − b∂c, RY = −∂b, RZ = −∂c.

– i) The polarization given by n =< Y,Z > leads to

0 = RY φ = −∂bφ = −Fbe(mc),

i.e. φ is of the form φ(a, b, c) = f(a)e(mc). As to be expected, via

π(g0)φ(g) = φ(gg0) = f(a + a0)e(m(c + c0 + ab0))= e(mc)e(m(c0 + ab0))f(a + a0),

we recover the Schrodinger representation πm =: πS

f(t) −→ πm(g0)f(t) = e(m(c0 + b0t))f(t + a0).

– ii) n =< X, Z > leads to

0 = RXφ = −(∂a + b∂c)φ = −(Fa + 2πimbF )e(mc),

i.e. φ is of the form φ(a, b, c) = e(−mab)f(b)e(mc). We get the representation

f(t) −→ e(mc0 − a0(b0 + t))f(t + b0) =: π′m(g0)f(t).

From the Stone-von Neumann Theorem one knows that this representation (and anyother constructed this way via another polarization) is equivalent to the Schrodingerrepresentation. In our context we can see this directly using the Fourier transform as anintertwining operator:


We abbreviate again e(u) := exp(2πiu) and write as Fourier transform

f(b) :=∫R

f(t)e(mbt)dt.

We get

π′m(g0)f(b) = e(m(c0 − a0(b + b0)))

∫R

f(t)e(mt(b + b0))dt

and see that this coincides with the Fourier transform of the application of the Schrodingerrepresentation πm(g0)f , namely with a + a0 =: t we get

(πm(g0)f)(b) =∫

e(m(c0 + ab0)f(a + a0)e(mab)da

= e(mc0)e(−m(a0(b + b0)))∫

e(mtb0)f(t)e(mbt)dt.

– iii) As another example of this procedure we use a complex polarization to constructthe Fock representation, which (among other things) is fundamental for the constructionof the theta functions. The Fock representation has as its representation space the spaceHm of holomorphic functions f on C subjected to the condition

‖ f‖ 2 =∫C

| f(z)| 2dµ(z) < ∞, dµ(z) := (1/(2i))dz ∧ dz.

To adopt our presentation to the more general one given in Igusa [Ig] p.31ff, we introducecomplex coordinates for our Heisenberg group

z := −ia + b, z = ia + b,

i.e.a = (z − z)/(2i), b = (z + z)/2

and∂z = (1/2)(i∂a + ∂b), ∂z = (1/2)(−i∂a + ∂b).

We use the complexification gc =< Y±, Z0 > with Y± := ±iX + Y, Z0 = 2iZ and thecomplex polarization given by n =< Y−, Z0 >. Then, in order to construct a representa-tion space following our general procedure, we come to look at smooth complex functionsφ on G, which are of the form

φ(a, b, c) = F (z, z)e(mc)

and subjected to the polarization condition

RY−φ = 0.

From (8.14) we have RX = −(∂a + b∂c), RY = −∂b. We deduce

RY− = −2∂z + ib∂c

and, hence, that one has 0 = RY−φ = (−2Fz − 2πmbF )e(mc), i.e.

∂z log F = −πmb = −πm(z + z)/2.


This shows that F is of the form

F (z, z) = e−(πm/2)(zz+z2/2)f(z)

with a holomorphic function f . As we soon will see, to get a unitary representation onthe space Hm introduced above, we have to choose here f(z) = exp(πmz2/4)f(z) withf ∈ Hm, i.e. our φ is of the form

φ(a, b, c) = e(mc)e−(πm/2)(zz+z2/2−z2/2)f(z).

By a small computation we see that the representation by right translation

π(g0)φ(g) = φ(gg0)

leads to the formula of the Fock representation πF

(8.15) f(z) −→ (πF (g0)f)(z) = e(mc0)e−πm((z+z0/2)z0+(z20−z2

0)/4)f(z + z0).

with g0 = (a0, b0, c0), z0 = b0 − ia0. Now it is easy to verify that πF acts unitarily inHm (Exercise 8.6).

Perhaps it is not a complete waste of time and energy to show here how this representationcomes up also (as it must) by the induction procedure we outlined in 7.1: We have thestandard projection

G = Heis′(R) −→ H \ G X , g = (a, b, c) −→ Hg = x = (a, b) ↔ (z, z)

with the Mackey section s given by s(x) = (a, b, 0), i.e. g = (0, 0, c)(a, b, 0), and themaster equation (7.7)

s(x)g0 = (a, b, 0)(a0, b0, c0) = h∗s(xg0)

with

s(xg0) = (a+a0, b+b0, 0), h∗ = (0, 0, c0 +ab0) = (0, 0, c0 +(zz0−zz0 + z0z0− zz0)/(4i)).

dµ(x) = dµ(z) is a quasi-invariant measure on X = C with Radon-Nikodym derivative

dµ(xg0)dµ(x)

= e−mπ(zz0+zz0+|z0|2) = δ(h∗).

Hence, injecting these data into the formula from Theorem 7.4 in 7.1.2 for the represen-tation in the second realization

π(g0)f(x) = δ(h∗)1/2π0(h∗)f(xg0)

provides the formula (8.15) for the Fock representation obtained above.

We know that the Fock representation πF is equivalent to the Schrodinger representationπS but it is perhaps not so evident how the intertwining operator looks like.

8.5 Some Hints Concerning the Jacobi Group 209

Proposition 8.1: The prescription

f(t) −→ F (z) = If(z) :=∫R

k(t, z)f(t)dt

with the kernel

k(t, z) = 21/4e−πt2e(tz)e(π/2)z2

provides an isometry I between the spaces L2(R) and Hm for m = 1 and intertwines thecorresponding Schrodinger and Fock representations.

The fact that I is an intertwining operator is verified by a straightforward computation(Exercise 8.7). For the verification of the isometry we refer to [Ig] p.31-35 where onegoes back to explicit Hilbert bases for the spaces.

Summary

For the Heisenberg group (as for each solvable group), one has a perfect matching be-tween coadjoint orbits and irreducible unitary representations.Everything goes through as well for the higher-dimensional Heisenberg groups if onechanges the notation appropriately.

8.5 Some Hints Concerning the Jacobi Group

In Kirillov’s book [Ki1] one can find a lot of examples and a thorough discussion of themerits and demerits of the orbit method. As orbits already turned up naturally in ourpresentation of Mackey’s method in 7.1 where we applied it to Euclidean groups andthe Poincare group, it is a natural topic to inspect the coadjoint orbits for semidirectproducts, in particular for the Euclidean groups.

Exercise 8.8: Determine the coadjoint orbits of the Euclidean group

GE(3) = SO(3) R3.

The reader will find material for this in [GS] p.124ff. As the author of this text isparticularly fond of the Jacobi group GJ , i.e. – in its simplest version – a semidirectproduct of SL(2,R) with Heis(R), as another exercise we propose here to treat somequestions in this direction concerning the Jacobi group. Most anwers to these can befound in [BeS] and [Be1] where we collected some elements of the representation theoryof GJ and the application of the orbit method to GJ . But at a closer look there areenough rather easily accessible open problems to do some original new work:

The Jacobi group is in general the semidirect product of the symplectic group with anappropriate Heisenberg group. Here we look at

GJ(R) := SL(2,R) Heis(R).


In this case we fix the multiplication law by the embedding into the symplectic groupSp(2,R) given by

Heis(R) (λ, µ, κ) −→

⎛⎜⎜⎝

1 µλ 1 µ κ

1 −λ1

⎞⎟⎟⎠ ,

SL(2,R) M =(

a bc d

)−→

⎛⎜⎜⎝

a b1

c d1

⎞⎟⎟⎠ .

We writeg = (p, q, κ)M or g = M(λ, µ, κ) ∈ GJ(R).

As in [BeS] or [Ya], we can describe the Lie algebra gJ as a subalgebra of g2 = sp(2,R)by

G(x, y, z, p, q, r) :=

⎛⎜⎜⎝

x 0 y qp 0 q rz 0 −x −p0 0 0 0

⎞⎟⎟⎠

and denoteX := G(1, 0, . . . , 0), . . . , R := G(0, . . . , 0, 1).

We get the commutators

[X, Y ] = 2Y, [X, Z] = −2Z, [Y, Z] = X,[X, P ] = −P, [X, Q] = Q, [P, Q] = 2R,[ Y, P ] = −Q, [ Z, Q] = −P,

all others are zero. Hence, we have the complexified Lie algebra given by

gJc =< Z1, X±, Y±, Z0 >

where as in [BeS] p.12

Z1 := −i(Y − Z), Z0 := −iR,

X± := (1/2)(X ± i(Y + Z)), Y± := (1/2)(P ± iQ)

with the commutation relations

[Z1, X±] = ±2X±, [Z0, Y±] = ±Y±, etc.

Exercise 8.9 : Verify all this and compute left invariant differential operators

LZ0 = i∂κ

LY± = (1/2)y−1/2e±iθ(∂p − (x ± iy)∂q − (p(x + iy) + q)∂κ)

LX± = ±(i/2)e±2iθ(2y(∂x ∓ i∂y) − ∂θ)

LZ1 = −i∂θ


acting on differentiable functions φ = φ(g) with the coordinates coming from

g = (p, q, κ)n(x)t(y)r(θ)

where

n(x) =(

1 x1

), t(y) =

(y1/2

y−1/2

), r(θ) =

(α β

−β α

)

withα = cos θ, β = sin θ,

andg(i, 0) = (τ, z) = (x + iy, pτ + q).

As usual, we putN := n(x); x ∈ R, A := t(a); a ∈ R>0,K := SO(2) = r(θ); θ ∈ R, M := ±E.

Elements of the representation theory of GJ(R) can be found in the work of Pyatetskii–Shapiro, Satake, Kirillov, Howe, Guillemin, Sternberg, Igusa, Mumford and many others.To a large part, they are summed up in [BeS], the essential outcome being that therepresentations π of GJ(R) with nontrivial central character are essentially productsof projective representations of SL(2,R) with a fundamental (projective) representationπm

SW of GJ called the Schrodinger–Weil representation in [BeS], as it is in turn theproduct of the Schrodinger representation πS of the Heisenberg group Heis(R) and the(Segal–Shale–)Weil or oscillator representation πW of SL(2,R). This representation πm

SW

is a representation of lowest weight k = 1/2 and index m. It is characterized by the factthat all its KJ–types are one-dimensional (KJ := SO(2) × C(Heis(R)) SO(2) × R),which in turn is characterized by the fact that the lowest weight vector for πm

SW satisfiesthe heat equation. To be a bit more explicit, the irreducible unitary representations π ofGJ(R) with central character ψm (with ψm(x) := em(x)) for m = 0 are infinitesimallyequivalent to

πm,s,ν , s ∈ iR, ν = ±1 (πm,s,ν πm,−s,ν),πm,s,ν , s ∈ R, s2 < 1/4,

π+m,k, k ∈ N,

π−m,k, k ∈ N.

Here and in the following, we restrict our treatment to m > 0, but there is a “mir-ror image” for m < 0 which, to save space, we will not discuss. These infinitesimalrepresentations are of the form

πm,s,ν = πmSW ⊗ πs,ν ,

π±m,k = πm

SW ⊗ π±k0

, k = k0 + 1/2.

The Schrodinger–Weil representation πmSW is given on the space

V 1/2m := 〈vj〉j∈N0

by (see [BeS] p.33)

Zvj := (1/2 + j)vj , Z0vj := µvj , Y+vj := vj+1, X+vj := −1/(2µ)vj+2, . . . .


And, for instance, the discrete series representation π+k−1/2 of sl(2) is given on

Wk−1/2 := 〈wl〉l∈2N0

by

Zwl := (k − 1/2 + l)wl, X+wl := wl+2, X−wl := (l/2)(k − 3/2 + l)wl−2.

That is, we have as a space for π+m,k

V +m,k := V 1/2

m ⊗ Wk−1/2 = 〈vj ⊗ wl〉j∈N, l∈2N0

withZ0(vj ⊗ wl) = µ(vj ⊗ wl),Z(vj ⊗ wl) = (k + j + l)(vj ⊗ wl),Y+(vj ⊗ wl) = vj+1 ⊗ wl,

X+(vj ⊗ wl) = −(1/(2µ))vj+2 ⊗ wl + vj ⊗ wl+2.

In particular, for the spaces of Z-weight k + λ, λ ≥ 0,

V (λ) := v ∈ V +m,k; Zv = (k + λ)v

we have

dim V (0) = 1, dim V (1) = 1, dim V (2) = 2, dim V (3) = 2, dim V (4) = 3, . . .

If we denote by ρm,k the (one-dimensional) representation of KJ given by

ρm,k(r(θ), (0, 0, κ)) = e(ikθ)+2πimκ, θ ∈ [0, 2π], κ ∈ R,

we have

mult (ρm,k, π+m,k) = 1 for k = k,

= 1 + l for k = k + 2l or = k + 2l + 1, l ∈ N0

= 0 for k < k,

andmult (ρm,k, π) = ∞

in the other cases listed above. Now, as to be seen for instance in [BeS] p.28ff, a lowestweight or vacuum vector φ0 for a discrete series representation π+

m,k of GJ(R) realized byright translation in a space H+

m,k of smooth functions φ = φ(g) on GJ(R) is characterizedby the equations

LX−φ0 = LY−φ0 = 0, LZ0φ0 = 2πmφ0, LZφ0 = kφ0 .

The Jacobi group is not semisimple (and indeed a prominent example for a non-reductivegroup). We describe its coadjoint orbits using our embedding of GJ into Sp(2,R) :

We identify the dual sp∗ of sp by the Sp(2,R) invariant isomorphism η → Xη given by

η(Y ) = Tr(XηY ) =:< Xη, Y > for all Y ∈ sp.


η1, . . . , η6 denotes a basis of (gJ)∗ dual to

(X1, . . . , X6) := (X, . . . , R).

We realize (gJ )∗ as a subspace of sp by the matrices

M(x, y, z, p, q, r) :=

⎛⎜⎜⎝

x p z 00 0 0 0y q −x 0q r −p 0

⎞⎟⎟⎠ , x, . . . , r ∈ R,

and putX∗ := M(1, 0, . . . , 0), . . . , R∗ := M(0, . . . , 0, 1).

ThenXη1 = X∗, . . . , Xη6 = R∗

is a basis of (gJ )∗ with

< X∗, X >= 2, < Y∗, Y >= 1, < Z∗, Z >= 1< P∗, P >= 2, < Q∗, Q >= 2, < R∗, R >= 1.

By a straightforward computation one obtains the following result:

Lemma: If g−1 is denoted as

g−1 =

⎛⎜⎜⎝

a 0 b aµ − bλλ 1 µ κc 0 d cµ − dλ0 0 0 1

⎞⎟⎟⎠ ,

the coadjoint action of g on M(x, . . . , r) is given by

Ad∗(g)M(x, . . . , r) = M(x, . . . , r)

with

(8.16)

x = (ad + bc)x + bdy − acz + (2acµ − (ad + bc)λ)p

+ ((ad + bc)µ − 2bdλ)q + r(aµ − bλ)(cµ − dλ),

y = 2dcx + d2y − c2z + 2(cµ − dλ)(cp + dq) + r(cµ − dλ)2,

z = −2abx − b2y + a2z − 2(aµ − bλ)(ap + bq) − r(aµ − bλ)2,

p = ap + bq + r(aµ − bλ),

q = cp + dq + r(cµ − dλ),

r = r.


Exercise 8.10: Verify this and determine the coadjoint orbits. For instance, show thatfor

U∗ = mR∗ + αX∗,m = 0, α = 0,

one has the 4-dimensional hyperbolic orbit OU∗ contained in

Mm,−α2 := (x, y, z.p, q, r) ∈ R6;m(x2 + yz − α2) = 2pqx − p2y + q2z, r = m.

Do you get OU∗ = Mm,−α2?

It is natural to try to apply the general procedure outlined in this section to realizerepresentations of GJ by bundles carried by these orbits and to ask, which ones you getand, if not, why? As already said above, there are some answers in [Be1] and [Ya], butfar from all.

Chapter 9

Epilogue: Outlook to NumberTheory

Representations of groups show up in many places in Number Theory, for instance asrepresentations of Galois groups. There seems to be no doubt that the relationship of thetwo topics culminates in the Langlands Program, which (roughly said) seeks to estab-lish a correspondence between Galois and automorphic representations. Many eminentmathematicians have worked and work on this program, and it is now even interestingfor physicists; see Frenkel’s Lectures on the Langlands Program and Conformal FieldTheory [Fr]. We cannot dare to go into this here, but we will try to introduce at leastsome initial elements by presenting some representation spaces for the special groupswe treated in our examples, consisting of, or at least containing, theta functions, andmodular and automorphic forms. And we will also introduce the notions of zeta andL-functions, which ultimately are the foundations of the bridge between the Galois andautomorphic representations. There are many useful books available. We can only citesome of them, which by now are classic: Representation Theory and Automorphic Func-tions by Gelfand, Graev and Pyatetskii-Shapiro [GGP], Automorphic Forms on GL(2)by Jacquet and Langlands [JL], Automorphic Forms on Adele Groups by Gelbart [Ge],Analytic Properties of Automorphic L-Functions by Gelbart and Shahidi [GS], Automor-phic Forms and Representations by Bump [Bu], Theta Functions by Igusa [Ig], The Weilrepresentation, Maslov index and Theta Series by Lion and Vergne [LV], Fourier Analysison Number Fields by Ramakrishnan and Valenza [RV], and Mumford’s Tata Lectures onTheta I, II, and III [Mu].

The goal of these last sections will be to lead a way to automorphic representations:While we already encountered the decomposition of the regular representation on thespace L2(G), we shall here take a discrete subgroup Γ of G and ask for the representa-tions appearing in the decomposition of L2(Γ\G), i.e., we consider representation spacesconsisting of functions with a certain periodicity, invariance or covariance property. Allthis is most easily understood for the case of the Heisenberg group and leads to thenotion of the theta functions. So we will start by treating this case, and afterwards willlook at the group SL(2,R) and introduce modular forms as examples for more generalautomorphic forms. Finally, we shall consider several kinds of L-functions appearing inconnection with representation theory.

216 9. Epilogue: Outlook to Number Theory

Since this is an epilogue and an outlook and the material has such a great extension anddepth, the reader will excuse (we hope) that we shall not be able to give more than someindications, and even less proofs, than in the preceding chapters.

9.1 Theta Functions and the Heisenberg Group

Whatever we do in this section can easily be extended to higher dimensional cases byconveniently adopting the notation. As already done before, we denote again z ∈ C andτ ∈ H and moreover, as usual in this context,

q := e(τ) = e2πiτ , ε := e(z) = e2πiz.

Then one has the classic Jacobi theta function ϑ given by

(9.1) ϑ(z, τ) :=∑n∈Z

eπi(n2τ+2nz) =∑n∈Z

qn2/2εn

going back to Jacobi in 1828. One has to verify convergence:

Lemma: ϑ converges absolutely and uniformly on compact parts of C × H.

Proof: Exercise 9.1.

Moreover, today, one has variants, which unfortunately are differently normalized bydifferent authors. For instance in [Ig] p.V, one finds the theta function with characteristicsm = (m′,m′′) ∈ R2

θm(τ, z) :=∑n∈Z

e((1/2)(n + m′)2τ + (n + m′)(z + m′′)).

In our text, we shall follow the notation of [EZ] p.58 where, for 2m ∈ N, µ = 0, 1, ., 2m−1,one has

θm,µ(τ, z) :=∑

r∈Z, r≡µ mod 2m

qr2/(4m)εr(9.2)

=∑n∈Z

em((n + µ/(2m))2τ + 2(n + µ/(2m))z).

Obviously, we have ϑ(z, τ) = θ1/2,0(τ, z). Theta functions have a rather evident quasiperi-odicity property concerning the complex variable z and a deeper modular property con-cerning the variable τ ∈ H, which makes them most interesting for applications in numbertheory and algebraic geometry, and – concerning both variables – ϑ satisfies the “one-dimensional” heat equation

4πiϑτ = ϑzz.

At first, we treat the dependance on the complex variable where, as we shall see, theHeisenberg group comes in. But to give already here at least a tiny hint, for instance,theta functions are useful if one asks for the number of ways a given natural number canbe written as, say, four squares of integers (see [Ma] p.354 or our Example 9.3 in 9.2).

9.1 Theta Functions and the Heisenberg Group 217

As we keep in mind that the group really behind theta functions is the Jacobi groupintroduced in 8.5, we take the version of the Heisenberg group, which comes from itsrealization by four-by-four matrices though in other sources different coordinizations areused (like we did in the last chapter). Thus, we take

G := Heis(R) = g = (λ, µ, κ) ∈ R3 with gg′ := (λ + λ′, µ + µ′, κ + κ′ + λµ′ − λ′µ).

For fixed τ ∈ H, G = Heis(R) acts on C by

C z −→ g(z) := z + λτ + µ, g = (λ, µ, κ) ∈ G.

Obviously, the stabilizer of z ∈ C is the center C(R) = (0, 0, κ); κ ∈ R R of theHeisenberg group and one has

X := G(R)/C(R) C.

This action induces an action on functions F on C in the usual way

F −→ F g with F g(z) := F (g(z)) = F (z + λτ + µ).

Recalling the cocycle condition (7.8) in 7.1 and the first example of an automorphicfactor in 7.2, we refine this action by introducing for m ∈ R as an automorphic factorfor the Heisenberg group

jm(g, z) := em(λ2τ + 2λz + λµ + κ).

Exercise 9.2: Verify the relation

jm(gg′, z) = jm(g, g′(z))jm(g′, z).

Then we define F |[g] byF | [g](z) := jm(g, z)F (g(z))

and introduce as our first example of an automorphic form a holomorphic function F onC with

(9.3) F |[γ] = F for all γ = (r, s, t) ∈ Γ = Heis(Z) = γ = (r, s, t) ∈ Z3.We denote by Θ(m) the vector space of all these functions F .

Thus, an entire function F (i.e. holomorphic on C) is an element of Θ(m) iff

F (z + s) = F (z) and F (z + rτ) = e−m(r2τ + 2rz)F (z) for all r, s ∈ Z.

If we relate this to the theta functions defined above, we are led to the following centralresult, which is not too difficult to prove (using the periodicity and the functional equa-tion (9.3)).

Theorem 9.1: For 2m ∈ N, Θ(m) is spanned by the theta functions

θm,α, α = 0, 1, . . . , 2m − 1

and hence has dimension 2m.


To any function F on C we associate a lifted function φ = φF defined on G = Heis(R)by

φF (g) := (F |[g])(0) = F (λτ + µ)em(λ2τ + λµ + κ).

This prescription is to be seen on the background of the relation between the First andthe Second Realization in the induction procedure (see 7.1). Then we come to under-stand immediately the first two items in the

Proposition 9.1: Under the map F −→ φF , the space Θ(m) is isomorphic to the spaceH(m) of functions φ : G −→ C satisfying

i) φ(γg) = φ(g) for all γ ∈ Γ,

ii) φ(gκ) = φ(g)em(κ) for all κ := (0, 0, κ) ∈ C(R),iii) φ is smooth and satisfies LY−φ = 0.

The third statement is result of a discussion parallel to the one of the Fock representationin 8.4: here we use Y− := P − τQ to express the holomorphicity of F by a differentialequation for φ: For

gP (t) = (t, 0, 0),gQ(t) = (0, t, 0),gR(t) = (0, 0, t)

one has

ggP (t) = (λ + t, µ, κ − µt),ggQ(t) = (λ, µ + t, κ + λt),ggR(t) = (λ, µ, κ + t)

and for the left invariant operators LU acting on functions φ living on G defined by

LUφ(g) :=d

dtφ(ggU (t))|t=0,

we get

LP = ∂λ − µ∂κ,

LQ = ∂µ + λ∂κ,

LR = ∂κ,

andLY− = LP−τQ = ∂λ − τ∂µ − (λτ + µ)∂κ.

In the prescription for the lifting from C to G we assume z = λτ + µ, z = λτ + µ and(via Wirtinger calculus) deduce

∂z = (τ∂µ − ∂λ)/(τ − τ).

If we haveφ(λ, µ, κ) = F (z, z)em(κ + λ2τ + λµ),

a tiny computation shows that the condition LY−φ = 0 translates into the holomorphicitycondition ∂zF = 0.

9.1 Theta Functions and the Heisenberg Group 219

This leads us directly to the following representation theoretic interpretation. The con-ditions i) and ii) of the space H(m) show up if one looks at still another standardrepresentation of the Heisenberg group, namely the lattice representation:

We take the subgroupH := h = (r, s, t); r, s ∈ Z, t ∈ R

and, for m ∈ Z \ 0, its character given by

π0(h) := em(t).

Then we have X = H \ G Z2 \ R2. The action of G on C given by

z −→ z + λτ + µ

restricts to the action of H on 0 ∈ C producing the points rτ + s forming a lattice L inC. By our general procedure from Mackey’s approach in 7.1, we get a representation byright translation on the space of functions φ on G, which satisfy the functional equation

φ(hg) = em(t)φ(g) for all h = (r, s, t), r, s ∈ Z, t ∈ R, g ∈ G.

(As X is compact, we have no problem with the condition of the finite norm integral.)This induced representation is not irreducible and we proceed as follows. For Γ = Heis(Z)we denote by L2(Γ \ G) the Hilbert space of Γ−invariant measurable functions G −→ Cwith the scalar product

〈φ, ψ〉 :=∫

Γ\G

φ(g)ψ(g)dg.

This is a Hilbert space direct sum

L2(Γ \ G) = ⊕m∈ZL2(Γ \ G)m

where L2(Γ \ G)m denotes the subspace of functions φ that satisfy

φ(gκ) = em(κ)φ(g) for all κ = (0, 0, κ) ∈ Z(R), g ∈ G.

Now for m > 1/2, this induced representation on L2(Γ \G)m is not irreducible. We findhere the first example of an important general statement.

Theorem 9.2 (Duality Theorem for the Heisenberg group): For m ∈ N, the multiplicityof the Schrodinger representation πm in L2(Γ \G) is equal to the dimension of the spaceΘ(m).

Proof: There are several different approaches. In the our context, we can argue likethis:The Schrodinger representation πm transforms a function f ∈ L2(R) into

πm(g0)f(x) = em(κ0 + (2x + λ0)µ0)f(x + λ0)

and is intertwined with the Heisenberg representation (see Example 7.1 at the beginningof 7.1) given by right translation on functions φ on G via

f(x) −→ em(κ + λµ)f(λ) = φf (g).


Obviously, one has φ(κg) = em(κ)φ(g) but as the Heisenberg representation is inducedby the two-dimensional subgroup (0, µ, κ); µ, κ ∈ R of G, one also has invarianceconcerning the µ−variable. Now, to produce a subspace of the lattice representation,one has to create in- or covariance concerning the λ−variable: We define for n ∈ Z andα = 0, 1, . . . , 2m − 1

λn,α := (n + α/(2m), 0, 0)

and getλn,αg = (λ + n + α/(2m), µ, κ + µ(n + α/(2m)).

Hence for every α, we have a map ϑα : f −→ φf,α, called theta transform, with

φf,α(g) :=∑n∈Z

φf (λn,αg)

=∑n∈Z

em(κ + µ(n + α/(2m)) + (λ + n + α/(2m))µ)f(λ + n + α/(2m)).

It is not difficult to verify that one has for all γ ∈ Γ and κ ∈ R

φf,α(γg) = φf,α(g) and φf,α(gκ) = φf,α(g).

Thus we see that for α = 0, 1, . . . , 2m− 1 every ϑα intertwines the Schrodinger represen-tation with the lattice representation. We still have to show that these are all differentand that there are no others. To do so, we realize the Schrodinger representation πm bychoosing as vacuum vector f0 not f0(x) = exp(−2πmx2), as in Remark 6.13 in 6.5, but

f0(x) = em(τx2).

Then this function is annihilated by the differential operator Y− = ∂x − (4πim)τx fromthe derived representation dπm (as in Example 6.9 in 6.3) belonging to Y− = P − τQintroduced above. If one applies the theta transform ϑα to f0 one gets

ϑαf0(g) = em(κ + τλ2 + λµ)θm,α(τ, λτ + µ),

i.e. exactly the theta basis of the space H(m) appearing in Proposition 9.1 above, whichcharacterizes the lifting of Θ(m) to functions on G.

The reader could try to find another version of the proof using the bundle approach toinduced representations. One can prove (Theorem of Appell-Humbert) that every linebundle on the torus X := L \ C, L = Zτ + Z is of the form L(H,β), i.e. the quotient ofthe trivial bundle C × C −→ C by the action of L given by

(, (v, z)) −→ (β()eiH(z,)+(1/2)H(,)v, z + ) for all ∈ L, v, z ∈ C.

Here H : C×C −→ C is a hermitian form such that E = Im (H) is integer valued on Land β : L −→ S1 a map with

β(1 + 2) = eiπE(1,2)β(1)β(2).

A computation shows that these hermitian forms are exactly the forms Hm, m ∈ Z, givenby

Hm(z, w) = mzw/Im(τ).

Up to now, we only treated the theta functions as quasiperiodic functions in the complexvariable z. There is also a remarkable modular behaviour concerning the variable τ ∈ H.Before we discuss this, we return to some more general considerations.

9.2 Modular Forms and SL(2,R) 221

9.2 Modular Forms and SL(2,R)

Besides theta functions and strongly intertwined with it, modular forms is another topicrelating classical function theory to number theory and algebraic and analytic geometry.There are many competent introductions. A very brief one is given in Serre [Se1]. More-over we recommend the books by Schoneberg [Sch] and Shimura [Sh] containing morerelevant details.

Classical Theory (Rudiments)

We already met several times with the action of G = SL(2,R) on the upper half planeX = H := τ = x + iy ∈ C; y > 0 given by

(g =(

a bc d

), τ) −→ g(τ) :=

aτ + b

cτ + d.

As g and −g produce the same action, it is more convenient to treat the group

G = PSL(2,R) := SL(2,R)/±E2.But since we worked so far with G and its representation theory can be rather easilyadopted to G, we stay with this group.

For k ∈ R (but mainly k ∈ Z) we have (see 7.2.2) the automorphic factor

j(g, τ) := (cτ + d)−1 resp. jk(g, τ) := (cτ + d)−k

fulfilling the cocycle relation (7.8) j(gg′, τ) = j(g, g′(τ))j(g′, τ). Let f be a smooth func-tion on H. Then one writes

f |k[g](τ) := f(g(τ))jk(g, τ)

(so that πk(g)f(τ) := f(g−1(τ))jk(g−1, τ) at least formally is the prescription of a rep-resentation).

There are several important types of discrete subgroups Γ of G and the theory becomesmore and more beautiful if one treats more and more of these. For our purpose to givean introduction, it is sufficient to restrict ourselves to the (full) modular group

Γ = SL(2,Z)

and eventually to the main congruence group of level N

Γ = Γ(N) := γ ∈ SL(2,Z); γ ≡ E2 mod N, N ∈ N,

Hecke’s special congruence subgroup

Γ = Γ0(N) := γ =(

a bc d

)∈ SL(2,Z); c ≡ 0 mod N, N ∈ N,

and the theta group Γϑ generated by T 2 and S, i.e.

Γ = Γϑ := < T 2, S > with S =(

0 −11 0

)and T =

(1 10 1

).


There is a lot to be said about these groups. We mention only two items:

Remark 9.1: Γ = SL(2,Z) is generated by just two elements, namely S and T .

Exercise 9.3: Prove this and realize that T acts on H as a translation τ −→ T (τ) = τ +1and S as a reflection at the unit circle τ −→ S(τ) = −1/τ .Determine the fixed points for the action of Γ and their stabilizing groups.

Remark 9.2: Γ = SL(2,R) has the following standard fundamental domain

F := τ ∈ H; |τ | > 1, −1/2 < Re τ < 1/2.Here we use the definion from [Sh] p.15: For any discrete subgroup Γ of G = SL(2,R),we call F a fundamental domain for Γ \ H (or simply for Γ) if– i) F is a connected open subset of H,– ii) no two points of F are equivalent under Γ,– iii) every point of H is equivalent to some point of the closure of F under Γ.(Other authors sometimes use different definitions.)

Exercise 9.4: Verify this and show that the set of Γ−orbits on H is in bijection to

F ′ := F ∪ τ ∈ C; |τ | ≥ 1, Re τ = −1/2 ∪ τ ∈ C; |τ | = 1,−1/2 ≤ Re τ ≤ 0.If you have trouble, search for help in [Sh] p.16 or any other book on modular forms.

We look at functions f : H −→ C with a certain invariance property with respect to thediscrete subgroup. Here we take Γ = SL(2,Z) and k ∈ Z.

Definition 9.1: f is called (elliptic) modular form of weight k iff one has

– i) f |k[γ] = f for all γ ∈ Γ,– ii) f is holomorphic,– iii) For Im τ >> 0, f has a Fourier expansion or (q−development)

f(τ) =∑n≥0

c(n)qn, q = e(τ) = e2πiτ , c(n) ∈ C.

f is called cusp form (forme parabolique in French) if one has moreover c(0) = 0.We denote by Ak(Γ) the space of all elliptic modular forms of weight k and by Sk(Γ)the space of cusp forms. Both are C vector spaces and their dimensions are importantcharacteristic numbers attached to Γ. From i) taken for γ = −E2, one sees immediatelythat for odd k there are no non-trivial forms. For even k = 2 one has

dim A2(Γ) = [/6] if ≡ 1 mod 6, ≥ 0= [/6] + 1 if ≡ 1 mod 6, ≥ 0

where [x] denotes the integral part of x, i.e. the largest integer n such that n ≤ x. For aproof see for instance [Se] p.88 or (with even more general information) [Sh] p.46.

Remark 9.3: One also studies meromorphic modular forms where in ii) holomorphic isreplaced by meromorphic and in iii) one has a q−development f(τ) =

∑n>−∞ c(n)qn.


As Γ is generated by S and T , it is sufficient to demand the two conditions

f(τ + 1) = f(τ), f(−1/τ) = τkf(τ).

Hence a modular form of weight k is given by a series f(τ) =∑

n≥0 c(n)qn convergingfor |q| < 1 and fulfilling the condition

f(−1/τ) = τkf(τ).

Remark 9.4: The condition iii) concerning the Fourier expansion can also be understoodin several different ways, for instance as a certain boundedness condition for y −→ ∞(in particular for f ∈ Sk(SL(2,Z)), |f(τ)|yk/2 is bounded for all τ ∈ H), or, if k is even,as holomorphicity of the differential form α = f(τ)dτk/2 at ∞: one can compactify theupper half plane H by adjoining as cusps the rational numbers and the “point” ∞, i.e.one takes H∗ := H∪Q∪∞. Then one extends the action of SL(2,R) on H to an actionon H∗ and identifies Γ \ H∗ with the Riemann sphere or P1(C). In this sense modularforms are understood as holomorphic k/2-forms on P1(C).

All this is made precise in books on the theory of modular forms. Here, for our purpose,the essential point is that, similar to the theta functions in the previous section, modularforms are realizations of dominant weight vectors of discrete series representations ofG = SL(2,R). Before we go into this, let us at least give some examples of modularforms and the way they can be constructed.

Example 9.1: The most well-known elliptic modular form is the ∆−function given byits q−expansion

(2π)−12∆(q) : = q Π∞n=1(1 − qn)24

= Σ∞n=1τ(n)qn = q − 24q2 + 252q3 − 1472q4 + . . . .

This is a cusp form of weight 12. The function n −→ τ(n) is called the Ramanujan func-tion. It has many interesting properties, for instance there is the Ramanujan conjecture

τ(n) = O(n(11/2)+ε), ε > 0,

which has been proved by Deligne in 1974 (as consequence of his proof of the much moregeneral Weil conjecture concerning Zeta functions of algebraic varieties (see Section 9.6)).But it is still an open question whether all τ(n) are non-zero.

A modular form in a slightly more refined sense is the η−function, also appearing in thephysics literature in several contexts,

η(τ) := q1/24Π∞n=1(1 − qn).

Example 9.2: A more comprehensive construction principle leads to the Eisensteinseries Ek resp. Gk. For k = 4, 6, . . . , Gk is defined by

Gk(τ) := Σ′m,n(mτ + n)−k

where the sum is taken over all m,n ∈ Z with exception of (m, n) = (0, 0).


It is not too much trouble to verify that this is really a modular form of weight k andhas a Fourier expansion (see [Sh] p.32)

Gk(τ) = 2ζ(k) + 2(2πi)k

(k − 1)!

∞∑n=1

σk−1(n)qn

where ζ is Riemann’s Zeta function defined for complex s with Re s > 1 by

ζ(s) :=∞∑

n=1

n−s

and σk(n) denotes the sum of dk for all positive divisors d of n. These functions appearnaturally in the discussion of lattices L = τZ + Z in C. They can also be introduced bya far reaching averaging concept similar to the theta transform we used in the previoussection to understand the construction of theta functions:Every function ϕ on H formally gives rise to an object satisfying the covariance propertyi) of a modular form by the averaging ∑

γ∈Γ

ϕ|k[γ].

One has the problem whether this series converges (and has the properties ii) and iii)).In general this is not to be expected. But one can use the following fact: For g0 = T

our automorphic factor has the property j(g0, τ) = 1 , and, hence, one has

j(g0g, τ) = j(g, τ) = (cτ + d)−1.

We see that for a constant function ϕ0, say for ϕ0(τ) = 1, we have

ϕ0|k[T ] = ϕ0.

Thus to get a covariant expression, one has only to sum up

Ek(τ) :=∑

γ∈Γ∞\Γ(ϕ0|k[γ])(τ)

with Γ∞ := T ; ∈ Z. The map

Γ −→ Z2, g =(

a bc d

)−→ (c, d)

induces a bijectionΓ∞ \ Γ (c, d) ∈ Z2; (c, d) = 1

where here (c, d) denotes the greatest common divisor. Hence, one can write

Ek(τ) =∑

c,d∈Z, (c,d)=1

(cτ + d)−k.

This Ek has the properties we wanted and is related to Gk by

ζ(k)Ek(τ) = Gk(τ).


The same procedure goes through for every Γ∞-invariant ϕ. For instance, ϕ(τ) = e(mτ)leads to the so called Poincare series, which are used to construct bases for spaces ofcusp forms.

Example 9.3: As already mentioned in 9.1, Jacobi’s theta function ϑ has a modularbehaviour concerning the variable τ . Here we restrict the function to its value at z = 0

ϑ(τ) := ϑ(0, τ) =∑n∈Z

eπin2τ .

One has the relations

ϑ(τ + 2) = ϑ(τ), ϑ(−1/τ) = (τ/i)1/2ϑ(τ).

The first one is easy to see and the second one requires some work (see for instance [FB]p.344/5). These relations entail that this ϑ is a modular form with respect to the thetagroup Γϑ if one generalizes the definition of modular forms appropriately. There aremore general theta series associated to quadratic forms resp. lattices leading directly tomodular forms:

Let S ∈ Mn(Z) be a symmetric positive matrix, i.e. with

S[x] := txSx > 0 for all 0 = x ∈ Rn

then one defines an associated theta series by

ϑ(S; τ) :=∑

x∈Zn

eπiS[x]τ .

We get a function with period 2 whose Fourier development

ϑ(S; τ) =∞∑

m=0

a(S,m)eπimτ

encodes the numbersa(S,m) := #x ∈ Zn; S[x] = m

of representations of the natural number m by the quadratic form, which belongs to S.If we assume that S is moreover unimodular, i.e. S is invertible with S−1 ∈ Mn(Z),even, i.e. S[x] ∈ 2Z for all x ∈ Zn, and if n is divisible by 8, then ϑ(S; τ) is an ellipticmodular form of weight n/2. For a proof we refer to [FB] p.352 or [Se1] p.53.

In particular, for

an(m) := a(En,m) = #x ∈ Zn;n∑

j=1

x2j = m

one has results obtained by Jacobi in 1829 resp. 1828, namely

a8(m) = 16∑d|m

(−1)m−dd3 and a4(m) = 8∑

d|m,4 d

d.


Proofs can be found for instance in [FB] or [Mu] I. The second relation can be verifiedusing the nice identity

ϑ4(τ) = π−2(4G2(2τ) − G2(τ/2)).

Example 9.4: As a final example we present the j−invariant or modular invariant,which is a meromorphic modular form: We put

g2(τ) := 60G4(τ), g3(τ) := 140G6(τ).

Then, we have ∆(τ) = g2(τ)3 − 27g23 and the j−function is defined by

j(τ) := 123g2(τ)3/∆(τ).

j is a modular function with Fourier expansion (see for instance [Sh] p.33)

j(τ) = q−1(1 +∞∑

n=1

cnqn), n ∈ Z.

This function is of special importance because it induces a bijection Γ\H∗ P1(C). Andthe coefficients cn hide some deep mysteries (essentially assembled under the heading ofmonstrous moonshine (see [CN], [FLM])).

Modular Forms as Functions on SL(2,R)

Parallel to the relation between the Second and First Realization in the induction pro-cedure one defines a lifting of functions f from the homogeneous space X = G/K Gx0

with K = Gx0 to the group using the automorphic factor j(g, τ) satisfying the cocyclecondition (7.8) by f −→ φf where we put

φf (g) := f(g(x0))j(g, x0) = f |[g](x0).

Our aim is to translate the conditions i) to iii) from the definition of modular forms f toappropriate conditions characterizing the lifts of these forms among the functions φ onG. Classic sources giving all details for this are [GGP], [Ge] and [Bu].

Remark 9.5: If for a subgroup Γ of G and a function f on X one has

(9.4) (f |[γ])(x) = f(γ(x))j(γ, x) = f(x) for all γ ∈ Γ,

the lifted function φ = φf satisfies

ia) φ(γg) = φ(g) for all γ ∈ Γ

and

ib) φ(gκ) = φ(g)j(κ, x0) for all κ ∈ K.

Proof: By definition, (9.4), and (7.8), we have

φ(γg) = f(γg(x0))j(γg, x0),

= f(g(x0))j(γ, g(x0))−1j(γg, x0),= f(g(x0))j(g, x0).


And by definition, (7.8), and K = Gx0 , we have as well

φ(gκ) = f(gκ(x0))j(gκ, x0)= f(g(x0))j(g, x0)j(κ, x0)= φ(g)j(κ, x0).

Remark 9.6: In our case G = SL(2,R), x0 = i ∈ X = H = G/K, K = SO(2) andj(g, τ) = (cτ + d)−k, for the lifted function we have

φ(g) = f(g(i))(ci + d)−k = f(τ)yk/2eikϑ

if g is given by the Iwasawa decomposition (7.12)

g = n(x)t(y)r(ϑ).

Proof: In this case one has

ci + d = y−1/2(−i sinϑ + cos ϑ) = y−1/2e−iϑ.

The translation of the holomorphicity condition ii) to a condition restricting a functionφ on G follows the scheme we already used several times: For sl(2,C)c = < Z,X± >one has the left invariant differential operators (7.16)

LX± = ±(i/2)e±2iϑ(2y(∂x ∓ i∂y) − ∂ϑ),LZ = −i∂ϑ,

and, hence, for a function φ = φf with φf (g) = ck(g)f(x, y), ck(g) := yk/2eikϑ

LZφ = kφ,

LX±φ = ck±2D±f

whereD+ = i(∂x − i∂y − ik/y), D− = −iy2(∂x + i∂y).

Furthermore, using Wirtinger calculus with

∂τ = (1/2)(∂x − i∂y), ∂τ = (1/2)(∂x + i∂y)

we get

LZφ = kφ,

LX±φ = ck±2D±f

whereD+ = 2i(∂τ + k/(τ − τ)), D− = (i/2)(τ − τ)2∂τ .

Hence in particular, we have found:

Remark 9.7: Let φ = φf be the lift of f . The condition ii) that f is holomorphic on His equivalent to

ii’) LX−φ = 0.


The last condition iii) is easily seen to be equivalent to the boundedness condition

|f(τ)| < M for Im τ >> 0.

With a bit more work, one translates the cuspidality of a modular form, i.e. f ∈ Sk(Γ),to the following

Lemma: f ∈ Ak(Γ) is in Sk(Γ) iff for every y0 > 0 there is an M > 0 such that

yk/2|f(τ)| < M for all y = Im τ > y0.

Hence one has for the lifted functions φ = φf the following property.

Remark 9.8: The condition iii) translates to

|φ(x, y, ϑ)y−k/2| < M for y >> 0

and the cusp condition c(0) = 0 to

|φ(g)| < M for all g ∈ G.

The remarks 9.5 to 9.8 together give proof to a fundamental statement.

Theorem 9.3: Via the lifting ϕk, the space Ak(Γ) of modular forms on H is isomorphicto the space Ak of smooth functions φ on G with

ia) φ(γg) = φ(g) for all γ ∈ Γ = SL(2,Z),ib) φ(gr(ϑ)) = φ(g)eikϑ for all r(ϑ) ∈ K = SO(2),ii’) LX−φ = 0,

iii’) |φ(g)y−k/2| < M for all y >> 0.

Moreover, the space of cusp forms Sk(Γ) is isomorphic to

A0k = φ ∈ Ak; |φ| < M.

Remark 9.9: The cusp condition in Remark 9.8 and in the Theorem above is equivalentto

(cusp)∫Γ∩N\N

φ(n(x)g)dx = 0 for all g ∈ G.

This follows by identifying N = n(x); x ∈ R with R and computing

∫Γ∩N\N

φ(n(x)g)dx =∫ 1

0

φ(n(x)g)dx

=∫ 1

0

f(τ + x) eikϑyk/2dx

= eikϑyk/2c(0).

Now we can refine Theorem 9.3 above:


Theorem 9.4: The space of cusp forms A0k is characterized by the following conditions

ia) φ(γg) = φ(g) for all γ ∈ Γ = SL(2,Z),ib) φ(gr(ϑ)) = φ(g)eikϑ for all r(ϑ) ∈ K = SO(2),ii) ∆φ = LΩφ = k((k/2) − 1)φ,

iii) φ is bounded,iv)

∫Γ∩N\N

φ(n(x)g)dx = 0 for all g ∈ G.

Here∆ = LΩ = 2y2(∂2

x + ∂2y) − 2y∂x∂ϑ

is the Laplacian (7.17) from 7.2 and, by a small calculation, it is easy to see that the liftφf of a cusp form f ∈ Sk(Γ) satisfies the condition ii) and hence all the conditions fromthe theorem. The proof of the fact that the lifting map ϕk is a surjection onto the spacegiven by the conditions ia) to iv) reduces to the verification that for φ ∈ A0

k the functionf with

(9.5) f(τ) := φ(g)y−k/2e−ikϑ

is well defined for each g with g(i) = τ (which is quite obvious) and that f is a cusp form.This assertion is based on a considerably deeper theorem, the Theorem of the discretespectrum, which we shall state below.

Decomposition of L2(Γ \ G)

For G = SL(2,R) ⊃ Γ = SL(2,Z), we put H = L2(Γ \G) with respect to the biinvariantmeasure dg on X = Γ \ G fixed by

dg =12π

dxdy

y2dϑ.

We want to discuss the decomposition of the representation ρ of G given by right trans-lation on H and show how the cusp forms come in here. A first indication is given by the

Remark 9.10: For f ∈ Sk(Γ), the lift φf = ϕk(f) is in H.

This follows from ∫X|φf (g)|2dg =

∫F|f(τ)|2yk−2dxdy < ∞

by the boundedness condition iii’) for the cusp form f .

We need another tool from functional analysis (for background see for instance [La1]p.389ff):

Remark 9.11: The minimal closed extension of the Laplacian ∆ is a selfadjoint operator∆. As ∆ does by constuction, the extension ∆ also commutes with the right translationand hence, by Schur’s Lemma (“the unitary Schur” mentioned in 3.2) ∆|H0 is scalar foreach G-invariant subspace H0 of H. From functional analysis we take over that ∆ and ρhave a discrete spectrum only for the subspace of cuspidal functions

H0 := φ ∈ L2(Γ \ G) ;∫

Γ∩N\N

φ(n(x)g)dx = 0 for almost all g ∈ G.


The central fact is as follows.

Theorem 9.5 (Theorem of the discrete spectrum): The right regular representation ρ ofSL(2,R) on H0 is completely reducible

H0 = ⊕j∈ZHj

and each irreducible component has finite multiplicity.

As this leads too much into (most fascinating) functional analysis, we can not go into theproof of this theorem and refer to [La1] and [GGP] p.94ff. A main point is a multiplicity-one statement going back to Godement, which we already mentioned (without proof)in 6.4. But we try to give an indication of the power of this theorem: For the proof ofcharacterization of cusp forms by the Laplacian in the theorem above, we have to showthat the function f from (9.5)

f(τ) = φ(g)y−k/2e−ikϑ

is holomorphic. If one decomposes φ as in Theorem 9.5

φ =∑

j

φj ,

the components φj have to fulfill the same relation ib) as φ does. And one has

∆φj = λkφj , λk = k((k/2) − 1),

because by the last remark above, ∆ is scalar on Hj , i.e. one has a λj with

∆φj = λjφj .

And then using the inner product in H one has

< φ, ∆φ > = λj < φ, φj >

and, as ∆ is self adjoint,

< φ, ∆φj > = < ∆φ, φj > = λk < φ, φj >

and therefore λj = λk. Hence, for the restriction of ρ to its irreducible subrepresentationsthe components φj are vectors of lowest weight k for the discrete series representationπ+

k . Such vectors are annihilated by LX− and, hence, f is holomorphic.

As another consequence we come to the statement parallel to the one we proved in 9.1for theta functions.

Theorem 9.6 (Duality Theorem for the discrete series of SL(2,R)): The multiplicityof π−

k in the right regular representation ρ on the space H0 of cuspidal functions is equalto the number of linear independant cusp forms of weight k

mult(π−k , ρ) = dimSk(Γ).


Proof: There is a nice and comprehensive classic proof in [GGP] p.53-57. As we al-ready have at hand the necessary tools, we reproduce its highlights changing only theparametrization of G = SL(2,R) used by these authors to the one used here all the time.

a)At first, we show that mk := mult(π−k , ρ) is equal to the number mk of linear indepen-

dent functions Ψ on X = Γ \ G in H0 with

(9.6) ∆Ψ = λkΨ, λk := k((k/2) − 1), ρ(r(ϕ))Ψ = e−ikϕΨ for all ϕ ∈ R.

ai) For each irreducible subspace Hj of H0, which is a model for π−k , there is - up to a

nonzero factor - exactly one highest weight vector where ρ restricted to Hj has highestweight −k and, hence ∆ has eigenvalue λk.aii) Every Ψ ∈ H0 satisfying (9.6) is a linear combination of vectors of highest weightfrom irreducible subspaces of H0 equivalent to a representation space of π−

k .

b) In the second step, we show that mk is equal to the number of cusp forms:The space of functions Ψ as above is in bijection to the space of functions Φ on G with

(9.7) ∆Φ = λkΦ, Φ(gr(ϕ)) = e−ikϕΦ(g), Φ(γ−1g) = Φ(g) for all ϕ ∈ R, γ ∈ Γ.

We describe G by our parameters τ, ϑ with τ = g(i), e−iϑ = (ci + d)/ | ci + d | .

Hence, for g−10 with components a0, b0, c0, d0, the matrix g−1

0 g has parameters τ1, ϑ1 withτ1 = g−1

0 (τ), ϑ1 = ϑ − arg(c0τ + d0). And gr(ϕ) has the parameters τ, ϑ + ϕ. Thus thesecond equation in (9.7) shows that Φ(g) is a function of type f1(τ)e−ikϑ. Guided by theproof of our Remark 9.7, in order to also fulfill the other conditions in (9.7) we try theansatz

Φ(g) = e−ikϑyk/2f(τ).

Using our previous notation, this can also be written as Φ(g) = j(g, i)−k

f(τ). We seethat using this ansatz, the third condition in (9.7) is equivalent to the functional equationfor modular forms

f(γ−1(τ))j(γ−1, τ)−k = f(τ) for all γ ∈ Γ.

For the real coordinates x, y of τ the first equation in (9.7) comes down to

y2+(k/2)e−ikϑ(fxx + fyy) + iky1+(k/2)e−ikϑ(fx − ify) = 0

resp.y(fxx + fyy) − ik(fx + ify) = 0

If f is a modular form, ergo holomorphic, one has fτ = 0 and this equation is fulfilled.On the other hand, if Φ fulfills (9.7), by the Decomposition Theorem, it is a linearcombination of highest weight vectors with higest weight −k, which are annihilated byLX+ = e2iϑ(2y(∂x− i∂y)−∂ϑ). And this again leads just to the condition fτ = 0 showingthat Φ corresponds to a linear combination of cusp forms.

The reader again is invited to watch very closely whether all signs are correct. It has tobe emphasized that the notions highest and lowest weight depend on our parametrizationof K = SO(2) and interchange if one uses r(−ϑ) in place of our r(ϑ) (as Gelbart does in[Ge]). [GGP] wisely use the term vector of dominant weight.


Remark 9.12: It is a natural question to ask under which conditions the lift φ = φf

of a cusp form f is a cyclic vector, i.e. generates an irreducible representation. Thereis a nice answer to this: if f is an eigenfunction to the Hecke operators T (p). We willintroduce these operators later in the context of associating L−functions to modularforms and here only mention that the space of cusp forms has a basis consisting of sucheigenfunctions.

The big difference between this duality statement and the one for the Heisenberg groupin 9.1 is that in this case we have no way to an explicit construction of the modular formsstarting from their property of being a dominant weight vector. Apparently, this reflectsthe fact that the structure of the group at hand is considerably more complicated thanthat of the Heisenberg group. Some progress here should give a deep insight into themodular group.

The characterization theorem is the starting point for several generalizations leading tothe notion of automorphic forms. It can be extended to other groups, even adelic onesas to be found for instance in [Ge]. There it is also pointed out that the right habitatfor the modern theory of modular forms is not SL(2) but GL(2). Here we can not be tooambitious and we stay with G = SL(2,R), K = SO(2) and Γ = SL(2,Z) and only treatthe following generalization.

Definition 9.2: The smooth function φ on G is called an automorphic form for Γ iff φsatisfies the following conditions

ia) φ(γg) = φ(g) for all γ ∈ Γ,ib) φ is right K−finite, i.e. dim 〈φ(gr(ϑ)); r(ϑ) ∈ K〉 < ∞,

ii) φ is an eigenfunction of ∆,iii) φ fulfills the growth condition that there are C, M with |φ(g)| ≤ CyM for y −→ ∞.

φ is called cusp form if moreover one has

(cusp)∫Γ∩N\N

φ(n(x)g)dx = 0.

Besides the cusp forms realizing highest weight vectors of π−k we just treated, as their

counterparts there come in forms realizing the lowest weight vectors of π+k , corresponding

to antiholomorphic modular forms on H and the famous Maass wave forms: these arebounded right K-invariant eigenfunctions of the Laplacian ∆φ = λsφ with

λs = −(1 − s2)/2, s ∈ iR.

In analogy to the duality theorem stated above, one has here the fact that the dimensionof the space Ws(Γ) of these wave forms is equal to the multiplicity of the principal seriesrepresentation πis in ρ (for this and in general more precise information see [GGP] inparticular p.50). In [Ge] p.29 we find the statement: one knows that Ws(Γ) is non-trivialfor infinitely many values of s, but almost nothing is known concerning the specific val-ues of s for which Ws(Γ) is non-trivial. Still now, Maass wave forms are an interestingsubject. For information on more recent research we recommend the paper by Lewis andZagier [LZ].

We finish this section by some remarks concerning the continuous spectrum of the Lapla-cian, resp. the orthocomplement Hc of the subspace Hd of H = L2(Γ \ G) consisting of


the cuspidal and the constant functions. One has rather good control about this spaceas it can be spanned by the so called non-holomorphic Eisenstein series. As sources forthis we go back to [Ku] and [La1] p.239ff.

The classical object is fixed as follows. One takes

fµ(τ) := yµ, µ ∈ C

and for k ∈ N0

Ek(τ, µ) : =∑

γ∈Γ∞\Γ(f(µ−k)/2|k[γ])(τ),

=∑

c,d∈Z, (c,d)=1

y(µ−k)/2

|cτ + d|µ−k

1(cτ + d)k

with Γ∞ = γ =(

1 b1

); b ∈ Z.

This prescription formally has the transformation property of a modular form of weightk and it can be verified that for µ with Re µ > 2 it defines a (real analytic) functionon H, which is non-cuspidal. Using the standard lift ϕk, this function can be lifted to afunction on G

Ek(g, µ) : = ϕkEk(τ, µ),

= yµ/2eikϑ∑

c,d∈Z, (c,d)=1

(cτ + d

cτ + d)k/2 1

|cτ + d|µ ,

=∑

γ∈Γ∞\Γφs,k(γg),

where for µ = is + 1 the functions (8.11) from 8.3 φs,k, k ∈ Z, with

φs,k(g) = y(is+1)/2eikϑ

come in, which span the space of the principal series representations. We have

∆Ek(., µ) = −((s2 + 1)/2) Ek(., µ)

and these functions are further examples for the definition of automorphic forms above.

Already in our (brief) chapter on abelian groups, we met with a continuous decomposi-tion of the regular representation for G = R by throwing in elements of Fourier analysiswithout really explaining the background from functional analysis. Now, this gets worsehere: There is the central statement ([Ge] p.33):

Theorem 9.7: The restriction ρc of ρ to Hc is the continuous sum of the principal seriesrepresentations πis,+, i.e.

ρc(g) =∫ ∞

0

πis,+(g)ds.


Using the inner product for functions f, f on H given by

(f, f) :=∫F

f(τ)f(τ)y−2dxdy

one can say that any slowly increasing function f on H, which is orthogonal to all cuspforms, may be expressed by a generalized Fourier integral

f(τ) =∫ ∞

0

f(s)E(τ, s)ds, f(s) := (f, E(., s)), E(τ, s) :=∑

c,d∈Z;(c,d)=1

y(is)/2

|cτ + d|2s.

Though we will not try to get to the bottom of the proofs, we indicate the following alsootherwise useful tools:

The Theta Transform θ

We already used the averaging procedure while introducing Eisenstein series: If ϕ is afunction on N \G, one gets, at least formally, a function θϕ on Γ \G by the prescription

θϕ(g) :=∑

γ∈Γ∞\Γϕ(γg).

It is part of nice analysis to discuss under which conditions this leads to convergent ob-jects and it is no wonder that it works at least if ϕ is a Schwartz function.

The Constant Term θ∗

Let φ be an element of H := L2(Γ \ G). To this φ we associate θ∗φ (again at leastformally) by

θ∗φ(g) :=∫

Γ∞\N

φ(ng)dn.

As φ can be assigned a periodic function, which one denotes again by φ, θ∗φ may beunderstood as the constant term in a Fourier expansion

φ(n(x)g) =∑

j

φj(g)e(jx).

θ and θ∗ are adjoint operators if we introduce an inner product

< ϕ, ϕ >N\G :=∫

N\G

ϕ(g)ϕ(g)dg,

i.e. we have for functions assuring the convergence of our prescriptions the relation

< θϕ, φ >Γ\G = < ϕ, θ∗φ >N\G .



The Zeta Transform Z

This is a prescription to produce elements in the representation space His,± of a principalseries representation πis,± : For µ ∈ C and ϕ a function on N \G, we put (again at leastformally)

Z(ϕ, g, µ) :=∫ ∞

0

ϕ(t(y)g)y−µ dy

y.

For µ = (is + 1)/2, we get

Z(ϕ, n(x0)t(y0)g, µ) =∫ ∞

0

ϕ(t(y0y)g)y−µ dy

y= y

(is+1)/20 Z(ϕ, g, µ),

i.e. an element fulfilling the equation (7.11) in 7.2 characterizing the elements of His,+.

The following statements are easy to prove.

Remark 9.13: For ϕ in the Schwartz space S(N \G) and Re µ > 0, the integral definingZ(ϕ, g, µ) converges absolutely and Z(ϕ, g, µ) is an entire function in µ if moreover ϕ hasfinite support.

With some more work composing the Zeta and the Theta transform, one obtains aΓ−invariant function:

Remark 9.14: For ϕ ∈ S(N \ G) and Re µ > 1, the Eisenstein series

E(ϕ, g, µ) := θZ(ϕ, g, µ) =∑

γ∈Γ∞\ΓZ(ϕ, γg, µ)

is absolutely convergent.

By a small computation, we get the relation between this Eisenstein series and the onewe defined above.

Remark 9.15: For

ϕ(g) := ϕ,k(g) = y−e−π/yeikϑ, ∈ N0, k ∈ Z,

one hasE(ϕ, g, µ) = π−(+µ)Γ( + µ)Ek(g, µ).

The functions ϕ,k, ∈ N0, k ∈ Z are a Hilbert basis of the space L2(N \G) with respectto the measure dν = y−2dydϑ. For every fixed and µ = is + 1, the images of these ϕ,k

by the Zeta transform are a basis for the representation spaces His,± if one takes all evenresp. odd k ∈ Z.

There is still much to be said here. We shall come back to the Zeta transform in thecontext of a discussion of ζ− and L−functions. But before doing this, we shall finish ouroutlook to automorphic forms by returning to theta functions.


9.3 Theta Functions and the Jacobi Group

In 9.1 we introduced theta functions and studied their behaviour as functions of thecomplex variable z. But as we already indicated and partially explored, there also isan important dependance on the modular variable τ ∈ H. This leads to the ultimateinterpretation of (lifts of) theta functions as automorphic forms for the Jacobi group GJ ,which we introduced in 8.5. We defined

q := e(τ) = e2πiτ , ε := e(z) = e2πiz.

and the classic Jacobi theta function ϑ (9.1) given by

ϑ(z, τ) :=∑n∈Z

eπi(n2τ+2nz) =∑n∈Z

qn2/2εn,

the theta function with characteristics m = (m′,m′′) ∈ R2

θm(τ, z) :=∑n∈Z

e((1/2)(n + m′)2τ + (n + m′)(z + m′′)),

and its variant (9.2) (following the notation of [EZ] p.58) with 2m ∈ N, µ = 0, 1, . . . , 2m − 1

θm,µ(τ, z) :=∑

r∈Z, r≡µ mod 2m

qr2/(4m)εr

=∑n∈Z

em((n + µ/(2m))2τ + 2(n + µ/(2m))z).

One has ϑ(z, τ) = θ1/2,0(τ, z). We already explored the quasiperiodicity property con-cerning the complex variable z

ϑ(z + rτ + s, τ) = e−πi(r2τ+2rz)ϑ(z, τ)

and (in Example 9.3 in 9.2) the modular property of the value ϑ(τ) := θ(0, τ) for z = 0(also called by the German word Thetanullwert)

ϑ(τ + 2) = ϑ(τ), ϑ(−1/τ) = (τ/i)1/2ϑ(τ).

To be consistent with [EZ] and the usual Jacobi Theory, we have to change (from Jacobi’sand Mumford’s) notation and put

θ(τ, z) := ϑ(z, τ).

Then one has the fundamental modular transformation relation.

Theorem 9.8: For a, b, c, d ∈ Z with ad − cd = 1 and ab, cd ∈ 2Z, there is an eighthroot of unity ζ such that one has

(9.8) θ(aτ + b

cτ + d,

z

cτ + d) = (cτ + d)1/2 ζ e((1/2)cz2/(cτ + d))θ(τ, z).

The unit root ζ can be determined as follows:

9.3 Theta Functions and the Jacobi Group 237

We may assume c > 0 or c = 0 and d > 0 (otherwise one would multiply the matrix(a bc d

)by −1) and hence have (cτ + d)1/2 in the first quadrant, i.e. with real- and

imaginary part > 0.We take the Jacobi symbol (a

b ), i.e. the multiplicative extension of the Legendre symbol,which for a, p ∈ Z, p prime is given by

(ap ) = 0 if p | a,

= 1 p a and there is an x ∈ Z with x2 ≡ a mod p,= −1 p a and there is no x ∈ Z with x2 ≡ a mod p.

Then one hasζ = i(d−1)/2( c

|d| ) for c even and d odd,

= e−πic/4(dc ) for c odd and d even.

The proof of this theorem is a very nice piece of analysis but too lengthy for this text.So we refer to the books of Eichler [Ei], in particular p.59-62, Lion-Vergne [LV] p.145ff,or Mumford [Mu] I.

Another most remarkable fact is the following characterization of θ, which comes aboutas an exercise in complex function theory.

Theorem 9.9: Letf : H × C −→ C

be a holomorphic function satisfying

a) f(τ, z + 1) = f(τ, z),b) f(τ, z + τ) = f(τ, z)e(−(1/2)τ − z),c) f(τ + 1, z + 1/2) = f(τ, z),d) f(−1/τ, z/τ) = f(τ, z)(−iτ)1/2e((1/2)z2/τ) for all (z, τ) ∈ C × H,

e) limImτ −→∞f(τ, z) = 1.

Then one has f = θ.

By a closer look one sees that the quasi-periodicity and the modular transformationproperty of θ can be combined to a transformation property under an action of theJacobi group GJ , which is a semi-direct product of SL(2,R) and Heis(R) and in 8.5 wasrealized as a subgroup of Sp(2,R) given by four-by-four matrices. As in 8.5, we use twonotations

GJ g =(

a bc d

)(λ, µ, κ) = (p, q, r)

(a bc d

).

By a short calculation one can verify that for (τ, z) ∈ H × C

(τ, z) −→ g(τ, z) := (aτ + b

cτ + d,z + λτ + µ

cτ + d)

defines an action of GJ on H × C. The stabilizing group of (i, 0) is KJ := SO(2) × C(R),C(R) the center of the Heisenberg group. One has

GJ/KJ H × C, g(i, 0) = (τ = x + iy, z = pτ + q) where g = (p, q, r)n(x)t(y)r(ϑ).


And for k, m ∈ Q one has (see the first pages of [EZ]) an automorphic factor

(9.9) jk,m(g; (τ, z)) := (cτ + d)−kem(−c(z + λτ + µ)2

cτ + d+ λ2τ + 2λz + λµ + κ).

This leads to an action of GJ on functions f living on H × C via

f −→ f |k,m[g] with f |k,m[g](τ, z) := f(g(τ, z))jk,m(g; (τ, z)).

Similar to modular forms and theta functions as automorphic forms for SL(2,R) resp.Heis(R), for the Jacobi group GJ , its discrete subgroup ΓJ := SL(2,Z) Z3, andk, m ∈ N0 one defines (as in [EZ]):

Definition 9.3: A function f : H × C −→ C is called a Jacobi form of weight k andindex m if one has

– i)f |k,m[γ](τ, z) = f(τ, z) for all γ ∈ ΓJ ,

– ii) f is holomorphic,

– iii) for Im τ >> 0, f has a Fourier development

f(τ, z) =∑

n,r∈Z, 4mn−r2≥0

c(n, r)qnεr, q = e(τ), ε = e(z), c(n, r) ∈ C.

f is called a cusp form if it satisfies moreover

– iii’) c(n, r) = 0 unless 4mn > r2.

The vector spaces of all such functions f are denoted by Jk,m resp. Jcuspk,m . They are

finite dimensional by Theorem 1.1 of [EZ]. As usual, one lifts a function f on H × C toa function φ = φf on GJ by

φf (g) := f(g(i, 0))jk,m(g; (i, 0)) = f(x, y, p, q)em(κ+pz)eikϑyk/2, g = (p, q, κ)n(x)t(y)r(ϑ).

Similarly to the analysis in the former cases, one can characterize the images of Jacobiforms.

Proposition 9.2: Jk,m is isomorphic to the space Ak,m of complex functions C∞(GJ)with

– i) φ(γg) = φ(g) for all γ ∈ ΓJ ,

– ii) φ(gr(ϑ)(0, 0, κ)) = φ(g)em(κ)eikϑ,

– iii) LX−φ = LY−φ = 0,

– iv) φ(g)y−k/2 is bounded in domains of type y > y0.

Jcuspk,m is isomorphic to the subspace A0

k,m of Ak,m with

– iv’) φ(g) is bounded.

For a proof of this and the following statement see [BeB] or [BeS] p. 76ff.

9.4 Hecke’s Theory of L−Functions Associated to Modular Forms 239

Theorem 9.10 (Duality Theorem for the Jacobi Group): For m, k ∈ N, dim Jcuspk,m is

equal to the multiplicity of the appropriate discrete series representation of GJ in theright regular representation 0 on the space H0

m of cuspidal functions of type m, whichconsists of the elements φ ∈ L2(ΓJ \ GJ) satisfying the functional equation

φ(g(0, 0, κ)) = e(mκ)φ(g), for all κ ∈ R

and fulfilling the cusp condition∫NJ∩ΓJ\NJ

φ(gλ/(2m)g)dn = 0 for almost all g ∈ GJ and λ = 0, . . . , 2m − 1

where NJ := (0, q, 0)n(x); q, x ∈ R and gλ := (λ, 0, 0).

Further information concerning the continuous part of the decomposition of L2(ΓJ \ GJ )is collected in [BeS].

At a closer look, one can see that the classic ϑ has the transformation property of a Jacobiform of weight and index 1/2 for a certain subgroup of ΓJ . As already the infinitesi-mal considerations indicated, one has to multiply with modular forms of half-integralweight to get Jacobi forms in the sense of the definition above. This is very nicely ex-ploited in [EZ]. The appearance of the half-integral weights can be explained by the factthat, somewhat similar to the relation of the representation theory of SO(3) and SU(2)resp. SO(3, 1) and SL(2,C), here one has coming in a double cover of SL(2,R) calledthe metaplectic group Mp(2,R). But this group is no more a linear group and hence,unfortunately, outside the scope of this book.

9.4 Hecke’s Theory of L−Functions of Modular Forms

It is a classical topic of analysis to study the relation of a power series f :=∑∞

n=1 c(n)qn

to a Dirichlet series

Lf (s) :=∞∑

n=1

c(n)ns

.

In our context, we take a cusp form f ∈ Sk(Γ) and get an associated Dirichlet seriesconvergent for s with Re s > k as the coefficients c(n) of the Fourier expansion of f fulfillan appropriate growth condition. The most important Dirichlet series have an Eulerproduct similar to the prototype of a Dirichlet series, namely the Riemann Zeta series

ζ(s) :=∞∑

n=1

n−s,

which is convergent for s with Re s > 1 and has the Euler product (taken over all primenumbers p)

ζ(s) = Π1

1 − p−s.

Moreover, it is known that ζ fulfills a functional equation under s −→ 1 − s and can becontinued to a meromorphic function on the whole plane C with a simple pole at s = 1.


Roughly said, Hecke’s theory analyses how these and more general properties are relatedto properties of modular forms. The main tools are the following operators.

Hecke Operators

Definition 9.4: For a natural number n one introduces the Hecke operator T (n) as anoperator acting on functions f on the upper half plane H by

(9.10) T (n)f(τ) := nk−1∑

a≥1,ad=n,0≤b<d

d−kf(aτ + b

d).

The origin of this prescription can be understood in several ways. There is the moregeneral background asking for the decomposition of double cosets into cosets for Γ: Forinstance, for a prime p one has

Γ(

1p

)Γ =

⊔0≤b<p

Γ(

1 bp

)

(as exploited in [Sh] p.51ff). But we follow the presentation in [Se] p.98 based on thenotion of correspondences: Let E be a set and let GE be the free abelian group generatedby E. A correspondence on E with integer coefficients is a homomorphism T of GE toitself. We can describe T by its values on the elements x of E:

T (x) =∑y∈E

ny(x)y, ny(x) ∈ Z, ny(x) = 0 for almost all y.

A C-valued function F on E can be extended linearly to a function on GE , which isagain denoted by F . The transform of F by T , denoted TF , is the restriction to E ofthe function F T . With the notation introduced above, it is given by

TF (x) = F (T (x)) =∑y∈E

ny(x)F (y).

Let R be the set of lattices L in C and n ∈ N. We denote by T (n) the correspondenceon R which transforms a lattice to the sum (in GR) of its sublattices Ln of index n.Thus we have for L ∈ R

T (n)L =∑

[L:Ln]

Ln.

The sum on the right side is finite. If n is prime, one sees that one has n + 1 suchsublattices. For the general case one has

Lemma: There is a bijection between the set of matrices

M =(

a b0 d

); ad = n, a ≥ 1, 0 ≤ b < d,

and the set of lattices Ln of index n in L. For L with basis (τ1, τ2) this bijection is givenby associating to the matrix M the lattice with basis τ ′

1 = aτ1 + bτ2, τ′2 = d τ2.



One also uses homothety operators Rλ, λ ∈ C∗ defined by

RλL = λL.

It makes sense to compose the correspondences T (n) and Rλ, since they are endomor-phisms of the abelian group GR.

Proposition 9.3: The correspondences T (n) and Rλ verify the identities

RλRµ = Rλµ, λ, µ ∈ C∗,

RλT (n) = T (n)Rλ, n ∈ N, λ ∈ C∗,

T (m)T (n) = T (mn), if (m,n) = 1,

T (pn)T (p) = T (pn+1) + pT (pn−1)Rp, p prime, n ∈ N.

Proof: We refer to [Se] p.98.

Corollary 1: The T (pn) are polynomials in T (p) and Rp.Corollary 2: The algebra generated by the Rλ and the T (p), p prime, is commutative;it contains all the T (n).

For the description of the action of these operators on modular forms of weight k, weassociate to f a corresponding function F defined on the set R of lattices by

F (L) := (τ1/τ2)−kf(τ1/τ2)

if (τ1, τ2) is a basis of L. We define T (n)f as the function on H associated to the functionnk−1T (n)F on R, i.e.

T (n)f(τ) := nk−1T (n)F (L(τ, 1))

where L(τ, 1) denotes the lattice with basis (τ := τ1/τ2, 1) (it can be arranged and issilently understood that the description of the lattices is chosen such that this τ is in H).Using the Lemma above, we get the formula (9.10) in the definition at the beginning.Certainly, one has to and can verify that this T (n) transforms the spaces of modularforms of weight k into itself (and similarly for cusp forms). We leave this to the readerand only note the relations, which are immediate consequences of Proposition 9.3

T (m)T (n)f = T (mn)f, if (m, n) = 1,(9.11)

T (pn)T (p)f = T (pn+1)f + pk−1T (pn−1)f, p prime n ∈ N.(9.12)

The central point of all this is the action of the Hecke operators expressed by the coeffi-cients of the q−development f(τ) =

∑c(n)qn:

Proposition 9.4 We haveT (n)f(τ) =

∑m∈Z

γ(m)qm

with

(9.13) γ(m) =∑

a∈N,a|(n,m)

ak−1c(nm/a2).


Using the relations obtained earlier, the proof is straightforward. Again we refer to [Se]p.101. In particular, we come to

γ(0) = σk−1(n)c(0), σk(n) :=∑

d∈N,d|ndk,

γ(1) = c(n),

γ(m) = c(pm) if n = p prime and m ≡ 0 mod p,

= c(pm) + pk−1c(m/p) m ≡ 0 mod p.

Since the Hecke operators commute, one can hope for simultaneous diagonalizability.

Hecke Eigenforms

Let f be a Hecke eigenform of weight k, i.e. a modular form of weight k, which is notidentically zero and an eigenfunction for all T (n), n ∈ N,

T (n)f(τ) = λ(n)f(τ).

Examples are the Eisenstein series Gk and the discriminant ∆. One has

T (n)Gk = σk−1(n)Gk, T (n)∆ = (2π)12τ(n)∆, for all n ∈ N.

The verification is non-trivial but standard ([Se] p.104).

Our central statement is as follows:

Theorem 9.11: The coefficient c(1) of a Hecke eigenform f is non-zero. If f is normalizedby c(1) = 1, one has

λ(n) = c(n) for all n ∈ N.

Proof: The formula (9.13) in Proposition 9.4 above shows that the coefficient of q inT (n)f is c(n). On the other hand, as f is a T (n)−eigenfunction, it is also λ(n)c(1).Thus we have c(n) = λ(n)c(1). If c(1) were zero, all the c(n), n > 0 would be zero, andf would be a constant, which is absurd.

Applying the first statement of the Theorem to the difference of two normalized Heckeeigenforms, we get

Corollary 1: Two modular forms of weight k, k > 0, which are both normalized Heckeeigenforms with the same eigenvalues, coincide.

By the second statement of the Theorem the relations (9.11) for the Hecke operatorstranslate immediately to relations for the Fourier coefficients. We get

Corollary 2: For the Fourier coefficients of a normalized Hecke eigenform f of weightk one has the relations

c(m)c(n) = c(mn), if (m,n) = 1,(9.14)

c(pn)c(p) = c(pn+1) + pk−1c(pn−1), if p is prime and n ∈ N.(9.15)


Now, these relations for the Fourier coefficients of a Hecke eigenform f =∑

c(n)qn trans-late to the statement that the associated Dirichlet series Lf (s) =

∑c(n)n−s (convergent,

as already mentioned, for Re s > k) has an Euler product.

Corollary 3: The Dirichlet series Lf of a Hecke eigenform f of weight k has the Eulerproduct

(9.16) Lf (s) = Πp∈P1

1 − c(p)p−s + pk−1−2s.

Proof: From Corollary 2 we know that n −→ c(n) is a multiplicative function, so witht := p−s we can write

Lf (s) = Πp∈P(∞∑

n=0

c(pn)tn).

Hence it is sufficient to show that we have

∞∑n=0

c(pn)tn = Lf,p(t) for Lf,p(t) := (1 − c(p)t + pk−1t2)−1.

We compute

ψ(t) := (∞∑

n=0

c(pn)tn)(1 − c(p)t + pk−1t2) =:∑

d(n)tn

and as coefficient of t get d(1) = c(p) − c(p) = 0 and, for n ≥ 1, as coefficient of tn+1

d(n + 1) = c(pn+1) − c(p)c(pn) + pk−1c(pn−1) = 0

in consequence of the second relation in Corollary 2. I.e., we have ψ(t) = c(1) = 1.

Remark 9.16: The statement that Lf has an Euler product as in Corollary 3 is equiv-alent to the fact that the Fourier coefficients of f fulfill the relations in Corollary 2.

Remark 9.17: Hecke proved that Lf can be analytically continued to a meromorphicfunction on the whole complex plane, resp. to a holomorphic function if f is a cusp form,and that the completed L-function

L∗f (s) := (2π)−sΓ(s)Lf (s)

satisfies the functional equation

(9.17) L∗f (s) = (−1)k/2L∗

f (k − s).

The proof is based on the classical Mellin transform: Let f be a normalized Hecke cuspform of weight k, i.e. we have

f(τ) =∞∑

n=1

c(n)qn with f(−1/τ) = τkf(τ).


Using the Γ−function

Γ(s) :=∫ ∞

0

e−tts−1dt

we determine the Mellin transform of f , namely

∫ ∞

0

f(iy)ys−1dy =∫ ∞

0

(∑

c(n)e(niy)) ys dy

y

=∑

c(n)∫ ∞

0

e−2πnyys−1dy

=∑

c(n)(2πn)−sΓ(s)

= (2π)−sΓ(s)Lf (s) = L∗f (s).

In the integral on the left hand side, we put u = 1/y and use the transformation formulafor f to get

∫ ∞

0

f(iy)ys−1dy = −∫ 0

∞f(i/u)u−s du

u

=∫ ∞

0

f(iu)ikuk−s du

u

= ikL∗f (k − s).

Remark 9.18: The whole theory has been extended to other groups than Γ = SL(2,Z),in particular to Γ0(N) (see for instance [Ge]).There is also a converse to the statement in Remark 9.17: Every Dirichlet series L,which fulfills a functional equation of the type just treated and has certain regularityand growth properties, comes from a modular form of f of weight k. This was first re-marked by Hecke and then refined considerably by Weil in [We1] and is now cited underthe heading of the famous Converse Theorem.

Remark 9.19: The Hecke operators are hermitian with respect to the Petersson scalarproduct for forms f1, f2 of weight k given by

< f1, f2 > :=∫F

f1(τ)f2(τ)yk−2dxdy, F as in Remark 9.2,

i.e. one has< T (n)f1, f2 > = < f1, T (n)f2 > .

Arithmetic Modular Forms

Though it does not directly belong to our topic, we can not resist the temptation tocomplete this section by indicating a concept of arithmetically distinguished modularforms, which has been exploited by Shimura and many others (see [Sh]): We look atmodular forms (for the modular group Γ, but this is open to wide generalizations) withintegral Fourier coefficients

Ak(Γ,Z) := f =∑

c(n)qn ∈ Ak(Γ); c(n) ∈ Z.


One can show that one can find a Z-basis of Ak(Γ,Z), which is also a C-basis of Ak(Γ).Prototypes of these forms are, up to factors, forms we already met with, namely

F (τ) := q∞∏

n=1

(1 − qn)24,

E4(τ) := 1 + 240∞∑

n=1

σ3(n)qn,

E6(τ) := 1 − 504∞∑

n=1

σ5(n)qn.

For k ∈ 4N Ak(Γ,Z) has as Z-basis (see [Se1] p.105)

Eα4 F β , α, β ∈ N0 with α + 3β = k/4,

and for k ∈ 2(2N + 1)

E6Eα4 F β , α, β ∈ N0 with α + 3β = ((k/2) − 3)/2.

Our analysis of the action of the Hecke operators above shows that Ak(Γ,Z) is stableunder all T (n). Hence the coefficients of the characteristic polynomial of T (n) actingon Ak(Γ,Z) are integers and one can deduce that the eigenvalues are algebraic integers.There are explicit formulas for the traces of the T (n), the Eichler-Selberg trace formulas,starting by [Ei1] and [Sel].

Summary

In this section we looked at L-functions L(s) = Lf (s) coming from modular cusp forms f .Via a functional equation these functions have analytic continuation to the whole com-plex plane. This functional equation is a consequence of the covariant transformationproperty of the modular form. The modular form is related to an (infinite-dimensional)automorphic representation of SL(2,R) (or GL(2,R)). We call such L-functions auto-morphic L-functions. If f is a Hecke eigenform, Lf has an Euler product with factors ofdegree 2, i.e. with

(1 − c(p)p−s + pk−1−2s)−1 = (1 − c(p)t + pk−1t2)−1, t := p−s.

There is a general procedure to associate such functions to automorphic representationsof other groups leading to Euler products with other degrees. In particular, one hasa functional equation and factors of degree 1 in the theory of Hecke’s L-series withgrossencharacters (from the German word “Grossencharakter”). This is fairly easy to beunderstood ([Ge] p.99f) as a theory of automorphic L-functions for the group GL(1) ifone uses the more functional analytic idelic approach initiated by Tate’s thesis (repro-duced in [CF] or described in [La2] and, provided with the necessary analytic background,in [RV]). Hecke’s classic approach is carefully displayed in [Ne] p.515ff. Here we shallindicate the definition of this function later when we have more number theory at hand.For an introduction to more general automorphic L-functions we refer to [Ge] and [GeS].


9.5 Elements of Algebraic Number Theory and HeckeL-Functions

Any book on number theory is good as a background to the following survey, in particu-lar we recommend the books by Lang [La2] and by Neukirch [Ne] (now also in English).

Number Fields and their Galois Groups

Definition 9.5: An algebraic number field is a finite extension K of the field Q of ra-tional numbers, i.e. a field containing Q, which is a finite-dimensional vector space over Q.

Such a field is obtained by adjoining to Q roots of polynomials with coefficients in Q.For example, the field

Q(i) := a + bi; a, b ∈ Qis obtained by adjoining the roots of the polynomial x2 + 1, denoted as usual by i and−i. We obtain a field, which has dimension 2 as a vector space over Q, the field of Gaussnumbers. This is the most elementary example of an imaginary quadratic field, whichhas the general form K = Q(

√−d), d ∈ N, d squarefree.

Similarly, for N ∈ N, adjoining to Q a primitive N−th root of unity ζN we obtain theN -th cyclotomic field QN := Q(ζN ). Its dimension as a vector space over Q is ϕ(N),the Euler function of N , i.e.

ϕ(N) := # m ∈ N; 1 ≤ m < N, (m,N) = 1.We can embed Q(ζN ) into C in such a way that ζN −→ e(1/N). But this is not theonly possible embedding of QN into C, since we could also send ζN −→ e(m/N) where(m,N) = 1.

Suppose now that K is an algebraic number field and F a finite extension of K, i.e. anotherfield containing K, which has a finite dimension as a vector space over K. This dimensionis called the degree of F over K and denoted by degK F, in particular degF for K = Q.

Definition 9.6: The group of all field automorphisms σ of F, preserving the field struc-ture and such that σ(x) = x for all x ∈ K, is called the Galois group of F over K anddenoted by Gal(F/K), in particular Gal(F) for K = Q.

Example 9.5: The Galois group Gal(QN ) is naturally identified with the group

(Z/NZ)× := n := n + NZ ∈ Z/NZ; (n,N) = 1with respect to multiplication. The element n ∈ (Z/NZ)× gives rise to the automor-phism of QN sending ζN to ζn

N , and hence ζmN to ζmn

N for all m. If M divides N , thenQM is contained in QN , and the corresponding homomorphism of the Galois groupsGal(QN ) −→ Gal(QM ) coincides under the above identification with the natural surjec-tive homomorphism

(Z/NZ)× −→ (Z/MZ)×; n −→ n mod M.

A central notion from number theory is the notion of a Galois extension:

9.5 Elements of Algebraic Number Theory and Hecke L-Functions 247

Definition 9.7: An algebraic field extension F/K is called a Galois extension iff theorder of the Galois group G = Gal(F/K) is equal to the relative degree degK F.

It is a main part of the theory to give different versions of this definition. Then, Galoistheory consists in establishing for a Galois extension a (contravariant) bijection betweensubgroups of the Galois group and subfields of F containing K.

Next, one has class field theory, which tries to collect information about a Galois fieldextension F/K, in particular the structure of the Galois group, only from the structureof the basefield K. We shall need some of this and have to recall some more notions andfacts from algebra:

The Ideal Class Group

Let R be an integral domain, i.e. a commutative ring with unit e and without zero-divisors. A subring a in R is an ideal in R iff one has ra ∈ a for all r ∈ R and a ∈ a.An ideal a is a principal ideal iff it can be generated by one element, i.e. is of the forma = aR for an element a ∈ R. An ideal p in R is a prime ideal iff from ab ∈ p one canconclude a or b ∈ p, or, equivalently, iff the factor ring R/p has no zero-divisors.

For instance, every normalized irreducible polynomial f of degree n in the polynomialring R = Q[x] generates a prime ideal p = fR. This ideal is maximal and the factor ringR/p is an algebraic number field of degree n. Every algebraic number field of degree ncan be described like this.

If for two ideals we have a ⊃ b, we say a divides b. And (a, b) denotes the greatestcommon divisor of a and b (corresponding to the usual notions for integers if R = Z anda = aZ, b = bZ).

From two ideals a and b one can build new ideals, the sum a + b and the product a · b(the last one consisting of finite sums of products ab, a ∈ a, b ∈ b).

In number theory, the main occasion to use these notions is the maximal order of analgebraic number field K, i.e. the ring o of integral elements of K: an element a ∈ Kis called integral iff it is the root of a normalized polynomial f ∈ Z[x] (one has to showthat the set of these elements is a ring). The determination of the maximal orders isa fundamental task of number theory. For instance, for the field K = Q(i) of Gaussnumbers one has the Gauss integers o = Z[i] = a+ bi; a, b ∈ Z (this is not difficult, tryto prove it as Exercise 9.7). For the cyclotomic field K = Q(ζN ) one has also the niceresult o = Z[ζN ].

Main tools in this context are the maps norm and trace: For x ∈ F one has

λx : F −→ F, a −→ xa for all a ∈ F,

the (additive) trace map

TrF/K : F −→ K, x −→ Trλx

and the (multiplicative) norm map

NF/K : F −→ K, x −→ detλx.


An ideal in o is also called an ideal of K and this notion is extended to the notion of afractional ideal of K: This is an o-submodule a = 0 of K, which has a common denom-inator, i.e. there is an element 0 = d ∈ o with da ⊆ o.

All this is done with the intention to associate another finite group to an algebraic num-ber field: It is not difficult to see that the set JK of fractional ideals provided withmultiplication as composition has the structure of an abelian group. It is called the idealgroup of K. The (almost unique) decomposition of an integer in Z into a product ofpowers of primes is extended to the statement (see for instance [Ne] p.23):

Every fractional ideal a ∈ JK has a unique representation as a product of powers of primeideals

a = Πp pνp , νp ∈ Z, almost all νp = 0.

The principal fractional ideals PK := (a) = ao; a ∈ K∗ form a subgroup of JK and,finally, one introduces the factor group

ClK := JK/PK.

This comes out as a finite abelian group and is called the ideal class group or simplyclass group of K.

Class field theory tries to relate the Galois group of a field extension to the ideal classgroup of the ground field. An extension F/K with abelian Galois group G is called aHilbert class field iff one has

G(F/K) = ClK.

Example 9.6: The cyclotomic field QN is a Hilbert class field over Q. We haveClQN

(Z/NZ)×.

Dirichlet Characters and Hecke’s L-Functions

For 0 = m ∈ Z, the classical Dirichlet character mod m is a character of the group(Z/mZ)×

χ : (Z/mZ)× −→ S1.

The character χ is called primitive if there is no genuine divisor m′ of m such thatχ factorizes via a character of (Z/m′Z)×. Given such a χ, we get a (multiplicative)function for all natural numbers n, which we denote again by χ, by mapping n to χ(n)if (n,m) = 1 and n is the class of n mod m, and to zero if (n,m) = 1. Hence, to such aprimitive character χ, one can associate an L-series

L(χ, s) :=∞∑

n=1

χ(n)ns

.

Obviously, for the trivial character χ0 mod 1 with χ0(n) = 1 for all n, we obtain theRiemann Zeta series. Hecke proved that these series have an Euler product (with degreeone factors)

L(χ, s) = Πp∈P1

1 − χ(p)p−s,

analytic continuation to C (resp. C\1 for χ0), and a functional equation for s −→ 1−s.(The proof relies again on the transformation formula of a (generalized) theta series.)

9.5 Elements of Algebraic Number Theory and Hecke L-Functions 249

These are examples of Hecke’s L-functions, which can be defined more generally: Let Kbe an algebraic number field with maximal order o, m ⊂ o an ideal, and Jm the group offractional ideals a in K, which are prime to m

Jm := a ⊂ K; (a, m) = 1.

Let χ : Jm −→ S1 be a character. Then we define an L-series

L(χ, s) :=∑a⊂o

χ(a)N(a)s

.

Here the sum is taken over all ideals a in o, N(a) denotes the number of elements of theresidue class ring o/a, and again we put χ(a) = 0 for (a,m) = 1. The assumption that χis a character translates easily to the existence of an Euler product taken over all primeideals p in o

L(χ, s) := Πp1

1 − χ(p)N(p)−s.

If we take χ = χ0 with χ0(a) = 1 for all a, we get the usual Dedekind Zeta function ofthe number field K

L(χ0, s) =∑a⊂o

N(a)−s =: ζK(s).

The question asking for the biggest class of characters, for which one can prove a func-tional equation (and hence has analytic continuation) leads to Hecke’s definition of thenotion “Grossencharakter”:

Definition 9.8: A character χ : Jm −→ S1 is called a grossencharacter mod m iff thereis a pair of characters

χf : (o/m)∗ −→ S1, χ∞ : R∗ −→ S1

such that

χ((a)) = χf (a)χ∞(a) for all a ∈ o with ((a),m) = 1.

We have remarked (in 5.1) that the set of characters of the additive group R is simplyin bijection to R. A sign that multiplicative structure is more intricate than additivestructure is again given by the fact that the set of these characters χ∞ of R∗ is morecomplicated (for a complete statement see [Ne] p.497).

As already said, a more elegant way to understand these constructions is based on thetheory of adeles and ideles introduced into number theory by Tate and then pursued bymany others. Here one has the central notion of a Hecke character as a character of theidel class group. The relation to the notions presented here is also to be found in [Ne]p.501. There are further refinements using ray class groups and their characters but wewill stop here and change the topic to get again more concrete examples.


9.6 Arithmetic L-Functions

In classical number theory and in arithmetic geometry there are many occasions to in-troduce Zeta- and L-functions as Dirichlet series encoding diophantine or arithmetic in-formation. By their nature as Dirichlet series, these series converge in right half planes.Then, as the main task, one has to find a way to extend the function given by such aseries to a function on the whole plane, eventually with some poles, and study the valuesof this function at special, in particular, integral points. This may turn out to be theexpression of a mysterious regularity property inherent in the arithmetic or diophantineproblem we started with. A way to prove this comes up if this arithmetic L-function canbe seen as an automorphic L-function. For instance the proof of the Fermat conjectureby Wiles (and Taylor) incorporates a proof that the L-function of an elliptic curve canbe interpreted as an L-function of a modular form. Here we are far from making thissufficiently precise but to give at least some flavour and define the L-functions for numberfields Emil Artin discussed around 1923, we borrow heavily from an expository articleon the role of representation theory in number theory by Langlands [L].

Two Examples:

We look at diophantine equations, i.e. polynomial equations with integral coefficients towhich integral solutions are sought. A famous example is the Fermat equation

xm + ym = zm.

For m = 2 there are infinitely many solutions, the Pythagorian triples, e.g. (3, 4, 5), andfor m > 2, as Wiles (and Taylor) proved, there are no non-trivial integral solutions. Butif there are no integral solutions to a diophantine problem, one can look for solutionsmod p, p prime, try to count their number and encode these numbers into a function.

Example 9.7: Probably the simplest example is the equation

(9.18) x2 + 1 = 0.

The primes p for which the congruence x2 +1 ≡ 0 mod p can be solved are 2, 5, 13, . . .all of which leave the remainder 1 upon division by 4, whereas primes like 7, 19, 23, . . .that leave the remainder 3 upon division by 4 never do so. In 9.3 we already presentedthe Legendre symbol

(a

p) = ±1 for all a ∈ Z, with (p, a) = 1,

which indicates whether the congruence x2 ≡ a mod p is solvable or not. Having athand the notion of a Dirichlet character, we see that here we have an example with amultiplicative function χ on Z given by

χ(p) = (−1p ) = 1 for p ≡ 1 mod 4,

= −1 for p ≡ 3 mod 4,= 0 for (p, 4) = 1,

i.e. with χ(1) = 1, χ(2) = 0, χ(3) = −1, χ(4) = 0 . . . .

9.6 Arithmetic L-Functions 251

We have

L(χ, s) =∏p

(1 − χ(p)ps

)−1 =∞∑

n=1

χ(n)ns

= 1 − 13s

+15s

− 17s

+19s

+ . . . .

Like every Dirichlet series, this series defines a function in a right half plane. It is anexpression of the mysterious symmetry of our problem that one can bring into play atheta function, which, by its transformation property, produces a function valid for alls ∈ C.Now, having prepared some material from algebraic number theory, we can look at thefunction L(χ, s) from another point of view: The polynomial f(x) = x2 + 1 defines theGauss field

K = Q(i) = a + bi, a, b ∈ Qand this field has the Galois group G = G(K) of automorphisms elementwise fixing Qconsisting just of the identity map and the map ϕ, uniquely fixed by changing the rootθ1 :=

√−1 = i into θ2 := −i. Hence, one has G ±1 and this group has exactly twoone-dimensional representations, the trivial one and a representation π with π(ϕ) = −1.One can use this to define another type of L-function, the Artin L-function. To explainthis (following Langlands’ article), we go back some steps:The solvability of the congruence f(x) = x2 + 1 ≡ 0 mod p is equivalent to the possi-bility to factorize f mod p into a product of two linear factors, e.g.

x2 + 1 ≡ x2 + 2x + 1 = (x + 1)2 mod 2

x2 + 1 = x2 − 4 + 5 ≡ (x − 2)(x + 2) mod 5.

For the primes p = 7 and 11 and, in general, for all primes, for which the congruence hasno solutions, the polynomial x2 + 1 stays irreducible mod p.

Example 9.8: Another famous example is

(9.19) f(x) = x5 + 10x3 − 10x2 + 35x − 18.

The polynomial is irreducible mod p for p = 7, 13, 19, 29, 43, 47, 59, . . . and factorizes intolinear factors mod p for p = 2063, 2213, 2953, 3631, . . . .

In general, we can look at a polynomial with integral coefficients

f(x) = xn + an−1xn−1 + · · · + a1x + a0

with roots α1, . . . , αn, which we assume to be pairwise different. There will be variousrelations between these roots with coefficients that are rational,

F (α1, . . . , αn) = 0.

For example, the roots of x3−1 = 0 are α1 = 1, α2 = (−1+√−3)/2, α3 = (−1−√−3)/2

and two of the valid relations for these roots are

α1 = 1, α2α3 = α1.

To the equation f(x) = 0 we can associate the group of all permutations of its roots thatpreserve all valid relations. In the last example the sole possibility in addition to the


trivial permutation is the permutation that fixes α1 and interchanges α2 and α3. Thegroup G that we get is called the Galois group of the equation f(x) = 0. It is identicalwith the Galois group of the algebraic number field K defined by the polynomial f .We introduce the discriminant ∆ of the equation (resp. the number field) as defined bythe equation

∆ :=∏i=j

(αi − αj).

It is an integer and one of the most important characteristic numbers to be associated toa number field. Now, one can attach to any prime p that does not divide ∆ an elementFp ∈ G, called the Frobenius automorphism, that determines among other things howthe equation factors mod p. More precisely, it is the conjugacy class of Fp within Gthat is determined (we will indicate below how this is done). To construct an L-serieswe need a prescription associating numerical information to the primes p. We can cometo this by choosing a (finite-dimensional) representation ρ of G and taking the trace orthe determinant of the matrix ρ(Fp). Things are particularly easy if we have a one-dimensional representation ρ as for instance for the group G in our first example (9.18)above: We can define an Euler product

L(ρ, s) :=∏p

(1 − ρ(Fp)p−s)−1

where the primes dividing ∆ are omitted. By closer analysis, one can realize that in theexample we get just the Hecke L-function L(χ, s) if we take for ρ the non-trivial characterχ of G ±1. This is another expression of (a special case of) the main theorem ofabelian class field theory.

In Chapter 2 we studied very briefly representations ρ of finite groups. Now, here, wefind an occasion to expand the subject slightly. Beyond one-dimensional representations,the next possibility is that ρ is a two-dimensional representation, which we may supposeto be unitary. We know that SU(2) is a double cover of SO(3). It is customary to classifyfinite subgroups of the unitary group by their image in the group of rotations. Takingfor an irreducible ρ the finite subgroup to be ρ(G), we obtain dihedral, tetrahedral, octa-hedral, and icosahedral representations if G has as its image in SU(2) the correspondinggroup. Dihedral representations can be treated by the classical theory, and also theother representations were treated at the time of Langlands’ article with the exceptionof a complete treatment of the icosahedral case. An example of an equation with anicosahedral representation ρ is given by our equation (9.19). For all primes, which do notdivide 800, the conductor (related to the discriminant in a way we do not explain here),one can form the matrix ρ(Fp), which has two eigenvalues, λp and µp. Artin’s L-functionis then

L(ρ, s) :=∏p

11 − λpp−s

11 − µpp−s

the primes dividing the conductor being omitted from the product. This product con-verges for Re s > 1 and it is a special case of fhe famous Artin Conjecture from 1923 toshow that it can be continued to an analytic function for all s ∈ C. For the equation(9.19) this was done in 1970 by J. Buhler. The general case is still open.


Decomposition Theory and Artin’s L-Function

To start with, we look again at the example K = QN = Q(ζN ): Here we have the abelianGalois group G = Gal(QN ) (Z/NZ)×. This isomorphism relates every irreduciblerepresentation ρ of G uniquely to a character χ of (Z/NZ)×. To each prime number p,which does not divide N , we associate the class p := p mod N in (Z/NZ)× and thenthe Frobenius automorphism Fp ∈ G given by ζN −→ ζp

N . Now we can define the ArtinL-function

L(ρ, s) :=∏pN

11 − ρ(Fp)p−s

and it is clear that it coincides with the Hecke L-function

L(χ, s) :=∏pN

11 − χ(p)p−s

.

For the general case, we look at an algebraic field extension F/K with Galois groupG = Gal(F/K) and rings of integral elements O ⊃ o. For a prime ideal P in O and aprime ideal p in o we say that P lies over p if P contains p and hence P ∩ o = p. Wesay that p decomposes into the primes Pi if p generates in O an ideal (p), which has thedecomposition

(p) =h∏

i=1

Peii

into the primes Pi lying over p with the ramification indices ei ∈ N. We denote by k(P)the residue field O/P and similarly k(p) := o/p. Both fields are finite fields. If Pi liesover p, k(Pi) is a finite extension of k(p) and we denote fi := degk(p) k(Pi). These degreesand the ramification indices are related by the central formula of Hilbert’s ramificationtheory

h∑i=1

eifi = n.

From now on, we assume that the extension F/K is a Galois extension. Then thedecomposition simplifies to

(p) = (h∏

i=1

Pi)e

and all degrees fi coincide (=: f). The Galois group G acts on O and moreover eventransitively on the primes Pi lying over a given p ⊂ o. We introduce the decompositiongroup

GP := σ ∈ G; σP = P.The index of the decomposition group in G is equal to the ramification index, which isdefined as e = [G : GP]. One has the two extreme cases:

i) If GP contains only the identity, we say that p is totally decomposed, andii) GP = G where p stays indecomposed.


Every σ ∈ GP induces an automorphism of the residue class field k(P) := O/P by

a mod P −→ σa mod P

Since this automorphism preserves the elements of the subfield k(p), we get a homomor-phism

ϕP : GP −→ Gal(k(P)/k(p)),

which is surjective. We define the inertia group as its kernel, IP := kerϕP. If p isunramified, IP consists only of the identity, one has GP Gal(k(P)/k(p)) and, hence, thisgroup may be seen as a subgroup of G. In this case we can find exacly one automorphismFP ∈ G such that

FPa ≡ aq mod P for all a ∈ O

with q := N(p). This is the Frobenius automorphism, it generates the cyclic group GP.If IP is not trivial, we take as Frobenius FP an element of G such that its image inGal(k(P)/k(p)) has again the action a mod P −→ aq mod P.

Now we are ready to define Artin’s L-function for this general situation. We take arepresentation ρ of G in the finite-dimensional vector space V and denote by V P thesubspace of elements fixed by ρ(IP). The characteristic polynomial

det(E − ρ(FP)t; V P)

for the action of ρ(FP) on V P depends only on the prime p and not on the prime P lyingover p.

Definition 9.9: Let F/K be a Galois extension of algebraic number fields and (ρ, V ) arepresentation of its Galois group G. Then Artin’s L-series for ρ is defined by

L(ρ, s) := L(F/K, ρ, s) :=∏p

1det(1 − ρ(FP)N(p)−s; V IP)

where the product is taken over all prime ideals p in K (i.e. in o).

This series converges in the right half plane Re s > 1 to an analytic function. For K = Qand the trivial one-dimensional representation ρ0 with ρ0(σ) = 1 for all σ ∈ G, Artin’sL-function is just the Dedekind Zeta-function, one has

L(F/Q, ρ0, s) = ζF(s).

In [Ne] Kapitel VII, we find a lot of results concerning the Artin L-function. To finishwith this topic, we only mention Theorem (10.6) in [Ne], clarifying the relation betweenArtin’s and Hecke’s L-function in the abelian case.

Theorem 9.12: Let F/K be an abelian extension, f the conductor of F/K, ρ a nontrivialcharacter of Gal(F/K) and χ the associated grossencharacter mod f. Then we have therelation

L(F/K, ρ, s) =∏p∈S

11 − ρ(FP)N(p)−s

L(χ, s), S := p | f, χ(IP) = 1.

Here we should still explain the notion of a conductor, but again we have to refer to [Ne].


L-Functions of Elliptic Curves

Again we follow Langlands’ article [L] and look at another diophantine problem: We askfor integral or at least rational solutions of the equation

(9.20) y2 = x3 + Dx,

with D ∈ Z, D = 0, say D = −1. This is a special case of an affine equation defining anelliptic curve E = ED. More generally, one looks at a smooth projective curve definedby the homogeneous equation

y2z = x3 + axz2 + bz3, a, b ∈ Q, 4a3 + 27b2 = 0.

A curve like this is distinguished among all smooth curves because it can be given thestructure of an abelian group. Elliptic curves now are very popular even outside ofnumber theory because of their importance for cryptography. On the way to define anL-function via Euler factors for the primes p, it seems a good idea to count again thenumber Np of solutions of (9.20) mod p (or of the associated homogeneous equation, butwe stay here with Langlands in his article). For instance, for D = −1 and p = 5, we getN5 = 7. We define αp and βp by the conditions

Np = p + αp + βp; βp = αp; αpβp = p.

(These conditions will define αp and βp only if |Np −p| ≤ 2√

p, but it can be proven thatthis is true.) Then one should be tempted to define as L-function for the elliptic curve

L(ED, s) =∏p

11 − αpp−s

11 − βpp−s

omitting perhaps some primes p related to D.

For D = (6577)2, one omits p = 2 and 3 and gets a function for any s and not only forRe s > 3/2 where the Euler product converges.

At this point it is to be emphasized that the value of an elliptic L-function at s = 1 has aparticular significance: It is a (special case of a) conjecture of Birch and Swinnerton-Dyerthat the equation (9.20) must have a solution if L(ED, 1) = 0. This conjecture from 1965is based on numerical calculations and in general is still open.

The L-functions of elliptic curves are special cases of the Hasse-Weil Zeta functions asso-ciated to projective varieties over number fields. Starting with work by Hasse, Weil, andDwork, and up to work by Deligne, Langlands and many others, there are deep theoremsand conjectures for these relating them again to automorphic L-functions when the vari-eties are Shimura varieties, i.e. come as homogeneous spaces from a linear group, as forinstance (roughly said) the completion of the modular curve Γ\H = Γ\SL(2,R)/SO(2).There was a famous conjecture, connected mainly with the names of Taniyama, Shimura,and A. Weil, indicating (again roughly) that each elliptic L-function is an automorphicL-function belonging to some modular form, which has been proved by Wiles and Tay-lor in the framework of their proof of the Fermat conjecture already mentioned at thebeginning of this section.


9.7 Summary and Final Reflections

There are still much more L- and Zeta functions than we mentioned up to now (Dworkonce remarked to the author that he had the impression that every mathematician feelsobliged to define his own one). As in this text we restricted our treatment to the dis-cussion of real and complex groups, we have to leave aside the true story, which happensusing the groups G(Qp) defined over Qp, the non-archimedian completions of Q, and theadelic groups G(A), which come up as the restricted direct product of the groups over allarchimedian and non-archimedian completions of Q and/or as groups of matrices withadelic entries.

Hence we finish our story by giving a rudimentary survey of what Langlands says in [L]concerning the archimedian case, in particular as several notions, which we treated in ourchapters 6 and 7, reappear in a language using intuition from elementary particle physics.

On the Way to (Archimedian) Langlands Parameters

We look at representations π of a subgroup G of GL(n,R), in particular G = GL(n,R).There are essentially two ways to analyse or construct representations:

i) the infinitesimal method using the Lie algebra of G, as exploited in Chapter 6,

and

ii) the method, which came up several times in Chapter 4, 7, and 8 and which startswith the action of G on a manifold M , passes to an associated action on functions livingon the manifold, and finally decomposes this action into irreducibles.

If G is compact, as for instance G = SO(n), one takes M = G and decomposes theright-regular representation on H = L2(G). This way, one comes to representations πwith square integrable matrix coefficients

< π(g)v, w >, v, w ∈ H,

and discrete parameters characterizing π.

If G is not compact, the situation changes considerably as we saw in our discussion of theexample SL(2,R) in 7.2. Langlands proposes that the system being treated is best com-pared to a Schrodinger equation for which both asymptotically independent and boundstates appear (for instance, in the case of the hydrogen atom, for the electron one hasdiscrete orbits and a continuous spectrum):

Since every element g ∈ GL(n,R) can be written as a product k1ak2 where k1 and k2 areorthogonal matrices and a := D(α1, . . . , αn) a diagonal matrix with positive eigenvaluesαj , the simplest representations should have n freely assignable parameters, and they areanalogous to a system of n interacting but asymptotically independent particles to whicharbitrary momenta can be assigned. In addition, the presence of the factors k1 and k2

may entail the presence of discrete quantum numbers. We exemplified this in 7.2.1 whileconstructing the principal series representations of SL(2,R) where we had the purelyimaginary “momentum” is and the discrete parameter ε ∈ 0, 1. In the GL(n,R) case,the general induction procedure from 7.1 proposes to look at first at the group B of

9.7 Summary and Final Reflections 257

superdiagonal matrices

(9.21) b =

⎛⎜⎜⎜⎜⎜⎜⎝

α1 ∗ ∗ . . . ∗α2 ∗ . . . ∗

α3 . . . ∗. . .

αn

⎞⎟⎟⎟⎟⎟⎟⎠

.

Given n real parameters s1, . . . , sn and n numbers εk = ±1 (the supplementary discretequantum numbers), one introduces the characters

χk : α −→ sgn(α)εk |α|isk

of R∗, as well as the character χ of B that sends the matrix b to

n∏k=1

χk(αk)|αk|δ−k.

Here we have δ = (n + 1)/2 and the supplementary exponent comes from the modularfunction and makes things unitary: Similar to our treatment in the special case in 7.2 weassociate to χ the induced representation indG

Bχ given by right translation on the spaceof functions φ on G that satisfy the functional equation

φ(bg) = χ(b)φ(g) for all b ∈ B, g ∈ G.

The representations associated to two sequences of parameters χk are equivalent iff onesequence is a permutation of the other.

If one replaces the exponents isk by arbitrary complex numbers, one gets representations,which in general will be neither irreducible nor unitary. But there is a well-determinedprocess for choosing a specific irreducible factor of the representation that is then takenas the representation associated to the characters χ1, . . . , χn. In the intuitive language,this means that one allows complex momenta.

In our study of SL(2,R) in 7.2 we met with the discrete series representations. Keepingthis in mind, one can deduce that for GL(2,R) there are representations associated toone continuous parameter (or, if the analogy is pursued, one momentum) and one dis-crete parameter (or quantum number).

In the treatment of GL(n,R) the case n = 2 plays a special role because there is justone algebraic extension of the field R, namely C, and this is of degree 2 over R.

For the group GL(n,C) of complex matrices, the representation theory is simpler. Onehas no discrete series and therefore no bound states. In Langlands’ language, for K = Ror C the general irreducible representation of GL(n,K) is analogous to r interacting butasymptotically bound states with n1, . . . , nr particles, Σnk = n and the nk being subjectto constraints appropriate to the field. To construct the representation, one introducesthe group P of matrices of the form (9.21) where n is replaced by r, each αk is a squarenk–matrix, and the asterisks represent block matrices of the appropriate size. Fromhere, we can again apply the induction process but we have to replace the character χ by


the tensor product of (in general infinite-dimensional) representations χk of GL(n,K),so that the functions φ with their functional equation as above take their values in aninfinite-dimensional space. If the momenta are all real, then the representation is irre-ducible. Otherwise it is again necessary to pass to a specific factor. As before, the orderof the χk is irrelevant.After some more consideration, finally one can conclude that the classification of irre-ducible representations of GL(n,K) has an extreme formal simplicity. For K = C, tospecify representations we specify the n-dimensional representations of GL(1,C) = C×.As Langlands emphasizes, here we compare two very different objects: on one hand, irre-ducible and in general infinite-dimensional representations of the group GL(n,C), and onthe other, finite-dimensional but in general reducible representations of GL(1,C) = C×.

A similar statement for K = R requires a group that is not commutative but whoseirreducible representations are of degree at most two, in order to accomodate the existenceof two particle bound states. The appropriate group is the Weil group of R, WR. It isobtained by adjoining to the group C× an element w such that

w2 = −1, and wz = zw for all z ∈ C×.

This is a kind of dihedral group and, in our text, it appeared as an example already inthe Chapters 0 and 1. It has as non-trivial irreducible representations the character π0

given byz −→ zz, w −→ −1

and the two-dimensional representations πm, for 0 = m ∈ Z given by

z −→(

zm 00 zm

), w −→

(0 1

(−1)m 0

).

The Weil group WC of the complex number field C is just C×. One has the nice resultthat for K = R and C the irreducible representations of GL(n,K) are parametrized byn-dimensional representations of the Weil group WK. The Weil group can be attachedto many of the fields appearing in number theory. It can be rather complicated because,as Langlands assures, it incorporates many of the deepest facts about the theory of equa-tions known at the present.

To get some more flavour, we add a remark concerning orthogonal subgroups G ofGL(2n + 1,R) defined as stability groups of quadratic forms. As we know, there areseveral types of groups depending on the index of the quadratic form and the represen-tation theory of these will be quite different. But one can get a unified point of viewby a parametrization of the representations via homomorphims of the Weil group WR

into the L-group LG of G. The general definition of the L-group is given in terms of thetheory of equations and of root systems. For G = GL(n,R) it is simply LG = GL(n,C)and for G = SO(2n + 1) it is the symplectic group of 2n × 2n-matrices, LG = Sp(n,R).

The principle that the irreducible representations of the real group G are classified bythe (continuous) homomorphisms of WR into LG is valid, but this has consequences. Areducible n-dimensional representation of WR is a homomorphism of WR into GL(n,C)that is isomorphic to C× and not in the center of GL(n,C). Hence the notion of irre-ducibility has an obvious analogue for the homomorphisms of WR into LG, and it canbe verified that the homomorphisms which are irreducible in this sense correspond to

9.7 Summary and Final Reflections 259

representations whose matrix coefficients are square-integrable over the group. If thequadratic form chosen is Euclidean, then G is compact and all representations have thisproperty, so that any homomorphism of WR into LG that is not irreducible in this sensewill have to be excluded from the list of parameters. A similar but less restrictive con-dition must also be imposed for the groups with other indices.It also turns out that the classification provided by WR and LG is coarse. Some homo-morphisms correspond to several irreducible representations of G. This finally leads tothe notion of L-packets which indicates representations with the same L-functions.

L-Groups and Functoriality

For Langlands the introduction of the L-group is an essential step towards his famousprinciple of functoriality. A homomorphism from a group H to a group G does not pro-vide any way to transfer irreducible representations from one of these groups to another,unless G is abelian, nor are homorphisms between H and G usually related to homomor-phisms between LH and LG. On the other hand, a homomorphism from LH to LG, bycomposition,

WR −→ LH −→ LG

yields a map from the set of parameters for the irreducible representations of H to theirreducible representations of G, and thus implicitly a way to pass from irreducible rep-resentations of H to irreducible representations of G. Langlands calls this passage theprinciple of functoriality in the L-group. A possibility to realize this principle in generalwould have deepest consequences.

The Standard L-Function

In 9.6 we already insisted on the possibility to associate to a given problem or objectan L-function via an Euler product by encoding information about the object in sucha way that one gets a reasonable “Euler polynomial” in the variable t = p−s, whichcan be put in the denominator for an Euler factor. While constructing representations,say of GL(n,K), via induction in Chapter 7, the main idea was that at least somerepresentations are fixed by choosing characters for the group H of diagonal matricesD(α1, . . . , αn), which can be understood as distinguishing matrices

D(αis1 , . . . , αisn), s1, . . . , sn ∈ R.

Even if we still avoid to go into the representation theory of the non-archimedian fieldQp, it should not be too hard to accept that the analogon of the matrix above in thep-adic theory is the matrix

A(πp) := D(pis1 , . . . , pisn), s1, . . . , sn ∈ R.

To this matrix one can attach the function

L(πp, s) :=1

det(En − A(πp)p−s)

and then form the Euler product

L(π, s) :=∏

L(πp, s).


In this product perhaps some “special” primes are left out or are represented by specialfactors and one has to find an appropriate “factor at ∞” associated to the archimedianrepresentation. Such a product has the chance to converge for, say, Re s ≥ a + 1, pro-vided that the eigenvalues of each A(πp) are less than pa in absolute value. The outcomeis called a standard L-function. Whoever wants to see that this really happens (andeven more...) should be motivated to take a look into the non-archimedian theory, as forinstance in [Ge] or [GeS].

L-Groups and Automorphic L-Functions

To finish our fantastic and magical story (the German word Marchen appears well in thetitle of one of Langlands’ articles ([L1])), we look at an (automorphic) representation πof a group G (in particular G = GL(n)), take a finite-dimensional representation ρ ofthe L-group LG belonging to G. Then we have again a finite Euler polynomial given bydet(E − ρ(A(πp)t) and one can introduce as automorphic L-function

L(π, ρ, s) :=∏ 1

det(E − ρ(A(πp))p−s)

where again for some primes p we have to put in special factors, which we do not discusshere. In particular, one has to find a completing factor belonging to the archimediantheory. The final goal is to generalize what we sketched in the last section by showingthat to given π and ρ there is an automorphic representation Π such that

A(Πp) = ρ(πp)

for almost all p. Then one could have

L(π, ρ, s) = L(Π, s)

and use properties of one type of functions to be transfered to the other.

There are many other important topics coming up in this context. We only mention thetopic of the trace formula but can not touch this here. The only theme whose formulationis within our reach is the following: Given a representation π of a group G = GL(n) anda representation π′ of G′ = GL(n′), by the methods from of our text one can constructrepresentations π + π′ of GL(n + n′) and π ⊗ π′ of GL(nn′). It is natural to ask for asimilar construction for automorphic representations. This is related to functoriality andup to now there are very few results.

We repeat our hope that the vagueness of the presentation of some of the material inthe last sections would make the reader curious to know more about all this and to bestimulated to more intense studies of this most fascinating part of mathematics. All thetime, we were driven by J. R. Jimenez’ words:

¡Voz mıa, canta, canta;que mientras haya algoque no hayas dicho tu,tu nada has dicho!

Bibliography

[AM] Abraham, R., Marsden, J.E.: Foundations of Mechanics. Benjamin/Cummings,Reading 1978.

[BR] Barut, A.O., Raczka, R.: Theory of Group Representations and Applications.PWN Polish Scientific Publishers Warszawa 1980.

[Ba] Bargmann, V.: Irreducible Unitary Representations of the Lorentz Group. An-nals of Math. 48 (1947) 568-640.

[Be] Berndt, R.: Einfuhrung in die Symplektische Geometrie. Vieweg, Braun-schweig/Wiesbaden 1998.Now also translated: An Introduction to Symplectic Geometry. GSM 26, AMS2001.

[Be1] Berndt, R.: The Heat Equation and Representations of the Jacobi Group. Con-temporary Mathematics 389 (2006) 47 – 68.

[BeS] Berndt, R. Schmidt, R.: Elements of the Representation Theory of the JacobiGroup. PM 163, Birkhauser, Basel 1998.

[BR] Brateli, O., Robinson, D.W.: Operator Algebra and Quantum Statistical Me-chanics. Springer, New York 1979.

[BtD] Brocker, T., tom Dieck, T.: Representations of Compact Lie Groups. Springer,New York 1985.

[Bu] Bump, D.: Automorphic Forms and Representations. Cambridge UniversityPress, Cambridge 1997.

[CF] Cassels, J.W.S., Frohlich, A.: Algebraic Number Theory. Thompson, Washing-ton, D.C. 1967.

[CN] Conway, J.H., Norton, S.P.: Monstrous Moonshine. Bull. London Math. Soc.11 (1979) 308 - 339.

[Co] Cornwell, J. F.: Group Theory in Physics. Academic Press, London 1984.

[Do] Donley, R. W.: Irreducible Representations of SL(2,R). p.51 - 59 in: Repre-sentation Theory and Automorphic Forms (Bailey, T.N., Knapp, A.W., eds.),PSPM Vol. 61 AMS 1997.

[Du] Duflo, M.: Theorie de Mackey pour les groupes de Lie algebriques. Acta Math.149 (1982) 153 - 213.

262 BIBLIOGRAPHY

[Ei] Eichler, M.: Einfuhrung in die Theorie der algebraischen Zahlen und Funktio-nen. Birkhauser, Basel 1963.

[Ei1] Eichler, M.: Einige Anwendungen der Spurformel im Bereich der Modularkor-respondenzen. Math. Ann. 168 (1967) 128 - 137.

[EZ] Eichler, M., Zagier, D.: The Theory of Jacobi Forms. Birkhauser, Boston 1985.

[Fi] Fischer, G.: Lineare Algebra. Vieweg, Wiesbaden 2005.

[Fl] Flicker. Y. Z.: Automorphic Forms and Shimura Varieties of PGSp(2). WorldScientific, New Jersey 2005.

[Fog] Fogarty,J.: Invariant Theory. Benjamin,New York 1969.

[Fo] Forster, O.: Analysis 1. Vieweg, Wiesbaden 2006.

[FB] Freitag, E., Busam, R.: Funktionentheorie. Springer, Berlin 1993.

[Fr] Frenkel, E.: Lectures on the Langlands Program and Conformal Field Theory.arXiv:hep-th/0512172v1 15 Dec 2005.

[FLM] Frenkel, E., Lepowski, J., Meurman, A.: Vertex Operator Algebras and theMonster. Academic Press, Boston 1988.

[FS] Fuchs, J., Schweigert, Chr.: Symmetries, Lie Algebras and Representations.Cambridge University Press 1997.

[FH] Fulton, W., Harris, J.: Representation Theory. GTM 129, Springer, New York1991.

[Ge] Gelbart, S.: Automorphic Forms on Adele Groups. Annals of Math. Studies 83,Princeton University Press 1975.

[GeS] Gelbart, S., Shahidi, F.: Analytic Properties of Automorphic L-Functions. Aca-demic Press, Boston 1988.

[Gel] Gell-Mann, M.: Phys. Rev. 125 1067-1084, 1962.

[GH] Griffiths, P., Harris, H.: Principles of Algebraic Geometry. Wiley, New York1978.

[GGP] Gelfand, I., Graev, M., Pyatetskii-Shapiro, I.: Representation Theory and Au-tomorphic Functions. W.B. Saunders, Philadelphia 1963.

[Go] Goldstein, H.: Classical Mechanics. Addison-Wesley, Reading 1980.

[GS] Guillemin, V., Sternberg, S.: Symplectic Techniques in Physics. CambridgeUniversity Press 1984.

[GS1] Guillemin, V., Sternberg, S.: Geometric Asymptotics. Math. Surv. and Mono-graphs 14, AMS 1990.

[Ha] Halmos, P.: Measure Theory. Van Nostrand, New York 1950.

[He] Hein, W.: Struktur- und Darstellungstheorie der klassischen Gruppen. SpringerHT Berlin 1990.

BIBLIOGRAPHY 263

[HN] Hilgert, J., Neeb, K.-H.: Lie-Gruppen und Lie-Algebren. Vieweg, Braun-schweig/Wiesbaden 1991.

[HR] Hewitt, E., Ross, K.A.: Abstract Harmonic Analysis I. Springer, Berlin 1963.

[Hu] Humphreys, J.E.: Introduction to Lie Algebras and Representation Theory.GTM 9, Springer New York 1972.

[Ig] Igusa, J.: Theta Functions. Springer, Berlin 1972.

[JL] Jacquet, H., Langlands, R.P.: Automorphic Forms on GL(2). LNM 114,Springer, New York 1970.

[Ja] Jacobson, N.: Lie Algebras. Interscience, New York 1962.

[Ki] Kirillov, A.A.: Elements of the Theory of Representations. Springer, Berlin1976.

[Ki1] Kirillov, A.A.: Lectures on the Orbit Method. GSM 64, AMS 2004.

[Ki2] Kirillov, A.A.: Unitary Representations of Nilpotent Lie Groups. Uspekhi Mat.Nauk 17 (1962), 57 - 110; English transl. in Russian Math. Surveys 17 (1962).

[Ki3] Kirillov, A.A.: Merits and Demerits of the Orbit Method. Bulletin of the AMS36 (1999) 433 - 488.

[Ko] Koecher, M.: Lineare Algebra. Springer, Berlin 1983.

[Kos] Kostant, B.: Quantization and Unitary Representations. In Lectures in ModernAnalysis III. (ed. Taam, C.T.) LNM 170, Springer, Berlin 1970.

[Kn] Knapp. A.W.: Representation Theory of Semisimple Groups. An OverviewBased on Examples. Princeton University Press 1986.

[Kn1] Knapp. A.W.: Structure Theory of Semisimple Lie Groups. p.1 - 27 in: Repre-sentation Theory and Automorphic Forms (Bailey, T.N., Knapp, A.W., eds.),PSPM Vol. 61, AMS 1997.

[Kn2] Knapp. A.W.: Lie Groups Beyond an Introduction. PM 140, Birkhauser, Boston1996.

[KT] Knapp, A.W., Trapa, P.E.: Representations of Semisimple Lie Groups. p. 7- 87 in: Representation Theory of Lie Groups (Adams, J., Vogan, D., eds.),IAS/Park City Math. Series 8, AMS 2000.

[Ku] Kubota, T.: Elementary Theory of Eisenstein Series. Halsted Press, New York1973.

[La] Lang, S.: Algebra. Addison-Wesley, Reading, Mass. 1965.

[La1] Lang, S.: SL(2,R). Springer, New York 1985.

[La2] Lang, S.: Algebraic Number Theory. Addison - Wesley, Reading, Mass. 1970.

[L] Langlands, R.P.: Representation Theory: Its Rise and Its Role in Number The-ory. p. 181-210 in Proceedings of the Gibbs Symposium, Yale University 1989,AMS 1990.

264 BIBLIOGRAPHY

[L1] Langlands, R.P.: Automorphic Representations, Shimura Varieties, and Mo-tives. Ein Marchen. p. 205-246 in PSPM Vol 33, Part 2, AMS 1979.

[LV] Lion, G., Vergne, M.: The Weil representation, Maslov index and Theta series.Birkhauser, Boston 1980.

[LZ] Lewis, J., Zagier, D.: Period functions for Maass wave forms I. Ann. Math.153 (2001) 191-253.

[Ma] Mackey, G.W.: Unitary Group Representations in Physics, Probability, andNumber Theory. Benjamin/Cummings Publishing Co., Reading, Mass. 1978.

[Ma1] Mackey, G.W.: Induced Representations of Locally Compact Groups I. Ann. ofMath. 55 (1952) 101-139.

[Mu] Mumford, D.: Tata Lectures on Theta I,II,III. PM 28, 43, 97, Birkhauser, Boston1983, 1984, 1991.

[Na] Naimark, M.A.: Linear Representations of the Lorentz Group. Pergamon Press,London 1964.

[Ne] Neukirch, J.: Algebraische Zahlentheorie. Springer, Berlin 2002.

[RV] Ramakrishnan, D., Valenza, R.J.: Fourier Analysis on Number Fields. GTM186, Springer, New York 1999.

[Re] Renouard, P.: Varietes symplectiques et quantification. These. Orsay 1969.

[Sch] Schoneberg, B.: Elliptic Modular Functions. Springer, Berlin 1974.

[Se] Serre, J.P.: Linear Representations of Finite Groups. Springer, New York 1977.

[Se1] Serre, J.P.: A Course in Arithmetics. GTM 7, Springer, New York 1973.

[Sel] Selberg, A.: Harmonic analysis and discontinuous groups in weakly symmetricriemannian spaces with application to Dirichlet series. J. Indian Math. Soc. 20(1956) 47 - 87.

[Sh] Shimura, G.: Introduction to the Arithmetic Theory of Automorphic Functions.Iwanami Shoten and Princeton University Press 1971.

[To] Torasso, P.: Methode des orbites de Kirillov–Duflo et representations minimalesdes groupes simples sur un corps local de caracteristique nulle. Duke Math. J.90 (1997) 261 – 377.

[vD] van Dijk, G.: The irreducible Unitary Representations of SL(2,R). In: Rep-resentations of Locally Compact Groups with Applications. ed. Koornwinder,T.H. Mathematisch Centrum Amsterdam 1979.

[Ve] Vergne, M.: Geometric Quantization and Equivariant Cohomology. First Eu-ropean Congress of Mathematics, Vol I, p.249–298. PM 119, Birkhauser, Basel1994.

[Ve1] Vergne, M.: Quantification geometrique et reduction symplectique. SeminaireBourbaki 888 (2001).

BIBLIOGRAPHY 265

[Vo] Vogan, D.A.: The Method of Coadjoint Orbits for Real Reductive Groups. p.179– 238 in: Representation Theory of Lie Groups (Adams, J., Vogan, D., eds.),IAS/Park City Math. Series 8 AMS 2000.

[Vo1] Vogan, D.A.: Associated Varieties and Unipotent Representations. p.315 – 388in: Harmonic Analysis on Reductive Lie Groups (Barker, W., Sally, P., eds.),Birkhauser, Boston 1991.

[Vo2] Vogan, D.A.: Cohomology and group representations. PSPM Vol 61, 219–243(1997).

[Wa] Warner, G.: Harmonic Analysis on Semi-Simple Lie Groups. Springer, Berlin1972.

[We] Weil, A.: Varietes Kahleriennes. Hermann, Paris 1957.

[We1] Weil, A.: Uber die Bestimmung Dirichletscher Reihen durch ihre Funktionalglei-chung. Math, Ann. 168 149-156 (1967).

[Wo] Woodhouse, N.: Geometric Quantization. (Second Edition) Oxford UniversityPress 1991.

[Ya] Yang, Y.-H.: The Method of Orbits for Real Lie Groups. Kyungpook Math. J.42 (2002) 199–272.

Index

algebraic number field, 246analytic subgroup, 89arithmetic modular form, 244Artin L-function, 252, 254Artin conjecture, 252automorphic L-function, 245automorphic factor, 136

for SL(2,R), 221for the Heisenberg group, 217for the Jacobi group, 238

automorphic form, 217, 232

Borel section, 121Borel space, 38Borel subgroup, 130bundle, 162

cotangent bundle, 170equivalent, 167fiber, 167Hilbert bundle, 163holomorphic line bundle over P1(C),

171, 183line bundle, 161prequantum line bundle, 179real analytic line bundles over P1(C),

182section of a bundle, 162tangent bundle, 169transition function, 168vector bundle, 167

bundle map, 167

Cartan decomposition of g, 95Cartan involution, 85Cartan matrix, 92Cartan subalgebra, 89, 93Casimir element, 101, 135Cauchy Riemann differential equation, 136Cayley transform, 137CDW diagram, 92center

of Heis(R), 83of a group, 5, 83of a Lie algebra, 87

centralizer, 5character

of a representation, 21of an abelian group, 59

coadjoint orbit, 186cocycle condition, 122complementary series

of SL(2,C), 143of SL(2,R), 142

complex structure, 185configuration space, 173conjugacy class, 4conjugation, 4connection on a vector bundle, 185continuous decomposition, 60continuous sum, 62correspondence, 240cusp form, 222cyclotomic field, 246

Darboux theorem, 179decomposition group, 253Dedekind Zeta function, 249degree of a field extension, 246delta function, 223derivation, 65derived representation, 70differential form, 174

closed, 176exact, 176exterior differentiation, 175inner product with a vector field, 176of first degree, 170

direct integral, 62Dirichlet character, 248Dirichlet series, 239discriminant, 252disrete series of SL(2,R), 133

INDEX 267

distribution on a manifold, 184duality theorem

for the discrete series representation ofGJ , 239

for the discrete Series of SL(2,R), 230for the Schrodinger representation of

Heis(R), 219Dynkin diagram, 92

Eisenstein series, 223non-holomorphic, 233

elliptic curve, 255eta function, 223Euler angles, 42, 52Euler function, 246Euler product, 239, 245, 249exponential function for matrices, 67

factor representation, 36fractional ideal, 248Frobenius automorphism, 254Frobenuis reciprocity, 127fundamental domain, 222fundamental functional equation, 117

G-orbit, 3G-space, 33Galois extension, 247Galois group, 246Gamma function, 244Gauss integer, 247Gell-Mann’s eightfold way, 110general linear group, 1generalized weight space, 97Grossencharakter, 249group

alternating, 6classical, 2congruence group, 221derived, 85euclidean group GE(n), 148Heisenberg group, 2Jacobi group, 209linear group, 32Lorentz group GL, 145modular group, 221nilpotent, 86of exponential type, 68orthogonal group, 1Poincare group GP , 154

Poincare group GP (n), 148reductive, 85simple, 85solvable, 86standard Borel, 2standard unipotent, 2symmetric group, 5symmetry group, 178symplectic, 85theta group, 221topological, 31unimodular, 39unitary group, 1Weil group of R, 2, 258Weyl group, 95

group actioneffective, 3from the left, 3from the right, 5transitive, 3

group ring, 23

Haar measure, 39hadron, 110half plane model, 136Hamilton operator, 177Hamilton’s equations, 173Hamiltonian G-space, 187Hamiltonian formalism, 173Hecke L-function, 249Hecke eigenform, 242Hecke operator, 232, 240Heisenberg algebra, 66Heisenberg commutation relations, 70, 72Heisenberg representation, 118Highest Weight Theorem, 98, 103Hilbert class field, 248homogeneous space, 4homothety, 18Hopf map, 195

idealfractional, 248of a commutative ring, 247of a Lie algebra, 86

ideal class group, 248induced representation, 119

compact picture, 125first realization, 123induced picture, 125

268 INDEX

noncompact picture, 126normalized, 120second realization, 124

inductionholomorphic, 141parabolic, 130

induction in stages, 127infinitesimal generator, 69Int g, 89integral curve, 174integral element, 247internal symmetry group, 110intertwining operator, 9isospin, 110isotypic subspace, 74Iwasawa decomposition, 69, 96, 130

Jacobi form, 238Jacobi group, 148, 209, 236, 237Jacobi identity, 64Jacobi symbol, 237Jacobi theta function, 216, 236Jordan-Holder theorem, 18

Killing form, 88Kirillov-Kostant form, 187

ladder operator, 73Langlands decomposition, 125Laplace operator

euclidean, 56Laplacian on SL(2,R), 135, 229left translation, 4Legendre symbol, 237Lie algebra, 64

classical, 89exceptional, 89nilpotent, 86of SL(2,R), 65, 69of SO(3), 66of SU(2), 65of a linear group, 68of the Heisenberg group, 66, 70radical of a Lie algebra, 87semisimple, 87simple, 87solvable, 86

linear fractional transformation, 133linear group, 32little group, 151

Lorentzboost, 146rotation, 146

Lorentz group GL, 145

Maass wave forms, 232Mackey decomposition, 121Mackey Theory for semidirect products, 150manifold, 165

analytic, 165diffeomorphic, 166Kahler, 185oriented, 166quantizable symplectic, 179smooth, 165symplectic, 176, 178

master equation, 122maximal order, 247maximal torus, 94measure

invariant, 39projection valued, 62quasi-invariant, 121spectral, 60

Mellin transform, 244Minkowski (pseudo)norm, 154Minkowski metric, 145mock discrete series of SL(2,R), 142modular form, 222

arithmetic, 244modular function ∆, 40modular group, 221modular invariant, 226moment map, 187multiplicity, 9multiplier system, 55, 125

neutron, 110normal subgroup, 4normalizer

of h in g, 89of H in G, 4

one-parameter subgroup, 68operator

adjoint, 34bounded, 33continuous, 33norm, 33self-adjoint, 34

INDEX 269

symmetric, 34orbit

admissible, 196coadjoint, 186elliptic, 188hyperbolic, 188nilpotent, 188rigged, 197

parabolic induction, 130parabolic subgroup, 130partition, 6Pauli matrices, 53permutation matrix, 6Peter–Weyl Theorem, 48Petersson scalar product, 244phase space, 176Poincare group, 154

two-dimensional, 127Poincare half plane, 133Poincare series, 225Poincare-Birkhoff-Witt Theorem, 100Poisson algebra, 64Poisson bracket, 64, 177polarization

admissible, 196algebraic approach, 196geometric approach, 184

Pontrjagin dual, 59prequantum Hilbert space, 180prime ideal, 247principal series of SL(2,C), 143principal series of SL(2,R), 130, 200, 235principle of functoriality, 259projection operator, 20projective representation, 55, 145projective space, 55, 129, 166proton, 110

q-development, 222quantization, 177quasi-measure relation, 121

Radon-Nikodym derivative, 121Ramanujan function, 223representation

adjoint, 186admissible, 143coadjoint, 186completely reducible, 16, 36

continuous, 34contragredient, 16cyclic, 37decomposable, 16, 36derived, 70discrete series, 142discrete series of SL(2,R), 133disjoint, 9eigenspace, 56equivalent, 9factor representation, 16Fock representation, 207induced, 119, 164irreducible, 8, 36lattice, 219Lie algebra, 66linear, 7permutation representation , 11primary, 20principal series, 130projective representation, 55regular representation of a compact group,

47regular representation of a finite group,

27Schrodinger representation, 13, 118, 152,

206Schrodinger-Weil representation, 211smooth, 70unitary, 9

representationsof SL(2,C), 143of SL(2,R), 130of SO(3), 54of SU(2), 48of the Poincare group GP , 157of the two-dimensional Poincare group,

127unitarily integrable representations of

sl(2,R), 78unitarily integrable representations of

su(2), 82Riemann Zeta function, 224, 239right translation, 4root, 90

positive, 92reduced, 92restricted, 96simple, 92

270 INDEX

root reflection, 91root space, 90root system, 91

reduced, 91

Schrodinger equation, 177Schrodinger representation, 118Schur’s Lemma, 18

finite-dimensional unitary Schur, 19the unitary Schur, 37

semidirect product, 147regular, 150

separable Hilbert space, 33special linear group, 1spectral measure, 61spherical harmonics, 56spin, 55standard L-function, 260Stone-von Neumann Theorem, 84subrepresentation, 8, 36subrepresentation theorem, 143symplectic form ω, 178

integral, 179

tangent space, 169tensor algebra, 99tensor product of representations, 14theta function with characteristics, 216theta group, 221theta transform, 234total differential, 175trace form, 88, 186

unit disc model, 138unitary (linear) dual, 9unitary dual

of SL(2,R), 80, 130of the Heisenberg group, 152of the Poincare group GP , 157

universal enveloping algebra, 99

vacuum vectorof Heis(R), 83

vector field, 170, 174Hamiltonian, 174

vector of lowest weight, 75Verma module, 101

weight, 74, 97algebraically integral, 98

Weil group of R, 2, 13, 258

Weyl character formula, 103, 104Weyl denominator, 103Weyl group, 95Weyl operator, 44Weyl’s dimensionality formula, 107

zeta transform, 235

Documents

Rolf Berndt - University of Chicagomargalit/repthy... · Mackey: “Induced Representations in Physics, Probability and Number Theory” [Ma1]. The title of this text is a mixture