An Introduction to Invariant Theorysites.lsa.umich.edu/.../wp-content/uploads/sites/614/2018/09/IASslid… · Harm Derksen, University of Michigan An Introduction to Invariant Theory

An Introduction to Invariant Theory

Harm Derksen, University of Michigan

Optimization, Complexity and Invariant TheoryInstitute for Advanced Study, June 4, 2018

Harm Derksen, University of Michigan An Introduction to Invariant Theory

Plan of the Talk

I applications of invariants

I a classical, motivating example : binary forms

I polynomial rings ideals

I group representations and invariant rings

I Hilbert’s Finiteness Theorem

I the null cone and the Hilbert-Mumford criterion

I degree bounds for invariants

I polarization of invariants and Weyl’s Theorem

I Invariant Theory for other fields


Plan of the Talk











Plan of the Talk











Plan of the Talk











Plan of the Talk











Plan of the Talk











Plan of the Talk











Plan of the Talk











Plan of the Talk











Applications of Invariants

Definition

an invariant is a quantity or expression that stays the same undercertain operations

the total energy in a physical system is an invariant as the systemevolves over time

loop invariants can be used to prove the correctness of an algorithmalthough the number of iterations in a loop may vary, the loopinvariant tell us to say something about the variables after theiterations



Definition






Definition






Knot invariants (such as the Jones polynomial) can be used todistinguish knots

knot invariants remain unchanged under Reidemeister moves

(co-)homology groups are invariants of topological manifolds



Knot invariants (such as the Jones polynomial) can be used todistinguish knots

knot invariants remain unchanged under Reidemeister moves

(co-)homology groups are invariants of topological manifolds


Invariant Theory

in invariant theory we restrict ourselves to

I invariants that are polynomial functions on a vector space

I invariants that remain unchanged under group symmetriessuch as rotations, permutations etc.

we start with a motivating example from 19th century invarianttheory


Invariant Theory






Invariant Theory






Classical Invariant Theory: Binary Forms

a binary form of degree 2 is a polynomial

p(z ,w) = p1z2 + p2zw + p3w

2

with p1, p2, p3 ∈ C

SL2 =

{(a bc d

): ad − bc = 1

}is the group of 2× 2 matrices with determinant onea matrix A ∈ SL2 gives a linear change of coordinates in C2

the group SL2 acts on (the coefficients of) binary forms:we make the substitution (z ,w) 7→ (az + cw , bz + dw) and getanother polynomial

p′(z ,w) = p(az + cw , bz + dw) = p′1z2 + p′2zw + p′3w

2




p(z ,w) = p1z2 + p2zw + p3w

2


SL2 =

{(a bc d

): ad − bc = 1




2




p(z ,w) = p1z2 + p2zw + p3w

2


SL2 =

{(a bc d

): ad − bc = 1




2



where

A =

(a bc d

)∈ SL2,p′1

p′2p′3

= MA

p1p2p3

and

MA =

a2 ab b2

ac ad + bc bdc2 cd d2



the polynomial f (x1, x2, x3) = x22 − 4x1x3 ∈ C[x1, x2, x3] (thediscriminant) can be viewed as a function from C3 to C and aneasy calculation shows that

f

p1p2p3

= p22 − 4p1p3 = (p′2)2 − 4p′1p′3 = f

p′1p′2p′3

we say that f (x1, x2, x3) is an invariant under the action of SL2

f (x1, x2, x3) is a fundamental invariant that generates all invariants:if h(x1, x2, x3) is another polynomial invariant, then there exists apolynomial q(y) such that h(x1, x2, x3) = q(f (x1, x2, x3))




f

p1p2p3

= p22 − 4p1p3 = (p′2)2 − 4p′1p′3 = f

p′1p′2p′3






f

p1p2p3

= p22 − 4p1p3 = (p′2)2 − 4p′1p′3 = f

p′1p′2p′3





we may identify binary forms of degree n with vectors in Cn+1:

p1zn + p2z

n−1w + · · ·+ pn+1wn ↔

p1p2...

pn+1

the vector space of binary forms of degree n is an(n + 1)-dimensional representation of SL2



we may identify binary forms of degree n with vectors in Cn+1:

p1zn + p2z

n−1w + · · ·+ pn+1wn ↔

p1p2...

pn+1

the vector space of binary forms of degree n is an(n + 1)-dimensional representation of SL2



polynomial invariants for binary forms of arbitrary degree wereextensively studied in the 19th century by mathematicians likeBoole, Sylvester, Cayley, Aronhold, Hermite, Eisenstein, Clebsch,Gordan, Lie, Klein, Capelli etc.

Theorem (Gordan 1868)

for binary forms of degree d there exists a finite system offundamental invariants that generate all invariants (i.e., everyinvariant is a polynomial expression in the fundamental invariants)

one of the main objectives was to find an explicit system offundamental invariants for binary forms up to degree d

(currently known for d ≤ 10)
















The Polynomial Ring

x1, x2, . . . , xn coordinate functions on V = Cn

a polynomial f (x1, . . . , xn) can be viewed as function from V to CC[x] = C[x1, . . . , xn] graded ring of polynomial functions

Definition (Ideal)

a subset I ⊆ C[x] is an ideal if

1. 0 ∈ I ;

2. f (x), g(x) ∈ I ⇒ f (x) + g(x) ∈ I ;

3. f (x) ∈ C[x], g(x) ∈ I ⇒ f (x)g(x) ∈ I .


The Polynomial Ring

x1, x2, . . . , xn coordinate functions on V = Cn

a polynomial f (x1, . . . , xn) can be viewed as function from V to CC[x] = C[x1, . . . , xn] graded ring of polynomial functions

Definition (Ideal)

a subset I ⊆ C[x] is an ideal if

1. 0 ∈ I ;

2. f (x), g(x) ∈ I ⇒ f (x) + g(x) ∈ I ;

3. f (x) ∈ C[x], g(x) ∈ I ⇒ f (x)g(x) ∈ I .


Hilbert’s Basis Theorem

the ideal (S) generated by a subset S ⊆ C[x] is

{a1(x)f1(x)+· · ·+ar (x)fr (x) | r ∈ N,∀i ai (x) ∈ C[x], fi (x) ∈ S}

Theorem (Hilbert 1890)

every ideal I ⊆ C[x] is generated by a finite set(C[x] is noetherian)

if S ⊆ C[x], then (S) = (T ) for some finite subset T ⊆ S

Hilbert used this theorem to prove a his Finiteness Theorem inInvariant Theory (discussed later)


























Action of a Group G

suppose V = Cn is a representation of a group Gthis means that every g ∈ G acts my some n × n matrixMg : V → V (so g · v = Mgv)

and we have Me = I andMgh = MgMh

this also implies Mg−1 = (Mg )−1

if f (x) ∈ C[x] and M = (mi ,j) is n × n matrix, then v 7→ f (Mv) isa polynomial function given by the formula

f( n∑

j=1

m1,jxj , . . . ,n∑

j=1

mn,jxj

)


Action of a Group G

suppose V = Cn is a representation of a group Gthis means that every g ∈ G acts my some n × n matrixMg : V → V (so g · v = Mgv) and we have Me = I andMgh = MgMh



f( n∑

j=1

m1,jxj , . . . ,n∑

j=1

mn,jxj

)


Action of a Group G

suppose V = Cn is a representation of a group Gthis means that every g ∈ G acts my some n × n matrixMg : V → V (so g · v = Mgv) and we have Me = I andMgh = MgMh



f( n∑

j=1

m1,jxj , . . . ,n∑

j=1

mn,jxj

)


Action of a Group G

G acts on C[x] as follows:

if g ∈ G and f (x) ∈ C[x] then define (g · f )(x) ∈ C[x] by(g · f )(v) = f (Mg−1v)

(we use Mg−1 instead of Mg to make it a left action)

C[x] is an ∞-dimensional C-vector spacethe monomials form a basisG acts by linear transformations on C[x]C[x] is an ∞-dimensional representation of G


Action of a Group G

G acts on C[x] as follows:

if g ∈ G and f (x) ∈ C[x] then define (g · f )(x) ∈ C[x] by(g · f )(v) = f (Mg−1v)

(we use Mg−1 instead of Mg to make it a left action)

C[x] is an ∞-dimensional C-vector spacethe monomials form a basisG acts by linear transformations on C[x]C[x] is an ∞-dimensional representation of G


The Invariant Ring

f (x) ∈ C[x] is G -invariant if (g · f )(x) = f (x) for all g ∈ Gf (x) ∈ C[x] is G -invariant if and only if it is constant on allG -orbits in V

Definition

C[x]G is the set of all G -invariant polynomials in C[x]

C[x]G is a subalgebra, i.e., contains C and is closed under addition,subtraction and multiplication

if f1(x), . . . , fr (x) ∈ C[x] then

C[f1(x), . . . , fr (x)] :=

{p(f1(x), . . . , fr (x)) | p(y1, . . . , yr ) ∈ C[y1, . . . , yr ]}

is the subalgebra of C[x] generated by f1(x), . . . , fr (x).


The Invariant Ring


Definition



if f1(x), . . . , fr (x) ∈ C[x] then

C[f1(x), . . . , fr (x)] :=

{p(f1(x), . . . , fr (x)) | p(y1, . . . , yr ) ∈ C[y1, . . . , yr ]}



The Invariant Ring


Definition



if f1(x), . . . , fr (x) ∈ C[x] then

C[f1(x), . . . , fr (x)] :=

{p(f1(x), . . . , fr (x)) | p(y1, . . . , yr ) ∈ C[y1, . . . , yr ]}



The Symmetric Group

G = Sn acts on V = Cn by permuting the coordinatesfor σ ∈ Sn, Mσ is the corresponding permutation matrixSn acts on C[x] as

(σ · f )(x1, . . . , xn) = f (xσ(1), . . . , xσ(n))

define the k-th elementary symmetric function as

ek(x) =∑

1≤i1<i2<···<ik≤nxi1xi2 · · · xik

for example e1 = x1 + x2 + · · ·+ xn and en = x1x2 · · · xn

Theorem

C[x]Sn = C[e1(x), . . . , en(x)]


The Symmetric Group


(σ · f )(x1, . . . , xn) = f (xσ(1), . . . , xσ(n))


ek(x) =∑

1≤i1<i2<···<ik≤nxi1xi2 · · · xik


Theorem

C[x]Sn = C[e1(x), . . . , en(x)]


The Symmetric Group


(σ · f )(x1, . . . , xn) = f (xσ(1), . . . , xσ(n))


ek(x) =∑

1≤i1<i2<···<ik≤nxi1xi2 · · · xik


Theorem

C[x]Sn = C[e1(x), . . . , en(x)]


Hilbert’s Finiteness Theorem

assume that G is (linearly) reductive, which means that everyrepresentation of G is a direct sum of irreducible representationsexamples are GLn, SLn, On, finite groups


C[x]G is a finitely generated algebra, i.e.,C[x]G = C[f1(x), . . . , fr (x)] for some r <∞ andf1(x), . . . , fr (x) ∈ C[x]G

proof sketch:J ⊆ C[x] ideal generated by all homogeneous, non-constantf (x) ∈ C[x]G (∞ many!)Basis Theorem: J = (f1(x), . . . , fr (x)) for some r <∞ andhomogeneous f1(x), . . . , fr (x) ∈ C[x]G

by induction one shows that C[x]G = C[f1(x), . . . , fr (x)]













proof sketch:J ⊆ C[x] ideal generated by all homogeneous, non-constantf (x) ∈ C[x]G (∞ many!)

Basis Theorem: J = (f1(x), . . . , fr (x)) for some r <∞ andhomogeneous f1(x), . . . , fr (x) ∈ C[x]G

















Degree Bounds

Definition

β(C[x]G ) is the smallest d such that C[x]G is generated bypolynomials of degree ≤ d

Theorem (Jordan 1876)

for binary forms of degree d we have β(C[x1, . . . , xd+1]SL2) ≤ d6

Theorem (Emmy Noether 1916)

if G is finite then β(C[x]G ) ≤ |G |


Degree Bounds

Definition







Degree Bounds

Definition







A Constructive Proof

the proof of Hilbert’s finiteness theorem does not give an algorithmfor finding generators, nor does it give an upper bound forβ(C[x]G ) for arbitrary G

so Hilbert gave another, more constructive proof in 1893 of hisFiniteness Theorem using his notion of the null cone


A Constructive Proof

the proof of Hilbert’s finiteness theorem does not give an algorithmfor finding generators, nor does it give an upper bound forβ(C[x]G ) for arbitrary G

so Hilbert gave another, more constructive proof in 1893 of hisFiniteness Theorem using his notion of the null cone


Hilbert’s Null cone

for v ∈ V , G · v = {g · v | g ∈ G} is orbit of vG · v ⊆ V closure of the orbit

Theorem

G · v ∩ G · w 6= ∅ ⇔ f (v) = f (w) for all f (x) ∈ C[x]G

⇒: f ∈ C[x]G is constant on G · v and G · w

Definition

Hilbert’s Null cone:

N := {v ∈ V | 0 ∈ G · v} =

= {v ∈ V | f (v) = f (0) for all f (x) ∈ C[x]G}

if C[x]G = C[f1(x), . . . , fr (x)] with f1(x), . . . , fr (x) homogeneous,non-constant, then N = {v ∈ V | f1(v) = · · · = fr (v) = 0}




Theorem



Definition


N := {v ∈ V | 0 ∈ G · v} =






Theorem



Definition


N := {v ∈ V | 0 ∈ G · v} =




Example: Multiplicative Group

G = C?, V = C4

for t ∈ C?, define

Mt =

t 0 0 00 t 0 00 0 t−1 00 0 0 t−1

t ·

v1v2v3v4

= Mt

v1v2v3v4

=

tv1tv2

t−1v3t−1v4

N = {v1 = v2 = 0} ∪ {v3 = v4 = 0}


Example: Multiplicative Group

C[x1, x2, x3, x4]C?

= C[x1x3, x1x4, x2x3, x2x4]

N = {v1v3 = v1v4 = v2v3 = v2v4 = 0} = {v1 = v2 = 0}∪{v3 = v4 = 0}

Note that in this case, there is an algebraic relation between thegenerators, namely

(x1x3)(x2x4) = (x1x4)(x2x3)


Hilbert-Mumford criterion

Definition

a one parameter subgroup (1-PSG) is a homomorphism ofalgebraic groups λ : C? → G

Theorem (Hilbert-Mumford criterion)

if v ∈ V = Cn, thenv ∈ N ⇔ there exists a 1-PSG λ : C? → G with lim

t→0λ(t) · v = 0


Hilbert-Mumford criterion

Definition

a one parameter subgroup (1-PSG) is a homomorphism ofalgebraic groups λ : C? → G

Theorem (Hilbert-Mumford criterion)

if v ∈ V = Cn, thenv ∈ N ⇔ there exists a 1-PSG λ : C? → G with lim

t→0λ(t) · v = 0


Conjugation of n × n Matrices

V = Matn,n, the space of n × n matricesG = GLn (the group of invertible n × n matrices) acts byconjugation: if A = (ai ,j) ∈ V and g ∈ G then g · A = gAg−1

if

(?) λ(t) =

tk1

. . .

tkn

with k1 ≥ k2 ≥ · · · ≥ kn, then

λ(t) · A = λ(t)Aλ(t)−1 = (tki−kjai ,j).

so limt→0 λ(t) · A = 0 if and only if A is strict upper triangular

every 1-PSG is of the form (?) after a base change, so

A ∈ N ⇔ A conjugate to strict upper triang. mat.⇔ A is nilpotent




if

(?) λ(t) =

tk1

. . .

tkn

with k1 ≥ k2 ≥ · · · ≥ kn, then








if

(?) λ(t) =

tk1

. . .

tkn

with k1 ≥ k2 ≥ · · · ≥ kn, then







X = (xi ,j) where xi ,j are indeterminates

det(tI − X ) = tn − f1(x)tn−1 + · · ·+ (−1)nfn(x)

where x = x1,1, x1,2, . . . , xn,nf1(A) = trace(A), fn(A) = det(A)

Theorem

C[x]G = C[f1(x), . . . , fn(x)]

A ∈ N ⇔ f1(A) = · · · = fn(A) = 0⇔ det(tI−A) = tn ⇔ A nilpotent



X = (xi ,j) where xi ,j are indeterminates

det(tI − X ) = tn − f1(x)tn−1 + · · ·+ (−1)nfn(x)

where x = x1,1, x1,2, . . . , xn,nf1(A) = trace(A), fn(A) = det(A)

Theorem

C[x]G = C[f1(x), . . . , fn(x)]

A ∈ N ⇔ f1(A) = · · · = fn(A) = 0⇔ det(tI−A) = tn ⇔ A nilpotent


Degree Bounds


suppose f1(x), . . . , fr (x) ∈ C[x]G are homogeneous andN = {v | f1(v) = · · · = fr (v) = 0}

then there exists finitely many homogenous invariantsh1(x), . . . , hs(x) such that every invariant p(x) ∈ C[x]G is of theform

p(x) = a1(x)h1(x) + · · ·+ as(x)hs(x)

for some a1(x), . . . , as(x) ∈ C[f1(x), . . . , fr (x)]


Degree Bounds


suppose f1(x), . . . , fr (x) ∈ C[x]G are homogeneous andN = {v | f1(v) = · · · = fr (v) = 0}

then there exists finitely many homogenous invariantsh1(x), . . . , hs(x) such that every invariant p(x) ∈ C[x]G is of theform

p(x) = a1(x)h1(x) + · · ·+ as(x)hs(x)



Degree Bounds

Theorem (Popov 1980)

suppose f1(x), . . . , fr (x) ∈ C[x]G are homogeneous of the samedegree d and N = {v | f1(v) = · · · = fr (v) = 0}

then there exists finitely many homogenous invariantsh1(x), . . . , hs(x) of degree at most n(d − 1) such that everyinvariant p(x) ∈ C[x]G is of the form

p(x) = a1(x)h1(x) + · · ·+ as(x)hs(x)


in particular, β(C[x]G ) ≤ max{d , n(d − 1)} ≤ nd(because C[x]G = C[f1(x), . . . , fr (x), h1(x), . . . , hs(x)])


Degree Bounds




p(x) = a1(x)h1(x) + · · ·+ as(x)hs(x)




Degree Bounds




p(x) = a1(x)h1(x) + · · ·+ as(x)hs(x)




Polynomial Degree Bounds

Theorem (D.)

suppose f1(x), . . . , fr (x) ∈ C[x]G are homogeneous of the degree atmost d and N = {v | f1(v) = · · · = fr (v) = 0}

then we haveβ(C[x]G ) ≤ 3

8nd2

for binary forms of degree n, the null cone is defined byhomogeneous invariants of degree ≤ 2n3, and we getβ(C[x]G ) ≤ 3

8(n + 1)(2n3)2 (which is slightly worse than Jordan’sbound)

if G is fixed then the bound β(C[x]G ) is polynomial in n (thedimension of V ) and the largest euclidean length among theweights appearing in the representation



Theorem (D.)



8nd2






Theorem (D.)



8nd2






Theorem (D.)



8nd2





Polarization

V representation of GC[x] = C[x1, . . . , xn] ring of polynomial functions on V = Cn

C[x, y] of polynomial functions on V ⊕ V = C2n

for f (x) ∈ C[x]G we can write

f (x1 + ty1, . . . , xn + tyn) = f0(x, y) + f1(x, y)t + · · ·+ fd(x, y)td

where f0(x, y), . . . , fd(x, y) ∈ C[x, y]G


Polarization

R[m] := C[x1,1, . . . , xn,1, . . . , x1,m, . . . , xn,m] is the ring ofpolynomial functions on Vm ∼= Matn,m

if m < s then we can polarize f (x) ∈ R[m]G to get invariants inR[s]G

Theorem (Weyl)

if s > n = dimV then polarizing generators from R[n]G givegenerators of R[s]G .

in particular, β(R[s]G ) = β(R[n]G )


Polarization



Theorem (Weyl)




Polarization



Theorem (Weyl)




Other Fields

instead of C, we can take any algebraically closed field ofcharacteristic 0

the degree bounds are valid for arbitrary fields of characteristic 0we need “algebraically closed” to make geometric statementsabout the null cone, orbits, etc.

most statements are either false, or more difficult to prove inpositive characteristic


Other Fields





Other Fields





Other Fields

Weyl’s theorem is false in positive characteristic

Noether’s bound (β(C[x]G ) ≤ |G | for finite G ) is wrong in positivecharacteristic

invariant rings of reductive groups are also finitely generated inpositive characteristic, but the proof is harder (using theorems ofNagata and Haboush) and many of the geometric statementsabout the null cone etc. are still true


Other Fields





Other Fields





Documents

An Introduction to Invariant Theorysites.lsa.umich.edu/.../wp-content/uploads/sites/614/2018/09/IASslid… · Harm Derksen, University of Michigan An Introduction to Invariant Theory