Combinatorics - Colorado State Universityhulpke/lectures/m501/notes.pdf · is are lecture notes I prepared for a graduate Combinatorics course which I taught in Fallat Colorado State

CombinatoricsCourse Notes – MATH

Alexander HulpkeSpring

LATEX-ed on January ,

Alexander HulpkeDepartment of MathematicsColorado State University Campus DeliveryFort Collins, CO,

ese notes are accompanying my course MATH /, Combinatorics, held Fall ,Spring at Colorado State University.

© Alexander Hulpke. Copying for personal use is permitted.

Contents

Contents iii

I Basic Counting I. Basic Counting of Sequences and Sets . . . . . . . . . . . . . . . . . I. Bijections and Double Counting . . . . . . . . . . . . . . . . . . . . . I. Stirling’s Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. e Twelvefold Way . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Some further counting tasks . . . . . . . . . . . . . . . . . . . . . . . e twelvefold way theorem . . . . . . . . . . . . . . . . . . . . . . .

II Recurrence and Generating Functions II. Power Series – A Review of Calculus . . . . . . . . . . . . . . . . . .

Generating Functions and Recursion . . . . . . . . . . . . . . . . . . II. Linear recursion with constant coecients . . . . . . . . . . . . . . .

Another example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II. Nested Recursions: Domino Tilings . . . . . . . . . . . . . . . . . . . II. Catalan Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II. Index-dependent coecients and Exponential generating functions

Bell Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Splitting and Products . . . . . . . . . . . . . . . . . . . . . . . . . . . Involutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

III Inclusion, Incidence, and Inversion III. e Principle of Inclusion and Exclusion . . . . . . . . . . . . . . . . III. Partially Ordered Sets and Lattices . . . . . . . . . . . . . . . . . . .

Linear extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii

iv Contents

Product of posets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III. Distributive Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . III. Chains and Extremal Set eory . . . . . . . . . . . . . . . . . . . . . III. Incidence Algebras and Mobius Functions . . . . . . . . . . . . . . .

Mobius inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

IV Connections IV. Halls’ Marriage eorem . . . . . . . . . . . . . . . . . . . . . . . . . IV. Konig’s eorems – Matchings . . . . . . . . . . . . . . . . . . . . . .

Stable Matchings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Menger’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Max-ow/Min-cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Braess’ Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

V Partitions and Tableaux V. Partitions and their Diagrams . . . . . . . . . . . . . . . . . . . . . . V. Some partition counts . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Pentagonal Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

e Hook Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Symmetric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Base Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Complete Homogeneous Symmetric Functions . . . . . . . . . . . . V. e Robinson-Schensted-Knuth Correspondence . . . . . . . . . . .

Proof of the RSK correspondence . . . . . . . . . . . . . . . . . . . .

VI Symmetry VI. Automorphisms and Group Actions . . . . . . . . . . . . . . . . . .

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Orbits and Stabilizers . . . . . . . . . . . . . . . . . . . . . . . . . . .

VI. Cayley Graphs and Graph Automorphisms . . . . . . . . . . . . . . VI. Permutation Group Decompositions . . . . . . . . . . . . . . . . . .

Block systems and Wreath Products . . . . . . . . . . . . . . . . . . . VI. Primitivity and Synchronization . . . . . . . . . . . . . . . . . . . . .

Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI. Enumeration up to Group Action . . . . . . . . . . . . . . . . . . . . VI. Polya Enumeration eory . . . . . . . . . . . . . . . . . . . . . . . .

e Cycle Index eorem . . . . . . . . . . . . . . . . . . . . . . . . .

Bibliography

Index

Preface

Counting is innate to man.

History of IndiaA R M A A-B

is are lecture notes I prepared for a graduate Combinatorics course which Itaught in Fall at Colorado State University.

ey are based extensively on

• Peter Cameron, Combinatorics [Cam]• Peter Cameron, J.H. van Lint, Designs, Graphs, Codes and their Links [CvL]• Chris Godsil, Gordon Royle, Algebraic Graph eory [GR]• Rudolf Lidl, Harald Niederreiter, Introduction to Finite Fields and their Ap-

plications [LN]• Richard Stanley, Enumerative Combinatorics I&II [Sta, Sta]• J.H. van Lint, W.M. Wilson, A Course in Combinatorics [vLW]• Marshall Hall, Combinatorial eory [Hal]• Ronald Graham, Donald Knuth, Oren Patashnik, Concrete Mathematics [GKP]• Donald Knuth, e Art of Computer Programming, Volume [Knu]• Philip Reichmeider, e equivalence of some combinatorial matching theo-

rems [Rei]

Fort Collins, Spring Alexander Hulpke

[email protected]

v

C

I

Basic Counting

One of the basic tasks of combinatorics is to determine the cardinality of (nite)classes of objects. Beyond basic applicability of such a number – for example toestimate probabilities – the actual process of counting may be of interest, as it givesfurther insight into the problem:

• If we cannot count a class of objects, we cannot claim that we know it.

• e process of enumeration might – for example by giving a bijection be-tween classes of dierent sets – uncover a relation between dierent classesof objects.

• e process of counting might lend itself to become constructive, that is al-low an actual construction of (or iteration through) all objects in the class.

e class of objects might have a clear mathematical description – e.g. all sub-sets of the numbers , . . . , . In other situations the description itself needs to betranslated into proper mathematical language:

D I. (Derangements): Given n letters and n addressed envelopes, a de-rangement is an assignment of letters to envelopes that no letter is in the correctenvelope. How many derangement of n exist?

is particular problem will be solved in section II.We will start in this chapter by considering the enumeration of some basic con-

structs – sets and sequences. More interesting problems, such as the derangementshere, arise later if further conditions restrict to sub-classes, or if obvious descrip-tions could have multiple sequences describe the same object.

CHAPTER I. BASIC COUNTING

I. Basic Counting of Sequences and Sets

I am the sea of permutation.I live beyond interpretation.I scramble all the names and the combinations.I penetrate the walls of explanation.

Lay My LoveB E

A sequence of length k is simply an element of the k-fold cartesian product.Entries are chosen independently, that is if the rst entry has a choices and thesecond entry b, there are a ⋅ b possible choices for a length two sequence. us,if we consider sequences of length k, entries chosen from a set A of cardinality n,there are nk such sequences.

is allows for duplication of entries, but in some cases – arranging objectsin sequence – this is not desired. In this case we can still choose n entries in therst position, but in the second position need to avoid the entry already chosen inthe rst position, giving n − options. (e number of options is always the same,the actual set of options of course depends on the choice in the rst position.) enumber of sequences of length k thus is (n)k = n ⋅ (n− ) ⋅ (n−) ⋅ ⋅ (n− k+ ) =

n!(n−k)! , called “n lower factorial k”.is could be continued up to a sequence of length n (aer which all n element

choices have been exhausted). Such a sequence is called a permutation of A, thereare n! = n ⋅ (n − ) ⋅ ⋅ ⋅ such permutations.

Next we consider sets. While every duplicate-free sequence describes a set, se-quences that have the same elements arranged in dierent order describe the sameset. Every set of k elements from n thus will be described by k! dierent duplicate-free sequences. To enumerate sets, we therefore need to divide by this factor, and getfor the number of k-element sets from n the count given by the binomial coecient

nk = (n)k

k!= n!(n − k)!k!

Note (using a convention of ! = ) we get that n = nn = . It also can be conve-nient to dene nk = for k < or k > a.

Binomial coecients also allow us to count the number of compositions of anumber n into k parts, that is to write n as a sum of exactly k positive integers withordering being relevant. For example = + = + + + are the possiblecompositions of into parts.

To get a formula of the number of possibilities, write the maximum composi-tion

n = + + + + Warning: e notation (n)k has dierent meaning in other areas of mathematics!

I.. BIJECTIONS AND DOUBLE COUNTING

which has n − plus-signs. We obtain the possible compositions into k parts bygrouping summands together to only have k summands. at is, we designate k− plus signs from the given n− possible ones as giving us the separation. e numberof possibilities thus is n−

k−.If we also want to allow summands of when writing n as a sum of k terms, we

can simply assume that we temporarily add to each summand. is guaranteesthat each summand is positive, but adds k to the sum. We thus count the numberof ways to express n + k as a sum of k summands which is by the previous formulan+k−

k− = n+k−n .

Example: Check this for n = and k = .is count is particular relevant as it also counts multisets, that is sets of n ele-

ments from k, in which we allow the same element to appear multiple times. Eachsuch set is described as a sum of k terms expressing n, the i-th term indicating howoen the i-th element is in the set.

Swapping the role of n and k, We denote by nk = n+k−k the number of

k-element multisets chosen from n possibilities.

e results of the previous paragraphs are summarized in the following theo-rem:

T I.: e number of ways to select k objects from a set of n is given by thefollowing table:

Repetition No RepetitionOrder signicant nk (n)k

(sequences)

Order not signicant n + k − k = n

k n

k

(sets)

N I.: Instead of using the word signicant some books talk about ordered orunordered sequences. I nd this confusing, as the use of ordered is opposite that ofthe word sorted which has the same meaning in common language. We thereforeuse the language of signicant.

I. Bijections and Double Counting

As long as there is no double counting, Section(a) adopts the principle of the recent casesallowing recovery of both a complainants actuallosses and a misappropriator’s unjust benet. . .

Dra of Uniform Trade Secrets ActA B A


Figure I.: A × grid

In this section we consider two further important counting principles that canbe used to build on the basic constructs.

Instead of counting a set A of objects directly, it might be easiest to establisha bijection to another set B, that is a function f ∶A → B which is one-to-one (alsocalled injective) and onto (also called surjective). Once such a function has beenestablished we know that A = B and if we know B we thus have counted A.

As an example, consider the following problem: We have an n×n grid of pointswith horizontal and vertical connections (depicted in gure I. for × ) and wantto count the number of dierent paths from the bottom le, to the top right corner,that only go right or up.

Each such path thus has exactly n− right steps, and n− up steps. We thus (thisalready could be considered as one bijection) could count instead sequences (is right, is up) of length n − that contain exactly n − ones (and zeros). Denotethe set of such sequences by A.

To determine A, we observe that each sequence is determined uniquely by thepositions of the ones and there are exactly n− of them. us let B the set of all n−element subsets of , . . . , n − .

We dene f ∶A→ B to assign to a sequence the positions of the ones, f ∶ a , . . . , an− i ai = .As every sequence has exactly n − ones, indeed f goes from A to B. As the

positions of the ones dene the sequence, f is injective. And as we clearly can con-struct a sequence which has ones in exactly n − given positions, f is surjective aswell. us f is a bijection.

We know that B = n−n− , this is also the cardinality of A.

Example: Check this for n = , n = .

e second useful technique is to “double counting”, counting the same set intwo dierent ways (which is a bijection from a set to itself). Both counts must givethe same result, which can oen give rise to interesting identities. e followinglemma is an example of this paradigm:

L I. (Handshaking Lemma): At a convention (where not everyone greetseveryone else but no pair greets twice), the number of delegates who shake handsan odd number of times is even.

I.. BIJECTIONS AND DOUBLE COUNTING

Proof: we assume without loss of generality that the delegates are given by , . . . , n.Consider the set of handshakes

S = (i , j) i and j shake hands. .

We know that if (i , j) is in S, so is ( j, i). is means that S is even, S = y wherey is the total number of handshakes occurring.

On the other hand, let xi be the number of pairs with i in the rst position. Wethus get∑ xi = S = y. If a sum of numbers is even, there must be an even numberof odd summands.

But xi is also the number of times that i shakes hands, proving the result.

About binomial theorem I’m teeming with a lot o’ news,With many cheerful facts about the square of the hypotenuse.

e Pirates of PenzanceW.S. G

e combinatorial interpretation of binomial coecients and double countingallows us to easily prove some identities for binomial coecients (which typicallyare proven by induction in undergraduate classes):

P I.: Let n, k be nonnegative integers with k ≤ n. en:

a) nk = n

n − k.b) kn

k = nn −

k − .

c) n + k = n

k − + n

k (Pascal’s triangle).

d)n

k=nk = n .

e)n

k=nk = n

n.

f) ( + t)n = nk=nktk (Binomial eorem).

Proof: a) Instead of selecting a subset of k elements we could select the n − k ele-ments not in the set.b) Suppose we want to count committees of k people (out of n) with a designatedchair. We can do so by either choosing rst the nk committees and then for each


team the k possible chairs out of the committee members. Or we choose rst the npossible chairs and then the remaining k − committee members out of the n − remaining persons.c) Suppose that I am part of a group that contains n+ persons and we want to de-termine subsets of this group that contain k people. ese either include me (andk − further persons from the n others), or do not include me and thus k from then other people.d) We count the total number of subsets of a set with n elements. Each subset canbe described by a sequence of length n, indicating whether the i-th element isin the set.e) Suppose we have n men and n women and we want to select groups of n persons.is is the right hand side. e le hand side enumerates separately the options withexactly k men, which is nk n

n−k = nk by a).f) Clearly (+ t)n is a polynomial of degree n. e coecient for tk gives the num-ber of possibilities to choose the t-summand when multiplying out the product

( + t)( + t) ⋅ ⋅ ( + t)of n factors so that there are k such summands overall. is is simply the numberof k-subsets, nk. Example: Prove the theorem using induction. Compare the eort. Which methodgives you more insight?

I. Stirling’s Estimate

Since the factorial function is somewhat unhandy in estimates, it can be useful tohave an approximation in terms of elementary functions. e most prominent ofsuch estimates if probably given by Stirling’s formula:

T I.:n! ≈√πn n

en + O

n . (I.)

Here the factor + O(n) means that the quotient of estimate and real valueis bounded by ± n. Figure I. shows a plot of the ratio of the estimate to n!.Proof: Consider the natural logarithm of the factorial:

log(n!) = log + log + + log n

If we dene a step function L(x) = log(x + ), we thus have that log(n!) is anintegral of L(x). We also know that

I(x) = ∫ x

log(t)dt = x log x − x .

all logarithms in this book are natural, unless stated dierently

I.. STIRLING’S ESTIMATE

Figure I.: Plot of√

πn ne n n!

We thus need to consider the integral over L(x) − log(x). To avoid divergenceissues, we consider this in two parts. Let

ak =

log k − ∫ k

k−

log xdx = ∫ k

k−

log(kx)dx ,

andbk = ∫ k+

klog xdx −

logk = ∫ l+

klog(xk)dx .

en

Sn = a − b + a − b + + an = log n! −

log n + I(n) − I( ).

A substitution gives

ak = ∫

log

− (tk)dt, bk = ∫

log( + tk)dt,

from which we see that ak > bk > ak+ > . By the Leibniz’ criterion thus Snconverges to a value S and we have that

log n! − (n + ) log n + n → S − I(

).

Taking the exponential function we get that

n!→ eC√n n

en

with C = ∑∞k=(ak − bk) − I( ).


Using standard analysis techniques, one can show that eC = √π, see [Fel].Alternatively for concrete calculations, we could simply approximate the value toany accuracy desired.

e appearance of e might seem surprising, but the following example showsthat it arises naturally in this context:

P I.: e number sn of all sequences (of arbitrary length) without rep-etition that can be formed from n objects is e ⋅ n!.Proof: If we has a sequence of length k, there are nk choices of objects each ink! possible arrangements, thus k!nk = n!(n−k)! such sequences. Summing over allvalues of n − k we get

sn = nk=

n!k!= n!

nk=

k!

.

Using the Taylor series for ex we see that

e ⋅ n! − sn = n +

+ (n + )(n + ) +

< n +

+ (n + ) +

= n< .

is is an example (if we ignore language meaning) of the popular problem

how many words could be formed from the letters of a given word, if no letters areduplicate.Example: If we allow duplicate letters the situation gets harder. For example, con-sider words (ignoring meaning) made from the letters of the word COLORADO.e letter O occurs thrice, the other ve letters only once. If a word contains at mostone O, the formula from above gives∑

k=!k! = ++++++ =

such words.If the word contains two O, and k other letters there are

k options to selectthese letters and (k + )! possibility to arrange the letters (the denominator making up for the fact that both O cannot be distinguished). us we get!+ ⋅ !

+ ⋅ !

+ ⋅ !

+ ⋅ !

+ !

= + + + + + =

such words.If the word contains three O, and k other letters we get a similar formula, but

with a cofactor (k + )!, that is!+ ⋅ !

+ ⋅ !

+ ⋅ !

+ ⋅ !

+ !

= + + + ++ =

I.. THE TWELVEFOLDWAY

possibilities, summing up to possibilities in total.Lucky we do not live in MISSISSIPPI!

I. e Twelvefold Way

A generalization of counting sets and sequences is given by considering functionsbetween nite sets. We shall consider functions f ∶N → X with N = n and X = x.ese functions could be arbitrary, injective, or surjective. Furthermore, we couldconsider the elements of N and of X as distinguishable or as indistinguishable.

Note that “indistinguishable” does not mean equal. It just means that if we ex-change the role of two elements the result is considered the same.

What does indistinguishable mean formally: In this case we actually count equiv-alence classes of functions. Two functions f , g∶N → X are N-equivalent if there isa bijection u∶N → N such that f (u(a)) = g(a) for all a ∈ N . If we say the elementsof N are indistinguishable, we are counting functions up to this N-equivalence.

Similarly, we dene an X-equivalence of f and g if there is v∶X → X such thatv( f (a)) = g(a) for all a ∈ N . If we say the elements of X are indistinguishable, wecount functions up to X-equivalence.

We can combine both equivalences to get even larger equivalence classes, whichare the case of elements of both N and X being indistinguishable.

(e reader might feel this to be insuciently stringent, or wonder about thecase of dierent classes of equivalent objects. We will treat such situations in Sec-tion VI. under the framework of group actions.)

is gives in total ⋅ ⋅ = possible classes of functions. is set of countingproblems is called the Twelvefold Way in [Sta, Section .].Example: For each of the categories, give an example of a concrete counting prob-lem in common language.

Determining explicit formulae for the dierent counting problems is not eveneasy, indeed we will be able to do so aer substantial further work in the followingchapters.

Some further counting tasks

A partition of a set A is a collection Ai of subsets (called parts or cells) = Ai ⊂ Asuch that for all i:

• iAi = A

• Ai ∩ Aj = for j = i.Note that a partition of a set gives an equivalence relation and that any equiva-

lence relation on a set denes a partition into equivalence classes.


D I.: We denote the number of partitions of , . . . , n into k (non-empty) parts by S(n, k). It is called the Stirling number of the second kind OEIS Ah .e total number of partitions of , . . . , n is given by theBell number OEIS A

Bn = nk=

S(n, k).Example: ere are B = partitions of the set , , .

Again we might want to set S(n, k) = unless ≤ k ≤ n.We will give a formula for S(n, k) in Lemma II. and study B(n) in section II..

In some cases we shall care not which numbers are in which parts of a partition,but only the size of the parts. We denote the number of partitions of n into k partsby pk(n), the total number of partitions by p(n) OEIS A .

Again we study the function p(n) later, however here we shall not achieve aclosed formula for the value.

e twelvefold way theorem

We now extend the table of theorem I.:

T I.: if N , X are nite sets with N = n and X = x, the number of(equivalence classes of) functions f ∶N → X is given by the following table. (In therst two columns, d/i indicates whether elements are considered distinguishable orindistinguishable. e boxed numbers refer to the explanations in the proof.):

N X f arbitrary f injective f surjective

d d xn (x)n x!S(n, x)i d x

n x

n n −

x − = x

n − x

d i xk=

S(n, k) if n ≤ x if n > x S(n, x)

i i xk=

pk(n) if n ≤ x if n > x px(n)

Proof: If N is distinguishable, we can simply write the elements of N in a row andconsider a function on N as a sequence of values. In ), we thus have sequences

ere also is a Stirling number of the rst kind

I.. THE TWELVEFOLDWAY

of length n with x possible values, in ) such sequences without repetition, bothformulas we already know.

If such a sequence takes exactly x values, each value f (n) can be taken to in-dicate the cell of a partition into x parts, into which n is placed. As we consider apartition as a set of parts, it does not distinguish the elements of X, that shows thatthe value in ) has to be S(n, x). If we distinguish the elements of X we need toaccount for the x! possible arrangements of cells, yielding the value in ).

Similarly to ), if we do not require f to be surjective, the number of dierentvalues of f gives us the number of parts. Up to x dierent parts are possible, thuswe need to add the values of the Stirling numbers.

To get ) from ) and ) from ) we notice that making the elements of Nindistinguishable simply means that we only care about the sizes of the parts, notwhich number is in which part. is means that the Stirling number S(n, x) getsreplaced by the partition (shape) count px(n).

If we again start at ) but now consider the elements of N as indistinguishable,we go from sequences to sets. If f is injective we have ordinary sets, in the generalcase multisets, and have already established the results of ) and ).

For ), we interpret the x distinct values of f to separate the elements of Ninto x parts. is is a composition of n into x parts, for which the count has beenestablished.

In ) and ), nally, injectivity demands that we assign all elements of n to dif-ferent values which is only possible if n ≤ x. As we do not distinguish the elementsof X it does not matter what the actual values are, thus there is only one such func-tion up to equivalence.

C

II

Recurrence and GeneratingFunctions

Finding a close formula for a combinatorial counting function can be hard. It oenis much easier to establish a recursion, based on a reduction of the problem. Such areduction oen is the principal tool when constructing all objects in the respectiveclass.

An easy example of such a situation is given by the number of partitions of n,given by the Bell number Bn :

L II.: For n ≥ we have:

Bn = nk=n − k − Bn−k

Proof: Consider a partition of , . . . , n. Being a partition, it must have in onecell. We group the partitions according to how many points are in the cell contain-ing . If there are k elements in this cell, there are n−

k− options for the other pointsin this cell. And the rest of the partition is simply a partition of the remaining n− kpoints.

II. Power Series – A Review of Calculus

A powerful technique for working with recurrence relations is that of generatingfunctions. e denition is easy, for a counting function f ∶Z≥ → Q dened onthe nonnegative integers we dene the associated generating function as the powerseries

F(t) = ∞n=

f (n)tn ∈ R[[t]]

CHAPTER II. RECURRENCE AND GENERATING FUNCTIONS

Here R[[t]] is the ring of formal power series in t, that is the set of all formal sums∑∞n= an tn i with an ∈ R.

When writing down such objects, we ignore the question whether theseries converges, i.e. whether F can be interpreted as a function on asubset of the real numbers.

Addition and multiplication are as one would expect: If F(t) = ∑n≥ f (n)tnand G(t) = ∑n≥ g(n)tn we dene new power series F +G and F ⋅G by:

(F +G)(t) ∶= n≥( f (n) + g(n))tn

(F ⋅G)(t) ∶= n≥ a+b=n

f (a)g(b) tn .

With this arithmetic R[[t]] becomes a commutative ring. (ere is no convergenceissue, as the sum over a + b = n is always nite.)

We also dene two operators, called dierentiation and integration on R[[t]]by

ddtn≥

an tn =n≥(nan)tn−

and

∫ n≥

an tn = n≥

ann +

tn+ ,

Two power series are equal if and only if all coecients are equal, an identityinvolving a power series as a variable is called a functional equation.

As a shorthand we dene the following names for particular power series:

exp(t) = n≥

tn

n!

log( + t) = n≥

(−)n− tn

n

and note that the usual functional identities exp(a+b) = exp(a) exp(b), log(ab) =log(a) + log(b), log(exp(t)) = t and dierential identities d

dt exp(t) = exp(t),ddt log(t) = t hold.

All of this can either be proven directly for power series, or we choose t in the interval of conver-gence, use results from Calculus, and obtain the result from the uniqueness of power series representa-tions.

II.. POWER SERIES – A REVIEW OF CALCULUS

A telescoping argument shows that for r ∈ R we have that (−rt) (∑n rn tn) = ,that is we can use the geometric series identity

nrn tn =

− rtto embed rational functions (whose denominators factor completely; this can betaken as given if we allow for complex coecients) into the ring of power series.

For a real number r, we dene

rn = r(r − )(r − n + )

n!tn

as well as( + t)r =

n≥rn.

For obvious reasons we call this denition the binomial formula. Note that for inte-gral r this agrees with the usual denition of exponents and binomial coecients,so there is no conict with the traditional denition of exponents.

We also notice that for arbitrary r, s we have that ( + t)r( + t)s = ( + t)r+sand d

dt ( + t)r = r( + t)r−.

A power series may (if t is chosen in the domain of convergence, but that is notsomething for us to worry about) be used to dene a function on the real num-bers and an identity amongst power series, involving dierentiation, then could beinterpreted as a dierential equation. e uniqueness of solutions for dierentialequations then implies that the power series for a solution of the dierential equa-tion must be the unique power series satisfying this equation. Or, in other words:

Wecanuse dierential equations and their tabulated solutions as a cannedtool for equations amongst power series.

Generating Functions and Recursion

Up to this point, generating functions seem to be a formality for formalities sake.e crucial observation now is that an index shi can be accomplished by multi-plication with t. A recurrence relation thus becomes a functional equation for thegenerating function. In many important cases we can solve this functional equationusing knowledge from Analysis, and then determine a formula for the coecientsof the generating function using Taylor’s theorem.

A recursion with coecients depending on the index, similarly leads to a dif-ferential equation, and is treated in the same way.

Instead of dening this formally, it is probably easiest to consider an exampleto describe a general method:

Unless r is an integer, this is a denition in the ring of formal power series.


We dene a function recursively by setting f () = and f (n + ) = f (n).(is recursion comes from the number of subsets of a set of cardinality n – x oneelement and distinguish between subsets containing this element and those thatdon’t.) e associated generating function is F(t) = ∑n≥ f (n)tn . We now observethat

tF(t) = n≥

f (n)tn+ = n≥

f (n + )tn+ = F(t) − .

We solve this functional equation as

F(t) = − t

and (geometric series!) obtain the power series representation

F(t) = n≥

n tn .

this allows us to conclude that f (n) = n , solving the recursion (and giving us thecombinatorial result we knew already that a set of cardinality n has n subsets.)

II. Linear recursion with constant coecients

en, at the age of forty, you sit,theologians without Jehovah,hairless and sick of altitude,in weathered suits,before an empty desk,burned out, oh Fibonacci,oh Kummer, oh Godel, oh Mandelbrotin the purgatory of recursion.

also means: “grief ”

Dann, mit vierzig, sitzt ihr,o eologen ohne Jehova,haarlos und hohenkrankin verwitterten Anzugenvor dem leeren Schreibtisch,ausgebrannt, o Fibonacci,o Kummer, o Godel, o Mandelbrot,im Fegefeuer der Rekursion.

Die MathematikerH M E

Suppose that f (n) satises a recursion of the form

f (n) = a f (n − ) + a f (n − ) + + ak f (n − k).at is, there is a xed number of recursion terms and each term is just a scalarmultiple of a prior value (we shall see below that easy cases of index dependencealso can be treated this way). We also assume that k initial values f (),, f (k− )have been established.

e most prominent case of this are clearly the Fibonacci numbers OEIS Awith k = , f () = f () = , and we shall use these as an example.

II.. LINEAR RECURSIONWITH CONSTANT COEFFICIENTS

Step : Get the functional equation Using the recursion, expand the coecientf (n) in the generating function F(t) = ∑n f (n)tn with terms of lower index. Notethat for n < k the recursion does not hold, you will need to look at the initial valuesto see whether the given formula suces, or if you need to add explicit multiplesof powers of t to get equality.

Separate summands into dierent sums, factor out powers of t to get f (n) com-bined with tn .

Replace all expressions∑ f (n)tn back with the generating function F(t). ewhole expression also must be equal to F(t), this is the functional equation.Example: in the case of the Fibonacci numbers, the recursion is f (n) = f (n − ) +f (n − ) for n > . us we get (using the initial values f () = f () = that

n≥

f (n)tn = n>( f (n − )tn + f (n − )tn) + f ()t + f ()

= n>

f (n − )tn +n>

f (n − )tn + t +

= tn>

f (n − )tn− + tn>

f (n − )tn− + t +

= tn>

f (n)tn + t n≥

f (n)tn + t +

= t(n≥

f (n)tn − f ()) + t n≥

f (n)tn + t +

= t ⋅ F(t) − t ⋅ f () + tF(t) + t + = t ⋅ F(t) + tF(t) + .

e functional equation is thus F(t) = t ⋅ F(t) + tF(t) + .

Step : Partial Fractions We can solve the functional equation to express F(t)as a rational function in t. Using partial fractions (Calculus ) we can write this asa sum of terms of the form

ai(t − α i)ei .

Example: We solve the functional equation to give us F(t) = − t − t . For a partial

fraction decomposition, let α = −+√ , β = −−√

the roots of the equation − t −t = . en

F(t) = − t − t = a

t − α +b

t − βWe solve this as a = −√, b = ().Step :Use knownpower series to express each summandas apower series egeometric series gives us that

at − α = n≥

−aαn+ t

n


If there are multiple roots, denominators could arise in powers. For this we noticethat

(t − α) =

n≥

(n + )αn+ tn

and for integral c ≥ that

(t − )c = −c

n≥c + n −

ntn

Using these formulae, we can write each summand of the partial fraction de-composition as an innite series.Example: In the example we get

F(t) = −√

t − α +

√

t − β

= −√n≥

−αn+ t

n + √n≥

−βn+ t

n

Step: Combine toone sum, and reado coecients We now take this (unique!)power series expression and read o the coecients. e coecient of tn will bef (n), which gives us an explicit formula.Example: Continuing the calculation above, we get

F(t) = n≥ √

⋅ αn++ −√

⋅ βn+ tn

and thus a closed formula for the Fibonacci numbers:

f (n) = √ ⋅ αn+

+ −√ ⋅ βn+

= √ ⋅ −+√

n+ + −√ ⋅ −−√

n+

= √

√ − n+ + √

+ n+

We notice that√

− > √

+ > , thus asymptotically

f (n + ) f (n) = √ −

= ≈ .

the value of the golden ratio.

II.. NESTED RECURSIONS: DOMINO TILINGS

Another example

We try another example. Take the (somewhat random) recursion given by

g() = g() = g(n) = g(n − ) + ∗ g(n − ) + (−)n , for n ≥

We get for the generating function

G(t) = ng(n)tn =

n≥g(n − )tn +

n≥g(n − )tn +

n≥ (−)nzn + z

= tG(t) + tG(t) + + t + t.

(you should verify that the addition of z was all that was required to resolve theinitial value settings.)

We solve this functional equation as

G(t) = + t + t

( − t)( + t)

and get the partial fraction decomposition

G(t) = −(t −

) −

(t + ) +

(t + ) .

We can read o the power series representation

G(t) = n n+ tn +

n (−)n+ tn + n −n(n + )tn

= n

n −

(−)n +

(−)n(n + ) tn

= n

n +

n +

(−)n tn ,

solving the recursion as g(n) = n +

n + (−)n .

II. Nested Recursions: Domino Tilings

e Domino eory had become conventionalwisdom and was rarely challenged.

DiplomacyH K

We apply the method of generating functions to some counting problems.


Figure II.: A × domino tiling

Figure II.: A tiling pattern that has no vertical cut

Suppose we have tiles that have dimensions × (in your favorite units) and wewant to tile a corridor. Let f (n) the number of possible tilings of a corridor that hasdimensions × n. We can start on the le with a vertical domino (thus leaving tothe right of it a tiling of a corridor of length n− ) or with two horizontal dominoes(leaving to the right of it a corridor of length n − ). this gives us the recursion

d(n) = d(n − ) + d(n − ), n >

with d() = and d() = (and thus d() = to t the recursion). is is justagain the Fibonacci numbers we have already worked out.

If we assume that the tiles are not symmetric, there are actually ways to placea horizontal tile and ways to place a vertical tile. We thus get a recursion withdierent coecients,

d(n) = d(n − ) + d(n − ), n >

with d() = , d() = (and thus d() = ).

If we expand the corridor to dimensions ×n , the recursion seems to be prob-lematic – it is possible to have patterns or arbitrary length that do not reduce to ashorter length, see gure II..

We thus instead look only at the le end of the tiling and persuade ourselves(going back to the assumption of symmetric tiles) that every tiling has to start withone of the patterns depicted in the top row of gure II.. We thus consider corridorsthat not only start with a straight line, but also corridors that start with the top le(respectively bottom le) tile missing. e count of these tilings will be given bya function e(n). (Note that both have the same counting function because we canip top/bottom.)

is gives us the recursion

d(n) = d(n − ) + ⋅ e(n − )

II.. NESTED RECURSIONS: DOMINO TILINGS

Figure II.: Variants for a × n tiling

In the bottom row of gure II., we enumerate the possible ways how a tilingwith missing corner can start, giving us the recursion

e(n) = d(n − ) + e(n − )We use the initial values d() = , d() = , d() = , e() = , e() = ,

e() = , and thus get the following identities for the associate generating functions

D(t) = nd(n)tn =

n>(d(n − ) + e(n − )) tn + d() + d()t

= tn>

d(n − )tn−n + tn>

e(n − )tn− +

= t n≥

d(n)tn + t(n≥

e(n)tn − e()t) +

= tD(t) + tE(t) +

and

E(t) = ne(n)tn =

n>(d(n − ) + e(n − )) tn + e() + e()t

= t(D(t) − d()) + tE(t) + t = tD(t) + tE(t)We solve this second equation as

E(t) = t − t D(t)

and substitute into the rst equation, obtaining the functional equation

D(t) = tD(t) + t

− t D(t) +

which we solve forD(t) = − t

− t + t


is is a function in t, indicating that d(n) = for odd n (indeed, this must be,as in this case × n is odd and cannot be tiled). We thus can consider instead thefunction

R(t) = − t − t + t =

nr(n)tn

with d(n) = r(n). Partial fraction decomposition, geometric series, and nal col-lection of coecients gives us the formula

d(n) = r(n) = ( +√

)n −√

+ ( −√

)n +√

and the sequence of r(n) given by OEIS A

, , , , , , , , , , , . . .

II. Catalan Numbers

e induction I used was pretty tedious,but I do not doubt that this result couldbe obtained much easier. Concerning theprogression of the numbers , , , , ,, etc. . . .

Die Induction aber, so ich gebraucht,war ziemlich muhsam, doch zweie ichnicht, dass diese Sach nicht sollte weitleichter entwickelt werden konnen.Ueber die Progression der Zahlen , ,, , , , etc. . . .

Letter to GoldbachSeptember , L E

We next look at an example of a recursion which is not just linear.

D II.: e n-th Catalan number Cn OEIS A is dened as thenumber of ways a sum of n+ variables can be evaluated by inserting parentheses.

E II.: We have C = C = , C = : (a+ b)+ c and a+ (b+ c), and C = :

((a + b) + c) + d(a + (b + c)) + da + ((b + c) + d)a + (b + (c + d))(a + b) + (c + d)

Named in honor of Eugene Charles Catalan (-) who rst stated the standard formula.e naming aer Catalan only stems from a book, see http://www.math.ucla.edu/

~

pak/

papers/cathist4.pdf. Catalan himself attributed them to Segner, though Euler’s work is even earlier.Careful, some books use a shied index, starting at only!

II.. CATALAN NUMBERS

To get a recursion, we consider the “outermost” addition, assume thus comesaer k+ of the variables. is addition combines two proper parenthetical expres-sions, the one on the le on k+ variables, the one on the right on (n+)−(k+) =n − k variables. We thus get the recursion

Cn = n−k=

CkCn−k− , if n > , C = .

in which we sum over the products of lower terms.is pattern is in fact typical for the way we reduced:We classify the set of objects according to a parameter (the outermost addition

at k), and then recurse to two parts. More generally, if we had a count d(n) forwhich we can split the objects into two parts, according to a smaller parameter k,where one is counted by a(n) and the other by b(n), we would get a formula

d(n + ) = nk=

a(k)b(n − k)is looks suspiciously like the formula for the multiplication of polynomials (orpower series) and indeed, if the associated generating functions are D(t), A(t) andB(t), we would get that D(t) = A(t)B(t). (Initial values might make the recursiononly work starting from a minimum value of n, and so there might be fudge factorsto consider.)

Indeed this is what happens here – we get the functional equation

C(t) = t ⋅ C(t) + .

(which is easiest seen by writing out the expression for tC(t) and collecting terms).e factor t is due to the way we index and is due to initial values.

is is a quadratic equation in the variable C(t) and we get:

C(t) = −√ − tt

,

where we choose the solution for which for t = we get the desired value of C.e binomial series then gives us that

√ − t =

k≥k(−t)k = +

k≥

k−k − (−t)k .

We conclude that

−√ − tt

= k≥

k−k − (−t)k−

= n≥nn tn

n + ,


in other words:Cn = nn

n +

.

Catalan numbers have many other combinatorial interpretations, and we shallencounter some in the exercises. Exercise . in [Sta] (and its algebraic contin-uation ., as well as an online supplement) contain hundreds of combinatorialinterpretations.

II. Index-dependent coecients and Exponential generatingfunctions

A recursion does not necessarily have constant coecient, but might have a coe-cient that is a polynomial in n. In this situation we can use (formal) dierentiation,which will convert a term f (n)tn into n f (n)tn−. e second derivative will givea term n(n − ) f (n)tn−; rst and second derivative thus allow us to construct acoecient n f (n) and so on for higher order polynomials.

e functional equation for the generating function then becomes a dierentialequation, and we might hope that a solution for it can be found in the extensiveliterature on dierential equations.

Alternatively, (that is we use the special form of derivatives for a typical sum-mand), such a situation oen can be translated immediately to a generating func-tion by using the power series

( − t)k+ =

nn + k

ntn .

For an example of variable coecients, we take the case of counting derange-ments OEIS A , that is permutations that leave no point xed. We denoteby d(n) the number of derangements on , . . . , n.

To build a recursion formula, suppose that π is a derangement on , . . . , n.en nπ = i < n. We now distinguish two cases, depending on how i is mapped byn:a) If iπ = n, then π swaps i and n and is a derangement of the remaining n − points, thus there are d(n − ) derangements that swap i and n. As there are n − choices for i, there are (n − )d(n − ) derangements that swap n with a smallerpoint.b) Suppose there is j = i such that jπ = n. In this case we can “bend” π into anotherpermutation ψ, by setting

kψ = kπ if k = ji if k = j .

We notice that ψ is a derangement of the points , . . . , n − .

II.. INDEX-DEPENDENT COEFFICIENTS AND EXPONENTIALGENERATING FUNCTIONS

Vice versa, if ψ is a derangement on , . . . , n− , and we choose a point j < n,we can dene π on , . . . , n by

kπ =

kψ if k = i , nn if k = jjψ if k = n .

We again notice that π is a derangement, and that dierent choices ofψ and i resultin dierent π’s. Furthermore, the two constructions are mutually inverse, that isevery derangement that is not in class a) is obtained by this construction.

ere are d(n− ) possible derangements ψ and n− choices for j, so there are(n − )d(n − ) derangements in this second class.

We thus obtain a recursion

d(n) = (n − ) (d(n − ) + d(n − ))and hand-calculate the initial values d() = and d() = .

We now could build a dierential equation for the generating function, butthere is a problem: Because of the factor n in the recursion, the coecients d(n)grow roughly like n!, but a series whose coecients grow that quickly will haveconvergence radius , making it unlikely that the solution exists in the literature ondierential equations.

We therefore introduce the exponential generating function which is denedsimply by dividing the i-th coecient by a factor of i!, thus keeping coecientgrowth bounded.

In our example, we get

D(t) =n

d(n)n!

tn

and thusddt

D(t) =n≥

n ⋅ d(n)n!

tn− =n≥

d(n)(n − )! tn− =

n≥

d(n + )n!

tn .

We also have

t ⋅ D(t) = n≥

d(n) tn+

n!=

n≥(n + )d(n) tn+

(n + )!=

n≥n ⋅ d(n − ) tn

n!=

n≥n ⋅ d(n − ) tn

n!

t ⋅ D′(t) = n≥

n ⋅ d(n)n!

tn

e reader might have an issue with the choice of d() = , as it is unclear what a derangementon no points is. By going backwards to the recursion (using d() and d() to calculate d()), it turnsout that this is the right number to make the recursion work in case d() is referred to.


From this, the recursion (in the form d(n + ) = n(d(n) + d(n − ))) gives us:

t ⋅ D(t) + t ⋅ D′(t) = n≥(n ⋅ d(n − ) + n ⋅ d(n)) tn

n!

= n≥

d(n + ) tnn!= D′(t),

and thus the separable dierential equation

D′(t)D(t) =

t − t .

with D() = d() = .Standard techniques from Calculus give us the solution

D(t) = e−t − t .

Looking up this function for Taylor coecients (respectively determining the for-mula by induction) shows that

D(t) = n≥ ni=

(−)ii! tn

and thus (introducing a factor n! to make up for the denominator in the generatingfunction) that

d(n) = n! ni=

(−)ii! .

is is simply n! multiplied with a Taylor approximation of e−. Indeed, if weconsider the dierence, the theory for alternating series gives us for n ≥ that:

d(n) − n!e = n! ∞

i=n+

(−)ii!

< n! (−)n+

(n + )! =

n + ≤

We have proven:

L II.: d(n) is the integer closest to n!e.

at is asymptotically, if we put letters into envelopes, the probability is e thatno letter is in the correct envelope.


Bell Numbers

∫√

zdz × cos π

= log( √e)

e integral z-squared dz,From one to the cube root of three,Times the cosine,Of three pi over nineEquals log of the cube root of e.

A.

A second use of generating functions is that in some cases it might be possible togive a reasonable (exponential) generating function for a counting function whichitself has no closed form expression.

We go back to the Bell numbers Bn , dened as giving the total number of par-titions of , . . . , n. e recursion is built by what now is a standard argument:

Given a partition X, consider the cell containing . It has k elements in total,these are chosen from the n − numbers > . e remaining cells form a partitionof the n − k numbers outside the cell. us, for n ≥ , we have:

Bn = nk=n − k − Bn−k

Let F(t) = ∑n Bn tnn! be the exponential generating function of the Bn . en

ddt

F(t) = n≥

Bntn−

(n − )! =n≥ nk=n − k − Bn−k tn−

(n − )!=

n≥

nk=

tk−

(k − )!Bn−k tn−k(n − k)!

= i≥

t i

i! ⋅ j≥

Bjt j

j! = exp(t)F(t).

(using a variable substitution i = k − , j = n − k and switching the summationorder).

is is again a separable dierential equation, its solution is

F(t) = c ⋅ exp(exp(t))for some constant c. As F() = we solve for c = exp(−) and thus get the expo-nential generating function

n

Bn tn

n!= exp(exp(t) − ).


Figure II.: e Genji-mon

ja.wikipedia,org

ere is no nice way to express the power series coecients of this function inclosed form, a Taylor approximation is (with denominators being deliberately keptin the form of n! to allow reading o the Bell numbers):

+ t + t

!+ t

!+ t

!+ t

!+ t

!+ t

!+ t

!+ t

!+ t

!.

One, somewhat surprising application of Bell numbers is to consider rhymeschemes. Given a sequence of n lines, the lines which rhyme form the cells of apartition of , . . . , n. For example, the partition , , , , is the schemeused by Limericks.

We can read o that B = . e classic th century Japanese novel Genjimonogatari (e Tale of Genji) has chapters of which rst and last are consid-ered “extra”. e remaining chapters are introduced each with a poem in one ofthe possible rhyme schemes and a symbol illustrating the scheme. ese sym-bols, see gure II., theGenji-mon, have been used extensively in Art. See http://viewingjapaneseprints.net/texts/topictexts/faq/faq_genjimon.html.

Splitting and Products

Exponential generating functions have one more trick up their sleeve, and arguablythis is their most important characteristic. ink back to the case of the Catalannumbers, where a split to to smaller parts gave a product of generating functions.

Suppose we are counting objects where parts have identifying labels (these wouldbe called labelled objects). In the example of the Catalan numbers this would be if


we cared not only about the parentheses placement, but also about the symbols,that is (a + b) + c would be dierent from (c + a) + b.

e recursion formula then will include a binomial factor for choosing k outof n objects for the le side, that is

d(n) = nk=nka(k)b(n − k).

We can write this as

d(n)n!= n

k=

a(k)k!

b(n − k)(n − k)! ,

that is the exponential generating functions multiply!

Lets look at this in a pathetic example, the number of functions from N =, . . . , n to , . . . , r (which we know already well as rn).Let a(n) count the number of constant functions on an n-element set, that is

a(n) = . e associated exponential generating function thus is

A(t) =n

tn

n!= exp(t)

(which, incidentally, shows why these are called “exponential” generating func-tions).

If we take an arbitrary function f on N , we can partition N into r (possiblyempty) sets N , . . . , Nr , such that f is constant on Ni and the Ni are maximal withthis property.

We get all possible functions f by combining constant functions on the possi-ble Ni ’s for all possible partitions of N . Note that the ordering of the partitions issignicant – they indicate the actual values.

We are thus exactly in the situation described, and get as exponential generatingfunction (start with r = , then use induction for larger r) the r-fold product of theexponential generating functions for the number of constant functions:

D(t) = exp(t) ⋅ exp(t)r factors

= exp(rt)

e coecient for tn in the power series for exp(rt) of course is simply rnn! , that is

the counting function is rn , as expected.

Indeed, using this approach we could have determined the exponential generat-ing function for the Bell numbers without the need to use recursion and dierentialequations.

For this, lets consider rst the Stirling numbers of the second kind, S(n, k)denoting the number of partitions of n into k (non-empty) parts. According to I.,part ) there are k!S(n, k) order-signicant partitions of n into k parts.


We denote the associated exponential generating function (for order-signicantpartitions) by

Sk(t) =nk!S(n, k) tn

n!.

We also know that there is – apart from the empty set – exactly one partition intoone cell. at is

S(t) = t + t

!+ t

!+ ⋅ ⋅ ⋅ = exp(t) −

If we have a partition into k-parts, we can x the rst cell and then partition therest further. us, for k > we have that

Sk(t) = S(t) ⋅ Sk−(t),which immediately gives that

Sk(t) = (exp(t) − )kWe deduce that the Stirling numbers of the second kind have the exponential gen-erating function

nS(n, k) tn

n!= (exp(t) − )k

k!.

Using the fact that Bn = ∑nk= S(n, k), we thus get the exponential generating func-

tion for the Bell numbers as

nBn

tn

n!=

knS(n, k) tn

n!

= k

(exp(t) − )kk!

= exp(exp(t) − )in agreement with the above result.

As we have the exponential generating function for the Stirling numbers, wemight as well deduce a coecient formula for them:

L II.:

S(n, k) = k!

ki=(−)k−ik

iin .

Proof: We rst note that – as n = – the i-sum could start at or alternativelywithout changing the result.


en multiplying through with k!, we know by the binomial formula that

k!nS(n, k) tn

n!= (exp(t) − )k = k

i=ki exp(t)i(−)k−i

= ki=ki exp(i ⋅ t)(−)k−i = k

i=ki

n

(it)nn! (−)k−i

= n ki=ki(−)k−i in tn

n!.

and we read o the coecients.

Involutions

Finally, lets use this in a new situation:

D II.: A permutation π on , . . . , n is called an involution OEIS Aif π = , that is (iπ)π = i for all i.

We want to determine the number s(n) of involutions on n points.e reduction we shall use is to consider the number of cycles of the involu-

tion, also counting -cycles. Let sr(n) the number of involutions that have exactly rcycles. Clearly s() = , s() = , s() = , s() = which gives the exponentialgenerating function S(t) = t + t

.When considering an arbitrary involution, we can split o a cycle, seemingly

leading to a formula

t + t

r

for Sr(t). However doing so distinguishes the cycles in the order they were splito, that is we need to divide by r! to avoid overcounting. us

Sr(t) = r!t + t

r

and thus for the exponential generating function of the number of involutions asum over all possible r:

S(t) = ∞r=

Sr(t) =r

r!t + t

r = exp(t + t

) = exp(t) exp t

.

Group theorists oen exclude the identity, but it is convenient to allow it here.


We easily write down power series for the two factors

exp(t) = n

tn

n!

exp t

=

n

tn

n ⋅ n!

and multiply out, yielding

S(t) = n

tn

n ⋅ n! ⋅

n

tn

n!

= m

mk=

tk tm−k

k k!(m − k)!and thus (again introducing a factor of n! for making up for the exponential gen-erating function)

s(n) = mk=

n!k k!(n − k)! .

C

III

Inclusion, Incidence, andInversion

It is calculus exam week and, as every time, a number of students were reported torequire alternate accommodations:

• students are sick.• students are scheduled to play for the university Quiddich team at the

county tournament.• students are planning to go on the restaurant excursion for the food ap-

preciation class.• students on the Quiddich team are sick (having been hit by balls).• students are scheduled for the excursion and the tournament.• students of the food appreciation class are sick (with food poisoning), and• of these students also planned to go to the tournament, i.e. have all three

excuses.

e course coordinator wonders how many alternate exams need to be provided.Using a Venn diagram and some trial-and-error, it is not hard to come up with

the diagram in gure III., showing that there are alternate exams to oer.

III. e Principle of Inclusion and Exclusion

By the method of exclusion, I had arrived atthis result, for no other hypothesis would meetthe facts.

A Study in ScarletA C D

CHAPTER III. INCLUSION, INCIDENCE, AND INVERSION

2

2

3

1

58

7

Sick

Excursion

Quiddich

Figure III.: An example of Inclusion/Exclusion

e Principle of Inclusion and Exclusion (PIE) formalizes this process for anarbitrary number of sets:

Let X be a set and A , . . . , An a family of subsets. For any subset I ⊂ , . . . , nwe dene

AI =i∈I Ai ,

using that A = X. en

L III.: e number of elements that lie in none of the subsets Ai is given by

I⊂, . . . ,n

(−)I AI .Proof: Take x ∈ X and consider the contribution of this element to the given sum.If x ∈ Ai for any i, it only is counted for I = , that is contributes .

Otherwise let J = ≤ a ≤ n x ∈ Aa and let j = J. We have that x ∈ AI ifand only if I ⊂ J. us x contributes

I⊂J(−)I = j

i= ji(−)i = ( − ) j =

As a rst application we determine the number of derangements on n points in

an dierent way:Let X = Sn be the set of all permutations of degree n, and let Ai be the set of

all permutations π with iπ = i. en Sn i Ai is exactly the set of derangements,there are ni possibilities to intersect i of the Ai ’s, and the formula gives us:

d(n) = ni=(−)in

i(n − i)! = n!

ni=

(−)ii!

.

III.. THE PRINCIPLE OF INCLUSION AND EXCLUSION

For a second example, we calculate the number of surjective mappings from ann-set to a k-set (which we know already from I. to be k!S(n, k)):

Let X be the set of all mappings from , . . . , n to , . . . , k, then X = kn .Let Ai be the set of those mappings f , such that i is not in the image of f , so Ai =(k−)n . More generally, if I ⊂ , . . . kwe have that AI = (k−I)n . e surjectivemappings are exactly those in X outside any of the Ai , thus the formula gives usthe count

ki=(−)ik

i(k − i)n ,

using again that there are ki possible sets I of cardinality i.

e factor (−)i in a formula oen is a good indication that inclusion/exclusionis to be used.

L III.:

ni=(−)in

im + n − i

k − i = mk if m ≥ k,

if m < k.

Proof: To use PIE, the sets Ai need to involve choosing from an n, set, and aerchoosing i of these we must choose from a set of size m + n − i.

Consider a bucket lled with n blue balls, labeled with , . . . , n, and m red balls.How many selections of k balls only involve red balls? Clearly the answer is the righthand side of the formula.

Let X be the set of all k-subsets of balls and Ai those subsets that contain blueball number i, then PIE gives the le side of the formula.

We nish this section with an application from number theory. e Euler func-tion φ(n) counts the number of integers ≤ k ≤ n with gcd(k, n) = .

Suppose that n = ∏ri= p

eii , X = , . . . , n and Ai the integers in X that are

multiples of pi . en (inclusion/exclusion)

φ(n) = n − ri=

npi+

≤i , j≤rn

pi p j− = n −

pi .

with the second identity obtained by multiplying out the product,We also note – exercise ?? – that ∑

dn φ(d) = n. at is, the sum over one

function over a nice index set is another (easier) function. We will put this intolarger context in later sections.


a) b) c) d) e) f)

g) h) i) j) k) l)

Figure III.: Hasse Diagrams of Small Posets and Lattices

III. Partially Ordered Sets and Lattices

e doors are open; and the surfeited groomsDo mock their charge with snores:I have drugg’d their possets,at death and nature do contend about them

Macbeth, Act II, Scene IIW S

A poset or partially ordered set is a set A with a relation R ⊂ A × A on theelements of Awhich we will typically write as a ≤ b instead of (a, b) ∈ R, such thatfor all a, b, c ∈ A:

(reexive) a ≤ a.

(antisymmetric) a ≤ b and b ≤ a imply that a = b.

(transitive) a ≤ b and b ≤ c imply that a ≤ c.

For example, A could be the set of subsets of a particular set, and ≤ with be the“subset or equal” relation.

A convenient way do describe a poset for a nite set A is by its Hasse-diagram.Say that a covers b if a ≥ b, a = b and there is no a = c == b with a ≥ c ≥ b. eHasse diagram of the poset is a graph in the plane which connects two vertices aand b only if a covers b, and in this case the edge from b to a goes upwards.

Because of transitivity, we have that a ≤ b if and only if one can go up alongedges from a to reach b.

Figure III. gives a number of examples of posets, given by their Hasse dia-grams, including all posets on elements.

An isomorphism of posets is a bijection that preserves the ≤ relation.

An element in a poset is called maximal if there is no larger (wrt. ≤) element,minimal is dened in the same way. Posets might have multiple maximal and min-imal elements.

III.. PARTIALLY ORDERED SETS AND LATTICES

Linear extension

My scheme of Order gave me the most trouble

AutobiographyB F

A partial order is called a total order, if for every pair a, b ∈ A of elements wehave that a ≤ b or b ≤ a.

While this is not part of our denition, we can always embed a partial orderinto a total order.

P III.: Let R ⊂ X × X be a partial order on X. en there exists a totalorder (called a linear extension) T ⊂ X × X such that R ⊂ T .

To avoid set acrobatics we shall prove this only in the case of a nite set X.Note that in Computer science the process of nding such an embedding is calleda topological sorting.Proof: We proceed by induction over the number of pairs a, b that are incompara-ble. In the base case we already have a total order.

Otherwise, let a, b such an incomparable pair. We set (arbitrarily) that a < b.Now let

L = x ∈ X x ≤R a,U = x ∈ X b ≤R xWe claim that S = R ∪ (l , u) l ∈ L, u ∈ u is a partial order. As (a, b) ∈ S it hasfewer incomparable pairs, this shows by induction that there exists a total orderT ⊃ S ⊃ R, proving the theorem.

Since R is reexive, S is. For antisymmetry, suppose that for x = y we have that(x , y), (y, x) ∈ S. Since R is a partial order, not both can be in R. Suppose that

(x , y), (y, x) ∈ S R = (l , u) l ∈ L, u ∈ u.is implies that x ≤R a and b ≤R x, thus by transitivity b ≤R a, contradicting theincomparability.

If (x , y) ∈ R, (y, x) ∈ SR, we have that b ≤R x ≤R y ≤R a, again contradictingincomparability.

For transitivity, suppose that (x , y) ∈ S R and (y, z) ∈ S. en (y, z) ∈ R, asotherwise b ≤R y ≤R a. But then b ≤R y ≤R z, implying that (x , z) ∈ S. e othercase is analog.

is theorem implies that we can always label the elements of a countable posetwith positive integers, such that the poset ordering implies the integer ordering (butthis is in general not unique).

Lattices

D III.: Let A be a poset and a, b ∈ A.


• A greatest lower bound of a and b is an element c ≤ a, b which is maximal inthe set of elements with this property.

• A least upper bound of a and b is an element c ≥ a, b which is minimal in theset of elements with this property.

A is a lattice if any pair a, b ∈ A have a unique greatest lower bound, called themeet and denoted by a∧ b; as well as unique least upper bound, called the join anddenoted by a ∨ b.

Amongst the Hasse diagrams in gure III., e,j,k,l) are lattices, while the othersare not. Lattices always have unique maximal and minimal elements, sometimesdenoted by (minimal) and (maximal).

Other examples of lattices are:

. Given a set X, let A = P(X) = Y ⊆ X the power set of X with ≤ denedby inclusion. Meet is the intersection, join the union of subsets.

. Given an integer n, let A be the set of divisors of n with ≤ given by “divides”.Meet and join are gcd, respectively lcm.

. For an algebraic structure S, let A be the set of all substructures (e.g. groupand subgroups) of S and ≤ given by inclusion. Meet is the intersection, jointhe substructure spanned by the two constituents.

. For particular algebraic structures there might be classes of substructuresthat are closed under meet and join, e.g. normal subgroups. ese then forma (sub)lattice.

Using meet and join as binary operations, we can axiomatize the structure of alattice:

P III.: Let X be a set with two binary operations ∧ and ∨ and twodistinguished elements , ∈ X. en (X ,∧,∨, , ) is a lattice if and only if thefollowing axioms are satises for all x , y, z ∈ X:

Associativity: x ∧ (y ∧ z) = (x ∧ y) ∧ z and x ∨ (y ∨ z) = (x ∨ y) ∨ z;

Commutativity: x ∧ y = y ∧ x and x ∨ y = y ∨ x;

Idempotence: x ∧ x = x and x ∨ x = x;

Inclusion: (x ∨ y) ∧ x = x = (x ∧ y) ∨ x;

Maximality: x ∧ = and x ∨ = .

Proof: e verication that these axioms hold for a lattice is le as exercise to thereader.

III.. PARTIALLY ORDERED SETS AND LATTICES

Vice versa, assume that these axioms hold. We need to produce a poset structureand thus dene that x ≤ y i x ∧ y = x. Using commutativity and inclusion thisimplies the dual property that x ∨ y = (x ∧ y) ∨ y = y.

To show that ≤ is a partial order, idempotence shows reexivity. If x ≤ y andy ≤ x then x = x ∧ y = y∧ x = y and thus antisymmetry. Finally suppose that x ≤ yand y ≤ z, that is x = x ∧ y and y = y ∧ z. en

x ∧ z = (x ∧ y) ∧ z) = x ∧ (y ∧ z) = x ∧ y = xand thus x ≤ z. Associativity gives us that x ∧ y ≤ x , y if also z ≤ x , y then

z ∧ (x ∧ y) = (z ∧ x) ∧ y = z ∧ y = zand thus z ≤ x ∧ y, thus x ∧ y is the unique greatest lower bound. e least upperbound is proven in the same way and the last axiom shows that is the uniqueminimal and the unique maximal element. D III.: An element x of a lattice L is join-irreducible (JI) if x = and ifx = y ∨ z implies that x = y or x = z.

For example, gure III. shows a lattice in which the black vertices are JI, theothers not.

When representing elements of a nite lattices, it is possible to do so by storingthe JI elements once and representing every element based on the JI elements thatare below. is is used for example in one of the algorithms for calculating thesubgroups of a group.

Product of posets

e cartesian product provides a way to construct new posets (or lattices) from oldones: Suppose that X ,Y are posets with orderings ≤X , ≤Y , we dene a partial orderon X × Y by setting

(x , y) ≤ (x , y) if and only if x ≤X x and y ≤Y≤ y .

P III.: is is a partial ordering, so X × Y is a poset. If furthermoreboth X and Y are lattices, then so is X × Y .

e proof of this is exercise ??.is allows us to describe two familiar lattices as constructed from smaller

pieces (with a proof also delegated to the exercises):

P III.: a) Let A = n andP(A) the power-set lattice (that is the subsetsof A, sorted by inclusion). en P(A) is (isomorphic to) the direct product of ncopies of the two element lattice , .b) For an integer n =∏r

i= peii > written as a product of powers of distinct primes,

letD(n) be the lattice of divisors of n. enD(n) ≅ D(pe ) ××D(perr ).


Figure III.: Order-ideal lattice for the “N” poset.

III. Distributive Lattices

A lattice L is distributive, if for any x , y, z ∈ L one (and thus also the other) of thetwo following laws hold:

x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z)x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z)

Example: ese laws clearly hold for the lattice of subsets of a set or the lattice ofdivisors of an integer n.

Lattices of substructures of algebraic structures are typically not distributive,the easiest example (diagram l) in gure III.) is the lattice of subgroups of C ×Cwhich also is the lattice of subspaces of F

.

If P = (X , ≤) is a poset, a subset Y ≤ X is an order ideal, if for any y ∈ Y andz ∈ X we have that z ≤ y implies z ∈ Y .

L III.: e set of order ideals is closed under union and intersection.

Proof: Let A, B be order ideals and y ∈ A ∪ B and z ≤ y. en y ∈ A or y ∈ B. Inthe rst case we have that z ∈ A, in the second case that z ∈ B, and thus alwaysz ∈ A∪ B. e same argument also works for intersections.

is implies:

L III.: e set of order ideals of P, denoted by J(P) is a lattice under inter-section and union.

As a sublattice of the lattice of subsets, J(P) is clearly distributive.For example, if P is the poset on elements with a Hasse diagram given by the

letter N (gure III., g) then gure III. describes the lattice J(P).In fact, any nite distributive lattice can be obtained this way

III.. DISTRIBUTIVE LATTICES

T III. (Fundamental eorem for Finite Distributive Lattices, B):Let L be a nite distributive lattice. en there is a unique (up to isomorphism) -nite poset P, such that L ≅ J(P).

To prove this theorem we use the following denition:

D III.: For any element x of the poset P, let ↓ x = y y ≤ x be theprincipal order ideal generated by x.

L III.: An order ideal of a nite poset P is join irreducible in J(P) if andonly it is principal.

Proof: First consider a principal order ideal ↓ x and suppose that ↓ x = b ∨ c withb and c being order ideals. en x ∈ b or x ∈ c, which by the order ideal propertyimplies that ↓ x ⊂ b or ≤ x ⊂ c.

Vice versa, suppose that a is a join irreducible order ideal and assume thata is not principal. en for any x ∈ a, ↓ x is a proper subset of a. But clearlya = x∈a ↓ x.

C III.: Given a nite poset P, the set of join-irreducibles of J(P), con-sidered as a subposet of J(P), is isomorphic to P.

Proof: Consider the map that maps x to ↓ x. It maps P bijective to the set of join-irreducibles, and clearly preserves inclusion. We now can prove theorem III.Proof: Given a distributive lattice L, let X be the set of join-irreducible elements ofL and P be the subposet of L formed by them. By corollary III., this is the onlyoption for P up to isomorphism, which will show uniqueness.

Let ∶ L → J(P) be dened by (a) = x ∈ X x ≤ a, that is it assigns to everyelement of L the JI elements below it. (Note that indeed (a) is an order ideal.).We want to show that is an isomorphism of lattices.

Step: Clearly we have that a = x∈(a) x for any a ∈ L (using the join over theempty set equal to ). us is injective.

Step : To show that is surjective, let Y ∈ J(P) be an order ideal of P, and leta = y∈Y y. We aim to show that (a) = Y : Clearly every y ∈ Y also has y ≤ a, soY ⊂ (a). Next take a join irreducible x ∈ (a), that is x ≤ a. en x ≤ y∈Y y andthus

x = x ∧ y∈Y y = y∈Y(x ∧ y)

by the distributive law. Because x is JI, we must have that x = x ∧ y for some y ∈ Y ,implying that x ≤ y. But asY is an order ideal this implies that x ∈ Y . us (a) ⊂ Yand thus equality, showing that is surjective.


Step : We nally need to show that maps the lattice operations: Let x ∈ X.en x ≤ a ∧ b if and only if x ≤ a and x ≤ b. us (a ∧ b) = (a) ∩ (b).

For the join, take x ∈ (a) ∪ (b). en x ∈ (a), implying x ≤ a, or (sameargument) x ≤ b; therefore x ≤ a ∨ b. Vice versa, suppose that x ∈ (a ∨ b), sox ≤ a ∨ b and thus

x = x ∧ (a ∨ b) = (x ∧ a) ∨ (x ∧ b).Because x is JI that implies x = x ∧ a, respectively x = x ∧ b.

In the rst case this gives x ≤ a and thus x ∈ (a); the second case similarlygives x ∈ (b). Example: If we take the lattice of subsets of a set, the join-irreducibles are the -element sets. If we take divisors of n, the join-irreducibles are prime powers.

III. Chains and Extremal Seteory

Man is born free;and everywhere he is in chains

e Social ContractJ-J R

D III.: A chain in a poset is a subset such that any two elements of itare comparable. (at is, restricted to the chain the order is total.)

An antichain is a subset, such that any two (dierent) elements are incompara-ble.

We shall talk about a partition of a poset into a collection of chains (or an-tichains) if the set of elements is partitioned.

Clearly a chain C and antichain A can intersect in at most one element. isgives the following duality:

L III.: Let P be a poset.a) If P has a chain of size r, then it cannot be partitioned in fewer than r antichains.b) If P has an antchain of size r, then it cannot be partitioned in fewer than r chains.

A stronger version of this goes usually under the name of D’s theo-rem.

T III. (D, ): e minimum number m of chains in a par-tition of a nite poset P is equal to the maximum number M of elements in anantichain.

proven earlier by G and M

III.. CHAINS AND EXTREMAL SET THEORY

Proof: e previous lemma shows that m ≥ M, so we only need to show that we canpartition P into M chains. We use induction on P, in the base case P = nothingneeds to be shown.

Consider a chain C in P of maximal size. If every antichain in P C containsat most M − elements, we apply induction and partition P C into M − chainsand are done.

us assume now that a , . . . , aM was an antichain in P C. Let

S− = x ∈ P x ≤ ai for some iS+ = x ∈ P x ≥ ai for some i

en S− ∪ S+ = P, as there otherwise would be an element we could add to theantichain and increase its size.

As C is of maximal size, the largest element of C cannot be in S−, and thuswe can apply induction to S−. As there is an antichain of cardinality M in S−, wepartition S− into M disjoint chains.

Similarly we partition S+ into M disjoint chains. But each ai is maximal ele-ment of exactly one chain in S− and minimal element of exactly one chain of S+.We can combine these chains at the ai ’s and thus partition P into M chains. C III.: If P is a poset with nm + elements, it has a chain of size n + or an antichain of size m + .

Proof: Suppose not, then every antichain has at most m elements and by D-’s theorem we can partition P into m chains of size ≤ n each, so P ≤ mn. C III. (E-S, ): Every sequence of nm + distinct in-tegers contains an increasing subsequence of length at least n + , or a decreasingsubsequence of at length least m + .

Proof: Suppose the sequence is a , . . . , aN with N = nm + . We construct a poseton N elements x , . . . , xN by dening xi ≤ x j if and only if i ≤ j and ai ≤ a j . (Verifythat it is a partial order!)

e theorems of this section in fact belong into a bigger context that has its ownchapter, chapter IV, devoted to.

A similar argument applies in the following two famous theorema:

T III. (S, ): Let N = , . . . , n and A , . . . Am ⊂ N , such thatAi ⊂ Aj if i = j. en m ≤ nn.Proof: Consider the poset of subsets of N and letA = A , . . . , Am. enA is anantichain.


A maximal chain C in this poset will consist of sets that iteratively add one newpoint, so there are n! maximal chains, and k!(n − k)! maximal chains that involvea particular k-subset of N .

We now count the pairs (A, C) such that A ∈ A and C is a maximal chain withA ∈ C. As a chain can contain at most one element of an antichain this is at mostn!.

On the other hand, denoting by ak the number of sets Ai with Ai = k, weknow there are

n! ≥ nk=

k!(n − k)!ak = n!

nk=

aknk

such pairs. As nk is maximal for k = n we get

nn ≥

nn

nk=

aknk ≥n

k=ak = m.

We note that equality is achieved ifA is the set of all n-subsets of N .

T III. (E-K-R, ): LetA = A , . . . , Am a collection of mdistinct k-subsets of N = , . . . , n, where k ≤ n, such that any two subsets havenonempty intersection. en m ≤ n−

k−.Proof: Consider “cyclic k-sequences” F = F , . . . , Fn with Fi = i , i + , . . . , i +k − , taken “modulo n” (that is each number should be ((x − )mod n) + ).

Note that A ∩ F ≤ k, since if some Fi equals Aj , then any only other Fl ∈ Amust intersect Fi , so we only need to consider (again considering indices modulon) Fl for i − k + ≤ l ≤ i + k − . But Fl will not intersect Fl+k , allowing at most fora set of k subsequent Fi ’s to be inA.

As this holds for an arbitrary A, the result remains true aer applying any ar-bitrary permutation π to the numbers in F . us

z ∶= π∈SnA ∩F π ≤ k ⋅ n!

We now calculate the sum z by xing Aj ∈ A, Fi ∈ F and observe that there arek!(n − k)! permutations π such that Fπ

i = Aj . us z = m ⋅ n ⋅ k!(n − k)!, provingthe theorem.

III. Incidence Algebras and Mobius Functions

A common tool in mathematics is to consider instead of a set S the set of functionsdened on S. To use this paradigm for (nite) posets, dene an interval on a poset

III.. INCIDENCE ALGEBRAS ANDMOBIUS FUNCTIONS

P as a set of elements z such that x ≤ z ≤ y for a given pair x ≤ y, and denote byInt(P) the set of all intervals.

For a eld K, we shall consider the set of functions on the intervals:

I(P) = I(P, K) = f ∶ Int(P)→ Kand call it the incidence algebra of P. is set of functions is obviously a K-vectorspace under pointwise operations. We shall denote intervals by their end pointsx , y and thus write f (x , y) ∈ I(P).

We also dene a multiplication on I(P) by dening, for f , g ∈ I(P) a functionf g by ( f g)(x , y) =

x≤z≤y f (x , z)g(z, y)In exercise ??we will show that with this denition I(P) becomes an associative

K-algebra with a one, given by

δ(x , y) = x = y x = y .

We could consider I(P) as the set of formal K-linear combinations of intervals[x , y] and a product dened by

[x , y][a, b] = [x , b] a = y a = y ,

and extended bilinearily.If P is nite, we can, by theorem III., arrange the elements of P as x , . . . , xn

where xi ≤ x j implies that i ≤ j. en I(P) is, by exercise ?? isomorphic to thealgebra of upper triangular matrices M = (mi , j) where mi , j = if xi ≤ x j .

L III.: Let f ∈ I(P). en f has a (two-sided) inverse if and only if f (x , x) = for all x ∈ P.

Proof: e property f g = δ is equivalent to:

f (x , x)g(x , x) = for all x ∈ P,

(implying the necessity of f (x , x) = ) and

g(x , y) = − f (x , x)− x<z≤y f (x , z)g(z, y).

If f (x , x) = the second formula will dene the values f − uniquely, dependingonly on the interval [x , y]. Reverting the roles of f and g shows the existence of a

An algebra is a structure that is both a vector space and a ring, such that vector space and ringoperations interact as one would expect. e prototype is the set of matrices over a eld.


le inverse and standard algebra shows that both have to be equal.

e zeta function of P is the characteristic function of the underlying relation,that is ζ(x , y) = if and only if x ≤ y (and otherwise).

is implies that

ζ(x , y) = x≤z≤y ζ(x , z)ζ(z, y) =

x≤z≤y = z z ≤ z ≤ y .By Lemma III., ζ is invertible. e inverse µ = ζ− is called the Mobius func-

tion of the lattice P. e identities

µ(x , x) = (III.)µ(x , y) = −

x≤z<y µ(x , z) (III.)

follow from µζ = δ and allow for a recursive computation of values of µ and implythat µ is integer-valued.

For illustration, we shall compute the Mobius function for a number of com-mon posets.

L III.: Let P be the total order on the numbers , . . . , n. en for anyx , y ∈ P we have:

µ(x , y) =

if x = y− if x + = y otherwise

Proof: e case of x = y is trivial. If x + = y, the sum in III. has only onesummand, and the result follows. us assume that x ≤ y but y ∈ x , x + . en

µ(x , y) = −µ(x , x) − µ(x , x + ) − x+≤z<y µ(x , z) = −

x+≤z<y µ(x , z)and the result follows by induction on y − x.

L III.: If P, Q are posets, the Mobius function on P × Q satises

µ((x , y), (x , y)) = µP(x , x)µQ(y , y)Proof: It is sucient to verify that the right hand side of the equation satises III..

Together with eorem III. and Lemma III. we get

III.. INCIDENCE ALGEBRAS ANDMOBIUS FUNCTIONS

C III.: a) For X ,Y ∈ P(A), we have that µ(X ,Y) = −Y −X if X ⊆ Y ,and otherwise.b) If x , y are divisors of n, then inD(n) we have that µ(x , y) = (−)d if yx is theproduct of d dierent primes, and otherwise.

Part b) explains the name: µ(, n) is the value of the classical number theoreticMobius function.

Part a) connects us back to section III.: e Mobius function gives the co-ecients for inclusion/exclusion over an arbitrary poset. We will investigate andclarify this further in the rest of this section.

Mobius inversion

e property of being inverse of the incidence function can be used to invert sum-mation formulas with the aid of the Mobius function:

T III. (Mobius inversion formula): Let P be a nite poset, and f , g∶ P →K, where K is a eld. en

g(t) =s≤t f (s) for all t ∈ P

is equivalent tof (t) =

s≤t g(s)µ(s, t) for all t ∈ P.

Proof: Let KP be the K-vector space of functions P → K. e incidence algebraI(P) acts linearly on this vector space by

( f ξ)(t) =s≤t f (s)ξ(s, t).

e two equations thus become

g = f ζ , respectively f = gµ,

and their equivalence follows from the fact that µ is the inverse of ζ . e classical Mobius inversion formula from Number eory follows as a specialcase of this.

If we consider the linear poset , . . . , n with total ordering, Lemma III.gives us that (assuming f () = g())

g(n) = ni=

f (i)is (unsurprisingly) equivalent to

f (n) = g(n) − g(n − ),


the nite dierence analog of the fundamental theorem of Calculus!

Going back to the start of the chapter, consider n subsets Ai of a set X. We takethe the poset P(, . . . , n) and dene, for I ⊂ , . . . , n

f (I) = −I X i∈I Ai

g(I) = i∈I Ai .

en g(I) = ∑J⊂I f (J) by Inclusion/Exclusion over the complements X Ai ,while

f (I) =J⊂I

g(J)µ(J , I) =J−I−J

j∈J A j

= −I X Ai is the ordinary inclusion/exclusion formula.

C

IV

Connections

Mathematicians are like Frenchmen.When you talk to them, they translateit into their own language,and then it is something quite dierent.

Die Mathematiker sind eine Art Franzosen;redet man zu ihnen, so ubersetzen sie esin ihre Sprache, und dann ist esalsobald ganz etwas anders.

Maximen und Reexionen:Uber Natur und NaturwissenschaJ W G

With so many joints and connections, leakswere plentiful. As the magazine e Builderremarked, in : “e fate of a theater is to beburned. It seems simply a question of time.”

ConnectionsJ B

In this chapter we will look at connections – both in an applied sense of model-ing situations of connected objects, in the abstract sense of connecting mathemat-ical theorems that initially seem to be unrelated, and in connecting concepts thatmight seem to be hopelessly abstract to practical applications. One of the joys ofmathematics is to discover such connections and see the unity of the discipline.

is practical relevance of the results places much of this chapter also in closecontact to the realm of (discrete) optimization.

We shall investigate a gaggle of theorems (which each in their area are funda-mental), but which turn out to be equivalent in the sense that we can deduce each

CHAPTER IV. CONNECTIONS

from each other. Furthermore this kind of derivation is oen easier than to derivethe theorem from scratch. e description owes much to [Rei]

Many theorems in this chapter are formulated in the language of graphs: Wetypically denote a graph by Γ = (V , E) with vertioces V and edges E being -element sets of vertices. A digraph is a graph with directed edges (that is we con-sider edges as elements of V × V instead of -element subsets of V ). In a digraph,we would allow distinct edges (x , y) and (y, x). Weighted edges means we have aweight function w∶ E → R.

Our start will be D’s eorem III. that we have proven already:

e minimum number m of chains in a part ion of a nite poset P isequal to the maximum number M of elements in an antichain.

IV. Halls’ Marriageeorem

He was, methinks, an understandingfellow who said, ’twas a happy marriagebetwixt a blind wife and a deaf husband.

Celuy la s’y entendoit, ce me semble, quidict qu’un bon mariage se dressoit d’unefemme aveugle avec un mary sourd.

Essais, Livre IIIM M

Consider a family of subsets A , . . . , An ⊆ X. A system of distinct representatives(SDR) for these sets is an n-tuple of distinct elements x , . . . , xn such that xi ∈ Ai .Example: SDRs do not have to exist, for example consider A = A = .

We dene, for a subset J ⊆ , . . . , n of indices, a set

A(J) =j∈J A j .

We immediately see a reason for the above example failing: For an SDR to exist, bynecessity A(J) ≥ J for any such subset J, since there are otherwise not sucientlymany elements available to pick distinct representatives. e following eorem

shows this condition is not only necessary, but also sucient.

T IV. (Halls’ Marriage eorem): e family A , . . . , An of nite setshas a system of distinct representatives if and only if

A(J) ≥ J for all J ⊆ , . . . , n. (IV.)

e name “Marriage eorem” comes from the following interpretation: Sup-pose we have a set of m men and n women. We let Ai be the set of men that women

Proven by the British Mathematician Philip Hall and extended by the unrelated American Math-ematician Marshall Hall for the innite case. us the apostrophe placement in the section title.

IV.. KONIG’S THEOREMS – MATCHINGS

i would consider for a potential marriage. en every women can marry a suit-able man if and only if every group of k women together considers at least k menas suitable.Proof:[D⇒H] As the necessity of the condition is trivial, we show suf-ciency:

Given sets Ai satisfying condition (IV.), Let Y = y , . . . , yn be n symbolsrepresenting the sets. We create a poset P in the following way: e elements of Pis the disjoint union X ∪ Y . e only relations are that x ≤ yi i x ∈ Ai .

Clearly X is an antichain in P. Suppose S is another arbitrary antichain and letJ = ≤ j ≤ n y j ∈ S. e antichain criterion imposes that A(J) ∩ S = , so

S ≤ J + (X − A(J)) ≤ Xbecause of (IV.). at means X is a maximal antichain, and by D’s the-orem, P can be partitioned into X chains. As a chain cannot contain more thanone point from X or more than one point from Y , the pigeonhole principle impliesthat each chain contains exactly one point from X and at most one point from Y .Suppose that xi ∈ X is the element that is together with yi ∈ Y in a chain. enxi ≤ yi , and thus xi ∈ Ai , so x , . . . , xn is a SDR.

As an example of the use of this theorem we consider the following theoremdue to G. B. A matrix M ∈ Zn×n≥ with nonnegative entries is called doublystatistic if every row and every column of M has the same row/column sum.

C IV.: A doubly statistic matrix M = (mi j)with row/column sum l canbe written as the sum of l permutation matrices.

Proof: We dene sets Ai , corresponding to the row indices by Ai = j mi j > .For any k-tupleK of indices, the sum of the corresponding rows of M is kl . As everycolumn of M has column sum l , this means that these k rows must have nonzeroentries in at least k columns, that is i∈K Ai ≥ k. us IV. is satised and thereis a SDR. We set pi , j = if j is the chosen representative for Ai (and pi , j = other-wise). en P = (pi , j) is a permutation matrix and M − P is doubly statistic withsum l − . e statement follows by induction on l .

IV. Konig’seorems – Matchings

Eventually everything connects - people, ideas,objects. . . the quality of the connections is thekey to quality per se.

C Ee theorem goes back to times of more restrictive social mores. e concerned reader might

want to consider k applicants for n jobs and Ai being the set of candidates that satisfy the conditionsfor job i.


Let A be an m × n matrix with entries. A line of A is a row or column. Aset of lines covers A, if every nonzero entry of A lies on at least one of the lines.Nonzero entries are called independent if no two lie on a common line. e termrank of A is the maximum number of independent entries of A.

e following theorem is reminiscent of the properties of a basis in linear alge-bra:

T IV. (K-E): e minimum number of lines coveringAequalsthe term rank of A.

Example: Let

A =

en A can be covered with lines, and has an independent set of cardinality .Proof:[H⇒K-E] Given A ∈ , m×n , let p be the term rank ofA and q the minimum number of lines covering A. We have to show that p = q.

We rst show that p ≤ q: A cover of l lines can cover at most l independent ’s(no line covers more than one) but we can cover all ones with q lines, so p ≤ q.

To see that p ≥ q, without loss of generality permute the rows and columns ofA such that a minimal cover involves the rst r rows and the last s columns of thematrix. We aim to nd an independent set that has r entries in the rst r rows andin columns , . . . , n − s, and s entries in the last s columns in rows r + , . . . , m:

For a row index ≤ i ≤ r let Ni = ≤ j ≤ n − s Ai , j = . en theunion of any k of these Ni ’s contains at least k column indices – otherwise wecould replace these k rows with < k columns in a minimal cover. By Hall’s theoremwe thus have an SDR x , . . . , xr. By denition xi ∈ Ni implies that Ai ,xi = . LetS = (i , xi) i = , . . . , r. Since the xi are distinct, this is an independent set.

A dual argument for the last s columns gives an independent set T of positionsin the last s columns and rows r+ , . . . , n. Since no position in S shares row or col-umn index with any position in T , S ∪ T also is an independent set, of cardinalityr + s = q.

We reformulate this theorem in the language of graphs. A graph is called bipar-tite, if its vertex V set can be written as a disjoint union V = V ∪ V, so that novertex in Vi has a neighbor in Vi (but only neighbors in V−i). A vertex cover in agraph is a set of vertices that every edge is incident to at least one vertex in the cov-er. A matching in a graph is a set of edges such that no two edges are incident to thesame vertex. A matching is called maximal, if no matching with a larger number ofedges exists.

We assume that the graphs we consider here have no isolated vertices.

T IV. (Konig): In a bipartite graph without isolated vertices, the numberof edges in a maximum matching equals the number of vertices in a minimum


Figure IV.: A bipartite graph with maximum matching and minimum vertex cover.

vertex cover.

Example: Figure IV. shows a bipartite graph with a maximum matching (bold).Proof:[K-E⇒K] Let W ,U ⊂ V be the two parts of the biparti-tion of vertices. Suppose W = m, U = n. We describe the adjacency in an m × nmatrix A with entries, Ai , j = i wi is adjacent to u j . (Note that the examplesfor this and the previous theorem are illustrating such a situation.)

A matching in the graph corresponds to an independent set — edges sharingno common vertices. A vertex cover — vertices such that every edge is incident —correspond to a line cover of the matrix. e result follows immediately from theK-E theorem.

Matchings in bipartite graphs can be interpreted as creating assignments be-tween tasks and operators, between customers and producers, between men andwomen, etc., and thus clearly have practical implications (Compare the Marriageeorem interpretation of Hall’s theorem!)

Due to this practical relevance, there is obvious interest in an algorithm fornding maximum matchings in a bipartite graph, the Assignment problem. erst algorithm published uses K’s theorem as a criterion whether an existingmatching can be improved. Due to the origin of the theorem authors, it has beennamed the Hungarian Method. (It turns out that this method had been discoveredbefore independently by Jacobi [Jac].)

Let M ⊂ E be a (not necessarily maximal) matching in a graph Γ = (V , E). Wecall an edge matched if it is in M, and free otherwise. Similarly, vertices incident tomatched edges are called matched, vertices not incident to any matched edge arefree.

An augmenting path for M is a path whose end points are free and whose edgesare alternatingly matched and free. is means all vertices but the end points arematched.

If the cardinality of M is that of a vertex cover C, no augmenting paths canexist: Every edge of M will be incident with exactly one vertex in C, leaving one ofthe two free edges at the end of the path without cover. Otherwise there must be


v1

v2

vw

w1

v2

w

w1

vw

x x

Case 1: Case 2:

Figure IV.: e two reduction in Corollary IV..

an augmenting path:

C IV.: If M < C for a vertex cover C, there either is a vertex cover ofsmaller size, or there is an augmenting path for M.

Proof: We prove the statement by induction over the number of vertices V , thebase case being trivial.

Case : Assume that two vertices v ,w ∈ C are connected by an edge in M. enboth v and w must be incident to other edges, as we otherwise could drop one ofthem from C and obtain a strictly smaller matching.

Furthermore one such edge v , vmust be to a vertex v ∈ C (as we otherwisecould drop v from the vertex cover). Clearly v , v is free. We similarly nd afree edge w ,w to a vertex w ∈ C. If both v and w are free, then v , v ,w ,wis an augmenting path. Otherwise at least one vertex, WLOG v, is matched thusincident to an edge v , v ∈ M. As we assumed that v ∈ C this implies that v ∈ C,as the graph is bipartite we know that v = w. Now consider the smaller graph Γ′obtained from Γ by deleting v and v (and the adjacent edges) and adding an edgev ,w if it did not yet exist. en M′ = (M − v ,w, v , v) ∪ v ,w isa matching in Γ′ and C′ = C − v a vertex cover in Γ′ (see Figure IV., le). Byinduction, we can either obtain a strictly smaller vertex cover D for Γ′ (in whichcase D ∪ v or D ∪ v is a strictly smaller cover for Γ), or an augmenting pathfor Γ′. If this path uses the new matching edge v ,w, we replace this edge by theedge sequence v , v, v , v, v ,w and obtain an augmenting path in Γ. iscompletes case .

In Case , we now may assume that no two vertices of the cover are connectedby an edge in the matching, as M ≤ C there must be an unmatched vertex v ∈ C.

If v is adjacent to any other unmatched vertex this is an augmented path oflength .

Otherwise v is adjacent to a matched vertex w ∈ C (if all neighbors are in C, wecould remove v from C). By a matched edge, w is adjacent to another vertex x, andto cover that edge we must have x ∈ C. We furthermore may assume that w was


chosen so that x is also incident to an edge not in the matching, as we otherwisecould replace all x’s with the respective w’s in the cover and remove v from C,reducing the cover size.

Now consider the graph Γ′, obtained by removing the vertices v andw and theirincident edges and set M′ = M x ,w and C′ = C v (Figure IV., right).

en C′ is a vertex cover of Γ′ (all edges that v covered are gone). Also M′ is amatching with M′ < C′. By induction, either Γ′ has a smaller cover D (in whichcase D ∪ v is a smaller cover for Γ), or there is an augmenting path. If this pathimvolves x, we extend it by xwv and thus obtain an augmenting path for Γ.

e existence of an augmenting path allows us to replace M with a larger match-ing by replacing the matched edges of an augmenting path in M with the free edgesin the path.

To nd a maximal matching thus simply start with some matching and searchesfor augmenting paths.

It is not hard to generalize this approach to (complete) bipartite graphs withweighted edges to obtain a (perfect: all vertices are matched) matching of maximalweight (prot):

Given an arbitrary matching with weighted edges, replace an edge a, b with(assume: integral) weightw by a set of edges a, b, b , a, a , b, . . . , aw , bwith newly introduced vertices ai , bi that are not connected in any other way. Se-lecting the edge a, b in the original graph thus now allows selection of w edges.us the maximum weight of a matching in the original graph corresponds to thecardinality of a maximum matching in the new graph.

Stable Matchings

. . . to Alvin E. Roth and Lloyd S. Shapley “forthe theory of stable allocations and the practiceof market design”.

Citation for the Sveriges Riksbank Prize inEconomic Sciences in Memory of Alfred Nobel

Also of practical interest is the concept of a stable matching (or stable marriage).Imagine a matching in a bipartite graph represents an assignment of students tocollege places. Every student has a personal preference ranking of colleges, everycollege (place – assume for simplicity that every college just has one student) has apersonal ranking of students. A matching is instable (otherwise: stable), if there isa student s and a college c such that s ranks c higher than their current college, andc ranks s higher than their current student.

A priori it is unclear that stable matchings exist, the following algorithm notonly proves that but also gives a way of producing one:


A IV. (Deferred Acceptance, G, S): Every student has atemporarily assigned college, colleges may have a temporarily assigned student.Once a college has been assigned a student, it will replace it only by one itranks higher. A student may be moved down from higher ranked to lower rankedcolleges.

For bookkeeping, students carry a rejected label that will change status asthe algorithm goes on, and also and carry for each college an indicator whetherthey have been rejected by this college.

1. [Initialize] Label every student as rejected

2. [Complete] If no student is rejected, terminate.

3. [Student Choice] Every student marked as rejected applies to the collegeshe ranks highest amongst those who have not yet rejected her.

4. [College Choice] Every college tentatively picks from the students whochose it the one it ranks highest (and removes that students rejectionstatus). It rejects all other students who have selected it,even if they hadbeen picked tentatively before.

5. Go back to step 2.

Proof: e only way the process terminates is if no student is rejected, that is theywere tentatively accepted by a college and this college has no student applying itwould rank higher. All colleges which the student ranks higher have rejected herbecause they were able to accept a student they rank higher, so the matching isstable.

In every round, students select from colleges and are either tentatively accept-ed, or are le with colleges of lower preference. As there is a nite set of studentsand preferences, this process must terminate.

is algorithm is used for example in the US to match graduating medical doc-tors to hospital training positions, www.nrmp.org. It also was instrumental for theaward of the Nobel prize in Economics to S and R.

IV. Menger’s theorem

Live in fragments no longer. Only connect, andthe beast and the monk, robbed of the isolationthat is life to either, will die.

Howards EndE. M. F

IV.. MENGER’S THEOREM

K’s aim in proving the eorem IV. was as a tool towards a proof for amore general theorem by M that deals with connections in a graph (and getsus in statement and proof back close to D’s theorem), but whose proof,when rst published, turned out to not work in the case of bipartite graphs.

Let u,w be nonadjacent vertices in a graph Γ. A uw-vertex path in Γ is a pathfrom u to w. A collection of uw-vertex paths is called independent, if the paths arepairwise disjoint except for u and w. A set S of vertices, excluding u and w, is uw-vertex separating, if every uw path must contain a vertex of S. Clearly the minimumcardinality of a uw vertex separating set must be at least as large as that of a numberof uw independent paths. It turns out that they are equal:

T IV. (M): Let u,w be nonadjacent vertices in a graph. en themaximum number of independent uw vertex paths equals the minimum cardinal-ity of a uw vertex separating set.

As K’s proof is comparatively long, we will instead give a direct proof dueto G. D. Note the similarity to the proof of Dilworth’s theorem!Proof: Let m be the minimum cardinality of a separating set, and M the maximumnumber of independent paths. We have seen that m ≥ M.

Assume that the theorem is false and Γ = (V , E) be a graph with a minimumnumber of edges for which the theorem fails, that is we have two nonadjacent ver-tices u and w and fewer than m independent vertex paths.

If e is an edge not incident to u or w, we can remove e and obtain a smallergraph ∆ for which the theorem holds. Since Γ, and thus ∆ has at most m − uw-independent paths this means that in ∆ there is an uw-separating set T of sizem − . If we add one of the vertices incident to e to this set T , we obtain a set Sof cardinality m which is uw separating – any path connecting in Γ but not in ∆clearly would have to use the edge e. We thus may assume that S contains a vertexnot adjacent to u or w.

Next, we note that there is no path usw with s ∈ S of length – if there was wecould remove s and have a graph with fewer edges and m − separating verticesand less than m − independent paths, contradicting the minimality of Γ.

For the nal contradiction, denote by Pu the set of vertex paths between u andexactly one vertex in S and Pw ditto for w. e paths in Pu and Pw have only ver-tices of S in common (otherwise we could circumvent S, contradicting that it isseparating).

We claim that all vertices of S are adjacent to u, or that all vertices in S areadjacent to w. is will contradict the choice of S and prove the theorem:

Consider the graph consisting of Pu , w and edges sw for s ∈ S. If w is not ad-jacent to all vertices in s, this graph has fewer edges than G, so by assumption hasm independent paths from u to w. If we leave out the sw-step from these paths weobtain a set Ru of m independent paths from u to the m elements of S. Similarly,we get a set of independent paths Rw fromw to the elements of S. Combining thesepaths at the elements of s yields m independent paths from u tow, contradiction.


a bc

d

e

ff

e

d

c

ba

Figure IV.: A graph and its line graph

A dual version of the theorem holds for disjoint paths:

T IV. (M’s theorem, edge version): e maximum number of edge-disjoint paths from u to w equals the minimum number of edges, whose removalwould put u and v into disjoint components.

e proof of the theorem is by the ordinary version, applied to the line graphL(Γ) of Γ. e vertices of L(Γ) are the edges of Γ, two vertices are adjacent if theedges in Γ are incident to the same vertex. See gure IV. for an example.

M’s theorem also generalizes in the obvious way to directed graphs, thoughwe shall not give a statement or proof here.

IV. Max-ow/Min-cut

And he thought of himself oating on his backdown a river, or striking out from one island toanother, and he felt that that was really the lifefor a Tigger.

e house at Pooh cornerA. A. M

A network is a digraph (V , E)with weighted edges with two distinguished ver-tices, the source s and the target t. We assume that the weights (which we call capac-ities) c∶ E → R≥ are nonnegative. (Below we shall assume that weights are in factrational – clearly we can aproximate real weights to arbitrary precision. We shallinvestigate the issue of irrational capacities in the exercises.) An example is givenin Figure IV.. For a (directed) edge e = (a, b) we denote by ι(e) = a the initial,and by τ(e) = b the terminal vertex of this edge.

e reader might want to consider this as a network of roads with weights beingthe maximal capacity. (It is possible to have dierent capacities in dierent direc-tions on the same road.) We want to transport vehicles from s to t (and implicitlyask how many we can transport through the network).

IV.. MAX-FLOW/MIN-CUT

A C F H

B E

D G

10

5

15

4

4

30

6

8

15

9

15

15

10

10

10

Figure IV.: A network

Note that we are looking for a steady state in a contnuous transport situation,that is we do not want to transport once, but forever and ask for the transport pertime unit.

A ow is a function f dened on the edges (indicating the number of units wetransport along that edge) such that

• ≤ f (e) ≤ c(e)∀e ∈ E (do not exceed capacity).

• ∑ι(e)=v f (e) = ∑τ(e)=v f (e)∀v ∈ V s, t (No intermediate vertex canbuer capacity or create new ow. is is K’s law in Physics.)

e value of a ow is the net transport out of the source:

val( f ) = ι(e)=s

f (e) − τ(e)=s

f (e).is denition might seem asymmetric in focussing on the source, however Lem-ma IV. will show that this is a somewhat arbitrary choice.

e question we ask thus is for the maximal value of a ow. e main tool forthis is to cosider the ow passing “by a point” in the network:

A cut C ⊂ V in a network is a set of vertices such that s ∈ C, t ∈ C. We do notrequire the cut to be connected, though in practice it usually will be.

We note that the value of any ow is equal to the net ow out of a cut. (isimplies that the value of a ow also is equal to the net transport into the target):

L IV.: Let C be a cut. We denote the edges crossing the cut by

Co = e ∈ E ι(e) ∈ C , τ(e) ∈ C ,Ci = e ∈ E τ(e) ∈ C , ι(e) ∈ C .

en for any ow f we have that

val( f ) = e∈Co

f (e) − e∈Ci

f (e).


Proof: We prove the statement by induction on the number of vertices in C. ForC = we have that C = s and the statement is the denition of the value of aow.

Now suppose that C > , let s = v ∈ C and D = C v. By induction, thestatement holds for D.

We now consider the following disjoint sets of edges:

Po = e ∈ E ι(e) ∈ D, τ(e) ∈ C ,Pi = e ∈ E τ(e) ∈ D, ι(e) ∈ C ,Qo = e ∈ E ι(e) ∈ D, τ(e) = v ,Qi = e ∈ E τ(e) ∈ D, ι(e) = v ,Ro = e ∈ E ι(e) = v , τ(e) ∈ C ,Ri = e ∈ E τ(e) = v , ι(e) ∈ C .

Any edge that starts at v must be in Qi or in Ro , any edge ending at v must be inQo or in Ri . Kirchhos law thus gives us that

e∈Qi

f (e) + e∈Ro

f (e) = e∈Qo

f (e) + e∈Ri

f (e).It is easily seen that Co = Po ∪ R, Ci = Pi ∪ Ri and that Do = Po ∪ Qo andDi = Pi ∪ Qi . erefore, using the previous expression to replace a dierence ofQ-sums by a dierence of R-sums:

val( f ) = e∈Do

f (e) − e∈Di

f (e)=

e∈Pof (e) −

e∈Pif (e) +

e∈Qof (e) −

e∈Qif (e)

=∑e∈Ro f (e)−∑e∈Ri f (e)= e∈Co

f (e) − e∈Ci

f (e).e claim follows by induction.

We dene the capacity of a cut as the sum of the capacity of the edges leavingthe cut:

cap(C) = e∈Co

c(e).Lemma IV. thus implies that the value of any ow must be bounded by the ca-pacity of any cut and thus (unsurprisingly) the maximal ow value is bounded bythe minimum cut capacity – a chain is only as strong as its weakest link.

Similar to the dualities we considered before, equality is attained:


T IV. (Max-Flow Min-Cut, integer version): Suppose the capacity func-tion c in integer valued. en the maximum value of a ow in a network is equalto the minimum capacity of a cut. Furthermore, there is a maximum ow f that isinteger valued.

N IV.: If the capacity function is rational valued, we can simply scale with thelcm of the denominators and obtain an integer valued capacity function. In the caseof irrational capacities, the theorem holds by a boring approximation argument.

Proof:[M⇒Max-ow] Given a directed network, replace every arc with in-tegral weight c by c disjoint directed edges.

Clearly, if one edge of a c-fold multi-edge set is in a minimum separating set,the other c − edges have to be.

us the cardinality of a minimum edge-separating set is a minimum cut, whilek independent paths give a ow of k. By the edge version of M’s theorem,the minimum cardinality of an edge-separating set equals the maximum numberof independent edge paths, proving the theorem.

e algorithm of F and F nds a maximal ow for a given net-work. It start with a valid ow (say f (e) = for all edges) and then improves thisow if possible, using the following main step:

A IV. (increase ow): Given a flow f , return a flow with higher value,or show that no such flow exists.

1. Let A ∶= s. (A is a cut of vertices that can be supplied at higher rate

from the source). I ∶= . (I is a set of edges that are not yet at capacity.)

R ∶= . (R is a set of edges whose flow should be reduced.)

2. If t ∈ A, go to step 6.

3. If there is an edge e ∈ Ao with f (e) < c(e), set A ∶= A∪τ(e), I ∶= I∪e.Go to step 2.

4. If there is an edge e ∈ Ai with f (e) > (This is flow into A which we

could reduce), set A ∶= A∪ ι(e), R ∶= R ∪ e. Go to step 2.

5. If neither of these two properties hold, terminate with a message thatthe flow is maximal.

6. By tracing back how t got added to A, we construct an (undirected) path(the augmenting path) P = (e , e , . . .) from s to t. Let

d =minmine∈P∩I(c(e) − f (e)), min

e∈P∩R f (e) .

Increase the flow on P ∩ I by d, reduce the flow on P ∩ R by d. (This

satisfies Kirchho↵s law.) Return the larger flow.

Or edges→ → with intermediate vertices to ensure disjointness


Proof: We shall show termination in the case of integral capacities: Every time weadjust a ow, we may assume that the ow on one edge increases by an integralvalue. is can only happen a nite number of times.

Once the algorithm ends in step , we have a cut A such that ∑e∈Ai f (e) = ,∑e∈Ao f (e) = ∑e∈Ao c(e), thus val( f ) = cap(A) and the ow must be maximal byeorem IV..

Example: We illustrate the algorithm in the example from above. Figure IV., a)shows some ow (in blue). e total ow is .

We start building the list of undersupplied vertices and mark edges on whichto increase, respectively reduce the ow:

Vertices Increase ReducesB sBA ABD ADt DT

We nd an augmenting path sBADt (red in Figure b) and note that we can increasethe ow by (the limiting factor being edge sB) to a total of .

Starting again, we obtain (Figure c) the augmenting path sCFBAEt, on whichwe can increase the ow by again, resulting is a ow of .

Finally we build the undersupplied cities as C,F,B and then cannot increase theset. us (Figure d), shaded) s,C , F , B is a cut whose capacity of is reached,proving that we have a maximal ow.

N IV.: As given, the algorithm is not necessarily in polynomial time, as theremight be a huge number of small incremental steps. Take the network in Figure IV..Starting with a ow of , we choose the augmenting path sABt on which we canincrease the ow by . Next, we take the augmenting path sBAt, reducing the owon AB to again and have a total ow of . We can iterate this, increasing the owby each step until we get the maximum ow of .

Polynomial time in this algoriuthm however can be acheived by a more carefulchoice of the augmenting path (e.g. a shortest path, Edmonds-Karp algorithm),that analysis however is beyond the scope of these notes.

N IV.: To close the circle, observe that the Max-ow/Min-cut theorem im-plies D’s theorem: Given a poset, we model it as a directed network withow from bottom to top and arc capacity one. us all theorems mentioned in thissection (and in fact a few more we have not mentioned) are in a sense mutuallyequivalent and form the fundamental theorem of discrete optimization.


a)

10

3

11

6

0

8

1

11

6

8

10

4

0

0

0

s B E t

A D

C F

10

5

15

4

4

30

6

8

15

9

15

15

10

10

10

b)

10

5

11

8

0

8

1

11

8

8

10

2

0

Flow: 26

0

0

s B E t

A D

C F

10

5

15

4

4

30

6

8

15

9

15

15

10

10

10

c)

10

5

13

8

2

8

3

13

8

10

10

0

0

Flow: 28

0

0

s B E t

A D

C F

10

5

15

4

4

30

6

8

15

9

15

15

10

10

10

d)

10

5

13

8

2

8

3

13

8

10

10

0

0

Flow: 28

0

0

s B E t

A D

C F

10

5

15

4

4

30

6

8

15

9

15

15

10

10

10

Figure IV.: Increasing a given ow to maximum

s

B

100

A

t

100 100

100

1

Figure IV.: Example for bad augmenting paths

Braess’ Paradox

On Earth Day this year, New York City’sTransportation Commissioner decided to closed Street, which as every New Yorker knows isalways congested. ”Many predicted it would bedoomsday,” said the Commissioner [...] But toeveryone’s surprise [...] Trac ow actuallyimproved when d Street was closed.

New York Times, //, p.G K


A B C D A B C D

Figure IV.: B’ paradox: Before and aer a new road is built.

From the time of Say and Ricardo the classicaleconomists have taught that supply creates itsown demand

e General eory of Employment, Interest,and Money

J M K

Oen the question of maximal ow turns up not just for an existing network,but already at the stage of network design, aiming to maximize ow. B’ para-dox shows that this can be very nonintuitive.

While we shall give a theoretical example, concrete instances of this eect havebeen observed in the real world in several cities (New York; Stuttgart, Germany;Winnipeg, Canada;. . . ) when roads had been temporarily blocked because of build-ing work, or aer new roads were built. e interested reader might observe obvi-ous implications to society and politics.

Suppose we have four cities, A, B, C, D with connecting roads as shown ingure IV., le. e reader may observe that this conguration is similar to theone in Note IV..

Cities A and B as well as C and D are connected by a minor road which easilyclogs up. e driving time thus depends strongly on the number of cars, drivingtime, with x thousand drivers on the road is

tAB = tCD = x .

An obstacle blocks direct connections from B to C.

High capacity roads have been build to connect A and C, as well as B and D.While overall time is longer, the impact for extra cars is less, time with x thousanddrivers on the road is

tAC = tBD = + x .

Assume drivers want to travel from A to D. A symmetry argument shows thathalf () travel via B and half via C. eir individual travel time is ⋅++ =. (It is not hard to show that this is a stable equilibrium, no traveler can gain bychanging their route.)


Aer time, a new connection is built from B to C through the obstacle (rightimage). (For simplicity of the argument we shall assume that it is one-way, but thatis not crucial for the paradox.) It has high capacity and is short, so travel time onthis route is

tBC = + x .

Assuming perfect information, drivers will move towards a distribution in whichtravel time along all possible routes is equal. In the example this is driversusing ABD, drivers ACD and drivers using ABCD. (is means that drivers are traveling AB and CD, respectively, travel BC.) eir traveltimes are (again in a stable equilibrium)

tABD = ⋅ + + + ⋅ = = ⋅ + + tABD = tACD ,

and thus more than before.

C

V

Partitions and Tableaux

Some of the hardest, but also most intriguing, counting problems are those of par-titions.

V. Partitions and their Diagrams

An order-irrelevant partition (from now on simply: partition) of n into k parts isan expression

n = x + x + + xk , x ≥ x ≥ xk ≥

with all xi being integers. We will typically denote partitions by small Greek lettersand write λ n to mean that λ is a partition of n.

As in I., we denote by pk(n) the number of partitions of n into k parts and byp(n) the total number of partitions of n (into any number of parts). OEIS AFor example

= = + = + = + + = + + + ,so p() = and

= + + = + + = + + = + + ,

giving p() = . Instead of writing a sum, we might give a count of the part sizes,that is instead of = + + we would write .

A convenient way to depict a partition is graphically by what is called a Youngdiagram (boxes) or Ferrers’ diagram (dots): Each part is depicted by a row of dots orof empty squares, in descending order. Figure V. gives two versions of the diagramfor the partition :.

this is the convention in English speaking countries. e French build instead from the bottomup.

CHAPTER V. PARTITIONS AND TABLEAUX

Figure V.: e Ferrers and Young diagram for

e conjugate λ∗ of a partition λ is the partition with the “transposed” diagram,for example if λ = , then λ∗ = . Clearly λ∗∗ = λ.

V. Some partition counts

e number pk(n) is clearly equal to the number of ways to write n−k = y++ykwith y ≥ ≥ yk ≥ . If exactly s of these integers yi are nonzero, there will beps(n − k) of this type. is gives the recursion

pk(n) = ks=

ps(n − k)which, together with the initial values pk(n) = for n < k and pk(k) = gives usa way to determine pk(n) recursively.

We start by giving a generating function for the partition function p(n):P V.:

np(n)tn =

i≥( − t i)−

Proof: Expanding the right hand side, and using that ( − t i)− = ( + t i + ti +,we get

i( − t i)− = ( + t + t +)( + t + t +)

In this product, tn is obtained as a product of tx from the rst factor, tx from thesecond, and so on, such that x x n. Vice versa, every partition of n gives adierent way to form tn , that is p(n) is the coecient of tn . is innite product does not have a nice closed form, but we will use it below inCorollary V. to obtain a recursion formula for p(n).

If we consider pk(n) for a xed k, we only need to consider a partial productin the expansion. is can be used to calculate the value directly:

L V.: p(n) is the integer closest to n

.

Proof: Let a(n) = p(n + ) the number of solutions of n = x + x + x withx ≥ x ≥ x ≥ . We set y = x, y = x − x and y = x − x and thus get that

V.. SOME PARTITION COUNTS

a(n) is the number of solutions of n = y + y + y. As in the previous proof, weget a generating function

na(n)tn = ( − t)−( − t)−( − t)− .

A partial fraction decomposition gives us

na(n)tn =

( − t)− +

( − t)− +

( − t)−

+ ( + t)− +

( − ζ t)− +

(t − ζ t)−

with ζ a primitive -rd root of unity. Using the series expression

( − t)−a− =ja + j

jt j ,

collecting and telescoping, we get

a(n) = (n + ) −

+ (−)n

+

(ζn + ζn).

We write this in the form

a(n) − (n + ) ≤

+

+

<

giving the desired result. More generally, we have the following approximation:

P V.: For xed k, we have that

pk(n) ∼ nk−

k!(k − )! (n →∞).Proof: Suppose that n = x + + xk with x ≥ ≥ xk ≥ . If we permute thevariables xi , we get (not necessarily dierent) compositions of n into k parts andeach composition can be obtained by a permutation of a partition. By Section I.there are n−

k− such compositions, giving us

k!pk(n) ≥ n − k − .

We now set yi = xi + (k − i) for ≤ i ≤ k. en the yi are all dierent, as

xi + k − i = yi = y j = x j + k − j


implies that xi + j = x j + i. Assuming WLOG that i < j we have that xi ≤ x j ,contradiction. Clearly y ≥ y ≥ yk .

e summation formula over i gives (aer an index swap) that y + + yk =n + k(k−)

is a partition of n + k(k−) . As the parts are dierent, every permutation

gives a dierent composition, but we do not necessarily obtain all compositionsthis way. us

k!pk(n) ≤ n + k(k−) −

k − .

Combining the two inequalities, and dividing by k! gives (with ak = k(k−) only

dependent on k)

(n − )(n − )(n − k + )k!(k − )! ≤ pk(n) ≤ (n + ak − )(n + ak − k + )

k!(k − )! .

As k (and thus ak) are both xed, both numerators approach nk− if n →∞.

V. Pentagonal Numbers

Our chief weapon is surprise...surprise and fear...fear and surprise...Our two weapons are fear and surprise...and ruthless eciency...

e Spanish InquisitionM P

For reasons we shall see soon, we divert in a surprising way.A number is pentagonal if it is of the form k(k − ) or generalized pentag-

onal OEIS A if it is of the form k(k + ) for some positive integer k.Alternatively, one can simply use k(k − ) for arbitrary integers k. e rst fewpentagonal numbers for positive index are ( ⋅ − ) = , ( ⋅ − ) = ,( ⋅ − ) = , ( ⋅ − ) = as depicted in Figure V., respectively( ⋅−) = , −( ⋅(−)−) = , −( ⋅(−)−) = , −( ⋅(−)−) = for other indices.

e pentagonal numbers give the following extraordinary theorem.

T V. (Euler’s Pentagonal Numbers eorem): a) If n is not a pentagonalnumber, the number of partitions of n into an even and an odd number of distinctparts are equal.

as in the title: une loi tout extraordinaire, [Eul]

V.. PENTAGONAL NUMBERS

Figure V.: e rst few pentagonal numbers: ,,,

b) If n = k(k−) for k ∈ Z even, then the number of partitions of n into an evennumber of distinct parts exceeds the number of partitions into an odd number ofdistinct parts by one.c) If n = k(k − ) for k ∈ Z odd, then the number of partitions of n into aneven number of distinct parts is one less than the number of partitions into an oddnumber of distinct parts.

Example: ere are partitions of into an even number of distinct parts and intoan odd number of distinct parts. ere are partitions of into an even number ofdistinct parts, and into an odd number of distinct parts. ere are partitions of into even parts, and into odd parts. Proof: e proof will consist of constructinga bijection between partitions with an even number of distinct parts and partitionswith an odd number of distinct parts. In the case that n is pentagonal this will haveleave out exactly one partition.

For a partition λ n into dierent parts, we consider two subsets of the cells,as depicted in Figure V.:

e base is the bottom row of cells (the shortest row), the slope is the cells start-ing at the end of the top row and descending straight down-le as far as possible(this might be only of length ).

Figure V.: Base (dashed) and Slope of two partitions

Base and slope might intersect in one cell, we call those cells of the base thatare not in the slope the pure base; similarly the pure slope are those slope cells thatare not also in the base.


We now divide the partitions of n with distinct parts into three classes:

Class are those partitions for which the pure base contains more cells than theslope.

Class are those partitions for which the pure slope is at least as long as the base.Class are all remaining partitions.

For a partition λ in Class , we create a new partition µ by removing the slope(possibly including the base point) and placing these cells in a new row at the bot-tom of the diagram. As the pure base is larger than the slope this results in a parti-tion into dierent parts. Furthermore, the slope of µ is at least as long as the slopeof λ (which is the base of µ) if base and slope of µ are distinct; if they intersect theslope of µ is strictly larger. is means µ is in Class .

Vice versa, if µ is in Class , we remove the base (possibly including the low-est point of the slope) and attach it from top right as new slope. is results in apartition λ into dierent parts which lies in Class .

For example, the le partition in Figure V. is in Class and will be transformedinto the partition on the right side in Class , and vice versa.

It is easily seen that these two operations are mutually inverse, that is Class and Class contain the same number of partitions. Furthermore the operationschange the number of parts by exactly one. at means that in Class ∪ Class ,there are exactly as many partitions with an even number of parts, as with an oddnumber of parts.

Finally consider Class : A partition in this class must have base and slope in-tersecting, and the base contains the same number of cells, or exactly one more,than the slope, see Figure V..

Figure V.: Possibilities for partitions in Class

is means that, if there are k parts, it consists of a k × k rectangle and a, , . . . , k triangle that might or might not overlap. In the rst case there are k +k(k − ) = k(k − ) parts, respectively k + k(k + ) = k(k + ) =(−k)((−k) − ) parts. For a given n only one of these two options is possible,and it implies that n is a pentagonal number. is proves the theorem.

Classications become easy if we allow an “all the rest” category

V.. PENTAGONAL NUMBERS

To simplify notation we write ω(k) = k(k−) for k ∈ Z to state the followingconsequence for the inverse of the generating function of partitions.

C V.:

n≥( − tn) = ∞

k=−∞(−)k tω(k) = +

k>(−)k tω(k) + tω(−k) .

Proof: e second equality simply relies on the now familiar two ways to describegeneralized pentagonal numbers.

Let even(n) be the number of partitions of n into an even number of dis-tinct parts, odd(n) similarly for an odd number of parts. By the Pentagonal Num-bers eorem, the right hand side of the equation is the generating function foreven(n) − odd(n). We aim to show that this is true also for the le hand side:

e coecient for tn will be made up by contributions from terms of the formtni , all distinct, such that n = n ++ nl . is partition contributes (−)l . at is,every partition into an even number of distinct parts contributes , every partitioninto an odd number of distinct parts contributes −, which was to be shown.

is now permits us to formulate a recurrence formula for p(n):C V.:

p(n) = k>(−)k− (p(n − ω(k)) + p(n − ω(−k)))

= p(n − ) + p(n − ) − p(n − ) − p(n − ) + p(n − ) +Proof: We know that∑n p(n)tn =∏( − tn)−, and therefore

np(n)tn ⋅ +

k>(−)k tω(k) + tω(−k) = .

We now consider the coecient of tn is this product (which is zero for n > ). isgives

= p(n) +k>(−)k (p(n − ω(k)) + p(n − ω(−k))) .

By using that p(n) = for n < and substituting values for the pentagonal numberswe get the explicit recursion. is formula can be used for example to calculate values for p(n) eectively, or toestimate the growth of the function p(n).

While it might not seems so we have by now wandered deep into Number the-ory. In this area we get for example R’s famous congruence identities

p(n + ) ≡ (mod )p(n + ) ≡ (mod )p(n + ) ≡ (mod )


that recently have been generalized by O and collaborators.Euler’s theorem and its consequence are a special case of the Jacobi triple product

identity ∞n=( − qn)( + qn− t)( + qn− t−) = ∞

r=−∞ qrtr

which has applications in the theory of theta-functions – the territory of Numbereory – and Physics.

V. Tableaux

It’s like the story of the drawing on theblackboard, remember?

C’est comme pour l’histoire du dessinsur le tableau noir, tu te rappelles?

Histoires inedites du Petit Nicolas,Tome

R G

Drawing a partition as Young diagram with boxes allows us to place numbersin them.

D V.: Let λ n. A (standard) Young tableau is a Young diagram ofshape λ into whose boxes the numbers , . . . , n have been entered, so that everyrow and every column is (strictly) increasing.

Beyond their intrinsic combinatorial interest, tableaux are a crucial tool forstudying permutations, and for the representation theory of symmetric groups.

An obvious question with a surprising answer is to count the number fλ ofYoung tableaux of shape λ. In the case that λ = n as partition of n, we observe,exercise ??, that fλ = Cn+, the counts fλ thus can be considered as generalizingCatalan numbers.

Our aim is to derive the famous hook formula (eorem V.) for fλ . While thisformula is inherently combinatorial, the proof we shall give is mainly algebraic –indeed, until surprisingly recently [GNW, Zei] no purely combinatorial proofwas known.

We start with a recursively dened function f on m-tuples of natural numbers(for arbitrary m). (As it will turn out that this function is fλ we use the same name.)

. f (n , . . . , nm) = , unless n ≥ n ≥ ≥ .

French: “(black)board”. Plural: tableaux

V.. TABLEAUX

. f (n , . . . , nm , ) = f (n , . . . , nm).. f (n , . . . , nm) = f (n − , n , . . . , nm) + f (n , n − , . . . , nm) ++ f (n , nm . . . , nm − ), if n ≥ n ≥ nm ≥ .

. f (n) = if n ≥ .

Properties i) and iv) are base cases, property ii) allows for an arbitrary number ofarguments, Property iii) nally gives a recursion that will reduce the values of thearguments; thus f is uniquely dened.

L V.: If λ = (n , . . . , nm), then fλ = f (n , . . . , nm) is the number of Youngtableaux of shape λ.

Proof: Conditions i), ii) and iv) are obviously satised by fλ . For condition iii), takea tableau of shape (n , . . . , nm) and consider the location of n. It needs to be atthe end of a row (say row j) and the bottom of a column, and if we remove thatentry, we obtain a valid tableau with n − entries and of shape (n , . . . , ni− , ni −, ni+ , . . . , nm). (Note that some of the terms could be zero if ni − < ni+.)

Next we shall need an indentity in the polynomial ring in variables x , . . . , xm , y.

D V.: e root of the discriminant of x , . . . , xn is dened as

∆(x , . . . , xn) ∶= ≤i< j≤m

(xi − x j)e eagle-eyed reader will recognize the right hand side as a Vandermonde

determinant. e next, auxiliary lemma is simply an identity about a particularmultivariate polynomial.

L V.: Dene a polynomial g ∈ Q[x , . . . , xm , y] as

g(x , . . . , xm ; y) ∶= x∆(x + y, x , . . . , xm)+ x∆(x , x + y, . . . , xm) + + xm∆(x , x , . . . , xm + y).en

g(x , . . . , xm ; y) = x + + xm + m y∆(x , . . . , xm).Proof: Clearly g is a homogeneous (every monomial in g has the same total degree)polynomial of degree + deg ∆(x , . . . , xm). If we interchange xi and x j , then gchanges its sign, thus, if we set xi = x j the polynomial evaluates to . us (considerg as univariate in xi with all other variables part of a rational function eld) xi − x jdivides g. Hence ∆(x , . . . , xm) divides g.

We now concentrate instead on the variable y. e degree of g as a polynomialin y is at most one, and for y = the statement is obvious. We therefore only needto show that the coecient of y in g is m ∆.


If we expand g, the terms in y are xi yxi−x j

∆(x , . . . , xm) and − x j yxi−x j

∆(x , . . . , xm)for pairs i , j with ≤ i < j ≤ m. ese two terms add up to y∆, thus the m pairs(i < j) yield in sum m ∆(x , . . . , xm).

Using the dening recurrences, we can now verify an (otherwise unmotivated)formula for fλ :

L V.: If λ = (n , . . . , nm), then

fλ = ∆(n +m − , n +m − , . . . , nm)n!(n +m − )!(n +m − )!nm!

.

In fact this formula satises properties i-iv) as long as n+m− ≥ n+m− ≥ nm .

Proof: We shall show that the right hand side satises conditions i)-iv). To makethe recursion in iii) work we need to include the case that ni+ = ni + , but in thatcase ni+ +m − i + = ni + +m − i + = ni +m − i, that is two arguments of ∆become equal and the right hand side evaluates to .

With this, properties i) and iv) become trivially true.Concerning property ii), if we replace m by m + and have nm+ = , the de-

nominator changes by a factor (n +m + + )(nm + ) which is the same factorby which the numerator changes.

Concerning property iii), this is the statement of V. for xi = ni + m − i andy = −.

e Hook Formula

us, although the term “hooker” did notoriginate during the Civil War, it certainlybecame popular [. . . ] in tribute to theproclivities of General Joseph Hooker [. . . ]

Studies in Etymology and EtiologyD G

A much nicer way to describe the values of fλ is given by the following concept:

D V.: Given a cell in a Young diagram, its hook consists of the cell itselfas well as the cells to the right in the same row, and those in the same column below.e hook length is the number of cells in the hook.

T V. (Hook formula, F, R, T ()): e num-ber of tableaux of shape λ n is given by n!, divided by the product of all hooklengths.

V.. SYMMETRIC FUNCTIONS

Example: Consider λ = . e numbers in the diagram for λ indicate hooklengths:

usfλ = !

⋅ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ =

= .

Proof: Suppose that λ = (nm . . . , nm) n. en the hook length in the top lecorner is n + m − . e hook length of the second cell is n + m − , and so on,seemingly down to hook length in the last cell. As in the example above, certainlengths are skipped. is happens if a lower row is shorter than the top row. is isindicated by the -dots in the above diagram.

We are thus skipping (n +m− )−(nm), in the column aer the last row ends,(n +m − ) − (nm− + ) aer the second last row ends and so on.Going to the second row, we get hook lengths n + m − , n + m − , and so

on, but excluding (n +m − ) − (nm) for the last row, (n +m − ) − (nm− + )for the second last row, etc.

Combining this, we get as product of the hook lengths a fraction with numera-tor (ni+m−i)! for the i-th row, and the denominator combining the canceled termsto ∆(n+m−, n+m−, . . . , nm), proving the theorem with use of Lemma V..

We close this section with the remark that the numbers fλ are also the degreesof the irreducible (ordinary) representations of the symmetric group. Proving thisrequires a substantial amount of abstract algebra and cannot be done here..

A consequence is the nice identity that∑λn f λ = n!, which we will also prove

below in V.

V. Symmetric Functions

Reality favors symmetries and slightanachronisms

A la realidad le gustan las simetrıas ylos leves anacronismos

El SurJ L B

For a commutative ring R, consider the ring A = R[x , . . . , xN] of polynomialsin N variables

is is the posh way of dening this set of polynomials. e reader will be perfectly well suited ifshe simply sets R = Q and considers A as the set of rational polynomials in n variables.


For writing polynomials, it will be convenient to introduce the following no-tation. A composition (see I.) α = (n , n ,, nk) of a number M is a sequenceof numbers such that M = n + + nk . We also write α i = ni to denote the i-thpart of a composition α. We can consider partitions of M to be a special case ofcompositions. We shall use a vector notation for monomials, writing

xα ∶= xn xn

xnkk ,

this way an arbitrary polynomial may be written as∑α cαxα .

D V.: A polynomial f ∈ A is a symmetric function, if it stays the sameaer any permutation of the arguments: f (x , . . . xn) = f (xπ , . . . , xnπ) for π ∈ Sn .

Some literature instead talks about symmetric polynomials, meaning the sameproperty.

e set of symmetric functions clearly forms a subring of A, we denote thissubring by Λ.

N V.: One could, in the spirit of chapter VI consider this as an action of Snon the polynomial ring and ask what happens if we restrict to a subgroup. is isthe topic of Invarianteory – a beautiful subject but not the one we study here.

As a permutation of the indices leaves the degree of each term invariant wecan consider a symmetric function as a sum of homogeneous (that is each termhas the same total degree) parts. It therefore is sucient to consider homogeneoussymmetric functions of degree n, say.

As a vector space, we can consider Λ = Λ⊕Λ⊕ as a direct sum of subspacesΛ i consisting of homogeneous symmetric functions of degree i.

e following denition describes some particular classes of symmetric func-tions:

D V.: Let n > and λ = n + n + + nk n.

a) e Monomial Symmetric Function mλ is the sum∑α xα where α runs overall distinct permutations of (the entries of) λ.

b) e Elementary Symmetric Function em is the sum over all products of m(distinct) indeterminates, that is em = mm . We dene eλ = en ⋅ enenk .

c) eComplete Homogeneous Symmetric Function hm is the sum over all prod-ucts of m (possibly repeated) indeterminates: hn = ∑λn mλ . We denehλ = hn ⋅ hnhnk .

d) e Power Sum Function pm is the sum xm + xm + + xmN over the m-thpower of all N indeterminates. We dene pλ = pn ⋅ pnpnk .

It is easily seen that all of these functions are indeed symmetric.

V.. BASE CHANGES

Example: Suppose that N = and λ = + . en

mλ = x x + x

x + x x + x

x + xx + x

x ,eλ = (xx + xx + xx)(x + x + x),pλ = (x

+ x + x

)(x + x + x),hλ = eλ + pλ

N V.: It can be convenient to consider N arbitrary large or even innite – theidentities amongst symmetric functions still hold.

Each of these classes of symmetric functions allows to generate all:

T V.: Suppose that N ≥ n and f a homogeneous symmetric function ofdegree n in x , . . . , xN . en

a) f = ∑λn cλmλ for suitable coecients cλ .

b) f = ∑λn cλeλ for suitable coecients cλ .

c) f = ∑λn cλhλ for suitable coecients cλ .

d) f = ∑λn cλpλ for suitable coecients cλ .

In each case the coecients cλ are unique, furthermore in cases a,b,c), if f hasintegral coecients, then all cλ are integral.

C V.: Any symmetric function f (x , . . . , xN) can be written as a poly-nomial g(z , . . . , zN), where z stands for one of the symbols e,h, p.

N V.: For algebraists we remark that one can show that the ring of symmetricpolynomials in N variables has transcendence degree N and that the ei, the hiand the pi each form a transcendence basis of this ring.

Proof:[of eorem V., part a)] If f = ∑α cαxα is homogeneous of degree n, thenf = ∑λn cλmλ . C V.: e set mλλn is a vector space basis of Λn , thus dim(Λn) =p(n).

We shall see below that each of the other sets, eλ, hλ, pλ, also forms abasis.

V. Base Changes

e proof we shall give for the remaining parts of eorem V. will be based onlinear algebra: We shall consider the coecients of expressing one set of symmetric


functions in terms of another set, and obtain a combinatorial interpretation of thesecoecients. From this we shall deduce that the matrix of coecients is invertible(and possibly has an integral inverse). is implies that the other sets also form abasis and thus the remaining parts of the theorem.

As we will argue with matrices, it involves an arrangement of basis vectors, wethus need to dene an ordering on partitions:

D V.: Let λ = (λ , λ , . . . , λk) and Let µ = (µ , µ , . . . , µl) be parti-tions of n. We assume WLOG that k = l by allowing cells of size .

We say that µ ≤ λ in the natural order, (also called dominance order or ma-jorization order) if µ = λ and for any i ≥ we have that

µ + µ + + µi ≤ λ + λ + + λ i .Following eorem III., this partial order has a total order as a linear exten-

sion, one such total order is given in exercise ??.

D V.: Let A = (ai j) a matrix. We consider row sums ri = ∑ j ai j andcolumn sums c j = ∑i ai j and call row(A) = (r , r , . . .) the row sum vector, respec-tively col(A) = (c , c , . . .) the column sum vector.

Since the mλ form a basis there are integral coecients Mλµ , such that

eλ = µn Mλµmµ .

In fact Mλµ is simply the coecient of xµ in eλ .

L V.: Mλµ equals the number of (, )-matrices (that is, matrices whoseentries are either or ) A = (ai j) satisfying row(A) = λ and col(A) = µ.

Proof: Let λ = (λ , λ , . . .). Any term of eλ is a product of a term of eλ , eλ , etc. andany term of eλ i is the product of i dierent variables. We shall describe any termusing a matrix. Let

X = x x x x x x ⋮ ⋮

en each monomial xα of eλ is the product of λ entries from the rst row, λentries from the second row and so on. We describe the selection of the factors bya (, )-matrix A with indicating the selected factors.

e products contributing to a particular monomial xα correspond exactly tothe matrices with row sums λ and column sums α. e statement follows.

We denote by M = (Mλµ) the matrix of these coecients.As the transpose of a matrix exchanges rows and columns we get

C V.: e matrix M is symmetric, that is Mλµ = Mµλ .

V.. BASE CHANGES

We now establish the fact that the matrix M is, for a suitable arrangement ofthe partitions, triangular with diagonal entries . is shows that detM = , thatis M is invertible over Z. is in turn implies eorem V., b).

P V.: Let λ, µ n. en Mλµ = , unless µ ≤ λ∗. Also Mλλ∗ = .us, if the λ are arranged in (a linear extension of) the natural ordering, and theµ according to ordering of its duals, then M is upper triangular with diagonal .

Proof: Suppose that Mλµ = , so by Lemma V. there is a (, )-matrix A withrow(A) = λ and col(A) = µ. Let A∗ be the matrix with row(A∗) = λ and the ’sle justied, that is A∗i j = if and only if ≤ j ≤ λ i . By denition of the dual, wehave that col(A∗) = λ∗. On the other hand, for any j the number of ’s in the rst jcolumns of A is not less that the number of ’s in the rst j columns of A, so in thenatural order of partitions we have that

λ∗ = col(A∗) ≥ col(A) = µ.

To see that Mλλ∗ = we observe that A∗ is the only matrix with row(A∗) = λ andcol(A∗) = λ∗. N V.: is approach of proving that the eλ form a basis — expressing the eλin terms of the mµ , giving a combinatorial interpretation of the coecients in thisexpression, and deducing from there that the coecient matrix must be invertible— could also be applied to the other two bases, hλ and pλ . We shall instead giveother proofs, in part because they illustrate further aspects of symmetric functions.

For the reader who feels that we have moved far away from enumerating com-binatorial structures, we close this section with the following observation (whoseproof is le as exercise ??):

Suppose we have n balls in total, of which λ i balls are labeled with the numberi. We also have boxes , ,. . . . en Mλµ is the number of ways to place the ballsinto the boxes such that box j contains exactly µ j balls, but no box contains morethan one ball with the same label.

Complete Homogeneous Symmetric Functions

For the complete homogeneous symmetric functions hλ we get the following ana-log to Lemma V.:

L V.: Dene Nλµ as the coecient of xµ in hλ , that is

hλ = µn Nλµmµ .

en Nλµ equals the number of N-matrices A = (ai j) satisfying row(A) = λ andcol(A) = µ.

A consequence of the Cayley-Hamilton theorem is that the inverse of a matrix A is a polynomialin A with denominators being divisors of det(A).


Proof: Exercise ??

We now establish a duality between the e-polynomials and the h-polynomials.

D V.: We dene a map ω∶Λ → Λ by setting

ω∶ en → hneλ → hλ

and extending linearly.

As the eλλn for a basis of Λn this denes ω uniquely and shows that ω is alinear map. Since the ei are also algebraically independent (a fact we shall take asgiven), ω is also an algebra endomorphism, that is it preserves products.

P V.: e endomorphism ω is an involution, that is ω is the identity.In particular ω(hn) = en and ω(hλ) = eλ .

We note that eorem V., c) is an immediate consequence of this proposition.Proof: We go in the ring Λ[[t]] of formal power series over Λ and set

E(t) ∶= n≥

en tn ,

H(t) ∶= n≥

hntn .

We have that

E(t) = Nr=( + xr t),

H(t) = r( + xr t + x

r t +) = N

r=( − xr t)− ,

as can be seen by expanding the products on the right hand side. is implies thatH(t)E(−t) = . Comparing coecients of tn on both sides yields

= ni=(−)i ei hn−i , n ≥ .

We now apply ω, use ω(ei) = hi , and reindex and thus get

= ni=(−)i hiω(hn−i) and = n

i=(−)i hn−iω(hi)

We consider this as a linear system of equations with the ω(hi) as variables. ecoecient matrix of this system is lower triangular with diagonal entries h = ,thus has full rank. e solution therefore must be unique, but we know already

V.. BASE CHANGES

that the ei form a solution, proving the theorem.

We nally come to the power sum functions pλ . Dene

P(t) ∶=n≥

pn tn−

P V.: ddt H(t) = P(t)H(t) and d

dt E(t) = P(−t)E(t).Proof:

P(t) = r≥

pr tr− =r≥

Ni=

xri tr−

= Ni=

xi − xi t =

ddt

Ni=

log( − xi t)−

= ddt

log Ni=( − xi t)− .

e result for H follows by logarithmic dierentiation. e argument for E is ana-log.

By considering the coecients of the power series, we can write this result inthe form nhn = ∑n

r= prhn−r . is allows us to express the hi in terms of the pi ,albeit at the cost of introducing denominators. For example, h =

(p + p).

erefore the pλ generate as vector space and we get eorem V. d); albeitnot necessarily with integral coecients.

We close this section with the mention of another basis of symmetric polyno-mials which is of signicant practical relevance, but more complicated to dene.Given a partition λ = (λ , λ , . . .), dene

snumλ(x , . . . , xn) = det

xλ+n− xλ+n−

xλ+n−n

xλ+n− xλ+n−

xλ+n−n⋮ ⋮ ⋮

xλn xλn

xλnn

.

By the properties of the determinant, snum is invariant under all even permuta-tions of the variables, but is not symmetric. It thus must be divisible by the Vander-monde determinant∏i< j(xi − x j), and the quotient

sλ(x , . . . , xn) = snumλ(x , . . . , xn)∏i< j(xi − x j)


is symmetric. We call the set of sλ the Schur polynomials. ey form another basisof Λn , but we will not prove this here.

Amongst the many interesting properties of Schur functions, one can show,for example, that in the expression sλ = ∑µ Kλµmµ , the coecient Kλµ (calleda Kostka Number) is given by the number of semistandard tableaux (that is, weallow duplicate entry and enforce strict increase in columns, but allow equality orincrease in rows) of shape λ. Also the base change matrix from pλ to sλ is thecharacter table of the symmetric group Sn .

V. e Robinson-Schensted-Knuth Correspondence

From ghoulies and ghostiesAnd long-leggedy beastiesAnd things that go bump in the night,Good Lord, deliver us!

Scottish PrayerT

We now investigate a miraculous connection between tableaux and permuta-tions. For this we relax the denition of a standard tableau to not requires entriesfrom , . . . , n, but allow larger entries (thus skipping some smaller numbers). Italso will be convenient to consider permutations as sequences of images.

e crucial step is an insert routine that takes a tableau and a number and in-serts the number in the tableau, adding one cell and moving some entries, so thatthe tableau property is preserved.

e routine proceeds by inserting the entry a in the rst row. Let x be the small-est entry in this row for which x ≥ a. If no such element exists, add a (in a new cell)at the end of the row. Otherwise a “bumps out” x and takes its place in this row.en x is inserted into by the same process into the second row, possibly bumpingout another number which then is inserted into the third row and so on.Example: Suppose we want to insert into the tableau

e smallest entry of the rst row larger than is , which gets bumped out,resulting in the row , , , , , and entry being inserted into the second row.

ere is the smallest entry larger than and gets bumped out, resulting inrow , , , .

V.. THE ROBINSON-SCHENSTED-KNUTH CORRESPONDENCE

Inserting into row three bumps out . Finally is added at the end of rowfour, obtaining the tableau

To be able to prove statements, we now describe this process in a formal wayfor a tableau T = (Ti j). For convenience we assume that the tableau is bordered by’s to the top and le, and by∞ to the bottom and right, so that Ti j is dened forall i , j ≥ .

We also dene a relation on entries of T by

a b if and only if a < b or a = b = or a = b =∞,

with this the property of being a tableau is given by the following characterization

Ti j = if and only if i = or j = ; (V.)Ti j Ti( j+) and Ti j T(i+) j , for all i , j ≥ . (V.)

We write x ∈ T if x =∞ or x = Ti j for all i , j ≥ .

A V. (Insert): Let T = (Ti , j) a tableau and x ∈ T a positive integer.This algorithm transforms T into another tableau that contains x in additionto the elements of T and adds one cell in a position (s, t) determined by thealgorithm.

1. [Input x.] Set i ∶= , x ∶= x, and j smallest such that T j =∞.

2. [Find xi+.] (At this point T(i−) j < xi < Ti j and xi ∈ T.) If xi < Ti( j−)then decrease j and repeat the step. Otherwise set xi+ ∶= Ti j and ri ∶= j.

3. [Replace by xi .] (Now Ti( j−) < xi < xi+ = Ti j Ti( j+), T(i−) j < xi <xi+ = Ti j T(i+) j, and ri = j.) Set Ti j ∶= xi .

4. [At end?] (Now Ti( j−) < Ti j = xi < xi+ Ti( j+), T(i−) j < Ti j = xi <xi+ T(i+) j, ri = j, and xi+ i nT.) If xi+ =∞ then increase i by andreturn to step 2.

5. return s ∶= i, t ∶= j and terminate. (At this point Tst = ∞ and T(s+)t =Ts(t+) =∞.)

Proof: e parenthetical remarks in the algorithm ensure that T remains a tableauat each step.


e algorithm implicitly denes a “bumping sequence”

x = x < x < < xs < xs+ =∞,

as well as column indicesr ≥ r ≥ ≥ rs = t.

We now consider a slight generalization of permutations, namely two-line ar-rays of integers A=

q q qnp p pn

(V.)

such that q < q < qn and that all pi are dierent. Permutations thus are thespecial case of qi = i and p , p , . . . , pn = , . . . , n.

e following algorithm takes such an array and produces a set of two tableauxof the same shape

A V. (Robinson-Schensted-Knuth): Given a two-line array of theform (V.35), the following algorithm produces two tableaux P, Q of the sameshape with entries from the pi , respectively the qi :

1. Initialize P and Q to the empty tableau. Let i ∶= .

2. Insert pi into P, obtaining a position (s, t).3. Place qi into a new cell at position (s, t) in Q.

4. Increment i. If i ≤ n go to step 2. Otherwise terminate and return P andQ.

Proof: e proof of Algorithm V. shows that P is a tableau. Furthermore, theposition s, t returned is always a valid position to add a cell to the Young diagram.As the entries of qi are increasing this shows that Q always is a tableau as well. We abbreviate this process as RSK and note that if the array is permutation both Pand Q are standard tableaux with entries , . . . , n.Example: We illustrate the RSK process with the example of the array

.Insert and P = , Q =

Insert and P = , Q =





Amazingly this process is a proper bijection

T V. (RSK Correspondence): ere is a bijection between the set of per-mutations of , . . . n and the set of ordered pairs (P, Q) of tableaux of the sameshape, formed from , . . . , n. Under this bijection, if g corresponds to the part(P, Q), then g− corresponds to (Q , P).Example: e permutation (, , , , , ) ∈ S has an array with second row , , , , , and gives the tableaux

P = , Q =

.

Before proving this theorem we notice a number of consequences concerningthe number fλ of tableaux:

C V.: λn

f λ = n!

Example: For n = there are partitions and respective tableau counts of f = ,f = , f = , f = , f = . Also + + + + = = !.

C V.: e number of involutions (permutations that are self-inverse,see II. for another formula) is given by∑λn fλ .

Proof: A permutation g is an involution if and only if g = g−. at means thatinvolutions correspond to pairs with P = Q. Example: S has one identity, three elements of shape (, )(, ) and

= ele-ments of shape (, ), thus = + + + + involutions.

Proof of the RSK correspondence

e unusual nature of these coincidencesmight lead us to suspect that some sortof witchcra is operating behind the scenes.

e Art of Computer Programming,Section ..: Tableaux.

D E. K

e rst part of the proof is to observe that Algorithm V. can be reversed:We say that a cell of a tableau is “at the edge”, if there is no cell to the right or

below. e insertion algorithm adds a cell which is on the edge of the new tableau.


Now suppose we have a tableau into which an element x has been inserted,resulting in a new cell at the border of the resulting tableau P, say at coordinates(s, t). If we are in row s = , the entry of the cell is simply the element x.

Otherwise the entry xs must have been bumped out of the previous row. It wasbumped out by an element xs− and xs was the smallest number in the row thatwas larger than xs−. at is, we can identify xs− in P as the largest number in rows − that is smaller than xs . We thus can simply put xs back into its old positionand now have to deal with xs−. If s − = it was the element x inserted into thetableau, otherwise we repeat the process for s − and so on.

We can consider this process as an “uninsert” algorithm that takes an (arbitrary)tableau P, a position (s, t) and returns a tableau P′ missing a cell in position s, t)as well as an element x. (It is easily see that at all steps the tableau properties aremaintained.)

e description of the uninsert process also ensures that, aer uninserting, wecould obtain P back by inserting x into P′ – insert and uninsert are mutual inverses.Proof:(of eorem V., part ) To establish a bijection, we describe an inverse pro-cess to Algorithm V.: Given two tableaux P, Q, we produce a two-line array A.

. Let A be the empty two-line array.

. Identify the position (s, t) of the largest entry q of Q. ((s,t) must be at theedge of Q.) Remove that entry q from Q.

. Apply the uninsert process to P for position (s, t), resulting in the removalof an element x.(and reducing P to a tableau as the same shape as the reducedQ.)

. Add the column qx at the front of A.

. Unless P and Q are empty, go back to step .

e result of this process is a two-line array A with the rst row consisting of thenumbers , . . . , n in canonical order, and the second row containing each of thesenumbers exactly once, that is a permutation.

It is easily seen that this process and the RSK algorithm are mutually inverse,thus establishing the bijection.

To prove the second part, inversion corresponding to swap of the tableaux, weneed further analysis of the RSK process:

If we consider the rst row of the P-tableau, the construction process insertssome numbers, and bumps some out. Once a number has been bumped it will neveraect row again.

Furthermore, the lower rows of P and Q simply are tableaux correspondingto the “bumped” two-line arrays, that is two-line arrays that consist of the entries


bumped out of the rst row and the associated q-entries. (is bumped array is thereason why we do not require the rst row to be the canonical sequence , , . . . .)

In the example of the permutation used above, we inserted

, , in the top row and bumped and , resulting in the two-line array Abump =

describing the second row and below.

is shows that the RSK algorithm can be split into two parts – constructingthe rst row only, and constructing the bumped two-line array. We consider bothparts:

Construction of the rst row To study the construction of the rst row – re-member that every entry of P is inserted rst into this to row – we dene:

D V.: Consider a two-line array as given in (V.). A column (qi , pi)of this array is in class t, if aer inserting p , . . . , pi into an empty tableau, pi is incolumn t of the tableau.

Example: Looking at the example above, (, ) and (, ) are in class ; (, ) is inclass ; and (, ) and (, ) are in class . We can easily characterize the class of acolumn:

L V.: e column (qi , pi) belongs to class t if and only if t is the length ofthe longest increasing subsequence pi < pi < . . . < pit = pi ending at pi .

Proof: For pi to be in class t, that is inserted into column t, there must be t − elements in the rst row that are smaller. ese elements must be p j ’s for j < ishowing that these p j ’s together with pi give an increasing subsequence of lengtht. We thus need to show the converse and will do so by induction on the length tof the longest increasing subsequence ending at pi :

For t = , pi is the smallest element and clearly will be inserted in column .Now assume that t > and that the theorem holds for class t − . Let p j the prede-cessor of pi in an increasing subsequence of length j. en by induction p j belongsto class t − , that is p j was inserted into the t − -st column. at means that at thepoint of inserting pi , the entry in this position is not larger than p j , implying thatpi must be inserted in column t or higher. But if it was higher there would be anincreasing subsequence of length > t, contradiction.

C V.: e length of row of the tableau is the length of a longest in-creasing subsequence of pi.Classes and bumped tableaux We observe that the tableaux resulting from theRSK process are completely determined by the columns of the two-line array andtheir classes, regardless of the order in which these columns are given:


First observe, that if (qi , pi), . . . , (qik , pik) form the set of columns in oneparticular class, the way the insertion process and bumping works means that wecan assume without loss of generality that the indices have been arranged so thatqi < qi < < qik and simultaneously pi > pi > > pik .

If we consider the rst row of tableau P, the entry in column t must be the p-partof a column (q, p) in class t. Because of bumping, it will be the (smallest) value pik .Similarly the corresponding entry in Q must be the q-part of such a column, but itis the q-part used when the entry is created for the rst time, that is the minimalentry qi . is describes the rst row of P and Q fully.

For the further rows consider the bumped tableaux. To describe these it is suf-cient to see what becomes of columns within class t (as entries from dierentclasses never bump one another). e entry pi gets bumped by pi which comeswith q-part qi , resulting in a column (qi , pi). Similarly pi gets paired with qiand so on, giving the (partial) two-line array arising from class t as

Abump = qi qi qikpi pi pik−

. (V.)

We then proceed from the bumped array anew to obtain new classes and from thesethe second rows of P and Q and so on.

With this observation, we are now able to prove the second part of the theo-rem, concerning the swap of P and Q. As the two tableaux are constructed by verydierent methods, this is a most surprising symmetry.Proof:(of eorem V., part ) On the level of two-line arrays, inversion means toswap the two rows and sort the columns according to the entries in the rst row.Denote this array by

A− = p′ p′ p′nq′ q′ q′n . (V.)

If we ignore the arrangement of columns, Lemma V. states that a column (qi , pi)is in class t if and only if t is the maximal size of an index set i , . . . , it such that

qi < qi < < qit andpi < pi < < pit .

is means that (q, p) is of class t in (V.), if and only if (p, q) is of class t in(V.).

at implies that the RSK algorithm applied to A− produces tableaux whoserst rows are exactly swapped from the tableaux produced for the original two-linearray A.

Furthermore, following the formula (V.) for the bumped array – within oneclass, remove the smallest p and the smallest q entries and “shi the entries togeth-er” – we have that A−bump is obtained from A−bump by swapping the two rows andsorting according to the rst row.


Applying the same argument again thus gives us that the second rows etc. alsoare swapped, as was to be proven.

ere is much more to the RSK construction. For example reverting the per-mutation (as a sequence of numbers) results in a transposition of P (but a some-what more complicated transformation ofQ namely the transpose of a transformedtableau.)

C

VI

Symmetry

Tyger Tyger, burning bright,In the forests of the night;What immortal hand or eye,Could frame thy fearful symmetry?

e TygerW B

e use of symmetries is one of the tools of science that crosses into manydisciplines. So far we have only touched upon this, in a rather perfunctory way,by considering objects as identical or non-identical, or by counting labeled versusnon-labeled objects. In this chapter we shall go into more detail. In combinatorics,there are at least three uses of symmetry:

• We might want to consider objects only up to symmetry, thereby reducingthe total number of objects. For example when considering bracelets madefrom green and one golden pearl, we want to consider this is one bracelet,rather than possible positions of the golden pearl. We will look at countingup to symmetry in Section VI..

• Symmetries can be used to classify what is sometimes a huge number of ob-jects into equivalence classes. For example, relabeling points in a permuta-tion preserves the cycle structure.

• Finally, in a related way, we might want to use symmetries as a proxy for ob-jects being interesting. When combining parts, there might be a huge num-ber of possible objects. For example there are

OEIS A

CHAPTER VI. SYMMETRY

Figure VI.: Two graphs with dierent symmetries

dierent graphs on vertices. But, be it graphs or other objects, it turns outthat many of them are just stuck together from smaller objects and do notamount to more than the sum of their parts. On the other hand symmetriesthat do not keep the parts xed oen indicate something interesting. Fig-ure VI. shows two graphs with vertices and edges. Clearly the graph onthe right (which has nontrivial symmetries) is more interesting than thele one (with only trivial symmetries).

Of the formal setting in mathematics for symmetries is a group. Instead of goingall abstract, we will consider rst, what we want symmetries to be, and then get theformal axioms as an aerthought. So, what are symmetries? If we have an object(which can be a set of smaller objects), a symmetry is a map from the object to itself,may be mixing up the constituent parts, that preserves some structural properties.

VI. Automorphisms and Group Actions

Nothing will make me believe that a man whoarranged all the rest of that room with thatexaggerated symmetry le that one feature of itlopsided.

e Worst Crime in the WorldG. K. C

We call such a map an automorphism, from the greek autos for self and morphefor shape. ere always is at least one such map, namely the identity which does-nt do anything. In an abuse of notation we shall write for the identity map. Bypreserving the larger collection, the maps have to be bijective, and thus aord aninverse. Finally we can compose these maps, that is apply one aer the other. As al-ways, composition of maps is associative. we thus get the formal axioms of a group,though this axiomatic theory shall be of little concern to us. Following conventionsintroduced in [Wie], we denote groups by upper case letters, group elements bylower case letters, and objects the group acts on by greek letters. If the objects weact on have numbered parts or are themselves numbered, it can be convenient to

VI.. AUTOMORPHISMS AND GROUP ACTIONS

represent maps as permutations. Indeed, if we act on a set of objects with numberedparts, permutations of these numbers yield automorphisms.

Formally, we can distinguish maps and the permutations induced by them, inthis case an action of a group G on n numbered objects yields a group homomor-phism G → Sn , assigning to every map the permutation induced by it.

We shall write the image of a point ω under a map g as ωg . at is our groupsact on the right. at way the image of Omega under the product of g with h issimply gh. Similarly permutations multiply as (, , ) ⋅ (, ) = (, ). (While thisis clearly the natural way for languages that are written from le to right, the readerwho had abstract algebra might notice that in some classes the convention used isan action from the le, which then requires the introduction of a special productoperation for composition of maps.)

In some situations we might want to consider what maps do to parts, collec-tions, or structures derived from the original objects. For this, it is most convenientto not consider new maps, but a new action of the original maps. e rules for thisare simply that the identity must act trivially, and that products act from le toright, that is

ω = ω ∀ω ∈ Ω, and ω(gh) = (ωg)h ∀ω ∈ Ω∀g , h ∈ GSuch an action, given by a group G, a domain Ω, and a map Ω ×G → Ω, (ω, g)ωg , is called a group action. Group actions are the basic objects we shall consider.

Examples

Before delving into further details of the theory, lets consider a few examples.e set of all permutations of n points form a group, called the symmetric group

Sn . It acts on the points , . . . n, but we also can have it act on opther structuresthat involve these numbers, for example all subsets of size k, sequences of length k,or partitions with k cells.

If we consider a bracelet with pearls as depicted on the le in Figure VI.,then automorphisms are given for example by a rotation along the bracelet (whichwe shall denote by the permutation (, , , , , )), or by ips along the dashedreection axes. For example, the ip along the axis going between and is givenby the permutation (, )(, )(, ). Together with the trivial rotation this gives++ = symmetries. We also observe that there are positions where can go,and then positions where goes with which an automorphism is fully determined.e whole symmetry group thus has order . (For those who had abstract algebra,it is the dihedral group of order .)

Next consider symmetries of a cube that are physically possible without dis-assembling, that is only rotations and no reections. (However in other cases onemight make another choice, indeed when talking about automorphisms one needsto decide what exacty is to be preserved.) Besides the identity there are rotationseach for the three pairs of opposite faces, rotations for each of the diagonal axes,


1

5

4 3

2

6

1

7

8

4 3

9

105

6

2

Figure VI.: A circle graph and the Petersen graph

as well as rotation on axes that go through the middle points of pairs of oppositefaces, for a total of +++ = symmetries. (Again, the images of neighboringfaces gives us at most ⋅ = possibilities, thus we found all.)

e set of automorphisms of an object X is called its automorphism group anddenoted by Aut(X). (As noted that might depend on what structure is to be pre-served. e automorphism group of the faces would be S if we did not want topreserve the cube structure.)

It is not always that easy to determine the number of automorphisms (alsocalled the order of the automorphism group). Take the graph on the right side ofFigure VI. (which is called the Petersen-graph). An automorphism of a graph is apermutation of the vertices that preserves adjacency. By checking this property, it iseasy to verify that the permutation (, , , , )(, , , , ) is an automorphism,as is (with a bit more eort) (, )(, ), (, ). But is is not trivial to show thatthere are automorphisms in total, and that any automorphism can be formed asa suitable product of the two permutations given. We will look at this in Exercise ??.

For an example with a larger set of objects, consider the order-signifcant parti-tions of n into k parts and Sk acting by permuting the parts.

Finally, we consider the set of all possible graphs on three vertices, labelled with, , , , as depticted in Figure VI.. ere are

= possible edges and thus = dierent graphs. e groul S acts on this set of graphs by permuting the vertices.(When doing so, the number of edges (and more — On four vertices we cannot mapa triangle to a Y) is preserved, so not every graph may be mapped to any other.

In general, for two given graphs on n (labelled) vertices, a permutation in Snthat maps one graph to the other is called a graph isomorphism, if such a map existsthe graphs are called isomorphic.


1 2 3

1 3 2

2 1 3

1 2 3

1 3 2

2 3 1

1

2

3

1 2

3

Figure VI.: e labelled graphs on vertices.

12

1

7

8

4 3

9

105

6

2

1

7

8

4

3

9

10

5

6

2

11

1

7

8

4

3

9 10

5

6

2

Figure VI.: ree nonisomorphic graphs that are regular of degree .

N VI.: e question of determining graph isomorphisms is a prominent [?]and hard problem. While local properties — number of edges or degrees of vertices— can eliminate the existence of automorphisms, one can show that for any setof local properties, there are graphs in dierent orbits which agree on all of theinvariants.

Figure VI. shows three graphs, each of them regular of degree . e le graphhas trivial automorphism group, the middle graph an automorphism group of or-der . e right graph is in fact, with the given labelling, isomorphic to the Pe-tersen graph (with automorphism group of order ), though this isomorphism isnot obvious without the labelling – there are ! possible maps, only of them,that is in induce an isomorphism.

We close this section with the denition of an equivalence of actions:

D VI.: Let G be a group acting on a set Ω and H a group acting on aset ∆. We call these actions equivalent (actions), if there is a group isomorphismα∶G → H and a bijection ψ∶Ω → ∆, such that

ψ(ωg) = ψ(ω)α(g)at means if a group G acts on a set Ω of size n, inducing a homomorphism

φ∶G → Sn , the action of G on Ω and the action of φ(G) on , . . . , n are equiva-lent.


Orbits and Stabilizers

Ever he would wander, selfcompelled, to theextreme limit of his cometary orbit.

UlyssesJ J

e two main concepts for working with group actions are orbits and stabiliz-ers.

D VI.: Let G act on Ω and ω ∈ Ω.a) e Orbit of ω is the set of all possible images:

ωG = ωG g ∈ G.b) e stabilizer of ω is

StabG(ω) = g ∈ G ωG = ω.Since group elements are invertible, the orbits of a group form equivalence

classes. ey represent “equivalence up to symmetry”. If we want to describe ob-jects up to symmetry, we really ask to describe the orbits of an appropriate symme-try group.Example: Determine orbits in the examples of the previous section?

D VI.: e action of G on Ω is transitive if there is only one orbit, thatis Ω = ωG for (arbitrary) ω ∈ Ω. An element ω ∈ Ω, that is the only one in its orbit— i.e. ωG = ω— is called a xed point of G.

e stabilizer of a point is closed under products and inverses and thus alwaysforms a group itself. Indeed,the automorphism group of an object can be consid-ered as a stabilizer of the object in a larger “full” symmetry group: In the case ofthe bracelet we can consider the set of all permutations of the pearls and want on-ly those that preserve the neighbor relations amongst pearls. e automorphismgroup of a cube (centered at the origin) is the stabilizer of the cube in the orthog-onal group of -dimensional space.

e key theorem about orbits and stabilizers is the bijection between orbit ele-ments and stabilizer cosets. Two elements g , h map ω in the same way if and onlyif their quotient lies in the stabilizer:

ωg = ωh ⇔ ω = ω(g⋅g−) = (ωg)g− = (ωh)g− = ωhg ⇔ hg ∈ StabG(ω).is means that if g is one element such that ωg = δ then the set of all elementsthat map ω in the same way is (what is called a coset):

StabG(g) ⋅ g = s ⋅ g s ∈ StabG(ω).For any orbit element δ ∈ ωG there therefore are exactly StabG(ω) elements of Gthat map ω to δ. We therefore have


T VI. (Orbit-Stabilizer): ere is a bijection between elements of ωG andcosets of StabG(ω). In particular:

G = ωG ⋅ StabG(ω) .In particular, this means that the length of an orbit must divide the order of the

acting group.Example: A group of order , acting on points thus must have xed points.

We note some consequences from this bijection. ey show that the possibletransitive actions of a group G are prescribed by its structure:

L VI.: Let G act on Ω and ω, δ ∈ Ω with g ∈ G such that ωg = δ. en

a) StabG(ω)g = StabG(δ).b) e set of elements of G mapping ω to δ is

StabG(ω) ⋅ g = g ⋅ StabG(δ) = StabG(ω) ⋅ g ⋅ StabG(δ).c) e action of G on the cosets of S = StabG(ω) by (Sx , g) S(xg) is com-

patible with the bijection: If ωx = δ, ωy = γ and δ g = γ then S(xg) = Sy.

In cases such as c), of a bijection between sets being compatible with groupactions (maybe even involving an isomorphism of groups) we talk about an equiv-alence (or isomorphism) of group actions.

We now dene a directed graph that describes the action of certain elementsA ⊂ G on Ω. (Typically A will be a generating set for G, that is every element of Gis a product of elements of A and their inverses.)

D VI.: Let SchA(Ω) = (Ω, E) the digraph with vertex set Ω and edgesE ⊂ Ω × Ω labelled by elements of A: We have that (ω, δ) ∈ E if and only if thereexists an a ∈ A such that ωa = δ. In this case we label the edge (ω, δ) with a. Wecall SchA(Ω) the Schreier graph for the action of A on Ω.

For example, Figure VI. gives the Schreier graph for the action of S, generatedby the permutations a = (, ) and b = (, , , ) on sets of order (which formjust one orbit).

By tracing paths between vertices and multiplying up the elements of A on theedges (inverses, if we go an arrow in reverse), we obtain a factorization of an ele-ment, mapping one point to another, as a word in A. In the example above, we canread o that b will map , to , , as would aba, b−, bab, and so on.

Puzzles, such as Rubik’s cube are essentially just about nding such factoria-tions, in examples in which the group is far too large to draw the Schreier graph.

aer O S, -


1,2

1,3

1,4

2,3

2,4

3,4

b

b

b

b

a

a

a

a

b

Figure VI.: Schreier graph for the action of a = (, ), b = (, , , ) on pairs.

We can determine the orbit of a point as the connected component of the Schreiergraph, starting from a vertex and continuing to add images under group genera-tors, until no new images are found.images under group generators, until no newimages are found. (One needs to store only vertices encountered this way, not theedges.) A spanning tree of a connected Schreier graph is called a Schreier tree, it iseasy to see that oen there will be many possible Schreier trees, of dierent depth.

We note that loops in the Schreier graph images yield stabilizer elements, that isif ωg = δ, δa = γ, and ωh = γ, then ga and h are in the same coset of S = StabG(ω)and gah ∈ S. S’s theorem (which we shall not prove here) shows that ifwe do so systematically for all loops that get created from elements of A, we obtaina generating set for StabG(ω).

is in fact provides the tool — called the Schreier-Sims algorithm — by whichcomputers calculate the order of a permutation group: ey calculate the orbit ofa point and generators for its stabilizer, then recurse to the stabilizer (and anotherpoint). By the Orbit-Stabilizer theorem the group order then is the product of theorbit lengths.Example: An icosahedron has faces that are triangle shaped. Rotations can mapany face to any other face, yielding an orbit of length . e stabilizer of a face canclearly have only rotations, yielding a rotational automorphism group of order. Similarly we obtain a rotation/refelction automorphism group of order .

VI. Cayley Graphs and Graph Automorphisms

Spread oot, lads, and pretend ye’re enjoying thecailey

e Wee Free MenT P

D VI.: An action of G on Ω is semiregular if the stabilizer of any ω ∈ Ωis trivial. If the action is also transitive, it is called regular.

VI.. CAYLEY GRAPHS AND GRAPH AUTOMORPHISMS

(1,3)

()

(2,3) (1,2)

(1,2,3) (1,3,2)

(1,3)

b a

a b

b a

()

(2,3) (1,2)

(1,2,3) (1,3,2)

y

x xx

y

x xx

y

Figure VI.: Two Cayley graphs for S.

By Lemma VI., a regular action ofG is equivalent to the action on the cosets ofthe trivial subgroup, that is the action of G on its elements by right multiplication:(x , g) xg. If we have a generating set A of G, the Schreier graph for this actiongets a special name, it is called the Cayley graph of G (with respect to A).

Figure VI. shows the Cayley graphs for S for the generating set a = (, ),b = (, ), as well as for the generating set x = (, , ), y = (, ).

We notice (Exercise ??) that G induces automorphisms of the Cayley graphCayA(G). To be consistent with the way we dened actions and with the labellingof edges, the action of g ∈ G is dened by (v , g) g−v for a vertex v ∈ G. Wedenote this automorphism by µg .

However Cayley graphs can, as graphs that ignore labels and directions, haveother automorphisms. For example, the graph in Figure VI., le, has a dihedralgroup of order of symmetries, no all of which make sense for S (a rotation forexample would change element orders).

Such extra automorphisms however will not preserve edge labels:

L VI.: Let Γ = CayA(G) and α ∈ Aut(Γ). en α preserves the labels of alledges, if and only if there exists g ∈ G such that α = µg , that is vα = g−v for allv ∈ G.

Proof: If x ⋅ a = y then g− ⋅(x ⋅ a) = g− ⋅ y. us all µg preserve the edge labels. Let be the identity of G and g ∶= α . en α ⋅ µg is an automorphism that maps to .Furthermore (as µg preserves edge labels), α ⋅µG preserves edge labels if and only ifα does. We thus may assume, without loss of generality, that α = . (We shall showthat in this case α must be the identity map that is µ. As α ⋅ µg = µ we get thatα = µg− .)

By denition, for every a ∈ A, there exists a unique vertex (namely the vertexa) that is connected by an edge from with label a. at means that this edge mustbe xed by α and thus aα = a. e same argument for edges to shows that a−

must be xed by α for any a ∈ A. is shows that every vertex at distance one from


must be xed by α.e same argument now can be used by induction to show that all elements at

distance , . . . from must be xed by α. As A is a generating set of G, this is allelements, thus α must be the identity, completing the proof.

is result implies that all nite groups can be considered as the automorphismgroup of a suitable graph.

T VI. (F): Let G be a nite group. en there exists a nite graphΓ such that Aut(Γ) ≅ G.

Proof: We have seen already that every group is the automorphism group of a di-rected graph with labelled edges, for example its Cayley graph. What we will simplydo is to simulate direction and labelling by extra vertices and edges. First for direc-tion: For each directed edge a → b, we introduce three new vertices x , y, z andreplace the directed edge by undirected edges a − x − y − b, as well as y − z. atis, a and b are still connected, and z indicates the arrow direction. e z-verticesof this new graph are the only ones of degree , so the automorphisms must mapa z-vertex to a z-vertex, and thus an y-vertex to a y-vertex. We also add a ”uni-versal” vertex X which we connect to all x-vertices. en (ensured if necessary byadding further vertices we connect to X) the vertex X is the only vertex of its degreeand thus must remain xed under any automorphism, implying that the x-verticesmust be permuted amongst each other as well. us the “original” vertices of thegraph cannot be mixed up with any of the newly introduced vertices representingedges. Also distance arguments show that each group of xyz vertices must staytogether as a group. e automorphisms of this new undirected graph thus are thesame as the automorphisms of the original directed graph.

To indicate edge label number i, we insert i − extra vertices into the y − zedges, replacing y − z by y − z − z − c ⋅ ⋅ ⋅ − zi− − z. Automorphisms then need topreserve these distances and thus are restricted to preserving the prior labelling. (Frucht in fact proved a stronger theorem, namely that every nite group is theautomorphism group of a -regular graph, i.e. a graph in which every vertex hasdegree .)

VI. Permutation Group Decompositions

Permutation:e theory of how hairdos evolved

e New Uxbridge DictionaryB-T ..

Finding the automorphism group of an object can be hard. In many practicallyrelevant cases, however the objects have substructures that need to be preserved

VI.. PERMUTATION GROUP DECOMPOSITIONS

by automorphisms. In this case we can relate automorphisms of the structure withautomorphisms of the substructures in a meaningful way.

In this section we shall look at two kinds of decompositions that lend them-selves to nice group theoretic structures.

By labelling relevant parts of the object acted on by G we can assume withoutloss of generality that the automorphism group group on a set Ω.

We start with the situation that Ω can be partitioned into two subsets that mustbe both preserved as sets, that is Ω = ∆ ∪Λ with ∆ ∩Λ = . is could be distinctobjects — say vertices and edges of a graph — or objects that structurally cannotbe mapped to each other – for example verticed of a graph of dierent degree. (epartition could be into more than two cells, in which case one could take a unionof orbits and iterate, taking the case of two as base case.)

In this case G must permute the elements of Ω amongst themselves, as well asthe elements of Λ. We thus can write every element of G as a product of a permu-tation on ∆ and a permutation on Λ. Since the sets are disjoint, these two permu-tations commute.

Group theoretically, this means that we get a homomorphismφ∶G → S∆ , as wellas a homomorphism ψ∶G → SΛ . e intersection Kernφ∩Kernψ is clearly trivial,as elements of G are determined uniquely by their action on Ω = ∆ ∪ Λ. We thuscan combine these maps to an embedding (an injective group homomorphisms)

ι∶G → S∆ × SΛ , g (φ(g),ψ(g))If we consider the direct product as acting on the disjoint union of ∆ and Λ, thismap is simply the identity map on permutations.

e images of φ and ψ are typically not the whole symmetric groups. us,setting A = φ(G) and B = ψ(G), we can consider G as a subgroup of A × B, andcall it a subdirect product of A and B. All intransitive permutation groups are suchsubdirect products.

Note that a subdirect product does not need to be a direct product itself, butcan be a proper subgroup, see Exercise ??. ere is a description on possible groupsarising this way, based on isomorphic factor groups. Doing this however requiresa bit more of group theory than we are prepared to use here.

We also note that it is not hard to classify subdirect products abstractly, andthat this indeed has been done to enumerate intransitive permutation groups.

Block systems andWreath Products

e second kind of decomposition we shall investigate is the case of a group G thatis transitive on Ω, but permutes a partition into subsets (which then need to be ofequal size).

For example, in the two graphs in Figure VI., one such partition would be giv-en by vertices that are diagonally opposed. Neither diagonal is xed, but a diagonalmust be mapped to a diagonal again.


Figure VI.: Two graphs with an imprimitive automorphism group.

In the graph on the right, constructed from four triangles that are connectedin all possible ways with the two neighbors, such a partition would be given by thetriangles.

We formalize this in the following denition:

D VI.: Let G act transitively on Ω. A block system (or system of imprim-itivity) for G on Ω is a partition B of Ω that is invariant under the action of G.

Example: For example, takingG = GLn(F) for a nite eld F, acting on the nonzerovectors of Fn , the sect of vectors that span the same -dimensional space, that isthose that are equivalent under multiplication by F∗, form a block system.

We note a few basic properties of block systems, the proof of which is le as anexercise.

L VI.: Let G act transitively on Ω with Ω = n.

a) ere are two trivial block systems, B = ωω∈Ω , as well as B∞ = Ω.b) All blocks in a block system must have the same size.

c) If B is a block system, consisting of a blocks of size b, then n = ab.

d) If B is a block system, then G acts transitively on the blocks in B.

e) A block system is determined uniquely by one of its blocks.

We note — See section ?? — that part a) of the lemma is the best possible –there are groups which only aord the trivial block systems.

A connection between blocks and group structure is given by the followingproposition which should be seen as an extension of eorem VI.:

P VI.: Let G act transitively on Ω and let ω ∈ Ω with S = StabG(ω).ere is a bijection between block systems of G on Ω and subgroups S ≤ U ≤ G.

Proof: We will establish the bijection by representing each block system by the blockB containing ω. For a block B with ω ∈ B, let StabG(B) be the set-wise stabilizer

VI.. PERMUTATION GROUP DECOMPOSITIONS

of B. We have that S ≤ StabG(B), as S maps ω to ω and thus must x the block B.Since G is transitive on Ω there are elements in G that map ω to an arbitary δ ∈ B,as B is a block this means that these elements must lie in StabG(B). is showsthat StabG(B) acts transitively on B and that B = ωStabG(B). e map fro blocks tosubgroups therefore is injective.

Vice versa, for a subgroup U ≥ S, let B = ωU . Clearly B is, as a set, stabilizedby U . We note that in fact U = StabG(B), as any element x ∈ StabG(B)must mapω to ωx inB and by denition there is u ∈ U such that ω = (ωx)u = ωxu , so xu ∈S ≤ U and therefore x ∈ U . us the map from subgroups containing S to subsetscontaining ω is injective.

We claim that B is the block in a block system. Since G is transitive, the imagesof B clearly cover all of Ω. We just need to show that they form a partition. Forthat, assume that Bg ∩ Bh . is means that δ g

−, δh

− ∈ B = ωU and thus thereexists ug , uh ∈ U such that ωug g = δ = ωuh h . But then ug g(uhh)− = ug gh−u−

h ∈StabG(ω) ≤ U , the gh− ∈ U and (as U = StabG(B)) we have that Bg = Bh . isshows that the images of B form a partition of Ω.

e properties shown together establish the bijection.

C VI.: e blocks systems for G on Ω form a lattice under the “subset”(that is blocks are either subsets of each other or intersect trivially) relation. Itsmaximal element is B∞, its minimal element is B.

We call a transitive permutation group imprimitive, if it aords a nontrivialblock system on Ω. We want to ontain an embedding theorem for imprimitivegroups, similar to what we did for direct products. at is, we want to describea “universal” group into which imprimitive groups embed.

e construction for this is called the wreath product”

D VI.: Let A be a group, b an integer, and B ≤ Sb a permutation group.e wreath product A B is the semidirect product of N = Ab = A×× A

b

with B

where B acting on N by permuting the components.

If A ≤ Sa is also a permutation group, we can represent A B as an imprimitivegroup on a ⋅b points: Consider the numbers , . . . , ab, arranged as in the followingdiagram.

aa + a + a⋮

a(b − ) + a(b − ) + ab

en the direct product N = Ab can be represented as permutations of these num-bers, the i-th copy of A acting on the i-th row. Now represent the permutations ofB by acting simultaneously on the columns, permuting the b rows. e resultinggroup W is the wreath product A B. Points in the same row of the diagram will


be mapped by W to points in the same row, thus W acts imprimitively on , . . . , abwith blocks according to the rows of the diagram. W is therefore called the imprim-itive action of the wreath product.

L VI.: Let G act imprimitively on , . . . n = ab with b blocks of size a. enG can be embedded into a wreath product Sa Sb in

Proof: By renumbering the points we may assume without loss of generality thatthe blocks of G are exactly , . . . , a, a + , . . . , a and so on, as given by therows of the above diagram. Let g ∈ G. en g will permute the blocks accordingto a permutation b ∈ Sb . By considering Sb as embedded into Sa Sb , we have thatgb xes all rows of the diagram as sets and thus is in N = (Sa)b . (Again, it is possible — for example under the name of induced representations —to give a better description, reducing to a wreath product of the block stabilizersaction on its block with the groups action on all the blocks, but doing so requires abit more work.)

VI. Primitivity and Synchronization

In action be primitive; in foresight, astrategist. Agir en primitif et prevoir en stratege.

Feuillets d’Hypnos R C

A permutation group G ≤ SΩ is called primitive, if the only block systems itaords are the trivial ones B and B∞. Since block sized need to divide the degree,every transitive group of prime degree is primitive.

Example in arbitrary degree are given by symmetric groups:Example: For n > , the symmetric group Sn is primitive: Assume that B is a blockin a nontrivial block system, then ≤ B ≤ n − . us there are ω, δ ∈ B andγ ∈ Ω B. But there is g ∈ Sn such that ωg = ω and δ g = γ, thus B is mapped byg to a set that has proper, nontrivial, intersection with B contradicting the blockproperty.

is argument in fact only requires that the action on pairs of points is transitive— map ω, δ to ω, γ, giving the following denition and lemma.

D VI.: Let G be a permutation group that is transitive on Ω.a) If G acts transitively on all k-tuples of distinct elements of Ω, then G is calledk-transitive.b) IfG acts transitively on all k-sets of elements of Ω, thenG is called k-homogeneous.

L VI.: a) If G is k-transitive for k > then G is also k − -transitive

Bibliography

[Cam] Peter J. Cameron, Combinatorics: topics, techniques, algorithms, Cam-bridge University Press, Cambridge, .

[CQRD] Hannah J. Coutts, Martyn Quick, and Colva M. Roney-Dougal, eprimitive permutation groups of degree less than , Comm. Algebra (), no. , –.

[CvL] P. J. Cameron and J. H. van Lint, Designs, graphs, codes and their links,London Mathematical Society Student Texts, vol. , Cambridge Uni-versity Press, Cambridge, .

[Eul] Leonard Euler, Decouverte d’une loi tout extraordinaire des nombres parrapport a la somme de leurs diviseurs, Opera Omnia, I. (FerdinandRudio, ed.), Birkhauser, , pp. –.

[Fel] William Feller,A direct proof of Stirling’s formula, Amer. Math. Monthly (), –.

[GKP] Ronald L. Graham, Donald E. Knuth, and Oren Patashnik, Concretemathematics, second ed., Addison-Wesley Publishing Company, Read-ing, MA, .

[GNW] Curtis Greene, Albert Nijenhuis, and Herbert S. Wilf, A probabilisticproof of a formula for the number of Young tableaux of a given shape,Adv. in Math. (), no. , –.

[GR] Chris Godsil and Gordon Royle,Algebraic graph theory, Graduate Textsin Mathematics, vol. , Springer-Verlag, New York, .

BIBLIOGRAPHY

[Hal] Marshall Hall, Jr.,Combinatorial theory, second ed., Wiley-InterscienceSeries in Discrete Mathematics, John Wiley & Sons, Inc., New York,, A Wiley-Interscience Publication.

[Jac] C.G.J. Jacobi, De investigando ordine systematis aequationum dieren-tialum vulgarium cujuscunque, Gesammelte Werke, Funer Band (KarlWeierstrass, ed.), G. Reimer, Berlin, , pp. –.

[Knu] Donald E. Knuth, e art of computer programming. Vol. , Addison-Wesley, Reading, MA, , Sorting and searching, Second edition [ofMR].

[LN] Rudolf Lidl and Harald Niederreiter, Introduction to nite elds andtheir applications, Cambridge University Press, Cambridge, .

[LTS] C. W. H. Lam, L. iel, and S. Swiercz, e nonexistence of nite projec-tive planes of order , Canad. J. Math. (), no. , –.

[Rei] Philip F. Reichmeider, e equivalence of some combinatorial matchingtheorems, Polygonal Publ. House, Washington, NJ, .

[Sta] Richard P. Stanley, Enumerative combinatorics. Vol. , Cambridge Stud-ies in Advanced Mathematics, vol. , Cambridge University Press,Cambridge, , With a foreword by Gian-Carlo Rota and appendix by Sergey Fomin.

[Sta] , Enumerative combinatorics. Volume , second ed., CambridgeStudies in Advanced Mathematics, vol. , Cambridge University Press,Cambridge, .

[vLW] J. H. van Lint and R. M. Wilson, A course in combinatorics, second ed.,Cambridge University Press, Cambridge, .

[Wie] Helmut Wielandt, Finite permutation groups, Academic Press, .

[Zei] Doron Zeilberger, Garsia and Milne’s bijective proof of the inclusion-exclusion principle, Discrete Math. (), no. , –.

Index

algebra, antichain, Assignment problem, at the edge, augmenting path, ,

base, Bell number, binomial coecient, binomial formula, bipartite,

capacities, capacity, Catalan number, cells, chain, column sum vector, Complete Homogeneous Symmetric

Function hm , compositions, conjugate, covers, , cut,

derangement, derangements, dierentiation,

digraph, distributive, dominance order,

Elementary Symmetric Function em ,

Euler function, exponential generating function,

Ferrers’ diagram, ow, functional equation,

geometric series, greatest lower bound,

Hasse-diagram, homogeneous, hook, hook length, Hungarian Method,

in class t, incidence algebra, independent, , injective, instable, integration, interval,

INDEX

Invariant eory, involution,

Jacobi triple product identity, join, join-irreducible,

Kostka Number,

labelled, lattice, least upper bound, line, line graph, linear extension, lower factorial,

Mobius function, majorization order, matching, maximal, meet, minimal, Monomial Symmetric Function,

natural order, network,

order ideal, order-irrelevant partition,

partially ordered set, partition, parts, pentagonal, permutation, poset, Power Sum Function pm , Principle of Inclusion and Exclusion,

pure base, pure slope,

root of the discriminant, row sum vector,

Schur polynomials, slope, source, stable, Stirling number of the second kind,

surjective, symmetric function, symmetric polynomials, system of distinct representatives,

target, term rank, topological sorting, total order, two-line arrays,

value of a ow, vertex cover, vertex path, vertex separating,

Young diagram,

zeta function,

Some Counting Sequences

Bell OEIS A, page B=, B=, B=, B=, B=, B=, B=, B=, B=, B=, B=

Catalan OEIS A, page C=,C=,C=,C=,C=,C=,C=,C=,C=,C=,C=

Derangements OEIS A, page d()=, d()=, d()=, d()=, d()=, d()=, d()=, d()=, d()=

Involutions OEIS A, page s()=, s()=, s()=, s()=, s()=, s()=, s()=, s()=, s()=

Fibonacci OEIS A, page F=, F=, F=, F=, F=, F=, F=, F=, F=, F=, F=, F=

Partitions OEIS A, page p()=, p()=, p()=, p()=, p()=, p()=, p()=, p()=, p()=, p()=

Documents

Combinatorics - Colorado State Universityhulpke/lectures/m501/notes.pdf · is are lecture notes I prepared for a graduate Combinatorics course which I taught in Fallat Colorado State