124
Algebraic Combinatorics Ulrich Dempwolff Contents 1 Basic Counting Principles 2 2 Binomial Coefficients 6 3 Permutations 12 4 Principle of Inclusion-Exclusion 16 5 Formal Power Series 25 6 Generating Functions 36 7 Exponential Generating Functions 47 8 Integer Partitions 55 9 Rogers-Ramanujan Identities 63 10 Enumeration with Groups 73 11 Characteristic Polynomials of Graphs 84 12 Strongly Regular Graphs 95 13 Designs 105 1

Algebraic Combinatorics Ulrich Dempwolff

Embed Size (px)

Citation preview

  • Algebraic Combinatorics

    Ulrich Dempwolff

    Contents

    1 Basic Counting Principles 2

    2 Binomial Coefficients 6

    3 Permutations 12

    4 Principle of Inclusion-Exclusion 16

    5 Formal Power Series 25

    6 Generating Functions 36

    7 Exponential Generating Functions 47

    8 Integer Partitions 55

    9 Rogers-Ramanujan Identities 63

    10 Enumeration with Groups 73

    11 Characteristic Polynomials of Graphs 84

    12 Strongly Regular Graphs 95

    13 Designs 105

    1

  • 1 Basic Counting Principles

    In this section we formalize some counting methods. Although being rathertrivial these methods are nevertheless useful. In complicated situations it canbe useful to think of these formal statements: it can help to see how to count.

    Definition By [n] we denote the set {1, . . . , n}, by |T | the size of a finite set T .We call T an nset iff |T | = n. For sets T, S the symbol T S denotes the set ofall maps from S to T . By 2T we denote the power set of T , i.e. the set of allsubsets of T . Any subset X of T is uniquely determined by its characteristicfunction X : T {0, 1} where X(y) = 1 if y X and X(y) = 0 if y 6 X .This explains the notation for the powerset of T . With

    (Tk

    )we denote the set of

    all ksubsets of T . A ksubset P = {T1, . . . , Tk} of sets is called an (unordered)partition of T iff T =

    ki=1 Ti and Ti Tj = for 1 i < j k. In this case we

    also write

    T =

    ki=1

    Ti.

    Obviously we have:

    Theorem 1.1 Let P be a partition of the finite set T . Then

    |T | =XP

    |X |.

    Definition Let P,B be sets. A relation I P I is called in combinatoricsalso an incidence structure. One writes xIy (one says x is incident with y) iff(x, y) I . We use symbols like I = (P,B, ) for an incidence structure too andx y means that (x, y) I . One calls traditionally P the set of points and Bset of blocks. I is called finite iff |P |, |B|

  • Definition Let I = (P,B) be an incidence structure and P = {x1, . . . , xm} andB = {y1, . . . , yn}. Define the incidence matrix of I A = (aij) {0, 1}nm by

    aij =

    {1, xjIyi,0, xj 6 Iyi.

    Then ki =m

    j=1 aij = |{xj P |xjIyi}| is the ith row sum and rj =ni=1 aij = |{yi P |xjIyi}| is the ith column sum. Then summing over

    the entries of A we obtain

    ni=1

    ki =n

    i=1

    ( mj=1

    aij

    )=

    mj=1

    ( ni=1

    aij

    )=

    mj=1

    rj

    and we recover again theorem 1.2. This result is therefore a consequence of theinterchanging of the summation order in a double sum a trick which we haveencountered a countless number of times.

    Applications of theorem 1.2 are often called double counting (counting into twoways). As a first illustration we apply Theorem 1.2 obtain an assertion whichis obvious anyway.

    Corollary 1.3 Let T1, . . . , Tk be finite sets. Then

    |T1 Tk| = |T1| |T2| |Tk|.

    In particular |T k| = nk if T is a nset.

    Proof. We prove the statement by induction on k. The case k = 1 is trivial.k 1 k. Define P = T1 Tk1, B = Tk and an incidence structure byI = P B. So if (t1, . . . , tk1) P then r(t1,...,tk1) = |B| = |Tk|. Hence byinduction and Theorem 1.2

    |I | = |P | |B| = (|T1| |Tk1|) |Tk| =k

    i=1

    |Ti|.

    2

    Definition A injective map from a kset into a nset is called a kpermutationof n elements. Clearly, the number of rpermutations of [n] is the size of theset of sequences

    {(x1, . . . , xk) [n] |xi 6= xj , i 6= j}.A npermutation of the nset T is called a permutation of T . We recall thatthe set of permutations of T forms a group Sym(T ) called the symmetric groupon T with respect to the composition of maps. The number n = |T | is calledthe degree of the group Sym(T ).

    3

  • Theorem 1.4 The number of kpermutations of an nset is

    P (n, k) = n (n 1) (n k + 1).

    In particular |Sym(T )| = n! for a nset T .

    Proof. Wlog. we take T = [n]. If k > n then there exist no kpermutations ofan nset which agrees with our formula. Hence we assume k n and induct onk. The case k = 1 is clear.k 1 k. Let k < n and denote by P the set of (k 1)permutations of[n]. We set B = [n] and define an incidence structure I = (P,B, ) wherethe incidence is defined by (x1, . . . , xk1) y ((x1, . . . , xk1) P, y B) iff y B{x1, . . . , xk1}. Hence r = r(x1,...,xk1) = nk+1 for all (x1, . . . , xk1) P .Obviously |I | = P (n, k). By Theorem 1.2 and induction

    P (n, k) = |P | r = (n (n 1) (n k + 2)) (n k + 1).

    2

    The pigeonhole principle asserts that if one distributes more then n objects inton container that at least one container contains two object. Although trivialthis observation has may useful applications. More formally we have:

    Theorem 1.5 (Pigeonhole Principle) Let f : A B a mapping of finite sets.Then the following holds:

    (a) There exists b1 B with

    |f1({b1})| |A||B| .

    (b) There exists b2 B with

    |f1({b2})| |A||B| .

    Proof. Consider the partition

    A =bB

    f1({b}).

    Choose b1, b2 B with |f1({b1})| = max{|f1({b})| | b B} and |f1({b2})| =min{|f1({b})| | b B}. Then

    |B| |f1({b2})| bB

    |f1({b})| = |A| |B| |f1({b1})|

    and (a) and (b) follow. 2

    4

  • We give three applications in graph theory.

    Definition Let V be a finite set, E a subset of(V2

    ). Then we call the pair

    = (V,E) a graph, V is the set of vertices, and E the set of edges, . We saythe vertex x is adjacent to the vertex y and write x y if {x, y} E. We callN(x) = {y V |x y} the set of neighbors of x and call d(x) = |N(x)| thedegree of x. Clearly, the degree of x is also |{e E |x e}| the number of edgescontaining x. Let E1 E and V1 V . Then 1 = (V1, E1) a subgraph, of ifE1

    (V12

    ).

    Examples (a) Set V = [n] and E = . Then (V,E) is the null graph on nvertices.

    (b) Set V = [n] and E =(V2

    ). The graph Kn = (V,E) is called the complete

    graph of size n.

    (c) Set V = [n] and E = {{i, i + 1} | i [n 1]}. The graph Pn = (V,E) iscalled the path of length n.

    (d) Set V = [n] and E = {{i, i + 1} | i [n 1]} {{1, n}}. The graphCn = (V,E) is called the cycle of length n.

    Theorem 1.6 In any graph there exist two vertices of the same degree.

    Proof. Let the graph have n vertices. Denote by Vk the set of vertices of degreek. This leads to the partition V = V0 t V1 t t Vn1 of the vertex set. Weclaim that at least one Vk contains at least two elements. Assume the converse,i.e. |Vk| 1 for all k. On the other hand n = |V | =

    n1k=0 |Vk | n which

    implies |Vk| = 1 for all k. Let x be the element in V0 and y the element in Vn1.Then y is adjacent to every point, i.e. x y, a contradiction. 2

    Definition A walk in a graph is a sequence of adjacent vertices. A graph isconnected if every two points are connected by a walk.

    Theorem 1.7 Let = (V,E) be a graph, |V | = n. Assume that every vertexhas at least degree (n 1)/2. Then is connected.Proof. Assume x, y are nonadjacent vertices. Then N(x), N(y) V {x, y}and |V {x, y}| = n2. By assumption N(x)N(y) 6= . Let z N(x)N(y).Then (x, z, y) is a walk which connects x and y. 2

    Example. Let be the disjoint union of two complete graphs of size m. Thisgraph is not connected. has n = 2m vertices and each vertex has degreem 1 = (n 2)/2. Hence in the above theorem the degree can not be lowered.

    Definition A triangle in a graph are three adjacent vertices.

    Theorem 1.8 (Mantel) Let = (V,E) be a graph with |V | = 2n and |E| n2 + 1. Then contains a triangle.

    5

  • Proof. We prove the theorem by induction on n.For n = 1 we have |E| 1 < 12 + 1 = 2 and the assertion is true (as theassumptions are not met).n n+1: Let = (V,E) be a graph with 2(n+1) vertices and |E| (n+1)2+1edges. Let x, y be two adjacent vertices. Define a graph 0 = (V0, E0) byV0 = V {x, y} and E0 = {e E |x, y 6 e}. If |E0| n2 + 1 then 0and thus contains a triangle. So assume |E0| n2. Then |E E0| 2n+ 2.Therefore EE0 contains 2n+1 edges connecting x or y with points in V0. ThusN(x)N(y) V0 6= . By the pigeon principle there is a z N(x)N(y) V0.Then {x, y, z} is a triangle. 2

    2 Binomial Coefficients

    Binomial numbers count how many ksubsets can be chosen in an nset. Multi-nomial numbers count how may partitions with prescribed sizes can be chosenin an nset. Binomial and multinomial numbers are omnipresent in all parts ofmathematics.

    Definition We denote by(Xk

    )the subset of ksubsets of X . If X is a nset the

    size of(Xk

    )is denoted by

    (nk

    ), i.e. |([n]k )| = (nk). This number is called binomial

    coefficient or binomial number.

    Theorem 2.1 (a)(n0

    )=(nn

    )= 1,

    (n1

    )=(

    nn1)

    = n, and(

    nk

    )= 0 for k > n.

    (b)n

    i=0

    (ni

    )= 2n = |2[n]|.

    (c)(nk

    )=(

    nnk)

    for 0 k n.(d)

    ni=0(1)i

    (ni

    )= 0.

    (e)(n+1

    k

    )=(nk

    )+(

    nk1)

    for 0 < k.

    (f)(nk

    )=n1

    i=k1(

    ik1)

    for 0 < k n.

    (g)(nk

    )=k

    i=0

    (ni1

    ki)

    for 0 k < n.

    (h)(nk

    )(nk

    )as(

    nk1)> 0. 2

    Corollary 2.2 (n

    k

    )=

    n!

    k!(n k)! =n (n 1) (n k + 1)

    k!

    Proof. Let A be the set of kpermutations of [n]. By 1.5 |A| = n (n1) (nk+ 1). Set B =

    ([n]k

    ) Sym(k) and define f : A B by f(x1, . . . , xk) = (X, pi)where X = {x1, . . . , xk} = {y1, . . . , yk}, y1 < < yk and ypi(i) = xi for1 i k. Define further g : B A by g(X, pi) = (xpi(1), . . . , xpi(k)) forX = {x1, . . . , xk}, x1 < < xk. Then f1 = g. This shows (using 1.3)

    |B| =(n

    k

    ) k! = n (n 1) (n k + 1).

    2

    Remark. The proofs of Theorem 2.1 and Corollary 2.2 are examples of bijec-tive or combinatorial proofs. One computes the size of a set A by finding anexplicit bijection f : A B where |B| is known. Bijective proofs leave theimpression that one understands why the set A has a particular size. Usuallybijective proofs are not easy to find. Note that all assertions of 2.1 can easilyby verified by computation (induction) if one defines

    (nk

    )= n!k!(nk)! . But the

    above proofs are more illuminating than computational proofs.

    7

  • Theorem 2.3 The number of monotone increasing (decreasing) maps f : [r] [n] is: (

    n+ r 1r

    )Proof. Clearly it suffices to consider the increasing case. Let F be the set ofincreasing maps f : [r] [n], i.e. f(i) f(i + 1) for 1 i r. Define : F ([n+r1]r ) by

    (f) = {f(i) + i 1 | 1 i r}.

    Indeed (f) ([n+r1]r ) as f(i)+ i1 f(i+1)+ i1< f(i+1)+(i+1)1n+ r 1. is injective: If (f) = (f ) then f(i) + i 1 = f (i) + i 1 for all i, i.e.f(i) = f (i) and hence f = f . is surjective: Pick X = {x1, . . . , xr}

    ([n+r1]

    r

    )with x1 < < xr. Define

    f : [r] [n] by f(i) = xi i+ 1. Then f F and (f) = X . Hence

    |F| =(n+ r 1

    r

    ).

    2

    Definition A multiset is a pair M = (X, ) with a finite set X and a map : X N into the non-negative integers. We call |M | = xX (x) the sizeof M .

    Example Let X = {a, b, c} and (a) = 2, (b) = 0, and (c) = 3. Then wethink of the multiset M = (X, ) as a set which contains the elements a, b, crepeatedly according to their multiplicity . Hence M = {a, a, c, c, c} consistsof two identical objects of type a and of three identical objects of type c.

    Remark. A multiset M = (X, ) with |M | = r is also called a rselection ofX . The number (x) is the multiplicity of x in M .

    Theorem 2.4 The number of multisets of size r of an nset is(n+ r 1

    r

    )=

    (n+ r 1n 1

    )Proof. Let F be again the set of increasing mappings f : [r] [n]. For f Fdefine a multiset (f) = M = ([n], ) by (x) = |f1({x})| for x [n]. Clearly,|M | = r.Conversely, given a multiset M = ([n], ) of size r we define f = (M) =([n], ) F : Let X = {x1, . . . , xr}, x1 < < xr be the set of elementsin [n] where does not vanish. Set f(i) = xj if (x1) + + (xj1) < i (x1) + + (xj). We see that 1 = . So |F| is the number of multisets of

    8

  • size r of [n]. The assertion follows from Theorem 2.3. 2

    Definition Let T be a nonempty set. A sequence (T1, . . . , Tk), Ti 2T , is calledan ordered partition of T of length k iff T = T1 t t Tk. If |Ti| = ri we call(T1, . . . , Tk) a (r1, . . . , rk)partition. We allow the possibility Ti = , i.e. ri = 0.Lemma 2.5 Let xij , 1 i m; 1 j n, be elements of a commutative ringR (with identity). Then

    mi=1

    ( nj=1

    xij

    )=

    (T1,...,Tn)

    aT1

    xa,1bT2

    xb,2

    cTnxc,n

    where (T1, . . . , Tn) ranges over the ordered partitions of [m] of length n. Inparticular for x1, . . . , xm, y1, . . . , ym R we have

    (x1 + y1) (xm + ym) =

    T2[m]

    aT

    xa

    b[m]Tyb.

    Proof. We induct on m and we use the common convention that the emptyproduct is 1. The case m = 1 is clear.m 1 m: Now (with (T1, . . . , Tn) ranging over the ordered partitions of[m 1] of length n):

    mi=1

    ( nj=1

    xij

    )=

    ( (T1,...,Tn)

    aT1

    xa,1

    cTnxc,n

    )( nj=1

    xmj

    )

    =n

    j=1

    (T1,...,Tn)

    ( aT1

    xa,1

    cTnxc,n

    )xmj

    =

    (T1,...,Tn)

    ((xm1

    aT1

    xa,1)bT2

    xb,2

    cTnxc,n +

    +

    aT1xa,1

    bT2

    xb,2 (xmn

    cTnxc,n)

    )=

    (S1,...,Sn)

    aS1

    xa,1

    cSnxc,n

    where (S1, . . . , Sn) ranges over the ordered partitions of [m] of length n. Herewe use that the mapping (S1, . . . , Sn) 7 (S1 {m}, . . . , Sn {m}) maps thethe ordered partitions of [m] of length n surjectively onto the set of orderedpartitions of [m 1] of length n and that every ordered partition (T1, . . . , Tn)of [m 1] of length n has precisely the n counter images

    (T1 {m}, . . . , Tn), . . . , (T1, . . . , Tn {m}).2

    The following theorem explains the notion binomial coefficient.

    9

  • Theorem 2.6 (Binomial Theorem) Let R be a commutative ring (with identity)and x, y R. Then

    (x+ y)n =

    nk=0

    (n

    k

    )xkynk.

    Proof. By Lemma 2.5 we have in the polynomial ring R[X1, . . . , Xn, Y1, . . . , Yn]the identity

    (X1 + Y1) (Xn + Yn) =

    T2[n]

    aT

    Xa

    b[n]TYb.

    Substitute every Xi by x and every Yi by y. We obtain

    (x+ y)n =

    T2[n]x|T |y|[n]T | =

    nk=0

    (n

    k

    )xkynk.

    2

    Example. Let R = Q[X,Y ] and substitute X = Y = 1 in the polynomialf(X,Y ) = (X + Y )n =

    k

    (nk

    )XkY nk. We obtain (b) of 2.1:

    2n = f(1, 1) =

    nk=0

    (n

    k

    )If we specialize X = 1, Y = 1 we obtain (c) of 2.1:

    0 = f(1,1) =n

    k=0

    (n

    k

    )(1)k

    These arguments are typical algebraic proofs : We associate with a combinatorialobject (binomial number

    (nk

    )) an algebraic object (the polynomial (X + Y )n)

    and use algebraic manipulations (substitution of X,Y by 1) to obtain a com-binatorial theorem.

    Theorem 2.7 (Multinomial Theorem)

    (a) The number of (r1, . . . , rk)partitions of a nset is(n

    r1, . . . , rk

    )=

    n!

    r1! rk! .

    (b) Let x1, . . . , xk be elements of a commutative ring R (with identity). Then

    (x1 + + xk)n =

    r1++rk=n

    (n

    r1, . . . , rk

    )xr11 xrkk .

    10

  • Definition The numbers(

    nr1,...,rk

    )are called multinomial coefficients.

    Proof. (a) We prove the assertion by induction on k. The case k = 1 is clear.k 1 k: If one of the ris is 0 the assertion follows by induction (and bythe convention 0! = 1). So we assume that all ri > 0. We define an incidence

    structure I = (P,B, ) as follows: P = ([n]r1 ) and B is the set of sequences(X2, . . . , Xk) with Xi

    ([n]ri

    )and (X2, . . . , Xk) is a (r2, . . . , rk)partition ofk

    i=2Xi. The incidence relation is defined by X1 (X2, . . . , Xk) iff (X1, . . . , Xk)is (r1, . . . , rk)partition of [n].Claim: |I | = n!r1!rk !For X1 P the parameter rX1 is the number of (r2, . . . , rk)partitions of [n]X1. By induction

    r = rX1 =(n r1)!r2! rk! .

    Hence

    |I | = |P |r =(n

    r1

    ) (n r1)!r2! rk ! =

    n(n 1) (n r1 + 1)r1!

    (n r1)!r2! rk! =

    n!

    r1! rk! .

    (b) This assertion is proved in a similar manner as the binomial theorem. 2

    Definition A sequence (n1, . . . , nk) Zk, ni > 0, is called a composition of theinteger n if n = n1 + + nk. Denote by ck(n) the number of compositions ofn with k parts.

    Theorem 2.8 (a) ck(n) =(n1k1).

    (b) 2n1 is the number of all compositions of n.

    We leave the proof of this theorem as an exercise. As another illustration ofvarious strategies for a proof the following proposition is verified by an induc-tive, a bijective, and by an algebraic proof.

    Proposition 2.9 Let 0 < a, b, n Z, n a+ b. Then(a+ b

    n

    )=

    ni=0

    (a

    i

    )(b

    n i).

    Proof (inductive). Use Theorem 2.1 (c) and induction on a+ b.

    11

  • Proof (algebraic). In Q[X ] we deduce from the binomial theorem:

    a+bn=0

    (a+ b

    n

    )Xn = (X + 1)a+b = (X + 1)a(X + 1)b

    =( a

    i=0

    (a

    i

    )X i)( b

    j=0

    (b

    j

    )Xj)

    =

    a+bn=0

    ( ni=0

    (a

    i

    )(b

    n i))

    Xn

    Compare the coefficients.

    Proof (bijective). Consider the partition [a + b] = X1 t X2 with X1 = [a] andX2 = [a+ b] [a]. Then (

    [a+ b]

    n

    )=

    nk=0

    Uk

    where Uk = {Y ([a+b]

    n

    ) | |Y X1| = k}. Clearly,(X1k

    )(X2n k

    )3 (A,B) 7 A tB Uk

    is a bijection. Thus(a+ b

    n

    )=

    nk=0

    |Uk| =n

    k=0

    (a

    k

    )(b

    n k).

    2

    Induction is the work horse in combinatorics. If one wants to prove a combina-torial statement the most obvious try is to use induction. Usually such a proofis not very illuminating nor easy. If an algebraic proof is available it usually willbe easy and short. More difficult is to see in which situations algebraic proofscan be found. As already mentioned bijective proofs are illuninating but can bedifficult to find and may be complicated.Although the emphasis of this text lies on algebraic methods we will often useinduction or bijective arguments. Following the dictate of an exclusive use ofalgebraic methods would lead to a rather steril style.

    3 Permutations

    The investigation of permutations is a particular rich subject of combinatorialenumeration. A basic reason for this are the many ways one can interpret apermutation. For instance the permutation pi : [n] [n] can be seen as a mapbut also as a word a1a2 an (where pi(i) = ai) in the alphabet [n]. We will

    12

  • learn more interesting aspects of permutations.

    Definition Let pi be a permutation of [n]. For k [n] we denote by ck(pi) thenumber of cycles of length k in a cycle decomposition of pi and by c(pi) we denotethe number of cycles in the cycle decomposition. The ntuple (c1(pi), . . . , cn(pi))is called the cycle type of pi.

    Clearly, we have c(pi) = c1(pi) + + cn(pi) and n =

    k kck(pi). Recall thatcycle decompositions of a permutations an be described in many ways: In a cyclerepresentation one can reorder the cycles arbitrary way. Also a cycle of lengthm, say (a1, . . . , am) can be expressed in m different ways as one can permute theentries of the mtuple cyclically. For instance (123)(45)(67) = (76)(312)(54).We call the different expressions the cycle representations of pi.

    Lemma 3.1 Let pi Sym(n) be a permutation with cycle type (c1, . . . , cn). Thenumber of cycle representations with an increasing length of the cycles is

    nk=1

    kck n

    k=1

    ck! = 1c1c1!2

    c2c2! ncncn!.

    Proof. Consider two permutations and which are the product of m disjointcycles of length k, say

    = (x10, . . . , x1k1) (xm0 , . . . , xmk1), = (y10 , . . . , y1k1) (ym0 , . . . , ymk1).

    Set Xi = {xi0, . . . , xik1}, 1 i m and define the sets Yi similarly. Then = iff there exists a permutation Sym(m) with Yi = X(i) for 1 i m andif (yi0, . . . , y

    ik1) = (x

    (i)0 , . . . , x

    (i)k1). But the last equation holds iff y

    ij = x

    (i)j+s1

    for some s [k] (where we read the index j + s 1 modulo k). Hence there areexactly m!km ways to represent as the product of m disjoint kcycles. Thisshows the assertion of the lemma. 2

    Theorem 3.2 The number of permutations of cycle type (c1, . . . , cn) in Sym(n)is

    n!nk=1 k

    ck nk=1 ck! = n!1c1c1!2c2c2! ncncn! .Proof. We describe a permutation pi by the sequence (x1, . . . , xn) of its images,i.e. pi(k) = xk. We trun this ntuple into a permutation of cycle type (c1, . . . , cn)with increasing length of cycles and count the preinmages under this map:

    (a, . . . , b c1

    , c, d, . . . , e, f 2c2

    , . . .) 7 (a) (b) c1

    (c, d) (e, f) c2

    More formal: We define the numbers x`ij , i [n], 1 ` ci, and j [i] by(x1, . . . , xn) = (x

    `ij)i[n],0`ci,j[i], that is:

    (x1, . . . , xn) = (. . . , xck1k1,k1, x

    1k1, . . . , x

    1kk , x

    2k1, . . . , x

    2kk , . . . , x

    ckk1, . . . , x

    ckkk , x

    1k+1,1, . . .)

    13

  • We define a permutation f(pi) of cycle type (c1, . . . , cn) by

    f(pi) =

    nk=1

    (x1k1, . . . , x1kk)(x

    1kk , x

    2k1, . . . , x

    2kk) (xckk1, . . . , xckkk)

    where the product is ordered by the increasing index k. Then f maps Sym(n)surjectively on the set S(c1, . . . , cn) of permutations with cycle type (c1, . . . , cn).By lemma 3.1 each permutation in this set has

    k k

    ck k ck! pre-images. Thisshows

    |S(c1, . . . , cn)| = n!nk=1 k

    ck nk=1 ck!2

    Remark. It is well known that two permutations in Sym(n) are conjugateiff they have the same cycle type. So the number of conjugacy classes is thenumber of representations of n of the form n = n1 + + nk with k [n]and n1 n2 nk. The sections about integer partitions studies thesenumbers. For a given cycle type (c1, . . . , cn) the above theorem tells us the sizeof the associated conjugacy class.

    Corollary 3.3 The number of unordered partitions of [n] of type (c1, . . . , cn) is

    n!nk=1(k!)

    ck nk=1 ck!Proof. We map a permutation pi with of cycle type (c1, . . . , cn) with cyclespi1, . . . , pim on P (pi) = {support(pi1), . . . , support(pim)} which is a partition of[n]. We know by 3.2 that Sym(n) contains (n1)! cycles of length n. Thereforeany partition [n] = S1t tSm of type (c1, . . . , cn) has precisely ((11)!)c1((21)!)c2 ((k 1)!)ck pre-images and the assertion follows by 3.2. 2

    Definition Define c(n, k) to be the number of pi Sym(n) with exactly kcycles. The number s(n, k) = (1)nkc(n, k) are the Stirling numbers of the firstkind and c(n, k) is called a signless Stirling numbers of the first kind. Clearly,c(n, n) = 1 for n 1 and c(n, k) = 0 for n < k. We set c(n, k) = 0 if n 0 ork 0 except c(0, 0) = 1.

    Theorem 3.4 The numbers c(n, k) satisfy the relation

    c(n, k) = (n 1)c(n 1, k) + c(n 1, k 1), n, k 1.

    Proof. Let S be the set of permutations in Sym(n) with precisely k cycles.Define S0 = {pi S |pi(n) = n} and S1 = {pi S |pi(n) 6= n} so that we obtainthe partition S = S0 t S1.Then |S0| = c(n 1, k 1) as any permutation in Sym(n 1) with k 1 cyclescorresponds to a permutation in S0: just extend the permutation with the cycle(n).

    14

  • We claim |S1| = (n1)c(n1, k). Once the claim is shown the theorem follows.Let pi = 1 k be a permutation in S1 with the cycles i. Deleting n in thecycle which contains n we obtain a permutation f(pi) in Sym(n1) with k cycles.If on the other hand pi Sym(n 1) has k cycles we can place the symbol nafter any of the elements 1, 2, . . . , n 1 and we obtain n 1 permutations inS1. Since f maps surjectively the elements of S1 onto the permutations inSym(n 1) with k cycles the claim follows. 2

    Theorem 3.5 For 0 n Z define

    Fn(X) = X(X + 1) (X + n 1) R[X ].

    Then

    Fn(X) =

    nk=0

    c(n, k)Xk.

    Proof. Write Fn(X) =n

    k=0 b(n, k)Xk. As F0(X) 1 we have b(0, 0) = 1 and

    b(n, 0) = 0 for n 1. From

    Fn(X) = (X+n1)Fn1(X) =n

    k=1

    b(n1, k1)Xk +(n1)n1k=0

    b(n1, k)Xk

    we obtainb(n, k) = (n 1)b(n 1, k) + b(n 1, k 1).

    Therefore the numbers b(n, k) satisfy the same relations as the c(n, k). As theinitial conditions b(n, k) = c(n, k), k, n 0 agree we are done. 2

    A maximal chain in [n] is a subset X = {X0, . . . , Xn} 2[n] such that Xi1 Xi for i [n]. Let a = (a1, . . . , an) [n]n be a sequence with distinct entries.We call ai a left-to-right maximum or LRM if ai > aj for j < i. For instancea1 is always a LRM. Let (ai1 , . . . , aik ) be the sequence of LRMs. We define apermutation pia by taking

    (ai1 , . . . , ai21) (aik1 , . . . , aik1) (aik , . . . , an)

    as its cycle decomposition. For instance if a = (2, 1, 8, 4, 5, 7, 9, 3, 6) then pia =(2, 1)(8, 4, 5, 7)(9, 3, 6). Permutations are particularly interesting since they havemany representations (interpretations). We recorded a few of them below:

    Proposition 3.6 Each of the following sets has size n!.

    (a) Sym(n).

    (b) The set of linear orderings of [n].

    (c) The set of sequences a = (a1, . . . , an) [n]n with distinct entries.

    15

  • (d) The set of permutations pia, a = (a1, . . . , an) [n]n a sequence with dis-tinct entries.

    (e) The set of maximal chains of [n].

    Proof. (a) is 1.5.Let pi be a permutation. Then api = (pi(1), . . . , pi(n)) is a sequence with distinctentries and if a = (a1, . . . , an) [n]n a sequence with distinct entries thenpia Sym(n) defined by pia(i) = ai is a permutation. This shows that (c) holds.The sequence a = (a1, . . . , an) with distinct entries defines a linear ordering aof [n] by a1 a a2 a a an and each linear ordering defines a sequencewith distinct entries by reversing this process. This shows (b).The cycle representation of a permutation an be chosen such that each cyclestarts with its largest elements and that the first elements in the cycles forman increasing sequence. This show that the map a 7 pia is surjective and (d)follows.For a chain X = {X0, . . . , Xn} define a sequence aX = (a1, . . . , an) whereXi Xi1 = {ai}. It is clear that the map X 7 aX is incective and that theentries of aX are distinct. (e) follows. 2

    Definition The representation of a permutation pi as a sequence a = (a1, . . . , an)with distinct entries such that pi = pia is called the standard representation ofpi. The standard representation contains the full information about the cycledecomposition and is therefore an ideal tool to store permutations.

    Notes Much more about the combinatorial properties of permutations can besaid. A good introduction to this subject is chapter 1.2 in Stanleys book onpermutation statistics. A particular interesting subject are integer partitionswhich connects combinatorics with number theory and the theory of symmetricgroups. Chapters 7 and 9 of our notes treat integer partitions. A comprehensivebook on symmetric groups is G. James, A. Kerber, The representation theoryof the symmetric groups, Reading, 1981.

    4 Principle of Inclusion-Exclusion

    The Principle of Inclusion-Exclusion (PIE) roughly speaking determines thesize of a set S that it starts with a set S T where |T | is known. Then onesubtracts form T the unwanted elements in a fashion which keeps track of thediminishing size until one obtains |S|. Abstractly this method is nothing morethat the inversion of a triangular matrix which is a simple task in linear algebra.However the great value of the PIE lies in its wide applicability. This sectioncontains a number of such applications. Here is the simplest form of this method:

    Let A1, A2, . . . be finite sets. Clearly,

    |A1 A2| = |A1|+ |A2| |A1 A2|.

    16

  • It is also not hard to see that

    |A1A2A3| = |A1|+|A2|+|A3||A1A2||A1A3||A2A3|+|A1A2A3|holds. More generally the following formula called Principle of Inclusion-Exclusionor short PIE holds:

    Theorem 4.1 (PIE) Let A1, . . . , An be finite sets. Then: ni=1

    Ai

    = ni=1

    (1)i1

    T([n]i )

    jT

    Aj

    Proof. This theorem can be verified by induction. We give a bijective argument.It is enough to show:Claim: The RHS counts each element of

    ni=1Ai precisely one time.

    By relabeling if necessary an element x iAi lies in A1, . . . , Ap but does notlie in Ap+1, . . . , An. Then x is counted by the term |

    jT Aj |, T

    ([n]i

    ), iff

    T ([p]i ). Hence we obtain for x the contributionp

    i=1

    (1)i1(p

    i

    )=

    pi=0

    (1)i1(p

    i

    )+ 1 = 1.

    2

    The next more general form of the PIE looks very different.

    Theorem 4.2 (general PIE) Let S be a finite set, G an abelian group, and f, gmappings from 2S into G.

    (a) Suppose f(A) =

    BA g(B) for A 2S. Then g(A) =

    BA(1)|AB|f(B).

    (b) Suppose f(A) =

    AB g(B) for A 2S. Then g(A) =

    AB(1)|BA|f(B).Proof. (a) We have:

    BA(1)|AB|f(B) =

    BA

    (1)|AB|CB

    g(C)

    =

    CBAg(C)(1)|AB|

    =CA

    g(C)

    CBA(1)|AB|

    For A,C fixed assume |A C| = m. Then

    CBA(1)|AB| =

    T2AC

    (1)|T | =m

    i=0

    (1)i(m

    i

    )=

    {1, m = 0,0, m > 0.

    17

  • Thus BA

    (1)|AB|f(B) =C=A

    g(C)(1)0 = g(A).

    The verification of (b) is similar. 2

    We deduce the most common form of the PIE which is also called the sieveformula. It generalizes 4.1 by allowing a weight function on the elements.

    Let S be an nset and and w : S G be a mapping, called weight, into anabelian group G. Let (P1, . . . , PN ) be a sequence of elements from 2

    S. Weimagine that each Pi represents a property and x S has property Pi iff x Pi. For the subset A = {i1, . . . ir} of size r of [N ] we let W (A) = W (Pi1 , . . . , Pir )be the sum of the weights of those elements of S which satisfy each of theproperties Pi1 , . . . , Pir and by E(A) = E(Pi1 , . . . , Pir ) the sum of the weightsof those elements of S which satisfy each of the properties Pi1 , . . . , Pir but nomore. We like to compute the sum of the weights of those elements which haveprecisely m properties.

    Theorem 4.3 (Sieve Formula) Define W (m) =

    A([N]m )W (A) and E(m) =A([N]m )E(A). Then

    E(m) =N

    i=m

    (1)im(i

    m

    )W (i).

    In particularE(0) = W (0)W (1) + (1)NW (N).

    Proof. For A ([N ]r ) we haveW (A) =

    AB

    E(B).

    By Theorem 4.2 we get

    E(A) =AB

    (1)|BA|W (B).

    Hence

    E(m) =|A|=m

    E(A) =|A|=m

    AB

    (1)|BA|W (B)

    =im

    (1)im

    AB,|A|=m,|B|=iW (B)

    =im

    (1)im|B|=i

    W (B)

    A(Bm)1

    =im

    (1)im(i

    m

    )W (i).

    18

  • 2The sieve formula is useful in situations where it is hard to see how many ele-ments have exactly n properties but it is easy to see how many elements have atleast m n properties. We now give various applications of the sieve formula.Our first example shows that Theorem 4.3 implies Theorem 4.1.

    Example. Let A1, . . . , An be subsets of the finite set A. We say that x Ahas property i iff x Ai. For T [n]

    AT =iT

    Ai

    is the set of elements that have at least the properties in T . In particular A isthe set of elements with no requirements, i.e. A = A. Define finally the weightfunction by w(x) = 1, x A. Then |A| = xA w(x) and W (T ) = xAT 1 =|AT | =

    iT Ai

    and henceW (m) =

    |T |=m

    W (T ) =|T |=m

    iT

    Ai.

    The set of elements which have none of the properties in [n] form the set A ni=1Ai. Then by 4.3

    A ni=1

    Ai = E(0) = W (0)W (1) + (1)nW (n).

    Now W (0) = |A| = |A|. So if A = A1 An the LHS is 0 and we get 4.1:

    |A| = W (0) =n

    i=1

    (1)i1W (i) =n

    i=1

    (1)i1|T |=i

    jT

    Aj.

    Theorem 4.4 The number of surjections of an nset onto an kset is

    kj=0

    (1)j(k

    j

    )(k j)n.

    Proof. Let X denote the set of maps from [n] to [k]. If i 6 f([n]) we say that fhas property i. We have to compute the number E(0) of maps in X which havenone of the properties in [k]. Clearly, Xi = {f X | i 6 f([n])} has size (k1)nand if T [k] the set XT = {f X | f([n]) [k]T} has size (k|T |)n. Thenby the sieve formula

    E(0) =

    kj=0

    (1)j(k

    j

    )(k j)n.

    19

  • 2Definition One denotes by S(n, k) the number of (unordered) partitions of sizek of a nset. These numbers are the Stirling numbers of second kind. By B(n)one denotes the number of all partitions of a nset. The numbers B(n) arecalled Bell numbers. Of course:

    B(n) =n

    k=0

    S(n, k).

    Theorem 4.5

    S(n, k) =1

    k!

    ki=0

    (1)i(k

    i

    )(k i)n

    for 1 k n.Proof. A surjective map f : [n] [k] defines the partition Pf :=

    ki=1 f

    1({i}).Define on the set F of surjections from [n] to [k] an equivalence relation byf g iff Pf = Pg. Then f g iff there exists a permutation pi Sym(k)with g = pi f . So every equivalence class has size k! and the maps of one classproduce the same partition while nonequivalent maps define different partitions.Then by 4.4:

    S(n, k) =1

    k!

    kj=0

    (1)j(k

    j

    )(k j)n

    2

    Remarks. (a) It is easy to see that the Stirling numbers of second kind obeythe recursive law

    S(n, k) = S(n 1, k 1) + kS(n 1, k).Using S(n, 1) = S(n, n) = 1 one obtains an algorithm which computes theStirling numbers of second kind faster than the sieve formula 4.5.(b) The theorem yields for the Bell numbers the formula:

    B(n) =

    nk=0

    ki=0

    (1)ii!

    (k

    i

    )(k i)n

    This can be used to derive the formula of Dobinski :

    B(n+ 1) = e1

    i=0

    (i+ 1)n

    i!.

    Definition A permutation pi Sym(n) is a derangement if pi(i) 6= i for 1 i n. The number of derangements in Sym(n) is denoted by D(n). For instanceD(1) = 0, D(2) = 1, D(3) = 2, and D(4) = 9.

    20

  • Theorem 4.6

    D(n) = n!n

    i=0

    (1)ii!

    .

    Proof. For T [n] let W (T ) = |{pi Sym(n) |pi(i) = i, i T}| be the numberof permutations which fix all elements of T . Clearly, W (T ) = |Sym([n] T )| =(n |T |)! and W (i) = |T |=iW (T ) = (ni)(n i)!. We obtain by 4.3

    D(n) = E(0) =

    ni=0

    (1)iW (i) =n

    i=0

    (1)i(n

    i

    )(n i)! = n!

    ni=0

    (1)ii!

    .

    2

    Remark. The number n!e1 is for large n an good approximation of D(n): Wehave e1 =

    i=0(1)i/i!. Thus

    |D(n) n!e1| = |n!

    i=n+1

    (1)i/i!| < 1/(n+ 1)

    by an estimate of the proof of the Leibniz criterion for alternating series.

    We come to two applications of the PIE to number theory.

    Theorem 4.7 Let n be a positive integer and a1, . . . , aN positive integers whichare pairwise coprime.Then the number of integers k [n] which are not dividedby ai, 1 i N is:

    n

    1iNb naic+

    1i

  • 2Definition Let Z+ be the set of positive integers. The Euler function :Z+ Z is defined by (n) = |{k [n] | (k, n) = 1}|.Theorem 4.8 Let n be a positive integer.

    (n) = np

    (1 1

    p

    )where p ranges over the prime divisors of n.

    Proof. Let {p1, . . . , pN} be the set of prime divisors of n. Clearly, k is coprimeto n iff k is not divisible by any p1, . . . , pN . By theorem 4.7 we have

    (n) = n

    1iNb npic+

    1i

  • The summation extends over all mpermutations (i1, . . . , im) of [n]. In the mostimportant case m = n this formula is

    per (A) =

    piSym(n)a1,pi(1) am,pi(m).

    So loosely spoken the permanent is a signless determinant.

    Properties of the permanent (a) per (A) remains invariant under arbitrarypermutations of the rows and the columns.(b) Multiplication of a row by a R replaces per (A) by a per (A).(c) Let m = n. The per (A) is invariant under transposition:

    per (A) = per (At)

    Also a Laplace expansion for determinants has signless analogue for permanents.However:

    Remark The multiplicative law

    det(AB) = det(A) det(B)

    is false for permanents. Also the addition of a multiple of one row of A to anotherdoes not leave per (A) invariant. This makes the evaluation of permanentsdifficult. The PIE however provides a mean for the evaluation of permanents:

    Theorem 4.9 (Ryser) Let A be a mnmatrix over a commutative ring, m n. For a m (n r)submatrix B of A we denote by P (B) the product overall row sums of B. Finally we denote by S(r) =

    P (B) the sum of all P (B)

    where B ranges over all m (n r)submatrices. Then

    per (A) =

    m1i=0

    (nm+ i

    i

    )(1)iS(nm+ i).

    Proof. Set S = [n]m and define the weight of the element (j1, . . . , jm) of S by

    a1j1a2j2 amjm .

    We say that the sequence (j1, . . . , jm) has property Pi if i does not occur in thesequence. Suppose that the submatrix B is obtained by deleting the columnsi1, . . . , ir. Set I = {i1, . . . , ir}. Then

    P (B) =

    mi=1

    (k 6I

    aik

    )=

    (j1,...,jm)

    a1,j1 am,jm = W (I).

    Here the (j1, . . . , jm) ranges over the sequences which do not contain elementsfrom I . Hence W (r) = S(r). The element

    ai,ji occurs in the defining sum of

    23

  • the permanent iff the sequence (j1, . . . , jm) is a mpermutation. Now all entriesin (j1, . . . , jm) are distinct iff this sequence has precisely nm of the propertiesPi. By the sieve formula:

    per (A) = E(nm)= W (nm)

    (nm+ 1

    1

    )W (nm+ 1)

    +

    (nm+ 2

    2

    )W (nm+ 2) + (1)m1

    (n 1m 1

    )W (n 1)

    =m1i=0

    (nm+ i

    i

    )(1)iS(nm+ i)

    2

    Corollary 4.10 Let A Rnn, R a commutative ring. Then

    per (A) =n1i=0

    (1)iS(i).

    The next theorem of Gaschutz uses the PIE in a clever, implicit way.

    Theorem 4.11 (Gaschutz) Let G be a finite group which can be generated by nelements. Let N G be a normal subgroup and G/N = a1N, . . . , anN. Thenthere exist bi aiN , 1 i n, such that G = b1, . . . , bn.

    Proof. Set

    X = {B = {b1, . . . , bn} | (b1, . . . , bn) a1N anN}.

    Then |X | = |N |n. Let Y be the subset of those B X which do not generateG, i.e. B < G and let M1, . . . ,Mr be the set of maximal subgroups withMiN = G. Note, that we can assume that such maximal subgroups exist:We can assume that U = a1, . . . , an is a proper subgroup of G. Choose asubgroup U < L of minimal order. If L = G then U is one of the Mis. Ifhowever L M < G with a maximal subgroup M then G = UN MN andM is one of the Mis.For a subgroup U G we define

    U =

    {1, G = NU,0, G > NU.

    If G = NU then ai = nibi with suitable ni N, bi U and ai = nibi,ni N, bi U iff ni = nix, bi = x1bi for some x N U . Moreover{b1, . . . , bn} X . Therefore for any subgroup U the number of B X withB U is U |N U |n.

    24

  • We say that B X has property i [r] if B Mi. If B has none of theproperties in [r] then B = G: Otherwise B M with a maximal subgroupMwith NM < G. But G = a1, . . . , an, N = B,N MN < G, a contradiction.Thus |X Y | is the number of B X which generate G. By the sieve formula:

    |X Y | =r

    k=0

    (1)k

    1j1 0.

    2

    Notes A poset (partially ordered set) is a pair (S,) where S is a set and anorder relation on S. For example the positive integers and the relation divisionas an ordering form a poset. One can define in this general setting the Mobiusfunction and the Mobius inversion and obtains many interesting applications.For this generalization and other generalizations of the sieve methods we referto Chapter 2 in the book of Stanley and Chapter 5 in the book of Aigner.

    5 Formal Power Series

    Formal power series rings are a notion of commutative algebra. They forman important class of commutative rings which behave in a certain sense lesscomplicated than polynomial rings. Power series (under the name generatingfunctions) form a useful tool in combinatorics. For combinatorial applicationsone may either consider a power series as an entity from complex analysis oras a formal power series which means that the common convergence questionsplay no role. Both view points occur in combinatorics and each point has itsadvantages and disadvantages. We will use the formal power series view whichseems to be the more natural stand point in the combinatorial environment.

    Definition Let R be a ring with identity and N = {z Z | z 0} the set ofnon-negative integers. A formal power series is a map f : N R. For n Nwe call f(n) the coefficient of f at n. f(0) is also called the constant term. Wedefine power series f, g the sum f + g by

    (f + g)(n) = f(n) + g(n)

    25

  • and the product fg by

    (fg)(n) =

    nk=0

    f(k)g(n k).

    A power series f is a polynomial if there exist a N N such that f(n) = 0for n > N . The conventional way to describe power series uses infinite, formalsums:

    f =

    n=0

    f(n)Xn

    Proposition 5.1 The formal power series over R form a ring with identity.This ring contains R and the polynomial ring R[X ] as subrings.

    The obvious verification is left to the reader. Of course f is the zero element iff(n) = 0 for all n and X0 is the identity.

    Definition Let R be a ring with identity. The ring in proposition 5.1 is calledthe ring of formal power series in one variable and is denoted by the symbolR[[X ]]. For 0 6= f R[[X ]] we define the order ord(f) as the smallest integern such that f(n) 6= 0. Formally we set ord(0) = . We denote the nthcoefficient of a power series f by [Xn]f . So if f =

    f(n)Xn then [Xn]f = f(n).

    Lemma 5.2 Let f, g R[[X ]]. Then:(a) ord(f + g) min{ord(f), ord(g)}.(b) ord(fg) ord(f) + ord(g). If R is an integral domain then ord(fg) =

    ord(f) + ord(g).

    Proof. The first assertion is trivial. Let f =

    m=M f(m), g =

    n=N f(n)where ord(f) = M and ord(g) = N . Then

    (fg)(M +N) =M+Nk=0

    f(k)g(M +N k) = f(M)g(N)

    as for k 6= M either k < M or M + N k < N (i.e. k > M), so thatf(k)g(M +N k) = 0. Similarly (fg)(n) = 0 for n < M +N . 2

    Definition Let R be a ring with identity and 0 6= f R[[X ]]. Then we definethe norm of f by

    f = 2ord(f).Further we set 0 = 0. The distance of f, g is f g. Note that norms liebetween 0 and 1. A direct consequence of lemma 5.2 is:

    Proposition 5.3 Let R be a ring with identity and f, g R[[X ]].

    26

  • (a) f = 0 iff f = 0.(b) fg f g and if R is an integral domain we have equality.(c) f + g max{f, g}.

    Assertion (3) implies the usual triangle inequality and is called the ultrametricinequality. Therefore (R[[X ]], ) is a metric space.

    Theorem 5.4 Let R be a ring with identity.

    (a) The metric space (R[[X ]], ) is complete space, i.e. every Cauchy se-quence converges.

    (b) Every power series is a limit of polynomials.

    (c) Let (f (m)) be a series of power series. We define the series

    m=0 f(m)

    associated with (f (m)) to be the sequence of partial sums (m

    k=0 f(k))mN.

    Then

    m=0 f(m) converges iff limm f (m) = 0.

    Proof. (a) It follows from the definition of the norm that (f (m)) is a null sequenceiff for each n the sequence (f (m)(n))mN is after finitely many steps identical 0.So if (f (m)) is Cauchy then for any n there exists some Mn such that f

    (m)(n) =cn for m Mn. Then (f (m)) converges to

    n=0 cnX

    n.(b) For f R[[X ]] and m N define the polynomial pm by pm =

    mk=0 f(k)X

    k.Then limm pm = f .(c) Suppose limm f (m) = 0 and for > 0 choose N such that f (m) < form N . Then for the difference of two partial sums we get by the ultrametricinequality

    m+nk=m

    f (k) max{f (k) |m k m+ n} <

    and

    m=0 f(m) converges by (a). 2

    Theorem 5.5 A formal power series over the ring R is invertible iff the con-stant term is invertible in R.

    Proof. Consider a power series f with constant term zero. Then f 21 andhence limm fm = 0 which implies limm fm = 0. By theorem 5.4 the series

    m0 fm converges. We denote the limit by (1 f)1. Indeed

    ((1 f) mk=0

    fk) 1 = fm+1 2m1

    which shows that (1 f)1 is the inverse of 1 f . So each power series withconstant term 1 is invertible. So if f has an invertible constant term thenf(0)1f and therefore f is invertible.

    27

  • If on the other hand f is invertible and g is the inverse then 1 = f(0)g(0) sothat f(0) is invertible. 2

    Definition Let R be a ring with identity and f, g R[[X ]] and assume thatg has constant term zero (or equivalently g < 1). Then (f(n)gn)nN is anull sequence and

    n f(n)g

    n converges by theorem 5.4. We call this series thecomposition of f and g and denote it by f g.Proposition 5.6 Let g R[[X ]] be a power series with g < 1. Then f g R[[X ]] is defined for every f R[[X ]]. Let (g(n)) be a sequence of power serieswith g(n) < 1 which converges to g. Then

    limn f g

    (n) = f g.

    Proof. The first assertion follows from the definition. From the definition of thecomposition we get f g f h gh and the second assertion follows.2Proposition 5.7 Let (f (n))n1 be a sequence of formal power series with f (n)(0) =1. Set pn(X) =

    nk=1 f

    (k)(X). Then (pn(X))n1 is convergent iff limn f (n)1 = 0.Proof. We have(

    1 +nK

    anXn)(

    1 +

    nMbnX

    n)

    = 1 + aKXK + bMX

    M +R(X)

    where R(X) contain only terms of degree > min{K,M}. Hence(1 + nK

    anXn)(

    1 +

    nMbnX

    n) 1 < max{2K , 2M}.

    By a straightforward induction one gets:(1+nK

    anXn)(

    1+

    nMbnX

    n) (1+nN

    cnXn)1 < max{2K, 2M , . . . , 2N}.

    This shows

    pn+k(X)pn(X) pn(X) k

    j=1

    f (n+j)(X)1 max{f (n+j)1 | j [k]}.Therefore (pn) is Cauchy iff limn f (n) 1 = 0. 2

    Definition Let (f (n))n1 be a sequence of formal power series with f (n)(0) = 1and assume limn f (n) 1 = 0. By proposition 5.7 limn(

    nk=1 f

    (k))exists. This limit is denoted by

    n=1

    f (n)(X)

    28

  • and called an infinite product.

    Remark Let f(X) =

    k=1 f(k)(X) be an infinite product. For a given n there

    exists aM with f (m)1 2n1 form >M . Then f(X)Mk=1 f (k)(X) Mk=1 f (k)(X) k=M+1 f (k)(X) 1 2n1. This shows that the firstM terms of the infinite product already determine the coefficient of Xn:

    [Xn]f(X) = [Xn]

    Mk=1

    f (k)(X)

    Proposition 5.8 Let

    n=1 f(n)(X) and

    n=1 g

    (n)(X) infinite products of powerseries.

    (a) limn(n

    k=1 f(n)g(n)) converges and(

    n=1

    f (n)(X))(

    n=1

    g(n)(X))

    =

    n=1

    (f (n)(X)g(n)(X)).

    (b)

    n=1

    (f (n))1(X) =(

    n=1

    f (n)(X))1

    (c)

    n=1

    f (n)(X)(

    n=1

    g(n)(X))1

    =

    n=1

    f (n)(X)

    g(n)(X)

    Proof. (a) We have seen f (n)g(n) 1 max{f (n) 1, g(n) 1} whichimplies that the RHS converges. Moreover by the remark there exists for agiven n a number M such that only the products up to M on both sides makea non-trivial contribution to the coefficient of Xn. Then

    [Xn](

    n=1

    f (n)(X))(

    n=1

    g(n)(X)) = [Xn](

    Mn=1

    f (n)(X))(

    Mn=1

    g(n)(X))

    = [Xn]

    Mn=1

    (f (n)(X)g(n)(X))

    = [Xn]

    n=1

    (f (n)(X)g(n)(X))

    and we are done.(b) and (c) follow from (a). 2

    For our purposes R will be a field, usually Q, R, or C. But in Godsil [8] one findsapplications of formal power series over noncommutative rings on combinatorialproblems. The following theorem is a consequence of the preceding discussion.

    29

  • Theorem 5.9 Let K be a field. Then K[[X ]] is a principal ideal ring. Theideals in K[[X ]] are (Xn), 0 n Z.Certain formal power series have roots:

    Theorem 5.10 Let K be a field of characteristic 0 and f K[[X ]] with con-stant term 1. For 0 < n Z there exists a unique power series g with g(0) = 1such that gn = f .

    Proof. Let h = 1 +

    k1 h(k)Xk K[[X ]]. Then hn = 1 +k1 hn(k)Xk

    with hn(1) = nh(1). For k 2 there exist Polynomials pnk(X1, . . . , Xk1) suchthat hn(k) = nh(k) + pnk(h(1), . . . , h(k 1)). This can be established by astraightforward induction.Using this observation we see that the coefficients of g are uniquely determinedby the equations

    g(0) = 1, ng(1) = f(1), ng(k) = f(k) pnk(g(1), . . . , g(k 1)), k 2.2

    Theorem 5.11 Let f, g R[[X ]] with fn = gn. Then f = g if n is odd andf = g if n is even.Proof. We may assume f 6= 0 6= g as otherwise by theorem 5.9 f = g = 0. Set C be a primitive nth root of unity. As Xn 1 =n1j=0 (X j) we obtainin C[X,Y ] (substitute X by X/Y ) the equation Xn Y n = n1j=0 (X jY ).Substituting X by f and Y by g we get in C[[X ]]

    0 = fn gn =n1j=0

    (f jg).

    If j is not real, then f jg 6= 0. So if n is odd then only the factor for j = 0can be trivial which shows f = g. For n even only the factors for j = 0 andj = n/2 can be trivial showing f = g. 2

    Definition Let K be R or C.(a) For f K[[X ]] we define the formal derivative by

    D(f) =n1

    nf(n)Xn1.

    (b) Define the exponential series as

    expX =n0

    Xn

    n!,

    the logarithm by

    log(1 +X) =n1

    (1)nn

    Xn,

    30

  • and for a R the binomial series by

    (1 +X)a =n0

    (a

    n

    )Xn,

    where(

    an

    )= a(a 1) (a n+ 1)/n!.

    Proposition 5.12 Use the notation of the definition.

    (a) D(f + g) = D(f) + D(g) and D(fg) = D(f)g + fD(g) for all f, g. Iff is invertible D(f1) = D(f)/f2. If ord(g) > 0 then D(f g) =D(f) g D(g).

    (b) D(f) = 0 iff f = cX0, c K.(c) D(f) = f iff f = c expX, c K.(d) Assume f < 1. Then D(exp f) = exp(f)D(f).(e) Assume f < 1. Then D(log(1 + f)) = (1 + f)1D(f).(f) Assume f < 1 and a R. Then D((1 + f)a) = a(1 + f)a1D(f).(g) Assume f < 1. Then exp(log(1 + f)) = 1 + f and log(exp f) = f .(h) Assume f, g < 1. Then exp(f + g) = exp(f) exp(g).

    Proof. (Sketch) (a) follows from the definition of D. (b) is trivial and theassertion of (c) implies (n+ 1)f(n+ 1) = f(n) which forces the assertion.(d) holds for monomials Xn and hence even for polynomials. Also exp is a con-tinuous function on the power series with constant term zero. Thus theorem 5.4implies the assertion for all f with f < 1.(e)-(h) are proved in a similar way. 2

    Remarks (a) The universal property of polynomials allows us to substitute thevariables by anything which makes sense. This property explains the funda-mental role of polynomial rings. A substitution of the variable for formal powerseries is only possible in very restricted situations, namely if one has a metricto define infinite sums:

    Universal Property of formal power series Let R be a commutativering with identity and S an associative algebra over R, I an ideal in S such thatS is complete with respect to the Iadic topology. Let x I . Then there exista unique algebra homomorphism : R[[X ]] S such that is continuous and(X) = x.

    The Iadic topology is defined similar as the topology for R[[X ]]: Assumen=1 I

    n = 0. Define for 0 6= x S the norm of x by x = 2n if 0 n is thesmallest integer with x In and set 0 = 0. This induces the Iadic metric

    31

  • on S. The universal property has a natural generalization for power series inarbitrarily many variables.

    (b) To f = f(X) C[[X ]] associate the ordinary power series f(z) which isdefined by f(z) =

    n0 f(n)z

    n. We consider f(z) as a function defined in itscircle of convergence. Suppose that is an operation with ordinary power se-ries such that an identity (f1(z), . . . , fn(z)) = 0 holds. Does this identity alsohold for formal power series, i.e. do we have (f1(X), . . . , fn(X)) = 0? Theanswer is yes if fulfills some mild assumptions. This tool allows to transfertheorems about analytic functions to theorems about formal power series. SeeE. A. Bender, A lifting theorem for formal power series, Proc. Amer. Math.Soc. 42(1974), 16-22, for a detailed discussion.

    (c) Usually such identities can be shown by a recipe we have used before: Sup-pose that the identity can be verified at least for (formal) polynomials f(X) andassume that the operations used in the identity are continuous with respect tothe metric of C[[X ]]. As the polynomials lie dense in C[[X ]] the identity holdsin general.

    Definition Let R be a commutative ring with identity and R(X) the set ofmaps f : Z R such that there exists a M = Mf Z with f(n) = 0 forn < M . Similar as in R[[X ]] we define for f, g R(X) the sum by (f +g)(n) = f(n) + g(n) and the product by (fg)(n) =

    k= f(k)g(n k) =nN

    k=M f(k)g(n k) if g(n) = 0 for n < N . Then R(X) is the ring of formalLaurent series which contains R[[X ]] as a subring. The elements are denotedby f = f(X) =

    n= f(n)X

    n or by f(X) =

    n=M f(n)Xn if f(n) = 0

    for n < M . If f(M) 6= 0 (in the latter case) we call again M the order of f .Also f(1) is called the residue of f(X) and we write Resf(X) = f(1). Thederivative is defined as for power series, i.e. D(f)(X) =

    n= nf(n)X

    n1.The usual rules for derivatives also hold for the derivative of Laurent series.

    Proposition 5.13 Let K be a field.

    (a) K(X) is a field.

    (b) Let f K(X) and g K[[X ]] with g(0) = 0. Then f g is defined andlies in K(X).

    Proof. (a) Let f(X) =

    n=M f(n)Xn be of order M . Then XMf(X) =

    f(M) + f(M + 1)X + f(M + 2)X2 + is invertible. Let h(X) be the inverse.Then f1(X) = XMh(X).(b) We may assume g 6= 0. Then g1 exists by (a) and therefore gm is definedfor all m Z. Then

    f g(X) =

    n=M

    f(n)g(X)n =

    1n=M

    f(n)g(X)n +n0

    f(n)g(X)n

    32

  • and as the second term converges also f g(X) converges. 2

    Lemma 5.14 Let K be a field and 0 6= f(X) K(X).(a) Res(D(f)(X)) = 0.

    (b) Res(f1(X)D(f)(X)) is the order of f .

    Proof. (a) follows from the definition of the derivative.(b) Let f(X) =

    n=M f(n)X

    n be of orderM . Then f1(X) = f(M)1XM +XM+1+XM+2 + and D(f)(X) = f(M)MXM1 +XM2 +XM3+ . Hence f1(X)D(f)(X) = MX1 + + X + and we are done. 2

    Proposition 5.15 Let K be a field and f K[[X ]] with f(0) = 0 6= f(1).(a) Then there exists a unique g K[[X ]] with g(0) = 0 6= g(1) and f(g(X)) =

    X. Moreover g(f(X)) = X.

    (b) Assume Char(K) = 0. Then

    f(n) = Res( 1ng(X)n

    ).

    Proof. (a) We prove the claim by induction on n. As [X ]f(g(X)) = f(1)g(1)we have g(1) = f(1)1.n n + 1: Suppose g(1), . . . , g(n) are already (uniquely) defined such that[Xk]f(g(X)) = k,1 for k n. For the coefficients of Xn+1 in

    k1 f(k)g(X)

    k

    only the terms f(1)g(X), . . . , f(n+1)g(X)n+1 give a contribution. Moreover fork 2 the contribution from g(X)k can only involve coefficients g(`), ` < n+ 1.This shows

    [Xn+1]f(g(X)) = [Xn+1]f(1)g(X)+ +[Xn+1]f(n+1)g(X)n+1 = f(1)g(n+1)+P

    with a polynomial P in f(2), . . . , f(n+1) and g(1), . . . , g(n). Then we are forcedto define

    g(n+ 1) = f(1)1P.

    This verifies the induction step and we are done.Finally, we know that there is a series h(x) with h(0) = 0 6= h(1) and g(h(X)) =X . Then f(X) = f(g h(X)) = f g(h(X)) = h(X). Here we use the identity(f g) h = f (g h) which is obvious for polynomials and thus is true forpower series by (5.6).(b) We take the derivative of f(g(X)) = X and obtain

    1 = D(f(g(X))) =k1

    kf(k)g(X)k1D(g)(X).

    33

  • Divide by ng(X)n so that

    1

    ng(X)n=k1

    kn1f(k)g(X)kn1D(g)(X).

    For k 6= n we see that the kth term on the RHS

    g(X)kn1D(g)(X) =D(g(X)kn)

    k nis a derivative and Res(g(X)kn1D(g)(X)) = 0 by lemma 5.14. As the residueis a continuous linear functional and as g(1) = f(1)1 we get again withlemma 5.14

    Res( 1ng(X)n

    )= Res(

    k1

    kn1f(k)g(X)kn1D(g)(X))

    = Res(f(n)g1(X)D(g)(X)) = f(n)

    2

    We now can prove the formal power series version of a theorem from complexanalysis.

    Theorem 5.16 (Lagrange inversion) Let K be a field of characteristic 0 andf K[[X ]] with f(0) 6= 0. Then there exists a unique g K[[X ]] with g(X) =Xf(g(X)) and

    [Xn]g(X) =1

    n[Xn1]f(X)n, n 1.

    Proof. Set h(X) = Xf(X) . Then h(0) = 0 6= h(1). By 5.15 there is a uniqueg K[[X ]] with g(0) = 0 6= g(1) and g(h(X)) = X . Also g(X)/f(g(X)) =h(g(X)) = X so that Xf(g(X)) = g(X). By (b) of proposition 5.15

    [Xn]g(X) = Res( 1nh(X)n

    )= Res

    (f(X)nnXn

    )=

    1

    n[Xn1]f(X)n.

    2

    We illustrate the use of generating functions by verifying two old identities:

    Example The following Euler-Identities hold in Z[[X,Z]]:

    (E1)

    n=0

    (1 +XnZ) =

    n=0

    Xn(n1)/2Zn

    (1X)(1X2) (1Xn)

    (E2)

    n=0

    (1 +XnZ)1 =

    n=0

    (1)nZn(1X)(1X2) (1Xn)

    34

  • We denote the LHS of (E1) by

    K(X,Z) = (1 + Z)(1 +XZ)(1 +X2Z) = 1 + c1Z + c2Z2 + =

    n=0

    cnZn

    with cn Z[[X ]] and c0 = 1. As(1+XZ)(1+X2Z)(1+X3Z) = (1+X0(XZ))(1+X1(XZ))(1+X2(XZ)) we have

    K(X,Z) = (1 + Z)K(X,XZ)

    which implies

    1 + c1Z + c2Z2 + = (1 + Z)(1 + c1XZ + c2(XZ)2 + ).

    Comparing coefficients we obtain cn = cnXn + cn1Xn1 or

    cn =Xn1cn1(1Xn) =

    Xn1Xn2cn2(1Xn)(1Xn1) = =

    Xn(n1)/2

    (1X)(1X2) (1Xn) .

    Now (E1) follows.Next we denote the LHS of (E2) by

    L(X,Z) = (1+Z)1(1+XZ)1(1+X2Z)1 = 1+d1Z+d2Z2+ =

    n=0

    dnZn

    with dn Z[[X ]] and d0 = 1. This time we haveL(X,Z) = (1 + Z)1L(X,XZ)

    which implies

    (1 + Z)(1 + d1Z + d2Z2 + ) = 1 + d1XZ + d2(XZ)2 + .

    Comparing coefficients we obtain dn + dn1 = dnXn or

    dn =dn1

    (1Xn) =(1)2cn2

    (1Xn)(1Xn1) = =(1)n

    (1X)(1X2) (1Xn)and (E2) follows.Note that the verification, that the right hand sides of (E1) and (E2) are inverseto each other by taking the product, is much more elaborate.

    Notes. The ring of formal power series R[[X1, . . . , Xn]] in n variables over Ris defined as the set of mappings f : Nn R. The addition is defined againcomponent-wise and multiplication is expressed by

    (fg)() =

    +=

    f()g()

    35

  • with , , Nn and the component-wise addition on Nn. One writesf =

    Nn

    f()X

    where X = X11 Xnn , = (1, . . . , n). Order and norm are defined sim-ilarly as before and again R[[X1, . . . , Xn]] is a complete metric space. Mostresults like 5.5, 5.6, 5.7, 5.8 carry over from R[[X ]] to R[[X1, . . . , Xn]]. Godsil[8], Chap. 3 and Goulden-Jackson [9], Chap. 1, contain the basic results onpower series in arbitrary many variables. As usual N. Bourbaki, (Algebra II,Chap. IV, Sec. 4) treats this subject in a comprehensive and very general form.An approach with particular emphasis on combinatorial requirements is givenby Tutte, On elementary calculus and the Good formula, J. Comb. Theory (B)18(1975), 97-137. This article is somewhat difficult to read as the author uses arather unconventional notation. Another useful survey article is Niven, Formalpower series, Amer. Math. Month. 76(1969), 871-889.

    6 Generating Functions and Recurrences

    The theme of this section is to turn number sequences into formal power series.Then one manipulates the power series to obtain more information about thenumber sequence.

    Definition Let (an)n0 be a complex sequence. The ordinary generating func-tion of (an), abbreviated OGF, is the formal power series

    n=0

    anXn

    and the exponential generating function of (an), abbreviated EGF, is the formalpower series

    n=0

    ann!Xn.

    Examples (a) Set an = 1 for n 0. The OGF is

    n=0

    Xn =1

    1Xand the EGF is

    n=0

    1

    n!Xn = expX.

    (b) The sequence ((nk

    ))k0 has the OGF

    k=0

    (n

    k

    )Xk =

    nk=0

    (n

    k

    )Xk = (1 +X)n

    36

  • which is the binomial formula.(c) The Fibonacci numbers (Fn) are defined by F0 = F1 = 1 and Fn+2 =Fn+1 + Fn for n 0. The OGF is f =

    n0 FnX

    n. Multiply the abovercurrence with Xn and sum over n. We get an identity of powerseries

    Fn+2Xn = Fn+1X

    n + FnXn,

    n0

    Fn+2Xn =

    n0

    Fn+1Xn +

    n0

    FnXn.

    But

    n0 Fn+1Xn = (f F0)/X and

    n0 Fn+2X

    n = (f F0 F1X)/X2.Since F0 = F1 = 1 we get

    f 1XX2

    =f 1X

    + f.

    Multiplying with X2 and solving f we have

    f =1

    1X X2 .

    We know that the RHS can be expressed as a formal power series which willgive us the numbers Fn. To facilitate the calculations we use the method ofpartial fractions. Set 1X X2 = (1X)(1 X), i.e = 1, + = 1.This shows:

    =1 +

    5

    2, =

    152

    Since the two factors are coprime we know that (by the main result on partialfractions) there exist A,B C with

    1

    1X X2 =A

    1 X +B

    1 X =A+B (A+ B)X

    (1 X)(1 X) .

    Hence A + B = 1 and A + B = 0. The solutions are A = /( ) andB = /( ) and therefore

    f =15

    ( 1 X

    1 X).

    On the other hand (1 aX)1 =n0 anXn, so that we finally obtainf =

    15

    (n0

    (n+1 n+1)Xn).

    We have reached the famous formula

    Fn =15

    ((1 +

    5

    2)n+1 (1

    5

    2)n+1

    ).

    With n increasing ( 1

    52 )

    n+1 becomes rapidly small. Approximately we have

    Fn 15

    (1 +52

    )n+1.

    37

  • 2Remark In combinatorial applications one has often the following situation: Aset X together with a function w : X N is given such that |w1(n)| is finitefor all n N. One calls w a weight function . The ordinary generating functionof the pair (X , w) is the formal power series

    f =n0

    |w1(n)|Xn.

    Let (Y , v) be another pair with generating function g. If X and Y are disjointdefine u : X t Y N by

    u(z) =

    {w(z), z X ,v(z), z Y .

    We observe that f + g is the OGF of (X t Y , u). Define further the functionu : X Y N by u(x, y) = w(x) + v(y). Then fg is the OFG of the pair(X Y , u):

    |u1(n)| = |{(x, y) X Y |w(x) + v(y) = n}|

    = n

    k=0

    {(x, y) X Y |w(x) = k, v(y) = n k}

    =

    nk=0

    |{(x, y) X Y |x w1(k), y v1(n k)}|

    =n

    k=0

    |w1(k)| |v1(n k)|

    These rules indicate why generating functions are useful.

    Definition (a) A recurrence of order k for the complex sequence (an)n0 is amapping : Ck C such that

    an+k = (an, an+1. . . . , an+k1)

    for n 0. The elements a0, a1, . . . , ak1 are called the initial conditions of therecurrence. The recurrence is linear if is a linear functional. In this case(x1, . . . , xk) = b1x1 + + bkxk with b1, . . . , bk C.(b) A generating function is called rational if it is the quotient of two polyno-mials.

    The next result shows the connection between linear recurrences and rationalgenerating functions.

    38

  • Theorem 6.1 Let b1, . . . , bk C, k 1, and bk 6= 0. Set Q(X) = 1 +ki=1 biX

    i and let Q be factorized in C[X ] as

    Q(X) =

    ri=1

    (1 iX)di

    such that the is are distinct. For a mapping (sequence) f : N C thefollowing assertions are equivalent.

    (a) n0

    f(n)Xn =P (X)

    Q(X)

    and deg(P ) < k.

    (b) For all n 0f(n+ k) + b1f(n+ k 1) + + bkf(n) = 0.

    (c) For all n 0f(n) =

    ri=1

    Pi(n)ni

    where Pi(n) is a polynomial in n of degree < di.

    (d) n0

    f(n)Xn =

    ri=1

    Gi(X)(1 iX)di

    with polynomials Gi of degree < di.

    Proof (Stanley). We define four vectorspaces:

    V1 = {f CN | f satisfies (a)}, V2 = {f CN | f satisfies (b)},V3 = {f CN | f satisfies (c)}, and V4 = {f CN | f satisfies (d)},

    We will show that these vectorspaces are equal which implies the theorem.Clearly, dimV1 = deg(Q) = k. Also dimV2 = k as f(0), . . . , f(k 1) canbe chosen arbitrarily while the other values of f are determined by the initialconditions. We observe

    ri=1 di = k and for each polynomial Pi one can choose

    the coefficients of X0, X, . . . , Xdi1 arbitrarily. So dimV3 = k too. Similar inV4 the coefficients of the Gis can chosen arbitrarily which forces dimV4 = k.For f V1 we haveP (X) = Q(X)

    n0

    f(n)Xn

    =n0

    f(n)Xn +

    ki=1

    n0

    bif(n)Xn+i

    = + X + + Xk1 +n0

    (f(n+ k) +

    ki=1

    bif(n+ k i))Xn+k.

    39

  • This shows f V2, i.e. V1 V2. So even V1 = V2 as dimV1 = dimV2.Considering V4 we observe

    ri=1

    Gi(X)(1 iX)di =r

    i=1 Gi(X)Q(X)(1 iX)diQ(X)

    which shows V4 V1. Again the dimension argument yields V1 = V4.Elements in V4 are linear combinations of terms of the form X

    j(1X)c with0 j < c and 0 6= C. Using the binomial series (proposition 5.12) we have

    Xj(1 X)c = Xjn0

    (cn

    )()nXn =

    mj

    ( cm j

    )(1)mjmjXm.

    We rewrite the coefficients( cm j

    )(1)mj = (c)(c 1) (cm+ j + 1)

    (m j)! (1)mj

    =(c+m j 1)(c+m j 2) (c+ 1)c

    (m j)!=

    (m+ c j 1

    m j)

    =

    (m+ c j 1

    c 1).

    Writing n instead of m we obtain

    Xj(1 X)c =nj

    (n+ c j 1

    c 1)jnXn

    =n0

    (n+ c j 1

    c 1)jnXn

    as(n+cj1

    c1)

    = 0 for 0 n < j. Now (n+cj1c1 ) is a polynomial in n of de-gree c 1. This shows V4 V3. But as all dimensions are k we finally haveV1 = V2 = V3 = V4. 2

    Remark Let f : N C satisfy the linear recurrence (b) of theorem 6.1.Then the OGF has the form f = P (X)/Q(X) with Q(X) = 1 +

    ki=1 bkX

    i.This polynomial is called the characteristic polynomial of the recurrence. Thecoefficients of the polynomial P (X) =

    k1j=0 cjX

    j are determined by the initialconditions f(0), . . . , f(k 1). Since P = f Q

    cj =

    ji=0

    bif(j i), 0 j k 1, b0 = 1.

    For the evaluation of P (X)/Q(X) one then decomposes this quotient into par-tial fractions. The additive terms of the form AX j/(1 X)c, j < c, can finallybe replaced with the help of binomial series.

    40

  • In the remainder of this section we investigate nonlinear recurrences and seethat the OGF method has wider applications.

    Example We examine the Stirling numbers of the second kind (see the defini-tion before 4.5) again from the view point of generating functions. We definedS(n, k) = 0 if n < k or k < 0 or n < 0 except S(0, 0) = 1. With theseconventions one has:(1) S(n, k) = S(n 1, k 1) + kS(n 1, k) for (n, k) 6= (0, 0).This is easy to see if one counts the kpartitions of [n] containing the set {n}and the remaining ones separately. The generating functions approach tells usto multiply the recurrence with powers of an independent variable and sum up.As the relations involves two variables there are three candidates:

    fk(X) =n0

    S(n, k)Xn, gn(Y ) =k0

    S(n, k)Y k, h(X,Y ) =

    k,n0S(n, k)XnY k.

    For simplicity we consider only the one variable case and have to choose betweenthe fks and and the gns. But (1) is linear only for fk. So we inspect this case.But see [17] how information is obtained from the seies gn and h. (1) translatesinto the relation:

    fk(X) = Xfk1(X) + kXfk(X), k 1, f0(X) = 1Dividing by 1 kX we get

    fk(X) =X

    1 kX fk1(X), k 1, f0(X) = 1.

    This leads finally to

    fk(X) =Xk

    (1X)(1 2X) (1 kX) , k 1.

    The expansion in partial fractions has the form

    1

    (1X)(1 2X) (1 kX) =k

    r=1

    r1 rX .

    To compute the j s we multiply this equation with 1 rX and specialize afterwards X = 1/r. This results in

    r = (1)kr rk1

    (r 1)!(k r)! , r [k].

    As (1 rX)1 =m0 rmXm we getfk(X) = X

    kk

    r=1

    r1 rX = X

    kk

    r=1

    m0

    rrmXm =

    m0

    (k

    r=1

    rrm)Xm+k.

    41

  • For n k we now get

    S(n, k) = [Xn]fk(X) = [Xn]m0

    (

    kr=1

    rrm)Xm+k

    =

    kr=1

    rrnk

    =

    kr=1

    (1)kr rk1

    (r 1)!(k r)! rnk

    =

    kr=1

    (1)kr rn

    r!(k r)!

    which agrees with theorem 4.5.

    Definition (a) Let A be a finite set called an alphabet and

    W(A) = {w} tn1

    An.

    The set W(A) is called the set of words over A. The symbol w is called theempty word. One writes w = a1 . . . an for a word w An instead of w =(a1, . . . , an). In this case |w| = n is the length of w. One defines |w| = 0.(b) Let w W(A) of length k 1. Define Ww(n) be the set of words v oflength n such that v does not contain w as a subword. Set an = |Ww(n)| anddefine the generating function of w avoiding words as

    Aw =n0

    anXn.

    For 0 i k1 define pi and qi of length i by w = piv = vqi. The correlationpolynomial of w is defined as:

    Cw(X) =

    k1i=0

    cw(i)Xi, cw(i) =

    {1, pki = qki,0, pki 6= qki, 0 i k 1.

    Theorem 6.2 Let w be a word of length k in a binary alphabet. The generatingfunction of w avoiding words is

    Aw(X) =Cw(X)

    Xk + (1 2X)Cw(X) .

    Proof. Let U(n) be the set of words of length n of the form wv such that woccurs precisely once as a subword (namely as the beginning of the word). Setbn = |U(n)| and define B(X) =

    nn bnX

    n. For d {0, 1} (the alphabet)

    42

  • and dv Ww(n+ 1) U(n+ 1) we have (by definition of W() and U()) thatv Ww(n). Hence:

    2an = an+1 + bn+1

    Multiply with Xn+1 and sum over n to obtain (we write A = Aw):

    2XA(X) =n0

    an+1Xn+1 +

    n0

    bn+1Xn+1

    = (A(X) a0) + (B(X) b0)= A(X) 1 +B(X)

    as a0 = 1 and b0 = 0. Hence

    B(X) = (2X 1)A(X) + 1.Consider the set L = {wv | v Ww(n)}. Then |L| = an. Of course L containsU(n + k) but a word from L may also contain w later. We write o(wv) = ifor wv L if wv = xwv and wv U(n + k i) and let Li be the elementsfrom L with o(wv) = i. As a v does not contain w we get the partition

    L = L0 t t Lk1.Let wv = xwv Lj , |x| = j. Then w1 = wj+1, w2 = wj+2, . . . , wkj = wk,where w = w1 . . . wk , i.e. pkj = qkj . This shows Lj = if cw(j) = 0. Ifcw(j) = 1 i.e. pkj = qkj choose wv U(n+kj) arbitrarily. Then wv can beextended uniquely to a word wv = xwv L. This implies |Lj | = cw(j)bn+kjand we get

    an =

    k1j=0

    cw(j)bn+kj .

    Multiply with Xn+k and sum over n and we get:

    XkA(X) =n0

    anXn+k

    =n0

    ( k1j=0

    cw(j)bn+kj)Xn+k

    =( k1

    j=0

    cw(j)Xj)(

    n0bnX

    n)

    = Cw(X)B(X)

    Substitute B by (2X 1)A + 1 and solve for A and one obtains the desiredrepresentation of A. 2

    Remark One can choose an alphabet of any size and take a set S of severalwords. The generating functions of Savoiding words is also known: see D.

    43

  • Zeilberger, Enumeration of words by their number of mistakes, Disc. Math.34(1981), 89-91.

    Consider the following problems:

    1. Given a regular ngon we like to determine the number of its triangulations.A triangulation is a decomposition of the ngon in triangles by diagonals. Notethat we do not identify triangulations obtained by rotations or reflections.

    2. Consider the paths on the lattice ZZ which connect (0, 0) with (n, n) suchthat only the moves (1, 0) or (0, 1) are allowed to move from one lattice pointto the next, i.e. a path has always 2n moves. Determine the number of pathswhich do not cross the diagonal connecting (0, 0) and (n, n).

    3. Let a1 . . . an be a free word of length n (the letters are distinct). Usingbrackets one can form legal expressions in a free monoid. We call the proce-dure associating. For n = 3 associating gives the two expression a1(a2a3) and(a1a2)a3. Determine the number of expressions on can obtain by association.

    It turns out that all problems yield the same numbers. We do not prove thisbut determine these numbers in the last case.

    Definition Denote by Cn the number of ways of associating a free word oflength n. The numbers Cn are called the Catalan numbers.

    Theorem 6.3

    Cn =(2n 2)!n!(n 1)! =

    1

    n

    (2n 2n 1

    ), n 1.

    Proof. We know C1 = C2 = 1 and C3 = 2. Starting an association on hasprecisely n 1 ways to place the first generation of brackets:

    a1(a2 an), (a1a2)(a3 an), (a1a2a3)(a4 an), . . . , (a1 an1)anA first generation expression (a1 ak)(ak+1 an) can be refined in preciselyCkCnk to legal expression. This shows:

    Cn = C1Cn1 + C2Cn2 + Cn1C1 =n1i=1

    CiCni, n 2, C1 = 1

    Set C0 = 0 and denote by C(X) =

    n CnXn the OGF of the Catalan numbers.

    Multipyling the above equation with Xn and summing up over n we obtain

    C(X)X =n>1

    CnXn =

    n>1

    ( n1i=1

    CiXiCniXni

    )= C(X)2

    44

  • or C2 C +X = 0. As a quadratic equation in C = C(X) we get the formalsolutions

    F =1

    2(1 (1 4X)1/2).

    We observe that indeed C = F+ or F must hold: We have F 2+ F+ +X = 0and hence 0 = (C2F 2+) (C F+) = (C F+)(C +F+ 1). As C[[X ]] is anintegral domain either C(X) = F+(X) or C(X) = 1 F+(X) = F(X) musthold.Using the binomial series we have

    (1 4X)1/2 =n0

    (1/2

    n

    )(4)nXn = 1 2X +

    which implies F+ = 1X + and F = X + . As C0 = 0 and C1 = 1 weget C = F. Now we simplify the coefficient of Xn in (1 4X)1/2:(

    1/2

    n

    )(4)n =

    n1i=0 (

    12 i)

    n!(4)n = 2

    nn1

    i=0 (12 i)

    2n(n!)(1)n22n

    =

    n1i=0 (2i 1)

    n!2n = (2n 2)!

    n!2n1(n 1)!2n

    = 2 (2n 2)!n!(n 1)!

    Hence

    (1 4X)1/2 = n0

    2(2n 2)!n!(n 1)!X

    n.

    Inserting this power series in the expression for C = F and comparing coeffi-cients we finally get:

    Cn =(2n 2)!n!(n 1)!

    2

    Evaluating parameter dependent sums.

    Consider the following:

    Problem Evaluate a sum sn which depends on a parameter n. Howeverthis parameter shall not appear explicitely in the summation, say of the formsn =

    k0 ck(n).

    In such situations sometimes a method called free parameter method or snakeoil method (H. Wilf) can be helpful. According to Wilf one either observesquickly that this method fails or this methods works easily.Here is a somewhat vague description of the snake oil method.

    1. Form the power series S(X) =

    n0 snXn.

    45

  • 2. Create a double sum S(X) =

    k0 akXk(

    n0 bnXnk) such that the

    inner sum has a useful description in a closed form B(X) =

    n0 bnXnk.

    3. Write S(X) =

    k0 akXkB(X).

    4. If one is lucky the coefficients of S can be evaluated.

    Examples will be helpful to explain the strategy.

    Examples (a) Evaluate:

    sn =k0

    (k

    n k), n N

    According to step 1 we consider:

    S(X) =n0

    Xn(

    k0

    (k

    n k))

    =k0

    n0

    (k

    n k)Xn

    For the second step we recall that(

    knk)

    = 0 for n < k or n > 2k. So writing

    S(X) =k0

    Xkn0

    (k

    n k)Xnk

    we get for the inner sum

    n0

    (k

    n k)Xnk =

    ki=0

    (k

    i

    )X i = (1 +X)k.

    So (step 3)

    S(X) =k0

    Xk(1 +X)k.

    In this case the final step is feasible:

    S(X) =k0

    (X +X2)k =1

    1X X2

    We recall that this power series is the generating functio of the Fibonacci num-bers. This shows

    Fn = sn =k0

    (k

    n k).

    (b) Problem: Show the identityk0

    (m

    k

    )(n+ k

    m

    )=k0

    (m

    k

    )(n

    k

    )2k, m, n 0.

    46

  • We multiply both sides by Xn. Summing over n we have to verify the identityL(X) = R(X) of power series with

    L(X) =n0

    Xnk0

    (m

    k

    )(n+ k

    m

    ), R(X) =

    n0

    Xnk0

    (m

    k

    )(n

    k

    )2k.

    We first rearrange the LHS and get using 1/(1 X)k+1 = n0 (n+kn )Xn =n0

    (n+k

    k

    )Xn

    L(X) =k0

    (m

    k

    )Xk

    n0

    (n+ k

    m

    )Xn+k

    =k0

    (m

    k

    )Xk

    Xm

    (1X)m+1

    =Xm

    (1X)m+1(1 +

    1

    X)m

    =(1 +X)m

    (1X)m+1

    In a similar way we treat the RHS.

    R(X) =k0

    (m

    k

    )2kn0

    (m

    k

    )Xn

    =1

    (1X)k0

    (m

    k

    )( 2X(1X)

    )k=

    1

    (1X)(1 +

    2X

    1X)m

    =(1 +X)m

    (1X)m+1

    Hence L(X) = R(X) and the identity follows.

    7 Exponential Generating Functions

    In this section we illustrate applications of exponential generating functions.We start with an example.

    Example Let in be the number of permutations pi in Sym(n) such that pi2 = 1.

    Clearly, i1 = 1, i2 = 2. We set i0 = 1 and claim:(1) in+2 = in+1 + (n+ 1)in for n 0.Set I = {pi Sym(n + 2) |pi2 = 1}. Then I = I t I where I = {pi I |pi(n+ 2) 6= n+ 2} and I = {pi I |pi(n+ 2) = n+ 2}. Clearly, |I | = in+1.

    47

  • We decompose I further

    I =n+1j=1

    I j , Ij = {pi I |pi(n+ 2) = j}.

    Then |I j | = in:Every element Sym([n+ 1]{j})(' Sym(n)) with 2 = 1 is extended withthe transposition (j, n+ 2) to an element in I j and the elements in I

    j map onto

    the s if one deletes this transposition. (1) follows.Unfortunately the recurrence (1) is not linear so that theorem 6.1 can not beapplied. However it turns out that we can determine the EGF (exponentialgenerating function)

    i(X) =n0

    inn!Xn.

    We have

    D(i)(X) =n1

    in(n 1)!X

    n1 =n0

    in+1n!

    Xn

    and

    Xi(X) =n0

    inn!Xn+1 =

    n0

    nin1n!

    Xn.

    By (1) D(i)(X) = (1 +X)i(X) and as i(X) is invertible

    D(i)(X)

    i(X)= 1 +X.

    Thus with proposition 5.12

    D(log(i(X))) = D(log(i(X) 1 + 1)) = D(i(X) 1)i(X)

    =D(i(X))

    i(X)= 1 +X.

    So there is a complex number c with

    log(i(X)) = c+X +X2

    2.

    Apply the exponential series to obtain

    i(X) = ec exp(X +

    X2

    2

    ).

    Also c = 0 as i0 = 1 so that finally

    i(X) = exp(X +

    X2

    2

    ).

    The following lemma is basic for applications of EGFs.

    48

  • Lemma 7.1 Define for the functions f, g : N C the EGFs F (X) = n0 f(n)n! Xn, G(X) =n0

    g(n)n! X

    n. Define further h : N C by

    h(n) =(S,T )

    f(|S|)g(|T |)

    where (S, T ) ranges over the ordered partitions of [n] of size 2 and let H(X) =n0

    h(n)n! X

    n be the associated EGF. Then H(X) = F (X)G(X).

    Proof. There exist precisely(nk

    )pairs (S, T ) with |S| = k. Thus:

    h(n) =

    nk=0

    (n

    k

    )f(k)g(n k).

    and

    F (X)G(X) =n0

    ( nk=0

    f(k)

    k!

    g(n k)(n k)!

    )Xn

    =n0

    1

    n!

    ( nk=0

    (n

    k

    )f(k)g(n k)

    )Xn = H(X)

    follows. 2

    Remark Let f1, . . . , fk : N C be functions with EGFs Fj(X) =

    n0fj(n)

    n! Xn.

    Define h : N C by

    h(n) =

    (T1,...,Tk)

    f1(|T1|) fk(|Tk|)

    where (T1, . . . , Tk) ranges over the ordered partitions of size k of [n] and let

    H(X) =

    n0h(n)n! X

    n be the associated EGF. Then

    F1(X) Fk(X) = H(X).This follows from lemma 7.1 by an obvious induction. With the next theoremwe compute the composition of EGFs.

    Theorem 7.2 (Composition formula for EGF,s) Define for the functions f, g :

    N C, f(0) = 0, the EGFs F (X) = n0 f(n)n! Xn, G(X) = n0 g(n)n! Xn.Define further h : N C by h(0) = g(0) and

    h(n) =k1

    ( {S1,...,Sk}

    f(|S1|) f(|Sk|))g(k), n 1,

    where {S1, . . . , Sk} ranges over the unordered partitions of [n] and let H(X) =n0

    h(n)n! X

    n be the associated EGF. Then H(X) = G(F (X)).

    49

  • Proof. By definition G(F (X)) =

    k0g(k)k! F (X)

    k and by the lemma and theremark

    g(k)

    k!F (X)k =

    g(k)

    k!

    (n0

    1

    n!

    ( (T1,...,Tk)

    f(|T1|) f(|Tk|))Xn).

    An unordered partition of size k defines k! ordered partition of length k andall these k! partitions produce the same number f(|T1|) f(|Tk|). Note thatordered partitions can contain the empty set while unordered partitions onlyhave nontrivial parts. But as f(0) = 0 we range with (T1, . . . , Tk) only overordered partitions with nontrivial parts and our consideration shows

    1

    k!

    (T1,...,Tk)

    f(|T1|) f(|Tk|))

    =

    {S1,...,Sk}f(|S1|) f(|Sk|)

    where (T1, . . . , Tk) ranges over ordered and {S1, . . . , Sk} over unordered parti-tions of [n]. Define for k 1, n 1

    hk(n) =

    {S1,...,Sk}f(|S1|) f(|Sk|)g(k).

    and set Hk(X) =

    n1hk(n)

    n! Xn. Then

    Hk(X) =n1

    1

    n!

    ( {S1,...,Sk}

    f(|S1|) f(|Sk|)g(k))Xn =

    g(k)

    k!F (X)k.

    Define H0(X) = g(0). Then

    G(F (X)) =k0

    Hk(X) = g(0) +k1

    g(k)

    k!F (X)k

    = g(0) +k1

    n1

    1

    n!

    ( {S1,...,Sk}

    f(|S1|) f(|Sk|)g(k))Xn

    = g(0) +n1

    1

    n!

    k1

    ( {S1,...,Sk}

    f(|S1|) f(|Sk|)g(k))Xn

    = H(X).

    2

    Corollary 7.3 (Exponential formula) Define for the function f : N C,f(0) = 0, the EGF F (X) =

    n0

    f(n)n! X

    n and define h : N C by h(0) = 1and

    h(n) =k1

    ( {S1,...,Sk}

    f(|S1|) f(|Sk|)), n 1,

    where {S1, . . . , Sk} ranges over the unordered partitions of [n]. Let H(X) =n0

    h(n)n! X

    n be the associated EGF. Then exp(F (X)) = H(X).

    50

  • Proof. Take g : N C as the function which is constant = 1. The EGF of g isthe exponential series. Apply the theorem. 2

    Already very simple choices of f and g lead to interesting applications.

    Example Define f : N C by f(0) = 0 and f(n) = 1 for n > 0 and definegk : N C by gk(n) = kn and let F (X) and Gk(X) the associated EFGs.Then F (X) = exp(X) 1 and Gk(X) = Xkk! . Define h(n) as in the compositionformula by

    h(n) =`1

    {S1,...,S`}

    f(|S1|)f(|S1|) f(|S`|)gk(`) =

    {S1,...,Sk}1 = S(n, k)

    i.e. we obtain the Stirling number S(n, k). We denote the EGF of the Stirling

    numbers of the second kind for fixed k by Sk(X) =

    n0S(n,k)

    n! Xn. Thus

    Sk(X) = Gk(F (X)) =1

    k!(exp(X) 1)k.

    Recall that the Bell numbers are defined asB(n) =n

    k=0 S(n, k) =

    k0 S(n, k)(as S(n, k) = 0 for k > n). Denote by

    B(X) =n0

    B(n)

    n!Xn

    the EGF of the Bell numbers. Then

    B(X) =n0

    1

    n!

    (k0

    S(n, k))Xn

    =k0

    (n0

    S(n, k)

    n!Xn)

    =k0

    Sk(X)

    =k0

    1

    k!(exp(X) 1)k = exp(exp(X) 1).

    2

    Remark Although no explicit formula for the Bell numbers is known we havea simple, explicit generating function for them.In cases where the generating function of a number series is explicitly knownone can often find recurrences for these numbers following a recipe described byH. Wilf in his book (p. 22) as follows:

    1. Take the logarithm of both sides of the equation.

    2. Differentiate both sides and multiply through by X .

    3. Clear the equation of fractions.

    51

  • 4. For each n, find the coefficient of Xn on both sides of the equation andequate them.

    We apply this method in the case of the Bell numbers.

    Proposition 7.4 For all n 1

    B(n) =

    n1k=0

    (n 1k

    )B(k).

    Proof. The formula B(x) =

    n01n!B(n)X

    n = exp(exp(X) 1) transformsafter applying the logarithm to:

    log(

    n0

    1

    n!B(n)Xn

    )= exp(X) 1

    We differentiate and multiply by X to getn1

    nn!B(n)X

    nn0

    1n!B(n)X

    n= X expX.

    Multiply with the denominator of the LHS and we obtain with the help oflemma 7.1n1

    1

    (n 1)!B(n)Xn = X expX

    n0

    1

    n!B(n)Xn = X

    (n0

    1

    n!

    ( nk=0

    (n

    k

    )B(k)

    )Xn).

    Hence for n 1B(n)

    (n 1)! = [Xn]n1

    n

    n!B(n)Xn

    = [Xn]n0

    1

    n!

    (n

    k=0

    (n

    k

    )B(k)

    )Xn+1

    =1

    (n 1)!n1k=0

    (n 1k

    )B(k)

    and the proposition is verified. 2

    Here is a version of the composition formula using permutations.

    Theorem 7.5 Define for the functions f, g : N C, f(0) = 0, the EGFsF (X) =

    n1

    f(n)n! X

    n, G(X) =

    n0g(n)n! X

    n. Define further h : N C byh(0) = g(0) and

    h(n) =k1

    piSym(n),c(pi)=k

    f(|S1|) f(|Sk|)g(k), n 1,

    52

  • where {S1, . . . , Sk} are the supports of the cycles of pi with c(pi) = k. LetH(X) =

    n0

    h(n)n! X

    n be the associated EGF. Then

    H(X) = G(n1

    f(n)

    nXn).

    Proof. The number of kcycles of a kset is (k 1)!. So if = {S1, . . . , Sk} isa partition of [n] we have precisely (|S1| 1)! (|Sk| 1)! permutations whosesupports of the cycles induce the partition . Thus

    h(n) =k1

    {S1,...,Sk}

    (|S1| 1)!f(|S1|) (|Sk| 1)!f(|Sk|)g(k)

    where {S1, . . . , Sk} ranges over the unordered partitions of [n] of size k. Definef(n) = (n 1)!f(n) then we rewrite the formula as

    h(n) =k1

    {S1,...,Sk}

    f(|S1|) f(|Sk|)g(k).

    Defining F (X) = f(n)

    n! Xn we get with the composition formula

    H(X) = G(F (X)) = G(n1

    f(n)

    nXn).

    2

    If we specialize g(n) = 1 for all n, i.e. G(X) = exp(X), we get:

    Corollary 7.6 Define for f : N C, f(0) = 0, the EGF F (X) = n1 f(n)n! Xnand define further h : N C by h(0) = 1 and

    h(n) =k1

    piSym(n),c(pi)=k

    f(|S1|) f(|Sk|), n 1,

    where {S1, . . . , Sk} are the supports of the cycles of pi with c(pi) = k. LetH(X) =

    n0

    h(n)n! X

    n be the associated EGF. Then

    H(X) = exp(n1

    f(n)

    nXn).

    The corollary provides the EGFs of elements of some particular order in thesymmetric groups:

    Theorem 7.7 Let 1 m Z be a positive number. The EGF of the numbersof elements in Sym(n) whose orders divide m, i.e. of the numbers h(n) = |{pi Sym(n) |pim = 1}|, is

    exp(

    d|m

    Xd

    d

    ).

    53

  • Proof. Set f(0) = 0 and for d 1 set

    f(d) =

    {1, d|m,0, else.

    The h(n) in the corollary

    h(n) =k1

    piSym(n),c(pi)=k

    f(|S1|) f(|Sk|)

    has the term f(|S1|) f(|Sk|) = 1 if all |Si| dividem and otherwise this productis 0. Thus a pi Sym(n) gives the nontrivial contribution 1 iff pim = 1. Henceh(n) = |{pi Sym(n) |pim = 1}|. By the corollary we obtain the EFG

    n0

    h(n)

    n!Xn = exp

    (d|m

    Xd

    d

    ).

    2

    Example. The EGF for elements of order 1 and 2 is therefore exp(X + X2

    2 )which agrees with the first example of this section.

    Definition A connected graph which contains no cycle as a subgraph is called atree. A graph whose connected components are trees is called a forest. A graphwith n vertices which are denoted by the elements of [n] is called a labeled graph.

    Example There exists only one tree with 3 vertices but 3 labeled trees with 3vertices.

    1 32

    1 3231 2

    unlabeled tree

    labeled trees

    Theorem 7.8 (Cayley) There are precisely nn2 labeled trees on n vertices.

    Proof. Let Tn the number of labeled trees on n vertices. We know alreadyT1 = T2 = 1, T3 = 3. In the sequel it is convenient to distinguish a particularvertex in a labeled tree which we call the root and the tree a rooted tree. Let tnbe the number of rooted trees. Since we can take any vertex of a labeled treeas a root we have

    tn = nTn.

    Let fn be the number of rooted forests on n vertices. We claim

    tn+1 = (n+ 1)fn. (1)

    54

  • Let be a rooted forest with connected components 1, . . . ,k and roots a1 V (1), . . . , ak V (k). We define a tree 0 on [n + 1] by adding n + 1 to thevertex set of and the edges {a1, n+ 1}, . . . , {ak, n+ 1} to the set of edges. Ifon the other hand if is a tree on [n+1] and N(n+1) = {a1, . . . , ak} we deleteform the vertex n + 1 and the edges {a1, n+ 1}, . . . , {ak, n + 1} and obtaina rooted forest 0 on [n] with k connected components. Of course we take theais as roots and observe that the components are characterized by the ais.Note that we can obtain from two trees isomorphic forests if we neglect thelabeling. However using the labeling the two lab