58
Chapter 1 Cutting Problems In this chapter we consider some problems concerning the cutting of various edible items. 1. We want to cut a cube of cheese that is three inches on each side into 27 1 × 1 × 1 inch cubes. You are allowed to rearrange the pieces after each cut and to cut more than one piece at once, but each cut must be a single planar slice. What is the least number of cuts necessary? Solution. The “standard” solution is to note that each of the six faces of the central 1 × 1 × 1 cube requires a separate cut, so six cuts is a lower bound. On the other hand, it is easy to see that six cuts suffices, even without rearranging pieces. The cheese cutting problem for a 4 × 4 × 4 cube, where six cuts is still achievable, appears in Gardner [58, pp. 51–52]. 2. The previous solution suffers from the defect that it doesn’t easily generalize. Find the minimum number of cuts necessary to cut an n 1 × n 2 ×⋯× n d brick of d-dimensional cheese into n 1 n 2 n d unit bricks. Again you are allowed to rearrange pieces before each cut, and each cut must be along a hyperplane (of dimension d 1). 1

Chapter 1 Cutting Problems - rstan/problems.pdf · 2. The previous solution suffers from the defect that it doesn’t easily generalize. Find the minimum number of cuts necessary

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Chapter 1

Cutting Problems

In this chapter we consider some problems concerning the cutting of variousedible items.

1. We want to cut a cube of cheese that is three inches on each side into27 1 × 1 × 1 inch cubes. You are allowed to rearrange the pieces aftereach cut and to cut more than one piece at once, but each cut must bea single planar slice. What is the least number of cuts necessary?

Solution. The “standard” solution is to note that each of the six facesof the central 1×1×1 cube requires a separate cut, so six cuts is a lowerbound. On the other hand, it is easy to see that six cuts suffices, evenwithout rearranging pieces.

The cheese cutting problem for a 4 × 4 × 4 cube, where six cuts is stillachievable, appears in Gardner [58, pp. 51–52].

2. The previous solution suffers from the defect that it doesn’t easilygeneralize. Find the minimum number of cuts necessary to cut ann1 ×n2 ×⋯×nd brick of d-dimensional cheese into n1n2⋯nd unit bricks.Again you are allowed to rearrange pieces before each cut, and eachcut must be along a hyperplane (of dimension d − 1).

1

2 CHAPTER 1. CUTTING PROBLEMS

Solution. The minimum number of cuts is

f(n1, . . . , nd) ∶=d

∑i=1⌈log2(ni)⌉. (1.1)

It requires ⌈log2(ni)⌉ cuts to break up a line of ni bricks in the ithcoordinate direction. Since we can cut up only one direction at a timefor each piece, we see that f(n1, . . . , nd) is a lower bound on the numberof cuts. It’s not hard to obtain this bound just by doing the strategyfor d = 1 one direction at a time.

This problem is discussed by Gardner [59, p. 34]. In particular, equa-tion (1.1) is due to Eugene Putzer and R. Lowen [121] in 1958.

3. Let us turn to some mathematical games involving cake cutting. Wehave two players Alice and Bob (the traditional names for the playersof two-player math games). This time we are cutting chocolate barsrather than cheese. Suppose we have such a bar divided by indentationsinto an m × n array of squares. Alice begins and (unless m = n = 1)breaks the bar into two pieces by cutting along one of the grid lines.The players take turns choosing one of the pieces and cutting it intotwo along one of the grid lines, if this piece is not just a 1 × 1 square.The first player unable to move loses. This situation will occur whenwe reach mn 1 × 1 squares. For each value of m and n, who will winunder optimal play? What is the correct strategy for the winner?

Solution. The number of pieces increases by one after each term.Therefore the game will end after mn − 1 turns. Thus Alice wins ifand only if mn is even. It doesn’t matter how either Alice or Bobplays.

This game is called impartial cutcake, “impartial” because at any stageof the game, the moves available to each player are the same. Winkler[164, p. 93] says “This ridiculously easy puzzle has been known tostump some very high-powered mathematicians for as much as a fullday, until the light finally dawns amid groans and beatings of headsagainst walls.”

4. Let’s move to a more interesting variant of impartial cutcake. Ratherthan have the same moves available for each player, let each cut beavailable to just one of the players. To be specific, regard our m × n

3

chocolate bar to be an array with m rows and n columns. Alice canonly make vertical cuts separating two adjacent columns (so on herfirst turn she has n− 1 choices), and similarly Bob for rows. Alice goesfirst as before, and the first player unable to move loses. This game iscalled simply cutcake. Who wins with optimal play?

Solution. A game is called partizan if it is always disadvantageous tomove. It is not hard to see that cutcake is a partizan game. Any cut aplayer makes to a rectangle R simply removes that option for him- orherself while doubling the number of options for the other player on thetwo pieces into which R has been cut. It turns out that for any partizangame G (once the definition of “partizan game” has been made moreprecise) we can assign a real number ν(G) called the value of the game.We should think that ν(G) represents the number of moves that Aliceis ahead of Bob. For instance, a 1 × n chocolate bar An−1 has valueν(An−1) = n − 1. If we have a set S = {R1, . . . ,Rk} of disjoint chocolatebars, then a move on S consists of cutting some Ri in an allowed way.We can denote this by writing S = R1 + ⋯ +Rk. If the 1 × n bar An−1is part of the collection S then Alice has n − 1 possible moves whichshe can take on An−1 at her leisure, and Bob can do nothing to disruptthese n − 1 moves. Similarly, an m × 1 bar A−(m−1) has value −(m − 1),since now Alice is m − 1 moves behind. A 1 × 1 rectangle has value 0,meaning that mover loses.

For any situation S = R1 +⋯+Rk, we say that ν(S) = n if mover loseson the game S + A−n. Since mover clearly loses on An + A−n, we getν(An) = n in agreement with our previous definition of ν(An).It is easy to show that if S = R1 +⋯+Rk then

ν(S) = ν(R1) +⋯+ ν(Rk).In particular, ν(S) = 0 if and only if mover loses (with optimal play).This is quite clear when all the Ri’s are of sizes mj × 1 and 1 × nj .The player’s moves are completely independent, so it is just a questionof counting how many moves each player has. To understand cutcakecompletely, it suffices to determine the value ν(R) of any rectangle.

Example. Let us write Ra,b for the game with one a×b rectangle, andkRa,b for Ra,b +⋯+Ra,b (k times). A tedious analysis of all possibilitiesshows that if we begin with R3,8+R4,1 then mover loses. Since ν(R4,1) =

4 CHAPTER 1. CUTTING PROBLEMS

6 9

5

13

1 2 3 4 5 7 8 10 11 12 13 14 15

1

2

3

4

6

7

8

9

10

11

12

14

15

9

2 3 4 5 60 1

−3

−1

−4

−5

−6

0 1 2

−1

−2

0

0 1 3 5 7 10 11 12 13 1442 6 8

−2

−1

−2

−3

−4

−5

−6

−7

−8

−9

−10

−11

−12

−13

−14

Figure 1.1: Cutcake values

−3 it follows that R3,8 = 3. Now one can also show that R5,9 = 1. Notethat in general ν(Ra,b) = −ν(Rb,a), since on Ra,b+Rb,a the second playercan always win by an obvious mimicking strategy. Thus we see thatR3,8 + 3R9,5 = 0, i.e., mover loses on R3,8 + 3R9,5. This simple exampleillustrates the power of the value function ν.

We need to give a simple formula for ν(Rij), or at least a simple methodto compute this number. One way to state the answer is as follows: ifm ≤ n and m has k + 1 binary digits, then ν(Rmn) = v if and only if(v + 1)2k ≤ n ≤ (v + 2)2n − 1. In particular, ν(Rmn) = 0 if and only if mand n have the same number of binary digits. Figure 4 gives a table ofν(Rmn) for m,n ≤ 15. Once the value of ν(Rmn) is guessed, its validityis straightforward to prove by induction. We leave the details to thereader.

Cutcake is analyzed in volume 1 [16, pp. 24–26] of the monumental

5

opus (four volumes in the A K Peters edition) by Berlekamp, Conway,and Guy1.

Note. Cutcake happens to be an especially simple partizan game be-cause the value of every position is an integer. In more complicated partizangames we can get fractional values. For nonpartizan games the situation be-comes much more complicated. All this and much more is thoroughly coveredin the book by Berlekamp, Conway, and Guy cited above.

By way of illustration, the next problem gives a rather contrived exampleof a game with a fractional value.

5. Let R be a 1×2 piece of chocolate. As in cutcake, Alice can cut R intotwo 1 × 1 pieces. However, if it is Bob’s turn to move, he is allowedto make a partial cut which Alice can complete when it is her turn tomove. (Once Bob has made a partial cut, he is not allowed to makeanother partial cut at a later turn.) What is the “correct” value ν(R)?Solution. Since Alice wins whoever goes first, we should have ν(R) > 0.However, the game is not as advantageous for Alice as a 1×2 rectanglein cutcake, since Bob is able to make a move prior to Alice. Thissuggests that ν(R) < 1. To get the precise value, it is easy to checkthat mover loses in the game R +R +R−1. (Recall that R−1 is a 2 × 1piece of chocolate in ordinary cutcake.) Since ν(R−1) = −1, we shoulddefine ν(R) = 1

2.

Another aspect of cutting is the theory of “fair cake-cutting.” We beginwith the simple and well-known procedure for two persons.

6. Two persons each have different “valuation functions” which tell themhow much they like each piece of a cake. The valuation functionsmust satisfy some reasonable properties for fair cake-cutting to makesense. We regard a valuation a finitely additive nonatomic probability

1Richard Guy is the world’s oldest living mathematician at the time of this writing (26May 2019), according to [94].

6 CHAPTER 1. CUTTING PROBLEMS

distribution V . Thus the value of the entire cake is 1, each individualpoint has value 0, and if A,B are disjoint pieces of cake then V (A∪B) =V (A) + V (B).The goal is to share the cake so each person thinks they got at leasttheir fair share (called proportionalility) and neither person thinks theother got more than their fair share (called envy-freeness). It is easyto see that envy-freeness implies proportionality. Describe an envy-freeprocedure for sharing a cake between two persons.

Solution. One person divides the cake into two pieces which he or sheregards of equal value, and then the second person chooses one of thepieces.

7. Describe an envy-free algorithm for n persons, n ≥ 3. Is there such analgorithm which has a bounded number of slices for each n?

Solution. For three persons, an envy-free algorithm was devised in1960 by John Selfridge, later independently found by John Conway. Itrequires as many as five slices. In 1995 Steven Brams and Alan Taylorfound a finite procedure that works for any number n of persons, butthe number of slices for sufficiently large fixed n is an unbounded func-tion of the valuations. Finally in 2018 Haris Aziz and Alan Mackenziedescribed an envy-free algorithm that has a bounded number of slicesfor fixed n. If f(n) is the least number of slices which suffices for anyn valuations, then it is known that f(n) ≥ cn2, while the upper boundis an exponential tower of six n’s, that is,

f(n) ≤ nnnnnn

.

Not such a practical algorithm for a classroom full of students. Even44

4

is far greater than the number of atoms in the visible universe.

Note. For a different kind of fair allocation problem, see Problem 3.??.

8. Our next cutting problem is at first sight a very unintuitive result.Wehave a cylindrical birthday cake with frosting on the top. Let 0 < θ < 2π.

7

Cut out a wedge that makes an angle θ, remove it from the cake, turnit upside down, and insert it back into the cake. Now cut anotherwedge of angle θ adjacent (say counterclockwise) to the first one anddo the same procedure. Continue in this way, always removing a wedgeadjacent in the counterclockwise direction to the wedge just removedand inserted, and then turning it upside down and inserting it backinto the cake. For what angles θ will we have all the frosting back ontop after a finite positive number of moves?

Solution. At first it may seem that θ must be a rational multiple of 2π,since otherwise each cut will be in the interior of some previously intactpiece P . After we turn over the new wedge, the piece P will have onepart with frosting on top and one without. What is often overlooked isthat when we turn over a piece of cake prior to reinserting it, it reversesthe pattern of the frosted and unfrosted regions. This reversal destroysthe argument that θ/π must be rational.

Let n = ⌈2π/θ⌉. Clearly if 2π/θ = n then after n flips all the frostingwill be on the bottom, and after n more it will be back on top. Thuswe can assume that 2π/θ /∈ Z. When we make the kth cut, considerthe set Sk of clockwise angles between that cut and all the radii thatseparate a frosted region from an unfrosted region.

For example, suppose that θ = 2. After one cut the region between theangles 0 and θ will be unfrosted, so S1 = {0, θ}. After the second cut,the unfrosted region is between 0 and 2θ, so S2 = {0,2θ}. SimilarlyS3 = {0,3θ}. The next cut at the angle 4θ will be inside the first piecethat was reinserted. When we flip, the unfrosted region from 0 and 4θ(counterclockwise) becomes a frosted region from 3θ to 7θ, while thefrosted region from 3θ to 0 becomes an unfrosted region from 7θ to4θ. (We reduce all angles ψ modulo 2π so 0 ≤ ψ < 2π.) Thus after theflip the region from 3θ to 7θ is frosted, while the remaining cake from7θ to 3θ is unfrosted. The radii between frosted and unfrosted regionsare at 3θ and 7θ. Measuring these angles from the position 4θ givesS4 = {−θ,3θ}, as illustrated below.

8 CHAPTER 1. CUTTING PROBLEMS

0

θ

Once the idea of looking at the sets Sk is found, the remainder of theproof is fairly straightforward. One shows by induction (left to reader)that for any k, Sk is a subset of

{−(n − 1)θ,−(n − 2)θ, . . . ,−θ,0, θ,2θ, . . . , nθ}. (1.2)

Since there are only finitely many possible Sk, and since given Sk wecan reverse the procedure to get back to Sk−1, it follows immediatelythat after finitely many steps we will get back to the frosting being ontop. A crude upper bound for the number of steps is 22n−1, the numberof subsets of the set (1.2).

The origin of this problem is uncertain. For an exposition see Winkler[165, pp. 111, 115–118].

9. * Show that if θ ≠ 2π/n, then the least number of steps to get thefrosting back on top is 2n(n − 1). Moreover, we will never get all thefrosting on the bottom as in the θ = 2π/n case.

Chapter 2

Polynomials

The theory of polynomials is a vast mathematical edifice. In fact, the entiresubject of algebraic geometry is about these creatures, but in general thissubject requires too much background to be suitable for this book. We willlook at some properties of polynomials that make interesting problems or areentertaining stand-alone facts.

We will consider primarily polynomials in one variable over a field K.The set (in fact, a ring) of such polynomials is denoted K[x]. The notationextends in an obvious way to K[x1, . . . , xn]. A zero of a polynomial f(x) ∈K[x] is an element α of some extension field of K for which f(α) = 0.Often the word “root” is used as a synonym for “zero,” but to be precise,polynomials f(x) have zeros, while polynomial equations f(x) = 0 have roots.

1. We begin with polynomials of degree one, say f(x) = ax + b, wherea ≠ 0. Find a necessary and sufficient condition for such a polynomialto have a zero in K.

Solution. Of course this is a trivial warmup problem. There alwaysexists the unique solution x = −b/a.

2. We continue to quadratic polynomials, say f(x) = ax2 + bx + c ∈ K[x]with a ≠ 0. Is there a nice condition for f(x) to have two zeros (alwayscounted with multiplicity) in K?

9

10 CHAPTER 2. POLYNOMIALS

Solution. This problem has a small trap. It is not correct that anecessary and sufficient condition for f(x) to have two zeros in K isthat b2 − 4ac is a square in K. This condition is correct only whencharK ≠ 2. Naturally in this case the proof follows from the quadraticformula. When charK = 2 there is no longer such a simple condition.For an example of what can be done in this case, see Cherly, Gallardo,Vasersteın, and Wheland [34].

As a immediate corollary, we see that a polynomial f(x) = ax2+bx+c ∈R[x] has only real zeros if and only if b2 − 4ac ≥ 0.

The solution to the next problem uses a basic result from the theory ofsymmetric functions. Recall that a polynomial f(x1, . . . , xn) ∈K[x1, . . . , xn]is said to be symmetric if it is invariant under any permutation of the vari-ables. In symbols, if w is any permutation of [n] then

f(x1, . . . , xn) = f(xw(1), . . . , xw(n)).In particular, the elementary symmetric functions

ek = ek(x1, . . . , xn) ∶= ∑1≤i1<i2<⋯<ik≤n

xi1xi2⋯xik , 1 ≤ k ≤ n,

are symmetric polynomials. The fundamental theorem of symmetric func-

tions asserts that any symmetric polynomial f ∈K[x1, . . . , xd] can be uniquelywritten as a polynomial in e1, . . . , ek. In particular, the power sums

pd = xd1 + xd2 +⋯+ xdn, d ≥ 1,can be written (uniquely) as polynomials in e1, . . . , en.

Let f(x) ∈ R[x] be a real polynomial of degree n. By the fundamentaltheorem of algebra, we can write

f(x) = c n

∏j=1(x −αj)

for complex numbers α1, . . . , αn (unique up to order) and a real number c ≠ 0(the leading coefficient of f(x)). Thus if ek is the kth elementary symmetricfunction of α1, . . . , αn, then

f(x) = c(xn − e1xn−1 + e2xn−2 −⋯ + (−1)nen). (2.1)

11

Writepd = pd(α1, . . . , αn) = αd

1 +⋯+ αdn.

By the fundamental theorem of symmetric functions and equation (2.1), pdis a polynomial (depending only on n) of the ej’s, so in particular pd ∈ R.

Given f(x) ∈ R[x], define the n × n real matrix A(f) by

A(f) =⎡⎢⎢⎢⎢⎢⎢⎢⎣

n p1 p2 ⋯ pn−1p1 p2 p3 ⋯ pn

⋮pn−1 pn pn+1 ⋯ p2n−2

⎤⎥⎥⎥⎥⎥⎥⎥⎦. (2.2)

3. Show that if the zeros of f(x) is real and distinct, then the matrix A(f)is positive definite, i.e., A(f) is a real symmetric matrix (and hence hasreal eigenvalues) for which every eigenvalue is positive.

Solution. Given f(x) = c∏nj=1(x − αj), define the matrix

V =⎡⎢⎢⎢⎢⎢⎢⎢⎣

1 α1 α21 ⋯ αn−1

1

1 α2 α22 ⋯ αn−1

2

⋮1 αn α2

n ⋯ αn−1n

⎤⎥⎥⎥⎥⎥⎥⎥⎦.

Note thatdetV = ∏

1≤i<j≤n(αj − αi) ≠ 0, (2.3)

the well-known evaluation of the Vandermonde determinant.

If B is any n × n nonsingular real matrix, then BtB (where t denotestranspose) is positive definite. Now note that V tV = A(f), and theresult follows.

A leading principal minor of an n×n matrix B is the determinant of thesubmatrix formed from the first k rows and first k columns, 1 ≤ k ≤ n. Recallfrom linear algebra that an n×n real symmetric matrix B is positive definiteif and only if its n leading principal minors are positive. Let Lk(f) denotethe kth leading principal minor of the matrix A(f). Thus L1(f) = n > 0.Hence to show that A(f) is positive definite, it suffices to verify the n − 1inequalities Lk(f) > 0 for 2 ≤ k ≤ n.

12 CHAPTER 2. POLYNOMIALS

4. Show that the necessary conditions Lk(f) > 0, 2 ≤ k ≤ n, for f(x) tohave all its zeros real and distinct are also sufficient.

Solution. The proof requires a knowledge of the central result aboutreal quadratic forms, which we now state. Let A = [aij]n−1i,j=0 be a realnonsingular symmetric n × n matrix. The quadratic form associatedwith A is the polynomial

SA(x) = n−1∑i,j=0

aijxixj .

We can then write SA(x) in the form

SA(x) = q

∑k=1

Z2k −

n

∑k=q+1

Z2k , (2.4)

where each Zk is a (nonzero) real linear form, i.e.,

Zk =n

∑h=1

bhkxh, bhk ∈ R.

This representation is not unique, but the integer q is unique and isequal to the number of positive eigenvalues of A.

Returning to the matrix A = A(f), we compute that

SA(x) = n−1∑i,j=0(αi+j

1 +⋯+ αi+jn )xixj

=n

∑k=1(x0 +αkx1 + α

2kx2 +⋯ +α

n−1k xn−1)2 .

Let Zk = x0 +αkx1 +α2kx2 +⋯+α

n−1k xn−1. If αk ∈ R, then Z2

k contributesa positive term to the sum in equation (2.4). Otherwise let an overheadbar denote complex conjugation. The pair αk, αk yields

Zk = Pk + iQk, Zk = Pk − iQk,

where Pk and Qk are real linear forms. Note that

Z2k + Z

2k = 2P 2

k − 2Q2k,

13

thus contributing one positive and one negative term to (2.4). Hencethe number of positive eigenvalues of A is q + 1

2(n − q) = 1

2(n + q).

In particular, A is positive definite (equivalently, all eigenvalues arepositive) if and only if q = n, and the proof follows.

This argument is taken from Gantmacher [56, Chap. XV, §9.1].

Note. By a continuity argument or a slightly more complicated versionof the previous proof, we get that f(x) has only real zeros if and only ifLk(f) ≥ 0 for 2 ≤ k ≤ n.

5. For a monic polynomial xn +axn−1 + bxn−2 + cxn−3 + dxn−4 +⋯ of degreen, compute L2(f) and L3(f).Solution. The answer is

L2(f) = (n − 1)a2 − 2nbL3(f) = −2(n − 1)a3c + (n − 2)a2b2 − 4(n − 1)a2d + 2(5n − 6)abc

−4(n − 2)b3 + 8nbf − 9nc2.In particular, L2(f) has two terms, and L3(f) has seven terms.

6. How many terms does Lk(f) have for a monic polynomial f of suffi-ciently large degree?

Solution. Let t(k) be the number of terms. One can compute thatt(4) = 34, t(5) = 204, and t(6) = 1409. The problem of computingt(k) in general has received little attention and is open. It is similar tofinding the number of terms in the discriminant (discussed below) of ageneric polynomial of degree n, which is also an open problem. Thissequence begins (starting at n = 1) 1, 2, 5, 16, 59, 246, 1103, 5247,26059.

The discriminant ∆(f) of f(x) = ∏nj=1(x −αj) is defined by

∆(f) = ∏1≤i<j≤n

(αj − αi)2.

14 CHAPTER 2. POLYNOMIALS

Since ∆(f) is a symmetric function of the αi’s, it can be written as a poly-nomial in the coefficients of f . More generally, if f(x) has leading coefficienta, then on should define

∆(f) = a2n−2 ∏1≤i<j≤n

(αj − αi)2.in order for ∆(f) to be a polynomial, i.e., to avoid negative powers of a. Forinstance,

∆(ax2 + bx + c) = b2 − 4ac

∆(ax3 + bx2 + cx + d) = b2c2 − 4ac3 − 4b3d − 27a2d2 + 18abcd.

Note that if follows from equation (2.3) and the equation V V t = A(f) thatwhen f is monic we have Ln(f) = ∆(f).

7. Let a, b ∈ C and n ≥ 2. Find ∆(xn + ax + b).Solution. First note that if f(x) = ∏n

i=1(x − αi), thenf ′(x) = f(x) n

∑i=1

1

x − αi

.

It follows thatf ′(αj) =∏

i≠j(αj − αi).

Since ∆(f) = ∏i≠j(αi − αj)2, we have

∆(f) = (−1)(n2)f ′(α1)⋯f ′(αn). (2.5)

Now if f(x) = xn + ax + b then f ′(x) = nxn−1 + a. Henceαjf

′(αj) = nαnj + aαj

= n(−aαj − b) + aαj

= −(n − 1)aαj − bn.

Plugging into equation (2.5) and using

α1α2⋯αn = (−1)nb

15

gives

∆(f) = (−1)(n2)α1⋯αn

n

∏j=1(−bn − (n − 1)aαj)

= (−1)(n2)+nb

(n − 1)nan n

∏j=1(− bn(n − 1)a − αj)

= (−1)(n2)+n(n − 1)nan [( −bn

(n − 1)a)n

−bn

n − 1+ b]

= (−1)(n2)+n(n − 1)nan [( −bn

(n − 1)a)n

−b

n − 1]

= (−1)(n2)nnbn−1 + (−1)(n−12 )(n − 1)n−1an.8. More generally, show that for 0 < k < n we have

∆(xn + axk + b) = (−1)(n2)bk−1 [nNbN−k − (−1)N(n − k)N−KkKaN ]d ,where d = gcd(n, k), N = n/d, and K = k/d.Solution. The origins of the formula for ∆(xn + axk + b) are obscure.Proofs were given by Swan [151] and by Greenfield and Drucker [72],based on the fact that the discriminant of a polynomial f(x) is essen-tially the resultant of f(x) and f ′(x). See also Gelfand, Kapranov, andZelevinsky [63, pp. 406–407] for another argument.

9. For n ≥ 1 let

fn(x) = n

∑i=0

xn

n!.

What is ∆(fn(x))?Solution. Let fn(x) = 1

n!∏ni=1(x −αi). Then

f ′n(α1)⋯f ′n(αn) = 1

n!n∏i≠j(αj −αi). (2.6)

Hence

∆(fn) = (−1)(n

2)

n!n−2f ′n(α1)⋯f ′n(αn).

Now

f ′n(x) = fn−1(x) = fn(x) − xnn! ,

16 CHAPTER 2. POLYNOMIALS

so f ′n(αi) = −αni /n!. From equation (2.6) there follows

∆(fn) = (−1)(n

2)

n!n−2(−αn

1

n!)⋯(−αn

n

n!) .

But α1⋯αn = (−1)nn!, so we get

∆(fn) = (−1)(n

2)

n!n−2.

This result goes back at least to a more general result of Hilbert [77].

10. Show that

∆( n

∑i=0(n +αn − i

)xii!) = 1

n!2n−2n

∏j=2jj(α + j)j−1, (2.7)

where α ∈ C (or we could take α to be an indeterminate).

Solution. The polynomial of equation (2.7) is a generalized Laguerre

polynomial. Its discriminant was found by Schur [132]. Sometimes adifferent but equivalent formula is given because of different conventionsin defining the discriminant.

There is a completely different necessary and sufficient condition for apolynomial to have only real zeros which is useless for specific examples butof great theoretical interest. Given a real polynomial f(x) = a0 + a1x + ⋯ +anxn, define an infinite matrix M(f) whose rows and columns are indexedby positive integers, with

M(f)ij = aj−i. (2.8)

We set ak = 0 if k < 0 or k > n. The matrix M(f) is an example of a Toeplitz

matrix, i.e., a matrix with constant diagonals (parallel to the main diagonal).

11. Show that all the zeros of f(x) are real if and only if every minor ofA(f) is nonnegative.Solution. This remarkable result is due to Aissen, Schoenberg, andWhitney [4]. It is useless for specific examples because even for quadraticpolynomials ax2 +bx+c it is necessary to check infinitely many minors;

17

there is no single minor equal to a positive multiple of b2 −4ac. Thingsreally get interesting when we look at power series rather than polyno-mials, but this topic is beyond the scope of this book. It is part of thefascinating theory of total positivity. For further information see Fomin[48] and the references therein.

A useful necessary condition for only real zeros is due to Isaac Newton.Write a real polynomial of degree n in the form f(x) = ∑n

i=0 (ni)aixi.12. Show that if all the zeros of f(x) are real, then a2i ≥ ai−1ai+1 for 1 ≤ i ≤

n − 1. In other words, the coefficients a0, a1, . . . , an are logarithmically

concave, or log concave (sometimes written log-concave) for short.

Solution. The proof is based on the facr that if all the zeros of a realpolynomial f(x) are real, then the same is true for its derivative f ′(x).This is an immediate consequence of Rolle’s theorem, which impliesthat there is a zero of f ′(x) in between two zeros of f(x), togetherwith the limiting case which states that if f(x) has a zero of order kat x = α, then f ′(x) has a zero of order k − 1 at x = α.By the Fundamental Theorem of Algebra, f(x) has n zeros, including

multiplicity. LetQ(x) = di−1

dxi−1 f(x). Thus Q(x) is a polynomial of degreen− i+1 with only real zeros. Let R(x) = xn−i+1Q(1/x), a polynomial ofdegree at most n− i+1. The zeros of R(x) are just reciprocals of thosezeros of Q(x) not equal to 0, with possible new zeros at 0. At any rate,

all zeros of R(x) are real. Now let S(x) = dn−i−1

dxn−i−1R(x), a polynomialof degree at most two. Then every zero of S(x) is real. An explicitcomputation yields

S(x) = n!2(ai−1x2 + 2aix + ai+1).

If ai−1 = 0 then trivially a2i ≥ ai−1ai+1. Otherwise S(x) is a quadraticpolynomial. Since it has real zeros, its discriminant ∆ is nonnegative.But

∆ = (2ai)2 − 4ai−1ai+1 = 4(a2i − ai−1ai+1) ≥ 0,so the sequence a0, a1, . . . , an is log-concave as claimed.

For information on Newton’s log-concavity theorem, see [76, p. 52].

18 CHAPTER 2. POLYNOMIALS

13. Deduce from Problem 12 that if ∑ni=0 bixi is a polynomial with only real

zeros, then

b2i ≥ bi−1bi+1.

Solution. From Problem 12 we have

b2i

(ni)2 ≥

bi−1( n

i−1)bi+1

binomni + 1.

This simplifies to

b2i ≥ (1 + 1

i)(1 + 1

n − i) bi−1bi+1, (2.9)

and the proof follows. Equation (2.9) is sometimes called the strong

log-concavity of the sequence b0, b1, . . . , bn.

14. Suppose that A(x) = ∑i aixi and B(x) = ∑i bix

i are polynomials withpositive log-concave coefficients, so a2i ≥ ai−1ai+1 and similarly for bi.Show that the product A(x)B(x) also has log-concave coefficients.

Solution. There are numerous ways to solve this problem. One couldtry a brute force computational approach, but this gets quite messy. Wegive an elegant proof based on linear algebra. The key result from lin-ear algebra is the Cauchy-Binet formula (also called the Binet-Cauchyformula), as follows.

Let C be an m × n matrix and D an n × m matrix over a field K.We want to compute detCD. If m > n then it should be clear by arank argument that detCD = 0, so assume m ≤ n. If S ⊆ [n], then letC[S] be the submatrix of C consisting of the columns of C indexedby elements of S, and similarly D⟨S⟩ for the rows of D. Then theCauchy-Binet formula asserts that

detCD =∑S

detC[S] ⋅ detD⟨S⟩,where S ranges over all m-element subsets of [n].Given a polynomial f(x) = ∑aixi, consider the infinite matrix from

19

equation (2.8) (or we could take a sufficiently large finite matrix)

A(f) =⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

a0 a1 a2 a3 ⋯

0 a0 a1 a2 ⋯

0 0 a0 a1 ⋯

0 0 0 a0 ⋯

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦.

It is easy to check that A(f)A(g) = A(fg). In fact, the map f(x) ↦A(f) is an isomorphism from the polynomial ring R[x] to the set ofupper triangular real Toeplitz matrices [aj−i]i,j≥0 with finitely manynonzero elements in each row.

We need to show that every 2×2 consecutive submatrix B (that is, theentries of B come from consecutive rows i, i + 1 and columns j, j + 1of A(fg)) has a nonnegative determinant. Let A(f)[i, i + 1] be thesubmatrix of A(f) consisting of rows i and i+1, and similarly A(g)⟨j, j+1⟩ for the columns of A(g). Then

B = A(f)[i, i + 1] ⋅A(g)⟨j, j + 1⟩ (matrix multiplication).

By the Cauchy-Binet formula,

detB = ∑r<s

detC(r, s) ⋅ detD(r, s),where C(r, s) is the 2×2 submatrix of A(f)[i, i+1] consisting of columnsr and s, while D(r, s) is the 2×2 submatrix of A(g)⟨j, j +1⟩ consistingof rows r and s. Now from a2i ≥ ai−1ai+1 and ai > 0 for all i therefollows easily that for all i ≤ j and s ≥ 0 we have aiaj ≥ ai−saj+s. HencedetC(r, s) ≥ 0 and detD(r, s) ≥ 0, and the proof follows.

There are a myriad of special tricks and techniques for dealing with thezeros of polynomials in one variable. We give one such example here.

15. Let f(x) be a complex polynomial such that every zero has the samereal part γ. Let u ∈ C with ∣u∣ = 1. Show that every zero of f(x + 1) +uf(x) has real part γ − 1

2.

20 CHAPTER 2. POLYNOMIALS

Solution. We may assume that f(x) is monic. Then we can write

f(x) =∏j

(x − γ − δji) , δj ∈ R,

where i2 = −1. If f(w+1)+uf(w) = 0, then ∣f(w)∣ = ∣f(w+1)∣. Supposethat w = γ − 1

2+ η + θi, where η, θ ∈ R. Thus

∣∏j

(−12+ η + (θ − δj)i)∣ = ∣∏

j

(12+ η + (θ − δj)i)∣ . (2.10)

Note that

∣−12+ η + (θ − δj)i∣2 = 1

4− η + η2 + (θ − δj)2

∣12+ η + (θ − δj)i∣2 = 1

4+ η + η2 + (θ − δj)2.

Hence if η > 0 then

∣−12+ η + (θ − δj)i∣ < ∣1

2+ η + (θ − δj)i∣

for all j, while if η < 0 then the inequality goes the other way. Thereforeequation (2.10) holds only if η = 0, and the proof follows.

16. For positive integers n and k, define

Pk,n(x) = k

∑j=0(−1)k−j(k

j)(x + j)n. (2.11)

Show that Pk,n(x) has positive coefficients.

Solution. We claim that every zero of Pk,n(x) has real part −k2. It

would then follow that Pk,n is a product of linear polynomials x+ k2and

quadratic polynomials

(x + k2+ βi)(x + k

2− βi) = x2 + kx + k2

4+ β2, β ∈ R.

Thus Pk,n(x) has positive coefficients, as desired.

21

To prove the claim, let E be the operator on polynomials P (x) definedby EP (x) = P (x + 1). Then

(E − 1)k = k

∑j=0(−1)k−j(k

j)Ej ,

soPk,n(x) = (E − 1)kxn.

Every zero of xn has real part 0. By what we have just shown, eachtime we apply E − 1 we lower the real part of every zero by 1

2. Thus

every zero of Pk,n(x) has real part −k2, and the proof follows.

22 CHAPTER 2. POLYNOMIALS

Chapter 3

Base Mathematics

In this chapter we consider mostly problems related to writing numbers insome base b ≥ 2. We start with a well-known trifle.

1. What is the missing number below?

10, 11, 12, 13, 14, 15, 16, 17, 20, 22,24, 31, 100, ?, 10000

Solution. In fact, the sequence is constant; all the terms are equal to16. They are just written in different bases, beginning with base 16,then 15, 14, down to base 2. So the missing number is 16 in base 3,which is 121. This sequence appears in Clessa [36, puzzle 51].

We make a short digression with two other “missing term” puzzles beforereturning to base mathematics.

2. Find the missing term (indicated by a question mark):

?Hint. Note that each term is symmetrical about a vertical line. See[58, p. 161][60, Problem 1].

23

24 CHAPTER 3. BASE MATHEMATICS

3. Consider the sequence

1, 1, 1, ?, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, . . . .

Given that the missing number f(4) is not 1, what is it?Solution. Of course there are highly contrived answers like the se-quence 1 + δ4n, where δij is the Kronecker delta, but this is hardly asatisfactory solution.

To obtain a mathematically interesting answer, let f(n) be the num-ber of inequivalent differentiable structures that can be put on Rn. Tounderstand what this means, we need the definition of a differentiablemanifold. Very roughly, it is a space M on which we can do calculus.More precisely, we want to cover M with open sets Uα such that we aregiven maps ϕα∶Uα → Rn that are homeomorphisms onto an open subsetof Rn, where the transition maps between the induced coordinate sys-tems on an intersection Uα∩Uβ are infinitely differentiable. The precisedefinition can be found in the literature. Similarly we can define whatit means for two differentiable manifolds M and N to have equivalent(or diffeomorphic) differentiable structures: there should be a bijectionf ∶M → N such that f and f−1 are infinitely differentiable. Again theprecise definition is easily found in the literature.

We can now return to the function f(n). By a result of Stallings,f(n) = 1 if n ≠ 4. Freedman showed that f(4) > 1, and Taubes showedthat in fact f(4) = c, the cardinality of the continuum (or of the realnumbers). Thus the missing number is c.

The upshot of these results is that there is only one way to do cal-culus on Rn for n ≠ 4, but lots of ways for n = 4. Thus there is agood opportunity to write quite a few textbooks on four-dimensionalcalculus.

Let us consider some frivolous mathematics related to decimal expansions.

4. Note that100

9899= 0.010102030508133455⋯.

25

Taking the digits after the decimal point two at a time generates theFibonacci numbers 1, 1, 2, 3, 5, 8, 13, 34, 55, . . . . Why?

Solution. We can write

100

9899= 1/1001 − 1

100−

110000

.

If Fn is the nth Fibonacci number then

∑n≥0

Fn+1xn = 1

1 − x − x2. (3.1)

Put x = 1100

and divide by 100 to get

∑n≥1

Fn

100n= 100

9899.

When Fn has at most two digits then Fn

100nhas the decimal expansion,

starting after the decimal point, consisting of 2n−2 0’s followed by thedigits of Fn (with a leading 0 if Fn has just one digit), so adding all ofthese up will give what you want, until there is some spillover from thefirst three-digit Fibonacci number.

Let us also note that the left-hand side of equation (3.1) converges for

∣x∣ < −1+√52= 0.618⋯, so there is no problem with convergence when

x = 1100

.

5. In the same vein, we have

5000 − 700√51 = 1.00010002000500140042013204291430⋯. (3.2)

Looking at the digits in groups of four gives the sequence 1, 1, 2, 5,14, 42, 132, 429, 1430, . . . . These are the ubiquitous Catalan numbers

Cn = 1n+1(2nn ). There are 214 combinatorial interpretations of these

numbers in Stanley [145]. In fact, the OEIS1 entry on Catalan numbers(A000108) states that “(t)his is probably the longest entry in the OEIS,and rightly so.”

What accounts for equation (3.2)?

1The Online Encyclopedia of Integer Sequences, oeis.org

26 CHAPTER 3. BASE MATHEMATICS

Solution. Let C(x) = ∑n≥0Cnxn = 1+x+2x2+5x3+⋯. It is well knownthat

C(x) = 1 −√1 − 4x

2x.

Now note thatC(1/10000) = 5000 − 700√51,

and argue as in the previous problem. It is a happy accident that thenumerator of 1 − 4

10000, namely 2499, is divisible by 72, so F (1/10000)

can be written using√51 rather than

√2499.

We now take a slight detour in order to evaluate the sum ∑n≥01Cn

.

6. Show that

cos t (sin−1 x) = ∑n≥0(−1)nt2(t2−22)(t2−42)⋯(t2−(2n−2)2) x2n(2n)! . (3.3)

Solution. Observe that that the coefficient of x2n/(2n)! in cos(t sin−1 x)is an even polynomial Pn(t) of degree 2n and leading coefficient (−1)n.If k ∈ Z, then cos 2kθ is an even polynomial in cos θ of degree 2k. More-over, cos2 sin−1 x = 1 − x2. Hence cos 2k(sin−1 x) is an even polynomialin x of degree 2k. For instance,

cos 4(sin−1 x) = 8x4 − 8x2 + 1.It follows that Pn(±2k) = 0 for ∣k∣ < n. We now have sufficient informa-tion to determine Pn(t) uniquely.

7. Show that

(sin−1 x)2 = ∑n≥1

22n−1

n2(2nn)x2n. (3.4)

Solution. Consider the coefficient of t2 in equation (3.3).

For another solution, square the much easier series

sin−1 x = ∑n≥0

1

22n(2nn) x2n+12n + 1

and prove a suitable combinatorial identity (left as an exercise).

27

8. We now come to our Catalanic digression. What is the value of thesum

S ∶= ∑n≥0

1

Cn

?

As a hint, the first four terms are

1 + 1 +1

2+1

5= 2.7.

Solution. The obvious conjecture is

S = 2 + 3√3π

27= 2.806133⋯.

To see this, apply the operator 2 ddxx2 d

dxx ddx

to equation (3.4). Aftersimplification we obtain

∑n≥0

xn

Cn

= 2(x + 8)(4 − x)2 +24√x sin−1 (1

2

√x)

(4 − x)5/2 .

Now set x = 1.9. * Evaluate the sum ∑n≥0(4 − 3n)/Cn.

There isn’t much connection between base b representations of real num-bers and “serious” mathematics. However, there are some results in thisdirection, of which we now give a sample. In these problems, p always de-notes a prime number.

10. Let n = ∑i≥0 aipi and k = ∑i≥0 bipi be the base p expansions of thepositive integers n and k (so 0 ≤ ai < b and 0 ≤ bi < p). Show that

(nk) ≡∏

i≥0(aibi) (mod p).

Solution. If f(x) and g(x) are polynomials with integer coefficients,write f(x) ≡ g(x) (mod p), or simply f(x) ≡ g(x), to mean that all

28 CHAPTER 3. BASE MATHEMATICS

coefficients of f(x) − g(x) are divisible by p. Since (1 + x)p ≡ 1 +xp (mod p) (the so-called “sophomore’s dream”), we have

∑k≥0(nk)xk = (1 + x)n

= (1 + x)∑aipi

≡ ∏i≥0(1 + xpi)ai

≡ ∏i≥0(∑bi≥0(aibi)xbipi) .

Taking the coefficient of xk on both sides and using the uniqueness ofthe binary expansion k = ∑ bipi gives the result.

This well-known result is known as Lucas’s theorem and first appearedin an 1878 paper of Edouard Lucas.

11. How many numbers in the nth row of Pascal’s triangle, i.e., the numbers(nk) for k ≥ 0, are not divisible by p?

Solution. If n = ∑aipi as before, then the number is ∏i(1 + ai). Weneed each bi to satisfy bi ≤ ai, so there are ai + 1 choices for bi.

12. Show that the largest power of p dividing (nk) is equal to the number

of carries in adding k and p − k in base p using the usual additionalgorithm.

Solution. Recall de Polignac’s formula (also known as Legendre’s for-mula), after Alphonse de Polignac (18??) and Adrien-Marie Legendre(1830): the exponent of the largest power of p dividing n! is equal to

⌊np⌋ + ⌊ n

p2⌋ + ⌊ n

p3⌋ +⋯.

Using this formula it is not difficult to complete the proof, keepingcareful track of the largest power of p dividing n!, k!, and (n − k)!.This result is known as Kummer’s theorem, after Ernst Kummer (1852).

13. For a positive integer c let νp(c) denote the largest integer d such thatpd divides c. Let

Hm =m

∏i=0

m

∏j=0(i + ji).

29

Show that for n ≥ 1 we have

νp(Hpn−1) = 1

2((n − 1

p − 1)p2n + pn

p − 1) .

Solution. There is a solution which directly uses De Polignac’s formulafor νp(m!), but instead we use Problem 12. We thus want the totalnumber of carries when adding all n-digit numbers i and j in base p,allowing 0 as a leading digit.

Let N = pn − 1. Let i = ∑n−1k=0 akpk and j = ∑n−1

k=0 bkpk be the basep expansions of i and j. There are two ways we can have a carryin position k (where ak and bk are in position k). The first is thatak + bk ≥ p. There are then (p

2) choices for the pair (ak, bk) and p2(n−1)

choices for the remaining digits. There are n choices for k, so the totalnumber of such carries is X = n(p

2)p2(n−1).

The second way we can have a carry in position i is i ≥ 1 and for somej < i we have ak + bk = ak−1 + bk−1 = ⋯ = aj+1 + bj+1 = p−1 and aj + bj ≥ p.There are pi−j choices for positions j+1, . . . , i. There are (p

2) choices for

position j. For the remaining n− i+ j − 1 positions there are p2(n−i+j−1)

choices. Hence the total number of choices in this case is

Y =n−1∑i=1

i−1∑j=0pi−j(p

2)p2(n−i+j−1).

It is now routine to compute that

νp(Hn) = X + Y = 12((n − 1

p − 1)p2n + pn

p − 1) .

This result is due to R. Stanley [?].

Note. Let Gm = ∏mk=0 (mk). The numbers νp(Gm) are discussed by J.

C. Lagarias and H. Mehta [92]. There don’t seem to be such explicitformulas for Gpn−1 as for Hpn−1.

14. Define a sequence a0, a1, . . . of nonnegative integers recursively as fol-lows: an is the least nonnegative integer greater than an−1 such that nothree of the numbers a0, a1, . . . an form an arithmetic progression. Thusa0 = 0 and a1 = 1. Since 0,1,2 form an arithmetic progression, we have

30 CHAPTER 3. BASE MATHEMATICS

a2 = 3. Similarly a3 = 4. Since we have arithmetic progressions 1,3,5;0,3,6; 1,4,7; 0,4,8, we have a4 = 9. What is an, e.g., a1000000?

Solution. Write n in binary and read it in ternary! For instance,

1000000 = 219 + 218 + 217 + 216 + 214 + 29 + 26,so

a1000000 = 319 + 318 + 317 + 316 + 314 + 39 + 36.It’s easy to prove this result by induction once it’s guessed.

There are lots of known generalizations. For instance, instead of start-ing with a0 = 0, we could instead let k > 0 and start with a0 = 0 anda1 = k. For n ≥ 2 we define an to be the least positive integer greaterthan an−1 so that no three of a0, a1, . . . , an are in arithmetic progression.A curious dichotomy then arises. When k is of the form 3j or 2 ⋅ 3j,which we call the regular values of k, there is a nice explicit descriptionof an analogous to the case k = 1 (the case we just considered). Forregular k the rate of growth of an is given by

1

2= lim inf

n→∞an

nα< limsup

n→∞an

nα= 1, (3.5)

where α = log 3

log 2= 1.58496⋯.

For other (irregular) values of k, the sequence appears to be very un-predictable. Recall the notation f(n) ∼ g(n), read “f(n) is asymptotic

to g(n),” if

limn→∞

f(n)g(n) = 1.

There is a simple heuristic argument about the rate of growth of an ifthe sequence behaves sufficiently randomly. Think of pn as the “prob-ability” that n appears in the sequence. Now n will appear in thesequence if and only if n− i and n − 2i do not appear for 1 ≤ i ≤ n/2. Ifthese events are independent then we obtain

pn =⌊n/2⌋∏i=1(1 − pn−ipn−2i), n ≥ 2. (3.6)

For the recurrence (3.6) with the initial conditions p0 = p1 = .5, whichcorresponds to a “random” start, one can show that

p1 + p2 +⋯+ pn ∼ c√n logn.

31

Hence when k is irregular we would expect that n ∼√an logan, or

an ∼c′n2

logn. (3.7)

This estimate agrees quite well with the numerical evidence, but noth-ing has been proved. Note that equation (3.7) yields a faster rate ofgrowth than (3.5).

For the greedy construction of sequences containing no three terms inarithmetic progression, see Odlyzko and Stanley [107]. For some gen-eralizations and further work, see [42][93][103][104] and the referencestherein.

Note. Let us mention the original motivation for the first sequence(corresponding to k = 1) 0,1,3,4,9,10, . . . . We can ask for the largestsubset of [n] containing no three terms in arithmetic progression. Letr3(n) denote the number of elements of this subset. A natural con-jecture is that we can construct this subset using a greedy algorithm,which leads to our first sequence. However, this conjecture is false. Thebest current bounds are

n2−√8 logn < r3(n) < c(log logn)4

lognn, (3.8)

for some constant c > 0.15. For any 0 ≤ k ≤ 9 there are infinitely many primes that don’t have the

digit k in their decimal expansion.

Solution. This is a deep result of James Maynard [98]. The proof is avery intricate argument using many tools from analytic number theory.Maynard’s result is probably quite far from the strongest result alongthese lines, since for instance it is plausible on probabilistic grounds,related to the fact that the sum of the reciprocal of the primes diverges,that there are infinitely many primes of the form (10n − 1)/9, i.e., withonly the digit 1 in their decimal expansion.

16. * Are there infinitely primes whose decimal expansion uses only thedigits 0,2,4,6,8?

17. Give a simple proof (using only elementary calculus) that ∑p 1/p di-verges, where p ranges over all primes.

32 CHAPTER 3. BASE MATHEMATICS

Solution. Consider the product

f(x) = ∏p<x

1

1 − 1p

= ∏p<x(1 + 1

p+

1

p2+⋯) ,

where p is prime. When we expand this product, we obtain a sum ofterms 1/n, including all n < x, since all prime factors of such n are alsoless than x. Thus

f(x) > ∑1≤n<x

1

n→∞ as x →∞.

Hence the product ∏p1

1−(1/p) diverges to ∞, so ∏p (1 − 1p) diverges to

0.

By a theorem of elementary calculus, if a1, a2, . . . is a sequence of posi-tive real numbers converging to 0, then ∏k(1− ak) diverges to 0 if andonly if ∑k ak diverges to ∞. Hence ∑p

1pdiverges.

This proof is essentially the limiting s = 1 case of Euler’s famous prod-uct formula for the Riemann zeta function:

ζ(s) ∶= ∑n≥1

1

ns=∏

p

1

1 − p−s,

which converges for s > 1.

18. Find an algorithm for computing the nth binary digit of π withoutcomputing any of the previous digits.

Solution. It seems rather remarkable that such an algorithm exists,but Bailey, Borwein, and Plouffe [14] found a so-called spigot algorithm

which does the trick. It is based on the formula

π = ∑k≥0[ 1

16k( 4

8k + 1−

2

8k + 4−

1

8k + 5−

1

8k + 6)] ,

discovered by Plouffe in 1995.

33

We now come to a couple of practical applications of base mathematics,other than such obvious ones as doing arithmetic. The first application isbased on the Morse-Hedlund sequence, also called the Prouhet-Thue-Morsesequence. One way to define it is recursively. Set t0 = 0. Once we havedefined t0, t1, . . . , t2n−1, define the next 2n terms to be the complement of thefirst 2n terms, i.e., t2n+j = 1 − tj for 0 ≤ j ≤ 2n − 1. Thus we build up thesequence as 0, then 01, then 0110, then 01101001, etc. A good survey on theMorse-Hedlund sequence is by Jean-Paul Allouche and Jeffrey Shallit [6].

19. Give a simple description of tn.

Solution. The term tn is congruent modulo 2 to the number of 1’s inthe binary expansion on n. Once this result is guessed, it is straight-forward to prove.

20. * Show the the Morse-Hedlund sequence is cubefree i.e., we cannot finda factor (a sequence of consecutive terms) of the form www, where wis any nonempty word (sequence) in the letters 0 and 1.

21. Does there exist an infinite binary sequence that is in fact squarefree,i.e., has no factor ww, where w is nonempty?

Solution. If we begin with 0, say, then the next term must be 1, then0, and then we are stuck. Thus no such sequence exists.

22. Does there exist an infinite sequence of 0’s, 1’s, and 2’s that is square-free?

Solution. Let vn be the number of 1’s between the nth and (n+1)st oc-currence of 0 in the Morse-Hedlund sequence t0, t1, . . . , so (v0, v1, . . . ) =(2,1,0,2,0,1,2,1,0,1,2, . . . ). It is a nice exercise to show that thissequence is squarefree. There are many other solutions.

23. Our “practical application” of the Morse-Hedlund sequence could becalled a “psychological” property. It is another fair allocation problemsimilar to the fair cake-cutting we discussed in Chapter 1. Supposea very generous mother (perhaps too generous) brings home sixteenroughly comparable gifts for her two young children. She flips a coin tosee who chooses the first gift, and then the children sequentially choosetheir favorite remaining gift. What is the fairest way for the children

34 CHAPTER 3. BASE MATHEMATICS

to choose eight gifts each, that is, in what order should the choosingbe done? Obviously it would be very unfair to let one child have thefirst eight choices, for instance.

Solution. If there were just two gifts, then there is no choice. One childchooses first, and the other takes the remaining gift. With four giftsone child (denoted 0) chooses first, and the other (denoted 1) shouldchoose second. Clearly it would be more fair for child 1 to now choosenext. Otherwise if the children valued the gifts equally, the gifts canbe put in pairs such that each child receives one gift in each pair, and 0always has the better gift in each pair. Therefore the fairest sequence is0110. Now if there were eight gifts, it seems reasonable that the sameprocedure for the second four gifts should be followed as for the firstfour, but with the roles of 0 and 1 reversed. Thus we get 01101001. Bythe same reasoning, for sixteen gifts we get 0110100110010110. For 2n

gifts we are using the first 2n terms of the Morse-Hedlund sequence.

Note. We have not really proved anything since we haven’t definedwhat it means for one sequence of choices to be fairer than another.We could imagine other possibilities. For instance, suppose that forany positive integer m there are 2m gifts. If one child chooses attimes a1, . . . , am and the other at b1, . . . , bm (so {a1, . . . , am, b1, . . . , bm} =[2m]), then we could try to maximize k such that ∑i a

ji = ∑i b

ji for all

1 ≤ j ≤ k.24. * Let P (x) = ∑i x

ai −∑i xbi , and let k be the largest positive integer for

which ∑i aji = ∑i b

ji for all 1 ≤ j ≤ k. Show that k is the largest integer

for which P (x) is divisible by (x − 1)k.25. What is the polynomial P (x) for the first 2n terms of the Morse-

Hedlund sequence, and what is the largest power of x − 1 dividing thispolynomial?

Solution. We have P (x) = (1 − x)(1 − x2)(1 − x4)⋯(1 − x2n−1), whichhas x = 1 as a zero of multiplicity n. The proof is straightforward.

26. Can P (x) be divisible by (x−1)n+1 for some sequence of ±1’s of length2n?

Solution. Somewhat surprisingly, this problem is open for n ≥ 8. More-over, it is known that there exists a polynomial p(x) of degree 47 withcoefficients ±1 that is divisible by (x − 1)6. See [140].

35

27. As a kind of “opposite” problem to Problem 22, let w be an infinitebinary sequence b1b2⋯ with the property that for all n sufficiently large,the prefix b1b2⋯bn ends in a square, i.e., has the form uv2, where u, vare binary words and v is nonempty. Does it follow that w is eventuallyperiodic? What if we also require that the v’s have bounded length?

Solution.

Set w0 = 0 and w1 = 01. Then recursively define

wn = wn−1wn−2, n ≥ 2,where wn−1wn−2 denotes the product (concatenation) of words. Definethe Fibonacci word

w = limn→∞wn = 01001010010010100101001⋯.

It is easy to see that w is not eventually periodic. By analyzing a fewcases, it is not hard to show that every prefix of length at least six canbe written as uv2, where v is a nonempty word of length at most five.The Fibonacci word also has the interesting characterization of beingthe unique nonempty binary word which is invariant under replacingeach 0 with 01 and each 1 with 0.

28. Find the average length of v (as defined above), where we always takev to be as short as possible. More precisely, if γn is the prefix of wof length n, then define an for n ≥ 6 to be the length of the shortestnonempty word v for which γn = uv2. For instance,

(a6, a7, . . . ) = (3,2,2,1,5,3,1,3,3,2,2,1,5,3, . . . ).Find

L ∶= limn→∞

1

n(a6 + a7 +⋯+ an).

Solution. The answer is

L = 7 − 2√5 = 2.527864⋯.[to be completed]

36 CHAPTER 3. BASE MATHEMATICS

29. Finally we come to the second practical application of number bases,namely, to the problem of obesity. Explain how base mathematics cancontribute to this problem.

Solution. A mathematician can assist a person wishing to lose (orgain) weight by choosing the appropriate base to write the weight. Forinstance, if a person weighed 215 pounds and had an ideal weight of118 pounds, the mathematician would determine that the best baseis 14. The weight would instantly become 115 pounds with no diet orexercise! Adjustments would have to made if a digit greater than 9 wereinvolved. For instance, a 300 pound person would weigh 150 pounds inbase 15. If he or she could get down to 290 pounds by ordinary means,then they would weigh only 122 pounds in base 16.

The motivation for this revolutionary weight loss method came fromwomen’s clothing sizes. A size 16 dress in 1958 would be around size 8today.

Chapter 4

Three Triangles

The triangles we consider in this chapter are not the geometric variety, butrather triangular arrays of integers. The mother of all such triangles is Pas-cal’s arithmetic triangle P , or just Pascal’s triangle or the arithmetic triangle

or less commonly Pingal’s Meruprastar, after the Indian mathematician Ar-charya Pingala from the 3rd/2nd century BC. The first five rows of thistriangle (beginning with row 0) look like

11 1

1 2 11 3 3 1

1 4 6 4 11 5 10 10 5 1

The kth entry in row n, calling the leftmost entry in each row the 0th entry,is denoted (n

k) and of course is called a binomial coefficient. If k > n then(n

k) = 0. The defining recurrence of Pascal’s triangle is

(nk) = (n − 1

k − 1) + (n − 1

k).

Note. The number of digits needed to write the nth row in decimal

notation is about x2 logx

2 log 10, so if each digit is written the same size then the

37

38 CHAPTER 4. THREE TRIANGLES

array is not really a triangle. We get a triangle if each entry (nk) takes up the

same amount of space, but then the entries are going to be very hard to readfor large n. Anyways, Pascal’s triangle has a lot of interesting properties,such as the row sums ∑n

k=0 (nk) being equal to 2n and certain diagonal sums

∑nk=0 (n−kk ) being equal to the Fibonacci number Fn+1. There are a myriad of

further identities likek

∑i=0(ai)( b

k − i) = (a + b

k),

which includes as a special case, using the symmetry (nk) = ( n

n−k),n

∑i=0(ni)2 = (2n

n).

We can extend the definition to (αk) for any complex number (or indetermi-

nate) α and k ∈ N by

(αk) = α(α − 1)⋯(α − k + 1)

k!. (4.1)

We then have the identity (Taylor series expansion)

∑n≥0(αn)xn = (1 + x)α, (4.2)

called Newton’s binomial theorem. When α ∈ N then the sum on the left-hand side of equation (4.2) terminates, and we obtain the familiar binomial

theorem:m

∑k=0(mk)xk = (1 + x)m.

Some other identities are

n

∑i=0(−1)i(n

i)3 = (−1)n (3n)!

n!3

n

∑i=0(α + k

k) = (α + n + 1

n)

n

∑i=0(2ii)(2(n − i)

n − i) = 4n (4.3)

min(a,b)∑i=0(α + β + k

k)( β

a − k)( α

b − k) = (α + a

b)(β + b

a).

39

On the other hand, there is no simple formula known (and highly unlikely

that one exists) for b3(n) = ∑ni=0 (ni)3.

1. Show that

∑n≥0(2nn)xn = 1√

1 − 4x. (4.4)

Solution. ??

2. (a) Prove (4.3) using equation (4.4).

(b) Give a combinatorial proof.

Solution. (a) Square both sides of equation (4.4) and take the coeffi-cient of xn.

(b) It is surprisingly difficult to give a combinatorial proof of this simpleidentity. For a discussion see ??.

Binomial coefficients also have a lot of congruence and divisibility prop-erties, as we saw when we discussed Lucas’ theorem and Kummer’s theoremin Chapter 3.

We now consider three further triangular arrays. The first is a multiplica-tive analogue of Pascal’s triangle, which we call the multiplicative triangle,denoted B. The initial conditions are that every row begins and ends with aone, just like Pascal’s triangle. Using the same indexing conventions as be-fore, denote the kth entry in row n of B by ^n

k^. Then we have the defining

recurrence ^^n

k

^^ =^^n − 1

k − 1

^^ ⋅^^n − 1

k

^^, 0 < k < n.

The first five rows (beginning with row 0) look like

11 1

1 1 11 1 1 1

1 1 1 1 11 1 1 1 1 1

.

40 CHAPTER 4. THREE TRIANGLES

3. * Show that

n

∑k=0

^^n

k

^^ = n + 1

n

∑k=0

^^n

k

^^2

= n + 1.

4. * What can be said about ∑nk=0 ^nk^r for r ≥ 2?

Rather than dwell on further amazing properties of the multiplicativetriangle, let us turn to another triangular array that may prove to be moreof a challenge.

We have the same initial conditions as in Pascal’s triangle, that is, row 0consists of one 1, and in each subsequent row we place a 1 at the beginningand end. We add two adjacent numbers just as in Pascal’s triangle, but wealso take each number in row n and place a copy directly below it in rown+ 1. It would take too much space to write down the entire array, but hereare the first few rows.

11 1 1

1 1 2 1 2 1 11 1 2 1 3 2 3 1 3 2 3 1 2 1 1

5. How many entries in row n?

Solution. 2n+1 − 1.

We call this array Stern’s triangle for a reason to be explained later, anddenote it by S. The nomenclature “triangle” is even more problematicalthan for Pascal’s triangle, but we will use it anyway. Even if we adjust thefont size of each entry so all entries take up the same amount of space, thetwo boundary edges are not straight lines as in Pascal’s triangle, but ratherexponential curves.

41

6. Find the sum u1(n) of the elements in row n.

Solution. Each entry contributes three times to the next row, once tothe left, once to the right, and once directly below. Hence u1(n + 1) =3u1(n). Using the initial condition u1(0) = 1 we get u1(n) = 3n.

7. Find the largest entry g(n) in row n.

Solution. The sequence g(0), g(1), . . . begins 1,1,2,3,5,8,13, . . . , sothere the obvious conjecture g(n) = Fn+1, a Fibonacci number, definedby the initial conditions F1 = F2 = 1 and recurrence Fn+1 = Fn +Fn−1¿

For the proof, note that two consecutive entries include one numberbrought done from the previous row, so g(n + 1) ≤ g(n) + g(n − 1). Ifg(n−1) and g(n) do appear consecutively in row n, then g(n−1)+g(n)and g(n) appear consecutively in row n + 1. Since g(0) and g(1) areconsecutive in row 1, we have by induction that g(n+1) = g(n)+g(n−1).It remains only to check the initial conditions g(0) = F1 = 1 and g(1) =F2 = 1.

8. Let ⟨nk⟩ be the kth entry in row n of Stern’s triangle, beginning with

the 0th entry, so ⟨n0⟩ = ⟨ n

2n+1−2⟩ = 1. Find a simple formula for thepolynomial

Un(x) = 2n+1−2∑k=0⟨nk⟩xk.

This will be the “Stern analogue” of the binomial theorem.

Solution. The numbers in row n + 1 brought straight down from rown contribute xUn(x2) to Un+1(x). The numbers brought down to theleft contribute Un(x2), and to the right contribute x2Un(x2). Hence

Un+1(x) = (1 + x + x2)Un(x2).Using the initial condition U0(x) = 1, we obtain

Un(x) = n−1∏i=0(1 + x2i + x2⋅2i) , n ≥ 1. (4.5)

Equation (4.5) is a good analogue of the binomial theorem, because itis a nice product formula. It is more traditional to use a similar array that

42 CHAPTER 4. THREE TRIANGLES

begins with two 1’s in the first row. It looks as follows:

1 11 2 11 3 2 3 11 4 3 5 2 5 3 4 11 5 4 7 3 8 5 7 2 7 5 8 3 7 4 5 1

This array is called Stern’s diatomic array. It is essentially equivalent toStern’s triangle S, but S is preferable because of its more elegant proper-ties such as the analogue of the binomial theorem. All the properties of Sdiscussed below can be easily carried over to Stern’s diatomic array.

9. Find the alternating sum ∑k(−1)k⟨nk ⟩ of the entries in row n of S.

Solution. Put x = −1 in the product formula (4.5). We get 3n−1

for n ≥ 1. This argument is completely analogous to showing that

∑k(−1)k(nk) = 0 by setting x = −1 in the binomial theorem (x + 1)n =∑k (nk)xk.

10. Find

∑k

ωk⟨nk⟩,

where ω = e2πi/3, a primitive cube root of unity.

Solution. We get 0 if n ≥ 1 and 1 if n = 0, by letting x = ω in Un(x)for n ≥ 1 and using 1 + ω + ω2 = 0. We also need that 2i and 2 ⋅ 2i areincongruent and nonzero modulo 3.

11. Find the sum

⟨n0⟩ + ⟨n

1⟩ − ⟨n

2⟩ + ⟨n

3⟩ + ⟨n

4⟩ − ⟨n

5⟩ +⋯.

The signs go + + − + + − + + − + + −⋯.

Solution. The sequence 1,1,−1,1,1 − 1, . . . has period three. Any se-quence a0, a1, . . . that is periodic of period three is a linear combination

43

of the sequences bn = 1, cn = ωn, and dn = ω2n. To find the coefficientswe need to solve the linear equations

a + b + c = 1a + ωb + ω2c = 1a + ω2b + ωc = −1.

Then the answer will be

aUn(1) + bUn(ω) + cUn(ω2).But clearly Un(1) = 3n and Un(ω) = Un(ω2) = 0 (for n ≥ 1), so theanswer is a3n. We can add the three equations to get a = 1/3, so thesum is 3n−1 for n ≥ 1. Curiously we get the answer 3n−1, the same asfor the alternating sum ⟨n

0⟩ − ⟨n

1⟩ + ⟨n

2⟩ − ⟨n

3⟩ +⋯.

12. Define

u2(n) ∶=∑k

⟨nk⟩2.

Show thatu2(n + 1) = 5u2(n) − 2u2(n − 1), (4.6)

with the initial conditions u2(0) = 1 and u2(1) = 3.Solution. Note that the elegant proof by generating functions that

∑k (nk)2 = (2nn ) (Problem 1) does not extend, since Um(x)Un(x) ≠ Um+n(x).Define the auxiliary function

u1,1(n) =∑i

⟨ni⟩⟨ n

i + 1⟩.

Each element in row n is either equal to the element above it or to thesum of the two neighboring elements above it, so

u2(n + 1) = ∑i

⟨ni⟩2 +∑

i

(⟨ni⟩ + ⟨ n

i + 1⟩)2

= 3u2(n) + 2u1,1(n). (4.7)

Similarly,

u1,1(n) = ∑i

⟨ni⟩(⟨n

i⟩ + ⟨ n

i + 1⟩) +∑

i

(⟨ni⟩ + ⟨ n

i + 1⟩) ⟨ n

i + 1⟩

= 2u2(n) + 2u1,1(n). (4.8)

44 CHAPTER 4. THREE TRIANGLES

We can restate the recurrences (4.7) and (4.8) in the matrix form

[ u2(n + 1)u1,1(n + 1) ] = [ 3 2

2 2] [ u2(n)

u1,1(n) ] ,so

[ u2(n + 1)u1,1(n + 1) ] = [ 3 2

2 2]n [ u2(1)

u1,1(1) ] .The minimum polynomial of the matrix A = [ 3 2

2 2] is x2 − 5x + 2, so

A2 − 5A + 2I = 0. Therefore An−1(A2 − 5A + 2I) = 0. Apply this to thevector

[ u2(1)u1,1(1) ] = [ 3

2] ,

and we get the desired recurrence.

Note that this result is simpler than the corresponding one for Pascal’striangle (equation (4.4). For Pascal’s triangle we get an algebraic functionof x, i.e., it satisfies a nonzero polynomial equation whose coefficients arepolynomials in x. Here the equation is (1 − 4x)y2 − 1 = 0. On the otherhand, since u2(n) satisfies a linear recurrence with constant coefficients, thegenerating function will be rational (a quotient of two polynomials), a specialcase of algebraic functions which is much simpler and more tractable. Morespecifically, we have

∑n≥0

u2(n)xn = 1 − 2x

1 − 5x + 2x2.

By standard techniques [143, Chap 4] involving rational generating functionsand linear recurrences with constant coefficients, we get

u2(n) = (12+

√17

34)(5 +

√17

2)n + (1

2−

√17

34)(5 −

√17

2)n .

Thus u2(n) grows like (5+√172)n = (4.56155⋯)n, while u1(n) = ∑k ⟨nk ⟩ = 3n.

13. Find a formula for

u3(n) =∑k

⟨nk⟩3.

45

Solution. This time one needs the auxiliary functions

u2,1(n) = ∑k

⟨nk⟩2⟨ n

k + 1⟩

and similarly for u1,2(n). However, because of the symmetry

⟨nk⟩ = ⟨ n

2k+1 − k − 2⟩

we have u2,1(n) = u1,2(n). It is then easy to check that for n ≥ 1 wehave

u3(n) = 3u3(n − 1) + 6u2,1(n − 1)u2,1(n) = 2u3(n − 1) + 4u2,1(n − 1).

The coefficient matrix has eigenvalues 7 and 0, so u3(n) = c ⋅ 7n forsome constant c, for n ≥ 1. Putting n = 1 gives the remarkably simpleformula u3(n) = 3 ⋅ 7n−1. This formula fails for n = 0 because of the

eigenvalue 0, that is, the minimal polynomial of the matrix [ 3 62 4

] isx(x − 7), not x − 7.

14. There is no counterpart of the formula for u3(n) for ∑i (ni)3. Show in

fact that the generating function for ∑i (ni)3 is not even algebraic.

Solution. See [143, ??]. [cont??]

15. * Show that the method of Problems 12 and 13 can be applied tour(n) ∶= ∑k ⟨nk ⟩r for any positive integer r. The key point is that weend up with only finitely many linear recurrences. In particular,

u4(n) = 10u4(n − 1) + 9u4(n − 2) − 2u4(n − 3)u5(n) = 14u5(n − 1) + 47u4(n − 2)u6(n) = 20u6(n − 1) + 161u6(n − 2) + 40u6(n − 3) − 4u6(n − 4).

16. The previous problem can be greatly generalized. As on random ex-ample, let

Fn(x) = n−1

∏i=0(1 + 2x3i − x2⋅3i + x3⋅3i) ,

46 CHAPTER 4. THREE TRIANGLES

and let vr(n) be the sum of the rth powers of the coefficients of Fn(x).Show that

v2(n) = 8v2(n − 1) − 9v2(n − 2)v3(n) = 15v3(n − 1) − 59v3(n − 2) + 183v3(n − 3)v4(n) = 32v4(n − 1) + 278v4(n − 2) − 1600v4(n − 3) − 867v4(n − 4).

Next we consider a completely unexpected property of Stern’s triangle.For convenience set ⟨n

k⟩ = 0 if k > 2n − 2. First note that the rows of the

Stern triangle are stable, that is, for any n ≥ 0 the sequence ⟨0k⟩, ⟨ 1

k⟩, ⟨ 2

k⟩, . . .

eventually becomes constant, say with value bn+1 (rather than bn in order toagree with established notation). In fact, the (n + 1)-st row begins with thefirst half of the nth row.

The first problem below should be clear.

17. * Show that

∑n≥0

bn+1xn = lim

m→∞Um(x)

= ∏i≥0(1 + x2i + x2⋅2i) .

We are taking the limit coefficientwise. Since for fixed k the coefficientof xk becomes constant asm →∞, there is no problem with convergencein taking the limit.

Set b0 = 0. The sequence b0, b1, . . . is called Stern’s diatomic series orStern’s diatomic sequence (or the Stern-Brocot sequence), after a 1858 paperof Moritz Abraham Stern. It begins as follows:

0,1,1,2,1,3,2,3,1,4,3,5,2,5,3,4,1,5,4,7, 3,8,5, 7,2,7, 5,8, . . . .

Its most amazing property is the following.

18. Show that every nonnegative rational number occurs exactly once asa ratio bi/bi+1 of two consecutive terms, and this fraction is in lowestterms.

47

1/1

5/2

1/2 2/1

1/3 3/2 2/3 3/1

1/4 4/3 3/5 5/3 3/4 4/12/5

Figure 4.1: The Calkin-Wilf tree

Solution. This result can be directly proved by induction using therecurrence b2n = bn and b2n+1 = bn + bn+1 (essentially equivalent to thedefinition of Stern’s triangle), but it is more easily understood by intro-ducing an intermediate object known as the Calkin-Wilf tree T . Thisis an infinite binary tree with root labeled by the fraction 1/1. We thenlabel the other vertices recursively using the following rule: if a vertexis labeled a/b then its left child is labeled a/(a + b) and its right child(a + b)/b. Figure 18 shows the first few levels.

It’s not difficult to prove the following three facts, which immediatelyimply the desired result.

• If we read the numerators of the labels of T in the usual read-ing order then we obtain the sequence b1, b2, . . . , i.e., the Sterndiatomic sequence except for the first term 0.

• If we read the denominators of the labels of T in the same readingorder then we obtain the sequence a2, a3, . . . .

• Every positive rational number occurs exactly once as a label.

Note. The Calkin-Wilf tree is named after Neil Calkin and Herbert Wilf,who published their paper on this subject in 2000. However, the earliestexplicit appearance of the Calkin-Wilf tree seems to be a 1997 paper of JeanBerstel and Aldo de Luca, who called it the Raney tree. However, similartrees go back to Stern and even to Kepler in 1619.

48 CHAPTER 4. THREE TRIANGLES

The terminology “Calkin-Wilf tree” is a good example of Stigler’s law ofeponymy, which asserts that no scientific discovery is named after its originaldiscoverer. This law is named after Stephen Stigler, who wrote about it in1980. Note, however, that Stigler’s law of eponymy implies that Stigler’s lawof eponymy was not originally discovered by Stigler.

19. ‘What is the sum f(n) of the elements at height n? We define the topheight be 1, so f(1) = 1, f(2) = 1

2+ 2 = 5

2, etc.”

Solution. In all rows but the first, the fractions come in pairs aband

ba. Their children sum to

a

a + b+a + b

b+

b

a + b+a + b

a= 3 + a

b+b

a.

There are 2n−2 such pairs at height n, so we get

f(n + 1) = f(n) + 3 ⋅ 2n−2.There are many ways to solve this recurrence, and we get

f(n) = 12(3 ⋅ 2n−1 − 1).

20. * Let fr(n) denote the sum of the rth powers of row n of the Calkin-Wilftree. Show that f2s+1(n) is a rational linear combination (independentof n) of f0(n), f2(n), f4(n), . . . , f2s(n), where s ∈ N.

Chapter 7

Hidden Independence and

Uniformity

In Chapter 5, Problem 4 we gave an example of the phenomenon “hiddenindependence and uniformity.” In this chapter we consider further problemsof a similar flavor, beginning with a very simple problem just to get the ballrolling.

1. The numbers 1,2, . . . ,100 are written on separate slips of paper andplaced in a hat. An eager volunteer removes five numbers from thehat, one at a time. What is the probability that they are in increasingorder?

Solution. Whatever set of five numbers are chosen, there is a proba-bility 1/5! = 1/120 that they will be in increasing order.

The “hidden independence” arises from the fact that the probabilitythat the numbers are in creasing is independent of which set of fivenumbers is chosen. [ref??]

2. Given positive integers n and k, how many k-tuples (S1, . . . , Sk) ofsubsets of {1,2, . . . , n} are there such that S1 ∩ S2 ∩⋯∩ Sk = ∅?Solution. Each element i can be in any subset of the Sj’s exceptall of them. There are 2k − 1 allowable subsets, so (2k − 1)n k-tuples

65

66 CHAPTER 7. HIDDEN INDEPENDENCE AND UNIFORMITY

(S1, . . . , Sk).The key idea is that the choice of which Sj ’s contain i is independentof i, so we we can multiply the possibilities for each i. Some similarproblems of this nature appear in Stanley [142, Exer. 1.32].

3. Let p be a prime number and 1 ≤ k ≤ p−1. How many k-element subsets{a1, . . . , ak} of {1,2, . . . , p} are there such that a1 +⋯+ ak ≡ 0 (mod p)?Solution. Define the subset {a1, . . . , ak} to be equivalent to all subsets{a1 + j, . . . , ak + j}, where 0 ≤ j < p, and we are computing ai + j

modulo p, i.e., working in the ring Z/pZ. Clearly this is an equivalencerelation. Because p is prime and 1 ≤ k ≤ p − 1, each equivalence classcontains exactly p elements. If a1 + ⋯ + ak ≡ d (mod p) and bi = ai + j,then b1 + ⋯ + bk ≡ d + jk (modp). The equation d + jk ≡ 0 (mod p)has exactly one solution j modulo p since k and p are relatively prime.Thus we have divided the (p

k) k-element subsets into equivalence classes,

where each class contains p elements, exactly one of which satisfies thecondition we want. Therefore the total number of subsets satisfyingthe condition is 1

p(pk).

4. The previous problem becomes more interesting when we work moduloany positive integer n rather than a prime. In particular, show thatthe total number f(n) of subsets S of {1,2, . . . , n} satisfying ∑i∈S i ≡0 (modn) is given by

f(n) = 1

n∑d∣n

dodd

φ(d)2n/d, (7.1)

where φ denotes the Euler φ-function.

Solution. There is an elegant argument using generating functions,and a complicated nonbijective combinatorial proof is known when n

is odd. When n is odd, f(n) is equal to the number of necklaces (up tocyclic rotation) with n beads, each bead colored either red or blue (asimple special case of the Frobenius-Burnside lemma for counting thenumber of orbits of a finite group action), but no bijection is knownbetween the subsets and the necklaces. For further information seeStanley [142, Exer. 1.105].

67

5. Let w be a random permutation (uniform distribution) of 1,2, . . . , n,and fix 1 ≤ k ≤ n. What is the probability that in the disjoint cycledecomposition of w, the length of the cycle containing 1 is k? In otherwords, what is the probability that k is the least positive integer forwhich wk(1) = 1?Solution. What might be called the “obvious” proof is as follows Thenumber of ways to choose a k-cycle containing 1 is (n−1)(n−2)⋯(n−k+1), since there are n−1 ways to choose w(1), then n−2 ways to choosew2(1), etc. There are (n − k)! choices for the rest of the permutationw, so the desired probability is

(n − 1)(n − 2)⋯(n − k + 1) ⋅ (n − k)!n!

= 1

n,

independent of k.

Such a simple answer suggest there might be a more elegant proof.There is more than one way to write the disjoint cycle decompositionof a permutation. For instance, (1,2,3)(4,5)(6,7) = (5,4)(6,7)(3,1,2).Define a standard representation of a permutation w of [n] by requiringthat (a) each cycle is written with its largest element first, and (b) thecycles are written in increasing order of their largest elements. For in-stance, the standard form of (1,4)(2)(3,7,5)(6) is (2)(4,1)(6)(7,5,3).Define w to be the permutation obtained by writing the standard formof w and erasing the parentheses, thereby obtaining a word (sequence)a1a2⋯an which we interpret as usual as the permutation w defined byw(i) = ai. For instance, if w = (1,4)(2)(3,7,5)(6) then w = 2416753.The key observation is that the operation w ↦ w is a bijection on thesymmetric group Sn of all permutations of [n]. The easy proof followsfrom observing that the left-to-right maxima of w (the elements aj forwhich aj > ai for all i < j) are the first elements in each cycle of wwritten in standard form.

Now we can see why, in a random permutation w ∈Sn, the probabilitythat the length of the cycle containing 1 equals k does not depend onk. Instead of the cycle containing 1, we can equivalently look at thecycle containing n. If w = a1a2⋯an then the length of the cycle of wcontaining n is just n+1− w−1(n), the position of n in w starting fromthe right. Since w−1(n) is equally likely to have any value 1,2, . . . , n,the proof follows.

68 CHAPTER 7. HIDDEN INDEPENDENCE AND UNIFORMITY

6. Given a random permutation w ∈ Sn, what is the probability that 1and 2 are in the same cycle?

Solution. Rather than 1 and 2, we can equivalently ask for the proba-bility that n−1 and n are in the same cycle. This is just the probabilitythat n − 1 follows n in w, so the probability is 1/2.

The nice bijection w → w is sometimes called the fundamental bijection.It was first used by Alfred Renyi [126] and first systematically developed byDominique Foata and Marcel-Paul Schutzenberger [45]. There is also a shortdiscussion in [142, §1.3].

7. Choose n real numbers uniformly and independently from the interval[0,1]. What is the expected value of mini xi, the minimum value ofx1, . . . , xn? Give a noncomputional proof.

Solution. Choose points x1, . . . , xn uniformly and independently on acircle of circumference 1. Then choose an additional point y on thecircle. Cut the circle at y and then “straighten out” into a unit interval[0,1] where for definiteness say that moving from 0 to 1 corresponds tomoving clockwise on the circle. What we have are n points x1, . . . , xnuniformly and independently chosen from [0,1]. Now by symmetry onthe circle the expected distance between any two consecutive pointsamong x1, . . . , xn, y is 1

n+1. Hence the expected distance between y and

the first xi clockwise from y is 1n+1

, and this distance is just minxi. Thusthe “hidden” independence is among n + 1 points, not the original npoints.

8. Given integers m,n ≥ 0, evaluate the integral

I = ∫1

0xm(1 − x)ndx.

Again a noncomputational proof is wanted.

Solution. This integral is by definition the beta function B(m + 1, n +1) and can be evaluated straightforwardly using integration by partsand induction. The idea behind a more elegant conceptual proof is tointerpret the integral probabilistically.

69

Think of x as a number chosen uniformly in [0,1]. Then xm is theprobability that m other numbers u1, . . . , um in [0,1] are all less thanx, and (1 − x)n is the probability that n other numbers v1, . . . , vn in[0,1] are all greater than x.

We can choose all m + n + 1 numbers ui, vj , x at once. There are (m +n + 1)! ways to order them. The number of orderings for which x ispreceded by u1, . . . , um and followed by v1, . . . , vn is m!n!. Hence

∫1

0xm(1 − x)ndx = m!n!(m + n + 1)! .

9. * Evaluate the integral

∫ ∫ ∫Rx1y9z8w4 dxdy dz, (7.2)

where R is the region of R3 defined by x, y, z ≥ 0 and x + y + z ≤ 1, andwhere w = 1−x−y−z. This was a problem on the 1984 William LowellPutnam Mathematical Competition [79, ??].

10. (a) For n, k ∈ P define the integral

In,k ∶= ∫1

0⋯∫

1

0∏

1≤i<j≤n(xi − xj)2kdx1⋯dxn. (7.3)

Find a probabilistic interpretation of In,k.

(b) Evaluate the integral.

Solution.

(a) The integral is the probability that a sequence consisting of n x’s,and 2k yij’s (for all 1 ≤ i < j ≤ n) (chosen from the uniform distri-bution on all such sequences) had all the yij’s occurring betweenthe ith and jth x.

(a) The integral In,k is a special case of Selberg’s integral [8, Ch. 8][51][82],first evaluated by Atle Selberg in 1944. It is given by

In,k = (kn)!k!n

n

∏j=0

(jk)!3(1 + k(n − 1 + j))! .

Several proofs are known, including an elementary induction ar-gument. It is suprising, however, that no combinatorial proof isknown of this simple probability.

70 CHAPTER 7. HIDDEN INDEPENDENCE AND UNIFORMITY

11. We next consider some geometric problems. Choose n points at random(uniformly and independently) on the circumference of a circle. Findthe probability pn that all the points lie on a semicircle. For instance,p1 = p2 = 1.Solution. If y is a point on the circle, let y denote its antipode. Sup-pose that we choose our n points by first choosing n points y1, . . . , ynand then choosing either yi or yi for each i. In order for the pointsto lie on a semicircle, they must appear consecutively among the 2npoints. There are 2n ways to do this, so the probability they lie ona semicircle is 2n/2n = n/2n−1. Since this is independent of the pointsy1, . . . , yn, this is also the probability when we choose any n points atrandom.

12. * Generalize the previous problem as follows. For any 0 < θ < 2π,choose n points uniformly and independently as before. What is theprobability that they are contained in an arc subtending an angle θ?

13. Choose four points at random (uniformly and independently) on thesurface of a sphere. What is the probability that the center of thesphere is contained in the convex hull of the four points? Recall thatthe convex hull of a subset X of Rn is the intersection of all convex setscontaining X .

Solution. Assume that the center of the sphere is (0,0,0). Four pointsx1, x2, x3, x4 on the sphere have probability one of being affinely inde-pendent (that is, they don’t lie in a plane), so we can assume this.There is then up to scalar multiplication a unique nontrivial lineardependence relation

a1x1 + a2x2 + a3x3 + a4x4 = (0,0,0), ai ∈ R.The origin will be in the convex hull of the four points if and only ifall the ai’s have the same sign. There are sixteen choices for the signs(that is, for choosing either ai or its antipode −ai), and two of thesechoices have all signs the same. Thus the probability is 1/8.

familiar with some of the facts used by Daniel then you should studyThis problem was on the 1992 Putnam Competition [79, ??]. Sixteencontestants received eight or more points out of a maximum of tenpoints, and no one received 2–7 points.

71

14. Show that that if we choose n points uniformly at random in a square,then the probability that they are in convex position (i.e., no point isin the convex hull of the others points) is given by

pn = [ 1n!(2n − 2n − 1

)]2 . (7.4)

Solution. While this problem has the same flavor of the previous three,no comparably simple proof is known. A difficult proof using inductionis due to Valtr [160].

15. The last problem of this chapter has a more combinatorial nature.Passengers P1, . . . , Pn enter a plane with n seats. Each passenger hasa different assigned seat. The first passenger sits in the wrong seat.Thereafter, each passenger either sits in their seat if unoccupied orotherwise sits in a random unoccupied seat. What is the probabilitythat the last passenger Pn sits in his or her own seat?

Solution. When Pn boards, the only available seats will be his (orhers) and the seat assigned to P1. Passengers P2, . . . , Pn−1 are just aslikely to sit in P1’s assigned seat as Pn’s, so the probability is 1/2.

This problem can be found in Winkler [164, pp. 35–37]. The originalsource is unknown.

Index

Aissen, Michael, 16algebraic (generating function), 44Allouche, Jean-Paul Simon, 33arithmetic triangle, 37asymptotic, 30Aziz, Haris, 6

Bailey, David Harold, 32Berlekamp, Elwyn Ralph, 5Berstel, Jean, 47beta function, 68Binet-Cauchy formula, 18binomial coefficient, 37binomial theorem, 38

Newton’s, 38Borwein, Jonathan Michael, 32Brams, Steven J., 6

Calkin, Neil James, 47Calkin-Wilf tree, 47Catalan number, 25Cauchy-Binet formula, 18, 19Cherly, Jørgen, 10Clessa, J. J., 23convex hull, 70Conway, John Horton, 5, 6cubefree (word), 33cutcake, 3

impartial, 2

diffeomorphic, 24

differentiable manifold, 24discriminant, 13Drucker, Daniel Stephen, 15

elementary symmetric function, 10envy-freeness, 6

fair cake-cutting, 5Fibonacci number, 41Fibonacci word, 35Foata, Dominique, 68Fomin, Sergey Vladimirovich, 17Frobenius-Burnside lemma, 66fundamental bijection, 68fundamental theorem of symmetric func-

tions, 10

Gallardo, Luis H., 10Gantmacher, Felix Ruvimovich, 13Gardner, Martin, 1Gelfand, Israel Moiseevich, 15Greenfield, Gary Robert, 15Guy, Richard Kenneth, 5

Hilbert, David, 16

Kapranov, Mikhail Mikhailovich, 15Kepler, Johannes, 47Kummer’s theorem, 28Kummer, Ernst Eduard, 28

Lagarias, Jeffrey Clark, 29

95

96 INDEX

Laguerre polynomial, generalized, 16leading principal minor, 11left-to-right maximum (of a permuta-

tion), 67Legendre’s formula, 28Legendre, Adrien-Marie, 28log concave, 17log-concave

strong, 18logarithmically concave, 17Lowen, Robert W., 2de Luca, Aldo, 47Lucas’s theorem, 28Lucas, Francois Edouard Anatole, 28

Mackenzie, Simon, 6Maynard, James, 31Mehta, Harsh, 29Morse-Hedlund sequence, 33multiplicative triangle, 39

Newton, Isaac, 17

Odlyzko, Andrew Michael, 31OEIS, 25

partizan game, 3Pascal’s triangle, 37Pingal’s Meruprastar, 37Pingala, Archarya, 37Plouffe, Simon, 32de Polignac’s formula, 28de Polignac, Alphonse Armand Charles

Georges Marie, 28positive definite matrix, 11power sum, 10proportionality, 6Prouhet-Thue-Morse sequence, 33Putzer, Eugene J., 2

quadratic form, 12

Raney tree, 47rational (generating function), 44Renyi, Alfred, 68

Schoenberg, Isaac Jacob, 16Schur, Issai, 16Schutzenberger, Marcel-Paul, 68Selberg’s integral, 69Selberg, Atle, 69Selfridge, John Lewis, 6Shallit, Jeffrey Outlaw, 33sophomore’s dream, 28spigot algorithm, 32squarefree (word), 33standard representation (of a permu-

tation), 67Stanley, Richard Peter, 25, 31, 66Stern’s diatomic array, 42Stern’s diatomic sequence, 46Stern’s diatomic series, 46Stern’s triangle, 40Stern, Moritz Abraham, 46, 47Stern-Brocot sequence, 46Stigler’s law of eponymy, 48Stigler, Stephen Mack, 48Swan, Richard Gordon, 15symmetric polynomial, 10

Taylor, Alan Dana, 6Toeplitz matrix, 16total positivity, 17

Valtr, Pavel, 71valuation (of a cake), 5value (of a partizan game), 3Vandermonde determinant, 11Vasersteın, Leonid N., 10

INDEX 97

Wheland, Ethel R., 10Whitney, Anne, 16Wilf, Herbert Saul, 47Winkler, Peter Mann, 2, 8, 71

Zelevinsky, Andrei Vladlenovich, 15zero (of a polynomial), 9