Discrete Math and Probability Theory

8/13/2019 Discrete Math and Probability Theory

1/133

CS 70 Discrete Mathematics and Probability Theory

Fall 2013 Vazirani Note 0

Review of Sets and Mathematical NotationAset is a well defined collection of objects. These objects are called elementsor membersof the set, and

they can be anything, including numbers, letters, people, cities, and even other sets. By convention, sets are

usually denoted by capital letters and can be described or defined by listing its elements and surrounding

the list by curly braces. For example, we can describe the set A to be the set whose members are the first

five prime numbers, or we can explicitly write: A= {2, 3, 5, 7, 11}. Ifx is an element ofA, we writex A.Similarly, ify is not an element ofA, then we writey A. Two sets A and B are said to be equal, written asA=B, if they have the same elements. The order and repetition of elements do not matter, so {red, white,blue} = {blue, white, red} = {red, white, white, blue}. Sometimes, more complicated sets can be defined by

using a different notation. For example, the set of all rational numbers denoted by Q can be written as: {ab

| a, b are integers, b =0}. In English, this is read as the set of all fractions such that the numerator is aninteger and the denominator is a non-zero integer."

Cardinality

We can also talk about the size of a set, or itscardinality. IfA= {1,2,3,4}, then the cardinality ofA, denoted

by |A|, is 4. It is possible for the cardinality of a set to be 0. This set is called the empty set, denoted by thesymbol /0. A set can also have an infinite number of elements, such as the set of all integers, prime numbers,

or odd numbers.

Subsets and Proper Subsets

If every element of a set A is also in set B, then we say thatA is asubsetofB, writtenA B. Equivalently

we can writeB A, or B is a superset ofA. Aproper subsetis a setA that is strictly contained inB, writtenasA B, meaning thatAexcludes at least one element ofB. For example, consider the setB = {1, 2, 3, 4, 5}.Then {1, 2, 3} is both a subset and a proper subset ofB, while {1, 2, 3, 4, 5} is a subset but not a proper subsetofB. Here are a few basic properties regarding subsets:

The empty set is a proper subset of any nonempty setA: /0 A.

The empty set is a subset of every setB: /0 B.

Every setA is a subset of itself: A A.

Intersections and Unions

Theintersectionof a set A with a setB, written asAB, is a set containing all elements which are in bothAand B. Two sets are said to be disjointifA B= /0. Theunionof a set A with a set B, written asA B,is a set of all elements which are in either Aor Bor both. For example, ifA is the set of all positive even

numbers, and B is the set of all positive odd numbers, then AB= /0, and AB= Z+, or the set of allpositive integers. Here are a few properties of intersections and unions:

AB=BA

CS 70, Fall 2013, Note 0 1


2/133

A /0=A

AB=BA

A /0= /0

Complements

IfA and B are two sets, then the relative complement ofA in B, written as BA or B\A, is the set ofelements in B, but not in A: B\A= {x B| x A}. For example, ifB = {1, 2, 3} and A = {3, 4, 5}, thenB\A= {1, 2}. For another example, ifR is the set of real numbers and Q is the set of rational numbers, thenR\Q is the set of irrational numbers. Here are some important properties of complements:

A\A= /0

A\/0=A

/0\A= /0

Significant Sets

In mathematics, some sets are referred to so commonly that they are denoted by special symbols. Some of

these numerical sets include:

N denotes the set of all natural numbers:{0, 1, 2, 3, ...}.

Z denotes the set of all integer numbers:{. . . ,2,1, 0, 1, 2, . . .}.

Q denotes the set of all rational numbers:{ ab| a, b Z,b =0}.

R denotes the set of all real numbers.

C denotes the set of all complex numbers.

In addition, theCartesian product (also called the cross product) of two sets A and B, written as AB,is the set of all pairs whose first component is an element ofA and whose second component is an element

ofB. In set notation,AB={(a, b)| a A, b B}. For example, ifA= {1, 2, 3} and B= {u, v}, thenAB= {(1, u), (1, v), (2, u), (2, v), (3, u), (3, v)}. Given a set S, another significant set is the power setofS,denoted by P(S), is the set of all subsets ofS:{T| T S}. For example, ifS= {1, 2, 3}, then the power setofSis: P(S) = {{},{1},{2},{3},{1, 2},{1, 3},{2, 3},{1, 2, 3}}. It is interesting to note that, if|S| =k,then |P(S)| =2k.

Mathematical notation:

Sums and Products:

There is a compact notation for writing sums or products of large numbers of items. For example, to write

1 + 2 + + n, without having to say dot dot dot, we write it as ni=1 i. More generally we can write the sumf(m) +f(m + 1) + +f(n)as ni=m f(i). Thus,

ni=5 i

2 =52 + 62 + + n2.To write the product f(m)f(m + 1) f(n), we use the notation ni=m f(i). For example,

ni=1 i=1 2 n.

CS 70, Fall 2013, Note 0 2


3/133

Universal and existential quantifiers:

Consider the statement: For all natural numbers n,n2 + n + 41 is prime. Here,n is quantified to any elementof the set N of natural numbers. In notation, we write(n N)(n2 + n + 41 is prime). Here we have used theuniversal quantifier (for all). Is the statement true? If you try to substitute small values ofn, you willnotice thatn2 + n + 41 is indeed prime for those values. But if you think harder, you can find larger valuesofn for which it is not prime. Can you find one? So the statement(n N)(n2 + n + 41 is prime) is false.

Theexistential quantifer (there exists) is used in the following statement:x Z xx

2. y Zx Z y>x

The first statement says that, given an integer, I can find a larger one. The second statement says something

very different: that there is a largest integer! The first statement is true, the second is not.

CS 70, Fall 2013, Note 0 3


4/133


5/133

Inductive Step: Prove that it also holds forn= (k+ 1), i.e.k+1

i=0

i=(k+ 1)(k+ 2)

2 :

k+1

i=0

i= (k

i=0

i) + (k+ 1)

=k(k+ 1)

2 + (k+ 1) (by the inductive hypothesis)

= (k+ 1)(k

2+ 1)

=(k+ 1)(k+ 2)

2 .

Hence, by the principle of induction, the theorem holds.

Lets step back and look at the general form of such a proof, and also why it makes sense. Let us denote

byP(n)the statementn

i=0

i=n(n + 1)

2 . So we wish to prove thatn N,P(n). Theprinciple of induction

asserts that you can prove P(n)is true n N, by following these three steps:

Base Case: Prove thatP(0)is true.

Inductive Hypothesis: Assume that P(k)is true.

Inductive Step: Show that it follows thatP(k+ 1)is true.

To understand why induction works, think of the statements P(n)as represented by a sequence of dominoes,numbered from 0,1,2,...,n, such that P(0) corresponds to the 0th domino, P(1) corresponds to the 1st

domino, and so on. The dominoes are lined up so that if the kth domino is knocked over, then it in turnknocks over the k+ 1st. Knocking over the kth domino corresponds to proving P(k) is true. And theinduction step corresponds to the placement of the dominoes to ensure that if the kth domino falls, in turn

it knocks over the k+ 1st domino. The base case (n= 0) knocks over the 0th domino, setting off a chainreaction that knocks down all the dominoes.

It is worth examining more closely the induction proof example above. To prove P(k+ 1), we find within it

the statementP(k): k+1i=0i = (k

i=0

i) + (k+ 1). This is the key to the induction step.

We will now look at another proof by induction, but first we will introduce some notation and a definition

for divisibility. Given integersa and b, we say thata dividesb (or b is divisible bya), written asa|b, if andonly if for some integerq,b=aq. In mathematical notation, a,b Z,a|biffq Z :b=aq.

CS 70, Fall 2013, Note 1 2


6/133

Theorem: n N,n3nis divisible by 3.

Proof(by induction overn): LetP(n)denote the statement n N,n3nis divisible by 3.

Base Case:P(0)asserts that 3|(030)or 3|0, which is true since non-zero integer divides 0. (In thiscase, 0=3 0).

Inductive Hypothesis: AssumeP(k)is true. That is, 3|(k3 k), or q Z,k3 k=3q.

Inductive Step: We must show that P(k+ 1)is true, which asserts that 3|((k+ 1)3 (k+ 1)). Let usexpand this out:

(k+ 1)3 (k+ 1) =k3 + 3k2 + 3k+ 1 (k+ 1)

= (k3 k) + 3k2 + 3k

=3q + 3(k2 + k), q Z (by the inductive hypothesis)

=3(q + k2 + k)

So 3|((k+ 1)3 (k+ 1)).

Hence, by the principle of induction, n N, 3|(n3n).

There is a clever direct proof without any induction for the above statement. Can you see it?

Two Color Theorem: There is a famous theorem called the four color theorem. It states that any map

can be colored with four colors such that any two adjacent countries (which share a border, but not just

a point) must have different colors. The four color theorem is very difficult to prove, and several bogus

proofs were claimed since the problem was first posed in 1852. It was not until 1976 that the theorem was

finally proved (with the aid of a computer) by Appel and Haken. (For an interesting history of the problem,

and a state-of-the-art proof, which is nonetheless still very challenging, see www.math.gatech.edu/$\sim$thomas/FC/fourcolor.html). We consider a simpler scenario, where we divide the plane

into regions by drawing lines, where each line divides the plane into two regions (i.e. it extends to infinity).

We want to know if we can color this map using no more than two colors (say, red and blue) such that no

two regions that share a boundary have the same color. Here is an example of a two-colored map:

We will prove this two color theorem" by induction on n, the number of lines:

Base Case: Prove thatP(0)is true, which is the proposition that a map withn=0 lines can be can becolored using no more than two colors. But this is easy, since we can just color the entire plane using

one color.

CS 70, Fall 2013, Note 1 3


7/133

Inductive Hypothesis: AssumeP(n). That is, a map with n lines can be two-colored.

Inductive Step: ProveP(n + 1). We are given a map with n + 1 lines and wish to show that it can betwo-colored. Lets see what happens if we remove a line. With only n lines on the plane, we know

we can two-color the map (by the inductive hypothesis). Let us make the following observation: if

we swap red blue, we still have a two-coloring. With this in mind, let us place back the line weremoved, and leave colors on one side of the line unchanged. On the other side of the line, swap red

blue. We claim that this is a valid two-coloring for the map with n + 1 lines.

Why does this work? Any border of a region either consists of a part of one of the original n lines or

a piece of the n + 1-st line. If it is a part of one of the original n lines, then the two regions on eitherside are both on the same side of then + 1-st line, and the colors of the regions must be distinct, by

the induction hypothesis. On the other hand, if the border is part of the n + 1-th line, then the tworegions were created by dividing a single region from the induction hypothesis, and by constructionwe reversed colors on one side of the line, and so they have opposite colors.

Induction is a very powerful technique. But you will need to exercise care while using it, since even small

errors can lead to proving ridiculously false statements. Here is a dramatic example: in the middle of the

last century, a colloquial expression in common use was that is a horse of a different color", referring to

something that is quite different from normal or common expectation. The famous mathematician George

Polya (who was also a great expositor of mathematics for the lay public) gave the following proof to show

that there is no horse of a different color!

Theorem: All horses are the same color.

Proof(by induction on the number of horses):

Base Case: P(1) is certainly true, since if you have a set containing just one horse, all horses in theset have the same color.

Inductive Hypothesis: AssumeP(n), which is the statement that in any set ofn horses, they all havethe same color.

Inductive Step: Given a set ofn + 1 horses {h1,h2, . . . ,hn+1}, we can exclude the last horse in the setand apply the inductive hypothesis just to the first n horses

{h

1, . . . ,h

n}, deducing that they all have

the same color. Similarly, we can conclude that the last n horses {h2, . . . ,hn+1} all have the samecolor. But now the middle horses {h2, . . . ,hn} (i.e., all but the first and the last) belong to both ofthese sets, so they have the same color as horseh1 and horsehn+1. It follows, therefore, that alln + 1horses have the same color. Thus, by the principle of induction, all horses have the same color.

Clearly, it is not true that all horses are of the same color, so where did we go wrong in our induction proof?

It is tempting to blame the induction hypothesis which is clearly false. But the whole point of induction

CS 70, Fall 2013, Note 1 4


8/133

is that if the base case is true (which it is in this case), and assuming the induction hypothesis for any n we

can prove the case n + 1, then the statement is true for all n. So what we are looking for is a flaw in thereasoning!

What makes the flaw in this proof a little tricky to spot is that the induction stepis valid for a typical" value

ofn, say,n =3. The flaw, however, is in the induction step when n =1. In this case, forn + 1=2 horses,there areno middle horses, and so the argument completely breaks down!

Strengthening the Inductive Hypothesis

Let us prove by induction the following proposition:

Theorem: n 1, the sum of the firstn odd numbers is a perfect square.

Proof: By induction on n.

Base Case:n=1. The first odd number is 1, which is a perfect square.

Inductive Hypothesis: Assume that the sum of the firstkodd numbers is a perfect square, saym2.

Inductive Step: Thek+ 1-th odd number is 2k+ 1, so by the induction hypothesis, the sum of the firstk+ 1 odd numbers ism2 + 2k+ 1. But now we are stuck. Why shouldm2 + 2k+ 1 be a perfect square?

Well, lets just take a detour and compute the values of the first few cases. Maybe we will identify another

pattern.

n=1 : 1=12 is a perfect square.

n=2 : 1 + 3=4=22 is a perfect square.

n=3 : 1 + 3 + 5=9=32 is a perfect square.

n=4 : 1 + 3 + 5 + 7=16=42 is a perfect square.

Wait, isnt there a pattern where the sum of the first n odd numbers is just n2? Here is an idea: let us show

something stronger!

Theorem: For alln 1, the sum of the firstn odd numbers isn2.


Base Case:n=1. The first odd number is 1, which is 12.

Inductive Hypothesis: Assume that the sum of the firstkodd numbers isk2.

CS 70, Fall 2013, Note 1 5


9/133

Inductive Step: The (k+ 1)-st odd number is 2k+ 1, so by the induction hypothesis the sum of thefirst k+ 1 odd numbers is k2 + (2k+ 1) = (k+ 1)2. Thus by the principle of induction the theoremholds.

See if you can understand what happened here. We could not prove a proposition, so we proved a harder

proposition instead! Can you see why that can sometimes be easier when you are doing a proof by induction?

When you are trying to prove a stronger statement by induction, you have to show something harder in theinduction step, but you also get to assume something stronger in the induction hypothesis. Sometimes the

stronger assumption helps you reach just that much further...

Here is another example:

Imagine that we are given L-shaped tiles (i.e., a 22 square tile with a missing 11 square), and we wantto know if we can tile a 2n2n courtyard with a missing 11 square in the middle. Here is an example ofa successful tiling in the case thatn=2:

Let us try to prove the proposition by induction onn.

Base Case: ProveP(1). This is the proposition that a 22 courtyard can be tiled with L-shaped tileswith a missing 11 square in the middle. But this is easy:

Inductive Hypothesis: AssumeP(n) is true. That is, we can tile a 2n 2n courtyard with a missing11 square in the middle.

Inductive Step: We want to show that we can tile a 2n+12n+1 courtyard with a missing 11 square inthe middle. Lets try to reduce this problem so we can apply our inductive hypothesis. A 2n+12n+1

courtyard can be broken up into four smaller courtyards of size 2n 2n, each with a missing 1 1square as follows:

But the holes are not in the middle of each 2n 2n courtyard, so the inductive hypothesis does nothelp! How should we proceed? We should strengthen our inductive hypothesis!

What we are about to do is completely counter-intuitive. Its like attempting to lift 100 pounds, failing, and

then saying I couldnt lift 100 pounds. Let me try to lift 200," and then succeeding! Instead of proving that

CS 70, Fall 2013, Note 1 6


10/133

we can tile a 2n2n courtyard with a hole in the middle, we will try to prove something stronger: that wecan tile the courtyard with the hole beinganywhere we choose. It is a trade-off: we have to prove more, but

we also get to assume a stronger hypothesis. The base case is the same, so we will just work on the inductive

hypothesis and step.

Inductive Hypothesis (second attempt): AssumeP(n) is true, so that we can tile a 2n2n courtyard

with a missing 11 square anywhere.

Inductive Step (second attempt): As before, we can break up the 2n+12n+1 courtyard as follows.

By placing the first tile as shown, we get four 2 n 2n courtyards, each with a 1 1 hole; three ofthese courtyards have the hole in one corner, while the fourth has the hole in a position determined by

the hole in the 2n

+12n

+1 courtyard. The stronger inductive hypothesis now applies to each of thesefour courtyards, so that each of them can be successfully tiled. Thus, we have proven that we can tile

a 2n+12n+1 courtyard with a hole anywhere! Hence, by the induction principle, we have proved the(stronger) theorem.

Strong Induction

Strong induction is very similar to simple induction, with the exception of the inductive hypothesis. With

strong induction, instead of just assumingP(k) is true, you assume the stronger statement thatP(0), P(1),. . . , andP(k)are all true (i.e., P(0)P(1) P(k)is true, or in more compact notation

ki=0 P(i)is true).

Strong induction sometimes makes the proof of the inductive step much easier since we get to assume astronger statement, as illustrated in the next example.

Theorem: Every natural numbern>1 can be written as a product of primes.

Recall that a numbern 2 is prime if 1 and n are its only divisors. Let P(n)be the proposition thatn can bewritten as a product of primes. We will prove thatP(n)is true for all n 2.

Base Case: We start atn=2. ClearlyP(2)holds, since 2 is a prime number.

Inductive Hypothesis: AssumeP(k) is true for 2 k n: i.e., every numberk: 2 k n can bewritten as a product of primes.

Inductive Step: We must show thatn + 1 can be written as a product of primes. We have two cases:either n + 1 is a prime number, or it is not. For the first case, ifn + 1 is a prime number, then weare done. For the second case, ifn + 1 is not a prime number, then by definition n + 1=xy, wherex,y Z+ and 1


11/133

Why does this proof fail if we were to use simple induction? If we only assumeP(n)is true, then we cannotapply our inductive hypothesis to x and y. For example, if we were trying to proveP(42), we might write42=67, and then it is useful to know that P(6) and P(7)are true. However, with simple induction, wecould only assume P(41), i.e., that 41 can be written as a product of primes a fact that is not useful inestablishing P(42).

To understand why strong induction works, lets think about our domino analogy. By the time we ready

for thek+ 1-st domino to fall, dominoes numbered 0 through khave already been knocked over. But thisis exactly what strong induction assumes: to proveP(k+ 1), we can assume we already know that P(0)throughP(k)are true.

Simple Induction vs. Strong Induction

We have seen that strong induction makes certain proofs easy when simple induction seems to fail. A natural

question to ask then, is whether the strong induction axiom is logically stronger than the simple induction

axiom. In fact, the two methods of induction are logically equivalent. Clearly anything that can be proven by

simple induction can also be proven by strong induction (convince yourself of this!). For the other direction,

suppose we can prove by strong induction that n P(n). LetQ(k) =P(0) P(k). Let us prove k Q(k)by simple induction. The proof is modeled after the strong induction proof ofn P(n). That is, we wantto show Q(k) Q(k+ 1), or equivalently P(0) P(k) P(0) P(k)P(k+ 1). But this istrue iffP(0) P(k) P(k+ 1). This is exactly what the strong induction proof ofn P(n)establishes!Therefore, we can establish n Q(n)by simple induction. And clearly, proving n Q(n)also proves n P(n).

Well Ordering Principle

In the context of proving statement about algorithms or programs, it is often convenient to formulate an

induction proof in a different way. We start by asking how the statementn N, P(n) could fail? Well,it means that there must be some values ofn for which P(n) is false. Letm be the smallest such naturalnumber. We know that m must be greater than 0 since P(0)is true (base case), which indicates m1 N.Sincem is the smallest input that makesP(m)false,P(m1)must be true. ButP(m1) P(m), which isa contradiction.

We assumed something when defining m that is usually taken for granted: that we can actually find a smallest

number in any subset of natural numbers. This property does nothold for, say, the real numbers; to see why,

consider the set{x R : 0


12/133

Lets look at an example.

Round robin tournament:Suppose that, in a round robin tournament, we have a set ofkplayers {p1,p2, . . . ,pk}such that p1 beats p2, p2 beats p3, . . . , pk1 beats pk, and pkbeats p1. This is called a cyclein the tourna-

ment:

(A round robin tournament is a tournament where each participant plays every other contestant exactly once.

Thus, if there aren players, there will be exactly n(n1)

2 matches. Also, we are assuming that every match

ends in either a win or a loss; no ties.)

Claim: If there exists a cycle in a tournament, then there exists a cycle of length 3.

Proof: For the base case, notice that we cannot have a cycle of length less than 3, and if there is a cycle of

length 3 then the proposition is true.

Assume for contradiction that the smallest cycle is:

withn>3. Let us look at the game between p1 and p3. We have two cases: either p3 beats p1, or p1 beatsp3. In the first case (where p3 beats p1), then we are done because we have a 3-cycle. In the second case

(where p1 beats p3), we have a shorter cycle {p3,p4, . . . ,pn} and thus a contradiction. Therefore, if thereexists a cycle, then there must exist a 3-cycle as well.

Can we prove this claim using more traditional induction? Let us start with the base case ofn =3 playersand proceed from there.


Base Case: As above.

Inductive Hypothesis: If a round-robin tournament has a cycle of lengthkthen it has a cycle of length

3.

CS 70, Fall 2013, Note 1 9


13/133

Inductive Step: Given a round-robin tournament with a cycle of lengthk+ 1, we wish to show theremust be a 3-cycle. Assume wlog that the cycle involves playersp1throughpk+1in that order. Consider

the outcome of the match between p1 and p3. If p3 beats p1 then we have a 3-cycle. If p1 beats p3,

there is a k-cycle that goes directly from p(1) p(3)and continues as before. Applying the inductionhypothesis, we conclude that there must be a 3-cycle in the tournament.

Induction and Recursion

There is an intimate connection between induction and recursion in mathematics and computer science. A

recursive definition of a function over the natural numbers specifies the value of the function at small values

ofn, and defines the value of f(n)for a general n in terms of the value of f(m)for m


14/133

Can you figure out how long this program takes to computeF(n)? This is a very inefficient way to computethe n-th Fibonacci number. A much faster way is to turn this into an iterative algorithm (this should be a

familiar example of turning a tail-recursion into an iterative algorithm):

function F2(n)

if n=0 then return 0

if n=1 then return 1a = 1

b = 0

f o r k = 2 t o n d o

temp = a

a = a + b

b = temp

return a

Can you show by induction that this new function F2(n) =F(n)? How long does this program take tocomputeF(n)?

Clearly, induction and recursion are closely related. In fact, proofs involving a recursively-defined concept,

e.g. factorial, are often best done using induction. Formally, the factorial of a nonnegative number n is

defined recursively as n!=n(n1)(n2)...1, with a base case 0!=1, whereas exponentiation is definedrecursively asxn =xn1x. In this next example, we will look at is an inequality between two functions ofn.Such inequalities are useful in computer science when showing that one algorithm is more efficient than

another.

Notice that for this example, we have chosen as our base case n =2 rather thann =0. This is because thestatement is trivially true forn1= n!


15/133

Practice Problems1. Prove for any natural numbern that 12 + 22 + 32 + . . .+ n2 = 1

6n(n + 1)(2n + 1).

2. Prove that 3n >2n for all natural numbersn 1.

3. In real analysis, Bernoullis Inequality is an inequality which approximates the exponentiations of

1 +x. Prove this inequality, which states that (1 +x)n 1 + nxifn is a natural number and 1 +x>0.

CS 70, Fall 2013, Note 1 12


16/133



The Stable Marriage Problem: Induction Proofs in Algorithms

A dating agency must match up n men and n women. Each man has an ordered preference listof the n

women, and each woman has a similar list of then men. Is there a good algorithm to pair them up?

Consider for example n=3 men represented by numbers 1, 2, and 3 and 3 women A, B, andC, with thefollowing preference lists:

Men Women1 A B C

2 B A C

3 A B C

Women MenA 2 1 3

B 1 2 3

C 1 2 3

There are many possible pairings for this example, two of which are {(1,A), (2,B), (3,C)} and {(1,B), (2,C),

(3,A)}. How do we decide which pairing to choose? Let us look at an algorithm for this problem that is

simple, fast, and widely-used.

The Stable Marriage Algorithm 1

Every Morning:Each man goes to the first woman on his list not yet crossed off and proposes to her.

Every Afternoon:Each woman says maybe, come back tomorrow to the man she likes best among

the proposals (she now has him on a string) and never to all the rest.

Every Evening:Each rejected suitor crosses off the woman who rejected him from his list.

The above loop is repeated each successive day until there are no more rejected suitors. On this day,

each woman marries the man she has on a string.

How is this algorithm used in the real world?

1This algorithm is based on a 1950s model of dating where the men propose to the women, and the women accept or reject

these proposals

CS 70, Fall 2013, Note 2 1


17/133

The Residency Match

Perhaps the most well-known application of the Stable Marriage Algorithm is the residency match program,

which pairs medical school graduates and residency slots (internships) at teaching hospitals. Graduates

and hospitals submit their ordered preference lists, and the stable pairing produced by a computer matches

students with residency programs.

The road to the residency match program was long and twisted. Medical residency programs were first

introduced about a century ago. Since interns offered a source of cheap labor for hospitals, soon the number

of residency slots exceeded the number of medical graduates, resulting in fierce competition. Hospitals tried

to outdo each other by making their residency offers earlier and earlier. By the mid-40s, offers for residency

were being made by the beginning of junior year of medical school, and some hospitals were contemplating

even earlier offers to sophomores! The American Medical Association finally stepped in and prohibited

medical schools from releasing student transcripts and reference letters until their senior year. This sparked

a new problem, with hospitals now making short fuse" offers to make sure that if their offer was rejected

they could still find alternate interns to fill the slot. Once again the competition between hospitals led to an

unacceptable situation, with students being given only a few hours to decide whether they would accept an

offer.

Finally, in the early 50s, this unsustainable situation led to the centralized system called the National Res-

idency Matching Program (N.R.M.P.) in which the hospitals ranked the residents and the residents ranked

the hospitals. The N.R.M.P. then produced a pairing between the applicants and the hospitals, though at

first this pairing was not stable. It was not until 1952 that the N.R.M.P. switched to the Stable Marriage

Algorithm, resulting in a stable pairing.

Most recently, Lloyd Shapley and Alvin Roth won the Nobel Prize in Economic Sciences 2012, by extending

the Stable Marriage Algorithm we study in this lecture!2

Properties of the Stable Marriage AlgorithmWe wish to show that the stable marriage algorithm is fast and finds a good pairing. But first, we must show

that it halts. Here is a simple argument: on each day that the algorithm does not halt, at least one man must

eliminate some woman from his list (otherwise the halting condition for the algorithm would be invoked).

Since each list has n elements, and there are n lists, this means that the algorithm must terminate in at most

n2 days. Next, we need to show that the Stable Marriage Algorithm finds a good pairing. Before we do this,

we should discuss what we consider to be a good pairing.

Stability

What properties should a good pairing have? One possible criterion for a good pairing is one in which the

number of first ranked choices is maximized. Another possibility is to minimize the number of last ranked

choices. Or perhaps minimizing the sum of the ranks of the choices, which may be thought of as maximizing

the average happiness.

2See http://www.nobelprize.org/nobel_prizes/economic-sciences/laureates/2012/

popular-economicsciences2012.pdffor more details

CS 70, Fall 2013, Note 2 2


18/133

In this lecture we will focus on a very basic criterion: stability. A pairing is unstable if there is a man and a

woman who prefer each other to their current partners. We will call such a pair a rogue couple. So a pairing

ofn men andn women is stable if it has no rogue couples.

An unstable pairing from the example given in the beginning is: {(1,C), (2,B), (3,A)}. The reason is that 1

andB form a rogue couple, since 1 would rather be with B thanC(his current partner), and since B would

rather be with 1 than 2 (her current partner).

An example of a stable pairing is: {(2,A), (1,B), (3,C)}. Note that(1,A)is not a rogue couple. It is true thatman 1 would rather be with woman A than his current partner. Unfortunately for him, she would rather be

with her current partner than with him. Note also that both 3 and Care paired with their least favorite choice

in this matching. Nonetheless, it is a stable pairing, since none of their preferred choices would rather be

with them.

Before we discuss how to find a stable pairing, let us ask a more basic question: do stable pairings always

exist? Surely the answer is yes, since we could start with any pairing and make it more and more stable as

follows: if there is a rogue couple, modify the current pairing so that they are together. Repeat. Surely this

procedure must result in a stable pairing! Unfortunately this reasoning is not sound. To demonstrate this,

let us consider a slightly different scenario, the roommates problem. Here we have 2npeople who must be

paired up to be roommates (the difference being that unlike the dating scenario, a person can be paired with

any of the remaining 2n1). The point is that nothing about the intuition about the progress made by thestable marriage algorithm relied on the fact that men can only be paired with women, so the same intuition

should apply to the roommates problem as well. The following counter-example illustrates the fallacy in the

reasoning:

CS 70, Fall 2013, Note 2 3


19/133


20/133

of the Stable Marriage algorithm. We will use the preference lists given earlier, which are duplicated here

for convenience:

Men Women

1 A B C

2 B A C

3 A B C

Women Men

A 2 1 3

B 1 2 3

C 1 2 3

The following table shows which men propose to which women on the given day (the circled men are the

ones who were told maybe):

Thus, the stable pairing which the algorithm outputs is: {(1,A), (2,B), (3,C)}.

Theorem: The pairing produced by the algorithm is always stable.

Proof: We will show that no man M can be involved in a rogue couple. Consider any couple (M,W) in

the pairing and suppose that M prefers some woman W* to W. We will argue that W* prefers her partner

to M, so that (M,W*) cannot be a rogue couple. Since W* occurs before W in Ms list, he must have

proposed to her before he proposed to W. Therefore, according to the algorithm, W* must have rejected him

for somebody she prefers. By the Improvement Lemma, W* likes her final partner at least as much, andtherefore prefers him to M. Thus no man M can be involved in a rogue couple, and the pairing is stable.

CS 70, Fall 2013, Note 2 5


21/133

Optimality

Consider the situation in which there are 4 men and 4 women with the following preference lists:

Men Women

1 A B C D

2 A D C B

3 A C B D

4 A B C D

Women Men

A 1 3 2 4

B 4 3 2 1

C 2 3 1 4

D 3 4 2 1

For these preference lists, there are exactly two stable pairings: S= {(1,A), (2,D), (3,C), (4,B)} and T={(1,A), (2,C), (3,D), (4,B)}. The fact that there is more than one stable pairing brings up an interesting

question. What is the best possible partner for each person, say man 2 for example?

The trivial answer is his first choice (i.e., woman A), but that is just not a realistic possibility for him. Pairing

man 2 with woman A would simply not be stable, since he is so low on her preference list. And indeed there

is no stable pairing in which 2 is paired with A. Examining the two stable pairings, we can see that the best

possible realistic outcome for man 2 is to be matched to woman D.

Let us make some definitions to better express these ideas: we say the optimal woman for a man is the

highest woman on his list whom he is paired with in any stablepairing. In other words, the optimal woman

is the best that a man can do under the condition of stability. In the above example, woman D is the optimal

woman for man 2. So the best that each man can hope for is to be paired with his optimal woman. But can

they achieve this optimality simultaneously? I.e., is there a stable pairing such that each man is paired with

his optimal woman? If such a pairing exists, we will call it amale optimalpairing. Turning to the example

above, Sis a male optimal pairing since A is 1s optimal woman, D is 2s optimal woman, C is 3s optimal

woman, and B is 4s optimal woman. Similarly, we can define a female optimal pairing, which is the pairing

in which each woman is paired with her optimal man. [Exercise: Check that T is a female optimal pairing.]

We can also go in the opposite direction and define the pessimalwoman for a man to be the lowest ranked

woman whom he is ever paired with in some stable pairing. This leads naturally to the notion of a malepessimalpairing can you define it, and also a female pessimal pairing?

Now, a natural question to ask is: Who is better off in the Stable Marriage Algorithm: men or women?

Think about this before you read on...

Theorem: The pairing output by the Stable Marriage algorithm is male optimal.

Proof: Suppose for the sake of contradiction that the pairing is notmale optimal. Assume the first day when

a man got rejected by his optimal woman was day k. On this day,Mwas rejected byW (his optimal mate)

in favor ofM who proposed to her. By definition of optimal woman, there must be exist a stable pairing

Tin which Mand W are paired together. Suppose Tlooks like this: {. . . , (M,W), . . . , (M,W), . . .}. We

will argue that(M,W)is a rogue couple in T, thus contradicting stability.

First, it is clear that W prefersM to M, since she rejected Min favor ofM during the execution of the

stable marriage algorithm. Moreover, since day kwas the first day when some man got rejected by his

optimal woman, on dayk M had not yet been rejected by his optimal woman. Since he proposed to W on

thek-th day, this implies that M likesW at least as much as his optimal woman, and therefore at least as

much asW. Therefore,(M,W) form a rogue couple in T, and so T is not stable. Contradiction. Thus,our assumption was wrong and the pairing is male optimal.

CS 70, Fall 2013, Note 2 6


22/133

What proof techniques did we use to prove this theorem? Again it is a proof by induction, structured as an

application of the well-ordering principle. How do we see it as a regular induction proof? This is a bit subtle

to figure out. See if you can do so before reading on. As a hint, the proof is really showing by induction onk

the following statement: for every k, no man gets rejected by his optimal woman on the kth day. [Exercise:

can you complete the induction?]

So men appear to fare very well by following this algorithm. How about the women? The following theorem

confirms the sad truth:

Theorem: If a pairing is male optimal, then it is also female pessimal.

Proof: Let T = {. . . , (M, W), . . . } be the male optimal pairing (which we know is output by the algorithm).

Suppose for the sake of contradiction that there exists a stable pairing S = {. . . , (M*, W), . . . , (M, W), . . . }

such that M* is lower on Ws list than M (i.e., M is not her pessimal man). We will argue that S cannot

possibly be stable by showing that (M,W) is a rogue couple in S. By assumption, W prefers M to M* since

M* is lower on her list. And M prefers W to his partner W in S because W is his partner in the male optimal

pairing T. Contradiction. Therefore, the male optimal pairing is female pessimal.

All this seems a bit unfair to the women! Are there any lessons to be learned from this? Make the first move!

Back to the National Residency Matching Program, until recently the algorithm was run with the hospitals

doing the proposing, and so the pairings produced were hospital optimal. In the nineties, the roles were

reversed so that the medical students were proposing to the hospitals. More recently, there were other

improvements made to the algorithm which the N.R.M.P. used. For example, the pairing takes into account

preferences for married students for positions at the same or nearby hospitals.

Further reading (optional!)

Though it was in use 10 years earlier, the propose-and-reject algorithm was first properly analyzed by Galeand Shapley, in a famous paper dating back to 1962 that still stands as one of the great achievements in the

analysis of algorithms. The full reference is:

D. Gale and L.S. Shapley, College Admissions and the Stability of Marriage, American Mathematical

Monthly69 (1962), pp. 914.

Stable marriage and its numerous variants remains an active topic of research in computer science. Although

it is by now twenty years old, the following very readable book covers many of the interesting developments

since Gale and Shapleys algorithm:

D. Gusfield and R.W. Irving, The Stable Marriage Problem: Structure and Algorithms, MIT Press, 1989.

CS 70, Fall 2013, Note 2 7


23/133



Modular ArithmeticIn several settings, such as error-correcting codes and cryptography, we sometimes wish to work over a

smaller range of numbers. Modular arithmetic is useful in these settings, since it limits numbers to a prede-

fined range{0,1, . . . ,N 1}, and wraps around whenever you try to leave this range like the hand of aclock (whereN= 12) or the days of the week (where N= 7).

Example: Calculating the time When you calculate the time, you automatically use modular arithmetic.

For example, if you are asked what time it will be 13 hours from 1 pm, you say 2 am rather than 14.

Lets assume our clock displays 12 as 0. This is limiting numbers to a predefined range,{0,1,2, . . . ,11}.Whenever you add two numbers in this setting, you divide by 12 and provide the remainder as the answer.

If we wanted to know what the time would be 24 hours from 2 pm, the answer is easy. It would be 2 pm.

This is true not just for 24 hours, but for any multiple of 12 hours. What about 25 hours from 2 pm? Since

the time 24 hours from 2 pm is still 2 pm, 25 hours later it would be 3 pm. Another way to say this is that

we add 1 hour, which is the remainder when we divide 25 by 12.

This example shows that under certain circumstances it makes sense to do arithmetic within the confines

of a particular number (12 in this example). That is, we only keep track of the remainder when we divide

by 12, and when we need to add two numbers, instead we just add the remainders. This method is quite

efficient in the sense of keeping intermediate values as small as possible, and we shall see in later notes how

useful it can be.

More generally we can define x mod m (in words x modulom) to be the remainderrwhen we dividex by

m. i.e. ifx mod m=r, thenx =mq + rwhere 0r m1 andq is an integer. Thus 5=29 mod 12 and3=13 mod 5.

ComputationIf we wish to calculatex +ymod m, we would first addx +yand the calculate the remainder when we dividethe result bym. For example, ifx=14 and y=25 andm=12, we would compute the remainder when wedividex +y=14 + 25=39 by 12, to get the answer 3. Notice that we would get the same answer if we firstcomputed 2= x mod 12 and 1=y mod 12 and added the results modulo 12 to get 3. The same holds forsubtraction: xymod 12 is11 mod 12, which is 1. Again, we could have directly obtained this as 2 1by first simplifyingx mod 12 andy mod 12.

This is even more convenient if we are trying to multiply: to compute xymod 12, we could first compute

xy=1425=350 and then compute the remainder when we divide by 12, which is 2. Notice that we getthe same answer if we first compute 2 =xmod 12 and 1 =ymod 12 and simply multiply the results modulo12.

More generally, while carrying out any sequence of additions, subtractions or multiplications modm, we

get the same answer even if we reduce any intermediate results mod m. This can considerably simplify the

CS 70, Fall 2013, Note 3 1


24/133

calculations.

Set Representation

There is an alternate view of modular arithmetic which helps understand all this better. For any integerm

we say thatx and y are congruent modulo m if they differ by a multiple ofm, or in symbols,

x=y mod m mdivides(xy).

Note that you may also see this written as x y mod m. For example, 29 and 5 are congruent modulo 12because 12 divides 295. We can also write 22=2 mod 12. Notice thatx andy are congruent modulomiff they have the same remainder modulo m.

What is the set of numbers that are congruent to 0 mod 12? These are all the multiples of 12:

{. . . ,36,24,12,0,12,24,36, . . .}. What about the set of numbers that are congruent to 1 mod 12?These are all the numbers that give a remainder 1 when divided by 12:{. . . ,35,23,11,1,13,25,37, . . .}.Similarly the set of numbers congruent to 2 mod 12 is{. . . ,34,22,10,2,14,26,38, . . .}. Notice in thisway we get 12 such sets of integers, and every integer belongs to one and only one of these sets.

In general if we work modulom, then we getm such disjoint sets whose union is the set of all integers. We

can think of each set as represented by the unique element it contains in the range (0, . . . ,m 1). The setrepresented by elementi would be all numbersz such that z=mx + ifor some integerx. Observe that all ofthese numbers have remainder i when divided bym; they are therefore congruent modulo m.

We can understand the operations of addition, subtraction and multiplication in terms of these sets. When

we add two numbers, sayx = 2 mod 12 andy = 1 mod 12, it does not matter whichxandywe pick from thetwo sets, since the result is always an element of the set that contains 3. The same is true about subtraction

and multiplication. It should now be clear that the elements of each set are interchangeable when computing

modulom, and this is why we can reduce any intermediate results modulom.

Here is a more formal way of stating this observation:

Theorem 3.1: Ifa=c mod m and b=dmod m, thena + b=c + dmod m and a b=c dmodm.

Proof: We know that c =a + km and d= b + m, so c + d= a + km + b + m=a + b + (k+ ) m,which means thata + b=c + dmod m. The proof for multiplication is similar and left as an exercise.

What this theorem tells us is that we can always reduce any arithmetic expression modulo m into a natural

number smaller than m. As an example, consider the expresion (13 + 11) 18 mod 7. Using the aboveTheorem several times we can write:

(13 + 11) 18= (6 + 4) 4 mod 7

=10 4 mod 7

=3 4 mod 7

=12 mod 7

=5 mod 7.

In summary, we can always do calculations modulom by reducing intermediate results modulom.

InversesWe have so far discussed addition, multiplication and subtraction. What about division? This is a bit harder.

Over the reals dividing by a number x is the same as multiplying by y = 1/x. Here y is that number such

CS 70, Fall 2013, Note 3 2


25/133

thatx y=1. Of course we have to be careful whenx=0, since such ay does not exist. Similarly, when wewish to divide byx mod m, we need to findy mod m such that x y=1 mod m; then dividing by x modulomwill be the same as multiplying by y modulom. Such ay is called themultiplicative inverseofx modulo

m. In our present setting of modular arithmetic, can we be sure that x has an inverse mod m, and if so, is it

unique (modulom) and can we compute it?

As a first example, take x =8 and m =15. Then 2x=16=1 mod 15, so 2 is a multiplicative inverse of 8

mod 15. As a second example, takex =12 and m=15. Then the sequence{axmod m : a=0,1,2, . . .}isperiodic, and takes on the values{0,12,9,6,3}(check this!). Thus 12has no multiplicative inverse mod15.

So when does x have a multiplicative inverse modulo m? The answer is: iff the greatest common divisor

ofm and x is 1. Moreover, when the inverse exists it is unique. Recall that the greatest common divisorof

two natural numbers x and y, denoted gcd(x,y), is the largest natural number that divides them both. Forexample,gcd(30,24) =6. If gcd(x,y)is 1, it means that x and y share no common factors (except 1). Thisis often expressed by saying thatx and m arerelatively prime.

Theorem 3.2: Let m,x be positive integers such that gcd(m,x) = 1. Then x has a multiplicative inversemodulom, and it is unique (modulo m).

Proof: Consider the sequence ofm numbers 0,x,2x, . . . (m1)x. We claim that these are all distinct mod-

ulo m. Since there are onlym distinct values modulo m, it must then be the case that ax= 1 mod m forexactly onea (modulom). Thisa is the unique multiplicative inverse.

To verify the above claim, suppose thatax=bx mod m for two distinct values a,bin the range 0bam1. Then we would have(ab)x= 0 modm, or equivalently,(ab)x= kmfor some integerk(possiblyzero or negative).

However,x and mare relatively prime, so x cannot share any factors withm. This implies that abmust bean integer multiple ofm. This is not possible, since abranges between 1 andm1.

Actually it turns out that gcd(m,x) = 1 is also a necessarycondition for the existence of an inverse: i.e., ifgcd(m,x)> 1 then x has no multiplicative inverse modulo m. You might like to try to prove this using asimilar idea to that in the above proof.

Since we know that multiplicative inverses are unique when gcd(m,x) =1, we shall write the inverse ofx asx1 modm. Being able to compute the multiplicative inverse of a number is crucial to many applications,

so ideally the algorithm used should be efficient. It turns out that we can use an extended version of Euclids

algorithm, which computes the gcd of two numbers, to compute the multiplicative inverse.

Computing the Multiplicative InverseLet us first discuss how computing the multiplicative inverse ofx modulom is related to finding gcd(x,m).For any pair of numbers x,y, suppose we could not only compute gcd(x,y), but also find integers a,b suchthat

d=gcd(x,y) =ax + by. (1)

(Note that this is not a modular equation; and the integers a,bcould be zero or negative.) For example, wecan write 1=gcd(35,12) =1 35 + 3 12, so herea=1 andb=3 are possible values fora,b.

If we could do this then wed be able to compute inverses, as follows. We first find the integers a andb such

that

1=gcd(m,x) =am + bx.

CS 70, Fall 2013, Note 3 3


26/133

But this means that bx =1 modm, sob is the multiplicative inverse ofx modulom. Reducingb modulomgives us the unique inverse we are looking for. In the above example, we see that 3 is the multiplicative

inverse of 12 mod 35. So, we have reduced the problem of computing inverses to that of finding integers

a,bthat satisfy equation (1). Remarkably, Euclids algorithm for computing gcds also allows us to find theintegers a and b described above. So computing the multiplicative inverse ofx modulo m is as simple as

running Euclids gcd algorithm on inputx and m!

Euclids Algorithm

If we wish to compute the gcd of two numbersxandy, how would we proceed? Ifxoryis 0, then computing

the gcd is easy; it is simply the other number, since 0 is divisible by everything (although of course it divides

nothing). The algorithm for computinggcd(x,y)uses the following theorem to eventually reduce to the casewhere one of the numbers is 0:

Theorem 3.3: Letxy and let q,rbe natural numbers suchx =yq+rand r0.

Lets go through a quick example of this recursive implementation of Euclids algorithm. We wish to

compute gcd(32,10):

gcd(32,10) = gcd(10,2)

= gcd(2,0)

= 2

CS 70, Fall 2013, Note 3 4


27/133

Extended Euclids Algorithm

In order to compute the multiplicative inverse, we need an algorithm which also returns integers a and b

such that:

gcd(x,y) =ax + by.

Now since this problem is a generalization of the basic gcd, it is perhaps not too surprising that we can solveit with a fairly straightforward extension of Euclids algorithm.

Examples

Lets first see how we would compute such numbers for x=6 andy=4. Well need the equations from ourexample above, copied here for reference:

16=101 + 610=61 + 46=41 + 24=22 + 0

From the last two equations it follows that gcd(6,4) =2. But now the second last equation gives us thenumbersa,b, since we just rearrange that equation to say 2=6141. Soa=1 andb=1.

What if we started with x =10 and y =6? Now we would write the last three equations to determine thatgcd(10,6) = 2. But how do we finda,b? Start as above and write 2 = 6141. But we want 10 and 6 onthe right hand side, not 6 and 4. But notice that the third from the last equation allows us to write 4 as a linear

combination of 6 and 10 and so we can just back substitute: we rewrite that equation as 4 =10161and substitute to get:

2=6141=61 (10161) =62101.

If we started with x= 16 and y= 10 we would back substitute again using the first equation rewritten as

6=1610 to get:2=62101= (1610)210=162103. Soa=2 andb=3.

Algorithm

The following recursive algorithmextended-gcdimplements the idea used in the examples above. It takes as

input a pair of natural numbers x y as in Euclids algorithm, and returns a triple of integers (d,a,b)suchthatd=gcd(x,y)and d=ax + by:

algorithm extended-gcd(x,y)

if y = 0 then return(x, 1, 0)

else

(d, a, b) := extended-gcd(y, x mod y)

return((d, b, a - (x div y) * b))

Note that this algorithm has the same form as the basic gcd algorithm we saw earlier; the only difference is

that we now carry around in addition the required values a,b. You should hand-turn the algorithm on theinput(x,y) = (16,10)from our earlier example, and check that it delivers correct values for a,b.

CS 70, Fall 2013, Note 3 5


28/133

Youll see a full analysis of this algorithm in CS 170, including correctness and efficiency (the running

time is O(n3)) . Let us understand intuitively why the numbersa and b returned by the algorithm shouldgive us what we are looking for. We just need to generalize the back substitution method we used in the

example above. The algorithm reduces finding gcd(x,y) to finding gcd(y,xmod y). Once the algorithmfinds gcd(y,xmod y), it returns valuesa and b such that:

d=ay + b(xmod y). (2)

Now we need to update these values ofa and b, say toA and B, such that

d=Ax +By. (3)

To figure out whatA and B should be, we need to rearrange equation (2), as follows:

d= ay + b(xmod y)

=ay + b(xx/yy)

=bx + (ax/yb)y.

(In the second line here, we have used the fact that x mod y =x x/yy check this!) Comparing thislast equation with equation (3), we see that we need to takeA=b and B=ax/yb. This is exactly whatthe algorithm does. Of course we have not fully proved correctness, but you should be able to see why the

algorithm works.

CS 70, Fall 2013, Note 3 6


29/133



This note is partly based on Section 1.4 of Algorithms," by S. Dasgupta, C. Papadimitriou and U. Vazirani,McGraw-Hill, 2007.

An online draft of the book is available at http://www.cs.berkeley.edu/ vazirani/algorithms.html

Public Key Cryptography

Bijections

This note introduces the fundamental concept of a function, as well as a famous function called RSA that

forms the basis of public-key cryptography. A function is a mapping from a set of inputs Ato a set of outputs

B: for inputxA, f(x)must be in the setB. To denote such a function, we write f :AB.

Consider the following examples of functions, where both functions map{0, . . . , m 1}to itself:

f(x) =x + 1 modm

g(x) =2xmod m

A bijection is a function for which every bB has a uniquepre-image aA such that f(a) =b. Note thatthis consists of two conditions:

1. f isonto: everybB has a pre-image aA.

2. f isone-to-one: for alla, a A, if f(a) = f(a)thena=a.

Looking back at our examples, we can see that fis a bijection; the unique pre-image ofy isy 1. However,gis only a bijection ifm is odd. Otherwise, it is neither one-to-one nor onto. The following lemma can be

used to prove that a function is a bijection:

Lemma:For a finite setA, f :A Ais a bijection if there is aninversefunctiong:A Asuch thatx Ag(f(x)) =x.

Proof: If f(x) = f(x), thenx=g(f(x)) =g(f(x)) =x. Therefore, fis one-to-one. Since f is one-to-one,there must be|A|elements in the range of f. This implies that fis also onto.

RSA

One of the most useful bijections is the RSA function, named after its inventors Ronald Rivest, Adi Shamir

and Leonard Adleman:

E(x)xe modN

whereN= pq(pand q are two large primes),E: {0, . . . ,N 1} {0, . . . ,N 1}and e is relatively primeto(p 1)(q 1). The inverse of the RSA function is:

CS 70, Fall 2013, Note 4 1


30/133

D(x)xd modN

wheredis the inverse ofe mod (p 1)(q 1).

Consider the following setting. Alice and Bob wish to communicate confidentially over some (insecure)

link. Eve, an eavesdropper who is listening in, would like to discover what they are saying. Lets assume

that Alice wants to transmit a message x (a number between 1 and N 1) to Bob. She will apply herencryption function E to x and send the encrypted message E(x)(also called the cyphertext) over the link;Bob, upon receipt ofE(x), will then apply hisdecryption function D to it. Since D is the inverse ofE, Bobwill obtain the original messagex. However, how can we be sure that Eve cannot also obtainx?

In order to encrypt the message, Alice only needs Bobspublic key(N, e). In order to decrypt the message,Bob needs hisprivate key d. The pair(N, e)can be thought of as a public lock - anyone can place a messagein a box and lock it, but only Bob has the key dto open the lock. The idea is that since Eve does not have

access tod, she will not be able to gain information about Alices message.

We will now prove that D(E(x)) =x(and thereforeE(x)is a bijection). We will require a beautiful theoremfrom number theory known as Fermats Little Theorem, which is the following:

Theorem 4.1: [Fermats Little Theorem]For any prime pand any a {1, 2, . . . ,p 1}, we haveap1 =1 mod p.

Let Sbe the nonzero integers modulo p; that is, S= {1, 2, . . . ,p 1}. Define a function f :S Ssuchthat f(x)ax mod p. Heres the crucial observation: fis simply a bijection from Sto S; it permutes theelements ofS. For instance, heres a picture of the case a=3,p=7:

6

5

4

3

2

1 1

2

3

4

5

6

Figure 1: Multiplication by (3 mod 7)

With this intuition, we can now prove Fermats Little Theorem:

Proof: Our first claim is that f(x)is a bijection. We will then show that this claim implies the theorem.

To show that f is a bijection, we simply need to argue that the numbers a i mod p are distinct. This isbecause if a i a j (mod p), then dividing both sides by a gives i j (mod p). They are nonzerobecausea i0 similarly implies i0. (And wecan divide bya, because by assumption it is nonzero andtherefore relatively prime to p.)

Now we can prove the theorem. Since f is a bijection, we know that the image of f isS. Now if we take the

product of all elements inS, it is equal to the product of all elements in the image of f:

(p 1)!ap1 (p 1)! (mod p).

CS 70, Fall 2013, Note 4 2


31/133

Dividing by(p 1)! (which we can do because it is relatively prime to p, since p is assumed prime) thengives the theorem.

Let us return to proving that D(E(x)) =x:

Theorem 4.2: Under the above definitions of the encryption and decryption functionsE and D, we have

D(E(x)) =x mod Nfor every possible messagex {0,

1, . . . ,

N 1}.The proof of this theorem relies on Fermats Little Theorem:

Proof of Theorem 6.2:To prove the statement, we have to show that

(xe)d =x mod N for everyx {0, 1, . . . ,N 1}. (1)

Lets consider the exponent, which is ed. By definition ofd, we know thated= 1 mod(p 1)(q 1); hencewe can writeed= 1 + k(p 1)(q 1)for some integer k, and therefore

xedx=x1+k(p1)(q1) x=x(xk(p1)(q1) 1). (2)

Looking back at equation (1), our goal is to show that this last expression in equation (2) is equal to 0 modN

for everyx.Now we claim that the expression x(xk(p1)(q1) 1)in (2) is divisible by p. To see this, we consider twocases:

Case 1: x is not a multiple of p. In this case, sincex = 0 mod p, we can use Fermats Little Theorem to deducethat xp1 =1 mod p. Then(xp1)k(q1) 1k(q1) mod p and hence xk(p1)(q1) 1= 0 mod p, asrequired.

Case 2: x is a multiple of p. In this case the expression in (2), which hasx as a factor, is clearly divisible by p.

By an entirely symmetrical argument, x(xk(p1)(q1) 1)is also divisible by q. Therefore, it is divisible byboth p and q, and since p and q are primes it must be divisible by their product, pq=N. But this implies

that the expression is equal to 0 modN, which is exactly what we wanted to prove.

So we have seen that the RSA protocol iscorrect, in the sense that Alice can encrypt messages in such a way

that Bob can reliably decrypt them again. But how do we know that it is secure, i.e., that Eve cannot get any

useful information by observing the encrypted messages? The security of RSA hinges upon the following

simple assumption:

Given N,e and y=xe modN, there is no efficient algorithm for determiningx.

This assumption is quite plausible. How might Eve try to guess x? She could experiment with all possible

values ofx, each time checking whether xe =y mod N; but she would have to try on the order ofNvalues

of x, which is completely unrealistic if N is a number with (say) 512 bits. This is becauseN2512

islarger than estimates for the age of the Universe in femtoseconds! Alternatively, she could try to factor Nto

retrieve pand q, and then figure out dby computing the inverse ofe mod(p 1)(q 1); but this approachrequires Eve to be able to factor Ninto its prime factors, a problem which is believed to be impossible to

solve efficiently for large values ofN. She could try to compute the quantity(p 1)(q 1)without factoringN; but it is possible to show that computing (p 1)(q 1) is equivalent to factoring N. We should pointout that the security of RSA has not been formally proved: it rests on the assumptions that breaking RSA is

essentially tantamount to factoring N, and that factoring is hard.

CS 70, Fall 2013, Note 4 3


32/133

We close this note with a brief discussion of implementation issues for RSA. Since we have argued that

breaking RSA is impossible becausefactoringwould take a very long time, we should check that the com-

putations that Alice and Bob themselves have to perform are much simpler, and can be done efficiently.

There are really only two non-trivial things that Alice and Bob have to do:

1. Bob has to find prime numberspand q, each having many (say, 512) bits.

2. Both Alice and Bob have to compute exponentials modN. (Alice has to computexe modN, and Bob

has to computeyd modN.)

Both of these tasks can be carried out efficiently. The first requires the implementation of an efficient test

for primality as well as a rich source of primes. You will learn how to tackle each of these tasks in the

algorithms courseCS170. The second requires an efficient algorithm for modular exponentiation, which is

not very difficult, but will also be discussed in detail in CS170.

To summarize, then, in the RSA protocol Bob need only perform simple calculations such as multiplication,

exponentiation and primality testing to implement his digital lock. Similarly, Alice and Bob need only

perform simple calculations to lock and unlock the the message respectivelyoperations that any pocket

computing device could handle. By contrast, to unlock the message without the key, Eve would haveto perform operations like factoring large numbers, which (at least according to widely accepted belief)

requires more computational power than all the worlds most sophisticated computers combined! This

compelling guarantee of security without the need for private keys explains why the RSA cryptosystem is

such a revolutionary development in cryptography.

CS 70, Fall 2013, Note 4 4


33/133



PolynomialsPolynomials constitute a rich class of functions which are both easy to describe and widely applicable in

topics ranging from Fourier analysis to computational geometry. In this note, we will discuss properties of

polynomials which make them so useful. We will then describe how to take advantage of these properties to

develop a secret sharing scheme.

Recall from your high school math that a polynomial in a single variable is of the form p(x) =adxd +

ad1xd1 +. . .+ a0. Here the variable x and the coefficients ai are usually real numbers. For example,

p(x) =5x3 + 2x + 1, is a polynomial ofdegree d=3. Its coefficients area3=5,a2=0,a1=2, anda0=1.Polynomials have some remarkably simple, elegant and powerful properties, which we will explore in this

note.

First, a definition: we say that a is a root of the polynomial p(x) if p(a) =0. For example, the degree2 polynomial p(x) =x2 4 has two roots, namely 2 and 2, since p(2) = p(2) = 0. If we plot thepolynomial p(x)in thex-yplane, then the roots of the polynomial are just the places where the curve crossesthex axis:

We now state two fundamental properties of polynomials that we will prove in due course.

Property 1: A non-zero polynomial of degree dhas at mostdroots.

Property 2: Givend+ 1 pairs(x1,y1), . . . , (xd+1,yd+1), with all thexidistinct, there is a unique polynomialp(x)of degree (at most)dsuch that p(xi) =yifor 1 i d+ 1.

Let us consider what these two properties say in the case that d= 1. A graph of a linear (degree 1) polynomialy=a1x + a0is a line. Property 1 says that if a line is not thex-axis (i.e. if the polynomial is noty=0), then

it can intersect thex-axis in at most one point.

CS 70, Fall 2013, Note 5 1


34/133

Property 2 says that two points uniquely determine a line.

Polynomial Interpolation

Property 2 says that two points uniquely determine a degree 1 polynomial (a line), three points uniquely

determine a degree 2 polynomial, four points uniquely determine a degree 3 polynomial, and so on. Given

d+ 1 pairs(x1,y1), . . . , (xd+1,yd+1), how do we determine the polynomial p(x) =adxd+ . . . + a1x + a0such

that p(xi) =yi for i= 1 to d+ 1? We will give an efficient algorithms for reconstructing the coefficientsa0, . . . , ad, and therefore the polynomial p(x).

The method is called Lagrange interpolation: Let us start by solving an easier problem. Suppose that

we are told that y1= 1 and yj =0 for 2 j d+ 1. Now can we reconstruct p(x)? Yes, this is easy!

Considerq(x) = (xx2)(xx3) (xxd+1). This is a polynomial of degree d(thexis are constants, andxappears d times). Also, we clearly haveq(xj) = 0 for 2 j d+ 1. But what isq(x1)? Well,q(x1) =(x1 x2)(x1 x3) (x1 xd+1), which is some constant not equal to 0. Thus if we let p(x) = q(x)/q(x1)(dividing is ok sinceq(x1) = 0), we have the polynomial we are looking for. For example, suppose you weregiven the pairs(1, 1),(2, 0), and(3, 0). Then we can construct the degree d=2 polynomial p(x)by lettingq(x) = (x2)(x3) =x25x + 6, andq(x1) = q(1) = 2. Thus, we can now constructp(x) = q(x)/q(x1) =(x2 5x + 6)/2.

Of course the problem is no harder if we single out some arbitrary index i instead of 1: i.e. yi=1 andyj=0for j =i. Let us introduce some notation: let us denote by i(x)the degreedpolynomial that goes through

thesed+ 1 points. Then i(x) = j=i(xxj)j=i(xixj)

.

Let us now return to the original problem. Givend+ 1 pairs(x1,y1), . . . , (xd+1,yd+1), we first construct thed+ 1 polynomials 1(x), . . . ,d+1(x). Now we can write p(x) =

d+1i=1 yii(x). Why does this work? First

notice that p(x)is a polynomial of degreedas required, since it is the sum of polynomials of degreed. Andwhen it is evaluated atxi,dof thed+ 1 terms in the sum evaluate to 0 and thei-thterm evaluates to yi times1, as required.

As an example, suppose we want to find the degree-2 polynomial p(x)that passes through the three points

CS 70, Fall 2013, Note 5 2


35/133

(1, 1),(2, 2)and(3, 4). The three polynomials i are as follows: Ifd=2, andxi=i, for instance, then

1(x) =(x2)(x3)

(12)(13)=

(x2)(x3)

2 =

1

2x2

5

2x + 3;

2(x) =(x1)(x3)

(21)(23)=

(x1)(x3)

1 = x2 + 4x3;

3(x) =

(x1)(x2)

(31)(32)=

(x1)(x2)

2 =

1

2x2

3

2x + 1.

The polynomial p(x)is therefore given by

p(x) =1 1(x) + 2 2(x) + 4 3(x) =1

2x2

1

2x + 1.

You should verify that this polynomial does indeed pass through the above three points.

Proof of Property 2

We would like to prove property 2:

Property 2: Givend+ 1 pairs(x1,y1), . . . , (xd+1,yd+1), with all thexidistinct, there is a unique polynomialp(x)of degree (at most)dsuch that p(xi) =yifor 1 i d+ 1.

We have shown how to find a polynomial p(x)such that p(xi) =yi for d+ 1 pairs(x1,y1), . . . , (xd+1,yd+1).This proves part of property 2 (the existence of the polynomial). How do we prove the second part, that the

polynomial is unique? Suppose for contradiction that there is another polynomialq(x)such that p(xi) = yifor alld+ 1 pairs above. Now consider the polynomialr(x) = p(x)q(x). Since we are assuming thatq(x)and p(x) are different polynomials, r(x) must be a non-zero polynomial of degree at most d. Therefore,property 1 implies it can have at most d roots. But on the other handr(xi) = p(xi) q(xi) =0 on d+ 1distinct points. Contradiction. Therefore, p(x)is the unique polynomial that satisfies the d+ 1 conditions.

Polynomial Division

Lets take a short digression to discuss polynomial division, which will be useful in the proof of property 1.

If we have a polynomialp(x)of degreed, we can divide by a polynomial q(x)of degree dby using longdivision. The result will be:

p(x) =q(x)q(x) + r(x)

whereq(x)is the quotient and r(x)is the remainder. The degree ofr(x)must be smaller than the degree ofp(x).

Example. We wish to divide p(x) =x3 +x2 1 by q(x) =x1:

X2 + 2X+ 2

X1

X3 +X2 1

X

3

+X2

2X2

2X2 + 2X

2X1

2X+ 2

1

Nowp(x) =x3 +x2 1= (x1)(x2 + 2x + 2) + 1,r(x) =1 andq(x) =x2 + 2x + 2.

CS 70, Fall 2013, Note 5 3


36/133

Proof of Property 1

Now let us turn to property 1: a non-zero polynomial of degree dhas at most droots.The idea of the proof

is as follows. We will prove the following claims:

Claim 1Ifa is a root of a polynomial p(x)with degreed, then p(x) = (xa)q(x)for a polynomialq(x)with degreed1.

Claim 2A polynomial p(x)of degreedwith distinct roots a1, . . . , adcan be written as p(x) = c(xa1) (xad).

Claim 2 implies property 1. We must show thata=ai for i= 1, . . . dcannot be a root of p(x). But thisfollows from claim 2, since p(a) =c(aa1) (aad) =0.

Proof of Claim 1

Dividingp(x)by(xa)gives p(x) = (xa)q(x)+ r(x), whereq(x)is the quotient andr(x)is the remainder.The degree ofr(x) is necessarily smaller than the degree of the divisor (x a). Thereforer(x) must have

degree 0 and therefore is some constant c. But now substitutingx= a, we get p(a) = c. But sincea is aroot, p(a) =0. Thusc=0 and therefore p(x) = (xa)q(x), thus showing that(xa)|p(x).

Claim 1 implies Claim 2

Proof by induction ond.

Base Case: We must show that a polynomial p(x) of degree 1 with root a1 can be written as p(x) =c(xa1). By Claim 1, we know that p(x) = (xa1)q(x), whereq(x)has degree 0 and is therefore aconstant.

Inductive Hypothesis: A polynomial of degreed1 with distinct rootsa1, . . . , ad1can be written asp(x) =c(xa1) (xad1).

Inductive Step: Let p(x) be a polynomial of of degree d with distinct roots a1, , ad. By Claim1, p(x) = (x ad)q(x) for some polynomial q(x) of degree d 1. Since 0= p(ai) = (ai ad)q(ai)for all i=d and ai ad=0 in this case, q(ai) must be equal to 0. Then q(x) is a polynomial ofdegree d 1 with distinct roots a1, . . . , ad1. We can now apply the inductive assumption to q(x)to write q(x) =c(x a1) (x ad1). Substituting in p(x) = (x ad)q(x), we finally obtain thatp(x) =c(xa1) (xad).

Finite FieldsBoth property 1 and property 2 also hold when the values of the coefficients and the variable x are chosen

from the complex numbers instead of the real numbers or even the rational numbers. They do not hold if the

values are restricted to being natural numbers or integers. Let us try to understand this a little more closely.

The only properties of numbers that we used in polynomial interpolation and in the proof of property 1 is

that we can add, subtract, multiply and divide any pair of numbers as long as we are not dividing by 0. We

cannot subtract two natural numbers and guarantee that the result is a natural number. And dividing two

integers does not usually result in an integer.

CS 70, Fall 2013, Note 5 4


37/133

But if we work with numbers modulo a prime m, then we can add, subtract, multiply and divide (by any

non-zero number modulo m). To check this, recall that x has an inverse mod m if gcd(m,x) = 1, so ifm isprimeall the numbers{1, . . . , m1}have an inverse mod m. So both property 1 and property 2 hold if thecoefficients and the variable x are restricted to take on values modulo m. This remarkable fact that these

properties hold even when we restrict ourselves to a finiteset of values is the key to several applications that

we will presently see.

Let us consider an example of degree d= 1 polynomials modulo 5. Let p(x) = 2x + 3( mod5). The rootsof this polynomial are all values x such that 2x + 3=0( mod5)holds. Solving forx, we get that 2x= 3=2( mod5)or x=1( mod5). Note that this is consistent with property 1 since we got only 1 root of a degree1 polynomial.

Now consider the polynomials p(x) = 2x + 3 and q(x) = 3x 2 with all numbers reduced mod 5. We canplot the value of each polynomialyas a function ofxin thex-yplane. Since we are working modulo 5, there

are only 5 possible choices forx, and only 5 possible choices fory:

Notice that these two lines" intersect in exactly one point, even though the picture looks nothing at all like

lines in the Euclidean plane! Looking at these graphs it might seem remarkable that both property 1 and

property 2 hold when we work modulo m for any prime number m. But as we stated above, all that was

required for the proofs of property 1 and 2 was the ability to add, subtract, multiply and divide any pair of

numbers (as long as we are not dividing by 0), and they hold whenever we work modulo a primem.

When we work with numbers modulo a prime m, we say that we are working over a finite field, denoted

by Fm or GF(m) (for Galois Field). In order for a set to be called a field, it must satisfy certain axiomswhich are the building blocks that allow for these amazing properties and others to hold. If you would like

to learn more about fields and the axioms they satisfy, you can visit Wikipedias site and read the article

on fields: http://en.wikipedia.org/wiki/Field_\%28mathematics\%29. While you are

there, you can also read the article on Galois Fields and learn more about some of their applications and

elegant properties which will not be covered in this lecture: http://en.wikipedia.org/wiki/

Galois_field.

CountingHow many polynomials of degree (at most) 2 are there modulo m? This is easy: there are 3 coefficients,

each of which can take on one ofm values for a total ofm3

. Writing p(x) = adxd

+ ad1xd1

+ . . . + a0 byspecifying itsd+ 1 coefficientsai is known as the coefficient representation ofp(x). Is there any other wayto specify p(x)?

Sure, there is! Our polynomial of degree (at most) 2 is uniquely specified by its values at any three points, say

x=0, 1, 2. Once again each of these three values can take on one ofm values, for a total ofm3 possibilities.In general, we can specify a degree dpolynomial p(x)by specifying its values atd+ 1 points, say 0, 1, . . . , d.Thesed+ 1 values,(y0,y1, . . . ,yd)are called the value representation ofp(x). The coefficient representation

CS 70, Fall 2013, Note 5 5


38/133


39/133

over such a devastating and destructive weapon. Suppose the U.S. government finally decides that a nuclear

strike can be initiated only if at leastk>1 major officials agree to it. We want to devise a scheme such that(1) any group ofkof these officials can pool their information to figure out the launch code and initiate the

strike but (2) no group ofk1 or fewer have any information about the launch code, even if they pool their

knowledge. For example, they should not learn whether the secret is odd or even, a prime number, divisible

by some numbera, or the secrets least significant bit. How can we accomplish this?

Suppose that there aren officials indexed from 1 to n and the launch code is some natural number s. Letqbe a prime number larger than n and s. We will work overGF(q)from now on.

Now pick a random polynomial P(x) of degree k 1 such that P(0) = s and give P(1) to the first official,P(2)to the second,. . . ,P(n)to thenth . Then

Anykofficials, having the values of the polynomial at kpoints, can use Lagrange interpolation to find

P, and once they know what P is, they can compute P(0) =s to learn the secret. Another way to saythis is that anykofficials have between them a value representation of the polynomial, which they can

convert to the coefficient representation, which allows them to evaluateP(0) =s.

Any group of k 1 officials has no information about s. So they know only k 1 points through

which P(x), an unknown polynomial of degree k1 passes. They wish to reconstruct P(0). But byour discussion in the previous section, for each possible valueP(0) =b, there is a unique polynomialof degree k 1 that passes through the k 1 points of the k 1 officials as well as through (0, b).Hence the secret could be any of the q possible values {0, 1, . . . , q 1}, so the officials havein avery precise senseno information about s. Another way of saying this is that the information of the

officials is consistent withq different value representations, one for each possible value of the secret,

and thus the officials have no information about s. (Note that this is the main reason we choose to

work over finite fields rather than, say, over the real numbers, where the basic secret-sharing scheme

would still work. Because there are only finitely many values in our field, we can quantify precisely

how many remaining possibilities there are for the value of the secret, and show that this is the same

as if the officials had no information at all.)

Example. Suppose you are in charge of setting up a secret sharing scheme, with secret s= 1, where youwant to distributen=5 shares to 5 people such that any k=3 or more people can figure out the secret, buttwo or fewer cannot. Lets say we are working over GF(7)(note that 7>s and 7>n) and you randomlychoose the following polynomial of degree k1=2 : P(x) =3x2 + 5x + 1 (here,P(0) =1=s, the secret).So you know everything there is to know about the secret and the polynomial, but what about the people

that receive the shares? Well, the shares handed out are P(1) =2 to the first official,P(2) =2 to the second,P(3) =1 to the third, P(4) = 6 to the fourth, and P(5) = 3 to the fifth official. Lets say that officials 3,4, and 5 get together (we expect them to be able to recover the secret). Using Lagrange interpolation, they

compute the following delta functions:

3(x) =(x4)(x5)

(34)(35)

=(x4)(x5)

2

=4(x4)(x5);

4(x) =(x3)(x5)

(43)(45)=

(x3)(x5)

1 =6(x3)(x5);

5(x) =(x3)(x4)

(53)(54)=

(x3)(x4)

2 =4(x3)(x4).

They then compute the polynomial over GF(7): P(x) = (1)3(x) + (6)4(x) + (3)5(x) =3x2 + 5x + 1

(verify the computation!). Now they simply compute P(0)and discover that the secret is 1.

CS 70, Fall 2013, Note 5 7


40/133

Lets see what happens if two officials try to get together, say persons 1 and 5. They both know that the

polynomial looks likeP(x) =a2x2 + a1x + s. They also know the following equations:

P(1) =a2+ a1+ s=2

P(5) =4a2+ 5a1+ s=3

But that is all they have two equations with three unknowns, and thus they cannot find out the secret. This

is the case no matter which two officials get together. Notice that since we are working overGF(7), thetwo people could have guessed the secret (0 s 6) and constructed a unique degree 2 polynomial (by

property 2). But the two people combined have the same chance of guessing what the secret is as they do

individually. This is important, as it implies that two people have no more information about the secret than

one person does.

CS 70, Fall 2013, Note 5 8


41/133



Error Correcting CodesWe will consider two situations in which we wish to transmit information on an unreliable channel. The

first is exemplified by the internet, where the information (say a file) is broken up into packets, and the un-

reliability is manifest in the fact that some of the packets are lost (or erased) during transmission. Moreover

the packets are labeled so that the recipient knows exactly which packets were received and which were

dropped. We will refer to such errors as erasure errors. See the figure below:

In the second situation, some of the packets are corrupted during transmission due to channel noise. Nowthe recipient has no idea which packets were corrupted and which were received unmodified:

In the above example, packets 1 and 4 are corrupted. These types of errors are called general errors. We will

discuss methods of encoding messages, called error correcting codes, which are capable of correcting both

erasure and general errors.

Assume that the information consists ofn packets. We can assume without loss of generality that the contents

of each packet is a number moduloq (denoted byGF(q)), whereq is a prime. For example, the contents ofthe packet might be a 32-bit string and can therefore be regarded as a number between 0 and 2 32 1; then

we could choose q to be any prime larger than 2 32. The properties of polynomials overGF(q) (i.e., withcoefficients and values reduced modulo q) are the backbone of both error-correcting schemes. To see this,

let us denote the message to be sent by m1, . . . ,mnand make the following crucial observations:

1)There is a unique polynomial P(x)of degreen1 such thatP(i) = mi for 1 i n (i.e., P(x)containsall of the information about the message, and evaluating P(i)gives the contents of thei-th packet).

2)The message to be sent is nowm1=P(1), . . . ,mn=P(n). We can generate additional packets by evaluat-ingP(x)at additional pointsn + 1,n+ 2, . . . ,n+j (remember, our transmitted message must be redundant,i.e., it must contain more packets than the original message to account for the lost or corrupted packets).

Thus the transmitted message is c1= P(1),c2= P(2), . . . ,cn+j= P(n+ j). Since we are working moduloq, we must make sure that n + j q, but this condition does not impose a serious constraint since q is very

large.

Erasure Errors

Here we consider the setting of packets being transmitted over the internet. In this setting, the packets are

labeled and so the recipient knows exactly which packets were dropped during transmission. One additional

observation will be useful:

CS 70, Fall 2013, Note 6 1


42/133

3)By Property 2 in Note 7, we can uniquely reconstruct P(x)from its values at anyn distinct points, sinceit has degree n 1. This means that P(x) can be reconstructed from any n of the transmitted packets.Evaluating this reconstructed polynomial P(x)at x=1, . . . ,nyields the original messagem1, . . . ,mn.

Recall that in our scheme, the transmitted message isc1=P(1),c2=P(2), . . . ,cn+j=P(n+j). Thus, if wehope to be able to correct kerrors, we simply need to set j= k. The encoded message will then consist ofn+ kpackets.

Example

Suppose Alice wants to send Bob a message ofn= 4 packets and she wants to guard against k= 2 lostpackets. Then, assuming the packets can be coded up as integers between 0 and 6, Alice can work over

Documents

Discrete Math and Probability Theory