18
Chapter 3 Probability In probability we try to quantify the likely occurrences of unpredictable events. Probabil- ity has a host of applications from insurance, investments, gambling, population modeling, epidemiology, cryptography, ... 1. Sample spaces and events When a fair dice is rolled, the outcome cannot be predicted with any certainty. However, in probability theory we try to make quantitative statements about such situations; we don’t know what number will come up with the fair dice is rolled, but, for example, we can say that the probability of scoring at most two is exactly 1 3 . Definition 1.1. (1) An experiment is a process by which an observation is made. (2) The sample space, Ω, of an experiment is the set of all possible outcomes for the experiment. (3) A sample point is an element of the sample space, i.e. a possible outcome of the experiment. (4) An event is a subset of the sample space, i.e. a collection of possible outcomes to the experiment. We say that an event A occurs if the outcome is an element of A. Example 1.2. (1) Tossing a coin, sample space = {h, t }. (2) Tossing a coin twice, sample space = {(h, h), (h, t ), ( t , h), ( t , t )}. 61

Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

Chapter 3

Probability

In probability we try to quantify the likely occurrences of unpredictable events. Probabil-ity has a host of applications from insurance, investments, gambling, population modeling,epidemiology, cryptography, . . .

1. Sample spaces and events

When a fair dice is rolled, the outcome cannot be predicted with any certainty. However, inprobability theory we try to make quantitative statements about such situations; we don’tknow what number will come up with the fair dice is rolled, but, for example, we can saythat the probability of scoring at most two is exactly 1

3 .

Definition 1.1.

(1) An experiment is a process by which an observation is made.

(2) The sample space, Ω, of an experiment is the set of all possible outcomes forthe experiment.

(3) A sample point is an element of the sample space, i.e. a possible outcome ofthe experiment.

(4) An event is a subset of the sample space, i.e. a collection of possible outcomesto the experiment. We say that an event A occurs if the outcome is an elementof A.

Example 1.2.

(1) Tossing a coin, sample space = h, t.

(2) Tossing a coin twice, sample space = (h, h), (h, t), (t, h), (t, t).

61

Page 2: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

62 3. Probability

(3) The sample space for throwing a dice is Ω = 1,2, 3,4, 5,6. The event ‘an oddnumber is thrown’ is the subset A= 1,3, 5. The event ‘a number bigger than 3 isthrown’ is B = 4, 5,6. The event ‘an odd number bigger than three is thrown’ isA∩ B = 5.

The sample space for a given experiment will often depend on what we are interested inabout the experiment.

Example 1.3. A man throws a dice repeatedly. He stops after five throws or if he gets twosuccessive 6s. If we are interested in how many throws there are before the experiment stopsthen the sample space is 2,3, 4,5, while if we are interested in what the total score is whenthe experiment stops then the sample space would be 5, 6,7, . . . , 27, 28 (lower end 5 1s,upper end 3 6s and 2 5s).

In this chapter, Ω will always stand for a finite sample space.

The empty set ∅ is called the null event and can be thought of as an impossible outcome tothe experiment. The sample space itself is an event, and can be thought of as ‘the experimenthas some outcome’. These two are called the trivial events.

Suppose A is an event in Ω (i.e. A⊆ Ω). If we perform the experiment represented by Ω andthe resulting sample point is an element of A then we say A occurs. E.g. in Example 1.2 (3) ifwe throw the dice and get a 1 then A occurs but B does not occur. An event which containsonly one element is called a simple event.

Lemma 1.4. Suppose A and B are events in a sample space Ω. Clearly, A∪ B, A∩ B andΩ \ A are also events. Then

(1) A∪ B occurs if and only if A occurs or B occurs,

(2) A∩ B occurs if and only if A occurs and B occurs.

(3) Ω \ A occurs if and only if A does not occur.

Example 1.5. A red dice and a blue dice are thrown once. The sample space for this exper-iment is

Ω = (x , y) | x , y ∈ N, x , y ¶ 6= (1, 1), (1, 2), . . . , (1,6), (2,1), . . . , (6,6).

The pair (x , y) represents the outcome ‘the red dice scores x and the blue dice scores y ’.

Let A be the event ‘odd number on red dice’ and B the event ‘odd number shows on bluedice’. Then we have the following events:

A∪ B: ‘at least one odd number scored’

A∩ B: ‘both dice are odd’

Ω \ A: ‘red dice is even’

A\ B: ‘red dice is odd and blue dice is even’

Page 3: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

2. Probability measure 63

(A\ B)∪ (B \ A): ‘one dice is odd and the other even’

Definition 1.6. Two events A and B are called mutually exclusive in Ω if they are disjointsubsets of Ω, that is A∩ B =∅.

I.e., two events are mutually exclusive if both of them cannot occur at the same time. E.g.when throwing a dice, the events ‘odd number rolled’ and ‘even number rolled’ are mutuallyexclusive. Of course, for any event A, the events A and Ω \ A are mutually exclusive.

Evidently, in many experiments, some outcomes are more likely than others and next weintroduce a way to measure this.

2. Probability measure

In this section we look at how probability is described and obtain some simple rules forfinding the probabilities of some events.

Recall that[0, 1] = x | x ∈ R and 0¶ x ¶ 1

denotes the closed interval with endpoints 0 and 1 on the real number line.

Definition 2.1. A probability measure on a finite sample space Ω is a function

p : Ω −→ [0,1]

with the property that∑

ω∈Ωp(ω) = 1.

A probability measure associates with each outcome of an experiment a number between 0and 1 (inclusive). We interpret p(ω) as the probability that ω will be the outcome of theexperiment, and that this probability ranges from 0 (impossible) to 1 (a certainty). The rule∑

ω∈Ω p(ω) = 1 simply means that one of the possible outcomes of the experiment mustoccur.

Example 2.2.

(1) Ω= h, t is the sample space for tossing a fair coin. Define the probability measurep : Ω→ [0, 1] by p(h) = p(t) = 1

2 to reflect the fact that the coin is fair, i.e. headsand tails are equally likely.

(2) Ω= 1, 2,3,4, 5,6 is the sample space for tossing a dice. A fair dice is representedby the probability measure p given by p(k) = 1

6 for all k ∈ Ω.If the dice is not fair a different probability measure is needed. E.g., suppose

getting 6 is twice as likely as getting any other number. Then the right probabilitymeasure is p(1) = p(2) = p(3) = p(4) = p(5) = 1

7 and p(6) = 27 .

Page 4: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

64 3. Probability

Definition 2.3. A finite sample spaceΩ, together with a probability measure, p, is calleda probability model (Ω, p) of the experiment it represents.

Bear in mind that a single sample space may have many probability measures.

A probability measure gives the ‘probability’ of a sample point (i.e. outcome) in Ω, but whatabout general events?

Definition 2.4. If (Ω, p) is a probability model and A is an event (i.e. A ⊆ Ω) then wedefine the probability of A to be

p(A) =∑

ω∈A

p(ω)

Also, p(∅) = 0.

Example 2.5. Consider the unfair dice in Example 2.2 (2). What is the probability that aneven number is thrown?

Solution: Here the event is A= 2, 4,6. The probability of this event is

p(A) =∑

ω∈A

p(ω) = p(2) + p(4) + p(6) = 17 +

17 +

27 =

47 .

We notice a couple of easy facts.

Fact 2.6.

(1) For any event A, p(A)¾ 0, because p(A) is the sum of non-negative numbers.

(2) p(Ω) = 1 because p(Ω) =∑

ω∈Ω p(ω) = 1.

The following is very useful when calculating probabilities of events.

Theorem 2.7. If A and B are mutually exclusive events (i.e. A∩ B =∅) , then

p(A∪ B) = p(A) + p(B)

More generally, if A1, A2, . . . , An are pairwise mutually exclusive (i.e. Ai∩A j =∅ wheneveri 6= j) then

p(A1 ∪ A2 ∪ · · · ∪ An) = p(A1) + p(A2) + · · ·+ p(An)

Proof. We prove the first part.

p(A∪ B) =∑

ω∈A∪B

p(ω).

Since A and B are disjoint sets we can split this sum into 2 parts:

p(A∪ B) =∑

ω∈A∪B

p(ω) =∑

ω∈A

p(ω) +∑

ω∈B

p(ω) = p(A) + p(B).

Page 5: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

3. Equiprobability 65

However, events are usually not mutually exclusive, and there is still the problem of calcu-lating the probability of the union of two events.

Example 2.8. Throwing a fair dice again: sample space Ω= 1, 2,3,4, 5,6 and probabilitymeasure p(ω) = 1

6 for all w ∈ Ω. In future this will be called the probability model for a fairdice.

Let A be the event ‘an even number is rolled’ and B the event ‘a number greater than 3 isrolled’. Then A= 2,4, 6, B = 4, 5,6, A∪ B = 2,4, 5,6, and A∩ B = 4,6.

We calculate

p(A) =∑

ω∈A

p(ω) = p(2) + p(4) + p(6) = 16 +

16 +

16 =

12

and similarly

p(B) = 12 , p(A∪ B) = p(2) + p(4) + p(5) + p(6) = 1

6 +16 +

16 +

16 =

23 , and

p(A∩ B) = p(4) + p(6) = 13 .

In particular, p(A∪ B) 6= p(A)+ p(B), i.e. the formula for mutually exclusive events does nothold. However, 2

3 =12 +

12 −

13 , that is,

p(A∪ B) = p(A) + p(B)− p(A∩ B).

Indeed this is the correct formula in all cases.

Theorem 2.9. For any two events A and B in Ω,

p(A∪ B) = p(A) + p(B)− p(A∩ B)

If A and B are mutually exclusive then this formula reduces to that of Theorem 2.7 since thenA∩ B =∅ and p(∅) = 0.

The following facts can be proved but they are also self-evident.

Theorem 2.10.

(1) For any two events A and B in Ω, if A⊆ B then p(A)¶ p(B).

(2) For any event A in Ω, 0¶ p(A)¶ 1.

(3) For any event A in Ω, p(Ω \ A) = 1− p(A).

3. Equiprobability

Definition 3.1. Let (Ω, p) be the probability model of an experiment (i.e. sample spacewith a probability measure). We say that the outcomes of the experiment are equiprob-able if p(ω) is the same number for every outcome ω ∈ Ω.

Page 6: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

66 3. Probability

E.g. in the probability model for throwing a fair dice (Example 2.8), the outcomes areequiprobable (all = 1

6). Similarly for tossing a fair coin.

Given any finite set E, the number of elements in E is called the cardinality of E and isdenoted #E. (Some authors also write |E| instead of #E.)

Remarks 3.2. It is easy to see that if p is a probability measure on an experiment which hasn possible outcomes which are equiprobable, then the probability measure of each outcomeis 1

n . If (Ω, p) is the probability model for this experiment then p(ω) = 1#Ω for every ω in Ω.

Theorem 3.3. Suppose (Ω, p) is the probability model of an experiment whose outcomesare equiprobable. Then for any event A in Ω,

p(A) =#A#Ω

Proof. Since the outcomes are equiprobable we have from above that p(ω) = 1#Ω for all

ω ∈ Ω. Hence

p(A) =∑

ω∈A

p(ω) =∑

ω∈A

1#Ω

=1

ω∈A

1=1

#Ω#A=

#A#Ω

.

Example 3.4. Two fair dice are thrown: one red, one blue. What is the probability that thetotal score is

(1) exactly seven?

(2) at least nine?

Solution: Let Ω be the sample space for the experiment, i.e.

Ω= (x , y) | x , y ∈ 1, 2,3, 4,5, 6

(see Example 1.5). Then #Ω = 36. Each point on the graph below represents an outcomein Ω. For instance, the point (2,4) represents the outcome ‘the red and blue dice come up 2and 4, respectively’.

Page 7: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

3. Equiprobability 67

(1) Let A be the event ‘exactly seven is scored’. Then

A= (1,6), (2,5), (3,4), (4, 3), (5, 2), (6, 1),

so p(A) = #A#Ω =

636 =

16 .

Equiprobability: examples

Let A be the event ‘exactly seven is scored’. Then

A = (1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1),

so p(A) = #A# = 6

36 = 16 .

Lecture 27 MATH10240 April 4, 2014 12 / 14Note. Whenever you roll 2 dice, 7 is the most probable total score. This can easily be seenfrom the graph: the line corresponding to p(A) has the largest number of dots (namely 6),compared to the lines parallel to it, which correspond to the remaining total scores, from 2(smallest), to 12 (largest).

(2) Let B be the event ‘at least 9 is scored’. Then

B = (3,6), (4,5), (5,4), (6,3), (4, 6), (5, 5), (6, 4), (5, 6), (6, 5), (6, 6).

By Theorem 3.3, P(B) = #B#Ω =

1036 =

518 .

Equiprobability: examples

Let B be the event ‘at least 9 is scored’. Then

B = (3, 6), (4, 5), (5, 4), (6, 3), (4, 6), (5, 5), (6, 4), (5, 6), (6, 5), (6, 6).

P(B) = #B# = 10

36 = 518 .

Lecture 27 MATH10240 April 4, 2014 12 / 14Warning 3.5. Remember ‘equiprobable’ means each outcome (sample point) has thesame probability. It does not mean that every event has the same probability. E.g. checkthat the probability for scoring a total of 2 is not equal to the probability of scoring atotal of 7 in the experiment of tossing two fair dice.

Page 8: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

68 3. Probability

4. Conditional Probability

If you toss a fair dice and ask what is the probability of getting 2 then clearly the answer is 16 .

However, if you are told that the dice shows an odd number then the probability of havinggot 2 is obviously 0. If you are told that the dice shows an even number then the probabilityof having thrown 2 is 1

3 . So the different events are not unrelated.

Remarks 4.1. Knowing that an event has occurred may affect the probability of anotherevent. Adding extra information to a situation can change probabilities.

If A and B are 2 events then we write p(B | A) for the probability that B occurs, given that Aoccurs. E.g.

p(2 rolled | odd number rolled) = 0, and

p(2 rolled | even number rolled) = 13 .

Definition 4.2. Suppose we have a probability model (Ω, p) and two events A, B ⊆ Ωwith p(A) > 0 (which means that there is some non-zero chance that A will occur). Wedefine the conditional probability of B given A by

p(B | A) =p(A∩ B)

p(A)

and we say that A and B are independent if p(B | A) = p(B).

Remarks 4.3.

(1) Since A∩ B ⊆ A, by Theorem 2.10 (1), p(A∩ B) ¶ p(A), and so p(B | A) =p(A∩B)

p(A) ¶ 1.

(2) p(A | A) = p(A∩A)p(A) =

p(A)p(A) = 1.

(3) If A and B are mutually exclusive then p(B | A) = p(A∩B)p(A) =

p(∅)p(A) = 0. In

particular,

p(A | Ω \ A) = 0

(4) As p(B | A) = p(A∩B)p(A) , we can write

p(A∩ B) = p(A)p(B | A)

In other words, the probability that A and B both occur is the probability thatA occurs multiplied by the conditional probability that B occurs, given that Aoccurs. This is known as the multiplication rule.

(5) If A and B are independent, the multiplication rule simplifies to

p(A∩ B) = p(A)p(B)

Page 9: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

4. Conditional Probability 69

Example 4.4. A biscuit box contains 5 fig rolls and 3 hobnobs. A biscuit is taken at randomand eaten. A second biscuit is then taken. Find the probability that (a) a fig roll and then ahobnob are taken and (b) 2 hobnobs are taken (we assume that each biscuit has the sameprobability of being picked).

Solution

(a) Let F1 be the event ‘first biscuit is a fig roll’ and H2 the event ‘second biscuit is a hobnob’.Since the first biscuit is chosen at random, P(F1) =

58 . If a fig roll is taken then there are four

fig rolls and three hobnobs left so the probability that the second biscuit is a hobnob giventhat the first is a fig roll is p(H2 | F1) =

37 . Now the probability that the first is a fig roll and

the second is a hobnob is

p(F1 ∩H2) = p(F1)p(H2 | F1) (multiplication rule)

= 58 ·

37 =

1556 ≈ 0.26786.

(b) Using similar notation, let H1 be the event ‘first biscuit is a hobnob’. Then p(H1) =38

and p(H2 | H1) =27 and so the probability that both are hobnobs is

p(H1 ∩H2) = p(H1)p(H2 | H1) =38 ·

27 =

656 ≈ 0.107143.

With the same notation for events, we can calculate

p(F1 ∩ F2) = p(F1)p(F2 | F1) =58 ·

47 =

2056

andp(H1 ∩ F2) = p(H1)p(F2 | H1) =

38 ·

57 =

1556 .

Notice that

p(H1 ∩H2) + p(H1 ∩ F2) + p(F1 ∩ F2) + p(F1 ∩H2) =1556 +

656 +

2056 +

1556 = 1.

This is expected because exactly 1 of the 4 events must occur.

Remarks 4.5. People often find conditional probabilities confusing. For example, in theproblem with the hobnobs and fig rolls, when a person is asked “what is the probability ofpicking two hobnobs”, they will think, “well, it is the probability that the first one is a hobnob,times the probability that the second one is a hobnob”. In formula form, people easily thinkthat

p(H1 ∩H2) = p(H1)p(H2).

This is FALSE however, as the act of taking the first hobnob changes the situation! Namely,there is one hobnob less to choose from. Therefore, instead of using p(H2), we need to usethe conditional probability that the second biscuit is a hobnob, given that the first one wasalready a hobnob, i.e., p(H2 | H1), in which case we obtain the correct formula

p(H1 ∩H2) = p(H1)p(H2 | H1).

Consider now a different scenario: after you take and eat the first hobnob, someone topsup the biscuit box with a new hobnob, so that the numbers of hobnobs and fig rolls remain

Page 10: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

70 3. Probability

unchanged. In this case, taking a second hobnob is an independent event from taking thefirst hobnob, in which the formula

p(H1 ∩H2) = p(H1)p(H2)

is correct!

We can generalise the multiplication rule as follows.

Theorem 4.6. Suppose we have n events A1, A2, . . . , An in Ω, each of which has non-zeroprobability. Then

p(A1 ∩ A2 ∩ · · · ∩ An) = p(A1)× P(A2 | A1)× p(A3 | A1 ∩ A2)× . . .

× p(An | A1 ∩ · · · ∩ An−1).

Example 4.7 (A birthday problem). 35 people are selected at random. What is the proba-bility that 2 of these people have the same birthday?

Solution: The sample space Ω is the set of 35 tuples of dates. E.g.,

Ω= 6 Feb, 13 Aug, 1 March, . . ..

Let Ak be the event ‘the kth person has a different birthday from that of any of the previousk−1 people. Then the event A, ‘none of the people have a birthday in common’ is representedby

A= A1 ∩ A2 ∩ · · · ∩ A35.

So the probability that no two have a common birthday is

p(A) = p(A1 ∩ A2 ∩ · · · ∩ A35)

= p(A1)× p(A2 | A1)× p(A3 | A1 ∩ A2)× . . .

× p(A35 | A1 ∩ · · · ∩ A34).

Now A1 is the event that the 1st person has a birthday different from the previous 0 people.This always occurs, i.e. A1 = Ω and p(A1) = 1. Also, p(A2 | A1), the probability that the2nd person has a different birthday from the 1st is 364

365 (we are ignoring leap years in thisexample).

If the first two have different birthdays (i.e. A1∩A2 occurs) then the probability that the 3rdperson has a different birthday from these two is 363

365 , i.e. p(A3 | A1 ∩ A2) =363365 .

Similarly P(A4 | A1 ∩ A2 ∩ A3) =362365 and so on, down as far as

p(A35 | A1 ∩ A2 ∩ · · · ∩ A34) =331365 .

This means thatp(A) = 1 · 364

365 ·363365 · · ·

331365 ≈ 0.1856.

Now the probability that at least 2 do have the same birthday is p(Ω\A) = 1−p(A)≈ 0.8144(by Theorem 2.10 (3)).

Page 11: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

5. Applied Examples 71

5. Applied Examples

Example 5.1. The results of the germination of yellow and white pine seeds are summarisedin the following table.

Germinated Not Germinated Total

Yellow pine 66 12 78

White pine 79 20 99

Total 145 32 177

Compute the following probabilities:

(a) The probability that a seed has germinated;

(b) The probability that a seed is yellow pine;

(c) The probability that a seed has germinated and is yellow pine;

(d) The conditional probability that a seed germinates given it is a yellow pine.

Solution: Let G be the event that a seed has germinated and Y be the event that a seed isyellow pine.

(a) p(G) =145177

= 0.82

(b) p(Y ) =78177

= 0.44

(c) p(G ∩ Y ) =66

177= 0.37

(d) p(G | Y ) =p(G ∩ Y )

p(Y )=

0.370.44

= 0.84

Example 5.2. The table below summarizes the results of a survey of 100 people regardingthe relation between smoking and lung disease.

Smoker Non-smoker Total

Has lung disease 10 3 13

No lung disease 14 73 87

Total 24 76 100

One person is selected at random out of these 100 people. Compute the following probabil-ities:

(a) The probability that a selected person is a smoker;

(b) The probability that a selected person does not have lung disease;

(c) The probability that a selected person has lung disease given that he/she is a smoker;

Page 12: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

72 3. Probability

(d) The probability that a selected person has lung disease given that he/she is a not asmoker.

Solution: Let S be the event that a selected person is a smoker and NS be the event that aselected person is a non-smoker. Let L be the event that a selected person has lung diseaseand NL be the event that a selected person has no lung disease.

(a) p(S) =24

100= 0.24

(b) p(NL) =87

100= 0.87

(c) p(L | S) =p(L ∩ S)

p(S)=

0.100.24

= 0.4167

(d) p(L | NS) =p(L ∩ NS)

p(NS)=

0.030.76

= 0.0395

6. Picking k elements from n: Binomial Coefficients

Notice that Theorem 3.3 implies that finding the probability of an event for an experimentwith equiprobable outcomes becomes just a problem of counting or combinatorics.

The main tool we take from combinatorics is the idea of binomial coefficients.

Consider a set of n elements, e.g., a standard deck of 52 cards. Now pick an integer kbetween 0 and n inclusive. How many ways can you pick k elements from the original n?E.g. how many ways can you pick 5 cards from a standard deck, i.e. how many poker handsare there? The answer is the binomial coefficient

nk

=n!

k!(n− k)!

where n! denotes the nth factorial 1× 2× · · · × (n− 1)× n.

The factorials n! grow extremely fast.

3!= 1× 2× 3= 6, 5!= 1× 2× 3× 4× 5= 120,

10!= 1× 2× 3× · · · × 9× 10= 3,628, 800

and 59! > 1080, the latter being an estimate for the number of atoms in the observableuniverse (i.e. 59! is a large number). Fortunately, to calculate

nk

, we often don’t have to toilto find these huge numbers. E.g. the number of poker hands is

525

=52!

5!47!

=1× 2× · · · × 46× 47× 48× 49× 50× 51× 52

5!× 1× 2× · · · × 46× 47

=48× 49× 50× 51× 52

120= 2, 598,960.

Page 13: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

6. Picking k elements from n: Binomial Coefficients 73

Example 6.1. A fair coin is tossed 10 times. What is the probability that 6 heads came up?

Solution: The sample space Ω is the set of all 10-symbol long strings of hs and ts. It has210 = 1024 elements. Each outcome has the same probability 2−10. If A is the event ‘6 headscame up’ then to get #A we need to count how many of those strings contain 6 heads. Thisequals the number of ways of choosing 6 items from 10, i.e.

#A=

106

=10!6!4!

=7× 8× 9× 10

24= 210.

Hence

p(A) =2101024

=105512≈ 0.20508.

Below is a bar chart showing the probabilities of getting k heads after 10 fair coin tosses,where 0¶ k ¶ 10.

The second bar chart shows the probabilities of getting k heads after 20 fair coin tosses,where 0¶ k ¶ 20.

Example 6.2. A farmer has 40 cattle in a field: 25 cows and 15 bullocks. He asks his helperto bring two cows into the parlour for milking. The helper, having been raised in the cityand not knowing a cow from a bullock, returns with a random selection of 5 cattle, hopingthat at least 2 are cows.

What is the probability that the helper has brought back at least 2 cows?

Page 14: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

74 3. Probability

Solution: Let Ω be the sample space for this experiment (each element of Ω is a set of 5cattle chosen from 40). The number of elements in Ω is the number of ways of choosing fiveobjects from 40, i.e. #Ω=

405

.

Let A be the event ‘the 5 cattle contain at least 2 cows’. Then Ω \ A is the event ‘the 5 cattlecontain at most 1 cow’, so we can write Ω \ A as the union of two events:

B = ‘the 5 cattle contain no cows’, and

C = ‘the 5 cattle contain exactly 1 cow’.

Now Ω \ A= B ∪ C so p(Ω \ A) = p(B ∪ C) and since B and C are clearly mutually exclusiveevents,

p(Ω \ A) = p(B ∪ C) = p(B) + p(C).

This experiment has equiprobable outcomes so p(B) = #B#Ω . We can see that #B =

155

(choosing 5 of the 15 bullocks) and so

p(B) =#B#Ω=

155

405

.

Similarly, #C =15

4

251

= 2515

4

, so

p(C) =#C#Ω=

2515

4

405

.

This gives

p(Ω \ A) = p(B) + p(C) =

155

+ 2515

4

405

=37128658008

≈ 0.05642

and hence p(A) = 1− p(Ω \ A) = 19902109 ≈ 0.94358.

What is the probability that the helper has 2 cows if he brings back a selection of just 3 cattle?

Example 6.3. A poker hand is dealt. What is the probability that the hand is

(1) 3 of a kind (3 of 1 rank, 1 of a different rank, and 1 of a 3rd rank),

(2) a full house (3 of 1 rank, and 2 of another, different rank),

(3) 2 pair (2 of 1 rank, 2 of another, different rank, and 1 of a 3rd rank),

(4) a flush (5 of the same suit)?

Solution: First of all, each element of the sample space, Ω, is a hand of five cards dealt froma total of 52, so #Ω=

525

= 2, 598,950.

(1) If A is the event ‘the hand dealt is three of a kind’ then what is #A?

Page 15: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

6. Picking k elements from n: Binomial Coefficients 75

First, choose one of the 13 ranks, next choose 3 of the 4 suits. For the last twocards, we have to choose 2 ranks from 12 and there is a choice of 4 suits for eachof the 2 cards. Therefore, the number of choices available is

131

43

122

41

41

= 13× 4× 66× 4× 4= 54, 912.

Thus

p(A) =#A#Ω=

54, 9122, 598,950

≈ 0.02113.

(2) If B is the event ‘a full house’ then first we have a choice of 13 ranks and 3 suitsfrom 4 in which to fill the 3 similar cards. For the last 2, we have to choose 1 rankfrom 12 and 2 suits from 4 so that the number of possible full houses is

131

43

121

42

= 3, 744.

Thus

p(B) =#B#Ω=

3,7442, 598,920

≈ 0.00144.

Similarly, you can check that the answers to (3) and (4) are approximately 0.04754and 0.00198, respectively.

Example 6.4. A 13-card hand is dealt from a pack of cards. Calculate the probability of a4-3-3-3 suit pattern and of a 4-4-3-2 suit pattern.

Solution: The sample space consisting of all 13-card hands has52

13

elements. Let A be theevent ‘4-3-3-3 suit pattern’ and B the event ‘4-4-3-2’ suit pattern.

First, we count the elements of A. There are 4 choices for which suit is to have 4 cards andfollowing this, there are

134

ways of choosing the rank of the cards in that suit. For eachof the other three suits, there are

133

ways of choosing the rank of the cards from that suitand so #A= 4

134

133

133

133

.

It follows that

p(A) =#A#Ω=

413

4

133

3

5213

=66, 905,856, 160635,013, 559,600

≈ 0.1054.

Similarly for B, we count #B. First, we have to choose the 2 suits which will contain 4cards –

42

choices for this. Now choose the ranks of the cards in these two suits –13

4

134

choices for these. Now choose which suit is to have 3 cards (2

1

choices) and the number ofchoices of the ranks for the last two suits is

133

132

. Multiplying all these choices, we have

#B =4

2

21

134

2133

132

(136, 852,887, 600). It follows that p(B)≈ 0.2155.

Example 6.5. A lotto game consists of picking 6 numbers from a panel of 42 numbers.Calculate

(1) the probability of winning the game;

(2) the probability of getting five matching numbers;

Page 16: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

76 3. Probability

(3) the probability of getting four matching numbers;

(4) the probability of not matching any numbers.

Solution: The experiment here is choosing a combination of 6 numbers from 42. There are42

6

= 5, 245,786 such combinations, i.e, #Ω= 5,245, 786.

(1) There is only one winning combination. If A is the event ‘our combination wins thegame’ then #A= 1 and p(A) = #A

#Ω =1

5,245,786 ≈ 0.0000001906292.

(2) Let B be the event ‘our combination has 5 matching numbers’. What is #B? Inchoosing 5 matching numbers we must choose 5 of the 6 winning numbers andalso choose 1 of the 36 non-winning numbers. Thus, #B =

65

361

= 6×36= 216,and the probability of matching five numbers is p(B) = 216

5,245,786 ≈ 0.0000411759.

(3) Letting C be the event ‘our combination has 4 matching numbers’, the same rea-soning gives us that #C =

64

362

= 15 × 630 = 9450 and p(C) = 94505245786 ≈

0.001801446.

(4) Here D is the event ‘no numbers are matching’. We get #D =6

0

366

= 1 ×1,947, 792 and p(D)≈ 0.371306.

Exercise 6.6. Calculate the probabilities of the other possibilities. That is, the probabilitythat one matches 3, 2 and 1 winning numbers, respectively. Which is most likely?

7. Random variables

In the lotto experiment, each element ω of the sample space is a combination of 6 numberschosen from 42. However, we were not so interested in ω itself, rather the number X (ω)where X (ω) is the number of matching numbers achieved. Notice that X is a function,X : Ω→ R. A function which assigns a number to each possible outcome of an experimentis called a random variable. The set of all the values the random variable takes is called theimage of the random variable. E.g., X (Ω) for the lotto game is 0, 1,2, 3,4, 5,6.

Some events can be described in terms of random variables. E.g., the event, ‘get 3 matchingnumbers’ can be represented at ω ∈ Ω | X (ω) = 3 or X = 3. In Example 6.5 we foundp(ω ∈ Ω | X (ω) = k) (also written p(X = k)) for k = 6, k = 5, k = 4 and k = 0. Findingthe all values of p(ω ∈ Ω | X (ω) = k) (for all possible k) is called finding the probabilitydistribution (or probability density) of X .

Example 7.1. A coin is tossed 10 times. Let X be the number of heads that came up (so theimage of X is 0,1,2,. . . , 10). What is the probability distribution of the random variableX?

Solution: This has been done already in Example 6.1.

p(X = k) =#X = k

#Ω=

10k

1024.

The probability distribution of X is represented by the first bar chart in Example 6.1.

Page 17: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

8. Expectation of a random variable 77

Example 7.2. A poker hand is dealt fairly. Let X be the number of clubs in the hand. Whatis the probability distribution of X?

Solution: The image of X is X (Ω) = 0,1, 2,3, 4,5 (Ω is the set of all possible hands of fivecards). We have to calculate p(X = k) for each k ∈ Ω. The number of hands containing kclubs is

13k

395−k

. Since there are52

5

equiprobable hands,

p(X = k) =#X = k

#Ω=

13k

395−k

525

.

Working these out on a calculator, we get (roughly)

k p(X = k)

0 0.2215

1 0.4114

2 0.2743

3 0.0816

4 0.0107

5 0.0005

8. Expectation of a random variable

You can see in Example 7.2 that 1 is the most probable number of clubs that one gets. How-ever, for practical purposes, it is often more useful to know the average number of clubsin each hand if we performed the experiment a large number of times. This leads to theexpected value of a random variable.

Definition 8.1. If X : Ω→ R is a random variable then the expectation or expected valueof X is defined by

E(X ) =∑

k∈X (Ω)

kp(X = k).

While E(X ) may not be an element of X (Ω) (that is, E(X ) may not be a possible value of X )we think of E(X ) as the value X takes ‘on average’.

Page 18: Probability - University College Dublinastier/math10240/probabilities-part1.pdf · probability theory we try to make quantitative statements about such situations; we don’t know

78 3. Probability

Example 8.2. We calculate the expectation of the random variable X in Example 7.2.

E(X ) =∑

k∈X (Ω)

kp(X = k) =5∑

k=0

kp(X = k)

= 0 · p(X = 0) + 1 · p(X = 1) + 2 · p(X = 2)

+3 · p(X = 3) + 4 · p(X = 4) + 5 · p(X = 5)

≈ 0+ 1× 1.4114+ 2× 0.2743

+3× 0.0816+ 4× 0.0107+ 5× 0.0005

≈ 1.25.

The answer here is what one expects. The number of clubs in a hand of five cards averages54 = 1.25.

If you repeated the experiment of dealing 5 cards from a pack of 52 many times, the averagenumber of clubs would be close to 1.25.

Example 8.3. Suppose the lotto company pays €2 million for a winning ticket, €1,500 fora match 5 and €10 for a match 4 ticket. For each ticket ω, let X (ω) be the number ofmatching numbers and let Y (ω) be the amount of money ω wins. Like X , Y is a randomvariable. What is the expectation of Y ?

Solution: Here, Y (Ω) = 0, 10, 1500, 2000000 - the set of possible payouts.

E(Y ) =∑

k∈Y (Ω)

kp(Y = k)

= 0 · p(Y = 0) + 10 · p(Y = 10) + 1500 · p(Y = 1500)

+2000000 · p(X = 2000000)

= 0 · p(X ¶ 3) + 10 · p(X = 4) + 1500 · p(X = 5)

+2000000 · p(X = 6)

≈ 0.46104 using the numbers from Example 6.5.

Suppose many people play the lotto many times according to these rules. Then a few peoplemay hit the jackpot and be paid €2 million. However, most people will get less or nothingat all, and the average payout per person per game would be close to 46 cent. If each pays€1 for each play then each person will, on average, make a loss of about 54 cent!