27
1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

Embed Size (px)

Citation preview

Page 1: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

1

Binomial Probability Distribution

Here we study a special discrete PD (PD will stand for Probability

Distribution) known as the Binomial PD

Page 2: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

2

11 1

1 2 1 1 3 3 1

1 4 6 4 1

Row 0 ---------------------Row 1-------------------Row 2 -----------------Row 3 --------------Row 4 ----------

What I have here is something called Pascal’s Triangle. Notea) Each row starts and ends with a 1,b) Any number that is not a 1 in the table is found by adding values in the two blocks directly above it. For example, in row 2 the 2 = 1 + 1. Another example, in row 4 the 6 = 3 + 3.c) A person could keep adding rows and the 5th row would have the numbers 1, 5, 10, 10, 5, 1.d) In any row the sum of the values = 2 raised to the power of the row number. For example, in row 3 2^3 = 8. (the ^ symbol means raise to the power).

Page 3: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

3

I believe Pascal’s Triangle can help us understand the binomial distribution. Now I want to start with a classic example of a binomial distribution where a coin is flipped 4 times.

Since I picked flipping a coin 4 times I would look at row 4 of the triangle. The numbers in the row have meaning for us. When you flip a coin once it can be either heads or tails (on the side you see after it lands). Let’s say heads is our focus of what we can see. Then when we flip a coin 4 times the number of heads could be 0, 1, 2, 3, and 4.The number of heads in 4 flips: 0 1 2 3 4Numbers in row 4: 1 4 6 4 1

So, when there are 0 heads in the 4 flips the number 1 below the 0 means there is only 1 way for this to happen. That would be TTTT, or all 4 flips come up tails.

Page 4: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

4

On the previous slide when I say the number of heads on 4 flips is 1, the number 4 below this means that there are 4 ways to have 1 head: TTTH, TTHT, THTT, HTTT.

Similarly, 2 heads in 4 flips can happen 6 ways: TTHH, THTH, THHT, HTHT, HTTH, HHTT.

We could do this for 3 heads and 4 heads. (Will you do it?)

Now, check out the formula n!/k!(n-k)!. n is the number of flips and k is the number of heads, since we said heads was our focal point. The ! symbol means factorial. When you have a number followed by a factorial sign you multiple the number by the next lowest number all the way down to one. For example, 4! = 4 times 3 times 2 times 1 = 4(3)(2)(1) = 24.

Page 5: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

5

0 heads in 4 flips happens 4!/0!(4!) = 1 way (since 0! = 1).

1 head in 4 flips happens 4!/1!(4-1)! = 4(3)(2)(1)/1(3)(2)(1) = 4 ways.2 heads in 4 flips happens 4!/2!(4-2)! = 12/2 = 6 ways.

3 heads in 4 flips happens 4!/3!(4-3)! = 4 ways.

4 heads in 4 flips happens 4!/4!(4-4)! = 1 way.

The next thing we want to do is construct the binomial probability distribution of the number of heads in 4 flips.

Page 6: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

6

Binomial DistributionWhen you flip a coin you can get a head or a tail. You could observe heads or tails on several flips of a coin. Say you flip the coin n times (n is a general number of times and when we have a specific problem we usually have a specific value for n).

On each flip we might call heads the “event of interest” and after n flips we might be interested in how many of the n flips gave the event of interest. The possibilities for the number of events of interest take on the discrete values 0, 1, 2 all the way through n. Thus the binomial distribution is really just the distribution of a variable with discrete values from 0 to n. But, certain conditions must hold. I show those soon.

Now, on a coin flip heads has probability .5, but the event of interest on any one trial in the more general binomial process does not have to be .5.

Page 7: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

7

Properties of a Binomial process.

1) The sample consists of a fixed numbers of observations, n.

2) Each observation is classified into one of two mutually exclusive and collectively exhaustive categories.

3) The probability of an observation being classified as the event of interest or a success, denoted by p, is constant (does not change) from observation to observation. q = (1 - p) is the probability of an observation being classified as NOT being the event of interest (or a failure) and does not change from observation to observation.

4) The observations are independent.

Recall before we saw that the probability of the intersection of independent events is equal to the product of the probabilities of each event.

Let’s move to an example to put this information into context.

Page 8: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

8

Say I wear a hat and every now and then I ask a person, “do you like my hat?” Say in recent history that 10 percent of the people said yes! So, thinking the future will be like the past (we will go with this idea here), the likelihood any person would say yes (a success) is 0.10.

So, a person saying yes is the event of interest and the probability is .1 that this will happen to with any one person.

If a sample of 4 people are asked about my hat then the binomial variable the number of folks saying yes can take on the values 0, 1, 2, 3, and 4.

There are 16 possible ways (2^4) the 4 orders can come in. Let’s see these on the next screen.

Page 9: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

9

Let’s call s the event of interest (a success) and f not an event of interest (a failure).4 orders number of events of interestssss 4sssf 3ssfs 3ssff 2sfss 3sfsf 2sffs 2sfff 1fsss 3fssf 2fsfs 2fsff 1ffss 2ffsf 1fffs 1ffff 0

On the next slide I have a tree diagram to help you think about all the possible outcomes. On the far left I have the event of interest s and not event of interest f in the first round. The second round shows a new s and f for each from the previous round. Then you just follow each branch to get the 16 different orders shown here.

Page 10: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

10

S

S F

S S

F F

S S

S F

F S

F F

S S

S F

F S

F F

S S

F F

F S

F

Page 11: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

11

The random variable again is x = the number of people saying yes. So we can have 0, 1, 2, 3, or 4. You can see that 1 of the 16 possible outcomes has exactly 4 saying yes. Does this mean the probability of exactly 4 saying yes is 1/16 or .0625? Maybe not. Here is why. We have to figure in the probability that a given person will say yes.

All 4 orders saying yes is really the intersection of the first saying yes and the second saying yes and the third saying yes and the fourth saying yes. The probability of the intersection of independent events is just the multiplication of the probability of each.

P(4) = .1(.1)(.1)(.1) = .0001 (By the way, if we flip a coin 4 times and the prob of a head on any 1 flip is .5, then the prob of 4 heads on 4 flips is .5(.5)(.5)(.5) = .0625 – BUT, here a person saying yes only has prob = .1).

Page 12: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

12

Now, the probability that only three say yes is tricky. I mean tricky only until you see what is next.

You can see from the list 4 of the 16 had only 3 say yes. Each one of the 4 has probability .1(.1)(.1)(.9) = .0009. So with 4 possible ways of getting this result we have P(3) = .0036

6 of the 16 outcomes have only two saying yes. Each has probability .1(.1)(.9)(.9) = .0081. So with this occurring six times P(2) = .0486.

4 of the 16 outcomes have only 1 saying yes. Each has probability .1(.9)(.9)(.9) = .0729. So with this occurring 4 times P(1) = .2916.

1 of the 16 outcomes has none saying yes. P(0) = ,9(.9)(.9)(.9) = .6561

Page 13: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

13

Remember we have n = 4 people here and X = the number of people saying yes to liking my hat. X could be 0, 1, 2, 3, or 4.

In general, the probability, written P(k), of a given k is found by the formula

n! pk(1-p)(n-k) remember something raised to 0 power = 1

k!(n-k)!

Now in our example n = 4, p = .1 and 1-p = .9

When k = 0, the probability P(0) = 4! .10(.9)4-0 = .6561

0!4!

Note when we found P(0) we had 0! and this equals 1. Plus we had something raised to the 0 power. This always equals 1.

Page 14: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

14

As you can tell these calculations are quite tedious. The good news is our book has a table that can give us the probabilities we so desire. Table C in the book has some binomial tables. Note down the left side of the table you see examples of n from 2 through 20. Also on the left you see k as the number of items of interest. For our example we had n = 4, or 4 people asked about my hat and k represents the number saying yes. So we see the probability that 0 of the 4 saying yes is in the 0 row (of the n = 4 section). Since in one trial our event of interest = .1 we have to look in that column.

On the next screen I show you the binomial probability distribution with n = 4 and p = .1. I also add the cumulative distribution.

On the next screen I have column P(X) and in the table in the book k represents the values of X.

Page 15: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

15

X P(X) cumulative prob

0 .6561 .6561

1 .2916 .9477

2 .0486 .9963

3 .0036 .9999

4 .0001 1.000

Now, let’s ask a few more questions.

What is the probability 1 or fewer say yes to liking my hat? The cumulative prob tells us the answer is .9477.

What is the probability that more than 2 will say yes? More than 2 is the complement of 2 or fewer, so P(more than 2) = 1 – P(X≤2) = 1 - .9963 = .0037.

The cumulative prob column is telling us the prob of X in a given row or any X less than in the row.

P(X≤2), for example, is the probability of 2 or fewer saying yes and equals .9963

P(X ≤ a value under the X column)

Page 16: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

16

Microsoft Excel and the Binomial PD

On the next slide there is a spreadsheet in Excel. I use a different generic example for you to see how this is similar to the table E.6. Note cell c1 has the value of n = 3 and cell c2 has the value of p = .3. Cells A4:A7 have the values of x.

Cells B4:B7 have Excel formulas typed in. If we put the mouse in cell B4 and typed “=BINOMDIST(A4, $C$1, $CD$2, FALSE).”

The A4 will mean 0. The $C$1 will mean 3. The $C$2 will mean .3 and the FALSE means we want the f(x). When you type this in hit the enter key. To get the rest of the f(x) values put the mouse back into cell B5 and click once. Then move the mouse to the bottom right corner of the cell, click and drag down to the last cell. In the BINOMDIST function A4 changes to A5 and so on as you drag down. Excel wants to change cell values when you drag functions. The $ signs in the $D$1 mean when you drag you will not leave that cell. If you want a cum prob put TRUE, not FALSE.

Page 17: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

17

Page 18: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

18

Page 19: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

19

The expected value for the binomial PD is

E(x) = np (a simplification for the binomial case from what we saw previously for a discrete random variable), and the variance is Var(x) = σ2 = np(1-p) (also a simplification).

The standard deviation is just the square root of the variance.

Consider a binomial experiment with n = 10 and p = 0.1. You can double click inside the spreadsheet on the next screen and copy the Excel file if you want.

a. P(0) is found in the f(x) column as .34867

b. P(2) = .1937

c. P(x≤2) is found in the Cum Prob column as .9298

d. P(x≥1) = 1 – P(x≤0) = 1 - .3487 = .6513

Page 20: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

20

Number of Trials (n) 10Probability of Success (p) 0.1

x f(x) Cum Prob0 0.34867844 0.348678441 0.387420489 0.7360989292 0.193710245 0.929809174 E(x) = 13 0.057395628 0.987204802 Var(x) = 0.94 0.011160261 0.998365063 St dev. = 0.9486835 0.001488035 0.9998530976 0.000137781 0.9999908787 8.748E-06 0.9999996268 3.645E-07 0.9999999919 9E-09 110 1E-10 1

Note the E notation here. 9E-09 means we have the number 9 but have to move the decimal 9 places to the left because we have E-09. The number is .000000009. An E+ would require a movement of the decimal to the right.

Page 21: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

21

Note on the previous slide I have an Excel spreadsheet. At the top I typed the label and numbers

Number of Trials (n) 10Probability of Success (p) 0.1

in separate cells. The numbers are used in the formulas. You should do this as well when you do a problem because it “dresses up” the output and makes it easier to remember what the heck is going on.

Also note that in my notes when you see a table you can double click on it and see the Excel spreadsheet.

Page 22: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

Example flipping two coins

22

If you flip two coins (or one coin twice) the possible outcomes are HH, HT, TH, TT. So, n = 2. Let’s say the event of interest is heads H. We could have X = 0, 1, or 2. Also say p = .5From table c we seeX P(X)0 .251 .52 .25What is the probability of at least 1 head on the two flips? This would be P(1) + P(2) = .5 + .25 = .75

Page 23: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

Special scenario

23

The binomial table c in the book only has probabilities up to 0.50. But, in the real world probabilities can be up to 1.00.

Say you have a special coin that when flipped has a 0.60 probability of being heads. Let’s flip 3 times. Thus there can be 0, 1, 2, or 3 heads in the 3 flips. Another way to state this is there can be 3, 2, 1, or 0 tails in the 3 flips. The probability of a tail on any flip is .4 in this case.

So, on the next slide I show you how to work with probabilities greater then .5 in the binomial situation.

Page 24: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

24

X = the number of heads and the probability of a head on any 1 flip is 0.6.Y = the number of tails and the probability of a tail on any 1 flip is 0.4.X P(X) Y P(Y)0 .064 3 .0641 .288 2 .2882 .432 1 .4323 .216 0 .216

So, if the probability of a head on any one flip is 0.6 then use the 0.4 probability and note that 0 heads is like 3 tails (when you do only 3 flips). 1 heads is like 2 tails, and so on.

As a last point I use the coin flip example a lot, but when there are situations where we focus on many occurrences, but each occurrence has only 2 outcomes, then we have the binomial dist.

Page 25: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

25

The Normal Distribution Approximation to the Binomial DistributionIn our book we have binomial tables that go up to n = 20. But often we have more than n = 20 as our interest. Some folks figured out in these cases we can use the normal distribution to give us a roughly equal answer.Remember the normal distribution is characterized by a mean and a standard deviation and then we use the Z calculation of

(value – mean) / standard deviation and round the result to 2 decimal places and then go to the z table to get probabilities.

In the binomial situation remember we have n trials and the probability of a success on any 1 trail is p. In the normal approximation the mean = n times p and the standard deviation is the square root of the product of the 3 terms n, p, and (1 – p).

Page 26: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

26

This normal approximation to the binomial can be used when 2 conditions hold:n times p ≥ 10 and n times (1 – p) ≥ 10.

Example from book5.69 Internet postings. Suppose (as is roughly true) that 20% of all Internet users have posted photos online. A sample survey interviews an SRS of 1555 Internet users.(a) What is the actual distribution of the number X in the sample

who have posted photos online?--the distribution is actually binomial with n = 1555 and p = .2(b) What is the probability that 300 or fewer of the people in the sample have posted photos online? (Use software or a suitable approximation.) To use the normal approximation we have a mean value = 1555 times .2 = 311 and a standard deviation equal to the square root of (1555 times .2 times .8) = 15.77 see next slide for more.

Page 27: 1 Binomial Probability Distribution Here we study a special discrete PD (PD will stand for Probability Distribution) known as the Binomial PD

27

To get the probability of 300 or fewer we find the Z =

(300 – 311)/15.77 = -0.70 and the area to the left that is in the table is what we want – that is .2420

So, when n is large (more than 20 for us) and the 2 conditions noted are met, we can use the normal approximation to the binomial distribution.