The Binomial Distribution - PCCspot.pcc.edu/~hmesa/math243/lecture/The_Binomial... · Web viewNow for the big question. When is it all right to use a standard normal distribution

The Binomial Distribution

We have seen examples of the binomial distribution in chapter 4. Recall the example involving Shaq's freethrow ability. If we ask the question what is the chance that Shaq will make 3 out of 5 freethrows, assuming the following conditions, this situation can be modeled using a binomial distribution.

What are the assumptions?

The probability of making a freethrow is fixed, and we will denote it by p. The events consisting of making free throws are independent from each other. There is a fixed number of attempts, called trials. Each event can only results in two outcomes which will call success and failure.

For our example we have

Shaq has a 40% chance of making a free throw. If Shaq makes or misses the first freethrow, the probability of making the next

freethrow does not change, it continues to be 40% For our example Shaq will attempt 5 freethrows. And finally Shaq can make or miss a freethrow (two possible outcomes).

What we have described is a binomial situation. The random variable X, which for this special situation we will rename the Binomial random Variable, counts the number of successes in a situation with n trials. The probability distribution table for this random variable is what we will call the binomial distribution.

Again consider the Shaq example. If we ask Shaq to attempt 5 freethrows, the number of trials is 5 (n = 5). So the binomial random variable will count the number of basket he makes out of the five. We could just as easily defined the binomial random variable to count the number of misses out the five attempts.

To create the probability distribution table we need to know what is the chance that he makes 0 freethrows, 1 freethrow, 2 freethrows, and so on.

The P(X = 0) consist of missing 5 shots in a row. Since the events are independent we just need to multiply P(missing a freethrow) = 0.6 five times in a row.P(X = 0) = (0.6)5 = .07776.

What is the probability that he makes one free throw out of the five? Let M denote a miss, and let B denote a basket made.

One possible situation is BMMMM, which has a probability of

P(B and M and M and M and M) = (.4)(.6)(.6)(.6)(.6) = 0.05184.

But another way he can make one freethrow is, MBMMM, which as you will note has also the same probability as the previous. P(M and B and M and M and M) = 0.05184.

The question, as you can see, is how many ways can Shaq just make one freethrow? Once I find this out I can multiply it by 0.05184, since this is the probability for every situation in which he can make one freethrow.

Again, by rearranging the letters I can see that all possible ways of making one freethrow are

BMMMMMBMMMMMBMMMMMBMMMMMB

So far we have

X 0 1 2 3 4 5P(X) .07776 5(0.05184)

How about making 2 free throws?

BBMMMBMBMMBMMBMBMMMBMBMMBMMBMBMMMBBMBBMMMMBBMMBMBMMMBMB, for a total of 11 possible combinations.

Phew! I hope that is all. You can see that the tough part is trying to find out how many ways I can have r successes out of n trials. Once I find this out to find the probability all I have to do is calculate one of the situations.

P(BBMMM) = (0.4)2(.6)3

So P(X = 2) = 11 (0.4)2(.6)3.

Can I make this process easier? The answer is yes. There is a function called the binomial formula that calculates the probability of getting X = k successes.

P(X = k) = (p)k (1 - p)n - k , where p is the probability of success.

The notation n! means the following: if I replace n with 3 it means, 3! = 3(2)(1) = 6. If I replace n with 5 it means, 5! = 5(4)(3)(2)(1) = 120.

The n! reads "n factorial"

Thus in general n! = n(n- 1)(n – 2) …(2)(1)

You must also know that by definition 0! = 1 and 1! = 1 as well.

How does this formula help? Lets break down all the factors.

The pk gives the probability of k successes. While (1 - p)n - k gives the probability of n - k failures. We can now see that the last factor, counts the number of ways we can get k successes and n-k failures.

If we again look at the last probability we calculated, P(BBMMM) = (0.4)2(.6)3 we see that the missing factor is .

Lets continue filling in the missing probabilities

P(X = 3) = (.4)3(.6)2

= (.4)3(.6)2 = = 10(.4)3(.6)2

P(X = 4) = (.4)4(.6)1

P(X = 5) = (.4)5(.6)0

We can now fill in the rest of our table without too much effort.

X 0 1 2 3 4 5P(X) .07776 0.2592 0.3456 0.2304 0.0768 0.01024

Approximation to the Binomial Distribution

Suppose that instead of Shaq attempting 5 freethrows he had attempted 100 freethrows as was mentioned in the book. Further, suppose the question is, what

is the probability that out of 100 free throws, Shaq makes more than 70 free throws?

If you analyze this question, you begin to realize that one way to answer this question is to use the formula P(X = k) = (p)k (1 - p)n - k. However, the statement, Shaq makes more than 70 free throws, means that you need to calculate, P(X = 71), P(X = 72), P(X = 73), P(X = 74) and so on, until I get to P(X = 100). Not a pleasant thought, even with the formula.

Another way out is too use a computer to do the repeated calculations. This is feasible, until you begin to ask questions involving many computations, like what is the chance of making more than 700 freethrows out of 10000 attempts. Not a very good, or realistic question, but you understand my meaning here. A computer would be able to compute this, but it would take a while. For example, I asked a similar question in another statistics class, and someone attempted to answer it by having their TI-85 go through the iterations. An hour later the calculator was still computing away.

Let us look again at the binomial distribution to understand the basic characteristics, that is crucial to this approximation concept.1. Clearly a binomial situation can only result in two outcomes success or failure.

The outcomes are not quantitative, but rather categorical. You can only succeed or fail and those are your only possibilities.

2. The Binomial Random Variable (notice that I am changing the focus) X, will be defined as the number of successes out of n attempts (trials). This means that the binomial random variable is discreet not continuous, since X can only equal 0, or 1, or 2, or 3, … or n. Continuous means that if X is a continuous variable, and 0 < X < 10 as an example, then X can equal 5.559459320012 or X = 7.000000000, or X = 3.1. That is, X can equal any value between 0 and 10.

3. The two parameters that characterize a binomial distribution are, the probability of success, p, (because once we know this we know the probability of failure, 1 - p, the only other outcome), and the number of trials, n, since this process requires a limit to the iteration.

If I were to graph the binomial distribution for p = .2 and n = 10, we would get the following graph, which I computed using

P(X = k) = (p)k (1 - p)n - k, ten times, for each outcome.

X 0 1 2 3 4 5 6 7 8 9 10P(X) .006 .040 .121 .215 .251 .201 .111 .042 .011 .002 .000

The probabilities in the table above have been rounded to the thousandths position.

The graph is shown below. Notice, unlike the representation in your text, the very narrow lines attempt to depict the fact that our distribution is discrete not continuous.

If I had used the square rectangles your book uses, it seems to indicate that we have a continuous distribution which we do not. For this distribution the probability of P(X = 3.1) = 0 while P(X = 3) = .215. Furthermore, P(2.1 < X < 3.99) = .215.

You can see from the graph that the shape of this distribution seems to somewhat look like the normal distribution. You could say that this distribution is like the skeleton and once we fill it in we will get a normal looking curve.

If you look at the actual values, one characteristic that will be apparent is that this distribution is not symmetrical!

Notice that for X = 9 and X = 10 the probability values are very close to zero. As you choose larger n values you will notice that most of the values at one endpoint or both are almost negligible with the amount they contribute to the overall probability. The mean of this distribution is calculated by = np and its standard deviation is = .

Now for the big question. When is it all right to use a standard normal distribution to approximate (I can not emphasize the word approximate enough) the binomial distribution?

The general rule of thumb is if np > 10 and n(1 - p) > 10 we should get good results. If we do not meet this criteria then, the results will not come that close to the actual values you would get if you used the formula P(X = k) = (p)k (1 - p)n - k.

Even if the criteria is met, we still are only approximating. How good is this approximation? If our products are very close to 10 expect, results that are satisfactory, probably good to the hundredth position and that is about it. The further we are away from 10 the better the approximation is.

So for very large trials, like n = 1000, the approximations are very good. Also if p, the probability of success is near 0.5, the halfway mark, and we meet the criteria if np > 10 and n(1 - p) > 10, but the product is still close to 10, we still would get much better results when approximating compared to a p value that is further away from 0.5.

How to Approximate Using a Normal Distribution.

1. Make sure that the criteria np > 10 and n(1 - p) > 10 is met.

2. To use the normal approximation you need to know and , so we can use the formula . We will use np = and = .

3. If we want to know P(X = k), or P(a X b), or P(X > a), or any other combination of possibilities where X is a binomial random variable with the mean and standard deviation mentioned in step 2, we may need to use this concept of continuity correction, which in the long run give much better results.

4. What is continuity correction? If we ask P(X > 4) where n is 10, which means that the largest value of X is 10, notice that we are excluding the number 4 from our count. What we want is P(X > 4) = P(X = 5) + P(X = 6) + P(X = 7) + …+ P(X =- 10). The problem is that when we are talking about continuous distributions, a single element has probability of zero of occurring, only ranges are possible; P(X = 4) = 0 in a continuous setting, while P(3.95 < X < 4.05)= will equal some number. Thus, we need to take this into account.

0.5 When p is close to 0.5 approximation is better , for the same n value.

When p is closer to 0 or 1, approximation is worse, for the same n value.

When p is closer to 0 or 1, approximation is worse, for the same n value.

10 When p is close to 0.5 approximation is better, , for the same n value.

An Example – A person is taking an exam with 100 multiple choice questions. Each question has 4 possible responses. If the person is merely guessing, this person has a 1/4 chance of correctly answering the question. What is the probability of answering 35 or more questions correct on the exam if you guess at every question?

Answer- You will note that this situation is perfect for a binomial situation. You have a fixed number of trials, n = 100. You only have two outcomes: you get answer right or not.You have a 25% chance of getting a correct answer, and that does not change from question to question, so the events are independent. Let X equal the number of correct questions out of the 100. This means that X can take any integer value, between 0 and 100.

The question posed is, “what does P(X 35) equal?” I could use the binomial formula 34 times and then use the complement to answer the question.

Using a computer I arrive at P(X 35) 0.016427 . I used 1-Binomdist(34,100,0.25,true). That is less than two chances in 100 tries on average.

How could I use an approximation? First, we need to ask, “Do we meet the criteria?”

100(.25) > 10 is true and so is 100(1 - .25) > 100.

Secondly, we find and : = 100(.25) = 25, = 4.330127.

The height of the sticks depict the probability for a particular binomial random variable, while the curve shows a portion of the normal density curve superimposed with the binomial graph. The yellow represents, the area to the right of 35.

If we just use z = with x = 35 to find the number of standard deviations we get

Normal Approximation without continuity correction

P(X 35) P = P(Z > 2.31) = .0104. You can see this is close to the actual value of 0.016427.

The problem is that the number 35 in the continuous scenario does not have any value so we have underestimated the result. To correct this fact we will create an interval that encloses the number 35. If we chose x = 34 that would be too much in the long run – which is the next random variable. So instead, let us choose a number between 35 and 34, like 34.5.

When we use 34.5 instead of 35 we have the following:

Normal Approximation with continuity correction

P(X 35) P P(Z > 2.19) = .0143, which is a better approximation.

Thus continuity correction involves adding or subtracting 0.5 to the number/s that constitute the endpoints of the range of numbers in order to get a better approximation.

Documents

The Binomial Distribution - PCCspot.pcc.edu/~hmesa/math243/lecture/The_Binomial... · Web viewNow for the big question. When is it all right to use a standard normal distribution