Notes Stats

7/28/2019 Notes Stats

1/38

Definitions

Probability Experiment

Process which leads to well-defined results call outcomes

Outcome

The result of a single trial of a probability experimentSample Space

Set of all possible outcomes of a probability experiment

Event

One or more outcomes of a probability experiment

Classical Probability

Uses the sample space to determine the numerical probability that an event will

happen. Also called theoretical probability.

Equally Likely Events

Events which have the same probability of occurring.

Complement of an Event

All the events in the sample space except the given events.

Empirical Probability

Uses a frequency distribution to determine the numerical probability. An

empirical probability is a relative frequency.

Subjective Probability

Uses probability values based on an educated guess or estimate. It employs

opinions and inexact information.

Mutually Exclusive Events

Two events which cannot happen at the same time.

Disjoint Events

Another name for mutually exclusive events.

Independent Events

Two events are independent if the occurrence of one does not affect the

probability of the other occurring.

Dependent Events

Two events are dependent if the first event affects the outcome or occurrence of

the second event in a way the probability is changed.

Conditional Probability

The probability of an event occurring given that another event has alreadyoccurred.

Bayes' Theorem

A formula which allows one to find the probability that an event occurred as

the result of a particular previous event.

Factorial


2/38

A positive integer factorial is the product of each natural number up to and

including the integer.

Permutation

An arrangement of objects in a specific order.

Combination

A selection of objects without regard to order.Tree Diagram

A graphical device used to list all possibilities of a sequence of events in a

systematic way.

Introduction to Probability

Sample Spaces

A sample space is the set of all possible outcomes. However, some sample spaces are

better than others.

Consider the experiment of flipping two coins. It is possible to get 0 heads, 1 head, or

2 heads. Thus, the sample space could be {0, 1, 2}. Another way to look at it is flip {

HH, HT, TH, TT }. The second way is better because each event is as equally likely to

occur as any other.

When writing the sample space, it is highly desirable to have events which are equally

likely.

Another example is rolling two dice. The sums are { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }.

However, each of these aren't equally likely. The only way to get a sum 2 is to roll a 1

on both dice, but you can get a sum of 4 by rolling a 3-1, 2-2, or 3-1. The following

table illustrates a better sample space for the sum obtain when rolling two dice.

First Die

Second Die

1 2 3 4 5 6

1 2 3 4 5 6 7


3/38

2 3 4 5 6 7 8

3 4 5 6 7 8 9

4 5 6 7 8 9 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12

Classical Probability

The above table lends itself to describing data another way -- using a probability

distribution. Let's consider the frequency distribution for the above sums.

Sum Frequency Relative

Frequency

2 1 1/36

3 2 2/36

4 3 3/36

5 4 4/36

6 5 5/36

7 6 6/36

8 5 5/36

9 4 4/36

10 3 3/36


4/38

11 2 2/36

12 1 1/36

If just the first and last columns were written, we would have a probability

distribution. The relative frequency of a frequency distribution is the probability of the

event occurring. This is only true, however, if the events are equally likely.

This gives us the formula for classical probability. The probability of an event

occurring is the number in the event divided by the number in the sample space.

Again, this is only true when the events are equally likely. A classical probability is

the relative frequency of each event in the sample space when each event is equally

likely.

P(E) = n(E) / n(S)

Empirical Probability

Empirical probability is based on observation. The empirical probability of an event is

the relative frequency of a frequency distribution based upon observation.

P(E) = f / n

Probability Rules

There are two rules which are very important.

All probabilities are between 0 and 1 inclusive

0


5/38

The probability of an event which must occur is 1.

The probability of the sample space is 1.

The probability of an event not occurring is one minus the probability of it

occurring.

P(E') = 1 - P(E)

Probability Rules

"OR" or Unions

Mutually Exclusive Events

Two events are mutually exclusive if they cannot occur at the same time. Another

word that means mutually exclusive is disjoint.

If two events are disjoint, then the probability of them both occurring at the same time

is 0.

Disjoint: P(A and B) = 0

If two events are mutually exclusive, then the probability of either occurring is the

sum of the probabilities of each occurring.

Specific Addition Rule

Only valid when the events are mutually exclusive.

P(A or B) = P(A) + P(B)

Example 1:

Given: P(A) = 0.20, P(B) = 0.70, A and B are disjoint

I like to use what's called a joint probability distribution. (Since disjoint means

nothing in common, joint is what they have in common -- so the values that go on the

inside portion of the table are the intersections or "and"s of each pair of events).

"Marginal" is another word for totals -- it's called marginal because they appear in the

margins.


6/38

B B' Marginal

A 0.00 0.20 0.20

A' 0.70 0.10 0.80

Marginal 0.70 0.30 1.00

The values in red are given in the problem. The grand total is always 1.00. The rest of

the values are obtained by addition and subtraction.

Non-Mutually Exclusive Events

In events which aren't mutually exclusive, there is some overlap. When P(A) and P(B)

are added, the probability of the intersection (and) is added twice. To compensate for

that double addition, the intersection needs to be subtracted.

General Addition Rule

Always valid.

P(A or B) = P(A) + P(B) - P(A and B)

Example 2:

Given P(A) = 0.20, P(B) = 0.70, P(A and B) = 0.15

B B' Marginal

A 0.15 0.05 0.20

A' 0.55 0.25 0.80

Marginal 0.70 0.30 1.00

Interpreting the table

Certain things can be determined from the joint probability distribution. Mutuallyexclusive events will have a probability of zero. All inclusive events will have a zero

opposite the intersection. All inclusive means that there is nothing outside of those

two events: P(A or B) = 1.

B B' Marginal

A A and B are Mutually Exclusive . .


7/38

if this value is 0

A' . A and B are All Inclusive if this

value is 0

.

Marginal . . 1.00

"AND" or Intersections

Independent Events

Two events are independent if the occurrence of one does not change the probability

of the other occurring.

An example would be rolling a 2 on a die and flipping a head on a coin. Rolling the 2

does not affect the probability of flipping the head.

If events are independent, then the probability of them both occurring is the product of

the probabilities of each occurring.

Specific Multiplication Rule

Only valid for independent events

P(A and B) = P(A) * P(B)

Example 3:

P(A) = 0.20, P(B) = 0.70, A and B are independent.

B B' Marginal

A 0.14 0.06 0.20

A' 0.56 0.24 0.80

Marginal 0.70 0.30 1.00

The 0.14 is because the probability of A and B is the probability of A times the

probability of B or 0.20 * 0.70 = 0.14.

Dependent Events


8/38

If the occurrence of one event does affect the probability of the other occurring, then

the events are dependent.


The probability of event B occurring that event A has already occurred is read "theprobability of B given A" and is written: P(B|A)

General Multiplication Rule

Always works.

P(A and B) = P(A) * P(B|A)

Example 4:

P(A) = 0.20, P(B) = 0.70, P(B|A) = 0.40

A good way to think of P(B|A) is that 40% of A is B. 40% of the 20% which was in

event A is 8%, thus the intersection is 0.08.

B B' Marginal

A 0.08 0.12 0.20

A' 0.62 0.18 0.80

Marginal 0.70 0.30 1.00

Independence Revisited

The following four statements are equivalent

1. A and B are independent events2. P(A and B) = P(A) * P(B)3. P(A|B) = P(A)4.

P(B|A) = P(B)

The last two are because if two events are independent, the occurrence of one doesn't

change the probability of the occurrence of the other. This means that the probability

of B occurring, whether A has happened or not, is simply the probability of B

occurring.


9/38



Recall that the probability of an event occurring given that another event has already

occurred is called a conditional probability.

The probability that event B occurs, given that event A has already occurred is

P(B|A) = P(A and B) / P(A)

This formula comes from thegeneral multiplication principleand a little bit of

algebra.

Since we are given that event A has occurred, we have a reduced sample space.

Instead of the entire sample space S, we now have a sample space of A since we know

A has occurred. So the old rule about being the number in the event divided by the

number in the sample space still applies. It is the number in A and B (must be in A

since A has occurred) divided by the number in A. If you then divided numerator and

denominator of the right hand side by the number in the sample space S, then you

have the probability of A and B divided by the probability of A.

Examples

Example 1:

The question, "Do you smoke?" was asked of 100 people. Results are shown in the

table.

. Yes No Total

Male 19 41 60

Female 12 28 40

Total 31 69 100
http://people.richland.edu/james/lecture/m113/prob_rules.htmlhttp://people.richland.edu/james/lecture/m113/prob_rules.htmlhttp://people.richland.edu/james/lecture/m113/prob_rules.htmlhttp://people.richland.edu/james/lecture/m113/prob_rules.html


10/38

What is the probability of a randomly selected individual being a male whosmokes? This is just a joint probability. The number of "Male and Smoke"

divided by the total = 19/100 = 0.19

What is the probability of a randomly selected individual being a male? This isthe total for male divided by the total = 60/100 = 0.60. Since no mention ismade of smoking or not smoking, it includes all the cases.

What is the probability of a randomly selected individual smoking? Again,since no mention is made of gender, this is a marginal probability, the total

who smoke divided by the total = 31/100 = 0.31.

What is the probability of a randomly selected male smoking? This time,you're told that you have a male - think of stratified sampling. What is the

probability that the male smokes? Well, 19 males smoke out of 60 males, so

19/60 = 0.31666...

What is the probability that a randomly selected smoker is male? This time,you're told that you have a smoker and asked to find the probability that the

smoker is also male. There are 19 male smokers out of 31 total smokers, so

19/31 = 0.6129 (approx)

After that last part, you have just worked a Bayes' Theorem problem. I know you

didn't realize it - that's the beauty of it. A Bayes' problem can be set up so it appears to

be just another conditional probability. In this class we will treat Bayes' problems as

another conditional probability and not involve the large messy formula given in the

text (and every other text).

Example 2:

There are three major manufacturing companies that make a product: Aberations,

Brochmailians, and Chompielians. Aberations has a 50% market share, and

Brochmailians has a 30% market share. 5% of Aberations' product is defective, 7% of

Brochmailians' product is defective, and 10% of Chompieliens' product is defective.

This information can be placed into a joint probability distribution

Company Good Defective Total

Aberations 0.50-0.025 = 0.475 0.05(0.50) = 0.025 0.50

Brochmailians 0.30-0.021 = 0.279 0.07(0.30) = 0.021 0.30


11/38

Chompieliens 0.20-0.020 = 0.180 0.10(0.20) = 0.020 0.20

Total 0.934 0.066 1.00

The percent of the market share for Chompieliens wasn't given, but since the

marginals must add to be 1.00, they have a 20% market share.

Notice that the 5%, 7%, and 10% defective rates don't go into the table directly. This

is because they are conditional probabilities and the table is a joint probability table.

These defective probabilities are conditional upon which company was given. That is,

the 7% is not P(Defective), but P(Defective|Brochmailians). The joint probability

P(Defective and Brochmailians) = P(Defective|Brochmailians) * P(Brochmailians).

The "good" probabilities can be found by subtraction as shown above, or bymultiplication using conditional probabilities. If 7% of Brochmailians' product is

defective, then 93% is good. 0.93(0.30)=0.279.

What is the probability a randomly selected product is defective? P(Defective)= 0.066

What is the probability that a defective product came from Brochmailians?P(Brochmailian|Defective) = P(Brochmailian and Defective) / P(Defective) =

0.021/0.066 = 7/22 = 0.318 (approx).

Are these events independent? No. If they were, thenP(Brochmailians|Defective)=0.318 would have to equal theP(Brochmailians)=0.30, but it doesn't. Also, the P(Aberations and

Defective)=0.025 would have to be P(Aberations)*P(Defective) =

0.50*0.066=0.033, and it doesn't.

The second question asked above is a Bayes' problem. Again, my point is, you don't

have to know Bayes formula just to work a Bayes' problem.

Bayes' Theorem

However, just for the sake of argument, let's say that you want to know what Bayes'

formula is.

Let's use the same example, but shorten each event to its one letter initial, ie: A, B, C,

and D instead of Aberations, Brochmailians, Chompieliens, and Defective.


12/38

P(D|B) is not a Bayes problem. This is given in the problem. Bayes' formula finds the

reverse conditional probability P(B|D).

It is based that the Given (D) is made of three parts, the part of D in A, the part of D in

B, and the part of D in C.

P(B and D)

P(B|D) = -----------------------------------------

P(A and D) + P(B and D) + P(C and D)

Inserting the multiplication rule for each of these joint probabilities gives

P(D|B)*P(B)

P(B|D) = -----------------------------------------

P(D|A)*P(A) + P(D|B)*P(B) + P(D|C)*P(C)

However, and I hope you agree, it is much easier to take the joint probability divided

by the marginal probability. The table does the adding for you and makes the

problems doable without having to memorize the formulas.

Counting Techniques

Fundamental Theorems

Every branch of mathematics has its fundamental theorem or theorems.

Fundamental Theorem of Arithmetic

Every integer greater than one is either prime or can be expressed as an unique

product of prime numbers

Fundamental Theorem of Algebra

Every polynomial function on one variable of degree n > 0 has at least one real orcomplex zero.

Fundamental Theorem of Linear Programming

If there is a solution to a linear programming problem, then it will occur at a corner

point or on a boundary between two or more corner points


13/38

Fundamental Counting Principle

In a sequence of events, the total possible number of ways all events can performed is

the product of the possible number of ways each individual event can be performed.

Factorials

If n is a positive integer, then

n! = n (n-1) (n-2) ... (3)(2)(1)

n! = n (n-1)!

A special case is 0!

0! = 1

Permutations

A permutation is an arrangement of objects without repetition and where order is

important.

Another definition of permutation is the number of arrangements that can be formed.

Permutations using all the objects

A permutation of n objects, arranged into one group of size n, without repetition, and

order being important is:

nPn = P(n,n) = n!

Example: Find all permutations of the letters "ABC"

ABC ACB BAC BCA CAB CBA

Permutations of some of the objects

A permutation of n objects, arranged in groups of size r, without repetition, and order

being important is:

nPr = P(n,r) = n! / (n-r)!

The calculator can be used to find the number of such permutations. On the TI-82 or

TI-83, the permutation key is found under the Math, Probability menu.


14/38

Example: Find all two-letter permutations of the letters "ABC"

AB AC BA BC CA CB

Shortcut formula for finding a permutation

Assuming that you start a n and count down to 1 in your factorials ...

P(n,r) = first r factors of n factorial

Distinguishable Permutations

Sometimes letters are repeated and all of the permutations aren't distinguishable from

each other.

Example: Find all permutations of the letters "BOB"

To help you distinguish, I'll write the second "B" as "b"

BOb BbO OBb ObB bBO bOB

If you just write "B" as "B", however ...

BOB BBO OBB OBB BBO BBO

There are really only three distinguishable permutations here.

BOB BBO OBB

If a word has N letters, k of which are unique, and you let n (n1, n2, n3, ..., nk) be the

frequency of each of the k letters, then the total number of distinguishable

permutations is given by:

Consider the word "STATISTICS":

Here are the frequency of each letter: S=3, T=3, A=1, I=2, C=1, there are 10 letters

total

10! 10*9*8*7*6*5*4*3*2*1

Permutations = -------------- = -------------------- = 50400

3! 3! 1! 2! 1! 6 * 6 * 1 * 2 * 1


15/38

You can find distinguishable permutations using theTI-82.

Combinations

A combination is an arrangement of objects without repetition and where order is not

important.

Note: The difference between a permutation and a combination is not whether there is

repetition or not - there must not be repetition with either, and if there is repetition,

you can not use the formulas for permutations or combinations. The only difference

in the definition of a permutation and a combination is whether order is

important.

A combination of n objects, arranged in groups of size r, without repetition, and order

being important is:

nCr = C(n,r) = n! / ( (n-r)! * r! )

Another way to write a combination of n things, r at a time is using the binomial

notation:

Example: Find all two-letter combinations of the letters "ABC"

AB = BA AC = CA BC = CB

There are only three two-letter combinations.

Shortcut formula for finding a combination

Assuming that you start a n and count down to 1 in your factorials ...

C(n,r) = first r factors of n factorial divided by the last r factors of n factorial

Pascal's Triangle

Combinations are used in the binomial expansion theorem from algebra to give the

coefficients of the expansion (a+b)^n. They also form a pattern known as Pascal's

Triangle.

1

1 1

1 2 1

1 3 3 1
http://people.richland.edu/james/ti82/ti-dper.htmlhttp://people.richland.edu/james/ti82/ti-dper.htmlhttp://people.richland.edu/james/ti82/ti-dper.htmlhttp://people.richland.edu/james/ti82/ti-dper.html


16/38

1 4 6 4 1

1 5 10 10 5 1

1 6 15 20 15 6 1

1 7 21 35 35 21 7 1

Each element in the table is the sum of the two elements directly above it. Each

element is also a combination. The n value is the number of the row (start counting atzero) and the r value is the element in the row (start counting at zero). That would

make the 20 in the next to last row C(6,3) -- it's in the row #6 (7th row) and position #3

(4th element).

Symmetry

Pascal's Triangle illustrates the symmetric nature of a combination. C(n,r) = C(n,n-r)

Example: C(10,4) = C(10,6) or C(100,99) = C(100,1)

Shortcut formula for finding a combination

Since combinations are symmetric, if n-r is smaller than r, then switch the

combination to its alternative form and then use the shortcut given above.

C(n,r) = first r factors of n factorial divided by the last r factors of n factorial

TI-82

You can use the TI-82 graphing calculator to findfactorials, permutations, and

combinations.

Tree Diagrams

Tree diagrams are a graphical way of listing all the possible

outcomes. The outcomes are listed in an orderly fashion, solisting all of the possible outcomes is easier than just trying

to make sure that you have them all listed. It is called a tree

diagram because of the way it looks.
http://people.richland.edu/james/ti82/ti-count.htmlhttp://people.richland.edu/james/ti82/ti-count.htmlhttp://people.richland.edu/james/ti82/ti-count.htmlhttp://people.richland.edu/james/ti82/ti-count.htmlhttp://people.richland.edu/james/ti82/ti-count.htmlhttp://people.richland.edu/james/ti82/ti-count.html


17/38

The first event appears on the left, and then each sequential event is represented as

branches off of the first event.

The tree diagram to the right would show the possible ways of flipping two coins. The

final outcomes are obtained by following each branch to its conclusion: They are from

top to bottom:

HH HT TH TT

Probability Distributions

Definitions

Random Variable

Variable whose values are determined by chance

Probability Distribution

The values a random variable can assume and the corresponding probabilities

of each.

Expected Value

The theoretical mean of the variable.

Binomial Experiment

An experiment with a fixed number of independent trials. Each trial can only

have two outcomes, or outcomes which can be reduced to two outcomes. The

probability of each outcome must remain constant from trial to trial.

Binomial Distribution

The outcomes of a binomial experiment with their corresponding

probabilities.

Multinomial Distribution


18/38

A probability distribution resulting from an experiment with a fixed number of

independent trials. Each trial has two or more mutually exclusive outcomes.

The probability of each outcome must remain constant from trial to trial.

Poisson Distribution

A probability distribution used when a density of items is distributed over a

period of time. The sample size needs to be large and the probability of

success to be small.

Hypergeometric Distribution

A probability distribution of a variable with two outcomes when sampling is

done without replacement.


Probability Functions

A probability function is a function which assigns probabilities to the values of arandom variable.

All the probabilities must be between 0 and 1 inclusive The sum of the probabilities of the outcomes must be 1.

If these two conditions aren't met, then the function isn't a probability function. There

is no requirement that the values of the random variable only be between 0 and 1, only

that the probabilities be between 0 and 1.


A listing of all the values the random variable can assume with their corresponding

probabilities make a probability distribution.

A note about random variables. A random variable does not mean that the values can

be anything (a random number). Random variables have a well defined set of


19/38

outcomes and well defined probabilities for the occurrence of each outcome. The

random refers to the fact that the outcomes happen by chance -- that is, you don't

know which outcome will occur next.

Here's an example probability distribution that results from the rolling of a single fair

die.

x 1 2 3 4 5 6 sum

p(x) 1/6 1/6 1/6 1/6 1/6 1/6 6/6=1

Mean, Variance, and Standard Deviation

Consider the following.

The definitions for population mean and variance used with an ungrouped frequency

distribution were:

Some of you might be confused by only dividing by N. Recall that this is the

population variance, the sample variance, which was the unbiased estimator for the

population variance was when it was divided by n-1.

Using algebra, this is equivalent

to:

Recall that a probability is a long term relative frequency. So every f/N can be

replaced by p(x). This simplifies to be:

What's even better, is that the last portion of the variance is the mean squared. So, thetwo formulas that we will be using are:


20/38

Here's the example we were working on earlier.

x 1 2 3 4 5 6 sum

p(x) 1/6 1/6 1/6 1/6 1/6 1/6 6/6 = 1

x p(x) 1/6 2/6 3/6 4/6 5/6 6/6 21/6 = 3.5

x^2 p(x) 1/6 4/6 9/6 16/6 25/6 36/6 91/6 = 15.1667

The mean is 7/2 or 3.5

The variance is 91/6 - (7/2)^2 = 35/12 = 2.916666...

The standard deviation is the square root of the variance = 1.7078

Do not use rounded off values in the intermediate calculations. Only round off the

final answer.

You can learn how to find the mean and variance of a probability distributionusing

listswith the TI-82 or using the program calledPDIST.

Binomial Probabilities

Binomial Experiment

A binomial experiment is an experiment which satisfies these four conditions

A fixed number of trials Each trial is independent of the others There are only two outcomes
http://people.richland.edu/james/ti82/ti-list3.htmlhttp://people.richland.edu/james/ti82/ti-list3.htmlhttp://people.richland.edu/james/ti82/ti-list3.htmlhttp://people.richland.edu/james/ti82/ti-list3.htmlhttp://people.richland.edu/james/ti82/pdist.htmlhttp://people.richland.edu/james/ti82/pdist.htmlhttp://people.richland.edu/james/ti82/pdist.htmlhttp://people.richland.edu/james/ti82/pdist.htmlhttp://people.richland.edu/james/ti82/ti-list3.htmlhttp://people.richland.edu/james/ti82/ti-list3.html


21/38

The probability of each outcome remains constant from trial to trial.These can be summarized as: An experiment with a fixed number of independent

trials, each of which can only have two possible outcomes.

A binomial experiment has a fixed number ofindependent trials, each withonly two outcomes.

The fact that each trial is independent actually means that the probabilities remain

constant.

Examples of binomial experiments

Tossing a coin 20 times to see how many tails occur. Asking 200 people if they watch ABC news. Rolling a die to see if a 5 appears. Asking 500 die-hard Republicans if they would vote for the Democratic

candidate. (Just because something is unlikely, doesn't mean that it isn't

binomial. The conditions are met - there's a fixed number [500], the trials are

independent [what one person does doesn't affect the next person], and

there's only two outcomes [yes or no]).

Examples which aren't binomial experiments

Rolling a die until a 6 appears (not a fixed number of trials) Asking 20 people how old they are (not two outcomes) Drawing 5 cards from a deck for a poker hand (done without replacement, so

not independent)

Binomial Probability Function

Example:

What is the probability of rolling exactly two sixes in 6 rolls of a die?

There are five things you need to do to work a binomial story problem.

1. Define Success first. Success must be for a single trial. Success = "Rolling a 6 ona single die"

2. Define the probability of success (p): p = 1/6


22/38

3. Find the probability of failure: q = 5/64. Define the number of trials: n = 65. Define the number of successes out of those trials: x = 2

Anytime a six appears, it is a success (denoted S) and anytime something else appears,

it is a failure (denoted F). The ways you can get exactly 2 successes in 6 trials are

given below. The probability of each is written to the right of the way it could occur.

Because the trials are independent, the probability of the event (all six dice) is the

product of each probability of each outcome (die)

1 FFFFSS 5/6 * 5/6 * 5/6 * 5/6 * 1/6 * 1/6 = (1/6)^2 * (5/6)^4

2 FFFSFS 5/6 * 5/6 * 5/6 * 1/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4

3 FFFSSF 5/6 * 5/6 * 5/6 * 1/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4

4 FFSFFS 5/6 * 5/6 * 1/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4

5 FFSFSF 5/6 * 5/6 * 1/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4

6 FFSSFF 5/6 * 5/6 * 1/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

7 FSFFFS 5/6 * 1/6 * 5/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4

8 FSFFSF 5/6 * 1/6 * 5/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4

9 FSFSFF 5/6 * 1/6 * 5/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

10 FSSFFF 5/6 * 1/6 * 1/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

11 SFFFFS 1/6 * 5/6 * 5/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4

12 SFFFSF 1/6 * 5/6 * 5/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4

13 SFFSFF 1/6 * 5/6 * 5/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

14 SFSFFF 1/6 * 5/6 * 1/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

15 SSFFFF 1/6 * 1/6 * 5/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

Notice that each of the 15 probabilities are exactly the same: (1/6)^2 * (5/6)^4.

Also, note that the 1/6 is the probability of success and you needed 2 successes. The

5/6 is the probability of failure, and if 2 of the 6 trials were success, then 4 of the 6

must be failures. Note that 2 is the value of x and 4 is the value of n-x.

Further note that there are fifteen ways this can occur. This is the number of ways 2

successes can be occur in 6 trials without repetition and order not being important, or

a combination of 6 things, 2 at a time.

The probability of getting exactly x success in n trials, with the probability of

success on a single trial being p is:

P(X=x) = nCx * p^x * q^(n-x)

Example:

A coin is tossed 10 times. What is the probability that exactly 6 heads will occur.

1. Success = "A head is flipped on a single coin"


23/38

2. p = 0.53. q = 0.54. n = 105. x = 6

P(x=6) = 10C6 * 0.5^6 * 0.5^4 = 210 * 0.015625 * 0.0625 = 0.205078125

Mean, Variance, and Standard Deviation

The mean, variance, and standard deviation of a binomial distribution are extremely

easy to find.

Another way to remember the variance is mu-q (since the np is mu).

Example:

Find the mean, variance, and standard deviation for the number of sixes that appear

when rolling 30 dice.

Success = "a six is rolled on a single die". p = 1/6, q = 5/6.

The mean is 30 * (1/6) = 5. The variance is 30 * (1/6) * (5/6) = 25/6. The standard

deviation is the square root of the variance = 2.041241452 (approx)

Other Discrete Distributions

Multinomial Probabilities

A multinomial experiment is an extended binomial probability. The difference is that

in a multinomial experiment, there are more than two possible outcomes. However,

there are still a fixed number of independent trials, and the probability of each

outcome must remain constant from trial to trial.


24/38

Instead of using a combination, as in the case of the binomial probability, the number

of ways the outcomes can occur is done using distinguishable permutations.

An example here will be much more useful than a formula.

The probability that a person will pass a College Algebra class is 0.55, the probabilitythat a person will withdraw before the class is completed is 0.40, and the probability

that a person will fail the class is 0.05. Find the probability that in a class of 30

students, exactly 16 pass, 12 withdraw, and 2 fail.

Outcome x p(outcome)

Pass 16 0.55

Withdraw 12 0.40

Fail 2 0.05

Total 30 1.00

The probability is found using this formula:

30!

P = ---------------- * 0.55^16 * 0.40^12 * 0.05^2(16!) (12!) (2!)

You can do this on theTI-82.

The multinomial experiment will be used later when we talk about the chi-square

goodness of fit test.

Poisson Probabilities

Named after the French mathematician Simeon Poisson, Poisson probabilities are

useful when there are a large number of independent trials with a small probability ofsuccess on a single trial and the variables occur over a period of time. It can also be

used when a density of items is distributed over a given area or volume.
http://people.richland.edu/james/ti82/ti-mult.htmlhttp://people.richland.edu/james/ti82/ti-mult.htmlhttp://people.richland.edu/james/ti82/ti-mult.htmlhttp://people.richland.edu/james/ti82/ti-mult.html


25/38

Lambda in the formula is the mean number of occurrences. If you're approximating a

binomial probability using the Poisson, then lambda is the same as mu or n * p.

Example:

If there are 500 customers per eight-hour day in a check-out lane, what is theprobability that there will be exactly 3 in line during any five-minute period?

The expected value during any one five minute period would be 500 / 96 =

5.2083333. The 96 is because there are 96 five-minute periods in eight hours. So, you

expect about 5.2 customers in 5 minutes and want to know the probability of getting

exactly 3.

p(3;500/96) = e^(-500/96) * (500/96)^3 / 3! = 0.1288 (approx)

Hypergeometric Probabilities

Hypergeometric experiments occur when the trials are not independent of each other

and occur due to sampling without replacement -- as in a five card poker hand.

Hypergeometric probabilities involve the multiplication of two combinations together

and then division by the total number of combinations.

Example:

How many ways can 3 men and 4 women be selected from a group of 7 men and 10women?

The answer is = 7350/19448 = 0.3779 (approx)

Note that the sum of the numbers in the numerator are the numbers used in the

combination in the denominator.

This can be extended to more than two groups and called an extended hypergeometric

problem.

You can use theTI-82to find hypergeometric probabilities.
http://people.richland.edu/james/ti82/hypergeo.htmlhttp://people.richland.edu/james/ti82/hypergeo.htmlhttp://people.richland.edu/james/ti82/hypergeo.htmlhttp://people.richland.edu/james/ti82/hypergeo.html


26/38

Normal Distribution

Definitions

Central Limit Theorem

Theorem which stats as the sample size increases, the sampling distribution of

the sample means will become approximately normally distributed.

Correction for Continuity

A correction applied to convert a discrete distribution to a continuous

distribution.

Finite Population Correction Factor

A correction applied to the standard error of the means when the sample size

is more than 5% of the population size and the sampling is done without

replacement.

Sampling Distribution of the Sample Means

Distribution obtained by using the means computed from random samples of

a specific size.

Sampling Error

Difference which occurs between the sample statistic and the population

parameter due to the fact that the sample isn't a perfect representation of the

population.

Standard Error or the Mean


27/38

The standard deviation of the sampling distribution of the sample means. It is

equal to the standard deviation of the population divided by the square root

of the sample size.

Standard Normal Distribution

A normal distribution in which the mean is 0 and the standard deviation is 1. It

is denoted by z.

Z-score

Also known as z-value. A standardized score in which the mean is zero and the

standard deviation is 1. The Z score is used to represent the standard normal

distribution.

Normal Distributions

Any Normal Distribution

Bell-shaped Symmetric about mean Continuous Never touches the x-axis Total area under curve is 1.00 Approximately 68% lies within 1 standard deviation of the mean, 95% within 2

standard deviations, and 99.7% within 3 standard deviations of the mean. This

is the Empirical Rule mentioned earlier.

Data values represented by x which has mean mu and standard deviationsigma.


28/38

Probability Function given by

Standard Normal Distribution

Same as a normal distribution, but also ...

Mean is zero Variance is one Standard Deviation is one Data values represented by z.

Probability Function given by

Standard Normal Non-Standard Normal

Mean = 0

and

Variance = 1

Mean is not 0

or

Variance is not 1

Normal Probabilities

This table has not been verified against the book, please use the table out of your

textbook.

Comprehension of this table is vital to success in the course!

There is a table which must be used to look up standard normal probabilities. The z-

score is broken into two parts, the whole number and tenth are looked up along the

left side and the hundredth is looked up across the top. The value in the intersection of

the row and column is the area under the curve between zero and the z-score looked

up.

Because of the symmetry of the normal distribution, look up the absolute value of any

z-score.
http://people.richland.edu/james/lecture/m113/z_table.htmlhttp://people.richland.edu/james/lecture/m113/z_table.htmlhttp://people.richland.edu/james/lecture/m113/z_table.html


29/38

Computing Normal Probabilities

There are several different situations that can arise when asked to find normal

probabilities.

Situation Instructions

Between zero and

any number

Look up the area in the table

Between two positives, or

Between two negatives

Look up both areas in the table and subtract the smaller from

the larger.

Between a negative and

a positive

Look up both areas in the table and add them together

Less than a negative, or

Greater than a positive

Look up the area in the table and subtract from 0.5000

Greater than a negative, or

Less than a positive

Look up the area in the table and add to 0.5000

This can be shortened into two rules.

1. If there is only one z-score given, use 0.5000 for the second area, otherwiselook up both z-scores in the table

2. If the two numbers are the same sign, then subtract; if they are different signs,then add. If there is only one z-score, then use the inequality to determine the

second sign (< is negative, and > is positive).

Finding z-scores from probabilities

This is more difficult, and requires you to use the table inversely. You must look upthe area between zero and the value on the inside part of the table, and then read the z-

score from the outside. Finally, decide if the z-score should be positive or negative,

based on whether it was on the left side or the right side of the mean. Remember, z-

scores can be negative, but areas or probabilities cannot be.


30/38

Situation Instructions

Area between 0 and a value Look up the area in the table

Make negative if on the left side

Area in one tail Subtract the area from 0.5000

Look up the difference in the table

Make negative if in the left tail

Area including one complete half

(Less than a positive or greater than a

negative)

Subtract 0.5000 from the area

Look up the difference in the table

Make negative if on the left side

Within z units of the mean Divide the area by 2

Look up the quotient in the table

Use both the positive and negative z-scores

Two tails with equal area

(More than z units from the mean)

Subtract the area from 1.000

Divide the area by 2

Look up the quotient in the table

Use both the positive and negative z-scores

Using the table becomes proficient with practice, work lots of the normal probabilityproblems!

Standard Normal Probabilities

z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359

0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753

0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141


31/38

0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517

0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879

0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224

0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549

0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852

0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133

0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389

1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621

1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830

1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015

1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177

1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319

1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441

1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545

1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633

1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706

1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767

2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817


32/38

2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857

2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890

2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916

2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936

2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952

2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964

2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974

2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981

2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986

3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990

The values in the table are the areas between zero and the z-score. That is, P(0


33/38

Example 1: Sampling Distribution of Values (x)

Consider the case where a single, fair die is rolled.

Here are the values that are possible and their probabilities.

Value 1 2 3 4 5 6

Probability 1/6 1/6 1/6 1/6 1/6 1/6

Here are the mean, variance, and standard deviation of this probability distribution.

Mean, mu = sum [ x * p(x) ] = 3.5

Variance, sigma^2 = sum [ x^2 * p(x) ] - mu^2 = 35/12Standard deviation, sigma = sqrt ( variance ) = sqrt ( 35/12 )

Example 2: Sampling Distribution of Sample Means (x-bar)

Consider the case where two fair dice are rolled instead of one.

Here are the sums that are possible and their probabilities.

Sum 2 3 4 5 6 7 8 9 10 11 12

Prob 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36

But, we're not interested in the sum of the dice, we're interested in the sample mean.

We find the sample mean by dividing the sum by the sample size.

Mean 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

Prob 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36

Computing the mean, variance, and standard deviation, we get ...

Mean, mu = sum [ x * p(x) ] = 3.5

Variance, sigma^2 = sum [ x^2 * p(x) ] - mu^2 = 35/24

Standard deviation, sigma = sqrt ( variance ) = sqrt ( 35/24 )


34/38

Properties of the Sampling Distribution of the Sample Means

When all of the possible sample means are computed, then the following properties

are true:

The mean of the sample means will be the mean of the population The variance of the sample means will be the variance of the population

divided by the sample size.

The standard deviation of the sample means (known as the standard error ofthe mean) will be smaller than the population mean and will be equal to the

standard deviation of the population divided by the square root of the sample

size.

If the population has a normal distribution, then the sample means will have anormal distribution.

If the population is not normally distributed, but the sample size is sufficientlylarge, then the sample means will have an approximately normal distribution.

Some books define sufficiently large as at least 30 and others as at least 31.

The formula for a z-score when working with the sample means is:

Finite Population Correction Factor

If the sample size is more than 5% of the population size and the sampling is done

without replacement, then a correction needs to be made to the standard error of the

means.

In the following, N is the population size and n is the sample size. The adjustment is

to multiply the standard error by the square root of the quotient of the difference

between the population and sample sizes and one less than the population

size.

For the most part, we will be ignoring this in class.


35/38

Normal Approximation to Binomial

Recall that according to the Central Limit Theorem, the sample mean of anydistribution will become approximately normal if the sample size is sufficiently large.

It turns out that thebinomial distributioncan be approximated using the normal

distribution if np and nq are both at least 5. Furthermore, recall that the mean of a

binomial distribution is np and the variance of the binomial distribution is npq.

Continuity Correction Factor

There is a problem with approximating the binomial with the normal. That problem

arises because the binomial distribution is a discrete distribution while the normaldistribution is a continuous distribution. The basic difference here is that with discrete

values, we are talking about heights but no widths, and with the continuous

distribution we are talking about both heights and widths.

The correction is to either add or subtract 0.5 of a unit from each discrete x-value.

This fills in the gaps to make it continuous. This is very similar to expanding of limits

to form boundaries that we did with group frequency distributions.

Examples

Discrete Continuous

x = 6 5.5 < x < 6.5

x > 6 x > 6.5

x >= 6 x > 5.5

x < 6 x < 5.5

x


36/38

As you can see, whether or not the equal to is included makes a big difference in the

discrete distribution and the way the conversion is performed. However, for a

continuous distribution, equality makes no difference.

Steps to working a normal approximation to the binomial distribution

1. Identify success, the probability of success, the number of trials, and thedesired number of successes. Since this is a binomial problem, these are the

same things which were identified when working a binomial problem.

2. Convert the discrete x to a continuous x. Some people would argue that step 3should be done before this step, but go ahead and convert the x before you

forget about it and miss the problem.

3. Find the smaller of np or nq. If the smaller one is at least five, then the largermust also be, so the approximation will be considered good. When you find

np, you're actually finding the mean, mu, so denote it as such.4. Find the standard deviation, sigma = sqrt (npq). It might be easier to find the

variance and just stick the square root in the final calculation - that way you

don't have to work with all of the decimal places.

5. Compute the z-score using the standard formula for an individual score (notthe one for a sample mean).

6. Calculate the probability desired.

Importance of the Normal Distribution

Parametric Hypothesis Testing

All parametric hypothesis testing that we're going to perform requires normality in

some sense.

Population Mean

Either the population was normally distributed, the sample size was large

enough (so the central limit theorem applied and was approximately normal),

or the population was approximately normal and the student's t was used.

Population Proportion


37/38

The binomial distribution (the one that really applies) was approximated using

the normal as long as np and nq were at least five. That is another way of

saying the expected frequency of each category (success and failure) is at least

five.

Population Variance

It was required that the population be normally distributed.

Correlation and Regression

The pairs of data had to have a bi-variate normal distribution.

Multinomial Experiment

The expected frequency of each category had to be at least five. This is

analogous to approximating the binomial using the normal.

Independence

The expected frequency of each cell had to be at least five. This is analogous

to approximating the binomial using the normal.

Distributions

The distributions have normality in them somewhere, too.

Normal Distribution

Well, obviously this one requires normality.

Student's T Distribution

Had to be approximately normal. As the sample size increases, the student's t

approaches the normal distribution.

Chi-squared Distribution

Required a normal population. There is another interesting relationship

between the normal and chi-square distributions. If you take a critical value

from normal distribution and square it, you will get the corresponding chi-


38/38

square value with one degree of freedom, but twice the area in the tails.

Example: z(0.05)2

= 1.6452

= 2.706 = chi-square(1,0.10)

F Distribution

Since F is the ratio of two independent chi-squared variables divided by their

respective degrees of freedom, and the chi-squares require a normal

distribution, then the F distribution is also going to require a normal

distribution.

Binomial Distribution

Obviously, the binomial doesn't require a normal population, but it can be

approximated using a normal distribution if the expected frequency of each

category is at least five.

Multinomial Distribution

Same as with the binomial, the multinomial can be approximated using the

normal if the expected frequency of each category is at least five.

As stated in class and in the lecture notes ... your comprehension of the normal

distribution is vital for success in the class.

Documents

Notes Stats