Notes Stats

Embed Size (px)

Citation preview

  • 7/28/2019 Notes Stats

    1/38

    Definitions

    Probability Experiment

    Process which leads to well-defined results call outcomes

    Outcome

    The result of a single trial of a probability experimentSample Space

    Set of all possible outcomes of a probability experiment

    Event

    One or more outcomes of a probability experiment

    Classical Probability

    Uses the sample space to determine the numerical probability that an event will

    happen. Also called theoretical probability.

    Equally Likely Events

    Events which have the same probability of occurring.

    Complement of an Event

    All the events in the sample space except the given events.

    Empirical Probability

    Uses a frequency distribution to determine the numerical probability. An

    empirical probability is a relative frequency.

    Subjective Probability

    Uses probability values based on an educated guess or estimate. It employs

    opinions and inexact information.

    Mutually Exclusive Events

    Two events which cannot happen at the same time.

    Disjoint Events

    Another name for mutually exclusive events.

    Independent Events

    Two events are independent if the occurrence of one does not affect the

    probability of the other occurring.

    Dependent Events

    Two events are dependent if the first event affects the outcome or occurrence of

    the second event in a way the probability is changed.

    Conditional Probability

    The probability of an event occurring given that another event has alreadyoccurred.

    Bayes' Theorem

    A formula which allows one to find the probability that an event occurred as

    the result of a particular previous event.

    Factorial

  • 7/28/2019 Notes Stats

    2/38

    A positive integer factorial is the product of each natural number up to and

    including the integer.

    Permutation

    An arrangement of objects in a specific order.

    Combination

    A selection of objects without regard to order.Tree Diagram

    A graphical device used to list all possibilities of a sequence of events in a

    systematic way.

    Introduction to Probability

    Sample Spaces

    A sample space is the set of all possible outcomes. However, some sample spaces are

    better than others.

    Consider the experiment of flipping two coins. It is possible to get 0 heads, 1 head, or

    2 heads. Thus, the sample space could be {0, 1, 2}. Another way to look at it is flip {

    HH, HT, TH, TT }. The second way is better because each event is as equally likely to

    occur as any other.

    When writing the sample space, it is highly desirable to have events which are equally

    likely.

    Another example is rolling two dice. The sums are { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }.

    However, each of these aren't equally likely. The only way to get a sum 2 is to roll a 1

    on both dice, but you can get a sum of 4 by rolling a 3-1, 2-2, or 3-1. The following

    table illustrates a better sample space for the sum obtain when rolling two dice.

    First Die

    Second Die

    1 2 3 4 5 6

    1 2 3 4 5 6 7

  • 7/28/2019 Notes Stats

    3/38

    2 3 4 5 6 7 8

    3 4 5 6 7 8 9

    4 5 6 7 8 9 10

    5 6 7 8 9 10 11

    6 7 8 9 10 11 12

    Classical Probability

    The above table lends itself to describing data another way -- using a probability

    distribution. Let's consider the frequency distribution for the above sums.

    Sum Frequency Relative

    Frequency

    2 1 1/36

    3 2 2/36

    4 3 3/36

    5 4 4/36

    6 5 5/36

    7 6 6/36

    8 5 5/36

    9 4 4/36

    10 3 3/36

  • 7/28/2019 Notes Stats

    4/38

    11 2 2/36

    12 1 1/36

    If just the first and last columns were written, we would have a probability

    distribution. The relative frequency of a frequency distribution is the probability of the

    event occurring. This is only true, however, if the events are equally likely.

    This gives us the formula for classical probability. The probability of an event

    occurring is the number in the event divided by the number in the sample space.

    Again, this is only true when the events are equally likely. A classical probability is

    the relative frequency of each event in the sample space when each event is equally

    likely.

    P(E) = n(E) / n(S)

    Empirical Probability

    Empirical probability is based on observation. The empirical probability of an event is

    the relative frequency of a frequency distribution based upon observation.

    P(E) = f / n

    Probability Rules

    There are two rules which are very important.

    All probabilities are between 0 and 1 inclusive

    0

  • 7/28/2019 Notes Stats

    5/38

    The probability of an event which must occur is 1.

    The probability of the sample space is 1.

    The probability of an event not occurring is one minus the probability of it

    occurring.

    P(E') = 1 - P(E)

    Probability Rules

    "OR" or Unions

    Mutually Exclusive Events

    Two events are mutually exclusive if they cannot occur at the same time. Another

    word that means mutually exclusive is disjoint.

    If two events are disjoint, then the probability of them both occurring at the same time

    is 0.

    Disjoint: P(A and B) = 0

    If two events are mutually exclusive, then the probability of either occurring is the

    sum of the probabilities of each occurring.

    Specific Addition Rule

    Only valid when the events are mutually exclusive.

    P(A or B) = P(A) + P(B)

    Example 1:

    Given: P(A) = 0.20, P(B) = 0.70, A and B are disjoint

    I like to use what's called a joint probability distribution. (Since disjoint means

    nothing in common, joint is what they have in common -- so the values that go on the

    inside portion of the table are the intersections or "and"s of each pair of events).

    "Marginal" is another word for totals -- it's called marginal because they appear in the

    margins.

  • 7/28/2019 Notes Stats

    6/38

    B B' Marginal

    A 0.00 0.20 0.20

    A' 0.70 0.10 0.80

    Marginal 0.70 0.30 1.00

    The values in red are given in the problem. The grand total is always 1.00. The rest of

    the values are obtained by addition and subtraction.

    Non-Mutually Exclusive Events

    In events which aren't mutually exclusive, there is some overlap. When P(A) and P(B)

    are added, the probability of the intersection (and) is added twice. To compensate for

    that double addition, the intersection needs to be subtracted.

    General Addition Rule

    Always valid.

    P(A or B) = P(A) + P(B) - P(A and B)

    Example 2:

    Given P(A) = 0.20, P(B) = 0.70, P(A and B) = 0.15

    B B' Marginal

    A 0.15 0.05 0.20

    A' 0.55 0.25 0.80

    Marginal 0.70 0.30 1.00

    Interpreting the table

    Certain things can be determined from the joint probability distribution. Mutuallyexclusive events will have a probability of zero. All inclusive events will have a zero

    opposite the intersection. All inclusive means that there is nothing outside of those

    two events: P(A or B) = 1.

    B B' Marginal

    A A and B are Mutually Exclusive . .

  • 7/28/2019 Notes Stats

    7/38

    if this value is 0

    A' . A and B are All Inclusive if this

    value is 0

    .

    Marginal . . 1.00

    "AND" or Intersections

    Independent Events

    Two events are independent if the occurrence of one does not change the probability

    of the other occurring.

    An example would be rolling a 2 on a die and flipping a head on a coin. Rolling the 2

    does not affect the probability of flipping the head.

    If events are independent, then the probability of them both occurring is the product of

    the probabilities of each occurring.

    Specific Multiplication Rule

    Only valid for independent events

    P(A and B) = P(A) * P(B)

    Example 3:

    P(A) = 0.20, P(B) = 0.70, A and B are independent.

    B B' Marginal

    A 0.14 0.06 0.20

    A' 0.56 0.24 0.80

    Marginal 0.70 0.30 1.00

    The 0.14 is because the probability of A and B is the probability of A times the

    probability of B or 0.20 * 0.70 = 0.14.

    Dependent Events

  • 7/28/2019 Notes Stats

    8/38

    If the occurrence of one event does affect the probability of the other occurring, then

    the events are dependent.

    Conditional Probability

    The probability of event B occurring that event A has already occurred is read "theprobability of B given A" and is written: P(B|A)

    General Multiplication Rule

    Always works.

    P(A and B) = P(A) * P(B|A)

    Example 4:

    P(A) = 0.20, P(B) = 0.70, P(B|A) = 0.40

    A good way to think of P(B|A) is that 40% of A is B. 40% of the 20% which was in

    event A is 8%, thus the intersection is 0.08.

    B B' Marginal

    A 0.08 0.12 0.20

    A' 0.62 0.18 0.80

    Marginal 0.70 0.30 1.00

    Independence Revisited

    The following four statements are equivalent

    1. A and B are independent events2. P(A and B) = P(A) * P(B)3. P(A|B) = P(A)4.

    P(B|A) = P(B)

    The last two are because if two events are independent, the occurrence of one doesn't

    change the probability of the occurrence of the other. This means that the probability

    of B occurring, whether A has happened or not, is simply the probability of B

    occurring.

  • 7/28/2019 Notes Stats

    9/38

    Conditional Probability

    Conditional Probability

    Recall that the probability of an event occurring given that another event has already

    occurred is called a conditional probability.

    The probability that event B occurs, given that event A has already occurred is

    P(B|A) = P(A and B) / P(A)

    This formula comes from thegeneral multiplication principleand a little bit of

    algebra.

    Since we are given that event A has occurred, we have a reduced sample space.

    Instead of the entire sample space S, we now have a sample space of A since we know

    A has occurred. So the old rule about being the number in the event divided by the

    number in the sample space still applies. It is the number in A and B (must be in A

    since A has occurred) divided by the number in A. If you then divided numerator and

    denominator of the right hand side by the number in the sample space S, then you

    have the probability of A and B divided by the probability of A.

    Examples

    Example 1:

    The question, "Do you smoke?" was asked of 100 people. Results are shown in the

    table.

    . Yes No Total

    Male 19 41 60

    Female 12 28 40

    Total 31 69 100

    http://people.richland.edu/james/lecture/m113/prob_rules.htmlhttp://people.richland.edu/james/lecture/m113/prob_rules.htmlhttp://people.richland.edu/james/lecture/m113/prob_rules.htmlhttp://people.richland.edu/james/lecture/m113/prob_rules.html
  • 7/28/2019 Notes Stats

    10/38

    What is the probability of a randomly selected individual being a male whosmokes? This is just a joint probability. The number of "Male and Smoke"

    divided by the total = 19/100 = 0.19

    What is the probability of a randomly selected individual being a male? This isthe total for male divided by the total = 60/100 = 0.60. Since no mention ismade of smoking or not smoking, it includes all the cases.

    What is the probability of a randomly selected individual smoking? Again,since no mention is made of gender, this is a marginal probability, the total

    who smoke divided by the total = 31/100 = 0.31.

    What is the probability of a randomly selected male smoking? This time,you're told that you have a male - think of stratified sampling. What is the

    probability that the male smokes? Well, 19 males smoke out of 60 males, so

    19/60 = 0.31666...

    What is the probability that a randomly selected smoker is male? This time,you're told that you have a smoker and asked to find the probability that the

    smoker is also male. There are 19 male smokers out of 31 total smokers, so

    19/31 = 0.6129 (approx)

    After that last part, you have just worked a Bayes' Theorem problem. I know you

    didn't realize it - that's the beauty of it. A Bayes' problem can be set up so it appears to

    be just another conditional probability. In this class we will treat Bayes' problems as

    another conditional probability and not involve the large messy formula given in the

    text (and every other text).

    Example 2:

    There are three major manufacturing companies that make a product: Aberations,

    Brochmailians, and Chompielians. Aberations has a 50% market share, and

    Brochmailians has a 30% market share. 5% of Aberations' product is defective, 7% of

    Brochmailians' product is defective, and 10% of Chompieliens' product is defective.

    This information can be placed into a joint probability distribution

    Company Good Defective Total

    Aberations 0.50-0.025 = 0.475 0.05(0.50) = 0.025 0.50

    Brochmailians 0.30-0.021 = 0.279 0.07(0.30) = 0.021 0.30

  • 7/28/2019 Notes Stats

    11/38

    Chompieliens 0.20-0.020 = 0.180 0.10(0.20) = 0.020 0.20

    Total 0.934 0.066 1.00

    The percent of the market share for Chompieliens wasn't given, but since the

    marginals must add to be 1.00, they have a 20% market share.

    Notice that the 5%, 7%, and 10% defective rates don't go into the table directly. This

    is because they are conditional probabilities and the table is a joint probability table.

    These defective probabilities are conditional upon which company was given. That is,

    the 7% is not P(Defective), but P(Defective|Brochmailians). The joint probability

    P(Defective and Brochmailians) = P(Defective|Brochmailians) * P(Brochmailians).

    The "good" probabilities can be found by subtraction as shown above, or bymultiplication using conditional probabilities. If 7% of Brochmailians' product is

    defective, then 93% is good. 0.93(0.30)=0.279.

    What is the probability a randomly selected product is defective? P(Defective)= 0.066

    What is the probability that a defective product came from Brochmailians?P(Brochmailian|Defective) = P(Brochmailian and Defective) / P(Defective) =

    0.021/0.066 = 7/22 = 0.318 (approx).

    Are these events independent? No. If they were, thenP(Brochmailians|Defective)=0.318 would have to equal theP(Brochmailians)=0.30, but it doesn't. Also, the P(Aberations and

    Defective)=0.025 would have to be P(Aberations)*P(Defective) =

    0.50*0.066=0.033, and it doesn't.

    The second question asked above is a Bayes' problem. Again, my point is, you don't

    have to know Bayes formula just to work a Bayes' problem.

    Bayes' Theorem

    However, just for the sake of argument, let's say that you want to know what Bayes'

    formula is.

    Let's use the same example, but shorten each event to its one letter initial, ie: A, B, C,

    and D instead of Aberations, Brochmailians, Chompieliens, and Defective.

  • 7/28/2019 Notes Stats

    12/38

    P(D|B) is not a Bayes problem. This is given in the problem. Bayes' formula finds the

    reverse conditional probability P(B|D).

    It is based that the Given (D) is made of three parts, the part of D in A, the part of D in

    B, and the part of D in C.

    P(B and D)

    P(B|D) = -----------------------------------------

    P(A and D) + P(B and D) + P(C and D)

    Inserting the multiplication rule for each of these joint probabilities gives

    P(D|B)*P(B)

    P(B|D) = -----------------------------------------

    P(D|A)*P(A) + P(D|B)*P(B) + P(D|C)*P(C)

    However, and I hope you agree, it is much easier to take the joint probability divided

    by the marginal probability. The table does the adding for you and makes the

    problems doable without having to memorize the formulas.

    Counting Techniques

    Fundamental Theorems

    Every branch of mathematics has its fundamental theorem or theorems.

    Fundamental Theorem of Arithmetic

    Every integer greater than one is either prime or can be expressed as an unique

    product of prime numbers

    Fundamental Theorem of Algebra

    Every polynomial function on one variable of degree n > 0 has at least one real orcomplex zero.

    Fundamental Theorem of Linear Programming

    If there is a solution to a linear programming problem, then it will occur at a corner

    point or on a boundary between two or more corner points

  • 7/28/2019 Notes Stats

    13/38

    Fundamental Counting Principle

    In a sequence of events, the total possible number of ways all events can performed is

    the product of the possible number of ways each individual event can be performed.

    Factorials

    If n is a positive integer, then

    n! = n (n-1) (n-2) ... (3)(2)(1)

    n! = n (n-1)!

    A special case is 0!

    0! = 1

    Permutations

    A permutation is an arrangement of objects without repetition and where order is

    important.

    Another definition of permutation is the number of arrangements that can be formed.

    Permutations using all the objects

    A permutation of n objects, arranged into one group of size n, without repetition, and

    order being important is:

    nPn = P(n,n) = n!

    Example: Find all permutations of the letters "ABC"

    ABC ACB BAC BCA CAB CBA

    Permutations of some of the objects

    A permutation of n objects, arranged in groups of size r, without repetition, and order

    being important is:

    nPr = P(n,r) = n! / (n-r)!

    The calculator can be used to find the number of such permutations. On the TI-82 or

    TI-83, the permutation key is found under the Math, Probability menu.

  • 7/28/2019 Notes Stats

    14/38

    Example: Find all two-letter permutations of the letters "ABC"

    AB AC BA BC CA CB

    Shortcut formula for finding a permutation

    Assuming that you start a n and count down to 1 in your factorials ...

    P(n,r) = first r factors of n factorial

    Distinguishable Permutations

    Sometimes letters are repeated and all of the permutations aren't distinguishable from

    each other.

    Example: Find all permutations of the letters "BOB"

    To help you distinguish, I'll write the second "B" as "b"

    BOb BbO OBb ObB bBO bOB

    If you just write "B" as "B", however ...

    BOB BBO OBB OBB BBO BBO

    There are really only three distinguishable permutations here.

    BOB BBO OBB

    If a word has N letters, k of which are unique, and you let n (n1, n2, n3, ..., nk) be the

    frequency of each of the k letters, then the total number of distinguishable

    permutations is given by:

    Consider the word "STATISTICS":

    Here are the frequency of each letter: S=3, T=3, A=1, I=2, C=1, there are 10 letters

    total

    10! 10*9*8*7*6*5*4*3*2*1

    Permutations = -------------- = -------------------- = 50400

    3! 3! 1! 2! 1! 6 * 6 * 1 * 2 * 1

  • 7/28/2019 Notes Stats

    15/38

    You can find distinguishable permutations using theTI-82.

    Combinations

    A combination is an arrangement of objects without repetition and where order is not

    important.

    Note: The difference between a permutation and a combination is not whether there is

    repetition or not - there must not be repetition with either, and if there is repetition,

    you can not use the formulas for permutations or combinations. The only difference

    in the definition of a permutation and a combination is whether order is

    important.

    A combination of n objects, arranged in groups of size r, without repetition, and order

    being important is:

    nCr = C(n,r) = n! / ( (n-r)! * r! )

    Another way to write a combination of n things, r at a time is using the binomial

    notation:

    Example: Find all two-letter combinations of the letters "ABC"

    AB = BA AC = CA BC = CB

    There are only three two-letter combinations.

    Shortcut formula for finding a combination

    Assuming that you start a n and count down to 1 in your factorials ...

    C(n,r) = first r factors of n factorial divided by the last r factors of n factorial

    Pascal's Triangle

    Combinations are used in the binomial expansion theorem from algebra to give the

    coefficients of the expansion (a+b)^n. They also form a pattern known as Pascal's

    Triangle.

    1

    1 1

    1 2 1

    1 3 3 1

    http://people.richland.edu/james/ti82/ti-dper.htmlhttp://people.richland.edu/james/ti82/ti-dper.htmlhttp://people.richland.edu/james/ti82/ti-dper.htmlhttp://people.richland.edu/james/ti82/ti-dper.html
  • 7/28/2019 Notes Stats

    16/38

    1 4 6 4 1

    1 5 10 10 5 1

    1 6 15 20 15 6 1

    1 7 21 35 35 21 7 1

    Each element in the table is the sum of the two elements directly above it. Each

    element is also a combination. The n value is the number of the row (start counting atzero) and the r value is the element in the row (start counting at zero). That would

    make the 20 in the next to last row C(6,3) -- it's in the row #6 (7th row) and position #3

    (4th element).

    Symmetry

    Pascal's Triangle illustrates the symmetric nature of a combination. C(n,r) = C(n,n-r)

    Example: C(10,4) = C(10,6) or C(100,99) = C(100,1)

    Shortcut formula for finding a combination

    Since combinations are symmetric, if n-r is smaller than r, then switch the

    combination to its alternative form and then use the shortcut given above.

    C(n,r) = first r factors of n factorial divided by the last r factors of n factorial

    TI-82

    You can use the TI-82 graphing calculator to findfactorials, permutations, and

    combinations.

    Tree Diagrams

    Tree diagrams are a graphical way of listing all the possible

    outcomes. The outcomes are listed in an orderly fashion, solisting all of the possible outcomes is easier than just trying

    to make sure that you have them all listed. It is called a tree

    diagram because of the way it looks.

    http://people.richland.edu/james/ti82/ti-count.htmlhttp://people.richland.edu/james/ti82/ti-count.htmlhttp://people.richland.edu/james/ti82/ti-count.htmlhttp://people.richland.edu/james/ti82/ti-count.htmlhttp://people.richland.edu/james/ti82/ti-count.htmlhttp://people.richland.edu/james/ti82/ti-count.html
  • 7/28/2019 Notes Stats

    17/38

    The first event appears on the left, and then each sequential event is represented as

    branches off of the first event.

    The tree diagram to the right would show the possible ways of flipping two coins. The

    final outcomes are obtained by following each branch to its conclusion: They are from

    top to bottom:

    HH HT TH TT

    Probability Distributions

    Definitions

    Random Variable

    Variable whose values are determined by chance

    Probability Distribution

    The values a random variable can assume and the corresponding probabilities

    of each.

    Expected Value

    The theoretical mean of the variable.

    Binomial Experiment

    An experiment with a fixed number of independent trials. Each trial can only

    have two outcomes, or outcomes which can be reduced to two outcomes. The

    probability of each outcome must remain constant from trial to trial.

    Binomial Distribution

    The outcomes of a binomial experiment with their corresponding

    probabilities.

    Multinomial Distribution

  • 7/28/2019 Notes Stats

    18/38

    A probability distribution resulting from an experiment with a fixed number of

    independent trials. Each trial has two or more mutually exclusive outcomes.

    The probability of each outcome must remain constant from trial to trial.

    Poisson Distribution

    A probability distribution used when a density of items is distributed over a

    period of time. The sample size needs to be large and the probability of

    success to be small.

    Hypergeometric Distribution

    A probability distribution of a variable with two outcomes when sampling is

    done without replacement.

    Probability Distributions

    Probability Functions

    A probability function is a function which assigns probabilities to the values of arandom variable.

    All the probabilities must be between 0 and 1 inclusive The sum of the probabilities of the outcomes must be 1.

    If these two conditions aren't met, then the function isn't a probability function. There

    is no requirement that the values of the random variable only be between 0 and 1, only

    that the probabilities be between 0 and 1.

    Probability Distributions

    A listing of all the values the random variable can assume with their corresponding

    probabilities make a probability distribution.

    A note about random variables. A random variable does not mean that the values can

    be anything (a random number). Random variables have a well defined set of

  • 7/28/2019 Notes Stats

    19/38

    outcomes and well defined probabilities for the occurrence of each outcome. The

    random refers to the fact that the outcomes happen by chance -- that is, you don't

    know which outcome will occur next.

    Here's an example probability distribution that results from the rolling of a single fair

    die.

    x 1 2 3 4 5 6 sum

    p(x) 1/6 1/6 1/6 1/6 1/6 1/6 6/6=1

    Mean, Variance, and Standard Deviation

    Consider the following.

    The definitions for population mean and variance used with an ungrouped frequency

    distribution were:

    Some of you might be confused by only dividing by N. Recall that this is the

    population variance, the sample variance, which was the unbiased estimator for the

    population variance was when it was divided by n-1.

    Using algebra, this is equivalent

    to:

    Recall that a probability is a long term relative frequency. So every f/N can be

    replaced by p(x). This simplifies to be:

    What's even better, is that the last portion of the variance is the mean squared. So, thetwo formulas that we will be using are:

  • 7/28/2019 Notes Stats

    20/38

    Here's the example we were working on earlier.

    x 1 2 3 4 5 6 sum

    p(x) 1/6 1/6 1/6 1/6 1/6 1/6 6/6 = 1

    x p(x) 1/6 2/6 3/6 4/6 5/6 6/6 21/6 = 3.5

    x^2 p(x) 1/6 4/6 9/6 16/6 25/6 36/6 91/6 = 15.1667

    The mean is 7/2 or 3.5

    The variance is 91/6 - (7/2)^2 = 35/12 = 2.916666...

    The standard deviation is the square root of the variance = 1.7078

    Do not use rounded off values in the intermediate calculations. Only round off the

    final answer.

    You can learn how to find the mean and variance of a probability distributionusing

    listswith the TI-82 or using the program calledPDIST.

    Binomial Probabilities

    Binomial Experiment

    A binomial experiment is an experiment which satisfies these four conditions

    A fixed number of trials Each trial is independent of the others There are only two outcomes

    http://people.richland.edu/james/ti82/ti-list3.htmlhttp://people.richland.edu/james/ti82/ti-list3.htmlhttp://people.richland.edu/james/ti82/ti-list3.htmlhttp://people.richland.edu/james/ti82/ti-list3.htmlhttp://people.richland.edu/james/ti82/pdist.htmlhttp://people.richland.edu/james/ti82/pdist.htmlhttp://people.richland.edu/james/ti82/pdist.htmlhttp://people.richland.edu/james/ti82/pdist.htmlhttp://people.richland.edu/james/ti82/ti-list3.htmlhttp://people.richland.edu/james/ti82/ti-list3.html
  • 7/28/2019 Notes Stats

    21/38

    The probability of each outcome remains constant from trial to trial.These can be summarized as: An experiment with a fixed number of independent

    trials, each of which can only have two possible outcomes.

    A binomial experiment has a fixed number ofindependent trials, each withonly two outcomes.

    The fact that each trial is independent actually means that the probabilities remain

    constant.

    Examples of binomial experiments

    Tossing a coin 20 times to see how many tails occur. Asking 200 people if they watch ABC news. Rolling a die to see if a 5 appears. Asking 500 die-hard Republicans if they would vote for the Democratic

    candidate. (Just because something is unlikely, doesn't mean that it isn't

    binomial. The conditions are met - there's a fixed number [500], the trials are

    independent [what one person does doesn't affect the next person], and

    there's only two outcomes [yes or no]).

    Examples which aren't binomial experiments

    Rolling a die until a 6 appears (not a fixed number of trials) Asking 20 people how old they are (not two outcomes) Drawing 5 cards from a deck for a poker hand (done without replacement, so

    not independent)

    Binomial Probability Function

    Example:

    What is the probability of rolling exactly two sixes in 6 rolls of a die?

    There are five things you need to do to work a binomial story problem.

    1. Define Success first. Success must be for a single trial. Success = "Rolling a 6 ona single die"

    2. Define the probability of success (p): p = 1/6

  • 7/28/2019 Notes Stats

    22/38

    3. Find the probability of failure: q = 5/64. Define the number of trials: n = 65. Define the number of successes out of those trials: x = 2

    Anytime a six appears, it is a success (denoted S) and anytime something else appears,

    it is a failure (denoted F). The ways you can get exactly 2 successes in 6 trials are

    given below. The probability of each is written to the right of the way it could occur.

    Because the trials are independent, the probability of the event (all six dice) is the

    product of each probability of each outcome (die)

    1 FFFFSS 5/6 * 5/6 * 5/6 * 5/6 * 1/6 * 1/6 = (1/6)^2 * (5/6)^4

    2 FFFSFS 5/6 * 5/6 * 5/6 * 1/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4

    3 FFFSSF 5/6 * 5/6 * 5/6 * 1/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4

    4 FFSFFS 5/6 * 5/6 * 1/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4

    5 FFSFSF 5/6 * 5/6 * 1/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4

    6 FFSSFF 5/6 * 5/6 * 1/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

    7 FSFFFS 5/6 * 1/6 * 5/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4

    8 FSFFSF 5/6 * 1/6 * 5/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4

    9 FSFSFF 5/6 * 1/6 * 5/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

    10 FSSFFF 5/6 * 1/6 * 1/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

    11 SFFFFS 1/6 * 5/6 * 5/6 * 5/6 * 5/6 * 1/6 = (1/6)^2 * (5/6)^4

    12 SFFFSF 1/6 * 5/6 * 5/6 * 5/6 * 1/6 * 5/6 = (1/6)^2 * (5/6)^4

    13 SFFSFF 1/6 * 5/6 * 5/6 * 1/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

    14 SFSFFF 1/6 * 5/6 * 1/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

    15 SSFFFF 1/6 * 1/6 * 5/6 * 5/6 * 5/6 * 5/6 = (1/6)^2 * (5/6)^4

    Notice that each of the 15 probabilities are exactly the same: (1/6)^2 * (5/6)^4.

    Also, note that the 1/6 is the probability of success and you needed 2 successes. The

    5/6 is the probability of failure, and if 2 of the 6 trials were success, then 4 of the 6

    must be failures. Note that 2 is the value of x and 4 is the value of n-x.

    Further note that there are fifteen ways this can occur. This is the number of ways 2

    successes can be occur in 6 trials without repetition and order not being important, or

    a combination of 6 things, 2 at a time.

    The probability of getting exactly x success in n trials, with the probability of

    success on a single trial being p is:

    P(X=x) = nCx * p^x * q^(n-x)

    Example:

    A coin is tossed 10 times. What is the probability that exactly 6 heads will occur.

    1. Success = "A head is flipped on a single coin"

  • 7/28/2019 Notes Stats

    23/38

    2. p = 0.53. q = 0.54. n = 105. x = 6

    P(x=6) = 10C6 * 0.5^6 * 0.5^4 = 210 * 0.015625 * 0.0625 = 0.205078125

    Mean, Variance, and Standard Deviation

    The mean, variance, and standard deviation of a binomial distribution are extremely

    easy to find.

    Another way to remember the variance is mu-q (since the np is mu).

    Example:

    Find the mean, variance, and standard deviation for the number of sixes that appear

    when rolling 30 dice.

    Success = "a six is rolled on a single die". p = 1/6, q = 5/6.

    The mean is 30 * (1/6) = 5. The variance is 30 * (1/6) * (5/6) = 25/6. The standard

    deviation is the square root of the variance = 2.041241452 (approx)

    Other Discrete Distributions

    Multinomial Probabilities

    A multinomial experiment is an extended binomial probability. The difference is that

    in a multinomial experiment, there are more than two possible outcomes. However,

    there are still a fixed number of independent trials, and the probability of each

    outcome must remain constant from trial to trial.

  • 7/28/2019 Notes Stats

    24/38

    Instead of using a combination, as in the case of the binomial probability, the number

    of ways the outcomes can occur is done using distinguishable permutations.

    An example here will be much more useful than a formula.

    The probability that a person will pass a College Algebra class is 0.55, the probabilitythat a person will withdraw before the class is completed is 0.40, and the probability

    that a person will fail the class is 0.05. Find the probability that in a class of 30

    students, exactly 16 pass, 12 withdraw, and 2 fail.

    Outcome x p(outcome)

    Pass 16 0.55

    Withdraw 12 0.40

    Fail 2 0.05

    Total 30 1.00

    The probability is found using this formula:

    30!

    P = ---------------- * 0.55^16 * 0.40^12 * 0.05^2(16!) (12!) (2!)

    You can do this on theTI-82.

    The multinomial experiment will be used later when we talk about the chi-square

    goodness of fit test.

    Poisson Probabilities

    Named after the French mathematician Simeon Poisson, Poisson probabilities are

    useful when there are a large number of independent trials with a small probability ofsuccess on a single trial and the variables occur over a period of time. It can also be

    used when a density of items is distributed over a given area or volume.

    http://people.richland.edu/james/ti82/ti-mult.htmlhttp://people.richland.edu/james/ti82/ti-mult.htmlhttp://people.richland.edu/james/ti82/ti-mult.htmlhttp://people.richland.edu/james/ti82/ti-mult.html
  • 7/28/2019 Notes Stats

    25/38

    Lambda in the formula is the mean number of occurrences. If you're approximating a

    binomial probability using the Poisson, then lambda is the same as mu or n * p.

    Example:

    If there are 500 customers per eight-hour day in a check-out lane, what is theprobability that there will be exactly 3 in line during any five-minute period?

    The expected value during any one five minute period would be 500 / 96 =

    5.2083333. The 96 is because there are 96 five-minute periods in eight hours. So, you

    expect about 5.2 customers in 5 minutes and want to know the probability of getting

    exactly 3.

    p(3;500/96) = e^(-500/96) * (500/96)^3 / 3! = 0.1288 (approx)

    Hypergeometric Probabilities

    Hypergeometric experiments occur when the trials are not independent of each other

    and occur due to sampling without replacement -- as in a five card poker hand.

    Hypergeometric probabilities involve the multiplication of two combinations together

    and then division by the total number of combinations.

    Example:

    How many ways can 3 men and 4 women be selected from a group of 7 men and 10women?

    The answer is = 7350/19448 = 0.3779 (approx)

    Note that the sum of the numbers in the numerator are the numbers used in the

    combination in the denominator.

    This can be extended to more than two groups and called an extended hypergeometric

    problem.

    You can use theTI-82to find hypergeometric probabilities.

    http://people.richland.edu/james/ti82/hypergeo.htmlhttp://people.richland.edu/james/ti82/hypergeo.htmlhttp://people.richland.edu/james/ti82/hypergeo.htmlhttp://people.richland.edu/james/ti82/hypergeo.html
  • 7/28/2019 Notes Stats

    26/38

    Normal Distribution

    Definitions

    Central Limit Theorem

    Theorem which stats as the sample size increases, the sampling distribution of

    the sample means will become approximately normally distributed.

    Correction for Continuity

    A correction applied to convert a discrete distribution to a continuous

    distribution.

    Finite Population Correction Factor

    A correction applied to the standard error of the means when the sample size

    is more than 5% of the population size and the sampling is done without

    replacement.

    Sampling Distribution of the Sample Means

    Distribution obtained by using the means computed from random samples of

    a specific size.

    Sampling Error

    Difference which occurs between the sample statistic and the population

    parameter due to the fact that the sample isn't a perfect representation of the

    population.

    Standard Error or the Mean

  • 7/28/2019 Notes Stats

    27/38

    The standard deviation of the sampling distribution of the sample means. It is

    equal to the standard deviation of the population divided by the square root

    of the sample size.

    Standard Normal Distribution

    A normal distribution in which the mean is 0 and the standard deviation is 1. It

    is denoted by z.

    Z-score

    Also known as z-value. A standardized score in which the mean is zero and the

    standard deviation is 1. The Z score is used to represent the standard normal

    distribution.

    Normal Distributions

    Any Normal Distribution

    Bell-shaped Symmetric about mean Continuous Never touches the x-axis Total area under curve is 1.00 Approximately 68% lies within 1 standard deviation of the mean, 95% within 2

    standard deviations, and 99.7% within 3 standard deviations of the mean. This

    is the Empirical Rule mentioned earlier.

    Data values represented by x which has mean mu and standard deviationsigma.

  • 7/28/2019 Notes Stats

    28/38

    Probability Function given by

    Standard Normal Distribution

    Same as a normal distribution, but also ...

    Mean is zero Variance is one Standard Deviation is one Data values represented by z.

    Probability Function given by

    Standard Normal Non-Standard Normal

    Mean = 0

    and

    Variance = 1

    Mean is not 0

    or

    Variance is not 1

    Normal Probabilities

    This table has not been verified against the book, please use the table out of your

    textbook.

    Comprehension of this table is vital to success in the course!

    There is a table which must be used to look up standard normal probabilities. The z-

    score is broken into two parts, the whole number and tenth are looked up along the

    left side and the hundredth is looked up across the top. The value in the intersection of

    the row and column is the area under the curve between zero and the z-score looked

    up.

    Because of the symmetry of the normal distribution, look up the absolute value of any

    z-score.

    http://people.richland.edu/james/lecture/m113/z_table.htmlhttp://people.richland.edu/james/lecture/m113/z_table.htmlhttp://people.richland.edu/james/lecture/m113/z_table.html
  • 7/28/2019 Notes Stats

    29/38

    Computing Normal Probabilities

    There are several different situations that can arise when asked to find normal

    probabilities.

    Situation Instructions

    Between zero and

    any number

    Look up the area in the table

    Between two positives, or

    Between two negatives

    Look up both areas in the table and subtract the smaller from

    the larger.

    Between a negative and

    a positive

    Look up both areas in the table and add them together

    Less than a negative, or

    Greater than a positive

    Look up the area in the table and subtract from 0.5000

    Greater than a negative, or

    Less than a positive

    Look up the area in the table and add to 0.5000

    This can be shortened into two rules.

    1. If there is only one z-score given, use 0.5000 for the second area, otherwiselook up both z-scores in the table

    2. If the two numbers are the same sign, then subtract; if they are different signs,then add. If there is only one z-score, then use the inequality to determine the

    second sign (< is negative, and > is positive).

    Finding z-scores from probabilities

    This is more difficult, and requires you to use the table inversely. You must look upthe area between zero and the value on the inside part of the table, and then read the z-

    score from the outside. Finally, decide if the z-score should be positive or negative,

    based on whether it was on the left side or the right side of the mean. Remember, z-

    scores can be negative, but areas or probabilities cannot be.

  • 7/28/2019 Notes Stats

    30/38

    Situation Instructions

    Area between 0 and a value Look up the area in the table

    Make negative if on the left side

    Area in one tail Subtract the area from 0.5000

    Look up the difference in the table

    Make negative if in the left tail

    Area including one complete half

    (Less than a positive or greater than a

    negative)

    Subtract 0.5000 from the area

    Look up the difference in the table

    Make negative if on the left side

    Within z units of the mean Divide the area by 2

    Look up the quotient in the table

    Use both the positive and negative z-scores

    Two tails with equal area

    (More than z units from the mean)

    Subtract the area from 1.000

    Divide the area by 2

    Look up the quotient in the table

    Use both the positive and negative z-scores

    Using the table becomes proficient with practice, work lots of the normal probabilityproblems!

    Standard Normal Probabilities

    z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

    0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359

    0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753

    0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141

  • 7/28/2019 Notes Stats

    31/38

    0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517

    0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879

    0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224

    0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549

    0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852

    0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133

    0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389

    1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621

    1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830

    1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015

    1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177

    1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319

    1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441

    1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545

    1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633

    1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706

    1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767

    2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817

  • 7/28/2019 Notes Stats

    32/38

    2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857

    2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890

    2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916

    2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936

    2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952

    2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964

    2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974

    2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981

    2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986

    3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990

    The values in the table are the areas between zero and the z-score. That is, P(0

  • 7/28/2019 Notes Stats

    33/38

    Example 1: Sampling Distribution of Values (x)

    Consider the case where a single, fair die is rolled.

    Here are the values that are possible and their probabilities.

    Value 1 2 3 4 5 6

    Probability 1/6 1/6 1/6 1/6 1/6 1/6

    Here are the mean, variance, and standard deviation of this probability distribution.

    Mean, mu = sum [ x * p(x) ] = 3.5

    Variance, sigma^2 = sum [ x^2 * p(x) ] - mu^2 = 35/12Standard deviation, sigma = sqrt ( variance ) = sqrt ( 35/12 )

    Example 2: Sampling Distribution of Sample Means (x-bar)

    Consider the case where two fair dice are rolled instead of one.

    Here are the sums that are possible and their probabilities.

    Sum 2 3 4 5 6 7 8 9 10 11 12

    Prob 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36

    But, we're not interested in the sum of the dice, we're interested in the sample mean.

    We find the sample mean by dividing the sum by the sample size.

    Mean 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

    Prob 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36

    Computing the mean, variance, and standard deviation, we get ...

    Mean, mu = sum [ x * p(x) ] = 3.5

    Variance, sigma^2 = sum [ x^2 * p(x) ] - mu^2 = 35/24

    Standard deviation, sigma = sqrt ( variance ) = sqrt ( 35/24 )

  • 7/28/2019 Notes Stats

    34/38

    Properties of the Sampling Distribution of the Sample Means

    When all of the possible sample means are computed, then the following properties

    are true:

    The mean of the sample means will be the mean of the population The variance of the sample means will be the variance of the population

    divided by the sample size.

    The standard deviation of the sample means (known as the standard error ofthe mean) will be smaller than the population mean and will be equal to the

    standard deviation of the population divided by the square root of the sample

    size.

    If the population has a normal distribution, then the sample means will have anormal distribution.

    If the population is not normally distributed, but the sample size is sufficientlylarge, then the sample means will have an approximately normal distribution.

    Some books define sufficiently large as at least 30 and others as at least 31.

    The formula for a z-score when working with the sample means is:

    Finite Population Correction Factor

    If the sample size is more than 5% of the population size and the sampling is done

    without replacement, then a correction needs to be made to the standard error of the

    means.

    In the following, N is the population size and n is the sample size. The adjustment is

    to multiply the standard error by the square root of the quotient of the difference

    between the population and sample sizes and one less than the population

    size.

    For the most part, we will be ignoring this in class.

  • 7/28/2019 Notes Stats

    35/38

    Normal Approximation to Binomial

    Recall that according to the Central Limit Theorem, the sample mean of anydistribution will become approximately normal if the sample size is sufficiently large.

    It turns out that thebinomial distributioncan be approximated using the normal

    distribution if np and nq are both at least 5. Furthermore, recall that the mean of a

    binomial distribution is np and the variance of the binomial distribution is npq.

    Continuity Correction Factor

    There is a problem with approximating the binomial with the normal. That problem

    arises because the binomial distribution is a discrete distribution while the normaldistribution is a continuous distribution. The basic difference here is that with discrete

    values, we are talking about heights but no widths, and with the continuous

    distribution we are talking about both heights and widths.

    The correction is to either add or subtract 0.5 of a unit from each discrete x-value.

    This fills in the gaps to make it continuous. This is very similar to expanding of limits

    to form boundaries that we did with group frequency distributions.

    Examples

    Discrete Continuous

    x = 6 5.5 < x < 6.5

    x > 6 x > 6.5

    x >= 6 x > 5.5

    x < 6 x < 5.5

    x

  • 7/28/2019 Notes Stats

    36/38

    As you can see, whether or not the equal to is included makes a big difference in the

    discrete distribution and the way the conversion is performed. However, for a

    continuous distribution, equality makes no difference.

    Steps to working a normal approximation to the binomial distribution

    1. Identify success, the probability of success, the number of trials, and thedesired number of successes. Since this is a binomial problem, these are the

    same things which were identified when working a binomial problem.

    2. Convert the discrete x to a continuous x. Some people would argue that step 3should be done before this step, but go ahead and convert the x before you

    forget about it and miss the problem.

    3. Find the smaller of np or nq. If the smaller one is at least five, then the largermust also be, so the approximation will be considered good. When you find

    np, you're actually finding the mean, mu, so denote it as such.4. Find the standard deviation, sigma = sqrt (npq). It might be easier to find the

    variance and just stick the square root in the final calculation - that way you

    don't have to work with all of the decimal places.

    5. Compute the z-score using the standard formula for an individual score (notthe one for a sample mean).

    6. Calculate the probability desired.

    Importance of the Normal Distribution

    Parametric Hypothesis Testing

    All parametric hypothesis testing that we're going to perform requires normality in

    some sense.

    Population Mean

    Either the population was normally distributed, the sample size was large

    enough (so the central limit theorem applied and was approximately normal),

    or the population was approximately normal and the student's t was used.

    Population Proportion

  • 7/28/2019 Notes Stats

    37/38

    The binomial distribution (the one that really applies) was approximated using

    the normal as long as np and nq were at least five. That is another way of

    saying the expected frequency of each category (success and failure) is at least

    five.

    Population Variance

    It was required that the population be normally distributed.

    Correlation and Regression

    The pairs of data had to have a bi-variate normal distribution.

    Multinomial Experiment

    The expected frequency of each category had to be at least five. This is

    analogous to approximating the binomial using the normal.

    Independence

    The expected frequency of each cell had to be at least five. This is analogous

    to approximating the binomial using the normal.

    Distributions

    The distributions have normality in them somewhere, too.

    Normal Distribution

    Well, obviously this one requires normality.

    Student's T Distribution

    Had to be approximately normal. As the sample size increases, the student's t

    approaches the normal distribution.

    Chi-squared Distribution

    Required a normal population. There is another interesting relationship

    between the normal and chi-square distributions. If you take a critical value

    from normal distribution and square it, you will get the corresponding chi-

  • 7/28/2019 Notes Stats

    38/38

    square value with one degree of freedom, but twice the area in the tails.

    Example: z(0.05)2

    = 1.6452

    = 2.706 = chi-square(1,0.10)

    F Distribution

    Since F is the ratio of two independent chi-squared variables divided by their

    respective degrees of freedom, and the chi-squares require a normal

    distribution, then the F distribution is also going to require a normal

    distribution.

    Binomial Distribution

    Obviously, the binomial doesn't require a normal population, but it can be

    approximated using a normal distribution if the expected frequency of each

    category is at least five.

    Multinomial Distribution

    Same as with the binomial, the multinomial can be approximated using the

    normal if the expected frequency of each category is at least five.

    As stated in class and in the lecture notes ... your comprehension of the normal

    distribution is vital for success in the class.