41
+ Recitation 3

Recitation 3

  • Upload
    kamin

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

Recitation 3. The Normal Distribution. Probability Distributions. A probability distribution is a table or an equation that links each outcome of a statistical experiment with its probability of occurrence. Distributions fit with different types of variables:. - PowerPoint PPT Presentation

Citation preview

Page 1: Recitation 3

+

Recitation 3

Page 2: Recitation 3

+The Normal Distribution

Page 3: Recitation 3

+Probability Distributions

A probability distribution is a table or an equation that links each outcome of a statistical experiment with its probability of occurrence.

Page 4: Recitation 3

+Distributions fit with different types of variables:Discrete variables: takes on a countable number of values     -the number of job classifications in an agency     -the number of employees in a department      -the number of training sessions 

Continuous variables: takes a countless (or super big) range of numerical values      -temperature     -pressure     -height, weight, time     -Dollars: budgets, income. (not strictly continuous) but they can take so many values that are so close that you may as well treat them that way

Page 5: Recitation 3

+Visualized Discrete vs. Continuous

The difference is being able to go on the discrete histogram and saying what is the prob. of a 3? and you can see it is .4. For a continuous variable, you would have to say a 1.1 and a 2.9 (give a range and take the area—integrate)

Page 6: Recitation 3

+Real life normal distribution

Page 7: Recitation 3

+The Normal Distribution Characteristics -continuous variables only

-The bell curve shape is familiar

-most values cluster around the mean mu

-As values fall at a greater distance from the mean, their likelihood of occurring shrinks 

-Its shape is completely determined by its mean and its standard deviation -The height of the curve is the greatest at the mean (where probability of occurrence is highest)

Page 8: Recitation 3

+ -68.26% of values fall within one standard deviation of the mean in either direction-95.44% of values fall within 2 standard deviations of the mean in either direction-99.72% of values fall within 3 standard deviations of the mean in either direction 

Page 9: Recitation 3

+z scores• The number of standard deviations a score of interests lies

away from the mean in a normal distribution 

• It is used to convert raw data into their associated probability of occurrence with reference to the mean 

• The score we are interested in is X. To find the z score of X, subtract the mean mu from it then divide by the amount of standard deviations (sigma) to determine how many SD’s the score is from the mean

Page 10: Recitation 3

+z scores• The z score itself equals the number of SD’s (sigma) that a

score of interest (X) is from the mean (mu) in a normal distribution 

• A data value X one standard deviation above the mean has a z score of 1

• A data value X 2 SD’s above the mean has a z score of 2• The probability associated with a z score of one is 0.3413; see

below in the blue oval: (68.26/2)=34.13% of the data values lie between the mean and 1 SD above it 

Page 11: Recitation 3

+z scores• The z score for 1 SD below the mean will be the same in magnitude

(0.3413) but with a negative z score of -1.0• Thus, the z score of -1.0 contains 34.13% of the data 

i.e. just over one third of the data fall between mu and 1 SD below it 

Page 12: Recitation 3

+Example: What is the likelihood that a value has a z score of 2.0?

Page 13: Recitation 3

+Example: What is the likelihood that a value has a z score of 2.0?

It is equal to 95.55/2=47.72% (Meaning, just over 47% of the data fall between mu and 2 SD above it)

Page 14: Recitation 3

+The normal distribution table

• Displays the percentage of data values falling between the mean mu and each z score

• the first 2 digits are in the far left column 

• the third digit is on the top row • The associated probability is

where they meet 

Page 15: Recitation 3

+Locating z score for 1.0

Page 16: Recitation 3

+exampleWhat percent of the data lies between mu and 1.33 SD away?

Page 17: Recitation 3

+Locate 1.33 on the z table

The answer is that 40.824% of the data fall between the mean and 1.33 SD from the mean

Page 18: Recitation 3

+Application Example:

The police chief is reviewing the academy’s exam scores. The police department’s entrance exam has a normal distribution with a mean of 100 and SD of 10. Someone scored 119.2 on the exam. Is this a good score?

Page 19: Recitation 3

+Solution-another way of asking this is: what is the probability that any random applicant takes the test and scores a 119.2?-If the probability is high, then it is an average or mediocre score, if the probability is low, then it is an exceptional score

-Step 1: convert the test score to a z score using the formula:

(119.2-100)/10=1.92

Page 20: Recitation 3

+Solution-Step 2: Use the z score of 1.92 (how many standard deviations the score is above the mean, since it is a positive z score) Look it up in the z table. 

Page 21: Recitation 3

+Solution-Step 3: The value here is .4726-But you’re not done. Here is what you just found:

-We also need to add in the part of the curve shaded in green, or all ofThe scores under the mean.

(0.5+0.4726=) 97.26 is the percentile, or in other words, 97.26% of the scores fall below this score-The probability that a randomly selected individual will get this score or better is 1-97.26=.0274

.4726.50

Page 22: Recitation 3

+Tips Always draw a picture, it helps you reason through your

answer The z curve is symmetric, so if a your score was a -

1.92, it would still contain ~47.2% of the data.

Page 23: Recitation 3

+

The Binomial Distribution

The last section, I promise.

Page 24: Recitation 3

+A gem from the reading

Page 25: Recitation 3

+Probability Distributions

A probability distribution is a table or an equation that links each outcome of a statistical experiment with its probability of occurrence.

Page 26: Recitation 3

+Binomial Distribution Definition• The probability an event will occur a

specified number of times within a specified number of trials 

• Examples: mail will be delivered before a certain time every day this weekequipment in a factory remains operational in a 10 day period • This is a DISCRETE distribution that

deals with the likelihood of observing a certain number of events in a set number of repeated trials 

Page 27: Recitation 3

+A Bernoulli process• The Binomial distribution can be used when the

process is Bernoulli • Bernoulli characteristics:• The outcome of a trial is either a success or a

failure • The outcomes are mutually exclusive• The probability of a success is constant from

trial to trial • One trial’s probability of success is not

affected by the trial before it (INDEPENDENCE)• Examples of independent events could

be multiple coin tosses , A fire occurring in a community isn’t affected by if one happened the night before 

Page 28: Recitation 3

+When looking at Bernoulli EventsYou can calculate their probability with

the binomial distribution• Examples of Bernoulli events:

• coin flip is either heads, or not heads

• A crime is either solved or not solved 

Page 29: Recitation 3

+To calculate a probability using the Binomial Distribution you need• n=number of trials• r=number of successes• p=probability that the

event will be a success• q=(1-p)

Page 30: Recitation 3

+Breaking down the formula

 

Is a combination, it is read, “a combination of n THINGS taken r at a time”

The formula is:

Page 31: Recitation 3

+ExampleWe flip a coin three times, and we want to know the probability of getting three heads

Page 32: Recitation 3

+Step 1

Define N, P, R, and Q

n (number of trials) =3r (successes)=3 [number of heads]p (probability of getting a heads on a flip)= 0.5q (1-p)=0.5

Now fill in the formula

Page 33: Recitation 3

+Important when solving• 0!=1• Any number raised to the power of 0

= 1

Page 34: Recitation 3

+Example 2A Public works department has been charged with discrimination. Last year, 40% of people who passed the civil service exam were minorities (eligible to be hired). From this group, Public works hired 10 people, and 2 were minorities. What is the probability that if Public works DID NOT discriminate it still would have hired 2 or fewer minorities? (assuming everyone had the same probability of getting hired)

Page 35: Recitation 3

+Step 1

identify n, p, r, and q• n (number of trials=number of

people hired) =10• r (successes)=2 [number of hired

minorities]• p (probability of getting hired=% of

minorities in the pool)= 0.4• q (1-p)=0.6

Page 36: Recitation 3

+Step 2

Reason through the problem. It asks the likelihood that Public Works hired 2 or fewer minorities. Thus, we need to calculate the binomial for 2 hires, 1 hire, and 0 hires. 

Page 37: Recitation 3

+Step 3set up the probability calculations:Two minorities:n (number of trials=number of people hired) =10r (successes)=2 [number of hired minorities]p (probability of getting hired=% of minorities in the pool)= 0.4q (1-p)=0.6

(10!/2!8!) * 0.4^2 * .6^8 = 0.12

Page 38: Recitation 3

+Formula reminder

Binomial:

Combinations:

Page 39: Recitation 3

+Step 4 Repeat for one minority hired

One minority 

n (number of trials=number of people hired) =10r (successes)=1 [number of hired minorities]p (probability of getting hired=% of minorities in the pool)= 0.4q (1-p)=0.6

(10!/1!9!) * 0.4^1 * 0.6 ^9 = 0.04

Page 40: Recitation 3

+Step 5 Repeat for 0 minorities hired

No minorities at all 

n (number of trials=number of people hired) =10r (successes)=0 [number of hired minorities]p (probability of getting hired=% of minorities in the pool)= 0.4q (1-p)=0.6

(10!/0!10!) * 0.4^0 * 0.6^10=.006

Page 41: Recitation 3

+Step 6

Add these probabilities together: =0.166 The likelihood of hiring 2 minorities by chance is !6%