Recitation 3

Preview:

DESCRIPTION

Recitation 3. The Normal Distribution. Probability Distributions. A probability distribution is a table or an equation that links each outcome of a statistical experiment with its probability of occurrence. Distributions fit with different types of variables:. - PowerPoint PPT Presentation

Citation preview

+

Recitation 3

+The Normal Distribution

+Probability Distributions

A probability distribution is a table or an equation that links each outcome of a statistical experiment with its probability of occurrence.

+Distributions fit with different types of variables:Discrete variables: takes on a countable number of values     -the number of job classifications in an agency     -the number of employees in a department      -the number of training sessions 

Continuous variables: takes a countless (or super big) range of numerical values      -temperature     -pressure     -height, weight, time     -Dollars: budgets, income. (not strictly continuous) but they can take so many values that are so close that you may as well treat them that way

+Visualized Discrete vs. Continuous

The difference is being able to go on the discrete histogram and saying what is the prob. of a 3? and you can see it is .4. For a continuous variable, you would have to say a 1.1 and a 2.9 (give a range and take the area—integrate)

+Real life normal distribution

+The Normal Distribution Characteristics -continuous variables only

-The bell curve shape is familiar

-most values cluster around the mean mu

-As values fall at a greater distance from the mean, their likelihood of occurring shrinks 

-Its shape is completely determined by its mean and its standard deviation -The height of the curve is the greatest at the mean (where probability of occurrence is highest)

+ -68.26% of values fall within one standard deviation of the mean in either direction-95.44% of values fall within 2 standard deviations of the mean in either direction-99.72% of values fall within 3 standard deviations of the mean in either direction 

+z scores• The number of standard deviations a score of interests lies

away from the mean in a normal distribution 

• It is used to convert raw data into their associated probability of occurrence with reference to the mean 

• The score we are interested in is X. To find the z score of X, subtract the mean mu from it then divide by the amount of standard deviations (sigma) to determine how many SD’s the score is from the mean

+z scores• The z score itself equals the number of SD’s (sigma) that a

score of interest (X) is from the mean (mu) in a normal distribution 

• A data value X one standard deviation above the mean has a z score of 1

• A data value X 2 SD’s above the mean has a z score of 2• The probability associated with a z score of one is 0.3413; see

below in the blue oval: (68.26/2)=34.13% of the data values lie between the mean and 1 SD above it 

+z scores• The z score for 1 SD below the mean will be the same in magnitude

(0.3413) but with a negative z score of -1.0• Thus, the z score of -1.0 contains 34.13% of the data 

i.e. just over one third of the data fall between mu and 1 SD below it 

+Example: What is the likelihood that a value has a z score of 2.0?

+Example: What is the likelihood that a value has a z score of 2.0?

It is equal to 95.55/2=47.72% (Meaning, just over 47% of the data fall between mu and 2 SD above it)

+The normal distribution table

• Displays the percentage of data values falling between the mean mu and each z score

• the first 2 digits are in the far left column 

• the third digit is on the top row • The associated probability is

where they meet 

+Locating z score for 1.0

+exampleWhat percent of the data lies between mu and 1.33 SD away?

+Locate 1.33 on the z table

The answer is that 40.824% of the data fall between the mean and 1.33 SD from the mean

+Application Example:

The police chief is reviewing the academy’s exam scores. The police department’s entrance exam has a normal distribution with a mean of 100 and SD of 10. Someone scored 119.2 on the exam. Is this a good score?

+Solution-another way of asking this is: what is the probability that any random applicant takes the test and scores a 119.2?-If the probability is high, then it is an average or mediocre score, if the probability is low, then it is an exceptional score

-Step 1: convert the test score to a z score using the formula:

(119.2-100)/10=1.92

+Solution-Step 2: Use the z score of 1.92 (how many standard deviations the score is above the mean, since it is a positive z score) Look it up in the z table. 

+Solution-Step 3: The value here is .4726-But you’re not done. Here is what you just found:

-We also need to add in the part of the curve shaded in green, or all ofThe scores under the mean.

(0.5+0.4726=) 97.26 is the percentile, or in other words, 97.26% of the scores fall below this score-The probability that a randomly selected individual will get this score or better is 1-97.26=.0274

.4726.50

+Tips Always draw a picture, it helps you reason through your

answer The z curve is symmetric, so if a your score was a -

1.92, it would still contain ~47.2% of the data.

+

The Binomial Distribution

The last section, I promise.

+A gem from the reading

+Probability Distributions

A probability distribution is a table or an equation that links each outcome of a statistical experiment with its probability of occurrence.

+Binomial Distribution Definition• The probability an event will occur a

specified number of times within a specified number of trials 

• Examples: mail will be delivered before a certain time every day this weekequipment in a factory remains operational in a 10 day period • This is a DISCRETE distribution that

deals with the likelihood of observing a certain number of events in a set number of repeated trials 

+A Bernoulli process• The Binomial distribution can be used when the

process is Bernoulli • Bernoulli characteristics:• The outcome of a trial is either a success or a

failure • The outcomes are mutually exclusive• The probability of a success is constant from

trial to trial • One trial’s probability of success is not

affected by the trial before it (INDEPENDENCE)• Examples of independent events could

be multiple coin tosses , A fire occurring in a community isn’t affected by if one happened the night before 

+When looking at Bernoulli EventsYou can calculate their probability with

the binomial distribution• Examples of Bernoulli events:

• coin flip is either heads, or not heads

• A crime is either solved or not solved 

+To calculate a probability using the Binomial Distribution you need• n=number of trials• r=number of successes• p=probability that the

event will be a success• q=(1-p)

+Breaking down the formula

 

Is a combination, it is read, “a combination of n THINGS taken r at a time”

The formula is:

+ExampleWe flip a coin three times, and we want to know the probability of getting three heads

+Step 1

Define N, P, R, and Q

n (number of trials) =3r (successes)=3 [number of heads]p (probability of getting a heads on a flip)= 0.5q (1-p)=0.5

Now fill in the formula

+Important when solving• 0!=1• Any number raised to the power of 0

= 1

+Example 2A Public works department has been charged with discrimination. Last year, 40% of people who passed the civil service exam were minorities (eligible to be hired). From this group, Public works hired 10 people, and 2 were minorities. What is the probability that if Public works DID NOT discriminate it still would have hired 2 or fewer minorities? (assuming everyone had the same probability of getting hired)

+Step 1

identify n, p, r, and q• n (number of trials=number of

people hired) =10• r (successes)=2 [number of hired

minorities]• p (probability of getting hired=% of

minorities in the pool)= 0.4• q (1-p)=0.6

+Step 2

Reason through the problem. It asks the likelihood that Public Works hired 2 or fewer minorities. Thus, we need to calculate the binomial for 2 hires, 1 hire, and 0 hires. 

+Step 3set up the probability calculations:Two minorities:n (number of trials=number of people hired) =10r (successes)=2 [number of hired minorities]p (probability of getting hired=% of minorities in the pool)= 0.4q (1-p)=0.6

(10!/2!8!) * 0.4^2 * .6^8 = 0.12

+Formula reminder

Binomial:

Combinations:

+Step 4 Repeat for one minority hired

One minority 

n (number of trials=number of people hired) =10r (successes)=1 [number of hired minorities]p (probability of getting hired=% of minorities in the pool)= 0.4q (1-p)=0.6

(10!/1!9!) * 0.4^1 * 0.6 ^9 = 0.04

+Step 5 Repeat for 0 minorities hired

No minorities at all 

n (number of trials=number of people hired) =10r (successes)=0 [number of hired minorities]p (probability of getting hired=% of minorities in the pool)= 0.4q (1-p)=0.6

(10!/0!10!) * 0.4^0 * 0.6^10=.006

+Step 6

Add these probabilities together: =0.166 The likelihood of hiring 2 minorities by chance is !6%