19
Normality (Part 4) Notes continued on page 138-139

Normality (Part 4) Notes continued on page 138-139

Embed Size (px)

Citation preview

Page 1: Normality (Part 4) Notes continued on page 138-139

Normality (Part 4)

Notes continued on page 138-139

Page 2: Normality (Part 4) Notes continued on page 138-139

The heights of the female students at RSH are normally distributed with a mean of 65 inches. What is the standard deviation of this distribution if 18.5% of the female students are shorter than 63 inches?P(X < 63) = .185

6322.2

9.2

65639.

What is the z-score for the 63?

-0.9

Page 3: Normality (Part 4) Notes continued on page 138-139

The heights of female teachers at RSH are normally distributed with mean of 65.5 inches and standard deviation of 2.25 inches. The heights of male teachers are normally distributed with mean of 70 inches and standard deviation of 2.5 inches. •Describe the distribution of differences of heights (male – female) teachers.

Normal distribution with m = 4.5 & s = 3.3634

Page 4: Normality (Part 4) Notes continued on page 138-139

• What is the probability that a randomly selected male teacher is shorter than a randomly selected female teacher?

4.5

P(X<0) =

34.13634.3

5.40

z

.0901

Page 5: Normality (Part 4) Notes continued on page 138-139

Will my calculator do any of this normal

stuff?• Normalpdf – use for graphing ONLY

• Normalcdf – will find probability of area from lower bound to upper bound

• Invnorm (inverse normal) – will find z-score for probability

Page 6: Normality (Part 4) Notes continued on page 138-139

Ways to Assess Normality

• Use graphs (dotplots, boxplots, or histograms)

• Normal probability (quantile) plot

Page 7: Normality (Part 4) Notes continued on page 138-139

To construct a normal probability plot, you can use quantities called normal score. The values of the normal scores depend on the sample size n. The normal scores when n = 10 are below:

-1.539 -1.001 -0.656 -0.376 -0.123 0.123 0.376 0.656 1.001 1.539

Think of selecting sample after sample of size 10 from a standard normal distribution. Then -1.539 is

the average of the smallest observation from each sample &

so on . . .

Suppose we have the following observations of widths of contact windows in integrated circuit chips:

3.21 2.49 2.94 4.38 4.02 3.62 3.30 2.85 3.34 3.81

Sketch a scatterplot by pairing the smallest normal score with the smallest observation from

the data set & so on

1 2 3 4 5

-1

1N

orm

al S

core

s

Widths of Contact Windows

What should happen if our data

set is normally distribute

d?

Page 8: Normality (Part 4) Notes continued on page 138-139

Normal Probability (Quantile) plots

• The observation (x) is plotted against known normal z-scores

• If the points on the quantile plot lie close to a straight line, then the data is normally distributed

• Deviations on the quantile plot indicate nonnormal data

• Points far away from the plot indicate outliers

• Vertical stacks of points (repeated observations of the same number) is called granularity

Page 9: Normality (Part 4) Notes continued on page 138-139

Are these approximately normally distributed?

50 48 54 47 51 52 46 53 52 51 48 48 54 55 57 45 53 50 47 49 50 56 53 52

Both the histogram & boxplot are approximately symmetrical, so these data are approximately normal.

The normal probability plot is approximately linear, so these data are approximately normal.

What is this

called?

Page 10: Normality (Part 4) Notes continued on page 138-139

Normal Approximation to the Binomial

Before widespread use of technology, binomial probability calculations were very tedious. Let’s see how statisticians estimated these calculations in the past!

Page 11: Normality (Part 4) Notes continued on page 138-139

Refresher on Binomial Probability P(x<30) = binomialcdf(n,p,x)

Where: n is the sample size, and p is the probability

P(15<x<30) =Binomialcdf(n,p,30) – binomialcdf(n,p,14)

For example, if n=250 and p=.1, we would enter: binomialcdf(250,.1,30)

Page 12: Normality (Part 4) Notes continued on page 138-139

Premature babies are those born more than 3 weeks early. Newsweek (May 16, 1988) reported that 10% of the live births in the U.S. are premature. Suppose that 250 live births are randomly selected and that the number X of the “preemies” is determined. What is the probability that there are between 15 and 30 preemies, inclusive? (POD, p. 422)

1) Find this probability using the binomial distribution.

2) What is the mean and standard deviation of the above distribution?

P(15<X<30) = binomialcdf(250,.1,30) – binomialcdf(250,.1,14)

=.866

m = np = 25 s = sqrt((np)(1-p)) = sqrt((25)(1-.1)) = 4.743On Formula Chart

Page 13: Normality (Part 4) Notes continued on page 138-139

3) If we were to graph a histogram for the above binomial distribution, what shape do you think it will have?Since the probability is only 10%, we would expect the histogram to be strongly skewed right.

Page 14: Normality (Part 4) Notes continued on page 138-139

Normal distributions can be used to estimate probabilities for binomial distributions when:

1) the probability of success is close to .5or2) n is sufficiently large

Rule: if n is large enough,then np > 10 & n(1 –p) > 10

Why 10?

See p 144

Page 15: Normality (Part 4) Notes continued on page 138-139

Normal distributions extend infinitely in both directions; however, binomial distributions are between 0 and n. If we use a normal distribution to estimate a binomial distribution, we must cut off the tails of the normal distribution. This is OK if the mean of the normal distribution (which we use the mean of the binomial) is at least three standard deviations (3s) from 0 and from n. (BVD, p. 334)

Page 16: Normality (Part 4) Notes continued on page 138-139

We require:

Or

As binomial:

Square:

Simplify:

Since (1 - p) < 1:

And p < 1:

Therefore,

3

pnpnp 13

we say the np should be at least 10 and n (1 – p) should be at least 10.

9np

pnp 19

pnppn 1922

03

91 pn

Page 17: Normality (Part 4) Notes continued on page 138-139

Normal distributions can be used to estimate probabilities for binomial distributions when: 1) the probability of success is close to .5or2) n is sufficiently large

Rule: if n is large enough,then np > 10 & n(1 –p) > 10

Since a continuous distribution is used to estimate the probabilities of a discrete distribution, a continuity correction is used to make the discrete values similar to continuous values.(+.5 to discrete values)

Why?

Think about how discrete histograms are made. Each bar

is centered over the discrete values. The bar for “1” actually goes from 0.5 to 1.5 & the bar for “2” goes from 1.5 to 2.5.

Therefore, by adding or subtracting .5 from the

discrete values, you find the actually width of the bars that you need to estimate with the

normal curve.

Page 18: Normality (Part 4) Notes continued on page 138-139

(Back to our example) Since P(preemie) = .1 which is not close to .5, is n large enough?

5) Use a normal distribution with the binomial mean and standard deviation above to estimate the probability that between 15 & 30 preemies, inclusive, are born in the 250 randomly selected babies.Binomial written as Normal (w/cont. correction)P(15 < X < 30)

6) How does the answer in question 6 compare to the answer in question 1 (Binomial answer =0.866)?

Normalcdf(14.5,30.5,25,4.743) = .8635

np = 250(.1) = 25 & n(1-p) = 250(.9) = 225

Yes, Ok to use normal to approximate binomial

P(14.5 < X < 30.5) =

Page 19: Normality (Part 4) Notes continued on page 138-139

Homework:

• Page 139, #A,B,C

• Finish handout “Graded Assignment 2-2” (all)