06-14-PP CHAPTER 06 Continuous Probability

Preview:

DESCRIPTION

normal distribution

Citation preview

1

Chapter 6

2

Normal Distribution

The most important type of random variable is the normal or Gaussian random variable that has a normal distribution. In fact, the binomial distribution can be approximated to the normal distribution.

Note that normal or Gaussian random variable is continuous.

3

Graph of Normal Probability Distribution

Let μ and σ be the mean and standard deviation of the given population. Recall that normal or Gaussian random variable is continuous. Its distribution function is also continuous on the set of all real numbers. Therefore, its graph is continuous on the entire real line.

The normal or Gaussian graph is represented in the next slide.

4

The Normal Curve

5

Properties of a Normal Curve

The notable features of a normal distribution (density) curve are as follows:a. The curve is bell-shaped with the highest point (the mode) at the mean .b. It is symmetrical about the mean.c. The curve is always above the horizontal axis. In other words, the curve approaches the horizontal axis (asymptote) but never touches or crosses it.d. It has two inflection points at - and +.e. The area bounded by the normal curve, horizontal axis, two vertical lines is the probability measure of normal random variable belonging to the interval determined by the two vertical lines.

6

The Normal Distribution Function

,0,

2

1),|(

2

2

1

x

where

exf

x

Moreover, the normal distribution curve, y = f(x) is in fact the normal density function. Its mathematical representation is given by,

7

Effect of the mean and variance on the normal curve

FIXED 2, VARYING VARYING 2, FIXED

8

Empirical Rule

For a distribution that is symmetrical and bell shaped (in particular, for a normal distribution):

▪ Approximately 68% of the data fall in the interval

▪ Approximately 95% of the data fall in the interval

▪ Approximately 99.7% of the data fall in the interval

,

2,2

3,3

9

Empirical Rule

In fact from this empirical rule, one can easily conclude that the probabilities of the events:i. ii. iii. are 0.68, 0.95, and 0.997 respectively.That is,

68.01 xP

95.02 xP

997.03 xP

10

Graphical Representation of the Empirical Rule

11

Control Charts

A control chart is used to examine data over a period of equally spaced time intervals.

For a given random variable X, the control chart is a plot of the observed values of X = x in time sequence order.

12

Procedure for Making Control Chart for Random Variable X

13

Example 1: Graphing Control Charts

14

Inferences about data using control chart

Out-of-Control Signal-I:▪ One point beyond the three standard deviation level either above or below the center line (mean, μ).

Out-of-Control Signal-II:▪ A run of nine consecutive points on one side of the center line (mean, ).

Out-of-Control Signal-III:▪ At least two of three consecutive points beyond the two standard deviation level on the same side of the center line (mean, μ).

15

Graphical Illustration of Out-of-Control Signal I

16

Probability of Out-of-Control Signal I Using the Empirical Rule

Control Chart

0

10

20

30

1 4 7 10 13 16 19

Trial

Sam

ple

Mea

n

003.0997.013yEmpiricall

xP

17

Graphical Illustration of Out-of-Control Signal II

18

Probability of Out-of-Control Signal II Using the Empirical Rule

004.0002.02mean theof sidesboth on Nine

002.05.0mean theof side oneon Nine 9

P

P

Control Chart

0

10

20

301 4 7 10 13 16 19

Trial

Sam

ple

Mea

n

19

Graphical Illustration of Out-of-Control Signal III

20

Probability of Out-of-Control Signal III Using the Empirical Rule

Control Chart

0

10

20

30

1 4 7 10 13 16 19

Trial

Sam

ple

Mea

n

21

Probability of Signal III (cont)

yEmpiricall

accurately Moremean theAbove2

95.01025.02

xP

004.00036.00018.02

mean the deviations standard

than twomore valuesdata threeofout least twoAt

0018.0

)975.0()025.0()975.0()025.0(

mean the deviations standard two

thanmore valuesdata threeofout least twoAt

0333

1223

elow above or bP

CC

aboveP

22

Summary of Signals Probabilities

004.0975.0025.0975.0025.02)III Signal(

004.05.02)II Signal(

003.0997.01)I Signal(

:rule empirical theUsing

0333

1223

9

CCP

P

P

23

Chebyshev’s Theorem*

For any set of data (population or sample) with sample size greater than 1, regardless of the distribution of the data set, the proportion of the data that must be within k standard deviations on either side of the mean is given by,

24

Results of Chebyshev’s Theorem

According to Chebyshev’s Theorem for any set of data, the proportion of data (percentage of data) within the given number of standard deviations yields the following results:

▪ At least 75% of the data fall in the interval

▪ At least 88.9% of the data fall in the interval

▪ At least 93.8% of the data fall in the interval

2,2

3,3

4,4

25

The Normal Distribution

2

2

1

2

1),|(

x

exf

scorez

dttfxFCDF

xfPDFx

),|(),|(:

),|(:

Probability Distribution Function

Cumulative Probability Distribution Function

26

The Z-value (or Z-score)

The z-value or z-score is the deviation of the measurement from the mean per unit standard deviation. It is defined by,

where x is the original measurement, µ is the mean of the x distribution and σ is the standard deviation.

27

Remarks

Note that we assume the word average to be either the sample mean or the population mean µ. We further note that the original score x is referred to as “raw score x”.

Knowing the z-score, σ, and µ then the raw score x is determine by,

28

Standard Normal Distribution

If the original distribution of the x values is normal with mean µ, and standard deviation σ, then the corresponding z values have a normal distribution with mean, µ = 0 and standard deviation, σ = 1.

This transformed normal distribution with mean, µ = 0 and standard deviation, σ = 1 is called the standard normal distribution.

29

Proof of Mean & Variance of Z-score

?)( and ?)(

let Then,

)( and )(Given 2

yVyE

xy

xVxE

0)(

)()1()()()(

yE

ExExEyE

22

2

0)(

)()1()()()(

yV

VxVxVyV

?)( and ?)(

let Hence,

zVzE

xz

0)(001

)(1

)(

zE

xEx

EzE

11

)(

)(1

)(

22

2

zV

xVx

VzV

30

The Empirical Rule under the Standard Normal Curve

31

The Standard Normal Table

The textbook uses the left tail and half tail style tables interchangeably to solve problems involving the normal distribution. However, for the sake of uniformity we would focus on only the left tail style table.This style table provides the cumulative area to the left of a given z score associated with an original raw score x.

32

The Standard Normal Left Tail Table (Left Tail Z Table)

33

Some Remarks about Normal Probability

▪ The total area under the normal curve is always equal to 1.▪ The portion of the area under the curve within a given interval represents the probability that a measurement will lie in that interval.▪ The probability that z equals a certain number is always 0. ▪ P (z = a) = 0

▪ Therefore, < and ≤ can be used interchangeably. Similarly, > and ≥ can be used interchangeably. ▪ P (z < b) = P (z ≤ b) ▪ P (z > c) = P (z ≥ c)

34

Convention? Argumentative.

Some instructors and books states that:The area to the left of a z-value smaller than –3.49 is 0.000

It is better to state ≤ 0.0002 from tableThe area to the left of a z-value greater than 3.49 is 1.000

It is better to state ≥ 0.9998 from table

Always avoid absolute statements.

35

Use of the Left Tail Normal Table(looking up area under the curve)

7357.0)63.0( zPExample 2:

36

Example 3

0075.0925.01)43.2( zPa.

9625.0)78.1( zP

001.0)09.3( zP

b.

c.

d. 5910.0)23.0()227.0( zPzP

37

Example 4

)18.2()34.1(

)34.118.2(

zPzP

zP

0146.09854.01

)18.2(1)18.2(

zPzP

9099.0)34.1( zP

8953.00146.09099.0

9099.03.1

04.0

z

9854.01.2

08.0

z

38

Example 5

Given that the mean is 25 and the standard deviation is 5, what is the probability that the observed data point is at most 28.15.

7357.0)63.0(

5

2515.28)15.28(

zP

xPxP

Example 6

39

)00.150.0(

2

46

2

43)63(

;2,4Given

zP

xPxP

3085.06915.01

)50.0(1)50.0(

symmetryBy

zPzP

8413.0)00.1( zP

5328.03085.08413.0

8413.00.1

00.0

z

6915.05.0

00.0

z

40

b.

Example 7

41

c.

Example 7 (continue)

42

Inverse Normal Distribution

Sometimes we may be required to find the z value or raw score, x that corresponds to a given area under the normal curve.▪ To do this, we look up the area associated with the given problem and find the corresponding z value.▪ Next, the raw score, x can be computed as follows:

43

Example 8: Using the information given in example 7,

44

Example 9: Using the information given in example 7,

45

Example 10

1. Find the z value such that 90% of the area under the standard normal curve lies between –z and z.2. Find the z value such that 3% of the area under the standard normal curve lies to the right of z. 3. If a random variable X is normally distributed with mean 50 and standard deviation 10, find k so that the P (X ≥ k) = 0.99

46

Sampling Distribution

A sampling distribution is a probability distribution of a sample statistic based on all possible simple random samples of the same size from the same population.

47

Example 11: Sampling Distribution

An application center has six sales representatives at its North Jacksonville outlet. Listed below is the number of refrigerators sold by each last month.

a. Select all possible samples of size 2 and compute the sample mean number sold for each sample.b. What is the distribution of the sample means.

Sales Representative Number Sold Sales Representative Number Sold

Zina 54 Jan 48

Woon 50 Molly 50

Ernie 52 Rachel 52

48

Central Limit Theorem (CLT)

▪ In general, given that a data is normally distributed, then regardless of the sample size, the sampling distribution will follow normal probability distribution.▪ On the other hand, if the distribution of the data does not follow the normal distribution then only when the sample size increases does the sampling distribution approach the normal probability distribution.

49

Central Limit Theorem

Regardless of the distribution of the data, as the sample size increases, the sampling distribution approaches normality.

50

Central Limit Theorem

Let x be a random sample from a population with finite mean and finite variance 2.Let x be the sample mean; that is,

Then as the sample size increases, the probability distribution of the sample mean approaches a normal probability distribution with mean and variance 2/n.

Proof

51

nSE

nxVxE

n

xxxVxE

x

xx

n

ii

ison distributi sampling for theerror standard the

thereforeand ,)(,)( :Prove

and )(,)( Given that

22

12

Detailed Proof

52

)(1

11)(

1

)()()(11

1)(

times11

2121

1

1

nn

nnxE

n

xExExEn

xxxEn

xEnn

xExE

n

n

i

n

ii

nn

n

ii

n

ii

x

Detailed Proof (continue)

53

nn

n

nnxV

n

xVxVxVn

xxxVn

xVnn

xVxV

n

n

i

n

ii

nn

n

ii

n

ii

x

22

2

times

2222

1

22

12

212212

12

12

)(1

11)(

1

)()()(11

1)(

Detailed Proof (continue)

54

n

nn

xVSE x

22

)(

55

Example 12

Assume that the weight of marbles are normally distributed with mean 172 grams and standard deviation 29 grams.a. If 4 marbles are selected, find the probability that its

mean weight is less than 167 grams.b. If 25 marbles are selected, find the probability that

they have a mean weight more than 167 grams.c. If 100 marbles are selected, find the probability that

they have a mean weight between 167 grams and 180 grams.

56

Normal Approximation to the Binomial Distribution

In the binomial distribution, if the sample size is very large, the probability of finding r ≥ j for some j, where 1 ≤ j ≤ n is very tedious and lengthy calculations. In such cases, the problem can be solved by using the normal approximation to this type of the binomial distribution.Procedure:Step 1: Given a binomial distribution with n, r, and p, wheren – stands for total number of trialsr – stands for the number of successes (r = 0, 1, 2, …, n)p – stands for the probability of success in a single trial.Step 2: Criteria for the normal approximation to the binomial distribution is that if,

np > 5 and nq > 5 or np ≥ 5 and nq ≥ 5Then r has a binomial distribution that can be approximated by a normal distribution with

µ = np and

57

Continuity Correction

58

Correction for Continuity

59

Converting Binomial to Standard Normal without correction for continuity

)1( pnp

npxxz

60

Converting Binomial to Standard Normal with correction for continuity

)1(

5.05.0

pnp

npxxz

61

Example 13

The Denver Post stated that 80% of all new products introduced in grocery stores fail (and are taken off the market) within 2 years. Using normal approximation for this binomial distribution and correction for continuity, if a grocery store chain introduces 75 new products, a. Verify that the assumption for normal approximation to the

binomial is satisfied. b. What is the probability that within two years, 54 or more will fail?c. What is the probability that within two years, fewer than 62 will

fail?d. What is the probability that within two years, more than 49 will

fail? e. What is the probability that within two years, 58 or fewer fail?

62

Example 13 (solution)

Without correction for continuity

60 and 3.464

60 54 60( 54)

3.464 3.464

( 1.73) 0.9582

xP x P

P z

With correction for continuity

60 and 3.464

0.5 60 54 0.5 60( 54 0.5)

3.464 3.464

( 1.88) 0.9699

xP x P

P z

b.

5a.

63

Example 13 (solution)Without correction for continuity

60 and 3.464

60 61 60( 62) ( 61)

3.464 3.464

( 0.29) 0.6141

xP x P x P

P z

With correction for continuity

60 and 3.464

0.5 60 61 0.5 60( 61 0.5)

3.464 3.464

( 0.43) 0.6664

xP x P

P z

c.

64

Example 13 (solution)d. for Continuity

▪ With Correction for Continuity

65

Example 13 (solution)e. for Continuity

▪ With Correction for Continuity

66

PP & QQ Plots for Testing the Assumption of Normality

PP PLOT QQ PLOT

67

Normal Probability Plot

bmxz

bm

xzx

z

and 1

Let

1

Hence, the data is normal if the scatter plot of the data and the corresponding z-score (by matching percentiles) is a line.

68

Testing the Assumption of Normality using the Probability-Probability Plot (PP Plot)

Approximately Normal Not Normal

69

Normal Quantile Plot

70

Testing the assumption of Normality using the Quantile-Quantile Plot (QQ Plot)

71

Standardized & Percentage Plots

STANDARDIZED PLOT PERCENTAGE PLOT

72

Normality

Regardless of the data’s distribution, as the sample size increases, the sampling distribution approaches normality

Most continuous variables are assumed normal and even the discrete probability distribution, binomial, can be approximated using normality.

The normal probability distribution was developed by Gauss; a Gaussian probability distribution shows normality.

To test for normality we can use the PP plot (a metaphoric t-shirt) or the QQ plot (the t-shirt turned inside-out) .

To force normality or normalize the data, we can use the standardized plot or the percentage change plot.

Central Limit Theorem

Continuous Random Variable

Correction for Continuity

Gaussian Probability

Distribution

Normal Approximation

Normal Probability

Distribution

Sampling Distribution

Standard Score

PP plot

QQ plot

Standardized plot

Percentage plot

73

Assignment Problems

Section 6.1:# 6.1Section 6.2:# 6.6 Section 6.3:# 6.15, 6.17, 6.19, 6.27, 6.29Section 6.5:# 6.31, 6.33, 6.41Section 6.5:# 6.49, 6.54, 6.60Section 6.6:# 6.65, 6.68, 6.70

Assignment for chapter-06Section 6.1# 6.1 Determine if the following are continuous or

discrete random variables:a. Number of characters in a document.b. The amount of time it takes to make

dinner.c. The height of a palm tree.Section 6.2

# 6.6 Illustrate the following curves indicating the points of inflection .a. X~N()b. X~N()c. X~N()

Section6.4

Section 6.3# 6.15 Determine the probability that the standard

normal random variable Z will assume a single value between -1.42 and 0.75.# 6.17 The random variable X is normally

distributed with mean and . Find the following probabilities:

# 6.19 If random variable X is normally distributed with and , find K so that the .

Section6.4

# 6.27 The amount of solid fuels, X, which assumes values of X metric tons, is normally distributed with mean thousand metric tons (kmt) and

standard deviation thousand metric tons.a. Determine the probability that the amount

is between 250 kmt and 320 kmt, that is a. Find the metric tonnage such that

probability that the tonnage is exceeded is 0.80.

# 6.29 The price of coffee, X, which assumes values of x dollars, is normally distributed with mean and

standard deviation .a. Determine the probability that the cost is

between $10 and $15 that is b. Find the cost that the probability that this

price is exceeded is 0.2.

Section6.4# 6.31 The mean height of a group of 500 nonsmoking college students is 74 inches and the standard deviation is 5inches. What is the probability that in a random sample of 25 students from this group, the average height will be between 73 and 75 inches?

# 6.33 Suppose that the weight of candy packing machine are distributed about the mean of 16 ounces and a standard deviation of 2 ounces. What is the probability that if nine packages of candy are weighted their average weight:

a. Will be less than 14 ounces?b. Will be more than 16 ounces?

# 6.41 A survey of IQ scores of all the United States senators in the history of this body revealed a mean of and standard deviation . What is

the probability that the average IQ score of a random sample of 16 senators:

a. Will be lower than 85?b. Will exceed 85?c. Will be between 85 and 130?

Section 6.5# 6.49 According to Chebyshev`s rule, what percentage of the data values will lie within one standard deviation of the mean?# 6.54 According to empirical rule, what percentage of

data values will lie within three standard deviation of the mean?# 6.60 Based on the empirical rule, how many data

values in a set of size 200 would you expect to lie within three standard deviations of the

mean?

Section 6.6# 6.65 Let the random variable X be binomially distributed with and . Evaluate the following probabilities:

# 6.68 A fair coin is tossed 15 times. Determine the probability that between 6 and 8 heads inclusive will occur:a. Using the binomial probability distribution.b. Using the normal approximation without

correction for continuity.

c. Using the normal approximation with correction for continuity.

# 6.70 A fair die is rolled 200 times. Using normal approximation to the binomial, what is the

probability that an ace (one) will appear between 34 and 36 times?

Recommended