19
Chapter 13 Sampling Distributions

Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Embed Size (px)

Citation preview

Page 1: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Chapter 13

Sampling Distributions

Page 2: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Sampling Distributions

• Summary measures such as , s, R, or proportion that is calculated for sample data is called a sample statistic.

• To obtain the sampling distribution of a statistic, one must take all possible samples of a given size and calculate the value of the statistic of interest for each sample.

x

Page 3: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Example 13.1

• A class has 6 students that have just purchased new laptops. They paid the following:– $1800– $2100– $2400– $1200– $2100– $2100

Let x denote the cost of computer.

m = $1950.00

s= $377.49

N=6

What if we sample 3 students at random? What could we expect?

Page 4: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Sample R s1 1800 2100 2400 2100 600 300.002 1800 2100 1200 1700 900 458.263 1800 2100 2100 2000 300 173.214 1800 2100 2100 2000 300 173.215 1800 2400 1200 1800 1200 600.006 1800 2400 2100 2100 600 300.007 1800 2400 2100 2100 600 300.008 1800 1200 2100 1700 900 458.269 1800 1200 2100 1700 900 458.26

10 1800 2100 2100 2000 300 173.2111 2100 2400 1200 1900 1200 624.5012 2100 2400 2100 2200 300 173.2113 2100 2400 2100 2200 300 173.2114 2100 1200 2100 1800 900 519.6215 2100 1200 2100 1800 900 519.6216 2100 2100 2100 2100 0 0.0017 2400 1200 2100 1900 1200 624.5018 2400 1200 2100 1900 1200 624.5019 2400 2100 2100 2200 300 173.2120 1200 2100 2100 1800 900 519.62

Mean 1950 690 367.32Std. Dev. Pop. 168.82 368.65 189.94Sigma 377.49 292.40 407.56 398.69

X

2

ˆd

R

nX ˆ

4

ˆc

s

Page 5: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Sampling Errors

• Different samples selected from the same population will give different results because they contain different elements.

• The difference between a sample statistic obtained from a sample and the value of the same parameter from the population is called the sampling error.

Sampling error = x – m• All sampling errors occur because of chance.

• Other errors do occur, these are called nonsampling errors.

Page 6: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Nonsampling Error

• Non-sampling errors occur because of human mistakes, not chance.

• Common causes are:1. Sample is nonrandom

2. Question phrased so that not fully understood

3. Respondents intentionally give false information

4. Polltaker mistaking records or keys wrong answer.

Page 7: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

How do errors look?

Sampling Error• In our computer example

m = $1950.00• Say we got sample of

$2100, $1200, $2100… then x=$1800

• So our sampling error: 1800-1950= -$150

Nonsampling Error• Same example… but pollster

writes down $2100, $1500, $2100

• Then x=$1900• Even though this is closer to

the population mean…• Sampling error is still -$150,

but nonsampling error is -$100

Page 8: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Population vs. Sample , m s

• Remember that in the whole population:• m = $1950.00• s= $377.49

• While in our samples:• m x= $1950.00

• sx= $168.82

• The mean of the sampling distribution (of a specific size) is the same as the mean of the population.

• Thus x is called an estimator of the population mean.• The standard deviation of a sample will always smaller

than the population, as long as sample size >1.

Page 9: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Sample Mean Distribution

X

nX

Population

Sample Mean

1

N

nN

nX

05.Nnif

Page 10: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

• Take into account the size of the sample vs. the population in calculating s x

• In Example 13.3, n/N is more than 5%, so…

1

N

nN

nx

nx

05.N

n

05.N

n

if

if

Finite population correction factor

82.168$)7746.0)(9440.217($16

36

3

49.377$

1

N

nN

nx

Sample Mean Distribution

Page 11: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Sample Range Distribution

RR

2

ˆd

R

n d2

2 1.1283 1.6934 2.0595 2.3266 2.5347 2.7048 2.8479 2.970

10 3.07811 3.17312 3.25813 3.33614 3.40715 3.47220 3.73525 3.931

Page 12: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Sample Std Deviation Distribution

ss

4

ˆc

s

241 cs

n c4

2 0.79793 0.88624 0.92135 0.94006 0.95157 0.95948 0.96509 0.9693

10 0.972711 0.975412 0.977613 0.979414 0.981015 0.982320 0.986925 0.9896

Page 13: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Sampling Distribution of a Sample Proportion

• Example: You ask 10 classmates if they have change for a dollar, so you can buy a Jolt Cola before class. 4 people had change for a dollar.

• We denote the sample proportion using the symbol p.

p = number of people with change for a dollar = 4 = .4 number of people asked 10

• Unlike x, the sampling distribution of p follows the binomial distribution. It

must meet the following conditions:

• There are n identical trials. Each performed under identical conditions.• Each trial has two and only two mutually exclusive events (outcomes). Usually called

a success and a failure.• Probability of success is denoted by p and failure by q. p+q=1. Probabilities of p and

q remain constant throughout trial.• The trials are independent. The outcome of one trial does not affect the outcome of

another.

^

^

^

Page 14: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Sample Proportion Distribution

Population

x

USLLSL

pp

Page 15: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Sample Proportion Distribution

pp

n

ppp

)1(

n

Dp

05.Nnif1

)1(

N

nN

n

ppp

Sample n D p

1 n1 D1 p1

2 n2 D2 p2

. . . .

. . . .

k nk Dk pk

n D

Page 16: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Central-Limit Theorem (CLT)

• Central Limit Theorem (CLT) states that irrespective of the underlying distribution of a population (with mean μ and standard deviation of σ), taking a number of samples of size n from the population, then the sample mean distribution follow a normal distribution with a mean of μ and a standard deviation of .

• The normality gets better as your sample size n increases.

n

Page 17: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Central-Limit Theorem (CLT)

• Central Limit Theorem: For a large sample size (N≥30), the shape of the sample mean distribution, is approximately normal. Also, the shape of the sampling distribution of p is approximately normal for a sample for which np≥10 and nq≥10.– Sampling distribution does not become a normal distribution when n

becomes 30, instead it takes on a shape that is close to a normal curve.

http://gaussianwaves.blogspot.com/2010/01/central-limit-theorem.html

Page 18: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Central-Limit

Theorem

Page 19: Chapter 13 Sampling Distributions. Sampling Distributions Summary measures such as, s, R, or proportion that is calculated for sample data is called a

Example 13.9

• Factory produces a new synthetic motor oil for older cars that lasts longer than traditional motor oils.– The amount of time oil should last follows a distribution with a mean of 4800 miles and

standard deviation of 300 miles– A random sample of 35 older vehicles were tested

• What is the approximate probability that the average distance traveled between oil changes will exceed 4900 miles?

SOLUTION:

milesx 4800 milesn

x 71.5035

300

0244.9756.1)97.1()71.50

48004900()4900(

zPzPxP