Sampling Distribution & Confidence Interval
CI - 1
1
A Normal Distribution
Example: Consider the distribution of serum cholesterol levels for 40- to 70-year-old males living in community A has a mean of 211 mg/100 ml, and the standard deviationof 46 mg/100 ml. If an individual is selected from this population, what is the probability that his/her serum cholesterol level is higher than 225?
2
P(X > 225) = ?
225
0z
x211
X ~ N (µ = 211, σ = 46)
.30
225 − 211
46= .30z =
.382
.382
3
Statistical Inference
4
Inferential Statistics
1. Type of Inference:EstimationHypothesis Testing
2. PurposeMake Decisions about Population Characteristics
Population?Population?
5
Inference Process
Decision Decision & &
ConclusionConclusion
Identify Identify PopulationPopulation
Find Find Representative Representative
SampleSample
Sample Sample StatisticStatistic
Estimates Estimates & Tests& Tests
6
Statistics Used to Estimate Population Parameters
Sample Mean,
Sample Variance, s2
Sample Proportion, …
Estimators
p̂
x µ population mean
σ 2 population variance
p population proportion
ParametersStatistics
Sampling Distribution & Confidence Interval
CI - 2
7
Sampling Distribution
Theoretical Probability Distribution of the Sample Statistic.
What is the Shape of this distribution?
What are the values of the parameters such as mean and standard deviation?
8
Probability Related to Mean
Example: Consider the distribution of serum cholesterol levels for 40- to 70-year-old males living in community A has a mean of 211 mg/100 ml, and the standard deviation of 46 mg/100 ml. If a random sample of 100 individuals is taken from this population, what is the probability that the averageserum cholesterol level of these 100 individuals is higher than 225?
9
P(X > 225) = ? What is probability that mean of the sample is greater than 225?
225
?x
?)?,( ? ==→ xxX σµ
What is the sampling distribution of sample mean?
10
Sampling Distribution of The MeanIf a random sample is taken from a population that has a mean µ and a standard deviation σ, the sampling distribution of the sample mean, x, will have a mean that is the same as the population mean, and will have a standard deviation that is equal to the standard deviation of the population divided by the square root of the sample size.
σσ
x n=σ
σx n
=µ µx =µ µx =
11
Sampling Distribution
σ = 2
µ = 8x
Population Distribution
µ = 8x
4.0252
==xσ 4.0252
==xσSampling Distribution of Mean (Sample size n=25)
12
Standard Error of Mean
1. Formula
2. Standard Deviation of the sampling distribution of the Sample Means,⎯X
3. Less Than Pop. Standard Deviation
ns
nx ≈=σσ
ns
nx ≈=σσ
σσ<
nσσ
<n
Sampling Distribution & Confidence Interval
CI - 3
13
Distribution Shape
What is the shape of the sampling distribution of mean?
A theorem of sampling distribution of mean:
If the population to be sampled is normallydistributed then the sampling distribution of mean would be normally distributed.
14
P(X > 225) = ? Cholesterol Level has a mean 211, s.d. 46.
If the population is normally distributed, the sampling distribution of the mean is normally distributed.
6.410046
211
===
==
nx
x
σσ
µµParameters of the sampling distribution of the mean:
211
x100)6.4,211(~
===
nNX xx σµ
15
Central Limit Theorem
What if the population sampled is not normally distributed?
16
Central Limit Theorem
If a relative large random sample is taken from a population that has a mean µ and a standard deviation σ, regardless of the distribution of the population, the distribution of the sample means is approximately normal with
σσ
x n=σ
σx n
=
µ µx =µ µx =
17
X
Central Limit Theorem
As As sample sample size gets size gets large large enough enough (n (n ≥≥ 30) ...30) ...
sampling sampling distribution distribution becomes becomes almost almost normal.normal.
σσ
x n=σ
σx n
=
µ µx =µ µx =18
µ = 50
σ = 10
Xµ = 50
σ = 10
X
Sampling from Non-Normal Populations
Mean
Standard Error
Mean
Standard Error
Population DistributionPopulation DistributionPopulation Distribution
σσ
x n=σ
σx n
=
µ µx =µ µx =
σσ XX = 5= 5
µX = 50- XµX = 50- X
Sampling Distribution ?Sampling Distribution ?Sampling Distribution ?
n = 4n = 4
Sampling Distribution & Confidence Interval
CI - 4
19
µ = 50
σ = 10
Xµ = 50
σ = 10
X
Sampling from Non-Normal Populations
Mean
Standard Error
Mean
Standard Error
Population DistributionPopulation DistributionPopulation Distribution
σσ
x n=σ
σx n
=
µ µx =µ µx = Sampling Distribution ?Sampling Distribution ?Sampling Distribution ?
σσ XX = 1.8= 1.8
µ =50XX
n = 30n = 30
20
A Random Sample from Population
Population mean = 19.9, standard deviation = 12.6
Random Sample of Size 400 from Population
110.0100.0
90.080.0
70.060.0
50.040.0
30.020.0
10.00.0
120
100
80
60
40
20
0
Std. Dev = 12.92 Mean = 20.7N = 400.00
21
Simulated Sampling Distribution of Means
SIZE2
77.073.0
69.065.0
61.057.0
53.049.0
45.041.0
37.033.0
29.025.0
21.017.0
13.09.05.01.0
70
60
50
40
30
20
10
0
Std. Dev = 8.88 Mean = 20.3N = 400.00
n=2 SIZE4
77.073.0
69.065.0
61.057.0
53.049.0
45.041.0
37.033.0
29.025.0
21.017.0
13.09.05.01.0
70
60
50
40
30
20
10
0
Std. Dev = 5.40 Mean = 19.4N = 400.00
n=4 SIZE10
77.073.0
69.065.0
61.057.0
53.049.0
45.041.0
37.033.0
29.025.0
21.017.0
13.09.05.01.0
100
80
60
40
20
0
Std. Dev = 4.32 Mean = 19.9N = 400.00
n=10
SIZE25
77.0073.00
69.0065.00
61.0057.00
53.0049.00
45.0041.00
37.0033.00
29.0025.00
21.0017.00
13.009.00
5.001.00
200
100
0
Std. Dev = 2.23 Mean = 19.84N = 400.00
n=25 SIZE50
77.0073.00
69.0065.00
61.0057.00
53.0049.00
45.0041.00
37.0033.00
29.0025.00
21.0017.00
13.009.00
5.001.00
200
100
0
Std. Dev = 1.64 Mean = 19.75N = 400.00
n=50 SIZE100
77.0073.00
69.0065.00
61.0057.00
53.0049.00
45.0041.00
37.0033.00
29.0025.00
21.0017.00
13.009.00
5.001.00
300
200
100
0
Std. Dev = 1.20 Mean = 19.81N = 400.00
n=100
22
Probability Related to Mean
Example: Consider the distribution of serum cholesterol levels for 40- to 70-year-old males living in community A has a mean of 211 mg/100 ml, and the standard deviation of 46 mg/100 ml. If a random sample of 100 individuals is taken from this population, what is the probability that the average serum cholesterol level of these 100 individuals is higher than 225?
23
P(X > 225) = ?
225
3.04
225 − 211
4.6= 3.04
.001
Cholesterol Level has a mean 211, s.d. 46.
001.0 )04.3()225(
=>=> ZPXP
211
n = 100
x
)6.4,211( ==→ xxNX σµ
0z
24
Introduction to Estimation
Confidence Intervals Confidence Intervals &&
Sample SizeSample Size
Sampling Distribution & Confidence Interval
CI - 5
25
Disadvantage of Point Estimation
1. Provides Single ValueBased on Observations from 1 Sample. * Sample Mean⎯X = 98 Is a Point Estimate of Unknown Population Mean.
2. Gives No Information about How Close Value Is to the Unknown Population Parameter
Which of the following statistics do you prefer?a. 32%b. 32% with a margin of error 3%
26
Estimation
You’re interested in finding the average body temperature of healthy adults in Northeastern Ohio (the population). What would you do?How can we estimate this average with a measure of reliability?
98 ± 1 F° 98 ± .5 F° 98 ± .2 F°
27
Interval Estimation
Margin of Error Gives Information about How Close Value Is to the Unknown Population Parameter.
28
Sampling Error
Sample statistic Sample statistic (point estimate)(point estimate)
xµ
Sampling Error = | µ – x |
29
Key Elements of Interval Estimation
Sample statistic Sample statistic (point estimate)(point estimate)
Confidence Confidence limit (lower)limit (lower)
Confidence Confidence limit (upper)limit (upper)
Confidence Confidence intervalinterval
Confidence LevelConfidence Level: A : A probabilityprobability that the that the population parameter falls somewhere population parameter falls somewhere within the interval.within the interval.
x ± Margin of Error
98 ± 1 F°
30
Sampling Distribution of the Mean
σσxx__
⎯⎯XXµµ
The sampling distribution is normal when sampled from normally distributed populationor having a relatively large sample.
Sampling Distribution & Confidence Interval
CI - 6
31
Sampling Distribution of the Mean
σσxx__
⎯⎯XXµµ
Within how many standard deviations of the mean will have 95% of the sampling distribution?
µµ -- ??σσ⎯⎯xx µµ + ?+ ?σσ⎯⎯xx
.025
.025.95
32
A Special Notation
Z .05 .06 .07
1.8 .032 .031 .031
1.9 .026 .025 .024
2.0 .020 .020 .019
2.1 .016 .015 .015
Z .05 .06 .07
1.8 .032 .031 .031
1.9 .026 .025 .024
2.0 .020 .020 .019
2.1 .016 .015 .015
zα = the z score that the proportion of the standard normal distribution to the right of it is α.
z.025 = ?
0 z.025
z.010 = ?
1.96
.025
33
The Confidence Interval
95% Sample 95% Sample MeansMeans
σσxx__
⎯⎯XX
µµ + 1.96+ 1.96σσ⎯⎯xxµµ -- 1.961.96σσ⎯⎯xx
µµ
1- α = .95
Confidence Level
α/2 α/2 = .025
1.96 = z.025
x + 1.96x + 1.96σσ⎯⎯xxx x -- 1.961.96σσ⎯⎯xx
x
Confidence Interval =>34
(1-α)·100% Confidence Interval Estimate for mean of a normal population
or
) , ( 2/2/ nZX
nZX σσ
αα ⋅+⋅− ) , ( 2/2/ nZX
nZX σσ
αα ⋅+⋅−
2/ nZX σ
α ⋅± 2/ nZX σ
α ⋅±Margin of Error
Confidence Interval for Mean (σ Known)
“σ Known” may mean that we have very good estimate of σ. It is not practical to assume that we know σ.
35
Confidence Interval of Mean (σ unKnown and n ≥ 30)
(1-α)·100% Confidence Interval Estimate for mean of a population when sample size is relative large
or
),( 2/2/ nsZX
nsZX ⋅+⋅− αα ),( 2/2/
nsZX
nsZX ⋅+⋅− αα
n
sZX ⋅± 2/α n
sZX ⋅± 2/α
36
The Confidence Interval
95% Samples95% Samples
σσxx__
⎯⎯XX
µµ + 1.96+ 1.96σσ⎯⎯xxµµ -- 1.961.96σσ⎯⎯xx
µµ
x x -- 1.961.96σσ⎯⎯xx x + 1.96x + 1.96σσ⎯⎯xx
x
Confidence Interval =>
95% Confidence Interval
Sampling Distribution & Confidence Interval
CI - 7
37
95% Samples95% Samples
σσxx__
⎯⎯XXµµ
2.5%2.5%
95 % of 95 % of intervals intervals contain contain µµ. . 5% do not.5% do not.
The Confidence Interval
38
Factors Affecting Interval Width
1.Data DispersionMeasured by σ
2.Sample SizeAffects standard error:
3.Level of Confidence (1 - α)Affects Zα/2
nx
σσ = nx
σσ =
) ( 22 nzX,
nzX //
σσαα ⋅+⋅− ) ( 22 n
zX,n
zX //σσ
αα ⋅+⋅−
39
90% Samples90% Samples
95% Samples95% Samples
99% Samples99% Samples
µµ + 1.65+ 1.65σσ x x µµ + 2.58+ 2.58σσxx
σσxx__
⎯⎯XX
µµ+1.96+1.96σσ xx
µµ -- 2.582.58σσ xx µµ -- 1.651.65σσxxµµ--1.961.96σσ xx
µµ
Size of Interval
40
Estimation Example Mean (σ Known)
The average weight of a random sample of n = 25 subjects is⎯X = 140. Set up a 95% confidence interval estimate for µ if σ = 10. (Assume Normal population.)
3.92140or ) 92.341 , 08.631 (
) 25
1096.1041 , 25
1096.1041 (
) , (
1.96. z .025, 2 .05, ,95.1
2/2/
2
±
⋅+⋅−
⋅+⋅−
====−
nZX
nZX σσ
ααα
αα
α
3.92140or ) 92.341 , 08.631 (
) 25
1096.1041 , 25
1096.1041 (
) , (
1.96. z .025, 2 .05, ,95.1
2/2/
2
±
⋅+⋅−
⋅+⋅−
====−
nZX
nZX σσ
ααα
αα
α
2/ nZX σ
α ⋅± 2/ nZX σ
α ⋅±
143.92) (136.08,
92.3 140 25
1096.1401
⇒
±⇒⋅±
143.92) (136.08,
92.3 140 25
1096.1401
⇒
±⇒⋅±
41
Interpretation
We can be 95% confident that the population mean is in (136.08, 143.92).
We can be 95% confident that the maximum sampling error using this interval estimate for estimating mean is within 3.92.
42
Confidence Interval of Mean (σ unKnown and n ≥ 30)
(1-α)·100% Confidence Interval Estimate for mean of a population when sample size is relative large
or
),( 2/2/ nsZX
nsZX ⋅+⋅− αα ),( 2/2/
nsZX
nsZX ⋅+⋅− αα
n
sZX ⋅± 2/α n
sZX ⋅± 2/α
Sampling Distribution & Confidence Interval
CI - 8
43
Thinking Challenge
Example: A city uses a certain noise index to monitor the noise pollution at a certain area of the city. A random sample of 100 observations from randomly selected days around noon showed an average indexvalue of x = 1.99 and standard deviation s = 0.05. Find the 90%confidence interval estimate of the average noise index at noon.
44
Confidence Interval Solution*
) 998.1 , 982.1 (
0.008 1.9910005.64.199.1
⇒
±⇒⋅±
) 998.1 , 982.1 (
0.008 1.9910005.64.199.1
⇒
±⇒⋅±
1.64 ZZ.05 /2 .1, 90.1 .90, 1
2/
.052 /
nsZX ⋅±
====−==−
α
α
ααα
1.64 ZZ.05 /2 .1, 90.1 .90, 1
2/
.052 /
nsZX ⋅±
====−==−
α
α
ααα
45
Interval Estimation for Mean
In a survey on a random sample of 64 individuals who gambled at Las Vegas, the average amount of money won for the day that survey was done is –$25.50 with a standard deviation of $100. Find the 95% confidence interval estimate for the average amount of money won by people gambled at Las Vegas that day.
46
Finding Sample Sizes for Estimating µ
I don’t want to sample too much or too little!
2
22
2
2
2
Error ofMargin
nzx :C.I.
B
zn
nZB
σ
σ
σ
α
α
α
⋅=
⋅==
⋅±
2
22
2
2
2
Error ofMargin
nzx :C.I.
B
zn
nZB
σ
σ
σ
α
α
α
⋅=
⋅==
⋅±
BB = Margin of Error or Bound= Margin of Error or Bound
47
Sample Size Example
What sample size is needed to be 90% confident of being correct within ± 5? A pilot study suggested that the standard deviation is 45.
( ) ( )( )
2202.2195
45645.12
22
2
22
05. ≅===B
Zn σ ( ) ( )( )
2202.2195
45645.12
22
2
22
05. ≅===B
Zn σ
48
Thinking Challenge
You plan to survey residents in your county to find the average health insurance premium that they are paying. You want to be 95% confident that the sample mean is within ± $50. A pilot study showed that σ was about $400. What sample size should you use?
Sampling Distribution & Confidence Interval
CI - 9
49
Sample Size Solution*
( ) ( )( )
24686.245
5040096.1
2
22
2
22
025.0
≅=
=
=B
Zn σ
( ) ( )( )
24686.245
5040096.1
2
22
2
22
025.0
≅=
=
=B
Zn σ
50
Confidence Interval Mean (σ Unknown & n< 30)
1. AssumptionsPopulation Standard Deviation Is UnknownPopulation Must Be Normally Distributed
2. Use Student’s t Distribution
3. Confidence Interval Estimate
) , ( 1,2/1,2/ nStX
nStX nn ⋅+⋅− −− αα ) , ( 1,2/1,2/ n
StXn
StX nn ⋅+⋅− −− αα
nStX
n ⋅±
− 1,2α
51
tt
Student’s t Distribution
00
t (t (dfdf = 5)= 5)
ZZ
Standard Standard Normal (Z)Normal (Z)
BellBell--ShapedShaped
SymmetricSymmetric
‘‘FatterFatter’’ TailsTails
t (t (dfdf = 13)= 13)n
sxt µ−=
52
Student’s t Table
t valuest valuest0 t0
.05.05
For a 90% C.I.: For a 90% C.I.: nn = 3= 3dfdf = = nn -- 1 = 21 = 2αα = .10= .10αα/2 =.05/2 =.05ttαα/2/2 = ?= ?
2.9202.920
53
Estimation Example Mean (σ Unknown)
A random sample of weights of 25 subjects, has a sample mean 140 and sample standard deviation 8. Set up a 95% confidence interval estimate for µ.
) 31.341 , 69.631 (
3.31 140 258064.2041
⇒
±⇒⋅±
) 31.341 , 69.631 (
3.31 140 258064.2041
⇒
±⇒⋅±
064.2 .025, /2 .05,.951 .95, 1
025.024 , /2 ====−==−
= tt dfα
ααα064.2
.025, /2 .05,.951 .95, 1
025.024 , /2 ====−==−
= tt dfα
ααα
1,2/ nStX n ⋅± −α 1,2/ nStX n ⋅± −α
54
Thinking Challenge
The numbers of community hospital beds per 1000 population that are available in each different regions of the country is normally distributed. A random sample 6 regions were selected and the rates of beds per 1000 were recorded and they are
3.6, 4.2, 4.0, 3.5, 3.8, 3.1.Find the 90% confidence interval estimate of the mean bed-rate in the country.
Sampling Distribution & Confidence Interval
CI - 10
55
Confidence Interval Solution*
= 3.7 s = 0.38987 x
1592.6
38987.==
ns
(use 90% confidence level)(use 90% confidence level)
n = 6, df = n − 1 = 6 − 1 = 5
t.05,5 = 2.015
( 3.7 - (2.015)(0.1592), 3.7 + (2.015)(0.1592) )
( 3.379, 4.021 )
nStX n ⋅± −1 ,2/α
56
Confidence interval with z-score: The (1− α)% confidence interval estimate for population mean:Assumption: If sampled from normal population with known variance, σ,
Assumption: If large sample and if unknown variance, s replaces σ,
nzx σ
α ⋅± 2/
nszx ⋅± 2/α
57
Confidence interval with t-score: The (1− α)% confidence interval estimate for population mean:Assumption: If sampled from normal population with unknown variance, σ,
nstx ndf ⋅± −= 1 ,2/α
(If sample size is large the normality assumption is insignificant.)
t → z as sample becomes large58
Average Weight for Female Ten Year Children In US
Info. from a random sample: n = 10, x = 80 lb, s = 18.05 lb, assume weight is normally distributed, find the 95% confidence interval estimate for average weight.
Data: 73.80 50.00 101.40 67.20 102.20 97.80 81.00 93.40 63.20 70.00
How do we know whether normality assumption is OK?
59
Tests of Normality
.171 10 .200* .930 10 .452ght (pounds) of participantStatistic df Sig. Statistic df Sig.
Kolmogorov-Smirnova Shapiro-Wilk
This is a lower bound of the true significance..
Lilliefors Significance Correction.
Both are greater than 0.05, normality assumption is acceptable.
60
Average Weight for Female Ten Year Children In US
Info. from a random sample: n = 10, x = 80 lb, s = 18.05 lb, assume weight is normally distributed, find the 95% confidence interval estimate for average weight.
tα/2 = t.05/2 = t0.25 , d.f. = 10 – 1 = 9, t0.25, 9 = 2.262
1005.18262.2809,2/ ⋅±⇒⋅± = n
stx dfα
)91.92 ,09.67( 91.1280 ⇒±
Sampling Distribution & Confidence Interval
CI - 11
61
Descriptives
80.0000 5.70840
67.0867
92.9133
80.4333
77.4000
325.858
18.05153
50.00
102.20
52.20
32.5000
-.148 .687
-1.229 1.334
86.8600 3.96048
77.9008
95.8192
Mean
Lower Bound
Upper Bound
95% ConfidenceInterval for Mean
5% Trimmed Mean
Median
Variance
Std. Deviation
Minimum
Maximum
Range
Interquartile Range
Skewness
Kurtosis
Mean
Lower Bound
Upper Bound
95% ConfidenceInterval for Mean
What is your sex?female
male
ht (pounds)articipant
Statistic Std. Error
80 ± 12.91Weight for Ten Year Old
62
Confidence Interval Estimate of Proportion
63
Proportion Estimation
Parameter: Population Proportion p (or π)(Percentage of people has no health insurance)
Statistic: Sample Proportion nxp =ˆ
x is number of successes n is sample size
64
Confidence Interval Proportion
1. AssumptionsTwo Categorical OutcomesNormal Approximation Can Be Used If np and n(1 – p) are both greater than 5.
) )ˆ1(ˆˆ , )ˆ1(ˆˆ ( 22 nppzp
nppzp −⋅
⋅+−⋅
⋅− αα ) )ˆ1(ˆˆ , )ˆ1(ˆˆ ( 22 nppzp
nppzp −⋅
⋅+−⋅
⋅− αα
2. Confidence Interval Estimate 2. Confidence Interval Estimate (for large sample)(for large sample)
nppp )ˆ1(ˆzˆ
2
−⋅⋅± α n
ppp )ˆ1(ˆzˆ2
−⋅⋅± α
65
Estimation Example Proportion
A random sample of 400 from a large community showed that 32 have diabetes. Set up a 95% confidence interval estimate for p, the percentage of people that have diabetes.
96.1,4000840032ˆ 025.2/ ===== zzn.p α , 96.1,40008
40032ˆ 025.2/ ===== zzn.p α ,
66
Estimation Example Proportion
The 95% C.I. for p, the percentage of people that have diabetes:
) 107. , 053. ( %7.2%8 .027 .08 ⇒±⇒± ) 107. , 053. ( %7.2%8 .027 .08 ⇒±⇒±
400
)08.1(08.96.108. −⋅⋅±
400)08.1(08.96.108. −⋅
⋅±
)ˆ1(ˆˆ 2/ nppZp −⋅
⋅± α )ˆ1(ˆˆ 2/ nppZp −⋅
⋅± α
400 ,0840032ˆ === n.p 400 ,08
40032ˆ === n.p
Sampling Distribution & Confidence Interval
CI - 12
67
Thinking Challenge
A member of a health department wish to see what percentage of people in a community will support an environmental policy. Of 200 survey forms sent and received, 35 responded that they support the policy and the rest of them do not support the policy. Find a 90% confidence interval estimate of the percentage of the population in this community that support the policy?
68
Confidence Interval Solution*
) %92.21 , %08.13 ( 4.42%17.5%0442. .175
=±=±
) %92.21 , %08.13 ( 4.42%17.5%0442. .175
=±=±
645.1 ,200 175.20035ˆ 2/ ==== αzn,p 645.1 ,200 175.
20035ˆ 2/ ==== αzn,p
)ˆ1(ˆˆ 2/ nppzp −⋅
⋅± α )ˆ1(ˆˆ 2/ nppzp −⋅
⋅± α
200
)825(.175.645.1175. ⋅⋅±
200)825(.175.645.1175. ⋅
⋅±
69
Example:Researchers wish to estimate the percentage of hospital employees infected by SARS in a certain country. Out of 500 randomly chosen hospital employees, 14 were infected. Find the 95% confidence interval estimate for percentage of hospital employees infected by SARS in this country.
70
Sample Size
25.0
or
2
2
2 ⋅=B
zn
α 25.0
or
2
2
2 ⋅=B
zn
α to get the largest sample to achieve the goal.
nppp )ˆ1(ˆ
zˆ :C.I.2
−⋅⋅± α n
ppp )ˆ1(ˆzˆ :C.I.
2
−⋅⋅± α
nppZB )ˆ1(ˆ
Error ofMargin 2
−⋅⋅== α n
ppZB )ˆ1(ˆError ofMargin
2
−⋅⋅== α
if pilot study is done.
)ˆ1(ˆ2
2
2 ppB
zn −⋅⋅=
α
)ˆ1(ˆ2
2
2 ppB
zn −⋅⋅=
α
71
Sample Size (No prior information on p)
Sample Size Example: If one wishes to do a survey to estimate the population proportion with 95% confidence and a margin of error of 3%, how large a sample is needed? Zα/2 = 1.96; B = .03n = (1.962/.032) x .25 = 1067.11 A sample of size 1068 is needed.
72
Sample Size (With prior information on p)
Sample Size Example: If one wishes to to estimate the percentage of people infected with West Nile in a population with 95% confidence and a margin of error of 3%, how large a sample is needed? (A pilot study has been done, and the sample proportion was 6%.)Zα/2 = 1.96; B = .03n = (1.962/.032) x .06 x (1 – .06) = 240.7 A sample of size 241 is needed.
How large a sample was used for pilot study?