Upload
xinearpinger
View
218
Download
0
Embed Size (px)
Citation preview
8/4/2019 SDA 3E Chapter 4
1/47
2007 Pearson Education
Chapter 4: Sampling andEstimation
8/4/2019 SDA 3E Chapter 4
2/47
Need for SamplingVery large populations
Destructive testing
Continuous production process
The objective of sampling is to draw a valid inference about
a population.
8/4/2019 SDA 3E Chapter 4
3/47
Sample Design Sampling Plan a description of the
approach that will be used to obtain
samples from a population Objectives
Target population
Population frame
Method of sampling
Operational procedures for data collection
Statistical tools for analysis
8/4/2019 SDA 3E Chapter 4
4/47
Sampling Methods Subjective
Judgment sampling
Convenience sampling
Probabilistic
Simple random sampling every subset of
a given size has an equal chance of beingselected
8/4/2019 SDA 3E Chapter 4
5/47
PHStat Tool
Random Sample Generator PHStat menu > Sampling > Random
Sample Generator
Enter sample size
Select sampling
method
8/4/2019 SDA 3E Chapter 4
6/47
Excel Data Analysis Tool
Sampling Excel menu > Tools > Data Analysis >
Sampling
Specify input rangeof data
Choose sampling
method
Select output option
8/4/2019 SDA 3E Chapter 4
7/47
Other Sampling Methods Systematic sampling
Stratified sampling
Cluster sampling
Sampling from a continuous process
8/4/2019 SDA 3E Chapter 4
8/47
Errors in Sampling Nonsampling error
Poor sample design
Sampling (statistical) error
Depends on sample size
Tradeoff between cost of sampling and
accuracy of estimates obtained bysampling
8/4/2019 SDA 3E Chapter 4
9/47
Estimation Estimation assessing the value of a
population parameter using sample data.
Point estimate a single number used toestimate a population parameter
Confidence intervals a range of valuesbetween which a population parameter isbelieved to be along with the probability thatthe interval correctly estimates the truepopulation parameter
8/4/2019 SDA 3E Chapter 4
10/47
Common Point Estimates
8/4/2019 SDA 3E Chapter 4
11/47
Theoretical Issues Unbiased estimator one for which the
expected value equals the population
parameter it is intended to estimate The sample variance is an unbiased
estimator for the population variance
1
2
12
n
xx
s
n
i
i
N
xn
i
i
2
12
8/4/2019 SDA 3E Chapter 4
12/47
Interval Estimates Range within which we believe the true
population parameter falls
Example: Gallup poll percentage ofvoters favoring a candidate is 56% with a3% margin of error.
Interval estimate is [53%, 59%]
8/4/2019 SDA 3E Chapter 4
13/47
Confidence Intervals Confidence interval (CI) an interval
estimated that specifies the likelihood that
the interval contains the true populationparameter
Level of confidence (1a) the probabilitythat the CI contains the true population
parameter, usually expressed as a percentage(90%, 95%, 99% are most common).
8/4/2019 SDA 3E Chapter 4
14/47
Sampling Distribution of the
Mean
8/4/2019 SDA 3E Chapter 4
15/47
Interval Estimate Containing the
True Population Mean
8/4/2019 SDA 3E Chapter 4
16/47
Interval Estimate Not Containing
the True Population Mean
8/4/2019 SDA 3E Chapter 4
17/47
Confidence Interval for the
Mean KnownA 100(1a)% CI is: x za/2(/n)
za/2 may be found from Table A.1 or using theExcel function NORMSINV(1-a/2)
8/4/2019 SDA 3E Chapter 4
18/47
Example Compute a 95 percent confidence interval for
the mean number of TV hours/week for the18-24 age group in the file TV Viewing.xls.
Assume that the population standarddeviation is known to be 10.0. The samplemean for the n= 45 observations iscomputed to be 60.16. For a 95 percent CI,
za/2 = 1.96. Therefore, the CI is60.16 1.96(10/45)= 60.16 2.92 or [57.24, 63.08]
8/4/2019 SDA 3E Chapter 4
19/47
Confidence Interval for the
Mean, UnknownA 100(1a)% CI is: x ta/2,n-1(s/n)
ta/2,n-1is the value from a t-distribution withn-1 degrees of freedom, from Table A.2 or
the Excel function TINV(a, n-1)
8/4/2019 SDA 3E Chapter 4
20/47
Relationship Between Normal
Distribution and t-distribution
The t-distribution yields larger confidenceintervals for smaller sample sizes.
8/4/2019 SDA 3E Chapter 4
21/47
Example Compute a 95 percent confidence interval for the
mean number of TV hours/week for the 18-24 agegroup in the file TV Viewing.xls. Assume that the
population standard deviation is not but estimatedfrom the sample as 10.095. A 95 percent CIcorresponds to a/2 = 0.025. With 45 observations,thus the t-distribution has 45 - 1 = 44 df. Using TableA.2, we find that t0.025, 44 = 2.0154, yielding a 95
percent CI for the mean of60.16 2.0154(10.095/45)= 60.16 3.03 or [57.13, 63.19]
8/4/2019 SDA 3E Chapter 4
22/47
PHStat Tool: Confidence
Intervals for the Mean PHStatmenu > Confidence Intervals>
Estimate for the mean, sigma known,
or Estimate for the mean, sigmaunknown
8/4/2019 SDA 3E Chapter 4
23/47
PHStat Tool: Confidence
Intervals for the Mean - Dialog
Enter the confidence level
Choose specification ofsample statistics
Check Finite PopulationCorrection box ifappropriate
8/4/2019 SDA 3E Chapter 4
24/47
Sampling From Finite
Populations When n > 0.05N, use a correction
factor in computing the standard error:
1
N
nN
nx
8/4/2019 SDA 3E Chapter 4
25/47
PHStat Tool: Confidence
Intervals for the Mean - Results
8/4/2019 SDA 3E Chapter 4
26/47
Confidence Intervals for
Proportions Sample proportion: p = x/n
x = number in sample having desired
characteristic n = sample size
The sampling distribution of p has meanp and variance p(1p)/n
When np and n(1p) are at least 5,the sampling distribution of p approacha normal distribution
8/4/2019 SDA 3E Chapter 4
27/47
Confidence Intervals for
Proportions
A 100(1
a)% CI is: np)-p(1
zp/2a
PHStattool is available under ConfidenceIntervalsoption
8/4/2019 SDA 3E Chapter 4
28/47
Confidence Intervals and
Sample Size CI for the mean, known
Sample size needed for half-width of at
most E is n (za/2)2
(2
)/E2
CI for a proportion Sample size needed for half-width of at
most E is
Use p as an estimate ofp or 0.5 for themost conservative estimate
2
2
2/)1()(
E
z
n
ppa
8/4/2019 SDA 3E Chapter 4
29/47
PHStat Tool: Sample Size
Determination PHStatmenu > Sample Size>
Determination for the Meanor
Determination for the Proportion
Enter s, E, and
confidence level
Check FinitePopulation Correction
box if appropriate
8/4/2019 SDA 3E Chapter 4
30/47
Confidence Intervals for
Population Total
A 100(1
a)% CI is:
PHStattool is available under ConfidenceIntervalsoption
Nx tn-1,a/2 1
N
nN
n
sN
8/4/2019 SDA 3E Chapter 4
31/47
Confidence Intervals for
Differences Between MeansPopulation 1 Population 2
Mean 1
2
Standard
deviation
1
2
Point estimate x1 x2Sample size n1 n2
Point estimate for the difference in means,12, is given by x1 - x2
8/4/2019 SDA 3E Chapter 4
32/47
Independent Samples With
Unequal Variances
A 100(1
a)% CI is:x1 -x2 (ta/2, df*) 2
2
2
1
2
1
n
s
n
s
1
)/(
1
)/(
2
2
2
2
2
1
2
1
2
1
2
2
2
2
1
2
1
n
ns
n
ns
n
s
n
s
df* = Fractional valuesrounded down
8/4/2019 SDA 3E Chapter 4
33/47
Example In theAccounting Professionals.xlsworksheet,
find a 95 percent confidence interval for the
difference in years of service between males andfemales.
8/4/2019 SDA 3E Chapter 4
34/47
Calculations s1= 4.39 and n1= 14 (females),
s2= 8.39 and n2= 13 (males)
df* = 17.81, so use 17 as the degreesof freedom
8/4/2019 SDA 3E Chapter 4
35/47
Independent Samples With
Equal Variances
A 100(1
a)% CI is:x
1- x
2
(ta/2, n1 + n22
)21
11
nnsp
2
)1()1(
21
2
22
2
11
nn
snsn
sp
where spis a common pooled standard deviation. Mustassume the variances of the two populations are equal.
8/4/2019 SDA 3E Chapter 4
36/47
Example: Accounting
Professionals
8/4/2019 SDA 3E Chapter 4
37/47
Paired Samples
A 100(1a)% CI is: D (tn-1,a/2) sD/n
1
)(1
n
DD
s
n
i
i
D
Di = difference for each pair of observations
D = average of differences
PHStattool available in theConfidence Intervalsmenu
2
8/4/2019 SDA 3E Chapter 4
38/47
Example Pile Foundation.xls
A 95% CI for the average differencebetween the actual and estimated pilelengths is
8/4/2019 SDA 3E Chapter 4
39/47
Differences Between
Proportions
A 100(1a)% CI is:2
22
1
11
2/21
)1()1(
n
pp
n
ppzpp
a
Applies when nipi and ni(1 pi) are greater than 5
8/4/2019 SDA 3E Chapter 4
40/47
Example In theAccounting Professionals.xls
worksheet, the proportion of females having
a CPA is 8/14 = 0.57, while the proportion ofmales having a CPA is 6/13 = 0.46. A 95percent confidence interval for the differencein proportions between females and males is
8/4/2019 SDA 3E Chapter 4
41/47
Sampling Distribution of s The sample standard deviation, s, is a point
estimate for the population standard
deviation, The sampling distribution of s has a chi-
square (c2) distribution with n-1 df See Table A.3
CHIDIST(x, deg_freedom) returns probability tothe right of x
CHIINV(probability, deg_freedom) returns thevalue of x for a specified right-tail probability
8/4/2019 SDA 3E Chapter 4
42/47
Confidence Intervals for the
Variance
A 100(1a)% CI is:
2
2/1,1
2
2
2/,1
2)1(
,)1(
aa cc nn
snsn
Note the difference in thedenominators!
8/4/2019 SDA 3E Chapter 4
43/47
PHStatTool: Confidence
Intervals for Variance - Dialog PHStatmenu > Confidence Intervals>
Estimate for the Population Variance
Enter sample size,standard deviation,
and confidence level
8/4/2019 SDA 3E Chapter 4
44/47
PHStatTool: Confidence
Intervals for Variance - Results
8/4/2019 SDA 3E Chapter 4
45/47
Time Series Data Confidence intervals only make sense
for stationary time series data
8/4/2019 SDA 3E Chapter 4
46/47
Summary and ConclusionsAs the confidence level (1 - a)
increases, the width of the confidenceinterval also increases.
As the sample size increases, the widthof the confidence interval decreases.
8/4/2019 SDA 3E Chapter 4
47/47
Probability IntervalsA 100(1a)% probability interval for a
random variable X is any interval [a,b]
such that P(a X b) = 1a Do not confuse a confidence interval
with a probability interval; confidence
intervals are probability intervals forsampling distributions, not for thedistribution of the random variable.