Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Estimation of Confidence Interval
Dr. Ioannis N. Lagoudis [email protected]
SAMPLE MEAN
t-distribution A t-‐value indicates the number of standard errors by which a sample mean differs
from a popula7on mean.
Confidence interval Point of estimate multiple * Standard Error ±
X − Z σn≤ µ ≤ X + Z σ
n
X − t σn≤ µ ≤ X + t σ
n
Z = X −µσ / n
t = X −µs / n
Key observations for t-distribution • A t-value indicates the number of standard
errors by which a sample mean differs from a population mean.
• Degrees of freedom (df) are important in understanding the shape of the distribution
• The smaller the sample the wider the distribution is
• The higher the sample the closer to the Normal Distribution the t-distribution is
Confidence Interval Levels
Confidence Level Z value t value
90% 1.645 1.699
95% 1.96 2.045
99% 2.575 2.756
Practice – Problem 1 (t Calculations)
• Use Excel Formulas =TDIST(X) & =TINV(X)
FOR TOTAL
Formulas
T^=NnTs = NX Point of Estimate
E(T )^= T Mean
SE(T )^= Nσ / n Standard Error
SE(T )^= Ns / n = N × SE(X) Approximate Standard Error
Example: Tax Refunds The Internal Revenue Service would like to estimate the total net amount of refund due to a particular set of 1,000,000 taxpayers. Each taxpayer will either receive a refund, in which case the net refund is positive, or will have to pay an amount due, in which case the net refund is negative. Therefore the total net amount of refund is a natural quantity of interest; it is the net amount the IRS will have to pay out (or receive, if negative). Find a 95% confidence interval for this total using the refunds from a random sample of 500 taxpayers.
Example: Tax Refunds
Use File: IRS Funds.xlsx
Source: S. Chris=an Albright, Wayne Winston and Christopher Zappe. Data Analysis and Decision Making, OH: South-‐Western, Cengage learning, 2011. 4th Edi=on. ISBN: 9780538476126.
N=1,000,000
T^=NnTs = NX
X = $294.98
FOR PROPORTION
Formulas
SE(p^) = p
^(1− p
^)
nStandard Error
p^± z−multiple× p
^(1− p
^)
nConfidence Interval
Example: Sandwich The fast-food manager from Example 8.2 has already sampled 40 customers to estimate the population mean rating of the restaurant’s new sandwich. Each rating is on a 1-to-10 scale, 10 being the best. The manager would now like to use the same sample to estimate the proportion of customers who rate the sandwich at least 6.
Example: Sandwich This confidence interval is based on the assump=on of a large sample size.
npL > 5n(1− pL )> 5npU > 5n(1− pU )> 5
Use File: Sa=sfac=on Ra=ngs.xlsx
Source: S. Chris=an Albright, Wayne Winston and Christopher Zappe. Data Analysis and Decision Making, OH: South-‐Western, Cengage learning, 2011. 4th Edi=on. ISBN: 9780538476126.
FOR A STANDARD DEVIATION
chi-square
=CHIDIST(v,df)
=CHIINV(p,df)
Example: Part Diameters A machine produces parts that are supposed to have diameter 10 centimeters. However, due to inherent variability, some diameters are greater than 10 and some are less. The production supervisor is concerned about two things. First, he is concerned that the mean diameter is not what it should be, 10 centimeters. Second, he is worried about the extent of variability in the diameters. Even if the mean is on target, excessive variability implies that many of the parts will fail to meet specifications. To analyze the process, he randomly samples 50 parts during the course of a day and measures the diameter of each part to the nearest millimeter. Should the supervisor be concerned about the results from this sample?
Example: Part Diameters
Use File: Part Diameters.xlsx
Source: S. Chris=an Albright, Wayne Winston and Christopher Zappe. Data Analysis and Decision Making, OH: South-‐Western, Cengage learning, 2011. 4th Edi=on. ISBN: 9780538476126.
Mean
St Dev
=NORMDIST(10-‐E16,E17,E18,1)+ (1-‐NORMDIST(10+E16,E17,E18,1))
F distributions • Test for equal variances • This test is phrased in
terms of the ratio of population variances
• If ratio is 1 (equal variances)
• If ratio is not 1 (unequal variances)
s12
s22
=FDIST(v,df1,df2) =FINV(p,df1,df2)
TWO SAMPLE MEANS
Applications • Men and women shop at a retail clothing store. The manager would like to
know how much more (or less), on average, a woman spends on a typical purchase occasion than a man.
• Two airline companies fly similar routes. A consumer organization would like to check how much the average delay differs between the two airlines, where delay is defined as the actual arrival time at the destination minus the scheduled arrival time.
• A supermarket chain mails coupons for various products to a randomly selected subset of its customers in a particular city. Its other customers in this city receive no such coupons. The chain would like to check how much the average amount spent on these products differs between the two sets of customers over the next couple of months.
• A car dealership often deals with husband–wife pairs shopping for cars. To check whether husbands react differently than their wives to the sales presentation, husbands and wives are asked (separately) to rate the quality of the sales presentation. The dealership wants to know how much husbands differ from their wives in terms of average ratings.
C.I. for two independent sample means
X1 − X 2 ± t −multiple× SE(X1 − X 2 )
sp =(n1 −1)s1
2 + (n2 −1)s22
n1 + n2 − 2
SE(X1 − X 2 ) = sp1n1+1n2
SE(X1 − X 2 ) = sps12
n1+s22
n2
Common StDev
Equal variance
Unequal variance
Paired Sample Example: Presentation Ratings The Stevens Honda-Buick automobile dealership often sells to husband-wife pairs. The manager would like to check whether the sales presentation is viewed any more or less favorably by the husbands than the wives. If it is, then some new training might be recommended for its salespeople. To check for differences, a random sample of husbands and wives are asked (separately) to rate the sales presentation on a scale of 1 to 10, 10 being the most favorable rating. What can the manager conclude from these data?
Paired Sample Example: Presentation Ratings
Use File: Sales Presenta=on Ra=ngs.xlsx
Source: S. Chris=an Albright, Wayne Winston and Christopher Zappe. Data Analysis and Decision Making, OH: South-‐Western, Cengage learning, 2011. 4th Edi=on. ISBN: 9780538476126.
One-‐Sample Analysis of Differences for Sales Presenta=on Data
Paired Sample Example: Presentation Ratings
Use File: Sales Presenta=on Ra=ngs.xlsx
Source: S. Chris=an Albright, Wayne Winston and Christopher Zappe. Data Analysis and Decision Making, OH: South-‐Western, Cengage learning, 2011. 4th Edi=on. ISBN: 9780538476126.
Two-‐Sample Analysis of Sales Presenta=on Data
Question • How can I know that my sample is paired?
=CORREL(X,Y)
TWO PROPORTION MEANS
Applications • When an appliance store is about to have a sale, it sometimes sends
selected customers a mailing to notify them of the sale. On other occasions it includes a coupon for 5% off the sale price in these mailings. The store’s manager would like to know whether the inclusion of coupons affects the proportion of customers who respond.
• A manufacturing company has two plants that produce identical products. The company wants to know how much the proportion of out-of-spec products differs across the two plants.
• An advertising agency would like to check whether men are more likely than women to switch TV channels when a commercial comes on. The agency runs an experiment where the channel-switching behavior of randomly chosen men and women can be monitored, and it collects data on the proportion of viewers who switch channels on at least half of the commercial times. The agency then compares these proportions across gender.
C.I. for Difference Between Proportions
p^
1− p^
2± z−multiple× SE(p^
1− p^
2 )
SE(p^
1− p^
2 ) =p^
1(1− p^
1)n1
+p^
2 (1− p^
2 )n2
Example: Coupon Effectiveness An appliance store is about to have a big sale. It selects 300 of its best customers and randomly divides them into two sets of 150 customers each. It then mails a notice of the sale to all 300 customers but includes a coupon for an extra 5% off the sale price to the second set of customers only. As the sale progresses, the store keeps track of which of these customers purchase appliances. What can the store’s manager conclude about the effectiveness of the coupons?
Example: Coupon Effectiveness
SAMPLE SIZE ESTIMATION
Formula
X ± t −multiple× s / n = B
n = t −multiple× sB
#
$%
&
'(2
n = t −multipleB
"
#$
%
&'2
pest (1− pest )
Example: Sandwich The fast-food manager surveyed 40 customers, each of whom rated a new sandwich on a scale 1 to 10. Based on the data, a 95% confidence interval for the mean rating of all potential customers extended from 5.739 to 6.761, with a half-length of (6.761 - 5.739)/2 = 0.511. The observed sample standard deviation is 1.597. How large a sample would be needed to reduce this half- length to approximately 0.3?
Example: New Sandwich
n = t −multiple× sB
#
$%
&
'(2
=1.96×1.597
0.3#
$%
&
'(2
=108.86 ~109
Source: S. Chris=an Albright, Wayne Winston and Christopher Zappe. Data Analysis and Decision Making, OH: South-‐Western, Cengage learning, 2011. 4th Edi=on. ISBN: 9780538476126.
Use File: Sa=sfac=on Ra=ngs.xlsx
Thank you!