Estimation of Confidence Interval - class.misi.edu.my€¦ · population mean rating of the restaurant’s new sandwich. Each rating is on a 1-to-10 scale, 10 being the best. The

Estimation of Confidence Interval

Dr. Ioannis N. Lagoudis [email protected]

[email protected]

SAMPLE MEAN

t-distribution A t-‐value indicates the number of standard errors by which a sample mean differs

from a popula7on mean.

Confidence interval Point of estimate multiple * Standard Error ±

X − Z σn≤ µ ≤ X + Z σ

n

X − t σn≤ µ ≤ X + t σ

n

Z = X −µσ / n

t = X −µs / n

Key observations for t-distribution •  A t-value indicates the number of standard

errors by which a sample mean differs from a population mean.

•  Degrees of freedom (df) are important in understanding the shape of the distribution

•  The smaller the sample the wider the distribution is

•  The higher the sample the closer to the Normal Distribution the t-distribution is

Confidence Interval Levels

Confidence Level Z value t value

90% 1.645 1.699

95% 1.96 2.045

99% 2.575 2.756

Practice – Problem 1 (t Calculations)

•  Use Excel Formulas =TDIST(X) & =TINV(X)

FOR TOTAL

Formulas

T^=NnTs = NX Point of Estimate

E(T )^= T Mean

SE(T )^= Nσ / n Standard Error

SE(T )^= Ns / n = N × SE(X) Approximate Standard Error

Example: Tax Refunds The Internal Revenue Service would like to estimate the total net amount of refund due to a particular set of 1,000,000 taxpayers. Each taxpayer will either receive a refund, in which case the net refund is positive, or will have to pay an amount due, in which case the net refund is negative. Therefore the total net amount of refund is a natural quantity of interest; it is the net amount the IRS will have to pay out (or receive, if negative). Find a 95% confidence interval for this total using the refunds from a random sample of 500 taxpayers.

Example: Tax Refunds

Use File: IRS Funds.xlsx

Source: S. Chris=an Albright, Wayne Winston and Christopher Zappe. Data Analysis and Decision Making, OH: South-‐Western, Cengage learning, 2011. 4th Edi=on. ISBN: 9780538476126.

N=1,000,000

T^=NnTs = NX

X = $294.98

FOR PROPORTION

Formulas

SE(p^) = p

^(1− p

^)

nStandard Error

p^± z−multiple× p

^(1− p

^)

nConfidence Interval

Example: Sandwich The fast-food manager from Example 8.2 has already sampled 40 customers to estimate the population mean rating of the restaurant’s new sandwich. Each rating is on a 1-to-10 scale, 10 being the best. The manager would now like to use the same sample to estimate the proportion of customers who rate the sandwich at least 6.

Example: Sandwich This confidence interval is based on the assump=on of a large sample size.

npL > 5n(1− pL )> 5npU > 5n(1− pU )> 5

Use File: Sa=sfac=on Ra=ngs.xlsx


FOR A STANDARD DEVIATION

chi-square

=CHIDIST(v,df)

=CHIINV(p,df)

Example: Part Diameters A machine produces parts that are supposed to have diameter 10 centimeters. However, due to inherent variability, some diameters are greater than 10 and some are less. The production supervisor is concerned about two things. First, he is concerned that the mean diameter is not what it should be, 10 centimeters. Second, he is worried about the extent of variability in the diameters. Even if the mean is on target, excessive variability implies that many of the parts will fail to meet specifications. To analyze the process, he randomly samples 50 parts during the course of a day and measures the diameter of each part to the nearest millimeter. Should the supervisor be concerned about the results from this sample?

Example: Part Diameters

Use File: Part Diameters.xlsx


Mean

St Dev

=NORMDIST(10-‐E16,E17,E18,1)+ (1-‐NORMDIST(10+E16,E17,E18,1))

F distributions •  Test for equal variances •  This test is phrased in

terms of the ratio of population variances

•  If ratio is 1 (equal variances)

•  If ratio is not 1 (unequal variances)

s12

s22

=FDIST(v,df1,df2) =FINV(p,df1,df2)

TWO SAMPLE MEANS

Applications •  Men and women shop at a retail clothing store. The manager would like to

know how much more (or less), on average, a woman spends on a typical purchase occasion than a man.

•  Two airline companies fly similar routes. A consumer organization would like to check how much the average delay differs between the two airlines, where delay is defined as the actual arrival time at the destination minus the scheduled arrival time.

•  A supermarket chain mails coupons for various products to a randomly selected subset of its customers in a particular city. Its other customers in this city receive no such coupons. The chain would like to check how much the average amount spent on these products differs between the two sets of customers over the next couple of months.

•  A car dealership often deals with husband–wife pairs shopping for cars. To check whether husbands react differently than their wives to the sales presentation, husbands and wives are asked (separately) to rate the quality of the sales presentation. The dealership wants to know how much husbands differ from their wives in terms of average ratings.

C.I. for two independent sample means

X1 − X 2 ± t −multiple× SE(X1 − X 2 )

sp =(n1 −1)s1

2 + (n2 −1)s22

n1 + n2 − 2

SE(X1 − X 2 ) = sp1n1+1n2

SE(X1 − X 2 ) = sps12

n1+s22

n2

Common StDev

Equal variance

Unequal variance

Paired Sample Example: Presentation Ratings The Stevens Honda-Buick automobile dealership often sells to husband-wife pairs. The manager would like to check whether the sales presentation is viewed any more or less favorably by the husbands than the wives. If it is, then some new training might be recommended for its salespeople. To check for differences, a random sample of husbands and wives are asked (separately) to rate the sales presentation on a scale of 1 to 10, 10 being the most favorable rating. What can the manager conclude from these data?

Paired Sample Example: Presentation Ratings

Use File: Sales Presenta=on Ra=ngs.xlsx


One-‐Sample Analysis of Differences for Sales Presenta=on Data

Paired Sample Example: Presentation Ratings

Use File: Sales Presenta=on Ra=ngs.xlsx


Two-‐Sample Analysis of Sales Presenta=on Data

Question •  How can I know that my sample is paired?

=CORREL(X,Y)

TWO PROPORTION MEANS

Applications •  When an appliance store is about to have a sale, it sometimes sends

selected customers a mailing to notify them of the sale. On other occasions it includes a coupon for 5% off the sale price in these mailings. The store’s manager would like to know whether the inclusion of coupons affects the proportion of customers who respond.

•  A manufacturing company has two plants that produce identical products. The company wants to know how much the proportion of out-of-spec products differs across the two plants.

•  An advertising agency would like to check whether men are more likely than women to switch TV channels when a commercial comes on. The agency runs an experiment where the channel-switching behavior of randomly chosen men and women can be monitored, and it collects data on the proportion of viewers who switch channels on at least half of the commercial times. The agency then compares these proportions across gender.

C.I. for Difference Between Proportions

p^

1− p^

2± z−multiple× SE(p^

1− p^

2 )

SE(p^

1− p^

2 ) =p^

1(1− p^

1)n1

+p^

2 (1− p^

2 )n2

Example: Coupon Effectiveness An appliance store is about to have a big sale. It selects 300 of its best customers and randomly divides them into two sets of 150 customers each. It then mails a notice of the sale to all 300 customers but includes a coupon for an extra 5% off the sale price to the second set of customers only. As the sale progresses, the store keeps track of which of these customers purchase appliances. What can the store’s manager conclude about the effectiveness of the coupons?

Example: Coupon Effectiveness

SAMPLE SIZE ESTIMATION

Formula

X ± t −multiple× s / n = B

n = t −multiple× sB

#

$%

&

'(2

n = t −multipleB

"

#$

%

&'2

pest (1− pest )

Example: Sandwich The fast-food manager surveyed 40 customers, each of whom rated a new sandwich on a scale 1 to 10. Based on the data, a 95% confidence interval for the mean rating of all potential customers extended from 5.739 to 6.761, with a half-length of (6.761 - 5.739)/2 = 0.511. The observed sample standard deviation is 1.597. How large a sample would be needed to reduce this half- length to approximately 0.3?

Example: New Sandwich

n = t −multiple× sB

#

$%

&

'(2

=1.96×1.597

0.3#

$%

&

'(2

=108.86 ~109


Use File: Sa=sfac=on Ra=ngs.xlsx

Thank you!

Documents

Estimation of Confidence Interval - class.misi.edu.my€¦ · population mean rating of the restaurant’s new sandwich. Each rating is on a 1-to-10 scale, 10 being the best. The