37
Estimation of Confidence Interval Dr. Ioannis N. Lagoudis [email protected] [email protected]

Estimation of Confidence Interval - class.misi.edu.my€¦ · population mean rating of the restaurant’s new sandwich. Each rating is on a 1-to-10 scale, 10 being the best. The

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Estimation of Confidence Interval

    Dr. Ioannis N. Lagoudis [email protected]

    [email protected]

  • SAMPLE MEAN

  • t-distribution  A  t-‐value  indicates  the  number  of  standard  errors  by  which  a  sample  mean  differs  

    from  a  popula7on  mean.  

  • Confidence interval Point of estimate multiple * Standard Error ±

    X − Z σn≤ µ ≤ X + Z σ

    n

    X − t σn≤ µ ≤ X + t σ

    n

    Z = X −µσ / n

    t = X −µs / n

  • Key observations for t-distribution •  A t-value indicates the number of standard

    errors by which a sample mean differs from a population mean.

    •  Degrees of freedom (df) are important in understanding the shape of the distribution

    •  The smaller the sample the wider the distribution is

    •  The higher the sample the closer to the Normal Distribution the t-distribution is

  • Confidence Interval Levels

    Confidence  Level   Z  value   t  value  

    90%   1.645   1.699  

    95%   1.96   2.045  

    99%   2.575   2.756  

  • Practice – Problem 1 (t Calculations)

    •  Use Excel Formulas =TDIST(X) & =TINV(X)

  • FOR TOTAL

  • Formulas

    T^=NnTs = NX Point of Estimate

    E(T )^= T Mean

    SE(T )^= Nσ / n Standard Error

    SE(T )^= Ns / n = N × SE(X) Approximate Standard Error

  • Example: Tax Refunds The Internal Revenue Service would like to estimate the total net amount of refund due to a particular set of 1,000,000 taxpayers. Each taxpayer will either receive a refund, in which case the net refund is positive, or will have to pay an amount due, in which case the net refund is negative. Therefore the total net amount of refund is a natural quantity of interest; it is the net amount the IRS will have to pay out (or receive, if negative). Find a 95% confidence interval for this total using the refunds from a random sample of 500 taxpayers.

  • Example: Tax Refunds

    Use  File:  IRS  Funds.xlsx  

    Source:  S.  Chris=an  Albright,  Wayne  Winston  and  Christopher  Zappe.  Data  Analysis  and  Decision  Making,  OH:  South-‐Western,  Cengage  learning,  2011.  4th  Edi=on.  ISBN:  9780538476126.  

    N=1,000,000  

    T^=NnTs = NX

    X = $294.98

  • FOR PROPORTION

  • Formulas

    SE(p^) = p

    ^(1− p

    ^)

    nStandard Error

    p^± z−multiple× p

    ^(1− p

    ^)

    nConfidence Interval

  • Example: Sandwich The fast-food manager from Example 8.2 has already sampled 40 customers to estimate the population mean rating of the restaurant’s new sandwich. Each rating is on a 1-to-10 scale, 10 being the best. The manager would now like to use the same sample to estimate the proportion of customers who rate the sandwich at least 6.

  • Example: Sandwich This  confidence  interval  is  based  on  the  assump=on  of  a  large  sample  size.  

    npL > 5n(1− pL )> 5npU > 5n(1− pU )> 5

    Use  File:  Sa=sfac=on  Ra=ngs.xlsx  

    Source:  S.  Chris=an  Albright,  Wayne  Winston  and  Christopher  Zappe.  Data  Analysis  and  Decision  Making,  OH:  South-‐Western,  Cengage  learning,  2011.  4th  Edi=on.  ISBN:  9780538476126.  

  • FOR A STANDARD DEVIATION

  • chi-square

    =CHIDIST(v,df)  

    =CHIINV(p,df)  

  • Example: Part Diameters A machine produces parts that are supposed to have diameter 10 centimeters. However, due to inherent variability, some diameters are greater than 10 and some are less. The production supervisor is concerned about two things. First, he is concerned that the mean diameter is not what it should be, 10 centimeters. Second, he is worried about the extent of variability in the diameters. Even if the mean is on target, excessive variability implies that many of the parts will fail to meet specifications. To analyze the process, he randomly samples 50 parts during the course of a day and measures the diameter of each part to the nearest millimeter. Should the supervisor be concerned about the results from this sample?

  • Example: Part Diameters

    Use  File:  Part  Diameters.xlsx  

    Source:  S.  Chris=an  Albright,  Wayne  Winston  and  Christopher  Zappe.  Data  Analysis  and  Decision  Making,  OH:  South-‐Western,  Cengage  learning,  2011.  4th  Edi=on.  ISBN:  9780538476126.  

    Mean  

    St  Dev  

    =NORMDIST(10-‐E16,E17,E18,1)+  (1-‐NORMDIST(10+E16,E17,E18,1))  

  • F distributions •  Test for equal variances •  This test is phrased in

    terms of the ratio of population variances

    •  If ratio is 1 (equal variances)

    •  If ratio is not 1 (unequal variances)

    s12

    s22

    =FDIST(v,df1,df2)  =FINV(p,df1,df2)  

  • TWO SAMPLE MEANS

  • Applications •  Men and women shop at a retail clothing store. The manager would like to

    know how much more (or less), on average, a woman spends on a typical purchase occasion than a man.

    •  Two airline companies fly similar routes. A consumer organization would like to check how much the average delay differs between the two airlines, where delay is defined as the actual arrival time at the destination minus the scheduled arrival time.

    •  A supermarket chain mails coupons for various products to a randomly selected subset of its customers in a particular city. Its other customers in this city receive no such coupons. The chain would like to check how much the average amount spent on these products differs between the two sets of customers over the next couple of months.

    •  A car dealership often deals with husband–wife pairs shopping for cars. To check whether husbands react differently than their wives to the sales presentation, husbands and wives are asked (separately) to rate the quality of the sales presentation. The dealership wants to know how much husbands differ from their wives in terms of average ratings.

  • C.I. for two independent sample means

    X1 − X 2 ± t −multiple× SE(X1 − X 2 )

    sp =(n1 −1)s1

    2 + (n2 −1)s22

    n1 + n2 − 2

    SE(X1 − X 2 ) = sp1n1+1n2

    SE(X1 − X 2 ) = sps12

    n1+s22

    n2

    Common StDev

    Equal variance

    Unequal variance

  • Paired Sample Example: Presentation Ratings The Stevens Honda-Buick automobile dealership often sells to husband-wife pairs. The manager would like to check whether the sales presentation is viewed any more or less favorably by the husbands than the wives. If it is, then some new training might be recommended for its salespeople. To check for differences, a random sample of husbands and wives are asked (separately) to rate the sales presentation on a scale of 1 to 10, 10 being the most favorable rating. What can the manager conclude from these data?

  • Paired Sample Example: Presentation Ratings

    Use  File:  Sales  Presenta=on  Ra=ngs.xlsx  

    Source:  S.  Chris=an  Albright,  Wayne  Winston  and  Christopher  Zappe.  Data  Analysis  and  Decision  Making,  OH:  South-‐Western,  Cengage  learning,  2011.  4th  Edi=on.  ISBN:  9780538476126.  

    One-‐Sample  Analysis  of  Differences  for  Sales  Presenta=on  Data  

  • Paired Sample Example: Presentation Ratings

    Use  File:  Sales  Presenta=on  Ra=ngs.xlsx  

    Source:  S.  Chris=an  Albright,  Wayne  Winston  and  Christopher  Zappe.  Data  Analysis  and  Decision  Making,  OH:  South-‐Western,  Cengage  learning,  2011.  4th  Edi=on.  ISBN:  9780538476126.  

    Two-‐Sample  Analysis  of  Sales  Presenta=on  Data  

  • Question •  How can I know that my sample is paired?

    =CORREL(X,Y)

  • TWO PROPORTION MEANS

  • Applications •  When an appliance store is about to have a sale, it sometimes sends

    selected customers a mailing to notify them of the sale. On other occasions it includes a coupon for 5% off the sale price in these mailings. The store’s manager would like to know whether the inclusion of coupons affects the proportion of customers who respond.

    •  A manufacturing company has two plants that produce identical products. The company wants to know how much the proportion of out-of-spec products differs across the two plants.

    •  An advertising agency would like to check whether men are more likely than women to switch TV channels when a commercial comes on. The agency runs an experiment where the channel-switching behavior of randomly chosen men and women can be monitored, and it collects data on the proportion of viewers who switch channels on at least half of the commercial times. The agency then compares these proportions across gender.

  • C.I. for Difference Between Proportions

    p^

    1− p^

    2± z−multiple× SE(p^

    1− p^

    2 )

    SE(p^

    1− p^

    2 ) =p^

    1(1− p^

    1)n1

    +p^

    2 (1− p^

    2 )n2

  • Example: Coupon Effectiveness An appliance store is about to have a big sale. It selects 300 of its best customers and randomly divides them into two sets of 150 customers each. It then mails a notice of the sale to all 300 customers but includes a coupon for an extra 5% off the sale price to the second set of customers only. As the sale progresses, the store keeps track of which of these customers purchase appliances. What can the store’s manager conclude about the effectiveness of the coupons?

  • Example: Coupon Effectiveness

  • SAMPLE SIZE ESTIMATION

  • Formula

    X ± t −multiple× s / n = B

    n = t −multiple× sB

    #

    $%

    &

    '(2

    n = t −multipleB

    "

    #$

    %

    &'2

    pest (1− pest )

  • Example: Sandwich The fast-food manager surveyed 40 customers, each of whom rated a new sandwich on a scale 1 to 10. Based on the data, a 95% confidence interval for the mean rating of all potential customers extended from 5.739 to 6.761, with a half-length of (6.761 - 5.739)/2 = 0.511. The observed sample standard deviation is 1.597. How large a sample would be needed to reduce this half- length to approximately 0.3?

  • Example: New Sandwich

    n = t −multiple× sB

    #

    $%

    &

    '(2

    =1.96×1.597

    0.3#

    $%

    &

    '(2

    =108.86 ~109

    Source:  S.  Chris=an  Albright,  Wayne  Winston  and  Christopher  Zappe.  Data  Analysis  and  Decision  Making,  OH:  South-‐Western,  Cengage  learning,  2011.  4th  Edi=on.  ISBN:  9780538476126.  

    Use  File:  Sa=sfac=on  Ra=ngs.xlsx  

  • Thank you!