9
STAT 141 Confidence Intervals and Hypothesis Testing 10/26/04 Today (Chapter 7): CI with σ unknown, t-distribution CI for proportions Two sample CI with σ known or unknown Hypothesis Testing, z-test Confidence Intervals with σ unknown Last Time: Confidence Interval when σ is known: A level C, or 100(1 - α) % confidence interval for μ is [ ¯ X - z α/2 σ n , ¯ X + z α/2 σ n ] But to return to reality, we don’t know σ. Thus we must estimate the standard deviation of ¯ X with: SE ¯ X = s n But s is just a function of our X i ’s and thus is a random variable too – it has a sampling distribution too. Before we could say if we knew σ P (-z α/2 < ¯ X - μ σ/ n <z α/2 )=1 - α which after algebra gave the confidence interval. [Remember for any s, z s is defined as where 1-2s of the area falls in (-z s ,z s ). So z s = qnorm(1-s)= -qnorm(s)=1 - s quantile. i.e. z s is the positive side.] Now we want a similar setup, so that: P (?? < ¯ X - μ SE ¯ X <??) = α We need know the probability distribution of T = ¯ X-μ SE ¯ X . T has the Student’s t-distribution with n - 1 degrees of freedom. We write this as T t n-1 . The degrees of freedom=ν is the only parameter of this distribution. [book uses t s for T ] 1

Confidence Intervals with σ unknown - Home | Department of …statweb.stanford.edu/~susan/courses/s141/TTestLectur… ·  · 2004-12-07STAT 141 Confidence Intervals and Hypothesis

Embed Size (px)

Citation preview

Page 1: Confidence Intervals with σ unknown - Home | Department of …statweb.stanford.edu/~susan/courses/s141/TTestLectur… ·  · 2004-12-07STAT 141 Confidence Intervals and Hypothesis

STAT 141 Confidence Intervals and Hypothesis Testing 10/26/04

Today (Chapter 7):

• CI with σ unknown, t-distribution

• CI for proportions

• Two sample CI withσ known or unknown

• Hypothesis Testing, z-test

Confidence Intervals withσ unknown

Last Time: Confidence Interval whenσ is known:

A level C, or 100(1− α) % confidence interval forµ is

[X̄ − zα/2σ√n

, X̄ + zα/2σ√n

]

But to return to reality, we don’t knowσ. Thus we must estimate the standard deviation ofX̄ with:

SEX̄ =s√n

But s is just a function of ourXi’s and thus is a random variable too – it has a sampling distribution too.

Before we could say if we knewσ

P (−zα/2 <X̄ − µ

σ/√

n< zα/2) = 1− α

which after algebra gave the confidence interval.

[Remember for anys, zs is definedas where1−2s of the area falls in(−zs, zs). Sozs = qnorm (1−s) =−qnorm (s) = 1− s quantile. i.e.zs is the positive side.]

Now we want a similar setup, so that:

P (?? <X̄ − µ

SEX̄

<??) = α

We need know the probability distribution ofT = X̄−µSEX̄

. T has the Student’s t-distribution withn − 1

degrees of freedom. We write this asT ∼ tn−1. The degrees of freedom=ν is the only parameter of thisdistribution.

[book usests for T ]

1

Page 2: Confidence Intervals with σ unknown - Home | Department of …statweb.stanford.edu/~susan/courses/s141/TTestLectur… ·  · 2004-12-07STAT 141 Confidence Intervals and Hypothesis

−6 −4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

t−dist w/ df=1

t−dist, df=1N(0,1)

−6 −4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

t−dist w/ df=5

t−dist, df=5N(0,1)

−6 −4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

t−dist w/ df=10

t−dist, df=10N(0,1)

−6 −4 −2 0 2 4 6

0.0

0.1

0.2

0.3

0.4

t−dist w/ df=50

t−dist, df=100N(0,1)

RCode:

> par(mfrow=c(2,2))#tdist1.pdf> plot(seq(-6,6,length=10000),dnorm(seq(-6,6,length=10000)),

type="l",lty=3,ylab="",xlab="",main="t-dist w/ df=1")> lines(seq(-6,6,length=10000),dt(seq(-6,6,length=10000),df=1),

type="l",ylab="",xlab="")> legend(x=2,y=.4,lty=c(1,3),legend=c("t-dist, df=1","N(0,1)"))...

Thus t-distribution approaches normal asν increases, but for smalln giveswider intervals.

Why “degrees of freedom”??

2

Page 3: Confidence Intervals with σ unknown - Home | Department of …statweb.stanford.edu/~susan/courses/s141/TTestLectur… ·  · 2004-12-07STAT 141 Confidence Intervals and Hypothesis

Let yi = xi − x̄

We have

s2 =1

n− 1

n∑1

y2i and

∑yi = 0(∗)

Now (*) < − > 1 constraint onn numbers, hence the phrase “n-1degrees of freedom”

Now that we know the distribution, we know we can find the “??” from above – these are just theα/2 and1 − α/2 quantiles of the t-distribution. Lettn−1,s be defined similarly aszs and is equal toqt (1− s, df = n− 1) = −qt (s, df = n− 1). We then have:

P (−tn−1,α/2 <X̄ − µ

SEX̄

< tn−1,α/2) = 1− α

This gives us a confidence interval like before, only we use the quantiles of the t-distribution rather thanthe normal distribution.

Example. Taken from the original paper ont-test by W.S. Gossett , 1908. [Gossett was employed byGuiness Breweries, Dublin. A chemist, turned statistician, Guiness, fearing the results to be of commer-ical importance, forbade Gossett to publish under his own name. Chose pseudonym “Student” out ofmodesty]

Two drugs to induce sleep: A- “dextro”, B= “laevo”. Each of ten patients receives both drugs (presum-ably in random order). Issue: Is drug B better than drug A? Student’s sleep data:

Page 4: Confidence Intervals with σ unknown - Home | Department of …statweb.stanford.edu/~susan/courses/s141/TTestLectur… ·  · 2004-12-07STAT 141 Confidence Intervals and Hypothesis

data(sleep)extra group

1 0.7 12 -1.6 13 -0.2 14 -1.2 15 -0.1 16 3.4 17 3.7 18 0.8 19 0.0 110 2.0 111 1.9 212 0.8 213 1.1 214 0.1 215 -0.1 216 4.4 217 5.5 218 1.6 219 4.6 220 3.4 2extra1=sleep[sleep[,2]==1,]extra2=sleep[sleep[,2]==2,]extradiff=extra2[,1]-extra1[,1]

>extradiff1.22.41.31.30.01.01.80.84.61.4

> mean(extradiff)[1] 1.58> sqrt(var(extradiff))[1] 1.229995> sqrt(var(extradiff)/10)[1] 0.3889587> 1.58/0.38896[1] 4.062114> qt(.975,9)[1] 2.262157> qt(.995,9)[1] 3.249836> qnorm(0.975)[1] 1.959964> qnorm(0.995)[1] 2.575829

A level C conf. interval withσ unknown:

• exact ifX Normal

• otherwise approx correct for largen

• Margin of errorM in E ±M is

tn−1, α2

s√n

= tn−1, α2SEX̄

Remark: Large value, 4.6 possible outlier, so some doubt about normal assumptions here.

What’s different?? Since we don’t knowσ, pay a penalty with a (slightly)wider interval: ( e.gt=2.262vs. z=1.96 for 5% level confidence )

For large sample sizes we can just use the normal distribution quantileszα/2, since the t-distributionquickly looks like the normal distribution.

ProportionsWe saw last time that̂p is approximately distributed asN(p, p(1−p)

n). If we want a confidence interval for

p̂ we can use this normality to get an approximate confidence interval.

Page 5: Confidence Intervals with σ unknown - Home | Department of …statweb.stanford.edu/~susan/courses/s141/TTestLectur… ·  · 2004-12-07STAT 141 Confidence Intervals and Hypothesis

M = zα/2 × SEp = zα/2

√p(1−p)

n

The book offers a correction to this usingp̃ =y+0.5z2

α/2

n+z2α/2

andSEp =

√p̃(1−p̃)

n+z2α/2

.

Two-samples

One of the most common statistical procedures. Is there a difference? Is it real?? However, because ofthe preparatory work with one-sample problems, this should seem rather familiar, a case of dejà-vu. , butwith slightly more complex formulas.

What do we mean by “two-samples”?

• Two groups

• Distinct populations [treatment/control, . . . , male/female . . . ]

• Grouping variable: categorical variable with 2 levels.

• Data isindependentbetween groups

Example: (Dalgaard p 87) Energy expenditure: Two groups of women, lean and obese. Twenty fourhour energy expenditure in MJ.

data(energy) lean_energy[energy$stature==’lean’,1]obese_energy[energy$stature==’obese’,1]obese[1]9.21 11.51 12.79 11.85 9.97 8.79 9.69 9.689.19

lean[1] 7.53 7.48 8.08 8.09 10.15 8.40 10.88 6.13 7.90 7.05 7.48

7.58 8.11

plot(expend~stature,data=energy)

Beware:Some data sets that maylook like two sample problems are really better treated aspaired data.

Example: Sleep drugs data from above: 10 patients, Drugs A and B. But since each patient received bothA and B, the samples are not really independent (common component of variation due to patient) – betterto look atdifferences. Becomes a one-sample problem. (Will discuss more about pairing/blocking later).

Notation:

PopulationVariable Mean SD

Population 1 X1 µ1 σ1

Population 2 X2 µ2 σ2

SRS from Each PopulationSample Size Sample Mean Sample SD

Sample 1 n1 X̄1 s1

Sample 2 n2 X̄2 s2

5

Page 6: Confidence Intervals with σ unknown - Home | Department of …statweb.stanford.edu/~susan/courses/s141/TTestLectur… ·  · 2004-12-07STAT 141 Confidence Intervals and Hypothesis

Distribution ofX̄1 − X̄2

Sample mean difference:X̄1 − X̄2 – All depends on the variability and distribution of this difference!!

Recall in general that ifE(V ) = µ andE(W ) = ν then

E(V −W ) = µ− ν

and ifV andW areindependentthen

var(V −W ) = var(V ) + var(W )

So if X̄1 ∼ (µ1,σ21

n1), X̄2 ∼ (µ2,

σ22

n2), we will have

µX̄1−X̄2= E(X̄1 − X̄2) = µ1 − µ2

and forindependentrvs X̄1 andX̄2:

σ2X̄1−X̄2

= σ2X̄1

+ σ2X̄2

=σ2

1

n1

+σ2

2

n2

We need estimates forµ1 − µ2 andσ2X̄1−X̄2

.

ClearlyX̄1 − X̄2 is estimate forµ1 − µ2. Once we have an estimate for theσX̄1−X̄2then we can use

similar method as for a 1-sample case to get a confidence interval.

1. Unequal variances:σ21 6= σ2

2 then use

SE2X̄1−X̄2

=s21

n1

+s22

n2

2. Equal Variances: Ifσ21 = σ2

2 = σ2 is unknown but assumed to beequal, can use apooledestimateof varianceσ:

s2pooled =

(n1 − 1)s21 + (n2 − 1)s2

2

n1 + n2 − 2

i.e. average with weights equal to the respective degrees of freedom. Then our estimate ofσX̄1+X̄2

SE2pooled = s2

pooled(1

n1

+1

n2

)

• Good method if the two SDs are close, but ifalso are moderate to large, there won’t bemuch difference from the unequal variances method (below)

• If the two SDs are different, better to use unequal variances method.

• will use this pooled estimate again when we study Analysis of Variance

As above, we need the distribution of:

T =X̄1 − X̄2 − µX̄1−X̄2

SE of X̄1 − X̄2

If X1 ∼ N(µ1, σ21) andX2 ∼ N(µ2, σ

22) then:

6

Page 7: Confidence Intervals with σ unknown - Home | Department of …statweb.stanford.edu/~susan/courses/s141/TTestLectur… ·  · 2004-12-07STAT 141 Confidence Intervals and Hypothesis

• Equal Variances: If we have equal variances in the two populations, thenSE of X̄1 − X̄2 = SEpooled andT ∼ tν with ν = n1 + n2 − 2

• Unequal Variances: ThenSE of X̄1 − X̄2 = SEX̄1−X̄2andT is approximatelydistributed astν .

We use one of two values forν

1. ν = min(n1 − 1, n2 − 1)

2.

ν ′ =

s21

n1+

s22

n2

1n1−1

(s21

n1)2 + 1

n2−1(

s22

n2)2

This is known as Welsh’s formula which gives fractional degrees of freedom. More accurateformula (generally used by packages, and only on computers!):

Can use either approximation,but say which!Note that one can generally not go too far wrong, since can show by algebra that

min(n1 − 1, n2 − 1) ≤ ν ′ ≤ n1 + n2 − 2

Summary:Two sample confidence intervals forµ1 − µ2 at the 100(1− α)% level

E ±M, E = X̄1 − X̄2, M = (zα/2 or tα/2)× (appropriate SE)

known large sample unknown, unequal unknown, equal

M = zα/2

√σ21

n1+

σ22

n2M = zα/2

√s21

n1+

s22

n2M = tα/2,ν′

√s21

n1+

s22

n2M = tα/2,νspooled

√1n1

+ 1n2

ν = min(n1 − 1, n2 − 1) or ν ′ ν = n1 + n2 − 2

wherezα/2 andtα/2,ν are same notation as for one-sample case.

In energy data above, we can construct a 95% confidence interval for the difference in the true meansbetween obese and lean.n1 = 9, n2 = 13 andX̄1 − X̄2 = 2.23. We’ll use the conservative estimate forν = min(9− 1, 13− 1) = 8. SEX̄1−X̄2

= 0.58. So ourM = 2.24× 0.57 = 1.30. Then a (conservative)95% confidence interval is [0.93,3.53]. Computer output for Welsh’s formula gives [1.00,3.46] > mean(obese)-mean(lean)[1] 2.231624> qt(.9725,df=8)[1] 2.244938> sqrt(var(obese)/length(obese)+var(lean)/length(lean))[1] 0.5788152> t.test(obese,lean, conf.level=.95 )Welch Two Sample t-test data: obese and leant = 3.8555, df = 15.919, p-value = 0.001411alternative hypothesis: true difference in means is not equal to 0

Page 8: Confidence Intervals with σ unknown - Home | Department of …statweb.stanford.edu/~susan/courses/s141/TTestLectur… ·  · 2004-12-07STAT 141 Confidence Intervals and Hypothesis

95 percent confidence interval:1.004081 3.459167

sample estimates:mean of x mean of y10.297778 8.066154

Hypothesis Tests

We will generally have some hypotheses about certain parameters of the population (or populations)from which our data arose, and we will be interested in using our data to see whether these hypothesesare consistent with what we have observed.

To do this, we have already calculatedconfidence intervals for them, now we will be conductinghypothesis tests about the populations parameters of interest. We will discuss these two statisticalprocedures, in general they are built on the idea that if some theory about the population parameters istrue, the observed data should follow, admittedly random, but generally predictable patterns. Thus, if thedata do not fall within the likely outcomes under our supposed ideas about the population, we will tendto disbelieve these ideas, as the data do not strongly support them.

We will initially be interested in using our data to make inferences aboutµ, the population mean. To dothis, we will use our estimate of location from the data; namely, the sample mean (average) (since it ismathematically nicer than the median). We will do this in the framework of several different datastructures, starting with the most basic, the one-sample situation. How can we decide if a given set ofdata, and in particular its sample mean, is close enough to a hypothesized value forµ for us to believethat the data are consistent with this value? In order to answer such a question, we need to know how astatistic like the sample average behaves, i.e. its distribution.

Now we have already studied the distribution of the sample average and the sample proportions, whenthe sample size is large enough, they follow Normal distributions, centered at the expected value andwith a spread of the order the relevant SE.

INFERENCE FOR A SINGLE SAMPLE: Z-DISTRIBUTION

Standard Error of the Sample Mean (σ known)Example: Testing whether the birthweights of the secher babies have above average mean.

Variance of the original populationσ=700 – Known.

We would like to test whetherµ = 2500, versus the alternativeµ > 2500.

We have a sample ofn = 107 observations.mean(bwt) gives thatX̄ = 2739, we would like to usethis data to testµ > 2500.

We have a sample of size107, we know thatX̄ will be normal with varianceσ2

n= 7002

107= 490000

107.

If it is true thatµ=2500 (this is called the null hypothesis), then under the central limit theorem,X̄ ∼ N (2500, 490000/107) = N (2500, 67.72) and under the null hypothesis

P (X̄ ≥ 2739) = P (X̄ − µ

σ√n

≥ 2739− 2500

67.7) = P (Z ≥ 3.53)

What is the probability that a standard normalZ score is as big as3.53?

P (Z > 3.53) = 1− P (Z ≤ 3.53) = 1− Φ(3.53) = 0.000207

8

Page 9: Confidence Intervals with σ unknown - Home | Department of …statweb.stanford.edu/~susan/courses/s141/TTestLectur… ·  · 2004-12-07STAT 141 Confidence Intervals and Hypothesis

using the R commandpnorm(3.53) which returns[1] 0.9997922

This is indeed very small, too small to be true. We reject the null hypothesis.

Let X1, . . . , Xn be a sample ofn i.i.d. random variables from a distribution having unknown meanµ,and known standard deviationσ. Assumen is large, sayn > 30. Suppose interest centers on testing thehypothesis

H0 : µ = µ0,

whereµ0 is some fixed, pre-specified value. This will be our null hypothesis, notice that it is asimpleone, i.e. it postulates a single hypothesized value forµ. The hypothesis against which the nullhypothesis is to be compared, the alternative hypothesis, can take one of three basic forms:

1. HA : µ 6= µ0

2. HA : µ > µ0

3. HA : µ < µ0

The idea, as we have said, is to assess whether the data supports the null hypothesis (H0) or whether itsuggests the relevant alternative (HA).

To begin, we assert that the null hypothesis is true (i.e. that the true value ofµ is actuallyµ0). Under thisassumption, the Central Limit Theorem implies that thetest statistic

Z =X − µ0

σ/√

n,

has a standard normal (N(0, 1)) distribution (notice that the test statistic is just the standardized versionof X under the assumption that the true mean is actually equal toµ0). The usual convention applies thatif σ is unknown, andn is large then the sample standard deviation,s, is used in place ofσ in forming thetest statistic. The null hypothesis is supported if the observed value of the test statistic is small (i.e.X isclose enough toµ0, the hypothesized value, so that I would believe that the true mean isµ0). On theother hand, if I observe a large value of the test statistic, this suggests thatX is far fromµ0, which tendsto discredit the null hypothesis in favor of the alternative hypothesisHA : µ 6= µ0.

The real issue is “how large is large?” (or small is small?).

For example, if I observe aZ value of 1, say, can we conclude in favor ofH0 overHA, or should wepreferHA overH0. What about aZ value of−2? The answer to these question lies in considering whatthe test statistic actually measures. In words, the observed value ofZ is just the number of standarderrors the observed sample mean is from the hypothesized population mean; i.e.

Zobs = number of standard errors̄X is away fromµ0

This is determined by how rare a rare event should be to make us think soemthing else thanH0 is goingon. This determines what we call the significance levelα, most oftenα is taken to be 5%, sometimes10%, and sometimes even .1 % (1/1000).

We compute the P-value which the probability of observing a value ‘as extreme as’ this.

The P-value computation either takesP (|Z| > Zobs),P (Z > Zobs) or P (Z < Zobs) depending on whatthe alternativeHA was.

9