BMI 541/699: Lecture 11lindstro/11.ttest2.10.8.pdf · Population 1: Infants of mothers who smoked during pregnancy. X 1 = BMD for infants in population 1. Population 2: Infants of

BMI 541/699: Lecture 11

We have covered:

1. Introduction and Experimental Design

2. Exploratory Data Analysis

3. Probability

4. Distribution of sample statistics

5. Testing hypotheses about the sample mean(s)

- One sample t-test- Two sample t-test (two sided p-value)- relationship between one sample t-test and the

confidence interval for the mean- Confidence interval for the difference of two means- Paired t-test- Using R for t-tests and confidence intervals

1 / 44

Review: Assumptions

The hypothesis tests and confidence intervals that we have learnedare valid if:

1. All samples are simple random samples.

2. The sampling distribution of the sample mean(s) isapproximately normal.

This is true when:

- the distribution of X is not too skew- the sample size is large enough for the central limit theorem to

apply

2 / 44

Large samples

We have discussed large samples in two settings.

• Samples that are large enough for the central limit theorem tohold. The needed sample size depends on the skewness of thepopulation distribution of our variable (I said n ≥ 30 but itreally depends on the data).

• Samples that are large enough so that the normal distributioncan be used for confidence intervals and t-tests. N ≥ 50 issufficient but it’s better to just use the t-distribution for allsamples.

3 / 44

Checking Assumptions for T-based confidence intervalsand tests

1. All samples must be simple random samples.

a. Every individual in the population has an equal chance ofbeing in the sample

b. The fact that one individual is in the sample does not changethe probability that any other individual is in the sample(independence).

2. The distribution of the sample mean(s) must beapproximately normal. We can check this:

a. Create a histogram of the sample values (one for each sampleif doing a 2 sample t-test).

b. Check that it is not too skew.I Small sample sizes require symmetric distributions for the CLT

to hold.I Large sample sizes can have more skew distributions and the

CLT will still hold.

4 / 44

Review: Confidence Interval for the population mean.If

• the above assumptions hold

• the variable X has mean = µ and sd = σ.

then

the 95% confidence interval for µ is:(x̄ − t0.025 , n−1

s√n, x̄ + t0.025 , n−1

s√n

)where t0.025 , n−1 is the number t such that Pr(Tn−1 > t) = 0.025

− t0.025,n−1 0 t0.025,n−1

Tn−1 distribution

area = 0.025area = 0.025

5 / 44

Connection between Hypothesis Testing and ConfidenceIntervals

Suppose that we have n = 8 observations from N(µ, σ2).

We observe x̄ = 37.1 and s = 2.51.

We want to test H0 : µ = 35 vs. HA : µ 6= 35.

If we reject H0 when the P-value is less than 5% then 5% is calledthe α level or significance level or size of the test.

• Test statistic: t = (37.1− 35)/(2.51/√

8) = 2.368.

• P-value = 2× Pr(T7 ≥ 2.368) = 2× .0249 = .0498.

• We reject H0 at the 5% level (but just barely).

We would reject H0 at the 5% level for any test statistic withabsolute value greater than t0.025 , 7 = 2.365.

6 / 44

The Rejection Region for a Hypothesis test at the 5% level is the setof possible values of the test statistics which will result in a p-valueless than or equal to 0.05.

For our example the rejection region is

(−∞,−2.365) plus (2.365,∞)

−2.365 2.365test statistic

Blue area sums to 0.05

Red line is Rejection Region

Values of x̄ which produce a test statistic that is not in therejection region will be in the 95% confidence interval for µ.

7 / 44

In general, We reject the null hypothesis µ = µ0 versus the two-sided alternative at the 0.05 level of significance if and only if a 95%confidence interval for µ does not contain µ0.

In other words:

• if 95% CI for µ contains µ0, then we fail to reject H0 : µ = µ0at the 0.05 level (the p-value will be larger than 0.05).

• if 95% CI for µ does not contain µ0, then we rejectH0 : µ = µ0 at the 0.05 level (the p-value will be smaller than0.05).

In our example the 95% CI for µ is

37.1± 2.365× 2.51/√

8 = (35.001 , 39.120)

• The 95% CI for µ does not include 35 (but just barely).

• The p-value for the two sided t-test was 0.0498. We rejectedthe null hypothesis (but just barely).

8 / 44

Confidence intervals for the difference between means fromtwo independent samples

Recall the previous BMD Example: We wish to determine whethermaternal cigarette smoking has any effect on bone mineral densityin newborns.

• Population 1: Infants of mothers who smoked duringpregnancy.

X1 = BMD for infants in population 1.

• Population 2: Infants of mothers who did not smoke duringpregnancy.

X2 = BMD for infants in population 2.

H0 : µ1 − µ2 = 0 HA : µ1 − µ2 6= 0

n1 = 77 x̄1 = 0.098 s1 = 0.026

n2 = 161 x̄2 = 0.095 s2 = 0.039 / 44

Find a 95% CI for the effect of maternal smoking on BMDassuming equal variances. That is a 95% CI for µ1 − µ2.

We know:

• x̄1 − x̄2 is our best estimate for µ1 − µ2• sp = ( the pooled estimate of σ1 = σ2 = σ)

• sd(X̄1 − X̄2) = sp√

1/n1 + 1/n2

Using the same method we used for a confidence interval for themean of a single population we can find the 95% CI for µ1 − µ2 is:

(x̄1 − x̄2) ± tdf , 0.025 sp√

1/n1 + 1/n2

10 / 44

Last time we calculated

• s2p = 0.0008278

• sp = 0.0288

• sp has 77 + 161− 2 = 236 degrees of freedom.

• The pooled estimate of the standard deviation of X̄1 − X̄2 is

sp√

1/77 + 1/161 = 0.00399

11 / 44

The final thing we need for the CI is t0.025 , 236 where

Pr(T236 > t0.025 , 236) = 0.025

From R or t-tables t0.025 , 236 = 1.97

0 1.97

T236 distribution

area = 0.025

So the 95% CI for µ1 − µ2 (BMD smoking - BMD non smoking) is

(x̄1 − x̄2)± 1.97× sp√

1/n1 + 1/n2

= 0.003± 1.97× 0.00399 = 0.003± 0.008 = (−0.005, 0.011)

( | | )

−0.005 0 0.003 0.011

smoking − non smoking12 / 44

Just as in the one sample case, the Rejection Region is the set ofpossible values of the test statistics which will result in a p-valueless than or equal to 0.05.

−1.97 1.97test statistic

Blue area sums to 0.05

Red line is Rejection Region

• Any test statistic (t-statistic) that is in the rejection regionwill allow us to reject H0.

• Values of x̄ which produce a test statistic that is not in therejection region will be in the 95% confidence interval forµ1 − µ2.

13 / 44

Why report confidence intervals?When we did the 2-sample t-test of differences in BMD we failedto reject H0 = the two means are equal.

This can happen for two reasons

• H0 is true

• We failed to gather enough data to show that H0 is false

The very narrow confidence interval (relative to clinically importantchanges in BMD) tells us that failing to reject the null hypothesisis probably not due to small sample size

( | | )

−0.005 0 0.003 0.011

smoking − non smoking

This is the advantage of reporting CIs as well as P-values.

It is more informative to include estimates and confidenceintervals in presentations and publications as well as P-values.

14 / 44

Comparing Population Means for Paired Data

Example: A study was conducted to investigate whether oat brancereal helps to lower serum cholesterol in males with highcholesterol.

• population = men with high cholesterol who do not eat oatbran in their normal diet.

• A simple random sample of size 14 was obtained.

• It is known that LDL cholesterol follows an approximatelynormal distribution in this population.

For each subject:

day 1 measure LDL cholesterol

next 2 weeks: change breakfast to include oat bran cereal

1st day after diet ends: measure LDL cholesterol

15 / 44

LDL Example continued

DefineX1 = LDL cholesterol before oat bran diet

X2 = LDL cholesterol after oat bran diet

We wish to test: H0 : µ1 − µ2 = 0 versus HA : µ2 − µ2 6= 0

At this point the setup looks like a two-sample t-test.

What are the assumptions for a two-sample t-test?

1. Two independent simple random samples, one from eachpopulation.

2. Approximately normally distributed sample means.

Are these met?

16 / 44

Here is the data:

LDL CholesterolBefore Oat Bran After Oat Bran

Subject X1 X2

1 4.61 3.842 6.42 5.573 5.40 5.854 4.54 4.80...

......

13 2.25 1.8414 4.24 4.14

Are the two means approximately normally distributed?

Are the two samples independent?

17 / 44

The within-person difference is what we are interested in.

The data again:

Subject Before After D = After - Before1 4.61 3.84 -0.772 6.42 5.57 -0.853 5.40 5.85 0.454 4.54 4.80 0.26...

......

13 2.25 1.84 -0.4114 4.24 4.14 -0.10

The D values summarize the interesting information in the data.

If we state our null hypothesis in terms of D we can use a onesample t-test since the D measurements are independent (one fromeach subject).

18 / 44

What is our null hypothesis?

We wish to know if eating oat bran changes cholesterol so we wantto test

H0 : µD = 0 versus HA : µD 6= 0

This has exactly the same form as hypotheses for a 1 sample t-test

H0 : µ = µ0 versus HA : µ 6= µ0

if we set µ0 = 0.

The one sample t-test of the differences is called a paired t-test.

19 / 44

To conduct the test we need D̄ and sD the sample mean andstandard deviation of the differences.

Here are some statistics calculated from the data:

Sample Mean Sample SD n

X1 = LDL before x̄1 = 4.44 s1 = 0.97 n1 = 14

X2 = LDL after x̄2 = 4.08 s2 = 1.06 n2 = 14

D = after - before D̄ = −0.36 sD = 0.41 nD = 14= X2 − X1

Note:D̄ = x̄2 − x̄1 and n1 = n2 = nD

However, s1, s2 and sD have no relationship

We only need the statistics in the third row for our test.

20 / 44

What assumptions are we making?

The assumptions of a one sample t-test applied to the differences.

• The differences are measured on a SRS from the population.

• The distribution of the differences is relatively symmetric.

21 / 44

Recall we are testing:

H0 : µD = 0 versus HA : µD 6= 0

The test statistic is

t =D̄ − 0

sD/√nD

=−0.36− 0

0.41/√

14= −3.29

P-value = 2× Pr(T14−1 > | − 3.29|)

= 2× Pr(T13 > 3.29)

= 2× 0.0030 = 0.0060

We have strong evidence against H0 and for the claim that oatbrand cereal lowers LDL cholesterol.

22 / 44

Paired t-tests summary

Definition: A Sample is made up of sampling units (oftensubjects but can be families, hospitals, etc.)

A paired t-test is used when the measurements are paired becausethe sampling units in the SRS are measured twice. For example:

• SRS of patients - before and after measurements of each.

• SRS of families - two children measured from each.

• SRS of litters - two mice measured from each.

• SRS of rats - two eyes measured from each.

23 / 44

The key is to figure out:

• What are the sampling units: people, families, litters, rats, . . .

• How many measurements are taken on each sampling unit.

2 SRSs, 1 measurement on each sampling unit→ 2 sample t-test.

1 SRS randomized into 2 groups, 1 measurement on eachsampling unit→ 2 sample t-test.

1 SRS, 2 measurements on each sampling unit→ paired t-test.

24 / 44

Confidence interval for µD (the difference between pairedvariables)

Recall:Sample Mean Sample SD n

X1 = LDL before x̄1 = 4.44 s1 = 0.97 n1 = 14

X2 = LDL after x̄2 = 4.08 s2 = 1.06 n2 = 14

D = after - before D̄ = −0.36 sD = 0.41 nD = 14= X2 − X1

Just like the paired hypothesis test, the CI for the mean differencerequires only the information in the third line.

25 / 44

The 95% CI for a paired sample is the same as a one sample95% CI for the mean of the variable D.

D̄ ± t0.025 , df ×sD√nD

df = 14− 1t0.025 , 13 = 2.16

0 t0.025, 13 = 2.16

T13 distribution

area = 0.025

26 / 44

So the 95% CI for µD (after - before)

x̄D ± t 0.025, df sD/√nD = −0.36 ± 2.16× 0.41/

√14

= −0.36 ± 0.24

= (−0.60, − 0.12)

( | ) |−0.6 −0.36 −0.12 0

D

We have strong evidence against the null hypothesis but a fairlywide confidence interval.

If we want a smaller confidence interval we need a larger sample.

27 / 44

Summary: t-tests and confidence intervals for the mean ormean difference

Parameter of interest is the Population MeanPopulation parameter: Population mean = µ

Sample estimate: Sample mean = x̄

SD of estimate: sd(x̄) = sd(x)/√n = s/

√n

Confidence interval: x̄ ± tα/2,n−1sd(x̄)

Hypothesis test: One sample t-test

Hypotheses: H0 : µ = µ0 vs. HA : µ 6= µ0

Test Statistic: t = (x̄ − µ0)/sd(x̄)

P-value (2-sided): 2× Pr(Tn−1 > |t|)Assumptions: SRS & X̄ is approximately normally distributed

28 / 44

Parameter of interest is the Difference between two population means

(independent samples)

Population parameter: Difference in population means = µ1 − µ2

Sample estimate: Difference in sample means = x̄1 − x̄2SD of est., eq. var: sd(x̄1 − x̄2) = sp

√1/n1 + 1/n2

where sp =√

[(n1 − 1)s21 + (n2 − 1)s22 ]/(n1 + n2 − 2) and

df = n1 + n2 − 2

unequal var: sd(x̄1 − x̄2) =√

s21/n1 + s22/n2 where

df = (r1 + r2)2/[r 21 /(n1 − 1)+ r 22 /(n2 − 1)] and where

r1 = s21/n1 and r2 = s22/n2Confidence Interval: x̄1 − x̄2 ± tα/2,dfsd(x̄1 − x̄2)

Hypothesis test: Two-sample t-test for independent samples

Hypotheses: H0 : µ1 = µ2 vs. HA : µ1 − µ2 6= 0

Test Statistic: t = (x̄1 − x̄2)/sd(x̄1 − x̄2)

P-value (2-sided): 2× Pr(Tdf > |t|)Assumptions: independent SRSs from 2 populations

or one SRS randomized to two groups &

X̄1 and X̄2 are approximately normally distributed

29 / 44

Parameter of interest is the Mean of paired differencePopulation parameter: Population mean difference = µD

Sample estimate : sample mean difference = x̄DSD of estimate: sd(x̄D) = sd(xD)/

√n = sD/

√n

Confidence interval: paired t-interval = x̄D ± tα/2,nD−1sd(x̄D)

Hypothesis test: Paired t-test

Hypotheses: H0 : µD = µ0 vs. HA : µD 6= µ0

Test Statistic: t = (x̄D − µ0)/sd(x̄D)

P-value (2-sided): 2× Pr(TnD−1 > |t|)Assumptions: pairs are a SRS &

D̄ is approximately normally distributed

Note: This summary is available on the home page under “Handouts” as

“formula summary”.

30 / 44

R Commander: One sample t-test and confidence intervalfor the population mean

A new data set: fasting.glucose

Glucose blood level (mg/100ml) after a 12 hour fast for a simplerandom sample of 70 women.

We wish to test the null hypothesis that the mean fasting glucosefor women is 75.

H0 : µ = 75 HA : µ 6= 75

31 / 44

First plot the histogram of glucose levels:

Quite symmetric.

32 / 44

To do the 1-sample t-test:

In Rcmdr:

• Statistics → Means → Single-sample t-test

• Set the “Null hypothesis: mu =” to 75

• set the “Confidence Level:” to .95 (default)

• For a two sided test check the Alternative Hypothesis“Population mean != mu0” (default)

33 / 44

The output from R:

> with(fasting.glucose, (t.test(glucose, alternative=’two.sided’,

+ mu=75, conf.level=.95)))

One Sample t-test

data: glucose

t = 2.0335, df = 69, p-value = 0.04585

alternative hypothesis: true mean is not equal to 75

95 percent confidence interval:

75.05654 80.91489

sample estimates:

mean of x

77.98571

Moderate evidence against the null hypothesis.

34 / 44

R Commander: Two sample t-test and confidence intervalfor the difference between two population means

Another new data set: birth.rate

Birth rates (per 1000 residential population) for SRSs of countiesin California and Maine.

23 counties in California and 19 in Maine.

Reference: County and City Data Book 12th edition, U.S. Dept. ofCommerce

We wish to test whether the birth rate is the same in Californiaand Maine.

35 / 44

First plot stacked histograms (use the grouped option)

Reasonably symmetric. The variances do not look equal.

36 / 44

The null hypothesis is

H0 : µC = µM HA : µC 6= µM

To do the 2-sample t-test:

In Rcmdr:

• Statistics → Means → Independent samples t-test

• Choose the “Groups variable” to be state

• Choose the “Response variable” to be births.per.1000

• Click on the Options tab and

- choose “Two-sided” (default)- set the “Confidence Level:” to .95 (default)- “Assume equal variances?” choose “No.” (default)

37 / 44

The output from R:

> t.test(births.per.1000~state, alternative=’two.sided’,

+ conf.level=.95, var.equal=FALSE, data=birth.rate)

Welch Two Sample t-test

data: births.per.1000 by state

t = 4.3467, df = 27.447, p-value = 0.0001708

alternative hypothesis: true difference in means is not equal to 0


1.684221 4.691523

sample estimates:

mean in group California mean in group Maine

17.18261 13.99474

Confidence interval is for:

California mean birth rate - Maine mean birth rate

Very strong evidence against the null hypothesis.38 / 44

R Commander: Paired t-test and confidence interval forthe mean difference

Yet another new data set: platelet data set from Shahbaba.

We have measurements on platelet aggregation before and aftersmoking on 11 individuals.

H0 : µA − µB = 0 HA : µA − µB 6= 0

A stands for “After”, B stands for “Before”

We need to plot the data but what should we plot?

What are we assuming?What plot do we need?

39 / 44

To plot the differences we first have to create them

Data → Manage variables in active data set → Compute new variable...New variable name: difference

Expression to compute: After - Before

Then create histogram of difference

Looks reasonably symmetric.40 / 44

To do the paired t-test

In Rcmdr:

• Statistics → Means → Paired t-test

• Choose the “First variable” to be After

• Choose the “Second variable” to be BeforeNote that the differences will be calculated as “First variable”- “Second Variable”


- choose “Two-sided” (default)- set the “Confidence Level:” to .95 (default)

41 / 44

The output from R:

> t.test(platelet$After, platelet$Before, alternative=’two.sided’,

+ conf.level=.95, paired=TRUE)

Paired t-test

data: After and Before

t = 4.2716, df = 10, p-value = 0.001633

alternative hypothesis: true difference in means is not equal to 0


4.91431 15.63114

sample estimates:

mean of the differences

10.27273

Mean and CI are for the differences.Strong evidence against the null hypothesis.

42 / 44

Alternate calculation of paired t-test and confidenceinterval for the mean difference

We can do the paired t-test using the one-sample t-test on thedifferences

In Rcmdr:

• Statistics → Means → Single-sample t-test

• Choose the “Variable” to be difference


- Set the “Null hypothesis: mu =” to 0 (default)

- set the “Confidence Level:” to .95 (default)

- For a two sided test check the Alternative Hypothesis“Population mean != mu0” (default)

43 / 44

The output from R:

> with(platelet, (t.test(difference, alternative=’two.sided’,

+ mu=0.0, conf.level=.95)))

One Sample t-test

data: difference

t = 4.2716, df = 10, p-value = 0.001633

alternative hypothesis: true mean is not equal to 0


4.91431 15.63114

sample estimates:

mean of x

10.27273

Same mean, confidence interval and p-value as the previous pairedt-test.

44 / 44

Documents

BMI 541/699: Lecture 11lindstro/11.ttest2.10.8.pdf · Population 1: Infants of mothers who smoked during pregnancy. X 1 = BMD for infants in population 1. Population 2: Infants of