T- and Z-Tests for Hypotheses about the Difference between Two Subsamples

T- and Z-Tests for Hypotheses T- and Z-Tests for Hypotheses about the Difference betweenabout the Difference between

Two SubsamplesTwo Subsamples

Random samples, partitioned into two independent subsamples (e.g., men and women).

Question: Are the means of some variable (such as salary) significantly different between the two subsample?

Key: The sampling distribution of all theoretically possible differences between subsample means.

For large samples (i.e., when the Central Limit Theorem holds), this sampling distribution of mean differences is normally shaped; for smaller samples, the sampling distribution takes the shape of one of the Student’s t distributions, identified by degrees of freedom.

The key is: The difference between two means is a single value. In the case of these so-called “means difference tests,” the null hypothesis is that the means in general (i.e., in the universe) do NOT differ. Symbolically,

H0: 2 - 1 = 0.00

There are two possible alternate hypotheses:

nondirectional; H1: 2 - 1 0.0

and directional, either

H1: 2 - 1 > 0.0

orH1: 2 - 1 0.0

In the 1984 General Social Survey, female respondents were asked whether or not their mothers had attended college. Then these female respondents were asked about their own education levels. A reasonable (alternate) hypothesis (H1) would be: Women whose mothers attended college will themselves have more formal education than women whose mothers did not attend college. Thus, our two hypotheses are :

H1: 2 - 1 > 0.0

H0: 2 - 1 = 0.00

Dividing respondents into women whose mothers attended college and women whose mothers did not led to the calculation of the following statistics for the two subsamples:

Mothers Mothers ATTENDED DID NOT College Attend_ _Y2 = 15.24 Y1 = 12.57

s22 = 6.57 s1

2 = 9.82

N2 = 90 N1 = 359

Notice that the sample means DO in fact differ. This is NOT the question.

Because these two subsamples combined exceed 100, we know that the Central Limit Theorem applies. We can convert the difference between the value of the sample mean differences and the presumed value of the mean difference in the universe under the null hypothesis (i.e., 0.0) to z-values by using the estimated standard error of the difference (the standard deviation of the sampling distribution of sample mean differences). Recall that in general standard errors are estimated by dividing the standard deviation of the sample by the square root of the sample size,

N

sY

However, in the case of this means difference test, we have TWO subsamples and thus TWO standard deviations (actually variances in this example), one for each subsample. What do we do? Simply combine the two subsample variances, as follows:

1

21

2

22

12ˆ

N

s

N

syy

With the information from above, this means

359

82.9

90

57.6ˆ

12 yy

0274.0073.0ˆ12

yy

1004.0ˆ12 yy

317.0ˆ12 yy

The algorithm for converting mean differences into z-units should look familiar:

12

ˆ1212

YY

YYz

In this example,

317.0

0.057.1224.15 z

317.0

67.2z

43.8z

Selecting alpha = 0.05 for a one-tailed test and looking for a critical value in Appendix 1 (pp. 540-542), we again interpolate between 0.4495 and 0.4505, making

z = 1.645.

Since 8.43 is GREATER THAN the critical value of 1.645, we REJECT the null hypothesis at the 0.05 level and conclude that IN GENERAL women whose mothers attended college have higher education levels than women whose mothers did not attend college.

Selecting alpha = 0.05 for a one-tailed test and looking for a critical value in Appendix 1 (pp. 540-542), we again interpolate between 0.4495 and 0.4505, making

z = 1.645.

Since 8.43 is GREATER THAN the critical value of 1.645, we REJECT the null hypothesis at the 0.05 level and conclude that IN GENERAL women whose mothers attended college have higher education levels than women whose mothers did not attend college.

Z = 1.645Z = 0.0 Z = 8.43

For random samples whose combined size is less than 120, we cannot assume that the sampling distribution of mean differences will be normally shaped. This is because the Central Limit Theorem doesn't hold with samples this small. Student's t distributions must be used instead.

The only new wrinkle here is that the standard error of the difference cannot be estimated as above. The sample standard deviations must be pooled in a way that is sensitive to the impact of even slight differences in small numbers.

Consider the following example:

The 63-city data set that we are using this semester has been divided into two subsamples, one consisting of SUNBELT cities and the other of FROSTBELT cities. The question is: Did frostbelt cities lose population at a higher rate than sunbelt cities in the fifteen years between 1960-1974? Sample statistics are these:

Frostbelt Cities Sunbelt Cities_ _Y2 = - 4.14 Y1 = 2.84

s22 = 9.98 s1

2 = 57.61

N2 = 37 N1 = 26

In the sample, frostbelt cities clearly lost population (population change - 4.14 percent) at a greater rate than sunbelt cities (which GAINED population, + 2.84 percent). The question is, is this sample difference sufficient to infer a similar trend in the universe of American cities?Our alternate hypothesis is that in general frostbelt cities lost population at a higher rate than sunbelt cities, hence

H1: 2 - 1 < 0.0

In other words, we expect 2 to be a larger negative

number than 1. Notice again that the presence of the

“less than” sign (“<”) dictates a one-tailed test, this time in the left-hand tail where negative mean differences are located.

Our null hypothesis is that there is no difference in the rates of population change, or

H0: 2 - 1 = 0.00

Since the subsample sizes are relatively small, we can't simply slam the standard deviations together to estimate the standard error of the difference. We must weight the subsample standard deviations (actually the subsample variances) by the size of the subsamples (actually by the number of degrees of freedom in the subsamples) before estimating the standard error. This is called pooling.

)2(

])1()1[(

12

211

222

NN

sNsNspooled

In this example,

)22637(

)]61.57)(126()98.9)(137[(

pooleds

61

)]61.57)(25()98.9)(36[( pooleds

61

250.1440280.359 pooleds

61

53.1799pooleds

500.29pooleds

431.5pooleds

Now we can use this “pooled” constant to estimate the standard error of the difference. The estimation is

121

2

2

2 11ˆ

12 NNs

N

s

N

syy

where s = spooled

26

1

37

1)431.5(ˆ

12 yy

039.0027.0)431.5(ˆ12

yy

066.0)431.5(ˆ12 yy

)257.0)(431.5(ˆ12 yy

396.1ˆ12 yy

Now we have the value of the standard error of the difference, 1.396; this is our “currency exchange rate” that will allow us to determine the value of the test statistic.

The algorithm should look familiar:

12

ˆ1212

YY

YYt

In this example,

396.1

0.0)]84.2()14.4[( t

396.1

98.6t

00.5t

We now know that location - 6.98 on the sampling distribution converts to a t-location of - 5.00 on the underlying X-axis. But what is the exact SHAPE of this sampling distribution?

It is the Student’s t distribution with 61 degrees of freedom. We have 63 cities selected randomly, but the cities have been subdivided into two subsamples (frostbelt and sunfelt cities). The 37 frostbelt-city subsample has 36 degrees of freedom (37 - 1 = 36), and the 26 sunbelt-city subsample has 25 degrees of freedom (26 - 1 = 25). More generally, the number of degrees of freedom in the t-test is

df = N2 + N1 - 2

Consulting Appendix 2 (p. 543) in search of the critical value for df = 61, we once again find only critical values for df = 60 and for df = 120.

Let's assume that we want to make our test with alpha = 0.05. We could interpolate, by finding the value that is 1 / 60th of the way from 1.671 to 1.645. This value is extremely small and would round back to 1.671.

But we need the critical value for the left (negative) tail, and Appendix 2 has only values for the right (positive) tail. Because Student's t distributions are all SYMMETRICAL, we simply add the negative sign to 1.671. Thus, our critical value is t0.05 = –1.671.

Since t = - 5.00 is GREATER THAN t = - 1.671, this

means that our sample difference lies INSIDE the region of rejection in the left-hand tail. Thus, we REJECT the null hypothesis at the 0.05 level and conclude that in general frostbelt cities in the U.S. probably did lose population at a greater rate than sunbelt cities in the period 1960 and 1974.

t = 0.0t = - 5.00 t = - 1.671

Using SAS to Produce T-Tests libname old 'a:\';libname library 'a:\'; options nonumber nodate ps=66; proc ttest data=old.cities;class agecity2;var manufpct;title1 'An Example of a T-Test';title2 'SAS Version 8.1';title3 'PPD 404';run;

An Example of a T-Test SAS Version 8.1

PPD 404 The TTEST Procedure Statistics Lower CL Upper CL Lower CL Upper CLVariable Class N Mean Mean Mean Std Dev Std Dev Std Dev Std Err MANUFPCT Newer 38 22.928 26.526 30.125 8.9262 10.949 14.165 1.7761MANUFPCT Older 25 20.1 23.92 27.74 7.2268 9.2553 12.875 1.8511MANUFPCT Diff (1-2) -2.706 2.6063 7.9183 8.7659 10.316 12.536 2.6565 T-Tests Variable Method Variances DF t Value Pr > |t| MANUFPCT Pooled Equal 61 0.98 0.3304 MANUFPCT Satterthwaite Unequal 57.1 1.02 0.3139 Equality of Variances Variable Method Num DF Den DF F Value Pr > F MANUFPCT Folded F 37 24 1.40 0.3897

Reject the null hypothesis (H0)when either:1. the value of the statistical test (2, z, t, F', or F) exceeds the critical value at the chosen -level; or,2. the p-value for the statistical test is smaller than the chosen value of .

Do NOT reject the null hypothesis (H0) when either:1. the value of the statistical test (2, z, t, F', or F) is less than the critical value at the chosen -level; or,2. the p-value for the statistical test is greater than the chosen value of .

t = 0.0t = - 5.00 t = - 1.671 t = - 0.25

Exercise 1

Means Difference Test In the 1984 General Social Survey (GSS), 234 male respondents with at least some college had a mean occupational prestige score of 47.49 (with a variance of 213.28). In contrast, 351 male respondents with only a high school education or less had an average occupational prestige score of 34.04 (with a variance of 132.30). Test the null hypothesis (H0) that there is no statistically significant difference in occupational prestige between these two groups. Assume that = 0.05. Perform a two-tailed test. Make your decision regarding the null hypothesis using z-values in Appendix 1 (“Proportions of Area under Standard Normal Curve"), pp. 540-542. 1. What is the value of the standard error?

2. What is the value of Z? 3. What are the values of Z at the 2.5 percent and 97.5 percent areas under the normal curve?

4. Do you reject or accept this null hypothesis?

Exercise 1 Answers

Means Difference Test 1. What is the value of the standard error? 1.135

2. What is the value of Z? 11.850

3. What are the values of Z at the 2.5 percent and 97.5 percent areas under the normal curve? 1.96

4. Do you reject or accept this null hypothesis? Reject

Exercise 2

Two Independent Samples t-test In an experiment to determine the effects of hunger on hand-eye coordination, the following results, representing the number of tasks completed success-fully, were obtained:

Experimental Group (#1) Control Group (#2) (Hungry) (Normal) Mean 14.0 19.0S.D. 2.449 3.873N 10 12

Calculate the estimated standard error of the difference, obtain the value of t, and test the hypothesis that the normally-fed (control) group performed better than did the hungry (experimental) group. Use Student's t distribution (Appendix 2, p. 543), and assume that = 0.05. Perform a one-tailed test.

Exercise 2 (continued)

Two Independent Samples t-test 1. Expressed symbolically, what is the alternate

hypothesis?

2. Expressed symbolically, what is the null hypothesis?

3. What is the value of the standard error?

4. What is the value of t? 5. How many degrees of freedom in this problem?

6. What is the critical value of tdf? 7. Do you reject or accept the null hypothesis?

Exercise 2 Answers

Two Independent Samples t-test 1. Expressed symbolically, what is the alternate

hypothesis? 2 - 1 > 0.0 2. Expressed symbolically, what is the null

hypothesis? 2 - 1 = 0.0 3. What is the value of the standard error? 1.416 4. What is the value of t? 3.530 5. How many degrees of freedom in this problem? 20

6. What is the critical value of tdf? + 1.725

7. Do you reject or accept the null hypothesis? Reject

Documents

T- and Z-Tests for Hypotheses about the Difference between Two Subsamples