A Review of the t

7/29/2019 A Review of the t

1/17

A Review of the t-test

The t-test is used for testing differences between two means. In order to use a t-test,

the same variable must be measured in different groups, at different times, or in

comparison to a known population mean. Comparing a sample mean to a known

population is an unusual test that appears in statistics books as a transitional step inlearning about the t-test. The more common applications of the t-test are testing the

difference between independent groups or testing the difference between dependent

groups.

A t-test for independent groups is useful when the same variable has been measured in

two independent groups and the researcher wants to know whether the difference

between group means is statistically significant. "Independent groups" means that the

groups have different people in them and that the people in the different groups have

not been matched or paired in any way. A t-test for related samples or a t-test for

dependent means is the appropriate test when the same people have been measured ortested under two different conditions or when people are put into pairs by matching

them on some other variable and then placing each member of the pair into one of two

groups.

The t-test For Independent Groups on SPSS

A t-test for independent groups is useful when the researcher's goal is to compare the

difference between means of two groups on the same variable. Groups may be formedin two different ways. First, a preexisting characteristic of the participants may be

used to divide them into groups. For example, the researcher may wish to compare

college GPAs of men and women. In this case, the grouping variable is biological

sex and the two groups would consist of men versus women. Other preexisting

characteristics that could be used as grouping variables include age (under 21 years

vs. 21 years and older or some other reasonable division into two groups), athlete

(plays collegiate varsity sport vs. does not play), type of student (undergraduate vs.

graduate student), type of faculty member (tenured vs. nontenured), or any other

variable for which it makes sense to have two categories. Another way to form groups

is to randomly assign participants to one of two experimental conditions such as agroup that listens to music versus a group that experiences a control condition.

Regardless of how the groups are determined, one of the variables in the SPSS data

file must contain the information needed to divide participants into the appropriate

groups. SPSS has very flexible features for accomplishing this task.


2/17

Like all other statistical tests using SPSS, the process begins with data. Consider the

fictional data on college GPA and weekly hours of studying used in the correlation

example. First, let's add information about the biological sex of each participant to the

data base. This requires a numerical code. For this example, let a "1" designate a

female and a "2" designate a male. With the new variable added, the data would look

like this:

Participant Current GPA Weekly Study Time Sex

Participant #01 1.8 15 hrs 2




Participant #05 3.3 36 hrs .







With this information added to the file, two methods of dividing participants intogroups can be illustrated. Note that Participant #05 has just a single dot in the column

for sex. This is the standard way that SPSS indicates missing data. This is a common

occurrence, especially in survey data, and SPSS has flexible options for handling this

situation. Begin the analysis by entering the new data for sex. Use the arrow keys or

mouse to move to the empty third column on the spreadsheet. Use the same technique

as previously to enter the new data. When data is missing (such as Participant #5 in


3/17

this example), hit the key when there is no data in the top line (you will

need to the previous entry) and a single dot will appear in the variable

column. Once the data is entered, clickData> Define Variable and type in the name

of the variable, "Sex." Then go to "value" And type a "1" in the box. For "Value

Label," type "Female." Then click on ADD. Repeat the sequence, typing "2" and

"male" in the appropriate boxes. Then clickADD again. Finally, clickCONTINUE>OKand you will be back to the main SPSS menu.

Back to the Top of the Page

To request the t-test, clickStatistics > Compare Means > Independent

SamplesT Test. Use the right-pointing arrow to transfer COLGPA to the "Test

Variable(s)" box. Then highlight Sex in the left box and click the bottom arrow

(pointing right) to transfer sex to the "Grouping Variable" box. Then click Define

Groups. Type "1" in the Group 1 box and type "2" in the Group 2 box. Then

clickContinue. ClickOptions and you will see the confidence interval or the methodof handling missing data can be changed. Since the default options are just fine,

clickContinue > OKand the results will quickly appear in the output window.

Results for the example are shown below:

T-Test

Group Statistics

Variable N Mean Std. Deviation Std. Error Mean

SEX 1.00 Female 5 3.4800 .487 .218

2.00 Male 5 2.3400 .493 .220

Independent Samples Test

Levene's Test for Equality of Variances

F Sig.

SEXEqual variances

assumed.002 .962

Equal Variances not

assumed
http://www.mhhe.com/socscience/psychology/runyon/spss/ttest.html#tophttp://www.mhhe.com/socscience/psychology/runyon/spss/ttest.html#tophttp://www.mhhe.com/socscience/psychology/runyon/spss/ttest.html#top


4/17

t-test for Equality of Means

t df Sig. (2-tailed)Mean

Difference

SEX

Equal

variancesassumed

3.68 8 .021 .1750

Equal

variances notassumed

3.68 8.00 .025 .1750

The output begins with the means and standard deviations for the two variables which

is key information that will need to be included in any related research report. The

"Mean Difference" statistic indicates the magnitude of the difference between means.

When combined with the confidence interval for the difference, this information can

make a valuable contribution to explaining the importance of the results. "Levene's

Test for Equality of Variances" is a test of the homogeneity of variance assumption.

When the value forFis large and the P-value is less than .05, this indicates that the

variances are heterogeneous which violates a key assumption of the t-test. The next

section of the output provides the actual t-test results in two formats. The first format

for "Equal" variances is the standard t-test taught in introductory statistics. This is the

test result that should be reported in a research report under most circumstances. The

second format reports a t-test for "Unequal" variances. This is an alternative way of

computing the t-test that accounts for heterogeneous variances and provides an

accurate result even when the homogeneity assumption has been violated (as indicatedby the Levene test). It is rare that one needs to consider using the "Unequal" variances

format because, under most circumstances, even when the homogeneity assumption is

violated, the results are practically indistinguishable. When the "Equal" variances and

"Unequal" variances formats lead to different conclusions, seek consultation. The

output for both formats shows the degrees of freedom (df) and probability (2-tailed

significance). As in all statistical tests, the basic criterion for statistical significance is

a "2-tailed significance" less than .05. The .021 probability in this example is clearly

less than .05 so the difference is statistically significant.


5/17

When two samples are involved, the samples can come from different individuals who

are not matched (the samples are independent of each other.) Or the sample can come

from the same individuals (the samples are paired with each other) and the samples

are not independent of each other. A third alternative is that the samples can come

from different individuals who have been matched on a variable of interest; this type

of sample will not be independent. The form of the t-test is slightly different for theindependent samples and dependent samples types of two sample tests, and SPSS has

separate procedures for performing the two types of tests.

The Independent Samples t-test can be used to see if two means are different from

each other when the two samples that the means are based on were taken from

different individuals who have not been matched. In this example, we will determine

if the students in sections one and two of PSY 216 have a different number of older

siblings.

We will follow our customary steps:

1. Write the null and alternative hypotheses first:H0: Section 1 = Section 2

H1: Section 1 Section 2

Where is the mean number of older siblings that the PSY 216 students have.

2. Determine if this is a one-tailed or a two-tailed test. Because the hypothesisinvolves the phrase "different" and no ordering of the means is specified, this

must be a two tailed test.

3. Specify the level: = .054. Determine the appropriate statistical test. The variable of interest, older, is on a

ratio scale, so a z-score test or a t-test might be appropriate. Because the

population standard deviation is not known, the z-test would be inappropriate.

Furthermore, there are different students in sections 1 and 2 of PSY 216, and

they have not been matched. Because of these factors, we will use the

independent samples t-test.

5. Calculate the t value, or let SPSS do it for you!

The command for the independent samples t tests is found at Analyze |

Compare Means | Independent-Samples T Test (this is shorthand for clicking

on the Analyze menu item at the top of the window, and then clicking on

Compare Means from the drop down menu, and Independent-Samples T Test


6/17

from the pop up menu.):

The Independent-Samples t Test dialog box will appear:

Select the dependent variable(s) that you want to test by clicking on it in the

left hand pane of the Independent-Samples t Test dialog box. Then click on the

upper arrow button to move the variable into the Test Variable(s) pane. In this

example, move the Older variable (number of older siblings) into the Test


7/17

Variables box:

Click on the independent variable (the variable that defines the two groups) inthe left hand pane of the Independent-Samples t Test dialog box. Then click on

the lower arrow button to move the variable in the Grouping Variable box. In

this example, move the Section variable into the Grouping Variable box:

You need to tell SPSS how to define the two groups. Click on the Define

Groups button. The Define Groups dialog box appears:


8/17

In the Group 1 text box, type in the value that determines the first group. In this

example, the value of the 10 AM section is 10. So you would type 10 in the

Group 1 text box. In the Group 2 text box, type the value that determines the

second group. In this example, the value of the 11 AM section is 11. So you

would type 11 in the Group 2 text box:

Click on the Continue button to close the Define Groups dialog box. Click on

the OK button in the Independent-Samples t Test dialog box to perform the t-

test. The output viewer will appear with the results of the t test. The resultshave two main parts: descriptive statistics and inferential statistics. First, the

descriptive statistics:

This gives the descriptive statistics for each of the two groups (as defined by

the grouping variable.) In this example, there are 14 people in the 10 AMsection (N), and they have, on average, 0.86 older siblings, with a standard

deviation of 1.027 older siblings. There are 32 people in the 11 AM section

(N), and they have, on average, 1.44 older siblings, with a standard deviation of

1.318 older siblings. The last column gives the standard error of the mean for

each of the two groups.


9/17

The second part of the output gives the inferential statistics:

The columns labeled "Levene's Test for Equality of Variances" tell us whether

an assumption of the t-test has been met. The t-test assumes that the variability

of each group is approximately equal. If that assumption isn't met, then a

special form of the t-test should be used. Look at the column labeled "Sig."under the heading "Levene's Test for Equality of Variances". In this example,

the significance (p value) of Levene's test is .203. If this value is less than or

equal to your level for the test (usually .05), then you can reject the null

hypothesis that the variability of the two groups is equal, implying that the

variances are unequal. If the p value is less than or equal to the level, then

you should use the bottom row of the output (the row labeled "Equal variances

not assumed.") If the p value is greater than your level, then you should use

the middle row of the output (the row labeled "Equal variances assumed.") In

this example, .203 is larger than , so we will assume that the variances are

equal and we will use the middle row of the output.

The column labeled "t" gives the observed or calculate t value. In this example,

assuming equal variances, the t value is 1.461. (We can ignore the sign of t for

a two tailed t-test.) The column labeled "df" gives the degrees of freedom

associated with the t test. In this example, there are 44 degrees of freedom.

The column labeled "Sig. (2-tailed)" gives the two-tailed p value associated

with the test. In this example, the p value is .151. If this had been a one-tailed

test, we would need to look up the critical t in a table.

6. Decide if we can reject H0: As before, the decision rule is given by: If p ,then reject H0. In this example, .151 is not less than or equal to .05, so we fail

to reject H0. That implies that we failed to observe a difference in the number

of older siblings between the two sections of this class.


10/17

If we were writing this for publication in an APA journal, we would write it as:

A ttest failed to reveal a statistically reliable difference between the mean number of

older siblings that the 10 AM section has (M = 0.86, s = 1.027) and that the 11 AM

sollowing working examples refer to thedatasetfrom theUS General Social

Survey 1993.

1.Analyze-> Compare Means->Independent-Samples T Test

"independent samples t-test" is usually adopted to compare means (1 variable,

e.g. age or GPA score) between two groups on a categorical variable in a

survey

if each respondent (i.e. each case) has 2 different scores (i.e. 2 variables) to

compare, e.g. GPA of term 1 and GPA of term 2, "Paired Samples T-Test"

should be used

o it is used when 2 measures relate to one another

2. Select and put interval (or ratio) scale variable in "Test Variable(s)" box

3. Select and put categorical variable in "Grouping Variable" box

the categorical variable can invovle 2 or more categories

however, T-Test can only compare 2 groups each time

define which 2 groups will be included in the comparison by pressingtheDefine Groupsbutton

o when there are only 2 categories in the variable, it is still necessary to

define groups
http://www.soc.qc.edu/QC_Software/downloads/gss/gss93.ziphttp://www.soc.qc.edu/QC_Software/downloads/gss/gss93.ziphttp://www.soc.qc.edu/QC_Software/downloads/gss/gss93.ziphttp://www.icpsr.umich.edu/GSS/rnd1998/merged/indx-mod/1993.htmhttp://www.icpsr.umich.edu/GSS/rnd1998/merged/indx-mod/1993.htmhttp://www.icpsr.umich.edu/GSS/rnd1998/merged/indx-mod/1993.htmhttp://www.icpsr.umich.edu/GSS/rnd1998/merged/indx-mod/1993.htmhttp://www.icpsr.umich.edu/GSS/rnd1998/merged/indx-mod/1993.htmhttp://www.icpsr.umich.edu/GSS/rnd1998/merged/indx-mod/1993.htmhttp://www.soc.qc.edu/QC_Software/downloads/gss/gss93.zip


11/17

fill in the codes representing the 2 groups to be compared

then pressContinue

as in other analyses, pressOKif you want to get the results immediately, or

pressPasteto copy out the command syntax, then run it in the Syntax window

to get the output

4. SPSS Output for T-Test

4.1 We want to know whether mean age of those who voted was differnet from thosewho did not vote in the 1992 election

SPSS will first produce the following table to show the mean age of the 2

groups in comparison


12/17

in average, those who voted were about 5 years (47.85 - 42.71) older than those

who did not

from the sample means, we can draw an initial conclusion that voters were

older than non-voters

however, we are interested more in inferring the sample finding to the target

population, the above conclusion must be tested for statistical significance

T-Test

Group Statistics

VOTE92 Voting in 1992

ElectionN Mean

Std.

Deviation

Std. Error

Mean

AGE Age of

Respondent

1 voted 1028 47.85 16.953 .529

2 did not vote 420 42.71 18.010 .879

4.2 Test for significance of difference

the null hypothesis is: voters and non-voters had no difference in age 2 rows contain the same nature of informtion:t,df,Sig. (2-tailed)...

o Equal variances assumed

o Equal variances not assumed

as you may notice, we have to choose one row of information to believe , but

which one?

Equal variances assumedorEqual variances not assumed?

o variances here refer to the variance of each group mean

o rule of decision:

the null hypothesis is: the variances of the means (2 groups) are

equal looking at the pink box, the significance corresponds to the F-

value (in green box)

if the significance level is greater than 0.05,the null hypothesis is

accepted i.e., choose the blue boxEqual variances assumedfor

information on t-test


13/17

if the significance level is less than or equal to 0.05, the null

hypothesis is rejected

i.e., choose the yellow boxEqual variances not

assumedfor information on t-test

o the significance level is 0.202, therefore null hypothesis is accepted

o we have to choose the blue boxEqual variances assumedforinformation on t-test

theSig. (2-tailed)tells us about the level of significance of the t-value

o the significance shows .000, but it does not mean the probability iszero,

it actually means the significance level is less than 0.0005

o as a convention, we reject the null hypothesis at p 0.05

hence, we may conclude that voters were older than non-voters in our target

population

Independent Samples Test

Levene's

Test for

Equality

of

Variance

s

t-test for Equality of Means

FSig

.t df

Sig.(2-

tailed

)

Mean

Differenc

e

Std.Error

Differenc

e

95%

Confidenc

e

Intervalof the

Differenc

e

Lowe

r

Uppe

r

AGE Age

of

Responden

t

Equal

variance

s

assumed

1.63

1.20

25.14

11446 .000 5.14 1.000

3.17

97.10

2

Equal

variance

s notassumed

5.01

2

737.80

3.000 5.14 1.026

3.12

7

7.15

4

ection has (M = 1.44, s = 1.318), t(44) = 1.461, p= .151, = .05.


14/17

Bivariate (Pearson) Correlation

A correlation expresses the strength of linkage or co-occurrence between to variables ina single value between -1 and +1. This value that measures the strength of linkage iscalled correlation coefficient, which is represented typically as the letterr.

The correlation coefficient between two continuous-level variables is also calledPearsons r or Pearson product-moment correlation coefficient. A positive rvalueexpresses a positive relationship between the two variables (the larger A, the larger B)while a negative rvalue indicates a negative relationship (the larger A, the smaller B). Acorrelation coefficient of zero indicates no relationship between the variables at all.However correlations are limited to linear relationships between variables. Even if thecorrelation coefficient is zero, a non-linear relationship might exist.

Bivariate correlation and regression evaluate the degree of relationship between twoquantitative variables. Pearson Correlation (r), the most commonly used bivariate correlation

technique, measures the association between two quantitative variables without distinctionbetween the independent and dependent variables (e.g., What is the relationship between SAT

scores and freshman college GPA?).The Output of the Bivariate (Pearson) CorrelationThe output is fairly simple and contains only a single table the correlation matrix. Thebivariate correlation analysis computes the Pearsons correlation coefficient of a pair oftwo variables. If the analysis is conducted for more than two variables it creates alarger matrix accordingly. The matrix is symmetrical since the correlation between Aand B is the same as between B and A. Also the correlation between A and A is always1.

In this example Pearsons correlation coefficient is .645, which signifies a medium positive linear

correlation. The significance test has the null hypothesis that there is no positive or negative


15/17

correlation between the two variables in the universe (r = 0). The results show a very high

statistical significance of p < 0.001 thus we can reject the null hypothesis and assume that the

Reading and Writing test scores are positively, linearly associated in the general universe.

. Data Analysis & Interpretation

Analysis involves the calculation of a correlation coefficient (i.e., a quantitative measure of a relationship)

Most common is a Pearson correlation coefficient (r)correlation between two interval variables

Numerous others exist for various combinations of variables

However, all are interpreted in similar manner; range from1.00 to +1.00 (some range from 0.00 to+1.00)

General rule of thumb for interpretation

Value of

coefficient

P-value

(significanc

e)

Sample

size


16/17

Partial Correlations

This feature requires the Statistics Base option.

The Partial Correlations procedure computes partial correlation coefficients that describe

the linear relationship between two variables while controlling for the effects of one ormore additional variables. Correlations are measures of linear association. Two variables

can be perfectly related, but if the relationship is not linear, a correlation coefficient is notan appropriate statistic for measuring their association.

Example. Is there a relationship between healthcare funding and disease rates?Although you might expect any such relationship to be a negative one, a study reports a

significant positivecorrelation: as healthcare funding increases, disease rates appear to

increase. Controlling for the rate of visits to healthcare providers, however, virtuallyeliminates the observed positive correlation. Healthcare funding and disease rates onlyappear to be positively related because more people have access to healthcare when

funding increases, which leads to more reported diseases by doctors and hospitals.

Partial Correlation

The partial correlation is the same as the Pearson correlation except that it allows you to

control for or remove the influence of another variable.

The influencing variable is typically referred to as a confounding variable. By statistically

controlling for or removing the influence of the confounding variable, you can obtain a more

clear and accurate indication of the relationship between your two variables of interest.

For example, lets say you wanted to evaluate the correlation between hours studied and

scores on a math test. There may be other factors that also influence test performance, like

IQ scores. So, to remove the influence of IQ scores, we would run a partial correlation

between hours studied and test score, controlling for IQ scores.

All correlations in the partial correlation are Pearson correlations coefficients (r). Just like

the bivariate correlation, the Pearson correlation coefficients (r) can only take on values

from -1 to +1. The sign indicates whether there is a positive correlation (as one variableincreases, the other variable also increases) or a negative relationship (as one variable

increases, the other variable decreases).

The size of the correlation coefficient (ignoring the sign) indicates the strength of the

relationship. The closer the correlation coefficient (r) gets to 1, either positive or negative,


17/17

the stronger the relationship. On the other hand, the closer the correlation coefficient gets to

0, the weaker the relationship.

Reliability analysis allows you to study the properties of measurement scales and theitems that compose the scales. The Reliability Analysis procedure calculates a number ofcommonly used measures of scale reliability and also provides information about the

relationships between individual items in the scale. Intraclass correlation coefficients canbe used to compute inter-rater reliability estimates.

Example. Does my questionnaire measure customer satisfaction in a useful way? Usingreliability analysis, you can determine the extent to which the items in your questionnaireare related to each other, you can get an overall index of the repeatability or internal

consistency of the scale as a whole, and you can identify problem items that should beexcluded from the scale.

Nominal.

A variable can be treated as nominal when its values represent categories with no intrinsicranking; for example, the department of the company in which an employee works. Examples of

nominal variables include region, zip code, or religious affiliation. A variable can be treated as

nominal when its values represent categories with no intrinsic ranking; for example, the

department of the company in which an employee works. Examples of nominal variables include

region, zip code, or religious affiliation.

Ordinal.

A variable can be treated as ordinal when its values represent categories with some intrinsic

ranking; for example, levels of service satisfaction from highly dissatisfied to highly satisfied.

Examples of ordinal variables include attitude scores representing degree of satisfaction or

confidence and preference rating scores.

A variable can be treated as ordinal when its values represent categories with some intrinsicranking; for example, levels of service satisfaction from highly dissatisfied to highly satisfied.

Examples of ordinal variables include attitude scores representing degree of satisfaction or

confidence and preference rating scores. For ordinal string variables, the alphabetic order of

string values is assumed to reflect the true order of the categories. For example, for a string

variable with the values of low, medium, high, the order of the categories is interpreted as high,

low, medium which is not the correct order. In general, it is more reliable to use numeric codes

to represent ordinal data.

Scale.

A variable can be treated as scale when its values represent ordered categories with a

meaningful metric, so that distance comparisons between values are appropriate. Examples of

scale variables include age in years and income in thousands of dollars. A variable can be

treated as scale when its values represent ordered categories with a meaningful metric, so that

distance comparisons between values are appropriate. Examples of scale variables include age

in years and income in thousands of dollars.

Documents

A Review of the t