What is an independent samples-t test?

Independent Samples T-Tests

Another application of the t-tests is the independent samples t-test.

An independent samples t-test evaluates whether two means from two samples of the same dependent variable are significantly different from one another.

Example: Same dependent variable - Baby birth weight

Independent Variable: - Two groups of expectant mothers:

Those who consumeless than 2 gallons of ice cream per monthmore than 2 gallons of ice cream per month





mean 1 mean 2









mean 1mean 2





mean 1 mean 2





mean 1 mean 2
















Those who consumeless than 2 gallons of ice cream per monthmore than 2 gallons of ice cream per month Dependent Variable













Independent Variable




Those who drinkless than 2 gallons of ice cream per monthmore than 2 gallons of ice cream per month




Those who drink• less than 2 bottles of water per daymore than 2 gallons of ice cream per month <




Those who drink• less than 2 bottles of water per day• more than 2 bottles of water per day >

Note – anytime you run an independent samples t-test you will have two levels of something – in this case expectant mothers who consume less than 2 bottles (1 group) or more than 2 bottles (2nd group) of water per day


level one


level one

<


level one level two

< >

These levels can either be: naturally occurring as 2 categorical groups (females/males) or arbitrarily divided into 2 groups from a continuous measure (baby birthweight);





level one 6 lbs.

7 lbs.

level two 8 lbs.

9 lbs.

These levels can either be: naturally occurring as 2 categorical groups (females/males) or arbitrarily divided into 2 groups from a continuous measure (baby birthweight); Note that our research question will be about group differences;

These levels can either be: naturally occurring as 2 categorical groups (females/males) or arbitrarily divided into 2 groups from a continuous measure (baby birthweight); Note that our research question will be about group differences;For example:

Who is more likely to engage in religious practices?(1) females (2) males(1) Mormons (2) Jews(1) urban dwellers (2) inner city residentsThose who high jump (1) over five feet or (2) under five feet… etc.

You also run an independent samples t-test when there is only 1 dependent variable.


In this example: Baby birth weight is the dependent variable we are measuring across both groups


In this example: Baby birth weight is the dependent variable we are measuring across both groups

An independent samples t-test is used only with interval or ratio data . . .

An independent samples t-test is used only with interval or ratio data . . . Interval scales

• assume quantity of the attribute• have equal intervals• may have an arbitrary zero or starting point

An independent samples t-test is used only with interval or ratio data . . . Interval scales

• assume quantity of the attribute• have equal intervals• may have an arbitrary zero or starting point

Ratio scales

• assume quantity of the attribute• have equal intervals• have a zero or starting point

5’6”6’1”

6’3”5’9”

An independent samples t-test is used only with interval or ratio data . . . not nominal nor ordinal,

An independent samples t-test is used only with interval or ratio data . . . not nominal nor ordinal,Nominal scales

• assume no quantity of the attribute• have no particular interval



Ordinal scales

• assume quantity of the attribute• do not have equal intervals

time = 16.1 time = 17.8



Ordinal scales

• assume quantity of the attribute• do not have equal intervals

time = 16.1 time = 17.8

Finally, an independent samples t-test should be used when the data is reasonably normally distributed;



NOT


OR ORNOT

So, in summary, an independent samples t-test is appropriate to run when –

1. working with interval / ratio data

2. the distribution is reasonably normal

3. there is one independent variable (gender) with two levels (female / male)

4. with the same dependent level (ice cream consumption)


1. the research question deals with the differences between two sample means.




mean 1 mean 2


1. the research question deals with the differences between two sample means. 2. working with interval / ratio data





1. the research question deals with the differences between two sample means. 2. working with interval / ratio data3. the distribution is reasonably normal


1. the research question deals with the differences between two sample means. 2. working with interval / ratio data3. the distribution is reasonably normal4. there is one independent variable (gender) with two levels (female / male)


1. the research question deals with the differences between two sample means. 2. working with interval / ratio data3. the distribution is reasonably normal4. there is one independent variable (gender) with two levels (female / male) 5. with the same dependent level (baby birthweight)

As is the case when using inferential statistics to answer a research question we start with a decision rule. This means stating the null as well as the alternative hypothesis:

As is the case when using inferential statistics to answer a research question we start with a decision rule. This means stating the null as well as the alternative hypothesis:The null hypothesis would be, “There is no significant difference between the two groups in terms of the dependent variable.”

As is the case when using inferential statistics to answer a research question we start with a decision rule. This means stating the null as well as the alternative hypothesis:The null hypothesis would be, “There will be no significant difference between the two groups in terms of the dependent variable.”The alternative hypothesis would be, “There is a significant difference between the two groups in terms of the dependent variable.”

As is the case when using inferential statistics to answer a research question we start with a decision rule. This means stating the null as well as the alternative hypothesis:The null hypothesis would be, “There will be no significant difference between the two groups in terms of the dependent variable.”The alternative hypothesis would be, “There is a significant difference between the two groups in terms of the dependent variable.”So what would the null-hypothesis be for the expectant mothers consumption of water and baby birth weight?

So what would the null-hypothesis be for the expectant mothers’ consumption of water and baby birth weight?

“There is no significant difference between expectant mothers who drink more than 2 bottles of water per day and those who drink less than 2 bottles of water per day (the two groups) in terms of baby birth weight (the dependent variable).”


“There is no significant difference between expectant mothers who drink more than 2 bottles of water per day and those who drink less than 2 bottles of water per day (the two groups) in terms terms of baby birth weight (the dependent variable).”

< >



< >



< >

So what would the null-hypothesis be for the amount of water consumed by each gender?

“There will be no significant difference between expectant mothers who eat more than 2 gallons of ice cream per month and those who eat less than 2 gallons of ice cream per month (the two groups) in terms of baby birth weight (the dependent variable).”


“There is no significant difference between males and females (the two groups) in terms of ice cream consumption (the dependent variable).”


“There is no significant difference between males and females (the two groups) in terms of ice cream consumption (the dependent variable).”


“There is no significant difference between males and females (the two groups) in terms of water consumption (the dependent variable).”


“There is no significant difference between males and females (the two groups) in terms of water consumption (the dependent variable).”

And the alternative hypothesis:

And the alternative hypothesis:

“There is a significant difference between males and females (the two groups) in terms of water consumption (the dependent variable).”

The formula for the independent samples t-test is as follows:


x1 – x2

SEdifferences


x1 – x2

SEdifferences

Mean birth weight of babies born to

mothers who drink >2 bottles of water

per day.


x1 – x2

SEdifferences



per day.


mothers who drink < 2 bottles of water

per day.


x1 – x2

SEdifferences



per day.


mothers who drink < 2 bottles of water

per day.

Difference between X1 & X2 measured in standard error

units

It follows the same general form as the single-sample t-test.

x1 – x2

SEdifferences


x1 – x2

SEdifferences

MEAN birth weight of babies born to


per day.

Independent samples t-test


μ – x2

SEdifferences

The POPULATION mean birth weight of babies born to


per day.

Single sample t-test

The independent samples t-test represents the difference between the means in standard error units.

The independent samples t-test represents the difference between the means in standard error units.

x1 – x2

SEdifferences

So for example, if

So for example, if

• the average birth weight for babies whose mothers consumed < 2 bottles of water per day was 10 pounds

So for example, if


• and for babies whose mothers consumed >2 bottles of water per day was 6 pounds

So for example, if



• and the standard error difference was 2,

So for example, if



• and the standard error difference was 2, • then the t value would be:

So for example, if




10 lb – 6 lb2

So for example, if



• and the standard error difference was 2, • then the t value would be: 4

2=10 lb – 6 lb

2

So for example, if




= 242

=10 lb – 6 lb2

This means that

• the average birth weight for babies whose mothers consumed < 2 gallons of ice cream were 10 pounds

This means that

• the mean weight for babies whose mothers consume < 2 bottles of water is 2 standard error units greater than babies whose mothers consume > 2 bottles of water per day.

This means that

• the mean weight for babies whose mothers consume < 2 bottles of water is 2 standard error units greater than babies whose mothers consume > 2 bottles of water per day.

At this point we do not know if there is a statistically significant difference between the two. Later this t-value will be compared against a standard to determine if such a difference exists.

How did we come up with standard error?

First, there is a theoretical answer and then a practical answer.

THEORETICAL ANSWER

This standard error of the differences represents the standard deviation of the sampling distribution of differences between means from samples of sample sizes n1 and n2.



THEORETICAL ANSWER




THEORETICAL ANSWER




THEORETICAL ANSWER


So here is a way to visually depict this.

Let’s imagine that n1 = 20 or in other words the SAMPLE OR NUMBER of baby birth weights recorded from expectant mothers consuming less than 2 gallons per month is 20.

Here is the distribution.

Now imagine that we selected one hundred samples of 20 of baby birth weight of expectant mothers consuming less than 2 gallons.


Let’s imagine that the first sample n1 = 20 or, in other words, the SAMPLE or NUMBER of baby birth weights recorded from expectant mothers consuming less than 2 water bottles per day is 20.














Now imagine that we selected one hundred samples of 20 of baby birth weights of expectant mothers consuming less than 2 bottles of water per day.


Let’s imagine that n1 = 20 or, in other words, the SAMPLE or NUMBER of baby birth weights recorded from expectant mothers consuming less than 2 gallons per month is 20.



10 128





10 128

Let’s imagine that there are 100 distributions below





10 128

Let’s imagine that there are 100 distributions below Each one of these

distributions represents a sample of 20 from the < 2

group




And we do the same for samples of baby birth weight from mothers drinking more than 2 bottles of water per day.





10 128





10 128

Each one of these distributions

represents a sample of 20 from the > 2

group

Then we do something very interesting.

Then we do something very interesting. We imagine subtracting each distribution’s sample mean from the < 2 group from another randomly selected sample mean from the > 2 group.


Sampling distribution of baby birth weight of mothers

consuming >2 bottles of H2O







One sample randomly pulled out




Mean = 10





Mean = 10


consuming <2 bottles of H2O





Mean = 10








Mean = 10



Mean = 7






Mean = 10



Mean = 7 =−One sample randomly

pulled outOne sample randomly

pulled out




Mean = 10



Mean = 7 =− Mean = 3






Mean = 10




Mean = 7


=− Mean = 3




Mean = 10



Mean = 7 =− Mean = 3








=−


=−SECOND sample

randomly pulled out






=−Mean = 11

SECOND sample randomly pulled out






=−Mean = 11








=−Mean = 11








=−Mean = 11 Mean = 7










Mean = 4









Mean = 4









Mean = 4







=−This is done hundreds of times until a subtracted

sampling distribution emerges.






=−This is done hundreds of times until a subtracted

sampling distribution emerges.






=−

Sampling distribution of subtracting the birth weights

from the two groups.

This is done hundreds of times until a subtracted sampling distribution emerges.






=−

Sampling distribution of subtracting the birth weights

from the two groups.





The standard deviation of this last distribution is called the STANDARD ERROR

+4 +6+2


+4 +6+2


+4 +6+2

SD = 2.0


The STANDARD DEVIATION of this distribution is the Standard Error of the differences between the first and second sample.

+4 +6+2

SD = 2.0

Look at the following explanation and as you read consider the images you just saw in the previous slides. Return to these slides if necessary until the concepts are clear.

In summary: The standard error of the differences represents the standard deviation of the sampling distribution of differences between means from samples of sample sizes n1 and n2.

Look at the following explanation and as you read consider the images you just saw in the previous slides. Return to these slides if necessary until the concepts are clear.

In summary: The standard error of the differences represents the standard deviation of the sampling distribution of differences between means from samples of sample sizes n1 and n2.

We are now leaving the Theoretical or Conceptual Explanation of standard error and on to what we end up doing in real life:

Because we generally do not have the resources or the ability to get hundreds of samples from one group and then hundreds of samples from another group and compute the difference and then standard deviation of it all, we simply estimate the standard error using information from just the two samples (of 20 each in this case).

We are now leaving the Theoretical or Conceptual Explanation of standard error and on to what we end up doing in real life:

Because we generally do not have the resources or the ability to get hundreds of samples from one group and then hundreds of samples from another group and compute the difference and then standard deviation of it all, we simply estimate the standard error using information from just the two samples (of 20 each in this case).

Amazingly, statisticians who have actually taken the hundreds of samples and run the calculations have found that this estimator is very accurate!

Amazingly, statisticians who have actually taken the hundreds of samples and run the calculations have found that this estimator is very accurate!

There are three computations that are involved in determining if two samples means are statistically significantly different from one another.

Computation #1 – this computation is used when the two samples are similar in two ways:1. variances2. sample size


Mean = 6 Var = 2Sample size = 20



When this is the case, use this formula to compute t:









Note – statistical software will run this for you. If you were to put the numbers in by hand and compute it you would get an identical result.



Computation #2 – this computation is used to calculate standard error when the two samples are different in terms of their sample size.



When this is the case, use this formula to compute t: Mean = 6 Var = 10

Sample size = 20Mean = 10 Var = 10

Sample size = 5


This complicated looking formula is used in this case to compute t:



Computation #3 – this computation is used to calculate the degrees of freedom when the variances are unequal.

Computation #3 – this computation is used to calculate the degrees of freedom when the variances are unequal.



Computation #3 – this computation is used to calculate the degrees of freedom when the variances are unequal. • After using the first or second computations to calculate t (depending on the

similarity of the sample sizes), the formula below is used to determine degrees of freedom:





• As will be visually depicted shortly the degrees of freedom determine the t critical value which in turn is the standard by which you determine if the two means are statistically significant or not.



• As will be visually depicted shortly the degrees of freedom determine the t critical value which in turn is the standard by which you determine if the two means are statistically significant or not.

• The bottom line here is that the critical t value is much larger when the variances are different requiring a greater t value for there to be a statistically significant difference between the two sample means.

Once again, the statistical software will run this calculation.


• So why do we show you the formula?



• We show you the formula in preparation for what you will see in upcoming slides.



• We show you the formula in preparation for what you will see in upcoming slides.

• We want you to see what happens in the formula as different means, variances, and sample sizes are used in the calculation.

Let’s begin with Computation #1

Example 1Sample 1 (>2 bottles of water) Sample 2 (<2 bottles of water)


Mean = 6 Var = 1.9

Mean = 10 Var = 2.1

Here is a simplified version of the formula to calculate t:




SEdifference



SEdifference

mean of sample 1



SEdifference

mean of sample 1 mean of sample 2


Here is a more specific version of the formula:


SEdifference





SEdifference





SEdifference


We are going to take this step by step so you will not only know what numbers to plug in but see important patterns that unfold when using this formula with different data.

Here are a couple things to consider:

1. The numerator in this fraction is simply the sample mean of one group minus the sample mean of another group. That’s it!

2. The only values you need to calculate is the sample size (in this case 20 for both samples) and the variance (in this case 2 for both samples.













3. As mentioned before, by some magic of nature the actual formula for standard error



Formula for estimating standard error


has been shown to be a fairly accurate estimator. In other words the results of calculating the estimated standard error is very close to the results gleaned from using the method we showed earlier (selecting 100 or 1000 samples, subtracting them from each other and taking the standard deviation - which is generally not practical to do)

Formula for estimating standard error

As seen in previous slides:


Sample Mean Distribution of birth weight of babies from

mothers who drink < 2 bottles of water.


mothers who drink > 2 bottles of water.

=−Sample Mean Distribution of

difference between the first and second sample


=−Sample Mean Distribution of

difference between the first and second sample

Take the standard deviation of this distribution and you

have the standard error


mothers who drink < 2 bottles of water.


mothers who drink > 2 bottles of water.

This conceptual method is estimated by a more feasible / practical method:



mean of sample 1


mean of sample 1

mean of sample 2


mean of sample 1

mean of sample 2

Estimate of standard error


Let’s try to understand it conceptually a step at a time

mean of sample 1

mean of sample 2

Estimate of standard error

If you have sample sizes (N1 & N2 ) of 30 each and variances (s2

1 & s22) of 2 each, let’s see what happens

Let’s imagine the

• first sample of baby birth weight whose mothers consumed < 2 gallons of ice cream is 10 pounds with a variance of 2 and

• second sample of baby birth weight whose mothers consumed > 2 gallons of ice cream is 6 pounds with a variance of 2.



Let’s imagine the

• first sample of baby birth weight whose mothers consumed < 2 gallons of ice cream is 10 pounds with a variance of 2 and




Let’s imagine the

• first sample of baby birth weight whose mothers consumed < 2 bottles of water is 10 pounds with a variance of 2 and




Let’s imagine the

• first sample of baby birth weight whose mothers consumed < 2 bottles of water is 10 pounds with a variance of 2 and

• second sample of baby birth weight whose mothers consumed > 2 bottles of water is 6 pounds with a variance of 2.

Step 1 – subtract the mean of one sample from the mean of another sample:


mean of sample 1

mean of sample 2


mean of sample 1

mean of sample 210


mean of sample 1

mean of sample 210 6


4


4

Raw score difference between sample

means.

Step 2 – divide each sample variance from its sample size

4

Raw score difference between sample

means.


4


4

variance of sample 12.0


4

variance of sample 12.0 2.0

variance of sample 2


4

number of observations in

sample 1

2.0 2.0

30


4


sample 1

2.0 2.0

30 30


sample 2


4

.0672.0

30


sample 2


4

.067 .067

Step 3 – take the square root of the result in the denominator

4

.067 .067


4

.133


4

.365


4

.365Estimated

standard error

Step 4 – Divide the difference between the means by the estimated standard error.

4

.365


10.95


What does that mean?

10.95

It means that there are 10.95 units of Standard Error between the sample mean of 6 pound babies and 10 pound babies.


Sample 1 (>2 bottles of water) Sample 2 (<2 bottles of water)

Mean = 6 Var = 2.0

Mean = 10 Var = 2.0



Mean = 6 Var = 2.0

Mean = 10 Var = 2.0

6 10

10.95 SE values separate the two means

Now we look up in the back of statistics book and find the following table entitled “t-DISTRIBUTION CRITICAL VALUES”



Why do we do this?


Why do we do this?

Well because we want to know the critical t-value. Once we know that value then we can determine if our t-value of 10.95 is less than or greater than the critical t-value.


Why do we do this?


If it is greater than the critical t-value then we will reject the null hypothesis.


Why do we do this?


If it is greater than the critical t-value then we will reject the null hypothesis.

If it is less than the critical t-value then we will accept or fail to reject the null hypothesis.

To determine the critical t-value we do two things.


• First, we calculate the degrees of freedom. This is done by summing the sample size of both samples (which in this case is 60 (30+30)) and subtracting them by 2 (which comes to 58).

• Second, we determine the alpha value. Essentially the alpha value is that value that you set that indicates what you are willing to accept as a rare occurrence.

– If you choose an alpha of .05 you are essentially saying “if the chance of that occurring is .05 or less, then I will assume that that is a rare occurrence and reject the null hypothesis.



• First, we calculate the degrees of freedom. his is done by summing the sample size of both samples (which in this case is 60 (30+30)) and subtracting them by 2 (which comes to 58).







– If you choose an alpha of .05 you are essentially saying “if the chance of that occurring is .05 or less, then I will assume that that is a rare occurrence and reject the null hypothesis.”







So let’s say that in this case you choose .05 as your alpha.

So let’s say that in this case you choose .05 as your alpha. Using these two pieces of information we can now determine the t-critical value:




First we go to the column to the far left with the heading “df” and trace our finger down to 58 (60 is the closest) We then go over to the .05 heading. Where the df of 58 and a probability of .05 intersect we find the value 1.671.



This is our critical t value: 1.671. So, if our calculated “t” exceeds this then we would reject the null hypothesis.



This is our critical t value: 1.671. So, if our calculated “t” exceeds this then we would reject the null hypothesis.

If it does not exceed this value that we would fail to reject or accept the null hypothesis.

So, with a t value of 10.95,

So, with a t value of 10.95, 10.95


which is larger than a critical t of 1.671, we will reject the null hypothesis in favor of the alternative hypothesis which states:

10.95


which is larger than a critical t of 1.671, we will reject the null hypothesis in favor of the alternative hypothesis which states:

“The mean weight of babies whose mothers drink less than 2 bottles of water per month is statistically significantly greater than the mean weight of babies whose mothers drink more than 2 bottles of water per month.”

10.95




Mean = 6 Var = 2.0

Mean = 10 Var = 2.0

It is important to note that three things could have changed this outcome from a rejection of the null hypothesis to an acceptance (failure to reject) of the null hypothesis:


1. if the means had been closer


1. if the means had been closer BEFORE




Mean = 6 Var = 2.0

Mean = 10 Var = 2.0

BEFORE


1. if the means had been closer AFTER

Mean = 10 Var = 2.0




Mean = 9 Var = 2.0

AFTER

Mean = 10 Var = 2.0




Mean = 9 Var = 2.0

AFTER

mean of sample 1

mean of sample 29

Mean = 10 Var = 2.0




Mean = 9 Var = 2.0

AFTER

mean of sample 1


Mean = 10 Var = 2.0




Mean = 9 Var = 2.0

AFTER

1

Raw score difference between

sample means.

Mean = 10 Var = 2.0




Mean = 9 Var = 2.0

AFTER

1


sample means.

The estimate of standard error is the

same as before

Mean = 10 Var = 2.0




Mean = 9 Var = 2.0

AFTER

1


sample means.

.365 The estimate of standard error is the

same as before

Mean = 10 Var = 2.0




Mean = 9 Var = 2.0

AFTER

2.74

Notice as the difference between the two means narrows the t value decreases as well (In this case from 10.94 to 2.74)

Notice as the difference between the two means narrows the t value decreases as well (In this case from 10.94 to 2.74)

There is a second factor that may impact the t value

2. When the sample size decreases the t value will decrease as well

Let’s say instead of a sample size of 30, we have samples sizes of 5 We’ll keep the means (10 and 6) and the variances (2) the same Let’s see what happens







Let’s see what happens

BEFORE


BEFORE

2.0 2.0

30 30

10 6


BEFORE

2.0 2.0

30 30

10 6


AFTER


AFTER

2.0 2.0

5 5

10 6


AFTER

2.0 2.0

5 5

10 6


AFTER

2.0 2.0

5 5

4


AFTER

.42.0

5

4


AFTER

.4 .4

4


AFTER

.8

4


AFTER

.894

4


AFTER

4.47


AFTER

Mean = 6 Var = 2.0

Mean = 10 Var = 2.0

6 10

4.00 raw score units from each other4.47 SE values from each other

Notice as the SAMPLE SIZE decreases the t value decreases (from 10.95 to 4.47).


Mean = 6 Var = 2.0

Mean = 10 Var = 2.0

6 10




Because we have 8 degrees of freedom (combined sample sizes of 10 minus 2) and we are using a .05 alpha (meaning – we are willing to call the difference significant since the occurrence happens less than 5% of the time), we will go to the number 8 in the far left column and scroll over to the column entitled .05. Here we see the value 1.860. Since t value is greater than 1.860 (remember it was 4.47) then we would reject the null hypothesis.

Here is the third factor that impacts the size of the t value:

Here is the third factor that impacts the size of the t value:3. When the variance increases the t value will decrease

Here is the third factor that impacts the size of the t value:3. When the variance increases the t value will decrease Let’s imagine that the variance increases from 2.0 to 20.0

BEFORE

BEFORE

2 2

30 30

10 6

AFTER

AFTER

20 20

30 30

10 6

AFTER

20 20

30 30

10 6

AFTER

20 20

30 30

10 6

AFTER

20 20

30 30

4

AFTER

.67 .67

4

AFTER

1.33

4

AFTER

1.15

4

AFTER

3.47


AFTER

Mean = 6 Var = 20

Mean = 10 Var = 20

6 10


Because we have 58 degrees of freedom again (combined sample size 60 minus 2) and we are using a .05 alpha (meaning – we are willing to call the difference significant since the occurrence happens less than 5% of the time), we will go to the number 60 in the far left column and scroll over to the column entitled .05. Here we see the value 1.671, just like in the first instance. Since t value is less than 1.671 (remember it was 1.15) then we would fail to reject (or accept) the null hypothesis.

“The mean weight of babies whose mothers drink less than 2 bottles of water per month is NOT statistically SIGNIFICANTLY GREATER than the mean weight of babies whose mothers drink more than 2 bottles of water per month.”

The examples you have just seen show what factors decrease the t value. Conversely, depending on their values these three factors can increase the t value, thus making it more likely that the t value will exceed the t critical value:


1. Large difference between means


1. Large difference between means

Mean = 6 Var = 2.0

Mean = 10 Var = 2.0

Mean = 10 Var = 2.0

Mean = 9 Var = 2.0

2. Increase sample size


sample size = 5

sample size = 30


3. Smaller standard deviation

sample size = 5

sample size = 30


3. Smaller standard deviation

sample size = 5

sample size = 30

Mean = 10

Var = 5.0

Mean = 6 Var = 5.0

Mean = 10

Var = 2.0

Mean = 6 Var = 2.0

In many cases the sample sizes are not the same. As we explained before another formula is used to weight the means so that the calculation is more accurate:



mean of sample 1mean of sample 2


As mentioned before, this formula is fairly complicated.

Let’s try to understand it conceptually a step at a time:



As mentioned before, this formula is fairly complicated.

Let’s try to understand it conceptually a step at a time:


Imagine the sample size for babies from mothers who consume >2 bottles of water is 5 and the sample size for babies from mothers who consume < 2 bottles of water is 30 and variances (s1

2 & s22) are 2.


2 & s22) are 2.

Here is the calculation:


2 & s22) are 2.


mean of sample 1mean of sample 210 6

30 2 5 2

30 5 30 5


2 & s22) are 2.


10 6

29 * 2 4 * 2

30 + 5 - 2 30 5


2 & s22) are 2.


4

58 + 8

33 30

7


2 & s22) are 2.


4

66

33 30

7


2 & s22) are 2.


4

66

33.23


2 & s22) are 2.


4

2 * .23


2 & s22) are 2.


4

.46


2 & s22) are 2.


4

.68


2 & s22) are 2.


5.90

Here is the interpretation. Excuse the repetition, but the more you see it the more it is likely to sink in.


Mean = 6 Var = 2.0 N = 30

Mean = 10 Var = 2.0 N = 5

6 10

5.90 SE values from each other

Now we look up in the back of statistics book and find the following table entitled “t-DISTRIBUTION CRITICAL

VALUES”


VALUES”The degrees of freedom are calculated by adding the two sample sizes and subtracting the result by 2: 30+5 – 2 = 33


VALUES”The degrees of freedom are calculated by adding the two sample sizes and subtracting the result by 2: 30+5 – 2 = 33 Because we have 33 degrees of freedom and we are using a .05 alpha (meaning – we are willing to call the difference significant since the occurrence happens less than 5% of the time), we will go to the number 33 in the far left column and scroll over to the column entitled .05


VALUES”The degrees of freedom are calculated by adding the two sample sizes and subtracting the result by 2: 30+5 – 2 = 33 Because we have 33 degrees of freedom and we are using a .05 alpha (meaning – we are willing to call the difference significant since the occurrence happens less than 5% of the time), we will go to the number 33 in the far left column and scroll over to the column entitled .05. Here we see the value is between 1.684 and 1.697. This is called the critical value, meaning if our calculated “t” exceeds this then we would reject the null hypothesis.


VALUES”The degrees of freedom are calculated by adding the two sample sizes and subtracting the result by 2: 30+5 – 2 = 33 Because we have 33 degrees of freedom and we are using a .05 alpha (meaning – we are willing to call the difference significant since the occurrence happens less than 5% of the time), we will go to the number 33 in the far left column and scroll over to the column entitled .05. Here we see the value is between 1.684 and 1.697. This is called the critical value, meaning if our calculated “t” exceeds this then we would reject the null hypothesis. If it does not exceed this value that we would fail to reject or accept the null hypothesis.



5.90


which is larger than a critical t of a value between 1.684 and 1.697, we will reject the null hypothesis in favor of the alternative hypothesis which states:

5.90


which is larger than a critical t of a value between 1.684 and 1.697, we will reject the null hypothesis in favor of the alternative hypothesis which states:

“The mean weight of babies whose mothers drink less than 2 bottles of water per month is statistically SIGNIFICANTLY GREATER than the mean weight of babies whose mothers drink more than 2 bottles of water per month.”

5.90

One more very important point when running an independent samples t test:


When the variances are significantly dissimilar, we do three things:



• First we determine if they are significantly dissimilar using a test called “Levene’s Test of Variance Inequality.”




• Second, if the sample sizes are similar, we calculate the t value using the original formula:




• Second, if the sample sizes are similar, we calculate the t value using the original formula:

• Third, we calculate the degrees of freedom using another complicated looking formula:

• Third, we calculate the degrees of freedom using another complicated looking formula:

Example 2


Mean = 6 Var = 5.0

Mean = 10 Var = 2.0

Example 2


Notice that the means are the same for examples 1 & 2

Mean = 6 Var = 5.0

Mean = 10 Var = 2.0

Example 2


Notice that the means are the same for examples 1 & 2Example 1:

Mean = 6 Var = 5.0

Mean = 10 Var = 2.0

Mean = 6 Var = 2.0

Mean = 10 Var = 2.0

Example 2


Notice that the means are the same for examples 1 & 2but the variance for the “> 2” group has gotten larger.

Mean = 6 Var = 5.0

Mean = 10 Var = 2.0

What impact is there on the t-test when the variances are significantly different?


The impact occurs in the denominator of the independent samples t-test equation:


The impact occurs in the denominator of the independent samples t-test equation:

x1 – x2

SEdifferences

We theoretically calculate the SEdifferences as shown before by calculating the difference between the first group’s sampling distribution and the second group’s sampling distribution,

We theoretically calculate the SEdifferences as shown before by calculating the difference between the first group’s sampling distribution and the second group’s sampling distribution, and then taking the standard deviation of this resulting distribution


=−


=−

Subtracting distributions with similar variances yield more stable results.


=−


Subtracting distributions with significantly different variances yield less stable results.

=−



=−

=− ?



So, the variances must be tested for similarity.

=−

=− ?

How are the variances tested for similarity?


It just so happens there is a test for this: The Levene’s Test for Equality of Variance.


It just so happens there is a test for this: The Levene’s Test for Equality of Variance. When an independent samples t-test is run in SPSS, the Levene’s test is automatically run.

As explained in the “Sums of Squares Logic” presentation, this test is computed by putting the larger sample variance in the numerator and the smaller one in the denominator.


Var = 2.1

Var = 1.9

=


Var = 2.1

Var = 1.9

= 1.1

The F-statistic 1.1 falls within the area of acceptance, therefore we would accept (fail to reject) the null-hypothesis that the variances are NOT different.

The F-statistic 1.1 falls within the area of acceptance, therefore we would accept (fail to reject) the null-hypothesis that the variances are NOT different.

We still use the standard formula for standard error

We still use the standard formula for standard error

x1 – x2

SEdifferences

Now see what happens when the variances differ


Var = 5.9

Var = 1.9

=


Var = 5.9

Var = 1.9

= 3.1

The F-statistic 3.1 falls just outside the area of acceptance, and into the area of rejection therefore we would reject the null-hypothesis that the variances are NOT different.

The F-statistic 3.1 falls just outside the area of acceptance, and into the area of rejection therefore we would reject the null-hypothesis that the variances are NOT different.

In summary, if Levene’s Test is not significant, we assume reasonable similarity in the dispersion of the two groups.


We would then use the pooled estimate formula for the standard error of differences, because the sample sizes of the two groups are the same.





If Levene’s Test is significant, we do not assume reasonable similarity in the dispersion of the two groups.

Furthermore, the formula for the corrected estimate of the standard error operates on different degrees of freedom which alters the actual standard error value, probability density and by probability of Type I error.

Furthermore, the formula for the corrected estimate of the standard error operates on different degrees of freedom which alters the actual standard error value, probability density and by probability of Type I error.

Let’s look at an example.

This is identical to an example you saw earlier with the exception that the variances are different.


What you will see in the slides that follow are the same standard error calculations as before. However we will use a special formula to calculate degrees of freedom.


What you will see in the slides that follow are the same standard error calculations as before. However we will use a special formula to calculate degrees of freedom.

You have sample sizes (N1 & N2 ) of 30 each and variances (s1

2 & s22) of 2 and 30 respectively.



Let’s imagine the



Let’s imagine the• first sample of baby birth weight whose mothers

consumed < 2 bottles of water is 10 pounds with a variance of 2 and



Let’s imagine the• first sample of baby birth weight whose mothers

consumed < 2 bottles of water is 10 pounds with a variance of 2 and

• second sample of baby birth weight whose mothers consumed > 2 bottles of water is 6 pounds with a variance of 30.


mean of sample 1

mean of sample 2


mean of sample 1



4

Raw score difference

between sample means.

Step 2 – divide each sample variance from its sample size:

4


4

2 30variance of sample 1

variance of sample 2


4

2 30


sample 1


sample 230 30


4

.06730

30


4

.067 1

Step 3 – take the square root of the result in the denominator:

4

.067 1


4

1.067


4

1.033 estimated standard error


4

1.033


3.872

If we used degrees of freedom of 29 this is what the critical t value would be:



t - critical value = 1.699

However, when the variances are dissimilar, we use the formula previously mentioned:

However, when the variances are dissimilar, we use the formula previously mentioned:

Let’s plug in the numbers and determine the appropriate degrees of freedom for two samples with such different variances

2


2 30


2 30

30


2 30

30 30

Add fractions and then square the result.

2 30

30 30


.06730

30


.067 1


1.067


1.033

Now for the denominator (lower half of the fraction)

1.033

Now for the denominator (lower half of the fraction)

1.033

2 30

30 30

Square the fractions

1.033

.06730

30

Square the fractions

1.033

.067 1

Square each result

1.033

.067 1

Square each result

1.033

.004 1

Square each result

1.033

.004 1 12 = 1

Calculate each sample size plus one.

1.033

.004 1

Calculate each sample size plus one.

1.033

.004 1

31 31

Simplify each fraction

1.033

.004 1

31 31


1.033

.000131

31


1.033

.00013 .0323

Sum the denominator

1.033

.00013 .0323

Sum the denominator

1.033

.0324

Sum the denominator

29.90

Sum the denominator

27.90

Sum the denominator

Degrees of freedom are 28 rather than 58.

27.90

Let’s see how the critical value changes with a degree of freedom of 28.

Let’s see how the critical value changes with a degree of freedom of 28. From this:


Let’s see how the critical value changes with a degree of freedom of 28. From this: To this:

Let’s see how the critical value changes with a degree of freedom of 28. From this: To this:


So let’s summarize:


An independent samples t-test is an inferential statistical analysis that helps researchers determine if the mean of one sample is statistically significantly greater or lesser than the mean of another sample.

?


If we were just looking at the difference between two means then we would subtract them.


If we were just looking at the difference between two means then we would subtract them. But since we are drawing conclusions to a larger population we have to set up a null-hypothesis and then run an independent samples t-test to determine if the results are statistically significant and by extension generalizable to other samples.


The estimated standard error is the value that determines if the distance between two means are significant or not.

estimated standard error


If the estimated standard error is small then a small difference between two means may still be statistically significant;


If the estimated standard error is small then a small difference between two means may still be statistically significant; if the estimated standard error is large then a medium to large difference between two means may not be statistically significant.


If the estimated standard error is small then a small difference between two means may still be statistically significant; if the estimated standard error is large then a medium to large difference between two means may not be statistically significant.

The estimated standard error is everything!

The size of the estimated standard error is determined by five factors:


1. How big the difference is between the two means


1. How big the difference is between the two means2. The size of the samples


1. How big the difference is between the two means2. The size of the samples3. The size of the variance


1. How big the difference is between the two means2. The size of the samples3. The size of the variance4. If the sample sizes are similar or different


1. How big the difference is between the two means2. The size of the samples3. The size of the variance4. If the sample sizes are similar or different5. If the variances are similar or different

End of Presentation

Education

What is an independent samples-t test?