29
1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following a very disappointing performance by Avonford Harriers athletes at the recent County Athletics Championships, local statistician Sandie Green has come up with an explanation. Our athletes cannot run at night! Although many of our athletes got through the first round heats in the morning session, most of them failed to get through the semi-final stage in the evening. It was initially thought that our athletes were not good enough but Sandie came up with an alternative explanation. “It is clear that Avonford runners perform better in the morning. This is probably because they do more of their training early in the morning before going to work” Sadie told us. “Before the next County Championships we must look in more detail at the training of these athletes”. Is this a valid explanation or just a feeble excuse for poor performances? Paired and Unpaired Experiments In order to test this, the coach conducts a simple experiment. He takes a random sample of 7 Avonford Harriers athletes, who run 400 metres in both a morning and an evening session. The results are shown in the table: Initials of runner SD SC MM OC SR MW NF Morning Time 46.2 47.1 46.7 47.4 48.5 49.3 50.1 Evening Time 47.2 47.6 46.8 47.4 48.3 50.3 52.1

Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

1

Chapter 15 Paired tests: t and Wilcoxon

The Avonford Star

Our Athletes cannot run at night!

Following a very disappointing performance by Avonford Harriers athletes at the

recent County Athletics Championships, local statistician Sandie Green has come up

with an explanation. Our athletes cannot run at night!

Although many of our athletes got through the first round heats in the morning

session, most of them failed to get through the semi-final stage in the evening. It was

initially thought that our athletes were not good enough but Sandie came up with an

alternative explanation.

“It is clear that Avonford runners perform better in the morning. This is probably

because they do more of their training early in the morning before going to

work” Sadie told us. “Before the next County Championships we must look in more

detail at the training of these athletes”.

Is this a valid explanation or just a feeble excuse for poor performances?

Paired and Unpaired Experiments

In order to test this, the coach conducts a simple experiment. He takes a random

sample of 7 Avonford Harriers athletes, who run 400 metres in both a morning and an

evening session.

The results are shown in the table:

Initials of runner SD SC MM OC SR MW NF

Morning Time 46.2 47.1 46.7 47.4 48.5 49.3 50.1

Evening Time 47.2 47.6 46.8 47.4 48.3 50.3 52.1

Page 2: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

2

Note: in this experiment the same 7 athletes were timed for the morning and evening

sessions. This is an example of a paired design.

On other occasions you may not be able to use the same subjects in the two

conditions; for example, some of the athletes may not be available or they may be

injured. If you had chosen marathon runners you would have got very odd results if

you had insisted that they ran a race in the morning and then in the afternoon.

In many experiments it is not possible to use the same subjects for both conditions

and in such cases an unpaired test has to be used. The unpaired design is covered in

the next chapter.

However, attempts are often made to match subjects so that a paired test can be used.

For example in many social science research projects pairs of identical twins are used.

Terminology

In this example,

• the random sample consists of 7 Avonford Harriers athletes, who run 400

metres in both a morning and an evening session.

• Take population 1 to be the times, X, of all Avonford Harriers athletes over

400 metres in the morning; it has mean X .

- A sample of size 7 has been taken from this, 1x , 2x , 3x , …, 7x

• Now take population 2 to be the times, Y, of all Avonford Harriers athletes

over 400 metres in the afternoon; it has mean Y .

- The equivalent sample of size 7 has been taken from this,

1y , 2y , 3y , … , 7y

Page 3: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

3

• The paired sample has differences d1, d2, … ,d7 where d1 = 1y - 1x , d2 = 2y -

2x , … , d7 = 7y - 7x , drawn from a population of differences, D, with mean

D .

So the Avonford Harriers data can be reduced to the set of values, d1, d2, … ,d7 of the

differences in the morning and evening times, shown in the table below:

Initials of runner SD SC MM OC SR MW NF

Evening Time, iy 47.2 47.6 46.8 47.4 48.3 50.3 52.1

Morning Time, ix 46.2 47.1 46.7 47.4 48.5 49.3 50.1

Difference, di = iy - ix 1 0.5 0.1 0 -0.2 1 2

If these figures come from a population with mean zero, then you would expect to see

approximately equal numbers and sizes of positive and negative differences in the

sample. What you want to decide is whether the 7 differences listed above are, on the

whole, so obviously negative that it is unlikely that they came from a population with

a mean of zero.

To do this you use a paired sample test. In this chapter two such tests are covered, the

paired sample t test and the Wilcoxon signed rank test for paired samples.

The paired sample t test

SETTING UP THE HYPOTHESIS TEST

H0: D =0 There is, on average, no difference between the performance of Avonford

athletes in the morning and the evening.

Page 4: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

4

H1: D >0 On average, Avonford athletes are faster in the morning than in the evening

(hence the times for mornings are smaller).

1-tail test.

Significance level 5%.

The form of the alternative hypothesis, which claims that Avonford athletes are faster

in the morning rather than just being different, indicates that a one-tail test is

appropriate.

CALCULATING THE TEST STATISTIC

The sample values are 1, 0.5, 0.1, 0, -0.2,1.2

Start by calculating the sample mean and standard deviation of the differences.

d = 0.6286 s = 0.7675

The test statistic, t, is given by

t = dsn

= 0.62860.7675

7

= 2.167

INTERPRETING THE TEST STATISTIC

In this case n = 7 and as before with a t test ν = n-1 = 7 – 1 = 6

The critical value, at the 5% significance level, for a one-taile t test with 6 degrees of

freedom, is found in the t tables under ν = 6 and p = 10%

5% is a suitable level in this situation

These are the differences, d

Use your calculator to check these values for yourself

Remember the tables are constructed to

give each tail a probability of ½ p %

Page 5: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

5

This gives a critical value of 1.943

Since 2.167 >1.943, the null hypothesis is rejected and the alternative hypothesis is

accepted.

The evidence supports, at the 5% significance level, the statement in the Avonford

Star that the athletes are faster in the morning.

Rationale

It is useful at this stage to think more clearly about what is happening when you apply

this test.

The values d1, d2, 3d … , d7 where d1 = 1y - 1x , d2 = 2y - 2x , …, are a sample of size 7.

In this test, D is assumed to be Normally distributed.

The standard deviation of the parent population is unknown so the standard deviation,

s, of the sample is used as an estimate for it. That is why you use a t test.

The null hypothesis is that the mean of D is zero.

? If the mean of D is zero what does this tell you about the “before” and “after”

populations?

Page 6: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

6

The test statistic, t, given by

t = dsn

.

It has a t distribution with n-1 degrees of freedom.

In the case where n =7, the test statistic has 6 degrees of freedom and the test statistic

is given by

t =

7

ds

? Compare the approach to the t test for a single sample. What do you notice?

Testing for a non-zero value of the difference of two means

In the previous example the null hypothesis was

H0: the mean value, D , of the population difference, D, is zero.

Sometimes, as in the next example, you will need to test that it has some other value,

denoted by k.

Example 15.1

Do people tend to marry other people of the same age? A student investigates the

statement that women on average marry men 4 years older than themselves.

She takes a random sample of 10 married couples in England and records their ages to

the nearest 110

of a year. The results are given in the table below. Carry out the

appropriate hypothesis test.

Solution

Define Population 1 as the ages, X, at which men marry in England. It has mean X .

Page 7: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

7

A sample of size 10 has been taken from this, 1x , 2x , 3x , …, 10x

Define Population 2 as the ages, Y, at which women marry in England. It has mean

Y .

The equivalent sample of size 10 has been taken from this, 1y , 2y , 3y , …, 10y

? Why is this a paired sample?

The paired sample has differences d1, d2, … ,d10 where d1 = 1x - 1y , d2 = 2x - 2y , … ,

d10 = 10x - 10y , drawn from a population of differences, D, with mean D .

So the data are reduced to the set of values, d1, d2, … , d10 of the differences in ages,

shown in the table below:

Husband, ix 36.3 42.3 37.4 26.5 21.5 30.8 32.9 56.3 25.2 30.9

Wife, iy 33.6 35.7 29.8 27.1 20.2 25.2 27.4 45.7 23.6 25.7

Difference, di = ix - iy 2.7 6.6 7.6 -0.6 1.3 5.6 5.5 10.6 1.6 5.2

SETTING UP THE HYPOTHESIS TEST

H0: D =4, there is a 4 year difference between the mean ages of husbands and wives

when they marry.

H1: D ≠ 4, the difference between the mean ages of husbands and wives when they

marry is not 4 years.

2-tail test.

Significance level 5%.

? Why is this a 2-tail test?

CALCULATING THE TEST STATISTIC

Start by calculating the sample mean and standard deviation of the differences, d1, d2,

… ,d10 Use your calculator to check these values for yourself

5% is a suitable level in this situation

Page 8: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

8

d = 4.61 s = 3.3617

The test statistic, t, is given by t = d ksn

− where k = 4

t = 4.61 43.3617

10

− = 0.5738

INTERPRETING THE TEST STATISTIC

In this case n = 10 so ν = n-1 = 10 – 1 = 9

The critical value, at the 5% significance level, for a two-tail t test with 9 degrees of

freedom, is 2.262

Since 0.5738 < 2.262, the null hypothesis is accepted.

The evidence supports the statement that there is a 4-year difference between the

mean ages of husbands and wives when they marry, at the 5% significance level.

Assumptions for the t test

There are two assumptions when carrying out a t test.

1. The sample taken is random.

2. The variable is Normally distributed.

? In the first example a random sample of Avonford Harriers athletes was conducted.

Are the differences between morning and evening times Normally distributed?

How could you check whether the model of a Normal distribution fits the differences?

Page 9: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

9

Exercise 15A

1. “It is clear that Avonford runners perform better in June rather than in July when

the main competitions are,” moaned the Avonford athletics coach. This conclusion

was based on the following results. A random sample of 10 athletes was selected for

one distance, the 400 metres. The results are shown in the table below.

Using an appropriate t test, at the 5% significance level, to examine whether the mean

difference between the times is zero. Also state the required distributional assumption.

Runner A P C T E F W D J B

June time 45.2 47.5 46.5 46.4 48.5 49.6 50.1 51.2 49.8 50.3

July time 47.8 47.4 46.9 46.4 49.3 50.1 51.1 52.3 49.2 48.6

2. A company purchases a chemical from a supplier. It is specified that the chemical

should contain no more than 7.5% of impurity. To investigate this, the company

arranges that a random sample of deliveries is checked by the supplier and by the

company itself. The percentages of impurity as found by the supplier and the

company are as follows.

Delivery A B C D E F G H I J

Supplier’s determination

7.7 9.4 6.6 5.5 8.1 4.9 5.9 6.9 9.0 7.4

Company’s determination

7.5 9.1 6.8 5.4 8.0 4.7 5.6 6.9 9.3 7.7

Page 10: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

10

Use an appropriate t test, at the 5% level of significance, to examine whether the

mean determinations of percentage of impurity by the company and supplier may be

assumed to be equal, stating clearly your null and alternate hypotheses and the

required distributional assumption. [MEI part]

3. A psychologist is studying the possible effect of hypnosis on dieting and weight

loss. Nine people (who may be considered as a random sample from the population

under study) volunteer to take part in the experiment. Their weights are measured.

Then, under hypnosis, they are told that they will seldom feel hungry and will eat less

than usual. After a month, their weights are measured again. The results in kilograms

are as follows.

Person Initial Weight Weight after one month

A 83.7 81.5

B 83.9 80.0

C 68.2 68.8

D 74.9 74.1

E 81.0 82.6

F 72.8 69.2

G 61.3 63.4

H 77.9 74.7

I 69.6 66.2

Page 11: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

11

(i) Use an appropriate t test to examine whether, overall, the mean weight has

been reduced over the month, at the 5% level of significance.

(ii) What distributional assumption is needed? [MEI]

4. In a steelworks, several skilled technicians are testing two machines that can be

used to cut rods of steel to, approximately, the required lengths. The machines always

cut to slightly above the specified length, so that the rod may be ground down to the

required length. However, this causes a waste of time and material. The purpose of

the test is to find which machine, if either, is better on the whole at cutting very near

to the specified length and thus minimising wastage.

Each technician uses each machine to cut rods to a particular specified length. The

excess length is then carefully measured, and the results, in cm, are as follows.

Technician Machine A Machine B

1 2.9 1.9

2 1.8 1.4

3 4.7 3.4

4 2.7 3.3

5 2.9 2.0

6 2.4 2.4

7 5.2 3.2

8 2.9 2.1

Use an appropriate t test to examine at the 5% level of significance, whether either

machine is better, stating carefully your null and alternative hypotheses. [MEI]

Page 12: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

12

5. A taxi fleet manager thinks that the fuel consumption might be improved by

adopting a new design of tyre. An experiment is conducted to compare fuel

consumption using this new design and using standard tyres. Ten taxis are selected at

random from the fleet, fitted with the new tyres, and driven for a month in normal

service, each keeping its own driver throughout the trial. The average fuel

consumption over this period is measured for each taxi, and compared with the

average fuel consumption for a previous similar period with the standard tyres. The

results for average fuel consumption, in litres per 100 kilometres, are as follows.

Taxi 1 2 3 4 5 6 7 8 9 10

New tyres 18.2 17.6 19.4 17.9 18.9 17.4 18.5 19.0 18.9 17.2

Standard Tyres 19.0 17.1 19.6 19.0 18.8 18.9 18.8 19.7 18.3 18.4

It is desired to examine the null hypothesis that, on the whole, the fuel consumption is

the same with new and standard tyres against the alternative that it is better (ie the

result in litres per 100km is smaller) with the new tyres. Making an appropriate

assumption, which should be carefully stated, use a t test to examine the above

hypothesis at the 5% level of significance. [MEI]

6. A therapist is studying the effect of a particular type of therapy on a phobic

reaction. A random sample of 10 patients is available. For each patient, the intensity

of the phobic reaction is measured, on a suitable scale, before and after therapy. The

results are as follows:

Page 13: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

13

Patient Intensity before therapy Intensity after therapy

A 72 66

B 59 40

C 45 58

D 87 56

E 37 15

F 64 66

G 7 15

H 75 31

I 14 3

J 50 36

Use an appropriate t procedure to test, at the 5% level of significance, whether the

therapy on the whole reduces the intensity of the reaction, stating clearly your null

and alternative hypotheses and the required distributional assumption.

Page 14: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

14

7. A fermentation process causes the growth of an enzyme. The amount of an

enzyme present in the mixture after a certain number of hours needs to be

measured accurately. An inspector is comparing two procedures for doing this,

there being a suspicion that the procedures are leading to a different results.

Eight samples are therefore taken and each is divided into two sub-samples of

which one is randomly assigned for analysis by the first procedure and the

other by the second. The data (in a convenient unit of concentration) are as

follows:

Sample 1 2 3 4 5 6 7 8

Result (procedure 1) 214.6 226.2 219.6 208.4 215.1 220.8 218.4 212.3

Result (procedure 2) 211.8 224.7 219.8 205.2 212.6 218.0 219.2 209.7

It is understood that the underlying populations are satisfactorily modelled by Normal

distributions.

Use an appropriate t test to examine these data, stating clearly the null and alternative

hypotheses you are testing. Use a 1% significance level. (MEI part)

Page 15: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

15

The Wilcoxon signed rank test for paired samples.

In the previous section you met the paired sample t test. However, there was one

assumption that is not always applicable. What happens if the differences are not

Normally distributed?

An alternative test is the Wilcoxon signed rank test for paired samples.

You have already met the Wilcoxon signed rank test for single samples and know that

it is a non-parametric or distribution free test. No assumption of underlying Normality

is needed. It is used to test the null hypothesis that the median of the distribution is

equal to some value. It can also be used for data where a numerical scale is

inappropriate but where it is possible to rank the observations.

Look again at the example of the 400 metre runners. Using the t test on it depended on

the assumption that the distribution of the differences is Normal. If you do not believe

this assumption is valid, you need to conduct a different test.

The data are shown in the table:

Athlete Evening Time Morning Time

SD 47.2 46.2

SC 47.6 47.1

MM 46.8 46.7

OC 47.4 47.4

SR 48.3 48.5

MW 50.3 49.3

NF 52.1 50.1

Page 16: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

16

Example 15.2

Carry out the Wilcoxon signed rank test on the data in the above table.

SETTING UP THE HYPOTHESIS TEST

H0 : There is no difference between the median of the times of Avonford athletes in

the evening and the morning.

H1 : There is a positive difference between the median of the times of Avonford

athletes in the evening and the morning.

1-tail test.

Significance level 5%.

The form of the alternative hypothesis, which claims that the times of the Avonford

athletes are greater in the evening, rather than just being different, indicates that a

one-tail test is appropriate.

CALCULATING THE TEST STATISTIC

The Avonford data can be reduced to a set of values, d1, d2, … , d7 of the differences

in the morning and evening times, shown in the table below:

Athlete Evening time, yi Morning time, xi Difference, di = yi- xi

SD 47.2 46.2 1

SC 47.6 47.1 0.5

MM 46.8 46.7 0.1

OC 47.4 47.4 0

SR 48.3 48.5 -0.2

MW 50.3 49.3 1

NF 52.1 50.1 2

5% is a suitable level in this situation

Page 17: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

17

The athlete OC has a difference of 0 so this value is ignored and the value of n is

reduced to 6.

Athlete yi xi di = yi- xi Rank + _

SD 47.2 46.2 1 4.5 4.5

SC 47.6 47.1 0.5 3 3

MM 46.8 46.7 0.1 1 1

SR 48.3 48.5 -0.2 2 2

MW 50.3 49.3 1 4.5 4.5

NF 52.1 50.1 2 6 6

W+ = 19 W- = 2

Calculate W- = 2

Calculate W+ = 1 + 3 + 4.5 + 4.5 +6 = 19

Check W+ + W- = 21 and with n = 6 ( 1)2

n n + = 21

Test statistic is W- = 2

INTERPRETING THE TEST STATISTIC

From tables, if you use a 5% significance level, the critical value for n=6 is W=2

.

Since 2= 2 you reject the null hypothesis

There is evidence of a difference between the median of the differences between the

performance of Avonford athletes in the morning and the evening.

At the 5% significance level, the data support the claim that the athletes perform

better in the morning.

Remember the test statistic is the smaller of W+ + W-

You reject H0 if the test statistic is ≥ the critical value

Page 18: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

18

Which test to use: Wilcoxon or t ?

Notice in this test the conclusion is the same as for the t test. However, in this case

there has been no assumption that the distribution of the differences is Normal. The

Wilcoxon signed rank test is therefore very useful when there is doubt over whether

this assumption is true.

Example 15.3

A survey is carried out to discover whether background music has any effect on the

recall of students. A sample of 8 students is given a memory test with and without

music.

The results are shown in the table below:

Student Recall (no music) Recall (with music)

AB 12 8

CT 7 8

GR 8 3

SH 10 10

JT 13 7

GW 8 9

MB 11 12

KY 15 12

Carry out the Wilcoxon rank-sum test.

SETTING UP THE HYPOTHESIS TEST

H0 : There is no difference between the median of the performance of students with or

without music.

Page 19: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

19

H1 : There is a difference between the median of the performance of students with or

without music.

2-tail test. Significance level 5%. CALCULATING THE TEST STATISTIC

First calculate the differences and insert in the table as follows:

Student Recall (no music) Recall (with music) Difference

AB 12 8 4

CT 7 8 -1

GR 8 3 5

SH 10 10 0

JT 13 7 6

GW 8 9 -1

MB 11 12 -1

KY 15 12 3

Ignore student 4 as the difference is 0. This reduces the value of n to 7.

Page 20: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

20

Student Recall (no music)

Recall (with

music)

Differences Rank + -

AB 12 8 4 5 5

CT 7 8 -1 2 2

GR 8 3 5 6 6

SH 10 10 0 Ignore

JT 13 7 6 7 7

GW 8 9 -1 2 2

MB 11 12 -1 2 2

KY 15 12 3 4 4

W+ = 22 W- = 6

Summing the + and – columns gives W+ = 22 and W- = 6.

Carrying out the check, W+ +W- = 28 and, with n = 7, ( 1)2

n n + = 28. Both are the

same.

The test statistic is W= 6

INTERPRETING THE TEST STATISTIC

From tables, for a 5% significance level for a 2-tail test, the critical value for n =7 is

W=2.

Since 6 > 2 the null hypothesis is accepted.

There is no evidence, at the 5% significance level, of a difference between the median

of the differences between the recall of students with or without music.

! Sometimes the data are presented in rows rather than columns. Although you can

work in rows using the same headings, differences, + and -, you may find it easier to

work in columns, as in the above examples.

Page 21: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

21

Page 22: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

22

Exercise 15B

1. An experiment is conducted on 10 students to test the recall of students when

there is no background noise or when there is background noise using classical music.

The results are shown in the table below. Use an appropriate Wilcoxon procedure to

test whether the recall improves with the use of classical music. Carry out the test, at

the 5% significance level.

Student Recall (no noise) Recall (classical music)

ER 36 38

HG 37 39

JE 28 32

WB 20 15

PT 43 47

BG 28 19

KH 31 23

TR 25 32

LK 44 46

WR 25 28

2. Avonford council is concerned about the amount of lead in the air in its town

centre. In order to reduce this it is decided to do some traffic calming in some of the

streets with the aim of cutting down the number of vehicles. Opponents to the scheme

feel that this has added to the problem, as cars spend a longer time queuing in these

streets and hence increase the pollution. The council samples the amount of lead in

the air at 11 sites around the town centre. Do the data below show that the council

Page 23: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

23

measures have changed the amount of lead in the air? Carry out the test, at the 5%

significance level.

Amount of lead (in parts per million)

Site 1 2 3 4 5 6 7 8 9 10 11

Before 210 89 56 167 46 67 98 121 144 67 34

After 243 76 53 187 38 63 87 132 160 67 28

3. Records have been kept, during a large number of working days, of the

numbers of heavy lorries per hour travelling eastbound and westbound on a certain

stretch of main road. It is anticipated that there will be some variation from hour to

hour, and it is thought that these variations might not be modelled by a Normal

distribution. Therefore, the Wilcoxon paired sample test is to be used to examine

whether, overall, the distributions of eastbound and westbound numbers can be

assumed to have the same location parameter. The data for a random sample of 12

hours are as follows:

Time 1 2 3 4 5 6 7 8 9 10 11 12

East 89 94 79 70 86 68 73 76 85 75 57 66

West 71 90 58 46 94 55 51 92 84 77 71 73

Carry out the test, at the 5% significance level. [MEI]

4. A psychologist is studying the possible effect of hypnosis on dieting and weight

loss. Nine people (who may be considered as a random sample from the population

under study) volunteer to take part in the experiment. Their weights are measured.

Then, under hypnosis, they are told that they will seldom feel hungry and will eat less

Page 24: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

24

than usual. After a month, their weights are measured again. The results in kilograms

are as follows.

Person Initial Weight Weight after one month

A 83.7 81.5

B 83.9 80.0

C 68.2 68.8

D 74.9 74.1

E 81.0 82.6

F 72.8 69.2

G 61.3 63.4

H 77.9 74.7

I 69.6 66.2

(j) Use an appropriate Wilcoxon test to examine whether, overall, the mean

weight has been reduced over the month, at the 5% level of significance.

(ii) Using only the information given in the question why may the Wilcoxon test

be a more appropriate test than the t test. [MEI adapted]

Page 25: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

25

5. A therapist is studying the effect of a particular type of therapy on a phobic

reaction. A random sample of 10 patients is available. For each patient, the intensity

of the phobic reaction is measured, on a suitable scale, before and after therapy. The

results are as follows:

Patient Intensity before therapy Intensity after therapy

A 72 66

B 59 40

C 45 58

D 87 56

E 37 15

F 64 66

G 7 15

H 75 31

I 14 3

J 50 36

Use an appropriate Wilcoxon procedure to test, at the 5% level of significance,

whether the therapy on the whole reduces the intensity of the reaction, stating clearly

your null and alternative hypotheses. [MEI part]

6. An inspector is examining the lengths of time to complete various routine

tasks taken by employees who have been trained in two different ways. He wants to

examine whether the two methods lead, overall, to the same times. Ten different tasks

have been prepared. Each task is undertaken by a randomly selected employee who

Page 26: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

26

has been trained by method A and by a randomly selected employee who has been

trained by method B. The times to completion, in minutes, are shown in the table.

Task 1 2 3 4 5 6 7 8 9 10

Time taken by method A employee

18.2 17.6 19.4 17.9 18.9 17.4 18.5 19.0 18.9 17.2

Time taken by method B employee

19.0 17.1 19.6 19.0 18.8 18.9 18.8 19.7 18.3 18.4

(i) Explain why these data should be analysed by a paired samples test.

(ii) Use an appropriate Wilcoxon test to examine these data, stating clearly the

null and alternative hypotheses you are testing. Use a 5% significance level.

(MEI part)

Page 27: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

27

KEY POINTS

• Paired (or matched) data arise when the same two measurements are taken for

the same or equivalent individuals.

• A paired sample test is carried out on the difference, D, between the two

measurements.

• For measurements 1x , 2x , 3x , …, nx and 1y , 2y , 3y , …, ny , d1 = 1x - 1y ,

d2 = 2x - 2y , … , dn = ny - nx

• For a t test of the null hypothesis that there is no difference in the means of the

two populations, i.e. D = 0, the test statistic is t = dsn

• For the null hypothesis D = k, where there is a suspected difference between

the 2 populations, the test statistic is t = d ksn

• The t test may be used if the differences are Normally distributed and the

sample is selected at random

• If the differences are not Normally distributed, the Wilcoxon signed rank test

is a possible alternative to the t test

• To calculate the test statistic for the Wilcoxon signed rank test

- Calculate each paired difference, d1, d2, … , dn

- Ignore any items where the difference is 0, and reduce the sample size

accordingly

- Rank the differences, ignoring the signs (assign 1 to the lowest

difference, 2 to the next etc.)

Page 28: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

28

- Label each rank with its sign, according to the sign of the associated

difference

- Calculate W+, the sum of the ranks of the positive differences, and W-,

the sum of the ranks of the negative differences.

- Check that W+ + W- = ( 1)2

n n +

- Use the smaller of W+ and W- as the value of the test statistic, W.

- Look up in tables the critical value and make a comparison and

conclusion.

- The null hypothesis is rejected if the test statistic W is less than or

equal to the appropriate critical value.

Page 29: Chapter 15 Paired tests: t and Wilcoxon Our …mei.org.uk/files/pdf/Z3ch15.pdf1 Chapter 15 Paired tests: t and Wilcoxon The Avonford Star Our Athletes cannot run at night! Following

29

Answers

? (Page 5)

This would indicate that the before and after populations have the same mean.

? (Page 5)

The approaches for the single sample t-test and the paired sample t test are very

similar.

In the paired sample t-test the variable used is the difference of the 2 variables that

were actually measured in the sample.

? (Page 6)

The data consists of married pairs.

? (Page 7)

The form of the alternative hypothesis, which claims that the difference between the

mean ages of the husbands and wives when they marry is not 4 years, indicates that a

2-tailed test is appropriate.

? (Page 8)

This is a reasonable assumption but may not always occur.

In order to check that the differences are Normally distributed you can conduct a chi-

squared test to see whether it is a good fit.