Chapter Eighteen

Smith/Davis (c) 2005 Prentice Hall

Chapter Eighteen

Data Transformations and Nonparametric Tests of Significance

PowerPoint Presentation created by Dr. Susan R. BurnsMorningside College


Assumptions of Inferential Statistical Tests and the Fmax Test

The assumptions researchers make when they conduct parametric tests (i.e., used to attempt to estimate population parameters). Are as follows:

– The populations from which you have drawn your samples are normally distributed (i.e., the populations have the shape of a normal curve).

– The populations from which you have drawn your samples have equivalent variances. Most researchers refer to this

assumption as homogeneity of variance.


Assumptions of Inferential Statistical Tests and the Fmax Test

If your samples are equal in size, you can use them to test the assumption that the population variances are equivalent (using the Fmax).

For example, assume that you conducted a multi-group experiment and obtained the following variances (n = 11). Group 1 θ2 = 22.14, Group 2 θ2 = 42.46, Group 3 θ2 = 34.82, Group 4 θ2 = 52.68. Select the smallest and largest variances and plug them into the Fmax formula.

To evaluate this statistic, you use the appropriate F table, entering at the n – 1 row and the k (number of samples in the study) column.


Data Transformations

Using data transformations is one technique researchers use to deal with violations of the assumptions of inferential tests.

It is a legitimate procedure that involves the application of an accepted mathematical procedure to all scores in a data set. Common data transformations include:

– Square root – research calculates the square root of each score.

– Logarithmic – researcher calculates the logarithm of each score.

– Reciprocal – researcher calculates the reciprocal (i.e., 1/X) of each score.



What Data Transformations Do and Not Do: – Data transformations change certain

characteristics of distributions. – For example, they can have the effect of

normalizing distributions or equalizing variances between (among) distributions.

– These transformations are desirable if you are in violation of the assumptions for inferential statistics.

– They do not alter the relative position of the data in the sample.



Using Data Transformations and a Caveat – Assuming that your data transformation corrected the problem that

prompted you to use it, you can proceed with your planned analysis.

– However, if you perform more than one type of analysis on your data, remember to use the transformed data for all of your analyses.

– Also, you will need to be careful when interpreting the results of your statistical analysis and draw conclusions.

– Your interpretations and conclusions need to be stated in terms of the transformed data, not in terms of the original data you gathered.

– You really can only compare your data to other research projects that have used the same transformations.


Nonparametric Tests of Significance

Nonparametric Tests of Significance are significance tests that do not attempt to estimate population parameters such as means and variances.


Chi Square Tests

Chi-Square Tests are procedures that compare the fit or mismatch between two or more distributions.

– Chi Square Test for Two Levels of a Single Nominal Variable – When we have one nominal variable, there is an expected frequency distribution and an observed frequency distribution.

The expected frequency distribution is the anticipated distribution of frequencies into categories because of previous research results or a theoretical prediction.

The observed frequency distribution is the actual distribution of frequencies that you obtain when you conduct your research.

The Chi Square Test for Two Levels of a Single Nominal Variable is the simplest form of this test.


Calculation of the Chi Square Test

Calculation of the chi-square test statistic is based on the following formula:

– Step 1. Subtract the appropriate expected value from each corresponding observed value. Square the difference and then divide this squared value by the expected value.

– Step 2. Sum all the products you calculated in Step 1. To determine significance, you will need to calculate the

degrees of freedom and then look up the critical value in the appropriate table. To calculate the degrees of freedom, use the following formula:

df = Number of Categories – 1


Chi Square Example

An example given in your textbook examines democratic and republican candidate preference in the current election.

A newspaper says that the two are absolutely even in terms of voter preferences.

You conduct a voter preference survey of 100 people in your town and find that 64 people prefer the democratic candidate, whereas 36 prefer the republican candidate.


Chi Square Test

Writing up your result in a research report would look something like this:

2(Number of categories, N) = #.##, p < .05 A significant result with a chi square test is interpreted

differently than a significance test with an inferential statistic.

When you have a significant chi square, that means that the two distributions you are tests are not similar.

In a two-group design, you would be able to say anything about the two groups differing from each other.

It is the expected and observed distributions that are differing from each other.


Chi-Square Test for More Than Two Levels of a Single Nominal Variable

Again, you are comparing an observed and an expected distribution when you are comparing more than two levels of a single nominal variable.

Thus, the appropriate formula still is as follows: 2 = Σ(O – E)2/E

– knowing that you may have different expected values (E) for each level.

When your nominal data has three or more levels/categories, it is not readily apparent where the discrepancy between the observed and expected frequencies exists.

Significant chi squares with more than two groups require follow-up tests.

– Most researchers use a series of smaller, more focused chi-square tests.


Chi-Square Test for Two Nominal Variables

This situation is analogous to a multiple IV research design. When you have two nominal variables, you will display y our data in a contingency table that shows the distribution of frequencies and totals for two nominal variables.

To find the expected frequency for any of the cells of your contingency table, use the following formula:

Expected Frequency = (row total X column total)/grand total You still use the chi square formula to test your contingency

table: 2 = Σ(O – E)2/E

The calculation for degrees of freedom for a contingency table is as follows:

df = (number of rows – 1)(number of columns – 1)


Chi-Square Test for Two Nominal Variables

The interpretation of a chi square for a contingency table is a bit different that our interpretation for one nominal variable.

The chi square for a contingency table tests to determine if the distributions in the table have similar or different patterns.

If the chi square is not significant, then the distributions are similar (i.e., the distributions are independent).

– Another way of saying this is the pattern for one category is essentially the same as the pattern for the other level(s).

If the chi square is significant, then the distributions are dissimilar, and researchers say that are dependent.

– Meaning, the pattern for one category is not the same pattern for the other level(s). The nature of the pattern on one nominal variable depends on the specific category or level of the other nominal variable.

These types of tests are also referred to as chi-square tests for independence.


Rank Order Tests

Rank-Order Tests are non-parametric tests that are appropriate to use when you have ordinal data.

The underlying rational for the ordinal-data tests involves ranking all the scores, disregarding specific group membership,

then you compare the ranks for the various groups. – If the groups were drawn from the same population, then the ranks

should not differ noticeably between (among) the groups. – If the IV was effective, then the ranks will not be evenly distributed

between (among) the groups; smaller ranks will be associated with one group and larger ranks will be associated with another group(s).


Mann-Whitney U Test

Mann-Whitney U Test is used with two independent groups that are relatively small (n = 20 or less).

Step 1. Rank all scores in your data set (disregarding the group membership); the 1 lowest score is assigned the rank of 1.

Step 2. Sum the ranks for each group Step 3. Compute a U value for each group according

to the following formulas:U1 = (n1) (n2) + n1(n1 + 1)/2 – ΣR1

U2 = (n1) (n2) + n2(n2 + 1)/2 – ΣR2 Where: ΣR1 = Sum of Ranks Group 1; ΣR2 = Sum of Ranks

Group 2; n = number of participants in the appropriate group.


Mann-Whitney U Test Example

You have designed a new method for teaching spelling to second graders.

You conduct an experiment to determine if your new method is superior to the method being used.

You randomly assign 12 second graders to two equal groups. You teach group 1 in the traditional manner, whereas group 2 learns to spell with your new method,

After two months, both groups complete the same 30-word spelling test.


Mann-Whitney U Test

Step 4. Determine which U Value (U1 or U2) to use to test for significance. For a two-tailed test, you will use the smaller U. For a one-tailed test, you will use the U for the group you predict will have the larger sum of ranks.

Step 5. Obtain the critical U value from the appropriate Table.

Step 6. Compare your calculated U value to the critical value.


Rank Sums Test

Rank Sums Test is used when you have two independent groups and the n in one or both groups is greater than 20.

Step 1. Rank order all the scores in your data set (disregarding group membership). The lowest scores is assigned the rank of 1.

Step 2. Select one group and sum the ranks (ΣR).

Step 3. Use the following formula to calculate the expected sum of ranks:

ΣRexp = n(n + 1)/2


Rank Sums Test

Step 4. Use the ΣR and ΣRexp for the selected group to calculate the rank sums z statistic according to the following formula:

Zrank sums = ΣR - ΣRexp/√(n1) (n2) (n2 + 1)/N Step 5. Use the appropriate table to determine the critical z

value for the .05 level. If you have a directional hypothesis (i.e., one-tailed test), you will find the z that occurs 45% from the mean. For non-directional (two-tail tests), you will find the z that occurs 47.5% from the mean. Remember you split the alpha level equally for a two-tailed test.

Step 6. Disregard the sign of your z statistic and compare it to the critical value.

Step 7. Calculate an effect size. η2 = (z rank scores)2/N-1


Wilcoxon T Test

Wilcoxon T Test – is used when you have two related groups and ordinal data.

A researcher believes that viewing a video showing the benefits of recycling will result in more positive recycling attitudes and behaviors. Ten volunteers complete a recycling questionnaire (higher scores = more positive attitudes) before and after viewing the video.


Wilcoxon T Test

Step 1. Find the difference between the scores for each pair. It doesn’t matter which score is subtracted, just be consistent throughout all the pairs.

Step 2. Assign ranks to all nonzero differences. (Disregard the sign of the difference. The smallest difference = rank 1. Tied differences receive the average of the ranks they are tied for.).

Step 3. Determine the ranks that are based on positive differences and the ranks that are based on negative differences.

Step 4. Sum the positive and negative ranks. These sums are the T values you will use to determine significance.


Wilcoxon T Test

Step 5. If you have a non-directional hypothesis (two-tailed test), you will use the smaller of the two sums of ranks. If you have a directional hypothesis (one-tailed) you will have to determine which sum of ranks your hypothesis predicts will be smaller.

Step 6. N for the Wilcoxon T Test equals the number of nonzero differences.

Step 7. Use the appropriate table to check the critical value for your test. To be significant, the calculated value must be equal to or less than the table value.


Kruskal-Wallis H Test

Kruskal-Wallis H Test is appropriate when you have more than two groups and ordinal data.

For example, a sport psychologist compared methods of teaching putting to beginning golfers. The methods were visual imagery and repetitive practice, and the combination of visual imagery ad repetitive practice.

The researcher randomly assigned 21 equally inexperience golfers to three equal groups.

Following equivalent amounts of training, each golfer attempted to sink 25 putts from the same location on the putting green.

The researcher rank-ordered (smaller rank = lower score) all the final performances.





Step 1. Rank all scores (1 = lowest score). Step 2. Sum the ranks for each group/condition. Step 3. Square the sum of ranks for each

group/condition. Step 4. Use the following formula, calculate the

sum of squares between groups:



Step 5. Use the following formula to calculate



Step 6. Use the chi square table to find the critical value for H. The degrees of freedom for the H test = k – 1, where k is the number of groups/conditions.

Step 7. Calculate an effect size:

Step 8. If appropriate conduct post hoc tests. The procedure for conducting such tests is similar to conducting follow-up tests when you have nominal data. In short, you dives simpler follow-up analyses.

Documents

Chapter Eighteen