T12 non-parametric tests

  • View
    4.223

  • Download
    2

  • Category

    Business

Preview:

Citation preview

Nonparametric Tests

By Rama Krishna Kompella

Key Terms

• Power of a test refers to the probability of rejecting a false null hypothesis (or detect a relationship when it exists)

• Power Efficiency the power of the test relative to that of its most powerful alternative.  For example, if the power efficiency of a certain nonparametric test for difference of means with sample size 10 is 0.9, it means that if interval scale and the normality assumptions can be made (more powerful), we can use the t-test with a sample size of 9 to achieve the same power.

Choice of nonparametric test

• It depends on the level of measurement obtained (nominal, ordinal, or interval), the power of the test, whether samples are related or independent, number of samples, availability of software support (e.g. SPSS)

• Related samples are usually referred to match-pair (using randomization) samples or before-after samples.  

• Other cases are usually treated as independent samples.  For instance, in a survey using random sampling, we have a sub-sample of males and a sub-sample of females.  They can be considered as independent samples as they are all randomly selected.

Sign Test paired data

Mann-Whitney U Test 2 independent samples

Kruskal-Wallis Test > 2 independent samples

Non-parametric Tests

Sign Test

• Used for paired data– Can be ordinal or continuous

• Very simple and easy to interpret• Makes no assumptions about distribution of

the data• Not very powerful

Sign Test: null hypothesis• The null hypothesis for the sign test is

• To evaluate H0 we only need to know the signs of the differences – If half the differences are positive and half are

negative, then the median = 0 (H0 is true).– If the signs are more unbalanced, then that is

evidence against H0.

H0: the median difference is zero

Example: Body image data

childRating before

Rating after change sign

1 1 5 4 +

2 1 4 3 +

3 3 1 -2 -

4 2 3 1 +

5 4 4 0 0

6 1 4 3 +

7 3 5 2 +

8 1 5 4 +

9 1 4 3 +

10 4 4 0 0

11 1 1 0 0

12 1 4 3 +

13 1 4 3 +

14 2 4 2 +

15 1 4 3 +

16 2 5 3 +

17 1 4 3 +

18 1 5 4 +

19 4 4 0 0

20 3 5 2 +

• The sign test looks at the signs of the differences– 15 children felt better

about their teeth (+ difference in ratings)

– 1 child felt worse (- diff.) – 4 children felt the same

(difference = 0)• Looks like good evidence

Mann-Whitney U test

• The Mann-Whitney U test, also called the rank sum test, is a non-parametric test that compares two independent (unmatched) groups.

• This means that either the data are at the ordinal level or data are at the interval/ratio level but not normally distributed.

• The test statistic is the U statistic. This is the test that you use if you cannot fulfill the assumptions of the t-test.

Mann-Whitney U test

• Assumption of normality or equality of variance is not met.

• Like many non-parametric tests, uses the ranks of the data rather than their raw values to calculate the statistic.

• Since this test does not make a distribution assumption, it is not as powerful as the t-test.

Mann-Whitney U test

The hypotheses for the comparison of two independent groups are:

• Ho: The two samples come from identical populations / the sum of the ranks is similar

• Ha: The two samples come from different populations / the sum of the ranks is different

Procedure for Mann-Whitney U-Test1. Choose Mann-Whitney Test2. Hypotheses Null and hypotheses Alternative3. Assign ranks to all the scores in the experiment.4. Compute the sum of the ranks for each group.5. Compute the two version of the Mann-Whitney

U. Fist compute U1 for Group 1 using the formula:

U1 = (n1)(n2) + n1 (n1 + 1) - Σ R1

2

Procedure for Mann-Whitney U-Test

Next compute U2 for Group 2 using the formula: U2 = (n1)(n2) + n2 (n2 + 1) - Σ R2

24. Determine the Mann_Whitney Uobt. 5. Find the critical value6.Compare Uobt to Ucritt .7. Intrepret, and make decision.8. Draw conclusion

Solution:

• H0: There is no difference in the test scores for the two type of classes.

• H1: There is a difference in test score for the two type of classes (claim)

Solution:

U1 = (n1)(n2) + n1 (n1 + 1) - Σ R1

2 = (18)(13) + (18)(19) – 208 2 = 234 + 342 - 208 2 = 197

Solution:

U2 = (n2)(n1) + n2 (n2 + 1) - Σ R2

2 = (13)(18) + (13)(14) – 288 2 = 234 + 182 - 288 2 = 37

Solution:

To check your computation of U:U1 + U2 = n1.n2 197 +37 = (18)(13) 234 = 234It checks out, and because U is the smaller of U1 and

U2, U = 37Critical value: Using n1 = 18 and n2 = 13, at a = 0.05,

the critical value is 67.Reject null hypothesis , since Uobt less than Ucrit

There is a difference between the two classes on the algebra readiness test.

• The Kruskal-Wallis H Test is a nonparametric procedure that can be used to compare more than two populations in a completely randomized design.

• All n = n1+n2+…+nk measurements are jointly ranked (i.e.treat as one large sample).

• We use the sums of the ranks of the k samples to compare the distributions.

The Kruskal-Wallis H Test

The Kruskal-Wallis H Test

)1(3)1(

12 2

nn

T

nnH

i

i

• Rank the total measurements in all k samples from 1 to n. Tied observations are assigned average of the ranks they would have gotten if not tied.

• Calculate Ti = rank sum for the ith sample i = 1, 2,…,k

• And the test statistic

The Kruskal-Wallis H Test

H0: the k distributions are identical versus

Ha: at least one distribution is different

Test statistic: Kruskal-Wallis H

When H0 is true, the test statistic H has an approximate chi-square distribution with df = k-1.

Use a right-tailed rejection region or p-value based on the Chi-square distribution.

Spearman’s Rank Correlation

• Spearman's Rank Correlation is a technique used to test the direction and strength of the relationship between two variables.

• In other words, its a device to show whether any one set of numbers has an effect on another set of numbers

• It uses the statistic Rs which falls between -1 and +1

Procedure for using Spearman's Rank Correlation

1. State the null hypothesis i.e. "There is no relationship between the two sets of data."

2. Rank both sets of data from the highest to the lowest.

3. Make sure to check for tied ranks. 4. Subtract the two sets of ranks to get the difference

d. 5. Square the values of d. 6. Add the squared values of d to get Sigma d2.

Procedure for using Spearman's Rank Correlation

7. Use the formula Rs = 1-(6Ʃd2/n3-n) where n is the number of ranks you have.

8. If the Rs value...– ... is -1, there is a perfect negative correlation.– ...falls between -1 and -0.5, there is a strong negative correlation.– ...falls between -0.5 and 0, there is a weak negative correlation. – ... is 0, there is no correlation– ...falls between 0 and 0.5, there is a weak positive correlation.– ...falls between 0.5 and 1, there is a strong positive correlation– ...is 1, there is a perfect positive correlation

between the 2 sets of data. If the Rs value is 0, state that null hypothesis is accepted. Otherwise, say it is rejected.

Run Test for Randomness

• Run test is used for examining whether or not a set of observations constitutes a random sample from an infinite population

• A run is defined as a series of increasing values or a series of decreasing values.

• For example, the males and females in a line can have patterns such as M F M F M F M F and M M M M F F F F, which have 8 and 2 runs, respectively

Run Test for Randomness

• Hypothesis: To test the run test of randomness, first set up the null and alternative hypothesis.

• In run test of randomness, null hypothesis assumes that the distribution of the sample is random. The alternative hypothesis will be the opposite of the null hypothesis.

Run Test for Randomness

• The second step is the calculation of the mean and variance.

• Where N= Total number of observations =N1+N2N1=Number of + symbolsN2=Number of – symbolsR= number of runs

• If Rc (lower) <= R<= Rc (Upper), accept Ho. Otherwise reject Ho

Cox-Stuart Test for Trend

• This test is useful for detecting positively or negatively sloping gradual trends in a sequence of independents on a single random variable

• One of the three alternative hypotheses are possible– An upward or downward trend exists– An upward trend exists or – A downward trend exists

• If the null hypothesis is accepted, the result indicates that the measurements within the ordered sequence are identically distributed

Kolmogorov-Smirnov Test

• In situations where there is unequal number of observations in two samples, K-S test is appropriate

• This test is used to test whether there is any significant difference between two treatments A and B

• The test Hypothesis is– Ho: No difference in the effect of treatments A and B– H1: There is some difference in the effect of treatments A

and B

Questions?

Recommended