11
B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

Embed Size (px)

Citation preview

Page 1: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

B AD 6243: Applied Univariate Statistics

Non-Parametric Statistics

Professor Laku Chidambaram

Price College of Business

University of Oklahoma

Page 2: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

BAD 6243: Applied Univariate Statistics 2

Using Non-Parametric Statistics

• Non-normal distribution of data– Tests referred to as “distribution free” (or sometimes

“assumption free”) tests

• Small sample size• Measurement issues

– Dependent variables are nominal or ordinal

• Tests are generally less powerful than their parametric counterparts– Intent is not to estimate population parameter per se

• Involves testing differences and relationships

Page 3: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

A Guide to Testing Differences

Nature of DV/

Sample Type

Nominal Ordinal Interval

2 Independent Samples

Chi-square Test Mann-Whitney U Test

Independent Samples T-test

2 Related Samples --

Wilcoxon Matched Pairs Test

Paired Samples T-test

k Independent Samples

Chi-square Test Kruskall-Wallis Test

One-way ANOVA

k x k Independent Samples

Contingency Analysis (Crosstabs)

--

Factorial ANOVA

Page 4: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

BAD 6243: Applied Univariate Statistics 4

The Chi-Square Distribution

• The chi-square distribution refers to a family of distributions (derived from the normal distribution) with one parameter, k, the degrees of freedom 

• The distribution is positively skewed but becomes increasingly symmetric as k increases  • The mean and variance of the chi-square

distribution also increase as k increases • The mean = k and variance = 2k  

Page 5: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

BAD 6243: Applied Univariate Statistics 5

The Chi-square Test

• The Chi-square Test is based on the chi-square

distribution• It evaluates the goodness-of-fit of the observed

frequencies (O) with the expected frequencies (E) in various categories

• The Chi-square statistic (shown below) helps determine whether differences between the observed and expected frequencies in the sample represent “real” or random differences

2= [(O-E)2 / E]   

Page 6: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

BAD 6243: Applied Univariate Statistics 6

An Example

H0: pMarketing = pManagement = pFinance = pMIS

H1: At least one pair is not equal

Is there an equal proportion of majors in the PCB?

Majors Observed Expected O-E (O-E)^2 (O-E)^2/EMarketing 140 100 40 1600 16Management 120 100 20 400 4Finance 90 100 -10 100 1MIS 50 100 -50 2500 25SUM 400 400 46

Chi-square (calc) = 46df = 3

Chi-square (crit) = 7.815 (alpha = 0.05)df = 3

(Case of the k independent samples)

Page 7: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

BAD 6243: Applied Univariate Statistics 7

Notes on the Chi-square Test• Same approach as before applies when unequal

frequencies are expected• In the case of the chi-square test for two

independent samples, the expected frequency in each cell should be at least 5

• In the case of the chi-square test for n independent samples, the expected frequency should not be less than 5 in more than 20% of the cells

• Where the above situation arises, you should consider combining categories

• Observations in all cases should be independent

Page 8: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

BAD 6243: Applied Univariate Statistics 8

Contingency Analysis(Crosstabs)

Male Female Total/Day E: Male/Day E: Fem/Day M: (O-E) F: (O-E) M: (O-E)^2 F: (O-E)^2 M: Chi-sq F: Chi-sq140 120 260 135.65 124.35 4.35 -4.35 18.90 18.90 0.14 0.1580 85 165 86.09 78.91 -6.09 6.09 37.05 37.05 0.43 0.4790 100 190 99.13 90.87 -9.13 9.13 83.36 83.36 0.84 0.92

100 105 205 106.96 98.04 -6.96 6.96 48.39 48.39 0.45 0.49190 140 330 172.17 157.83 17.83 -17.83 317.77 317.77 1.85 2.01600 550 3.71 4.05

Chi-square (calc) = 7.75df = (5-1)(2-1) 4

Chi-square (crit) = 9.49 (alpha = 0.05)df = 4

(Case of the k x k samples)

Is there a relationship between gender and when students are absent from classes?

Page 9: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

BAD 6243: Applied Univariate Statistics 9

Mann-Whitney U Test

Descriptive Statistics

20 16.85 7.916 1 30

20 .50 .513 0 1

ADMITS

YEAR

N Mean Std. Deviation Minimum Maximum

Ranks

10 7.35 73.50

10 13.65 136.50

20

YEARYear 2000

Year 2001

Total

ADMITSN Mean Rank Sum of RanksTest Statisticsb

18.500

73.500

-2.386

.017

.015a

Mann-Whitney U

Wilcoxon W

Z

Asymp. Sig. (2-tailed)

Exact Sig. [2*(1-tailedSig.)]

ADMITS

Not corrected for ties.a.

Grouping Variable: YEARb.

Is there a difference in the average rank of PhD admits who matriculated in 2000 vs. 2001?

(Case of the 2 independent samples)

Page 10: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

BAD 6243: Applied Univariate Statistics 10

Wilcoxon Matched Pairs TestWhere are they now: Is there a difference between the ATP

rankings of the top ten seeded tennis players in 2000 and 2003?

(Case of the 2 related samples)

Rank2000 Rank20031 24 95 106 73 12 208 37 189 15

10 19

Ranks

2a 4.00 8.00

8b 5.88 47.00

0c

10

Negative Ranks

Positive Ranks

Ties

Total

RANK2003 - RANK2000N Mean Rank Sum of Ranks

RANK2003 < RANK2000a.

RANK2003 > RANK2000b.

RANK2003 = RANK2000c. Test Statisticsb

-1.994a

.046

Z

Asymp. Sig. (2-tailed)

RANK2003 -RANK2000

Based on negative ranks.a.

Wilcoxon Signed Ranks Testb.

Page 11: B AD 6243: Applied Univariate Statistics Non-Parametric Statistics Professor Laku Chidambaram Price College of Business University of Oklahoma

BAD 6243: Applied Univariate Statistics 11

Kruskall-Wallis Test

Descriptive Statistics

100 55.5800 22.51293 12.00 99.00

100 1.00 .816 0 2

Scholar ranks

University

N Mean Std. Deviation Minimum Maximum

Ranks

33 35.91

34 60.96

33 54.32

100

UniversityOU

OSU

Other

Total

Scholar ranksN Mean Rank

Test Statisticsa,b

13.341

2

.001

Chi-Square

df

Asymp. Sig.

Scholar ranks

Kruskal Wallis Testa.

Grouping Variable: Universityb.

Is there a difference among the average rankings of National Merit Scholars admitted to schools of business in the state?

(Case of the k independent samples)