36
PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

  • View
    220

  • Download
    1

Embed Size (px)

Citation preview

Page 1: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

PPA 501 – Analytical Methods in Administration

Lecture 9 – Bivariate Association

Page 2: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Statistical Significance and Theoretical Significance

Tests of significance detect nonrandom relationships.

Measures of association go one step farther and provide information on the strength and direction of relationships.

Page 3: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Statistical Significance and Theoretical Significance

Measures of association allow us to achieve two goals: Trace causal relationships among variables.

But, they cannot prove causality. Predict scores on the dependent variable

based on information from the independent variable.

Page 4: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Bivariate Association and Bivariate Tables

Two variables are said to be associated if the distribution of one of them changes under the various categories or scores of the other. Liberals are more likely to vote for Democratic

candidates than for Republican Candidates. Presidents are more likely to grant disaster

assistance when deaths are involved than when there are no deaths.

Page 5: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Bivariate Association and Bivariate Tables

Bivariate tables are devices for displaying the scores of cases on two different variables. Independent or X variable in the columns. Dependent variable or Y variable in the rows. Each column represents a category on the

independent variable. Each row represents a category on the dependent variable. Each cell represents those cases that fall into each combination of categories.

Page 6: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Bivariate Association and Bivariate Tables

Table 1. Home Buying Plans by Race, Jefferson County Housing Authority, 1999

81 37 0 118

79.4% 50.0% .0% 66.7%

21 37 1 59

20.6% 50.0% 100.0% 33.3%

102 74 1 177

100.0% 100.0% 100.0% 100.0%

Count

% within Race

Count

% within Race

Count

% within Race

No

Yes

Do you plan to buya home within thenext five years?

Total

White Black Other

Race

Total

Each column’s frequency distribution is called a conditional distribu-tion of Y.

Page 7: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Bivariate Association and Bivariate Tables

Often you will calculate a chi-square when you generate table. If chi-square is zero, then there is no association.

Usually, however, chi-square is positive to some degree.

Statistical significance is not the same as association. It is, however, usually the case that significance is the first step in determining the strength of association.

Page 8: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Three Characteristics of Bivariate Associations

Does an association exist? Statistical significance the first step. Calculate column percentages and compare

across conditional distributions. If there is an association, the largest cell will

change from column to column. If there is no association, the conditional

percentages will not change.

Page 9: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Three Characteristics of Bivariate Associations

How strong is the association? Once we establish the existence of an association,

we need to determine how strong it is? A matter of measuring the changes across conditional

distributions. No association – no change in column percentages. Perfect association – each value of the independent

variable is associated with one and only one value of the dependent variable.

The huge majority of relationships fall in between.

Page 10: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Three Characteristics of Bivariate Associations

How strong is the association (contd.)? Virtually all statistics of association are

designed to vary between 0 for no association and +1 for perfect association (±1 for ordinal and interval data).

The meaning of the statistics varies a little from statistic to statistics, but 0 signifies no association and 1 signifies perfect association in all cases.

Page 11: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Three Characteristics of Bivariate Associations

What is the pattern and/or direction of the association? Pattern is determined by examining which categories

of X are associated with which categories of Y. Direction only matters for ordinal and interval-ratio

data. In positive association, low values on one variable are associated with low values on the other and high with high.

On negative association, low values on variable are associated with high values on the other and vice versa.

Page 12: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Three Characteristics of Bivariate Associations

7-PT SCALE PARTY IDENTIFICATION * LIBERAL-CONSERVATIVE 7PT SCALE Crosstabulation

172 752 548 1095 317 301 85 3270

40.0% 38.2% 21.7% 16.1% 8.1% 8.5% 14.2% 16.5%

79 438 723 1592 653 358 58 3901

18.4% 22.3% 28.6% 23.4% 16.6% 10.1% 9.7% 19.7%

104 417 536 1003 335 188 32 2615

24.2% 21.2% 21.2% 14.7% 8.5% 5.3% 5.4% 13.2%

41 144 200 911 363 231 49 1939

9.5% 7.3% 7.9% 13.4% 9.2% 6.5% 8.2% 9.8%

13 86 206 793 697 555 76 2426

3.0% 4.4% 8.2% 11.6% 17.7% 15.7% 12.7% 12.3%

13 72 230 990 1003 670 77 3055

3.0% 3.7% 9.1% 14.5% 25.5% 19.0% 12.9% 15.4%

8 58 81 425 561 1231 220 2584

1.9% 2.9% 3.2% 6.2% 14.3% 34.8% 36.9% 13.1%

430 1967 2524 6809 3929 3534 597 19790

100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0% 100.0%

Strong Democrat

Weak Democrat

Independent - Democrat

Independent -Independent

Independent - Republican

Weak Republican

Strong Republican

7-PT SCALEPARTYIDENTIFICATION

Total

Extremelyliberal Liberal Slightly liberal

Moderate,middle ofthe road

Slightlyconservative Conservative

Extremelyconservative

LIBERAL-CONSERVATIVE 7PT SCALE

Total

Page 13: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Three Characteristics of Bivariate Associations

President's recommendation * Snow and ice/ Winter storm Crosstabulation

100 31 131

33.2% 44.3% 35.3%

49 13 62

16.3% 18.6% 16.7%

152 26 178

50.5% 37.1% 48.0%

301 70 371

100.0% 100.0% 100.0%

Count

% within Snow andice/ Winter storm

Count

% within Snow andice/ Winter storm

Count

% within Snow andice/ Winter storm

Count

% within Snow andice/ Winter storm

Turndown

Emergency Declaration

Major disaster declaration

President'srecommendation

Total

No Yes

Snow and ice/ Winterstorm

Total

Page 14: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association – Chi-Square Based

tables.2 x 2n larger tha ;1-c 1,-r of Minimum

V

tables.2 x 2 ;

2

2

N

N

The five-step model is calculated using chi-square.

Page 15: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association – Chi-Square Based

Snow and ice/ Winter storm * Presidential administration Crosstabulation

121 181 302

94.5% 73.9% 81.0%

7 64 71

5.5% 26.1% 19.0%

128 245 373

100.0% 100.0% 100.0%

Count

% within Presidentialadministration

Count

% within Presidentialadministration

Count

% within Presidentialadministration

No

Yes

Snow and ice/Winter storm

Total

Gerald R.Ford Jimmy Carter

Presidentialadministration

Total

Page 16: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association – Chi-Square-Based

Chi-Square Tests

23.271b 1 .000

21.950 1 .000

27.380 1 .000

.000 .000

23.209 1 .000

373

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

Linear-by-LinearAssociation

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

Computed only for a 2x2 tablea.

0 cells (.0%) have expected count less than 5. The minimum expected count is24.36.

b.

.250.0624.373

271.23

1)-c 1,-r of min.(

2

N

V

Page 17: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association – Proportional Reduction in Error

The logic of proportional reduction in error (PRE) involves first attempting to guess or predict the category into which each case will fall on the dependent variable without using information from the independent variable.

The second step involves using the information on the conditional distribution of the dependent variable within categories of the independent variable to reduce errors in prediction.

Page 18: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association – Proportional Reduction in Error

If there is a strong association, there will be a substantial reduction in error from knowing the joint distribution of X and Y.

If there is no association, there will be no reduction in error.

Page 19: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association – PRE - Lambda

1

21

E

EE

Prediction rule: Predict that all cases fall the largest category.Where E1 is the number of prediction errors without knowing X.And E2 is the number of prediction errors knowing the distributionOn X.

Page 20: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association – PRE - Lambda

E1 is calculated by subtracting the cases in the largest category of the row marginals from the total number of cases.

E2 is calculated by subtracting the largest category in each column from the column total and summing across columns of the independent variable.

Page 21: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association – PRE - Lambda

Table 2. Home Ownership by Race in Birmingham, 2000

21 81 102

35.0% 51.3% 46.8%

39 77 116

65.0% 48.7% 53.2%

60 158 218

100.0% 100.0% 100.0%

Count

% within Race(Dichotomous)

Count

% within Race(Dichotomous)

Count

% within Race(Dichotomous)

Rent or lease

Own

Do you rent, lease or ownyour current residence?

Total

White Non-white

Race (Dichotomous)

Total

Page 22: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association – PRE - Lambda

s.respondent about then informatio no knowingover 3.9%by homesown their not they or whether

predictingin error your reduced haveyou respondent theof race theknowingBy :tionInterpreta

.039.102

4

102

98102

.987721)81158()3960()()(

.102116218category Maximum

1

21

22112

1

E

EE

MaxNMaxNE

NE

Page 23: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association – PRE - Lambda

The key problem with lambda occurs when the distribution on one of the variables is lopsided (many cases in one category and few cases in the others). Under those circumstances, lambda can equal zero, even when there is a relationship.

Page 24: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Nominal Association

The five-step test of significance for both the chi-square-based statistics and the PRE statistics is chi-square.

Page 25: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Ordinal Association

Some ordinal variables have many categories and look like interval variables. These can be called continuous ordinal variables. The appropriate ordinal association statistic is Spearman’s rho (ignored in this lecture)

Some ordinal variables have only a few categories and can be called collapsed ordinal variables. The appropriate ordinal association statistic is gamma and the other tau-class statistics.

Page 26: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

The Computation of Gamma and Other Tau-class Statistics

These measures of association compare each case to every other case. The total number of pairs of cases is equal to N(N-1)/2.

Page 27: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

The Computation of Gamma and Other Tau-class Statistics

There are five classes of pairs: C or Ns: Pairs where one case is higher than the other

on both variables. D or Nd: Pairs where one case is higher on one

variable and lower on the other. Ty: Pairs tied on the dependent variable but not the

independent variable. Tx: Pairs tied on the independent variable but not the

dependent variable. Txy:Pairs tied on both variables.

Page 28: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

The Computation of Gamma and Other Tau-class Statistics

To calculate C, start in the upper left cell and multiply the frequency in each cell by the total of all frequencies to right and below the cell in the table.

To calculate D, start in the upper right cell and multiply the frequency in each cell by the total of all frequencies to the left and below the cell in the table.

Page 29: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

The Computation of Gamma and Other Tau-class Statistics

To calculate Tx, start in the upper left cell and multiply the frequency in each cell by the total of all frequencies directly below the cell.

To calculate Ty, start in the upper left cell and multiple the frequency in each cell by the total of all frequencies directly to the right of the cell.

Page 30: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

The Computation of Gamma and Other Tau-class Statistics

To calculate Txy, start in the upper left cell and multiply N(N-1)/2 for each cell.

TyDCTxDC

DCbtausKendall

TyDC

DCdsSomer

DC

DC

yx

'

'

Page 31: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Gamma Example – JCHA 2000

How safe do you feel alone at night in your home? * Race Crosstabulation

13 9 22

48.1% 64.3% 53.7%

11 4 15

40.7% 28.6% 36.6%

2 0 2

7.4% .0% 4.9%

1 1 2

3.7% 7.1% 4.9%

27 14 41

100.0% 100.0% 100.0%

Count

% within Race

Count

% within Race

Count

% within Race

Count

% within Race

Count

% within Race

Very safe

Somewhat safe

Somewhat unsafe

Very unsafe

How safe doyou feel aloneat night in yourhome?

Total

White Black

Race

Total

Page 32: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Gamma Example – JCHA 2000

141.

8442.426

60

482378

60

2661387816213878

60'

159.378

60

16213878

13878'

278.216

60

13878

13878

820

176000165536782

)0(1

2

)0(1

2

)0(0

2

)1(2

2

)3(4

2

)10(11

2

)8(9

2

)12(13

2660445233182)1(0)10(4)104(9)1(2)12(11)1211(13

1621044117)1(1)0(2)4(11)9(13

138012126)1(0)12(4)1211(9

7821165)1(2)10(11)104(13

TyDCTxDC

DCbtausKendall

TyDC

DCdyxsSomer

DC

DC

PairsTotal

Txy

Tx

Ty

D

C

Gamma and its associated statistics can have a PRE interpretation.

Page 33: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

Gamma Example – JCHA 2000Directional Measures

-.140 .146 -.952 .341

-.159 .166 -.952 .341

-.124 .131 -.952 .341

Symmetric

How safe do you feelalone at night in yourhome? Dependent

Race Dependent

Somers' dOrdinal by OrdinalValue

Asymp.Std. Error

aApprox. T

bApprox. Sig.

Not assuming the null hypothesis.a.

Using the asymptotic standard error assuming the null hypothesis.b.

Symmetric Measures

-.141 .147 -.952 .341

-.278 .289 -.952 .341

41

Kendall's tau-b

Gamma

Ordinal byOrdinal

N of Valid Cases

ValueAsymp.

Std. Errora

Approx. Tb

Approx. Sig.

Not assuming the null hypothesis.a.

Using the asymptotic standard error assuming the null hypothesis.b.

Page 34: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

The Five-Step Test for Gamma

Step 1. Making assumptions. Random sampling. Ordinal measurement. Normal sampling distribution.

Step 2 – Stating the null hypothesis. H0: γ=0.0

H1: γ 0.0

Page 35: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

The Five-Step Test for Gamma

Step 3. Selecting the sampling distribution and establishing the critical region. Sampling distribution = Z distribution. Alpha = .05. Z (critical) = ±1.96.

Page 36: PPA 501 – Analytical Methods in Administration Lecture 9 – Bivariate Association

The Five-Step Test for Gamma

Step 4. Computing the test statistic.

Step 5. Making a decision. Z(obtained) is less than Z(critical). Fail to reject the

null hypothesis that gamma is zero in the population. There is no relationship between race and the feeling of safety at home based on the JCHA 2000 sample.

.664.)3895.2(278.8314.37

216278.

278.141

13878278.

1)(

22

GN

DCGobtainedZ