13
Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Embed Size (px)

Citation preview

Page 1: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Elementary Data Analysis

Frequencies and Crosstabulations

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 2: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Displaying Univariate Distributions

The first step in data analysis is usually to display the variation in each variable of interest.

Graphs and frequency distributions are the two most popular approaches; both allow the analyst to display the distribution of cases across the categories of a variable.

RS RELIGIOUS PREFERENCE

INTER-NONDENOMINATIO

NATIVE AMERICAN

CHRISTIAN

ORTHODOX-CHRISTIAN

MOSLEM/ISLAM

OTHER EASTERN

HINDUISM

BUDDHISM

OTHER (SPECIFY)

NONE

JEWISH

CATHOLIC

PROTESTANT

Pe

rce

nt

60

50

40

30

20

10

0

Source: GSS 2000

Respondent's Religious PreferenceValue Frequency Valid PercentPROTESTANT 1521 54.07CATHOLIC 679 24.13JEWISH 63 2.23NONE 398 14.15OTHER (SPECIFY) 42 1.5BUDDHISM 17 0.6HINDUISM 8 0.28OTHER EASTERN 1 0.03MOSLEM/ISLAM 12 0.43ORTHODOX-CHRISTIAN 12 0.43CHRISTIAN 39 1.39NATIVE AMERICAN 4 0.14INTER-NONDENOMINATIONAL 17 0.6Total 2813 100DK 1NA 3Total 4

2817

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 3: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Displaying Ungrouped Data

• There are more than 15–20 values to begin with, a number too large to be displayed in an easily readable table.

• The distribution of the variable will be clearer or more meaningful if some of the values are combined.

Many frequency distributions (and graphs) require grouping of some values after the data are collected. There are two reasons for grouping:

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 4: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Ungrouped vs. Grouped Data

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 5: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Ungrouped vs. Grouped Data

• Categories should be logically defensible and preserve the distribution’s shape.

• Categories should be mutually exclusive and exhaustive, so that every case should be classifiable in one and only one category.

Guidelines for grouping values:

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 6: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

CROSSTABULATING VARIABLESA crosstabulation displays the distribution of one variable for each category of another variable; it can also be termed a bivariate distribution.

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 7: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

CROSSTABULATING VARIABLES

Follow these rules when you create and then read a percentage table:

1. Make the independent variable the column variable and the dependent variable the row variable.

2. Percentage the table column by column, on the column totals. The percents should add to 100 (or perhaps 99 or 101, if there has been rounding error) in each column.

3. Compare the distributions of the dependent variable (the row variable) across each column.

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 8: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Graphing Association

Graphs provide an efficient tool for summarizing relationships among variables. © Pine Forge Press, an imprint of Sage Publications, 2004

Page 9: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Graphing Association

Another good example of the use of graphs to show relationships is provided by a Bureau of Justice Statistics report on criminal victimization (Rand, Lynch, & Cantor, 1997:1).

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 10: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Describing Association

• Existence. Do the percentage distributions vary at all between categories of the inde-pendent variable?

• Strength. How much do the percentage distributions vary between categories of the independent variable?

• Direction. For quantitative variables, do values on the dependent variable tend to increase or decrease with an increase in value on the independent variable?

• Pattern. For quantitative variables, are changes in the percentage distribution of the dependent variable fairly regular (simply increasing or decreasing), or do they vary (perhaps increasing, then decreasing, or perhaps gradually increasing, then rapidly increasing)?

A crosstabulation table reveals four aspects of the association between two variables:

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 11: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Describing Association

In this table, an association exists; it is moderately strong (the difference in percentages between the first and last column is about 25 percentage points); and the direction of association between likelihood of voting and family income is positive.

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 12: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Controlling for a Third Variable

By adding a third variable to an evaluation of a bivariate relationship, the data analyst can specify the conditions under which the bivariate relationship occurs.

© Pine Forge Press, an imprint of Sage Publications, 2004

Page 13: Elementary Data Analysis Frequencies and Crosstabulations © Pine Forge Press, an imprint of Sage Publications, 2004

Frequencies and Crosstabulations

Tools of the Data Analyst

© Pine Forge Press, an imprint of Sage Publications, 2004