Cross-Tabulation Tables

Tables in R and Computing Chi Square

Kinds of Data• Nominal or Ordinal (few

categories)• Interval if it is grouped• Some tests ignore the ordering of

the categories (e.g. Chi square)• In R this means we are working

with factors

Kinds of Tables1. One line per observation, e.g.

data on Ernest Witte where each row is a single individual - table() and Rcmdr()

2. One line per cell with a column of numbers representing the count for that cell – xtabs()

Kinds of Tables3. A row for each category of the

first variable and a column for each category of the second variable with counts at the intersection of a row and column – Rcmdr (Enter table directly)

Type 1> EWG2[sample(rownames(EWG2), 6),c("Age", "Goods")]

Age Goods159 Middle Adult Absent126 Child Present075 Child Absent156 Old Adult Present095 Adult Absent157 Old Adult Absent

Type 2 Age Goods FreqChild Absent 18Adult Absent 51Child Present 19Adult Present 55

Type 3

Absent Present Child 18 19 Adult 51 55

Factors in R• Factors use integers to code for

categorical data• Each integer code is associated

with a label, e.g. 1 could stand for “Absent” and 2 for “Present”

• Usually R creates factors from any character data columns

Factors• Regular factors are either equal or

not equal (nominal)• Ordered factors can be >, ==, and

<• Rcmdr makes is easy to convert a

numeric variable to a factor, to change the factor labels, to change the order of the factor levels, and to make the factor ordered

Tables in R• Tables are basically matrices with

labeling• Transferring between data.frames

and tables is possible but can lead to unexpected results

• Rcmdr does not recognize tables.

Key table commands in R• table() – create one and multi-way

tables• xtabs() - uses formulas (and

optionally weights/counts)• addmargins() – add row and

column totals• prop.table() – create table of

proportions

Key commands (cont.)• ftable() – flatten a

multidimensional table – but does not work with xtable()

• print(xtable(), type=“html”) – print an html version of the table.

# Use Rcmdr to load ErnestWitte and create EWG2# EWG2 <- subset(ErnestWitte, subset=Group==2)table(EWG2$Age)EWG2$Age <- factor(EWG2$Age)Table1 <- table(EWG2$Age, EWG2$Goods, dnn=c("Age", "Goods"))Table1str(Table1)Table2 <- xtabs(~Age+Goods, data=EWG2)Table2str(Table2)DF1 <- data.frame(Table1)DF1names(DF1) <- c("Age", "Goods","Freq")DF

Table3 <- xtabs(Freq~Age+Goods, data=DF1)Table3addmargins(Table1)prop.table(Table1)prop.table(Table1, 1)prop.table(addmargins(Table1, 1), 1)

# Included in RcmdrrowPercents(Table1)colPercents(Table1)

Table4 <- xtabs(~Adult+Goods+Pathology, data=EWG2)Table4str(Table4)ftable(Table4, row.vars=c(1, 2), col.vars=3)ftable(Table4, row.vars=c(3, 2), col.vars=1)

# tohtml() puts html code for table into Windows# clipboard or a file# named “clipboard” in Mac OsX or Linuxtohtml <- function(x) print(xtable(x), type="html", file="clipboard")tohtml(Table1)# Paste clipboard into Microsoft Excel

Null Hypothesis• The usual null hypothesis is that

the row and column variables are independent of one another – knowing one does not help us predict the other

• If the null hypothesis is false, the cell values will deviate from expected values

E.g. Coin Flipping• If I flip a coin twice, the chance

that the first flip comes up heads is .5

• The chance that the second flip comes up heads is .5 as well

• But what if the chance of getting a head changed depending on the first toss? The probabilities would be conditional

Expected Probabilities• Under the null hypothesis the

expected value for a cell is– (Row sum * Column sum)/Total count

• Deviations of the actual counts from the expected values is measured as– (Observed – Expected)2/Expected

• Summing the deviations over all cells gives us a statistic with a chi-square distribution

Chi-Square Test• Compares observed counts to

expected counts based on independence

• Rcmdr constructs the tables and computes the test, BUT deletes the results

Two Options• chisq.test()

– Saves results in multiple tables– Performs Chi Square and simulation

for p value• CrossTable() and crosstab() in

descr– SAS, SPSS style output with xtable()– More formatting options– Mosaic plot with crosstab()

Results <- chisq.test(xtabs(~Age+Pathology, data=EWG2), simulate.p.value=TRUE)

Pearson's Chi-squared test with simulated p-value (based on 2000 replicates)

data: xtabs(~Age + Pathology, data = EWG2) X-squared = 31.2876, df = NA, p-value = 0.0004998

str(Results)Results$expectedResults$residualsfisher.test(xtabs(~Sex+Goods, data=EWG2))

with(EWG2, CrossTable(Age, Pathology))with(EWG2, CrossTable(Age, Pathology, prop.c=FALSE, prop.t=FALSE))with(EWG2, crosstab(Age, Pathology))with(EWG2, crosstab(Age, Pathology, expected=TRUE, resid=TRUE))with(EWG2, crosstab(Sex, Goods))

Cross-Tabulation Tables

Documents

Classification and Tabulation of data Describing Data: Frequency Tables, Frequency Distributions, and Graphic Presentation (Organizing and Presenting Data)

Two Variable Tables February 23, 2011. Objectives By the end of this meeting, participants should be able to: a) a)Create and interpret a cross-tabulation

Cross-tabulations and Banners. Cross-tabulation Way to organize data by groups or categories, thus facilitating comparisons; joint frequency distribution

CROSS TABLES TWO AND MULTIWAY GRAPHS

Chapter Fifteen Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

Contingency Tables (cross tabs)

Ch 20 Cross Tabulation

Potato Chips Cross Tabulation(1)

Efficient Cross-tabulation Making with ‘tabout’ Command in STATA · 2016-10-14 · 2015-08-10 1 Efficient Cross-tabulation Making with ‘tabout’ Command in STATA Aug. 9. 2015

QUICKCHANGE CROSS REFERENCE TABLES

Simple and Cross Tabulation

Chapter XV Frequency Distribution, Cross-Tabulation, and Hypothesis Testing Chapter XV

User's Manual for CSPro Cross Tabulation 3.3

Cross Tabulation

Chapter 6: Relationships Between Two Variables: Cross-Tabulation

Income sustainability and poverty reduction among ... · Beekeeping value chain actors in the Berekum Municipality. Cross tabulation, Mean, Line graphs, and Frequency tables were

Cross-Border Relations Project - Tables of Recommendations

Appendix B Cross Tabulation and Frequency Tables 2005 ......Appendix B Cross Tabulation and Frequency Tables 2005 OSU Technology Poll Questionnaire (Time Series Section) Q1 How many

Tables Caption - SUNY Plattsburghfacweb.plattsburgh.edu/wendy.braje/students/psy205/Graphs.pdf · 2 Tables ! PracticeCase 2: More than 1 IV !Cross-tabulation table 20 Table 1 Number

Creating a Pivot Table - jesseb.com · A Pivot Table is the name Excel gives to what is more commonly known as a cross-tabulation table. Such tables can be one, two or three-dimensional