Area Principle The area occupied by a part of the graph should
correspond to the magnitude of the value it represents.
Slide 2
Contingency Tables A table that shows how the individuals are
distributed along each variable, contingent on the value of the
other variable. marginal distribution conditional distribution
Slide 3
Slide 4
Did the chance of surviving the Titanic sinking depend on
ticket class??
Slide 5
Slide 6
Chapter 3 Displaying and Describing Categorical Data
*Independence *Contingency Tables *What can go wrong?
Slide 7
Independence In a contingency table, when the distribution of
one variable is the same for all categories of another, the
variables are INDEPENDENT No association between the variables Just
Checking pg 28 BlueBrownG/H/OTotal Males620632 Females4161232
Total10361864
Slide 8
Examining Contingency Tables Medical researchers followed 6272
Swedish men for 30 years to see if there was any association
between the amount of fish in their diet and prostate cancer. NoYes
Never/Seldom11014 Small part2420201 Moderate part2769209 Large
part50742 Prostate Cancer Fish Consumption
Slide 9
Process Think State the problem Identify the variables and the
Ws Check any conditions Show Mechanics (crunch numbers and make
displays) Tell Conclusion interpret the patterns in the table and
displays in context. Discuss possible real-world consequences. Be
careful not to overstate what you see.
Slide 10
What Can Go Wrong? Do NOT violate the area principle. 3-D
graphs and graphs shown at an angle are fun but not accurate Keep
it honest. Pie charts should have a total of 100% Be careful with
percentages that sound similar the percentage of the passengers who
were both in first class and survived vs the percentage of
first-class passengers who survived
Slide 11
Slide 12
What Can Go Wrong? When looking at contingency tables or
conditional distributions, be sure to look at the variables
individually as well Be sure there are enough individuals for each
category. We found that 66.7% of the rats improved their
performance with training. The other rat died. Dont overstate your
case Independence is an important concept, but it is rare for two
variables to be entirely independent. We can not conclude that one
variable has no effect whatsoever on another. Usually all we know
is that little effect was observed in our study.
Slide 13
What Can Go Wrong? Dont use unfair or sill averages. Averages
can be misleading. Be careful when averaging different variables
that the quantities youre averaging are comparable.
Slide 14
Simpsons Paradox When averages are taken across different
groups, they can appear to contradict the overall averages. Moral:
Be careful when you average across different levels of a second
variable Its always better to compare percentages or other averages
within each level of the other variable The overall averages may be
misleading
Slide 15
Its the last inning of an important game. Your team is a run
down with the bases loaded and two outs. The pitcher is due up, so
youll be sending in a pinch-hitter. There are 2 batters available
on the bench. Who should you send into bat? PlayerOverall A33 for
103 B45 for 151
Slide 16
Now who would you choose now? PlayerOverallvs LHPvs RHP A33 for
10328 for 815 for 22 B45 for 15112 for 3233 for 119 Pooling the
data together loses important information and leads to the wrong
conclusion. We always should take into account any factors that
might matter.