30
Ogive, Stem and Leaf plot & Crosstabulation

Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Embed Size (px)

Citation preview

Page 1: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Ogive, Stem and Leaf plot &

Crosstabulation

Page 2: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

OgiveOgive

An ogive is a graph of a cumulative distribution..

The data values are shown on the horizontal axis.

Shown on the vertical axis are the:

• cumulative frequencies, or

• cumulative relative frequencies, or

• cumulative percent frequencies

Page 3: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Ogive

The frequency (one of the above) of each class is plotted as a point.

The plotted points are connected by straight lines.

Page 4: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

PartsPartsCost ($)Cost ($) PartsPartsCost ($)Cost ($)

2020

4040

6060

8080

100100

Cu

mu

lati

ve P

erc

en

t Fr

eq

uen

cyC

um

ula

tive P

erc

en

t Fr

eq

uen

cyC

um

ula

tive P

erc

en

t Fr

eq

uen

cyC

um

ula

tive P

erc

en

t Fr

eq

uen

cy

50 60 70 80 90 100 11050 60 70 80 90 100 11050 60 70 80 90 100 11050 60 70 80 90 100 110

(89.5, 76)(89.5, 76)

Ogive withOgive with

Cumulative Percent Frequencies Cumulative Percent Frequencies

Example of an OgiveExample of an OgiveExample of an OgiveExample of an Ogive

Page 5: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Stem and Leaf Plots

1. Sort data *** 2. Round data (if necessary) 3. Create TWO new columns (stem and leaf) 4. Put “stem” in one column and “leaves” in

another. 5. Format the leaves column to be left-aligned.

Page 6: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

What we have done

Summary of variablesQualitative:

Numeric: Frequency, relative frequency, percentage frequency, cumulative frequency, cumulative relative frequency, cumulative Percentage

Graphical: Bar (column) chart, pie chart

Page 7: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

What we have done II

Quantitative: Numeric: Frequency, relative frequency,

percentage frequency, cumulative frequency, cumulative relative frequency, cumulative Percentage

Graphical: histogram, stem and leaf, Ogive, boxplot

Page 8: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Another thing of interest to statisticians

Relationship between variablesVariables:

Quantitative Qualitative

Page 9: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Relationship between variables

Qualitative vs. qualitative: Crosstabulation

Qualitative vs. quantitative: ANOVA etc.

Quantitative vs. quantitative: Regression etc.

Page 10: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Example of Crosstab

Sum of count factor b          

factor a 1 2 3 4 5 Grand Total

1 10 20 36 32 51 149

2 69 87 52 32 12 252

3 14 62 32 53 83 244

4 69 91 92 20 25 297

Grand Total 162 260 212 137 171 942

Page 11: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

What crosstab tells us?

Cross Tabs: a tabular summary of data for two variables

Marginal Distributions/Probabilities: totals/probabilities in the margins of the cross tabulation.

Page 12: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

An example that makes more sense

Sum of Count Win    

Ginobli Played

N

Y Total

N 16 22 38

Y 12 32 44

Total 28 54 82

Page 13: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Marginal Distributions

Ginobli’s game play distributionPlayed: 44; Missed: 38

Spurs’ season breakdownWin: 54; Lose: 28

Page 14: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Marginal Probabilities

Ginobli’s chance of playing: 44/82

Spurs’ winning percentage: 54/82

Row (column ) total / grand total

Page 15: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Some other Probabilities

Conditional Probability Spurs’ winning percentage when Ginobli

played. 32/44 Cell count / row (column ) total

Joint Probability: cell count /grand totalE.g. The percentage of games that Spurs won and

Ginobli played.

Page 16: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Crosstab

Page 17: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Example cont.

Page 18: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Components of the tableColumn1 Column2 Column3 Total

Row 1 Cell count Cell count Cell count Row 1 total

Row 2 Cell count Cell count Cell count Row 2 total

Row 3 Cell count Cell count Cell count Row 3 total

Total Column 1 total

Column 1 total

Column 1 total

Grand

Total

Page 19: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Probabilities From Crosstab

Marginal, joint and conditional Marginal probability

row(column) total/grand total Joint probability

cell count / grand total Conditional probability

Cell count / row (column) total

Page 20: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

What is the percentage of all patients who received a CHEAP positive test result? Is this a joint, marginal, or conditional percentage?

Marginal: 37.0%

Page 21: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Out of all the patients given the CHEAP test, what is the percentage of false negatives? Is this a joint, marginal, or conditional percentage?

Joint, 2% (this is where CHEAP is negative, but Actual SFI is positive)

Page 22: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

What is the percentage of subjects diagnosed as positive by BOTH tests? Is this a joint, marginal, or conditional percentage?

Joint: 30%.

Page 23: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

What is the percentage of correct diagnosis?

=(30+61)/100 = 91% That is correct diagnosis of positive AND

negative.

Page 24: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

If someone gets the test result and it is “positive”, what is the chance that this person really has the disease.

30/37=81% (conditional)

That means there is still 19% chance that this person does not have the disease.

Page 25: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Check this one out! Homicide convictions in the state of Florida between 1976 and

1980. Did convicted person get death sentence? Is there a racial bias?

YES NO Total (% YES)

White 39 308 347 11.2%

Black 32 345 377 8.5%

Total 71 653 724 9.8%

Page 26: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

The other side of the story ii.

Table for those cases involving white victims

YES NO Total (% YES)

White 39 279 318 12.3%

Black 29 121 150 19.3%

Total 68 400 468 14.5%

Page 27: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

The other side of the story i.

Table for those cases involving black victims

YES NO Total (% YES)

White 0 29 29 0%

Black 3 224 227 1.3%

Total 3 253 256 1.2%

Page 28: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

This is what we call Simpson’s Paradox in statistics

Simpson’s paradox refers to the reversal in the direction of an X versus Y relationship when controlling for a third variable Z.

Page 29: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Another Example

Numbers of flights on time and delayed for two airlines at five airports in June 1991.

Alaska Airline American West Airline

On Time Delayed Delay % On Time Delayed Delay %

3724 501 13.3% 6438 787 10.9%

Page 30: Ogive, Stem and Leaf plot & Crosstabulation. Ogive n An ogive is a graph of a cumulative distribution.. n The data values are shown on the horizontal

Another Example (contd) Alaska Airline American West Airline

On Time

Delayed

Delay %

On Time

Delayed

Delay %

L.A. 497 62 11.1%

694 117 14.4%

Phoenix 221 12 5.4% 4840 415 7.9%

San Diego 212 20 8.6% 383 65 14.5%

San Francisco 503 102 16.9%

320 129 28.7%

Seattle 1841 305 14.2%

201 61 23.3%