155
Descriptive Statistics Lesson 2 Ryan Safner 1 1 Department of Economics Hood College ECMG 212 - Statistics for Business and Economics Spring 2017 Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 1 / 95

Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

  • Upload
    vuquynh

  • View
    230

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Descriptive StatisticsLesson 2

Ryan Safner1

1Department of EconomicsHood College

ECMG 212 - Statistics for Business and EconomicsSpring 2017

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 1 / 95

Page 2: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Lesson Plan

1 Describing Categorical Data

2 Describing Quantitative Data

Measures of Central TendencyMeasures of Locating DataMeasures of Spread

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 2 / 95

Page 3: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variables and Distributions

All variables have a distribution of different individual values (andhow often it takes on these values)

We often want to display this distribution in a useful way to searchfor interesting patterns

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 3 / 95

Page 4: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Frequency Tables

A frequency table organizes data by recording counts or relativefrequencies for categories

Count: the total number of occurrences for a category

Relative frequency: the proportion or percentage of a categoryoccurring relative to all categories

RF (%) =Count of Category

Total Count(×100%)

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 4 / 95

Page 5: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Frequency Tables

Example

The ads that air during the Super Bowl are very expensive: a 30-second adduring the 2013 Super Bowl cost about $4M. Polls often ask whetherrespondents are more interested in the game or the commercials. Here are40 responses from one such poll:

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 5 / 95

Page 6: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Frequency Tables

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 6 / 95

Page 7: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Frequency Tables

Response Count Percentage

Commercials 8 20%Game 18 45%Won’t Watch 12 30%No Answer/Don’t Know 2 5%

Total 40 100%

Responses to Survey on Super Bowl

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 7 / 95

Page 8: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Displaying Data

Three rules of data analysis:

1 Make a graph2 Make a graph3 Make a graph

The Area principle: the area occupied by a part of the graph shouldcorrespond to the magnitude of the value it represents

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 8 / 95

Page 9: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Displaying Data

Three rules of data analysis:1 Make a graph

2 Make a graph3 Make a graph

The Area principle: the area occupied by a part of the graph shouldcorrespond to the magnitude of the value it represents

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 8 / 95

Page 10: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Displaying Data

Three rules of data analysis:1 Make a graph2 Make a graph

3 Make a graph

The Area principle: the area occupied by a part of the graph shouldcorrespond to the magnitude of the value it represents

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 8 / 95

Page 11: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Displaying Data

Three rules of data analysis:1 Make a graph2 Make a graph3 Make a graph

The Area principle: the area occupied by a part of the graph shouldcorrespond to the magnitude of the value it represents

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 8 / 95

Page 12: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Displaying Data

Three rules of data analysis:1 Make a graph2 Make a graph3 Make a graph

The Area principle: the area occupied by a part of the graph shouldcorrespond to the magnitude of the value it represents

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 8 / 95

Page 13: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Pie Graph

A pie graph represents categories as wedges in a circle proportional tothe relative frequency of that category

Wedges can be counts...

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 9 / 95

Page 14: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Pie Graph

A pie graph represents categories as wedges in a circle proportional tothe relative frequency of that category

Wedges can be counts...or relative frequencies

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 10 / 95

Page 15: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Bar Graph

A bar graph represents categories as bars with lengths proportional tothe relative frequency of that category

Bars can be counts...

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 11 / 95

Page 16: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Bar Graph

A bar graph represents categories as bars with lengths proportional tothe relative frequency of that category

Bars can be counts...or relative frequencies

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 12 / 95

Page 17: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Categorical Data and Graphs

Pie graphs and bar graphs are only valid for categorical data!

Can only represent counts or frequencies of different categories

Make sure that categories do not overlap – misleading

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 13 / 95

Page 18: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

We can see how two categorical variables are related with acontingency table

Shows how individuals are distributed along each variable dependingon the value of the other variable

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 14 / 95

Page 19: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

Each cell in a table gives the count for the combination of values ofboth variables

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 15 / 95

Page 20: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

Each cell in a table gives the count for the combination of values ofboth variables

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 15 / 95

Page 21: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

Marginal distribution of a variable is the distribution of total count ofthat variable’s values alone

Focuses on the margins (in bold) of the table

Marginal distribution of ResponseMarginal distribution of Sex

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 16 / 95

Page 22: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

Marginal distribution of a variable is the distribution of total count ofthat variable’s values alone

Focuses on the margins (in bold) of the tableMarginal distribution of Response

Marginal distribution of Sex

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 16 / 95

Page 23: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

Marginal distribution of a variable is the distribution of total count ofthat variable’s values alone

Focuses on the margins (in bold) of the tableMarginal distribution of ResponseMarginal distribution of Sex

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 16 / 95

Page 24: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

Conditional distribution of a variable is the distribution of values avariable takes conditional on another variable taking on a specificvalue

Conditional distribution of responses for femalesConditional distribution of sex for non-watchers

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 17 / 95

Page 25: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

Conditional distribution of a variable is the distribution of values avariable takes conditional on another variable taking on a specificvalue

Conditional distribution of responses for females

Conditional distribution of sex for non-watchers

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 17 / 95

Page 26: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

Conditional distribution of a variable is the distribution of values avariable takes conditional on another variable taking on a specificvalue

Conditional distribution of responses for femalesConditional distribution of sex for non-watchers

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 17 / 95

Page 27: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

277 men plan to watch the game, what percentage is this?

Column percent vs. row percent vs. total percent

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 18 / 95

Page 28: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

277 men plan to watch the game, what percentage is this?

Column percent vs. row percent vs. total percent

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 18 / 95

Page 29: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

What percent of respondents are men who will watch the game?

What percent of women plan to watch for the commercials?

What percent of those who won’t watch are men?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 19 / 95

Page 30: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

What percent of respondents are men who will watch the game?

What percent of women plan to watch for the commercials?

What percent of those who won’t watch are men?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 19 / 95

Page 31: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables: Contingency Tables

Example

Sex

Response Female Male Total

Game 198 277 475Commercials 154 79 233Won’t Watch 160 132 292NA/Don’t Know 4 4 8

Total 516 492 1008

What percent of respondents are men who will watch the game?

What percent of women plan to watch for the commercials?

What percent of those who won’t watch are men?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 19 / 95

Page 32: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables

Is there an association between the response to the survey and therespondent’s sex, or are the two independent?

Find the conditional distribution of responses by sex, and make agraph

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 20 / 95

Page 33: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables

A clustered bar chart allows us to compare the two distributions sideby side

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 21 / 95

Page 34: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Comparing Two Variables

A segmented bar chart shows the responses by sex

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 22 / 95

Page 35: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Simpson’s Paradox

Caution

Comparing percentages across different values or groups can lead tomisleading results – Simpson’s Paradox

Example

Suppose it’s the last inning of a baseball game, your team is down by 1with the bases loaded and 2 outs. The pitcher is due up, so you’ll besending in a pinch-hitter. There are 2 batters available on the bench.Whom should you send in to bat?

Player Overall

A 33 for 103B 45 for 151

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 23 / 95

Page 36: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Simpson’s Paradox

Caution

Comparing percentages across different values or groups can lead tomisleading results – Simpson’s Paradox

Example

Suppose it’s the last inning of a baseball game, your team is down by 1with the bases loaded and 2 outs. The pitcher is due up, so you’ll besending in a pinch-hitter. There are 2 batters available on the bench.Whom should you send in to bat?

Player Overall

A 33 for 103B 45 for 151

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 23 / 95

Page 37: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Simpson’s Paradox

Caution

Comparing percentages across different values or groups can lead tomisleading results – Simpson’s Paradox

Example

Suppose it’s the last inning of a baseball game, your team is down by 1with the bases loaded and 2 outs. The pitcher is due up, so you’ll besending in a pinch-hitter. There are 2 batters available on the bench.Whom should you send in to bat?

Player Overall

A 33 for 103B 45 for 151

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 23 / 95

Page 38: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Simpson’s Paradox

Caution

Comparing percentages across different values or groups can lead tomisleading results – Simpson’s Paradox

Example

Suppose it’s the last inning of a baseball game, your team is down by 1with the bases loaded and 2 outs. The pitcher is due up, so you’ll besending in a pinch-hitter. There are 2 batters available on the bench.Whom should you send in to bat?

Player Overall vs LHP vs RHP

A 33 for 103 28 for 81 5 for 22B 45 for 151 12 for 32 33 for 119

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 24 / 95

Page 39: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Simpson’s Paradox

Example

Two companies have labor and management classifications of employees.Company A’s laborers have a higher average salary than company B’s, asdo Company A’s managers. But overall, company B pays a higher averagesalary. How can that be? And which is the better way to compare earningpotential at the two companies?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 25 / 95

Page 40: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Cautions

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 26 / 95

Page 41: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Cautions

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 27 / 95

Page 42: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Cautions

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 28 / 95

Page 43: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Cautions

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 29 / 95

Page 44: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Cautions

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 30 / 95

Page 45: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Cautions

Open Letter to Kansas School Board

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 31 / 95

Page 46: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Cautions

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 32 / 95

Page 47: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Cautions

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 33 / 95

Page 48: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Cautions

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 34 / 95

Page 49: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Lesson Plan

1 Describing Categorical Data

2 Describing Quantitative Data

Measures of Central TendencyMeasures of Locating DataMeasures of Spread

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 35 / 95

Page 50: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Describing Quantitative Data

Suppose instead we quantitative data

Example

A class of 13 students takes a quiz out of 100 points with the followingresults: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 36 / 95

Page 51: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Stem-and-Leaf Plots

A stem-and-leaf plot is a quick way of organizing and displaying data(best for small datasets)

Divide each observation into a stem and a leaf, with the leafcontaining the final significant digit

e.g. For 53, stem 5, leaf 3

e.g. For 413 stem 41, leaf 3

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 37 / 95

Page 52: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Stem-and-Leaf Plots

A stem-and-leaf plot is a quick way of organizing and displaying data(best for small datasets)

Divide each observation into a stem and a leaf, with the leafcontaining the final significant digit

e.g. For 53, stem 5, leaf 3e.g. For 413 stem 41, leaf 3

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 37 / 95

Page 53: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Stem-and-Leaf Plots

Example

A class of 13 students takes a quiz out of 100 points with the followingresults: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

0 0123456 2 67 1 1 4 6 98 3 6 89 3 5

10

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 38 / 95

Page 54: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Stem-and-Leaf Plots

Example

A class of 13 students takes a quiz out of 100 points with the followingresults: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

0 0123456 2 67 1 1 4 6 98 3 6 89 3 5

10

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 38 / 95

Page 55: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Stem-and-Leaf Plots

Example

A sample of residents of Frederick report the distances from their home totheir local supermarket (in miles):{0.5, 1.2, 1.4, 1.4, 1.5, 2.2, 3.7, 4.2, 4.4, 4.4, 8.2}Create a stem-and-leaf plot.

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 39 / 95

Page 56: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Stem-and-Leaf Plots

We can quickly compare two distributions with a side-by-sidestem-and-leaf plot

Example

The stock prices of Apple over 10 days are: {320, 340, 333, 321, 332, 333,351, 329, 301, 339}

The stock prices of Microsoft over 10 days are: {290, 292, 302, 310, 303,299, 301, 319, 319, 307}

29 0 2 91 30 1 2 3 7

31 0 9 90 1 9 32

2 3 3 9 330 341 35

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 40 / 95

Page 57: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Stem-and-Leaf Plots

We can quickly compare two distributions with a side-by-sidestem-and-leaf plot

Example

The stock prices of Apple over 10 days are: {320, 340, 333, 321, 332, 333,351, 329, 301, 339}

The stock prices of Microsoft over 10 days are: {290, 292, 302, 310, 303,299, 301, 319, 319, 307}

29 0 2 91 30 1 2 3 7

31 0 9 90 1 9 32

2 3 3 9 330 341 35

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 40 / 95

Page 58: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Histograms

A more visually-appealing way to present this data is a histogram, thequantitative analogue to a bar graph

We divide up the data into bins of a certain size, and count up thenumber of values falling within those bins, representing these as bars

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 41 / 95

Page 59: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Histograms

Example

A class of 13 students takes a quiz out of 100 points with the followingresults: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Quiz Grades

No.

of S

tude

nts

0 20 40 60 80 100

02

4

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 42 / 95

Page 60: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Histograms

Example

A class of 13 students takes a quiz out of 100 points with the followingresults: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Note: Excel essentially plots a bar graph by first turning quantitative into categorical data

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 43 / 95

Page 61: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Histograms

Example

A class of 13 students takes a quiz out of 100 points with the followingresults: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

We can also make a relative frequency (percentage) histogram

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 44 / 95

Page 62: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Histograms

0 0123456 2 67 1 1 4 6 98 3 6 89 3 5

10

A stem-and-leaf plot is shaped like a sideways histogram

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 45 / 95

Page 63: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Quantitative Distributions: Shape

For distributions of quantitative data, we are often interested in theirshape, particularly:

ModesSymmetrySkewnessCenterSpreadOutliers

Formal definitions for these using probability theory, for now focus onhow a histogram “looks”

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 46 / 95

Page 64: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mode

The mode of a variable is its most frequent value

A variable can have more than one mode

Example

A class of 13 students takes a quiz out of 100 points with the followingresults: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 47 / 95

Page 65: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mode

The mode of a variable is its most frequent value

A variable can have more than one mode

Example

A class of 13 students takes a quiz out of 100 points with the followingresults: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 47 / 95

Page 66: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mode

Looking at the distribution (histogram), the modes are the “peaks” ofthe distribution

May be unimodal, bimodal, trimodal, etc.

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 48 / 95

Page 67: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mode

A distribution that does not have any clear mode is uniform

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 49 / 95

Page 68: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Symmetry

A distribution is symmetric if its distribution looks roughly the sameon either side of the “center”

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 50 / 95

Page 69: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Skewness

The thinner ends of a distribution (far left & far right) are called thetails of the distribution

If one tail stretches farther than the other, the distribution is said tobe skewed in the direction of the longer tail

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 51 / 95

Page 70: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Skewness

The thinner ends of a distribution (far left & far right) are called thetails of the distribution

If one tail stretches farther than the other, the distribution is said tobe skewed in the direction of the longer tail

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 52 / 95

Page 71: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Outliers

An extreme value that does not appear part of the general pattern ofa distribution is an outlier

Note: Excel essentially plots a bar graph by first turning quantitative intocategorical data

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 53 / 95

Page 72: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Outliers

Outliers can strongly affect descriptive statistics about a dataset

Outliers can be the most informative part of the data

Outliers could be the result of errors

Outliers should always be discussed in presentations about data

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 54 / 95

Page 73: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Arithmetic Mean

The natural measure of the center of a population’s distribution is its“average” or arithmetic mean (µ)

µ =x1 + x2 + ...+ xn

n=

1

N

N∑i=1

xi

For N values of variable x ,“mu” is the sum of all individual x values(xi ) from 1 to N, divided by the N number of values

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 55 / 95

Page 74: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Arithmetic Mean

The natural measure of the center of a population’s distribution is its“average” or arithmetic mean (µ)

µ =x1 + x2 + ...+ xn

n=

1

N

N∑i=1

xi

For N values of variable x ,“mu” is the sum of all individual x values(xi ) from 1 to N, divided by the N number of values

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 55 / 95

Page 75: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Arithmetic Mean

When we are dealing with a sample, we compute the sample mean(X̄ )

X̄ =x1 + x2 + ...+ xn

n=

1

n

n∑i=1

xi

For n values of variable x ,“x-bar” is the sum of all individual x values(xi ) from 1 to n, divided by the n number of values

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 56 / 95

Page 76: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Arithmetic Mean

When we are dealing with a sample, we compute the sample mean(X̄ )

X̄ =x1 + x2 + ...+ xn

n=

1

n

n∑i=1

xi

For n values of variable x ,“x-bar” is the sum of all individual x values(xi ) from 1 to n, divided by the n number of values

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 56 / 95

Page 77: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Arithmetic Mean

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Mean: 0+62+66+71+71+74+76+79+83+86+88+93+9513 = 944

13 = 72.61

Note the mean need not be an actual value of the data!

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 57 / 95

Page 78: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Arithmetic Mean

{62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

If we drop the outlier (0):Mean: 62+66+71+71+74+76+79+83+86+88+93+95

12 = 94412 = 78.67

The mean is not robust to outliers!Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 58 / 95

Page 79: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Median

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

The median is the midpoint of the distribution

50% to the left of the median, 50% to the right of the median

Arrange values of data in numerical order

For odd n: median is middle observation

For even n: median is average of two middle observations

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 59 / 95

Page 80: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Median

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

The median is the midpoint of the distribution

50% to the left of the median, 50% to the right of the median

Arrange values of data in numerical order

For odd n: median is middle observation

For even n: median is average of two middle observations

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 59 / 95

Page 81: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Median

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

The median is the midpoint of the distribution

50% to the left of the median, 50% to the right of the median

Arrange values of data in numerical order

For odd n: median is middle observation

For even n: median is average of two middle observations

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 59 / 95

Page 82: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Median

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

The median is the midpoint of the distribution

50% to the left of the median, 50% to the right of the median

Arrange values of data in numerical order

For odd n: median is middle observation

For even n: median is average of two middle observations

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 59 / 95

Page 83: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Median

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

The median is robust to outliers!

{62, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 60 / 95

Page 84: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

{1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 6, 6, 7}

For a symmetric distribution, mean=median

Mean: 6416 = 4

Median: 4

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 61 / 95

Page 85: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

{1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 6, 6, 7}

For a symmetric distribution, mean=medianMean: 64

16 = 4

Median: 4

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 61 / 95

Page 86: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

{1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 6, 6, 7}

For a symmetric distribution, mean=medianMean: 64

16 = 4Median: 4

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 61 / 95

Page 87: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

{1, 2, 3, 4, 4, 4, 5, 5, 6, 6, 6, 7, 7}

For a distribution skewed to the left, mean<median

Mean: 6013 = 4.6

Median: 5

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 62 / 95

Page 88: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

{1, 2, 3, 4, 4, 4, 5, 5, 6, 6, 6, 7, 7}

For a distribution skewed to the left, mean<medianMean: 60

13 = 4.6

Median: 5

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 62 / 95

Page 89: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

{1, 2, 3, 4, 4, 4, 5, 5, 6, 6, 6, 7, 7}

For a distribution skewed to the left, mean<medianMean: 60

13 = 4.6Median: 5

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 62 / 95

Page 90: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

{1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 6, 7}

For a distribution skewed to the right, mean>median

Mean: 4413 = 3.4

Median: 3

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 63 / 95

Page 91: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

{1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 6, 7}

For a distribution skewed to the right, mean>medianMean: 44

13 = 3.4

Median: 3

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 63 / 95

Page 92: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

{1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 6, 7}

For a distribution skewed to the right, mean>medianMean: 44

13 = 3.4Median: 3

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 63 / 95

Page 93: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

Example

A sample of the per capita consumption of gasoline (in gallons) for 10U.S. States in the year 2017 are given below:{556, 560, 537, 409, 530, 485, 521, 486, 504, 434}

1 Find the mean

2 Find the median

3 Is this distribution symmetric, skewed to the left, or skewed to theright?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 64 / 95

Page 94: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

Example

A sample of the per capita consumption of gasoline (in gallons) for 10U.S. States in the year 2017 are given below:{556, 560, 537, 409, 530, 485, 521, 486, 504, 434}

1 Find the mean

2 Find the median

3 Is this distribution symmetric, skewed to the left, or skewed to theright?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 64 / 95

Page 95: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

Example

A sample of the per capita consumption of gasoline (in gallons) for 10U.S. States in the year 2017 are given below:{556, 560, 537, 409, 530, 485, 521, 486, 504, 434}

1 Find the mean

2 Find the median

3 Is this distribution symmetric, skewed to the left, or skewed to theright?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 64 / 95

Page 96: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

Example

A sample of the GDP growth rate for 11 developed countries in the year2017 are given below:{0.05, 0.03, 0.02, 0.01, 0.00, 0.09, 0.11, 0.02, 0.03, 0.04, 0.01}

1 Find the mean

2 Find the median

3 Is this distribution symmetric, skewed to the left, or skewed to theright?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 65 / 95

Page 97: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

Example

A sample of the GDP growth rate for 11 developed countries in the year2017 are given below:{0.05, 0.03, 0.02, 0.01, 0.00, 0.09, 0.11, 0.02, 0.03, 0.04, 0.01}

1 Find the mean

2 Find the median

3 Is this distribution symmetric, skewed to the left, or skewed to theright?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 65 / 95

Page 98: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Mean, Median, & Skewness

Example

A sample of the GDP growth rate for 11 developed countries in the year2017 are given below:{0.05, 0.03, 0.02, 0.01, 0.00, 0.09, 0.11, 0.02, 0.03, 0.04, 0.01}

1 Find the mean

2 Find the median

3 Is this distribution symmetric, skewed to the left, or skewed to theright?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 65 / 95

Page 99: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Percentiles

We often care about specific values in the distribution and how theyrelate to the rest of the distribution

A helpful measure for a data value’s local is its percentile, measuringthe percentage of all data that is less than (or equal to) that value

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 66 / 95

Page 100: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Percentiles

To calculate the kth percentile, after ordering the data in numericalorder, calculate:

i =k

100(n + 1)

Where i is the index (rank or position) of the value & n is the totalnumber of observations

If i comes out to a whole number, the answer is that position

If i is not an integer, round up and round down, and take the averageof those positions in the data

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 67 / 95

Page 101: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Percentiles

Example

The following are a sample of 20 SAT Math scores: {570, 575, 580, 590,620, 635, 640, 645, 650, 650, 650, 670, 675, 675, 680, 710, 720, 745, 770,780}

1 Find the 20th percentile

2 Find the 84th percentile

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 68 / 95

Page 102: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Percentiles

Example

The following are a sample of 20 SAT Math scores: {570, 575, 580, 590,620, 635, 640, 645, 650, 650, 650, 670, 675, 675, 680, 710, 720, 745, 770,780}

1 Find the 20th percentile

2 Find the 84th percentile

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 68 / 95

Page 103: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Percentiles

To find the percentile of a particular data value, after ordering thedata in numerical order, calculate:

x + 0.5y

n∗ 100 then round to the nearest integer

x is number of data values counting from the first up to the valueright before the chosen value

y is the number of data values equal to the chosen value

n is total number of data

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 69 / 95

Page 104: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Percentiles

Example

The following are a sample of 20 SAT Math scores: {570, 575, 580, 590,620, 635, 640, 645, 650, 650, 650, 670, 675, 675, 680, 710, 720, 745, 770,780}

1 What percentile is a score of 645?

2 What percentile is a score of 675?

3 What percentile is a score of 720?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 70 / 95

Page 105: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Percentiles

Example

The following are a sample of 20 SAT Math scores: {570, 575, 580, 590,620, 635, 640, 645, 650, 650, 650, 670, 675, 675, 680, 710, 720, 745, 770,780}

1 What percentile is a score of 645?

2 What percentile is a score of 675?

3 What percentile is a score of 720?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 70 / 95

Page 106: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Percentiles

Example

The following are a sample of 20 SAT Math scores: {570, 575, 580, 590,620, 635, 640, 645, 650, 650, 650, 670, 675, 675, 680, 710, 720, 745, 770,780}

1 What percentile is a score of 645?

2 What percentile is a score of 675?

3 What percentile is a score of 720?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 70 / 95

Page 107: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Quartiles

We can divide up a distribution into four equal quartiles, eachcomprising a quarter (25%) of the data:

Quartile % of data

1 25%2 50%3 75%4 100%

The 2nd quartile (Q2) is the median

The 1st quartile (Q1) is the median of all the data beneath the medianThe 3rd quartile (Q3) is the median of all the data above the median

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 71 / 95

Page 108: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Quartiles

We can divide up a distribution into four equal quartiles, eachcomprising a quarter (25%) of the data:

Quartile % of data

1 25%2 50%3 75%4 100%

The 2nd quartile (Q2) is the median

The 1st quartile (Q1) is the median of all the data beneath the medianThe 3rd quartile (Q3) is the median of all the data above the median

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 71 / 95

Page 109: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Quartiles

We can divide up a distribution into four equal quartiles, eachcomprising a quarter (25%) of the data:

Quartile % of data

1 25%2 50%3 75%4 100%

The 2nd quartile (Q2) is the median

The 1st quartile (Q1) is the median of all the data beneath the median

The 3rd quartile (Q3) is the median of all the data above the median

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 71 / 95

Page 110: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Quartiles

We can divide up a distribution into four equal quartiles, eachcomprising a quarter (25%) of the data:

Quartile % of data

1 25%2 50%3 75%4 100%

The 2nd quartile (Q2) is the median

The 1st quartile (Q1) is the median of all the data beneath the medianThe 3rd quartile (Q3) is the median of all the data above the median

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 71 / 95

Page 111: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Measures of Spread

The more variation in the data, the less helpful a measure of centraltendency will tell us

So in addition to measuring the center, we also want to measure thespread

The simplest way is looking at the range, or the difference betweenthe extremes:

Range = max −min

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 72 / 95

Page 112: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Measures of Spread

The more variation in the data, the less helpful a measure of centraltendency will tell us

So in addition to measuring the center, we also want to measure thespread

The simplest way is looking at the range, or the difference betweenthe extremes:

Range = max −min

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 72 / 95

Page 113: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Measures of Spread

The more variation in the data, the less helpful a measure of centraltendency will tell us

So in addition to measuring the center, we also want to measure thespread

The simplest way is looking at the range, or the difference betweenthe extremes:

Range = max −min

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 72 / 95

Page 114: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Range

Example

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Range = 95− 0 = 95

Example

{62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Range = 95− 62 = 33

Note that the range is not robust to outliers

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 73 / 95

Page 115: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Range

Example

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Range = 95− 0 = 95

Example

{62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Range = 95− 62 = 33

Note that the range is not robust to outliers

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 73 / 95

Page 116: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Range

Example

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Range = 95− 0 = 95

Example

{62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Range = 95− 62 = 33

Note that the range is not robust to outliers

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 73 / 95

Page 117: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Range

Example

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Range = 95− 0 = 95

Example

{62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Range = 95− 62 = 33

Note that the range is not robust to outliers

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 73 / 95

Page 118: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Range

Example

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Range = 95− 0 = 95

Example

{62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Range = 95− 62 = 33

Note that the range is not robust to outliers

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 73 / 95

Page 119: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Interquartile Range

One helpful measure of spread is the interquartile range, the middle50%:

IQR = Q3 − Q1

Example

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Median = 76

Q1 = 71

Q3 = 86

IQR = 86− 71 = 15

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 74 / 95

Page 120: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Interquartile Range

One helpful measure of spread is the interquartile range, the middle50%:

IQR = Q3 − Q1

Example

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Median = 76

Q1 = 71

Q3 = 86

IQR = 86− 71 = 15

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 74 / 95

Page 121: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Interquartile Range

One helpful measure of spread is the interquartile range, the middle50%:

IQR = Q3 − Q1

Example

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Median = 76

Q1 = 71

Q3 = 86

IQR = 86− 71 = 15

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 74 / 95

Page 122: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Interquartile Range

One helpful measure of spread is the interquartile range, the middle50%:

IQR = Q3 − Q1

Example

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Median = 76

Q1 = 71

Q3 = 86

IQR = 86− 71 = 15

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 74 / 95

Page 123: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Five-Number Summary

Once we know the values of the quartiles, we can construct afive-number summary of a distribution, including:

1 Minimum2 Q13 Median4 Q35 Maximum

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 75 / 95

Page 124: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Five-Number Summary

Example

{0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}

Min Q1 Median Q3 Max

0 71 76 86 95

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 76 / 95

Page 125: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

The graphical way to present the five number summary is a boxplot(or a “box-and-whisker plot”)

The length of the box isthe IQR (Q1-Q3)

The line within the box isthe median

The “whiskers” identifydata within 1.5× IQR

Points beyond thewhiskers are outliers

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 77 / 95

Page 126: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

The graphical way to present the five number summary is a boxplot(or a “box-and-whisker plot”)

The length of the box isthe IQR (Q1-Q3)

The line within the box isthe median

The “whiskers” identifydata within 1.5× IQR

Points beyond thewhiskers are outliers

Q3

Q1

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 77 / 95

Page 127: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

The graphical way to present the five number summary is a boxplot(or a “box-and-whisker plot”)

The length of the box isthe IQR (Q1-Q3)

The line within the box isthe median

The “whiskers” identifydata within 1.5× IQR

Points beyond thewhiskers are outliers

Q3

Q1

Median

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 77 / 95

Page 128: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

The graphical way to present the five number summary is a boxplot(or a “box-and-whisker plot”)

The length of the box isthe IQR (Q1-Q3)

The line within the box isthe median

The “whiskers” identifydata within 1.5× IQR

Points beyond thewhiskers are outliers

Q3

Q1

Median

Q3 + 1.5 ∗ IQR

Q1 − 1.5 ∗ IQR

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 77 / 95

Page 129: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

The graphical way to present the five number summary is a boxplot(or a “box-and-whisker plot”)

The length of the box isthe IQR (Q1-Q3)

The line within the box isthe median

The “whiskers” identifydata within 1.5× IQR

Points beyond thewhiskers are outliers

Q3

Q1

Median

Q3 + 1.5 ∗ IQR

Q1 − 1.5 ∗ IQR

Outlier

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 77 / 95

Page 130: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

Quiz 1: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}Quiz 2: {50, 62, 72, 73, 79, 81, 82, 82, 86, 90, 94, 98, 99}

Quiz 1

Min Q1 Median Q3 Max

0 71 76 86 95

Quiz 2

Min Q1 Median Q3 Max

50 73 82 90 99

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 78 / 95

Page 131: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

Quiz 1: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}Quiz 2: {50, 62, 72, 73, 79, 81, 82, 82, 86, 90, 94, 98, 99}

Quiz 1

Min Q1 Median Q3 Max

0 71 76 86 95

Quiz 2

Min Q1 Median Q3 Max

50 73 82 90 99

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 78 / 95

Page 132: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

Quiz 1: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}Quiz 2: {50, 62, 72, 73, 79, 81, 82, 82, 86, 90, 94, 98, 99}

Quiz 1

Min Q1 Median Q3 Max

0 71 76 86 95

Quiz 2

Min Q1 Median Q3 Max

50 73 82 90 99

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 78 / 95

Page 133: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

Quiz 1: {0, 62, 66, 71, 71, 74, 76, 79, 83, 86, 88, 93, 95}Quiz 2: {50, 62, 72, 73, 79, 81, 82, 82, 86, 90, 94, 98, 99}

●0

25

50

75

100

Quiz 1 Quiz 2Quiz

Sco

res variable

Quiz 1

Quiz 2

Boxplots are great for quickly comparing multiple datasets

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 79 / 95

Page 134: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

Boxplots for daily AIG closing stock price

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 80 / 95

Page 135: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 81 / 95

Page 136: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 82 / 95

Page 137: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Boxplots

Alternate way of constructing a boxplot: extend “whiskers” from Q1

to Minimum and Q3 to MaximumBut less rigorous way of discovering outliersYour textbook uses this method, as does MS Excel

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 83 / 95

Page 138: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Deviations

Each observation deviates from the mean of the data:

deviation = xi − µ

There are as many deviations as there are data points (n)

We can measure the average or standard deviation from the mean

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 84 / 95

Page 139: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance

The population variance (σ2) of a population distribution measuresthe average of the squared deviations from the population mean

σ2 =

N∑i=1

(xi − µ)2

N

Why do we square deviations?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 85 / 95

Page 140: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance

The population variance (σ2) of a population distribution measuresthe average of the squared deviations from the population mean

σ2 =

N∑i=1

(xi − µ)2

N

Why do we square deviations?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 85 / 95

Page 141: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Standard Deviation

Square root the variance to get the population standard deviation(σ), the average deviation from the mean (in x units)

σ =√σ2 =

√√√√√√N∑i=1

(xi − µ)2

N

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 86 / 95

Page 142: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance

The sample variance (s2) of a sample distribution measures theaverage of the squared deviations from the sample mean

s2 =

n∑i=1

(xi − x̄)2

n − 1

Why divide by n − 1?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 87 / 95

Page 143: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance

The sample variance (s2) of a sample distribution measures theaverage of the squared deviations from the sample mean

s2 =

n∑i=1

(xi − x̄)2

n − 1

Why divide by n − 1?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 87 / 95

Page 144: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Standard Deviation

Square root the variance to get the sample standard deviation (s), theaverage deviation from the mean (in x units)

s =√s2 =

√√√√√√n∑

i=1

(xi − x̄)2

n − 1

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 88 / 95

Page 145: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Descriptive Statistics: Population vs. Sample

Population Parameters

Population Size: N

Mean: µ

Variance:

σ2 = 1N

N∑i=1

(xi − µ)2

Standard Deviation:σ =√σ2

Sample Statistics

Sample Size: n

Mean: x̄

Variance:

s2 = 1n−1

n∑i=1

(xi − x̄)2

Standard Deviation:s =√s2

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 89 / 95

Page 146: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance & Standard Deviation

Example

{-10, 0, 10, 20, 30}

1 Find the mean: −10+0+10+20+305 = 10

2 Find deviations from mean and square them:

xi xi − x̄ (xi − x̄)2

-10 -20 4000 -10 100

10 0 020 10 10030 20 400

∑0 1000

3 Add them all up

400 + 100 + 0 + 100 + 400 = 1000

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 90 / 95

Page 147: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance & Standard Deviation

Example

{-10, 0, 10, 20, 30}

1 Find the mean: −10+0+10+20+305 = 10

2 Find deviations from mean and square them:

xi xi − x̄ (xi − x̄)2

-10 -20 4000 -10 100

10 0 020 10 10030 20 400

∑0 1000

3 Add them all up

400 + 100 + 0 + 100 + 400 = 1000

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 90 / 95

Page 148: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance & Standard Deviation

Example

{-10, 0, 10, 20, 30}

1 Find the mean: −10+0+10+20+305 = 10

2 Find deviations from mean and square them:

xi xi − x̄ (xi − x̄)2

-10 -20 4000 -10 100

10 0 020 10 10030 20 400

∑0 1000

3 Add them all up

400 + 100 + 0 + 100 + 400 = 1000

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 90 / 95

Page 149: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance & Standard Deviation

Example

{-10, 0, 10, 20, 30}

1 Find the mean: −10+0+10+20+305 = 10

2 Find deviations from mean and square them:

xi xi − x̄ (xi − x̄)2

-10 -20 4000 -10 100

10 0 020 10 10030 20 400∑

0 1000

3 Add them all up

400 + 100 + 0 + 100 + 400 = 1000

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 90 / 95

Page 150: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance & Standard Deviation

Example

{-10, 0, 10, 20, 30}

5 Divide by n − 11000

4= 250

6 Square root (for standard deviation):

√250 ≈ 16

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 91 / 95

Page 151: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance & Standard Deviation

Example

{-10, 0, 10, 20, 30}

5 Divide by n − 11000

4= 250

6 Square root (for standard deviation):

√250 ≈ 16

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 91 / 95

Page 152: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Variance & Standard Deviation

Example

{8, 9, 10, 11, 12}

1 Find the mean

2 Find the variance

3 Find the standard deviation

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 92 / 95

Page 153: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Standardizing Variables

Sometimes we want to know how far a value is from its mean

We standardize a variable, or calculate its z-score:

Z =x − x̄

s

Z is the number of standard deviations a value is away from its mean(above +, below −)

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 93 / 95

Page 154: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Standardizing Variables

Example

A real estate analyst finds from data on 350 recent sales, that the averageprice was $175,000 with a standard deviation of $55,000. The size of thehouses (in square feet) averaged 2100 sq. ft. with a standard deviation of650 sq. ft.Which is more unusual, a house in this town that costs $340,000, or a5000 sq. ft. house?

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 94 / 95

Page 155: Descriptive Statistics - Lesson 2 - Ryan Safnerryansafner.com/teaching/ECMG212S2017/2. Descriptive Statistics.pdfDescriptive Statistics Lesson 2 Ryan Safner1 ... Lesson Plan 1 Describing

Descriptive Statistics

Most software programs can easily compute descriptive statistics (e.g.mean, median, quartiles, standard deviation) for us

MS Excel: Descriptive statistics in Data Analysis pack

TI-83+ calculators1 Enter data in L1 : STAT → 1.Edit → input data values in column2 CLEAR → STAT → CALC → 1.1-Var Stats, ENTER → 2nd L1 ENTER

Ryan Safner (Hood College) ECMG 212 - Lesson 2 Fall 2016 95 / 95