172
3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects these measures of center

3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 1

3.1 Measure of Center

Calculate the mean for a given data set

Find the median, and describe why the median is sometimes preferable to the mean

Find the mode of a data set

Describe how skewness affects these measures of center

Page 2: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 2

Measure of Center

Measure of Center

the value at the center or middle of a data set

The three common measures of center are the mean, the median, and the mode.

Page 3: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 3

Mean

the measure of center obtained by adding the data values and then dividing the total by the number of values

What most people call an average also called the arithmetic mean.

Page 4: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 4

Notation

Greek letter sigma used to denote the sum of a set of values.

x is the variable usually used to represent the data values.

n represents the number of data values in a sample.

Page 5: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 5

Example of summation

• If there are n data values that are denoted as:

Then:

nxxx ,,, 21

nxxxx 21

Page 6: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 6

Example of summation

• data

Then:

367

6462625348322521

x

21,25,32,48,53,62,62,64

Page 7: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 7

Sample Mean

x = n

x

is pronounced „x-bar‟ and denotes the mean of a set of sample values

x

Page 8: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 8

Example of Sample Mean

• data

Then:

21,25,32,48,53,62,62,64

9.45875.458

367

8

6462625348322521

x

Page 9: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 9

Notation

µ Greek letter mu used to denote the

population mean

N represents the number of data values in a population.

Page 10: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 10

Population Mean

N µ =

x

Note: here x represents the data values in the

population

Page 11: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 11

Advantages

Is relatively reliable: means of samples

drawn from the same population don‟t vary

as much as other measures of center

Takes every data value into account

Mean

Page 12: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 12

Mean

Disadvantage

Is sensitive to every data value, one

extreme value can affect it dramatically;

is not a resistant measure of center

Example:

21,25,32,48,53,62,62,64 → 9.45x

21,25,32,48,53,62,62,300 → 4.75x

Page 13: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 13

Median

Median

the measure of center which is the middle

value when the original data values are

arranged in order of increasing (or

decreasing) magnitude

Page 14: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 14

Finding the Median

1. If the number of data values is odd, the median is the number located in the exact middle of the list. Its position in the list is:

First sort the values (arrange them in order), the follow one of these

thn

2

1

Page 15: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 15

Finding the Median

2. If the number of data values is even, the median is found by computing the mean of the two middle numbers which are those that lie on either side of the data value in the position:

thn

2

1

Page 16: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 16

Example of Median

• 6 data values:

5.40 1.10 0.42 0.73 0.48 1.10

• Sorted data:

0.42 0.48 0.73 1.10 1.10 5.40

(even number of values – no exact middle)

915.02

1.173.0median

Page 17: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 17

Example of Median

• 7 data values:

5.40 1.10 0.42 0.73 0.48 1.10 0.66

73.0median

• Sorted data:

0.42 0.48 0.66 0.73 1.10 1.10 5.40

Page 18: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 18

Median

Median is not affected by an

extreme value - is a resistant

measure of the center

Example:

21,25,32,48,53,62,62,64

21,25,32,48,53,62,62,300

Median is 50.5 for both data sets.

Page 19: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 19

Median

From Example 3.3, page 91

Page 20: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 20

Mode

the value that occurs with the greatest

frequency

Data set can have one, more than one, or no mode

Page 21: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 21

Mode

Mode is the only measure of central

tendency that can be used with nominal data

Bimodal two data values occur with the same greatest frequency

Multimodal more than two data values occur with the same greatest frequency

No Mode no data value is repeated

Page 22: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 22

a) 5.40 1.10 0.42 0.73 0.48 1.10

b) 27 27 27 55 55 55 88 88 99

c) 1 2 3 6 7 8 9 10

Mode - Examples

Mode is 1.10

Bimodal - 27 & 55

No Mode

Page 23: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 23

• These data values represent weight gain or loss in kg for a random sample of18 college freshman (negative data values indicate weight loss)

11 3 0 -2 3 -2 -2 5 -2 7 2 4 1 8 1 0 -5 2

• Do these values support the legend that college students gain 15 pounds (6.8 kg) during their freshman year? Explain

Page 24: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 24

• Sample Mean

kg 9.118

34

n

x

• Median

-5 -2 -2 -2 -2 0 0 1 1 2 2 3 3 4 5 7 8 11

kg 5.12/)21(median

• Mode: -2

Page 25: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 25

• All of the measures of center are below 6.8 kg (15 pounds)

• Based on measures of center, these data values do not support the idea that college students gain 15 pounds (6.8 kg) during their freshman year

CONCLUSION

Page 26: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 26

• First, enter the list of data values

•Then select 2nd STAT (LIST) and arrow right to MATH option 3:mean( or 4: median(

•and input the desired list

Mean/Median with Graphing Calculator

Page 27: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 27

Example of Computing the Mean Using Calculator

Sorted amounts of Strontium-90 (in millibecquerels) in a simple random sample of baby teeth obtained from Philadelphia residents born after 1979

Note: this data is related to Three Mile Island nuclear power plant

Accident in 1979.

x = n

x = 149.2

Page 28: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 28

Example of Computing the Mean Using Calculator

Median is 150

Page 29: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 29

Symmetric distribution of data is symmetric if the

left half of its histogram is roughly a mirror image of its right half

Skewed

distribution of data is skewed if it is not symmetric and extends more to one side than the other

Skewed and Symmetric

Page 30: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 30

Skewed to the left

(also called negatively skewed) have a longer left tail, mean and median are to the left of the mode

Skewed to the right

(also called positively skewed) have a longer right tail, mean and median are to the right of the mode

Skewed Left or Right

Page 31: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 31

Distribution Skewed Left

• Mean is smaller than the median.

Page 32: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 32

Symmetric Distribution

• Mean, median, mode approximately equal.

Page 33: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 33

Distribution Skewed Right

• Mean is larger than the median.

Page 34: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 34

Example data set

5 5 5 5 5 10 10 10 10 10 10 15 15 15 15 15

20 20 20 20 25 25 25 30 30 30 35 35 40 45

• Mean:

• Median:

Distribution is skewed right.

18.730

560

n

xx

152

1515

Page 35: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 35

3.2 Measures of Variability

The range

What is a deviation?

The standard deviation and the variance.

Page 36: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 36

Why is it important to understand variation?

• A measure of the center by itself can be misleading

• Example:

Two nations with the same median family income are very different if one has extremes of wealth and poverty and the other has little variation among families (see the following table).

Page 37: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 37

Example of variation

Data Set A Data Set B

50,000 10,000

60,000 20,000

70,000 70,000

80,000 120,000

90,000 130,000

MEAN 70,000 70,000

MEDIAN 70,000 70,000

Data set B has more variation about the mean

Page 38: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 38

Histograms: example of variation

Data set B has more variation about the mean (Target).

Page 39: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 39

How do we quantify variation?

Page 40: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 40

Definition

The range of a set of data values is the difference between the maximum data value and the minimum data value.

Range = (maximum value) – (minimum value)

Page 41: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 41

Range = 30 - 6 = 24

Example of range.

Data:

27 28 25 6 27 30 26

Page 42: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 42

Range (cont.)

This shows that the range is very sensitive to extreme values; therefore not as useful as other measures of variation.

Ignoring the outlier of 6 in the previous data set gives data 27 28 25 27 30 26

Range = 30 - 25 = 5

Page 43: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 43

Deviation

The deviation for a given data value is the distance between the data value and the mean, except that the deviation can be negative while a distance is always positive.

Page 44: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 44

Deviation

A deviation for a given data value is the difference between the data value and the mean of the data set. If x is the data value, 1. For a sample, the deviation of x is

2. For a population, the deviation of x is

xx

x

Page 45: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 45

Deviation

The deviation can be positive, negative, or zero. 1. If the data value is larger than the mean, the

deviation will be positive.

2. If the data value is smaller than the mean, the deviation will be negative.

3. If the data value equals the mean, the deviation will be zero.

Page 46: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 46

Example:

• data

Mean

8,5,12,8,9,15,21,16,3

78.109

3162115981258

n

xx

Page 47: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 47

Data Value Deviation

8

5

12

8

9

15

21

16

3

78.278.108

78.578.105

78.278.108

78.178.109

78.778.103

22.578.1016

22.1078.1021

22.478.1015

22.178.1012

Page 48: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 48

Population Variance

N

x2

2

The population variance is the mean of the squared deviations in the population

Page 49: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 49

Population Standard Deviation

N

x2

The population standard deviation is the square root of the population variance.

Page 50: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 50

Sample Variance

1

2

2

n

xxs

The sample variance is

Page 51: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 51

Sample Variance

Note that the sample variance is only approximately the mean of the squared deviations in the sample because we use n-1 instead of n.

Page 52: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 52

Sample Variance

A statistic is an unbiased estimator of a parameter if its mean value equals the parameter it is trying to estimate.

Using n-1 instead of n makes the sample variance an unbiased estimator of the population variance.

Page 53: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 53

Sample Standard Deviation

1

2

n

xxs

The sample standard deviation is the square root of the sample variance.

Page 54: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 54

Steps to calculate the sample standard deviation

1. Calculate the sample mean

2. Find the squared deviations from the sample mean for each sample data value:

3. Add the squared deviations

4. Divide the sum in step 3 by n-1

5. Take the square root of the quotient in step 4

2)( xx

x

Page 55: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 55

Example: Standard Deviation

Given the data set:

8, 5, 12, 8, 9, 15, 21, 16, 3

Find the standard deviation

Page 56: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 56

Example: Standard Deviation

• Find the mean

78.109

3162115981258

n

xx

Page 57: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 57

Data Value Squared Deviations

From the Mean

8

5

12

8

9

15

21

16

3

73.7)78.108( 2

41.33)78.105( 2

73.7)78.108( 2

17.3)78.109( 2

53.60)78.103( 2

25.27)78.1016( 2

45.104)78.1021( 2

81.17)78.1015( 2

49.1)78.1012( 2

Page 58: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 58

Example: Standard Deviation

• Add the squared deviations (last column in the table above)

53.6025.2745.10481.1717.373.749.141.3373.7

57.263

Page 59: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 59

• Divide the sum by 9-1=8:

• Take the square root:

95.328/57.263

74.595.32

7.5s

Example: Standard Deviation

Page 60: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 60

Sample Standard Deviation (Computational Formula)

1

/22

n

nxxs

Page 61: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 61

Example: Standard Deviation

Data:

2 1 1 1 1 1 1 4 1 2 2 1 2 3

3 2 3 1 3 1 3 1 3 2 2

Determine the standard deviation

using the previous formula

Page 62: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 62

Example: Standard Deviation

• We need to find each the following:

n

)( 22 xx

x

Page 63: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 63

Data Table (25 data values)

TOTALS: 47 109

Page 64: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 64

Example: Standard Deviation

• Thus:

25n

109 )( 22 xx

47 x

Page 65: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 65

Example: Standard Deviation

• And:

24

25/47109

1

/ 222

n

nxxs

9.086.0

Page 66: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 66

Standard Deviation - Important Properties

The standard deviation is a measure of variation of all values from the mean.

The value of the standard deviation s is never negative and usually not zero.

The value of the standard deviation s can increase dramatically with the inclusion of one or more outliers (data values far away from all others).

Unlike variance, the units of the standard deviation s are the same as the units of the original data values.

Page 67: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 67

Example: page 116

Page 68: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 68

Colony A

7361134 range

Colony B

9167158 range

ANSWER

Page 69: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 69

28(b) Which colony has the greater variability according to the range?

Example: page 116

ANSWER: colony B

Page 70: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 70

Use the previous example and calculate the standard deviation for each colony with a calculator.

Page 71: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 71

Data is stored in Lists. Locate and press

the STAT button on the

calculator. Choose EDIT. The calculator

will display the first three of six lists

(columns) for entering data. Simply type

your data and press ENTER. Use your

arrow keys to move between lists.

Page 72: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 72

• Enter STAT then arrow right to CALC to get

• then press ENTER

Page 73: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 73

Calculator Example

When 1-Var Stats appears on the home screen, enter the name of the list containing the data. You can do this by entering List (= 2nd STAT) and choosing which list has the desired data.

Page 74: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 74

• 1-Var Stats

• NOTE: Previous example data will give different values than these.

Page 75: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 75

Colony A

standard deviation = 21.9

Colony B

standard deviation = 26.4

ANSWER

Page 76: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 76

• Compare histograms (SPSS)

Page 77: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 77

3.3 Working with grouped data

• Calculate the weighted mean for a given data set

•Estimate the mean from grouped data

•Estimate the variance and standard deviation from grouped data

Page 78: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 78

Weighted Mean

When data values are assigned different weights, we can compute a weighted mean.

Data values:

Corresponding weights:

nxxxx ,..., , , 321

nwwww ,..., , , 321

Page 79: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 79

Computing the weighted mean

• Multiply each data value by its corresponding weight:

•Sum these products.

•Divide the result by the sum of the weights.

iixw

Page 80: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 80

Weighted Mean

n

nn

i

iiw

www

xwxwxw

w

xwx

21

2211

Page 81: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 81

Example: Weighted Mean

Suppose homework/quiz average is weighted 10%, 2 exams are weighted 60%, and final exam is weighted 30%.

If a student makes homework/quiz average 87, exam scores of 80 and 92, and final exam score 85, compute the weighted average.

Page 82: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 82

Example: Weighted Mean

ANSWER:

8.85

00.1

)85(30.0)92(30.0)80(30.0)87(10.0

Page 83: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 83

Example: Weighted Mean

Page 84: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 84

Employment Hourly Mean Wage

($)

(weight) x (data value)

12,380 60.32 746,761.60

18,580 60.25 1,119,445.00

9,540 59.39 566580.60

35,550 57.98 2,061,189.00

10,130 55.95 566773.50

Weights Data Values

180,86

70.749,060,5

i

iiw

w

xwx

= $58.72

Page 85: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 85

Estimating the mean from grouped data

• Given a frequency distribution, how do we compute the mean?

HEIGHT

(inches)

FREQUENCY

59-60 3

61-62 3

63-64 4

65-66 7

67-68 6

69-70 1

71-72 1

Heights of 25 Women:

Page 86: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 86

Estimating the mean from grouped data

• Assume all sample values are at the class midpoints

Page 87: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 87

Assume all sample values are at the class

midpoints. HEIGHT

(inches)

FREQUENCY

59-60 3

61-62 3

63-64 4

65-66 7

67-68 6

69-70 1

71-72 1

Class midpoints:

59.5, 61.5, 63.5, 65.5, 67.5, 69.5, 71.5

Estimating the mean from grouped data

Page 88: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 88

Estimating the mean from grouped data

• Multiply each class midpoint by its corresponding frequency

•Add the result

•Divide by the sum of the frequencies (total number of data values)

Page 89: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 89

Class midpoints:

59.5, 61.5, 63.5, 65.5, 67.5, 69.5, 71.5

Estimating the mean from grouped data

Frequencies:

3, 3, 4, 7, 6, 1, 1

25

)5.71(1)5.69(1)5.67(6)5.65(7)5.63(4)5.61(3)5.59(3

Estimated mean:

inches 9.6425/5.1621

Page 90: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 90

Estimating the mean from a frequency distribution: Class midpoint = Frequency =

•Note: is “mu-hat” where the hat denotes the fact the mu is not exact, but approximate.

n

nn

i

ii

fff

fmfmfm

f

fm

21

2211

im if

Page 91: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 91

Estimating the variance from a frequency distribution:

i

ii

f

fm

)ˆ(ˆ

22

Estimating the standard deviation from a frequency distribution:

i

ii

f

fm

)ˆ(ˆ

2

Page 92: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 92

HEIGHT

(inches)

Frequency Class

Midpoints

59-60 3 59.5 86.2

61-62 3 61.5 33.9

63-64 4 63.5 7.4

65-66 7 65.5 2.9

67-68 6 67.5 41.8

69-70 1 69.5 21.5

71-72 1 71.5 44.0

inches 9.64ˆ

ii fm 2)ˆ(

inches 1.325

7.237 )ˆ(ˆ

2

i

ii

f

fm

Page 93: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 93

3.4 Measures of Position

Find percentiles for small and large data sets

Calculate z-scores and explain why we use them

Use z-scores to detect outliers.

Page 94: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 94

Let p be any integer between 0 and

100. The pth percentile of a data set is a value for which p percent of the values in the data set are less than or equal to this value.

Percentile

Page 95: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 95

Sort the data from small to large. If you are finding the pth percentile of a

sample of size n, calculate:

which is p percent of n

Steps to find the pth percentile for small data sets

np

i

100

Page 96: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 96

If i is an integer, the pth percentile is the

mean of the data values in positions i and i+1. If i is not an integer, round up and use

the value in this position as the pth percentile.

Steps to find the pth percentile for small data sets (cont)

Page 97: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 97

Example of Finding Percentile

• Find the 25th and 75th percentiles of these 12 data values

36 37 38 39 44 44 47 50 53 57 65 69

Page 98: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 98

Example of Finding Percentile

25% of 12

75% of 12

312100

25

i

912100

75

i

Page 99: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 - 99

Example of Finding Percentile

• The data can be grouped as follows:

36 37 38 39 44 44 47 50 53 57 65 69

25% of the data is below 38.5 (the mean of 38 and 39).

The 25th percentile is 38.5

3rd position

Page 100: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

100

Example of Finding Percentile

• The data can be grouped as follows:

36 37 38 39 44 44 47 50 53 57 65 69

75% of the data is below 55 (the mean of 53 and 57).

The 75th percentile is 55

9th position

Page 101: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

101

Example of Finding Percentile

• Find the 25th and 75th percentiles of these 7 data values:

36 38 39 44 47 50 65

Page 102: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

102

Example of Finding Percentile

25% of 7

75% of 7

75.17100

25

i

25.57100

75

i

Page 103: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

103

Example of Finding Percentile

36 38 39 44 47 50 65

2nd position

The 25th percentile is 38

1.75 round up to position 2

Page 104: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

104

Example of Finding Percentile

36 38 39 44 47 50 65

6th position

The 75th percentile is 50

5.25 round up to position 6

Page 105: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

105

Example of Finding Percentile

Page 132

Page 106: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

106

Example of Finding Percentile

2.0 2.1 2.4 2.8 3.1 3.5 3.8 4.2 4.3 4.4 5.2 7.1 7.7 8.8 14.7

Note: 15 data values

Data:

Page 107: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

107

Example of Finding Percentile

16(a) To find position, 5% of 15

16(b) To find position, 95% of 15

1 toup rounded 75.015100

5

i

15 toup rounded 25.1415100

95

i

Page 108: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

108

Example of Finding Percentile

2.0 2.1 2.4 2.8 3.1 3.5 3.8 4.2 4.3 4.4 5.2 7.1 7.7 8.8 14.7

16(a) 5th percentile is 2.0 million

16(b) 95th percentile is 14.7 million

Data:

Page 109: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

109

z Score (or standardized value)

the number of standard deviations that a given value x is above or below the mean

Z score

Page 110: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

110

Sample Population

Z score Formulas

s

xxz

xz

Page 111: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

111

Interpreting Z Scores

1. A z-score has no units.

2. Whenever a value is greater than the mean, its z score is positive

3. Whenever a value is less than the mean, its z score is negative

Page 112: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

112

Example of Finding Z Score

Page 132

Page 113: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

113

Example of Finding Z score

2.0 2.1 2.4 2.8 3.1 3.5 3.8 4.2 4.3 4.4 5.2 7.1 7.7 8.8 14.7

Using calculator we get that

Data:

4.3 and 1.5 sx

Page 114: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

114

Example of Finding Z score

2.0 2.1 2.4 2.8 3.1 3.5 3.8 4.2 4.3 4.4 5.2 7.1 7.7 8.8 14.7

18(a) z-score for fish oil (data value is 4.2)

26.0

4.3

1.52.4

z

Page 115: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

115

Example of Finding Z score

2.0 2.1 2.4 2.8 3.1 3.5 3.8 4.2 4.3 4.4 5.2 7.1 7.7 8.8 14.7

19(a) z-score for Ginseng (data value is 8.8)

11.1

4.3

1.58.8

z

Page 116: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

116

Outliers

An outlier is an extreme data value.

We will define a data value as “extreme” if it is at least three standard deviations from the mean.

Page 117: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

117

Outliers and z Scores

Data values are not unusual (exteme) if

Data values are unusual or outliers if

22 z

3or 3 zz

Page 118: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

118

Interpreting Z Scores

page 131: bell shaped distribution

Page 119: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

119

3.5 Chebyshev‟s Rule and the Empirical Rule

Calculate percentages using Chebyshev‟s Rule

Find percentages and data values using the Empirical Rule

Page 120: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

120

Chebyshev‟s Rule

The proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1–1/K2, where K is any positive number greater than 1.

For K = 2, at least 3/4 (or 75%) of all values lie within 2 standard deviations of the mean.

For K = 3, at least 8/9 (or 89%) of all values lie within 3 standard deviations of the mean.

Page 121: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

121

Example of Chebyshev‟s Rule

Page 139, problem 11(a)

A data distribution has a mean of 500 and a standard deviation of 100. Suppose we do not know whether the distribution is bell-shaped. (a) Estimate the proportion of data that falls between 300 and 700.

Page 122: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

122

Example of Chebyshev‟s Rule

Page 139, problem 11(a) ANSWER: data values obey First compute k using given

skx 003

700300 x

2 700100500 003 kk

100 and 500 sx

700 skx

Page 123: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

123

Example of Chebyshev‟s Rule

Page 139, problem 11(a) ANSWER: Since k =2, And Chebyshev’s rule says that at least 75% of the data falls between 300 and 700.

4

3

4

11

2

11

11

22

k

Page 124: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

124

The Empirical Rule

For data sets having a distribution that is approximately bell shaped, the following properties apply:

About 68% of all values fall within 1 standard deviation of the mean.

About 95% of all values fall within 2 standard deviations of the mean.

About 99.7% of all values fall within 3 standard deviations of the mean.

Page 125: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

125

For data sets having a distribution that is approximately bell shaped, the following properties apply:

About 68% of all data values obey

About 95% of all data values obey

About 99.7% of all data values obey

11 z

22 z

33 z

The Empirical Rule

Page 126: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

126

The Empirical Rule

Page 127: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

127

The Empirical Rule

Page 128: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

128

The Empirical Rule

Page 129: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

129

The Empirical Rule

Explain these percentages.

page 136: bell shaped distribution

Page 130: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

130

The Empirical Rule

For the green regions: 1. 50% of the data lies to the left of z=0. 2. 34% (half of 68%) of the data lies between z=-1 and

z=0. Therefore, 16% (=50%-34%) of the data is to the left of z=-1.

3. 47.5% (half of 95%) of the data lies between z=-2 and z=0. Therefore, 2.5% (=50%-47.5%) of the data is to the left of z=-2.

4. Subtracting areas gives that 13.5%=16%-2.5% of the data lies between z=-2 and z=-1.

5. Using symmetry, 13.5% of the data also lies between

z=1 and z=2.

Page 131: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

131

Example of Empirical Rule

Page 139, problem 12(a) A data distribution has a mean of 500 and a standard deviation of 100. Assume that the distribution is bell-shaped. (a) Estimate the proportion of data that falls between 300 and 700.

Page 132: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

132

Page 139, problem 12(a) ANSWER: data values obey As in problem 11, we are given so that the data in the interval are within 2 standard deviations of the mean

700300 x

100 and 500 sx

Example of Empirical Rule

Page 133: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

133

Page 139, problem 12(a) ANSWER: Here we are also given that the distribution is bell-shaped. Using the empirical rule, approximately 95% of the data lies between 300 and 700.

Example of Empirical Rule

Page 134: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

134

Note the difference in problems 11 and 12: For problem 11 we are not told that the distributioin is bell-shaped and we can only say that “at least” 75% of data is between 300 and 700 (using Chebyshev’s Rule). We cannot use the Empirical Rule in problem 11.

Empirical Rule vs. Chebyshev‟s Rule

Page 135: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

135

Example

Page 141, problem 24(a) In San Francisco the mean and standard deviation of the wind speed in January is (Note these are population parameters) Assume that the distribution of the wind speed is bell-shaped. (a) Estimate the proportion of times that the wind speed is between 1.2 mph and 13.2 mph.

mph 7.2 andmph 2.7

Page 136: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

136

Let the variable x represent wind speed so that Note that the mean 7.2 is the midpoint of this interval. Calculate how many standard deviations from the mean this interval represents. Use the formula:

2.132.1 x

Example

deviation standard

mean valuenumerical k

Page 137: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

137

Example

83.07.2

7.2 1.2

k 83.0

7.2

7.2 13.2

k

so that the data in the interval are within 0.83 standard deviations of the mean.

ANSWER

The empirical rule implies that less than 68% of the time the windspeed is between 1.2 mph and 13.2 mph.

Note: convice yourself of this by sketching areas below the bell-shaped distribution.

Page 138: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

138

Example

Page 141, problem 24(b)

(b) Estimate the proportion of times that the wind speed is less than 1.2 mph.

Page 139: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

139

Example

Page 141, problem 24(b)

1. Since 0.0 mph is 1 standard deviation below the mean of 7.2 mph, the empirical rule implies that 34% (half of 68%) of the time, the windspeed is between 0.0 mph and 7.2 mph.

2. Subtracting 34% from 50% gives that 16% of the time the windspeed is less than 0.0 mph.

3. Since 1.2 mph is greater than 0.0 mph but less than 7.2 mph, we can say that at least 16% of the time but no more than 50% of the time the windspeed is less than 0.0 mph

Note: convice yourself of this by sketching areas below the bell-shaped

distribution.

Page 140: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

140

3.6 Robust Measures

• Find quartiles and the interquartile range

•Calculate the five number summary of a data set

•Construct a boxplot for a given data set

•Apply robust detection of outliers

Page 141: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

141

Quartiles

Q1 (First Quartile) is the 25th percentile

Q2 (Second Quartile) is the 50th percentile or the median

Q3 (Third Quartile) is the 75th percentile

Are measures of location, denoted Q1, Q2, and Q3, which divide a set of data into four groups with about 25% of the values in each group.

Page 142: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

142

Example of Quartiles

1 Q

• Given the 24 data values (sorted):

36 37 37 39 39 41 43 44 44 47 50 53

54 55 56 56 57 59 61 61 65 69 69 75

Find , ,

2 Q3 Q

Page 143: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

143

For first quartile (25th percentile), position:

therefore the first quartile is the mean of the data values in positions 6 and 7

624100

25

i

Example of Quartiles

0.422

4341

2 76

1

xx

Q

Page 144: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

144

For second quartile, (50th percentile), position:

therefore the second quartile is the mean of the data values in positions 12 and 13

1224100

50

i

5.532

5453

2 1312

2

xx

Q

Example of Quartiles

Page 145: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

145

For third quartile, (75th percentile), position:

therefore the third quartile is the mean of the data values in positions 18 and 19

1824100

75

i

0.602

6159

2 1918

3

xx

Q

Example of Quartiles

Page 146: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

146

Interquartile Range

The Interquartile Range (IQR) is the difference between the third quartile and the first quartile which measures the spread of the middle 50% of the data:

It is considered a “robust” measure of variability because it is not affected by outliers in the data (bottom 25% and top 25% of data are ignored).

13IQR QQ

Page 147: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

147

0.60 and 0.42 31 QQ

Example of IQR

• Given the 24 data values (sorted):

36 37 37 39 39 41 43 44 44 47 50 53

54 55 56 56 57 59 61 61 65 69 69 75

we found that

0.180.420.60IQR 13 QQ

Page 148: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

148

0.60 and 0.42 31 QQ

Example of IQR

• Introduce outliers into previous data set:

2 5 37 39 39 41 43 44 44 47 50 53

54 55 56 56 57 59 61 61 65 69 100 200

we still have:

0.180.420.60IQR 13 QQ

Page 149: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

149

For a set of data, the 5-number

summary consists of the

minimum value; the first quartile

Q1; the median (or second

quartile Q2); the third quartile,

Q3; and the maximum value.

5-Number Summary

Page 150: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

150

Example of Five Number Summary

Given Data (sorted):

128 130 133 137 138 142 142 144 147 149

151 151 151 155 156 161 163 163 166

Page 151: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

151

• Minimum data value is 128

• First quartile location

• round up to get

138 51 xQ

Example of Five Number Summary

75.419100

25

i

Page 152: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

152

• Second quartile location

• round up to get

149 102 xQ

Example of Five Number Summary

5.919100

50

i

Page 153: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

153

• Third quartile location

• round up to get

156 153 xQ

Example of Five Number Summary

25.1419100

75

i

Page 154: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

154

• Max data value is 166

• Five Number Summary

min max

128 138 149 156 166

Example of Five Number Summary

3 Q2 Q1 Q

Page 155: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

155

Using a five number summary,

a data value is an outlier if

1. It is located 1.5(IQR) or more

below the first quartile

2. It is located 1.5(IQR) or more

above the third quartile

Robust Detection of Outliers

Page 156: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

156

• Given the data set: 2 5 37 39 39 41 43 44 44 47 50 53

54 55 56 56 57 59 61 61 65 69 100 200

has a five number summary:

2 42.0 53.5 60.0 200

0.180.420.60IQR 13 QQ

Robust Detection of Outliers

Page 157: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

157

Calculate: 1.5(IQR)=27

1.5(IQR) below the first quartile:

42 - 27=15

1.5(IQR) above the third quartile:

60 + 27=87

Therefore, 2,5,100,200 are outliers

Robust Detection of Outliers

Page 158: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

158

A boxplot (or box-and-whisker-

diagram) is a graph of a data set

that consists of a line extending

from the minimum value to the

maximum value, and a box with

lines drawn at the first quartile,

Q1; the median; and the third

quartile, Q3.

Boxplot

Page 159: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

159

Example of Boxplot

Page 160: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

160

Sorted amounts of Strontium-90 (in millibecquerels) in a random sample of baby teeth obtained from Philadelphia residents born after 1979 Note: this data is related to Three Mile Island nuclear power plant Accident in 1979.

128 130 133 137 138 142 142 144 147 149

151 151 151 155 156 161 163 163 166 172

Example of Boxplot

Page 161: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

161

• Five Number Summary

128 140 150 158.5 172

• Boxplot?

Next slide: page 148 constructing a boxplot by hand

or calculator or SPSS

Example of Boxplot

Page 162: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

162

Constructing a Boxplot

Page 148 (see example 3.41)

Page 163: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

163

Example of Boxplot

• Boxplot

Page 164: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

164

• Enter the data in a list:

Calculator Five Number Summary

Page 165: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

165

• Go to STAT - CALC and choose 1-Var Stats

Calculator Five Number Summary

Page 166: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

166

• On the HOME screen, when 1-Var Stats appears, type the list containing the data.

Calculator Five Number Summary

Page 167: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

167

• Arrow down to the five number summary (last five items in the list)

Calculator Five Number Summary

Page 168: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

168

• CLEAR out the graphs under y = (or turn them off).

• Enter the data into the calculator lists. (choose STAT, #1 EDIT and type in entries)

Calculator Boxplot

Page 169: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

169

• Press 2nd STATPLOT and choose #1 PLOT 1. Be sure the plot is ON, the second box-and-whisker icon is highlighted, and that the list you

will be using is indicated next to Xlist.

Calculator Boxplot

Page 170: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

170

• To see the box-and-whisker plot, press ZOOM and #9 ZoomStat. Press the TRACE key to see on-screen data about the box-and-whisker plot.

Calculator Boxplot

Page 171: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

171

Boxplot - Symmetric Distribution

Normal Distribution: Heights from a Random Sample of Women

1223 QQQQ NOTE:

value)data(min value)data(max 23 QQ

Page 172: 3.1 Measure of Centerjga001/chapter 3.pdf · 3.1 - 1 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable

3.1 -

172

Boxplot - Skewed Distribution

Skewed Distribution: Salaries (in thousands of dollars) of NCAA Football Coaches