24
Measures of Variation

Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Embed Size (px)

Citation preview

Page 1: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Measures of Variation

Page 2: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

RangeThe difference between the maximum

and the minimum data entries in a data set.

Range = max value – min value

Page 3: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

DeviationThe difference between a data entry (x)

and the mean (µ)

Deviation of x = x - µ

Page 4: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

EX: find the range of the set and the deviation of each valueSalary (1000 s of dollars)

Deviation

41

37

39

45

47

41

Page 5: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Population Variance (σ2)Square the deviations of the data set,

then average them to get the population variance.

σ2 = Σ(x - µ)2 n

Page 6: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Population Standard Deviation… Just take the square root of the

population variance.

(Symbol = σ)

Page 7: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

EX: find the variance and standard deviation of the data set

x x - µ (x - µ)2

41

37

39

45

47

41

Page 8: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Sample variance and standard deviation:

Page 9: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

EX: find variance and standard deviation of the sample:The weights (in pounds) of a sample of 10 U.S. presidents.

173175200173160185195230190180

Page 10: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Interpreting Standard DeviationStandard deviation is the measure of the

typical amount an entry deviates from the mean. The more entries are spread out, the greater the standard deviation.

Page 11: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Empirical RuleFor data with a symmetric (bell-shaped)

distribution, the standard deviation has the following characteristics:

1. About 68% of the data lie within 1 standard deviation of the mean.

2. About 95% of the data lie within 2 standard deviations of the mean.

3. About 99.7% of the data lie within 3 standard deviations of the mean.

Page 12: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Ex (from page 96)Use the Empirical Rule – assume the data

has a bell-shaped distribution:

30. The mean monthly utility bill for a sample of households in a city is $70, with a standard deviation of $8. Between what two values do about 95% of the data lie?

Page 13: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Chebychev’s TheoremThis works for ANY data set,

symmetric or not.

The portion of any data set lying within k standard deviations of the mean is at least

1 - 1 k2

Page 14: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Ex: (from page 96)36. Old Faithful is a famous geyser at

Yellowstone National Park. From a sample with n = 32, the mean duration of Old Faithful’s eruptions is 3.32 minutes and the standard deviation is 1.09 minutes. Using Chebyshev’s Theorem, determine at least how many of the eruptions lasted between 1.14 minutes and 5.5 minutes.

Page 15: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Standard Deviation for grouped data:

Page 16: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Measures of Position

Page 17: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

QuartilesData set is divided into 4 sections,

separated by 3 QUARTILESQ1 – about 25% of the data is below

Quartile 1Q2 – about 50% of the data is below

Quartile 2Q3 – about 75% of the data is below

Quartile 3

(Q2 is also the median!)

Page 18: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Ex: Find the QuartilesThe number of vacations days used by a sample of 20 employees in a recent year:

3 9 2 1 75 3 2 2 64 0 10 0 35 7 8 6 5

Page 19: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

InterQuartile Range (IQR)IQR is the measure of variation that

given the range of the middle 50% of the data. It is the difference between the 3rd and 1st quartiles.

IQR = Q3 – Q1

Page 20: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Box-and-Whisker PlotFind the 3 quartiles of the data set,

and the minimum and maximum entries

Construct a horizontal scale that spans the range.

Draw a box from Q1 to Q3 and draw a vertical line at Q2.

Draw whiskers from the box to the min and max entries.

Page 21: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Construct a box-and-whisker plotThe number of vacations days used by a sample of 20 employees in a recent year:

3 9 2 1 75 3 2 2 64 0 10 0 35 7 8 6 5

Page 22: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Percentiles and DecilesSimilar to Quartiles, but the data is

divided into 10 or 100 parts instead of 4.

8th Decile 80% of the data falls before the decile.

95th Percentile 95% of the data falls before the percentile

Page 23: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Standard Score (z-score)Represents the number of standard

deviations a given value (x) falls from the mean (µ).

z = value – mean = x - µ

standard deviation σ

Page 24: Measures of Variation. Range The difference between the maximum and the minimum data entries in a data set. Range = max value – min value

Ex: (from page 112)48. The life spans of a species of fruit fly

have a bell-shaped distribution, with a mean of 33 days and a standard deviation of 4 days.A. The life spans of three randomly selected

fruit flies are 34 days, 30 days, and 42 days. Find the z-score that corresponds to each life span. Determine if any of these life spans are unusual.

B. The life spans of three randomly selected fruit flies are 29 days, 41 days, and 25 days. Using the Empirical Rule, find the percentile that corresponds to each life span.