Upload
api-3704862
View
46
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Chapter 16 Measures of Dispersion
Citation preview
Contents
3.1 Range and Inter-quartile Range
3.2 Box-and-whisker Diagrams
3.3 Standard Deviation
3.4 Comparing the Dispersions of Different Sets of Data
3.5 Effects on the Dispersion with Change in Data
3 Measures of Dispersion
P. 2
Measures of Dispersion3
Content
3.1 Range and Inter-quartile Range
Mean, the median and the mode are measures of the central tendency of a set of data. But such measurements cannot tell us the dispersion of the data. Dispersion is the statistical name for the spread or variability of data
A. Introduction to Dispersion
Consider the following two sets of data which shows the heights (in cm) of the team members in two different teams.
Team A: 160, 161, 162, 162, 163, 164, 165, 167, 171, 175
Team B: 154, 156, 158, 159, 162, 164, 166, 172, 174, 185
Both teams have the same mean height 165 cm but the distribution of the heights of the team members in Team B is more spread out.
P. 3
Measures of Dispersion3
Content
3.1 Range and Inter-quartile Range
B. Range
The range is a simple measure of the dispersion of a set of data. For ungrouped data, the range is the difference between the largest value and the smallest value of the set of data.
For a set of grouped data, the range is the difference between the highest class boundary and the lowest class boundary.
Range = largest value – smallest value
Range = highest class boundary – lowest class boundary
P. 4
Measures of Dispersion3
Content
C. Inter-quartile Range
3.1 Range and Inter-quartile Range
The inter-quartile range is defined as the difference between the upper quartile and the lower quartile of a set of data.
Inter-quartile range = Q3 – Q1
When the data is arranged in ascending order of magnitude, the quartiles divide the data into four parts. There are a total of three quartiles which are usually denoted by Q1, Q2 and Q3.
P. 5
Measures of Dispersion3
Content
3.2 Box-and-whisker Diagrams
A box-and-whisker diagram illustrates the spread of a set of data. It provides a graphical summary of the set of data by showing the quartiles and the extreme values of the data.
From the above diagram, we know that the range of the data is 22 and the inter-quartile range is 9.
The difference between the two end-points of the line (represented by the highest and lowest marks) is the range.
The length of the box is the inter-quartile range.
Fig. 3.16
P. 6
Measures of Dispersion3
Content
For a set of ungrouped data x1, x2, …, xn,
A. Standard Deviation for Ungrouped Data
3.3 Standard Deviation
n
xxf
n
xxxxxx
n
ii
n
1
21
222
21
)(
)()()( deviation Standard
data. of number total the is and mean the iswhere nx
Notes:
1. Two sets of data may have the same mean but different standard deviations.
2. The larger the standard deviation, the more spread out the data is.
P. 7
Measures of Dispersion3
Content
For a set of grouped, we have to consider the frequency of each datum.
B. Standard Deviation for Grouped Data
3.3 Standing Deviation
n
ii
n
ii
n
nn
f
xxf
fff
xxfxxfxxf
1
1
21
21
2222
211
)(
)()()( deviation Standard
data. of number total the is and mean the isdata, of group th the offrequency the is where
nxifi
P. 8
Measures of Dispersion3
Content
Consider the following data:
40 , 36, 47, 53, 56.
The following steps demonstrates how to use a calculator to find the mean and the standard deviation of the data.
C. Finding Standard Deviation by a Calculator
3.3 Standing Deviation
Step 1: Set the function mode of the calculator to standard deviation ‘SD’ by pressing mode mode 1. then clear all the previous data in the ‘SD’ mode by pressing SHIFT CLR 1 EXE.
Step 2: Press the following keys n sequence: 40 DT 36 DT 47 DT 53 DT 56 DT
Step 3: Press SHIFT S-VAR 1 EXE, then we can obtain the mean = 46.4. Press SHIFT S-VAR 2 EXE then we can obtain the standard
deviation = 7.55.
P. 9
Measures of Dispersion3
Content
3.4 Comparing the Dispersions of Different Sets of Data
Measure of dispersion
Advantage Disadvantage
1. Range
2. inter-quartile range
3. Standard deviation
Only two data are involved, so it is the easiest one to calculate.
It only focuses on the middle 50% of data, thus avoiding the influence by extreme values.
It takes all the data into account.
Only extreme values are considered which may give a misleading impression of the dispersion.
It cannot show the dispersion of the whole group of data.
It is difficult to compute without using a calculator.
Table 3.34
P. 10
Measures of Dispersion3
Content
3.5 Effects on the Dispersion with Change in Data
If the greatest or the least value (assuming both are unique) in a data set is removed, then
(1) the range will decrease;
(2) the standard deviation will decrease as the data spread less widely;
(3) the inter-quartile range may increase or decrease.
A. Removal of a Certain Item from the Data
P. 11
Measures of Dispersion3
Content
3.5 Effects on the Dispersion with Change in Data
If a constant k is added to every datum in a set of data, then the following measures of dispersion
(1) the range,
(2) the inter-quartile range and
(3) the standard deviation
will not change.
B. Adding a Common Constant to the Whole Set of Data
P. 12
Measures of Dispersion3
Content
3.5 Effects on the Dispersion with Change in Data
The range, the inter-quartile range and the standard deviation will be k time the original values if the whole set of data is multiplied by a positive constant k.
C. Multiplying the Whole Set of Data by a Constant
P. 13
Measures of Dispersion3
Content
3.5 Effects on the Dispersion with Change in Data
In general, the zero value is the smallest one in a
set of data. So, it is similar to the case we studied
in Section A. But now, we insert the smallest one
into the set of data.
If a zero value is inserted in a positive data set, then
D. Insertion of Zero in the Data Set
In general, statistical data is non-negative, for example, height, weight, score, etc
• the range will increase;
• the standard deviation will increase as the data is spread more widely;
• the inter-quartile range may increase of decrease.