View
225
Download
0
Embed Size (px)
Citation preview
Measures of Dispersion
Variance and Standard Deviation
Basic Assumptions about Distributions
• We should be able to plot the number of times a specific value occurs on a graph using a line chart or histogram (interval/ratio data)
• Some distributions will be normal or bell-shaped.• Some distributions will be bi-modal or will have data
points distributed irregularly.• Some distributions will be skewed to the right or skewed
to the left. • Theoretically, samples taken from one population, should
over time, approximate a normal distribution. • We should have a normal distribution if we are to use
inferential statistics.
Other reasons to use Measures of Dispersion
• To see if variables taken from two or more samples are similar to one another.
• To see if a variable taken from a sample is similar to the same variable taken from a population – in other words is our sample representative of people in the population at least on that one variable.
Variation in Two SamplesSample 1 Sample 2
1 2
2 3
3 3
4 5
4 7
5 9
6 9
7 10
Mo = 4 Mo, 3, 9
Mdn = 4 Mdn = 6
Mean = 4 Mean = 6
VAR00001
7.06.05.04.03.02.01.0
2.5
2.0
1.5
1.0
.5
0.0
Std. Dev = 2.00
Mean = 4.0
N = 8.00
Sample 2
VAR00001
1097532
Count
2.2
2.0
1.8
1.6
1.4
1.2
1.0
.8
Normal Distributions are Bell-shaped and have the same
number of measures on either side of the mean.
Note: According to Montcalm & Royse only unimodal distributions can be normal distributions.
Normal Distributions
• 50% of all scores are on either side of the mean. • The distribution is symmetrical – same number
of scores fall above and below the mean.• The mean is the midpoint of the distribution.• Mean = median = mode• The entire area under the bell-shaped curve =
100%.
A standard deviation is:
• The degree to which each of the scores in a distribution vary from the mean. (x – mean)
• Calculated by squaring the deviation of each score from the mean.
• Based on first calculating a statistic called the variance.
Formulas are:
• Variance = Sum of each deviation squared divided by (n -1) where n is the number of values in the distribution.
• Standard Deviation = the square root of the sum of squares divided by (n – 1).
Using Sample 1 as an example
1 (1-4) = -3 9 Mean = 4
2 (2-4) = -2 4
3 (3-4) = -1 1
4 (4-4) = 0 0 Variance S.D.
4 (4-4) = 0 0 28/(8-1) Sq Root 4
5 (5-4) = 1 1 4 2
6(6 - 4) =
2 4
7(7 = 4) =
3 9
Total 0 28
Another variance/SD example
1 -5.00 25.00 Mean = 6
2 -4.00 16.00
4 -2.00 4.00
8 2.00 4.00
10 4.00 16.00Variance =
90/(6-1)SD = sq root of
18
11 5.00 25.00 18.00 4.24
Total 0.00 90.00
Other Important Terms in This Chapter
• Mean squares – the average of squared deviations from the mean in a set of numbers. (Same as variance)
• Interquartile range – points in a set of numbers that occur between 75% of the scores and 25% of the scores – that is, where the middle 50% of all scores lie (use cumulative percentages)
• Box plot – gives graphic information about minimum, maximum, and quartile scores in a distribution.
Box Plot
258216N =
Gender
MaleFemale
Curr
ent
Sala
ry
160000
140000
120000
100000
80000
60000
40000
20000
0
4314541063410344634318
32
29
2421342774131688072240468348371
Interquartile RangeTest Scores
Frequency Percent Cumulative
Percent
100 3 25% 100%
90 3 25% 75%
80 3 25% 50%
70 1 8.3% 25%
60 2 16.7% 16.7%
Total 12 100.0%
This information is important to our discussion of normal distributions
Central Limit Theorem (we will discuss this in two weeks) specifies
that: • 50% of all scores in a normal distribution are on
either side of the mean. • 68.25% of all scores are one standard deviation
from the mean.• 95.44% of all scores are two standard deviations
from the mean.• 99.74% of all scores in a normal distribution are
within 3 standard deviations of the mean.
Therefore, we will be able to
• Predict what scores are contained within one, two, or three standard deviations from the mean in a normal distribution.
• Compare the distribution of scores in samples.
• Compare the distribution of scores from populations to samples.
To calculate measures of central tendency and dispersion in SPSS
• Select descriptive statistics
• Select descriptives
• Select your variables
• Select options (mean, sd, etc.)
SPSS output
Descriptive Statistics
474 13 8 21 13.49 2.885 8.322474
Educational Level (years)Valid N (listwise)
N Range Minimum Maximum MeanStd.
Deviation Variance