Upload
bonita-mdoda-armstrong
View
245
Download
0
Embed Size (px)
Citation preview
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 1/29
1
Tendencia central y dispersiónTendencia central y dispersión
de una distribuciónde una distribución
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 2/29
2
Review Topics
•Measures of Central Tendency Mean, Median, Mode
•Quartile
•Measures of Variation
The Range, Variance and
Standard Deviation, Coefficient of variation•Shape
Symmetric, Skewed
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 3/29
3
Important Summary Measures
Central Tendency
MeanMedian
Mode
Quartile
One sample
Summary Measures
Variation
Variance
Standard Deviation
Coefficient of
VariationRange
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 4/29
4
Measures of Central Tendency
Central Tendency
Mean Median Mode
n
x n
i i
= 1
Data: You can access practice sample data on
HMO premiums here.
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 5/29
5
With one data pointclearly the centrallocation is at the point
itself.
if the third data pointears on the left hand-side
e midrange, it should “pull”central location to the left.
Measures of Central
Location (Tendency)Usually, we focus our attention on two
aspects of measures of central location:
– Measure of the central data point (the average).
– Measure of dispersion of the data about theaverage. With two data points,
the central locationshould fall in the midbetween them (in ord
to reflect the locationboth of them).
If the third data point appearsexactly in the middle of thecurrent range, the centrallocation should not change
(because it is currentlyresiding in the middle).
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 6/29
6
n
xx i
n
1i=∑=
– This is the most popular and useful measure of central location
Sum of the measurements
Number of measurements
Mean =
Sample mean Population mean
N
xi
N
1i=∑=µ
Sample size Population size
nxx i
n
1i=∑=
l ArithmeticArithmetic
meanmean
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 7/29
7
=+++++
=∑
= =
66
654321
6
1x x x x x x x
xii
• Example 4.1
mean of the sample of six measurements 7, 3, 9, -2, 4, 6 is
77 33 99 44 664.54.5
• Example 4.2
ppose the telephone bills of example 2.1 represent populatmeasurements. The population mean is
=+++
=∑
=µ =
200
x...xx
200
x 20021i2001i 42.1942.19 15.3015.30 53.2153.21
43.5943.59
2−
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 8/29
8
26,26,28,29,30,32,60,31
Odd number of observations
26,26,28,29,30,32,60
Example 4.4
Seven employee salaries were recorded(in 1000s) : 28, 60, 26, 32, 30, 26, 29.Find the median salary.
– The median of a set of measurements is the
value that falls in the middle when the
measurements are arranged in order of
magnitude.
Suppose one employee’s salary of $31,was added to the group recorded beforFind the median salary.
Even number of observation
26,26,28,29, 30,32,60,3126,26,28,29, 30,32,60,31
There are two middle values!First, sort the salaries.
Then, locate the valuein the middle
First, sort the salaries. Then, locate the values in the middle26,26,28,29, 30,32,60,3129.5,
l The medianThe median
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 9/29
9
– The mode of a set of measurements is the value
that occurs most frequently.
– Set of data may have one mode (or modal class),
or two or more modes.
The modal classFor large data setsthe modal class ismuch more relevant
than the a single-value mode.
l The modeThe mode
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 10/29
10
• Example 4.6
professor of statistics wants to report the results of a midtexam, taken by 100 students. The data appear in file XM04-Find the mean, median, and mode, and describe the informahey provide.
Marks
Mean 73.98
Standard Error 2.1502163Median 81
Mode 84
Standard Deviation 21.502163Sample Variance 462.34303Kurtosis 0.3936606Skewness -1.073098Range 89Minimum 11Maximum 100
Sum 7398Count 100
The mean provides information
about the over-all performance levelof the class. The Median indicates that half of theclass received a grade below 81%,and half of the class received a grade
above 81%. The mode must be used when data isqualitative. If marks are classified byletter grade, the frequency of eachgrade can be calculated.Then, the modbecomes a logical measure to comput
Excel Results
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 11/29
11
Relationship among Mean, Median,Relationship among Mean, Median,
and Modeand Mode
If a distribution is symmetrical, the
mean, median and mode coincide
If a distribution is non symmetrical, and
skewed
to the left or to the right, the three
measuresdiffer.
A positively skewed distribution(“skewed to the right”)
MeanMedian
Mode
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 12/29
12
`̀
If a distribution is symmetrical, the mean,
median and mode coincide
If a distribution is non symmetrical, and
skewed to the left or to the right, the three
measures differ.
A positively skewed distribution(“skewed to the right”)
MeanMedian
ModeMeanMedianMode
A negatively skewed distribu(“skewed to the left”)
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 13/29
13
Measures of Variation
Variation
Variance Standard Deviation Coefficient of
VariationPopulation
Variance
Sample
Variance
Population
Standard
Deviation
Sample
Standard
Deviation
Range
Interquartile Range
100%
= X
S
CV
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 14/29
14
Measures of variabilityMeasures of variability(Looking beyond the average)(Looking beyond the average)
Measures of central location fail to tell the
whole story about the distribution.
A question of interest still remains unanswered:
How typical is the average value of allthe measurements in the data set?
How much spread out are the measurementsabout the average value?
or
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 15/29
15
Observe two hypothetical data sets
The average value provides
a good representation of thevalues in the data set.
Low variability data set
High variability data set
he same average value does notrovide as good presentation of the
alues in the data set as before.
This is the previous
data set. It is nowchanging to...
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 16/29
16
– The range of a set of measurements is the difference
between the largest and smallest measurements.
– Its major advantage is the ease with which it can becomputed.
– Its major shortcoming is its failure to provide
information on the dispersion of the values between
the two end points.
? ? ?
But, how do all the measurements spread out?
Smallestmeasurement
Largestmeasurement
The range cannot assist in answering this question
Range
l The rangeThe range
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 17/29
17
– This measure of dispersion reflects the values of
all the measurements.
– The variance of a population of N
measurements
x1, x2,…,x N having a mean µ is defined as
– The variance of a sample of n measurements
x1, x2, …,xn having a mean is defined as
N
)x( 2i
N1i2 µ−∑
=σ =
x
1n
)xx(
s
2i
n1i2
−
−∑
==
l The varianceThe variance
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 18/29
18
onsider two small populations:opulation A: 8, 9, 10, 11, 12
opulation B: 4, 7, 10, 13, 16
1098
74 10
11 12
13 16
8-10= -2
9-10= -1
11-10= +1
12-10= +2
4-10 = - 6
7-10 = -3
13-10 = +3
16-10 = +6
Sum = 0
Sum = 0
The mean of bothpopulations is 10...
…but measurements in Bare much more dispersedthen those in A.
Thus, a measure of dispersionis needed that agrees with thisobservation.
us start by calculatingsum of deviations
A
B
The sum of deviationis zero in both cases,therefore, anothermeasure is needed.
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 19/29
19
1098
74 10
11 12
13 16
8-10= -2
9-10= -1
11-10= +1
12-10= +2
4-10 = - 6
7-10 = -3
13-10 = +3
16-10 = +6
Sum = 0
Sum = 0
A
B
The sum of deviationis zero in both cases,therefore, anothermeasure is needed.
The sum of squared deviationsis used in calculating the variance.
See example next.
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 20/29
20
Let us calculate the variance of the two populations
185
)1016()1013()1010()107()104( 222222B
=−+−+−+−+−
=σ
25
)1012()1011()1010()109()108(22222
2A
=−+−+−+−+−
=σ
hy is the variance defined ashe average squared deviation?
hy not use the sum of squareddeviations as a measure of dispersion instead?
After all, the sum of squareddeviations increases inmagnitude when the dispersion
of a data set increases!!
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 21/29
21
– Example 4.8
Find the mean and the variance of the followingsample of measurements (in years).
3.4, 2.5, 4.1, 1.2, 2.8, 3.7
– Solution
=
∑−∑−=−
−∑= =
==
n
)x(x
1n1
1n
)xx(s
2
i
n
1i2i
n
1i
2
i
n
1i2
95.26
7.17
6
7.38.22.11.45.24.3
6
xx
i6
1i ==+++++
=∑
= =
A shortcut formula
=[3.42+2.52+…+3.72]-[(17.7)2/6] = 1.075 (years)2
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 22/29
22
Sample Standard Deviation
( )
1
2
−∑ −
=n
X X i
For the Sample : use n - 1
in the denominator.
Data: 10 12 14 15 17 18 18 24
s =
n = 8 Mean =16
18
16241618161716151614161216102222222
−
− )()()()()()()(
= 4.2426
s
: X i
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 23/29
23
Interpreting StandardInterpreting Standard
DeviationDeviation
The standard deviation can be used to
– compare the variability of several distributions
– make a statement about the general shape of adistribution.
The empirical rule: If a sample of measurements
has a mound-shaped distribution, the interval
tsmeasurementheof 68%elyapproximatcontains)sx,sx( +−tsmeasuremetheof 95%elyapproximatcontains)s2x,s2x( +−
tsmeasurementheof allvirtuallycontains)s3x,s3x( +−
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 24/29
24
Comparing Standard Deviations
( )1
2
−∑ −
n
X X i s = = 4.2426
( )
N
Xi−
=
2µ
σ = 3.9686
Value for the Standard Deviation is larger for data considered as a Sample.
Data : 10 12 14 15 17 18 18 24: X i
N= 8 Mean =16
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 25/29
25
Comparing Standard Deviations
Mean = 15.5 s = 3.338 11 12 13 14 15 16 17 18 19 20 21
11 12 13 14 15 16 17 18 19 20 21
Data B
Data A
Mean = 15.5
s = .9258
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5
s = 4.57
Data C
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 26/29
26
Measures of AssociationMeasures of Association
Two numerical measures are presented, for
the description of linear relationship
between two variables depicted in thescatter diagram.
– Covariance - is there any pattern to the way two
variables move together?
– Correlation coefficient - how strong is the linear
relationship between two variables
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 27/29
27
N
)y)((x Y)COV(X,covariancePopulation
yixiµ−µ−∑
==
µx (µy) is the population mean of the variable X (Y)
N is the population size. n is the sample size.
1-n
)y)((x
Y)cov(X,covarianceSampleyixi
µ−µ−∑
==
l TheThe
covariancecovariance
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 28/29
28
– This coefficient answers the question: How strong
is the association between X and Y.
yx
) Y,X(COV
ncorrelatioof tcoefficienPopulation
σσ
=ρ
yxss
) Y,Xcov(r
ncorrelatioof tcoefficienSample
=
l The coefficient of correlationThe coefficient of correlation
8/3/2019 2 Mean Median Mode Variance
http://slidepdf.com/reader/full/2-mean-median-mode-variance 29/29
29
COV(X,Y)=0
ρ or r =
+1
0
-1
Strong positive linear relationship
No linear relationship
Strong negative linear relationship
or
COV(X,Y)>0
COV(X,Y)<0