29
1 Tendencia central y d ispersión Tendencia central y d ispers ión de una distribución de una distribución

2 Mean Median Mode Variance

Embed Size (px)

Citation preview

Page 1: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 1/29

1

Tendencia central y dispersiónTendencia central y dispersión

de una distribuciónde una distribución

Page 2: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 2/29

2

Review Topics

•Measures of  Central Tendency  Mean, Median, Mode

•Quartile

•Measures of  Variation

  The Range, Variance and

Standard Deviation, Coefficient of variation•Shape

  Symmetric, Skewed

Page 3: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 3/29

3

Important Summary Measures

Central Tendency

MeanMedian

Mode

Quartile

One sample

Summary Measures

Variation

Variance

Standard Deviation

Coefficient of 

VariationRange

Page 4: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 4/29

4

Measures of Central Tendency

Central Tendency

Mean Median Mode

n

 x n

i i 

= 1

Data: You can access practice sample data on

HMO premiums here.

Page 5: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 5/29

5

With one data pointclearly the centrallocation is at the point

itself.

if the third data pointears on the left hand-side

e midrange, it should “pull”central location to the left.

Measures of Central

Location (Tendency)Usually, we focus our attention on two

aspects of measures of central location:

 –  Measure of the central data point (the average).

 –  Measure of dispersion of the data about theaverage. With two data points,

the central locationshould fall in the midbetween them (in ord

to reflect the locationboth of them).

If the third data point appearsexactly in the middle of thecurrent range, the centrallocation should not change

(because it is currentlyresiding in the middle).

Page 6: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 6/29

6

n

xx i

n

1i=∑=

 –  This is the most popular and useful measure of central location

Sum of the measurements

Number of measurements

Mean = 

Sample mean Population mean

N

xi

N

1i=∑=µ

Sample size Population size

nxx i

n

1i=∑=

l  ArithmeticArithmetic

meanmean

Page 7: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 7/29

7

=+++++

=∑

= =

66

654321

6

1x x x x x x x

 xii

• Example 4.1

mean of the sample of six measurements 7, 3, 9, -2, 4, 6 is

77 33 99 44 664.54.5

•  Example 4.2

ppose the telephone bills of example 2.1 represent  populatmeasurements. The population mean is

=+++

=∑

=µ =

200

x...xx

200

x 20021i2001i 42.1942.19 15.3015.30 53.2153.21

43.5943.59

2−

Page 8: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 8/29

8

26,26,28,29,30,32,60,31

Odd number of observations

26,26,28,29,30,32,60

Example 4.4

Seven employee salaries were recorded(in 1000s) : 28, 60, 26, 32, 30, 26, 29.Find the median salary.

 –  The median of a set of measurements is the

value that falls in the middle when the

measurements are arranged in order of 

magnitude.

Suppose one employee’s salary of $31,was added to the group recorded beforFind the median salary.

Even number of observation

26,26,28,29, 30,32,60,3126,26,28,29, 30,32,60,31

 There are two middle values!First, sort the salaries.

 Then, locate the valuein the middle

First, sort the salaries. Then, locate the values in the middle26,26,28,29, 30,32,60,3129.5,

l  The medianThe median

Page 9: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 9/29

9

 –  The mode of a set of measurements is the value

that occurs most frequently.

 –  Set of data may have one mode (or modal class),

or two or more modes.

 The modal classFor large data setsthe modal class ismuch more relevant

than the a single-value mode.

l  The modeThe mode

Page 10: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 10/29

10

•  Example 4.6

professor of statistics wants to report the results of a midtexam, taken by 100 students. The data appear in file XM04-Find the mean, median, and mode, and describe the informahey provide.

Marks

Mean 73.98

Standard Error 2.1502163Median 81

Mode 84

Standard Deviation 21.502163Sample Variance 462.34303Kurtosis 0.3936606Skewness -1.073098Range 89Minimum 11Maximum 100

Sum 7398Count 100

 The mean provides information

about the over-all performance levelof the class. The Median indicates that half of theclass received a grade below 81%,and half of the class received a grade

above 81%. The mode must be used when data isqualitative. If marks are classified byletter grade, the frequency of eachgrade can be calculated.Then, the modbecomes a logical measure to comput

Excel Results

Page 11: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 11/29

11

Relationship among Mean, Median,Relationship among Mean, Median,

and Modeand Mode

If a distribution is symmetrical, the

mean, median and mode coincide

If a distribution is non symmetrical, and

skewed

to the left or to the right, the three

measuresdiffer.

A positively skewed distribution(“skewed to the right”)

MeanMedian

Mode

Page 12: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 12/29

12

If a distribution is symmetrical, the mean,

median and mode coincide

If a distribution is non symmetrical, and

skewed to the left or to the right, the three

measures differ.

A positively skewed distribution(“skewed to the right”)

MeanMedian

ModeMeanMedianMode

A negatively skewed distribu(“skewed to the left”)

Page 13: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 13/29

13

Measures of Variation

Variation

Variance Standard Deviation Coefficient of 

VariationPopulation

Variance

Sample

Variance

Population

Standard

Deviation

Sample

Standard

Deviation

Range

Interquartile Range

100%

=  X 

CV 

Page 14: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 14/29

14

Measures of variabilityMeasures of variability(Looking beyond the average)(Looking beyond the average)

Measures of central location fail to tell the

whole story about the distribution.

A question of interest still remains unanswered:

How typical is the average value of allthe measurements in the data set?

How much spread out are the measurementsabout the average value?

or

Page 15: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 15/29

15

Observe two hypothetical data sets

 The average value provides

a good representation of thevalues in the data set.

Low variability data set

High variability data set

he same average value does notrovide as good presentation of the

alues in the data set as before.

 This is the previous

data set. It is nowchanging to...

Page 16: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 16/29

16

 –  The range of a set of measurements is the difference

 between the largest and smallest measurements.

 –  Its major advantage is the ease with which it can becomputed.

 –  Its major shortcoming is its failure to provide

information on the dispersion of the values between

the two end points.

? ? ?

But, how do all the measurements spread out?

Smallestmeasurement

Largestmeasurement

 The range cannot assist in answering this question

Range

l  The rangeThe range

Page 17: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 17/29

17

 –  This measure of dispersion reflects the values of 

all  the measurements.

 –  The variance of a population of N

measurements

x1, x2,…,x N having a mean µ is defined as

 –  The variance of a sample of n measurements

x1, x2, …,xn having a mean is defined as

N

)x( 2i

N1i2 µ−∑

=σ =

x

1n

)xx(

s

2i

n1i2

−∑

==

l  The varianceThe variance

Page 18: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 18/29

18

onsider two small populations:opulation A: 8, 9, 10, 11, 12

opulation B: 4, 7, 10, 13, 16

1098

74 10

11 12

13 16

8-10= -2

9-10= -1

11-10= +1

12-10= +2

4-10 = - 6

7-10 = -3

13-10 = +3

16-10 = +6

Sum = 0

Sum = 0

 The mean of bothpopulations is 10...

…but measurements in Bare much more dispersedthen those in A.

 Thus, a measure of dispersionis needed that agrees with thisobservation.

us start by calculatingsum of deviations

A

B

 The sum of deviationis zero in both cases,therefore, anothermeasure is needed.

Page 19: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 19/29

19

1098

74 10

11 12

13 16

8-10= -2

9-10= -1

11-10= +1

12-10= +2

4-10 = - 6

7-10 = -3

13-10 = +3

16-10 = +6

Sum = 0

Sum = 0

A

B

 The sum of deviationis zero in both cases,therefore, anothermeasure is needed.

 The sum of squared deviationsis used in calculating the variance.

See example next.

Page 20: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 20/29

20

Let us calculate the variance of the two populations

185

)1016()1013()1010()107()104( 222222B

=−+−+−+−+−

25

)1012()1011()1010()109()108(22222

2A

=−+−+−+−+−

hy is the variance defined ashe average squared deviation?

hy not use the sum of squareddeviations as a measure of dispersion instead?

After all, the sum of squareddeviations increases inmagnitude when the dispersion

of a data set increases!!

Page 21: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 21/29

21

 –  Example 4.8

Find the mean and the variance of the followingsample of measurements (in years).

3.4, 2.5, 4.1, 1.2, 2.8, 3.7

 –  Solution

=

∑−∑−=−

−∑= =

==

n

)x(x

1n1

1n

)xx(s

2

i

n

1i2i

n

1i

2

i

n

1i2

95.26

7.17

6

7.38.22.11.45.24.3

6

xx

i6

1i ==+++++

=∑

= =

A shortcut formula

=[3.42+2.52+…+3.72]-[(17.7)2/6] = 1.075 (years)2

Page 22: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 22/29

22

 Sample Standard Deviation

( )

1

2

−∑ −

=n

 X  X i 

For the Sample : use  n - 1 

in the denominator.

Data:  10 12 14 15 17 18 18 24

s =

n = 8 Mean =16

18

16241618161716151614161216102222222

− )()()()()()()(

= 4.2426

s

: X i 

Page 23: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 23/29

23

Interpreting StandardInterpreting Standard

DeviationDeviation

The standard deviation can be used to

 –  compare the variability of several distributions

 –  make a statement about the general shape of adistribution.

The empirical rule: If a sample of measurements

has a mound-shaped distribution, the interval

tsmeasurementheof 68%elyapproximatcontains)sx,sx( +−tsmeasuremetheof 95%elyapproximatcontains)s2x,s2x( +−

tsmeasurementheof allvirtuallycontains)s3x,s3x( +−

Page 24: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 24/29

24

 Comparing Standard Deviations

( )1

2

−∑ −

n

 X  X i s = = 4.2426

( )

 N

Xi−

=

σ = 3.9686

Value for the Standard Deviation is larger for data considered as a Sample.

Data :  10 12 14 15 17 18 18 24: X i 

N= 8 Mean =16

Page 25: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 25/29

25

 Comparing Standard Deviations

Mean = 15.5  s = 3.338 11 12 13 14 15 16 17 18 19 20 21

11 12 13 14 15 16 17 18 19 20 21

Data B

Data A

Mean = 15.5

  s = .9258

11 12 13 14 15 16 17 18 19 20 21

Mean = 15.5

  s = 4.57

Data C

Page 26: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 26/29

26

Measures of AssociationMeasures of Association

Two numerical measures are presented, for 

the description of linear relationship

 between two variables depicted in thescatter diagram.

 –  Covariance - is there any pattern to the way two

variables move together?

 –  Correlation coefficient - how strong is the linear 

relationship between two variables

Page 27: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 27/29

27

N

)y)((x Y)COV(X,covariancePopulation

yixiµ−µ−∑

==

µx (µy)  is the population mean of the variable X (Y)

N is the population size. n is the sample size.

1-n

)y)((x

 Y)cov(X,covarianceSampleyixi

µ−µ−∑

==

l  TheThe

covariancecovariance

Page 28: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 28/29

28

 –  This coefficient answers the question: How strong

is the association between X and Y.

yx

) Y,X(COV 

ncorrelatioof tcoefficienPopulation

σσ

yxss

) Y,Xcov(r 

ncorrelatioof tcoefficienSample

=

l  The coefficient of correlationThe coefficient of correlation

Page 29: 2 Mean Median Mode Variance

8/3/2019 2 Mean Median Mode Variance

http://slidepdf.com/reader/full/2-mean-median-mode-variance 29/29

29

COV(X,Y)=0

ρ or r =

+1

0

-1

Strong positive linear relationship

No linear relationship

Strong negative linear relationship

or

COV(X,Y)>0

COV(X,Y)<0