View
227
Download
0
Category
Preview:
Citation preview
7/28/2019 Central Tendency (Stats)
1/89
MEASURES OF CENTRALTENDENCY ( Location ) Average age of top 50 powerful persons of 2010in India decreased from 58 years to 54
7/28/2019 Central Tendency (Stats)
2/89
Measures of Location orCentral Tendency.
In a distribution , the observationscluster around a central value. Thisproperty of concentration of the
observations around a central value iscalled Central Tendency.
The central Value around which there isconcentration is called measure ofcentral tendency( measure of location,Average).
Ex: Mean marks scored by 1st PGDM is 65 %
7/28/2019 Central Tendency (Stats)
3/89
Objectives of Averaging:To get a Single value that describes the
characteristics of the entire data.
To Facilitate Comparison. For computing various other
statistical measures such as
dispersion, skewness, kurtosisand various other basiccharacteristics of a mass data.
7/28/2019 Central Tendency (Stats)
4/89
Requisites of a Good Average.
1. It should be simple to Understand andeasy to calculate.
2. It should be based on all the items ofthe given data.
3. It should be rigidly defined.4. It should be capable of further
mathematical treatment.
5. It should be affected as little as possible
by fluctuations of sampling.6. It should not be affected by extreme
observations ( Values)
7/28/2019 Central Tendency (Stats)
5/89
1. Arithmetic Mean (A.M)
2. Median (M)
3. Mode (Z)
4. Geometric Mean (G.M)
5. Harmonic Mean (H.M)
Various measures of Central Tendency
7/28/2019 Central Tendency (Stats)
6/89
1. Arithmetic Mean ( A.M )
A.M=Sum of observationsNumber of observations
7/28/2019 Central Tendency (Stats)
7/89
Calculation of A.M
1. Ungrouped Data ( Raw Data)
2. Discrete Data3. Continuous Data
7/28/2019 Central Tendency (Stats)
8/89
Ungrouped Data (Raw Data):
A sample of 30 persons weight of a
particular class students are as follows.
62 58 58 52 48 53 54 63 69 63
57 56 46 48 53 56 57 59 58 53
52 56 57 52 52 53 54 58 61 63
7/28/2019 Central Tendency (Stats)
9/89
Discrete DataNumber of post graduates
(x)
Frequency (f)
0 2
1 2
2 4
3 1
4 1
7/28/2019 Central Tendency (Stats)
10/89
Continuous Data
Marks No. of students20 30 5
30 40 15
40 50 25
7/28/2019 Central Tendency (Stats)
11/89
Exclusive method (overlapping)In this method, the upper limits of one class-interval are the lower limit of next class. Thismethod makes continuity of data.
A student whose mark is between 20 to 29.9will be included in the 20 30 class.
Marks No. of students
20
30 5
3040 15
4050 25
7/28/2019 Central Tendency (Stats)
12/89
Inclusive method (non-overlaping)
A student whose mark is 29 is included in
20 29 class interval and a student whose
mark in 39 is included in 30 39 classinterval.
Marks No. of students2029 5
3039 15
4049 25
7/28/2019 Central Tendency (Stats)
13/89
Ungrouped Data (Raw Data)
x = observationsn = number of observations.
X =X1+ X2+ X3+ + X n
n
= Xn
7/28/2019 Central Tendency (Stats)
14/89
The following data gives value of equity holdings of 20 of
the Indias billionaires.
Name Equity Holdings ( M illions of Rs.) Kiran Mazumdar-shaw 2717The Nilekani family 2796The Punj family 3098Karsanbhai K.Patel& family 3144Shashi Ruia 3527K.K . Birla 3534
B. Rama Linga Raju 3862Habil F. khorakiwala 4187The Murthy family 4310Keshub Mahindra 4506The Kirloskar family 4745M.v. Subbiah family 4784Ajay G. Piramal 4923
Uday Kotak 5034S.P.Hinduja 5071Subhash Chandra 5424Adi Godrej 5561Vijay Mallya 6505V.N. Dhoot 6707
Naresh Goyal 6874
7/28/2019 Central Tendency (Stats)
15/89
X 2717+2796++6874
n 20
= Rs.4565.4 Millions
=X =
7/28/2019 Central Tendency (Stats)
16/89
X = ObservationsF = Frequency
Discrete Data
X =f x
f
7/28/2019 Central Tendency (Stats)
17/89
Problem on Discrete Data
The following is the frequency distribution of the number of
telephone calls received in 245 successive one-minuteintervals at an exchange:
Obtain the mean number of calls per minute.
No. of Calls 0 1 2 3 4 5 6 7Frequency 14 21 25 43 51 40 39 12
7/28/2019 Central Tendency (Stats)
18/89
No. of calls (x) Frequency (f) f x
0 14 0
1 21 21
2 25 50
3 43 129
4 51 204
5 40 200
6 39234
7 12 84
f=245 f x: 922
7/28/2019 Central Tendency (Stats)
19/89
f x 922
f 245=X = = 3.763
7/28/2019 Central Tendency (Stats)
20/89
Continuous Series
The calculation is illustrated with the data relating toequity holdings of the group of 20 billionaires given
Class Interval Frequency
2000-3000 2
3000-4000 5
4000-5000 6
5000-6000 4
6000-7000 3
7/28/2019 Central Tendency (Stats)
21/89
Class Interval Frequency (F) Mid value(X) fx
2000-3000 2 2500 50003000-4000 5 3500 17500
4000-5000 6 4500 27000
5000-6000 4 5500 22000
6000-7000 3 6500 19500f=20 fx=91000
7/28/2019 Central Tendency (Stats)
22/89
f x 91000
f 20==X = 4550
7/28/2019 Central Tendency (Stats)
23/89
Properties of Arithmetic Mean
1. The sum of the deviations, of all thevalues x, from their arithmetic mean, isalways zero
2. The product of the arithmetic mean and
the number of items gives the total ofall items.
3. If there are the arithmetic mean of two samplesof sizes n1and n2 respectively then, the
arithmetic mean of the distribution combiningthe two can be calculated as
X12 = N1 X 1 + N2 X 2
N1 + N2
7/28/2019 Central Tendency (Stats)
24/89
Properties of Arithmetic Mean4. The sum of squared deviations of the
items from mean is minimum, whencompared to the sum of squareddeviation of the items from any othervalue.
7/28/2019 Central Tendency (Stats)
25/89
Weighted Mean
The weighted meanof a set of numbersX1, X2, ..., Xn, with corresponding weightsw1, w2, ...,wn, is computed from the
following formula:
7/28/2019 Central Tendency (Stats)
26/89
26
EXAMPLE Weighted Mean
The Carter Construction Company pays its hourlyemployees $16.50, $19.00, or $25.00 per hour.There are 26 hourly employees, 14 of which arepaid at the $16.50 rate, 10 at the $19.00 rate, and2 at the $25.00 rate. What is the mean hourly rate
paid the 26 employees?
7/28/2019 Central Tendency (Stats)
27/89
Merits:1. Mean is based on all the items of the
given data.2. Mean is rigidly defined by a
mathematical formula.
3. Mean is capable of further algebraic
treatment.
4. Mean has good sampling stability.
7/28/2019 Central Tendency (Stats)
28/89
Demerits:1. Mean can be unduly affected by
extreme values.
2. Mean cannot be calculated for
open-end classes, since midpoints cannot be found for suchclasses.
3. Mean cannot be foundgraphically like median and mode
7/28/2019 Central Tendency (Stats)
29/89
Median (M) The median is that value of thevariable which divides the group in
two equal parts, one partcomprising all the values greaterand the other, all the values less
than median.
7/28/2019 Central Tendency (Stats)
30/89
Calculation of Median
1. Ungrouped Data ( Raw Data)
2. Discrete Data
3. Continuous Data
7/28/2019 Central Tendency (Stats)
31/89
Raw Data
Steps:1. Arrange the data in ascending
order.
2. Find n+1 value2
3. Apply the formula
M= size of n+1 item.
2
7/28/2019 Central Tendency (Stats)
32/89
Sales Sorted Sales9 66 9
12 1010 1213 1315 1416 1414 1514 1616 1617 1616 1724 1721 1822 1818 1919 2018 2120 2217 24
The median is the middle value ofdata sorted in order of magnitude.
(20+1)/2=10.516
Median
7/28/2019 Central Tendency (Stats)
33/89
Discrete Data
Steps:1. Find Cumulative Frequencies (C.F)2. Find N/2 value.
N= total Frequency3. Apply the formula M= Size of (N/2)thitem. In other words locate a valuewhich is just more than N/2 value.(Note: This is not Median)
4. Read the corresponding X value. Thisgives the value of Median.
7/28/2019 Central Tendency (Stats)
34/89
Continuous data
Steps:1. Find C.F
2. Find N/2 Value.
3. Locate the value which is morethan N/2 value from thecumulative frequency column.
4. Read the corresponding class.This is the median class i. e theclass where median lies.
7/28/2019 Central Tendency (Stats)
35/89
5. Apply the formula,
M= l+ 2
M = Median
l = Lower limit of the median class.
N = Total Frequency
c. f= cumulative frequency of the pre medianclass
f = frequency of the median class
c = width of the median class
N C .F
f
X c
7/28/2019 Central Tendency (Stats)
36/89
Merits:1. It is easy to understand and easy to
calculate for a non-mathematicalperson.2. It is not affected by extreme
observations.
3. Median can be calculated dealingwith a distribution with open endclasses.
4. Median can be representedgraphically.
5. Median is the only average to beused with qualitative data.
7/28/2019 Central Tendency (Stats)
37/89
Demerits:
1. In case of even number of
observation for an ungrouped data ,median can not be determinedgraphically.
2. Median, being a positional average ,
is not based on each and every itemof the distribution.
3. Median is not suitable for furthermathematical treatment.
4. Median doest not have samplingstability.
7/28/2019 Central Tendency (Stats)
38/89
Mode (Z)
. . . . . : . : : : . . . . .
---------------------------------------------------------------6 9 10 12 13 14 15 16 17 18 19 20 21 22 24.
Mode
Mode is defined as the value which is repeatedmaximum number of times in a data.
.
7/28/2019 Central Tendency (Stats)
39/89
Calculation of Mode
1. Ungrouped Data ( Raw Data)2. Discrete Data
3. Continuous Data
7/28/2019 Central Tendency (Stats)
40/89
Ungrouped data
Here, Mode is calculated by mere
inspection.
7/28/2019 Central Tendency (Stats)
41/89
Discrete Data
Here, Mode is calculated by mere
inspection.
7/28/2019 Central Tendency (Stats)
42/89
Continuous data
Steps:1. Locate the maximum frequency.
2. Read the corresponding class.
This is the modal class i.e., theclass where mode lies.
3. Apply the formula,
z= l + 1+
1
1 2
X c
7/28/2019 Central Tendency (Stats)
43/89
z = mode
l = lower limit of modal classf - f
f - f
f Frequency of modal classf Frequency of pre modal class
f Frequency of post modal
classc Width of the class interval
1= 1 0
2= 1 2
1 =
0 =
2 =
=
7/28/2019 Central Tendency (Stats)
44/89
Merits:
1. Its value can be easily ascertained
without much calculation.2. It is an average which is commonly
used in day to day life.
3. It is not affected by extreme values.4. The data need not be arranged.
5. Mode can be graphicallydetermined.
6. Mode can be calculated for datawith open-end classes.
7/28/2019 Central Tendency (Stats)
45/89
Demerits:
1. Mode is not based on each and
every item of the data.2. Mode is not capable of further
of algebraic treatment.
3. Mode is not rigidly defined.
4. Model value can be misleading.
5. Mode is ill defined for bimodalor multimodal distribution.
6. Mode doesnt have samplingstability.
R l ti b/ di
7/28/2019 Central Tendency (Stats)
46/89
Relation b/w mean, medianand mode.
Mode = mean - 3 [mean - median]
Mode = 3 median - 2 mean
Median = mode +
7/28/2019 Central Tendency (Stats)
47/89
Symmetrical Distribution
7/28/2019 Central Tendency (Stats)
48/89
NEGATIVELY OR LEFT SKEWED
Mean < Median < Mode
7/28/2019 Central Tendency (Stats)
49/89
POSITIVE OR RIGHT SKEWED
Mean > Median > Mode
7/28/2019 Central Tendency (Stats)
50/89
Geometric Mean It is defined as the nth root of
product of n positive values oritems.
7/28/2019 Central Tendency (Stats)
51/89
Calculation of G.MUngrouped data ( Raw Data )G.M= antilog log X
n
7/28/2019 Central Tendency (Stats)
52/89
7/28/2019 Central Tendency (Stats)
53/89
Calculation of G.MGrouped data (Discrete &Continues Data )G.M= antilog f log X
N= total Frequency.
N
Example:
7/28/2019 Central Tendency (Stats)
54/89
Example:
Suppose you receive a 5 percentincrease in salary this year and a 15percent increase next year. The
average annual percent increase is8.886, not 10.0. Why is this so? Webegin by calculating the geometric
mean.
7/28/2019 Central Tendency (Stats)
55/89
55
The Geometric Mean Useful in finding the average change ofpercentages, ratios, indexes, or growthrates over time. It has a wide application in business and
economics because we are often interestedin finding the percentage changes in sales,salaries, or economic figures, such as theGDP, which compound or build on eachother.
The geometric mean will always be lessthan or equal to the arithmetic mean.
7/28/2019 Central Tendency (Stats)
56/89
Combined Geometric Mean
G = Antilog [(n1 log G1 + n2 log G2)/ (n1 + n2)]
7/28/2019 Central Tendency (Stats)
57/89
Geometric MeanMerits:1. Makes use of full data.2. Extreme large values havelesser impact.3. Useful for data relating to ratiosand percentages.4. Useful for rate ofchange/growth.
Demerits: (G.M)
7/28/2019 Central Tendency (Stats)
58/89
1. Cannot be calculated if anyobservation has the value zeroor is negative.2. Difficult to calculate and
interpret.
7/28/2019 Central Tendency (Stats)
59/89
AM, GM, and HM satisfy theseinequalities:
AMGMHM
Equality holds only when all theelements of the given sample areequal.
7/28/2019 Central Tendency (Stats)
60/89
Harmonic Mean
It is defined as the reciprocal ofmean of reciprocal of values.Calculation of H.M:
UngroupedData Grouped Datan
7/28/2019 Central Tendency (Stats)
61/89
H.M- Merits:
1. It is based on all the items of thegiven data.
2. It gives the best results wheretime and rates are under study.
3. It is rigidly defined.
4. It is calculated even if the seriescontains negative values.
7/28/2019 Central Tendency (Stats)
62/89
H.M Demerits:
1. It is difficult for layman tounderstand and interpret.
2. It has limited practicalapplication.
3. It cannot be calculated if any of
the value is zero.
7/28/2019 Central Tendency (Stats)
63/89
Sales Sales Executive A Sales Executive B Sales Executive C
March 14 10 6
April 12 10 16
May 6 10 7
June 8 10 15
July 13 10 10
Aug 7 10 6
Total 60 60 60
Average 10 10 10
7/28/2019 Central Tendency (Stats)
64/89
7/28/2019 Central Tendency (Stats)
65/89
MEASURES OF DISPERSION
Why Study Dispersion?
7/28/2019 Central Tendency (Stats)
66/89
A measure of location, such as the mean or themedian, only describes the center of the data. It isvaluable from that standpoint, but it does not tell usanything about the spread of the data.For example, if your nature guide told you that theriver ahead averaged 3 feet in depth, would youwant to wade across on foot without additionalinformation? Probably not. You would want to knowsomething about the variation in the depth.A second reason for studying the dispersion in a setof data is to compare the spread in two or moredistributions.
7/28/2019 Central Tendency (Stats)
67/89
The scatterdness of
values from any measure of centraltendency is called Variation orDispersion
Characteristics for Ideal
7/28/2019 Central Tendency (Stats)
68/89
measure of dispersion1. It should be rigidly defined.2. It should be based on all the
observations.
3. It should be amenable to furthermathematical treatment.
4. It should be not be affected by
extreme observations.
f
7/28/2019 Central Tendency (Stats)
69/89
Measures of Dispersion:
1. Range2. Quartile Deviation
3. Mean Deviation
4. Standard Deviation
7/28/2019 Central Tendency (Stats)
70/89
Range:
Range is simply the difference between thehighest and lowest value in the distributionof values.
Example:
Weekly income of 10 people:
Range is maximum income minusminimum income: 330-180 = 150.180 220 280 320 280 180 350 280 330 220
7/28/2019 Central Tendency (Stats)
71/89
Group A: 30, 40, 40, 40, 40, 50, 50
Group B: 30, 30, 30, 40, 50, 50,50
Group C: 30, 35, 40, 40, 40, 45, 50
Range:20
Let us take two sets of observations.Set A contains marks of five students
7/28/2019 Central Tendency (Stats)
72/89
Set A contains marks of five studentsin Mathematics out of 25 marks and group Bcontains marks of the same student in
English out of 100 marks.
Set A: 10, 15, 18, 20, 20Set B: 30, 35, 40, 45, 50
The values of range and coefficient of
range are calculated as:
7/28/2019 Central Tendency (Stats)
73/89
Range Co efficient ofRangeSet : A 20 -10 = 10
Set : B 50 -30=20
Coefficient of Range:
7/28/2019 Central Tendency (Stats)
74/89
Coefficient of Range:
It is relative measure ofdispersion and is based on thevalue of range. It is also called
range coefficient of dispersion. Itis defined as
Coefficient of Range = Max-min
Max+min
Merits: Demerits:1 It i t b d
7/28/2019 Central Tendency (Stats)
75/89
1. It is the simplestmethod ofmeasuringvariation.
2. It can becalculated quicklysince only twovalues are taken
into consideration.
1. It is not based oneach and every item
of the given data.2. It can get affectedunduly by extremevalues, since only
those values areconsidered.
3. It can not becalculated for data
with open endclasses.
4. Range does nothave sampling
stabilit . Semi Interquartile Range
7/28/2019 Central Tendency (Stats)
76/89
( Quartile Deviation )Inter quartile range (IQR) is another range measure but thistime looks at the data in terms of quarters or percentiles.The range of data is divided into four equalpercentiles or quarters (25%).
Min Max
Q2
Median
50th Percentile
Q1
25th percentile
Q3
75th percentile
IQR
Range Calculation Of Q.D
7/28/2019 Central Tendency (Stats)
77/89
1. Ungrouped data ( Raw data).2. Discrete Data.
3. Continuous Data.
Raw Data:
7/28/2019 Central Tendency (Stats)
78/89
Raw Data:
thQ1 = Size of n+1 item.
4
thQ3 = Size of 3 n+1 item.
4
Di t D t
7/28/2019 Central Tendency (Stats)
79/89
Discrete Data:
Merits:
7/28/2019 Central Tendency (Stats)
80/89
1.It is simple to compute and easyto understand.
2. It can be computed for data with
open-end classes.3. It is not affected by extreme
values.
Demerits:
7/28/2019 Central Tendency (Stats)
81/89
1. It doesnt take all the values intoconsideration. It omits 50% of theitems- i.e. 25% items below Q1
and 25% items above Q3.2. It is not much capable of further
algebraic treatment.
3. It doesnt have sampling stability.
3. Mean Deviation
7/28/2019 Central Tendency (Stats)
82/89
It is defined as the mean
of absolute deviations of variousitems from either mean or medianor mode.
Calc lation of M D
7/28/2019 Central Tendency (Stats)
83/89
Calculation of M.D
1. Raw Data2. Discrete Data
3. Continuous Data.
Merits of M.D:
7/28/2019 Central Tendency (Stats)
84/89
1. It is based on every item of theseries.
2. It is rigidly defined.
3. It is not much affected byextreme values.
Demerits of M.D :
7/28/2019 Central Tendency (Stats)
85/89
1. It ignores algebraic signs whiletaking deviations of the items.
2. It is not much used for further
algebraic treatment.3. It can not be computed for data with
open end classes.
4. Calculation of M.D becomes tediouswhen the values of Mean, median,mode are in decimals.
Variance
7/28/2019 Central Tendency (Stats)
86/89
Where the mean is a measure of the centre of agroup of numbers, the variance is the measure of the
spread.
It involves measuring the distance between each ofthe values and the mean.
To calculate the variance :
1. calculate the mean
2. for each value in the distribution subtract the
mean and then square the result (the squareddifference)
3. calculate the average of those squareddifferences.
Variance
7/28/2019 Central Tendency (Stats)
87/89
= Sum of (observed value
mean score)2
Total number of scores -1
The larger the variance value the further the observedvalues of the data set are dispersed from the mean.
A variance value of zero means all observed values arethe same as the mean.
1
2
2
N
XXs
i
4. Standard Deviation (S.D)
7/28/2019 Central Tendency (Stats)
88/89
( )
The square root of variance isknown as standard deviation.
7/28/2019 Central Tendency (Stats)
89/89
Recommended