Upload
infinity
View
1.724
Download
3
Tags:
Embed Size (px)
Citation preview
UNIT –5MEASURES OF DISPERSION
Measures of central tendencies indicates the central Tendency of a frequency distribution in the form of an average. An average is a single significant value which is used to describe a distribution. These averages tell something about the general level of magnitude of the distribution but they fail to show anything further about the distribution. An average does not tell the full story and it is hardly fully representative of a mass, unless we know the manner in which the individual items scatter around it. A further description of the series is necessary to gauge the worthness of average.
Thus, it may be such that in several series the average may be the same, but variables may highly differ in magnitudes and therefore, the central tendency calculated from such variables may not be the most typical representative in many cases.
To know the extent of spread about these averages or the variations of items, it is necessary to observe the following examples.
Example1A series B series
100 50110 150120 100130 100140 200 600 600 120 120
Example – 2A series B series
x d = (x - ) x d = (x - )100 -20 300 -20110 -10 310 -10120 0 320 0130 10 330 10140 20 340 20 600 1600 120 320
In the example-1 though the formation of series is different, but their averages are same. However, in the example-2 though their means are different, the deviations of individual items from ‘x’ are same. Therefore, it is clear that one should not hurriedly conclude that the series A & B are same as their x are same or the series are not same as their x are different. That means averages them selves are not sufficient indicators of all the characteristics of a given data therefore they must further be subjected to other statistical analysis. Such further step in statistical analysis are the measures of dispersion.
MEANING OF MEASURES OF DISPERSION:- Dispersion is an important measure sought for describing the character of variability of data. Dispersion finds out how individual values fall apart on an average from the representative value. The average is derived from the actual values, but dispersion is known by averaging the deviations from representative value.
Definition: Let us go through some definitions
“ Dispersion is the measure of the variation of the items”A. L. Bowley
“ The degree to which numerical data tend to spread about an average value is called the variation or dispersion of the data”-Spiegel
85
Above given definitions focuses an variation. In order to understand the actual amount of variation must present in a given set of value, the size of variation must be measured and expressed in terms of numbers. This is known as measure of dispersion.
Objectives of measures of dispersion The following are the main purpose of measuring dispersion
1. To test the reliability of an average.The variation measure is the only means to test the representative character of an Average. If the scatter is large, average is less reliable. On the other hand if the scatter it small the average is a typical value.
2. To serve as a basis for control of variability.Measures dispersion are indispensable to determine the nature and find the causes of variation. When these are known, it is easy to control the variation itself.
3. To compare two or more series with regard to their variability.The degree of uniformity or the consistency of data can be found out through study of measure of dispersion. when comparing two series, as regards the reliability of the averages, due considerations may be given to dispersion ,which is a good basis for comparison.
4. To facilitate as a basis for further statistical analysisThe measures of dispersion are essential for studying the statistical tools.
Requisites of a good measure of dispersionA good measure of dispersion should have the following properties.
1. It should be simple to understand and rigidly defined.2. It should be easy to compare.3. It should be based an all items.4. It should be free from sampling fluctuations.5. It should be capable of further algebraic treatment.6. It should remain un affected by extreme items.
Methods of Measuring DispersionThe following are the important methods of studying variation.
1. Range2. Inter-quartile Range & Quartile Deviation.3. Mean Deviation (Not included in your syllabus)4. Standard Deviation.
Range:The range is the simplest measure of dispersion. It is a rough measure of dispersion. Its measure
depends upon the extreme items and not on all the items. It does not tell us anything about the distribution of values in the series relative to a typical value
Thus Range = Largest Value – Smallest value R= L –S
Co efficient of Range = L –S L+S
To Compare the series, the relative measure of dispersion is used.
Computation of RangeIndividual series
ILLUSTRATION = 01The net profit of a business concern in thousands of Rs is given below
Year 1996 1997 1998 1999 2000 2001 2002Profit 100 160 150 220 300 190 200
Find out Range and its co efficient.SOLUTION
Largest item = 300 Smallest item = 100 Range = L – S
= 300 – 100 = Rs 200 thousand
86
Co efficient of Range = L – S = 300 –100 = 200 = 0.5 or 50% L+ S 300 +100 400
Discrete SeriesILLUSTRATION = 02
Find out Range and its coefficient of the following dataSize 3 5 7 9 11 13Frequency 11 15 13 19 14 2Solution:
Largest Value = 13Smallest Value = 3
Range = L – S= 13 – 3= 10
Coefficient of Range = L – SL+S
= 13 – 3 = 10 = 0.625 13 +3 16
Continuous SeriesILLUSTRATION-3
Find out Range and its co efficient of the following frequency distribution.SBE 0-10 10-20 20-30 30-40 40-50Frequency 01 03 1 2 06 03 Solution:-
Largest Value = Upper limit of highest class interval L=50
Smallest Value= Lower limit of the lowest class Interval L=0
Range =L –S = 50 – 0 = 50Co efficient of Range = L –S L+S = 50 – 0 = 50 = 1 or 100 % 50 +0 50ILLUSTRATION =4
Calculate the Range and its co efficient from the following data.Size 0 –10 10 –20 20 –30 30 –40 40 –50Frequency 01 03 12 06 03Solution
Convert inclusive class intervals into exclusive form.Class 0.5 –10.5 10.5 –20.5 20.5 –30.5 30.5 –40.5 40.5 –50.5Frequency 3 7 20 13 6
Largest Value = 50. 5 LSmallest Value = 0.5 S
Range = L – S = 50.5 – 0.5 = 50Co efficient of Range = L – S = 50.5 – 0.5 = 50 = 0.98
L+S 50.5+ 0.5 51Uses of Range:
1. Range is used in industries for the statistical quality control of the manufactured product by the construction of control chast.
87
2. Range is useful in studying the variations in the prices of stick, shares and other commodities that are sensitive to price changes from one period to another period.
3. The meteorological department uses the Range for weather fore casts.Merits1. It is simple to compute and understand.2. It gives a rough, but quick answer3. When items are limited as in the case of sample lots for quality control pourpose, these methods are quite
handy.4. It is rigidly defined.Demerits1. It is not reliable, because it is affected by the extreme items.2. It cannot be applied to open end cases.3. Range is too indefinite to be used as a practical measure of dispersion.
QUARTILE DEVIATION OR SEMI-INTER QUARTILE RANGESemi inter quartile range or quartile deviation is defined as half the distance between the third and the
first quartiles.Symbolically:-
Semi-inter quartile Range Or =Q2 – Q1
Quartile Deviation 2
It means, the items below the lower quartile and the items above the upper quartile are not at all included in the computation. Thus we are considering only the middle half portion of the distribution. The range so obtained is divided by two as we are considering only half of the data.
The quartile deviation gives the average amount by which the two quartiles differ from median in an asymmetrical distribution. It is a measure of partition rather than a measure of dispersion. The smallest the value of Q.D, the minimum is the dispersion of middle half of the distribution around the median. However it provides no indication of the degree of dispersion lying beyond the limits of the two quartiles.
Quartile deviation is an absolute measure of dispersion. The relative measure of dispersion, known as co-efficient of quartile deviation, is calculated as follows: -
Co-efficient of quartile Deviation = Q3 – Q1 Q3 + Q1
Quartile deviation is an improved measure over the range, as it is not calculated from extreme items, but on quartiles.
Computation of Quartile Deviation & its Coefficient Individual seriesILLUSTRATION =5
15 students of a class obtained the following marks in statistics. Calculate the quartile Deviation and its coefficient.Marks:( 15, 20, 20, 21, 22, 22, 24, 25, 28, 28, 29, 30, 32, 33, 35.)Solution
Marks arranged in ascending orderSl. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Marks 15 20 20 21 22 22 24 25 28 28 29 30 32 33 35Q1 = The size of N+1 th item Quartile 4 Deviation= Q3 – Q1
= The size of 15+1 16 4th item 2 4 4 = 30 –21 = 4.5 = 4th item is 21=Q1 2
Q3 = The size of 3(N+1)th item 4
88
Coefficient of = The size of 3(15+1) = 3X16 = 12th item Q.D = Q3 – Q1 = 30 –21 = 0.176
4 4 Q3 + Q1 30+21 = 12th item is 30 =Q3
Discrete seriesILLUSTRATION=06
Find the quartile Deviation and its coefficient from the following.Age in years 15 16 17 18 19 20 21No, of students 4 6 10 15 12 9 4
SolutionAges in years
xNo. of Students
fcf
Q1= The size of N+1 th items Value 4 = The size of 60+1 61 = 15.25 4 4
Q1=17Q3 = The size of 3(N+1)th items value 4 = The size of 3(60+1) = 45.75 4 Q3 =19
15161718192021
4610151294
4102035475660
q1
q3
N = 60 Quartile Deviation = Q3 – Q1 19 – 17 = 2 = 1 2 2 2 Coefficient of quartile Deviation = Q3 – Q1 = 19 – 17 = 2 = 0.0556 Q3 +Q1 19 + 17 36
Continuous SeriesILLUSTRATION = 07
Calculate quartile deviation and its coefficient from the following data.Wages in Rs 120 –130 130 –140 140 –150 150 –160 160 –170 170 –180 180 –190 190 –200
No.of workers
10 20 30 40 30 20 15 5
SolutionQuartile Deviation = Q3 – Q1
2 Coefficient of Quartile Deviation = Q3 –Q1
Q3 + Q1x F cf Q1= L1 + L 2-L1 (q1-c) where q1=N/4 = 170/4 = 42.5
f = 140+150 –140 (42.5-30) 30 = 140 + 10 x12.5 = Q1 =144.167 30Q3 = L1 + L2 –L1(q1-c) where q3=3N 3 x 170 =127.5 f 4 4 =160 + 170 –160 (127.5 –100) = Q3 =169.167 30 Coeff. of Q .D = Q3 – Q1 = 169.167 – 144.167 = 25 = 0.079 Q3+Q1 169167 + 144.167 313.334
120 –130130 –140140 –150150 –160160 –170170 –180180 –190190 –200
102030403020155
103060100130150165170
q1
170
89
ILLUSTRATION =08Calculate Quartile Deviation and its coefficient from the data given below.
Mid Value 1 2 3 4 5 6 7 8 9 10Frequency 2 9 11 14 20 24 20 16 5 3Solution: Form Lower Classes and upper classes from the given mid valueX f Cf Q1= L1+L2 – L1( q1-c) where q1 =N/4 = 140/4 = 35
f= 3.5 + 4.5 –3.5 (31-22) = 3.5 + 1 x9 = 4.14 14 14Q3 = L1+L2 –L1 (q3-c) where q3 = 3N = 3 x 124 = 93 f 4 4= 6.5 + 7.5 – 6.5( 93-80) = 6.5 +1 x 13 =7.15 20 20Q.D = Q3 – Q1 = 7.15 – 4.14 =1.505 2 2Coefficient of QD =Q3 – Q1 = 7.15 – 4.14 = 3.01= 0.266 Q3+Q1 7.15 +4.14 11.29
0.5 – 1.51.5 –2.52.5 –3.53.5 –4.54.5 –5.55.5 –6.56.5 –7.57.5 –8.58.5 –9.59.5 –10.5
2911142024201653
21122365680100116121124
q1
q3
124
ILLUSTRATION:9Calculate quartile Deviation and its coefficient from the data given below.Wages in Rs. 100 100-109.5 110-119.5 120-129.5 130-139.5 140-149.5 150-159.5 160 &abvNo. of worker 12 18 24 16 30 20 15 5
Solution:-Convert the given open end cum inclusive series into exclusive formX f Cf Q1=L1+L2 – L1(q1-c) Where q1= N/4 = 140/4 = 35
f = 109.75 + 119.75 –109.75(35 –30) 24= 109.75 + 10 x 5 = 109.75 +50 = 111.83 24 24Q3= L1+L2 –L1(q3-c) where q3 = 3N = 3X140 = 105 f 4 4 = 139.75 + 149.75-139.75 (105 –100) = 139.75 + 10 x 5 = 139.75+ 50 = 142.25 20 20 Q.D = Q3 –Q1 = 142.25 –111.83 = 15.21 2 2Co eff of Q.D = Q3 –Q1 = 142.25 –111.83 = 30.42 = 0.119 Q3 +Q1 142.25 +111.83 254.08
89.75 –99.7599.75 –109.75109.75 –119.75119.75 –129.75129.75 –139.75139.75 –149.75149.75 –159.75159.75 –169.75
121824163020155
12305470100120135140
35
105
140
ILLUSTRATION =10 Calculate Quartile Deviation and its coefficient from the data given
Marks Below 50 55 60 65 70 75 80 85 90 95 100No. of students 15 31 48 70 102 130 148 170 185 190 200Convent the given cumulative frequency table into ordinary table.
90
Solution
x f cf Q1 = L1+ L2 –L1(q1-c) where q1= N/4 = 200/4 =50 f = 60 + 65 –60(50 -48) 22 = 60 +5 x 2 = 60 +10 = 60.45 22 22 Q3 = L1 + L2 – L1(q3-c) where q3 =3N =3x200 = 50 f 4 4= 80+ 85 – 80(150 – 148) 22= 80+ 5 x 2 = 80 + 10 = 80.45 22 22Q.D = Q3 – Q1 = 80.450 – 60.45 = 10 2 2Coefficient of Q. D = Q3 – Q1 = 80.45 – 60.45 = 0.141 Q3+Q1 80.45 + 60.45
45 –5050 –5555 –6060 –6565 –7070 –7575 –8080 –8585 –9090 –9595 –100
1516172232281822155
10
15314870102130148170185190200
200
ILLUUSTRATION =11 Calculate Quartile Deviation and its coefficient from the data given below.Wages Above 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000In Rs. 100 95 85 72 61 54 43 30 16 10
Solution:- Convert the given more than cumulative frequency distribution in to an ordinary Table.
X f cf Q1 =L1 + L2 –L1 (q1-c) where q1 = N/4 =100/4 = 25 f = 1600 + 1800 – 1600(25 –15) 13 = 1600 + 200 x 10 = 1600 + 2000 = 1753.846
13 13Q3 = L1 + L2 – L1( q3 –c) where q3 = 3N = 3 x100 =75 f 4 4 = 2600 + 2800 – 2600(75 –70) 14 = 2600 + 200 x 5 = 2671.43 14Q.D = Q3 – Q 1 = 2671.43 – 1753.846 = 458.79 2 2Coefficient Q.D = Q3 – Q1 = 2671.43 – 1753.85 = 917.58 = 0.207 Q3 + Q1 2671.43 +1753.85 4425.28
1200 –14001400 –16001600 –18001800 –20002000 –22002200 –24002400 –26002600 –28002800 –30003000 –3200
5101311071113140610
51528394657708490100
100
Merits of Quartile Deviation1. It is easy to calculate and simple to understand.2. It is rigidly defined.3. It gives an idea about the variation of central 50% of the observation.4. It is not unduly affected by extreme values.
Demerits1. It is not based on all the observations.2. It is not capable of further of algebraic treatment.3. It is much affected by fluctuations.4. It is necessary to arrange the data in ascending order.
91
TERMINAL QUESTIONS (5, 10 & 15 Marks)1. Define the term Dispersion. State the objectives of measurements of dispersion.2. What is meant by dispersion? Explain different methods measuring dispersion.3. Briefly, explain the relative merits and demerits of the various measures of dispersion4. What is Range? Explain its merits & demerits5. What is quartile Deviation? State it merits & demerits 6. Find out quartile Deviation and its coefficient from the following data.
Weight in kg 10 –12 12 –14 14 –16 16 –18 18 –20 20 –22 22 –24No. of Boxes 6 9 20 25 24 15 5
[Answer Q.D = 2.09, coefficient of Q.D = 0 .12]7. Calculate Quartile Deviation and its coefficient from the following data.Class 1 –10 11 –20 21 –30 31 –40 41 –50 51 –60 61 –70Frequency 6 8 6 9 9 6 6[Answer Q D =15.67, coefficient of Q D = 0.468. Calculate Quartile Deviation and its Coefficient from the following data.Size more than 20 30 40 50 60 70Frequency 68 63 40 40 18 7 [Answer Q .D = 12.845, coefficient 0.27]9. Calculate Quartile Deviation and its Coefficient from the following dataMarks less than 10 20 30 40 50 60 70 80 90No/ of persons 5 15 98 242 367 405 425 438 439 [Answer Q.D = 8.08, coeff = 0.21]
92
STANDARD DEVATION
Concept of standard Deviation was introduced by Karl Pearson in 1893. It is the most important measure of dispersion and is widely used in many statistical formula. Standard Deviation is also called Root – mean square Deviation. The reason is that it is the square root of the means of the squared deviation from the arithmetic mean. In this method the draw back of ignoring the algebraic sign in mean Deviation is overcome by taking the square of deviation, there by making all the deviations as positive.
Therefore, it is defined as positive square root of the arithmetic mean of the squares of the deviations of the given observations from the arithmetic mean. The standard deviations is denoted by the greek letter (sigma) the square of standard deviation is known as variance.
Features of standard deviationFollowing are the feature of standard deviation 1. The standard deviation measures the absolute dispersion of a distribution. The greater the dispersion greater
is the standard deviation.2. A small standard deviations means a high degree of uniformity of the observation as well as homogeneity. A
large standard deviation the opposite.3. If we have two or more comparable series with nearly identical means, it is the distribution with the smallest
standard deviation that the most representative mean.4. The standard deviation has the same unit as the original variables have.
Computation of standard deviationIndividual series – direct methodSteps:- 1. Calculate arithmetic mean, of the series
2. Obtain deviations from arithmetic mean dx = x –
3. square these deviations and final their sum d2x 4. use formula for standard deviation
= d 2 x or (x – ) n n
= Arithmetic mean d2x = Square of the Deviation from n = Numbers of items.
ILLUSTRATION= 01Find the standard Deviation of the monthly salaries of 10 persons given below.
Persons A B C D E F G H I JSalaries in Rs 120 110 115 122 126 140 125 121 120 131
SOLUTION
PersonsSalaries in Rs
xDeviations from mean (x-) ax (x –123)
Squaresd2x
A 120 -3 9B 110 -13 169C 115 -8 64D 122 -1 1E 126 +3 9F 140 +17 289G 125 +2 4H 121 -2 4I 120 -3 9J 131 +8 64
x 1230 d0 622d2x = x = 1230 = 123 n 10 = 123
93
S. D = d 2 x N = 622 10 = 62.2
= 7.89
ILLUSTRATION –02Calculate the standard deviation from the following data.
Variables x 10 15 20 12 8 4 15 16 25 35
SOLUTIONx Deviation from Mean dx = x -(x – 16) dx2
10 -6 3615 -1 120 +4 1612 -4 168 -8 644 -12 14415 -1 116 0 025 +9 8135 +19 361
160x 0 d2x 720Mean = x N = 160 = 16 10Standard deviation = d 2 x n = 720 10 = 72 S.D = 8. 485
SHORT – CUT METHOD OR DEVIATIONS TAKEN FROM ASSUMED MEANThis method is adopted when the arithmetic average is a fractional value. Taking deviations from
fractional value would be a very difficult and tedious task. To save time and labour we apply shore cut method.Steps:- 1. Assume any one of the item in the series as an average = (A) 2. Find out the deviations from the assumed mean dx = x – A C 3. Find out the total of the deviations dx
4. Square the deviations and add up the squares of deviations i.e d2x5. Use the formula = d 2 x - ( dx) 2 x C
n nILLUSTRATION = 03
The below given table gives the marks obtained by 10 students in statistics examination. Calculate standard deviation.Sl. No 1 2 3 4 5 6 7 8 9 10Marks 43 48 65 57 31 60 37 48 78 59
Solution:- A= 31, C=1 dx = x - A C
94
Sl. No. Marksx
x-31,dx
d2x
1 43 12 1442 48 17 2893 65 34 11564 57 26 6765 31 0 06 60 29 8417 37 6 368 48 17 2899 78 47 220910 59 28 784
N=10 dx 216 6424
ILLUSTRATION = 04Calculate the standard deviation from the following data.
Values (x) 58 59 60 54 65 66 52 75 69 52
Solution A = 65, C = 1, dx =(x –A)/C
DISCRETE SERIESCalculation of standard deviation
DIRECTMETHOD OR ACTUAL MEAN METHODStep:-
1. Calculate the mean of the series 2. Find deviations for various items from the mean dx = x –3. Square the deviations (d2) and multiply by the respective frequencies we get fd2x4. Get the total product fd2 and use the formula
= fd 2 x N= frequency total N
ILLUSTRATION = 05Calculate Standard Deviation from the following data.
Marks 10 20 30 40 50 60No. of Students 8 12 20 10 7 3
Solution x – 30.8x f fx Dx dx2 dd2x = fx = 1850 =30.83
N 60S.D = fd 2 x
1020
812
80240
-20.8-10.8
432.64116.64
3461.121399.68
Values - x (x –65) dx d2x58 -7 4959 -6 3660 -5 2554 -11 12165 0 066 1 152 -13 16975 10 10069 4 1652 -13 169
n =10 -40 686 d2x
S . D = d 2 x – dx 2 x C n n = 686 – -40 2 x 1 10 10 = 68.6 – (-4)2 x 1 =68.6 – 16 =52.6S.D = 7.252 = A + dx x C n = 65 +(-40 x 1) = 61 10
95
S.D = d 2 x - dx 2 x C n n = 6424 – 216 2 x 1
10 10 = 642.4 –(21.6)2 x 1 = 642.4 – 466.56 = 175.84S.D = 13.26
30405060
201073
600400350180
-0.89.219.229.2
0.6484.64368.64852.64
12.8846.402580.482557.92
N = 10858.40
60 = 180.97 = 13.45
N= 60 1850fdx fd2x 10858.40
ILLUSTRATION = 06No. of accidents 0 1 2 3 4 5 6 7 8 9 10 11 12Persons involved 16 16 21 10 16 8 4 2 1 2 2 0 2Solution (x – 3)
= fx n = 300 = 3 100
S.D = fd 2 x N = 702 100 = 7.02S.D = 2.649
SHORT – CUT METHOD OR ASSUMED MEANSteps:-
1. Assume any one of the item in the series as an average ( A )2. Find out the deviations from assumed mean after considering common factor if any dx =(x –A)
C3. Multiply the square deviations by the respective frequencies and get the fdx4. Square the deviations d2x.5. Multiply the squared deviations (d2x) by the respective frequencies and get fd2x6. Use the formula S.D = d 2 x – fdx 2 x C
N N
ILLUSTRATON = 07Calculate standard deviation for the following data.
No/of Accidents
xNo/ of Persons fx dx d2x fd2
0 7 0 -3 9 1441 16 16 -2 4 642 16 42 -1 1 213 21 30 0 0 04 10 64 1 1 165 16 40 2 4 326 8 24 3 9 367 4 14 4 16 328 2 8 5 25 259 1 18 6 36 7210 2 20 7 49 9811 2 0 8 64 012 0 24 9 81 162
2
N=100 fx 300702fd2x
96
Marks 5 6 7 8 9 10 11 12 13Students 5 2 4 3 6 5 7 2 6
Solution A=9, C=1, dx = x –A C
ILLUSTRATION =08Find standard deviation for the following data.
Values 60 70 80 90 100 110 120frequency 3 6 9 13 8 5 4 Solution (x –90) A= 90, C=10, dx = x – A fd2x = fdx x dx 10 c
x f dx fdx fdx2 S.D = fd 2 x – fdx 2 x C N N = 124 – 0 2 x 10 48 48 =2.583 x 10 = 1.6072 x 10S.D = 16.072
60 3 -3 -9 2770 6 -2 -12 2480 9 -1 -9 990 13 0 0 0100 8 1 8 8110 5 2 10 20120 4 3 12 36N48 fdx fdx 0 124
CONTINUOUS SERIES
Calculation of standard deviation In case of grouped series, find mid values and proceed as in case of discrete series.
Direct Method
ILLUSTRATION –09 Find arithmetic mean and standard deviation for the following data.
Class 0-4 4-8 8-12 12-16 16-20Frequency 2 3 10 3 2
Solution (x - )Class x f Mid value fm d d2 Fd2 = fm = 200 = 10
Marks x f(x-9) dx
Fdx d2x Fd2xS.D = fd 2 x - fdx 2 x C N N = 264 – 12 2 x 1 40 40 = 6.6 – 0.9 = 6.51 S.D = 2.551
5 5 -4 -20 16 806 2 -3 -6 9 187 4 -2 -8 4 168 3 -1 -3 1 39 6 0 0 0 010 5 1 5 1 511 7 2 14 4 2812 2 3 6 9 1813 6 4 24 16 96
40 fdx 12 fd2x 264
97
N 20
S.D = fd 2 = 35 2 N 20 S.D = 4.19
0 –4 2 2 4 -8 64 1284 –8 3 6 18 -4 16 488 –12 10 10 100 0 0 012 –16 3 14 42 4 16 4816 -20 2 18 36 8 64 128
N= 20 200 0 352
SHORT – CUT METHOD OR ASSUMED MEAN METHODIn the continuous series the method of calculating standard deviation is almost the same as in a discrete
frequency distribution. But one additional step here is obtained of mid value. The step deviation method is widely used .
Formula for standard deviation calculation.S.D = fd 2 x – fd 2 x x C
N N
ILLUSTRATION = 10Calculate the standard deviation from the following data
Class x 0 –10 10 –20 20 –30 30 –40 40 –50 50 –60 60 –70Frequency 8 12 17 14 9 7 4
Solution = A =35, C=10, dx =(x –A)/c(x –35)/10
X f Mid x dx fdx fd2x = A + fdx x C N = 35 + -302 x 70 71 = 35 – 300 = 35 – 4.225 = 30.775 71S.D = fd 2 – fdx x C N N = 210 – –30 2 x 10 71 71 = 2.957 – (-0.422)2 x 10 =2.7785 x 10 =1.668 x 10 = 16.668
0 –1010 –2020 –3030 –4040 –5050 –6060 –70
8121714974
5152535455565
-3-2-10123
-24-24-17091412
724817092836
71 fd2x 210
ILLUSTRATION = 11 The following data relate to the age of a group of workers calculate the arithmetic Mean and standard
deviation.Age in years 20-25 25-30 30-35 35-40 40-45 45-50 50-55No of worker 170 110 80 45 40 30 25 Solution x – 37.5, A =37.5, C =5 , dx =(x – A)/c
5
Age x fmid
xdx fdx fd2x
Fd2x = fdx x dx = A+ fdx x C N = 37.5 + –635 x 5 = 31.15 500S.D = fd 2 x – fdx 2 x C N N
=2435 – –635 2 x 5 = 4.87 –1.613 x 5
20-25 170 22.5 -3 -510 153025-30 110 27.5 -2 -220 44030-35 80 32.5 -1 -80 8035-40 45 37.5 0 0 040-45 40 42.5 1 40 4045-50 30 47.5 2 60 120
98
500 500 = 1.804 x 5 = 9.020
50-55 25 52.5 3 75 225N= 500 fdx -635 2435 fd2x
ILLUSTRATION = 12Calculate arithmetic mean and standard deviation from the following data.
Wages more than 100 200 300 400 500 600 700 800No of workers 660 615 527 381 175 96 44 14
[K U V BBM 2002]Solution ( x – A)/100, A =450, C= 100, dx,=(x – A)/C
x f Mid x dx fdx fd2x = A + fdx xC N = 450 + -128 x 100 660 = 450 – 19.39 = 430.61S.D = fd 2 x – fdx 2 x C N N = 1684 – –128 2 x 100 = 2.51 x 100 660 660 = 1.584 x 100 =158 . 4
100 –200200 –300300 –400400 –500500 –600600 –700700 –800800 -900
458814620679523014
150250350450550650750850
-3-2-101234
-135-176-146
0791049056
405352146079208270224
N= 660 fdx -1281684fd2x
IILUSTRATION 13 Calculate arithmetic mean and standard deviation.Age in less than years 10 20 30 40 50 60 70 80No of workers 15 30 53 75 100 110 115 125
Solution (x –35)/10, A = 35, C = 10, dx = (x – A)/CX f Mid x dx fdx fd2x = fdx x C = 35 + 2 x 10 = 35 + 20 =35.16
N 125 125
S.D = fd 2 x – ( fdx) 2 x C = 488 – ( 02 )2 x 10 N N 125 125 = 3.904 – (0.016)2 x10 = 3.904 – 0.000256 = 1.975 x 10 = 19.75
0-1010-2020-3030-4040-5050-6060-7070-80
151523222510510
515253545556575
-3-2-101234
-45-30-23025201540
13560230254045160
N=125 fdx 02 488=fd2x
ILLUSTRATION 14Calculate the arithmetic mean and standard deviation from the following data.
Income under Under 100 10-104.9 105-109.9 110.114.9 115-119.9 120-124.9 125-129.9No. of workers 20 40 30 20 50 15 5
Solution: - A = 112.45, C =5, dx =(x –A)/CX f Mid x dx fdx fd2x = A+ fdx X C
N = 112.45 + -75 x 5 180 = 112.45 – 2 .08 = 110.37S.D = fd 2 x - ( fdx) 2 X C N N = 525– -75 2 x 5 = 2.9166 –(0.416)2 x 5 180 180 = 2.916 – 1.173 =2.7436 x 5 = 1.656 x 5 = 8.28
95-99.9100-104.9105-109.9110-114.9115-119.9120-124.9125-129.9
20403020501505
97.45102.45107.45112.45117.45122.45127.45
-3-2-10123
-60-60-300503015
180160300506045
N = 180 -75 525
99
COEFFICIENT OFVARIATION = CVThe standard deviation is an absolute measure of dispersion. It is expressed in terms of units in which
the original figures are collected and stated. The standard deviation of heights of students cannot be compared with the standard deviation of weights of students, as both are expressed in different units. Therefore, the standard deviation must be converted into a relative measure of dispersion for the purpose of comparison. The relative measure is known as the coefficient of variation.
Variance:- square of standard deviation is called variance symbolically Variance = 2
= variance Coefficient of standard deviation =
For better comparison purpose, this coefficient of standard deviation is multiplied by 100 gives the coefficient of variation
Coefficient of variation = x 100
Prof. Karl Pearson suggests this measure of coefficient of variation as the most commonly used measure of relative variation. The series for which the c v is greater, indicates that the series is more variable or less uniform if the coefficient of variation is less, it indicates that the series is less variable or more stable or more consistent
COMPARISON OFTWO SERIESUSING COEFFICIENT OF VARIATION
ILLUSTRATION =15The index numbers of prices of cotton and cool shares in a year are given below.
MonthIndex no of prices of cotton on shares, x
Index no, of prices of coal shares
January 188 131February 178 130March 173 130April 164 129May 172 129June 183 129July 185 127
August 184 127September 211 130
October 217 137November 232 140December 240 142
Compare the variations of the price of the two shares using coefficient of variation.Solution :- Calculation of coefficient of variation
X(x – 185)
dxd2 y
(y –127)dy
d2y
188 3 9 131 4 16178 -7 49 130 3 9173 -12 144 130 3 9164 -21 441 129 2 4172 -13 169 129 2 4183 -2 4 129 2 4185 0 0 127 0 0184 -1 1 127 0 0211 26 676 130 3 9217 32 1024 137 10 100232 47 2209 140 13 169240 55 3025 142 15 225
N=12107dx
7751d2x
dy=57549d2y
100
= A + dx xC n = 185 + 107 x1 12 = 193.916y = A + dy x C n = 127 +57 x 1 12 = 127 + 4.75 = 131.75
S.D = d 2 x - ( dx ) 2 x C n n = 7751 – (107) 2 x 1 12 12 = 645.92 – (8.916)2
= 645.92 – 79.495 = 566.425 = 23.799C.V= x 100 = 23.799 x100 = 12.27% 193.91
S.D = d 2 y - ( dy) 2 X C n n = 549 -(57)2 X 1 12 (12) = 45.75 – ( 4.75 )2
= 45.75 - 22.5625 = 23.1875 = 4.815C.V = X 100 = 4.815 x 100 = 3.65% 131.75
Hence cotton shares are more variable in price than the coal shares.
ILLUSTRATION = 16Following are the runs scored by two bats man A & B. Find
a. Who is better scored run getter?b. Who is more consistent batsman?
A 101 22 0 36 82 45 7 13 65 14B 97 12 40 96 13 8 85 8 56 15 Solution Ax = 82, Ay = 13
AX
x – 82dx
d2xBy
y – 13dy
d2y
101 19 361 97 84 705622 -60 3600 12 -1 10 -82 6724 40 27 72936 -46 2116 96 83 688982 0 0 13 0 045 -37 1369 8 -5 257 -75 5625 85 72 518413 -69 4761 8 -5 2565 -17 289 56 43 184914 -68 4624 15 2 4
-435dx
29469d2x
300dy
21762d2x
= A + dx x C n = 82 + -435 x1 = 38.5 10 y = A + dy x C n=13 + 300 x1 =43 10
101
x- series S.D = d 2 x - ( dx) 2 x C n n =29469 – (-43.5)2 x 1 10 10 = 2946.9 – 1892.25 = 1054.65 = 32.475 C V = x 100 = 32.475 x 100 = 84 .35% 38.5
y-seriesS.D = d 2 y -( dy) 2 xC n n =21762 - (300 )2 x 1 10 ( 10 ) = 2176.2 – (30)2 x 1 =2176.2 – 900 = 1276.2 = 35.723C.V = x 100 y = 35.723 X 100 43 = 83.07%
Conclusion:-1. Batsman B is better run getter, because he has scored 430 runs compare to batsman A 385 runs2. Batsman B is more consistent.
COMPARISION OF TWO GROUPS OF DATAUSING COEFFICEINT OF VARIATION
DISCRETE SERIESILLUSTRATION = 17
The goals scored by tow teams A & B in the football matches were as follows.Goals 0 1 2 3 4Matches A= 27 9 8 4 5Teams B= 17 9 6 5 3Find which team is more consistent.
Solution A = 2
Team –A Team-B Gools
xTeam A,
fA
x –2dx
fAdx fAd2x dx fB fBdx fBd2x
0 27 -2 -54 108 -2 17 -34 681 9 -1 -9 9 -1 9 -9 92 8 0 0 0 0 6 0 03 4 1 4 4 1 5 5 54 5 2 10 20 2 3 6 12
N1=53-49
fAdx141
fAd2xN2= 40
-32fBdx
94fBd2x
S.D = f Ad 2 x - ( f Adx )2 X C N1 N1
=141 - (-49) 2 x 1 53 ( 53)
= 2.66 – (-0.925)2 x 1 = 1.8052 = 1.3436 A = A + fadx x C N = 2+ -49 x 1 53 = 2 –0.925 =1.075 C.V = X100 = 1.3436 x 100 A 1.075 = 124.93%
S.D = f Bd 2 x - ( f Bdx)2 x C N2 N2
= 94 - (-32)2 x140 40
= 2.35 – (-0.8)2 x1 = 1.71 = 1.308B= A + fBdx x C N = 2 + -32 x 1 = 1.2 40C.V = x 100 = 1.308 x 100 = 109% B 1.2
Team B is more consistent as it has less variation
102
CONTINUOUSSERIESILLUSTRATION =18
Following data gives life of electric bulb manufactured by two companies. Calculate. A. which of the two makes has a higher average life?B. If prices of both the bulbs is same, which company’s bulb would you prefer to buy and why? Use
C.VLife in 000 hrs
XCompany
ACompany
B50-5960-6970-7980-8990-99
182226259
1524301813
100 100Solution
Computation of coefficient of variation for both the series.
X Mid xx – 74.510 dx
fA fAdx fAd2x fB fBdx fBd2x
50-5960-6970-7980-8990-99
54.564.574.584.594.5
-2-1012
182226259
-36-2202518
722202536
1524301813
-30-2401826
602401852
N1=100 N1
-15fAdx
155fAd2x
100N2
-10fBdx
154fBd2x
S.D = f Ad 2 x - ( f Adx )2 x C N1 N1
=155 - (-15) 2 x 10 100 (100) = 1.55-( -0.15) 2 x 10 = 1.55-0.0225 = 1.5275 = 1.2359 x10 =12.359 A = A + f Adx x C N =74.5 + (-15) x10 100 =74.5-1.5 = 73.0C.V = X 100 = 12.359 X 100 = 16.93% A 73
sssS.D = f Bd 2 x - ( f Bdx )2 x C N2 N2
= 154- (-10)2 x10 100 (100) = 1.54 – (-0.1)2 x10 = 1.54 – 0.01 x 10 = 1.53 x 10 = 1.2369 X 10 = 12.369B= A + f Bdx x C N = 74.5 + (-10) X10 100 = 74.5 -100 =73.5 100C.V = X 100 = 12.369 x 100 = 16.83% B 73.5
Conclusion1. B company bulbs have higher average life i.e. 73.5 hours.2. B company bulbs are more uniform and consistent in giving life, hence buyer would prefer to buy B
companies bulbs.
ILLUSTRATION =19
From the data given below state which of the two series is more variable. Use coefficient of variation. Variable 10-20 20-30 30-40 40-50 50-60 60-70Frequency A 10 18 32 40 22 18 Frequency B 18 22 40 32 18 10
103
SolutionComputation of coefficient of variation .
Frequency A Frequency BVariable
Xmid x
x-3510 dx
fA fAdx fAd2x dx fA fBdx fBd2x
10-2020-3030-4040-5050-6060-70
152535455565
-2-10123
101832402218
-20-180404454
401804088162
-2-10123
182240321810
-36-220323630
72220327290
140 100 348 140 40 288
A = A + fAdx X C N1
=35 + 100 X 10 140 = 42.1429 B = A + fBdx X C N2
= 35 + 40 X 10 140 = 37.8571 A-Series B-Series S.DA = f Ad 2 x - ( f Adx)2 x C N1 N1
=348 - (100) 2 X 10 140 140 = 2.4857-( -0.7142) 2 x 10 S.D= 14.055 C.V = X 100 = 14.055 x 100 A 42.1429 = 33.35%
S.DB = f Bdx - ( f Bdx )2 x C N2 N2
=288 - (40) 2 X 10 140 140 = 2.05714 -(0.2857) 2 X 10 = 14.055 C.V = X 100 = 14.055 X 100 B 37.8571 = 37.127%
Conclusion:- Series B is more Variable
ILLUSTRATIONS:-20Two brands of types are tested for their life and the following results were obtained. State which brand
of Tyres are more consistent?Life in months 20-25 25-30 30-35 35-40 40-45No/ of Tyres: x: 1 22 64 10 3
y: 3 21 74 1 1 Solution: computation of coefficient of variation,
Brand A Brand B
Life in month x mid xx-32.5
5fAdx fdx fd2x f fdx fd2x
104
20-2525-3030-3535-4040-45
22.527.532.537.542.5
-2-1012
012264103
-2-220106
42201012
3217411
-6-21012
+1221014
100 - 8 48 100 -24 38
1 = A + fdx X C N1
=352.5+ -8 X 5 100 = 32.5-0.4 =32.1 y = A + f dx X C N2
= 32.5+ - 24 X 5 = 32.5 –1.20 = 31.3 100
S.D = fd 2 x - ( fdx )2 x C N1 N1
= 48 - (-8) 2 x 5 100 100 = 3.44 C.V = X 100 = 3.44 x 100 32.1 = 10.72%
S.D = fd 2 x - ( fdx )2 X C N2 N2
= 38 - (-24) 2 X 5 100 100 = 2.839 C.V = X 100 = 2.839 X 100 31.30 = 9.07%
‘Y’ Brand Tyres are more consistent than brand x Tyres.
ILLUSTRATION:21:- In two factories A& B engaged in the industrial area the average weekly wages in Rs and the standard
deviations are as followsFactory Average Standard Deviation No. of workers
A 345 5 476B 285 4.5 524
Find 1) Which factory A or B pays out a larger amount as weekly wages?2) Which factory A or B has greater variability in individual wages?
Solution1. Calculation of total weekly wage payment
Total wages paid by factory A= Rs345 x 476 = 164220Total wages paid by factory B= Rs 285 x 524 = 14 9340Therefore, factory A pays out larger amount as weekly wages
2. Calculation of coefficient of variation Factory A ------- CV = x 100 = 5 x 100 = 1.449% 345 Factory B--------- CV = x 100 = 4.5 x 100 = 1.578% 285 Factory B has greater variability in individual wages since CV of factory B is greater than CV of factory A.
ILLUSTRA TION=22Particulars regarding income of two villages are given below.
Particulars Village A Village BAverage income 1750 1860Variance 100 81 State in which village is the variation in income greater ?Solution
Calculation of coefficient of variation
105
1. village . A------CV = x 100 =10 x 100 = 0.57% 1750 2. village. B------ CV = x 100 = 9 x 100 = 0.483% 1860
SD =Variance =100 = 10
S.D Variance = 81 = 9
Conclusion:- village A has greater variation than village B.
ILLUSTRATION=23.Coefficient of variation of two series are 58% and 69%. Their standard Deviations are 21.2 and 15.6
what are their arithmetic means?
Solutionx-Series x-Series
C.V = = x 100 58 = 21.2 x 100 = 2120 58 = 36.55
CV = X 100 Y69 = 15.6 yy = 1560 69 y = 22.6
ILLUSTRAAATION = 24Coefficient of variation of two series are 75% and 90% and their arithmetic means are 20 & 20
respectively find their standard deviations.x-series y-seriesCV = x 100 58 = x 100 20 75 =100 20 = 75 x 20 = 15 100
CV = x 100 y90 = x 100 20 90 =100 20 = 90 x 20 = 18 100
USE OF STANDARD DEVIATIONS:Standard Deviation is the best measure of dispersion. It is widely used statistics because it possesses
most of the characteristics of an ideal measure of dispersion. It is widely used in sampling theory and biologist. It is used in coefficient correlation and in the study of symmetrical frequency distribution.
Merits of standard Deviation: 1. It is rigidly defined and its value is always definite and based on all the observations.2. As it is based on arithmetic mean, it has all the merits of arithmetic mean.3. It is the most important and widely used measure of dispersion.4. It is possible for further algebraic treatment.5. It is less affected by the fluctuations of sampling and hence stable.6. It is the basis for measuring the coefficient of correlation sampling and statistical inferences 7. The coefficient of variation is considered to be the most appropriate method for comparing the variability of
two or more distributions and this is based on mean and standard deviation.Demerits:1. It is not easy to understand and it is difficult to calculate.2. It gives more weight to extreme values, because the values are squared up3. It is affected by the value o f every item in the series.4. As it is an absolute measure of variability, it cannot be used for the purpose of comparison5. It ha not found favour with the economists & businessmen.THEORETICAL QUESTIONS(5, 10 & 15 Marks)1. Why is that standard deviation is considered to the most popular measure of dispersion ?2. What is standard deviation? State their merits & demerits
106
3. What is coefficient of variation? What purpose does it serve.PRACTICAL PROBLEMS1. Calculate mean and standard deviation f r o m the following data.Wages in rs 40-50 50-60 60-70 70-80 80-90 90-100No of workers 12 9 8 5 7 9 [Answers: = 67-6, SD = 18.31]
2. Calculate the standard deviation from the following data.Monthy
Exp.78-82 73-77 68-72 63-67 58-62 53-57 48-52 43-47 38-42 33-37 28-32
No. of worker
3 6 7 12 17 13 9 7 4 2 1
[Answer SD = 11]3. Find mean standard deviation and coefficient of variation from the following data.Age in under year 10 20 30 40 50 60 70 80No. of persons 15 30 53 75 100 110 115 125 [Answers; = 35.16, SD = 19.76, CV = 56.2% ]4. Find Mean, standard deviation and coefficient of variation from the following data .Marks more than 0 10 20 30 40 50 60 70 80 90No. of students 100 90 80 65 50 20 15 10 5 2 [ Answers: = 38.7, SD = 21.3, CV=55 %] 5. From the prices pf shares of company ‘A’ and company ‘B’ given below, state which is more stable in value?Share A price 55 54 52 53 56 58 52 50 51 49Share B price 108 107 105 105 106 107 104 103 104 101 [ Answers : A = 53, A= 2.646, CV = 4.99 % ] B = 105, B=2 CV = 1.90% ( B company share prices are more stable)6. From the following table of marks obtained by 10 students, find the coefficient of variation and determine
the marks of which subject are more variable.Statistics 25 50 45 30 70 42 36 38 34 60M. Accounting 10 70 50 20 95 55 42 60 48 80 [Answers CV for stat=30.49%, CV for A/C= 45.9% ] (Variation is greater in the marks of M. Accounting)7. The scores of two bats men A & B for 20innings are as under which of the two may be regarded as the more
consistent batsman?Scores 53 54 55 56 57 58 59 60 TotalNo/of innings A 2 0 0 4 3 5 3 3 20
B 1 2 3 6 3 3 2 0 20 [ Answers A= 57.4, A = 1.9 6 CV = 3.4% B = 56.2, B= 1.6 CV=2.86 %] (Batsman B can be considered as more consistent)8. Samples of polythene Bags from two manufacturers A & B are tested by a prospective buyer for bursting
pressure with the following results.Bursting pressure in lbs
xNumber of bags
Company A Company Bunder-100 50 100100-104.9 150 75105-109.9 120 125110-114.9 80 65115-119.9 130 135120-124.9 70 140125-129.9 150 60
130 & above 50 100Total 800 800
(Ans: A = 114.64, A=10.57, CVA= 9.22%)(B=115.75, B=11.095, CVB= 9.64%)
107
9. The life of two types of lamps is given below Find 1. Which of the two makes has a higher average life?2. If prices are same for both, which type would you prefer to buy and why (use CV)
Life in hours
XNumber of lamps manufactured
Company A Company –BUp to 20002000-39994000-59996000-79998000-9999
10000-119991200 & above
15015012080130170200
1001751256513514060
1000 800 [Answers A = 7399.5, A = 4308.13, B = 6549.5, B = 3820.68]10. Coefficients of variation of two series are 60% & 80% respectively. Their standard Deviation are 20 and 16
respectively what are their means?[ Answer 1 = 33.3 2 = 20]
108