Module 5

UNIT –5MEASURES OF DISPERSION

Measures of central tendencies indicates the central Tendency of a frequency distribution in the form of an average. An average is a single significant value which is used to describe a distribution. These averages tell something about the general level of magnitude of the distribution but they fail to show anything further about the distribution. An average does not tell the full story and it is hardly fully representative of a mass, unless we know the manner in which the individual items scatter around it. A further description of the series is necessary to gauge the worthness of average.

Thus, it may be such that in several series the average may be the same, but variables may highly differ in magnitudes and therefore, the central tendency calculated from such variables may not be the most typical representative in many cases.

To know the extent of spread about these averages or the variations of items, it is necessary to observe the following examples.

Example1A series B series

100 50110 150120 100130 100140 200 600 600 120 120

Example – 2A series B series

x d = (x - ) x d = (x - )100 -20 300 -20110 -10 310 -10120 0 320 0130 10 330 10140 20 340 20 600 1600 120 320

In the example-1 though the formation of series is different, but their averages are same. However, in the example-2 though their means are different, the deviations of individual items from ‘x’ are same. Therefore, it is clear that one should not hurriedly conclude that the series A & B are same as their x are same or the series are not same as their x are different. That means averages them selves are not sufficient indicators of all the characteristics of a given data therefore they must further be subjected to other statistical analysis. Such further step in statistical analysis are the measures of dispersion.

MEANING OF MEASURES OF DISPERSION:- Dispersion is an important measure sought for describing the character of variability of data. Dispersion finds out how individual values fall apart on an average from the representative value. The average is derived from the actual values, but dispersion is known by averaging the deviations from representative value.

Definition: Let us go through some definitions

“ Dispersion is the measure of the variation of the items”A. L. Bowley

“ The degree to which numerical data tend to spread about an average value is called the variation or dispersion of the data”-Spiegel

85

Above given definitions focuses an variation. In order to understand the actual amount of variation must present in a given set of value, the size of variation must be measured and expressed in terms of numbers. This is known as measure of dispersion.

Objectives of measures of dispersion The following are the main purpose of measuring dispersion

1. To test the reliability of an average.The variation measure is the only means to test the representative character of an Average. If the scatter is large, average is less reliable. On the other hand if the scatter it small the average is a typical value.

2. To serve as a basis for control of variability.Measures dispersion are indispensable to determine the nature and find the causes of variation. When these are known, it is easy to control the variation itself.

3. To compare two or more series with regard to their variability.The degree of uniformity or the consistency of data can be found out through study of measure of dispersion. when comparing two series, as regards the reliability of the averages, due considerations may be given to dispersion ,which is a good basis for comparison.

4. To facilitate as a basis for further statistical analysisThe measures of dispersion are essential for studying the statistical tools.

Requisites of a good measure of dispersionA good measure of dispersion should have the following properties.

1. It should be simple to understand and rigidly defined.2. It should be easy to compare.3. It should be based an all items.4. It should be free from sampling fluctuations.5. It should be capable of further algebraic treatment.6. It should remain un affected by extreme items.

Methods of Measuring DispersionThe following are the important methods of studying variation.

1. Range2. Inter-quartile Range & Quartile Deviation.3. Mean Deviation (Not included in your syllabus)4. Standard Deviation.

Range:The range is the simplest measure of dispersion. It is a rough measure of dispersion. Its measure

depends upon the extreme items and not on all the items. It does not tell us anything about the distribution of values in the series relative to a typical value

Thus Range = Largest Value – Smallest value R= L –S

Co efficient of Range = L –S L+S

To Compare the series, the relative measure of dispersion is used.

Computation of RangeIndividual series

ILLUSTRATION = 01The net profit of a business concern in thousands of Rs is given below

Year 1996 1997 1998 1999 2000 2001 2002Profit 100 160 150 220 300 190 200

Find out Range and its co efficient.SOLUTION

Largest item = 300 Smallest item = 100 Range = L – S

= 300 – 100 = Rs 200 thousand

86

Co efficient of Range = L – S = 300 –100 = 200 = 0.5 or 50% L+ S 300 +100 400

Discrete SeriesILLUSTRATION = 02

Find out Range and its coefficient of the following dataSize 3 5 7 9 11 13Frequency 11 15 13 19 14 2Solution:

Largest Value = 13Smallest Value = 3

Range = L – S= 13 – 3= 10

Coefficient of Range = L – SL+S

= 13 – 3 = 10 = 0.625 13 +3 16

Continuous SeriesILLUSTRATION-3

Find out Range and its co efficient of the following frequency distribution.SBE 0-10 10-20 20-30 30-40 40-50Frequency 01 03 1 2 06 03 Solution:-

Largest Value = Upper limit of highest class interval L=50

Smallest Value= Lower limit of the lowest class Interval L=0

Range =L –S = 50 – 0 = 50Co efficient of Range = L –S L+S = 50 – 0 = 50 = 1 or 100 % 50 +0 50ILLUSTRATION =4

Calculate the Range and its co efficient from the following data.Size 0 –10 10 –20 20 –30 30 –40 40 –50Frequency 01 03 12 06 03Solution

Convert inclusive class intervals into exclusive form.Class 0.5 –10.5 10.5 –20.5 20.5 –30.5 30.5 –40.5 40.5 –50.5Frequency 3 7 20 13 6

Largest Value = 50. 5 LSmallest Value = 0.5 S

Range = L – S = 50.5 – 0.5 = 50Co efficient of Range = L – S = 50.5 – 0.5 = 50 = 0.98

L+S 50.5+ 0.5 51Uses of Range:

1. Range is used in industries for the statistical quality control of the manufactured product by the construction of control chast.

87

2. Range is useful in studying the variations in the prices of stick, shares and other commodities that are sensitive to price changes from one period to another period.

3. The meteorological department uses the Range for weather fore casts.Merits1. It is simple to compute and understand.2. It gives a rough, but quick answer3. When items are limited as in the case of sample lots for quality control pourpose, these methods are quite

handy.4. It is rigidly defined.Demerits1. It is not reliable, because it is affected by the extreme items.2. It cannot be applied to open end cases.3. Range is too indefinite to be used as a practical measure of dispersion.

QUARTILE DEVIATION OR SEMI-INTER QUARTILE RANGESemi inter quartile range or quartile deviation is defined as half the distance between the third and the

first quartiles.Symbolically:-

Semi-inter quartile Range Or =Q2 – Q1

Quartile Deviation 2

It means, the items below the lower quartile and the items above the upper quartile are not at all included in the computation. Thus we are considering only the middle half portion of the distribution. The range so obtained is divided by two as we are considering only half of the data.

The quartile deviation gives the average amount by which the two quartiles differ from median in an asymmetrical distribution. It is a measure of partition rather than a measure of dispersion. The smallest the value of Q.D, the minimum is the dispersion of middle half of the distribution around the median. However it provides no indication of the degree of dispersion lying beyond the limits of the two quartiles.

Quartile deviation is an absolute measure of dispersion. The relative measure of dispersion, known as co-efficient of quartile deviation, is calculated as follows: -

Co-efficient of quartile Deviation = Q3 – Q1 Q3 + Q1

Quartile deviation is an improved measure over the range, as it is not calculated from extreme items, but on quartiles.

Computation of Quartile Deviation & its Coefficient Individual seriesILLUSTRATION =5

15 students of a class obtained the following marks in statistics. Calculate the quartile Deviation and its coefficient.Marks:( 15, 20, 20, 21, 22, 22, 24, 25, 28, 28, 29, 30, 32, 33, 35.)Solution

Marks arranged in ascending orderSl. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15Marks 15 20 20 21 22 22 24 25 28 28 29 30 32 33 35Q1 = The size of N+1 th item Quartile 4 Deviation= Q3 – Q1

= The size of 15+1 16 4th item 2 4 4 = 30 –21 = 4.5 = 4th item is 21=Q1 2

Q3 = The size of 3(N+1)th item 4

88

Coefficient of = The size of 3(15+1) = 3X16 = 12th item Q.D = Q3 – Q1 = 30 –21 = 0.176

4 4 Q3 + Q1 30+21 = 12th item is 30 =Q3

Discrete seriesILLUSTRATION=06

Find the quartile Deviation and its coefficient from the following.Age in years 15 16 17 18 19 20 21No, of students 4 6 10 15 12 9 4

SolutionAges in years

xNo. of Students

fcf

Q1= The size of N+1 th items Value 4 = The size of 60+1 61 = 15.25 4 4

Q1=17Q3 = The size of 3(N+1)th items value 4 = The size of 3(60+1) = 45.75 4 Q3 =19

15161718192021

4610151294

4102035475660

q1

q3

N = 60 Quartile Deviation = Q3 – Q1 19 – 17 = 2 = 1 2 2 2 Coefficient of quartile Deviation = Q3 – Q1 = 19 – 17 = 2 = 0.0556 Q3 +Q1 19 + 17 36

Continuous SeriesILLUSTRATION = 07

Calculate quartile deviation and its coefficient from the following data.Wages in Rs 120 –130 130 –140 140 –150 150 –160 160 –170 170 –180 180 –190 190 –200

No.of workers

10 20 30 40 30 20 15 5

SolutionQuartile Deviation = Q3 – Q1

2 Coefficient of Quartile Deviation = Q3 –Q1

Q3 + Q1x F cf Q1= L1 + L 2-L1 (q1-c) where q1=N/4 = 170/4 = 42.5

f = 140+150 –140 (42.5-30) 30 = 140 + 10 x12.5 = Q1 =144.167 30Q3 = L1 + L2 –L1(q1-c) where q3=3N 3 x 170 =127.5 f 4 4 =160 + 170 –160 (127.5 –100) = Q3 =169.167 30 Coeff. of Q .D = Q3 – Q1 = 169.167 – 144.167 = 25 = 0.079 Q3+Q1 169167 + 144.167 313.334

120 –130130 –140140 –150150 –160160 –170170 –180180 –190190 –200

102030403020155

103060100130150165170

q1

170

89

ILLUSTRATION =08Calculate Quartile Deviation and its coefficient from the data given below.

Mid Value 1 2 3 4 5 6 7 8 9 10Frequency 2 9 11 14 20 24 20 16 5 3Solution: Form Lower Classes and upper classes from the given mid valueX f Cf Q1= L1+L2 – L1( q1-c) where q1 =N/4 = 140/4 = 35

f= 3.5 + 4.5 –3.5 (31-22) = 3.5 + 1 x9 = 4.14 14 14Q3 = L1+L2 –L1 (q3-c) where q3 = 3N = 3 x 124 = 93 f 4 4= 6.5 + 7.5 – 6.5( 93-80) = 6.5 +1 x 13 =7.15 20 20Q.D = Q3 – Q1 = 7.15 – 4.14 =1.505 2 2Coefficient of QD =Q3 – Q1 = 7.15 – 4.14 = 3.01= 0.266 Q3+Q1 7.15 +4.14 11.29

0.5 – 1.51.5 –2.52.5 –3.53.5 –4.54.5 –5.55.5 –6.56.5 –7.57.5 –8.58.5 –9.59.5 –10.5

2911142024201653

21122365680100116121124

q1

q3

124

ILLUSTRATION:9Calculate quartile Deviation and its coefficient from the data given below.Wages in Rs. 100 100-109.5 110-119.5 120-129.5 130-139.5 140-149.5 150-159.5 160 &abvNo. of worker 12 18 24 16 30 20 15 5

Solution:-Convert the given open end cum inclusive series into exclusive formX f Cf Q1=L1+L2 – L1(q1-c) Where q1= N/4 = 140/4 = 35

f = 109.75 + 119.75 –109.75(35 –30) 24= 109.75 + 10 x 5 = 109.75 +50 = 111.83 24 24Q3= L1+L2 –L1(q3-c) where q3 = 3N = 3X140 = 105 f 4 4 = 139.75 + 149.75-139.75 (105 –100) = 139.75 + 10 x 5 = 139.75+ 50 = 142.25 20 20 Q.D = Q3 –Q1 = 142.25 –111.83 = 15.21 2 2Co eff of Q.D = Q3 –Q1 = 142.25 –111.83 = 30.42 = 0.119 Q3 +Q1 142.25 +111.83 254.08

89.75 –99.7599.75 –109.75109.75 –119.75119.75 –129.75129.75 –139.75139.75 –149.75149.75 –159.75159.75 –169.75

121824163020155

12305470100120135140

35

105

140

ILLUSTRATION =10 Calculate Quartile Deviation and its coefficient from the data given

Marks Below 50 55 60 65 70 75 80 85 90 95 100No. of students 15 31 48 70 102 130 148 170 185 190 200Convent the given cumulative frequency table into ordinary table.

90

Solution

x f cf Q1 = L1+ L2 –L1(q1-c) where q1= N/4 = 200/4 =50 f = 60 + 65 –60(50 -48) 22 = 60 +5 x 2 = 60 +10 = 60.45 22 22 Q3 = L1 + L2 – L1(q3-c) where q3 =3N =3x200 = 50 f 4 4= 80+ 85 – 80(150 – 148) 22= 80+ 5 x 2 = 80 + 10 = 80.45 22 22Q.D = Q3 – Q1 = 80.450 – 60.45 = 10 2 2Coefficient of Q. D = Q3 – Q1 = 80.45 – 60.45 = 0.141 Q3+Q1 80.45 + 60.45

45 –5050 –5555 –6060 –6565 –7070 –7575 –8080 –8585 –9090 –9595 –100

1516172232281822155

10

15314870102130148170185190200

200

ILLUUSTRATION =11 Calculate Quartile Deviation and its coefficient from the data given below.Wages Above 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000In Rs. 100 95 85 72 61 54 43 30 16 10

Solution:- Convert the given more than cumulative frequency distribution in to an ordinary Table.

X f cf Q1 =L1 + L2 –L1 (q1-c) where q1 = N/4 =100/4 = 25 f = 1600 + 1800 – 1600(25 –15) 13 = 1600 + 200 x 10 = 1600 + 2000 = 1753.846

13 13Q3 = L1 + L2 – L1( q3 –c) where q3 = 3N = 3 x100 =75 f 4 4 = 2600 + 2800 – 2600(75 –70) 14 = 2600 + 200 x 5 = 2671.43 14Q.D = Q3 – Q 1 = 2671.43 – 1753.846 = 458.79 2 2Coefficient Q.D = Q3 – Q1 = 2671.43 – 1753.85 = 917.58 = 0.207 Q3 + Q1 2671.43 +1753.85 4425.28

1200 –14001400 –16001600 –18001800 –20002000 –22002200 –24002400 –26002600 –28002800 –30003000 –3200

5101311071113140610

51528394657708490100

100

Merits of Quartile Deviation1. It is easy to calculate and simple to understand.2. It is rigidly defined.3. It gives an idea about the variation of central 50% of the observation.4. It is not unduly affected by extreme values.

Demerits1. It is not based on all the observations.2. It is not capable of further of algebraic treatment.3. It is much affected by fluctuations.4. It is necessary to arrange the data in ascending order.

91

TERMINAL QUESTIONS (5, 10 & 15 Marks)1. Define the term Dispersion. State the objectives of measurements of dispersion.2. What is meant by dispersion? Explain different methods measuring dispersion.3. Briefly, explain the relative merits and demerits of the various measures of dispersion4. What is Range? Explain its merits & demerits5. What is quartile Deviation? State it merits & demerits 6. Find out quartile Deviation and its coefficient from the following data.

Weight in kg 10 –12 12 –14 14 –16 16 –18 18 –20 20 –22 22 –24No. of Boxes 6 9 20 25 24 15 5

[Answer Q.D = 2.09, coefficient of Q.D = 0 .12]7. Calculate Quartile Deviation and its coefficient from the following data.Class 1 –10 11 –20 21 –30 31 –40 41 –50 51 –60 61 –70Frequency 6 8 6 9 9 6 6[Answer Q D =15.67, coefficient of Q D = 0.468. Calculate Quartile Deviation and its Coefficient from the following data.Size more than 20 30 40 50 60 70Frequency 68 63 40 40 18 7 [Answer Q .D = 12.845, coefficient 0.27]9. Calculate Quartile Deviation and its Coefficient from the following dataMarks less than 10 20 30 40 50 60 70 80 90No/ of persons 5 15 98 242 367 405 425 438 439 [Answer Q.D = 8.08, coeff = 0.21]

92

STANDARD DEVATION

Concept of standard Deviation was introduced by Karl Pearson in 1893. It is the most important measure of dispersion and is widely used in many statistical formula. Standard Deviation is also called Root – mean square Deviation. The reason is that it is the square root of the means of the squared deviation from the arithmetic mean. In this method the draw back of ignoring the algebraic sign in mean Deviation is overcome by taking the square of deviation, there by making all the deviations as positive.

Therefore, it is defined as positive square root of the arithmetic mean of the squares of the deviations of the given observations from the arithmetic mean. The standard deviations is denoted by the greek letter (sigma) the square of standard deviation is known as variance.

Features of standard deviationFollowing are the feature of standard deviation 1. The standard deviation measures the absolute dispersion of a distribution. The greater the dispersion greater

is the standard deviation.2. A small standard deviations means a high degree of uniformity of the observation as well as homogeneity. A

large standard deviation the opposite.3. If we have two or more comparable series with nearly identical means, it is the distribution with the smallest

standard deviation that the most representative mean.4. The standard deviation has the same unit as the original variables have.

Computation of standard deviationIndividual series – direct methodSteps:- 1. Calculate arithmetic mean, of the series

2. Obtain deviations from arithmetic mean dx = x –

3. square these deviations and final their sum d2x 4. use formula for standard deviation

= d 2 x or (x – ) n n

= Arithmetic mean d2x = Square of the Deviation from n = Numbers of items.

ILLUSTRATION= 01Find the standard Deviation of the monthly salaries of 10 persons given below.

Persons A B C D E F G H I JSalaries in Rs 120 110 115 122 126 140 125 121 120 131

SOLUTION

PersonsSalaries in Rs

xDeviations from mean (x-) ax (x –123)

Squaresd2x

A 120 -3 9B 110 -13 169C 115 -8 64D 122 -1 1E 126 +3 9F 140 +17 289G 125 +2 4H 121 -2 4I 120 -3 9J 131 +8 64

x 1230 d0 622d2x = x = 1230 = 123 n 10 = 123

93

S. D = d 2 x N = 622 10 = 62.2

= 7.89

ILLUSTRATION –02Calculate the standard deviation from the following data.

Variables x 10 15 20 12 8 4 15 16 25 35

SOLUTIONx Deviation from Mean dx = x -(x – 16) dx2

10 -6 3615 -1 120 +4 1612 -4 168 -8 644 -12 14415 -1 116 0 025 +9 8135 +19 361

160x 0 d2x 720Mean = x N = 160 = 16 10Standard deviation = d 2 x n = 720 10 = 72 S.D = 8. 485

SHORT – CUT METHOD OR DEVIATIONS TAKEN FROM ASSUMED MEANThis method is adopted when the arithmetic average is a fractional value. Taking deviations from

fractional value would be a very difficult and tedious task. To save time and labour we apply shore cut method.Steps:- 1. Assume any one of the item in the series as an average = (A) 2. Find out the deviations from the assumed mean dx = x – A C 3. Find out the total of the deviations dx

4. Square the deviations and add up the squares of deviations i.e d2x5. Use the formula = d 2 x - ( dx) 2 x C

n nILLUSTRATION = 03

The below given table gives the marks obtained by 10 students in statistics examination. Calculate standard deviation.Sl. No 1 2 3 4 5 6 7 8 9 10Marks 43 48 65 57 31 60 37 48 78 59

Solution:- A= 31, C=1 dx = x - A C

94

Sl. No. Marksx

x-31,dx

d2x

1 43 12 1442 48 17 2893 65 34 11564 57 26 6765 31 0 06 60 29 8417 37 6 368 48 17 2899 78 47 220910 59 28 784

N=10 dx 216 6424

ILLUSTRATION = 04Calculate the standard deviation from the following data.

Values (x) 58 59 60 54 65 66 52 75 69 52

Solution A = 65, C = 1, dx =(x –A)/C

DISCRETE SERIESCalculation of standard deviation

DIRECTMETHOD OR ACTUAL MEAN METHODStep:-

1. Calculate the mean of the series 2. Find deviations for various items from the mean dx = x –3. Square the deviations (d2) and multiply by the respective frequencies we get fd2x4. Get the total product fd2 and use the formula

= fd 2 x N= frequency total N

ILLUSTRATION = 05Calculate Standard Deviation from the following data.

Marks 10 20 30 40 50 60No. of Students 8 12 20 10 7 3

Solution x – 30.8x f fx Dx dx2 dd2x = fx = 1850 =30.83

N 60S.D = fd 2 x

1020

812

80240

-20.8-10.8

432.64116.64

3461.121399.68

Values - x (x –65) dx d2x58 -7 4959 -6 3660 -5 2554 -11 12165 0 066 1 152 -13 16975 10 10069 4 1652 -13 169

n =10 -40 686 d2x

S . D = d 2 x – dx 2 x C n n = 686 – -40 2 x 1 10 10 = 68.6 – (-4)2 x 1 =68.6 – 16 =52.6S.D = 7.252 = A + dx x C n = 65 +(-40 x 1) = 61 10

95

S.D = d 2 x - dx 2 x C n n = 6424 – 216 2 x 1

10 10 = 642.4 –(21.6)2 x 1 = 642.4 – 466.56 = 175.84S.D = 13.26

30405060

201073

600400350180

-0.89.219.229.2

0.6484.64368.64852.64

12.8846.402580.482557.92

N = 10858.40

60 = 180.97 = 13.45

N= 60 1850fdx fd2x 10858.40

ILLUSTRATION = 06No. of accidents 0 1 2 3 4 5 6 7 8 9 10 11 12Persons involved 16 16 21 10 16 8 4 2 1 2 2 0 2Solution (x – 3)

= fx n = 300 = 3 100

S.D = fd 2 x N = 702 100 = 7.02S.D = 2.649

SHORT – CUT METHOD OR ASSUMED MEANSteps:-

1. Assume any one of the item in the series as an average ( A )2. Find out the deviations from assumed mean after considering common factor if any dx =(x –A)

C3. Multiply the square deviations by the respective frequencies and get the fdx4. Square the deviations d2x.5. Multiply the squared deviations (d2x) by the respective frequencies and get fd2x6. Use the formula S.D = d 2 x – fdx 2 x C

N N

ILLUSTRATON = 07Calculate standard deviation for the following data.

No/of Accidents

xNo/ of Persons fx dx d2x fd2

0 7 0 -3 9 1441 16 16 -2 4 642 16 42 -1 1 213 21 30 0 0 04 10 64 1 1 165 16 40 2 4 326 8 24 3 9 367 4 14 4 16 328 2 8 5 25 259 1 18 6 36 7210 2 20 7 49 9811 2 0 8 64 012 0 24 9 81 162

2

N=100 fx 300702fd2x

96

Marks 5 6 7 8 9 10 11 12 13Students 5 2 4 3 6 5 7 2 6

Solution A=9, C=1, dx = x –A C

ILLUSTRATION =08Find standard deviation for the following data.

Values 60 70 80 90 100 110 120frequency 3 6 9 13 8 5 4 Solution (x –90) A= 90, C=10, dx = x – A fd2x = fdx x dx 10 c

x f dx fdx fdx2 S.D = fd 2 x – fdx 2 x C N N = 124 – 0 2 x 10 48 48 =2.583 x 10 = 1.6072 x 10S.D = 16.072

60 3 -3 -9 2770 6 -2 -12 2480 9 -1 -9 990 13 0 0 0100 8 1 8 8110 5 2 10 20120 4 3 12 36N48 fdx fdx 0 124

CONTINUOUS SERIES

Calculation of standard deviation In case of grouped series, find mid values and proceed as in case of discrete series.

Direct Method

ILLUSTRATION –09 Find arithmetic mean and standard deviation for the following data.

Class 0-4 4-8 8-12 12-16 16-20Frequency 2 3 10 3 2

Solution (x - )Class x f Mid value fm d d2 Fd2 = fm = 200 = 10

Marks x f(x-9) dx

Fdx d2x Fd2xS.D = fd 2 x - fdx 2 x C N N = 264 – 12 2 x 1 40 40 = 6.6 – 0.9 = 6.51 S.D = 2.551

5 5 -4 -20 16 806 2 -3 -6 9 187 4 -2 -8 4 168 3 -1 -3 1 39 6 0 0 0 010 5 1 5 1 511 7 2 14 4 2812 2 3 6 9 1813 6 4 24 16 96

40 fdx 12 fd2x 264

97

N 20

S.D = fd 2 = 35 2 N 20 S.D = 4.19

0 –4 2 2 4 -8 64 1284 –8 3 6 18 -4 16 488 –12 10 10 100 0 0 012 –16 3 14 42 4 16 4816 -20 2 18 36 8 64 128

N= 20 200 0 352

SHORT – CUT METHOD OR ASSUMED MEAN METHODIn the continuous series the method of calculating standard deviation is almost the same as in a discrete

frequency distribution. But one additional step here is obtained of mid value. The step deviation method is widely used .

Formula for standard deviation calculation.S.D = fd 2 x – fd 2 x x C

N N

ILLUSTRATION = 10Calculate the standard deviation from the following data

Class x 0 –10 10 –20 20 –30 30 –40 40 –50 50 –60 60 –70Frequency 8 12 17 14 9 7 4

Solution = A =35, C=10, dx =(x –A)/c(x –35)/10

X f Mid x dx fdx fd2x = A + fdx x C N = 35 + -302 x 70 71 = 35 – 300 = 35 – 4.225 = 30.775 71S.D = fd 2 – fdx x C N N = 210 – –30 2 x 10 71 71 = 2.957 – (-0.422)2 x 10 =2.7785 x 10 =1.668 x 10 = 16.668

0 –1010 –2020 –3030 –4040 –5050 –6060 –70

8121714974

5152535455565

-3-2-10123

-24-24-17091412

724817092836

71 fd2x 210

ILLUSTRATION = 11 The following data relate to the age of a group of workers calculate the arithmetic Mean and standard

deviation.Age in years 20-25 25-30 30-35 35-40 40-45 45-50 50-55No of worker 170 110 80 45 40 30 25 Solution x – 37.5, A =37.5, C =5 , dx =(x – A)/c

5

Age x fmid

xdx fdx fd2x

Fd2x = fdx x dx = A+ fdx x C N = 37.5 + –635 x 5 = 31.15 500S.D = fd 2 x – fdx 2 x C N N

=2435 – –635 2 x 5 = 4.87 –1.613 x 5

20-25 170 22.5 -3 -510 153025-30 110 27.5 -2 -220 44030-35 80 32.5 -1 -80 8035-40 45 37.5 0 0 040-45 40 42.5 1 40 4045-50 30 47.5 2 60 120

98

500 500 = 1.804 x 5 = 9.020

50-55 25 52.5 3 75 225N= 500 fdx -635 2435 fd2x

ILLUSTRATION = 12Calculate arithmetic mean and standard deviation from the following data.

Wages more than 100 200 300 400 500 600 700 800No of workers 660 615 527 381 175 96 44 14

[K U V BBM 2002]Solution ( x – A)/100, A =450, C= 100, dx,=(x – A)/C

x f Mid x dx fdx fd2x = A + fdx xC N = 450 + -128 x 100 660 = 450 – 19.39 = 430.61S.D = fd 2 x – fdx 2 x C N N = 1684 – –128 2 x 100 = 2.51 x 100 660 660 = 1.584 x 100 =158 . 4

100 –200200 –300300 –400400 –500500 –600600 –700700 –800800 -900

458814620679523014

150250350450550650750850

-3-2-101234

-135-176-146

0791049056

405352146079208270224

N= 660 fdx -1281684fd2x

IILUSTRATION 13 Calculate arithmetic mean and standard deviation.Age in less than years 10 20 30 40 50 60 70 80No of workers 15 30 53 75 100 110 115 125

Solution (x –35)/10, A = 35, C = 10, dx = (x – A)/CX f Mid x dx fdx fd2x = fdx x C = 35 + 2 x 10 = 35 + 20 =35.16

N 125 125

S.D = fd 2 x – ( fdx) 2 x C = 488 – ( 02 )2 x 10 N N 125 125 = 3.904 – (0.016)2 x10 = 3.904 – 0.000256 = 1.975 x 10 = 19.75

0-1010-2020-3030-4040-5050-6060-7070-80

151523222510510

515253545556575

-3-2-101234

-45-30-23025201540

13560230254045160

N=125 fdx 02 488=fd2x

ILLUSTRATION 14Calculate the arithmetic mean and standard deviation from the following data.

Income under Under 100 10-104.9 105-109.9 110.114.9 115-119.9 120-124.9 125-129.9No. of workers 20 40 30 20 50 15 5

Solution: - A = 112.45, C =5, dx =(x –A)/CX f Mid x dx fdx fd2x = A+ fdx X C

N = 112.45 + -75 x 5 180 = 112.45 – 2 .08 = 110.37S.D = fd 2 x - ( fdx) 2 X C N N = 525– -75 2 x 5 = 2.9166 –(0.416)2 x 5 180 180 = 2.916 – 1.173 =2.7436 x 5 = 1.656 x 5 = 8.28

95-99.9100-104.9105-109.9110-114.9115-119.9120-124.9125-129.9

20403020501505

97.45102.45107.45112.45117.45122.45127.45

-3-2-10123

-60-60-300503015

180160300506045

N = 180 -75 525

99

COEFFICIENT OFVARIATION = CVThe standard deviation is an absolute measure of dispersion. It is expressed in terms of units in which

the original figures are collected and stated. The standard deviation of heights of students cannot be compared with the standard deviation of weights of students, as both are expressed in different units. Therefore, the standard deviation must be converted into a relative measure of dispersion for the purpose of comparison. The relative measure is known as the coefficient of variation.

Variance:- square of standard deviation is called variance symbolically Variance = 2

= variance Coefficient of standard deviation =

For better comparison purpose, this coefficient of standard deviation is multiplied by 100 gives the coefficient of variation

Coefficient of variation = x 100

Prof. Karl Pearson suggests this measure of coefficient of variation as the most commonly used measure of relative variation. The series for which the c v is greater, indicates that the series is more variable or less uniform if the coefficient of variation is less, it indicates that the series is less variable or more stable or more consistent

COMPARISON OFTWO SERIESUSING COEFFICIENT OF VARIATION

ILLUSTRATION =15The index numbers of prices of cotton and cool shares in a year are given below.

MonthIndex no of prices of cotton on shares, x

Index no, of prices of coal shares

January 188 131February 178 130March 173 130April 164 129May 172 129June 183 129July 185 127

August 184 127September 211 130

October 217 137November 232 140December 240 142

Compare the variations of the price of the two shares using coefficient of variation.Solution :- Calculation of coefficient of variation

X(x – 185)

dxd2 y

(y –127)dy

d2y

188 3 9 131 4 16178 -7 49 130 3 9173 -12 144 130 3 9164 -21 441 129 2 4172 -13 169 129 2 4183 -2 4 129 2 4185 0 0 127 0 0184 -1 1 127 0 0211 26 676 130 3 9217 32 1024 137 10 100232 47 2209 140 13 169240 55 3025 142 15 225

N=12107dx

7751d2x

dy=57549d2y

100

= A + dx xC n = 185 + 107 x1 12 = 193.916y = A + dy x C n = 127 +57 x 1 12 = 127 + 4.75 = 131.75

S.D = d 2 x - ( dx ) 2 x C n n = 7751 – (107) 2 x 1 12 12 = 645.92 – (8.916)2

= 645.92 – 79.495 = 566.425 = 23.799C.V= x 100 = 23.799 x100 = 12.27% 193.91

S.D = d 2 y - ( dy) 2 X C n n = 549 -(57)2 X 1 12 (12) = 45.75 – ( 4.75 )2

= 45.75 - 22.5625 = 23.1875 = 4.815C.V = X 100 = 4.815 x 100 = 3.65% 131.75

Hence cotton shares are more variable in price than the coal shares.

ILLUSTRATION = 16Following are the runs scored by two bats man A & B. Find

a. Who is better scored run getter?b. Who is more consistent batsman?

A 101 22 0 36 82 45 7 13 65 14B 97 12 40 96 13 8 85 8 56 15 Solution Ax = 82, Ay = 13

AX

x – 82dx

d2xBy

y – 13dy

d2y

101 19 361 97 84 705622 -60 3600 12 -1 10 -82 6724 40 27 72936 -46 2116 96 83 688982 0 0 13 0 045 -37 1369 8 -5 257 -75 5625 85 72 518413 -69 4761 8 -5 2565 -17 289 56 43 184914 -68 4624 15 2 4

-435dx

29469d2x

300dy

21762d2x

= A + dx x C n = 82 + -435 x1 = 38.5 10 y = A + dy x C n=13 + 300 x1 =43 10

101

x- series S.D = d 2 x - ( dx) 2 x C n n =29469 – (-43.5)2 x 1 10 10 = 2946.9 – 1892.25 = 1054.65 = 32.475 C V = x 100 = 32.475 x 100 = 84 .35% 38.5

y-seriesS.D = d 2 y -( dy) 2 xC n n =21762 - (300 )2 x 1 10 ( 10 ) = 2176.2 – (30)2 x 1 =2176.2 – 900 = 1276.2 = 35.723C.V = x 100 y = 35.723 X 100 43 = 83.07%

Conclusion:-1. Batsman B is better run getter, because he has scored 430 runs compare to batsman A 385 runs2. Batsman B is more consistent.

COMPARISION OF TWO GROUPS OF DATAUSING COEFFICEINT OF VARIATION

DISCRETE SERIESILLUSTRATION = 17

The goals scored by tow teams A & B in the football matches were as follows.Goals 0 1 2 3 4Matches A= 27 9 8 4 5Teams B= 17 9 6 5 3Find which team is more consistent.

Solution A = 2

Team –A Team-B Gools

xTeam A,

fA

x –2dx

fAdx fAd2x dx fB fBdx fBd2x

0 27 -2 -54 108 -2 17 -34 681 9 -1 -9 9 -1 9 -9 92 8 0 0 0 0 6 0 03 4 1 4 4 1 5 5 54 5 2 10 20 2 3 6 12

N1=53-49

fAdx141

fAd2xN2= 40

-32fBdx

94fBd2x

S.D = f Ad 2 x - ( f Adx )2 X C N1 N1

=141 - (-49) 2 x 1 53 ( 53)

= 2.66 – (-0.925)2 x 1 = 1.8052 = 1.3436 A = A + fadx x C N = 2+ -49 x 1 53 = 2 –0.925 =1.075 C.V = X100 = 1.3436 x 100 A 1.075 = 124.93%

S.D = f Bd 2 x - ( f Bdx)2 x C N2 N2

= 94 - (-32)2 x140 40

= 2.35 – (-0.8)2 x1 = 1.71 = 1.308B= A + fBdx x C N = 2 + -32 x 1 = 1.2 40C.V = x 100 = 1.308 x 100 = 109% B 1.2

Team B is more consistent as it has less variation

102

CONTINUOUSSERIESILLUSTRATION =18

Following data gives life of electric bulb manufactured by two companies. Calculate. A. which of the two makes has a higher average life?B. If prices of both the bulbs is same, which company’s bulb would you prefer to buy and why? Use

C.VLife in 000 hrs

XCompany

ACompany

B50-5960-6970-7980-8990-99

182226259

1524301813

100 100Solution

Computation of coefficient of variation for both the series.

X Mid xx – 74.510 dx

fA fAdx fAd2x fB fBdx fBd2x

50-5960-6970-7980-8990-99

54.564.574.584.594.5

-2-1012

182226259

-36-2202518

722202536

1524301813

-30-2401826

602401852

N1=100 N1

-15fAdx

155fAd2x

100N2

-10fBdx

154fBd2x

S.D = f Ad 2 x - ( f Adx )2 x C N1 N1

=155 - (-15) 2 x 10 100 (100) = 1.55-( -0.15) 2 x 10 = 1.55-0.0225 = 1.5275 = 1.2359 x10 =12.359 A = A + f Adx x C N =74.5 + (-15) x10 100 =74.5-1.5 = 73.0C.V = X 100 = 12.359 X 100 = 16.93% A 73

sssS.D = f Bd 2 x - ( f Bdx )2 x C N2 N2

= 154- (-10)2 x10 100 (100) = 1.54 – (-0.1)2 x10 = 1.54 – 0.01 x 10 = 1.53 x 10 = 1.2369 X 10 = 12.369B= A + f Bdx x C N = 74.5 + (-10) X10 100 = 74.5 -100 =73.5 100C.V = X 100 = 12.369 x 100 = 16.83% B 73.5

Conclusion1. B company bulbs have higher average life i.e. 73.5 hours.2. B company bulbs are more uniform and consistent in giving life, hence buyer would prefer to buy B

companies bulbs.

ILLUSTRATION =19

From the data given below state which of the two series is more variable. Use coefficient of variation. Variable 10-20 20-30 30-40 40-50 50-60 60-70Frequency A 10 18 32 40 22 18 Frequency B 18 22 40 32 18 10

103

SolutionComputation of coefficient of variation .

Frequency A Frequency BVariable

Xmid x

x-3510 dx

fA fAdx fAd2x dx fA fBdx fBd2x

10-2020-3030-4040-5050-6060-70

152535455565

-2-10123

101832402218

-20-180404454

401804088162

-2-10123

182240321810

-36-220323630

72220327290

140 100 348 140 40 288

A = A + fAdx X C N1

=35 + 100 X 10 140 = 42.1429 B = A + fBdx X C N2

= 35 + 40 X 10 140 = 37.8571 A-Series B-Series S.DA = f Ad 2 x - ( f Adx)2 x C N1 N1

=348 - (100) 2 X 10 140 140 = 2.4857-( -0.7142) 2 x 10 S.D= 14.055 C.V = X 100 = 14.055 x 100 A 42.1429 = 33.35%

S.DB = f Bdx - ( f Bdx )2 x C N2 N2

=288 - (40) 2 X 10 140 140 = 2.05714 -(0.2857) 2 X 10 = 14.055 C.V = X 100 = 14.055 X 100 B 37.8571 = 37.127%

Conclusion:- Series B is more Variable

ILLUSTRATIONS:-20Two brands of types are tested for their life and the following results were obtained. State which brand

of Tyres are more consistent?Life in months 20-25 25-30 30-35 35-40 40-45No/ of Tyres: x: 1 22 64 10 3

y: 3 21 74 1 1 Solution: computation of coefficient of variation,

Brand A Brand B

Life in month x mid xx-32.5

5fAdx fdx fd2x f fdx fd2x

104

20-2525-3030-3535-4040-45

22.527.532.537.542.5

-2-1012

012264103

-2-220106

42201012

3217411

-6-21012

+1221014

100 - 8 48 100 -24 38

1 = A + fdx X C N1

=352.5+ -8 X 5 100 = 32.5-0.4 =32.1 y = A + f dx X C N2

= 32.5+ - 24 X 5 = 32.5 –1.20 = 31.3 100

S.D = fd 2 x - ( fdx )2 x C N1 N1

= 48 - (-8) 2 x 5 100 100 = 3.44 C.V = X 100 = 3.44 x 100 32.1 = 10.72%

S.D = fd 2 x - ( fdx )2 X C N2 N2

= 38 - (-24) 2 X 5 100 100 = 2.839 C.V = X 100 = 2.839 X 100 31.30 = 9.07%

‘Y’ Brand Tyres are more consistent than brand x Tyres.

ILLUSTRATION:21:- In two factories A& B engaged in the industrial area the average weekly wages in Rs and the standard

deviations are as followsFactory Average Standard Deviation No. of workers

A 345 5 476B 285 4.5 524

Find 1) Which factory A or B pays out a larger amount as weekly wages?2) Which factory A or B has greater variability in individual wages?

Solution1. Calculation of total weekly wage payment

Total wages paid by factory A= Rs345 x 476 = 164220Total wages paid by factory B= Rs 285 x 524 = 14 9340Therefore, factory A pays out larger amount as weekly wages

2. Calculation of coefficient of variation Factory A ------- CV = x 100 = 5 x 100 = 1.449% 345 Factory B--------- CV = x 100 = 4.5 x 100 = 1.578% 285 Factory B has greater variability in individual wages since CV of factory B is greater than CV of factory A.

ILLUSTRA TION=22Particulars regarding income of two villages are given below.

Particulars Village A Village BAverage income 1750 1860Variance 100 81 State in which village is the variation in income greater ?Solution

Calculation of coefficient of variation

105

1. village . A------CV = x 100 =10 x 100 = 0.57% 1750 2. village. B------ CV = x 100 = 9 x 100 = 0.483% 1860

SD =Variance =100 = 10

S.D Variance = 81 = 9

Conclusion:- village A has greater variation than village B.

ILLUSTRATION=23.Coefficient of variation of two series are 58% and 69%. Their standard Deviations are 21.2 and 15.6

what are their arithmetic means?

Solutionx-Series x-Series

C.V = = x 100 58 = 21.2 x 100 = 2120 58 = 36.55

CV = X 100 Y69 = 15.6 yy = 1560 69 y = 22.6

ILLUSTRAAATION = 24Coefficient of variation of two series are 75% and 90% and their arithmetic means are 20 & 20

respectively find their standard deviations.x-series y-seriesCV = x 100 58 = x 100 20 75 =100 20 = 75 x 20 = 15 100

CV = x 100 y90 = x 100 20 90 =100 20 = 90 x 20 = 18 100

USE OF STANDARD DEVIATIONS:Standard Deviation is the best measure of dispersion. It is widely used statistics because it possesses

most of the characteristics of an ideal measure of dispersion. It is widely used in sampling theory and biologist. It is used in coefficient correlation and in the study of symmetrical frequency distribution.

Merits of standard Deviation: 1. It is rigidly defined and its value is always definite and based on all the observations.2. As it is based on arithmetic mean, it has all the merits of arithmetic mean.3. It is the most important and widely used measure of dispersion.4. It is possible for further algebraic treatment.5. It is less affected by the fluctuations of sampling and hence stable.6. It is the basis for measuring the coefficient of correlation sampling and statistical inferences 7. The coefficient of variation is considered to be the most appropriate method for comparing the variability of

two or more distributions and this is based on mean and standard deviation.Demerits:1. It is not easy to understand and it is difficult to calculate.2. It gives more weight to extreme values, because the values are squared up3. It is affected by the value o f every item in the series.4. As it is an absolute measure of variability, it cannot be used for the purpose of comparison5. It ha not found favour with the economists & businessmen.THEORETICAL QUESTIONS(5, 10 & 15 Marks)1. Why is that standard deviation is considered to the most popular measure of dispersion ?2. What is standard deviation? State their merits & demerits

106

3. What is coefficient of variation? What purpose does it serve.PRACTICAL PROBLEMS1. Calculate mean and standard deviation f r o m the following data.Wages in rs 40-50 50-60 60-70 70-80 80-90 90-100No of workers 12 9 8 5 7 9 [Answers: = 67-6, SD = 18.31]

2. Calculate the standard deviation from the following data.Monthy

Exp.78-82 73-77 68-72 63-67 58-62 53-57 48-52 43-47 38-42 33-37 28-32

No. of worker

3 6 7 12 17 13 9 7 4 2 1

[Answer SD = 11]3. Find mean standard deviation and coefficient of variation from the following data.Age in under year 10 20 30 40 50 60 70 80No. of persons 15 30 53 75 100 110 115 125 [Answers; = 35.16, SD = 19.76, CV = 56.2% ]4. Find Mean, standard deviation and coefficient of variation from the following data .Marks more than 0 10 20 30 40 50 60 70 80 90No. of students 100 90 80 65 50 20 15 10 5 2 [ Answers: = 38.7, SD = 21.3, CV=55 %] 5. From the prices pf shares of company ‘A’ and company ‘B’ given below, state which is more stable in value?Share A price 55 54 52 53 56 58 52 50 51 49Share B price 108 107 105 105 106 107 104 103 104 101 [ Answers : A = 53, A= 2.646, CV = 4.99 % ] B = 105, B=2 CV = 1.90% ( B company share prices are more stable)6. From the following table of marks obtained by 10 students, find the coefficient of variation and determine

the marks of which subject are more variable.Statistics 25 50 45 30 70 42 36 38 34 60M. Accounting 10 70 50 20 95 55 42 60 48 80 [Answers CV for stat=30.49%, CV for A/C= 45.9% ] (Variation is greater in the marks of M. Accounting)7. The scores of two bats men A & B for 20innings are as under which of the two may be regarded as the more

consistent batsman?Scores 53 54 55 56 57 58 59 60 TotalNo/of innings A 2 0 0 4 3 5 3 3 20

B 1 2 3 6 3 3 2 0 20 [ Answers A= 57.4, A = 1.9 6 CV = 3.4% B = 56.2, B= 1.6 CV=2.86 %] (Batsman B can be considered as more consistent)8. Samples of polythene Bags from two manufacturers A & B are tested by a prospective buyer for bursting

pressure with the following results.Bursting pressure in lbs

xNumber of bags

Company A Company Bunder-100 50 100100-104.9 150 75105-109.9 120 125110-114.9 80 65115-119.9 130 135120-124.9 70 140125-129.9 150 60

130 & above 50 100Total 800 800

(Ans: A = 114.64, A=10.57, CVA= 9.22%)(B=115.75, B=11.095, CVB= 9.64%)

107

9. The life of two types of lamps is given below Find 1. Which of the two makes has a higher average life?2. If prices are same for both, which type would you prefer to buy and why (use CV)

Life in hours

XNumber of lamps manufactured

Company A Company –BUp to 20002000-39994000-59996000-79998000-9999

10000-119991200 & above

15015012080130170200

1001751256513514060

1000 800 [Answers A = 7399.5, A = 4308.13, B = 6549.5, B = 3820.68]10. Coefficients of variation of two series are 60% & 80% respectively. Their standard Deviation are 20 and 16

respectively what are their means?[ Answer 1 = 33.3 2 = 20]

108

Technology

Module 5