Dispersion or Variability

Embed Size (px)

Citation preview

  • 7/29/2019 Dispersion or Variability

    1/6

    IBA, JU 4Master of Business Administration

    Course Instructor: Dr Swapan Kumar Dhar

    Measures of Dispersion or VariabilityMeasures of Central Tendency do not reveal how much the individual values of the variate areconcentrated towards the central value or average. A measure of central tendency, such as the mean or

    the median, only describes the center of the data. It is valuable from that standpoint, but it does not tell usanything about the spread of the data. For example, suppose, you are visiting somewhere and you wantto cross a river on foot. Your guide has told you that the river ahead averaged 3 feet in depth, would youwant to wade across on foot without additional information? Probably not. You would want to knowsomething about the variation in the depth. Is the maximum depth of the river 3.25 feet and minimum 2.75feet? If that is the case, you would probably agree to cross. What if you learn that the river depth rangesfrom 0.50 feet to 5.5 feet? Your decision would probably be not to cross. Before making a decision aboutcrossing the river, you want information on both the typical depth and the dispersion in the depth of theriver. Let us give another example.The mean of 40, 50 and 60 is 50The mean of 10, 50 and 90 is 50.In the first case the values are close to the average 50, while in the second case the values are widelyspread from the average 50.

    So, it is necessary to know how the values of the variate are dispersed about the central value.Dispersion is the deviation of the values of a variate from its central value. It measures the degree ofvariability of the values of the variate among themselves. If all the values of a variate are equal, thedispersion is zero and it is large for the values widely dispersed.Different Measures of Dispersion:There are many measures of dispersion. But here we are considering only 4 of them. They are:(i) Range(ii) Variance(iii) Standard deviation(iv) Coefficient of variationRangeFor simple distribution, Range = Highest value Lowest valueFor grouped frequency distribution,Range = Upper limit of the highest class Lower limit of the lowest class.

    Alternatively,Range = Upper boundary of the highest class Lower boundary of the lowest class.The range measures the total spread in the data set.Example 1: The capacities of several plastic containers are 30, 20, 37, 64 and 27 liters respectively.What is the range?Solution: Arranging in ascending order, we get 20, 27, 37, 38, and 64.Range = 64 20 = 44 liters.Example 2: Suppose a sample of 40 hourly wages was grouped into this frequency distribution:

    Hourly earnings (in $) Number 5 up to 10 1010 up to 15 2115 up to 20 9

    Find the range.Solution: Range = $ 20 - # 5 = $ 15.

    Example 3: Find the range of the distribution:

    Class interval 10-19 20-29 30-39 40-49Frequency 12 18 24 10

    Solution: Here we have to find the boundaries. The formulas to find out the boundaries are:

    The lower class boundary = Lower class limit - )(2

    1d

    1

  • 7/29/2019 Dispersion or Variability

    2/6

    The upper class boundary = Upper class limit + )(2

    1d

    Where d = difference between the upper class limit of a class interval and lower class limit of the followingclass.Here for the first class,

    The lower class boundary = 10 -

    1

    (20 19)2 = 9.5

    The Upper class boundary = 19 +1

    (1)2

    = 19.5

    Class boundary Frequency9.5 - 19.5 1219.5 - 29.5 1829.5 - 39.5 2439.5 - 49.5 10

    Range = upper boundary of the highest class lower boundary of the lowest class= 49.5-9.5 = 40.

    Example 4: Following represents the current years Return on Equity of the 25 companies in an investorsportfolio.

    Solution:Here Highest value = 22.1 and lowest value = -8.1.So Range = 22.1 (-8.1) = 30.2Variance and Standard Deviation:Variance is the arithmetic mean of the squared deviations from the mean. Standard deviation (S.D) is thepositive square root of the variance.

    Population VarianceFor ungrouped data, the population variance denoted by 2 (sigma square) is

    2

    2(X )

    N

    = Where, X = The value of an observation in the population

    = The arithmetic mean of the population XN

    =

    N = The total number of observations in the population.

    Working formula for population variance is

    22

    2X X

    N N =

    Example 5: The ages of all patients in the cancer ward of DMC Hospital are 38, 26, 13, 41 and 22 years.What is the population variance?Solution:

    22 2 2 2 2

    2 (38) (26) (13) (41) (22) 38 26 ... 22

    5 5

    + + + + + + + =

    106.8= Years.

    Population Standard DeviationFor ungrouped data

    22 2(X ) X X

    N N N

    = =

    For the previous problem, 106.8 = 10.33= Years.

    -8.1 3.2 5.9 8.1 12.3-5.1 4.1 6.3 9.2 13.3-3.1 4.6 7.9 9.5 14.0-1.4 4.8 7.9 9.7 15.01.2 5.7 8.0 10.3 22.1

    2

  • 7/29/2019 Dispersion or Variability

    3/6

    Sample Variance

    For ungrouped data, the formula for the sample variance is

    2

    2(X X)

    Sn 1

    =

    Where, =X The value of the observations in the sample

    X

    X

    n

    = =

    The mean of the sample

    =n Total number of observations in the sample.

    Working formula

    2

    2

    2

    ( X)X

    nSn 1

    =

    Example 6: The hourly wages for a sample of part-time employees of certain firm are: $2, $10, $6, $8and $9. What is the sample variance?

    Solution: Using the working formula, we have

    2

    2

    2

    ( X)X

    nSn 1

    =

    Hourly wage (X)2

    X2 4

    10 1006 368 649 81

    $35 2852

    2

    (35)285

    405S5 1 4

    = =

    10$=

    Population Variance:

    (a) For grouped data,

    2

    2 f(X X)N

    = 22

    fX fX , N fN N

    = =

    22fX fX

    S.DN N

    = =

    (b) For grouped frequency distribution

    22

    2fX fX

    N N =

    22fX fX

    S.D

    N N

    = =

    Where X = Mid value.Sample Variance(a) For simple frequency distribution

    2

    2f(X X)

    Sn 1

    =

    2

    2( )

    1

    =

    XXn

    n

    3

  • 7/29/2019 Dispersion or Variability

    4/6

    (b)Grouped frequency distribution

    2

    2f(X X)

    Sn 1

    =

    2

    2( X)

    Xn

    n 1

    =

    Where X = Mid value of a class.

    Example 7: Find the standard deviation from the following data:

    Daily wages ($) 20-24 25-29 30-34 35-39Number of workers 16 28 14 12

    Solution: Calculation.

    Class interval Frequency )( f Mid value(x ) fx 2fx20-24 16 22 352 774425-29 28 27 756 2041230-34 14 32 448 1433635-39 12 37 444 16428Total N=70 2000 58920

    Variance

    22fX fX

    N N=

    2

    58920 2000$25,388

    70 70= =

    S.D. 5.04= .Example 8: Marks obtained by all 100 students in a class in an examination are as follows:

    Marks Number of studentsBelow 10 14Below 20 30Below 30 50Below 40 75

    Below 50 87Below 60 95Below 70 100

    Find the mean, variance and S.D.Solution: Calculation

    Marks Number of students )( f Mid value(x ) fx 2fx0-10 14 5 70 35010-20 16 15 240 360020-30 20 25 500 1250030-40 25 35 875 3062540-50 12 45 540 24300

    50-60 8 55 440 2420060-70 5 65 325 21125Total N=100 2990 116700

    Variance

    22fX fX

    N N=

    2

    1167 00 2990

    100 100

    =

    1167 894.01= 272.99= .

    S.D. 16.52= Where Meanfx

    29.90N

    = =

    .

    4

  • 7/29/2019 Dispersion or Variability

    5/6

    Example 9: Find the S.D. of the following distribution.

    Daily wages ($) Number of workers0 and above 200

    10 and above 15520 and above 127

    30 and above 9240 and above 5450 and above 0

    Solution: Calculation of S.D.

    Daily wages Number of workers )( f Mid value(x ) fx 2fx0-10 45 5 225 112510-20 28 15 420 630020-30 35 25 875 2187530-40 38 35 1330 46550

    40-50 54 45 2430 109350Total N=200 5280 185200

    22fX fX

    S.DN N

    =

    2

    185200 5280

    200 200=

    926 696.96= = 15.13.

    Coefficient of Variation (C.V)The coefficient of Variation, denoted by the symbol CV, measures the scatter in the data relative to themean. It may be computed as follows:

    S.DC.V 100

    Mean= .

    As a relative measure, the CV is particularly useful when comparing the variability of two or more data

    sets that are expressed in different units of measurement.Example 10: The combined grade point average in different semesters of two students is shown below:

    StudentCGPA

    1 2 3 4 5 6 7 8A 2.5 2.5 3.0 3.5 3.5 4.0 3.5 3.5B 2.5 3.0 4.0 4.0 4.0 2.0 2.5 4.0

    Which student would you consider better throughout the courses of studies?Solution: For student A,

    2

    i

    i 1

    11 2 2N2 2 i

    1 1 i

    i 11 1 1

    1

    1

    N

    1

    1

    (x )

    C.V.for A

    263.25

    8

    ( x )1 1 1 (26)x 86.5 0.25.

    N N N 8 8

    0.25 0.5.

    0.5100 100 15.38%.

    3.25

    = =

    = =

    = = = =

    = =

    =

    = =

    For student B,

    5

  • 7/29/2019 Dispersion or Variability

    6/6

    2

    i

    i 1

    22

    2

    2 2

    2 2 i

    2 2 i

    i 12 2 2

    2

    NN

    2

    2

    (x )

    0.79100 24.31%

    3.25

    263.25

    8

    ( x )1 1 1 (26)x 89.5 0.625.

    N N 8 8

    0.625 0.79.

    C.V. for B

    N

    100

    = =

    =

    = =

    = = = =

    = ==

    =

    It is observed that the average CGPA of both students are same but C.V. of A is less than that of C.V. ofB. This implies that student A is better than B throughout the course of studies. The performance of A ismore homogeneous in all semesters.Example 11: Find the coefficient of variation from the following frequency distribution giving the weeklywages of 100 workers.

    Wages (Taka) Number of workers260-269 6270-279 14280-289 29

    290-299 23300-309 16310-319 10320-329 2

    Solution: Mean = 2.291=x2

    2fx fx

    S.DN N

    =

    where =x Mid value

    39.1= .13.9

    C.V 100

    291.2

    = = 4.77%.

    Example 12: The scores of two batsmen A and B in ten consecutive innings are as follows:

    :A 70 34 46 58 62 39 11 80 20 50

    :B 48 52 66 44 32 58 80 42 68 40

    Find which batsman is more consistent in scoring.

    Solution: For batsman A , x 47= andS.D 20.57= .

    20.57C.V 100

    47= 43.76%=

    For batsmanB , x 53= and S.D 14.09= .

    14.09C.V 100

    53= 26.58%=

    Batsman B is more consistent (less variability) than the batsman A because 43.76% > 26.58%.

    6