34
PSYCHO C B.Sc. C ( UN SCHO Calicut unive OLOGICAL STATIS II SEMESTER Complementary Course For Counselling Psycho (CU-CBCSS) (2014 Admission onwards) NIVERSITY OF CALICUT OOL OF DISTANCE EDUCATION ersity P.O, Malappuram Kerala, India STICS ology T N a 673 635.

B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

PSYCHOLOGICAL STATISTICS

II SEMESTERComplementary Course

For

B.Sc. Counselling Psychology(CU-CBCSS)

(2014 Admission onwards)

UNIVERSITY OF CALICUTSCHOOL OF DISTANCE EDUCATION

Calicut university P.O, Malappuram Kerala, India 673 635.

PSYCHOLOGICAL STATISTICS

II SEMESTERComplementary Course

For

B.Sc. Counselling Psychology(CU-CBCSS)

(2014 Admission onwards)

UNIVERSITY OF CALICUTSCHOOL OF DISTANCE EDUCATION

Calicut university P.O, Malappuram Kerala, India 673 635.

PSYCHOLOGICAL STATISTICS

II SEMESTERComplementary Course

For

B.Sc. Counselling Psychology(CU-CBCSS)

(2014 Admission onwards)

UNIVERSITY OF CALICUTSCHOOL OF DISTANCE EDUCATION

Calicut university P.O, Malappuram Kerala, India 673 635.

Page 2: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 2

UNIVERSITY OF CALICUT

SCHOOL OF DISTANCE EDUCATIONSTUDY MATERIAL

COMPLEMENTARY COURSE

For

B.Sc. COUNSELLING PSYCHOLOGY

PSYCHOLOGICAL STATISTICS

II Semester

Prepared by:Ms. SajilaResearch ScholarUniversity of Calicut

Layout: Computer Section, SDE©

Reserved

Page 3: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 3

CONTENT PAGES

Module - 1 05 – 16

Module - 2 17 – 26

Module - 3 27-34

Page 4: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 4

Page 5: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 5

Module 1: Frequency Distribution and Graphs

Horace Secrist defines statistics as, “aggregate of facts, affected to a marked extentby multiplicity of causes, numerically expressed, enumerated or estimated according to areasonable standard of accuracy, collected in a systematic manner for a predeterminedpurpose and placed in relation to each other”.

Meaning of Data

The term ‘data’ refers to facts or evidences relating to a group, situation or aphenomenon. It may include raw facts such as name, measures of height, weight and scoreson different forms of tests, experiments or surveys.

Measures of Data: Continuous and Discrete

Data may be either in continuous or discrete form. Data relating to psychologicaland physical traits fall into continuous data. A continuous series can have any degree ofsubdivision, with each measure, which may be an integer or a fraction, existing anywherewithin the range of the scale used. ie., Continuous data are not restricted to defined separatevalues, but can occupy any value over a continuous range. Between any two continuousdata values there may be an infinite number of other values. Examples: measure of heightslike 160.5cms,159.6 cms, etc, measure of distances like 25.7kms,56.5kms,etc, scoresobtained in exams like 85.5, 73.5,etc.

Discrete data can only take particular values. There may potentially be an infinitenumber of those values, but each is distinct and there's no grey area in between. ie.,measures that fall under discrete series are separate and distinct. There is real gap betweenthe measures. Examples: number of students in a class like 50, 45, etc, number of books ina library like 1000, 2345, etc.

Organisation of Data

The different aspects of psychology may be studied by conducting different formsof tests, surveys and experiments which yields valuable data. Data in its original formhaving little meaning to the reader or investigator is termed as raw data. In order to makethe raw data meaningful, it has to be organised or arranged systematically. This process oforganising or arranging original data in a systematic manner in order to make meaningfulinterpretations is termed as organisation or grouping of data.

There are different methods for the organisation of data. Data may be organised inany of the following forms as given below.

1. Statistical Tables2. Rank Order3. Frequency Distribution

Organising data in the form of Statistical Tables: Under this method, data are presented intabular form or arranged in to rows and columns of different headings. The tables

Page 6: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 6

constitute original data or raw scores as well as the percentages, means, standarddeviations, etc. Consider the statistical table given below.

Example 1.1

Organising data in the form of Rank Order: Under this method, raw data are arranged inascending or descending series which reveals the order with respect to ranks or meritposition of the individual. Consider the following example.

Example 1.2: The following are scores obtained by 40 students in a test. Present the data ina tabular form depicting the rank order. The scores are:

72 54 60 55 80 45 53 76 50 6055 85 76 39 64 63 46 62 42 6364 45 52 53 35 55 63 38 53 6265 60 65 40 52 40 42 78 48 52

Solution: The rank order tabulation of the data

Sl No. Score Sl No. Score Sl No. Score Sl No. Score Sl No. Score

1 35 9 45 17 53 25 60 33 652 38 10 46 18 53 26 62 34 653 39 11 48 19 54 27 62 35 724 40 12 50 20 55 28 63 36 765 40 13 52 21 55 29 63 37 766 42 14 52 22 55 30 63 38 787 42 15 52 23 60 31 64 39 808 45 16 53 24 60 32 64 40 85

Frequency Distribution: Frequency Distribution is a method of presenting data showing thefrequency or the number of time a score or group of scores occur in a given distribution.Under this method data is organised in to groups or classes in which each score is allotted a

School of Distance Education

Psychological Statistics Page 6

constitute original data or raw scores as well as the percentages, means, standarddeviations, etc. Consider the statistical table given below.

Example 1.1

Organising data in the form of Rank Order: Under this method, raw data are arranged inascending or descending series which reveals the order with respect to ranks or meritposition of the individual. Consider the following example.

Example 1.2: The following are scores obtained by 40 students in a test. Present the data ina tabular form depicting the rank order. The scores are:

72 54 60 55 80 45 53 76 50 6055 85 76 39 64 63 46 62 42 6364 45 52 53 35 55 63 38 53 6265 60 65 40 52 40 42 78 48 52

Solution: The rank order tabulation of the data

Sl No. Score Sl No. Score Sl No. Score Sl No. Score Sl No. Score

1 35 9 45 17 53 25 60 33 652 38 10 46 18 53 26 62 34 653 39 11 48 19 54 27 62 35 724 40 12 50 20 55 28 63 36 765 40 13 52 21 55 29 63 37 766 42 14 52 22 55 30 63 38 787 42 15 52 23 60 31 64 39 808 45 16 53 24 60 32 64 40 85

Frequency Distribution: Frequency Distribution is a method of presenting data showing thefrequency or the number of time a score or group of scores occur in a given distribution.Under this method data is organised in to groups or classes in which each score is allotted a

School of Distance Education

Psychological Statistics Page 6

constitute original data or raw scores as well as the percentages, means, standarddeviations, etc. Consider the statistical table given below.

Example 1.1

Organising data in the form of Rank Order: Under this method, raw data are arranged inascending or descending series which reveals the order with respect to ranks or meritposition of the individual. Consider the following example.

Example 1.2: The following are scores obtained by 40 students in a test. Present the data ina tabular form depicting the rank order. The scores are:

72 54 60 55 80 45 53 76 50 6055 85 76 39 64 63 46 62 42 6364 45 52 53 35 55 63 38 53 6265 60 65 40 52 40 42 78 48 52

Solution: The rank order tabulation of the data

Sl No. Score Sl No. Score Sl No. Score Sl No. Score Sl No. Score

1 35 9 45 17 53 25 60 33 652 38 10 46 18 53 26 62 34 653 39 11 48 19 54 27 62 35 724 40 12 50 20 55 28 63 36 765 40 13 52 21 55 29 63 37 766 42 14 52 22 55 30 63 38 787 42 15 52 23 60 31 64 39 808 45 16 53 24 60 32 64 40 85

Frequency Distribution: Frequency Distribution is a method of presenting data showing thefrequency or the number of time a score or group of scores occur in a given distribution.Under this method data is organised in to groups or classes in which each score is allotted a

Page 7: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 7

place in the respective group or class. The number of times a particular score or group ofscore occurs in the given distribution is also given. This is known as the frequency of ascore or group of scores.

Construction of Frequency Distribution Table

Data are organised in to a frequency distribution systematically. The following stepsare used to construct a frequency distribution table:

1. Finding the Range: The first step is finding out the range of the given series of data.Range is computed by subtracting the lowest score from the highest one given in thedata series. In the Example 1.2 given, the range of distribution of data will be,Range= Highest score – Lowest score, ie., 85-35=50.

2. Determining the Class Interval: Class interval denotes the number and size of classesof groups used for grouping or organising data. There are two methods for this:i. Computing the class interval (i) using the formula:=As a general rule, Tate (1955) has given the following rule for deciding the numberof classes desired.

- For items less than or equal to 50, the number of classes may be 10.- For 50 to 100 items, then 10 to 15 classes are appropriate.- For more than 100 items, 15 or more classes may be appropriate.- Ordinarily, not fewer than 10 classes or more than 20 classes are used.

ii. Under the second method, class interval (i) is decided first and then the numberof classes is determined. For this purpose usually, the class intervals of 2, 3, 5or 10 units in length are used.

Thus in the given Example 1.2, the class interval (i) will be,=Here, range is 50. As the number of scores is 40, which is less than 50, it may

be sufficient to take 10 classes. Hence class interval (i) will be 50/10 ie., 5.3. Preparing Frequency Distribution Table

After determining the size and class interval, we proceed to preparing the frequencydistribution table. This follows two steps:

i. Writing classes of the distributionii. Tallying the score and checking the tallies

The first step is the writing of classes of distribution. For this, first the lowestclasses and then the subsequent higher classes are formed.

Page 8: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 8

In the example, the lowest class will be 30-39 and subsequently the higherclasses 40-49, 50-59, 60-60, 70-79 and 80-89.

The second step involves tallying the scores and checking the tallies. For this,the score given in the distribution are taken one by one and tallied in their properclasses as shown in the Table 1.1. The tally marks against each class are thencounted and checked to determine the frequencies of that class. The totalfrequencies should be equal to the number for individuals whose scores have beentabulated.

Table 1.1: Frequency Distribution TableClass Tallies Frequency

80 – 8970 – 7960 – 6950 – 5940 – 4930 – 39

24121183

Cumulative frequency and Cumulative Percentage Frequency Distributions

A frequency distribution table shows how frequencies are distributed over thedifferent class intervals. For determining the number of scores or percentage of scoreslying above or below a class interval, another category of tables called cumulativefrequency and cumulative percentage frequency tables are constructed. The cumulativefrequency and cumulative percentage frequency distributions may be directly obtainedfrom frequency distribution. Consider Table 1.2.

Table 1.2: Cumulative Frequencies and Cumulative Percentage Frequencies

Class Frequency Cumulative Frequency Cumulative Percentage Frequency80 – 8970 – 7960 – 6950 – 5940 – 4930 – 39

24121183

40383422113

100958555

27.57.5

N= 40

In Table 1.2, cumulative frequencies are obtained by adding successively theindividual frequencies starting from the lowest class. These cumulative frequencies areconverted to cumulative percentage frequencies by multiplying each cumulative frequencyby 100/N., where N is the total number of frequencies.

The cumulative percentage frequencies show the percentage cases lying above orbelow a given score or class. In the Table 1.2, consider for example, the cumulativepercentage 55%, which is computed as 22100/40. This shows that 55% of students in the

Page 9: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 9

class of 40 students, achievement score in mathematics lie below 59 or 59.5 which is theactual or exact upper limit of the class 50-59.

Thus, the cumulative frequencies and cumulative percentage frequencies help us todetermine the relative position, rank or merit of an individual with respect to the membersof a group.

Diagrams and Graphs

The data obtained from surveys, tests and experiments may be organised in the formof statistical and frequency distribution tables. Such an organisation helps in the betterunderstanding of data and interpreting them to derive valuable conclusions. The numericaldata may be easily analysed if they are represented graphically in the form of pictures andgraphs.

Meaning of graphical Data Representation

Graphical representation of data means representing numerical data in visual formusing pictures, diagrams and graphs for analysing the data more easily and effectively. It isalways considered as an effective and economical way for presenting, understanding,analysing and interpreting of statistical data.Advantages1. Precise and easy to understand.2. More economical and effective method of representing data.3. Attractive and appealing.4. Easy to remember.5. Easy to make comparisons with other data effectively.6. Proper estimation, evaluation and interpretation of data is possible.7. Easy computations of mean, median, mode, etc.8. Helps in determining the nature of data and forecasting the trends.

Modes of graphical representations

The two types of data such as ungrouped (data in raw form) and grouped (dataorganised in to frequency distribution) uses separate methods of representing data ingraphical form.

Graphical Representation of Ungrouped Data

The ungrouped data usually uses the following graphical representations: (1) BarDiagrams (2) Pie Diagrams (3) Pictograms (4) Line Graphs

1. Bar Diagrams

Data in different forms like raw scores, total scores or frequencies, computedstatistics and summarised figures like percentages and averages can be represented by

Page 10: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 10

using bars. This form of graphical data representation is called bar diagrams. This may taketwo forms like vertical and horizontal bar diagrams.

The lengths of bars are in the proportion of the value of variables (height, weight,intelligence, marks, price, etc). The widths of bars are chosen arbitrarily. It is conventionalto have the space between the bars about one half of the width of a bar. Consider example1.3 for the illustration of bar diagram.

Example 1.3: The following data relates to the student enrolment in Zenith College indifferent years. Represent the following data using bar diagram.

Year Number of Students Enrolled2010 – 20112011 – 20122012 – 20132013 – 20142014 – 20152015 – 2016

10001220900110014001500

The above data can be represented using bar diagrams in vertical and horizontal forms asgive in Figures 1.1 and 1.2.

Figure 1.1: Vertical Bar Diagram – Student enrolment in Zenith College during the years2010 to 2015.

Figure 1.2: Horizontal Bar Diagram – Student enrolment at Zenith College during theyears 2010 to 2015.

0200400600800

1000120014001600

2010 –2011

2011 –2012

2012 –2013

2013 –2014

2014 –2015

2015 -2016

Stud

ent e

nrol

men

t

Year

0 500 1000 1500 2000

2010 – 2011

2012 – 2013

2014 – 2015

Student enrolment

Year

Page 11: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 11

2. Pie Diagram

In a pie diagram data is represented as sections or portions of a circle of 3600, inwhich each part represents the amount of data converted in to angles. The total frequencyvalue is equated to 3600 and then the angles corresponding to component parts arecalculated. By using these angles, different sectors are drawn. Consider Example 1.4 forthe illustration of preparing pie diagram.

Example 1.4: The following data relates to Subjects offered for study in an institution andthe number of students enrolled. Present the data graphically in the form of a pie diagram.Subjects : Science Arts CommerceStudents enrolled : 100 130 170

The above data can be presented in the form of pie diagram as given below.

Courses Offered No. of Students Angle of the CircleScience

ArtsCommerce

100130170

(100/400)x360 = 900

(130/400)x360 = 1170

(170/400)x360 = 1530

Total 400 3600

Figure 1.3 Representation of Pie Diagram – Subjects offered for study and percentage ofstudents enrolled.

3. Pictograms

In data representation using pictograms, numerical data is represented by means ofpicture figures appropriately designed in proportion to the numerical data.

Example 1.5: The number of students in classes 1 to 5 is given. Represent the data usingpictogram.Class : I II III IV V

Strength: 70 70 60 50 40

25%

32%

43%

Science Arts Commerce

Page 12: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 12

Figure 1.4: Pictogram representation of number of students in classes 1 to 5.

4. Line Graphs

In line graph form of data representation, data related to one variable is plotted onthe horizontal X-axis, and the other variable on the vertical Y- axis of line graph.

Consider Example 1.3 for drawing a line graph.

Figure 1.5: Line graph- Student enrolment in Zenith College in different years.

Graphical Representation of Grouped Data

The raw data are organised into frequency distribution to get grouped data. Themethods of representing grouped data graphically are given below:(1) Histogram (2) Frequency Polygon (3) Cumulative Frequency Graph (4)Cumulative Frequency Curve or Ogive.

1. Histogram

A histogram is essentially a bar diagram of a frequency distribution in which the‘actual’ class interval plotted on the X-axis represent the width of bars (rectangles) andrespective frequencies of these class represents the height of bars.

0

200

400

600

800

1000

1200

1400

1600

2009-2010 2010 – 20112011 – 20122012 – 20132013 – 20142014 – 20152015 – 2016

No.

of s

tude

nts e

nrol

led

Year

Page 13: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 13

For determining the actual class, a value 0.5 is subtracted from the lower limit of theclass and 0.5 is added to the upper limit. For example, in the class 50-54, the actual classlimits are determined by subtracting and adding 0.5 to the upper and lower limitsrespectively. Hence we get the actual class interval as 49.5 - 54.5.

The steps in the construction of histograms are given below:

1. Convert the scores into actual class limits, ie. , 20 – 24 as 19.5 – 24.5.2. Take two extra class intervals, one above and one below the given classes with zero

as frequency.3. Plot the actual or exact lower limits of classes on the X-axis.4. Frequencies of distributions are to be plotted on the Y-axis.5. Represent each class by separate rectangles in which the base of each rectangle is

the width of the class interval (i) and the height as its respective frequency.

Consider the following example for the illustration of representing data in the form ofhistogram.

Example 1.6Score : 30-39 40-49 50-59 60-69 70-79 80-89No. of students: 3 8 11 12 4 2

To draw histogram, take the actual or exact lower limits of the classes of score as valuesto be marked on the X-axis, and the corresponding frequencies of classes on the Y-axis.

Figure 1.6: Histogram representation of scores and frequencies.

2. Frequency Polygon

A frequency polygon is essentially a line graph used for the graphical representationof a frequency distribution. A frequency polygon is drawn from a histogram by connectingthe midpoints of the upper bases of rectangular bars by using straight lines. Frequencypolygon can also be drawn directly by plotting the midpoints of classes.

Steps in the construction of a frequency polygon are given below.

Page 14: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 14

1. Take two extra classes one above and one below the given intervals with zerofrequency.

2. Compute the midpoints of classes.3. Mark the midpoints along the X-axis and mark the corresponding frequencies on Y-

axis.4. Join the points marked on the graph by using straight lines to obtain a frequency

polygon.

Example 1.7: Construct a frequency polygon from the data given below.

Score : 50-59 60-69 70-79 80-89 90-99No. of students: 5 10 30 40 15

Figure 1.7: Frequency Polygon

3. The Cumulative Frequency Graph

The data organised in the form of a cumulative frequency distribution may berepresented graphically using cumulative frequency graph. It is essentially a line graphdrawn by plotting actual upper limits of the class intervals on the X-axis and the respectivecumulative frequencies of these class intervals on the Y-axis.

Steps in the construction of cumulative frequency graph are given below.

1. Take one extra class with cumulative frequency as zero to plot the origin of thegraph on the X-axis.

2. Compute the actual upper limits of classes.3. Compute the cumulative frequencies.4. Mark the actual upper limits of classes on X-axis and mark the corresponding

cumulative frequencies on Y-axis.5. Join the points plotted on graph by using straight lines resulting in a cumulative

frequency graph or a cumulative frequency line graph.

Example 1.8: Consider the following for constructing cumulative frequency graph.

Page 15: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 15

Scores : 30-39 40-49 50-59 60-69 70-79No. of students: 20 35 25 15 5Solution:Actual upper limits of classes: 39.5 49.5 59.5 69.5 79.5Cumulative Frequencies : 20 55 80 95 100

Figure 1.8: Cumulative Frequency Graph

4. The Cumulative Percentage Frequency Curve or Ogive

The cumulative percentage frequency curve or ogive represents the cumulativepercentage frequency distribution by plotting exact or actual upper limits of classes on theX-axis and their respective cumulative percentage frequencies of classes on the Y-axis.

Ogives can be useful in the computation of medians, quartiles, deciles, percentiles,percentile ranks and percentile norms as well as for the overall comparison of two or moregroups or frequency distributions.

Consider data given in Example 1.6 for the illustration of Ogive

Solution:

Scores Actualupperlimits(X)

Frequencies(f)

CumulativeFrequencies

(CF)

Cumulative PercentageFrequency

= 100N

CF

30-39 39.5 20 20 2040-49 49.5 35 55 5550-59 59.5 25 80 8060-69 69.5 15 95 9570-79 79.5 5 100 100

N=100

Page 16: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 16

Figure 1.9: Cumulative Percentage Frequency Curve or Ogive

Page 17: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 17

Module 2: Measures of Central Tendency

Meaning

The scores obtained by conducting tests, surveys and experiments are mostly not bepresented entirely which in many circumstances would be impossible also. It can be seenthat only a very few scores are very high or very low, while most of the scores tend tocluster around a central value. This central value reflects the average characteristic of thedistribution.

The tendency of scores in a distribution to cluster around a central value is termed ascentral tendency; and the typical score or value lying between the extremes reflecting theaverage characteristic is referred to as a measure of central tendency.

The three most common measures of central tendency are given below.

1. Arithmetic Mean or Mean2. Median3. Mode

Arithmetic Mean

Arithmetic mean or Mean is the sum of all the values of a given distribution dividedby the number of values. In simple words, it is the average of a distribution. It is

represented by the symbol M or X .

Mean = Sum of all valuesNumber of valuesCharacteristics of Arithmetic Mean

1. The value of mean reflects the magnitude of every value in a given distribution.2. A distribution has only one mean.3. It is possible to manipulate mean algebraically.4. Mean may be calculated even if individual values are unknown, provided the sum of

values and the size of sample ‘N’ are given.5. There is no need or ordering or grouping of data for the computation of mean.6. It is not possible to compute mean of an open ended distribution.

Types of MeanThere are mainly four types of mean. They are:

1. Arithmetic Mean2. Geometric Mean3. Harmonic Mean4. Quadratic Mean

Page 18: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 18

Arithmetic mean is simply the ‘average value’. It is the sum of all scores divided bythe number of scores. Geometric mean is computed by multiplying all the values (N) ina distribution and taking the Nth root of their product. Harmonic mean is the centraltendency of a distribution that is the reciprocal of arithmetic mean of the reciprocals ofa set of values. Quadratic mean is the central tendency of a distribution that is squareroot of the arithmetic mean of the squares of a set of values.

Advantages1. It is easy to understand.2. It is simple to calculate.3. There is no need to order data in ascending or descending manner.4. All the scores in a distribution are taken in to consideration while computing Mean.5. It is very useful for comparing values.

Limitations1. It is difficult to assume Mean from frequencies of values alone.2. It is not appropriate for qualitative analysis.3. If the frequency of one value is missing, it would be difficult to calculate Mean.4. The Mean gives importance to large frequencies than smaller ones.5. The same Mean of different categories may give different meanings.6. It is not appropriate for computing ratios.

Computation of Mean from Ungrouped Data

Direct Method

If X1, X2, X3, ..... , X10 are the scores obtained by 10 students on a test, thearithmetic mean is computed as:

` M =X1 + X 2 + X 3 + .....+ X10

10

The formula for calculating mean of ungrouped data is

N

XX

Where,

X is the sum of scores of the distribution

N is the total number of scores in the distribution.

Example 2.1: Consider the marks obtained by 10 students in an achievement test inPsychology. Marks: 65, 76, 50, 80, 73, 64, 57, 45, 78, 82. Compute mean marks from thedata given.

Page 19: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 19

MarksX

---------65765080736457457882

------------

X = 670=======

Mean =N

X

6710

670

Short-cut Method

N

dAX

Where,A is the assumed meand is deviationN is number of scores in the distribution

X d= (X – A)657650807364(A)57457882

112-141690-7-191418

d = 30

N

dAX

6736410

3064

Page 20: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 20

Computation of mean from Grouped Data

Direct MethodIn a frequency distribution, where all the frequencies are greater than one, the mean

is calculated by the formula given below.

N

fXM

Where,X is the mid-point of the classesf is the frequencyN is the total of all frequencies

Example 2.2: Compute mean from the data given below.Scores Frequency(f)85-89 180-84 175-79 370-74 165-69 260-64 1055-59 350-54 845-49 440-44 435-39 3

N=40Solution:

Scores Frequency(f) Mid-point (X) fX85-89 1 87 8780-84 1 82 8275-79 3 78 23470-74 1 72 7265-69 2 68 13660-64 10 62 62055-59 3 58 17450-54 8 52 41645-49 4 48 19240-44 4 42 16835-39 3 38 114

N=40 fX=2295

N

fXM

= 38.5740

2295

Shortcut Method

iN

fxAM

'

Page 21: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 21

Where,A = assumed meani = class internalf = frequencyN = total frequency

x' =i

AX , where, X is the mid-point of the class.

Consider the data given in Example 2.2. Compute mean by using shortcut method.Scores Frequency(f) Mid-point (X) x' = (X-A)/i fx'

85-89 1 87 5 580-84 1 82 4 475-79 3 78 3 970-74 1 72 2 265-69 2 68 1 260-64 10 62 0 055-59 3 58 -1 -350-54 8 52 -2 -1645-49 4 48 -3 -1240-44 4 42 -4 -1635-39 3 38 -5 -15

N=40 fx' = -40

iN

fxAM

'

540

4062

562

= 57

Median

When the items of a series are arranged in ascending or descending order ofmagnitude, the measure or value of the central item in the series is called as Median.

Median is a value that divides the distribution into two parts, ie., half of the valuelies above the Median and half below it.

Characteristics of Median1. It is the value that occupies the middle point of the distribution, such that half the

items fall above it and half below it.2. The value of median doesn’t reflect the values in a given distribution.3. A distribution has only one median.4. Median cannot be manipulated algebraically.5. Computation of median requires the proper ordering of values.

Page 22: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 22

6. It is possible to compute median of an open ended distribution.

Advantages1. It is simple to calculate.2. Easy to understand.3. It is possible to calculate median in all distributions.4. Median can be calculated even with extreme values.5. It is very useful in quantitative analysis where order of score is emphasised (ie.,

ordinal).Limitations

1. It has only limited use.2. Not appropriate for qualitative phenomenon.3. Not applicable where items are assigned weights.

Computation of Median for Ungrouped Datai. When the number of items in a distribution (N) is odd

When N, ie., the number of items in a distribution is an odd number, Median is computedusing the following formula:

Median (Md)= the measure or value of the (N=1)/2th item.

Example 2.3: The marks obtained by 5 students in a test are 42, 50, 64, 56, 35. Computethe Median mark obtained in the test.

The first step in the calculation of Median is to arrange the scores either in ascending ordescending order.

By arranging the marks in ascending order we get 35, 42, 50, 56, 64.

Since N=5, which is an odd number, we compute Median by using the formula Median(Md)= the measure of (N+1)/2th item viz.,

= the measure of (5+1)/2th item

= the measure of 3rd item, ie., 50

ii. When the number of items in a distribution (N) is even

When N, ie., the number of items in a distribution is an even number, Median is computedusing the following formula:

2

item1]+[(N/2)ofValueitem(N/2)ofValue)(

thth dMMedian

Example 2.4: The marks obtained by 8 students in an achievement test are 50, 42, 60, 35,56, 65, 40, 62. Calculate the Median mark obtained.

Arranging the marks in ascending order we get, 35, 40, 42, 50, 56, 60, 62, 65.

Page 23: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 23

2

item1]+[(N/2)ofValueitem(N/2)ofValue)(

thth dMMedian

Where,N=8Value of (N/2)th item = 8/2= 4th item, ie., 50Value of [(N/2) + 1]th item = 4 +1 = 5th item, ie., 56Therefore, Median is (50 + 56)/2, ie., 53.

Example 2.5: The table gives salary to employees in a firm. There are 52 employeesworking. Compute the median salary paid to employees in a month.Salary (in thousands): 4 7 8 10 11 12 13 14 15Number of employees: 3 4 7 9 12 8 4 2 1Solution:

Salary(in thousands)

No. of employees(f)

CumulativeFrequency (cf)

4 3 37 4 78 7 1410 9 2311 12 3512 8 4313 4 4714 2 4915 1 50

N=50

Median (Md) = itemN

ofMeasure th

2

1

5.252

51

2

150

Here, 25.5th item comes after the cumulative frequency 23. Therefore it will beincluded in 35; and hence the Median salary will be Rs. 11000.

Computation of Median for Grouped Data

Consider the following example for computation of Median for grouped data or datain continuous series.

Example 2.6: The monthly income of staff members of an institution

Monthly Income: 2000 – 2500 1500 – 2000 1000 – 1500 500 – 1000 0 – 500 No. ofStaff : 3 14 27 34 46

Md =f

FNil

)2/(

Page 24: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 24

Where,l = Exact or actual lower limit of the median class

F = Total of all frequencies before the median class

f = Frequency of the median class

i = Class interval

N= Total frequencies

Monthly Income f F2000 – 2500 3 1241500 – 2000 14 1211000 – 1500 27 107500 – 1000 34 80

0 – 500 46 46

Median class can be computed as follows:Firstly, find N/2 = 124/2 viz., 62

Then, find the cumulative frequency in which the 62 can be included. Here, 62 can beincluded in the cumulative frequency (F) 80. Therefore the median class is 500 –1000.

Now, applying the formula we get,

Md=f

FNil

)2/(

34

)4662(5005.499

79.73434

165005.499

Mode

Mode is the value or measure that occurs most frequently in a distribution. Thescore or value corresponds to the maximum frequency of the distribution.

Characteristics of Mode1. It is the most frequently occurring value in a distribution.2. A distribution may have two or more modes.3. Mode does not reflect the other values in a given distribution.4. It cannot be manipulated algebraically.5. The computation of mode requires proper ordering of data.6. It is possible to calculate mode of an open ended distribution.

Page 25: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 25

Advantages1. Mode can be easily computed.2. It can be also identified by graph.3. It is not affected by extreme values.4. It is very useful for business purposes.

Limitations1. It is not a stable measure of central tendency.2. It cannot be put to algebraic treatment.3. It remains indeterminate when there exists two or more modal values in a series.4. It is not suitable where the relative importance of items is under consideration.

Computation of Mode from Ungrouped Data

In the case of ungrouped data, mode is the value or score that occurs maximumnumber of times in a distribution. That is, it is the value or measure that has the maximumfrequency.

Example 2.7: Compute mode from the following distribution: 34, 23, 45, 34, 48, 54, 56,34, 76, 45.

Here, 34 occurs the most number of times ie., three times. Hence, in the example given, thevalue of mode is 34.

Computation of Mode from Grouped Data

In data which is given in the form of a frequency distribution (grouped data orcontinuous series), Mode is computed using the formula,

Mode (Mo) = 3Md – 2MWhere, Md is the median and M is the Mean of the given distribution. The Mean and

Median are first computed and subsequently Mode is computed.

Mode can also be computed directly from the frequency distribution table withoutcalculating mean and median. For this, the following formula is used:

)(2 12

201

01 llfff

fflM o

Where,l1= lower limit of the modal classl2= upper limit of the modal classf1=frequency of the modal classf0= frequency of the class preceding (before) the modal classf2= frequency of the class succeeding (after) the modal class

Example 2.8: The following data relates to the different income groups of 45 farmers in avillage.

Page 26: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 26

Income groups No. of farmers30000 – 35000 235000 – 40000 540000 – 450000 1045000 – 50000 850000 – 55000 355000 – 60000 1060000 – 65000 7

N= 45Solution:

)(2 12

201

01 llfff

fflM o

5000)3()10()8(2

10845000

oM

50001316

245000

416673333450003

1000045000

Page 27: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 27

Module 3: Measures of Dispersion

Measures of central tendency provide a value that can be used to represent thecharacteristic of a given distribution. This single value or measure can be used to representthe characteristic of the entire distribution or group. But they do not show how theindividual scores are ‘spread’ or ‘scattered’, which is very important in cases where wehave to describe and compare two or more frequency distributions or sets of scores.

There is a tendency for data to be dispersed, scattered or to show variability aroundthe average. The tendency of scores to ‘scatter’ or ‘spread’ or deviate from the average orcentral value is termed as the measure of dispersion or variability. It is to be noted that ifdispersion is less, the average is more representative of the distribution and vice versa.

Measures of Dispersion

The measure of dispersion gives the degree of variability or dispersion by a singlevalue, which tells us how the individual scores are scattered or spread throughout thedistribution or data. There are four measures of variability or dispersion. They are thefollowing:

1. Range (R)2. Quartile Deviation (QD)3. Average Deviation (AD)4. Standard Deviation (SD)

1. Range (R)

Range is the simplest measure of variability or dispersion. It is computed bysubtracting the lowest score in the series from the highest score. Lower the range, lessscattered would be the variations and higher the range, more scattered would be thevariations. However, range is a very crude or rough score as it takes in to account only theextreme values and ignore the variation of individual items.

Range (R)= Largest value – Smallest value

Coefficient of Range

For comparative purposes, absolute measure has to be converted into relativemeasure. This is done by computing coefficient of variation. Here, in this case, we areconsidering range, and hence we have to compute-

lueSmallestVaestValueL

lueSmallestVaestValueLtofrangeCoefficien

arg

arg

Quartile Deviation (QD)

The total distribution is divided in to four quartiles or parts which includes Q1

(25%), Q2 (25%), Q3 (25%) and Q4 (25%). Quartile Deviation (QD) is one half of the

Page 28: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 28

difference between the 3rd quartile which is Q3 and the 1st quartile is Q1. The formula forQuartile Deviation is given below:

213 QQ

QD

Where,

f

FNilQ

)4/3(3

f

FNilQ

)4/(1

The value Q3 – Q1 is the difference or range between the 3rd quartile and the 1st

quartiles. This value is also called the interquartile range. While computing QuartileDeviation, this interquartile range is divided by 2, and hence, Quartile Deviation is alsocalled as semi-interquartile range.

Example 3.1: Compute quartile deviation from the data given below.

Class F F90-99 1 10080-89 5 9970-79 12 9460-69 20 82 Q3

50-59 26 62 Q2

40-49 13 36 Q1

30-39 8 2320-29 7 1510-19 4 80-9 4 4

N=100

3rd quartile= 754

1003

4

3

xN

Where, 75 is included in the cumulative frequency 82.

2nd Quartile, 50 is included in the cumulative frequency 62,

hence median class is 50 – 59.

213 QQ

QD

f

FNilQ

)4/3(3

Page 29: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 29

=20

624

100310

5.59

x

= 20

)6275(105.59

= 6620

13105.59

f

FNilQ

)4/(1

13

)234/100(105.39

13

)2325(105.39

04.4113

2105.39

213 QQ

QD

48.122

04.4166

Mean Deviation or Average Deviation

Garrett (1971) defines Average Deviation as the mean of deviations of all theseparate scores in the series taken from their mean. This measure of variability takes in toaccount the fluctuation or variation of all the items in a series.

Computation of Mean Deviation from Ungrouped Data

The following formula is used for ungrouped data:

N

xMD

Where,

x = X – X

X is the raw scoreM is the Mean value

x is the absolute value of x, ie., value of x by ignoring the signs +ve or –ve.

Example 3.2: find the Mean Deviation of the scores 35, 32, 17, 20, 31.Solution:N=5

Page 30: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 30

Mean= (35+32+17+20+31) / 5 = 135 / 5 = 27

X x= X – X x

35 8 832 5 517 -10 1020 -7 731 4 4

N = 5 x = 34

N

xMD

= 8.6

5

34

Computation of Mean Deviation from Grouped Data

The following formula is used to compute Mean Deviation for grouped data:

N

fxMD

Example 3.3: Compute mean deviation from the data give below.

Scores frequency50-54 345-49 440-44 635-39 1130-34 1425-29 1220-24 915-19 410-14 2

Solution:

Scores f X fX x=X- X fx fx

50-54 3 52 156 20 60 6045-49 4 47 188 15 60 6040-44 6 42 252 10 60 6035-39 11 37 407 5 55 5530-34 14 32 448 0 0 025-29 12 27 324 -5 -60 6020-24 9 22 198 -10 -90 9015-19 4 17 68 -15 -60 6010-14 2 12 24 -20 -40 40

N=65 fX= 2065 fx = 485

Page 31: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 31

3277.3165

2065

N

fXXorMean

46.765

485

N

fxMD

2. Variance and Standard Deviation

Variance is the measure of dispersion which eliminates the sign problem caused bythe negative deviations cancelling out the positive deviations. The procedure is to squarethe deviation scores and divide their sum by number of scores in the distribution.

n

XXSVariance

2

12 )(

Standard Deviation (SD) is regarded as the most stable measure of variability asmean is used for its computation. Standard Deviation of a set of scores is defined as thesquare root of the average of the squares of the deviations of each score from the mean. Itwill always be a positive number. SD explains how much dispersion is there in thedistribution of the given data. Standard Deviation is interpreted as an index of variation.The larger the standard deviation, the greater is the variation or spread of the scores in thedistribution. If there is no variation of scores, then the standard deviation is always zero.

Standard deviation is often referred to as root mean square deviation and is denotedby the Greek letter sigma ( ). Since the algebraic sign +ve and –ve are not ignored, it ismore accurate than Mean Deviation.

Characteristics of Standard Deviation1. It is the most important measure of dispersion.2. It measure variability or spread of scores in a distribution.3. Standard deviation will be a positive number.4. It is more accurate and justified measure of dispersion.5. It is more accurate than mean deviation since + and – signs are not ignored in the

calculation.

The formula for computing SD is given below.

N

XXSD

2)(

N

x2

Where,

X = individual score

X = mean of all scores

N = total number of items

Page 32: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 32

x = deviation of each score from the mean ie., X – X

Computation of Standard Deviation from Ungrouped Data

Standard deviation of ungrouped data can be computed using the formula givenbelow.

N

xSD

2

Example 3.4: Compute standard deviation of the following distribution.

Score: 68, 62, 58, 64, 52, 58, 50, 68

N

XMean

=8

6850585264586268

=8

480

= 60

Score (X) x=X- X x2

68 8 6462 2 458 -2 464 4 1652 -8 6458 -2 450 -10 10068 8 64

x2 = 320

N

xDS

2

)(

=8

320

32.640

Computation of Standard Deviation from Grouped Data

Standard deviation of grouped data can be computed using the formula given below.

Page 33: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 33

N

XMean

Example 3.5: Compute Standard Deviation for the frequency distribution given below.The mean of the distribution is 115.

Scores Frequency127-129 1124-126 2121-123 3118-120 1115-117 6112-114 4109-111 3106-108 2103-105 1100-102 1

N=24

Solution:

Scores Frequency X x= X- X x2 fx2

127-129 1 128 13 169 169124-126 2 125 10 100 200121-123 3 122 7 49 147118-120 1 119 4 16 16115-117 6 116 1 1 6112-114 4 113 -2 4 16109-111 3 110 -5 25 75106-108 2 107 -8 64 128103-105 1 104 -11 121 121100-102 1 101 -14 196 196

N=24 fx2 = 1074

69.675.444824

10742

N

fxSD

Coefficient of Variation or Coefficient of Relative Variability

It is often desirable to compare variabilities when means are unequal or when unitsof measurement from test to test are incommensurable. A statistic useful in making suchcomparisons is the coefficient of variation or V, sometimes called the coefficient of relativevariability. This measure was first suggested by Karl Pearson as the percentage variation ina mean, the standard deviation being treated as the total variation in the mean, symbolicallycoefficient of variation.

Coefficient of variation stands for the percentage which the value of standarddeviation is, to the value of the mean. That is, if standard deviation is divided by the meanand multiplied by 100, we get the coefficient of variation.

Page 34: B.Sc. Counselling Psychology - University of Calicut · B.Sc. Counselling Psychology (CU-CBCSS) (2014 Admission onwards) UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION Calicut

School of Distance Education

Psychological Statistics Page 34

The following formula is used for computing coefficient of variation:

100tan

)( Mean

iondardDeviatSVontofVariatiCoefficien

Example 3.6: The mean of a distribution is 50 and SD is 10. find the coefficient ofvariation.Solution:

Coefficient of Variation (V) = 10050

10 ie., 20%.

It means that the SD is 20% of mean. Coefficient of variation (V) is a primary toolin the statistical analysis of data because, being expressed as a percentage, the units of thevariables can be ignored. Problems relating to conversion of different units of the variablesin to a standard unit for purpose of uniform expression do not arise. Coefficient of variationis only a percentage of SD to the mean of a given distribution.

*********************