28
COMPILED BY MUTHAMA, JAPHETH MUTINDA CORRELATION

Corellation Analysis

Embed Size (px)

DESCRIPTION

notes

Citation preview

Page 1: Corellation Analysis

COMPILED BY

MUTHAMA, JAPHETH MUTINDA

CORRELATION

Page 2: Corellation Analysis

INTRODUCTION

Page 3: Corellation Analysis

Objectives of the presentation

After going through this presentation, the listener is expected to:

1. Be able to present the results of analysed research data.

2. Make effective interpretation of the relationship between research variables

3. Draw implications or inferences from the variables in the study model

Page 4: Corellation Analysis

Definition

Correlation (r) is the statistical measure of how two Variables move in

relation to each other.

It measures the relative strength of the relationship between two

variables

Correlation is computed into what is known as the correlation

coefficient, which ranges between -1 and +1.

Page 5: Corellation Analysis

Coefficient of correlation

Coefficient of correlation is the technique of determining the degree of

correlation between two or more variables in different values of the study

variables

The correlation, if any, found through this approach is applied in a

statistical method to deal with the formulation of mathematical model

depicting relationship amongst variables which can be used for the purpose

of prediction of the values of dependent variable, given the values of the

independent variable

Page 6: Corellation Analysis

The sample correlation coefficient (r) measures the degree of linearity in the relationship between X and Y.

-1 < r < +1

r = 0 : Indicates no linear relationship between the research variables

-1 < r < +1

The + and – signs are used for explaining the positive linear correlations and negative linear correlations respectively

Coefficient of Correlation Analysis

Strong negative

relationship Strong positive

relationship

Page 7: Corellation Analysis

Interpreting Correlation Coefficient (r)

1) Strong correlation: r > 0.70 or r < –0.70

2) Moderate correlation: r is between 0.30 and 0.70

or r is between –0.30 and –0.70

3) Weak correlation: r is between 0 and 0.30 or r is between 0 and –0.30 .

Page 8: Corellation Analysis

Methods of studying Correlation

Correlation can be determined by use of the following method;

1. A Scatter Diagram Method

2. Karl Pearson Coefficient Correlation of Method

3. Spearman’s Rank Correlation Method

Page 9: Corellation Analysis

SCATTER DIAGRAMS

This is a graph in which the individual data points are plotted in two-dimensions as

presented below;

Very good fit Moderate fit

Points clustered closely around a line show a strong correlation. The line is a good

predictor (good fit) with the data. The more spread out the points, the weaker the

correlation, and the less good the fit.

The line is a REGRESSSION line (Y = a + bX)

Strong relationship simply means a good linear fit

Page 10: Corellation Analysis

Coefficient of determination and the regression line

NOTE:

1. The coefficient of determination is a measure of how well the regression

line represents the data and therefore represents the percent of the data that

is the closest to the line of best fit

2. If the regression line passes exactly through every point on the scatter

plot, it would be able to explain all of the variation

3. The further the line is away from the points, the less it is able to explain

the variation

Page 11: Corellation Analysis

Cont…

For example in the case of variables X and Y:

If the r = 0.922, then r 2 = 0.850

Which means that 85% of the total variation in y can be explained by the

linear relationship between x and y (as described by the regression

equation)

This therefore means that, the other 15% of the total variation in y remains

unexplained

Page 12: Corellation Analysis

Karl Pearson’s coefficient of correlation (or simple correlation)

This is the most widely used method of measuring the degree of

relationship between two variables.

Its defined as the measure of the strength of the linear relationship between

two variables that is defined in terms of the (sample) covariance of the

variables divided by their (sample) standard deviations.

This coefficient assumes the following:

(i) that there is linear relationship between the two variables;

(ii) that the two variables are casually related which means that one of the

variables is independent and the other one is dependent

(iii) A large number of independent causes are operating in both variables

so as to produce a normal distribution.

Page 13: Corellation Analysis

2222 )Y(Yn )X(Xn

YXXYn

r xy

- Shared variability of X and Y variables - on the top

- Individual variability of X and Y variables- At the bottom

Karl Pearson’s coefficient of correlation can be worked out thus

OR

yxr

yx .

),cov(

Page 14: Corellation Analysis

Illistration

From the following data find the coefficient of correlation by Karl Pearson method

X: 6, 2, 10, 4, 8

Y: 9, 11, 5, 8, 7

Page 15: Corellation Analysis

Sol.cont.

92.0800

26

20.40

26

.

.

85

40

65

30

22

yx

yxr

N

YY

N

XX

Page 16: Corellation Analysis

Spearman's rank coefficient

This is the technique of determining the degree of correlation between two

variables incase of ordinal data where ranks are given to different values of

the variables.

The main objective of the coefficient is to determine the extend to which

the two sets of ranking are similar or dissimilar.

This method is only used to determine correlation when the data is not

available in numerical form

Thus when the values of the two variables are converted to their ranks and

the correlation is obtained, the correlation is known as rank correlation

Page 17: Corellation Analysis

Computation of Rank Correlation

Spearman’s rank correlation coefficient ρ can be calculated when

• Actual ranks given

• Ranks are not given but grades are given but not repeated

• Ranks are not given and grades are given and repeated

yofrankR

XofrankR

RRD

where

NN

DR

y

x

yx

..

..

)1(

61

2

2

Page 18: Corellation Analysis

Illustration

Calculate the spearman’s rank correlation coefficient between advertisement cost and sales from the following data

Advertisement cost : 39, 65, 62, 90, 82, 75, 25, 98, 36, 78

Sales(Shs): 47, 53, 58, 86, 62, 68, 60, 91, 51, 84

Page 19: Corellation Analysis

X Y R-x R-y D

39 47 8 10 -2 4

65 53 6 8 -2 4

62 58 7 7 0 0

90 86 2 2 0 0

82 62 3 5 -2 4

75 68 5 4 1 1

25 60 10 6 4 16

98 91 1 1 0 0

36 51 9 9 0 0

78 84 4 3 1 1

30

2D

Page 20: Corellation Analysis

Cont….

82.0

990

1801

1010

)30(61

61

3

3

2

R

R

R

NN

DR

Page 21: Corellation Analysis

Nonlinear Relationships

In correlation analysis, not all relationships are linear.

In cases where there is clear evidence of a nonlinear relationship DO NOTuse Pearson’s Product Moment Correlation ( r ) to summarize the strength of the relationship between Y and X.

Page 22: Corellation Analysis

Non linear correlation Scatter graph

Page 23: Corellation Analysis

Conclusions

Correlation is the linear association between two numeric variables e.g variables X and Y.

The correlation (r) ranges from -1 to +1

where

-1 < r < 1

If r < 0 then there is a negative correlation between X and Y, i.e. as X increases Y generally decreases

If r > 0 then there is a positive correlation between X and Y, i.e. as X increases Y generally increases

The close r is to 0 the weaker the linear association between X and Y.

Page 24: Corellation Analysis

A diagram explaining different strengths of correlations

The value of r ranges between ( -1) and ( +1)

The value of r denotes the strength of the association as illustratedby the following diagram.

-1 10-0.25-0.75 0.750.25

strong strongintermediate intermediateweak weak

no relation

perfect

correlation

perfect

correlation

Directindirect

Page 25: Corellation Analysis

Example of graphs and their interpretation

Negative and positive correlations

Page 26: Corellation Analysis

No Relationship (r = .00)

Information about Explanatory Flexibility tells you nothing about Emotional Insight

Explanatory Flexibility

3.53.02.52.01.51.0.50.0-.5

AS

IS -

Em

otio

na

l In

sig

ht

8

7

6

5

4

3

2

1

Page 27: Corellation Analysis

REFERENCES

Dhrymes, P. J.: Econometrics: Statistical Foundations and Applications,Harper & Row, New York, 1970.

Fomby, Thomas B., Carter R. Hill, and Stanley R. Johnson: AdvancedEconometric Methods, Springer-Verlag, New York, 1984.

Goldberger, A. S.: A Course in Econometrics, Harvard University Press,Cambridge, Mass., 1991.

Harvey, A. C.: The Econometric Analysis of Time Series, 2d ed., MIT Press,Cambridge, Mass., 1990.

Kothari CR, Research methodology: an introduction. New Delhi, Vikaspublishing house Pvt ltd 2000

Emory C William, Business research methods. Illinois: Richard D. Irwin,Inc. Homewood 2001

Page 28: Corellation Analysis

THANK YOU