of 41 /41
BIOSTATISTICS CORRELATION AND REGRESSION, ANOVA SHRIVARDHAN DHEEMAN GURUKUL KANGRI UNIVERSITY HARIDWAR

Correlation and Regression; ANOVA

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Correlation and Regression; ANOVA

BIOSTATISTICSCORRELATION AND REGRESSION, ANOVA

SHRIVARDHAN DHEEMAN

GURUKUL KANGRI UNIVERSITY

HARIDWAR

Page 2: Correlation and Regression; ANOVA

2

CORRELATION/CORRELATION ANALYSIS

When we going to finding a relationship (if it exist) between the two variables

(bivariate) under study

TOOL WE USE

Correlation

Method and techniques used for studying and

measuring the extent of the relationship between

two variables

CorrelationAnalysis

Page 3: Correlation and Regression; ANOVA

3

FIRST TO UNDERSTAND TERM BIVARIATEExample of bivariate distribution

will clear your concept:

In field 10 plants

Height and flower

In class 60 students

Obtained marks in two subject by all of them

S. No.

Height of plant

Flower on plant

1 4 12

2 3 10

3 4 13

4 5 15

5 5 16

6 4 11

7 6 18

8 3 9

9 5 14

10 4 12

Page 4: Correlation and Regression; ANOVA

4

0 2 4 6 8 10 1202468

101214161820

Hight of plant Flower on plant

Page 5: Correlation and Regression; ANOVA

5

TYPES OF CORRELATION

Analytical

Positive

Negative

Graphical

Linear

Non-linear

Page 6: Correlation and Regression; ANOVA

6

POSITIVE CORRELATIONProceeding goes in a single direction:

e.g.

Turbidity in a culture and OD

Concentration of Antibiotic and Zone of clearance

NEGATIVE CORRELATIONProceeding goes in a diverse/different direction:

e.g.

Volume and Pressure of gas

Demand of grain and Price

Page 7: Correlation and Regression; ANOVA

7

LINEAR CORRELATION This correlation is categorized based upon the graphical

representation:

The correlation gives a linear straight graph representation says a linear correlation.

Change in one unit of one variable result in the corresponding change in the other variable over the entire range of value:

e.g. X 2 4 6 8 10

Y 7 13 19 25 31

Page 8: Correlation and Regression; ANOVA

8

• Unit change in the value of X, there is a constant change in the corresponding value of Y and the above data can be expressed by relation

In general two variable X and Y are said to be Linearly related, if these exist in a relation ship of the from

Where,

a and b are the real numbers.

Page 9: Correlation and Regression; ANOVA

9

1 2 3 4 50

5

10

15

20

25

30

35

Linear Correlation Graph

XY

Page 10: Correlation and Regression; ANOVA

10

NON-LINEAR CORRELATION

Relation between two non-linear if corresponding to a unit change in one variable, the other variable does not change at a constant rate.

But, change at fluctuating rate, So graph will not get a straight line

Page 11: Correlation and Regression; ANOVA

11

1 2 3 4 50

5

10

15

20

25

30

35

Non-Linear Correlation Graph

XY

Page 12: Correlation and Regression; ANOVA

12

COEFFICIENT OF CORRELATION

Measure of the degree of association between two variable is called coefficient of correlation (r):

If the two set of data have r = +1

Thus, Positive correlation

If the two set of data have r = -1

Thus, Negative correlation

If the two set of data have r = 0

Thus, Non-correlation

Page 13: Correlation and Regression; ANOVA

13

SOLVED EXAMPLE

Problem: Find the relationship between the Flower on plant is correlated with the height of plant

S. No.

Height of plant

Flower on plant

1 4 12

2 3 10

3 4 13

4 5 15

5 5 16

6 4 11

7 6 18

8 3 9

9 5 14

10 4 12

Page 14: Correlation and Regression; ANOVA

14

SOLUTIONS. No. Height

of plant (x)

Flower on plant (y)

x2 y2 xy

1 4 12 16 144 48

2 3 10 9 100 30

3 4 13 16 169 52

4 5 15 25 225 75

5 5 16 25 256 80

6 4 11 16 121 44

7 6 18 36 324 108

8 3 9 9 81 27

9 5 14 25 196 70

10 4 12 16 144 48

Total 43 130 193 1760 582

Page 15: Correlation and Regression; ANOVA

15

𝒓=10 .(582)−43 .130

√¿¿¿

𝒓=5820−5590

√¿¿¿

𝒓=230

√(𝟖𝟏¿)(700)¿

𝒓=230

√𝟓𝟔𝟕𝟎𝟎

𝒓=230

𝟐𝟑𝟖 .𝟏𝟏

𝒓=230

𝟐𝟑𝟖 .𝟏𝟏 𝒓=0 .9659

Page 16: Correlation and Regression; ANOVA

16

REGRESSION

If the two are significantly correlated and if there is some theoretical basis for doing so, it is possible to predict value of one variable from the other. This method to analyze so is called the Regression Analysis.

“Estimation or prediction of the unknown value of the variable from the known value of the other variable.

M. M. Blair has addressed that “ regression analysis is mathematical measure of the average relationship between two or more variables in terms of the original unit of the data.

Page 17: Correlation and Regression; ANOVA

17

REGRESSION EQUATION

Size of sample = n

And the two set of measures is denoted by the X and Y

We can predict the value of Y given the value of X for desirable size n denoted with the X’

Following the equation is used as Regression Equation:

Y=a+bX’

Where,

a and b = coefficient

Page 18: Correlation and Regression; ANOVA

18

EXAMPLEProblem: Nitrogen produced by the treatment plant in the mid term and final. Develop a regression equation which may be used to predict final yield from the mid term score.

Treatment plant Mid term Final

1 98 90

2 66 74

3 100 98

4 96 88

5 88 80

6 45 62

7 76 78

8 60 74

9 74 86

10 82 80

Page 19: Correlation and Regression; ANOVA

19

SOLUTION

Treatment plant

Mid term (x) Final (y) x2 xy

1 98 90 9064 8820

2 66 74 4356 4884

3 100 98 10000 9800

4 96 88 9216 8448

5 88 80 7744 7040

6 45 62 2025 2790

7 76 78 5776 5928

8 60 74 3600 4440

9 74 86 5476 6364

10 82 80 6724 6560

Total 785 810 64521 65071

Page 20: Correlation and Regression; ANOVA

20

Numerator of b = 10x65071-785x810

= 65710-635850

= 14860

Dominator of b = 64521-(785)2

= 645210-616225

= 28985

Therefore b = 14860/28985

= 0.5127

Numerator of a = 810-785x0.5127

= 810-402.4695

= 407.5305

Dominator of a = 10

Page 21: Correlation and Regression; ANOVA

21

Thus,

Value of a = numerator of a/dominator of a

= 407.5305/a

= 40.7531

considering the formula of regression equation:

Y=a+b(X’)

Y= predicting value

a = value obtained

b = value obtained

X’ = number of object for the prediction is desirable

Thus,

Y = 40.7531+(0.5127)50

= 40.7531+25.631

= 66.3881

Page 22: Correlation and Regression; ANOVA

22

ANOVA

Page 23: Correlation and Regression; ANOVA

23

ANOVA

ANALYSIS OF VARIANCE

• statistical hypothesis• Analysis of experimental data• Method

• Making decision by using data

• Calculated• By the null hypothesis and the sample data

“Assuming the truth of the Null Hypothesis statistically result can be justifies to reject and accept for predict the inference regarding variance of the data. If the variation analysis is predict as accept thus the variation is not significant and vice versa.”

Page 24: Correlation and Regression; ANOVA

When the graphical data representation obtained after ANOVA data lies in the graph and the two region of graph is obtained one in acceptance region where data support the hypothesis and another in rejection region where data doesn't support the hypothesis

Null hypothesis is denoted by H0

24

Page 25: Correlation and Regression; ANOVA

25

HISTORY OF ANOVA

In year 1827 La’Place address the ANOVA problem regarding measurement of atmosphere tides.

1918 Sir Ronald Fisher introduced the term Varience in his article published in same year under the title “the correlation between relative on the supposition of medallion inheritance”.

Fischer introduced the method of analysis in his book published in the year 1925 named “statistical method for research workers”

Page 26: Correlation and Regression; ANOVA

26

COMPONET OF MEASURE OF ANOVA: F TEST

For the comparison of variance from a mixed poputation. It is recommended for ANOVA, where two estimates of the variance of the same sample are compared. While the F test is not generally used against the departures from normality, it has been found to be robust in the special case of ANOVA.

Citation from Moore and Mc Cabe (2003); uses F test in ANOVA, but there are not the same as the F statistic for computing standard deviation of two population.

Page 27: Correlation and Regression; ANOVA

27

The F-test is used for comparisons of the components of the total deviation. For example, in one-way, or single factor ANOVA, statistical significance is tested for by comparing the F test statistic

Page 28: Correlation and Regression; ANOVA

28

WHAT IS ANOVA

ANOVA apply in all groups of simply random sample of the single population, so the treatment want to implies the same effect.

ANOVA as a statistical design of experiments

Experiment adjust the factors & measures response in an attempt to determine effect.

ANOVA is the synthesis of several ideas and it is used for multiple response/purpose. As a consequences, it is difficult to define concisely and precisely.

Page 29: Correlation and Regression; ANOVA

29

CHARACTERISTIC& LOGIC

Characteristics:

• Used in the analysis of comparative experiments

• Determine by the ratio of two variances

Logic:

• The calculation of ANOVA can be characterized a computing a number of means and variances, dividing two variation and comparing the ratio to determine statistical significance.

• An effect of any treatment is estimated by taking the difference between the mean of the observation which receive the treatment and the general mean.

Page 30: Correlation and Regression; ANOVA

30

Page 31: Correlation and Regression; ANOVA

31

Page 32: Correlation and Regression; ANOVA

32

Page 33: Correlation and Regression; ANOVA

33

TYPE OF ANOVA

One way ANOVA:

This ANOVA is analyze for the single hypothesis from the obtained data.

Hypothesis is null hypothesis

Single hypothesis is analyze the effect or factor of the variance in the random data of groups. Further by F-test a limit of acceptance and rejection is obtained under the factor of F-test the graph is plotted between F value and the obtained value of ANOVA analysis.

Example:

Problem: Nitrogen produced by the treated plant with Fertilizer

H0: nitrogen is produce due to fertilizer Vs. itself by the plant

Page 34: Correlation and Regression; ANOVA

34

TYPE OF ANOVA

Two way ANOVA:

This ANOVA has a significant difference from the one way ANOVA that from this analysis we can test two hypothesis simultaneously under the Null hypothesis

From the two hypothesis one is rejected and the another is accepted for the data.

Example:

Problem: Bacterial growth observed in CFU on the 28 solid media plate. Where temperature and pH are the factor of growth. If we want to test the factor so we have to test the two hypothesis:

H0: bacterial growth is inhibited due to temp Vs. pH

H0’: bacterial groth is enhanced due to temp. Vs. pH

Page 35: Correlation and Regression; ANOVA

35

Page 36: Correlation and Regression; ANOVA

36

Page 37: Correlation and Regression; ANOVA

37

Page 38: Correlation and Regression; ANOVA

38

Page 39: Correlation and Regression; ANOVA

39

Page 40: Correlation and Regression; ANOVA

40

Page 41: Correlation and Regression; ANOVA