Regression & correlation

  • Published on
    15-Jul-2015

  • View
    54

  • Download
    1

Embed Size (px)

Transcript

<ul><li><p>Course Title: Business StatisticsBBA (Hons)2nd Semester</p><p>Course Instructor: Atiq ur Rehman ShahLecturer, Federal Urdu University of Arts, Science &amp; Technology, Islamabad+92-345-5271959aatresh@gmail.com</p></li><li><p>Correlation Correlation is a LINEAR association between two random variables</p><p>Correlation is a statistical technique used to determine the degree to which two variables are related</p></li><li><p>Scatter diagram</p><p>Rectangular coordinateTwo quantitative variablesOne variable is called independent (X) and the second is called dependent (Y)</p></li><li><p>Wt.</p><p>(kg)</p><p>67</p><p>69</p><p>85</p><p>83</p><p>74</p><p>81</p><p>97</p><p>92</p><p>114</p><p>85</p><p>SBP</p><p>(mmHg)</p><p>120</p><p>125</p><p>140</p><p>160</p><p>130</p><p>180</p><p>150</p><p>140</p><p>200</p><p>130</p></li><li><p>Scatter diagram of weight and systolic blood pressure</p><p>Chart1</p><p>120</p><p>125</p><p>130</p><p>180</p><p>160</p><p>140</p><p>130</p><p>140</p><p>150</p><p>200</p><p>SBP</p><p>wt (kg)</p><p>SBP(mmHg)</p><p>Sheet1</p><p>Weight (kg)SBP</p><p>67120</p><p>69125</p><p>74130</p><p>81180</p><p>83160</p><p>85140</p><p>85130</p><p>92140</p><p>97150</p><p>114200</p><p>Wt.</p><p>(kg)</p><p>67</p><p>69</p><p>85</p><p>83</p><p>74</p><p>81</p><p>97</p><p>92</p><p>114</p><p>85</p><p>SBP</p><p>(mmHg)</p><p>120</p><p>125</p><p>140</p><p>160</p><p>130</p><p>180</p><p>150</p><p>140</p><p>200</p><p>130</p></li><li><p>Scatter diagram of weight and systolic blood pressure</p><p>Chart1</p><p>120</p><p>125</p><p>130</p><p>180</p><p>160</p><p>140</p><p>130</p><p>140</p><p>150</p><p>200</p><p>SBP</p><p>Wt (kg)</p><p>SBP(mmHg)</p><p>Sheet1</p><p>Weight (kg)SBP</p><p>67120</p><p>69125</p><p>74130</p><p>81180</p><p>83160</p><p>85140</p><p>85130</p><p>92140</p><p>97150</p><p>114200</p></li><li><p>Scatter plotsThe pattern of data is indicative of the type of relationship between your two variables:positive relationshipnegative relationshipno relationship</p></li><li><p>Positive relationship</p></li><li><p>Negative relationshipReliabilityAge of Car</p></li><li><p>No relation</p></li><li><p>Correlation CoefficientThe correlation coefficient (r) measures the strength and direction of relationship between two variables</p></li><li><p>How to interpret the value of r?r lies between -1 and 1. Values near 0 means no (linear) correlation and values near 1 means very strong correlation.The negative sign means that the two variables are inversely related, that is, as one variable increases the other variable decreases.</p></li><li><p>How to interpret the value of r?</p></li><li><p>Pearsons rA 0.9 is a strong positive association (as one variable rises, so does the other)</p><p>A -0.9 is a strong negative association (as one variable rises, the other falls)</p><p>r=correlation coefficient</p></li><li><p>Coefficient of DeterminationDefinedPearsons r can be squared , r 2, to derive a coefficient of determination.</p><p>Coefficient of determination the portion of variability in one of the variables that can be accounted for by variability in the second variable</p></li><li><p>Example of depression and CGPA Pearsons r shows negative correlation, r=-0.5 r2=0.25</p><p>In this example we can say that 1/4 or 0.25 of the variability in CGPA scores can be accounted for by depression (remaining 75% of variability is other factors, habits, ability, motivation, courses studied, etc)</p></li><li><p>Coefficient of Determinationand Pearsons rIf r=0.5, then r2=0.25If r=0.7 then r2=0.49</p><p>Thus while r=0.5 versus 0.7 might not look so different in terms of strength, r2 tells us that r=0.7 accounts for about twice the variability relative to r=0.5</p></li><li><p>ExampleCalculate the coefficient of correlation between the value X and Y given below:</p><p>X7889976959796861Y125137156112107136123108</p></li><li><p>XYX2Y2XY781256084156259750891377921187691219397156940924336151326911247611254477285910734811144963137913662411849610744681234624151298364611083721116646588Summation60010044624212801276812</p></li><li><p> = 0.95Hence the correlation co-efficient betweenX and Y is 0.95.</p><p>** (What does this value tells us??)**</p></li><li><p>RegressionA statistical tool that is used to investigate the dependence of one variable (dependent variable) on one or more other variables (independent variables)The dependent variable (Y) is the variable for which we want to make a prediction. The independent variable (X) is the variable on the basis of which we are making predictions.</p></li><li><p>The linear relationship between two variables can either be positive or negative.For instance, an increase in advertisement budget will bring more sales (positive), and increase in temperature will decrease the cooling efficiency of a room AC (negative)</p></li><li><p>Simple Linear Regression</p><p>Positive Linear RelationshipSlope (b)is positiveRegression lineIntercept(a)</p></li><li><p>Simple Linear Regression</p><p>Negative Linear RelationshipSlope (b)is negativeRegression lineIntercept(a)</p></li><li><p>Simple Linear Regression</p><p>No RelationshipSlope (b)is 0Regression lineIntercept(a)</p></li><li><p>Simple Linear Regression EquationHence the equation for linear regression line can be written as:y= a + bxWhere:y= dependent variablex= independent variablea= y-intercept (i.e value of y when x=0)b= slope</p></li><li><p>Least-squares estimates For a simple linear regression equation:y= a + bx</p><p>We have,</p><p>Where, and </p></li><li><p>ExampleCompute the least squares regression equation of Y on X for the following data. What is the regression coefficient and what does it mean??</p><p>X568101213151617Y161923283641444550</p></li><li><p>XYXYX251680256191143682318464102828010012364321441341533169154466022516457202561750850289Summation10230238531308</p></li><li><p>Now = 102/9 = 11.33And = 302/9 = 33.56</p><p>= 9(3853) (102) (302) 9( 1308) (102)2= 3873/1368Sob = 2.381</p></li><li><p>And = 33.56 (2.831) (11.33) = 1.47</p><p>Hence the desired estimated regression line of Y on X isy= 1.47 + 2.831x</p><p>** The estimated regression co-efficient is b=2.831, which means that yhe value of y increase by 2.831 units for a unit increase in x.</p></li></ul>

Recommended

View more >