37
Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS

Free Powerpoint Templates ROHANA BINTI ABDUL HAMID INSTITUT E FOR ENGINEERING MATHEMATICS (IMK) UNIVERSITI MALAYSIA PERLIS

Embed Size (px)

Citation preview

  • Free Powerpoint TemplatesROHANA BINTI ABDUL HAMIDINSTITUT E FOR ENGINEERING MATHEMATICS (IMK)UNIVERSITI MALAYSIA PERLIS

  • 5.1 INTRODUCTION5.2 SCATTER PLOTS5.3 LINEAR REGRESSION MODEL5.4 LEAST SQUARE METHOD5.5 COEFFICIENT DETERMINATION5.7 TEST OF SIGNIFICANCE5.6 CORRELATION5.8 ANALYSIS OF VARIANCE (ANOVA)

  • 5.1 INTRODUCTION TO REGRESSIONRegression is a statistical procedure for establishing the r/ship between 2 or more variables.

    This is done by fitting a linear equation to the observed data.

    The regression line is then used by the researcher to see the trend and make prediction of values for the data.

    There are 2 types of relationship:Simple ( 2 variables)Multiple (more than 2 variables)

  • THE SIMPLE LINEAR REGRESSION MODELis an equation that describes a dependent variable (Y) in terms of an independent variable (X) plus random error

    where, = intercept of the line with the Y-axis = slope of the line = random errorRandom error, is the difference of data point from the deterministic value.This regression line is estimated from the data collected by fitting a straight line to the data set and getting the equation of the straight line,

  • Example 5.1:

    1) A nutritionist studying weight loss programs might wants to find out if reducing intake of carbohydrate can help a person reduce weight.a)X is the carbohydrate intake (independent variable).b)Y is the weight (dependent variable).

    2) An entrepreneur might want to know whether increasing the cost of packaging his new product will have an effect on the sales volume.a)X is costb)Y is sales volume

  • 5.2 SCATTER PLOTSA scatter plot is a graph or ordered pairs (x,y).

    The purpose of scatter plot to describe the nature of the relationships between independent variable, X and dependent variable, Y in visual way.

    The independent variable, x is plotted on the horizontal axis and the dependent variable, y is plotted on the vertical axis.

  • Positive Linear RelationshipSlope b1is positiveRegression lineIntercept b0SCATTER DIAGRAM

  • Negative Linear RelationshipSlope b1is negativeRegression lineIntercept b0SCATTER DIAGRAM

  • No RelationshipSlope b1is 0Regression lineIntercept b0SCATTER DIAGRAM

  • A linear regression can be develop by freehand plot of the data.Example 5.2:The given table contains values for 2 variables, X and Y. Plot the given data and make a freehand estimated regression line.

    5.3 LINEAR REGRESSION MODEL

  • The least squares method is commonly used to determine values for and that ensure a best fit for the estimated regression line to the sample data points

    The straight line fitted to the data set is the line:

    5.4 LEAST SQUARES METHOD

  • Theorem 10.1:Given the sample data , the coefficients of the least squares line are:

    LEAST SQUARES METHODy-Intercept for the Estimated Regression Equation,

    and

    are the mean of x and y respectively.

  • LEAST SQUARES METHODii) Slope for the Estimated Regression Equation,

    Where,

  • LEAST SQUARES METHOD Given any value of the predicted value of the dependent variable , can be found by substituting into the equation

  • Example 5.3: Students score in history The data below represent scores obtained by ten primary school students before and after they were taken on a tour to the museum (which is supposed to increase their interest in history)

    Fit a linear regression model with before as the explanatory variable and after as the dependent variable.

    Predict the score a student would obtain after if he scored 60 marks before.

    Before,x65637646687268573696After, y68668648656671574287

  • The coefficient of determination is a measure of the variation of the dependent variable (Y) that is explained by the regression line and the independent variable (X).The symbol for the coefficient of determination is or .If =0.90, then =0.81. It means that 81% of the variation in the dependent variable (Y) is accounted for by the variations in the independent variable (X). The rest of the variation, 0.19 or 19%, is unexplained and called the coefficient of nondetermination.Formula for the coefficient of nondetermination is

    5.5 COEFFICIENT OF DETERMINATION( )

  • Relationship Among SST, SSR, SSEwhere: SST = total sum of squares SSR = sum of squares due to regression SSE = sum of squares due to errorSST = SSR + SSEThe coefficient of determination is:where:SSR = sum of squares due to regressionSST = total sum of squares COEFFICIENT OF DETERMINATION( )

  • Example 5.4

    If =0.919, find the value for and explain the value.

    Solution : = 0.84. It means that 84% of the variation in the dependent variable (Y) is explained by the variations in the independent variable (X).

  • Correlation measures the strength of a linear relationship between the two variables.Also known as Pearsons product moment coefficient of correlation.The symbol for the sample coefficient of correlation is , population . Formula :

    5.6 CORRELATION (r)

  • Properties of :

    Values of close to 1 implies there is a strong positive linear relationship between x and y.Values of close to -1 implies there is a strong negative linear relationship between x and y. Values of close to 0 implies little or no linear relationship between x and y.

  • Refer Example 5.3: Students score in history c)Calculate the value of r and interpret its meaning

    Solution:

    Thus, there is a strong positive linear relationship between score obtain before (x) and after (y).

  • To determine whether X provides information in predicting Y, we proceed with testing the hypothesis.Two test are commonly used:

    i)

    ii)5.7 TEST OF SIGNIFICANCEt TestF Test

  • 1) t-Test1. Determine the hypotheses.2. Compute Critical Value/ level of significance.3. Compute the test statistic./ p-value ( no linear r/ship)(exist linear r/ship)

  • 1) t-Test4. Determine the Rejection Rule. Reject H0 if :

    t < - or t >

    p-value < a There is a significant relationship between variable X and Y.5.Conclusion.

  • 2) F-Test1. Determine the hypotheses.2. Specify the level of significance.3. Compute the test statistic.Fa with degree of freedom (df) in the numerator (1) and degrees of freedom (df) in the denominator (n-2)F = MSR/MSE4. Determine the Rejection Rule. Reject H0 if :p-value < aF test >( no linear r/ship)(exist linear r/ship)

  • There is a significant relationship between variable X and Y.5.Conclusion.2) F-Test

  • Refer Example 5.3: Students score in history d) Test to determine if their scores before and after the trip is related. Use a=0.05 Solution:1. ( no linear r/ship) (exist linear r/ship)2.

    3.

  • 4. Rejection Rule:

    5. Conclusion: Thus, we reject H0. The score before (x) is linear relationship to the score after (y) the trip.

  • The value of the test statistic F for an ANOVA test is calculated as:

    F=MSR MSE To calculate MSR and MSE, first compute the regression sum of squares (SSR) and the error sum of squares (SSE).

    5.8 ANALYSIS OF VARIANCE (ANOVA)

  • General form of ANOVA table:

    ANOVA Test1) Hypothesis: 2) Select the distribution to use: F-distribution3) Calculate the value of the test statistic: F4) Determine rejection and non rejection regions:5) Make a decision: Reject Ho/ accept H0ANALYSIS OF VARIANCE (ANOVA)

    Source of VariationDegrees of Freedom(df)Sum of SquaresMean SquaresValue of the Test StatisticRegression1SSRMSR=SSR/1F=MSR MSEErrorn-2SSEMSE=SSE/n-2Totaln-1SST

  • Example 5.5 The manufacturer of Cardio Glide exercise equipment wants to study the relationship between the number of months since the glide was purchased and the length of time the equipment was used last week.

    Determine the regression equation.At , test whether there is a linear relationship between the variables

  • Solution (1):Regression equation:

  • Solution (2):Hypothesis:

    F-distribution table:Test Statistic: F = MSR/MSE = 17.303 or using p-value approach: significant value =0.003Rejection region: Since F statistic > F table (17.303>11.2586 ), we reject H0 or since p-value (0.003 < 0.01 )we reject H05)Thus, there is a linear relationship between the variables (month X and hours Y).