17
Using football league tables to demonstrate correlation Duncan Williamson January 2006 www.duncanwil.co.uk

Using football league tables to demonstrate correlation Duncan Williamson January 2006

Embed Size (px)

Citation preview

Page 1: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Using football league tables to demonstrate correlation

Duncan Williamson

January 2006

www.duncanwil.co.uk

Page 2: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Introduction

• This page is aimed at demonstrating how to use an example such as football league tables to illustrate the way correlation works. It is not aimed at giving you a fully worked example of correlation and all of its ins and outs.

• Correlation shows the statistical relationship between two or more variables. It is important to appreciate that correlation does not prove cause and effect in relations between the variables.

Page 3: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Correlation on a graph

• The first two of these graphs show the extremes of correlation: perfect positive and perfect negative. – perfectly positive: the bigger variable X the bigger variable Y will be – perfectly negative: the bigger variable X the smaller variable Y will be – zero: there is no apparent relationship between variables X and Y

Page 4: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Correlation Values

• Anyone who has studied correlation analysis will know that in addition to the graphs and brief descriptions above, numbers are often used here too:

• perfectly positive: +1

• perfectly negative: -1

• zero: 0

Page 5: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Correlation Extremes

• We can also conclude since the +1 and -1 are extremes, then anything that is less than perfectly positive and greater than perfectly negative must have a value greater than -1 and less than +1.

• Reasonably good and positive correlation could have a value of +0.7 or +0.8 ...

• Reasonably good and negative correlation could have a value of -0.7 or -0.8 ...

Page 6: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Correlation Good or Bad?

• Sorry but there are no hard and fast values that we can give between the two extremes that we can guarantee are good or bad or even alright! Eventually good and bad and acceptable are often a matter of judgement in the context of the example being studied or presented.

Page 7: Using football league tables to demonstrate correlation Duncan Williamson January 2006

The Football Data• The example we are demonstrating here uses

some very basic information: whether there is any relationship between the goal difference and points earned by the various football clubs both in the English Premier League and the English Championship League.

• We present here the data as they appear in newspapers, on the television or on a web site at any time. Then we present the graphs we have prepared from the data. Finally, we will draw our conclusions of whether there is an apparent relationship between goal difference and points earned.

Page 8: Using football league tables to demonstrate correlation Duncan Williamson January 2006

League Tables

Page 9: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Goal Difference, Points …

• Goal difference is, very simply, the difference between the number of goals a team has scored and the number of goals it has conceded in the league campaign for a given season.

• The points refers to the number of points that a club has earned according to the League rules: three points for a win, one point for a draw and no points for a defeat.

Page 10: Using football league tables to demonstrate correlation Duncan Williamson January 2006

A Graphical View of Correlation 1

Page 11: Using football league tables to demonstrate correlation Duncan Williamson January 2006

A Graphical View of Correlation 2

Page 12: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Analysis 1

• What we can see here is that there is a high degree of positive correlation between goal difference and the number of points earned both in Premiership and Championship football in England.

• Are we surprised by this and can we conclude that we are dealing with cause and effect in this case?

Page 13: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Analysis 2

• We should not be surprised by this because there is a direct relationship between goals and points: the larger the number of goals scored, on average, the greater the number of points a club is likely to earn. Of course, there are examples where clubs have scored heavily but haven't necessarily been top of the league. The reason being that they had a leaky defence as well as a strong attack line up!

Page 14: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Analysis 3

• Can we conclude that there might be a cause and effect relationship here? We think the answer here is yes. We can conclude that there is a cause and effect between the two variables, goal difference and points earned because any club that scores more goals than its opponents MUST earn points for a win. Consequently, clubs with positive goal differences must have a greater haul of points than clubs with negative goal differences.

Page 15: Using football league tables to demonstrate correlation Duncan Williamson January 2006

A Bit More Advanced: r2 and r values

• Please note that we used Microsoft Excel to prepare the graphs and to do the calculations.

• Firstly, the values of correlation revealed by these examples. Look at the graphs and see that for the Premiership data the correlation value given by Excel is r2 is 0.9037 and for the Championship data the correlation value given by Excel is r squared is 0.9344. both of these figures are positive and are very close to a value of 1 so there is a high degree of correlation in both cases.

• The correlation values given relate to what is called r2 (the coefficient of determination) whereas the correlation value we talked about initially is the r value (coefficient of correlation). For the Premiership and Championship examples, then, the r values are 0.9666 and 0.9506 respectively.

Page 16: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Conclusion

• This page has shown that there are simple and effective examples of where we can demonstrate the correlation effect between two or more variables. For more effect, why not gather the data for Division one and two, the Scottish leagues, the Rugby Union or League tables ... as you wish, where the same or equivalent data are available.

Page 17: Using football league tables to demonstrate correlation Duncan Williamson January 2006

Devised and Prepared by

Duncan Williamson

January 2006

www.duncanwil.co.uk

Also highly recommends

www.oxbow.org.uk