Forecasting the FIFA World Cuppieter.robberechts/... · Match outcome prediction Via team rating...

Forecasting the FIFA World CupCombining goal- and result-based team ability parameters

Pieter Robberechts, Jesse Davishttp://people.cs.kuleuven.be/pieter.robberechts

Introduction

A popular research topic since the '60

Two popular approaches:

1. Goal-based models

Model the number of goals scored by both teams

2. Result-based models

Model win-draw-loss outcomes directly

Typical approach:

1. Estimate team abilities based on historical match data

2. Use them to predict future match outcomes

Match outcome prediction

Data → Team ratings → Predictions

Typical approach:

Data scraped from:

- post WW2 international games from http://eloratings.net

- betting odds from http://betexplorer.com/

Typical approach:

Two rating systems were explored:

- ELO ratings (result-based)

- ODM ratings (goal-based)

Team ...

Strength 2320 2237 2220 2207 ....

The ELO rating systemA Result-based rating system

1 + 10RH−RA

R′�H = RH + k(SH − EH)

Given:

RH, RA

SH = {10.50

Current home and away team ratings

Expected score for the home team

Actual score of the home team

Updated rating of the home team

If the home team wonWhen drawIf the home team lost

The ELO rating systemA Result-based rating system

k = k0wi(1+δ)γ

Problem: - Not all games are handled with the same seriousness- Most games are played against weak opponents

‣ Competitiveness factor ‣ Margin of victory

Margin of victory weight Recentness factorR′�H = RH + k(SH − EH)

Offense-Defense ratingsA Goal-based rating system

Given:

∑i=1

didi =

∑i=1

Aij = 0

Score team j generated against team i

Otherwise

Offensive rating of team j Defensive rating of team i

Offense-Defense ratingsA Goal-based rating system

Problem: - Large disparities between the number of games played and the

strength of the opponents- Teams in different confederations rarely play each other

Solution:

Update ratings sequentially

For each team:- Pre-game ratings = weighted sum of a team's post game ratings- Post-game ratings = ODM procedure with pre-game ratings as initial ratings

Match outcome predictionVia team rating systems

Two prediction models were explored:

- Ordered logit regression (result-based)

- Bivariate poisson regression (goal-based)

Typical approach:

Predictor

Eloattdef

Elodefatt [ 0.43 0.33 0.24 ]

"Belgium wins"

"It's a tie"

"England wins"

Home advantage?

Tuning the predictive power

1r − 1

r −1

∑k=1

∑l=1

( ̂pl − yl))2

How accurate are our predictions?

3 possible interpretations:1. How many games are predicted correctly?→ Accuracy

2. How certain was the model about the true outcome?→ Logarithmic loss

3. How certain was the model about the true ordered outcome?

→ Ranked Probability Score (RPS)

Tuning the predictive power

Dataset

Test setValidation set

Apply best model

Training set

Until convergence: For each game ∈ Training set: update_rating(game) If game ∈ Validation set: make_prediction(game)

End if End for Compute average RPS Update rating and prediction model parameters

Minimise RPS with L-BFG-S algorithm:

Challenge I: Match outcome prediction

Accuracy LogLoss RPS

ELO ordered logit

ELO bivariate Poisson

Random forest

Bookmakers

ELO+ODM ordered logit

ELO+ODM bivariate Poisson

ODM ordered logit

ODM bivariate Poisson

0,51 0,6 0,1

The models were validated on the 2002, 2006, 2010 and 2014

World Cups 2002 2006 2010 2014allX

Challenge I: Match outcome prediction

Accuracy RPS

Bookmakers

ELO ordered logit

ELO+ODM ordered logit

Berrar et al.

Hubáček et al.

Constantinou

Tsokos et al.

And compared with the 2017 Soccer Prediction Challenge submissions

0,5 0,54

Accuracy LogLoss RPS

2014 Elo

Elo+ODM

FiveThirthyEight

2010 Elo

Elo+ODM

2006 Elo

Elo+ODM

2002 Elo

Elo+ODM

0,3 0,6 0,1 0,24

Challenge II: Tournament elimination

How accurate can we predict the round of elimination of each team in

previous World Cups?

Our predictions

Other's predictions

Accuracy LogLoss RPSFiveThirtyEightZeileirs et al.Groll et al.Our model

UBS 0,50,563

0,5940,563

0,2010,224

0,1860,1850,182

0,1920,1320,1260,1270,124

Tournament elimination

Online interactive https://dtai.cs.kuleuven.be/sports/worldcup18/

Thanks!Any questions?

Interactive at:https://dtai.cs.kuleuven.be/sports/worldcup18/

Forecasting the FIFA World Cuppieter.robberechts/... · Match outcome prediction Via team rating...

Documents

Scatter Diagram of Bivariate Measurement Data. Bivariate Measurement Data Example of Bivariate Measurement:

Bivariate data

LOGIT VERSUS DISCRIMINANT ANALYSIS · 2. The relation between logit and discriminant analysis For simplicity, only the bivariate case is considered here although the results extend

Bivariate Statistics

Bivariate Relationships

13 Bivariate

Bivariate Statistik

Analytical Prediction of Transitions Probabilities in the Conditional ...ftp.iza.org/dp1015.pdf · Analytical Prediction of Transitions Probabilities in the Conditional Logit Model

Forecasting the FIFA World Cuppieter.robberechts/repo/robberechts... · Match outcome prediction Via team rating systems Two prediction models were explored:-Ordered logit regression

Logit powerpoint

Bivariate Regression

Bivariate Data

Bivariate Multivariate Nonlinear Prediction Model for the

Getting Started in Logit and Ordered Logit Regressionotorres/Logit.pdf · Logit and Ordered Logit Regression (ver. 3.1 beta) ... Iteration 4: log likelihood ... test to see whether

Logit Estimation

Mixed logit modelling in Stata An overview · Mixed logit modelling in Stata-An overview ... The mixed logit model extends the standard conditional logit ... assumptions about the

Bivariate Data

Logit Model

Variables Important for Bankruptcy Prediction - A …lup.lub.lu.se/student-papers/record/3469994/file/3469995.pdf · Variables Important for Bankruptcy Prediction: A Logit ... then

12 Bivariate