Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
The Player KernelLucas Maystre, Victor Kristof, Antonio González Ferrer, Matthias Grossglauser
School of Computer and Communication Sciences, EPFL
MLSA workshop @ ECML-PKDD — September 19th, 2016
ContextOur entry to the EURO 2016 Prediction Competition, Challenge 1
Task: probabilistic prediction of match outcomes
2
ContextOur entry to the EURO 2016 Prediction Competition, Challenge 1
Task: probabilistic prediction of match outcomes
2
predictor
⇥0.37 0.31 0.32
⇤Italy
Belgium
ContextOur entry to the EURO 2016 Prediction Competition, Challenge 1
Task: probabilistic prediction of match outcomes
2
predictor
⇥0.37 0.31 0.32
⇤Italy
Belgium
"Italy wins" "Belgium wins"
"It's a tie"
Starting point
3
M national teams. Team u has "strength" su
Starting point
3
P (u � v) =1
1 + exp[�(su � sv)]
M national teams. Team u has "strength" su
Starting point
3
P (u � v) =1
1 + exp[�(su � sv)]
M national teams. Team u has "strength" su
0
1
Starting point
3
P (u � v) =1
1 + exp[�(su � sv)]
M national teams. Team u has "strength" su
0
1
=
1
1 + exp(�s
>x)
2
64s1...
sM
3
75
2
666666666664
0...1...�1...0
3
777777777775
Starting point
Key challenges with national teams:
1. They play few matches every year: recent data is sparse
2. Their squad change frequently: old data is stale3
P (u � v) =1
1 + exp[�(su � sv)]
M national teams. Team u has "strength" su
0
1
=
1
1 + exp(�s
>x)
2
64s1...
sM
3
75
2
666666666664
0...1...�1...0
3
777777777775
4
Source: http://www.estadao.com.br/infograficos/onde-atuam-os-736-jogadores-da-copa-2010,esportes,280906
Inspiration Many players play against each other in club competitions
Can we transfer information from club matches to international matches?
Embed teams in the space of players.
Main idea
5
su =X
i2Lu
s̃i
Embed teams in the space of players.
Main idea
5
su =X
i2Lu
s̃i
P (u � v) =1
1 + exp[�(su � sv)]
Embed teams in the space of players.
Main idea
5
su =X
i2Lu
s̃i
P (u � v) =1
1 + exp[�(su � sv)]=
1
1 + exp(�˜s>z) 2
66664
s̃1......s̃P
3
77775
2
66666666666666666664
0...111...�1�1�1...0
3
77777777777777777775
Club matches and international matches share the same parameters.
Embed teams in the space of players.
Main idea
5
su =X
i2Lu
s̃i
P (u � v) =1
1 + exp[�(su � sv)]=
1
1 + exp(�˜s>z) 2
66664
s̃1......s̃P
3
77775
2
66666666666666666664
0...111...�1�1�1...0
3
77777777777777777775
Club matches and international matches share the same parameters.
Embed teams in the space of players.
Main idea
5
su =X
i2Lu
s̃i
P (u � v) =1
1 + exp[�(su � sv)]
The number of parameters explodes.Seems like it will lead to statistical and computational issues.
=
1
1 + exp(�˜s>z) 2
66664
s̃1......s̃P
3
77775
2
66666666666666666664
0...111...�1�1�1...0
3
77777777777777777775
Bayesian approach
6
prior distribution (e.g. Gaussian)
likelihoodposterior distribution
p(s̃ | D) / p(D | s̃)⇥ p(s̃)
Keep a distribution over parameters instead of optimizing the likelihood.
Bayesian approach
6
prior distribution (e.g. Gaussian)
likelihoodposterior distribution
p(s̃ | D) / p(D | s̃)⇥ p(s̃)
Keep a distribution over parameters instead of optimizing the likelihood.
Bayesian approach
6
prior distribution (e.g. Gaussian)
likelihoodposterior distribution
p(s̃ | D) / p(D | s̃)⇥ p(s̃)
Statistical issues solved? not clear Computational issues solved? no
Keep a distribution over parameters instead of optimizing the likelihood.
In fine, we are only interested in
Dual viewpoint
7
p(s̃>z | D) "strength" difference
In fine, we are only interested in
Dual viewpoint
7
p(s̃>z | D) "strength" difference
Accurate estimation of all parameters is not necessary
In fine, we are only interested in
Dual viewpoint
7
p(s̃>z | D) "strength" difference
Accurate estimation of all parameters is not necessary
f(z) = s̃>z p(f | D) / p(D | f)⇥ p(f)
In fine, we are only interested in
Dual viewpoint
7
p(s̃>z | D) "strength" difference
Accurate estimation of all parameters is not necessary
f(z) = s̃>z p(f | D) / p(D | f)⇥ p(f)
Cov[f(z), f(z0)] = �2z>z0
In fine, we are only interested in
Dual viewpoint
7
p(s̃>z | D) "strength" difference
Accurate estimation of all parameters is not necessary
Inference can be done in the dual space
f(z) = s̃>z p(f | D) / p(D | f)⇥ p(f)
Cov[f(z), f(z0)] = �2z>z0
In fine, we are only interested in
Dual viewpoint
7
p(s̃>z | D) "strength" difference
Accurate estimation of all parameters is not necessary
Inference can be done in the dual space
f(z) = s̃>z p(f | D) / p(D | f)⇥ p(f)
Cov[f(z), f(z0)] = �2z>z0
f(z) ⇠ GP[0, k(z, z0)]
In fine, we are only interested in
Dual viewpoint
7
p(s̃>z | D) "strength" difference
Accurate estimation of all parameters is not necessary
Inference can be done in the dual space
f(z) = s̃>z p(f | D) / p(D | f)⇥ p(f)
Cov[f(z), f(z0)] = �2z>z0
f(z) ⇠ GP[0, k(z, z0)] The player kernel!
The cubeLinear
RegressionLogistic
Regression
Kernel Regression
Kernel Classification
Bayesian Linear
Regression
Bayesian Logistic
Regression
GP Regression
GP Classification
KernelBayesian
Classification
Credit: Zoubin Ghahramani8
9
Dataset Collected data on 24 887 matches from main football competitions over the last 10 years.
33 157 distinct players appear in the dataset.
10
Results
11
Logarithmic loss against competing approaches in 2008, 2012 and 2016.
2008 2012 2016
Rao and Kupper (1967) proposed the following extension.
Ternary outcomes
13
P (u � v) =1
1 + exp[�(su � sv � ↵)]
P (u ⌘ v) = 1� P (u � v)� P (v � u)
/ P (u � v) · P (v � u)
Rao and Kupper (1967) proposed the following extension.
Ternary outcomes
13
A draw is (essentially) equivalent to one win and one loss.
P (u � v) =1
1 + exp[�(su � sv � ↵)]
P (u ⌘ v) = 1� P (u � v)� P (v � u)
/ P (u � v) · P (v � u)