47
1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D.

1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Embed Size (px)

Citation preview

Page 1: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

1

An Intelligence Approach to Evaluation of Sports

Teams by

Edward Kambour, Ph.D.

Page 2: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

AgendaAgenda

I. College FootballII. Linear ModelIII. Generalized Linear ModelIV. Intelligence (Bayesian) ApproachV. ResultsVI. Other SportsVII. Future Work

Page 3: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

General BackgroundGeneral Background

Goals Forecast winners of future games

Beat the Bookie! Estimate the outcome of unscheduled

games What’s the probability that Iowa would have

beaten Ohio St? Generate reasonable rankings

Page 4: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Major College Football Major College Football

No playoff system “Computer rankings” are an element of

the BCS 114 teams 12 games for each in a season

Page 5: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Linear ModelLinear Model

Rothman (1970’s), Harville (1977), Stefani (1977), …, Kambour (1991), …, Sagarin??? Response, Y, is the net result (point-

spread) Parameter, , is the vector of ratings For a game involving teams i and j,

E[Y] = i - j

Page 6: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Linear Model (cont.)Linear Model (cont.)

Let X be a row vector with

E[Y]=X

1 if

1 if

0 otherwise k

k i

X k j

Page 7: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Regression Model NotesRegression Model Notes

Least Squares Normality, Homogeneity

College Football Estimate 100 parameters Sample size for a full season is about 600 Design Matrix is sparse and not full rank

Page 8: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Home-field AdvantageHome-field Advantage

Generic Advantage (Stefani, 1980) Force i to be home team and j the visiting team Add an intercept term to X Adds one more parameter to estimate UAB = Alabama Rice = Texas A&M

Team Specific Advantage Doubles the number of parameters to estimate

Page 9: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Linear Model IssuesLinear Model Issues

Normality Homogeneity Lots of parameters, with relatively

small sample size Overfitting The bookie takes you to the cleaners!

Page 10: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Linear Model Issues (cont.)Linear Model Issues (cont.)

Should we model point differential A and B play twice

A by 34 in first, B by 14 in the second A by 10 each time

Running up the score (or lack thereof) BCS: Thou shalt not use margin of victory

in thy ratings!

Page 11: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Logistic RegressionLogistic Regression

Rothman (1970s) Linear Model Use binary variable

Winning is all that matters Avoid margin of victory Coin Flips

Page 12: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Logistic Regression IssuesLogistic Regression Issues

Still have sample size issues Throw away a lot of information Undefeated teams

Page 13: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

TransformationsTransformations

Transform the differentials to normality Power transformations Rothman logistic transform

Transforms points to probabilities for logistic regression

“Diminishing returns” transforms Downweights runaway scores

Page 14: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Power TransformsPower Transforms

Transform the point-spread Y = sign(Z)|Z|a

a = 1 straight margin of victory a = 0 just win baby a = 0 Poisson or Gamma “ish”

Page 15: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Maximum Likelihood Transform

Maximum Likelihood Transform

1995-2002 seasons

MLE = 0.98

Power -2ln(likelihood)

0.1 52487

0.3 41213

0.5 35128

0.67 32597

0.8 31418

1 31193

Page 16: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Predicting the ScorePredicting the Score

Model point differential Y1 = Si – Sj

Additionally model the sum of the points scored Y2 = Si + Sj

Fit a similar linear model (different parameter estimates)

Forecast home and visitors score H = (Y1 + Y2 )/2, V = (Y2 - Y1)/2

Page 17: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Another Transformation IdeaAnother Transformation Idea

Scores (touchdowns or field goals) are arrivals, maybe Poisson Final score = 7 times a Poisson + 3 times

a Poisson + … Transform the scores to homogeneity

and normality first The differences (and sums) should follow

suit

Page 18: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Square Root TransformSquare Root Transform

Since the score is “similar” to a linear combination of Poissons, square root should work

Transformation

Why k? For small Poisson arrival rates, get better

performance (Anscombe, 1948)

T S k

Page 19: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Likelihood TestLikelihood Test

LRT: No transformation vs. square root with fitted k Used College Football results from 1995-

2002 k = 21 Transformation was significantly better

p-value = 0.0023, chi-square = 9.26

Page 20: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Predicting the Score with Transform

Predicting the Score with Transform

Model point differential

Additionally model the sum of the points scored

Forecast home and visitors score H = ((Y1 + Y2 )/2)2 , V = ((Y2 - Y1)/2)2

Note the point differential is the product

1 21 21i jY S S

2 21 21i jY S S

Page 21: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Unresolved Linear Model Issues

Unresolved Linear Model Issues

Overfitting History

Going into the season, we have a good idea as to how teams will do

The best teams tend to stay the best The worst teams tend to stay the worst

Changes happen Kansas State

Page 22: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Intelligence ModelIntelligence Model

Concept The ratings and home-ads for year t are

similar to those of year t-1. There is some drift from one year to the next.

Model 1

2

where

~ N( , )

t t t

t

0

Page 23: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Intelligence Model (Details)Intelligence Model (Details)

Notation L teams M seasons of data Ni games in the ith season

Xi : the Ni by 2L “X” matrix for season i

Yi : the Ni vector of results for season i

i : the Ni vector of results for season I

Page 24: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Details (cont.)Details (cont.)

Data Distribution: For all i = 1, 2, …, M

2, (independent)i i iN Y X

Page 25: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Details (cont.)Details (cont.)

Prior Distribution

2 21

2 21

2

0N ,

0 0.05

0.25 0N , for 2,...,

0 0.01

2,0.5

i i i M

I0

I

I

I

Page 26: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Details (finally, the end)Details (finally, the end)

The Posterior Distribution of M and -2 is closed form and can be calculated by an iterative method

The Predictive Distribution for future results (transformed sum or difference) is straight-forward correlated normal (given the variance)

Page 27: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

ForecastsForecasts

For Scores Simply untransform

E[Z2] = Var[Z] + E[Z]2

For the point-spread Product of two normals

Simulate 10000 results

Page 28: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Enhanced ModelEnhanced Model

Fit the prior parameters Hierarchical models Drifts and initial variances No closed form for posterior and predictive

distributions (at least as far as I know) The complete conditionals are straight-forward,

so Gibbs sampling will work (eventually)

Page 29: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Results(www.geocities.com/kambour/football.html)

Results(www.geocities.com/kambour/football.html)

2002 Final RankingsTeam Rating Home

Miami 72.23 (1.03) 0.21 (0.04)

Kansas St 72.04 (1.04) 0.44 (0.03)

USC 71.95 (1.03) 0.04 (0.03)

Oklahoma 71.85 (1.02) 0.18 (0.03)

Texas 71.57 (1.03) 0.36 (0.03)

Georgia 71.49 (1.03) 0.02 (0.03)

Alabama 71.45 (1.03) -0.09 (0.03)

Iowa 71.30 (1.03) 0.21 (0.04)

Florida St 71.29 (1.02) 0.43 (0.03)

Virginia Tech 71.25 (1.03) 0.12 (0.03)

Ohio St 71.18 (1.03) 0.27 (0.03)

Page 30: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

ResultsResults

2002 Final RankingsTeam Rating Home

Miami 72.23 0.21

Kansas St 72.04 0.44

USC 71.95 0.04

Oklahoma 71.85 0.18

Texas 71.57 0.36

Georgia 71.49 0.02

Alabama 71.45 -0.09

Iowa 71.30 0.21

Florida St 71.29 0.43

Virginia Tech 71.25 0.12

Ohio St 71.18 0.27

Page 31: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

ResultsResults

2002 Final RankingsTeam Rating Home

Miami 72.23 0.21

Kansas St 72.04 0.44

USC 71.95 0.04

Oklahoma 71.85 0.18

Texas 71.57 0.36

Georgia 71.49 0.02

Alabama 71.45 -0.09

Iowa 71.30 0.21

Florida St 71.29 0.43

Virginia Tech 71.25 0.12

Ohio St 71.18 0.27

Page 32: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Bowl PredictionsBowl Predictions

Ohio St 17Miami Fl (-13) 31 0.8255 0.5228

Washington St 21

Oklahoma (-6.5) 31 0.7347 0.5797

Iowa 21

USC (-6) 30 0.7174 0.5721

NC State (E) 20

Notre Dame 17 0.5639 0.5639

Florida St (+4) 24

Georgia 27 0.5719 0.5320

Page 33: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

2002 Final Record2002 Final Record

Picking Winners 522 – 157 0.769

Against the Vegas lines 367 – 307 – 5 0.544

Best Bets 9 – 7 0.563 In 2001, 11 - 4

Page 34: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

ESPN College Pick’em(http://games.espn.go.com/cpickem/leader)

ESPN College Pick’em(http://games.espn.go.com/cpickem/leader)

1. Barry Schultz 5830 2. Jim Dobbs 5687 3. Michael Reeves 5651 4. Fup Biz 5594 5. Joe * 5587 6. Rising Cream 5562 7. Intelligence Ratings 5559

Page 35: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Ratings System Comparison(http://tbeck.freeshell.org/fb/awards2002.html)

Ratings System Comparison(http://tbeck.freeshell.org/fb/awards2002.html)

Todd Beck Ph.D. Statistician Rush Institute

Intelligence Ratings – Best Predictors

Page 36: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

College Football ConclusionsCollege Football Conclusions

Can forecast the outcome of games Capture the random nature

High variability Sparse design

Scientists should avoid BCS Statistical significance is impossible Problem Complexity Other issues

Page 37: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

NFLNFL

Similar to College Football Square root transform is applicable Drift is a little higher than College

Football Better design matrix

Small sample size Playoff

Page 38: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

NFL Results(www.geocities.com/kambour/NFL.html)

NFL Results(www.geocities.com/kambour/NFL.html)

2002 Final Rankings (after the Super Bowl)Team Rating Home

Tampa Bay 70.72 0.29

Oakland 70.57 0.28

Philadelphia 70.55 0.10

New England 70.16 0.12

Atlanta 70.13 0.20

NY Jets 70.10 -0.01

Pittsburgh 69.95 0.28

Green Bay 69.92 0.28

Kansas City 69.90 0.51

Denver 69.89 0.50

Miami 69.89 0.49

Page 39: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

2002 Final NFL Record2002 Final NFL Record

Picking Winners 162 – 104 – 1 0.609

Against the Vegas lines 135 – 128 – 4 0.513

Best Bets 9 – 8 0.529

Page 40: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

NFL EuropeNFL Europe

Similar to College and NFL Square root transform Dramatic drift Teams change dramatically in mid-

season Few teams

Better design matrix

Page 41: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

College BasketballCollege Basketball

Transform? Much more normal (Central Limit Theorem)

A lot more games Intersectional games

Less emphasis on programs than in College Football More drift

NCAA tournament

Page 42: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

NCAA Basketball Pre-tournament Ratings

NCAA Basketball Pre-tournament Ratings

Team Rating Home

Arizona 100.06 3.97

Kentucky 99.33 4.32

Kansas 95.89 3.85

Texas 93.42 4.44

Duke 92.90 4.66

Oklahoma 90.19 4.31

Florida 90.65 3.99

Wake Forest 88.70 3.65

Syracuse 88.50 3.49

Xavier 87.89 3.37

Louisville 87.88 4.16

Page 43: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

NBANBA

Similar to College Basketball Normal – No transformation

A lot more games – fewer teams Playoffs are completely different from

regular season Regular season – very balanced, strong

home court Post season – less balanced, home court

lessened

Page 44: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

HockeyHockey

Transform Rare events = “Poissonish”

Square root with k around 1

A lot more games History matters Playoffs seem similar to regular season Balance

Page 45: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

SoccerSoccer

Similar to hockey Transform

Square root with low k Not a lot of games Friendlys versus cup play Home pitch is pronounced

Varies widely

Page 46: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Soccer ResultsSoccer Results

Correctly forecasted 2002 World Cup final Brazil over Germany

Correctly forecasted US run to quarter-finals

Won the PROS World Cup Soccer Pool

Page 47: 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D

Future EnhancementsFuture Enhancements

Hierarchical Approaches Conferences

More complicated drift models Correlations Individual drifts Drift during the season Mean correcting drift More informative priors