Paper Draft - Final

Embed Size (px)

DESCRIPTION

NFL Gambling Paper

Citation preview

  • Washington University in St. Louis

    Olin Business School

    The NFL Gambling Market:Testing efficiency and the late

    season bias

    Author :Daniel Sear

    Supervisor :Dr. Dirk Nitzsche

    May 21, 2010

  • Abstract

    This paper analyzes the NFL gambling market for inefficiencies. Testing game datafrom 2006 to 2009 we find that betting on late season home underdogs can be prof-itable, representing a market inefficiency. Next we find that, unlike other recentresearch, the weather and climate in which the game is played does not represent amis-pricing. Finally, we develop both a Binary and OLS Base Model regression andfind that the Binary Base Model regression can be utilized out-of-sample to form aprofitable strategy on all games late in the season.

    ii

  • Acknowledgements

    I would like to thank Professor Dirk Nitzsche for his guidance through the con-struction of this paper. Also thanks to Brain Burke of AdvancedNFLStats.com forpointing me in the right direction on some interesting research. Finally, I would liketo thank Professor Richard Borghesi for his helpful clarifications on the finer pointsof his research.

    iii

  • Declaration

    I declare that this dissertation is the result of my own work and includes nothingwhich is the outcome of work done in collaboration. It is not substantially the sameas any which I have submitted for a degree, diploma, or other qualification at anyother university. Additionally, no part of this dissertation has already been, or iscurrently being, submitted for any such degree, diplmoa, or other qualification.

    (Daniel Sear)

    iv

  • Contents

    1 Introduction 71.1 NFL Background Information . . . . . . . . . . . . . . . . . . . . . . 71.2 Betting in the NFL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.3 Necessity for New Research . . . . . . . . . . . . . . . . . . . . . . . 10

    2 Literature Review 122.1 OLS Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2 Binary Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.3 Other variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    3 Analysis 223.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.3 Time variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    4 Results 284.1 Statistical analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.2 Regression Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374.3 In-sample predictability . . . . . . . . . . . . . . . . . . . . . . . . . 394.4 Out-of-sample predictability . . . . . . . . . . . . . . . . . . . . . . . 40

    5 Conclusion 44

    Appendices 47

    Bibliography 50

    v

  • List of Tables

    1.1 Illegal betting by sport in the United States . . . . . . . . . . . . . . 8

    3.1 Summary of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    4.1 NFL home team summary statistics by week . . . . . . . . . . . . . . 294.2 NFL home underdog summary statistics by week . . . . . . . . . . . 314.3 Persistence of biases in the NFL . . . . . . . . . . . . . . . . . . . . . 324.4 Weather effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.5 Nevada football betting . . . . . . . . . . . . . . . . . . . . . . . . . . 354.6 Success rates of simple betting rules in the NFL . . . . . . . . . . . . 364.7 NFL in-sample predictability . . . . . . . . . . . . . . . . . . . . . . . 394.8 NFL out-of-sample predictability using the base binary model . . . . 41

    1 Climate of NFL teams . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    vi

  • Chapter 1

    Introduction

    Gambling has become an integral aspect of modern sports. A motivated person can

    place a bet on virtually anything; take the Super Bowl, for instance. Bets can be

    placed on every aspect of that game from simply the winner or loser to the coin toss

    (heads or tails), even the length of time it takes for the national anthem to be sung

    (an over/under). These bets not only drive interest in the sports on which wagering

    occurs, but they are huge business and attract over 300 billion dollars each year in

    the United States alone (see Table 1.1) and that is only illegal gambling. Due to

    the mass amounts of money that flows through this market each year, an inefficiency

    could result in large profits for a bettor if he can exploit the mis-pricing properly.

    This paper aims to determine if the market for NFL bets displays any biases that

    could be exploited or if it is an efficient market with regard to the variables we test.

    1.1 NFL Background Information

    The National Football League was formed in 1920 and has risen from small beginnings

    to become the highest attended American sport by a wide distance.1(MacCambridge,

    2005) The league consists of 32 teams organized into two conferences each with four

    divisions of four teams. Following a four week preseason, teams play 16 regular season

    games over a 17 week season, allowing one bye week per team per season. The bye

    week generally comes between week 4 and week 10 of the regular season. The regular

    season currently begins the Thursday evening after Labor Day (the first Monday

    in September) and ends the last week of December. Games are played primarily

    on Sunday with a weekly primetime game on Monday (the famous Monday Night

    Football), however as the season progresses weekly games are added on Thursday

    and Saturday. Six teams from each conference make the playoffs which consists of

    four rounds: the Wild Card, Divisional, Conference Championship, and Super Bowl.

    1The average attendance for an NFL game is 67,509.

    Sear 7

  • 1.2. BETTING IN THE NFL CHAPTER 1. INTRODUCTION

    Table 1.1: Illegal betting by sport in the United States

    League/Event Total Wagers ($)

    National Football League 80 100 billionSuper Bowl 6 10 billion

    College Football 60 70 billionCollege Basketball 50 billion

    NCAA Basketball Tournament 6 12 billionNational Basketball Association 35 40 billionMajor League Baseball 30 40 billionHockey, Golf, NASCAR, Boxing, and Other Sports 1 3 billionSoccer Nominal

    Total 268 325 billion

    Notes: This table shows the estimated amount of illegal gambling conducted in theUnited States each year by league or event. The information comes from a studydone by CNBC and does not factor in any legal gambling conducted in casinos orother authorized sports books.

    The Super Bowl falls in the first week of February the following year of a season (so

    the 2009 seasons Super Bowl was played in February 2010).

    The NFL has a hard salary cap and revenue sharing which fosters parity

    between teams.2 It also allows smaller market teams such as the Green Bay Packers to

    be on level footing with big market teams like the New York Giants. The NFLs main

    demographic is males, around the age of 16-49. This is also the main demographic of

    sports bettors which helps to explain the extreme volume of wagering that is centered

    around the NFL.

    1.2 Betting in the NFL

    In standard American football betting (both legal and illegal) a point spread system

    is used. In this system a spread is set for the weekends games early in the week,

    normally Monday. Bets are taken up until kickoff when the betting ends and the

    2A hard cap is one that a team cannot exceed. In contrast, the NBA has a soft cap whichmeans a team must pay a fine for any salary that is over the cap, but they can still have a payrollexceeding the cap. Revenue sharing means that most revenues the league generates are pooledtogether and distributed evenly among the 32 teams. For instance, the NFL signs television rightscontracts and gives the Dallas Cowboys 1/32nd of the money, the Chicago Bears 1/32nd of themoney and so on.

    Sear 8

  • 1.2. BETTING IN THE NFL CHAPTER 1. INTRODUCTION

    closing line is established. An example of a line would be Chicago minus five at

    Minnesota. This would mean that Chicago is expected to beat Minnesota by five

    points. If a bettor believes one team is undervalued compared to their opponent

    they bet on that team with a book maker (bookie). In this case, if Chicago outscores

    Minnesota by more than five points, bets on Chicago win. If Minnesota loses by four

    points or less (including scenarios where they win outright), bets on Minnesota win.

    Finally, if Chicago wins by exactly five points a push is declared and the money is

    simply returned to the bettor.3

    In point spread betting the bookie acts much like a stock exchange specialist.

    The bookie, like the stock exchange specialist, charges a fee for setting up sellers and

    buyers (in this case bettors wagering on both sides of the line). The bookie makes

    money in two ways. First, for each wager a bookie pays out $10 in winnings for

    each bet of $11. This means that if in the previous example a bet was placed on

    Chicago for $11 and Chicago won by seven points the bettor would be given $21 by

    the bookie, not $22; this is known as the vigorish, or vig.4 Second, the bets that

    lose are collected by the bookie and what is not given to other bettors in winnings is

    their profit.

    Like the stock exchange specialist, a bookie would like to avoid ending up in

    a naked position. Therefore, if many bettors are wagering on one side of a bet the

    bookie will adjust the line to encourage betting on the other side. For example, if

    there are many bets placed on Chicago to win by five the bookie might change the

    line from Chicago minus five at Minnesota to Chicago minus five and a half at

    Minnesota. This will encourage more betting on Minnesota as that bet would win

    3Sometimes no favorite is declared and the line is zero; this is commonly referred to as a pick-emgame.

    4Other betting systems exist in other sports. In baseball and hockey a possible bet would looklike (175,+190). In this case a bettor could win $1 by betting $1.75 on a favorite or $1.90 bybetting $1 on an underdog. Due to the nature of this system baseball and hockey bettors are notconcerned with point spreads, only who wins the game. Another system exists in horse racing calledpari-mutel betting. As opposed to the odds and point-spread betting systems where the payoff islocked in at the time of the bet, the winning bettors in a pari-mutuel system divide the total amountof bet after deducting commissions. This means that if a bettor places a $1 bet on a 20-to-1 horsethey might only receive 15-to-1 odds after all the bets are placed. In this system transaction costsare normally a percentage of the bet amount.

    Sear 9

  • 1.3. NECESSITY FOR NEW RESEARCH CHAPTER 1. INTRODUCTION

    if Minnesota loses by five where previously they would have pushed. The bookie will

    continue to adjust the line so all bets offset, reducing his exposure. However, it does

    not matter to a bettor what the line does after their bet is placed. If a bet was placed

    on Chicago when the line was minus five and the line subsequently moved to five and

    a half that persons bet would still win or lose based on a line of five; the line is

    locked in for each bet when it is made. Therefore, since betting lines move according

    to the dollar amount of wagers on each side of the bet, closing spreads should reflect

    all public and private information as well as any biases of market participants.

    Illegal betting, which we have seen is equally if not more popular than the

    legal variety, can be conducted through the point-spread system or a variety of other

    bets, known as prop bets. Prop bets exist in legal casinos but are much more

    popular in the illegal gambling market. Prop bets are bets not on the outcome of the

    game, but on any range of ancillary factors pertaining to the contest. For instance,

    we previously mentioned two popular prop bets for the Super Bowl: heads or tails

    for the opening coin flip and the length of time it takes for the National Anthem to

    be sung. In addition to bets like this, illegal prop bets can range from guessing if a

    questionable player will make the starting lineup, to details as inane as what a coach

    will be wearing on the sidelines. The illegal gambling market is generally regarded as

    a place where any bet can be made if a bettor has the money to support the wager.

    Because of this, it is a much less structured market that is difficult to gather hard

    details on; we will be focusing this research on the more defined market of Las Vegas

    NFL gambling.5

    1.3 Necessity for New Research

    This paper is necessary given the existence of previous research done on the subject

    of betting market efficiency because it is an area that is constantly evolving. The

    idea of efficiency in markets is predicated upon the belief that when an inefficiency is

    identified in liquid markets, arbitragers will exploit the inefficiency until it disappears

    5The information for this section has been taken from the CNBC study cited above.

    Sear 10

  • 1.3. NECESSITY FOR NEW RESEARCH CHAPTER 1. INTRODUCTION

    and all assets are appropriately priced. In this study, we examine an inefficiency

    identified by Borghesi (2007) that led to a mis-pricing of late season NFL games,

    specifically home underdogs. If the idea holds that markets correct themselves back

    to an efficient state after time, this mis-pricing should either be gone or at least

    dissipating. Borghesis study was done using data through 2000, we will examine

    data from 2006 to 2009.

    We believe that we will find the mis-pricing has disappeared. The mis-pricing

    identified was large enough that a savvy bettor or group of bettors would have come

    into the market with large amounts of capital and exploited the bias until it dis-

    appeared. Furthermore, we do not think a model can be created that is accurate

    to account for transaction costs such as the vigorish. This model would have to be

    accurate around 53% of the time to achieve this level. In short, we believe that the

    NFL gambling market has returned to an efficient state.

    Sear 11

  • Chapter 2

    Literature Review

    2.1 OLS Models

    There have been numerous studies on the efficiency of the NFL gambling market over

    the years. Most repeat or build on the first study done in 1985 by Zuber et al. In that

    study, the authors built off of research done into the efficiency of racetrack gambling

    to analyze the NFL. They first tested for weak form efficiency in the market. They

    thought of the betting lines on NFL games as forward prices on stocks. The way

    to test for market efficiency in that situation is to use the forward prices to predict

    the future spot prices. Using this analogy they tested for market efficiency in NFL

    betting lines using a simple regression where DPi,t, the dependent variable, is the

    actual difference in points between the teams playing in the ith game in week t and

    PSi,t, the independent variable, is the final point spread. They note that PS is

    not the book makers opinion of the outcome. The first line that is set in the week

    represents the book makers (expert) assessment of the game, however, as bets come

    in he adjusts the line to avoid excessive risk. Therefore, by the end of the week the

    line will reflect all public knowledge about the game.1 It follows that if their equation

    rejects the null hypothesis that = 0 and = 1 the market is not efficient.

    They found that based on the 1983 season they could not reject the null for

    13 of the 16 weeks. They decided to test the complete opposite as well, the null

    hypothesis that = = 0, meaning that the point spread was entirely unrelated to

    the outcome. This also could not be rejected for 15 of the 16 weeks. Based on that

    result they determined that the weak form test was not a sufficient indicator of the

    efficiency of the market.

    1This assumes that there is a high level of activity on the betting lines. While actual total figureson the amount bet on each game are unavailable due to the illegal placement of many bets it is safeto assume that there is sufficient activity given that Nevada casinos alone take in over $66.6 billioneach year in football bets.

    Sear 12

  • 2.1. OLS MODELS CHAPTER 2. LITERATURE REVIEW

    Their next approach was to define an efficient market as one in which no

    player could develop a consistently profitable strategy. In this case, that would be

    developing a strategy that involves winning greater than 52.40 percent of the time.2

    In an attempt to determine a strategy that would achieve greater than 52.40 percent

    return they developed another simple, one variable regression equation where DPi,t

    again represents the actual difference in points between the teams, B is a vector of

    coefficients to be estimated andXhi,t andXvi,t, the independent portion of the equation,

    are matrices of observable variables for the home and away teams respectively. Each

    of the game and team variables in the matrix Xi,t is defined in the same way as

    DPi,t meaning each consists of the difference between the value of the variable for

    the home team and its value for the visiting team. The explanatory variables used

    in this equation are game statistics like yards rushed, proportion of passing plays

    attempted to total offensive plays, and number of rookies for a team. The model was

    found significant to explain the actual difference in points between the two teams.

    This model was used to predict the actual difference in points, (DPPi,t), and several

    betting methods were determined.3 They would bet on a game if |DPPi,tPSi,t| where was 0.5, 1, 2, and 3 in different scenarios. The team to bet on is determined

    by the sign of (DPPi,t PSi,t): if this expression is positive, the gamble is made onthe home team; if negative, the gamble is made on the visiting team. The bet is won

    if the team gambled on beats the spread. This requires that the sign of (DPi,tPSi,t)coincide with the sign of (DPPi,t PSi,t).

    This model found all to produce a winning percentage above the required

    52.40 percent. They concluded that there was a way to develop a winning strategy in

    this market, but reasoned that an inefficiency did not necessarily exist. This is due

    to other factors which must be considered as costs like the time and effort it takes

    to form all of these analyses. They reasoned that this opportunity existed because

    2Gambling on an NFL game is carried out on the 11 for 10 rule: the gambler must lay out $11for every $10 he or she wishes to win. The percentage of winning bets (WP) necessary to breakeven, 52.40 percent, is obtained by setting the expected value of the random variable, a gambleWP (10) + (1WP )(11), equal to zero.

    3The explanatory variables were produced each week based on statistics from week 1 to week(t 1)

    Sear 13

  • 2.1. OLS MODELS CHAPTER 2. LITERATURE REVIEW

    no one valued the gain from exploiting it more than they valued the other ways in

    which they could spend their time.

    There are two main criticisms of this initial study. First, the authors chose to

    conduct it only over one season, 1983. This means that the inefficiencies that they

    found potentially were only present in that season and the inclusion of other seasons

    would have eliminated the inefficiency. In addition, the second criticism of the study

    is that they did not test their model on any other seasons. It was shown by Sauer

    et al. that when their strategy was applied to other seasons, games outside of the

    sample, the strategy was not consistently profitable.

    In response to these criticisms, Sauer et al. (1988) repeated Zuber et al.s

    study making refinements so that the results would be more accurate. To start, they

    changed the sample size from 14 to 224 to increase the accuracy of the weak form

    test, Zuber et al.s s first equation.4,5 With this new approach, Sauer et al. arrived

    at the same result as the previous paper of not being able to reject the null when

    testing for weak form efficiency. However, unlike Zuber et al. previously, they were

    able to reject the null on the extreme alternative test where = = 0.

    Sauer et al. go on to take issue with Zuber et al.s second test which used

    past game statistics to predict the point spread. They argued that the game statistics

    would be widely know by the bookies and the bettors and would be included in the

    point spread.6 This was tested using using a modified version of Zuber et al.s second

    equation

    DPi,t PSi,t = B (Xhi,t Xvi,t) + i,t (2.1)

    where

    Xhi,t =1

    t 1 t1s=1

    Xhs (2.2)

    4Zuber et al. broke up the season into 16 samples of 14 games each, making their n = 14 whereasSauer et al. included all the games of the season giving them n = 224.

    5It has been shown that the variance of a least-squares estimator is inversely related to samplesize, so the smaller sample size for Zuber et al. increased the likelihood that they would reject thenull hypothesis.

    6They did not contend that Sauer et al.s strategy would not have made money certain seasons,but they argued that it is does not represent a market inefficiency.

    Sear 14

  • 2.2. BINARY MODELS CHAPTER 2. LITERATURE REVIEW

    and similarly defined for Xvi,t. This allows the authors to regress the Zuber et al.

    variables against both the actual difference in points as well as the betting point

    spread, with the stipulation the coefficient on the betting point spread is equal to

    1.0. Using this model Sauer et al. find that there is virtually no information added

    to the existing betting line by the Zuber et al. variables. Sauer et al. recreate the

    Zuber et al. model for the 1984 season and find that for all s the strategy would

    end up losing money.7 Sauer et al. conclude by stating they have found no evidence

    to support the previous claim of inefficiency and they believe the market for NFL

    betting is efficient.

    2.2 Binary Models

    Other authors such as Gray and Gray (1997), Dare and Holland (2004), and Borghesi

    (2007) have continued the work on the subject of NFL gambling market efficiency.

    However these authors began to introduce the binary (probit) model. Gray and Gray

    looked at long-term biases in the market. They were the first researchers to use a

    probit model in their analysis. They replaced the left side of the model with a binary

    variable to account for the fact that bettors do not care by what magnitude they win

    their bets, only if the bet is won. They examine both in-sample as well as out-of-

    sample performance of multiple betting strategies. As with other studies, Gray and

    Grays in-sample strategies perform very well. They devise multiple simple strategies

    generating positive returns. For instance, the simple strategy of betting on all home

    underdogs achieves an accuracy of nearly 57%, around 4 percentage points greater

    than the break even point. This specification was statistically significant for their

    sample but they note is has begun to dissipate over recent years. They noticed over

    the last 11 seasons of their sample (1984-1994) only three seasons generated accuracy

    greater than the break even point.

    Gray and Gray further use their probit model out-of-sample. In these tests,

    they set up two scenarios: (1) bet on games where the model predicts a team has a

    7The losses were as much as $2,920 for = 0.5.

    Sear 15

  • 2.2. BINARY MODELS CHAPTER 2. LITERATURE REVIEW

    greater than 50% chance to win the bet and (2) where the model predicts a team has

    a greater than 57.5% chance to win the bet. Under these specifications the authors

    generate out-of-sample returns of 6.93 percent and 16.67 percent, respectively. Other

    probit strategies they developed which generate positive in-sample returns did not

    generate those same positive returns when they were applied outside of the sample.

    They go on to declare that some of the returns generated with various probit samples

    were negative (though not significant) and that all probit filters cannot generate

    positive returns. They use this fact to give further weight to their claim that the

    inefficiencies they have noted are dissipating over time and will soon be gone.

    They came to the conclusion that their models indicated an overreaction by

    bettors to a teams recent performance, ignoring the long-term success of the team.

    They proposed a strategy of identifying teams that were performing well over the

    course of the season but poorly in recent weeks and betting on them or, conversely,

    betting against teams that are on a hot streak but are poor performers overall. This

    is essentially the contrarian strategy found in financial markets. At the end of their

    paper, Gray and Gray suggest that the probit model should be used for all further

    research (instead of the widely used OLS model) and that this model should be tested

    using other variables such as weather and performance variables.

    Dare and Holland (2004) worked to refine and consolidate the work done by

    Sauer et al. and Zuber et al. They took the most mathematically based approach

    of any researcher to date and analyzed the regression models used by Dare and

    McDonald (1996) and Gray and Gray (1997). They point out that characteristics

    used by Gray and Gray such as if a team is favored, underdog, home, or visiting are

    are correlated and show that the methods used by Gray and Gray would likely lead to

    biased estimates and false rejection of the null hypothesis. They note that Dare and

    McDonald noticed this issue as well and accounted for it by imposing restrictions on

    the estimates that account for the correlation. This restricted the home team bias to

    be the opposite of the visiting team bias and the favorite team bias to be the opposite

    of the underdog bias. Dare and McDonald, using this model, found minimal to no

    Sear 16

  • 2.2. BINARY MODELS CHAPTER 2. LITERATURE REVIEW

    evidence of inefficiency in the market.

    Dare and Holland criticize the Dare and McDonald specification on the point

    that venue and favorite are deemed to be unrelated in the Dare and McDonald model.

    On the contrary, Dare and Holland find that a home team is almost twice as likely to

    be a betting favorite than a visiting team. They then mathematically work through

    the Dare and McDonald equation so they can compare derivatives of the coefficients.

    They find the Dare and McDonald model to be over restrictive and therefore not

    allow for the proper pricing of games.8 They then work through the model backwards,

    beginning with the misspecified portion of the Dare and McDonald model and add in

    Gray and Grays binary process to come up with a model that correctly specifies all

    of the biases in the market. This model was subsequently used by Borghesi (2007)

    to examine the persistence of the late season bias.

    Borghesi published multiple papers looking at specific situations that he be-

    lieved might represent an inefficiency. The first of those papers identified an ineffi-

    ciency in the betting market with regard to temperature. He found that game day

    temperature has a great affect on the outcome of the game and is not represented in

    the betting line. In another paper he identifies a bias in the final few weeks of each

    season. Specifically, the home-underdog effect is not properly accounted for and he

    shows that a profitable strategy can be produced from exploiting this bias.9 This

    paper also tests for overall market efficiency. However, he uses a different equation

    to model that efficiency. Using Gray and Grays idea that bettors have no interest

    in the actual difference in points, only if they win or lose the bet, Borghesi used the

    equation proposed by Dare and Holland (2004) where Wi is 1 if the favorite covers

    and 0 otherwise, HFi = 1 if the home team is the favorite; HFi = 0 otherwise,

    V Fi = 1 if the visiting team is the favorite; V Fi = 0 otherwise and CLi is the ab-

    solute value of the closing line. This formula allows for a discrete analysis to take

    8Specifically, Dare and Holland find the Dare and McDonald model assumes that there are anequal number of home underdogs as home favorites. This is an assumption that is almost nevertrue and thus requires a reworking of the model.

    9The home-underdog effect is when home-underdogs are generally undervalued by the bettingmarket, so it becomes profitable to wager on their side of the spread.

    Sear 17

  • 2.2. BINARY MODELS CHAPTER 2. LITERATURE REVIEW

    place. He then augments this model to change Wi to DPi, a variable that reflects the

    outcome. This shift establishes an OLS model that takes the magnitude of a bettors

    win into account. While the OLS model had been discredited by previous studies

    (most notably Gray and Gray (1997)), Borghesi included it along with the binary

    model to back up Gray and Grays assertion that it is flawed.

    He also looks at time variation in the model. He examines three separate

    variables relating to time: (1) the possibility that market participants might need

    more than the one week between games to properly process all the information gen-

    erated from the week of games, (2) whether bettors value different characteristics

    depending on the period of the season,10 and (3) whether there is a momentum effect

    present in the form of over or under-valuing recent wins and losses, recent offensive

    and defensive performance, or any other performance measure.

    The findings are quite interesting. He finds that home teams strongly out-

    perform the betting line during the last four weeks of the season, significant at the

    1 percent level. In the playoffs, the mis-pricing is even stronger with home teams

    favored to win by 5.75 points winning by an average of 8.60 points over the 20 season

    sample. This means late in the season bettors routinely place too many bets on

    away teams. The author was unable to find any evidence of mis-pricing of underdogs

    overall, but he did discover mis-pricing in the smaller subgroup of home-underdogs.

    The spread on these games favors the away team by 4.65 points on average and the

    home underdog only lose by an average of 3.13 points, a significant result at the 1

    percent level. In the playoffs this effect is even greater with home underdogs winning

    outright on average. The author also shows that the effect is entirely coming from

    the last four weeks of the season and has been growing over the 20 year sample, not

    slowing as bettors became more aware of the phenomenon.

    He reasons that the presence of the late season bias is because of the weather

    for the home team.11 The correlation between weather and outcome is strong the

    10His financial example is small cap vs. large cap in the 1980s and the football example is puttingless value in performance measures earlier in the season because teams are still trying to figure outhow good they are.

    11Weather has been a factor in almost every study done on this topic, however Borghesi was the

    Sear 18

  • 2.3. OTHER VARIABLES CHAPTER 2. LITERATURE REVIEW

    last four weeks (when it is cold in the northern cities) involving home underdogs from

    cold weather cities like Chicago, Buffalo, and New York among others. The author

    reasons that this may also be the result of bettors who are down money for the season

    overall choosing to alter their strategies from rational ones to irrational ones in the

    hope that a change will improve their fortunes. Another alternative reason the author

    presents is that bettors might become more risk tolerant as the season progresses,

    resulting in irrational movement of the spread. The author is unable to test these

    hypotheses with his data, however. Instead, he reasons in support of his claim saying

    that as the NFL season progresses it gets increasing media coverage resulting in

    more casual bettors. This swings the proportion from the rational, informed bettors

    dominating the market to the irrational, casual bettors having a significant influence

    on the market. Borghesi finishes his paper by devising regression models to exploit the

    aforementioned inefficiencies. Using 1-month, 1-year, and 5-year Base Binary Models

    he was able to pinpoint a short-term bias (using the 1-month model specifically) to

    generate a late-season success rate of 53.25 percent.

    2.3 Other variables

    Over the last twenty years there have been many papers written on the topic of NFL

    gambling market efficiency. Most of the recent papers have focused on taking the

    models that Zuber et al., Sauer et al. or Dare and Holland derived and adding new

    information or performance variables. Some of these studies find inefficiencies that

    can be exploited by a bettor as Borghesi does with the late season home underdog,

    however, most find these new variables provide no new information. In 2006 Boulier,

    Stekler and Amundson published a study on the efficiency of the NFL betting market

    modeled on the Zuber et al. study from 1985. In that study, Zuber et al. used the

    weak form efficiency test combined with performance variables like yards of offense to

    create a predictive model for a games outcome. Boulier, Stekler and Amundson used

    the same model and instead of on-field performance measures used off-field variables.

    first to look at it from a time oriented perspective.

    Sear 19

  • 2.3. OTHER VARIABLES CHAPTER 2. LITERATURE REVIEW

    They examined the explanation power of the New York Times NFL Power Rankings,

    whether the home team plays in a complex with a dome, and whether the home team

    has artificial turf or not.12

    The authors looked at a modern set of seasons, 1994-2000. It can be assumed

    that they chose to start in 1994 due to the establishment of the salary cap and

    free agency in 1993 though that is never explicitly stated in the paper.13 They

    found that for these seasons the test for weak form efficiency could not reject the

    null, meaning they could not declare it an inefficient market with that test. Their

    tests using their new variables also showed very little evidence of mis-pricings in the

    market. They found that the dummy variable testing for differences in opponents

    home playing surface to be significant in-sample, however when tested out-of-sample

    it lost significance. They concluded by stating that these new variables have been

    properly factored into the prices of the market and do not represent any bias that

    can be exploited by a gambler.

    Another study done by Borghesi in 2006 investigated the effect of the weather

    and if it represents an advantage home teams benefit from that is not represented in

    the spread. The type of weather he analyzed in this paper was game day temperature,

    different from the general climate distinctions he uses in his later paper on late season

    biases. Using data from 1981 to 2000, he is able to locate another mis-pricing in

    the NFL gambling market using this analysis. He first shows that forecast errors

    in the NFL point spread betting market are biased and provides a link from that

    error to the game day temperature conditions. He finds that not only absolute

    temperature but also relative temperature acclimatization affects the performance of

    players. He finds that the worst mis-pricing occurs when home teams play in the

    coldest temperatures.14 He then adjusts his model to account for the home underdog

    12They believed that teams with a dome play a unique style that would put them at an advantageagainst non-dome teams and that teams with artificial turf also played a unique style that wouldput them at an advantage against natural grass teams.

    13The establishment of the hard cap and free agency significantly altered the way the game wasplayed and managed and is oddly never brought up in any of these studies. Our sample comes wellafter the effects of these two policies had been realized, an advantage that other samples do nothave.

    14He uses this finding again in his paper on late season bias.

    Sear 20

  • 2.3. OTHER VARIABLES CHAPTER 2. LITERATURE REVIEW

    bias, and still finds a significant mis-pricing in these cold temperature games. This

    finding lends credibility to the argument that the NFL gambling market experienced

    a mis-pricing in the later part of the 20th century, however most of the studies that

    discover this mis-pricing indicate the effect to be dissipating over time. This leads us

    to believe that the mis-pricings have been resolved by the seasons our data is from,

    2006-2009.

    Sear 21

  • Chapter 3

    Analysis

    3.1 Data

    This analysis presented in this paper is predicated upon the presence of statistical

    anomalies in the outcome of NFL games. To locate these anomalies we compile

    a time-series data set. This data set consists of all non-exhibition NFL contests

    between the start of the 2006 season and the end of the 2009 season.1 The data

    include home/away team, scores, closing lines (point spreads), and statistics (time

    of possession, yards, etc.). Seven games have been removed from the sample as

    they were played at neutral sites so home field cannot be considered an advantage

    for either team. These were three games played in London: New England Patriots

    versus Tampa Bay Buccaneers in week seven, 2009; San Diego Chargers versus New

    Orleans Saints in week eight, 2008; New York Giants versus Miami Dolphins in week

    eight, 2007. Additionally, the Super Bowl, the NFL championship game, is played at

    a neutral site each year so those 4 games were left out. During the sample, Buffalo

    Bills played two games in Toronto. These contests were left in the sample because

    Toronto is close to Buffalo and the Bills have tremendous fan support in the area.

    Because of this they still benefited from any home field advantage they would have

    had in Buffalo, if not greater support given the Toronto fans only got to see the

    team once a year and were more excited and enthusiastic compared to the fans in

    Buffalo. This results in 255 regular season and 10 playoff games examined for all

    seasons except for 2006 in which we examine 256 regular season games and 10 playoff

    games, yielding an entire sample of 1061 games.

    Table 3.1 displays a breakdown of the point spreads in our sample. The

    regular season has an average spread of 5.98 points. Interestingly, this is slightly

    1The start of the NFL season is in early September and ends with the Super Bowl normally inthe first week of February of the next year. Therefore the end of the 2009 season was actually inearly 2010. For simplicity we will simply refer to each season in which it was started, ignoring theoverlap into the early stages of the next year.

    Sear 22

  • 3.2. METHODOLOGY CHAPTER 3. ANALYSIS

    Table 3.1: NFL point-spread summary statistics

    Closing line

    Category N Median Mean

    Regular season 1021 5.00 5.98Playoffs 40 4.25 5.70Underdogs 1053 5.00 6.01Home underdogs 353 3.50 4.99Pick-ems 8 0.00 0.00Pushes 25 3.00 4.64All Games 1061 5.00 5.97

    Notes: This table contains summary statistics describing the closing spreads of allNFL games played between 2006 and 2009. Regular season excludes games in thepre-season, playoffs, and three regular season games played in London (one each yearbeginning in 2007). Playoffs include only games in the post-season, excluding theSuper Bowl each year as well. Underdogs include only games in which there is anonzero point-spread. Home underdogs include only games in which the home teamis assigned a positive point-spread. Pick-ems include only games which there is aclosing-point spread of zero. Pushes include only games in which the final outcomeexactly matches the closing-point spread. Closing spread describes the predicteddifference in the number of points scored by the two teams in each category (absolutevalue of home minus away score).

    greater than the average playoff spread, 5.70. Common thinking is that the playoff

    format puts a weaker team on the road against a higher seeded team which would

    cause the predicted score to heavily favor the stronger, home team. Underdogs are

    expected to lose by 6.01 points on average while home underdogs are expected to

    lose by only 4.99 points.2 Data was gathered from NFL.com (scored and statistics)

    and FootballLocks.com, a betting strategies website (point spreads).

    3.2 Methodology

    We start our examination with an analysis to determine if there is statistically sig-

    nificant difference in the point spread and the actual outcome of the games. In the

    2Not shown in the table are the most common point spreads which are three points (N=115)and seven points (N=43). Predictably, the most common outcomes of games are also three points(N=94) and seven points (N=58).

    Sear 23

  • 3.2. METHODOLOGY CHAPTER 3. ANALYSIS

    past (Zuber et al., 1985; Sauer et al., 1988) this has been tested using the equation

    DPi = + PSi + i (3.1)

    where DPi is the actual difference in points between the teams playing in the ith

    game, PSi is the final point spread, and i is the error term. If expectations of

    efficiency hold = 0 and = 1. However, using this equation proves problematic

    for multiple reasons. First, this model only measures biases that are present over an

    entire sample. For example, if home teams cover by three points during the first half

    of a sample and away teams cover by three points during the second half of a sample,

    this test would show = 0 and = 1, ignoring the significant trends present within

    the sample and not identifying the strong biases. Second, the way the data is defined

    can restrict the power of the model. Since the dependent variable, DPi, is only based

    on one factor, measurement with respect to the underdog will only identify underdog

    bias and measurement with respect to the home team will only measure home team

    bias. A new model is required that takes these facts into account.

    Home teams have an inherent advantage over their opponents. Over the four

    sampled seasons 66.73% of favored teams were playing at home. Thus, we need a

    model that will take this into account, something that equation 3.1 does not. To

    differentiate between the two factors (home and underdog effects), isolate them and

    determine their affect Gray and Gray (1997) propose the model

    Yi = + 1HOMEi + 2FAVi + i (3.2)

    This model fails to allow for other, unexpected relationships, however, and it

    needs to contain proper restrictions on estimates. For instance, the model needs to

    be restricted so the home effect is the negative of the away effect. Dare and Holland

    (2004) developed a model that correctly isolates the venue and spread explanatory

    Sear 24

  • 3.3. TIME VARIATION CHAPTER 3. ANALYSIS

    variables using the model

    Di = aHFHFi + aV FV Fi + ( 1)CLi + i (3.3)

    In this case, Di is the outcome (the difference in points scored between the favorite

    and the underdog) minus the closing line, HFi = 1 if the home team is the favorite;

    HFi = 0 otherwise, V Fi = 1 if the visiting team is the favorite; V Fi = 0 otherwise

    and CLi is the absolute value of the closing line. This is the model we will use and

    it will be referred to as the Base Model.

    However, this model would seem to indicate that the magnitude by which a

    bet is won is of import to the bettor. We have seen from Gray and Gray (1997) that

    it is unclear how much more information is provided by the magnitude of a win. If

    a bettor only views their bet in terms of success or failure the magnitude of that

    success or failure is irrelevant. This leads to a binary state, win or loss. Equation 3.3

    can be augmented to reflect this binary view.3 The model is the same as equation

    3.3 however now the left-hand-side variable is Wi which is 1 if the favorite covers and

    0 otherwise

    Wi = aHFHFi + aV FV Fi + ( 1)CLi + i (3.4)

    If these two equations result in substantially different results it would indicate

    that the magnitude by which a bet wins (loses) by is of importance to bettors.

    However, we begin our analysis assuming that bettors are principally concerned with

    the success or failure of a bet. Bettors learn to price the underlying asset (the game

    in this case) based on repeated success and failure against the spread. Therefore, we

    will use a discrete choice regression as the basis for our analysis.

    3.3 Time variation

    The last aspect of valuation that we consider for the data is time variation. It is

    very important to account for time variation using subsamples of various lengths for

    3Indeed, Dare and Holland (2004) included this second model in their paper as well.

    Sear 25

  • 3.3. TIME VARIATION CHAPTER 3. ANALYSIS

    at least three reasons. First, it is quite possible that market participants do not

    immediately process all of the information necessary for the formation of a rational

    price. In equities markets, prices are continuously evolving and changing which

    causes problems when judging how long investors need to properly account for new

    information. However, the NFL betting market is much more discrete. In this market

    almost all relevant information is released six days before final prices are set.4 If

    bettors can rationally process all of this information in that time period the closing

    point spread should be an unbiased measure including all available information. On

    the other hand, if bettors cannot react in six days and multiple betting periods pass

    before the new information is included into the point spread, inefficiencies can exist.

    We examine this rate of information processing by observing how many weeks are

    necessary for all prior information to be incorporated into the point spread.

    Second, investors may value certain characteristics more than others depend-

    ing on the time period; for example, small cap vs. large cap in the 1980s. In sports

    betting this phenomenon presents itself largely in bettors placing more emphasis on

    venue (home vs. away) early on in the season factoring on field performance increas-

    ingly as the season progresses. The reason for this is that teams go through many

    changes each off-season like new players, coaches and tactics and the true strength

    of a team is less apparent during the early stages of the season when all of these

    changes are still taking effect. We analyze whether bettors shift the perceptions of

    the value of certain information over time.

    Lastly, as seen in equities markets price where momentum and reversals can

    be very important to investors, there has been suggestion that streaks are important

    to bettors as well (Camerer, 1989). Streaks of this nature involve the mis-valuation of

    recent wins and losses or other factors like recent offensive or defensive performance.

    While it is unclear if perceived streaks do exist, the value they would have to market

    participants depends on the size and length of the streaks. Any shift from rational

    4Other information relating to the outcome such as injuries to players, suspensions, etc. thatmight happen mid-week is very limited and we will assume for the sake of simplicity that it has nooverall effect on the process we are examining here.

    Sear 26

  • 3.3. TIME VARIATION CHAPTER 3. ANALYSIS

    values has to be first identified before it can be exploited by an investor and must

    be a great enough mis-pricing to recoup transaction costs for the investor. We will

    test for a range of time periods because the length of these streaks is unknown before

    they occur.

    Sear 27

  • Chapter 4

    Results

    4.1 Statistical analysis

    The crucial aspect of our analysis is the difference between point spreads and actual

    outcomes. We assume the distribution of these differences is nonnormal so we use a

    Wilcoxon signed rank test to test their significance. The summary of this statistical

    analysis can be found in table 4.1 . Overall, home teams are predicted to win by an

    average of 2.64 points and actually win by 2.24 points with this difference falling short

    of showing significance. This table is in stark contrast to the findings of Borghesi

    (2007). In that study he found statistically significant differences as the season wore

    on with weeks 15, 16, and 17 significant at the 10%, 5%, and 2% level, respectively.

    Our data shows only two weeks, four and ten, where the data would suggest a mis-

    pricing exists. In the later weeks our more recent data shows mostly a negative

    median difference with a median of 0.25 for weeks 14-17, but it does not come closeto significantly differing from zero. In fact, the mean and the median have different

    signs, indicating that the true average of the data is around zero.

    Our data shows the greatest indication of mis-pricing in the early weeks, the

    opposite of Borghesis findings. Week 4 has a median difference of 6.50 points, a

    number significant at the 2% level; week 10 has a median difference of -5.00 points,

    a number significant at the 5% level. These two weeks have very little in common

    however, and do not appear to be representative of a large mis-pricing given that the

    differences not only are 6 weeks apart, but also have differing signs. In this data,

    as the season progresses the market seems to get increasingly accurate. This could

    be a symptom of the time variation idea that a teams performance is uncertain at

    the beginning of the season due to the large amount of changes that occur in the

    off-season. As the season progresses and a teams true ability is revealed the market

    becomes more accurate at pricing the assets.

    Sear 28

  • 4.1. STATISTICAL ANALYSIS CHAPTER 4. RESULTS

    Tab

    le4.

    1:N

    FL

    hom

    ete

    amsu

    mm

    ary

    stat

    isti

    csby

    wee

    k

    Wee

    kG

    ames

    Mea

    nP

    SM

    ean

    outc

    ome

    Med

    ian

    outc

    ome

    Mea

    ndiff

    eren

    ceM

    edia

    ndiff

    eren

    cep-

    valu

    e

    164

    -2.3

    90.

    23-2

    .00

    -2.6

    3-2

    .25

    0.13

    32

    63-2

    .81

    -1.9

    0-3

    .00

    -0.9

    0-1

    .00

    0.63

    23

    62-2

    .87

    -2.1

    8-3

    .00

    -0.6

    90.

    000.

    795

    455

    -1.7

    5-5

    .98

    -7.0

    04.

    246.

    500.

    013

    556

    -3.1

    0-4

    .86

    -3.0

    01.

    760.

    750.

    428

    654

    -2.8

    0-4

    .06

    -2.5

    01.

    260.

    250.

    856

    753

    -0.5

    30.

    00-3

    .00

    -0.5

    30.

    000.

    833

    852

    -4.6

    3-2

    .65

    -4.0

    0-1

    .97

    -3.5

    00.

    247

    955

    -2.7

    5-1

    .47

    -3.0

    0-1

    .27

    -1.0

    00.

    407

    1059

    -2.9

    5-0

    .27

    2.00

    -2.6

    8-5

    .00

    0.04

    311

    64-1

    .91

    -0.1

    9-2

    .50

    -1.7

    3-1

    .25

    0.33

    312

    64-3

    .44

    -2.8

    1-3

    .00

    -0.6

    3-1

    .00

    0.76

    613

    64-1

    .22

    0.27

    0.00

    -1.4

    80.

    750.

    579

    1464

    -2.4

    8-5

    .50

    -6.5

    03.

    024.

    500.

    093

    1564

    -2.2

    1-1

    .22

    -3.0

    0-0

    .99

    -1.0

    00.

    530

    1664

    -2.8

    2-0

    .81

    -0.0

    0-2

    .01

    -2.0

    00.

    259

    1764

    -2.9

    9-3

    .86

    -3.5

    00.

    870.

    500.

    734

    Wee

    ks

    1-13

    765

    -2.5

    4-1

    .92

    -3.0

    0-0

    .62

    -0.5

    00.

    148

    Wee

    ks

    14-1

    725

    6-2

    .63

    -2.8

    5-3

    .00

    0.22

    -0.2

    50.

    927

    Pla

    yoff

    s40

    -4.7

    8-4

    .43

    -3.5

    0-0

    .35

    -1.0

    00.

    742

    All

    gam

    es10

    61-2

    .64

    -2.2

    4-3

    .00

    -0.4

    1-0

    .50

    0.21

    6

    Not

    es:

    This

    table

    conta

    ins

    sum

    mar

    yst

    atis

    tics

    for

    all

    NF

    Lhom

    ega

    mes

    pla

    yed

    bet

    wee

    n20

    06an

    d20

    09.

    Gam

    esth

    athav

    eno

    hom

    ete

    am(S

    up

    erB

    owls

    )ar

    eom

    itte

    d.

    Wee

    kre

    fers

    tore

    gula

    rse

    ason

    gam

    eson

    ly.

    Pla

    yoff

    gam

    esar

    esu

    mm

    ariz

    edse

    par

    atel

    ynea

    rth

    eb

    otto

    mof

    the

    table

    .M

    ean

    PS

    isth

    eav

    erag

    eva

    lue

    ofth

    ecl

    osin

    gline

    (poi

    nt

    spre

    ad)

    rela

    tive

    toth

    ehom

    ete

    am(n

    egat

    ive

    indic

    ates

    that

    the

    hom

    ete

    amis

    the

    favo

    rite

    ).M

    ean

    (med

    ian)

    outc

    ome

    isth

    em

    ean

    (med

    ian)

    diff

    eren

    cein

    poi

    nts

    scor

    edb

    etw

    een

    the

    away

    team

    and

    the

    hom

    ete

    am(n

    egat

    ive

    indic

    ates

    that

    the

    hom

    ete

    amw

    ins)

    .M

    ean

    (med

    ian)

    diff

    eren

    ceis

    the

    mea

    n(m

    edia

    n)

    diff

    eren

    ceb

    etw

    een

    the

    clos

    ing

    spre

    adan

    dth

    eac

    tual

    outc

    ome

    (neg

    ativ

    ein

    dic

    ates

    that

    the

    hom

    ete

    amco

    vers

    ).p-

    valu

    ein

    dic

    ates

    the

    like

    lihood

    that

    med

    ian

    diff

    eren

    ceis

    sign

    ifica

    ntl

    ydiff

    eren

    tfr

    omze

    rousi

    ng

    asi

    gned

    rank

    test

    .

    Sear 29

  • 4.1. STATISTICAL ANALYSIS CHAPTER 4. RESULTS

    The results of the subgroup of home underdogs appears in Table 4.2 . Within

    this subgroup, teams are predicted to lose by an average of 5.11 points and actually

    lose by 5.54 points. There are no weeks with a statistically significant difference from

    zero and only two weeks, four and twelve, even approach a 10% significance level.

    Again, this differs greatly from the findings of Borghesi (2007). In his research, he

    finds a large mis-pricing of home underdog games with statistically significant values

    for all games, playoffs, the aggregate late season weeks, and individual weeks 15, 16,

    and 17. In particular, he finds that home underdogs in the playoffs not only lose

    by less than the spread, but win outright by almost nine points.1 That is a very

    striking contrast with our data, and according to this simple analysis the mis-pricing

    Borghesi found has been corrected by the market. We will engage in a more focused

    analysis of this later, but this data does not show that bettors are unable to rationally

    value late season data. It would appear, in fact, that they process better towards

    the beginning of the season when the true skill of a team is still relatively unknown

    (weeks 1-13 has a higher p-value than weeks 14-17 in both cases).

    In their research on the topic Gray and Gray (1997) found a home underdog

    bias present from 1976 to 1994 however, they found that this bias was getting smaller

    and smaller as time progressed. Our data indicates that the bias has hit a plateau

    and leveled off. Looking at table 4.3 the home underdog is correctly priced in our

    sample with a bet winning around 50% of the time. For late season home underdogs

    however, the results are more interesting. Over the four seasons examined a bet

    on the home underdog won 54.03% of the time. This is high enough to suggest a

    strategy could be implemented to make money from this mis-pricing, though the lack

    of significance seen in table 4.1 and table 4.2 might weaken this ability. Mis-pricing

    within a season can occur for many reasons and can lead to irrational betting.

    One potential source of persistent seasonal biases is bettors not properly ac-

    counting for factors which affect team performance from one point to another in a

    1This is a striking discovery that Borghesi found in his data. He claims that it is not drivenby outliers however, we have found no evidence of any mis-pricing in the post-season, let alonesomething this extreme. If it was not driven by outliers we would conclude it was simply an odditythat occurred over the time period he analyzed.

    Sear 30

  • 4.1. STATISTICAL ANALYSIS CHAPTER 4. RESULTS

    Tab

    le4.

    2:N

    FL

    hom

    eunder

    dog

    sum

    mar

    yst

    atis

    tics

    by

    wee

    k

    Wee

    kG

    ames

    Mea

    nP

    SM

    ean

    outc

    ome

    Med

    ian

    outc

    ome

    Mea

    ndiff

    eren

    ceM

    edia

    ndiff

    eren

    cep-

    valu

    e

    122

    4.11

    9.18

    8.00

    -5.0

    7-5

    .25

    0.13

    12

    194.

    324.

    844.

    00-0

    .53

    -1.0

    01.

    000

    317

    4.88

    6.71

    6.00

    -1.8

    2-2

    .50

    0.42

    14

    224.

    800.

    73-2

    .50

    4.07

    6.00

    0.10

    85

    184.

    698.

    503.

    50-3

    .81

    0.25

    0.58

    66

    175.

    532.

    82-1

    .00

    2.71

    5.00

    0.34

    47

    236.

    246.

    7010

    .00

    -0.4

    6-1

    .50

    0.98

    88

    104.

    209.

    309.

    50-5

    .10

    -6.5

    00.

    221

    915

    5.17

    3.20

    3.00

    1.97

    6.00

    0.58

    910

    165.

    445.

    446.

    000.

    00-0

    .50

    0.91

    011

    235.

    679.

    304.

    00-3

    .63

    -1.0

    00.

    254

    1220

    4.65

    9.75

    10.5

    0-5

    .10

    -5.7

    50.

    107

    1325

    5.50

    6.08

    3.00

    -0.5

    83.

    000.

    788

    1426

    4.67

    4.08

    6.00

    0.60

    -2.7

    50.

    914

    1525

    6.08

    2.88

    3.00

    3.20

    4.50

    0.28

    416

    205.

    881.

    902.

    003.

    983.

    000.

    204

    1721

    4.88

    5.38

    6.00

    -0.5

    0-3

    .00

    0.57

    5W

    eeks

    1-13

    247

    5.06

    6.35

    4.00

    -1.2

    9-0

    .50

    0.27

    3W

    eeks

    14-1

    792

    5.36

    3.58

    4.50

    1.79

    1.00

    0.29

    1P

    layo

    ffs

    63.

    082.

    17-2

    .00

    0.92

    4.50

    0.82

    6A

    llga

    mes

    345

    5.11

    5.54

    4.00

    -0.4

    30.

    000.

    751

    Not

    es:

    This

    table

    sum

    mar

    izes

    all

    NF

    Lga

    mes

    from

    2006

    to20

    09in

    whic

    hth

    ehom

    ete

    amis

    the

    under

    dog

    .W

    eek

    refe

    rsto

    regu

    lar

    seas

    onga

    mes

    only

    .P

    layo

    ffga

    mes

    are

    sum

    mar

    ized

    separ

    atel

    ynea

    rth

    eb

    otto

    mof

    the

    table

    .M

    ean

    CL

    isth

    eav

    erag

    eva

    lue

    ofth

    ecl

    osin

    gline

    rela

    tive

    toth

    ehom

    ete

    am(n

    egat

    ive

    indic

    ates

    that

    the

    hom

    ete

    amis

    the

    favo

    rite

    ).M

    ean

    (med

    ian)

    outc

    ome

    isth

    em

    ean

    (med

    ian)

    diff

    eren

    cein

    poi

    nts

    scor

    edb

    etw

    een

    the

    away

    team

    and

    the

    hom

    ete

    am(n

    egat

    ive

    indic

    ates

    that

    the

    hom

    ete

    amw

    ins)

    .M

    ean

    (med

    ian)

    diff

    eren

    ceis

    the

    mea

    n(m

    edia

    n)

    diff

    eren

    ceb

    etw

    een

    the

    clos

    ing

    spre

    adan

    dth

    eac

    tual

    outc

    ome

    (neg

    ativ

    ein

    dic

    ates

    that

    the

    hom

    ete

    amco

    vers

    ).p-

    valu

    ein

    dic

    ates

    the

    like

    lihood

    that

    med

    ian

    diff

    eren

    ceis

    sign

    ifica

    ntl

    ydiff

    eren

    cefr

    omze

    rousi

    ng

    asi

    gned

    rank

    test

    .

    Sear 31

  • 4.1. STATISTICAL ANALYSIS CHAPTER 4. RESULTS

    Table 4.3: Persistence of biases in the NFL

    Home underdog Late home underdog

    Season Games Win (%) Games Win (%)

    2006 81 58.02 25 48.002007 94 50.00 35 57.142008 87 43.68 31 58.062009 91 48.35 33 51.52All Seasons 353 49.86 124 54.03

    Notes: This table shows the success rate (omitting pushes) of betting on all homeunderdogs during the regular season and post-season. Win is the proportion of homeunderdogs that cover the spread. Home underdogs include all regular season andplayoff games (except Super Bowls) in which the home team is assigned a positive-point spread. Late home underdogs exclude games played before week 13.

    season. The most pertinent of these that differs from the beginning to the end of

    a season is the weather. People around the game (coaches, players, sports writers,

    etc.) consistently argue that when a team from a mild climate like San Diego has to

    play in an open air stadium in a harsh climate late in the season, such as Chicago,

    the mild climate team is at a significant disadvantage. Teams based in harsh cli-

    mates regularly practice and play in this harsh weather making them more adept at

    handling the adversity that the weather presents. In an efficient market this climate

    factor should be fully reflected in the closing point spread. It is possible that late

    season spreads do not account for this situation and if the late season mis-pricing

    is a result of this it would indicate that bettors are not properly factoring historical

    results into their analysis. If they were, they would account for the effect the weather

    can have late in the season and the bias would be expected and incorporated into

    the prices.

    The relationship between weather and outcome in games played in weeks 15

    and later is shown in table 4.4 , Panel A. This subsample does not show evidence of

    consistent mis-pricing in games involving visiting teams from mild climates traveling

    to play games in cold climates.2 The mean closing line indicates that home teams are

    2A list of teams with their location and climate information can be found in the appendix. Coldclimate games are game that were played after week 14.

    Sear 32

  • 4.1. STATISTICAL ANALYSIS CHAPTER 4. RESULTS

    Tab

    le4.

    4:W

    eath

    ereff

    ects

    Sea

    son

    Gam

    esM

    ean

    PS

    Mea

    nou

    tcom

    eM

    edia

    nou

    tcom

    eM

    ean

    diff

    eren

    ceM

    edia

    ndiff

    eren

    cep-

    valu

    e

    Pan

    elA

    :C

    old

    wea

    ther

    adva

    nta

    geby

    seas

    on

    2006

    14-6

    .54

    -4.0

    0-3

    .00

    -2.5

    4-4

    .00

    0.55

    120

    0712

    -6.7

    5-7

    .50

    -8.5

    00.

    750.

    000.

    937

    2008

    10-1

    .60

    -12.

    70-9

    .50

    11.1

    06.

    500.

    044

    2009

    14-6

    .57

    -4.5

    7-6

    .00

    -2.0

    02.

    000.

    834

    All

    Sea

    sons

    50-5

    .61

    -6.

    74-7

    .00

    1.13

    1.25

    0.52

    8

    Pan

    elB

    :C

    old

    wea

    ther

    adva

    nta

    geby

    mon

    thM

    onth

    (s)

    Gam

    esM

    ean

    PS

    Mea

    nou

    tcom

    eM

    edia

    nou

    tcom

    eM

    ean

    diff

    eren

    ceM

    edia

    ndiff

    eren

    cep-

    valu

    e

    Sep

    tem

    ber

    51-3

    .14

    -4.8

    2-5

    .00

    1.69

    2.50

    0.34

    7O

    ctob

    er48

    -5.6

    1-8

    .35

    -7.0

    02.

    742.

    250.

    350

    Nov

    emb

    er71

    -3.5

    0-2

    .07

    -3.0

    0-1

    .43

    -1.0

    00.

    358

    Dec

    emb

    er,

    Jan

    uar

    y61

    -5.9

    7-7

    .70

    -7.0

    01.

    741.

    500.

    315

    Not

    es:

    Pan

    elA

    sum

    mar

    izes

    all

    NF

    Lga

    mes

    from

    2006

    to20

    09in

    whic

    hth

    ehom

    ete

    amhas

    aco

    ldw

    eath

    erad

    vanta

    ge.

    This

    adva

    nta

    geis

    defi

    ned

    toocc

    ur

    when

    avis

    itin

    gte

    amis

    trav

    elin

    gfr

    oma

    mild

    clim

    ate

    topla

    yin

    Buff

    alo,

    Chic

    ago,

    Cin

    cinnat

    i,C

    leve

    land,

    Den

    ver,

    Gre

    enB

    ay,

    New

    Engl

    and,

    New

    Yor

    k,

    Philad

    elphia

    orP

    itts

    burg

    hin

    wee

    k15

    orla

    ter.

    Mea

    nP

    Sis

    the

    aver

    age

    valu

    eof

    the

    clos

    ing

    line

    (poi

    nt

    spre

    ad)

    rela

    tive

    toth

    ehom

    ete

    am(n

    egat

    ive

    indic

    ates

    that

    the

    hom

    ete

    amis

    the

    favo

    rite

    ).M

    ean

    (med

    ian)

    outc

    ome

    isth

    em

    ean

    (med

    ian)

    diff

    eren

    cein

    poi

    nts

    scor

    edb

    etw

    een

    the

    away

    team

    and

    the

    hom

    ete

    am(n

    egat

    ive

    indic

    ates

    that

    the

    hom

    ete

    amw

    ins)

    .M

    ean

    (med

    ian)

    diff

    eren

    ceis

    the

    mea

    n(m

    edia

    n)

    diff

    eren

    ceb

    etw

    een

    the

    clos

    ing

    spre

    adan

    dth

    eac

    tual

    outc

    ome

    (neg

    ativ

    ein

    dic

    ates

    that

    the

    hom

    ete

    amco

    vers

    ).p-

    valu

    ein

    dic

    ates

    the

    like

    lihood

    that

    mea

    ndiff

    eren

    ceis

    sign

    ifica

    ntl

    ydiff

    eren

    tfr

    omze

    rousi

    ng

    asi

    gned

    rank

    test

    .

    Sear 33

  • 4.1. STATISTICAL ANALYSIS CHAPTER 4. RESULTS

    predicted to win by 5.61 points but home teams win by 6.74 points in this sample.

    This figure is not significantly different from zero, but it is possible that if the same

    closing line and actual outcome stayed the same over a slightly larger sample, it would

    reach significance. Panel B shows that there is not a significant difference when the

    games are broken down over a broader, seasonal time period. The p-values in Panel

    B do not corroborate the previous findings by Borghesi who found significance in

    September, November and December/January.3

    Another possible explanation for the high home underdog late season winning

    percentage we found is behavioral. In the NFL point-spread betting market, as in

    almost all betting markets, the bettor has a negative expected value due to the

    vigorish paid to a bookie. As a season progresses and a bettor begins to lose money,

    they may change their strategy from the system that has caused them to lose that

    money. Bettors hope that any change in strategy will help change their misfortunes

    which allows them to justify the shift from rational bets to irrational bets. This

    phenomenon has also been identified in horse racetrack betting (Rachlin, 1990; Ritter,

    1994). Rachlin argues that the more bettors lose, the higher their risk tolerance

    becomes. This results in placing more bets on long shots. Prior losses cause bettors

    to end up over-betting on outcomes that are less likely to occur. In other systems

    of betting a bettor would receive bettor odds for these bets but in the point spread

    market this behavior results in irrational movements of the spread. This is a theory

    that we cannot test with our data. Furthermore, it is a behavior that is likely to

    manifest itself randomly during the sample based on the disposition of each bettor

    and thus, would not be directly related to the venue of a game, the variable we are

    working to explain.

    While we can assume that bettors do act irrationally on occasion that fact does

    not explain why there is a higher winning proportion of late season home underdogs.

    The fact that the winning percentage is around 50% for the entire season indicates

    3Borghesi found the reverse of the cold weather factor happened in September. That is, thecold climate teams were at a disadvantage when playing a mild climate team at home early in theseason.

    Sear 34

  • 4.1. STATISTICAL ANALYSIS CHAPTER 4. RESULTS

    Table 4.5: Nevada football betting

    Month CFB games NFL games Total games Total bet/game

    September 964 208 1172 $6,976,349October 925 232 1157 $8,745,662November 715 260 975 $9,145,005December 148 256 404 $22,224,760January, February 56 104 160 $190,447,832

    Notes: This table shows the number of football games and bet volume by month from September2006 to February 2010. The number of games is defined as the number of college football (CFB)and National Football League games listed in Las Vegas. Because Nevada pools its CFB and NFLbetting data for accounting purposes, the value of football bets cannot be broken down into collegevs. professional games.

    that there is a group of informed bettors pushing the betting line to a rational spot.

    Therefore, there is something about the end of the season that draws out more

    irrational, uniformed bettors who influence the market in a way that can potentially

    be exploited.

    As the season nears completion, media interest in the NFL increases. The

    final weeks of the regular season coincide with the conclusion of the baseball World

    Series and the very early stages of the NBA season so most media outlets cover the

    NFL and college football to great lengths over this time. As table 4.1 shows, the

    amount bet on football each month increases consistently over the end of the season

    culminating with the playoffs and Super Bowl in January and February.4 Since the

    informed bettors have limited wealth it is reasonable to assume that the new entrants

    who come in late in the season drown out the effect of the rational bettors causing

    arbitrage opportunities, such as the home underdog effect we have identified.

    To further investigate what we have seen we develop four simple betting strate-

    gies and test out their effectiveness over the course of our sample. These results can

    be seen in Table 4.6. The first two columns show the results of two common betting

    strategies.5 While neither bet on all home teams nor bet on all home underdogs

    win enough to cover the vigorish, bet on all home underdogs when used over the

    last four weeks of the regular season and the playoffs yields a return of 56.52% and

    4The data in table 4.1 has been taken from the Nevada Gaming Control board who keep recordsdating back to 1998. They do not differentiate between college and professional football.

    5Amoako-Adu et al. (1985) and Vergin and Scriabin (1978) show that simple strategies can beeffective.

    Sear 35

  • 4.1. STATISTICAL ANALYSIS CHAPTER 4. RESULTS

    Tab

    le4.

    6:Succ

    ess

    rate

    sof

    sim

    ple

    bet

    ting

    rule

    sin

    the

    NF

    L

    Str

    ateg

    y

    Hom

    eH

    ome

    under

    dog

    s2+

    Hom

    eunder

    dog

    s8+

    Hom

    eunder

    dog

    s

    Wee

    kN

    Acc

    ura

    cy(%

    )N

    Acc

    ura

    cy(%

    )N

    Acc

    ura

    cy(%

    )N

    Acc

    ura

    cy(%

    )

    163

    44.4

    422

    40.9

    122

    40.9

    11

    100.

    002

    6045

    .00

    2045

    .00

    1656

    .25

    20.

    003

    5949

    .15

    1747

    .06

    1747

    .06

    0N

    /A4

    5567

    .27

    2245

    .00

    2070

    .00

    616

    .67

    554

    55.5

    618

    47.0

    616

    56.2

    53

    66.6

    76

    5350

    .94

    1763

    .64

    1662

    .50

    366

    .67

    751

    49.0

    224

    50.0

    022

    45.4

    57

    14.2

    98

    5240

    .38

    1127

    .27

    922

    .22

    110

    0.00

    955

    47.2

    716

    68.7

    515

    66.6

    72

    100.

    0010

    5736

    .84

    1643

    .75

    1442

    .86

    450

    .00

    1162

    41.9

    423

    34.7

    821

    33.3

    37

    71.4

    312

    6245

    .16

    2334

    .78

    2025

    .00

    30.

    0013

    6453

    .13

    2661

    .54

    2065

    .00

    475

    .00

    1463

    55.5

    626

    46.1

    522

    45.4

    56

    66.6

    715

    6245

    .16

    2560

    .00

    2560

    .00

    710

    0.00

    1662

    40.3

    220

    65.0

    020

    65.0

    04

    75.0

    017

    6253

    .23

    2157

    .14

    2060

    .00

    475

    .00

    Wee

    ks

    1-13

    747

    48.0

    625

    549

    .02

    228

    49.1

    243

    46.5

    1W

    eeks

    14-1

    724

    948

    .59

    9256

    .52

    8757

    .47

    2180

    .95

    Pla

    yoff

    s40

    47.5

    06

    66.6

    76

    66.6

    70

    N/A

    All

    gam

    es10

    3648

    .17

    353

    51.2

    732

    151

    .71

    6457

    .81

    Not

    es:

    This

    table

    show

    sth

    esu

    cces

    sra

    teof

    four

    sim

    ple

    bet

    ting

    rule

    sfo

    rN

    FL

    gam

    esfr

    om20

    06to

    2009

    .N

    isth

    enum

    ber

    ofga

    mes

    inw

    hic

    hth

    esi

    mple

    rule

    crit

    eria

    ism

    et.

    Acc

    ura

    cyis

    the

    succ

    ess

    rate

    ofea

    chst

    rate

    gy.

    Gam

    esre

    sult

    ing

    ina

    push

    are

    excl

    uded

    .

    Sear 36

  • 4.2. REGRESSION ANALYSIS CHAPTER 4. RESULTS

    66.67%, respectively. This is more than enough to cover the vigorish and any other

    transaction costs. The final columns of Table 4.6 show that bets on moderate and

    extreme underdogs are even more precise.6 While only the bet on 8+ home under-

    dogs is a high enough percentage over the whole sample to make up for the vigorish,

    both bet on 2+ home underdogs and bet on 8+ home underdogs work very well

    over the late season and into the playoffs. In fact, in the small sample of 21, the bet

    on 8+ home underdogs won almost 81% of the time. These findings show that the

    most profitable betting strategies rely on the venue and spread but also the precise

    timing of the bet.

    4.2 Regression Analysis

    In the previous section we demonstrated that in certain instances, bettors systemat-

    ically misvalue bets. In this section we will present the results of a regression model

    betting system. These strategies are designed to take advantage of any mis-pricings

    that exist in the available bets. Initially, we will examine in-sample predictability

    and compare the predictive accuracy of binary models versus OLS models. Then, we

    will augment these models to include and momentum effects.

    Since the classification as either favorite or underdog is not independent of

    home or visitor status, we start by examining the results of the Binary Base Model

    to find clarification on the conditional variables. We estimate equation 3.4 using a

    pooled regression and find the coefficients (p-values) for the home favorite, visiting

    favorite, and closing line to be 0.4721 (0.0000), 0.5025 (0.0000), and -0.0025 (0.5543),

    respectively, which shows significant p-values for both the HF and VF terms. Since

    both terms are positive, this says that favorites, both home and away, are more

    likely to cover the spread. Since the coefficient on VF is greater than the coefficient

    on HF this regression indicates that it is more likely for a visiting favorite to cover

    6Moderate and extreme spreads are defined as two points or greater and eight points or greater,respectively. While bet on all home teams and bet on all home underdogs are two commonlyused simple betting strategies, bets on moderate and extreme home underdogs are not. However,Borghesi (2007) discovered that as the spread increases, bets on home underdogs are more likely tocover.

    Sear 37

  • 4.2. REGRESSION ANALYSIS CHAPTER 4. RESULTS

    than a home favorite. A pooled regression of the late season games augments these

    results. In this case the coefficients are 0.4621 (0.0000), 0.45885 (0.0000), and -0.0018

    (0.8136). The highly significant, positive coefficients on both dummy variables again

    indicate that favorites have a higher chance of covering, regardless of venue. However,

    there is a switch wherein the coefficient on HF is now greater than the coefficient

    on VF. This indicates that late in the season a home favorite is more likely to cover

    than a visiting favorite.7

    To check for the climates role in mis-pricing Borghesi has developed the model

    i = aHFMHF Mi+aV FMV F Mi+(1)CLi+aHFMHF Ci+aV FMV F Ci (4.1)

    where Mi = 1 if the game is played in a moderate climate or the game is played in

    a cold climate and the visiting team is from a cold climate; Mi = 0 otherwise and

    Ci = 1 if the game is played in a cold climate and the visiting team is from a moderate

    climate; Ci = 0 otherwise.8 Results (not shown) indicate that the impact of weather

    on late-season games is not fully reflected in the closing line. The parameter for

    HF Ci is not significant, the coefficient for V F Ci is negative and highly significant(coefficient= 0.0663, p-value= 0.0610). This means that weather is not correctlyfactored into price when visiting favorites from a mild climate play on the road in a

    harsh climate. Therefore, home underdogs with a climate advantage are undervalued.

    While this agrees with Borghesi (2007) this is a very small coefficient, almost 1/6th of

    what he found, and if we use a 5% level of significance, it is thrown out all together.

    This does not unequivocally state that there is a mis-pricing by any means.

    7Both of these results contrast with the findings of Borghesi (2007). He found only VF late inthe season to be significant. This coefficient was negative indicating home underdogs are more likelyto cover the spread late in the season.

    8Though our previous analysis showed little significant p-levels for weather influence, it is im-portant to confirm that this is because it does not exist and not due to the somewhat small samplesize.

    Sear 38

  • 4.3. IN-SAMPLE PREDICTABILITY CHAPTER 4. RESULTS

    Table 4.7: NFL in-sample predictability

    Base model

    OLS BinaryWeek N accuracy (%) accuracy (%)

    1 64 50.00 53.132 62 46.77 56.453 62 48.39 50.004 55 54.55 30.915 56 50.00 46.436 53 41.51 52.837 53 49.06 52.838 51 52.94 54.909 54 40.74 70.3710 59 37.29 57.6311 62 53.23 67.7412 62 59.68 59.6813 64 45.31 53.1314 64 62.50 45.3115 64 32.81 59.3816 64 28.13 65.6317 58 63.79 51.72Weeks 1-13 757 48.48 54.43Weeks 14-17 250 46.40 55.60Playoffs 30 56.67 70.00All Games 1037 48.22 55.16

    Notes: This table shows the success rate of two in-sample regression models used to predict outcomesof NFL games from 2006 to 2009. Accuracy is the proportion of outcomes that are correctly predictedin-sample. The Base Model is i = HFHFi + V FV Fi + ( 1)CLi + i where HF is 1 if thehome team is the favorite and HF=0 otherwise, V F=1 if the visiting team is the favorite, V F=0otherwise and CLi is the absolute value of the closing point spread. In the OLS model i is theoutcome (the difference in points scored between the favorite and the underdog) minus the closingpoint spread. In the binary model i = 1 if the favorite covers the spread and i = 0 otherwise. Theestimators are calculated each season and used to predict outcomes in that same season. Gameswith a point spread of zero (Pick em games) have been omitted from the sample.

    4.3 In-sample predictability

    To better quantify the value we can derive from imperfect information processing we

    first design a series of models to predict within our sample. Using all of the outcomes

    in a season we obtain parameters to develop a model to predict the outcomes within

    that same season. The success of these models is shown in Table 4.7. Columns one

    and two show that the accuracy for the Base OLS Model is 48.22% overall and 55.16%

    overall, respectively. This means that if a gambler ex ante had perfect information

    about the upcoming season he could devise an objective strategy to win bets at a

    Sear 39

  • 4.4. OUT-OF-SAMPLE PREDICTABILITY CHAPTER 4. RESULTS

    rate of 55.16%.

    It is also worth noting the difference between the OLS and Binary models

    accuracy. The Binary model is far more accurate in predicting in-sample outcomes.9

    This lends credence to the belief that bettors are unconcerned with the magnitude of

    a won or lost bet, only if they win or lose it. For instance, a bettor does not care if

    a favorite they bet on covers by 1-point or 10-points, they win their bet either way.

    The superiority of the binary model confirms this.

    4.4 Out-of-sample predictability

    The idea that a gambler would have perfect information is grossly flawed, however.

    If a gambler did have perfect information before the season starts he could have a

    winning percentage of 100%, not just 55.16%. Many studies have found an objective

    betting method that succeeds against the vigorish in-sample (Zuber et al. (1985)

    for instance), not many have been able to show that this objective advantage holds

    outside of the sample of data the model is based on.

    To isolate and identify the length and severity of biases we establish two

    variants of the Base Binary Model first proposed by Borghesi (2007).10 The first is

    the 1-Month Base Binary Model in which we regress the previous four weeks of data

    to predict the next four weeks of results. Estimators for the first few weeks of each

    season are derived from the final weeks of the previous season, except for the first

    weeks of our sample, weeks 1-4 in 2006, which are omitted. The second is the 1-Year

    Base Binary Model in which we regress the previous seasons data to to predict the

    seasons results. The first season in our sample, 2006, has been omitted from this

    model. The first model is used to identify and exploit short term biases while the

    second model is designed to identify and exploit long term biases.

    Table 4.8 shows the results of these models. Neither model achieves the

    necessary 52.40% accuracy that has been the proposed break even point in previous

    9Conducting a similar study, Borghesi (2007) found increased accuracy in the binary model aswell, however his OLS model was accurate to 52.98%, a high enough accuracy to beat the vigorish.

    10We stop using the OLS model because we have shown the binary model to be a much moreaccurate predictor.

    Sear 40

  • 4.4. OUT-OF-SAMPLE PREDICTABILITY CHAPTER 4. RESULTS

    Table 4.8: NFL out-of-sample predictability using the base binary model

    Base binary model variant

    1-month 1-year

    Week N Accuracy (%) N Accuracy (%)

    1 48 54.17 48 58.332 46 50.00 46 54.353 46 54.35 46 65.224 41 43.90 41 48.785 56 42.86 42 47.626 53 45.28 39 53.857 53 56.00 40 55.008 51 47.06 39 46.159 54 53.70 41 53.6610 59 45.76 44 52.2711 62 58.06 46 54.3512 62 46.77 47 36.1713 64 64.06 48 45.8314 64 46.88 48 45.8315 64 54.69 48 52.0816 64 60.94 48 43.7517 58 51.72 41 46.34Weeks 1-13 695 51.22 567 51.68Weeks 14-17 250 53.60 185 47.03Playoffs 40 52.50 30 70.00All Games 985 51.88 782 50.51

    Notes: This table shows the success rate of each time variant of the Base Binary Model at predictingoutcomes of NFL games from 2006 to 2009. Accuracy is the proportion of outcomes that are correctlypredicted out-of-sample. The Base Model is Wi = HFHFi + V FV Fi + ( 1)CLi + i whereWi = 1 if the favorite covers the spread and Wi = 0 otherwise, HF is 1 if the home team isthe favorite and HF=0 otherwise, V F=1 if the visiting team is the favorite, V F=0 otherwise andCLi is the absolute value of the closing point spread. The 1-month variant estimates parametersin four week blocks and predicts the following four weeks outcomes. The 1-year variant estimateparameters over a 1-year period and predict the following seasons outcome