32
Quantifying the Volatility of Starting Pitchers Bill Petti SABR Analytics Conference March 2014 Phoenix, AZ (Sort of)

Quantifying the Volatility of Starting Pitchers

  • Upload
    marlee

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

Quantifying the Volatility of Starting Pitchers. (Sort of). Bill Petti SABR Analytics Conference March 2014 Phoenix, AZ. Let’s Make This a Little Interactive: Presentation Cliché Game. See How Many You Hear/See Today. “Needed to start somewhere” “Not sure what to make of the results” - PowerPoint PPT Presentation

Citation preview

Page 1: Quantifying the Volatility of Starting Pitchers

Quantifying the Volatility of Starting

Pitchers

Bill PettiSABR Analytics Conference

March 2014Phoenix, AZ

(Sort of)

Page 2: Quantifying the Volatility of Starting Pitchers

2

Let’s Make This a Little Interactive: Presentation Cliché

Game

Page 3: Quantifying the Volatility of Starting Pitchers

3

See How Many You Hear/See Today

“Needed to start somewhere”

“Not sure what to make of the results”

“Take the results with a grain of salt”

“Results directional, but not definitive”

“More questions than answers”

“Lot’s of work to be done”

Page 4: Quantifying the Volatility of Starting Pitchers

4

Motivating Questions

Are there differences in how volatile starting pitchers are over the course of a season?

Are certain types of pitchers more volatile than others?

Page 5: Quantifying the Volatility of Starting Pitchers

5

Why Study Volatility in Baseball?

• It’s my unicorn; what’s a unicorn?

• My first published baseball research focused on David Wright and whether he was volatile**

• Basically, I haven’t been able to let it go

*Gone in 60 Seconds**http://www.beyondtheboxscore.com/2011/1/4/1908646/player-volatility-the-case-of-david-wright

“Fabled creature? You know, the horse with the horn? Impossible to capture?”*

Page 6: Quantifying the Volatility of Starting Pitchers

6

Better Reason• We know less about Volatility than

other subjects, e.g. aging• There is some evidence that Volatility

in run scoring and run prevention matters for teams– How teams distribute their runs can

impact their expected win percentage– Sal Baxamusa* showed that the increase

in win probability becomes more marginal as teams score more than 5 runs

*The Hardball Times, 2007, http://www.hardballtimes.com/consistency-is-key/

Page 7: Quantifying the Volatility of Starting Pitchers

7

Why Study Volatility in Baseball?

• Some evidence that Volatility at the team level helps teams beat their Pythagorean Expectation*– Run Scoring (RS) and Runs Allowed (RA)

Volatility were both negatively correlated to total wins

– However, RS and RA Volatility were positively correlated to wins above expectation

*FanGraphs, 2012, http://www.fangraphs.com/blogs/does-consistent-play-help-a-team-win/

Page 8: Quantifying the Volatility of Starting Pitchers

Streakiness is about how extreme positive and negative performances lump together over the course of a season

8

Volatility is not the same as Streakiness

Volatility is about the overall distribution of a player’s daily performance relative to their average (i.e. central tendency)

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

105

110

115

120

125

130

135

140

145

150

Average wOBA

1% 1% 2% 7% 9%

30%20%

11% 8% 8% 1% 1% 1% 0%

Average wOBA

Page 9: Quantifying the Volatility of Starting Pitchers

9

Volatility and Hitters• Developed a metric for quantifying the volatility

of hitters (VOL) and examined what types of hitters may be more prone to inconsistency*

*The Hardball Times, 2014, http://www.hardballtimes.com/what-kind-of-hitters-are-volatile/

VOL=STD(daily_wOBA)/Yearly_wOBA.52, where:

VOL=Seasonal VolatilitySTD(daily_wOBA)=the standard deviation of a player’s daily batting performance, measured by wOBA

Yearly_wOBA.52: a player’s seasonal wOBA, raised to .52 power

Only games where a player had three or more plate appearances are included

Page 10: Quantifying the Volatility of Starting Pitchers

10

Volatility and Hitters (cont.)• This method ensured that VOL was not biased in

favor of inferior players and not simply a function of players with high PA/G

• VOL has a year-to-year correlation of .4 (n=435)– Some evidence it’s a repeatable skill, but one that

fluctuates much like BABIP or batting average• High VOL hitters tended to be high strikeout, fly ball

slugging hitters, while low VOL hitters tended to be ground ball, high contact, high on-base hitters

• Some evidence that hitters might be “structurally volatile”, but not all performance explained by this– Phrase borrowed from Matt Swartz and his work on LHHs

and clutch performance

Page 11: Quantifying the Volatility of Starting Pitchers

11

Volatility and Hitters (cont.)

Page 12: Quantifying the Volatility of Starting Pitchers

12

What About Pitchers?• There have been some attempts to quantify

consistency in pitchers– David Gasko using Quality Starts as a proxy for

consistency*• Controlling for ERA, pitchers don’t retain their consistency,

year-to-year• But inconsistency in a pitcher is preferable compared to a

consistent pitcher of similar talent– Eric Seidman using the Flake statistic at Baseball

Prospectus**– I briefly looked at a pitcher version of volatility,

consistent Flake***• Pitching creates unique challenges to this type

of metric*The Hardball Times, 2006, http://www.hardballtimes.com/what-kind-of-hitters-are-volatile/**2009, http://www.baseballprospectus.com/article.php?articleid=8579***Beyond the Box Score, 2011, http://www.beyondtheboxscore.com/2011/9/8/2404007/pitcher-volatility-part-i

Page 13: Quantifying the Volatility of Starting Pitchers

13

What About Pitchers?• First, hitters generate a larger number of

observations for study over the course of a season– Roughly 5x as many observations than starting

pitchers• Second, this makes outliers much more of

a problem for pitchers• Third, managers create the biggest

problem, since starters don’t’ control when they will exit a game– Tends to accentuate the outlier issue

Page 14: Quantifying the Volatility of Starting Pitchers

14

Decisions, decisions, decisions…

• Could continue to use a standard deviation-based metric– But, distribution of game performance not quite normal,

and outliers can play havoc with individual scores

• Another option is interquartile range (IQR), adjusted for median (i.e. quartile coefficient of dispersion or IQR COD)– Similar to hitter VOL, which uses coefficient of variation)– Still not perfect, but IQR CoD a robust measures that

handles outliers better

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 300%5%

10%15%20%

Game RA9: 2009-2013

Page 15: Quantifying the Volatility of Starting Pitchers

15

Decided to use IQR CoD• E.g. Buzz Capra, 1974• 27 starts in 1974 with a ERA- 59• 2.80 RA9 in those starts, but a few key outliers

– 22.5 and 405.0, both outings lasted less than 2 IP• Using the IQR CoD method does mitigate the impact of

outliers

VOL using Standard Deviation VOL using Coefficient of Dispersion (IQR COD)

0%

50%

100%

150%

200%

250%

300%

271%

143%

VOL relative to League Average

Page 16: Quantifying the Volatility of Starting Pitchers

16

Volatility for Pitchers• Data from 2009-2013, only pitchers that started

>= 20 games used

• There was no limit placed on the number of innings for a start– Tough decision, but had to start somewhere

RA9VOL=(IQR_daily_RA9/2) / Median_daily_RA9, where:

IQR_daily_RA9=Interquartile Range of pitcher’s daily RA9Median_daily_RA9=Median of a pitcher’s daily RA9

FIPVOL=(IQR_daily_FIP/2) / Median_daily_FIP, where:

IQR_daily_FIP=Interquartile Range of pitcher’s daily FIPMedian_daily_FIP=Median of a pitcher’s daily FIP

Page 17: Quantifying the Volatility of Starting Pitchers

17

Comparing RA9 and FIP VOL• At a population-level, the volatility of RA9 has a much larger

spread than the volatility of FIP• RA9VOL: Mean - .65 Standard Deviation - .23• FIPVOL: Mean - .36 Standard Deviation - .09

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.20%

5%

10%

15%

20%

25%

30%

35%

40%

45%

RA9VOL

% o

f Pi

tche

rs

Average

Average

Page 18: Quantifying the Volatility of Starting Pitchers

18

Contrasting RA9 and FIP VOL• E.g. Dillon Gee, 2013, 32 GS• RA9 3.89 / FIP 4.00

RA9 FIPQuartile 1 (bottom 25%) 1.3 2.7

Quartile 3 (top 25%) 6.0 4.7(Q3-Q1)/2 2.36 1.02

Median 2.5 3.7VOL [((Q3-Q1)/2)/Median]

.93 .27

VOL-[VOL/lgVOL] 142 65

Page 19: Quantifying the Volatility of Starting Pitchers

19

Contrasting RA9 and FIP VOL:Ryu vs. Strasburg 2013

• Both pitchers posted a 3.00 ERA, 30 GS, and ~ 6 IP/GS• Ryu – 3.24 FIP, Strasburg 3.21 FIP• While very similar in their seasonal outcomes, Ryu was the

more consistent starter, both in terms of runs allowed and FIP

Hyun-Jin Ryu RA9 FIPVOL .54 .31VOL- 80 79

Stephen Strasburg RA9 FIPVOL .80 .43VOL- 118 107

Page 20: Quantifying the Volatility of Starting Pitchers

20

Hypotheses on Causes of RA9VOL

• Pitchers with high K%s will have lower VOL• Pitchers with high LOB% will have lower

VOL• Pitchers with high BB% will have higher

VOL• Pitchers with high HR/FB rates will have

higher VOL• Pitchers with high BABIP will have higher

VOL• Pitchers with low GB/FB rates will have

higher VOL

Page 21: Quantifying the Volatility of Starting Pitchers

21

Testing the Hypotheses

• Four of the six hypothesized variables were statistically significant, but the magnitude of the relationship was small

• None of the relationships were in the hypothesized direction

Correlation with RA9

Statistically Significant?

Correct Hypothesized

Direction?K% .200** Yes No

LOB% .233** Yes No

BB% -.008 No No

HR/FB -.232** Yes No

BABIP -.124** Yes No

GB/FB .013 No Yes

Page 22: Quantifying the Volatility of Starting Pitchers

22

What to Make of This?• While the relationships were significant,

the directionality is hard to explain• When taken together, it’s hard to decipher– High K, high LOB = higher VOL; but– High HR/FB, high BABIP = lower VOL

• Often, pitchers with high K rates are also more home run prone (throw more fastballs, attack the zone, fly ball pitchers, etc.)

• Pitchers with lower BABIPs tend to strand more runners, not fewer

Page 23: Quantifying the Volatility of Starting Pitchers

23

Is VOL a Talent?• A quick read on whether something is a

talent or skill is to see if it is repeatable, year over year

• Previous research suggested that, whatever measure you use, VOL or consistency was not repeatable in consecutive years– And that appears to be the case with my measure as

well Year-to-Year CorrelationStandard Deviation-based VOL .04

RA9 VOL -.01

FIP VOL .06

Page 24: Quantifying the Volatility of Starting Pitchers

24

Is VOL a Talent?• It’s possible that VOL is simply a descriptive

statistic that captures the variances from pitcher to pitcher in how their performances randomly distributed themselves over 30-ish starts

• It’s also possible that VOL is a statistic that needs more time to stabilize, much like BABIP– Need to look at multiple seasons averaged to get a better

sense of a pitcher’s consistency• Finally, because of some of the inherent problems

with trying to measure VOL, it may be best used to compare pitchers with similar outcome metrics– E.g. compare pitchers with similar ERAs, K%, etc.– Provides another data point to consider

• Or, Occum’s Razor: the metric isn’t that great

Page 25: Quantifying the Volatility of Starting Pitchers

25

Summing Up• There appear to be measurable differences in

how pitchers distribute their runs allowed and FIP over the course of a season– And those differences are normally distributed

over the course of a season– FIP appears to be generally less volatile than RA9

across the league• However, VOL itself seems quite inconsistent,

year to year, at the individual level– It does appear to stabilize a bit when looking at

multiple seasons (akin to clutch ability)– Also, it’s not clear VOL is structurally determined,

somewhat like clutch hitting

Page 26: Quantifying the Volatility of Starting Pitchers

26

So Where Do We Stand?• I’m not in love with this metric, currently, and recommend

anyone use it with a big, fat grain of salt• It could be that parks impact pitcher VOL more than hitters

– Need to split the data by home and away starts (hat tip Vince Gennaro)

• Quality of opponent could also play a role (hat tip Sean Forman)

• There is also the possibility that inconsistency in mechanics throughout the year impacts VOL more than other metrics– Adjustments to mechanics, or just inability to repeat

mechanics, or injury could be what drives VOL (hat tip Jeff Zimmerman)

• It could also be that there is no way around the IP/GS issue– Two pitchers that give up 4 runs over 8 innings could

arrive their differently; one gives up 4 runs over the last 2 innings, the other over the first 3. High odds the latter doesn’t make it to 8 innings

• Bottom line: more work to be done

Page 27: Quantifying the Volatility of Starting Pitchers

27

Thank You@BillPetti

[email protected]

Page 28: Quantifying the Volatility of Starting Pitchers

28

Appendix

Page 29: Quantifying the Volatility of Starting Pitchers

29

RA9VOL leaders: 2013GS RA9VOL FIPVOL RA9VOL- FIPVOL-

Jason Hammel 23 0.25 0.17 37% 43%Wei-Yin Chen 23 0.29 0.32 42% 81%Jon Niese 24 0.33 0.35 49% 89%Roberto Hernandez 24 0.36 0.22 53% 56%Miguel Gonzalez 28 0.36 0.32 54% 79%John Danks 22 0.36 0.45 54% 113%Andrew Cashner 26 0.36 0.41 54% 102%Kevin Correia 31 0.37 0.43 54% 109%Jose Quintana 33 0.38 0.46 57% 114%Jacob Turner 20 0.41 0.53 60% 134%Joe Blanton 20 0.41 0.41 61% 104%Jason Marquis 20 0.42 0.30 62% 74%Felix Doubront 27 0.42 0.34 62% 86%Edwin Jackson 31 0.44 0.40 65% 101%Zach McAllister 24 0.44 0.39 65% 98%Rick Porcello 29 0.45 0.37 66% 94%Tommy Milone 26 0.46 0.33 68% 82%R.A. Dickey 34 0.47 0.29 69% 72%Brandon McCarthy 22 0.47 0.24 69% 59%John Lackey 29 0.47 0.38 69% 95%Yu Darvish 32 0.47 0.44 69% 110%Scott Diamond 24 0.47 0.39 69% 98%A.J. Griffin 32 0.47 0.33 69% 83%Chris Sale 30 0.47 0.44 69% 111%Andy Pettitte 30 0.47 0.19 70% 47%Jeff Samardzija 33 0.47 0.28 70% 71%C.J. Wilson 33 0.48 0.27 71% 68%Wade Miley 33 0.48 0.41 72% 103%James Shields 34 0.49 0.30 73% 77%Cliff Lee 31 0.50 0.29 74% 72%

Page 30: Quantifying the Volatility of Starting Pitchers

30

RA9VOL trailers: 2013GS RA9VOL FIPVOL RA9VOL- FIPVOL-

Gio Gonzalez 32 0.75 0.32 112% 80%Homer Bailey 32 0.76 0.45 113% 112%Wade Davis 24 0.77 0.33 114% 84%Justin Verlander 34 0.77 0.46 115% 115%Barry Zito 25 0.79 0.46 117% 117%Scott Kazmir 29 0.79 0.44 117% 110%Stephen Strasburg 30 0.80 0.43 118% 107%Chris Archer 23 0.80 0.52 118% 130%Clayton Kershaw 33 0.82 0.39 121% 98%Dan Haren 30 0.83 0.46 124% 115%Zack Greinke 28 0.83 0.45 124% 113%Hiroki Kuroda 32 0.84 0.44 125% 110%Jered Weaver 24 0.85 0.36 126% 91%Jason Vargas 24 0.85 0.36 127% 90%Joe Saunders 32 0.86 0.36 128% 91%Matt Cain 30 0.87 0.37 129% 93%Esmil Rogers 20 0.87 0.41 129% 104%Jeff Locke 30 0.89 0.28 132% 70%Jose Fernandez 28 0.90 0.43 134% 109%Felix Hernandez 31 0.91 0.43 135% 108%Matt Harvey 26 0.92 0.44 136% 109%Patrick Corbin 32 0.94 0.23 139% 57%Dillon Gee 32 0.96 0.26 142% 65%Lance Lynn 33 1.00 0.34 149% 85%Mike Leake 31 1.04 0.39 154% 97%Justin Masterson 29 1.04 0.38 154% 95%Chris Capuano 20 1.10 0.64 164% 161%Lucas Harrell 22 1.45 0.41 215% 104%Matt Moore 27 1.59 0.31 236% 78%Francisco Liriano 26 1.62 0.37 240% 94%

Page 31: Quantifying the Volatility of Starting Pitchers

31

RA9VOL leaders: 2009-2013GS RA9VOL FIPVOL RA9VOL- FIPVOL-

Kevin Correia 144 75.1 62.6 52% 43%Ryan Dempster 128 66.9 42.6 52% 33%John Lannan 91 48.2 26.5 53% 29%Chris Volstad 109 59.7 37.5 55% 34%Bud Norris 87 47.7 39.0 55% 45%A.J. Burnett 159 87.8 52.7 55% 33%Ricky Nolasco 121 67.4 45.4 56% 37%Roberto Hernandez 113 63.0 36.2 56% 32%Jeremy Guthrie 130 73.0 39.3 56% 30%Edwin Jackson 95 53.5 43.6 56% 46%Anibal Sanchez 93 52.5 36.3 56% 39%Derek Lowe 101 57.2 31.6 57% 31%James Shields 166 94.7 61.0 57% 37%David Price 146 85.3 54.9 58% 38%R.A. Dickey 125 73.7 42.6 59% 34%Luke Hochevar 88 52.2 40.2 59% 46%Chad Billingsley 120 72.2 37.4 60% 31%Joe Blanton 79 48.1 33.5 61% 42%Mark Buehrle 161 99.4 56.3 62% 35%Mat Latos 127 78.6 60.6 62% 48%Jeremy Hellickson 91 56.4 31.5 62% 35%CC Sabathia 161 99.9 59.2 62% 37%Randy Wolf 101 62.7 33.7 62% 33%Ricky Romero 125 78.4 39.5 63% 32%Scott Feldman 74 46.6 36.0 63% 49%Jason Hammel 130 82.4 61.6 63% 47%Bruce Chen 82 52.3 37.2 64% 45%Josh Johnson 92 58.7 45.6 64% 50%Mike Pelfrey 126 80.5 48.6 64% 39%Ervin Santana 151 96.6 69.3 64% 46%

Page 32: Quantifying the Volatility of Starting Pitchers

32

RA9VOL trailers: 2009-2013GS RA9VOL FIPVOL RA9VOL- FIPVOL-

Brandon Morrow 77 143.1 46.3 186% 60%Johan Santana 75 127.8 38.4 170% 51%Francisco Liriano 105 132.9 53.4 127% 51%Felix Hernandez 165 163.4 67.3 99% 41%Chris Carpenter 97 95.1 28.7 98% 30%Jair Jurrjens 77 73.8 27.9 96% 36%Jason Vargas 120 113.5 52.9 95% 44%Nick Blackburn 85 80.4 37.8 95% 44%Clayton Kershaw 161 149.1 61.2 93% 38%Randy Wells 82 75.0 28.4 91% 35%Homer Bailey 107 97.7 54.7 91% 51%Chris Capuano 84 76.6 46.4 91% 55%Justin Masterson 125 112.4 52.5 90% 42%Wandy Rodriguez 95 85.4 31.1 90% 33%Jered Weaver 154 136.1 65.0 88% 42%Zack Greinke 122 106.0 49.7 87% 41%Carlos Zambrano 92 79.4 40.8 86% 44%Gavin Floyd 120 103.4 50.0 86% 42%Bartolo Colon 80 68.6 41.0 86% 51%Josh Beckett 83 69.9 35.7 84% 43%Tim Hudson 116 97.3 45.7 84% 39%Mike Leake 109 91.2 50.0 84% 46%Jon Niese 110 92.0 51.5 84% 47%Jaime Garcia 80 66.8 40.1 84% 50%Scott Baker 83 68.4 49.6 82% 60%Tim Lincecum 163 134.0 60.5 82% 37%Kyle Kendrick 86 70.7 34.5 82% 40%Jhoulys Chacin 83 67.1 27.5 81% 33%Max Scherzer 158 124.4 58.2 79% 37%Johnny Cueto 118 92.4 41.4 78% 35%