We are now about two weeks into the 2018-2019 MLB off-season. Its another off-season where we continue to contemplate how we are going to survive the next 3 months without baseball. Somehow after 3 months without baseball the day pitchers and catchers report become one of the top 10 best days of the year. While there may not be baseball being played, the hot stove will heat up quickly. A lot is expected of certain free agents headlined by Bryce Harper and Manny Machado. This piece however will focus on the pitchers of this free agent class. Among the pitching headliners is Patrick Corbin and former AL Cy Young winner Dallas Keuchel. As fans we can only hope that our teams’ general managers don’t sign an albatross to infect the organization for years to come. The following will take you through a simple model that will predict the contracts of some of this years biggest pitching free agents.
The data is taken from the prior 3 off-seasons’ worth of free agent pitchers who signed major league contracts. This includes a total of 80 pitchers. These are only starting pitchers as relievers will go off slightly different criteria. For this article we are only concerned with starters. What we are predicting is the average annual value of the contract. We will take into account K/9, opponent batting average, FIP, innings pitched the previous season, handedness, and age. Now we will run the model that look like this:
AvgValue = β0 + β1K9+ β2OPPAVG+ β3FIP+ β4IP+ β5Handedness+ β6Age + ε
Without going too far into detail as I do not want to bore the remaining readers that made it to this point, opponent average and handedness were not significant in this model. The adjusted R Squared for this was .585 which, for those who aren’t as familiar with statistics signifies that this model accounts for about 59% of the variance in average contract value. I now re-run the model only this time without handedness and opponent average:
AvgValue = β0 + β1K9+ β2FIP+ β3IP+ β4Age + ε
In this model all the covariates are significant at a .05 significance level. The adjusted R Squared remains nearly the same at .5819 which again means that this model explains about 59% of the variance in contract value.
*In old fashion baseball fan voice: “Oh what is this FIP junk and all your fancy values. Wins and ERA should make a big impact.”
Alright old man calm down. Grab a beer and let me do that. I agree ERA should be in there. As for wins you seriously have to conform to modern baseball. Here is a model with wins and ERA:
AvgValue = β0 + β1K9+ β2FIP+ β3IP+ β4Age + β5Wins + β6ERA + ε
In this model ERA is not significant however wins are significant. Old man may have a point here (No not really). The R-Squared increased to .6213 which means the model accounts for about 62% of variability in average contract value. We will run one more model which will be our final model this time taking out ERA:
AvgValue = β0 + β1K9+ β2FIP+ β3IP+ β4Age + β5Wins + ε
This model includes all significant covariates with an R-Squared of .6224 which is actually a slight increase from the previous model.
Something interesting to notice is that ERA was not significant at all in this formula but FIP was. That problem could be due to some pitchers no accumulating many innings and in conjunction with some bad luck had a much higher ERA than their true talent. FIP, therefore, is a more accurate measure. That being said, it is surprising wins was significant since those same players who did not accumulate a lot of innings also did not get many wins. Our final predictive model with the coefficients looks as follows:
AvgValue = 21222239 + 925436(K/9) – 2002017(FIP) + 38983(IP) – 653206(Age) + 608616(Wins)
The top starting pitching free agents of this off-season are Patrick Corbin, Dallas Keuchel, Charlie Morton, Hyun-Jin- Ryu, and JA Happ. Nathan Eovaldi could be considered a starter, but following this post-season he appears to be more valuable as a reliever. Based on the model above the projected average contract values are listed below. This does not include length of contract.
If we went solely off this model it looks like Keuchel should have accepted the Astros’ qualifying offer of $17.9 million because here it is predicted he gets less than $16 million. Obviously this model has its flaws. As mentioned earlier it only accounts for about 62% of the variance in the average annual value. In addition, the number of years along with the negotiations of opt-outs will affect the annual value.
Thank you for reading! All models were run using R and the data is taken from Fangraphs. If you would like more information about the calculations or the metadata, feel free to comment below. Be sure to like and follow us on Facebook throughout this off-season!