The biggest golf tournament of the year, the Masters, recently finished up a couple weeks ago with Patrick Reed winning his first ever green jacket.  Jordan Spieth had an incredible final round posting an -8 on the day to pull within two strokes of the lead.  Every year around Masters time, I make predictions with some friends as to who I think will win.  Last season I successfully picked Sergio Garcia while this year I came up just short in my pick of Rickie Fowler.  A lot could go in to making a prediction like this.  I could pick based off of past success at Augusta, recent performance in tournaments, or just pick Bubba every year until he eventually might win again.  I was recently intrigued as to how someone could pick the winner of the Masters, or if it is just luck.

To begin I looked up every player who made the cut from the past 3 years at the Masters.  I could have collected data from every person, but preferred to have all 4 rounds to keep it consistent and not include the 2 rounds of the players who got cut.  I decided to look at first round score, average driving distance on the season, greens in regulation percentage for the season, the score of the past tournament the player participated in, and the percentage of 4 to 8 foot putts made on the season.  The goal is to see if the final score of the masters could be predicted based on these factors.

All of the assumptions hold in these regressions as it relates to having no non-linear patters, normally distributed residuals, residuals being spread equally along the ranges of predictors, and no overly influential cases.

The first model I tested was the following:

Masters_final = B0+B1Masters_round1 + B2Avg_drivingDist +B3 GIR + B4Tourney_before + B5Putts

The results are as follows:

The model itself is statistically significant in predicting the final score of the Masters for a given player at a .05 rejection level (F-value = 11.8,  p=.000).  The putting parameter and score of the tournament before the Masters prove to be insignificant in this model (F-value=3.066, p=.082 and F-value=.522, p=.471 respectively).  This model explains 29% of the variability in the data.

In this next model I decided to take out putting and the tourney before but add in interaction terms for the first round of the Masters, average driving distance, and greens in regulation.

Here are the results:

In this model, there is no statistically significance in predicting (F-value=6.117, p=.287).  None of the predictors are accurate in predicting the final score of the Masters.

In the previous models I used the players’ first round score to predict the final score.  In this last model, I take that variable out to see what we can predict before the first tee off.


The results follow:

This model is a significant model (F-value=8.892, p-value=.000)  The score of the tournament before and 4 to 8 foot putts are still insignificant predictors.  The R-squared value is .197, thus the model explains about 20% of the variability in the data.

The first model is the most useful when predicting the final score of tournament, but most bets are placed before the tournament.  In the final model, the R-Squared, while not very high, does explain 20% of the data with only driving distance and greens in regulation being significant predictors.  Adding newer stats such as strokes gained tee to green that were not accessible in 2016 along with other stats in relation to driving accuracy and approach shots could help increase the accuracy of a model.  Given the lack of accessibility of data to the public, I only did the past three seasons although I would prefer to do around 10 or more seasons.  The most surprising finding in my opinion would be that prior tournament success played an insignificant role in the model.  With golf being a big mental game, I would have thought that recent success would have a positive impact on the players’ scores at the Masters.  The Masters is one of the most entertaining events in all of professional sports, and it is meant for every player in the field to have a shot at winning.  Maybe someday Tiger. Maybe someday.


