Every fantasy football manager wants to know what rookie will lead the league in scoring. Is the SEC always over hyped? Does a wide receiver with a fast forty in the NFL combine mean anything?
As you’ll see, being the best in an event doesn’t mean much.
Looking at the NFL Combine Data
I’ve scraped this data from Pro Football Reference and contains data from the 2000-2016 seasons. For all the analysis I’ve used the Draft Kings score from the player’s first season in the NFL.
The most interesting thing I’ve noticed is that being the best at an individual event like the forty yard dash isn’t important. Most people attach high significance to players who ran the fastest forty.
The highest scoring points come in cone shapes.
Our takeaway. If you want to pick the highest scoring rookie wide receiver, you should pay less attention to “winning” an event and more attention to mildly above average results.
But this doesn’t mean you should always discard good performance at an event. A quick look at the density of forty yard dash times against fantasy score for wide receiver shows a much better chance of “decent” performance for sub-4.4 forty times.
I don’t know about you but I’d much rather have the best wide receiver than a “good” one.
Running Backs Perform Similarly
The data for running backs shows much of the same as wide receivers
There’s a little more skew here showing a center around 4.43. But again, the results don’t show great forty yard dash, great fantasy results.
Some Contests are Too Similar
Forcing running backs to train for the vertical and broad jump is a waste of time. Look at the correlation matrix (farther from 0 is better)
See that 0.96 correlation. The data looks like something you’d see in stats class
Now lets look at how this affects fantasy results.
There might be slightly more significance to vertical results but the effects are pretty balanced.
Draft Position is Highly Important
As a general rule, the closer a pick is to first the better the player will do. This isn’t always the case — see the WR draft pick plot — but still a good rule. And the only good rule of thumb we can find.
Can We Make a Model?
So can we?
I wanted to use machine learning to learn trends that a casual NFL fan might notice and see if it could be better than garbage.
To do this, I used XGBoost and a basic linear regression model. XGBoost learns rules like:
- if position=”WR” and
- Forty Yard Dash <4.50 and
- Weight < 220 lb
- Add 20 points to predicted score.
And a linear regression tries to compute a weighted sum of variables — it’s pretty easy for an excel junkie to write.
As you can imagine, the results weren’t that great so if you’re a fantasy football manager don’t spend too much time focusing on the NFL combine.
R-squared values for the linear regressions hovered around 0.63.
Does the NFL Combine Mean Anything?
For the casual fan, you’re better off looking at draft results as a predictor of player performance than anything else. Our takeaways
- Being the best at an event is not good!
- Earlier draft picks generally score more points their rookie year.
- The broad jump and vertical tests are repetitive.
View the Jupyter notebook for this post to see how the analysis was done and download the data here I used Sports Data Direct to create the dataset. A large set of scatter plots is available.
Next time we’ll see if we can create advanced metrics like player density, strength, and weighting by draft pick.
|↑1||I used Sports Data Direct to create the dataset|