algorithms, Analysis, code, data, Home Page, NBA, sports

Replicating Nate Silver’s NBA Elo Algorithm

Before diving into how exactly Nate Silver’s Elo algorithm works I want to tell you a little about Nate Silver. Nate Silver first became famous from his MLB player projection tool PECOTA and managing Baseball Prospectus, a website devoted to baseball “sabermetrics”. Nate became publicly famous for predicting 49/50 states in the 2008 presidential election and was named to Times top 100 most influential people list. In 2012, he went on to predict all 50 states. Since then he created the blog fivethirtyeight.com and publishes data journalism articles.

Silver’s Elo Algorithm: Why care?

Quick answer.. it can make you money. Nate Silver took the Elo algorithm for chess and applied it to sports. On fivethirtyeight he keeps a page dedicated to data driven power rankings. This guy is already famous for predictions so I wanted to share the actual implementation of the algorithm and analyze it’s performance against the spread.1)The one thing I hate about Nate Silver is that he never goes the extra mile in publishing his methods. He keeps the actual spread performance to a one line comment that it’s not good enough to beat Vegas. By using a combination of strength of schedule, team types, and average stats it should be possible to beat Vegas lines and make some money — or at least beat some Stanford undergrads.

From my tests the past few seasons, I’ve found a simple Strength of Schedule (SOS) rating using only scores and home/away information predicts game winners 65-75% of the time (which is in line with published results on the NBA). Against the spread performance is not statistically good enough to beat Vegas spread lines but when combined with other factors it is.

Silver’s Elo Algorithm: Implementation

I’ve compiled the algorithm by reading about his description on fivethirtyeight.com. He doesn’t quite give the implementation details in a convenient way so I’ve listed them here — for the techie people in the audience, the Jupyter notebook I used for this post is available here.

$$R_{0}=1300$$

$$R_{i+1}=K(S_{team}-E_{team})+R_i$$ where R is elo rating, S=1 if the team wins and S=0 for a loss. E represents the expected win probability in Nate’s formula and is defined as $$E_{\text{team}}=\frac{1}{1+10^{\frac{\text{opp_elo}-\text{team_elo}}{400}}}.$$ In chess K is a fixed constant but Nate changes K to handle margin of victory. Nate Silver’s K is $$\text{K}=20\frac{(\text{MOV}_{winner}+3)^{0.8}}{7.5+0.006(\text{elo_difference}_{winner})}.$$ where $$\text{elo_difference}_{winner}=\text{winning_elo}-\text{losing_elo}.$$ Nate also takes into account home advantage by increasing the rating of the home team by 100.
The only other consideration is how to handle seasons. Nate handles this by reverting each team towards a mean of 1505 as in the following formula $$R_{s=i+1}=(0.75)R_{s=i}+(0.25)1505.$$

Proof this is it

Charts don’t lie. 2)They just commonly mislead — not here though Nate Silver's NBA ELO algorithm compared to my implementation

I computed the R-Squared score of a linear fit and found 0.999. The small blips are due to floating point arithmetic and rounding errors.

Accuracy of Elo Predictions

Predict Away Predict Home
Away Wins 9772 14061
Home Wins 6083 33222

The Elo prediction was right 68% of the time. This is pretty remarkable considering how simple it is. Already this beats published models by Stanford students in CS 229 that use over 200 features! God I hope nobody picks on my thesis…

Conclusion

There you have it. Some data behind Nate Silver’s Elo algorithm, code, and a complete description of the model. Be sure to read the bonus topics and leave a comment.

Bonus Topic: Alternate Betting Methods Exploiting Error Profile

Just because Nate Silver’s Elo algorithm has a similar success rate as a coin flip that doesn’t mean you can’t successfully bet with it. Here is the error profile of using his algorithm against the spread for 2012-2014 regular season games. 3)Data from covers.com

Bet on Away Bet on Home
Away Covers 757 938
Home Covers 669 852

Although the success rate at picking teams against the spread is 50.03%, a strategy to only use Nate’s model to bet on away favored teams would have given you a success rate of 53% over 3 years. So maybe there’s something useful with this algorithm that the good people at fivethirtyeight are hiding from their published articles… Hmmm.

Now that we know that just because an algorithm doesn’t perform well for all games that doesn’t mean we can’t use it special cases. Since this Elo implementation is all about strength of schedule and home advantage, it is probably more accurate for home court advantage and teams that go on long streaks against good or bad opponents — the media can overreact. One way you could build a model that understands Elo’s is by adding moving average features to go along with the Elo score.

If you’re curious on what percentage you actually need to be a profitable bettor, I’ve computed the percentage return you receive by betting a fixed amount of money each time. 4)This assumes your payout is -110 on each game.

 Success Rate Percentage Return
0.50 -50%
0.51 -29%
0.52 -8%
0.53 13%
0.54 34%
0.55 55%
0.56 76%
0.57 97%
0.58 118%
0.59 139%
0.60 160%
0.61 181%
0.62 202%
0.63 223%
0.64 244%
0.65 265%

So at 53% success rate you can “expect” 13% return. If you want to bet with this be my guest. It’s still not good enough for me because a 99% confidence interval shows that your true rate could be 53% and you are only right 48% of the time in a season — too noisy for me — the S&P 500 seems like a better bet than pure Elo . That being said I’m working on a model that is 55% ATS right now that I’m trying to build to 57% by feeding it additional stats. Not sure I’ll share that one 😛

Bonus Topic: Autocorrelation and Markov Models

Auto-correlation (self-correlation) is the tendency of a time series to correlate with a delayed version of itself. For this Elo problem, our goal is to sum up an entire team’s ability with 1 number — granted this isn’t a good assumption but it is what the Elo rating and Nate Silver are trying to do.

I think it’s easiest to show this mathematically. To minimize delayed or “serial” correlations with itself, an Elo rating system should be a markov process.  That means that todays rating should only depend on yesterdays rating and earlier information doesn’t add value as in the following formula

$$R_{i}=R_{i-1}=R_{i-1,i-2,i-3,…,0}.$$

That should make sense because all we’re saying is the rating takes into account all the information we have from the past. Autocorrelation helps our Elo values make sense because a 1550 team is always a 1550 team regardless of what they were a week ago.

Bonus Topic: Limitation of 1 Dimensional Rating System and Elo

Do you remember Pokemon? These guys Pokemon Types

Now imagine you tried to rate them and figure out which one’s the best. I’m sure you did but then you would get owned by Gary because although Charmander is pretty good Squirtle destroys him. Water is super effective against fire. The same can be said about NBA teams.

A team that’s great about scoring fast break points should do exceptionally well against a team with turnover problems. If you want to beat Vegas lines day in and day out, you can’t just use one number, Elo, to do everything for you. You gotta do better than that.

References   [ + ]

1. The one thing I hate about Nate Silver is that he never goes the extra mile in publishing his methods. He keeps the actual spread performance to a one line comment that it’s not good enough to beat Vegas.
2. They just commonly mislead — not here though
3. Data from covers.com
4. This assumes your payout is -110 on each game.

6 thoughts on “Replicating Nate Silver’s NBA Elo Algorithm

  1. Nice write up! I did something similar myself (although a bit hackier) and came to a similar conclusion: the slight edge over Vegas isn’t consistent enough to overcome the noise. With a fictional bankroll of $1,000, this did make a profit in the long run, but it went through a crazy rollercoaster ride along the way.

    Have you tried tracking Vegas returns for Nate Silver’s CARMELO model?

    1. Thanks Cameron! I went live with the ELO algorithm 2 year back and it was definitely a rollercoaster.

      I want to try CARMELO, it seems like that model should be much better from what fivethirtyeight writes about it. I’m still working on getting enough historical NBA stats and will see if I can write a similar piece to this.

      I’m also going to evaluate the Glicko rating system and found a good writeup from a guy who did it for the nhl http://msinilo.pl/blog2/post/p1396/

  2. Great article. This definitely cleared up a few things for me. Thank you.

    One thing stands out to me though. If this model has indeed picked with an accuracy of ~68% over the last three years, what does that say about the effect of key players (i.e. all-star level talent) missing games? One would expect this to be an area for error to be introduced. If I’m correct, Elo models don’t account for game-day rosters…

    1. Thanks Jesse. Appreciate you’re comment. I agree, key players missing games would be a great place to start to improve this Elo strength of schedule model. The performance without taking game day rosters into account is remarkable but could probably be improved with game day rosters.

      I’m hoping to get quality injury data and test what your saying. I’d wager that star running backs in football make less of a difference than star WR’s or QB’s. As for basketball, I think it could be really interesting to see if stars like Russell Westbrook and all of his triple doubles means that OKC will miss him when he’s out. Or if players tendency to be ball hogs and dictate the style of play can actually hurt teams, e.g. Carmelo Anthony having 60+ points but losing the game. The points make him look like an all-star but it could be hurting his team.

      Adding the feature could be tough too. Would a binary variable (key player missing be enough) or would you have to build a model that understands how the missing all-star player would change the style of play. Less offensive rebounds/physical if a key center is out.

  3. I still can’t comprehend how it’s this accurate! Accuracy of 68% is exactly what Vegas has achieved over the same period. The pessimistic side of me questions just how much more precise one get with Vegas being the gold standard and all. I’m inclined to think that the cost-benefit ratio with that endeavour wouldn’t be worth it…

    While the Elo model can generate win probabilities and therefore point spreads, any thoughts on if it can be modified to predict team totals?

    1. I agree. It’s insane. Even for the NFL, Elo is consistently very close to Vegas. https://nbviewer.jupyter.org/github/sportsdatadirect/python_tutorials/blob/master/Power%20Rankings.ipynb — scroll to “Performance each season during test period”.

      To your question, yes. Elo can generate team totals. First you’d need to make some assumptions on the distribution of team totals. The naive assumption of Gaussian is fine, but some statisticians think the Skellam distribution is a better fit. https://en.wikipedia.org/wiki/Skellam_distribution

      All you’d have to do is take the probability of winning and put it in the team total distribution you choose. Could use a team specific distribution or the whole NBA team population. Whatever you choose, you’d probably want to make sure the Elo spread prediction is balanced home_team_total+home_spread=away_team_total.

Leave a Reply

Your email address will not be published.