NBA Playoff Predictor Model Methodology
Introduction
We built a statistical model using multiple linear regression to predict every series in the 2024 NBA playoffs. The model is based on 13 years of advanced team stat data from 2010-2023 and outcomes from every playoff series in that time span. To see the model’s predictions for the 2024 NBA Playoffs, check out this piece. This article walks through the process, methodology, and logic of the model, before acknowledging its limitations and recommended application.
Data Collection and Cleaning
The model’s independent variable data comes from BasketballReference’s Advanced Stats pane from the 2010-11 season to 2022-23. We populated an Excel sheet with each season’s data, which came out to 390 rows (30 teams x 13 seasons). To account for changes over time in NBA play styles, such as offensive rating climbing dramatically over that time span, we rescaled each value within its season. This means that the team with the number one offensive rating in the league in 2010-11 has the same value in the dataset (1) as the best offensive team in 2022-23.
To train the model, we listed out the record of every team in each playoff series from 2011-2023. This resulted in another 390 rows (15 series x 13 seasons x 2 teams). In other words, each series appeared twice: once with each team as “Team One” and another as “Team Two.” The dependent variable is the win percentage of Team One in the series. For example, in the Cavaliers-Warriors series in the 2016 NBA Finals, the series appears twice, with the Cavaliers having a 57.14% win rate, and with the Warriors having a 42.86% win rate. The goal of the model is to predict series win percentage based on the assortment of advanced stats from BasketballReference.
To get a statistical picture for both teams in each series, we measured the difference between Team One and Team Two for each independent variable. We did this by subtracting Team One’s metrics from Team Two’s. For example, if the Hawks (Team One) had an offensive rating of 110.5 and the Wizards (Team Two) had an offensive rating of 112.0, the offensive rating value for the Hawks-Wizards series was 1.5. This also means that the same series, but with the Wizards as Team One, would have an offensive rating value of -1.5. This resulted in an array of values for each series capturing the deltas between the two teams for each independent variable.
Data Analysis
With the cleaned dataset, we used R to create a multiple linear regression model. To begin, we included all of BasketballReference’s advanced stats. This included age, wins, losses, win percentage, pythagorean wins, pythagorean losses, margin of victory, strength of schedule, simple rating system, offensive rating, defensive rating, net rating, net rating rank, pace, free throw rate, three point attempt rate, true shooting, and the Offense and Defense Four Factors (eFG%, TOV rate, offensive rebound percentage, and free throws per field goal attempt). The dependent variable was the win percentage of “Team One.” The initial model had an R2 value of 0.4389 and several variables with large alpha values. To reduce noise and bias within the model, we filtered out the variable with the highest alpha and re-ran the model until there were no variables with alphas greater than 0.10.
The resulting model had a shorter list of independent variables: net rating rank, wins, losses, win percentage, margin of victory, simple rating system, offensive rating, free throw rate, three point rate, true shooting, eFG%, turnover rate, opponent eFG%, and opponent turnover rate. Several variables interacted with others in the model. To account for this, we multiplied wins, losses, and win percentage together to capture the inherent interaction between winning and losing games. True shooting and eFG% also have a similar formula, creating some strong interaction between the two, so we multiplied those as well. This finalized model had a slightly improved adjusted R2 of 0.4436, meaning that approximately 44% of the variance in the data is captured by the model.
See a summary of the series model below.
Assumptions and Limitations
The relatively low R2 value indicates that there are a myriad of other factors that matter to playoff series outcomes that the model doesn’t capture. Some of these factors can’t be measured accurately, such as luck and momentum. Others, such as playoff experience, depth, and injuries, can be. A future model should look to account for these other measurable factors to get a more complete view of each playoff team.
The model also relies on the differences between teams in their advanced metrics to calculate their predicted playoff success. It’s plausible that the difference in metrics doesn’t tell the entire story. For example, two teams with elite offenses will have a small delta in their offensive rating, just like two teams with poor offenses will. In the case of the latter, the teams’ other metrics may become more important, as the team that takes advantage in the rebounding battle or turnover differential might be a bigger factor than if both teams have elite offenses. An ideal model should account for these different situations.
Another limitation of this model is that it utilizes regular season statistics to predict playoff outcomes. Postseason basketball is an entirely different game than the regular season in a myriad of ways. For one, defensive liabilities are often factored out of games due to matchup hunting in the playoffs. This reduces the effectiveness of guards who can’t defend effectively, as well as opens up the floor for mobile bigs and small ball lineups to replace interior defenders such as Rudy Gobert. The playoff game is also much more physical, and thus we see players who can play through contact such as Nikola Jokic, Kawhi Leonard, and LeBron James elevate their game in the postseason. FInally, easy buckets are few and far between in the playoffs. Lackadaisical defense during the regular season allows strong offensive teams to consistently find quality shots. During the playoffs, every point must be earned, so contested shot-makers such as Jimmy Butler and Jamal Murray tend to stand ahead of the pack when the lights are brightest.
Overall, this model should be used more as a thought-provoking experiment rather than a serious predictor of playoff success. It’s possible that it can expose inefficiencies in the betting markets by revealing underrated teams, such as the Pacers and Suns in 2024. However, given the amount of other factors that play into playoff outcomes, the model should not be used alone to predict individual series. It is just one of many ways to evaluate teams as they prepare for their playoff runs.
Contributors: Jason Taylor and Samuel Rui
Using Statistical Modeling to Predict the 2024 NBA Playoffs
We built a multiple linear regression model to predict every series in the 2024 NBA Playoffs.
The NBA Playoffs are finally here.
While the college basketball bracket-building bonanza that dominates group chat discourse and small talk with your coworkers hasn’t yet permeated the NBA, it is just as intriguing to predict the NBA playoff bracket. But instead of making picks based on mascots, Cinderella potential, and overall vibes, this article will predict each round based on more than a decade of team stats and data from more than 1,000 playoff games. We built a regression model from 13 seasons of advanced stat data to predict playoff series win percentage in every round of this year’s playoffs. If terms like “regression analysis” and “covariance” get you excited like us, check out this short article explaining the methodology and science behind the model. Otherwise, let’s get to the predictions.
Round 1
(1) Oklahoma City Thunder vs. (8) New Orleans Pelicans
Model Prediction: Thunder in 7
After a wild few games to end the regular season, the Thunder managed to capture the top seed in the West over the Nuggets and Timberwolves. They’ll play the New Orleans Pelicans, who knocked out the Kings in the final play-in game after losing Zion Williamson to a hamstring injury in the game prior. This is a big test for the young Thunder, who have very little playoff experience, to show that their regular season success was no fluke. Even without Zion, the Pelicans are no easy opponent, as their roster runs _ and went 7-5 without Williamson available this season. Both teams take care of the ball, force turnovers on the other end, and rely on a balanced scoring attack outside of their primary ball handlers. One advantage for New Orleans is in the rebounding department, as they ranked above average in offensive and defensive rebound rate while OKC fell bottom three in both categories. Still, the model likes the Thunder to take this in a seven game series.
(2) Denver Nuggets vs. (7) Los Angeles Lakers
Model Prediction: Nuggets in 7
The defending champs are back for another run for the NBA title. In the first round, they’ll match up against the Lakers, the team that they swept in last year’s conference finals. However, don’t underestimate LA, as their chemistry, continuity, and subtle roster improvements make them a force to be reckoned with in 2024. Expect this series to be more competitive than last year, with Gabe Vincent having an impact slowing down Jamal Murray and D’Angelo Russell playing his best ball in the purple and gold. LeBron James’s continued defiance of Father Time has been a joy to watch. If Anthony Davis can limit Nikola Jokic, the Lakers can win this series, but the Nuggets are the favorites for a reason. I agree with the model’s prediction for the Nuggets in seven, but the Nuggets could keep this series shorter if their role players knock down shots.
(3) Minnesota Timberwolves vs. (6) Phoenix Suns
Model Prediction: Suns in 7
Minnesota’s brilliant regular season sets them up for a first round date with the Phoenix Suns, a team who spent a plethora of picks (and dipped heavily into the luxury tax) to build their roster the last 18 months. While Phoenix underwhelmed during the regular season, the model likes them to win their first round matchup against the relatively inexperienced Timberwolves. The key to this one is Kevin Durant. If the Timberwolves can slow him down (eyes on you, Jaden McDaniels), they absolutely can win this series, but stopping the Slim Reaper is easier said than done. With Devin Booker and Bradley Beal on the wings and Grayson Allen’s marksmanship, the Suns’ potent offense should prove too much for Minnesota’s Gobert-anchored defense. Overall, this is a tough matchup for the Timberwolves, and I agree with the model’s prediction for the Suns in seven.
(4) Los Angeles Clippers vs. (5) Dallas Mavericks
Model Prediction: Clippers in 6
Once again, the Clippers meet the Mavs in the NBA playoffs. Dallas has been playing their best basketball of the season as of late, ranking 7th in net rating in their last 15 games and fielding the league’s best defense over that span. Luka Doncic and Kyrie Irving have built some excellent chemistry, learning how to play off one another and when to allow the other to take over. The team’s acquisitions of Dereck Lively II, Daniel Gafford, PJ Washington, and Dante Exum have been astute work by the front office and have helped Dallas play incredibly well down the season’s home stretch. LA, on the other hand, has struggled since the All Star break, due in part to the absence of Kawhi Leonard since late March. Regardless of if Kawhi suits up for this series, I’ll take the red-hot Mavericks to defy the model and win this series in six games.
(1) Boston Celtics vs. (8) Miami Heat
Model Prediction: Celtics in 4
We have another conference finals rematch in the first round, as the Celtics look for revenge against the Jimmy Butler-less Heat. Boston boasts the league’s best record, net rating, and offense, making them heavy favorites in this first round matchup. The additions of Kristaps Porzingis and Jrue Holiday combined with major growth from Derrick White have made the Celtics a true juggernaut. With so many sources of offense, the Heat will have their work cut out for them. Jimmy Butler’s likely absence is a killer, as the Heat have no one else with his blend of toughness and skill. I think Miami gets a game here due to their relentless effort and unconventional zone defense, but the Celtics win in 5.
(2) New York Knicks vs. (7) Philadelphia 76ers
Model Prediction: Knicks in 6
Who would’ve expected before the season that a Knicks-Sixers matchup would have the Sixers coming in as underdogs? This series may be the most unpredictable one in the first round as both teams have major question marks coming in. The Sixers have Joel Embiid back for this series after missing 43 games during the regular season. His availability is a game-changer, as Philadelphia posted a 31-8 record with Embiid and a 16-27 record without him. They are one of the strongest teams in the league with him, but what remains to be seen is how effective he will be given his knee injury. The other side of the equation is the Knicks. Proponents of the Knicks may point to the Knicks 20-3 record with OG Anunoby in the lineup. Critics may point to their 21-28 record against teams over .500. That 21 win figure puts them in a similar profile as the Cavaliers, Warriors, and Heat. This series may largely be dictated by what version of Embiid we see. The model is based on a season where Embiid played roughly half the games, and because of that, it predicts the Knicks to win in six.
(3) Milwaukee Bucks vs. (6) Indiana Pacers
Model Prediction: Pacers in 7
The Bucks’ rollercoaster regular season lands them with a first round matchup with the Indiana Pacers. Milwaukee struggled mightily down the stretch, finishing with a losing record under head coach Doc Rivers after the dismissal of Adrian Griffin in January. The key to this series will be the health of Giannis Antetokounmpo, who injured his calf at the end of the regular season and is unlikely to suit up for Game 1. If Giannis is unable to play or is seriously limited, expect the Pacers to win this series, as their depth and elite offense will wreak havoc on the Bucks’ middling defense. The Pacers have the second-fastest pace in the NBA and will look to turn this series into a track meet. If they can dictate tempo, they’ll win this series in less than 7.
(4) Cleveland Cavaliers vs. (5) Orlando Magic
Model Prediction: Cavaliers in 7
The Cavaliers and Magic may be two teams that we consistently see at the top of the Eastern Conference in the near future. Both have talented young cores that may need more time together to develop proper chemistry. The Cavs have one of the best offensive backcourts in the league in Donovan Mitchell Mitchell and Darius Garland, rounded out by a strong defensive frontcourt with Evan Mobley and Jarrett Allen. The issue with the Cavs is that they are going to consistently have two liabilities on the defensive end in Garland and Mitchell that will get hunted in ball screens and isolation action. The Magic have a solid defensive team from top to bottom which has resulted in them claiming the third-best defensive rating in the league. Orlando’s problems arise on offense, where they have much less floor spacing than other teams. Only one player shoots above 40% from 3 (Joe Ingles) and three of their five starters shoot below 35%. The Cavs length in Mobley and Allen should be able to limit paint points and force the Magic to make tough shots. The model predicts Cavaliers in 7, and we agree as this appears to be a suboptimal matchup for Orlando.
Round 2
(1) Oklahoma City Thunder vs. (4) Los Angeles Clippers
Model Prediction: Thunder in 7
The young Thunder take on the mature Clippers in the second round. This one would be a tough matchup for OKC, as the Clippers have an abundance of defensive wings to put on Shai Gilgeous-Alexander and Jalen Williams. However, the Thunder had their way offensively in the regular season, averaging 127 points and winning the season series over the Clippers 2-1. Still, playoff experience is vital as teams progress in the playoffs. If the Clippers are healthy, I would expect them or Dallas to take down the Thunder, but the model likes OKC to advance in 7.
(2) Denver Nuggets vs. (6) Phoenix Suns
Model Prediction: Suns in 7
After some hard-fought first round series, the champs will take on the Suns in round 2. Phoenix presents a challenging opponent for Denver, who went 1-2 against the Suns this year, including two losses in March to close out the season series. The Suns' roster is somewhat flawed, but they do possess some difficult matchups for the Nuggets, who struggled this season against teams with big, scoring wings. Aaron Gordon and Kentavious Caldwell-Pope will have their hands full against Kevin Durant, Devin Booker, and Bradley Beal. The model predicts the Suns to beat the Nuggets in 7 to advance to the conference finals.
(1) Boston Celtics vs. (4) Cleveland Cavaliers
Model Prediction: Celtics in 5
The Celtics can switch 1-4, and have an elite paint protector in Porzingis. As touched on in the Cavs and Magic preview, teams will hunt Garland and Mitchell throughout the entirety of these gritty playoff games. Brown, Tatum, Derrick White, and Jrue Holiday should all be able to take turns going at the Cavs backcourt. The models have the Cavs pulling off one win against the Celtics, which we believe may be too generous.
(2) New York Knicks vs. (6) Indiana Pacers
Model Prediction: Pacers in 7
The Knicks and Pacers were fortunate enough to face teams that were struggling with injury woes in the first round. This series appears to be an entertaining one as star guards Haliburton and Brunson will lead their teams, with both boasting top ten offenses (Pacers 2nd, Knicks 7th). It will be interesting to see which team dictates the tempo of these games, as the Pacers play at the second fastest tempo, while the Knicks play at the slowest pace in the league. The model has the Pacers controlling the pace and beating the Knicks in a close series, a result that we agree with given how their regular season series went (Pacers 2-1, +24 point differential).
Conference Finals
(1) Oklahoma City Thunder vs. (6) Phoenix Suns
Model Prediction: Suns in 7
The model predicts a return to Oklahoma City for Kevin Durant as the Suns meet the Thunder in the Western Conference Finals. The key to this series will be the turnover differential. The Thunder thrive at forcing turnovers, ranking second in opponent turnover rate and leading the NBA in percentage of points off turnovers. Phoenix would have to take care of the basketball, though they had the fifth-worse turnover rate in the league this season. The model expects the experienced Suns to upset OKC and head to their second Finals appearance in three years.
(1) Boston Celtics vs. (6) Indiana Pacers
Model Prediction: Celtics in 6
The Celtics come into this matchup having won easily in their prior two series. The same can’t be said for the Pacers, who have gone through two dogfights to make it to the conference finals. The Celtics are a better team from top to bottom, matching the Pacers' offensive efficiency (1st for Boston, 2nd for Indiana) and crushing them on the defensive side of the ball (2nd for Boston, 24th for Indiana). The Celtics slowed down Haliburton significantly in the regular season as he averaged 16 points and 9 assists compared to his 21 and 11 across the season. The same happens in this series as the model predicts the Pacers falling in 6, which might be generous for a team that has gone through back-to-back 7-game series.
NBA Finals
(1) Boston Celtics vs. (6) Phoenix Suns
Model Prediction: Celtics in 7
This seems like a good matchup for the Celtics, as their long list of perimeter defenders can bother the Suns’ stars. The Celtics led the NBA in defensive rating due to their switchability, as players like Holiday, White, and Tatum can guard 1-4 while Horford and Porzingis are capable of switching on to wings. Still, Kevin Durant can be the best player in this series, and that can be enough to win two or three games on its own. If the Celtics, who swept the season series, can generate rim pressure and continue to knock down threes at an elite clip, this is a very winnable series. The model predicts that they beat the Suns in a gutsy seven games and win the 2024 NBA championship.
Contributors: Jason Taylor and Samuel Rui