3 min read

Does Three Point Field Goal Percentage Affect Winning?

I wanted to see if three point percentage affect winning. Let’s try making a simple linear model:

threePointModel <- lm(GW ~THREEPP_100,data = stats_2019)
summary(threePointModel)
## 
## Call:
## lm(formula = GW ~ THREEPP_100, data = stats_2019)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -21.7873  -5.8964   0.3398   7.7762  20.0635 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -110.234     44.371  -2.484  0.01923 * 
## THREEPP_100    4.254      1.247   3.411  0.00198 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.29 on 28 degrees of freedom
## Multiple R-squared:  0.2936, Adjusted R-squared:  0.2684 
## F-statistic: 11.64 on 1 and 28 DF,  p-value: 0.001983

I try to explain Games Won (GW) by that team’s average Three Point field goal percentage (THREEPP_100) per game. I’m just looking at data from the most recent NBA regular season, 2018-2019. We can see from the model that as Three Point Percentage goes up by 1 percent, the number of games won by that team increasees by 4.25.

The regression model for the year 2019 has a positive slope.

Let’s see the regression lines for the different years.

We can see that all the slopes are positive, so an increase in three point percentage means more games won, except for the year 1997.

Let’s run an anova test, to see if year is significant.

model_threes <- lm(GW~THREEPP_100*Year, data = statsAllYears)
anova(model_threes)
## Analysis of Variance Table
## 
## Response: GW
##                   Df Sum Sq Mean Sq  F value    Pr(>F)    
## THREEPP_100        1  22148 22148.5 176.5806 < 2.2e-16 ***
## Year              23   5674   246.7   1.9670  0.004638 ** 
## THREEPP_100:Year  23   7907   343.8   2.7409 2.589e-05 ***
## Residuals        663  83160   125.4                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We can see that the p-value of Year is less than .05, so year is significant.

##  Year THREEPP_100.trend    SE  df lower.CL upper.CL
##  1996              3.59 0.939 663   1.7417     5.43
##  1997             -2.88 0.942 663  -4.7288    -1.03
##  1998              1.87 0.785 663   0.3315     3.41
##  1999              1.01 0.727 663  -0.4153     2.44
##  2000              1.85 0.953 663  -0.0180     3.72
##  2001              2.60 0.890 663   0.8478     4.34
##  2002              2.17 0.950 663   0.3001     4.03
##  2003              2.41 0.869 663   0.7041     4.12
##  2004              2.27 1.174 663  -0.0351     4.58
##  2005              3.90 1.166 663   1.6130     6.19
##  2006              1.55 1.053 663  -0.5218     3.61
##  2007              2.88 1.313 663   0.3043     5.46
##  2008              3.92 1.068 663   1.8202     6.02
##  2009              3.89 1.188 663   1.5566     6.22
##  2010              3.65 1.068 663   1.5506     5.74
##  2011              2.83 1.060 663   0.7474     4.91
##  2012              2.21 0.938 663   0.3707     4.06
##  2013              3.30 1.069 663   1.1990     5.40
##  2014              3.64 1.100 663   1.4839     5.80
##  2015              5.08 1.131 663   2.8581     7.30
##  2016              5.02 1.171 663   2.7207     7.32
##  2017              3.95 1.153 663   1.6840     6.21
##  2018              5.28 1.843 663   1.6626     8.90
##  2019              4.25 1.357 663   1.5888     6.92
## 
## Confidence level used: 0.95

Looking at the different regression slopes (THREEPP_100.trend) for the different years, we can see the 2015-2019 have a significantly higher regression slopes than of previous years.