NBA Salaries Prediction

Background

When it comes to the topic of basketball and the NBA, one of the most commonly heard questions about players is how much money they make. Being able to predict a players NBA salary can never be completed with 100% accuracy, with some factors like previous injuries, how common they are injured, whether they are a star, etc. due to the fact that not all players are consistent and many teams do not always make the most sound decision when changing their roster for the season. Teams will often negotiate to receive a star NBA player for a ton of money, then build a team around them with players who are valued less but coaches believe could work well with the star on the court based on personalities and chemistry. Of course, there are many teams that go for a star when they have a solid roster to begin with, and want to increase their chances of securing a ring. Based on the main statistics that people see on sites like ESPN, and have the highest trendlines with salary, I created a model to predict a players salary. The goal of the project was to predict players salary using a multiple linear regression, and to create a web page that allows users to fill in a player's stats and receive a predicted salary.

Variables for Prediction

PPG (Points per Game)
MPG (Minutes Played per Game)
APG (Assists per Game)
SPG (Steals per Game)
BPG (Blocks per Game)
WS (Win Shares)
Age

Exploratory Analysis

Using Tableau, some analysis was done on the data for the 2017-2018 NBA Season Statistics in an attempt to find relationships within the dataset. Some findings based on the analysis was that the Average Salary for an NBA player in the U.S. is not even close to being at the top of the list when compared to some other countries. This is due to the fact that a large proportion of players in the NBA are born in the U.S, with only few of them actually being viewed as a star when they are drafted. Other analysis that were used before creating the machine learning model was trendlines for columns in the dataset to see which factors played the biggest role in determining players salaries, with each of the variables showing a strong trend, aside from Age. FG also showed a strong trendline for salaries, however it decreased the accuracy of the model so it was dropped from the regression model. The tableau workbook acts as an interactive dashboard for people to view player and team stats in an efficient manner.

Building the Model

After doing some preprocessing on the dataset to drop null rows and convert a few column types in a jupyter notebook, the cleaned data was read into a new jupyter notebook file. The first step was taking a log of the salary column in order to end up with a better predictive model, creating a new column for these values which would become the y target of the multiple linear regression. Next, five new columns were made, being the first 5 variables as they were given (PTS, MP, AST, STL, BLK) and dividing them by the number of games played for each player. At this point, the X targets were set to the 7 variables listed above, and the y target set to Salary_log.

Using sklearn, the data was split into train and test sets and then fitted to a linear regression to receive the salary prediction from the X_ test data.

After fitting the model, matplotlib was used to construct a residual plot to show a visual representation of the model. After the scatter plot was constructed, sklearn was utilized to find the mean squared error and r-squared score of the model.

MSE: 0.6326738743239875, R2: 0.5523769265005738

At this point, the model needed to be referenced in an app.py file to connect the user inputs from a web page to to predict a salary. In order to do this, pickle.dump was used to pass the model into a .pkl file that could be read into app.py.

The app.py file uses flask to set up two routes for the web page; the first being the loading page which renders the index.html on load, which has user input boxes for each of the 7 variables in the model, and the second being a prediction route. The predict route uses request.form.get for each of the user input values after the predict button is pressed, and then passes the values into a numpy array as float values. Once these inputs are passed into an array, model.predict passes them into the regression, and numpy.exp reverses the log for the Salary_log column so the value given to the user outputs in the correct form. Finally, it rounds the value for the predicted salary and then returns the value to the user on the same web page under the predict button.

Deployment

The app is deployed with heroku, as it can be connected directly via github using flask in the app.py. It can be accessed by clicking on nba-salaries-prediction under the environments section of this repository, or by this link: https://nba-salaries-prediction.herokuapp.com/

Tableau Workbook

The workbook can be downloaded from this repository, titled NBA_Stats.twbx

link: https://public.tableau.com/profile/fernando.flores2809#!/vizhome/NBA_Stats_16204195758970/NBA2017-2018SeasonStats

Tools/Programs

Tableau
Pandas, Numpy, os, Matplotlib, Pickle, Sklearn
Python, Flask
HTML/CSS
Heroku

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.vscode		.vscode
Data		Data
static/css		static/css
templates		templates
Flowchart.png		Flowchart.png
NBA_Stats.twbx		NBA_Stats.twbx
NBA_salary_ML.ipynb		NBA_salary_ML.ipynb
NBA_salary_preprocessing.ipynb		NBA_salary_preprocessing.ipynb
Procfile		Procfile
README.md		README.md
Technical Report - NBA Salaries Prediction.docx		Technical Report - NBA Salaries Prediction.docx
app.py		app.py
model.pkl		model.pkl
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NBA Salaries Prediction

Background

Variables for Prediction

Exploratory Analysis

Building the Model

Deployment

Tableau Workbook

Tools/Programs

Data Sources

About

Releases

Packages

Contributors 2

Languages

jonhicks123/NBA-Salaries-Prediction

Folders and files

Latest commit

History

Repository files navigation

NBA Salaries Prediction

Background

Variables for Prediction

Exploratory Analysis

Building the Model

Deployment

Tableau Workbook

Tools/Programs

Data Sources

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages