Using Deep Stats and TransferMarkt data to build interpretable player valuation models
The data used in this project came from two key sources:
- Market value and age data from TransferMarkt (https://www.transfermarkt.us/)
- Player performance data from Understat (https://understat.com/)
Data was collected for all players who played in at least one of the top 5 leagues in the 2020-2021 season. For reference, these leagues are:
- English Premier League
- Italian Serie A
- German Bundesliga
- Spanish La Liga
- French Ligue 1
The data files that include player performance, age, and TransferMarkt value for players in each of these leagues in the 2020-2021 season can be found in the Data directory in this repository.
The SoccerAnalysisGit.py file is your one-stop shop for all the code I used to perform the data analysis in this project.
If you would like more description about the analysis and project as a whole, check out my blog post on Towards Data Science: https://medium.com/@leobdata/using-deep-stats-for-performance-based-soccer-player-valuations-f6ea01c43bf