A modular Python package for analyzing soccer player statistics and identifying talent.
This project provides tools for:
- Loading and processing soccer statistics from different sources
- Analyzing player performance across various metrics
- Identifying promising players based on different attributes
- Creating comprehensive scouting reports
- Storing and tracking player data over time
- Advanced metrics analysis with visualizations
- Enhanced shooting analysis capabilities
fast-soccer-analysis/
β
βββ config/ # Configuration settings
β βββ settings.py # General settings and parameters
β βββ urls.py # Data source URLs
β
βββ src/ # Core functionality
β βββ data/ # Data handling
β β βββ loaders.py # Data loading functions
β β βββ processors.py # Data processing functions
β β βββ shooting_processors.py # Enhanced shooting data processors
β β
β βββ db/ # Database operations
β β βββ operations.py # Database functions
β β
β βββ analysis/ # Analysis algorithms
β β βββ basic/ # Basic analysis functions
β β β βββ playmakers.py # Playmaker identification
β β β βββ forwards.py # Forward analysis
β β β βββ midfielders.py # Midfielder analysis
β β β
β β βββ advanced/ # Advanced analysis modules
β β β βββ versatility.py # Versatility score calculations
β β β βββ progression.py # Progressive action analysis
β β β βββ possession_impact.py # xPI calculations
β β β βββ clustering.py # Positional clustering analysis
β β β
β β βββ shooting_analyzer.py # Advanced shooting analysis
β β βββ metrics.py # Metrics calculation utilities
β β
β βββ utils/ # Utility functions
β βββ normalization.py # Metric normalization helpers
β βββ logging_setup.py # Logging configuration
β βββ visualization.py # General visualization utilities
β βββ shooting_visualizations.py # Shooting-specific visualizations
β
βββ pipelines/ # Analysis pipelines
β βββ full_analysis.py # Complete analysis pipeline
β βββ advanced_analysis.py # Advanced analysis pipeline
β βββ shooting_pipeline.py # Shooting analysis pipeline
β βββ daily_update.py # Daily data update pipeline
β
βββ visualizations/ # Generated visualization outputs
β βββ shooting/ # Shooting-specific visualizations
β
βββ main.py # Main application entry point
- Python 3.9 or higher
- Dependencies listed in
pyproject.toml
-
Clone the repository:
git clone https://github.com/yourusername/fast-soccer-analysis.git cd fast-soccer-analysis
-
Create and activate a virtual environment:
uv venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install the package and dependencies:
uv pip install -e .
Run a complete analysis with default parameters:
python main.py
Run advanced analysis with player versatility scores, progression metrics, and clustering:
python main.py --analysis-type advanced
Run in-depth shooting analysis with shooting efficiency, profiles, and shot quality metrics:
python main.py --analysis-type shooting
Run all available analysis types together:
python main.py --analysis-type all
The main script supports various options:
python main.py --analysis-type shooting --min-shots 15 --top-n 20 --positions FW --min-90s 10 --max-age 25 --report-file report.md
Options:
--analysis-type
: Type of analysis to run (basic, advanced, shooting, all)--min-shots
: Minimum shots for forward analysis (default: 20)--top-n
: Number of top players to return (default: 20)--positions
: Positions to analyze (default: ["MF", "FW, MF", "MF,DF"])--min-90s
: Minimum 90-minute periods played (default: 5)--max-age
: Maximum player age to include (default: 30)--force-reload
: Force data reload from source--no-save
: Don't save to database--no-visualizations
: Skip creating visualizations for advanced analysis--report-file
: Path to save the report
from pipelines.full_analysis import run_analysis_pipeline
from pipelines.advanced_analysis import run_advanced_analysis
from pipelines.shooting_pipeline import run_shooting_analysis
# Run basic analysis with custom parameters
basic_results = run_analysis_pipeline(
min_shots=15,
top_n=20,
positions=["MF", "FW"],
min_90s=10,
max_age=23,
save_to_db=True,
report_file="reports/young_midfielders.md"
)
# Run advanced analysis
advanced_results = run_advanced_analysis(
min_shots=15,
top_n=20,
positions=["MF", "FW"],
min_90s=10,
max_age=23,
save_to_db=True,
create_visualizations=True
)
# Run shooting analysis
shooting_results = run_shooting_analysis(
min_shots=15,
top_n=20,
positions=["FW"],
min_90s=8,
max_age=25,
create_visualizations=True,
output_dir="visualizations/shooting"
)
# Access specific results
playmakers = basic_results["playmakers"]
print(f"Top playmaker: {playmakers.iloc[0]['Player']}")
versatile_players = advanced_results["versatile_players"]
print(f"Most versatile player: {versatile_players.iloc[0]['Player']}")
clinical_forwards = shooting_results["shooting_efficiency"]
print(f"Most efficient shooter: {clinical_forwards.iloc[0]['Player']}")
Run daily data updates to keep your database current:
python -m pipelines.daily_update --output-dir reports/updates
- Playmakers: Creative midfielders based on progressive passing and chance creation
- Clinical Forwards: Efficient forwards based on shooting and conversion metrics
- Progressive Midfielders: Players who excel at moving the ball forward
- Pressing Midfielders: Players who excel in defensive actions and pressing
- Complete Midfielders: Well-rounded midfielders who contribute in multiple areas
- Passing Quality: Players with exceptional passing metrics
- Player Versatility: Players who excel across multiple skill areas (passing, possession, defense, shooting)
- Progressive Actions: Breakdown of how players move the ball forward (carrying, passing, receiving)
- Expected Possession Impact (xPI): Comprehensive metric quantifying a player's contribution to team possession
- Positional Clustering: Groups players by statistical profiles rather than listed positions
- Shooting Efficiency: Comprehensive analysis of shooting effectiveness and conversion quality
- Shooting Profiles: Classification of players into shooting style categories (Volume Shooter, Clinical Finisher, etc.)
- Shot Creation Specialists: Players who excel at both scoring and creating shots
- Finishing Skill: Analysis of players who outperform their expected goals metrics
- Shot Quality: Evaluation of shot selection based on location, accuracy, and expected value
The analysis automatically generates visualizations including:
- Radar charts comparing player strengths
- Scatter plots showing relationships between metrics
- Bar charts for direct player comparisons
- Heatmaps for comprehensive metric evaluation
- Finishing skill plots comparing goals to expected goals
- Shot quality distributions
- Shot distance histograms
These visualizations are saved in the visualizations/
directory and can be referenced in reports.
Player data and analysis results are stored in a DuckDB database (scouting.db
by default). This provides:
- Efficient storage of player statistics
- Tracking player development over time
- Persistent storage of analysis results (basic, advanced, and shooting)
- Fast querying capabilities
Analysis results are stored in separate tables with appropriate prefixes.
- Create a new analysis function in the appropriate module:
- Basic analysis in
src/analysis/basic/
- Advanced analysis in
src/analysis/advanced/
- Shooting analysis in
src/analysis/shooting_analyzer.py
- Basic analysis in
- Update the relevant pipelines to include your new analysis
- Add appropriate weights and parameters in
config/settings.py
- Create a new visualization function in the appropriate module:
- General visualizations in
src/utils/visualization.py
- Shooting visualizations in
src/utils/shooting_visualizations.py
- General visualizations in
- Update the visualization dashboard to include your new chart
- Reference the visualization in your reports
- Add the URL to
config/urls.py
- Create a processor function in the appropriate module:
- General processing in
src/data/processors.py
- Shooting-specific processing in
src/data/shooting_processors.py
- General processing in
- Update the data loader to handle the new source
This project is licensed under the MIT License - see the LICENSE file for details.
- FBref for providing soccer statistics