Skip to content

Breast Cancer Risk Prediction (Data Science) Project || Tech Stack: Python 3.10, Pandas, Numpy, Matplotlib, Seaborn, Scikit-Learn, Machine Learning, Flask, Bootstrap 5, Jinja2, Pytest, Docker, Ruff, Black, Bandit (pre-commit hooks)

License

Notifications You must be signed in to change notification settings

AAdewunmi/Breast-Cancer-Risk-Prediction-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

# 🩺 Breast Cancer Risk Prediction Project

This repository contains **two related projects** exploring breast cancer risk estimation using machine learning:

1. **Flask Web Application** – a fully functional prototype for predicting breast cancer risk using **multi-modal data** (both *image* and *risk-factor* inputs).  
   The system uses an **ensemble model**, which blends predictions from different models (e.g., image and clinical data models) to produce a more robust final risk estimate.  
2. **Jupyter Notebook – `Breast_Cancer_Risk_Prediction.ipynb`** – a companion experiment that focuses on **image-only** risk prediction.

---

## πŸ“Š Overview

### 1. Flask App β€” Multi-Modal Ensemble
The web app allows users to upload a mammogram-like image and provide lifestyle or genetic factors to compute an **ensemble probability of risk**.

**What is an ensemble model?**  
An ensemble combines multiple independent models (e.g., one trained on mammogram images and another on lifestyle/risk-factor data).  
By averaging or weighting their outputs, it reduces noise and improves stabilityβ€”yielding more reliable overall predictions than either model alone.

**Key features**
- Multi-input (image + structured risk-factor data)
- Flask-based front-end with clean Bootstrap UI
- Secure form submission with CSRF protection
- Local file uploads, input forms, and visual results dashboard
- Risk interpretation with contextual explanations and color-coded badges
- Built-in tests (`pytest`), linting (`ruff`), and code formatting (`black`)
- Ready for containerization (Dockerfile + docker-compose.yml)

### 2. Image-Only Model Notebook
The notebook `Breast_Cancer_Risk_Prediction.ipynb` focuses purely on **image classification**.  
It trains and evaluates a CNN-based model on mammogram images, generating metrics such as accuracy, loss curves, and confusion matrices.

---

## πŸš€ Getting Started

### Prerequisites
- Python 3.10+ (recommend using a virtual environment)
- pip, virtualenv, or venv

### 1️⃣ Set up the Flask app
```bash
git clone https://github.com/AAdewunmi/Breast-Cancer-Risk-Prediction-Project.git
cd Breast-Cancer-Risk-Prediction-Project/Breast-Cancer-Risk-Prediction-Project

# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate  # (Windows: .venv\Scripts\activate)

# Configure environment

Create a `.env` file based on `.env.example`:

```bash
FLASK_ENV=development
FLASK_DEBUG=1
SECRET_KEY=devkey
UPLOAD_FOLDER=data/uploads
```

# Install dependencies
pip install -r requirements.txt

2️⃣ Run the app

flask run

The app will start at http://127.0.0.1:5000.

Upload an image, fill in risk factors, and view the ensemble prediction dashboard.

3️⃣ Run tests and code checks

pytest -q

4️⃣ Code Quality & Security

pre-commit run --all-files

Checks include:

  • Ruff (Linter + Fixer)
  • Black (Formatter)
  • Bandit (Security)
  • YAML/TOML/JSON consistency

🧠 Folder Structure

Breast-Cancer-Risk-Prediction-Project/
β”‚
β”œβ”€β”€ predictor/                     # Flask app package
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ app.py                     # Flask application factory
β”‚   β”œβ”€β”€ services/                  # Inference logic and model stubs
β”‚   β”œβ”€β”€ templates/                 # HTML templates (Bootstrap)
β”‚   β”œβ”€β”€ static/                    # CSS, JS, images
β”‚   └── tests/                     # Unit and integration tests
β”‚
β”œβ”€β”€ Breast_Cancer_Risk_Prediction.ipynb   # Image-only ML model
β”œβ”€β”€ setup.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .pre-commit-config.yaml
β”œβ”€β”€ pytest.ini
└── README.md

🧰 Tech Stack

  • Python 3.10+
  • Flask
  • Bootstrap 5
  • Jinja2
  • Pytest
  • Docker
  • Ruff, Black, Bandit (pre-commit hooks)

🧩 Model Explanation

Ensemble Prediction Components

Component Description Output
Image Model Processes uploaded image to estimate visual patterns of risk. Probability ∈ [0, 1]
Risk-Factor Model Uses user inputs (age, BMI, BRCA, alcohol, etc.). Probability ∈ [0, 1]
Ensemble Weighted combination of both models. Combined probability, color-coded interpretation

Interpretation categories:

Range Label Meaning
< 10% Very Low Minimal model-estimated risk
10–20% Low Mild risk, maintain routine screening
20–40% Moderate Some elevated indicators, consider clinical discussion
40–70% High Elevated probability, follow-up advised
> 70% Very High Significant model risk; clinical review strongly advised

⚠️ These outputs are probabilistic and for research or educational use only β€” not for clinical diagnosis.


🎬 Quick Demo

Input Form
Image
Results Page
Image

πŸ§‘β€πŸ’» Development Notes

Testing

  • Unit tests cover inference logic, view rendering, and form handling.
  • Integration tests simulate uploads and ensure stable Flask routing.

Linting & Formatting

  • ruff for linting and auto-fixes.
  • black for consistent formatting.
  • Run checks automatically via pre-commit hooks.

πŸ€– Acknowledgements

This project was developed with the assistance of OpenAI’s ChatGPT, which supported:

  • Code refactoring and docstring generation
  • Automated test and CI setup
  • UI and UX text suggestions
  • Explanation synthesis for interpretability

πŸ“š References


🩢 Disclaimer

This project is a prototype for educational and research purposes. It does not provide medical advice or diagnosis. Always consult qualified healthcare professionals for clinical assessment.


πŸ“„ License

MIT License

About

Breast Cancer Risk Prediction (Data Science) Project || Tech Stack: Python 3.10, Pandas, Numpy, Matplotlib, Seaborn, Scikit-Learn, Machine Learning, Flask, Bootstrap 5, Jinja2, Pytest, Docker, Ruff, Black, Bandit (pre-commit hooks)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •