Medication Discontinuation (≤100 days)

Predicts whether a patient will discontinue medication within 100 days using a compact embedding‑based neural network for categorical features, plus baselines and rich interpretability. Includes a polished Streamlit dashboard (dark theme) for metrics, global feature importance, plots, and a live prediction form with local explanations.

Target mapping: MMA_score_cat_new = "0. poor adherence" -> y=1 (discontinue); "1. good adherence" -> y=0 (continue).

Quickstart

From the repo root (hackathon):

python -m venv venv
source venv/bin/activate   # Windows: venv\Scripts\activate
pip install --upgrade pip
pip install -r content/med-discontinue/requirements.txt

# Train + CV (saves models and reports)
python content/med-discontinue/main.py --mode cv --config content/med-discontinue/config/config.yaml

# Generate interpretability artifacts (plots + JSON)
python content/med-discontinue/main.py --mode interpret --config content/med-discontinue/config/config.yaml

# Launch Streamlit dashboard
streamlit run content/med-discontinue/app.py

If running commands from inside content/med-discontinue/:

python main.py --mode cv --config config/config.yaml
python main.py --mode interpret --config config/config.yaml
streamlit run app.py

What’s Inside

Entry point: content/med-discontinue/main.py
- --mode cv: k‑fold CV for the embedding NN; also runs baselines; saves outputs/models/nn_fold*.pt and outputs/reports/nn_cv.json.
- --mode interpret: permutation importance, IG, PDP/ICE; saves under outputs/reports/ and outputs/plots/.
- --mode test: placeholder.
Config: content/med-discontinue/config/config.yaml
- data.stata_path (Stata .dta path), training hyperparams, model architecture, evaluation, and output paths.
Data & preprocessing
- src/data_loading.py reads .dta.
- src/preprocessing.py fills unknowns and label‑encodes each categorical feature.
- src/dataset.py builds a torch Dataset of categorical indices.
- Preprocessing schema is saved to outputs/reports/preproc_artifacts.json during CV.
Model & training
- src/model.py CategoricalEmbeddingNN (per‑feature embeddings → MLP → logit).
- src/train.py training loop with early stopping, LR scheduling, BCE/focal loss, PR‑based threshold selection.
- src/evaluate.py PR/ROC metrics and threshold selection.
- src/run_cv.py orchestrates StratifiedKFold, saves fold checkpoints and aggregated report.
Baselines
- src/baselines.py Logistic Regression and RandomForest; saves outputs/reports/baselines_cv.json.
Interpretability
- src/interpretability/permutation.py permutation importance. Output schema can be:
  - {"importances": [{"feature": str, "importance": float}, ...]}
  - or {"columns": [..], "f2_drop": [..]} (F2 drop per feature).
- src/interpretability/captum_ig.py global/local Integrated Gradients (global_ig.json, plots).
- src/interpretability/pdp_ice.py categorical PDP/ICE plots and summary JSON.
Artifacts
- outputs/models/nn_fold*.pt — best model per fold.
- outputs/reports/*.json — CV, baselines, permutation importance, IG, PDP/ICE, preprocessing schema.
- outputs/plots/*.png — permutation, IG, PDP/ICE, local example.

Streamlit App

Path: content/med-discontinue/app.py

Features

Dark theme UI with high contrast and subtle depth.
Dashboard tab: CV metrics (NN + baselines), global feature importance (Altair bars), and plots gallery.
Predict tab: dropdowns for all categorical features using preproc_artifacts.json; loads a checkpoint; shows live probability, decision (mean CV threshold), and permutation‑based local deltas.
Diagnostics expander: shows resolved artifact paths and presence; Reload button now clears cache and reruns.
Handles both permutation report schemas automatically (list or columns+f2_drop).

Tips

Run both cv and interpret before launching the app to populate all sections.
If you change artifacts, click “Reload data” in the sidebar to refresh.
If running into path issues, run from repo root or adjust stata_path and command paths accordingly.

Project Structure

med-discontinue/
├─ data/                                # Place your .dta file here
├─ config/
│  └─ config.yaml                       # Paths & hyperparameters
├─ src/
│  ├─ data_loading.py                   # Load Stata
│  ├─ preprocessing.py                  # Fill 'Unknown', label-encode categoricals
│  ├─ dataset.py                        # PyTorch Dataset for categorical indices
│  ├─ model.py                          # Embedding NN (categorical)
│  ├─ losses.py                         # BCEWithLogits + optional focal
│  ├─ train.py                          # Train loop + early stopping
│  ├─ evaluate.py                       # Metrics, PR/ROC, threshold selection
│  ├─ calibrate.py                      # Temperature scaling
│  ├─ interpretability/
│  │  ├─ permutation.py                 # Permutation importance
│  │  ├─ captum_ig.py                   # Integrated Gradients
│  │  └─ pdp_ice.py                     # PDP/ICE for categoricals
│  ├─ baselines.py                      # Logistic Regression & RandomForest
│  ├─ utils.py                          # Seed, logging, plotting
│  └─ run_cv.py                         # k-fold orchestration
├─ outputs/
│  ├─ models/
│  ├─ plots/
│  └─ reports/
├─ requirements.txt
├─ app.py
└─ main.py

Troubleshooting

“File not found” for stata_path: ensure the path in config matches your file name exactly (including spaces), or run from repo root.
Dashboard says “Permutation importance report not found”: run interpret mode, then click Reload in the sidebar.
Streamlit caching: the Reload button clears cache and reruns.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
hackathon-med-discontinue		hackathon-med-discontinue
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Medication Discontinuation (≤100 days)

Quickstart

What’s Inside

Streamlit App

Project Structure

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ElucidataInc/Aushadh

Folders and files

Latest commit

History

Repository files navigation

Medication Discontinuation (≤100 days)

Quickstart

What’s Inside

Streamlit App

Project Structure

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages