Skip to content

bips-hb/iml_imputation_paper

Repository files navigation

Imputation uncertainty in interpretable machine learning methods

by Pegah Golchian & Marvin N. Wright

This repository contains the code for the simulation study and real data example in the paper "Imputation uncertainty in interpretable machine learning methods" by P. Golchian, and M. Wright. It compares the effects of different imputation methods on the confidence interval coverage probabilities of the IML methods permutation feature importance (PFI), partial dependence plots (PDP) and SHAP.

The repository contains:

  • Code for the confidence interval experiment (ci-experiment.R)
  • Helper functions used in the scripts (folder ./R/)
  • DESCRIPTION file describing the package dependencies
  • Cluster setup (batchtools.conf.R)
  • data (UCI wine quality dataset)

Plot results for different IML methods, missingness rates, missingness patterns, sampling strategy and imputation methods

  • plots_paper file - contains all the plots mentioned in the paper
  • plots_sim_example.R - coverage, width and bias for xgb, MAR, bootstrap
  • real_data.R - PFI, PDP and SHAP of UCI wine dataset example with different imputation methods
  • plots_paper_cov.R - coverage of all IML methods over model refits with (set parameter bootstrap or subsampling)
  • plots_paper_bias.R - bias of all IML methods over different missingness rates for 15 refits (set parameter bootstrap or subsampling)
  • plots_paper_width.R - width of all IML methods over different missingness rates for 15 refits (set parameter bootstrap or subsampling)
  • plots-width-refits-adjusted.R - width of all IML methods over model refits > 4
  • plots-performance.R - performance of IML methods with different imputation

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages