This repository contains the pipeline for Calibration from Fay Frost as part of the CVDNet project.
-
Clone the repository:
git clone https://github.com/alan-turing-institute/cvd-net-pipeline cd cvd-net-pipeline
-
Create a virtual environment (optional but recommended):
python -m venv venv source venv/bin/activate # On macOS/Linux
-
Install
You can install the dependencies using the pyproject.toml
file:
pip install .
The pipeline can be run by executing the main.py
script. You must specify a configuration file in JSON format using the --config
argument.
The configuration file should define the steps to run and other parameters such as the number of samples and with outputs used for calibration. Below is an example configuration file:
{
"input_parameters": "parameters_pulmonary_sensitive_summarystats.json",
"output_path": "output_synthetic/",
"steps": ["1", "2", "3", "4", "5"],
"nsamples": 64,
"n_pca_components": 10,
"n_params": 9,
"include_timeseries": 0,
"output_keys": [
"t_max_dpdt", "a_epad", "epad", "s_a_epad", "s_epad",
"min_dpdt", "max_dpdt", "A_p", "P_max", "esp", "sys",
"EF", "dia", "Ees/Ea", "iT", "PC1", "PC2", "PC3"
],
"epsilon_obs_scale": 0.05,
"output_dir_bayesian": "output_synthetic/output_5000_9params/bayesian_calibration_results/18_output_keys/calibration_20250619_135107"
}
input_parameters
: The json file containting parameters and their ranges to sample from.
output_path
: The path you want output to be saved to.
steps
: The steps you want to run.
nsamples
: The number of samples you want to run.
n_pca_components
: The number of pca components to compute.
n_params
: The number of non-fixed parameters sampled.
include_timeseries
: (Not implemented yet) Whether the timeseries waveform should be included in the calibration 0/1 for true/false.
output_keys
: The outputs you want to calibrate on.
epsilon_obs_scale
: (Hard coded for now) Observation error
output_dir_bayesian
: The directory where calibration files are saved if you are only running steps 6 onwards.
To run the pipeline, specify the configuration file as follows:
python main.py --config config/pipeline_config.json
- Simulate Data: Generates input and output data based on parameters.
- Analyze Giessen: Performs analysis on the simulated pressure waveform data.
- Compute PCA: Performs a PCA on the output data.
- Build Emulator: Builds an emulator for the data.
- Calibration: Calibrates the model.
- Simulate Posterior Data: Simulates data from posterior samples.
- Final Resampling: Performs final resampling on posterior simulations.
To run specific steps, modify the steps
field in the configuration file. For example, to run steps 1, 2, and 3, use the following within your configuration:
{
"steps": ["1", "2", "3"],
}
Then execute:
python main.py --config config/pipeline_config.json
main.py
: Entry point for running the pipeline.pipeline/
: Contains the modules for each step of the pipeline.simulate_data.py
: Simulates input and output data.analyse_giessen.py
: Analyzes the data.build_emulator.py
: Builds the emulator.simulate_posterior.py
: Simulates posterior data.calibrate.py
: Calibrates the model.
This project is licensed under the MIT License. See the LICENSE
file for details.