author: Daniel Wong ([email protected])
https://osf.io/xntd6/
Identifier: DOI 10.17605/OSF.IO/XNTD6
torch 1.2.0
torchvision 0.4.0
pandas 0.24.2
cv2 4.1.1.26
numpy 1.16.4
sklearn 0.21.2
We've included an example conda environment in this repository called transchtranschannel_conda_env.yml. To install the necessary packages, simply install conda first (https://conda.io/projects/conda/en/latest/user-guide/install/index.html), and then 'conda env create -f transchannel_env.yml -n transchannel_env' to create a new conda environment from the .yml file. Install time should take just a few minutes.
All deep learning models were trained using Nvidia Geforce GTX 1080 GPUs with a 64 CPU machine. We used a CentOS Linux operating system (version 7).
transchannel.py contains the majority of the necessary code to reproduce the main results of the paper. See the docstrings for specifics.
transchannel_runner.py This script pertains to the model training, evaluation, and supplemental analyses of the tauopathy sections of the paper. It can be run with the command "python transchannel_runner.py {FOLD NUMBER}". When specifying a 70/30 split, set FOLD NUMBER to be NOT in {1,2,3}.
transchannel_tests.py This script contains unit tests and integration tests for key functions in transchannel.py.
osteosarcoma.py This script pertains to the osteosarcoma sections of the paper. It is the runner code for training and evaluation on the functional genomics dataset.
figure_generator.py contains code to reproduce the figures.
bash_script.sh convenient bash script to run all of the code, from model training to model evaluation and figure generation
models:
This folder contains the fully trained models, including:
The tauopathy model applied to the archival HCS
("raw_1_thru_6_full_Unet_mod_continue_training_2.pt")
The 3 models trained with threefold cross-validation on the ablated (95th percentile) osteosarcoma dataset
d0_to_d1_ablation_cyclin_only_dataset_fold1_continue_training.pt
d0_to_d1_ablation_cyclin_only_dataset_fold2_continue_training.pt
d0_to_d1_ablation_cyclin_only_dataset_fold3_continue_training.pt
The 3 models trained with threefold cross-validation on the raw, unablated osteosarcoma dataset
d0_to_d1_cyclin_only_dataset_fold1.pt
d0_to_d1_cyclin_only_dataset_fold2.pt
d0_to_d1_cyclin_only_dataset_fold3.pt
The two single channel models (i.e. one channel input to one channel output) for the supplemental tauopathy analysis
DAPI_only_to_AT8.pt
YFP_only_to_AT8.pt
csvs:
This folder contains the CSVs with matching string pointers for the image datasets
stats:
This folder contains .npy files with stats used for normalization
pickles:
This folder is used for saving different results as pickles
matplotlib_figures:
This folder is where figure_generator.py saves its files
outputs:
This folder is a temporary directory used for saving images
Demo:
To demo the model, you can run all of the tests included in transchannel_tests.py. Computation for calculating ROC can be very intensive and takes multiple hours of compute. Tests can be run with a smaller sample size by modifying the variable 'sample_size' found in transchannel_tests.py, which determines the maximum number of images to pull from the test set.