Skip to content

Commit ae90fd3

Browse files
committed
feat: Initial open source release.
Signed-off-by: Matteo Manica <[email protected]>
0 parents  commit ae90fd3

20 files changed

+1399
-0
lines changed

.gitignore

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
# secrets detector
2+
.secrets.baseline
3+
4+
# mac files
5+
.DS_Store
6+
7+
# Byte-compiled / optimized / DLL files
8+
__pycache__/
9+
*.py[cod]
10+
*$py.class
11+
12+
# C extensions
13+
*.so
14+
15+
# Distribution / packaging
16+
.Python
17+
env/
18+
build/
19+
develop-eggs/
20+
dist/
21+
downloads/
22+
eggs/
23+
.eggs/
24+
lib/
25+
lib64/
26+
parts/
27+
sdist/
28+
var/
29+
wheels/
30+
*.egg-info/
31+
.installed.cfg
32+
*.egg
33+
34+
# PyInstaller
35+
# Usually these files are written by a python script from a template
36+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
37+
*.manifest
38+
*.spec
39+
40+
# Installer logs
41+
pip-log.txt
42+
pip-delete-this-directory.txt
43+
44+
# Unit test / coverage reports
45+
htmlcov/
46+
.tox/
47+
.coverage
48+
.coverage.*
49+
.cache
50+
nosetests.xml
51+
coverage.xml
52+
*.cover
53+
.hypothesis/
54+
55+
# Translations
56+
*.mo
57+
*.pot
58+
59+
# Django stuff:
60+
*.log
61+
local_settings.py
62+
63+
# Flask stuff:
64+
instance/
65+
.webassets-cache
66+
67+
# Scrapy stuff:
68+
.scrapy
69+
70+
# Sphinx documentation
71+
docs/_build/
72+
73+
# PyBuilder
74+
target/
75+
76+
# Jupyter Notebook
77+
.ipynb_checkpoints
78+
79+
# pyenv
80+
.python-version
81+
82+
# celery beat schedule file
83+
celerybeat-schedule
84+
85+
# SageMath parsed files
86+
*.sage.py
87+
88+
# dotenv
89+
.env
90+
91+
# virtualenv
92+
.venv
93+
venv/
94+
ENV/
95+
96+
# Spyder project settings
97+
.spyderproject
98+
.spyproject
99+
100+
# Rope project settings
101+
.ropeproject
102+
103+
# mkdocs documentation
104+
/site
105+
106+
# mypy
107+
.mypy_cache/
108+
109+
# data files
110+
.pdf
111+
.csv
112+
113+
# swap files
114+
*swp
115+
116+
# shell scripts
117+
*.sh
118+
119+
# trained models
120+
/models

.travis.yml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
dist: trusty
2+
3+
services:
4+
- docker
5+
6+
before_script:
7+
- docker pull drugilsberg/rdkit-ubuntu:latest
8+
- docker build -f .travis/Dockerfile -t predictor .
9+
10+
script:
11+
- docker run -it predictor python3 -c "import paccmann_predictor"
12+
- docker run -it predictor python3 examples/train_paccmann.py -h

.travis/Dockerfile

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
FROM drugilsberg/rdkit-ubuntu:latest
2+
RUN apt-get update && apt-get install -y git
3+
WORKDIR /predictor
4+
# install requirements
5+
COPY examples/requirements.txt .
6+
RUN pip3 install --no-cache-dir -r requirements.txt
7+
# copy paccmann_predictor
8+
COPY . .
9+
RUN pip3 install --no-deps .
10+
CMD /bin/bash

LICENSE

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
Copyright 2019 Ali Oskooei, Jannis Born, Matteo Manica, Joris Cadow
2+
3+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4+
5+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6+
7+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

README.md

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# paccmann_predictor
2+
3+
PyTorch implementation of PaccMann.
4+
5+
IC50 prediction using drug properties and tissue-specific cell line (gene expression profiles).
6+
7+
`paccmann_predictor` is a package for drug sensitivity prediction and is the core component of the repo.
8+
9+
*NOTE*: PaccMann acronyms "Prediction of AntiCancer Compound sensitivity with Multi-modal Attention-based Neural Networks".
10+
11+
*NOTE*: This repo contains the `pytorch` implementation of our best model architecture (a multiscale convolutional attentive SMILES encoder).
12+
13+
*NOTE*: For details, please see our [paper](https://doi.org/10.1021/acs.molpharmaceut.9b00520) in *Molecular Pharmaceutics*.
14+
15+
## Requirements
16+
17+
- `conda>=3.7`
18+
19+
## Installation
20+
21+
The library itself has few dependencies (see [setup.py](setup.py)) with loose requirements.
22+
To run the example training script we provide environment files under `examples/`.
23+
24+
Create a conda environment:
25+
26+
```sh
27+
conda env create -f examples/conda.yml
28+
```
29+
30+
Activate the environment:
31+
32+
```sh
33+
conda activate paccmann_predictor
34+
```
35+
36+
Install in editable mode for development:
37+
38+
```sh
39+
pip install -e .
40+
```
41+
42+
## Example usage
43+
44+
In the `examples` directory is a training script [train_paccmann.py](./examples/train_paccmann.py) that makes use
45+
of `paccmann_predictor`.
46+
47+
```console
48+
(paccmann_predictor) $ python examples/train_paccmann.py -h
49+
usage: train_paccmann.py [-h]
50+
train_sensitivity_filepath test_sensitivity_filepath
51+
gep_filepath smi_filepath gene_filepath
52+
smiles_language_filepath model_path params_filepath
53+
training_name
54+
55+
positional arguments:
56+
train_sensitivity_filepath
57+
Path to the drug sensitivity (IC50) data.
58+
test_sensitivity_filepath
59+
Path to the drug sensitivity (IC50) data.
60+
gep_filepath Path to the gene expression profile data.
61+
smi_filepath Path to the SMILES data.
62+
gene_filepath Path to a pickle object containing list of genes.
63+
smiles_language_filepath
64+
Path to a pickle object a SMILES language object.
65+
model_path Directory where the model will be stored.
66+
params_filepath Path to the parameter file.
67+
training_name Name for the training.
68+
69+
optional arguments:
70+
-h, --help show this help message and exit
71+
```
72+
73+
`params_path` could point to [examples/example_params.json](examples/example_params.json) examples for other files can be downloaded from [here](https://ibm.box.com/v/paccmann-pytoda-data).
74+
75+
## References
76+
77+
If you use `paccmann_predictor` in your projects, please cite the following:
78+
79+
```bib
80+
@article{doi:10.1021/acs.molpharmaceut.9b00520,
81+
author = {Manica, Matteo and Oskooei, Ali and Born, Jannis and Subramanian, Vigneshwari and Saez-Rodriguez, Julio and Rodriguez Martinez, Maria},
82+
title = {Toward Explainable Anticancer Compound Sensitivity Prediction via Multimodal Attention-Based Convolutional Encoders},
83+
journal = {Molecular Pharmaceutics},
84+
year = {2019},
85+
doi = {10.1021/acs.molpharmaceut.9b00520},
86+
note ={PMID: 31618586},
87+
URL = {
88+
https://doi.org/10.1021/acs.molpharmaceut.9b00520
89+
},
90+
eprint = {
91+
https://doi.org/10.1021/acs.molpharmaceut.9b00520
92+
}
93+
94+
}
95+
@misc{born2019reinforcement,
96+
title={Reinforcement learning-driven de-novo design of anticancer compounds conditioned on biomolecular profiles},
97+
author={Jannis Born and Matteo Manica and Ali Oskooei and Maria Rodriguez Martinez},
98+
year={2019},
99+
eprint={1909.05114},
100+
archivePrefix={arXiv},
101+
primaryClass={q-bio.BM}
102+
}
103+
```

examples/conda.yml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
name: paccmann_predictor
2+
channels:
3+
- https://conda.anaconda.org/rdkit
4+
dependencies:
5+
- rdkit=2019.03.1
6+
- python>=3.6,<3.8
7+
- pip>=19.1
8+
- pip:
9+
- pytoda @ git+https://github.com/PaccMann/[email protected]
10+
- numpy>=1.14.3
11+
- scipy>=1.3.1
12+
- tensorflow>=1.10.0,<2.0
13+
- torch==1.0.1

examples/example_params.json

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
{
2+
"drug_sensitivity_min_max": true,
3+
"augment_smiles": true,
4+
"smiles_start_stop_token": true,
5+
"number_of_genes": 2128,
6+
"smiles_padding_length": 560,
7+
"stacked_dense_hidden_sizes": [
8+
1024,
9+
512
10+
],
11+
"activation_fn": "relu",
12+
"dropout": 0.5,
13+
"batch_norm": true,
14+
"filters": [
15+
64,
16+
64,
17+
64
18+
],
19+
"multiheads": [
20+
4,
21+
4,
22+
4,
23+
4
24+
],
25+
"smiles_embedding_size": 16,
26+
"kernel_sizes": [
27+
[
28+
3,
29+
16
30+
],
31+
[
32+
5,
33+
16
34+
],
35+
[
36+
11,
37+
16
38+
]
39+
],
40+
"smiles_attention_size": 64,
41+
"embed_scale_grad": false,
42+
"final_activation": true,
43+
"gene_to_dense": false,
44+
"batch_size": 2048,
45+
"lr": 0.01,
46+
"optimizer": "adam",
47+
"loss_fn": "mse",
48+
"epochs": 200,
49+
"save_model": 25
50+
}

examples/requirements.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pytoda @ git+https://github.com/PaccMann/[email protected]
2+
numpy>=1.14.3
3+
scipy>=1.3.1
4+
tensorflow>=1.10.0,<2.0
5+
torch==1.0.1

0 commit comments

Comments
 (0)