Skip to content

Commit

Permalink
update unimol docking v2 instructions (#228)
Browse files Browse the repository at this point in the history
* fix mean

* update unimol docking v2 instructions

* update main readme

---------

Co-authored-by: zhougm <[email protected]>
  • Loading branch information
ZhouGengmo and zhougm authored Jun 5, 2024
1 parent 97797d5 commit 4df83d8
Show file tree
Hide file tree
Showing 7 changed files with 108 additions and 5 deletions.
12 changes: 9 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,14 +51,20 @@ Check this [subfolder](./unimol_tools/) for more detalis.

Documentation of Uni-Mol tools is available at https://unimol.readthedocs.io/en/latest/

Uni-Mol Docking V2: towards realistic and accurate binding pose prediction
Uni-Mol Docking V2: Towards realistic and accurate binding pose prediction
--------------------------------------------------------------------
We update unimol docking to Uni-Mol Docking V2, which demonstrates a remarkable improvement in performance, accurately predicting the binding poses of 77+% of ligands in the PoseBusters benchmark with an RMSD value of less than 2.0 Å, and 75+\% passing all quality checks. This represents a significant increase from the 62% achieved by the previous Uni-Mol Docking model. Notably, our Uni-Mol Docking approach generates chemically accurate predictions, circumventing issues such as chirality inversions and steric
[![arXiv](https://img.shields.io/badge/arXiv-2405.11769-00ff00.svg)](https://arxiv.org/abs/2405.11769) ![Static Badge](https://img.shields.io/badge/Bohrium_Apps-Uni--Mol_Docking_V2-blue?link=https%3A%2F%2Fbohrium.dp.tech%2Fapps%2Funimoldockingv2)


<p align="center"><img src="unimol_docking_v2/figure/bohrium_app.png" width=60%></p>
<p align="center"><b>Uni-Mol Docking V2 Bohrium App Interface</b></p>

We update Uni-Mol Docking to Uni-Mol Docking V2, which demonstrates a remarkable improvement in performance, accurately predicting the binding poses of 77+% of ligands in the PoseBusters benchmark with an RMSD value of less than 2.0 Å, and 75+\% passing all quality checks. This represents a significant increase from the 62% achieved by the previous Uni-Mol Docking model. Notably, our Uni-Mol Docking approach generates chemically accurate predictions, circumventing issues such as chirality inversions and steric
clashes that have plagued previous ML models.

Check this [subfolder](./unimol_docking_v2/) for more detalis.

Service of Uni-Mol Docking v2 is avaiable at https://bohrium.dp.tech/apps/unimoldockingv2
Service of Uni-Mol Docking V2 is avaiable at https://bohrium.dp.tech/apps/unimoldockingv2

News
----
Expand Down
2 changes: 1 addition & 1 deletion unimol/unimol/losses/reg_loss.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ def reduce_metrics(logging_outputs, split="valid") -> None:
.numpy()
.mean(axis=1)
)
agg_mae = np.abs(y_pred - y_true).mean()
agg_mae = np.abs(y_pred - y_true).mean(axis=0).mean(axis=1)
metrics.log_scalar(f"{split}_agg_mae", agg_mae, sample_size, round=4)


Expand Down
86 changes: 86 additions & 0 deletions unimol_docking_v2/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,31 @@
Uni-Mol Docking V2
===================================================================
[![arXiv](https://img.shields.io/badge/arXiv-2405.11769-00ff00.svg)](https://arxiv.org/abs/2405.11769) ![Static Badge](https://img.shields.io/badge/Bohrium_Apps-Uni--Mol_Docking_V2-blue?link=https%3A%2F%2Fbohrium.dp.tech%2Fapps%2Funimoldockingv2)

<p align="center"><img src="figure/bohrium_app.png" width=60%></p>
<p align="center"><b>Uni-Mol Docking V2 Bohrium App Interface</b></p>

We update Uni-Mol Docking to Uni-Mol Docking V2, which demonstrates a remarkable improvement in performance, accurately predicting the binding poses of 77+% of ligands in the PoseBusters benchmark with an RMSD value of less than 2.0 Å, and 75+\% passing all quality checks. This represents a significant increase from the 62% achieved by the previous Uni-Mol Docking model. Notably, our Uni-Mol Docking approach generates chemically accurate predictions, circumventing issues such as chirality inversions and steric
clashes that have plagued previous ML models.

Service of Uni-Mol Docking V2 is avaiable at https://bohrium.dp.tech/apps/unimoldockingv2

Dependencies
------------
- [Uni-Core](https://github.com/dptech-corp/Uni-Core), check its [Installation Documentation](https://github.com/dptech-corp/Uni-Core#installation).
- rdkit==2022.9.3, install via `pip install rdkit-pypi==2022.9.3 -i https://pypi.tuna.tsinghua.edu.cn/simple/ --trusted-host pypi.tuna.tsinghua.edu.cn`
- biopandas==0.4.1, install via `pip install biopandas`

Data
----------------------------------
| Data | File Size | Update Date | Download Link |
|--------------------------|------------| ----------- |---------------------------------------------------------------------------------------------------------------------------|
| Raw training data | 4.95GB | May 14 2024 |https://zenodo.org/records/11191555 |
| Posebusters and Astex | 8.2MB | Nov 16 2023 |https://github.com/dptech-corp/Uni-Mol/files/13352676/eval_sets.zip |


Note that we use the `Posebusters V1` (428 datapoints, released in August 2023). For the latest version, please refer to [Posebusters repo](https://github.com/maabuu/posebusters).


Model weights
----------------------------------
Expand All @@ -28,3 +46,71 @@ Results
| Uni-Mol Docking | 58.9 | 82.35 |
| AlphaFold latest | 73.6 | - |
| **Uni-Mol Docking V2** | **77.6** | **95.29**|

To reproduce the Posebusters results, we provide a notebook `interface/posebuster_demo` that includes the pipeline from data processing, model inference to metric calculation.

Training
----------------------------------

In the training script, `data_path`, `save_dir`, `finetune_mol_model`, and `finetune_pocket_model` need to be specified.

The pretrained molecular and pocket model weights can be obtained from [Uni-Mol repo]((https://github.com/maabuu/posebusters)). We use the no_h version weights for molecule.

```
bash train.sh
```

Inference
----------------------------------

We add an interface for model inference in `interface/demo.py`.

About inputs and outpus:

- `--input-protein`: PDB file, abusolute path or raletive path, in batch_one2one mode, list of paths

- `--input-ligand`: SDF file, abusolute path or raletive path; in batch mode, list of paths

- `--input-docking-grid`: JSON file, include center coordinate and box size, abusolute path or raletive path; in batch mode, list of paths

- `--output-ligand-name`: str, the output SDF file name; in batch mode, list of names

- `--output-ligand-dir`: str, abusolute path or raletive path

In batch mode, you can save `input_protein`, `input_ligand`, `input_docking_grid`, and `output_ligand_name` to a CSV file and use `--input-batch-file` to input it.

Other parameters used:

- `--steric-clash-fix`: The predicted SDF file will be corrected for chemical detail and clash relaxation.

- `--mode`: optional values are `single`, `batch_one2one` and `batch_one2many`.
- `single` represents one protein and one ligand as input.
- `batch_one2one` represents a batch of proteins and a batch of ligands, where the relationship is one-to-one.
- `batch_one2many` represents one protein and a batch of ligands, where the relationship is one-to-many.

Demo:

```
cd interface
bash demo.sh # demo_batch_one2one.sh for batch mode
```
Or refer to this notebook `interface/posebuster_demo`.


Citation
--------

Please kindly cite this paper if you use the data/code/model.
```
@article{alcaide2024uni,
title={Uni-Mol Docking V2: Towards Realistic and Accurate Binding Pose Prediction},
author={Alcaide, Eric and Gao, Zhifeng and Ke, Guolin and Li, Yaqi and Zhang, Linfeng and Zheng, Hang and Zhou, Gengmo},
journal={arXiv preprint arXiv:2405.11769},
year={2024}
}
```

License
-------

This project is licensed under the terms of the MIT license. See [LICENSE](https://github.com/dptech-corp/Uni-Mol/blob/main/LICENSE) for additional details.
Binary file added unimol_docking_v2/figure/bohrium_app.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion unimol_docking_v2/interface/demo.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
python demo.py --input-protein ../example_data/protein.pdb \
python demo.py --mode single --conf-size 10 --cluster \
--input-protein ../example_data/protein.pdb \
--input-ligand ../example_data/ligand.sdf \
--input-docking-grid ../example_data/docking_grid.json \
--output-ligand-name ligand_predict \
--output-ligand-dir predict_sdf \
--steric-clash-fix \
--model-dir checkpoint_best.pt
5 changes: 5 additions & 0 deletions unimol_docking_v2/interface/demo_batch_one2one.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
python demo.py --mode batch_one2one --batch-size 8 --conf-size 10 --cluster \
--input-batch-file input_batch_one2one.csv \
--output-ligand-dir predict_sdf \
--steric-clash-fix \
--model-dir checkpoint_best.pt
4 changes: 4 additions & 0 deletions unimol_docking_v2/interface/input_batch_one2one.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
input_protein,input_ligand,input_docking_grid,output_ligand_name
protein1.pdb,ligand_prepared1.sdf,docking_grid1.json,ligand_predict1
protein2.pdb,ligand_prepared2.sdf,docking_grid2.json,ligand_predict2
protein3.pdb,ligand_prepared3.sdf,docking_grid3.json,ligand_predict3

0 comments on commit 4df83d8

Please sign in to comment.