Skip to content

A novel method that quantitatively evaluates the impact of PLM in PLI predictions.

License

Notifications You must be signed in to change notification settings

brian-zZZ/PLM-PLI

Repository files navigation

PLM-PLI

Does protein pretrained language model facilitate the prediction of protein-ligand interaction?
A novel method that quantitatively assesses the significance of protein PLMs in PLI prediction

Directory Structure

├── AttentiveFP/           # GAT model for extracting drug features
├── data/                  # PLI task datasets
├── models/                # PLMs
├── args.yaml              # Drug molecule parameters
├── config.py              # Configuration file for parameter settings
├── data_handler.py        # PLI data processing tool
├── main.py                # Main function
├── ot_metric              # Quantitative transfer metrics based on OT
├── OTFRM                  # OTFRM analysis
├── plotter.py             # Plotting tool
├── README.md              # Readme file
├── requirements.txt       # Environment dependencies
├── train_test.py          # Engine for training and testing the model
├── utils.py               # Collection of utility functions

Requirements

python >3.10.11 torch-1.11.0

conda create -n PLMPLI python==3.10.11
conda activate PLMPLI
cd PLM-PLI
pip install -r requirements.txt

Data Preparation

Place the processed datasets for PDBbind, Kinase, and DUD-E in the data/ directory. An example entry of the processed PDBbind dataset is shown below:

PDB-ID seq rdkit_smiles label set
11gs PYTVV...GKQ CC[C@@H](CSC[C@H]...C(=O)c1ccc(OCC(=O)O)c(Cl)c1Cl 5.82 train

Fine-tuning on PLI Tasks

Run main.py to perform fine-tuning from pre-trained PLMs to downstream PLI prediction. The following example demonstrates the command to fine-tune using ProtTrans as the PLM on the PDBBind task:

python main.py --model_name=prottrans --task=PDBBind

For more input parameter settings, please refer to config.py.

Acknowledgement

The SOFTWARE will be used for teaching or not-for-profit research purposes only. Permission is required for any commercial use of the Software.

Citations

If you use our method in your research, please cite our paper:

@article{zhang2023protein,
  author={Zhang, Weihong and Hu, Fan and Li, Wang and Yin, Peng},
  title={Does protein pretrained language model facilitate the prediction of protein-ligand interaction?},
  year={2023},
  journal={Methods},
}

About

A novel method that quantitatively evaluates the impact of PLM in PLI predictions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published