PhoreGen is a pharmacophore-oriented 3D molecular generation framework designed to generate entire 3D molecules that are precisely aligned with a given pharmacophore model. It employs asynchronous perturbations and simultaneously updates on both atomic and bond information, coupled with a message-passing mechanism that incoporates prior knowledge of ligand-pharmacophore mapping during the diffusion-denoising process. By hierarchical learning on a large number of ligand-pharmacophore pairs derived from 3D ligands, complex structures, and docking-produced potential binding modes, PhoreGen can generate chemically and energetically reasonable 3D molecules well-aligned with the pharmacophore constraints, while maintaining structural diversity, drug-likeness, and potentially high binding affinity. Notably, it excels in generating feature-customized molecules, e.g. with covalent groups and metal-binding motifs, at high frequency, demonstrating its unparalleled ability and practicality even for challenging drug design scenarios.
The codes have been tested in the following environment:
Package | Version |
---|---|
Python | 3.9.16 |
PyTorch | 1.12.1 |
CUDA | 12.1 |
PyTorch Geometric | 2.1.0 |
RDKit | 2022.9.5 |
OpenBabel | 3.1.1 |
Pandas | 1.5.3 |
NumPy | 1.25.1 |
conda env create -f phoregen_env.yml
conda activate phoregen
Please refer to README.md
in the data
folder.
You can generate pharmacophore models based on complexes or ligands using the online tool available at AncPhore.
Use the following command to generate molecules based on the given pharmacophore models:
python sample_all.py --num_samples 100 --outdir ./results/test --phore_file_list ./data/phore_for_sampling/file_index.json
Key arguments:
num_samples
: Number of molecules to generate for each pharmacophore model.outdir
: Output directory for the generated molecules.phore_file_list
: Path to the JSON file containing the list of pharmacophore models, we provide a test file in./data/phore_for_sampling/file_index.json
.
Output files include 3D molecular structures in .sdf
format.
To perform pretraining with the LigPhore dataset:
python train.py --config ./configs/train_lig-phore.yml
To refine the model using CpxPhore and DockPhore datasets:
python train.py --config ./configs/train_dock-cpx-phore.yml
For questions or feedback, please contact:
- Peng Jian: [email protected]
- Li Guo-Bo: [email protected]
- Visit our Lab Website for more details about PhoreGen and related projects.