Top-DTI: Integrating Topological Deep Learning and Large Language Models for Drug Target Interaction Prediction
We propose Top-DTI framework for predicting Drug-Target Interaction (DTI) by integrating Topological Data Analysis (TDA) and Large Language Models (LLMs). Top-DTI leverages Persistent Homology (PH) to extract topological features from protein contact maps and drug molecular images. Simultaneously, protein and drug LLMs generate semantically rich embeddings that capture sequential and contextual information from protein sequences and drug SMILES strings. TDA and LLM embeddings are combined through a learnable fusion mechanism that dynamically balances the contributions of topological and sequence-based features. The integrated representations are then fed into a heterogeneous Graph Neural Network (GNN) to learn relational information from the DTI network. Finally, the embeddings learned from the GNN are used to train a multilayer perceptron (MLP) classifier to predict DTIs.
-
Generate 2D Representations
Generate two-dimensional representations of drug molecular structures and protein contact maps to capture structural features:- Drug Images: Generated from SMILES using the RDKit library.
- Protein Contact Maps: Created using a transformer-based contact prediction model.
-
Extract Topological Features
Extract topological features from drug molecular images and protein contact maps using Persistent Homology. -
Generate Sequence-Based Embeddings
Capture sequence-based features using LLMs: -
Top-DTI Evaluation and Results
The embeddings generated from Step 2: Topological Features and Step 3: Sequence-Based Embeddings are utilized to evaluate the performance of Top-DTI on benchmark datasets:
The public benchmark datasets are available in the datasets folder for direct access.
- BioSNAP and Human datasets were obtained from DrugLAMP repository.
- BioSNAP Unseen Drug and BioSNAP Unseen Target datasets were sourced from ConPLex_dev repository.
The environment.yml file and requirements.txt are provided in the main repository for your convenience.
The provided environment.yml
file can be used to create the required Conda environment as follows:
conda env create -f environment.yml
conda activate top_dti