Computational pipeline for optimizing antibody binding affinity through structure-based design and machine learning approaches.
This project implements an 8-step computational workflow to improve antibody variants targeting influenza hemagglutinin. The pipeline combines structural biology analysis with AI-guided mutation design to achieve enhanced binding properties.
The optimized antibody variant YGSTGDRH demonstrates 5-fold improved binding affinity (205.8 nM) compared to the original sequence (1038.8 nM), achieved through systematic CDR3 region optimization.
The pipeline consists of eight sequential analysis phases:
- Dataset exploration - PDB structure collection and characterization
- Structure preprocessing - Molecular cleaning and interface identification
- CDR mapping - Complementarity-determining region analysis and hotspot identification
- AI mutation design - Machine learning-guided variant generation using ESM-2 and ProteinMPNN
- Binding affinity scoring - Molecular docking and energy calculations
- Filtering and selection - Multi-criteria evaluation and candidate ranking
- Visualization and reporting - Comprehensive results analysis
- Documentation - Repository organization and methodology documentation
biopython>=1.79
pandas>=1.5.0
matplotlib>=3.5.0
numpy>=1.21.0
torch>=1.12.0
transformers>=4.20.0
seaborn>=0.11.0
Execute notebooks sequentially (part1 through part8). Each notebook contains complete documentation and can be run independently with appropriate input data.
part1_HA_dataset/ # Initial data collection
part2_structure_preprocessing/ # Molecular structure preparation
part3_CDR_mapping/ # Binding site identification
part4_ai_mutation_design/ # Variant generation
part5_binding_affinity/ # Binding strength evaluation
part6_filtering_selection/ # Candidate optimization
part7_visualization_report/ # Results presentation
part8_github_pipeline/ # Documentation
The G1Y mutation (YGSTGDRH) was identified as the optimal variant through systematic evaluation of binding affinity, structural compatibility, and drug-like properties.
Ecenur Karagöl
B.Sc. Molecular Biology and Genetics
Specialization: Computational Biology, Structural Bioinformatics
Contact: karagollece@gmail.com