Spatio-temporal learning from molecular dynamics simulations for protein–ligand binding affinity prediction Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.1093/bioinformatics/btaf429
· OA: W4413344468
Motivation The field of protein–ligand binding affinity prediction continues to face significant challenges. While deep learning (DL) models can leverage 3D structural information of protein–ligand complexes, they perform well only on heavily biased test sets containing information leaked from training sets. This lack of generalization arises from the limited availability of training data and the models’ inability to effectively learn from protein–ligand interactions. Since these interactions are inherently time-dependent, molecular dynamics (MD) simulations offer a potential solution by incorporating conformational sampling and providing interaction rich information. Results We have developed MDbind, a dataset comprising 63 000 simulations of protein–ligand interactions, along with novel neural networks capable of learning from these simulations to predict binding affinity. By utilizing MD as data augmentation, our models achieved state-of-the-art performance on the PDBbind v.2016 core set and an external test set, the free energy perturbation (FEP) dataset. Additionally, when trained on the full MD simulations, the models demonstrated less biased predictions. Availability and implementation The code for neural networks is available at https://github.com/ICOA-SBC/MD_DL_BA. The models, the results and the training/validation/test sets are available for download at https://zenodo.org/records/10390550. The MDbind trajectories are being transferred to the MDDB: https://mmb-dev.mddbr.eu/#/browse? option=mdbind.