PLD-Tree: Persistent Laplacian Decision Tree for Protein-Protein Binding Free Energy Prediction
Abstract
Recent advances in topology-based modeling have accelerated progress in physical modeling and molecular studies, including applications to protein-ligand binding affinity. In this work, we introduce the Persistent Laplacian Decision Tree (PLD-Tree), a novel method designed to address the challenging task of predicting protein-protein interaction (PPI) affinities. PLD-Tree focuses on protein chains at binding interfaces and employs the persistent Laplacian to capture topological invariants reflecting critical inter-protein interactions. These topological descriptors, derived from persistent homology, are further enhanced by incorporating evolutionary scale modeling (ESM) from a large language model to integrate sequence-based information. We validate PLD-Tree on two benchmark datasets-PDBbind V2020 and SKEMPI v2 demonstrating a correlation coefficient ($R_p$) of 0.83 under the sophisticated leave-out-protein-out cross-validation. Notably, our approach outperforms all reported state-of-the-art methods on these datasets. These results underscore the power of integrating machine learning techniques with topology-based descriptors for molecular docking and virtual screening, providing a robust and accurate framework for predicting protein-protein binding affinities.
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2024
- DOI:
- arXiv:
- arXiv:2412.18541
- Bibcode:
- 2024arXiv241218541X
- Keywords:
-
- Quantitative Biology - Biomolecules
- E-Print:
- 19 pages, 3 figures, 4 tables