Kinase-substrate prediction using an autoregressive model Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.1016/j.csbj.2025.03.003
Kinase-specific phosphorylation plays a critical role in cellular signaling and various diseases. However, even in model organisms, the substrates of most kinases remain unidentified. Currently, there is no reliable method to predict kinase-substrate relationships. In this study, we introduce an innovative approach leveraging an autoregressive model to predict kinase-substrate pairs. Unlike traditional methods focused on predicting site-specific phosphorylation, our approach addresses kinase-specific protein substrate prediction at the protein level. We redefine this problem as a special type of protein-protein interaction prediction task. Our model integrates protein large language model ESM-2 as the encoder and employs an autoregressive decoder to classify protein-kinase interactions in a binary fashion. We adopted a hard negative strategy, based on kinase embedding distances generated from ESM-2, to compel the model to effectively distinguish positive from negative data. We conducted a top‑k analysis to assess how well our model can prioritize the most likely kinase candidates. Our method is also capable of zero-shot prediction, meaning it can predict substrates for a kinase in case of no known substrates, which cannot be achieved by site-specific prediction methods. Our model's robust generalization to novel kinase and underrepresented groups showcases its versatility and broad utility. Code and data are available at https://github.com/farz1995/substrate_kinase_prediction.
Related Topics To Compare & Contrast
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1016/j.csbj.2025.03.003
- OA Status
- gold
- References
- 38
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4408254815