MELISSA: Semi-Supervised Embedding for Protein Function Prediction Across Multiple Networks Article Swipe

PDF

Related Concepts

Embedding Artificial intelligence Function (biology) Computer science Machine learning Biology Evolutionary biology

Kaiyi Wu , Di Zhou , Donna K. Slonim , Xiaozhe Hu , Lenore Cowen ·

YOU? · · 2023 · Open Access · · DOI: https://doi.org/10.1101/2023.08.09.552672 · OA: W4385794378

Motivation Several popular methods exist to predict function from multiple protein-protein association networks. For example, both the Mashup algorithm, introduced by Cho, Peng and Berger, and deepNF, introduced by Gligorijević, Barotand, and Bonneau, analyze the diffusion in each network first, to characterize the topological context of each node. In Mashup, the high-dimensional topological patterns in individual networks are canonically represented using low-dimensional vectors, one per gene or protein, to yield the multi-network embedding. In deepNF, a multimodal autoencoder is trained to extract common network features across networks that yield a low-dimensional embedding. Neither embedding takes into account known functional labels; rather, these are then used by the machine learning methods applied after embedding. Results We introduce MELISSA (MultiNetwork Embedding with Label Integrated Semi-Supervised Augmentation) which incorporates functional labels in the embedding stage. The function labels induce sets of “must link” and “cannot link” constraints which guide a further semi-supervised dimension reduction to yield an embedding that captures both the network topology and the information contained in the annotations. We find that the MELISSA embedding improves on both the Mashup and deepNF embeddings in creating more functionally enriched neighborhoods for predicting GO labels for multiplex association networks in both yeast and humans. Availability MELISSA is available at https://github.com/XiaozheHu/melissa