Rohit Singh
YOU?
Author Swipe
View article: Transforming Biological Foundation Model Representations for Out-of-Distribution Data
Transforming Biological Foundation Model Representations for Out-of-Distribution Data Open
Foundation models pre-trained on certain biological data modalities exhibit systematic representational biases when encountering out-of-distribution (OOD) data from new assays. The embedding drift largely arises from instrumentation and pr…
View article: Iterative immunogen optimization to focus immune responses on a conserved, subdominant viral epitope
Iterative immunogen optimization to focus immune responses on a conserved, subdominant viral epitope Open
Designing effective vaccination strategies against genetically diverse viruses, such as HIV or influenza, is hindered by the ability of these pathogens to mutate and readily evade immune control. While these viruses contain conserved regio…
View article: Unveiling causal regulatory mechanisms through cell-state parallax
Unveiling causal regulatory mechanisms through cell-state parallax Open
Genome-wide association studies (GWAS) identify numerous disease-linked genetic variants at noncoding genomic loci, yet therapeutic progress is hampered by the challenge of deciphering the regulatory roles of these loci in tissue-specific …
View article: Comparative analysis of the syncytiotrophoblast in placenta tissue and trophoblast organoids using snRNA sequencing
Comparative analysis of the syncytiotrophoblast in placenta tissue and trophoblast organoids using snRNA sequencing Open
The syncytiotrophoblast (STB) is a multinucleated cell layer that forms the outer surface of human chorionic villi. Its unusual structure, with billions of nuclei in a single cell, makes it difficult to resolve using conventional single-ce…
View article: Decoding the causal drivers of spatial cellular topology
Decoding the causal drivers of spatial cellular topology Open
Decoding how cells influence and communicate with each other in space is fundamental for understanding tissue organization. However, existing approaches either overlook spatial context entirely or rely solely on local cell-cell adjacency, …
View article: Point-of-Care No-Specimen Diagnostic Platform Using Machine Learning and Raman Spectroscopy: Proof-of-Concept Studies for Both COVID-19 and Blood Glucose
Point-of-Care No-Specimen Diagnostic Platform Using Machine Learning and Raman Spectroscopy: Proof-of-Concept Studies for Both COVID-19 and Blood Glucose Open
Significance: We describe a novel, specimen-free diagnostic platform that can immediately detect both a metabolite (glucose) or an infection (COVID-19) by non-invasively using Raman spectroscopy and machine learning. Aim: Current diagnosti…
View article: Topology-driven discovery of transmembrane protein S-palmitoylation
Topology-driven discovery of transmembrane protein S-palmitoylation Open
Protein S-palmitoylation is a reversible lipophilic posttranslational modification regulating diverse signaling pathways. Within transmembrane proteins (TMPs), S-palmitoylation is implicated in conditions from inflammatory disorders to res…
View article: Learning a CoNCISE language for small-molecule binding
Learning a CoNCISE language for small-molecule binding Open
Rapid advances in deep learning have improved in silico methods for drug-target interaction (DTI) prediction. However, current methods do not scale to the massive catalogs that list millions or billions of commercially-available small mole…
View article: Learning the language of antibody hypervariability
Learning the language of antibody hypervariability Open
Protein language models (PLMs) have demonstrated impressive success in modeling proteins. However, general-purpose “foundational” PLMs have limited performance in modeling antibodies due to the latter’s hypervariable regions, which do not …
View article: Aggregating residue-level protein language model embeddings with optimal transport
Aggregating residue-level protein language model embeddings with optimal transport Open
Motivation Protein language models (PLMs) have emerged as powerful approaches for mapping protein sequences into embeddings suitable for various applications. As protein representation schemes, PLMs generate per-token (i.e. per-residue) re…
View article: Harnessing Artificial Intelligence in Dentistry: Enhancing Patient Care and Diagnostic Precision
Harnessing Artificial Intelligence in Dentistry: Enhancing Patient Care and Diagnostic Precision Open
View article: Decoding the Functional Interactome of Non-Model Organisms with PHILHARMONIC
Decoding the Functional Interactome of Non-Model Organisms with PHILHARMONIC Open
Despite the widespread availability of genome sequencing pipelines, many genes remain part of the genome’s “dark matter,” where existing inference tools cannot even begin to guess the biological function of their proteins from sequence alo…
View article: Point-of-Care No-Specimen Diagnostic Platform using Machine Learning and Raman Spectroscopy: Proof-of-Concept Studies for Both COVID-19 and Blood Glucose
Point-of-Care No-Specimen Diagnostic Platform using Machine Learning and Raman Spectroscopy: Proof-of-Concept Studies for Both COVID-19 and Blood Glucose Open
Significance We describe a novel, specimen-free diagnostic platform that can immediately detect both a metabolite (glucose) or an infection (COVID-19), by non-invasively using Raman spectroscopy and machine learning. Aim Current diagnostic…
View article: Topology-Driven Discovery of Transmembrane Protein<i>S</i>-Palmitoylation
Topology-Driven Discovery of Transmembrane Protein<i>S</i>-Palmitoylation Open
Protein S -palmitoylation is a reversible lipophilic posttranslational modification regulating a diverse number of signaling pathways. Within transmembrane proteins (TMPs), S -palmitoylation is implicated in conditions from inflammatory di…
View article: Miniaturizing, Modifying, and Magnifying Nature’s Proteins with Raygun
Miniaturizing, Modifying, and Magnifying Nature’s Proteins with Raygun Open
Proteins have evolved over billions of years through extensive and coordinated substitutions, insertions and deletions (indels). Computational protein design cannot yet fully mimic nature’s ability to engineer new proteins from existing te…
View article: Aggregating Residue-Level Protein Language Model Embeddings with Optimal Transport
Aggregating Residue-Level Protein Language Model Embeddings with Optimal Transport Open
Protein language models (PLMs) have emerged as powerful approaches for mapping protein sequences into embeddings suitable for various applications. As protein representation schemes, PLMs generate per-token (i.e., per-residue) representati…
View article: Local transcriptional covariation produces accurate estimates of cell phenotype
Local transcriptional covariation produces accurate estimates of cell phenotype Open
The utility of single-cell RNA sequencing (scRNA-seq) is premised on the notion that transcriptional state can faithfully reflect cell phenotype. However, scRNA-seq measurements are noisy and sparse, with individual transcript counts showi…
View article: TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions
TT3D: Leveraging precomputed protein 3D sequence models to predict protein–protein interactions Open
Motivation High-quality computational structural models are now precomputed and available for nearly every protein in UniProt. However, the best way to leverage these models to predict which pairs of proteins interact in a high-throughput …
View article: Machine learning and multinomal models: A simulation study
Machine learning and multinomal models: A simulation study Open
In recent times, the use of machine learning in academic research has become a more and more attractive option, an option that is becoming increasingly popular in the field of econometric research A significant amount of econometric litera…
View article: Contrastive learning in protein language space predicts interactions between drugs and protein targets
Contrastive learning in protein language space predicts interactions between drugs and protein targets Open
Sequence-based prediction of drug–target interactions has the potential to accelerate drug discovery by complementing experimental screens. Such computational prediction needs to be generalizable and scalable while remaining sensitive to s…
View article: split-intein Gal4 provides intersectional genetic labeling that is repressible by Gal80
split-intein Gal4 provides intersectional genetic labeling that is repressible by Gal80 Open
The split-Gal4 system allows for intersectional genetic labeling of highly specific cell types and tissues in Drosophila . However, the existing split-Gal4 system, unlike the standard Gal4 system, cannot be repressed by Gal80, and therefor…
View article: Learning the Language of Antibody Hypervariability
Learning the Language of Antibody Hypervariability Open
Protein language models (PLMs) based on machine learning have demon-strated impressive success in predicting protein structure and function. However, general-purpose (“foundational”) PLMs have limited performance in predicting antibodies d…
View article: split-intein Gal4 provides intersectional genetic labeling that is fully repressible by Gal80
split-intein Gal4 provides intersectional genetic labeling that is fully repressible by Gal80 Open
The split-Gal4 system allows for intersectional genetic labeling of highly specific cell-types and tissues in Drosophila . However, the existing split-Gal4 system, unlike the standard Gal4 system, cannot be repressed by Gal80, and therefor…
View article: Unveiling causal regulatory mechanisms through cell-state parallax
Unveiling causal regulatory mechanisms through cell-state parallax Open
Genome-wide association studies (GWAS) identify numerous disease-linked genetic variants at noncoding genomic loci, yet therapeutic progress is hampered by the challenge of deciphering the regulatory roles of these loci in tissue-specific …
View article: Transfer of knowledge from model organisms to evolutionarily distant non-model organisms: The coral Pocillopora damicornis membrane signaling receptome
Transfer of knowledge from model organisms to evolutionarily distant non-model organisms: The coral Pocillopora damicornis membrane signaling receptome Open
With the ease of gene sequencing and the technology available to study and manipulate non-model organisms, the extension of the methodological toolbox required to translate our understanding of model organisms to non-model organisms has be…
View article: Learning the Drug-Target Interaction Lexicon
Learning the Drug-Target Interaction Lexicon Open
Sequence-based prediction of drug-target interactions has the potential to accelerate drug discovery by complementing experimental screens. Such computational prediction needs to be generalizable and scalable while remaining sensitive to s…
View article: Adapting protein language models for rapid DTI prediction
Adapting protein language models for rapid DTI prediction Open
We consider the problem of sequence-based drug-target interaction (DTI) prediction, showing that a straightforward deep learning architecture that leverages pre-trained protein language models (PLMs) for protein embedding outperforms state…
View article: Contrasting drugs from decoys
Contrasting drugs from decoys Open
Protein language models (PLMs) have recently been proposed to advance drugtarget interaction (DTI) prediction, and have shown state-of-the-art performance on several standard benchmarks. However, a remaining challenge for all DTI predictio…
View article: Causal gene regulatory analysis with RNA velocity reveals an interplay between slow and fast transcription factors
Causal gene regulatory analysis with RNA velocity reveals an interplay between slow and fast transcription factors Open
Single-cell expression dynamics from differentiation trajectories or RNA velocity have the potential to reveal causal links between transcription factors (TFs) and their target genes in gene regulatory networks (GRNs). However, existing me…
View article: Causally-guided Regularization of Graph Attention Improves Generalizability
Causally-guided Regularization of Graph Attention Improves Generalizability Open
Graph attention networks estimate the relational importance of node neighbors to aggregate relevant information over local neighborhoods for a prediction task. However, the inferred attentions are vulnerable to spurious correlations and co…