Eric P. Xing
YOU?
Author Swipe
View article: Beyond the Black Box: Identifiable Interpretation and Control in Generative Models via Causal Minimality
Beyond the Black Box: Identifiable Interpretation and Control in Generative Models via Causal Minimality Open
Deep generative models, while revolutionizing fields like image and text generation, largely operate as opaque black boxes, hindering human understanding, control, and alignment. While methods like sparse autoencoders (SAEs) show remarkabl…
View article: PertAdapt: Unlocking Single-Cell Foundation Models for Genetic Perturbation Prediction via Condition-Sensitive Adaptation
PertAdapt: Unlocking Single-Cell Foundation Models for Genetic Perturbation Prediction via Condition-Sensitive Adaptation Open
Single-cell foundation models (FMs) pretrained on massive unlabeled scRNA-seq data show strong potential in predicting transcriptional responses to unseen genetic perturbations. However, existing approaches insufficiently transfer pretrain…
View article: PAN: A World Model for General, Interactable, and Long-Horizon World Simulation
PAN: A World Model for General, Interactable, and Long-Horizon World Simulation Open
A world model enables an intelligent agent to imagine, predict, and reason about how the world evolves in response to its actions, and accordingly to plan and strategize. While recent video generation models produce realistic visual sequen…
View article: Efficient Long-context Language Model Training by Core Attention Disaggregation
Efficient Long-context Language Model Training by Core Attention Disaggregation Open
We present core attention disaggregation (CAD), a technique that improves long-context large language model training by decoupling the core attention computation, softmax(QK^T)V, from the rest of the model and executing it on a separate po…
View article: Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity
Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity Open
The rapid adoption of generative artificial intelligence (GenAI) in the biosciences is transforming biotechnology, medicine, and synthetic biology. Yet this advancement is intrinsically linked to new vulnerabilities, as GenAI lowers the ba…
View article: PRISM: Enhancing Protein Inverse Folding through Fine-Grained Retrieval on Structure-Sequence Multimodal Representations
PRISM: Enhancing Protein Inverse Folding through Fine-Grained Retrieval on Structure-Sequence Multimodal Representations Open
Designing protein sequences that fold into a target three-dimensional structure, known as the inverse folding problem, is central to protein engineering but remains challenging due to the vast sequence space and the importance of local str…
View article: Step-Aware Policy Optimization for Reasoning in Diffusion Large Language Models
Step-Aware Policy Optimization for Reasoning in Diffusion Large Language Models Open
Diffusion language models (dLLMs) offer a promising, non-autoregressive paradigm for text generation, yet training them for complex reasoning remains a key challenge. Current reinforcement learning approaches often rely on sparse, outcome-…
View article: Response to Promises and Pitfalls of Deep Kernel Learning
Response to Promises and Pitfalls of Deep Kernel Learning Open
This note responds to "Promises and Pitfalls of Deep Kernel Learning" (Ober et al., 2021). The marginal likelihood of a Gaussian process can be compartmentalized into a data fit term and a complexity penalty. Ober et al. (2021) shows that …
View article: Prophylactic VA-ECMO During Complex High-Risk PCI
Prophylactic VA-ECMO During Complex High-Risk PCI Open
In this single-center study of patients undergoing elective PCI of complex high-risk coronary lesions, prophylactic VA-ECMO was associated with lower rates of life-threatening complications and larger reduction in SYNTAX scores. Larger stu…
View article: Vision-G1: Towards General Vision Language Reasoning with Multi-Domain Data Curation
Vision-G1: Towards General Vision Language Reasoning with Multi-Domain Data Curation Open
Despite their success, current training pipelines for reasoning VLMs focus on a limited range of tasks, such as mathematical and logical reasoning. As a result, these models face difficulties in generalizing their reasoning capabilities to…
View article: Data Mixing Optimization for Supervised Fine-Tuning of Large Language Models
Data Mixing Optimization for Supervised Fine-Tuning of Large Language Models Open
Optimizing data mixtures for supervised fine-tuning (SFT) of large language models (LLMs) is critical for developing general-purpose models, yet this area remains underexplored. In this paper, we frame data mixing as an optimization proble…
View article: SimuRA: A World-Model-Driven Simulative Reasoning Architecture for General Goal-Oriented Agents
SimuRA: A World-Model-Driven Simulative Reasoning Architecture for General Goal-Oriented Agents Open
AI agents built on foundation models hold enormous promise. Current practice, however, focuses on a one-task-one-agent approach, which not only falls short of scalability and generality, but also faces practical limitations from black-box …
View article: AIDO.Tissue: Spatial Cell-Guided Pretraining for Scalable Spatial Transcriptomics Foundation Model
AIDO.Tissue: Spatial Cell-Guided Pretraining for Scalable Spatial Transcriptomics Foundation Model Open
Single-cell spatial transcriptomics enables high-resolution insights into tissue organization and cell-cell interactions, yet poses significant computational and modeling challenges due to its scale and complexity. Here we introduce AIDO.T…
View article: Nile-Chat: Egyptian Language Models for Arabic and Latin Scripts
Nile-Chat: Egyptian Language Models for Arabic and Latin Scripts Open
We introduce Nile-Chat-4B, 3x4B-A6B, and 12B, a collection of LLMs for Egyptian dialect, uniquely designed to understand and generate texts written in both Arabic and Latin scripts. Specifically, with Nile-Chat-3x4B-A6B, we introduce a nov…
View article: Rapid and Reproducible Multimodal Biological Foundation Model Development with AIDO.ModelGenerator
Rapid and Reproducible Multimodal Biological Foundation Model Development with AIDO.ModelGenerator Open
Foundation models (FMs) for DNA, RNA, proteins, cells, and tissues have begun to close long-standing performance gaps in biological prediction tasks, yet each modality is usually studied in isolation. Bridging them requires software that c…
View article: Uncertainty-Aware Discrete Diffusion Improves Protein Design
Uncertainty-Aware Discrete Diffusion Improves Protein Design Open
Protein inverse folding involves generating amino acid sequences that adopt a specified 3D structure—a key challenge in structural biology and molecular engineering. While discrete diffusion models have demonstrated strong performance, exi…
View article: Multimodal Benchmarking of Foundation Model Representations for Cellular Perturbation Response Prediction
Multimodal Benchmarking of Foundation Model Representations for Cellular Perturbation Response Prediction Open
The decreasing cost of single-cell RNA sequencing (scRNA-seq) has enabled the collection of massive scRNA-seq datasets, which are now being used to train transformer-based cell foundation models (FMs). One of the most promising application…
View article: Global and Local Entailment Learning for Natural World Imagery
Global and Local Entailment Learning for Natural World Imagery Open
Learning the hierarchical structure of data in vision-language models is a significant challenge. Previous works have attempted to address this challenge by employing entailment learning. However, these approaches fail to model the transit…
View article: Reassessing immediate coal phase-out: Dual imperatives of capacity control and renewables expansion in China’s net-zero strategy
Reassessing immediate coal phase-out: Dual imperatives of capacity control and renewables expansion in China’s net-zero strategy Open
Immediate cessation of investments in new coal-fired power plants is widely regarded as a crucial measure for China to achieve net-zero emission. However, there exists a lack of systematic evaluations regarding the stringent removal of coa…
View article: Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Open
Reinforcement learning (RL) has emerged as a promising approach to improve large language model (LLM) reasoning, yet most open efforts focus narrowly on math and code, limiting our understanding of its broader applicability to general reas…
View article: Pruning Spurious Subgraphs for Graph Out-of-Distribution Generalization
Pruning Spurious Subgraphs for Graph Out-of-Distribution Generalization Open
Graph Neural Networks (GNNs) often encounter significant performance degradation under distribution shifts between training and test data, hindering their applicability in real-world scenarios. Recent studies have proposed various methods …
View article: Log-Linear Attention
Log-Linear Attention Open
The attention mechanism in Transformers is an important primitive for accurate and scalable sequence modeling. Its quadratic-compute and linear-memory complexity however remain significant bottlenecks. Linear attention and state-space mode…
View article: Esoteric Language Models
Esoteric Language Models Open
Diffusion-based language models offer a compelling alternative to autoregressive (AR) models by enabling parallel and controllable generation. Among this family of models, Masked Diffusion Models (MDMs) achieve the strongest performance bu…
View article: ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval
ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval Open
Composed image retrieval (CIR) is the task of retrieving a target image specified by a query image and a relative text that describes a semantic modification to the query image. Existing methods in CIR struggle to accurately represent the …
View article: QuARI: Query Adaptive Retrieval Improvement
QuARI: Query Adaptive Retrieval Improvement Open
Massive-scale pretraining has made vision-language models increasingly popular for image-to-image and text-to-image retrieval across a broad collection of domains. However, these models do not perform well when used for challenging retriev…
View article: Learning to estimate sample-specific transcriptional networks for 7,000 tumors
Learning to estimate sample-specific transcriptional networks for 7,000 tumors Open
Cancers are shaped by somatic mutations, microenvironment, and patient background, each altering gene expression and regulation in complex ways, resulting in heterogeneous cellular states and dynamics. Inferring gene regulatory networks (G…
View article: lmgame-Bench: How Good are LLMs at Playing Games?
lmgame-Bench: How Good are LLMs at Playing Games? Open
Playing video games requires perception, memory, and planning, exactly the faculties modern large language model (LLM) agents are expected to master. We study the major challenges in using popular video games to evaluate modern LLMs and fi…
View article: Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models
Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models Open
The recent explosion of large language models (LLMs), each with its own general or specialized strengths, makes scalable, reliable benchmarking more urgent than ever. Standard practices nowadays face fundamental trade-offs: closed-ended qu…
View article: A Large-Scale Foundation Model for RNA Enables Diverse Function and Structure Prediction
A Large-Scale Foundation Model for RNA Enables Diverse Function and Structure Prediction Open
Accurately predicting RNA structures and functions from nucleotide sequences, or conversely, designing sequences to meet structural and functional requirements, remains a fundamental challenge in RNA biology, largely due to limited annotat…
View article: Myocardial flow reserve derived from D-SPECT for evaluating non-culprit ischemic lesions in STEMI patients: comparison with quantitative flow ratio
Myocardial flow reserve derived from D-SPECT for evaluating non-culprit ischemic lesions in STEMI patients: comparison with quantitative flow ratio Open