Explanipedia

Higher-order Linear Attention Open

Yifan Zhang, Zhen Qin, Quanquan Gu · 2025

The quadratic cost of scaled dot-product attention is a central obstacle to scaling autoregressive language models to long contexts. Linear-time attention and State Space Models (SSMs) provide scalable alternatives but are typically restri…

Robust Layerwise Scaling Rules by Proper Weight Decay Tuning Open

Zhiyuan Fan, Yifeng Liu, Qingyue Zhao, An Yuan, Quanquan Gu · 2025

Empirical scaling laws prescribe how to allocate parameters, data, and compute, while maximal-update parameterization ($μ$P) enables learning-rate transfer across widths by equalizing early-time update magnitudes. However, in modern scale-…

Light intensity exercise and sleep: Indications of enhanced oxygen diffusion, perfusion, and metabolism Open

P. Anthony Gryffin, Quanquan Gu · 2025

Background Hypoxia underlies or complicates a wide range of chronic conditions. Previous research suggests slower-paced exercises may develop states of relaxation, combined with enhanced respiration, which may trigger increased oxygen perf…

A paradigm shift in light intensity exercise and sleep: Indications of enhanced oxygen diffusion and metabolism and implications for healing and cellular regeneration Open

P. Anthony Gryffin, Quanquan Gu · 2025

Background Hypoxia underlies or complicates a wide range of chronic conditions. Light-intensity exercises may develop states of relaxation, combined with enhanced respiration, which may trigger accelerated diffusion and facilitated oxygen …

Causal Attention with Lookahead Keys Open

Zhuoqing Song, Peng Sun, Haozhe Yuan, Quanquan Gu · 2025

In standard causal attention, each token's query, key, and value (QKV) are static and encode only preceding context. We introduce CAuSal aTtention with Lookahead kEys (CASTLE), an attention mechanism that continually updates each token's k…

Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey Open

Chen Ling, Xujiang Zhao, Jiaying Lu, Chengyuan Deng, Can Zheng , et al. · 2025

Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. However, directly applying LLMs to solve sophist…

The Inaugural Flatiron Institute Cryo-EM Conformational Heterogeneity Challenge Open

Miro A. Astore, Geoffrey Woollard, David Silva-Sánchez, Wenda Zhou, Mykhailo Kopylov , et al. · 2025

Despite the rise of single particle cryo-electron microscopy (cryo-EM) as a premier method for resolving macromolecular structures at atomic resolution, methods to address molecular heterogeneity in vitrified samples have yet to reach matu…

A Paradigm Shift in Walking, Sleep, and Exercise: Unique Effects on Blood Oxygen Saturation, Oxygen Diffusion, and Cellular Metabolism Open

P. Anthony Gryffin, Quanquan Gu · 2025

Hypoxia underlies or complicates a wide range of chronic conditions, including cancer, arthritis, chronic pain, multiple sclerosis, stroke, chronic kidney disease, diabetes and more. Research is presented supporting indications that slower…

Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration Open

H. Vicky Zhao, Xingrui Yu, David M. Bossens, Ivor W. Tsang, Quanquan Gu · 2025

Imitation learning is a central problem in reinforcement learning where the goal is to learn a policy that mimics the expert's behavior. In practice, it is often challenging to learn the expert policy from a limited number of demonstration…

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving Open

Wendong Xu, Jing Xiong, Chenyang Zhao, Qiujiang Chen, Haoran Wang , et al. · 2025

We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs) that closely mirrors real-world software development workflows. Unlike traditional static benchmarks, SwingArena models the collaborative process of…

On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning Open

Yifan Zhang, Yifeng Liu, Huizhuo Yuan, Yuan Yang, Quanquan Gu , et al. · 2025

Policy gradient algorithms have been successfully applied to enhance the reasoning capabilities of large language models (LLMs). KL regularization is ubiquitous, yet the design surface, choice of KL direction (forward vs. reverse), normali…

Stimulus-responsive cellulose hydrogels in biomedical applications and challenges Open

Huaqian Xue, Cong Zhu, Yifan Wang, Quanquan Gu, Yunyuan Shao , et al. · 2025

Stimuli-responsive cellulose hydrogels have garnered significant attention in the biomedical field owing to their extensive applications in tissue engineering and controlled drug delivery systems. Derived from cellulose and its derivatives…

Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization Open

Zixiang Chen, Greg Yang, Qingyue Zhao, Quanquan Gu · 2025

Despite deep neural networks' powerful representation learning capabilities, theoretical understanding of how networks can simultaneously achieve meaningful feature learning and global convergence remains elusive. Existing approaches like …

The Potential Impact of Antibiotic Exposure on the Microbiome and Human Health Open

Siqi Li, Jiahao Liu, Xinyang Zhang, Quanquan Gu, Yutong Wu , et al. · 2025

Antibiotics are a cornerstone of modern medicine, saving countless lives. However, their widespread use presents two major challenges. First, antibiotic-induced changes in the microbiome can disrupt immune function, increasing the suscepti…

CryoSTAR: Leveraging Structural Prior and Constraints for Cryo-EM Heterogeneous Reconstruction Open

Yilai Li, Yi Zhou, Jing Yuan, Fei Ye, Quanquan Gu · 2025

Resolving conformational heterogeneity in cryo-electron microscopy (cryo-EM) datasets remains a significant challenge in structural biology. Previous methods have often been restricted to working exclusively on volumetric densities, neglec…

Understanding SGD with Exponential Moving Average: A Case Study in Linear Regression Open

Xuheng Li, Quanquan Gu · 2025

Exponential moving average (EMA) has recently gained significant popularity in training modern deep learning models, especially diffusion-based generative models. However, there have been few theoretical results explaining the effectivenes…

Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees Open

Yongtao Wu, Luca Viano, Yihang Chen, Zhenyu Zhu, Kimon Antonakopoulos , et al. · 2025

Reinforcement Learning from Human Feedback (RLHF) has been highly successful in aligning large language models with human preferences. While prevalent methods like DPO have demonstrated strong performance, they frame interactions with the …

Logarithmic Regret for Online KL-Regularized Reinforcement Learning Open

Heyang Zhao, Chenlu Ye, Wei Xiong, Quanquan Gu, Tong Zhang · 2025

Recent advances in Reinforcement Learning from Human Feedback (RLHF) have shown that KL-regularization plays a pivotal role in improving the efficiency of RL fine-tuning for large language models (LLMs). Despite its empirical advantage, th…

Towards a Sharp Analysis of Offline Policy Learning for $f$-Divergence-Regularized Contextual Bandits Open

Qiang Zhao, Kai Ji, Heyang Zhao, Tong Zhang, Quanquan Gu · 2025

Although many popular reinforcement learning algorithms are underpinned by $f$-divergence regularization, their sample complexity with respect to the \emph{regularized objective} still lacks a tight characterization. In this paper, we anal…

Seeing through “brain fog”: neuroimaging assessment and imaging biomarkers for cancer-related cognitive impairments Open

Quanquan Gu, Liya Wang, Tricia Z. King, Hongbo Chen, Long Jiang Zhang , et al. · 2024

Advances in cancer diagnosis and treatment have substantially improved patient outcomes and survival in recent years. However, up to 75% of cancer patients and survivors, including those with non-central nervous system (non-CNS) cancers, s…

MARS: Unleashing the Power of Variance Reduction for Training Large Models Open

Huizhuo Yuan, Yifeng Liu, Shuang Wu, Xun Zhou, Quanquan Gu · 2024

Training deep neural networks--and more recently, large models demands efficient and scalable optimizers. Adaptive gradient algorithms like Adam, AdamW, and their variants have been central to this task. Despite the development of numerous…

ProteinWeaver: A Divide-and-Assembly Approach for Protein Backbone Design Open

Yiming Ma, Fei Ye, Yi Zhou, Zaixiang Zheng, Dongyu Xue , et al. · 2024

Nature creates diverse proteins through a 'divide and assembly' strategy. Inspired by this idea, we introduce ProteinWeaver, a two-stage framework for protein backbone design. Our method first generates individual protein domains and then …

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF Open

Heyang Zhao, Chenlu Ye, Quanquan Gu, Tong Zhang · 2024

Reverse-Kullback-Leibler (KL) regularization has emerged to be a predominant technique used to enhance policy optimization in reinforcement learning (RL) and reinforcement learning from human feedback (RLHF), which forces the learned polic…

CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing Open

Yang Chen, Chenyang Zhao, Quanquan Gu, Daibing Zhou · 2024

Sequential reasoning in agent systems has been significantly advanced by large language models (LLMs), yet existing approaches face limitations. Reflection-driven reasoning relies solely on knowledge in pretrained models, limiting performa…

Unified Convergence Analysis for Score-Based Diffusion Models with Deterministic Samplers Open

Runjia Li, Qiwei Di, Quanquan Gu · 2024

Score-based diffusion models have emerged as powerful techniques for generating samples from high-dimensional data distributions. These models involve a two-phase process: first, injecting noise to transform the data distribution into a kn…

DPLM-2: A Multimodal Diffusion Protein Language Model Open

Xinyou Wang, Zaixiang Zheng, Fei Ye, Dongyu Xue, Shujian Huang , et al. · 2024

Proteins are essential macromolecules defined by their amino acid sequences, which determine their three-dimensional structures and, consequently, their functions in all living organisms. Therefore, generative protein modeling necessitates…

CryoFM: A Flow-based Foundation Model for Cryo-EM Densities Open

Yi Zhou, Y Li, Jing Yuan, Quanquan Gu · 2024

Cryo-electron microscopy (cryo-EM) is a powerful technique in structural biology and drug discovery, enabling the study of biomolecules at high resolution. Significant advancements by structural biologists using cryo-EM have led to the pro…

Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization Open

Guanlin Liu, Kai Ji, Renjie Zheng, Zhenhua Wu, Dun Chen , et al. · 2024

Reinforcement Learning (RL) plays a crucial role in aligning large language models (LLMs) with human preferences and improving their ability to perform complex tasks. However, current approaches either require significant computational res…

Accelerated Preference Optimization for Large Language Model Alignment Open

Jie He, Huizhuo Yuan, Quanquan Gu · 2024

Reinforcement Learning from Human Feedback (RLHF) has emerged as a pivotal tool for aligning large language models (LLMs) with human preferences. Direct Preference Optimization (DPO), one of the most popular approaches, formulates RLHF as …

Redox-Detecting Deep Learning for Mechanism Discernment in Cyclic Voltammograms of Multiple Redox Events Open

Benjamin B. Hoar, Weitong Zhang, Yuanzhou Chen, Jingwen Sun, Hongyuan Sheng , et al. · 2024

In electrochemical analysis, mechanism assignment is fundamental to understanding the chemistry of a system. The detection and classification of electrochemical mechanisms in cyclic voltammetry set the foundation for subsequent quantitativ…

Quanquan Gu YOU? Author Swipe