Explanipedia

Revealing emergent human-like conceptual representations from language prediction Open

Ningyu Xu, Qi Zhang, Chao Du, Qiang Luo, Xipeng Qiu , et al. · 2025

People acquire concepts through rich physical and social experiences and use them to understand and navigate the world. In contrast, large language models (LLMs), trained solely through next-token prediction on text, exhibit strikingly hum…

Trajectories of cumulative fluid balance and the association with pressure injuries in ICU patients Open

Xiangping Chen, Peiqi Liu, Bing Zhu, Xipeng Qiu, Wei Yu , et al. · 2025

Four unique trajectories of CFB were identified among patients in the ICU, including rapid accumulation, slow accumulation, neutral balance, and negative decrease. Rapid accumulation independently increased the risk of PIs during ICU stay.

ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models Open

Zhangyue Yin, Qiushi Sun, Zhiyuan Zeng, Zhiyuan Yu, Qipeng Guo , et al. · 2025

Test-time scaling has emerged as a transformative paradigm for enhancing the performance of large reasoning models, enabling dynamic allocation of computational resources during inference. However, as the landscape of reasoning models rapi…

MCM-DPO: Multifaceted Cross-Modal Direct Preference Optimization for Alt-text Generation Open

Jinlan Fu, Shenzhen Huangfu, Fei, Hao, Ya‐Chun Huang, Xiaoyu Shen , et al. · 2025

The alt-text generation task produces concise, context-relevant descriptions of images, enabling blind and low-vision users to access online images. Despite the capabilities of large vision-language models, alt-text generation performance …

SIM-CoT: Supervised Implicit Chain-of-Thought Open

Xilin Wei, Xiaoran Liu, Yuhang Zang, Xiaoyi Dong, Yuan Cao , et al. · 2025

Implicit Chain-of-Thought (CoT) methods offer a token-efficient alternative to explicit CoT reasoning in Large Language Models (LLMs), but a persistent performance gap has limited their adoption. We identify a core latent instability issue…

Evolution of Concepts in Language Model Pre-Training Open

Xuyang Ge, Wentao Shu, Jiaxin Wu, Yunhua Zhou, Zhengfu He , et al. · 2025

Language models obtain extensive capabilities through pre-training. However, the pre-training process remains a black box. In this work, we track linear interpretable feature evolution across pre-training snapshots using a sparse dictionar…

Decoupled Proxy Alignment: Mitigating Language Prior Conflict for Multimodal Alignment in MLLM Open

Chenkun Tan, Pengyu Wang, Shaojun Zhou, Botian Jiang, Zhaowei Li , et al. · 2025

Multimodal large language models (MLLMs) have gained significant attention due to their impressive ability to integrate vision and language modalities. Recent advancements in MLLMs have primarily focused on improving performance through hi…

UnifiedVisual: A Framework for Constructing Unified Vision-Language Datasets Open

Pengyu Wang, Shaojun Zhou, Chenkun Tan, Xinghao Wang, Wei Huang , et al. · 2025

Unified vision large language models (VLLMs) have recently achieved impressive advancements in both multimodal understanding and generation, powering applications such as visual question answering and text-guided image synthesis. However, …

VStyle: A Benchmark for Voice Style Adaptation with Spoken Instructions Open

Jun Zhan, Mingyang Han, Yuxuan Richard Xie, Chen Wang, Dong Zhang , et al. · 2025

Spoken language models (SLMs) have emerged as a unified paradigm for speech understanding and generation, enabling natural human machine interaction. However, while most progress has focused on semantic accuracy and instruction following, …

VehicleWorld: A Highly Integrated Multi-Device Environment for Intelligent Vehicle Interaction Open

Jie Yang, Jiajun Chen, Zhangyue Yin, Shuo Chen, Yuxin Wang , et al. · 2025

Intelligent vehicle cockpits present unique challenges for API Agents, requiring coordination across tightly-coupled subsystems that exceed typical task environments' complexity. Traditional Function Calling (FC) approaches operate statele…

Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance Open

Luozhijie Jin, Zijie Qiu, Da‐Yan Liu, Zhao Yu Diao, Lifeng Qiao , et al. · 2025

Denoising-based generative models, particularly diffusion and flow matching algorithms, have achieved remarkable success. However, aligning their output distributions with complex downstream objectives, such as human preferences, compositi…

CodecBench: A Comprehensive Benchmark for Acoustic and Semantic Evaluation Open

Robert H. Deng, Yan-Xiao Gong, Qinghui Gao, Jin Li, Qinyuan Cheng , et al. · 2025

With the rise of multimodal large language models (LLMs), audio codec plays an increasingly vital role in encoding audio into discrete tokens, enabling integration of audio into text-based LLMs. Current audio codec captures two types of in…

FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction Open

Zhiyuan Zeng, Jiashuo Liu, Siyuan Chen, Tianci He, Y. P. Liao , et al. · 2025

Future prediction is a complex task for LLM agents, requiring a high level of analytical thinking, information gathering, contextual understanding, and decision-making under uncertainty. Agents must not only gather and interpret vast amoun…

Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction Open

Yang Song, Xiaoran Liu, Ruixiao Li, Zhigeng Liu, Zengfeng Huang , et al. · 2025

Diffusion Large Language Models (dLLMs) enable breakthroughs in reasoning and parallel decoding but suffer from prohibitive quadratic computational complexity and memory overhead during inference. Current caching techniques accelerate deco…

Dynamic and Generalizable Process Reward Modeling Open

Zhangyue Yin, Qiushi Sun, Zhiyuan Zeng, Qinyuan Cheng, Xipeng Qiu , et al. · 2025

Process Reward Models (PRMs) are crucial for guiding Large Language Models (LLMs) in complex scenarios by providing dense reward signals. However, existing PRMs primarily rely on heuristic approaches, which struggle with cross-domain gener…

Pre-Trained Policy Discriminators are General Reward Models Open

Shihan Dou, Shichun Liu, Y. Yang, Yicheng Zou, Yunhua Zhou , et al. · 2025

We offer a novel perspective on reward modeling by formulating it as a policy discriminator, which quantifies the difference between two policies to generate a reward signal, guiding the training policy towards a target policy with desired…

Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning Open

Zhaoye Fei, Ji Li, Siyin Wang, Junhao Shi, Jingjing Gong , et al. · 2025

Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, yet they face significant challenges in embodied task planning scenarios that require continuous environmental understanding and action generation…

World-aware Planning Narratives Enhance Large Vision-Language Model Planner Open

Junhao Shi, Zhaoye Fei, Siyin Wang, Qipeng Guo, Jingjing Gong , et al. · 2025

Large Vision-Language Models (LVLMs) show promise for embodied planning tasks but struggle with complex scenarios involving unfamiliar environments and multi-step goals. Current approaches rely on environment-agnostic imitation learning th…

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections Open

Bo Wang, Qiuming Cheng, Runyu Peng, Rong Bao, Ping Li , et al. · 2025

Post-training processes are essential phases in grounding pre-trained language models to real-world tasks, with learning from demonstrations or preference signals playing a crucial role in this adaptation. We present a unified theoretical …

Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache Open

Xiaoran Liu, Shidong He, Qiqi Wang, Ruixiao Li, Yang Song , et al. · 2025

Large Language Models struggle with memory demands from the growing Key-Value (KV) cache as context lengths increase. Existing compression methods homogenize head dimensions or rely on attention-guided token pruning, often sacrificing accu…

Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training Open

Mozhi Zhang, Howe Tissue, Lu Wang, Xipeng Qiu · 2025

We introduce~\textsc{Domain2Vec}, a novel approach that decomposes any dataset into a linear combination of several \emph{meta-domains}, a new concept designed to capture the key underlying features of datasets. \textsc{Domain2Vec} maintai…

Clinical study of intelligent tongue diagnosis and oral microbiome for classifying TCM syndromes in MASLD Open

Juncai Deng, Shixuan Dai, Shi Liu, Liping Tu, Ji Cui , et al. · 2025

Medicine

Background This study aimed to analyze the tongue image features and oral microbial markers in different TCM syndromes related to metabolic dysfunction-associated steatotic liver disease (MASLD). Methods This study involved 34 healthy volu…

REARANK: Reasoning Re-ranking Agent via Reinforcement Learning Open

Le Zhang, Bo Wang, Xipeng Qiu, Siva Reddy, Aishwarya Agrawal · 2025

We present REARANK, a large language model (LLM)-based listwise reasoning reranking agent. REARANK explicitly reasons before reranking, significantly improving both performance and interpretability. Leveraging reinforcement learning and da…

ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning Open

Changtai Zhu, Siyin Wang, Ruijun Feng, Kai Song, Xipeng Qiu · 2025

Conversational search systems require effective handling of context-dependent queries that often contain ambiguity, omission, and coreference. Conversational Query Reformulation (CQR) addresses this challenge by transforming these queries …

Teach2Eval: An Indirect Evaluation Method for LLM by Judging How It Teaches Open

Yuhang Zhou, Xutian Chen, Yixin Cao, Yuchen Ni, Yu He , et al. · 2025

Recent progress in large language models (LLMs) has outpaced the development of effective evaluation methods. Traditional benchmarks rely on task-specific metrics and static datasets, which often suffer from fairness issues, limited scalab…

Task-Core Memory Management and Consolidation for Long-term Continual Learning Open

Tianyu Huai, Jie Zhou, Yuxuan Cai, Qin Chen, Wen Wu , et al. · 2025

In this paper, we focus on a long-term continual learning (CL) task, where a model learns sequentially from a stream of vast tasks over time, acquiring new knowledge while retaining previously learned information in a manner akin to human …

Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback Open

Yutao Yang, Jie Zhou, Junsong Li, Qingrui Pan, Buchao Zhan , et al. · 2025

This paper introduces an interactive continual learning paradigm where AI models dynamically learn new skills from real-time human feedback while retaining prior knowledge. This paradigm distinctively addresses two major limitations of tra…

Which Teaching Arrangement Can Better Inspire Teachers and Students?-- A Game Theoretic Approach Open

Xi Yang, Yue Qiao, Xipeng Qiu, LI Ya · 2025

Computer science Psychology Mathematics

Based on 61332 posts on social network during the epidemic period, this research used natural language processing methods and text sentiment analysis to study the satisfaction with online teaching, and found that approximately 59 percent o…

Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition Open

Z. T. He, Junxuan Wang, Rui Lin, Xuyang Ge, Wei Shu , et al. · 2025

We propose Low-Rank Sparse Attention (Lorsa), a sparse replacement model of Transformer attention layers to disentangle original Multi Head Self Attention (MHSA) into individually comprehensible components. Lorsa is designed to address the…

Clinical Study of Intelligent Tongue Diagnosis and Oral Microbiome for Classifying TCM Syndromes in MASLD Open

Juncai Deng, Shixuan Dai, Shi Liu, Liping Tu, Ji Cui , et al. · 2025

Medicine Computer science Biology

Background This study aimed to analyze the tongue image features and oral microbial markers in different TCM syndromes related to metabolic dysfunction-associated steatotic liver disease (MASLD). Methods This study involved 34 healthy volu…

Xipeng Qiu YOU? Author Swipe