Xipeng Qiu
YOU?
Author Swipe
View article: Revealing emergent human-like conceptual representations from language prediction
Revealing emergent human-like conceptual representations from language prediction Open
People acquire concepts through rich physical and social experiences and use them to understand and navigate the world. In contrast, large language models (LLMs), trained solely through next-token prediction on text, exhibit strikingly hum…
View article: Trajectories of cumulative fluid balance and the association with pressure injuries in ICU patients
Trajectories of cumulative fluid balance and the association with pressure injuries in ICU patients Open
Four unique trajectories of CFB were identified among patients in the ICU, including rapid accumulation, slow accumulation, neutral balance, and negative decrease. Rapid accumulation independently increased the risk of PIs during ICU stay.
View article: ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models
ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models Open
Test-time scaling has emerged as a transformative paradigm for enhancing the performance of large reasoning models, enabling dynamic allocation of computational resources during inference. However, as the landscape of reasoning models rapi…
View article: MCM-DPO: Multifaceted Cross-Modal Direct Preference Optimization for Alt-text Generation
MCM-DPO: Multifaceted Cross-Modal Direct Preference Optimization for Alt-text Generation Open
The alt-text generation task produces concise, context-relevant descriptions of images, enabling blind and low-vision users to access online images. Despite the capabilities of large vision-language models, alt-text generation performance …
View article: SIM-CoT: Supervised Implicit Chain-of-Thought
SIM-CoT: Supervised Implicit Chain-of-Thought Open
Implicit Chain-of-Thought (CoT) methods offer a token-efficient alternative to explicit CoT reasoning in Large Language Models (LLMs), but a persistent performance gap has limited their adoption. We identify a core latent instability issue…
View article: Evolution of Concepts in Language Model Pre-Training
Evolution of Concepts in Language Model Pre-Training Open
Language models obtain extensive capabilities through pre-training. However, the pre-training process remains a black box. In this work, we track linear interpretable feature evolution across pre-training snapshots using a sparse dictionar…
View article: Decoupled Proxy Alignment: Mitigating Language Prior Conflict for Multimodal Alignment in MLLM
Decoupled Proxy Alignment: Mitigating Language Prior Conflict for Multimodal Alignment in MLLM Open
Multimodal large language models (MLLMs) have gained significant attention due to their impressive ability to integrate vision and language modalities. Recent advancements in MLLMs have primarily focused on improving performance through hi…
View article: UnifiedVisual: A Framework for Constructing Unified Vision-Language Datasets
UnifiedVisual: A Framework for Constructing Unified Vision-Language Datasets Open
Unified vision large language models (VLLMs) have recently achieved impressive advancements in both multimodal understanding and generation, powering applications such as visual question answering and text-guided image synthesis. However, …
View article: VStyle: A Benchmark for Voice Style Adaptation with Spoken Instructions
VStyle: A Benchmark for Voice Style Adaptation with Spoken Instructions Open
Spoken language models (SLMs) have emerged as a unified paradigm for speech understanding and generation, enabling natural human machine interaction. However, while most progress has focused on semantic accuracy and instruction following, …
View article: VehicleWorld: A Highly Integrated Multi-Device Environment for Intelligent Vehicle Interaction
VehicleWorld: A Highly Integrated Multi-Device Environment for Intelligent Vehicle Interaction Open
Intelligent vehicle cockpits present unique challenges for API Agents, requiring coordination across tightly-coupled subsystems that exceed typical task environments' complexity. Traditional Function Calling (FC) approaches operate statele…
View article: Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance
Inference-Time Alignment Control for Diffusion Models with Reinforcement Learning Guidance Open
Denoising-based generative models, particularly diffusion and flow matching algorithms, have achieved remarkable success. However, aligning their output distributions with complex downstream objectives, such as human preferences, compositi…
View article: CodecBench: A Comprehensive Benchmark for Acoustic and Semantic Evaluation
CodecBench: A Comprehensive Benchmark for Acoustic and Semantic Evaluation Open
With the rise of multimodal large language models (LLMs), audio codec plays an increasingly vital role in encoding audio into discrete tokens, enabling integration of audio into text-based LLMs. Current audio codec captures two types of in…
View article: FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction Open
Future prediction is a complex task for LLM agents, requiring a high level of analytical thinking, information gathering, contextual understanding, and decision-making under uncertainty. Agents must not only gather and interpret vast amoun…
View article: Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction
Sparse-dLLM: Accelerating Diffusion LLMs with Dynamic Cache Eviction Open
Diffusion Large Language Models (dLLMs) enable breakthroughs in reasoning and parallel decoding but suffer from prohibitive quadratic computational complexity and memory overhead during inference. Current caching techniques accelerate deco…
View article: Dynamic and Generalizable Process Reward Modeling
Dynamic and Generalizable Process Reward Modeling Open
Process Reward Models (PRMs) are crucial for guiding Large Language Models (LLMs) in complex scenarios by providing dense reward signals. However, existing PRMs primarily rely on heuristic approaches, which struggle with cross-domain gener…
View article: Pre-Trained Policy Discriminators are General Reward Models
Pre-Trained Policy Discriminators are General Reward Models Open
We offer a novel perspective on reward modeling by formulating it as a policy discriminator, which quantifies the difference between two policies to generate a reward signal, guiding the training policy towards a target policy with desired…
View article: Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning
Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning Open
Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, yet they face significant challenges in embodied task planning scenarios that require continuous environmental understanding and action generation…
View article: World-aware Planning Narratives Enhance Large Vision-Language Model Planner
World-aware Planning Narratives Enhance Large Vision-Language Model Planner Open
Large Vision-Language Models (LVLMs) show promise for embodied planning tasks but struggle with complex scenarios involving unfamiliar environments and multi-step goals. Current approaches rely on environment-agnostic imitation learning th…
View article: Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections
Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections Open
Post-training processes are essential phases in grounding pre-trained language models to real-world tasks, with learning from demonstrations or preference signals playing a crucial role in this adaptation. We present a unified theoretical …
View article: Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache
Beyond Homogeneous Attention: Memory-Efficient LLMs via Fourier-Approximated KV Cache Open
Large Language Models struggle with memory demands from the growing Key-Value (KV) cache as context lengths increase. Existing compression methods homogenize head dimensions or rely on attention-guided token pruning, often sacrificing accu…
View article: Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training
Domain2Vec: Vectorizing Datasets to Find the Optimal Data Mixture without Training Open
We introduce~\textsc{Domain2Vec}, a novel approach that decomposes any dataset into a linear combination of several \emph{meta-domains}, a new concept designed to capture the key underlying features of datasets. \textsc{Domain2Vec} maintai…
View article: Clinical study of intelligent tongue diagnosis and oral microbiome for classifying TCM syndromes in MASLD
Clinical study of intelligent tongue diagnosis and oral microbiome for classifying TCM syndromes in MASLD Open
Background This study aimed to analyze the tongue image features and oral microbial markers in different TCM syndromes related to metabolic dysfunction-associated steatotic liver disease (MASLD). Methods This study involved 34 healthy volu…
View article: REARANK: Reasoning Re-ranking Agent via Reinforcement Learning
REARANK: Reasoning Re-ranking Agent via Reinforcement Learning Open
We present REARANK, a large language model (LLM)-based listwise reasoning reranking agent. REARANK explicitly reasons before reranking, significantly improving both performance and interpretability. Leveraging reinforcement learning and da…
View article: ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning
ConvSearch-R1: Enhancing Query Reformulation for Conversational Search with Reasoning via Reinforcement Learning Open
Conversational search systems require effective handling of context-dependent queries that often contain ambiguity, omission, and coreference. Conversational Query Reformulation (CQR) addresses this challenge by transforming these queries …
View article: Teach2Eval: An Indirect Evaluation Method for LLM by Judging How It Teaches
Teach2Eval: An Indirect Evaluation Method for LLM by Judging How It Teaches Open
Recent progress in large language models (LLMs) has outpaced the development of effective evaluation methods. Traditional benchmarks rely on task-specific metrics and static datasets, which often suffer from fairness issues, limited scalab…
View article: Task-Core Memory Management and Consolidation for Long-term Continual Learning
Task-Core Memory Management and Consolidation for Long-term Continual Learning Open
In this paper, we focus on a long-term continual learning (CL) task, where a model learns sequentially from a stream of vast tasks over time, acquiring new knowledge while retaining previously learned information in a manner akin to human …
View article: Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback
Reinforced Interactive Continual Learning via Real-time Noisy Human Feedback Open
This paper introduces an interactive continual learning paradigm where AI models dynamically learn new skills from real-time human feedback while retaining prior knowledge. This paradigm distinctively addresses two major limitations of tra…
View article: Which Teaching Arrangement Can Better Inspire Teachers and Students?-- A Game Theoretic Approach
Which Teaching Arrangement Can Better Inspire Teachers and Students?-- A Game Theoretic Approach Open
Based on 61332 posts on social network during the epidemic period, this research used natural language processing methods and text sentiment analysis to study the satisfaction with online teaching, and found that approximately 59 percent o…
View article: Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition
Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition Open
We propose Low-Rank Sparse Attention (Lorsa), a sparse replacement model of Transformer attention layers to disentangle original Multi Head Self Attention (MHSA) into individually comprehensible components. Lorsa is designed to address the…
View article: Clinical Study of Intelligent Tongue Diagnosis and Oral Microbiome for Classifying TCM Syndromes in MASLD
Clinical Study of Intelligent Tongue Diagnosis and Oral Microbiome for Classifying TCM Syndromes in MASLD Open
Background This study aimed to analyze the tongue image features and oral microbial markers in different TCM syndromes related to metabolic dysfunction-associated steatotic liver disease (MASLD). Methods This study involved 34 healthy volu…