Explanipedia

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning Open

Shih-Yang Liu, Xin Dong, Ximing Lu, Shizhe Diao, Mingjie Liu , et al. · 2025

Reasoning language models such as OpenAI-o1, DeepSeek-R1, and Qwen achieve strong performance via extended chains of thought but often generate unnecessarily long outputs. Maximizing intelligence per token--accuracy relative to response le…

BroRL: Scaling Reinforcement Learning via Broadened Exploration Open

Jian Hu, Mingjie Liu, Ximing Lu, Fang Wu, Zaid Harchaoui , et al. · 2025

Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a key ingredient for unlocking complex reasoning capabilities in large language models. Recent work ProRL has shown promise in scaling RL by increasing the number of trai…

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Open

Fang Wu, Weihao Xuan, Heli Qi, Ximing Lu, Albert Tu , et al. · 2025

Although RLVR has become an essential component for developing advanced reasoning skills in LLMs, contemporary studies have documented training plateaus that emerge following thousands of optimization steps, demonstrating notable decreases…

Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation Open

Yusuke Hirota, Ryo Hachiuma, Boyi Li, Ximing Lu, Michael Ross Boone , et al. · 2025

Gender bias in vision-language foundation models (VLMs) raises concerns about their safe deployment and is typically evaluated using benchmarks with gender annotations on real-world images. However, as these benchmarks often contain spurio…

Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training Open

Mingjie Liu, Shizhe Diao, Jian Hu, Ximing Lu, Xin Dong , et al. · 2025

Recent advancements in reasoning-focused language models such as OpenAI's O1 and DeepSeek-R1 have shown that scaling test-time computation-through chain-of-thought reasoning and iterative exploration-can yield substantial improvements on c…

Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers Open

Wooseok Seo, Seungju Han, Jaehun Jung, Benjamin Newman, Seungwon Lim , et al. · 2025

Fact verification is essential for ensuring the reliability of LLM applications. In this study, we evaluate 12 pre-trained LLMs and one specialized fact-verifier, including frontier LLMs and open-weight reasoning LLMs, using a collection o…

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models Open

Mingjie Liu, Shizhe Diao, Ximing Lu, Jian Hu, Xin Dong , et al. · 2025

Recent advances in reasoning-centric language models have highlighted reinforcement learning (RL) as a promising method for aligning models with verifiable rewards. However, it remains contentious whether RL truly expands a model's reasoni…

Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning Open

Jaehun Jung, Seungju Han, Ximing Lu, Skyler Hallinan, David Acuna , et al. · 2025

Effective generalization in language models depends critically on the diversity of their training data. Yet existing diversity metrics often fall short of this goal, relying on surface-level heuristics that are decoupled from model behavio…

Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models Open

Abhilasha Ravichander, Jillian Fisher, Taylor Sorensen, Ximing Lu, Maria Antoniak , et al. · 2025

Socratic-MCTS: Test-Time Visual Reasoning by Asking the Right Questions Open

David Acuna, Ximing Lu, Jaehun Jung, Hyunwoo Kim, Amlan Kar , et al. · 2025

AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text Open

Ximing Lu, Melanie Sclar, Skyler Hallinan, Niloofar Mireshghallah, Jiacheng Liu , et al. · 2024

Creativity has long been considered one of the most difficult aspect of human intelligence for AI to mimic. However, the rise of Large Language Models (LLMs), like ChatGPT, has raised questions about whether AI can match or even surpass hu…

HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions Open

Xuhui Zhou, Hyun‐Woo Kim, Faeze Brahman, Liwei Jiang, Hao Zhu , et al. · 2024

AI agents are increasingly autonomous in their interactions with human users and tools, leading to increased interactional safety risks. We present HAICOSYSTEM, a framework examining AI agent safety within diverse and complex social intera…

StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements Open

Jillian Fisher, Skyler Hallinan, Ximing Lu, Mitchell Gordon, Zaïd Harchaoui , et al. · 2024

Authorship obfuscation, rewriting a text to intentionally obscure the identity of the author, is an important but challenging task. Current methods using large language models (LLMs) lack interpretability and controllability, often ignorin…

Certainly Uncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness Open

Khyathi Raghavi Chandu, Linjie Li, Anas Awadalla, Ximing Lu, Jae Sung Park , et al. · 2024

The ability to acknowledge the inevitable uncertainty in their knowledge and reasoning is a prerequisite for AI systems to be truly truthful and reliable. In this paper, we present a taxonomy of uncertainty specific to vision-language AI s…

How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models Open

Jae-Young Lee, Ximing Lu, Jack Hessel, Faeze Brahman, Youngjae Yu , et al. · 2024

Given the growing influx of misinformation across news and social media, there is a critical need for systems that can provide effective real-time verification of news claims. Large language or multimodal model based verification has been …

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models Open

Liwei Jiang, Kavel Rao, Seungju Han, Allyson Ettinger, Faeze Brahman , et al. · 2024

We introduce WildTeaming, an automatic LLM safety red-teaming framework that mines in-the-wild user-chatbot interactions to discover 5.7K unique clusters of novel jailbreak tactics, and then composes multiple tactics for systematic explora…

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties Open

Taylor Sorensen, Liwei Jiang, Jena D. Hwang, Sydney Levine, Valentina Pyatkin , et al. · 2024

Human values are crucial to human decision-making. Value pluralism is the view that multiple correct values may be held in tension with one another (e.g., when considering lying to a friend to protect their feelings, how does one balance h…

Information-Theoretic Distillation for Reference-less Summarization Open

Jaehun Jung, Ximing Lu, Liwei Jiang, Faeze Brahman, Peter West , et al. · 2024

The current winning recipe for automatic summarization is using proprietary large-scale language models (LLMs) such as ChatGPT as is, or imitation learning from them as teacher models. While increasingly ubiquitous dependence on such large…

Logic-Induced-Long-Tail (LINT) Open

Huihan Li, Yuting Ning, Zeyi Liao, Siyuan Wang, Xiang Lorraine Li , et al. · 2024

Data release for Arxiv Paper: In Search of the Long-Tail: Systematic Generation of Long-Tail Inferential Knowledge via Logical Rule Guided Search

JAMDEC: Unsupervised Authorship Obfuscation using Constrained Decoding over Small Language Models Open

Jillian Fisher, Ximing Lu, Jaehun Jung, Liwei Jiang, Zaïd Harchaoui , et al. · 2024

The permanence of online content combined with the enhanced authorship identification techniques calls for stronger computational methods to protect the identity and privacy of online authorship when needed, e.g., blind reviews for scienti…

NovaCOMET: Open Commonsense Foundation Models with Symbolic Knowledge Distillation Open

Peter West, Ronan Le Bras, Taylor Sorensen, Bill Yuchen Lin, Liwei Jiang , et al. · 2023

We present NovaCOMET, an open commonsense knowledge model, that combines the best aspects of knowledge and general task models. Compared to previous knowledge models, NovaCOMET allows open-format relations enabling direct application to re…

Localized Symbolic Knowledge Distillation for Visual Commonsense Models Open

Jae Sung Park, Jack Hessel, Khyathi Raghavi Chandu, Paul Pu Liang, Ximing Lu , et al. · 2023

Instruction following vision-language (VL) models offer a flexible interface that supports a broad range of multimodal tasks in a zero-shot fashion. However, interfaces that operate on full images do not directly enable the user to "point …

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning Open

Bill Yuchen Lin, Abhilasha Ravichander, Ximing Lu, Nouha Dziri, Melanie Sclar , et al. · 2023

The alignment tuning process of large language models (LLMs) typically involves instruction learning through supervised fine-tuning (SFT) and preference tuning via reinforcement learning from human feedback (RLHF). A recent study, LIMA (Zh…

STEER: Unified Style Transfer with Expert Reinforcement Open

Skyler Hallinan, Faeze Brahman, Ximing Lu, Jaehun Jung, Sean Welleck , et al. · 2023

While text style transfer has many applications across natural language processing, the core premise of transferring from a single source style is unrealistic in a real-world setting. In this work, we focus on arbitrary style transfer: rew…

Tailoring Self-Rationalizers with Multi-Reward Distillation Open

Sahana Ramnath, Brihi Joshi, Skyler Hallinan, Ximing Lu, Liunian Harold Li , et al. · 2023

Large language models (LMs) are capable of generating free-text rationales to aid question answering. However, prior work 1) suggests that useful self-rationalization is emergent only at significant scales (e.g., 175B parameter GPT-3); and…

The Generative AI Paradox: "What It Can Create, It May Not Understand" Open

Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li , et al. · 2023

The recent wave of generative AI has sparked unprecedented global attention, with both excitement and concern over potentially superhuman levels of artificial intelligence: models now take only seconds to produce outputs that would challen…

Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement Open

Linlu Qiu, Liwei Jiang, Ximing Lu, Melanie Sclar, Valentina Pyatkin , et al. · 2023

The ability to derive underlying principles from a handful of observations and then generalize to novel situations -- known as inductive reasoning -- is central to human intelligence. Prior work suggests that language models (LMs) often fa…

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties Open

Taylor Sorensen, Liwei Jiang, Jena D. Hwang, Sydney Levine, Valentina Pyatkin , et al. · 2023

Human values are crucial to human decision-making. Value pluralism is the view that multiple correct values may be held in tension with one another (e.g., when considering lying to a friend to protect their feelings, how does one balance h…

Faith and Fate: Limits of Transformers on Compositionality Open

Nouha Dziri, Ximing Lu, Melanie Sclar, Xiang Lorraine Li, Liwei Jian , et al. · 2023

Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures on surprisingly trivial problems. This b…

Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing Open

Jaehun Jung, Peter West, Liwei Jiang, Faeze Brahman, Ximing Lu , et al. · 2023

We present Impossible Distillation, a novel framework for paraphrasing and sentence summarization, that distills a high-quality dataset and model from a low-quality teacher that itself cannot perform these tasks. Unlike prior works that re…

Ximing Lu YOU? Author Swipe