Explanipedia

Model State Arithmetic for Machine Unlearning Open

Keivan Rezaei, Mehrdad Saberi, Abhilasha Ravichander, Soheil Feizi · 2025

Large language models are trained on massive corpora of web data, which may include private data, copyrighted material, factually inaccurate data, or data that degrades model performance. Eliminating the influence of such problematic datap…

What Has Been Lost with Synthetic Evaluation? Open

Alexander Gill, Abhilasha Ravichander, Ana Marasović · 2025

Large language models (LLMs) are increasingly used for data generation. However, creating evaluation benchmarks raises the bar for this emerging paradigm. Benchmarks must target specific phenomena, penalize exploiting shortcuts, and be cha…

Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can’t Answer? Open

Nishant Balepur, Feng Gu, Abhilasha Ravichander, Feng Shi, Jordan Boyd‐Graber , et al. · 2025

What Has Been Lost with Synthetic Evaluation? Open

Alexander Gill, Abhilasha Ravichander, Ana Marasović · 2025

Proceedings of the 10th Workshop on Representation Learning for NLP (RepL4NLP-2025) Open

Burcu Can, Maximilian Mozes, Samuel Cahyawijaya, Naomi Saphra, Nora Kassner , et al. · 2025

Information-Guided Identification of Training Data Imprint in (Proprietary) Large Language Models Open

Abhilasha Ravichander, Jillian Fisher, Taylor Sorensen, Ximing Lu, Maria Antoniak , et al. · 2025

RESTOR: Knowledge Recovery in Machine Unlearning Open

Keivan Rezaei, Khyathi Raghavi Chandu, Soheil Feizi, Yejin Choi, Faeze Brahman , et al. · 2024

Large language models trained on web-scale corpora can memorize undesirable data containing misinformation, copyrighted material, or private or sensitive information. Recently, several machine unlearning algorithms have been proposed to el…

Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer? Open

Nishant Balepur, Feng Gu, Abhilasha Ravichander, Shi Feng, Jordan Boyd‐Graber , et al. · 2024

Question answering (QA), giving correct answers to questions, is a popular task, but we test reverse question answering (RQA): for an input answer, give a question with that answer. Past work tests QA and RQA separately, but we test them j…

The Art of Saying No: Contextual Noncompliance in Language Models Open

Faeze Brahman, Sachin Kumar, Vidhisha Balachandran, Pradeep Dasigi, Valentina Pyatkin , et al. · 2024

Chat-based language models are designed to be helpful, yet they should not comply with every user request. While most existing work primarily focuses on refusal of "unsafe" queries, we posit that the scope of noncompliance should be broade…

WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild Open

Bill Yuchen Lin, Yuntian Deng, Khyathi Raghavi Chandu, Faeze Brahman, Abhilasha Ravichander , et al. · 2024

We introduce WildBench, an automated evaluation framework designed to benchmark large language models (LLMs) using challenging, real-world user queries. WildBench consists of 1,024 tasks carefully selected from over one million human-chatb…

Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question? Open

Nishant Balepur, Abhilasha Ravichander, Rachel Rudinger · 2024

Multiple-choice question answering (MCQA) is often used to evaluate large language models (LLMs). To see if MCQA assesses LLMs as intended, we probe if LLMs can perform MCQA with choices-only prompts, where models must select the correct a…

OLMo: Accelerating the Science of Language Models Open

Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney , et al. · 2024

Language models (LMs) have become ubiquitous in both NLP research and in commercial product offerings. As their commercial importance has surged, the most powerful models have become closed off, gated behind proprietary interfaces, with im…

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Open

Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson , et al. · 2024

Information about pretraining corpora used to train the current best-performing language models is seldom discussed: commercial models rarely detail their data, and even open models are often released without accompanying training data or …

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning Open

Bill Yuchen Lin, Abhilasha Ravichander, Ximing Lu, Nouha Dziri, Melanie Sclar , et al. · 2023

The alignment tuning process of large language models (LLMs) typically involves instruction learning through supervised fine-tuning (SFT) and preference tuning via reinforcement learning from human feedback (RLHF). A recent study, LIMA (Zh…

MacGyver: Are Large Language Models Creative Problem Solvers? Open

Yufei Tian, Abhilasha Ravichander, Lianhui Qin, Ronan Le Bras, Raja Marjieh , et al. · 2023

We explore the creative problem-solving capabilities of modern LLMs in a novel constrained setting. To this end, we create MACGYVER, an automatically generated dataset consisting of over 1,600 real-world problems deliberately designed to t…

Agent Lumos: Unified and Modular Training for Open-Source Language Agents Open

Yin Da, Faeze Brahman, Abhilasha Ravichander, Khyathi Raghavi Chandu, Kai-Wei Chang , et al. · 2023

Closed-source agents suffer from several issues such as a lack of affordability, transparency, and reproducibility, particularly on complex interactive tasks. This motivates the development of open-source alternatives. We introduce LUMOS, …

What's In My Big Data? Open

Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk , et al. · 2023

Large text corpora are the backbone of language models. However, we have a limited understanding of the content of these corpora, including general statistics, quality, social factors, and inclusion of evaluation data (contamination). In t…

The Generative AI Paradox: "What It Can Create, It May Not Understand" Open

Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li , et al. · 2023

The recent wave of generative AI has sparked unprecedented global attention, with both excitement and concern over potentially superhuman levels of artificial intelligence: models now take only seconds to produce outputs that would challen…

Understanding How to Inform Blind and Low-Vision Users about Data Privacy through Privacy Question Answering Assistants Open

Yuanyuan Feng, Abhilasha Ravichander, Yaxing Yao, Shikun Zhang, Rex Chen , et al. · 2023

Understanding and managing data privacy in the digital world can be challenging for sighted users, let alone blind and low-vision (BLV) users. There is limited research on how BLV users, who have special accessibility needs, navigate data …

Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning Open

Ximing Lu, Faeze Brahman, Peter West, Jaehun Jang, Khyathi Raghavi Chandu , et al. · 2023

While extreme-scale language models have demonstrated exceptional performance on a variety of language tasks, the degree of control over these language models through pure prompting can often be limited. Directly fine-tuning such language …

Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning Open

Ximing Lu, Faeze Brahman, Peter West, Jae‐Hun Jung, Khyathi Raghavi Chandu , et al. · 2023

Ximing Lu, Faeze Brahman, Peter West, Jaehun Jung, Khyathi Chandu, Abhilasha Ravichander, Prithviraj Ammanabrolu, Liwei Jiang, Sahana Ramnath, Nouha Dziri, Jillian Fisher, Bill Lin, Skyler Hallinan, Lianhui Qin, Xiang Ren, Sean Welleck, Ye…

When and Why Does Bias Mitigation Work? Open

Abhilasha Ravichander, Joe Stacey, Marek Rei · 2023

Neural models have been shown to exploit shallow surface features to perform language understanding tasks, rather than learning the deeper language understanding and reasoning skills that practitioners desire. Previous work has developed d…

CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation Open

Abhilasha Ravichander, Matt Gardner, Ana Marasović · 2022

The full power of human language-based communication cannot be realized without negation. All human languages have some form of negation. Despite this, negation remains a challenging phenomenon for current natural language understanding sy…

Exploring and Improving the Accessibility of Data Privacy-related Information for People Who Are Blind or Low-vision Open

Yuanyuan Feng, Abhilasha Ravichander, Yaxing Yao, Shikun Zhang, Norman Sadeh · 2022

We present a study of privacy attitudes and behaviors of people who are blind or low vision. Our study involved in-depth interviews with 21 US participants. The study explores their risk perceptions and also whether and how they go about o…

Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions Open

Yanai Elazar, Nora Kassner, Shauli Ravfogel, Amir Feder, Abhilasha Ravichander , et al. · 2022

Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models. But what exactly in the training data causes a model to make a certain prediction? We seek to answer this question by prov…

Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022) Open

Antoine Bosselut, Xiang Li, Bill Yuchen Lin, Bodhisattwa Prasad Majumder, Yash Kumar Lal , et al. · 2022

Knowledge graphs are often used to store common sense information that is useful for various tasks.However, the extraction of contextuallyrelevant knowledge is an unsolved problem, and current approaches are relatively simple.Here we intro…

CONDAQA: A Contrastive Reading Comprehension Dataset for Reasoning about Negation Open

Abhilasha Ravichander, Matt Gardner, Ana Marasović · 2022

The full power of human language-based communication cannot be realized without negation. All human languages have some form of negation. Despite this, negation remains a challenging phenomenon for current natural language understanding sy…

CURIE: An Iterative Querying Approach for Reasoning About Situations Open

Dheeraj Rajagopal, Aman Madaan, Niket Tandon, Yiming Yang, Shrimai Prabhumoye , et al. · 2022

Dheeraj Rajagopal, Aman Madaan, Niket Tandon, Yiming Yang, Shrimai Prabhumoye, Abhilasha Ravichander, Peter Clark, Eduard H Hovy. Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022). 2022.

CURIE: An Iterative Querying Approach for Reasoning About Situations Open

Dheeraj Rajagopal, Aman Madaan, Niket Tandon, Yiming Yang, Shrimai Prabhumoye , et al. · 2021

Recently, models have been shown to predict the effects of unexpected situations, e.g., would cloudy skies help or hinder plant growth? Given a context, the goal of such situational reasoning is to elicit the consequences of a new situatio…

Measuring and Improving Consistency in Pretrained Language Models Open

Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy , et al. · 2021

Consistency of a model -- that is, the invariance of its behavior under meaning-preserving alternations in its input -- is a highly desirable property in natural language processing. In this paper we study the question: Are Pretrained Lang…

Abhilasha Ravichander YOU? Author Swipe