Explanipedia

On the Impossibility of Retrain Equivalence in Machine Unlearning Open

Jiatong Yu, Yanhu He, Anirudh Goyal, Sanjeev Arora · 2025

Machine unlearning seeks to selectively remove the "influence" of specific training data on a model's outputs. The ideal goal is Retrain Equivalence--behavior identical to a model trained from scratch on only the retained data. This goal w…

Rethinking Thinking Tokens: LLMs as Improvement Operators Open

Lovish Madaan, Aniket Didolkar, Suchin Gururangan, John Quan, Ruan Silva , et al. · 2025

Reasoning training incentivizes LLMs to produce long chains of thought (long CoT), which among other things, allows them to explore solution strategies with self-checking. This results in higher accuracy, but inflates context length, token…

OCA: A Shiny web application for transparent overload compensation in higher education Open

Dawit Aberra, Xiangyan Zeng, Chunhua Dong Mahon, Sanjeev Arora · 2025

Metacognitive Reuse: Turning Recurring LLM Reasoning Into Concise Behaviors Open

Aniket Didolkar, Nicolas Ballas, Sanjeev Arora, Anirudh Goyal · 2025

Large language models (LLMs) now solve multi-step problems by emitting extended chains of thought. During the process, they often re-derive the same intermediate steps across problems, inflating token usage and latency. This saturation of …

The Dose Response Regarding Microbial Disease: A Mathematical View Open

JOGENDRA KUMAR, Anand Kumar, Sanjeev Arora · 2025

In the society, we have seen the incidence of both communicable and non-communicable disease, we are also familiar with the events of chronic diseases like T. B. which takes longer duration to be cured as compared to cold, cough or flu etc…

Advancing science- and evidence-based AI policy Open

Rishi Bommasani, Sanjeev Arora, Jennifer Chayes, Yejin Choi, Mariano-Florentino Cuéllar , et al. · 2025

Policy must be informed by, but also facilitate the generation of, scientific evidence

Review Article: Ecotoxicological Impacts of Pollution on Biodiversity and Ecosystem Health in the Anthropocene Open

Sanjeev Arora · 2025

The Anthropocene epoch, characterized by accelerated industrial growth and intensified human activities, has led to widespread environmental contamination, significantly endangering biodiversity and ecosystem functionality. This review del…

Evaluating the impact of Extension of Community Healthcare Outcome (ECHO) telementoring on enhancing knowledge and perceived skills in alcohol use disorders among the nonmedical District Mental Health Programme healthcare providers in Karnataka Open

Prabhat Chand, Channaveerachari Naveen Kumar, Rajani Parthasarathy, Pratima Murthy, Sanjeev Arora · 2025

Background: Alcohol use disorder (AUD) is a public health problem. In India, about 5.2% of the population aged 10–75 years, that is, approximately 5.7 crore individuals, need help for their alcohol use problems, and around 20 lakhs from Ka…

Weak-to-Strong Generalization Even in Random Feature Networks, Provably Open

Mikhail V. Medvedev, Kaifeng Lyu, Dingli Yu, Sanjeev Arora, Zhiyuan Li , et al. · 2025

Weak-to-Strong Generalization (Burns et al., 2024) is the phenomenon whereby a strong student, say GPT-4, learns a task from a weak teacher, say GPT-2, and ends up significantly outperforming the teacher. We show that this phenomenon does …

Time to Rethink AI for Combinatorial Optimization: Classical Algorithms Remain Tough to Match Open

Yikai Wu, Haoyu Zhao, Sanjeev Arora · 2025

This position paper argues that the machine learning community should fundamentally rethink how AI-inspired methods are developed and evaluated for combinatorial optimization (CO). We present comprehensive empirical benchmarks comparing va…

Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs? Open

Simon Park, Abhishek Panigrahi, Yun Chen, Dingli Yu, Anirudh Goyal , et al. · 2025

Vision Language Models (VLMs) are impressive at visual question answering and image captioning. But they underperform on multi-step visual reasoning -- even compared to LLMs on the same tasks presented in text form -- giving rise to percep…

A dual strain probiotic administered via the waterline beneficially modulates the ileal and cecal microbiome, sIgA and acute phase protein levels, and growth performance of broilers during a dysbacteriosis challenge Open

S.A.S. van der Klein, Sanjeev Arora, S. Haldar, Amrita Kumar Dhara, Kirsty Gibbs · 2024

Intestinal dysbacteriosis is increasing in broilers due to the reduced use of antibiotics in feed. This study tested the effect of daily waterline administration of a dual-strain probiotic comprising Lactobacillus acidophilus AG01 and Bifi…

NAMAH—An Innovative Tele-ECHO Mentoring Program to Foster Well-being Among Physicians Open

Nidhi Parate, Manjunatha BR, R Sanchitha, Venkata Lakshmi Narasimha, Jayant Mahadevan , et al. · 2024

Background: The current study aimed to develop and implement the National Assistance in Mental Health for Health Care Providers (NAMAH) module, which focused on wellness and building resilience for a cohort of physicians. Methods: The NAMA…

Can Models Learn Skill Composition from Examples? Open

Haoyu Zhao, Simran Kaur, Dingli Yu, Anirudh Goyal, Sanjeev Arora · 2024

As large language models (LLMs) become increasingly advanced, their ability to exhibit compositional generalization -- the capacity to combine learned skills in novel ways not encountered during training -- has garnered significant attenti…

Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning Open

Simran Kaur, Simon Park, Anirudh Goyal, Sanjeev Arora · 2024

We introduce Instruct-SkillMix, an automated approach for creating diverse, high quality SFT data for instruction-following. The pipeline involves two stages, each leveraging an existing powerful LLM: (1) Skill extraction: uses the LLM to …

ConceptMix: A Compositional Image Generation Benchmark with Controllable Difficulty Open

Xindi Wu, Dingli Yu, Yangsibo Huang, Olga Russakovsky, Sanjeev Arora · 2024

Compositionality is a critical capability in Text-to-Image (T2I) models, as it reflects their ability to understand and combine multiple concepts from text descriptions. Existing evaluations of compositional capability rely heavily on huma…

Correction: Clinical Outcomes of Rural Patients with Diabetes Treated by ECHO-Trained Providers Versus an Academic Medical Center Open

Matt Bouchonville, Larissa Myaskovsky, Yuridia Leyva, Erik B. Erhardt, Mark L. Unruh , et al. · 2024

Expanding Hepatitis C Virus Treatment in the New Mexico State Prison System: Using the ECHO Model for Provider and Prison Peer Education Open

Karla Thornton, Paulina Deming, Gaelyn R. D. Archer, Juan A. Ceniceros, Laura E. Tomedi , et al. · 2024

It is critical to address hepatitis C virus (HCV) in carceral settings to achieve worldwide elimination of the virus. We describe New Mexico's (NM) experience expanding HCV treatment in state prisons, supplemented with Project ECHO (ECHO; …

AI-Assisted Generation of Difficult Math Questions Open

Vedant Shah, Dingli Yu, Kaifeng Lyu, Simon Park, Nan Rosemary Ke , et al. · 2024

Current LLM training positions mathematical reasoning as a core capability. With publicly available sources fully tapped, there is unmet demand for diverse and challenging math questions. Relying solely on human experts is both time-consum…

Clinical Outcomes of Rural Patients with Diabetes Treated by ECHO-Trained Providers Versus an Academic Medical Center Open

Matt Bouchonville, Larissa Myaskovsky, Yuridia Leyva, Erik B. Erhardt, Mark L. Unruh , et al. · 2024

CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Open

Zirui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu , et al. · 2024

Chart understanding plays a pivotal role when applying Multimodal Large Language Models (MLLMs) to real-world tasks such as analyzing scientific papers or financial reports. However, existing datasets often focus on oversimplified and homo…

Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving Open

Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Vaľko , et al. · 2024

Metacognitive knowledge refers to humans' intuitive knowledge of their own thinking and reasoning processes. Today's best LLMs clearly possess some reasoning processes. The paper gives evidence that they also have metacognitive knowledge, …

The Hidden Threat: Exposing OSINT Exploitation in Cyber Attacks Open

Sanjeev Arora · 2024

The use of open-source intelligence (OSINT) has become an important tool for hackers in modern cyber warfare. This paper examines how attackers use publicly available information to target individuals and organizations. We explore various …

Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates Open

Kaifeng Lyu, Haoyu Zhao, Xinran Gu, Dingli Yu, Anirudh Goyal , et al. · 2024

Public LLMs such as the Llama 2-Chat underwent alignment training and were considered safe. Recently Qi et al. [2024] reported that even benign fine-tuning on seemingly safe datasets can give rise to unsafe behaviors in the models. The cur…

Language Models as Science Tutors Open

Alexis Chevalier, Jiayi Geng, Alexander Wettig, Howard Chen, Sebastian Mizera , et al. · 2024

NLP has recently made exciting progress toward training language models (LMs) with strong scientific problem-solving skills. However, model development has not focused on real-life use-cases of LMs for science, including applications in ed…

LESS: Selecting Influential Data for Targeted Instruction Tuning Open

Mengzhou Xia, Sadhika Malladi, Suchin Gururangan, Sanjeev Arora, Danqi Chen · 2024

Instruction tuning has unlocked powerful capabilities in large language models (LLMs), effectively using combined datasets to develop generalpurpose chatbots. However, real-world applications often require a specialized suite of skills (e.…

Unlearning via Sparse Representations Open

Vedant Shah, Frederik Träuble, Ashish Malik, Hugo Larochelle, Michael C. Mozer , et al. · 2023

Machine \emph{unlearning}, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infeasible by existing techniques. We propose a nearly compute-free zero-shot unlearning technique based…

Curriculum-based Faculty Training in Networking: Knowledge and Self-efficacy Outcomes Open

Xin Shore, Brian Soller, N Mickel, Brandt Wiskur, Dadonim Vila Morales , et al. · 2023

Although the advantages of developmental networks are well-known, most faculty do not know how to participate in such networks actively. Additionally, institutions face challenges in teaching faculty the best practices of networking. This …

Skill-Mix: a Flexible and Expandable Family of Evaluations for AI models Open

Dingli Yu, Simran Kaur, Arushi Gupta, Jonah Brown-Cohen, Anirudh Goyal , et al. · 2023

With LLMs shifting their role from statistical modeling of language to serving as general-purpose AI agents, how should LLM evaluations change? Arguably, a key ability of an AI agent is to flexibly combine, as needed, the basic skills it h…

A Quadratic Synchronization Rule for Distributed Deep Learning Open

Xinran Gu, Kaifeng Lyu, Sanjeev Arora, Jingzhao Zhang, Longbo Huang · 2023

In distributed deep learning with data parallelism, synchronizing gradients at each training step can cause a huge communication overhead, especially when many nodes work together to train large models. Local gradient methods, such as Loca…

Sanjeev Arora YOU? Author Swipe