Explanipedia

The Privacy Quagmire: Where Computer Scientists and Lawyers May Disagree Open

Yunwei Zhao, Varun Chandrasekaran, Thomas Wies, Lakshminarayanan Subramanian · 2025

Efficiently Attacking Memorization Scores Open

Tue Do, Varun Chandrasekaran, Daniel Alabi · 2025

Influence estimation tools -- such as memorization scores -- are widely used to understand model behavior, attribute training data, and inform dataset curation. However, recent applications in data valuation and responsible machine learnin…

Analyzing Security and Privacy Challenges in Generative AI Usage Guidelines for Higher Education Open

Bernard Ng, Jiarui Liu, Xiuhong Tong, Kevin Ye, Gauthami Yenne , et al. · 2025

Educators and learners worldwide are embracing the rise of Generative Artificial Intelligence (GenAI) as it reshapes higher education. However, GenAI also raises significant privacy and security concerns, as models and privacy-sensitive us…

AMUN: Adversarial Machine UNlearning Open

Ali Ebrahimpour-Boroojeny, Hari Sundaram, Varun Chandrasekaran · 2025

Machine unlearning, where users can request the deletion of a forget dataset, is becoming increasingly important because of numerous privacy regulations. Initial works on ``exact'' unlearning (e.g., retraining) incur large computational ov…

MM-GEN: Enhancing Task Performance Through Targeted Multimodal Data Curation Open

Siddharth Joshi, Besmira Nushi, Vidhisha Balachandran, Varun Chandrasekaran, Vibhav Vineet , et al. · 2025

Vision-language models (VLMs) are highly effective but often underperform on specialized tasks; for example, Llava-1.5 struggles with chart and diagram understanding due to scarce task-specific training data. Existing training data, source…

Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models Open

Rishabh Adiga, Besmira Nushi, Varun Chandrasekaran · 2025

The Efficacy of Transfer-based No-box Attacks on Image Watermarking: A Pragmatic Analysis Open

Qilong Wu, Varun Chandrasekaran · 2024

Watermarking approaches are widely used to identify if images being circulated are authentic or AI-generated. Determining the robustness of image watermarking methods in the ``no-box'' setting, where the attacker is assumed to have no know…

Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models Open

Rishabh Adiga, Besmira Nushi, Varun Chandrasekaran · 2024

We explore the internal mechanisms of how bias emerges in large language models (LLMs) when provided with ambiguous comparative prompts: inputs that compare or enforce choosing between two or more entities without providing clear context f…

BenchAgents: Multi-Agent Systems for Structured Benchmark Creation Open

Nigar Azhar Butt, Varun Chandrasekaran, Neel Joshi, Besmira Nushi, Vidhisha Balachandran · 2024

Evaluation insights are limited by the availability of high-quality benchmarks. As models evolve, there is a need to create benchmarks that can measure progress on new and complex generative capabilities. However, manually creating new ben…

Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models Open

Mazda Moayeri, Vidhisha Balachandran, Varun Chandrasekaran, Safoora Yousefi, Thomas Fel , et al. · 2024

With models getting stronger, evaluations have grown more complex, testing multiple skills in one benchmark and even in the same instance at once. However, skill-wise performance is obscured when inspecting aggregate accuracy, under-utiliz…

LOTOS: Layer-wise Orthogonalization for Training Robust Ensembles Open

Ali Ebrahimpour-Boroojeny, Hari Sundaram, Varun Chandrasekaran · 2024

Transferability of adversarial examples is a well-known property that endangers all classification models, even those that are only accessible through black-box queries. Prior work has shown that an ensemble of models is more resilient to …

Generative Monoculture in Large Language Models Open

Fan Wu, Emily Black, Varun Chandrasekaran · 2024

We introduce {\em generative monoculture}, a behavior observed in large language models (LLMs) characterized by a significant narrowing of model output diversity relative to available training data for a given task: for example, generating…

Bypassing LLM Watermarks with Color-Aware Substitutions Open

Qilong Wu, Varun Chandrasekaran · 2024

Watermarking approaches are proposed to identify if text being circulated is human or large language model (LLM) generated. The state-of-the-art watermarking strategy of Kirchenbauer et al. (2023a) biases the LLM to generate specific (``gr…

Designing Informative Metrics for Few-Shot Example Selection Open

Rishabh Adiga, Lakshminarayanan Subramanian, Varun Chandrasekaran · 2024

Pretrained language models (PLMs) have shown remarkable few-shot learning capabilities when provided with properly formatted examples. However, selecting the "best" examples remains an open challenge. We propose a complexity-based prompt s…

Privately Aligning Language Models with Reinforcement Learning Open

Fan Wu, Huseyin A. Inan, Artūrs Bačkurs, Varun Chandrasekaran, Janardhan Kulkarni , et al. · 2023

Positioned between pre-training and user deployment, aligning large language models (LLMs) through reinforcement learning (RL) has emerged as a prevailing strategy for training instruction following-models such as ChatGPT. In this work, we…

KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval Open

Marah Abdin, Suriya Gunasekar, Varun Chandrasekaran, Jerry Li, Mert Yüksekgönül , et al. · 2023

We study the ability of state-of-the art models to answer constraint satisfaction queries for information retrieval (e.g., 'a list of ice cream shops in San Diego'). In the past, such queries were considered to be tasks that could only be …

Why Train More? Effective and Efficient Membership Inference via Memorization Open

Jihye Choi, Shruti Tople, Varun Chandrasekaran, Somesh Jha · 2023

Membership Inference Attacks (MIAs) aim to identify specific data samples within the private training dataset of machine learning models, leading to serious privacy violations and other sophisticated threats. Many practical black-box MIAs …

Diversity of Thought Improves Reasoning Abilities of LLMs Open

Ranjita Naik, Varun Chandrasekaran, Mert Yüksekgönül, Hamid Palangi, Besmira Nushi · 2023

Large language models (LLMs) are documented to struggle in settings that require complex reasoning. Nevertheless, instructing the model to break down the problem into smaller reasoning steps, or ensembling various generations through modif…

Teaching Language Models to Hallucinate Less with Synthetic Tasks Open

Erik Jones, Hamid Palangi, Clarisse Simões, Varun Chandrasekaran, Subhabrata Mukherjee , et al. · 2023

Large language models (LLMs) frequently hallucinate on abstractive summarization tasks such as document-based question-answering, meeting summarization, and clinical report generation, even though all necessary information is included in c…

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models Open

Mert Yüksekgönül, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik , et al. · 2023

We investigate the internal behavior of Transformer-based Large Language Models (LLMs) when they generate factually incorrect text. We propose modeling factual queries as constraint satisfaction problems and use this framework to investiga…

DSML 2023 Committee Open

Publicity Chairs, Lishan Yang, George Mason, Matthew Jagielski, Homa Alemzadeh , et al. · 2023

Sparks of Artificial General Intelligence: Early experiments with GPT-4 Open

Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz , et al. · 2023

Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. Th…

Verifiable and Provably Secure Machine Unlearning Open

Thorsten Eisenhofer, Doreen Riepel, Varun Chandrasekaran, Esha Ghosh, Olga Ohrimenko , et al. · 2022

Machine unlearning aims to remove points from the training dataset of a machine learning model after training; for example when a user requests their data to be deleted. While many machine unlearning methods have been proposed, none of the…

Proof-of-Learning is Currently More Broken Than You Think Open

Congyu Fang, Hengrui Jia, Anvith Thudi, Mohammad Yaghini, Christopher A. Choquette-Choo , et al. · 2022

Proof-of-Learning (PoL) proposes that a model owner logs training checkpoints to establish a proof of having expended the computation necessary for training. The authors of PoL forego cryptographic approaches and trade rigorous security gu…

Generative Extraction of Audio Classifiers for Speaker Identification Open

Tejumade Afonja, Lucas Bourtoule, Varun Chandrasekaran, Sageev Oore, Nicolas Papernot · 2022

It is perhaps no longer surprising that machine learning models, especially deep neural networks, are particularly vulnerable to attacks. One such vulnerability that has been well studied is model extraction: a phenomenon in which the atta…

Hierarchical Federated Learning with Privacy Open

Varun Chandrasekaran, Suman Banerjee, Diego Perino, Nicolas Kourtellis · 2022

Federated learning (FL), where data remains at the federated clients, and where only gradient updates are shared with a central aggregator, was assumed to be private. Recent work demonstrates that adversaries with gradient-level access can…

Message from the DSML 2022 Organizers Open

Homa Alemzadeh, Rakesh B. Bobba, Varun Chandrasekaran, David Evans, Nicolas Papernot , et al. · 2022

CONFIDANT: A Privacy Controller for Social Robots Open

Brian Tang, Dakota Sullivan, Bengisu Çağıltay, Varun Chandrasekaran, Kassem Fawaz , et al. · 2022

As social robots become increasingly prevalent in day-to-day environments, they will participate in conversations and appropriately manage the information shared with them. However, little is known about how robots might appropriately disc…

Unrolling SGD: Understanding Factors Influencing Machine Unlearning Open

Anvith Thudi, Gabriel Deza, Varun Chandrasekaran, Nicolas Papernot · 2021

Machine unlearning is the process through which a deployed machine learning model is made to forget about some of its training data points. While naively retraining the model from scratch is an option, it is almost always associated with l…

SoK: Machine Learning Governance Open

Varun Chandrasekaran, Hengrui Jia, Anvith Thudi, Adelin Travers, Mohammad Yaghini , et al. · 2021

The application of machine learning (ML) in computer systems introduces not only many benefits but also risks to society. In this paper, we develop the concept of ML governance to balance such benefits and risks, with the aim of achieving …

Varun Chandrasekaran YOU? Author Swipe