Vikash Sehwag
YOU?
Author Swipe
View article: Does More Inference-Time Compute Really Help Robustness?
Does More Inference-Time Compute Really Help Robustness? Open
Recently, Zaremba et al. demonstrated that increasing inference-time computation improves robustness in large proprietary reasoning LLMs. In this paper, we first show that smaller-scale, open-source models (e.g., DeepSeek R1, Qwen3, Phi-re…
View article: Self-Comparison for Dataset-Level Membership Inference in Large (Vision-)Language Model
Self-Comparison for Dataset-Level Membership Inference in Large (Vision-)Language Model Open
View article: Differentially Private Image Classification by Learning Priors from Random Processes
Differentially Private Image Classification by Learning Priors from Random Processes Open
In privacy-preserving machine learning, differentially private stochastic gradient descent (DP-SGD) performs worse than SGD due to per-sample gradient clipping and noise addition. A recent focus in private learning research is improving th…
View article: Adapting to Evolving Adversaries with Regularized Continual Robust Training
Adapting to Evolving Adversaries with Regularized Continual Robust Training Open
Robust training methods typically defend against specific attack types, such as Lp attacks with fixed budgets, and rarely account for the fact that defenders may encounter new attacks over time. A natural solution is to adapt the defended …
View article: Activity Recognition on Avatar-Anonymized Datasets with Masked Differential Privacy
Activity Recognition on Avatar-Anonymized Datasets with Masked Differential Privacy Open
Privacy-preserving computer vision is an important emerging problem in machine learning and artificial intelligence. Prevalent methods tackling this problem use differential privacy (DP) or obfuscation techniques to protect the privacy of …
View article: Self-Comparison for Dataset-Level Membership Inference in Large (Vision-)Language Models
Self-Comparison for Dataset-Level Membership Inference in Large (Vision-)Language Models Open
Large Language Models (LLMs) and Vision-Language Models (VLMs) have made significant advancements in a wide range of natural language processing and vision-language tasks. Access to large web-scale datasets has been a key factor in their s…
View article: Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget Open
As scaling laws in generative AI push performance, they also simultaneously concentrate the development of these models among actors with large computational resources. With a focus on text-to-image (T2I) generative models, we aim to addre…
View article: EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations
EnTruth: Enhancing the Traceability of Unauthorized Dataset Usage in Text-to-image Diffusion Models with Minimal and Robust Alterations Open
Generative models, especially text-to-image diffusion models, have significantly advanced in their ability to generate images, benefiting from enhanced architectures, increased computational power, and large-scale datasets. While the datas…
View article: Evaluating and Mitigating IP Infringement in Visual Generative AI
Evaluating and Mitigating IP Infringement in Visual Generative AI Open
The popularity of visual generative AI models like DALL-E 3, Stable Diffusion XL, Stable Video Diffusion, and Sora has been increasing. Through extensive evaluation, we discovered that the state-of-the-art visual generative models can gene…
View article: AI Risk Management Should Incorporate Both Safety and Security
AI Risk Management Should Incorporate Both Safety and Security Open
The exposure of security vulnerabilities in safety-aligned language models, e.g., susceptibility to adversarial attacks, has shed light on the intricate interplay between AI safety and AI security. Although the two disciplines now come tog…
View article: How to Trace Latent Generative Model Generated Images without Artificial Watermark?
How to Trace Latent Generative Model Generated Images without Artificial Watermark? Open
Latent generative models (e.g., Stable Diffusion) have become more and more popular, but concerns have arisen regarding potential misuse related to images generated by these models. It is, therefore, necessary to analyze the origin of imag…
View article: JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models Open
Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content. Evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation te…
View article: Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection
Finding needles in a haystack: A Black-Box Approach to Invisible Watermark Detection Open
In this paper, we propose WaterMark Detection (WMD), the first invisible watermark detection method under a black-box and annotation-free setting. WMD is capable of detecting arbitrary watermarks within a given reference dataset using a cl…
View article: Scaling Compute Is Not All You Need for Adversarial Robustness
Scaling Compute Is Not All You Need for Adversarial Robustness Open
The last six years have witnessed significant progress in adversarially robust deep learning. As evidenced by the CIFAR-10 dataset category in RobustBench benchmark, the accuracy under $\ell_\infty$ adversarial perturbations improved from …
View article: Differentially Private Image Classification by Learning Priors from Random Processes
Differentially Private Image Classification by Learning Priors from Random Processes Open
In privacy-preserving machine learning, differentially private stochastic gradient descent (DP-SGD) performs worse than SGD due to per-sample gradient clipping and noise addition. A recent focus in private learning research is improving th…
View article: MultiRobustBench: Benchmarking Robustness Against Multiple Attacks
MultiRobustBench: Benchmarking Robustness Against Multiple Attacks Open
The bulk of existing research in defending against adversarial examples focuses on defending against a single (typically bounded Lp-norm) attack, but for a practical setting, machine learning (ML) models should be robust to a wide variety …
View article: Extracting Training Data from Diffusion Models
Extracting Training Data from Diffusion Models Open
Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show that diffusion models memorize individual imag…
View article: Uncovering Adversarial Risks of Test-Time Adaptation
Uncovering Adversarial Risks of Test-Time Adaptation Open
Recently, test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts. It allows a base model to adapt to an unforeseen distribution during inference by leveraging the information from the batch …
View article: A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization
A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization Open
An open problem in differentially private deep learning is hyperparameter optimization (HPO). DP-SGD introduces new hyperparameters and complicates existing ones, forcing researchers to painstakingly tune hyperparameters with hundreds of t…
View article: Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation
Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation Open
Recent works have demonstrated that deep learning models are vulnerable to backdoor poisoning attacks, where these attacks instill spurious correlations to external trigger patterns or objects (e.g., stickers, sunglasses, etc.). We find th…
View article: A Light Recipe to Train Robust Vision Transformers
A Light Recipe to Train Robust Vision Transformers Open
In this paper, we ask whether Vision Transformers (ViTs) can serve as an underlying architecture for improving the adversarial robustness of machine learning models against evasion attacks. While earlier works have focused on improving Con…
View article: Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation
Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation Open
Recent works have demonstrated that deep learning models are vulnerable to backdoor poisoning attacks, where these attacks instill spurious correlations to external trigger patterns or objects (e.g., stickers, sunglasses, etc.). We find th…
View article: Understanding Robust Learning through the Lens of Representation Similarities
Understanding Robust Learning through the Lens of Representation Similarities Open
Representation learning, i.e. the generation of representations useful for downstream applications, is a task of fundamental importance that underlies much of the success of deep neural networks (DNNs). Recently, robustness to adversarial …
View article: Generating High Fidelity Data from Low-density Regions using Diffusion Models
Generating High Fidelity Data from Low-density Regions using Diffusion Models Open
Our work focuses on addressing sample deficiency from low-density regions of data manifold in common image datasets. We leverage diffusion process based generative models to synthesize novel images from low-density regions. We observe that…
View article: Improving Adversarial Robustness Using Proxy Distributions.
Improving Adversarial Robustness Using Proxy Distributions. Open
We focus on the use of proxy distributions, i.e., approximations of the underlying distribution of the training dataset, in both understanding and improving the adversarial robustness in image classification. While additional training data…
View article: Robust Learning Meets Generative Models: Can Proxy Distributions Improve\n Adversarial Robustness?
Robust Learning Meets Generative Models: Can Proxy Distributions Improve\n Adversarial Robustness? Open
While additional training data improves the robustness of deep neural\nnetworks against adversarial examples, it presents the challenge of curating a\nlarge number of specific real-world samples. We circumvent this challenge by\nusing addi…
View article: Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?
Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness? Open
While additional training data improves the robustness of deep neural networks against adversarial examples, it presents the challenge of curating a large number of specific real-world samples. We circumvent this challenge by using additio…
View article: Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries
Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries Open
Understanding the fundamental limits of robust supervised learning has emerged as a problem of immense interest, from both practical and theoretical standpoints. In particular, it is critical to determine classifier-agnostic bounds on the …
View article: SSD: A Unified Framework for Self-Supervised Outlier Detection
SSD: A Unified Framework for Self-Supervised Outlier Detection Open
We ask the following question: what training information is required to design an effective outlier/out-of-distribution (OOD) detector, i.e., detecting samples that lie far away from the training distribution? Since unlabeled data is easil…
View article: Fast-Convergent Federated Learning
Fast-Convergent Federated Learning Open
Federated learning has emerged recently as a promising solution for distributing machine learning tasks through modern networks of mobile devices. Recent studies have obtained lower bounds on the expected decrease in model loss that is ach…