Explanipedia

Through the Stealth Lens: Rethinking Attacks and Defenses in RAG Open

Sarthak Choudhary, Nils Palumbo, Ashish Hooda, Krishnamurthy Dvijotham, Somesh Jha · 2025

Retrieval-augmented generation (RAG) systems are vulnerable to attacks that inject poisoned passages into the retrieved set, even at low corruption rates. We show that existing attacks are not designed to be stealthy, allowing reliable det…

Fun-tuning: Characterizing the Vulnerability of Proprietary LLMs to Optimization-based Prompt Injection Attacks via the Fine-Tuning Interface Open

Andrey Labunets, Nishit V. Pandya, Ashish Hooda, Xiaohan Fu, Earlence Fernandes · 2025

Computer science

We surface a new threat to closed-weight Large Language Models (LLMs) that enables an attacker to compute optimization-based prompt injections. Specifically, we characterize how an attacker can leverage the loss-like information returned f…

PolicyLR: A Logic Representation For Privacy Policies Open

Ashish Hooda, Rishabh Khandelwal, Prasad Chalasani, Kassem Fawaz, Somesh Jha · 2024

Computer science Political science

Privacy policies are crucial in the online ecosystem, defining how services handle user data and adhere to regulations such as GDPR and CCPA. However, their complexity and frequent updates often make them difficult for stakeholders to unde…

Synthetic Counterfactual Faces Open

Guruprasad V Ramesh, Harrison Rosenberg, Ashish Hooda, Shimaa Ahmed Kassem Fawaz · 2024

Economics Psychology Computer science

Computer vision systems have been deployed in various applications involving biometrics like human faces. These systems can identify social media users, search for missing persons, and verify identity of individuals. While computer vision …

PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails Open

Neal Mangaokar, Ashish Hooda, Jihye Choi, Shreyas Chandrashekaran, Kassem Fawaz , et al. · 2024

Computer science Physics

Large language models (LLMs) are typically aligned to be harmless to humans. Unfortunately, recent work has shown that such models are susceptible to automated jailbreak attacks that induce them to generate harmful content. More recent LLM…

Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates Open

Ashish Hooda, Mihai Christodorescu, Miltiadis Allamanis, Aaron Wilson, Kassem Fawaz , et al. · 2024

Computer science

Large Language Models' success on text generation has also made them better at code generation and coding tasks. While a lot of work has demonstrated their remarkable performance on tasks such as code completion and editing, it is still un…

Experimental Analyses of the Physical Surveillance Risks in Client-Side Content Scanning Open

Ashish Hooda, Andrey Labunets, Tadayoshi Kohno, Earlence Fernandes · 2024

Computer science

Content scanning systems employ perceptual hashing algorithms to scan user content for illicit material, such as child pornography or terrorist recruitment flyers.Perceptual hashing algorithms help determine whether two images are visually…

Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks Open

Ryan Feng, Ashish Hooda, Neal Mangaokar, Kassem Fawaz, Somesh Jha , et al. · 2023

Computer science

Recent work has proposed stateful defense models (SDMs) as a compelling strategy to defend against a black-box attacker who only has query access to the model, as is common for online machine learning platforms. Such stateful defenses aim …

Theoretically Principled Trade-off for Stateful Defenses against Query-Based Black-Box Attacks Open

Ashish Hooda, Neal Mangaokar, Ryan Feng, Kassem Fawaz, Somesh Jha , et al. · 2023

Computer science Philosophy

Adversarial examples threaten the integrity of machine learning systems with alarming success rates even under constrained black-box conditions. Stateful defenses have emerged as an effective countermeasure, detecting potential attacks by …

Re-purposing Perceptual Hashing based Client Side Scanning for Physical Surveillance Open

Ashish Hooda, Andrey Labunets, Tadayoshi Kohno, Earlence Fernandes · 2022

Computer science Psychology Chemistry

Content scanning systems employ perceptual hashing algorithms to scan user content for illegal material, such as child pornography or terrorist recruitment flyers. Perceptual hashing algorithms help determine whether two images are visuall…

SkillFence Open

Ashish Hooda, Matthew Wallace, Kushal Jhunjhunwalla, Earlence Fernandes, Kassem Fawaz · 2022

Computer science Psychology

Voice assistants are deployed widely and provide useful functionality. However, recent work has shown that commercial systems like Amazon Alexa and Google Home are vulnerable to voice-based confusion attacks that exploit design issues. We …

D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles Open

Ashish Hooda, Neal Mangaokar, Ryan Feng, Kassem Fawaz, Somesh Jha , et al. · 2022

Computer science Mathematics Chemistry

Detecting diffusion-generated deepfake images remains an open problem. Current detection methods fail against an adversary who adds imperceptible adversarial perturbations to the deepfake to evade detection. In this work, we propose Disjoi…

Invisible Perturbations: Physical Adversarial Examples Exploiting the Rolling Shutter Effect Open

Athena Sayles, Ashish Hooda, Mohit Gupta, Rahul Chatterjee, Earlence Fernandes · 2021

Computer science Physics

Physical adversarial examples for camera-based computer vision have so far been achieved through visible artifacts -- a sticker on a Stop sign, colorful borders around eyeglasses or a 3D printed object with a colorful texture. An implicit …

Ashish Hooda YOU? Author Swipe