Ashish Hooda
YOU?
Author Swipe
View article: Through the Stealth Lens: Rethinking Attacks and Defenses in RAG
Through the Stealth Lens: Rethinking Attacks and Defenses in RAG Open
Retrieval-augmented generation (RAG) systems are vulnerable to attacks that inject poisoned passages into the retrieved set, even at low corruption rates. We show that existing attacks are not designed to be stealthy, allowing reliable det…
View article: Fun-tuning: Characterizing the Vulnerability of Proprietary LLMs to Optimization-based Prompt Injection Attacks via the Fine-Tuning Interface
Fun-tuning: Characterizing the Vulnerability of Proprietary LLMs to Optimization-based Prompt Injection Attacks via the Fine-Tuning Interface Open
We surface a new threat to closed-weight Large Language Models (LLMs) that enables an attacker to compute optimization-based prompt injections. Specifically, we characterize how an attacker can leverage the loss-like information returned f…
View article: PolicyLR: A Logic Representation For Privacy Policies
PolicyLR: A Logic Representation For Privacy Policies Open
Privacy policies are crucial in the online ecosystem, defining how services handle user data and adhere to regulations such as GDPR and CCPA. However, their complexity and frequent updates often make them difficult for stakeholders to unde…
View article: Synthetic Counterfactual Faces
Synthetic Counterfactual Faces Open
Computer vision systems have been deployed in various applications involving biometrics like human faces. These systems can identify social media users, search for missing persons, and verify identity of individuals. While computer vision …
View article: PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails
PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails Open
Large language models (LLMs) are typically aligned to be harmless to humans. Unfortunately, recent work has shown that such models are susceptible to automated jailbreak attacks that induce them to generate harmful content. More recent LLM…
View article: Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates
Do Large Code Models Understand Programming Concepts? Counterfactual Analysis for Code Predicates Open
Large Language Models' success on text generation has also made them better at code generation and coding tasks. While a lot of work has demonstrated their remarkable performance on tasks such as code completion and editing, it is still un…
View article: Experimental Analyses of the Physical Surveillance Risks in Client-Side Content Scanning
Experimental Analyses of the Physical Surveillance Risks in Client-Side Content Scanning Open
Content scanning systems employ perceptual hashing algorithms to scan user content for illicit material, such as child pornography or terrorist recruitment flyers.Perceptual hashing algorithms help determine whether two images are visually…
View article: Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks
Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks Open
Recent work has proposed stateful defense models (SDMs) as a compelling strategy to defend against a black-box attacker who only has query access to the model, as is common for online machine learning platforms. Such stateful defenses aim …
View article: Theoretically Principled Trade-off for Stateful Defenses against Query-Based Black-Box Attacks
Theoretically Principled Trade-off for Stateful Defenses against Query-Based Black-Box Attacks Open
Adversarial examples threaten the integrity of machine learning systems with alarming success rates even under constrained black-box conditions. Stateful defenses have emerged as an effective countermeasure, detecting potential attacks by …
View article: Re-purposing Perceptual Hashing based Client Side Scanning for Physical Surveillance
Re-purposing Perceptual Hashing based Client Side Scanning for Physical Surveillance Open
Content scanning systems employ perceptual hashing algorithms to scan user content for illegal material, such as child pornography or terrorist recruitment flyers. Perceptual hashing algorithms help determine whether two images are visuall…
View article: SkillFence
SkillFence Open
Voice assistants are deployed widely and provide useful functionality. However, recent work has shown that commercial systems like Amazon Alexa and Google Home are vulnerable to voice-based confusion attacks that exploit design issues. We …
View article: D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles
D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles Open
Detecting diffusion-generated deepfake images remains an open problem. Current detection methods fail against an adversary who adds imperceptible adversarial perturbations to the deepfake to evade detection. In this work, we propose Disjoi…
View article: Invisible Perturbations: Physical Adversarial Examples Exploiting the Rolling Shutter Effect
Invisible Perturbations: Physical Adversarial Examples Exploiting the Rolling Shutter Effect Open
Physical adversarial examples for camera-based computer vision have so far been achieved through visible artifacts -- a sticker on a Stop sign, colorful borders around eyeglasses or a 3D printed object with a colorful texture. An implicit …