Explanipedia

Rerouting LLM Routers Open

Avital Shafran, Roei Schuster, Thomas Ristenpart, Vitaly Shmatikov · 2025

Computer science Business

LLM routers aim to balance quality and cost of generation by classifying queries and routing them to a cheaper or more expensive LLM depending on their complexity. Routers represent one type of what we call LLM control planes: systems that…

Machine Against the RAG: Jamming Retrieval-Augmented Generation with Blocker Documents Open

Avital Shafran, Roei Schuster, Vitaly Shmatikov · 2024

Computer science Physics

Retrieval-augmented generation (RAG) systems respond to queries by retrieving relevant documents from a knowledge database and applying an LLM to the retrieved documents. We demonstrate that RAG systems that operate on databases with untru…

The Adversarial Implications of Variable-Time Inference Open

Dudi Biton, Aditi Misra, Efrat Levy, Jaidip Kotak, Ron Bitton , et al. · 2023

Computer science Chemistry

Machine learning (ML) models are known to be vulnerable to a number of attacks that target the integrity of their predictions or the privacy of their training data. To carry out these attacks, a black-box adversary must typically possess t…

The Adversarial Implications of Variable-Time Inference Open

Dudi Biton, Aditi Misra, Efrat Levy, Jaidip Kotak, Ron Bitton , et al. · 2023

Computer science

Machine learning (ML) models are known to be vulnerable to a number of attacks that target the integrity of their predictions or the privacy of their training data. To carry out these attacks, a black-box adversary must typically possess t…

Reconstructing Individual Data Points in Federated Learning Hardened with Differential Privacy and Secure Aggregation Open

Franziska Boenisch, Adam Dziedzic, Roei Schuster, Ali Shahin Shamsabadi, Ilia Shumailov , et al. · 2023

Computer science Medicine

Federated learning (FL) is a framework for users to jointly train a machine learning model. FL is promoted as a privacy-enhancing technology (PET) that provides data minimization: data never "leaves" personal devices and users share only m…

Understanding Transformer Memorization Recall Through Idioms Open

Adi Haviv, Ido Cohen, Jacob Gidron, Roei Schuster, Yoav Goldberg , et al. · 2023

Computer science Psychology Physics

To produce accurate predictions, language models (LMs) must balance between generalization and memorization. Yet, little is known about the mechanism by which transformer LMs employ their memorization capacity. When does a model decide to …

Learned-Database Systems Security Open

Roei Schuster, Jin Zhou, Thorsten Eisenhofer, Paul Grubbs, Nicolas Papernot · 2022

Computer science Mathematics

A learned database system uses machine learning (ML) internally to improve performance. We can expect such systems to be vulnerable to some adversarial-ML attacks. Often, the learned component is shared between mutually-distrusting users o…

Understanding Transformer Memorization Recall Through Idioms Open

Adi Haviv, Ido Cohen, Jacob Gidron, Roei Schuster, Yoav Goldberg , et al. · 2022

Computer science Psychology Mathematics

To produce accurate predictions, language models (LMs) must balance between generalization and memorization. Yet, little is known about the mechanism by which transformer LMs employ their memorization capacity. When does a model decide to …

In Differential Privacy, There is Truth: On Vote Leakage in Ensemble Private Learning Open

Jiaqi Wang, Roei Schuster, Ilia Shumailov, David Lie, Nicolas Papernot · 2022

Computer science Economics

When learning from sensitive data, care must be taken to ensure that training algorithms address privacy concerns. The canonical Private Aggregation of Teacher Ensembles, or PATE, computes output labels by aggregating the predictions of a …

When the Curious Abandon Honesty: Federated Learning Is Not Private Open

Franziska Boenisch, Adam Dziedzic, Roei Schuster, Ali Shahin Shamsabadi, Ilia Shumailov , et al. · 2021

Computer science Medicine

In federated learning (FL), data does not leave personal devices when they are jointly training a machine learning model. Instead, these devices share gradients, parameters, or other model updates, with a central party (e.g., a company) co…

You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion Open

Roei Schuster, Congzheng Song, Eran Tromer, Vitaly Shmatikov · 2021

Computer science Philosophy Biology

Code autocompletion is an integral feature of modern code editors and IDEs. The latest generation of autocompleters uses neural language models, trained on public open-source code repositories, to suggest likely (not just statically feasib…

Transformer Feed-Forward Layers Are Key-Value Memories Open

Mor Geva, Roei Schuster, Jonathan Berant, Omer Levy · 2021

Computer science Engineering Chemistry

Feed-forward layers constitute two-thirds of a transformer model's parameters, yet their role in the network remains under-explored. We show that feed-forward layers in transformer-based language models operate as key-value memories, where…

You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion Open

Roei Schuster, Congzheng Song, Eran Tromer, Vitaly Shmatikov · 2020

Computer science Philosophy Biology

Code autocompletion is an integral feature of modern code editors and IDEs. The latest generation of autocompleters uses neural language models, trained on public open-source code repositories, to suggest likely (not just statically feasib…

De-Anonymizing Text by Fingerprinting Language Generation Open

Zhen Sun, Roei Schuster, Vitaly Shmatikov · 2020

Computer science Mathematics

Components of machine learning systems are not (yet) perceived as security hotspots. Secure coding practices, such as ensuring that no execution paths depend on confidential inputs, have not yet been adopted by ML developers. We initiate t…

Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning Open

Roei Schuster, Tal Schuster, Yoav Meri, Vitaly Shmatikov · 2020

Computer science Philosophy Psychology

Word embeddings, i.e., low-dimensional vector representations such as GloVe and SGNS, encode word "meaning" in the sense that distances between words' vectors correspond to their semantic proximity. This enables transfer learning of semant…

The Limitations of Stylometry for Detecting Machine-Generated Fake News Open

Tal Schuster, Roei Schuster, Darsh Shah, Regina Barzilay · 2020

Computer science

Recent developments in neural language models (LMs) have raised concerns about their potential misuse for automatically spreading misinformation. In light of these concerns, several studies have proposed to detect machine-generated fake ne…

Are We Safe Yet? The Limitations of Distributional Features for Fake News Detection. Open

Tal Schuster, Roei Schuster, Darsh Shah, Regina Barzilay · 2019

Computer science Psychology

Recent developments in neural language models (LMs) have raised concerns about their potential misuse for automatically spreading misinformation. In light of these concerns, several studies have proposed to detect machine-generated fake ne…

Synesthesia: Detecting Screen Content via Remote Acoustic Side Channels Open

Daniel Genkin, Pattani Mihir, Roei Schuster, Eran Tromer · 2019

Computer science Biology

We show that subtle acoustic noises emanating from within computer screens\ncan be used to detect the content displayed on the screens. This sound can be\npicked up by ordinary microphones built into webcams or screens, and is\ninadvertent…

Situational Access Control in the Internet of Things Open

Roei Schuster, Vitaly Shmatikov, Eran Tromer · 2018

Computer science Engineering Political science

Access control in the Internet of Things (IoT) often depends on a situation --- for example, "the user is at home'' --- that can only be tracked using multiple devices. In contrast to the (well-studied) smartphone frameworks, enforcement o…

Roei Schuster YOU? Author Swipe