Explanipedia

CafeQ: Calibration-free Quantization via Learned Transformations and Adaptive Rounding Open

Ziteng Sun, Asher Trockman, Vikas Singh, Ananda Theertha Suresh · 2025

Post-training quantization is an effective method for reducing the serving cost of large language models, where the standard approach is to use a round-to-nearest quantization level scheme. However, this often introduces large errors due t…

Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe Open

Chong You, Rajesh Jayaram, Ananda Theertha Suresh, Robin Nittka, Felix Yu , et al. · 2025

Dual encoder (DE) models, where a pair of matching query and document are embedded into similar vector representations, are widely used in information retrieval due to their simplicity and scalability. However, the Euclidean geometry of th…

Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models Open

Haotian Ye, Himanshu Jain, Chong You, Ananda Theertha Suresh, Haowei Lin , et al. · 2025

In real-world applications of large language models, outputs are often required to be confined: selecting items from predefined product or document sets, generating phrases that comply with safety standards, or conforming to specialized fo…

Rate of Model Collapse in Recursive Training Open

Ananda Theertha Suresh, Andrew Thangaraj, Aditya Nanda Kishore Khandavally · 2024

Given the ease of creating synthetic data from machine learning models, new models can be potentially trained on synthetic data generated by previous models. This recursive training process raises concerns about the long-term impact on mod…

Coupling without Communication and Drafter-Invariant Speculative Decoding Open

Majid Daliri, Christopher Musco, Ananda Theertha Suresh · 2024

Suppose Alice has a distribution $P$ and Bob has a distribution $Q$. Alice wants to draw a sample $a\sim P$ and Bob a sample $b \sim Q$ such that $a = b$ with as high of probability as possible. It is well-known that, by sampling from an o…

Private federated discovery of out-of-vocabulary words for Gboard Open

Ziteng Sun, Peter Kairouz, Haicheng Sun, Adrià Gascón, Ananda Theertha Suresh · 2024

The vocabulary of language models in Gboard, Google's keyboard application, plays a crucial role for improving user experience. One way to improve the vocabulary is to discover frequently typed out-of-vocabulary (OOV) words on user devices…

Exploring and Improving Drafts in Blockwise Parallel Decoding Open

Taehyeon Kim, Ananda Theertha Suresh, Kishore Papineni, Michael Riley, Sanjiv Kumar , et al. · 2024

Despite the remarkable strides made by autoregressive language models, their potential is often hampered by the slow inference speeds inherent in sequential token generation. Blockwise parallel decoding (BPD) was proposed by Stern et al. a…

Asymptotics of Language Model Alignment Open

Joy Qiping Yang, Salman Salamatian, Ziteng Sun, Ananda Theertha Suresh, Ahmad Beirami · 2024

Let $p$ denote a generative language model. Let $r$ denote a reward model that returns a scalar that captures the degree at which a draw from $p$ is preferred. The goal of language model alignment is to alter $p$ to a new distribution $ϕ$ …

Block Verification Accelerates Speculative Decoding Open

Ziteng Sun, Jae Hun Ro, Ahmad Beirami, Ananda Theertha Suresh · 2024

Speculative decoding is an effective method for lossless acceleration of large language models during inference. It uses a fast model to draft a block of tokens which are then verified in parallel by the target model, and provides a guaran…

Efficient Language Model Architectures for Differentially Private Federated Learning Open

Jae Hun Ro, Srinadh Bhojanapalli, Zheng Xu, Yanxiang Zhang, Ananda Theertha Suresh · 2024

Cross-device federated learning (FL) is a technique that trains a model on data distributed across typically millions of edge devices without data leaving the devices. SGD is the standard client optimizer for on device training in cross-de…

Theoretical guarantees on the best-of-n alignment policy Open

Ahmad Beirami, Alekh Agarwal, Jonathan Berant, Alexander D’Amour, Jacob Eisenstein , et al. · 2024

A simple and effective method for the inference-time alignment and scaling test-time compute of generative models is best-of-$n$ sampling, where $n$ samples are drawn from a reference policy, ranked based on a reward function, and the high…

Mean estimation in the add-remove model of differential privacy Open

Alex Kulesza, Ananda Theertha Suresh, Yuyan Wang · 2023

Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical appl…

Multi-Group Fairness Evaluation via Conditional Value-at-Risk Testing Open

Lucas Monteiro Paes, Ananda Theertha Suresh, Alex Beutel, Flávio P. Calmon, Ahmad Beirami · 2023

Machine learning (ML) models used in prediction and classification tasks may display performance disparities across population groups determined by sensitive attributes (e.g., race, sex, age). We consider the problem of evaluating the perf…

SpecTr: Fast Speculative Decoding via Optimal Transport Open

Ziteng Sun, Ananda Theertha Suresh, Jae Hun Ro, Ahmad Beirami, Himanshu Jain , et al. · 2023

Autoregressive sampling from large language models has led to state-of-the-art results in several natural language tasks. However, autoregressive sampling generates tokens one at a time making it slow, and even prohibitive in certain tasks…

Federated Heavy Hitter Recovery under Linear Sketching Open

Adrià Gascón, Peter Kairouz, Ziteng Sun, Ananda Theertha Suresh · 2023

Motivated by real-life deployments of multi-round federated analytics with secure aggregation, we investigate the fundamental communication-accuracy tradeoffs of the heavy hitter discovery and approximate (open-domain) histogram problems u…

The importance of feature preprocessing for differentially private linear optimization Open

Ziteng Sun, Ananda Theertha Suresh, Aditya Krishna Menon · 2023

Training machine learning models with differential privacy (DP) has received increasing interest in recent years. One of the most popular algorithms for training differentially private models is differentially private stochastic gradient d…

FedYolo: Augmenting Federated Learning with Pretrained Transformers Open

Xuechen Zhang, Mingchen Li, Xiangyu Chang, Jiasi Chen, Amit K. Roy–Chowdhury , et al. · 2023

The growth and diversity of machine learning applications motivate a rethinking of learning with mobile and edge devices. How can we address diverse client goals and learn with scarce heterogeneous data? While federated learning aims to ad…

Subset-Based Instance Optimality in Private Estimation Open

Travis Dick, Alex Kulesza, Ziteng Sun, Ananda Theertha Suresh · 2023

We propose a new definition of instance optimality for differentially private estimation algorithms. Our definition requires an optimal algorithm to compete, simultaneously for every dataset $D$, with the best private benchmark algorithm t…

Concentration Bounds for Discrete Distribution Estimation in KL Divergence Open

Clément L. Canonne, Ziteng Sun, Ananda Theertha Suresh · 2023

We study the problem of discrete distribution estimation in KL divergence and provide concentration bounds for the Laplace estimator. We show that the deviation from mean scales as $\sqrt{k}/n$ when $n \ge k$, improving upon the best prior…

Private Domain Adaptation from a Public Source Open

Raef Bassily, Mehryar Mohri, Ananda Theertha Suresh · 2022

A key problem in a variety of applications is that of domain adaptation from a public source domain, for which a relatively large amount of labeled data with no privacy constraints is at one's disposal, to a private target domain, for whic…

Algorithms for bounding contribution for histogram estimation under user-level privacy Open

Yuhan Liu, Ananda Theertha Suresh, Wennan Zhu, Peter Kairouz, Marco Gruteser · 2022

We study the problem of histogram estimation under user-level differential privacy, where the goal is to preserve the privacy of all entries of any single user. We consider the heterogeneous scenario where the quantity of data can be diffe…

Differentially Private Learning with Margin Guarantees Open

Raef Bassily, Mehryar Mohri, Ananda Theertha Suresh · 2022

We present a series of new differentially private (DP) algorithms with dimension-independent margin guarantees. For the family of linear hypotheses, we give a pure DP learning algorithm that benefits from relative deviation margin guarante…

Scaling Language Model Size in Cross-Device Federated Learning Open

Jae Hun Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Theertha Suresh , et al. · 2022

Most studies in cross-device federated learning focus on small models, due to the server-client communication and on-device computation bottlenecks. In this work, we leverage various techniques for mitigating these bottlenecks to train lar…

Correlated quantization for distributed mean estimation and optimization Open

Ananda Theertha Suresh, Ziteng Sun, Jae Hun Ro, Felix Yu · 2022

We study the problem of distributed mean estimation and optimization under communication constraints. We propose a correlated quantization protocol whose leading term in the error guarantee depends on the mean deviation of data points rath…

The Fundamental Price of Secure Aggregation in Differentially Private Federated Learning Open

Weining Chen, Christopher A. Choquette-Choo, Peter Kairouz, Ananda Theertha Suresh · 2022

We consider the problem of training a $d$ dimensional model with distributed differential privacy (DP) where secure aggregation (SecAgg) is used to ensure that the server only sees the noisy sum of $n$ model updates in every training round…

Proceedings of the First Workshop on Federated Learning for Natural Language Processing (FL4NLP 2022) Open

Jae Hun Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Theertha Suresh , et al. · 2022

In the context of personalized federated learning (FL), the critical challenge is to balance local model improvement and global model tuning when the personal and global objectives may not be exactly aligned.Inspired by Bayesian hierarchic…

Remember What You Want to Forget: Algorithms for Machine Unlearning Open

Ayush Sekhari, Jayadev Acharya, Gautam Kamath, Ananda Theertha Suresh · 2021

We study the problem of unlearning datapoints from a learnt model. The learner first receives a dataset $S$ drawn i.i.d. from an unknown distribution, and outputs a model $\widehat{w}$ that performs well on unseen samples from the same dis…

On the Rényi Differential Privacy of the Shuffle Model Open

Antonious M. Girgis, Deepesh Data, Suhas Diggavi, Ananda Theertha Suresh, Peter Kairouz · 2021

The central question studied in this paper is Renyi Differential Privacy (RDP) guarantees for general discrete local mechanisms in the shuffle privacy model. In the shuffle model, each of the $n$ clients randomizes its response using a loc…

Robust Estimation for Random Graphs Open

Jayadev Acharya, Ayush Jain, Gautam Kamath, Ananda Theertha Suresh, Huanyu Zhang · 2021

We study the problem of robustly estimating the parameter $p$ of an Erdős-Rényi random graph on $n$ nodes, where a $γ$ fraction of nodes may be adversarially corrupted. After showing the deficiencies of canonical estimators, we design a co…

HD-cos Networks: Efficient Neural Architectures for Secure Multi-Party Computation Open

Wittawat Jitkrittum, Michał Łukasik, Ananda Theertha Suresh, Felix X. Yu, Gang Wang · 2021

Multi-party computation (MPC) is a branch of cryptography where multiple non-colluding parties execute a well designed protocol to securely compute a function. With the non-colluding party assumption, MPC has a cryptographic guarantee that…

Ananda Theertha Suresh YOU? Author Swipe