Kunal Talwar
YOU?
Author Swipe
Instance-Optimality for Private KL Distribution Estimation Open
We study the fundamental problem of estimating an unknown discrete distribution $p$ over $d$ symbols, given $n$ i.i.d. samples from the distribution. We are interested in minimizing the KL divergence between the true distribution and the a…
Faster Rates for Private Adversarial Bandits Open
We design new differentially private algorithms for the problems of adversarial bandits and bandits with expert advice. For adversarial bandits, we give a simple and efficient conversion of any non-private bandit algorithm to a private ban…
On Privately Estimating a Single Parameter Open
We investigate differentially private estimators for individual parameters within larger parametric models. While generic private estimators exist, the estimators we provide repose on new local notions of estimand stability, and these noti…
Privacy-Computation Trade-Offs in Private Repetition and Metaselection Open
A Private Repetition algorithm takes as input a differentially private algorithm with constant success probability and boosts it to one that succeeds with high probability. These algorithms are closely related to private metaselection algo…
Fingerprinting Codes Meet Geometry: Improved Lower Bounds for Private Query Release and Adaptive Data Analysis Open
Fingerprinting codes are a crucial tool for proving lower bounds in differential privacy. They have been used to prove tight lower bounds for several fundamental questions, especially in the ``low accuracy'' regime. Unlike reconstruction/d…
Adaptive Batch Size for Privately Finding Second-Order Stationary Points Open
There is a gap between finding a first-order stationary point (FOSP) and a second-order stationary point (SOSP) under differential privacy constraints, and it remains unclear whether privately finding an SOSP is more challenging than findi…
Improved Sample Complexity for Private Nonsmooth Nonconvex Optimization Open
We study differentially private (DP) optimization algorithms for stochastic and empirical objectives which are neither smooth nor convex, and propose methods that return a Goldstein-stationary point with sample complexity bounds that impro…
Instance-Optimal Private Density Estimation in the Wasserstein Distance Open
Estimating the density of a distribution from samples is a fundamental problem in statistics. In many practical settings, the Wasserstein distance is an appropriate error metric for density estimation. For example, when estimating populati…
Scalable Private Search with Wally Open
This paper presents Wally, a private search system that supports efficient search queries against large databases. When sufficiently many clients are making queries, Wally's performance is significantly better than previous systems while p…
Private Online Learning via Lazy Algorithms Open
We study the problem of private online learning, specifically, online prediction from experts (OPE) and online convex optimization (OCO). We propose a new transformation that transforms lazy online learning algorithms into private algorith…
View article: Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages
Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages Open
We study the problem of private vector mean estimation in the shuffle model of privacy where $n$ users each have a unit vector $v^{(i)} \in\mathbb{R}^d$. We propose a new multi-message protocol that achieves the optimal error using $\tilde…
View article: Communication Complexity and Discrepancy of Halfplanes
Communication Complexity and Discrepancy of Halfplanes Open
We study the discrepancy of the following communication problem. Alice receives a halfplane, and Bob receives a point in the plane, and their goal is to determine whether Bob’s point belongs to Alice’s halfplane. This communication task co…
PINE: Efficient Norm-Bound Verification for Secret-Shared Vectors Open
Secure aggregation of high-dimensional vectors is a fundamental primitive in federated statistics and learning. A two-server system such as PRIO allows for scalable aggregation of secret-shared vectors. Adversarial clients might try to man…
Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers and Gradient Clipping Open
While federated learning (FL) and differential privacy (DP) have been extensively studied, their application to automatic speech recognition (ASR) remains largely unexplored due to the challenges in training large transformer models. Speci…
Mean Estimation with User-level Privacy under Data Heterogeneity Open
A key challenge in many modern data analysis tasks is that user data are heterogeneous. Different users may possess vastly different numbers of data points. More importantly, it cannot be assumed that all users sample from the same underly…
Samplable Anonymous Aggregation for Private Federated Data Analysis Open
We revisit the problem of designing scalable protocols for private statistics and private federated learning when each device holds its private data. Locally differentially private algorithms require little trust but are (provably) limited…
View article: Differentially Private Heavy Hitter Detection using Federated Analytics
Differentially Private Heavy Hitter Detection using Federated Analytics Open
In this work, we study practical heuristics to improve the performance of prefix-tree based algorithms for differentially private heavy hitter detection. Our model assumes each user has multiple data points and the goal is to learn as many…
Fast Optimal Locally Private Mean Estimation via Random Projections Open
We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball. Existing algorithms for this problem either incur sub-optimal error or have high communication and/or run-time complexity. We propos…
Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime Open
We consider online learning problems in the realizable setting, where there is a zero-loss solution, and propose new Differentially Private (DP) algorithms that obtain near-optimal regret bounds. For the problem of online prediction from e…
Concentration of the Langevin Algorithm's Stationary Distribution Open
A canonical algorithm for log-concave sampling is the Langevin Algorithm, aka the Langevin Diffusion run with some discretization stepsize $η> 0$. This discretization leads the Langevin Algorithm to have a stationary distribution $π_η$ whi…
Private Federated Statistics in an Interactive Setting Open
Privately learning statistics of events on devices can enable improved user experience. Differentially private algorithms for such problems can benefit significantly from interactivity. We argue that an aggregation protocol can enable an i…
Private Online Prediction from Experts: Separations and Faster Rates Open
Online prediction from experts is a fundamental problem in machine learning and several works have studied this problem under privacy constraints. We propose and analyze new algorithms for this problem that improve over the regret bounds o…
Subspace Recovery from Heterogeneous Data with Non-isotropic Noise Open
Recovering linear subspaces from data is a fundamental and important task in statistics and machine learning. Motivated by heterogeneity in Federated Learning settings, we study a basic formulation of this problem: the principal component …
Resolving the Mixing Time of the Langevin Algorithm to its Stationary Distribution for Log-Concave Sampling Open
Sampling from a high-dimensional distribution is a fundamental task in statistics, engineering, and the sciences. A canonical approach is the Langevin Algorithm, i.e., the Markov chain for the discretized Langevin Diffusion. This is the sa…
Stronger Privacy Amplification by Shuffling for Rényi and Approximate Differential Privacy Open
The shuffle model of differential privacy has gained significant interest as an intermediate trust model between the standard local and central models [EFMRTT19; CSUZZ19]. A key result in this model is that randomly shuffling locally rando…
FLAIR: Federated Learning Annotated Image Repository Open
Cross-device federated learning is an emerging machine learning (ML) paradigm where a large population of devices collectively train an ML model while the data remains on the devices. This research field has a unique set of practical chall…
Privacy of Noisy Stochastic Gradient Descent: More Iterations without More Privacy Loss Open
A central issue in machine learning is how to train models on sensitive user data. Industry has widely adopted a simple algorithm: Stochastic Gradient Descent with noise (a.k.a. Stochastic Gradient Langevin Dynamics). However, foundational…
Optimal Algorithms for Mean Estimation under Local Differential Privacy Open
We study the problem of mean estimation of $\ell_2$-bounded vectors under the constraint of local differential privacy. While the literature has a variety of algorithms that achieve the asymptotically optimal rates for this problem, the pe…