Jonathan Ullman
YOU?
Author Swipe
View article: Lower Bounds for Public-Private Learning under Distribution Shift
Lower Bounds for Public-Private Learning under Distribution Shift Open
The most effective differentially private machine learning algorithms in practice rely on an additional source of purportedly public data. This paradigm is most interesting when the two sources combine to be more than the sum of their part…
View article: Privacy in Metalearning and Multitask Learning: Modeling and Separations
Privacy in Metalearning and Multitask Learning: Modeling and Separations Open
Model personalization allows a set of individuals, each facing a different learning task, to train models that are more accurate for each person than those they could develop individually. The goals of personalization are captured in a var…
View article: TMI! Finetuned Models Leak Private Information from their Pretraining Data
TMI! Finetuned Models Leak Private Information from their Pretraining Data Open
Transfer learning has become an increasingly popular technique in machine learning as a way to leverage a pretrained model trained for one task to assist with building a finetuned model for a related task. This paradigm has been especially…
View article: Program Analysis for Adaptive Data Analysis
Program Analysis for Adaptive Data Analysis Open
Data analyses are usually designed to identify some property of the population from which the data are drawn, generalizing beyond the specific data sample. For this reason, data analyses are often designed in a way that guarantees that the…
View article: Private Geometric Median
Private Geometric Median Open
In this paper, we study differentially private (DP) algorithms for computing the geometric median (GM) of a dataset: Given $n$ points, $x_1,\dots,x_n$ in $\mathbb{R}^d$, the goal is to find a point $θ$ that minimizes the sum of the Euclide…
View article: Private Mean Estimation with Person-Level Differential Privacy
Private Mean Estimation with Person-Level Differential Privacy Open
We study person-level differentially private (DP) mean estimation in the case where each person holds multiple samples. DP here requires the usual notion of distributional stability when $\textit{all}$ of a person's datapoints can be modif…
View article: How to Make the Gradients Small Privately: Improved Rates for Differentially Private Non-Convex Optimization
How to Make the Gradients Small Privately: Improved Rates for Differentially Private Non-Convex Optimization Open
We provide a simple and flexible framework for designing differentially private algorithms to find approximate stationary points of non-convex loss functions. Our framework is based on using a private approximate risk minimizer to "warm st…
View article: Differentially Private Medians and Interior Points for Non-Pathological Data
Differentially Private Medians and Interior Points for Non-Pathological Data Open
We construct sample-efficient differentially private estimators for the approximate-median and interior-point problems, that can be applied to arbitrary input distributions over ℝ satisfying very mild statistical assumptions. Our results s…
View article: Metalearning with Very Few Samples Per Task
Metalearning with Very Few Samples Per Task Open
Metalearning and multitask learning are two frameworks for solving a group of related learning tasks more efficiently than we could hope to solve each of the individual tasks on their own. In multitask learning, we are given a fixed set of…
View article: Chameleon: Increasing Label-Only Membership Leakage with Adaptive Poisoning
Chameleon: Increasing Label-Only Membership Leakage with Adaptive Poisoning Open
The integration of machine learning (ML) in numerous critical applications introduces a range of privacy concerns for individuals who provide their datasets for model training. One such privacy risk is Membership Inference (MI), in which a…
View article: Smooth Lower Bounds for Differentially Private Algorithms via Padding-and-Permuting Fingerprinting Codes
Smooth Lower Bounds for Differentially Private Algorithms via Padding-and-Permuting Fingerprinting Codes Open
Fingerprinting arguments, first introduced by Bun, Ullman, and Vadhan (STOC 2014), are the most widely used method for establishing lower bounds on the sample complexity or error of approximately differentially private (DP) algorithms. Sti…
View article: Investigating the Visual Utility of Differentially Private Scatterplots
Investigating the Visual Utility of Differentially Private Scatterplots Open
Increasingly, visualization practitioners are working with, using, and studying private and sensitive data. There can be many stakeholders interested in the resulting analyses-but widespread sharing of the data can cause harm to individual…
View article: TMI! Finetuned Models Leak Private Information from their Pretraining Data
TMI! Finetuned Models Leak Private Information from their Pretraining Data Open
Transfer learning has become an increasingly popular technique in machine learning as a way to leverage a pretrained model trained for one task to assist with building a finetuned model for a related task. This paradigm has been especially…
View article: How to Combine Membership-Inference Attacks on Multiple Updated Machine Learning Models
How to Combine Membership-Inference Attacks on Multiple Updated Machine Learning Models Open
A large body of research has shown that machine learning models are vulnerable to membership inference (MI) attacks that violate the privacy of the participants in the training data. Most MI research focuses on the case of a single standal…
View article: Investigating the Visual Utility of Differentially Private Scatterplots
Investigating the Visual Utility of Differentially Private Scatterplots Open
Increasingly, visualization practitioners are working with, using, and studying private and sensitive data. There can be many stakeholders interested in the resulting analyses—but widespread sharing of the data can cause harm to individual…
View article: From Robustness to Privacy and Back
From Robustness to Privacy and Back Open
We study the relationship between two desiderata of algorithms in statistical inference and machine learning: differential privacy and robustness to adversarial data corruptions. Their conceptual similarity was first observed by Dwork and …
View article: A Bias-Accuracy-Privacy Trilemma for Statistical Estimation
A Bias-Accuracy-Privacy Trilemma for Statistical Estimation Open
Differential privacy (DP) is a rigorous notion of data privacy, used for private statistics. The canonical algorithm for differentially private mean estimation is to first clip the samples to a bounded range and then add noise to their emp…
View article: Multitask Learning via Shared Features: Algorithms and Hardness
Multitask Learning via Shared Features: Algorithms and Hardness Open
We investigate the computational efficiency of multitask learning of Boolean functions over the $d$-dimensional hypercube, that are related by means of a feature representation of size $k \ll d$ shared across all tasks. We present a polyno…
View article: SNAP: Efficient Extraction of Private Properties with Poisoning
SNAP: Efficient Extraction of Private Properties with Poisoning Open
Property inference attacks allow an adversary to extract global properties of the training dataset from a machine learning model. Such attacks have privacy implications for data owners sharing their datasets to train machine learning model…
View article: How to Combine Membership-Inference Attacks on Multiple Updated Models
How to Combine Membership-Inference Attacks on Multiple Updated Models Open
A large body of research has shown that machine learning models are vulnerable to membership inference (MI) attacks that violate the privacy of the participants in the training data. Most MI research focuses on the case of a single standal…
View article: A Private and Computationally-Efficient Estimator for Unbounded Gaussians
A Private and Computationally-Efficient Estimator for Unbounded Gaussians Open
We give the first polynomial-time, polynomial-sample, differentially private estimator for the mean and covariance of an arbitrary Gaussian distribution $\mathcal{N}(μ,Σ)$ in $\mathbb{R}^d$. All previous estimators are either nonconstructi…
View article: Covariance-Aware Private Mean Estimation Without Private Covariance\n Estimation
Covariance-Aware Private Mean Estimation Without Private Covariance\n Estimation Open
We present two sample-efficient differentially private mean estimators for\n$d$-dimensional (sub)Gaussian distributions with unknown covariance.\nInformally, given $n \\gtrsim d/\\alpha^2$ samples from such a distribution with\nmean $\\mu$…
View article: Covariance-Aware Private Mean Estimation Without Private Covariance Estimation
Covariance-Aware Private Mean Estimation Without Private Covariance Estimation Open
We present two sample-efficient differentially private mean estimators for $d$-dimensional (sub)Gaussian distributions with unknown covariance. Informally, given $n \gtrsim d/α^2$ samples from such a distribution with mean $μ$ and covarian…
View article: The limits of pan privacy and shuffle privacy for learning and estimation
The limits of pan privacy and shuffle privacy for learning and estimation Open
There has been a recent wave of interest in intermediate trust models for differential privacy that eliminate the need for a fully trusted central data collector, but overcome the limitations of local differential privacy. This interest ha…
View article: Leveraging Public Data for Practical Private Query Release
Leveraging Public Data for Practical Private Query Release Open
In many statistical problems, incorporating priors can significantly improve performance. However, the use of prior knowledge in differentially private query release has remained underexplored, despite such priors commonly being available …
View article: Fair and Optimal Cohort Selection for Linear Utilities
Fair and Optimal Cohort Selection for Linear Utilities Open
The rise of algorithmic decision-making has created an explosion of research around the fairness of those algorithms. While there are many compelling notions of individual fairness, beginning with the work of Dwork et al., these notions ty…
View article: Manipulation Attacks in Local Differential Privacy
Manipulation Attacks in Local Differential Privacy Open
Local differential privacy is a widely studied restriction on distributed algorithms that collect aggregates about sensitive user data, and is now deployed in several large systems. We initiate a systematic study of a fundamental limitatio…
View article: Efficiently Estimating Erdos-Renyi Graphs with Node Differential Privacy
Efficiently Estimating Erdos-Renyi Graphs with Node Differential Privacy Open
We give a simple, computationally efficient, and node-differentially-private algorithm for estimating the parameter of an Erdos-Renyi graph---that is, estimating p in a G(n,p)---with near-optimal accuracy. Our algorithm nearly matches the …
View article: Algorithmic Stability for Adaptive Data Analysis
Algorithmic Stability for Adaptive Data Analysis Open
Adaptivity is an important feature of data analysis---the choice of questions to ask about a dataset often depends on previous interactions with the same dataset. However, statistical validity is typically studied in a nonadaptive model, w…
View article: The Limits of Pan Privacy and Shuffle Privacy for Learning and\n Estimation
The Limits of Pan Privacy and Shuffle Privacy for Learning and\n Estimation Open
There has been a recent wave of interest in intermediate trust models for\ndifferential privacy that eliminate the need for a fully trusted central data\ncollector, but overcome the limitations of local differential privacy. This\ninterest…