Gregory Valiant
YOU?
Author Swipe
View article: Attainability of Two-Point Testing Rates for Finite-Sample Location Estimation
Attainability of Two-Point Testing Rates for Finite-Sample Location Estimation Open
LeCam's two-point testing method yields perhaps the simplest lower bound for estimating the mean of a distribution: roughly, if it is impossible to well-distinguish a distribution centered at $μ$ from the same distribution centered at $μ+Δ…
View article: A Generalized Trace Reconstruction Problem: Recovering a String of Probabilities
A Generalized Trace Reconstruction Problem: Recovering a String of Probabilities Open
We introduce the following natural generalization of trace reconstruction, parameterized by a deletion probability $δ\in (0,1)$ and length $n$: There is a length $n$ string of probabilities, $S=p_1,\ldots,p_n,$ and each "trace" is obtained…
View article: Discovering Data Structures: Nearest Neighbor Search and Beyond
Discovering Data Structures: Nearest Neighbor Search and Beyond Open
We propose a general framework for end-to-end learning of data structures. Our framework adapts to the underlying data distribution and provides fine-grained control over query and space complexity. Crucially, the data structure is learned…
View article: Adaptive and oblivious statistical adversaries are equivalent
Adaptive and oblivious statistical adversaries are equivalent Open
We resolve a fundamental question about the ability to perform a statistical task, such as learning, when an adversary corrupts the sample. Such adversaries are specified by the types of corruption they can make and their level of knowledg…
View article: Efficient Convex Optimization Requires Superlinear Memory
Efficient Convex Optimization Requires Superlinear Memory Open
We show that any memory-constrained, first-order algorithm which minimizes d -dimensional, 1-Lipschitz convex functions over the unit ball to 1/poly( d ) accuracy using at most d 1.25 - δ bits of memory must make at least \(\tilde{\Omega }…
View article: Near-Optimal Mean Estimation with Unknown, Heteroskedastic Variances
Near-Optimal Mean Estimation with Unknown, Heteroskedastic Variances Open
Given data drawn from a collection of Gaussian variables with a common mean but different and unknown variances, what is the best algorithm for estimating their common mean? We present an intuitive and efficient algorithm for this task. As…
View article: Near-Optimal Mean Estimation with Unknown, Heteroskedastic Variances
Near-Optimal Mean Estimation with Unknown, Heteroskedastic Variances Open
Given data drawn from a collection of Gaussian variables with a common mean but different and unknown variances, what is the best algorithm for estimating their common mean? We present an intuitive and efficient algorithm for this task. As…
View article: Matrix Multiplication in Quadratic Time and Energy? Towards a Fine-Grained Energy-Centric Church-Turing Thesis
Matrix Multiplication in Quadratic Time and Energy? Towards a Fine-Grained Energy-Centric Church-Turing Thesis Open
We describe two algorithms for multiplying n x n matrices using time and energy n^2 polylog(n) under basic models of classical physics. The first algorithm is for multiplying integer-valued matrices, and the second, quite different algorit…
View article: Testing with Non-identically Distributed Samples
Testing with Non-identically Distributed Samples Open
We examine the extent to which sublinear-sample property testing and estimation apply to settings where samples are independently but not identically distributed. Specifically, we consider the following distributional property testing fram…
View article: Efficient Convex Optimization Requires Superlinear Memory (Extended Abstract)
Efficient Convex Optimization Requires Superlinear Memory (Extended Abstract) Open
Minimizing a convex function with access to a first order oracle---that returns the function evaluation and (sub)gradient at a query point---is a canonical optimization problem and a fundamental primitive in machine learning. Gradient-base…
View article: One-sided Matrix Completion from Two Observations Per Row
One-sided Matrix Completion from Two Observations Per Row Open
Given only a few observed entries from a low-rank matrix $X$, matrix completion is the problem of imputing the missing entries, and it formalizes a wide range of real-world settings that involve estimating missing data. However, when there…
View article: Lexinvariant Language Models
Lexinvariant Language Models Open
Token embeddings, a mapping from discrete lexical symbols to continuous vectors, are at the heart of any language model (LM). However, lexical symbol meanings can also be determined and even redefined by their structural role in a long con…
View article: Online Pen Testing
Online Pen Testing Open
We study a "pen testing" problem, in which we are given n pens with unknown amounts of ink X₁, X₂, …, X_n, and we want to choose a pen with the maximum amount of remaining ink in it. The challenge is that we cannot access each X_i directly…
View article: What Can Transformers Learn In-Context? A Case Study of Simple Function Classes
What Can Transformers Learn In-Context? A Case Study of Simple Function Classes Open
In-context learning refers to the ability of a model to condition on a prompt sequence consisting of in-context examples (input-output pairs corresponding to some task) along with a new query input, and generate the corresponding output. C…
View article: From Sand to Flour: The Next Leap in Granular Computing with NanoSort
From Sand to Flour: The Next Leap in Granular Computing with NanoSort Open
The granularity of distributed computing is limited by communication time: there is no point in farming out smaller and smaller tasks if the communication overhead dominates the decrease in processing time due to the added parallelism. In …
View article: Efficient Convex Optimization Requires Superlinear Memory
Efficient Convex Optimization Requires Superlinear Memory Open
We show that any memory-constrained, first-order algorithm which minimizes $d$-dimensional, $1$-Lipschitz convex functions over the unit ball to $1/\mathrm{poly}(d)$ accuracy using at most $d^{1.25 - δ}$ bits of memory must make at least $…
View article: On the Statistical Complexity of Sample Amplification
On the Statistical Complexity of Sample Amplification Open
The ``sample amplification'' problem formalizes the following question: Given $n$ i.i.d. samples drawn from an unknown distribution $P$, when is it possible to produce a larger set of $n+m$ samples which cannot be distinguished from $n+m$ …
View article: Big-Step-Little-Step: Efficient Gradient Methods for Objectives with Multiple Scales
Big-Step-Little-Step: Efficient Gradient Methods for Objectives with Multiple Scales Open
We provide new gradient-based methods for efficiently solving a broad class of ill-conditioned optimization problems. We consider the problem of minimizing a function $f : \mathbb{R}^d \rightarrow \mathbb{R}$ which is implicitly decomposab…
View article: ReporterSeq reveals genome-wide dynamic modulators of the heat shock response across diverse stressors
ReporterSeq reveals genome-wide dynamic modulators of the heat shock response across diverse stressors Open
Understanding cellular stress response pathways is challenging because of the complexity of regulatory mechanisms and response dynamics, which can vary with both time and the type of stress. We developed a reverse genetic method called Rep…
View article: Exponential Weights Algorithms for Selective Learning
Exponential Weights Algorithms for Selective Learning Open
We study the selective learning problem introduced by Qiao and Valiant (2019), in which the learner observes $n$ labeled data points one at a time. At a time of its choosing, the learner selects a window length $w$ and a model $\hat\ell$ f…
View article: Exponential Weights Algorithms for Selective Learning
Exponential Weights Algorithms for Selective Learning Open
We study the selective learning problem introduced by Qiao and Valiant (2019), in which the learner observes $n$ labeled data points one at a time. At a time of its choosing, the learner selects a window length $w$ and a model $\hat\ell$ f…
View article: Author response: ReporterSeq reveals genome-wide dynamic modulators of the heat shock response across diverse stressors
Author response: ReporterSeq reveals genome-wide dynamic modulators of the heat shock response across diverse stressors Open
Article Figures and data Abstract Introduction Results Discussion Materials and methods Data availability References Decision letter Author response Article and author information Metrics Abstract Understanding cellular stress response pat…
View article: Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed\n Self-Training
Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed\n Self-Training Open
Self-training is a standard approach to semi-supervised learning where the\nlearner's own predictions on unlabeled data are used as supervision during\ntraining. In this paper, we reinterpret this label assignment process as an\noptimal tr…
View article: Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed Self-Training
Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed Self-Training Open
Self-training is a standard approach to semi-supervised learning where the learner's own predictions on unlabeled data are used as supervision during training. In this paper, we reinterpret this label assignment process as an optimal trans…
View article: On Misspecification in Prediction Problems and Robustness via Improper Learning
On Misspecification in Prediction Problems and Robustness via Improper Learning Open
We study probabilistic prediction games when the underlying model is misspecified, investigating the consequences of predicting using an incorrect parametric model. We show that for a broad class of loss functions and parametric families o…
View article: Beyond Laurel/Yanny: An Autoencoder-Enabled Search for Polyperceivable Audio
Beyond Laurel/Yanny: An Autoencoder-Enabled Search for Polyperceivable Audio Open
Kartik Chandra, Chuma Kabaghe, Gregory Valiant. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 20…
View article: Stronger Calibration Lower Bounds via Sidestepping
Stronger Calibration Lower Bounds via Sidestepping Open
We consider an online binary prediction setting where a forecaster observes a sequence of $T$ bits one by one. Before each bit is revealed, the forecaster predicts the probability that the bit is $1$. The forecaster is called well-calibrat…
View article: On the Generalization Effects of Linear Transformations in Data Augmentation
On the Generalization Effects of Linear Transformations in Data Augmentation Open
Data augmentation is a powerful technique to improve performance in applications such as image and text classification tasks. Yet, there is little rigorous understanding of why and how various augmentations work. In this work, we consider …
View article: On the Generalization Effects of Linear Transformations in Data\n Augmentation
On the Generalization Effects of Linear Transformations in Data\n Augmentation Open
Data augmentation is a powerful technique to improve performance in\napplications such as image and text classification tasks. Yet, there is little\nrigorous understanding of why and how various augmentations work. In this work,\nwe consid…
View article: Genome-wide, time-sensitive interrogation of the heat shock response under diverse stressors via ReporterSeq
Genome-wide, time-sensitive interrogation of the heat shock response under diverse stressors via ReporterSeq Open
Interrogating cellular stress response pathways is challenging because of the complexity of regulatory mechanisms and response dynamics, which can vary with both time and the type of stress. We developed a reverse genetic method called Rep…