Martin J. Wainwright
YOU?
Author Swipe
View article: Near-optimal inference in adaptive linear regression
Near-optimal inference in adaptive linear regression Open
When data is collected in an adaptive manner, even simple methods like ordinary least squares can exhibit non-normal asymptotic behavior. As an undesirable consequence, hypothesis tests and confidence intervals based on asymptotic normalit…
View article: Sharp Results for Hypothesis Testing with Risk-Sensitive Agents
Sharp Results for Hypothesis Testing with Risk-Sensitive Agents Open
Statistical protocols are often used for decision-making involving multiple parties, each with their own incentives, private information, and ability to influence the distributional properties of the data. We study a game-theoretic version…
View article: Inference under Staggered Adoption: Case Study of the Affordable Care Act
Inference under Staggered Adoption: Case Study of the Affordable Care Act Open
Panel data consists of a collection of $N$ units that are observed over $T$ units of time. A policy or treatment is subject to staggered adoption if different units take on treatment at different times and remains treated (or never at all)…
View article: Prediction Aided by Surrogate Training
Prediction Aided by Surrogate Training Open
We study a class of prediction problems in which relatively few observations have associated responses, but all observations include both standard covariates as well as additional "helper" covariates. While the end goal is to make high-qua…
View article: When is it worthwhile to jackknife? Breaking the quadratic barrier for Z-estimators
When is it worthwhile to jackknife? Breaking the quadratic barrier for Z-estimators Open
Resampling methods are especially well-suited to inference with estimators that provide only "black-box'' access. Jackknife is a form of resampling, widely used for bias correction and variance estimation, that is well-understood under cla…
View article: Instrumental variables: A non-asymptotic viewpoint
Instrumental variables: A non-asymptotic viewpoint Open
We provide a non-asymptotic analysis of the linear instrumental variable estimator allowing for the presence of exogeneous covariates. In addition, we introduce a novel measure of the strength of an instrument that can be used to derive no…
View article: Exploiting Exogenous Structure for Sample-Efficient Reinforcement Learning
Exploiting Exogenous Structure for Sample-Efficient Reinforcement Learning Open
We study Exo-MDPs, a structured class of Markov Decision Processes (MDPs) where the state space is partitioned into exogenous and endogenous components. Exogenous states evolve stochastically, independent of the agent's actions, while endo…
View article: Finite-Sample Guarantees for Learning Dynamics in Zero-Sum Polymatrix Games
Finite-Sample Guarantees for Learning Dynamics in Zero-Sum Polymatrix Games Open
We study best-response type learning dynamics for zero-sum polymatrix games under two information settings. The two settings are distinguished by the type of information that each player has about the game and their opponents' strategy. Th…
View article: Optimal and instance-dependent guarantees for Markovian linear stochastic approximation
Optimal and instance-dependent guarantees for Markovian linear stochastic approximation Open
We study stochastic approximation procedures for approximately solving a d -dimensional linear fixed-point equation based on observing a trajectory of length n from an ergodic Markov chain. We first exhibit a non-asymptotic bound of the o…
View article: Entrywise Inference for Missing Panel Data: A Simple and Instance-Optimal Approach
Entrywise Inference for Missing Panel Data: A Simple and Instance-Optimal Approach Open
Longitudinal or panel data can be represented as a matrix with rows indexed by units and columns indexed by time. We consider inferential questions associated with the missing data version of panel data induced by staggered adoption. We pr…
View article: Taming "data-hungry" reinforcement learning? Stability in continuous state-action spaces
Taming "data-hungry" reinforcement learning? Stability in continuous state-action spaces Open
We introduce a novel framework for analyzing reinforcement learning (RL) in continuous state-action spaces, and use it to prove fast rates of convergence in both off-line and on-line settings. Our analysis highlights two key stability prop…
View article: A decorrelation method for general regression adjustment in randomized experiments
A decorrelation method for general regression adjustment in randomized experiments Open
We study regression adjustment with general function class approximations for estimating the average treatment effect in the design-based setting. Standard regression adjustment involves bias due to sample re-use, and this bias leads to be…
View article: Doubly High-Dimensional Contextual Bandits: An Interpretable Model for Joint Assortment-Pricing
Doubly High-Dimensional Contextual Bandits: An Interpretable Model for Joint Assortment-Pricing Open
Key challenges in running a retail business include how to select products to present to consumers (the assortment problem), and how to price products (the pricing problem) to maximize revenue or profit. Instead of considering these proble…
View article: Challenges of the inconsistency regime: Novel debiasing methods for missing data models
Challenges of the inconsistency regime: Novel debiasing methods for missing data models Open
We study semi-parametric estimation of the population mean when data is observed missing at random (MAR) in the $n < p$ "inconsistency regime", in which neither the outcome model nor the propensity/missingness model can be estimated consis…
View article: When is the estimated propensity score better? High-dimensional analysis and bias correction
When is the estimated propensity score better? High-dimensional analysis and bias correction Open
Anecdotally, using an estimated propensity score is superior to the true propensity score in estimating the average treatment effect based on observational data. However, this claim comes with several qualifications: it holds only if prope…
View article: Noisy recovery from random linear observations: Sharp minimax rates under elliptical constraints
Noisy recovery from random linear observations: Sharp minimax rates under elliptical constraints Open
Estimation problems with constrained parameter spaces arise in various settings. In many of these problems, the observations available to the statistician can be modelled as arising from the noisy realization of the image of a random linea…
View article: Semi-parametric inference based on adaptively collected data
Semi-parametric inference based on adaptively collected data Open
Many standard estimators, when applied to adaptively collected data, fail to be asymptotically normal, thereby complicating the construction of confidence intervals. We address this challenge in a semi-parametric context: estimating the pa…
View article: Kernel-based off-policy estimation without overlap: Instance optimality beyond semiparametric efficiency
Kernel-based off-policy estimation without overlap: Instance optimality beyond semiparametric efficiency Open
We study optimal procedures for estimating a linear functional based on observational data. In many problems of this kind, a widely used assumption is strict overlap, i.e., uniform boundedness of the importance ratio, which measures how we…
View article: Policy evaluation from a single path: Multi-step methods, mixing and mis-specification
Policy evaluation from a single path: Multi-step methods, mixing and mis-specification Open
We study non-parametric estimation of the value function of an infinite-horizon $γ$-discounted Markov reward process (MRP) using observations from a single trajectory. We provide non-asymptotic guarantees for a general family of kernel-bas…
View article: Krylov-Bellman boosting: Super-linear policy evaluation in general state spaces
Krylov-Bellman boosting: Super-linear policy evaluation in general state spaces Open
We present and analyze the Krylov-Bellman Boosting (KBB) algorithm for policy evaluation in general state spaces. It alternates between fitting the Bellman residual using non-parametric regression (as in boosting), and estimating the value…
View article: QuTE: decentralized multiple testing on sensor networks with false discovery rate control
QuTE: decentralized multiple testing on sensor networks with false discovery rate control Open
This paper designs methods for decentralized multiple hypothesis testing on graphs that are equipped with provable guarantees on the false discovery rate (FDR). We consider the setting where distinct agents reside on the nodes of an undire…
View article: Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency
Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency Open
The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures. We analyze a broad class of two-stage procedures that first estimate the treatment effect function,…
View article: Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning Open
The $Q$-learning algorithm is a simple and widely-used stochastic approximation scheme for reinforcement learning, but the basic protocol can exhibit instability in conjunction with function approximation. Such instability can be observed …
View article: Optimally tackling covariate shift in RKHS-based nonparametric regression
Optimally tackling covariate shift in RKHS-based nonparametric regression Open
We study the covariate shift problem in the context of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We focus on two natural families of covariate shift problems defined using the likelihood ratios between the so…
View article: Improved bounds for discretization of Langevin diffusions: Near-optimal rates without convexity
Improved bounds for discretization of Langevin diffusions: Near-optimal rates without convexity Open
We consider minimizing a nonconvex, smooth function $f$ on a Riemannian manifold $\\mathcal{M}$. We show that a perturbed version of Riemannian gradient descent algorithm converges to a second-order stationary point (and hence is able to e…
View article: Bellman Residual Orthogonalization for Offline Reinforcement Learning
Bellman Residual Orthogonalization for Offline Reinforcement Learning Open
We propose and analyze a reinforcement learning principle that approximates the Bellman equations by enforcing their validity only along an user-defined space of test functions. Focusing on applications to model-free offline RL with functi…
View article: A new similarity measure for covariate shift with applications to nonparametric regression
A new similarity measure for covariate shift with applications to nonparametric regression Open
We study covariate shift in the context of nonparametric regression. We introduce a new measure of distribution mismatch between the source and target distributions that is based on the integrated ratio of probabilities of balls at a given…
View article: Optimal variance-reduced stochastic approximation in Banach spaces
Optimal variance-reduced stochastic approximation in Banach spaces Open
We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space. Focusing on a stochastic query model that provides noisy evaluations of the operator, we analyze a variance-reduced stochasti…