Patrick Rebeschini
YOU?
Author Swipe
View article: Best-of-Both Worlds for linear contextual bandits with paid observations
Best-of-Both Worlds for linear contextual bandits with paid observations Open
We study the problem of linear contextual bandits with paid observations, where at each round the learner selects an action in order to minimize its loss in a given context, and can then decide to pay a fixed cost to observe the loss of an…
View article: Implicit Regularisation in Diffusion Models: An Algorithm-Dependent Generalisation Analysis
Implicit Regularisation in Diffusion Models: An Algorithm-Dependent Generalisation Analysis Open
The success of denoising diffusion models raises important questions regarding their generalisation behaviour, particularly in high-dimensional settings. Notably, it has been shown that when training and sampling are performed perfectly, t…
View article: On the necessity of adaptive regularisation:Optimal anytime online learning on $\boldsymbol{\ell_p}$-balls
On the necessity of adaptive regularisation:Optimal anytime online learning on $\boldsymbol{\ell_p}$-balls Open
We study online convex optimization on $\ell_p$-balls in $\mathbb{R}^d$ for $p > 2$. While always sub-linear, the optimal regret exhibits a shift between the high-dimensional setting ($d > T$), when the dimension $d$ is greater than the ti…
View article: Non-stationary Bandit Convex Optimization: A Comprehensive Study
Non-stationary Bandit Convex Optimization: A Comprehensive Study Open
Bandit Convex Optimization is a fundamental class of sequential decision-making problems, where the learner selects actions from a continuous domain and observes a loss (but not its gradient) at only one point per round. We study this prob…
View article: Early-Stopped Mirror Descent for Linear Regression over Convex Bodies
Early-Stopped Mirror Descent for Linear Regression over Convex Bodies Open
Early-stopped iterative optimization methods are widely used as alternatives to explicit regularization, and direct comparisons between early-stopping and explicit regularization have been established for many optimization geometries. Howe…
View article: Uniform mean estimation for monotonic processes
Uniform mean estimation for monotonic processes Open
We consider the problem of deriving uniform confidence bands for the mean of a monotonic stochastic process, such as the cumulative distribution function (CDF) of a random variable, based on a sequence of i.i.d.~observations. Our approach …
View article: Black-Box Uniform Stability for Non-Euclidean Empirical Risk Minimization
Black-Box Uniform Stability for Non-Euclidean Empirical Risk Minimization Open
We study first-order algorithms that are uniformly stable for empirical risk minimization (ERM) problems that are convex and smooth with respect to $p$-norms, $p \geq 1$. We propose a black-box reduction method that, by employing propertie…
View article: Meta-Learning Objectives for Preference Optimization
Meta-Learning Objectives for Preference Optimization Open
Evaluating preference optimization (PO) algorithms on LLM alignment is a challenging task that presents prohibitive costs, noise, and several variables like model size and hyper-parameters. In this work, we show that it is possible to gain…
View article: Robust Gradient Descent for Phase Retrieval
Robust Gradient Descent for Phase Retrieval Open
Recent progress in robust statistical learning has mainly tackled convex problems, like mean estimation or linear regression, with non-convex challenges receiving less attention. Phase retrieval exemplifies such a non-convex problem, requi…
View article: Differentiable Cost-Parameterized Monge Map Estimators
Differentiable Cost-Parameterized Monge Map Estimators Open
Within the field of optimal transport (OT), the choice of ground cost is crucial to ensuring that the optimality of a transport map corresponds to usefulness in real-world applications. It is therefore desirable to use known information to…
View article: Learning mirror maps in policy mirror descent
Learning mirror maps in policy mirror descent Open
Policy Mirror Descent (PMD) is a popular framework in reinforcement learning, serving as a unifying perspective that encompasses numerous algorithms. These algorithms are derived through the selection of a mirror map and enjoy finite-time …
View article: Generalization Bounds for Label Noise Stochastic Gradient Descent
Generalization Bounds for Label Noise Stochastic Gradient Descent Open
We develop generalization error bounds for stochastic gradient descent (SGD) with label noise in non-convex settings under uniform dissipativity and smoothness conditions. Under a suitable choice of semimetric, we establish a contraction i…
View article: Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent Adaptivity
Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent Adaptivity Open
We theoretically explore the relationship between sample-efficiency and adaptivity in reinforcement learning. An algorithm is sample-efficient if it uses a number of queries $n$ to the environment that is polynomial in the dimension $d$ of…
View article: The statistical complexity of early-stopped mirror descent
The statistical complexity of early-stopped mirror descent Open
Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-st…
View article: Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision Processes
Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision Processes Open
Policy Mirror Descent (PMD) is a general family of algorithms that covers a wide range of novel and fundamental methods in reinforcement learning. Motivated by the instability of policy iteration (PI) with inexact policy evaluation, PMD al…
View article: A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence
A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence Open
Modern policy optimization methods in reinforcement learning, such as TRPO and PPO, owe their success to the use of parameterized policies. However, while theoretical guarantees have been established for this class of algorithms, especiall…
View article: Nearly minimax-optimal rates for noisy sparse phase retrieval via early-stopped mirror descent
Nearly minimax-optimal rates for noisy sparse phase retrieval via early-stopped mirror descent Open
This paper studies early-stopped mirror descent applied to noisy sparse phase retrieval, which is the problem of recovering a $k$-sparse signal $\textbf{x}^\star \in{\mathbb{R}}^n$ from a set of quadratic Gaussian measurements corrupted by…
View article: Linear Convergence for Natural Policy Gradient with Log-linear Policy Parametrization
Linear Convergence for Natural Policy Gradient with Log-linear Policy Parametrization Open
We analyze the convergence rate of the unregularized natural policy gradient algorithm with log-linear policy parametrizations in infinite-horizon discounted Markov decision processes. In the deterministic case, when the Q-value is known a…
View article: Exponential Tail Local Rademacher Complexity Risk Bounds Without the Bernstein Condition
Exponential Tail Local Rademacher Complexity Risk Bounds Without the Bernstein Condition Open
The local Rademacher complexity framework is one of the most successful general-purpose toolboxes for establishing sharp excess risk bounds for statistical estimators based on the framework of empirical risk minimization. Applying this too…
View article: Time-independent Generalization Bounds for SGLD in Non-convex Settings
Time-independent Generalization Bounds for SGLD in Non-convex Settings Open
We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with constant learning rate under the assumptions of dissipativity and smoothness, a setting that has received increased attention in the sampling/op…
View article: On Optimal Interpolation In Linear Regression
On Optimal Interpolation In Linear Regression Open
Understanding when and why interpolating methods generalize well has recently been a topic of interest in statistical learning theory. However, systematically connecting interpolating methods to achievable notions of optimality has only re…
View article: Dimension-Free Rates for Natural Policy Gradient in Multi-Agent Reinforcement Learning
Dimension-Free Rates for Natural Policy Gradient in Multi-Agent Reinforcement Learning Open
Cooperative multi-agent reinforcement learning is a decentralized paradigm in sequential decision making where agents distributed over a network iteratively collaborate with neighbors to maximize global (network-wide) notions of rewards. E…
View article: Comparing Classes of Estimators: When does Gradient Descent Beat Ridge Regression in Linear Models?
Comparing Classes of Estimators: When does Gradient Descent Beat Ridge Regression in Linear Models? Open
Methods for learning from data depend on various types of tuning parameters, such as penalization strength or step size. Since performance can depend strongly on these parameters, it is important to compare classes of estimators-by conside…
View article: Implicit Regularization in Matrix Sensing via Mirror Descent
Implicit Regularization in Matrix Sensing via Mirror Descent Open
We study discrete-time mirror descent applied to the unregularized empirical risk in matrix sensing. In both the general case of rectangular matrices and the particular case of positive semidefinite matrices, a simple potential-based analy…
View article: Nearly Minimax-Optimal Rates for Noisy Sparse Phase Retrieval via Early-Stopped Mirror Descent
Nearly Minimax-Optimal Rates for Noisy Sparse Phase Retrieval via Early-Stopped Mirror Descent Open
This paper studies early-stopped mirror descent applied to noisy sparse phase retrieval, which is the problem of recovering a $k$-sparse signal $\mathbf{x}^\star\in\mathbb{R}^n$ from a set of quadratic Gaussian measurements corrupted by su…
View article: A Continuous-Time Mirror Descent Approach to Sparse Phase Retrieval
A Continuous-Time Mirror Descent Approach to Sparse Phase Retrieval Open
We analyze continuous-time mirror descent applied to sparse phase retrieval, which is the problem of recovering sparse signals from a set of magnitude-only measurements. We apply mirror descent to the unconstrained empirical risk minimizat…
View article: Decentralised Learning with Random Features and Distributed Gradient Descent
Decentralised Learning with Random Features and Distributed Gradient Descent Open
We investigate the generalisation performance of Distributed Gradient Descent with Implicit Regularisation and Random Features in the homogenous setting where a network of agents are given data sampled independently from the same unknown d…
View article: Hadamard Wirtinger Flow for Sparse Phase Retrieval
Hadamard Wirtinger Flow for Sparse Phase Retrieval Open
We consider the problem of reconstructing an $n$-dimensional $k$-sparse signal from a set of noiseless magnitude-only measurements. Formulating the problem as an unregularized empirical risk minimization task, we study the sample complexit…
View article: The Statistical Complexity of Early-Stopped Mirror Descent
The Statistical Complexity of Early-Stopped Mirror Descent Open
Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-st…