Explanipedia

Best-of-Both Worlds for linear contextual bandits with paid observations Open

Dorian Baudry, Patrick Rebeschini · 2025

We study the problem of linear contextual bandits with paid observations, where at each round the learner selects an action in order to minimize its loss in a given context, and can then decide to pay a fixed cost to observe the loss of an…

Implicit Regularisation in Diffusion Models: An Algorithm-Dependent Generalisation Analysis Open

Tyler Farghly, Patrick Rebeschini, George Deligiannidis, Arnaud Doucet · 2025

The success of denoising diffusion models raises important questions regarding their generalisation behaviour, particularly in high-dimensional settings. Notably, it has been shown that when training and sampling are performed perfectly, t…

On the necessity of adaptive regularisation:Optimal anytime online learning on $\boldsymbol{\ell_p}$-balls Open

Emmeran Johnson, David Martı́nez-Rubio, Ciara Pike-Burke, Patrick Rebeschini · 2025

We study online convex optimization on $\ell_p$-balls in $\mathbb{R}^d$ for $p > 2$. While always sub-linear, the optimal regret exhibits a shift between the high-dimensional setting ($d > T$), when the dimension $d$ is greater than the ti…

Non-stationary Bandit Convex Optimization: A Comprehensive Study Open

Xiaoqi Liu, Dorian Baudry, Julian Zimmert, Patrick Rebeschini, Arya A. Akhavan · 2025

Bandit Convex Optimization is a fundamental class of sequential decision-making problems, where the learner selects actions from a continuous domain and observes a loss (but not its gradient) at only one point per round. We study this prob…

Early-Stopped Mirror Descent for Linear Regression over Convex Bodies Open

Tobias Wegel, Gil Kur, Patrick Rebeschini · 2025

Early-stopped iterative optimization methods are widely used as alternatives to explicit regularization, and direct comparisons between early-stopping and explicit regularization have been established for many optimization geometries. Howe…

Uniform mean estimation for monotonic processes Open

Eugenio Clerico, Hamish Flynn, Patrick Rebeschini · 2025

We consider the problem of deriving uniform confidence bands for the mean of a monotonic stochastic process, such as the cumulative distribution function (CDF) of a random variable, based on a sequence of i.i.d.~observations. Our approach …

Black-Box Uniform Stability for Non-Euclidean Empirical Risk Minimization Open

Simon Vary, David Martı́nez-Rubio, Patrick Rebeschini · 2024

Mathematics Computer science

We study first-order algorithms that are uniformly stable for empirical risk minimization (ERM) problems that are convex and smooth with respect to $p$-norms, $p \geq 1$. We propose a black-box reduction method that, by employing propertie…

Meta-Learning Objectives for Preference Optimization Open

Carlo Alfano, Silvia Sapora, Jakob Foerster, Patrick Rebeschini, Yee Whye Teh · 2024

Computer science Economics

Evaluating preference optimization (PO) algorithms on LLM alignment is a challenging task that presents prohibitive costs, noise, and several variables like model size and hyper-parameters. In this work, we show that it is possible to gain…

Robust Gradient Descent for Phase Retrieval Open

Alex Buna, Patrick Rebeschini · 2024

Computer science Geography Physics

Recent progress in robust statistical learning has mainly tackled convex problems, like mean estimation or linear regression, with non-convex challenges receiving less attention. Phase retrieval exemplifies such a non-convex problem, requi…

Differentiable Cost-Parameterized Monge Map Estimators Open

Samuel J. Howard, George Deligiannidis, Patrick Rebeschini, Jim Thornton · 2024

Mathematics Computer science Medicine

Within the field of optimal transport (OT), the choice of ground cost is crucial to ensuring that the optimality of a transport map corresponds to usefulness in real-world applications. It is therefore desirable to use known information to…

Learning mirror maps in policy mirror descent Open

Carlo Alfano, Sebastian Towers, Silvia Sapora, Chris Lu, Patrick Rebeschini · 2024

Computer science Physics

Policy Mirror Descent (PMD) is a popular framework in reinforcement learning, serving as a unifying perspective that encompasses numerous algorithms. These algorithms are derived through the selection of a mirror map and enjoy finite-time …

Generalization Bounds for Label Noise Stochastic Gradient Descent Open

Jung Eun Huh, Patrick Rebeschini · 2023

Mathematics Computer science Physics

We develop generalization error bounds for stochastic gradient descent (SGD) with label noise in non-convex settings under uniform dissipativity and smoothness conditions. Under a suitable choice of semimetric, we establish a contraction i…

Sample-Efficiency in Multi-Batch Reinforcement Learning: The Need for Dimension-Dependent Adaptivity Open

Emmeran Johnson, Ciara Pike-Burke, Patrick Rebeschini · 2023

Computer science Mathematics Psychology

We theoretically explore the relationship between sample-efficiency and adaptivity in reinforcement learning. An algorithm is sample-efficient if it uses a number of queries $n$ to the environment that is polynomial in the dimension $d$ of…

The statistical complexity of early-stopped mirror descent Open

Varun Kanade, Patrick Rebeschini, Tomas Vaškevičius · 2023

Mathematics Computer science Economics

Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-st…

Optimal Convergence Rate for Exact Policy Mirror Descent in Discounted Markov Decision Processes Open

Emmeran Johnson, Ciara Pike-Burke, Patrick Rebeschini · 2023

Computer science Mathematics Biology

Policy Mirror Descent (PMD) is a general family of algorithms that covers a wide range of novel and fundamental methods in reinforcement learning. Motivated by the instability of policy iteration (PI) with inexact policy evaluation, PMD al…

A Novel Framework for Policy Mirror Descent with General Parameterization and Linear Convergence Open

Carlo Alfano, Rui Yuan, Patrick Rebeschini · 2023

Computer science Mathematics Economics

Modern policy optimization methods in reinforcement learning, such as TRPO and PPO, owe their success to the use of parameterized policies. However, while theoretical guarantees have been established for this class of algorithms, especiall…

Nearly minimax-optimal rates for noisy sparse phase retrieval via early-stopped mirror descent Open

Fan Wu, Patrick Rebeschini · 2022

Mathematics Computer science Economics

This paper studies early-stopped mirror descent applied to noisy sparse phase retrieval, which is the problem of recovering a $k$-sparse signal $\textbf{x}^\star \in{\mathbb{R}}^n$ from a set of quadratic Gaussian measurements corrupted by…

Linear Convergence for Natural Policy Gradient with Log-linear Policy Parametrization Open

Carlo Alfano, Patrick Rebeschini · 2022

Mathematics Computer science Biology

We analyze the convergence rate of the unregularized natural policy gradient algorithm with log-linear policy parametrizations in infinite-horizon discounted Markov decision processes. In the deterministic case, when the Q-value is known a…

Exponential Tail Local Rademacher Complexity Risk Bounds Without the Bernstein Condition Open

Varun Kanade, Patrick Rebeschini, Tomas Vaškevičius · 2022

Mathematics Computer science

The local Rademacher complexity framework is one of the most successful general-purpose toolboxes for establishing sharp excess risk bounds for statistical estimators based on the framework of empirical risk minimization. Applying this too…

Time-independent Generalization Bounds for SGLD in Non-convex Settings Open

Tyler Farghly, Patrick Rebeschini · 2021

Mathematics Physics

We establish generalization error bounds for stochastic gradient Langevin dynamics (SGLD) with constant learning rate under the assumptions of dissipativity and smoothness, a setting that has received increased attention in the sampling/op…

On Optimal Interpolation In Linear Regression Open

Eduard Oravkin, Patrick Rebeschini · 2021

Mathematics Computer science

Understanding when and why interpolating methods generalize well has recently been a topic of interest in statistical learning theory. However, systematically connecting interpolating methods to achievable notions of optimality has only re…

Dimension-Free Rates for Natural Policy Gradient in Multi-Agent Reinforcement Learning Open

Carlo Alfano, Patrick Rebeschini · 2021

Computer science Psychology Mathematics

Cooperative multi-agent reinforcement learning is a decentralized paradigm in sequential decision making where agents distributed over a network iteratively collaborate with neighbors to maximize global (network-wide) notions of rewards. E…

Comparing Classes of Estimators: When does Gradient Descent Beat Ridge Regression in Linear Models? Open

Dominic Richards, Edgar Dobriban, Patrick Rebeschini · 2021

Mathematics Computer science Physics

Methods for learning from data depend on various types of tuning parameters, such as penalization strength or step size. Since performance can depend strongly on these parameters, it is important to compare classes of estimators-by conside…

Implicit Regularization in Matrix Sensing via Mirror Descent Open

Fan Wu, Patrick Rebeschini · 2021

Mathematics Computer science Physics

We study discrete-time mirror descent applied to the unregularized empirical risk in matrix sensing. In both the general case of rectangular matrices and the particular case of positive semidefinite matrices, a simple potential-based analy…

Nearly Minimax-Optimal Rates for Noisy Sparse Phase Retrieval via Early-Stopped Mirror Descent Open

Fan Wu, Patrick Rebeschini · 2021

Mathematics Computer science Economics

This paper studies early-stopped mirror descent applied to noisy sparse phase retrieval, which is the problem of recovering a $k$-sparse signal $\mathbf{x}^\star\in\mathbb{R}^n$ from a set of quadratic Gaussian measurements corrupted by su…

A Continuous-Time Mirror Descent Approach to Sparse Phase Retrieval Open

Fan Wu, Patrick Rebeschini · 2020

Mathematics Computer science

We analyze continuous-time mirror descent applied to sparse phase retrieval, which is the problem of recovering sparse signals from a set of magnitude-only measurements. We apply mirror descent to the unconstrained empirical risk minimizat…

Decentralised Learning with Random Features and Distributed Gradient Descent Open

Dominic Richards, Patrick Rebeschini, Lorenzo Rosasco · 2020

Computer science Mathematics

We investigate the generalisation performance of Distributed Gradient Descent with Implicit Regularisation and Random Features in the homogenous setting where a network of agents are given data sampled independently from the same unknown d…

Hadamard Wirtinger Flow for Sparse Phase Retrieval Open

Fan Wu, Patrick Rebeschini · 2020

Computer science Mathematics

We consider the problem of reconstructing an $n$-dimensional $k$-sparse signal from a set of noiseless magnitude-only measurements. Formulating the problem as an unregularized empirical risk minimization task, we study the sample complexit…

The Statistical Complexity of Early-Stopped Mirror Descent Open

Tomas Vaškevičius, Varun Kanade, Patrick Rebeschini · 2020

Computer science Mathematics Economics

Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-st…

Patrick Rebeschini YOU? Author Swipe