Dorian Baudry
YOU?
Author Swipe
View article: Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits
Fast Asymptotically Optimal Algorithms for Non-Parametric Stochastic Bandits Open
International audience
View article: A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms
A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms Open
In this paper we propose a general methodology to derive regret bounds for randomized multi-armed bandit algorithms. It consists in checking a set of sufficient conditions on the sampling probability of each arm and on the family of distri…
View article: Towards an efficient and risk aware strategy for guiding farmers in identifying best crop management
Towards an efficient and risk aware strategy for guiding farmers in identifying best crop management Open
Identification of best performing fertilizer practices among a set of contrasting practices with field trials is challenging as crop losses are costly for farmers. To identify best management practices, an ''intuitive strategy'' would be t…
View article: Top Two Algorithms Revisited
Top Two Algorithms Revisited Open
Top Two algorithms arose as an adaptation of Thompson sampling to best arm\nidentification in multi-armed bandit models (Russo, 2016), for parametric\nfamilies of arms. They select the next arm to sample from by randomizing among\ntwo cand…
View article: Top Two Algorithms Revisited
Top Two Algorithms Revisited Open
Top Two algorithms arose as an adaptation of Thompson sampling to best arm identification in multi-armed bandit models (Russo, 2016), for parametric families of arms. They select the next arm to sample from by randomizing among two candida…
View article: Efficient Algorithms for Extreme Bandits
Efficient Algorithms for Extreme Bandits Open
In this paper, we contribute to the Extreme Bandit problem, a variant of Multi-Armed Bandits in which the learner seeks to collect the largest possible reward. We first study the concentration of the maximum of i.i.d random variables under…
View article: From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits
From Optimality to Robustness: Dirichlet Sampling Strategies in Stochastic Bandits Open
The stochastic multi-arm bandit problem has been extensively studied under standard assumptions on the arm's distribution (e.g bounded with known support, exponential family, etc). These assumptions are suitable for many real-world problem…
View article: From Optimality to Robustness: Dirichlet Sampling Strategies in\n Stochastic Bandits
From Optimality to Robustness: Dirichlet Sampling Strategies in\n Stochastic Bandits Open
The stochastic multi-arm bandit problem has been extensively studied under\nstandard assumptions on the arm's distribution (e.g bounded with known support,\nexponential family, etc). These assumptions are suitable for many real-world\nprob…
View article: Optimal Thompson Sampling strategies for support-aware CVaR bandits
Optimal Thompson Sampling strategies for support-aware CVaR bandits Open
In this paper we study a multi-arm bandit problem in which the quality of each arm is measured by the Conditional Value at Risk (CVaR) at some level α of the reward distribution. While existing works in this setting mainly focus on Upper C…
View article: On Limited-Memory Subsampling Strategies for Bandits
On Limited-Memory Subsampling Strategies for Bandits Open
There has been a recent surge of interest in nonparametric bandit algorithms based on subsampling. One drawback however of these approaches is the additional complexity required by random subsampling and the storage of the full history of …
View article: On Limited-Memory Subsampling Strategies for Bandits
On Limited-Memory Subsampling Strategies for Bandits Open
There has been a recent surge of interest in nonparametric bandit algorithms\nbased on subsampling. One drawback however of these approaches is the\nadditional complexity required by random subsampling and the storage of the\nfull history …
View article: Optimal Thompson Sampling strategies for support-aware CVaR bandits
Optimal Thompson Sampling strategies for support-aware CVaR bandits Open
In this paper we study a multi-arm bandit problem in which the quality of each arm is measured by the Conditional Value at Risk (CVaR) at some level alpha of the reward distribution. While existing works in this setting mainly focus on Upp…
View article: Thompson Sampling for CVaR Bandits.
Thompson Sampling for CVaR Bandits. Open
Risk awareness is an important feature to formulate a variety of real world problems. In this paper we study a multi-arm bandit problem in which the quality of each arm is measured by the Conditional Value at Risk (CVaR) at some level {\al…
View article: Sub-sampling for Efficient Non-Parametric Bandit Exploration
Sub-sampling for Efficient Non-Parametric Bandit Exploration Open
In this paper we propose the first multi-armed bandit algorithm based on re-sampling that achieves asymptotically optimal regret simultaneously for different families of arms (namely Bernoulli, Gaussian and Poisson distributions). Unlike T…