Explanipedia

Closed-Form Last Layer Optimization Open

Alexandre Galashov, Nathaël Da Costa, Liyuan Xu, Philipp Hennig, Arthur Gretton · 2025

Neural networks are typically optimized with variants of stochastic gradient descent. Under a squared loss, however, the optimal solution to the linear last layer weights is known in closed-form. We propose to leverage this during optimiza…

Learn to Guide Your Diffusion Model Open

Alexandre Galashov, Ashwini Pokle, Randal Douc, Arthur Gretton, Mauricio Delbracio , et al. · 2025

Classifier-free guidance (CFG) is a widely used technique for improving the perceptual quality of samples from conditional diffusion models. It operates by linearly combining conditional and unconditional score estimates using a guidance w…

Distributional Diffusion Models with Scoring Rules Open

Valentin De Bortoli, Alexandre Galashov, J. Swaroop Guntupalli, Guangyao Zhou, Kevin P. Murphy , et al. · 2025

Diffusion models generate high-quality synthetic data. They operate by defining a continuous-time forward process which gradually adds Gaussian noise to data until fully corrupted. The corresponding reverse process progressively "denoises"…

Accelerated Diffusion Models via Speculative Sampling Open

Valentin De Bortoli, Alexandre Galashov, Arthur Gretton, Randal Douc · 2025

Computer science Economics Physics

Speculative sampling is a popular technique for accelerating inference in Large Language Models by generating candidate tokens using a fast draft model and accepting or rejecting them based on the target model's distribution. While specula…

Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset Open

Alexandre Galashov, Michalis K. Titsias, András György, Clare Lyle, Razvan Pascanu , et al. · 2024

Computer science Economics

Neural networks are traditionally trained under the assumption that data come from a stationary distribution. However, settings which violate this assumption are becoming more popular; examples include supervised learning under distributio…

Deep MMD Gradient Flow without adversarial training Open

Alexandre Galashov, Valentin De Bortoli, Arthur Gretton · 2024

Computer science Mathematics Geography

We propose a gradient flow procedure for generative modeling by transporting particles from an initial source distribution to a target distribution, where the gradient field on the particles is given by a noise-adaptive Wasserstein Gradien…

Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models Open

Amal Rannen-Triki, Jörg Bornschein, Razvan Pascanu, Marcus Hütter, András György , et al. · 2024

Computer science Psychology

We consider the problem of online fine tuning the parameters of a language model at test time, also known as dynamic evaluation. While it is generally known that this approach improves the overall predictive performance, especially when co…

Kalman Filter for Online Classification of Non-Stationary Data Open

Michalis K. Titsias, Alexandre Galashov, Amal Rannen-Triki, Razvan Pascanu, Yee Whye Teh , et al. · 2023

Computer science Political science

In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps. Important challenges in OCL are concerned with automatic adaptation to the particular non-stationary st…

Towards Compute-Optimal Transfer Learning Open

M. Caccia, Alexandre Galashov, Arthur Douillard, Amal Rannen-Triki, Dushyant Rao , et al. · 2023

Computer science Mathematics Biology

The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requ…

NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research Open

Jörg Bornschein, Alexandre Galashov, Ross Hemsley, Amal Rannen-Triki, Yutian Chen , et al. · 2022

Computer science Geography Mathematics

A shared goal of several machine learning communities like continual learning, meta-learning and transfer learning, is to design algorithms and models that efficiently and robustly adapt to unseen tasks. An even more ambitious goal is to b…

Data augmentation for efficient learning from parametric experts Open

Alexandre Galashov, Josh Merel, Nicolas Heess · 2022

Computer science Mathematics Biology

We present a simple, yet powerful data-augmentation technique to enable data-efficient learning from parametric experts for reinforcement and imitation learning. We focus on what we call the policy cloning setting, in which we use online o…

Game Plan: What AI can do for Football, and What Football can do for AI Open

Karl Tuyls, Shayegan Omidshafiei, Paul Müller, Zhe Wang, Jerome T. Connor , et al. · 2021

Computer science Psychology Mathematics

The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have b…

Behavior Priors for Efficient Reinforcement Learning Open

Dhruva Tirumala, Alexandre Galashov, Hyeonwoo Noh, Leonard Hasenclever, Razvan Pascanu , et al. · 2020

Computer science Mathematics Economics

As we deploy reinforcement learning agents to solve increasingly challenging problems, methods that allow us to inject prior knowledge about the structure of the world and effective solution strategies becomes increasingly important. In th…

Learning Dexterous Manipulation from Suboptimal Experts Open

Rae Jeong, Jost Tobias Springenberg, Jackie Kay, Daniel Zheng, Yuxiang Zhou , et al. · 2020

Computer science Philosophy Economics

Learning dexterous manipulation in high-dimensional state-action spaces is an important open challenge with exploration presenting a major bottleneck. Although in many cases the learning process could be guided by demonstrations or other s…

Temporal Difference Uncertainties as a Signal for Exploration Open

Sebastian Flennerhag, Jane X. Wang, Pablo Sprechmann, Francesco Visin, Alexandre Galashov , et al. · 2020

Computer science

An effective approach to exploration in reinforcement learning is to rely on an agent's uncertainty over the optimal policy, which can yield near-optimal exploration strategies in tabular settings. However, in non-tabular settings that inv…

Importance Weighted Policy Learning and Adaptation Open

Alexandre Galashov, Jakub Sygnowski, Guillaume Desjardins, Jan Humplik, Leonard Hasenclever , et al. · 2020

Computer science Psychology

The ability to exploit prior experience to solve novel problems rapidly is a hallmark of biological learning systems and of great practical importance for artificial ones. In the meta reinforcement learning literature much recent work has …

Importance Weighted Policy Learning and Adaption. Open

Alexandre Galashov, Jakub Sygnowski, Guillaume Desjardins, Jan Humplik, Leonard Hasenclever , et al. · 2020

Computer science Biology Economics

The ability to exploit prior experience to solve novel problems rapidly is a hallmark of biological learning systems and of great practical importance for artificial ones. In the meta reinforcement learning literature much recent work has …

Information Theoretic Meta Learning with Gaussian Processes Open

Michalis K. Titsias, Sotirios Nikoloutsopoulos, Alexandre Galashov, Galashov, Alexandre · 2020

Computer science Mathematics Physics

We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck. The idea is to learn a stochastic representation or encoding of the task description, given by a training set, that…

Task Agnostic Continual Learning via Meta Learning Open

Xu He, Jakub Sygnowski, Alexandre Galashov, Andrei A. Rusu, Yee Whye Teh , et al. · 2019

Computer science Engineering Physics

While neural networks are powerful function approximators, they suffer from catastrophic forgetting when the data distribution is not stationary. One particular formalism that studies learning under non-stationary distribution is provided …

Meta reinforcement learning as task inference Open

Jan Humplik, Alexandre Galashov, Leonard Hasenclever, Pedro A. Ortega, Yee Whye Teh , et al. · 2019

Computer science Mathematics Economics

Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There is considerable interest in designing reinforcement learning (RL) algorithms with similar properties. This includes pro…

Information asymmetry in KL-regularized RL Open

Alexandre Galashov, Siddhant M. Jayakumar, Leonard Hasenclever, Dhruva Tirumala, Jonathan Schwarz , et al. · 2019

Computer science Mathematics Economics

Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study the possibility of leveraging such repeated structure to speed up and regularize learning. We start f…

Meta-Learning surrogate models for sequential decision making Open

Alexandre Galashov, Jonathan Schwarz, Hyunjik Kim, Marta Garnelo, David Saxton , et al. · 2019

Computer science Materials science Economics

We introduce a unified probabilistic framework for solving sequential decision making problems ranging from Bayesian optimisation to contextual bandits and reinforcement learning. This is accomplished by a probabilistic model-based approac…

Exploiting Hierarchy for Learning and Transfer in KL-regularized RL Open

Dhruva Tirumala, Hyeonwoo Noh, Alexandre Galashov, Leonard Hasenclever, Arun Ahuja , et al. · 2019

Computer science Economics Biology

As reinforcement learning agents are tasked with solving more challenging and diverse tasks, the ability to incorporate prior knowledge into the learning system and to exploit reusable structure in solution space is likely to become increa…

Neural probabilistic motor primitives for humanoid control Open

Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham , et al. · 2018

Computer science Psychology

We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids. To do this, we propose a motor architecture that has the general s…

Neural probabilistic motor primitives for humanoid control Open

Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham , et al. · 2018

Computer science Psychology Biology

We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids. To do this, we propose a motor architecture that has the general s…

Alexandre Galashov YOU? Author Swipe