Alexandre Galashov
YOU?
Author Swipe
View article: Closed-Form Last Layer Optimization
Closed-Form Last Layer Optimization Open
Neural networks are typically optimized with variants of stochastic gradient descent. Under a squared loss, however, the optimal solution to the linear last layer weights is known in closed-form. We propose to leverage this during optimiza…
View article: Learn to Guide Your Diffusion Model
Learn to Guide Your Diffusion Model Open
Classifier-free guidance (CFG) is a widely used technique for improving the perceptual quality of samples from conditional diffusion models. It operates by linearly combining conditional and unconditional score estimates using a guidance w…
View article: Distributional Diffusion Models with Scoring Rules
Distributional Diffusion Models with Scoring Rules Open
Diffusion models generate high-quality synthetic data. They operate by defining a continuous-time forward process which gradually adds Gaussian noise to data until fully corrupted. The corresponding reverse process progressively "denoises"…
View article: Accelerated Diffusion Models via Speculative Sampling
Accelerated Diffusion Models via Speculative Sampling Open
Speculative sampling is a popular technique for accelerating inference in Large Language Models by generating candidate tokens using a fast draft model and accepting or rejecting them based on the target model's distribution. While specula…
View article: Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset
Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset Open
Neural networks are traditionally trained under the assumption that data come from a stationary distribution. However, settings which violate this assumption are becoming more popular; examples include supervised learning under distributio…
View article: Deep MMD Gradient Flow without adversarial training
Deep MMD Gradient Flow without adversarial training Open
We propose a gradient flow procedure for generative modeling by transporting particles from an initial source distribution to a target distribution, where the gradient field on the particles is given by a noise-adaptive Wasserstein Gradien…
View article: Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models
Revisiting Dynamic Evaluation: Online Adaptation for Large Language Models Open
We consider the problem of online fine tuning the parameters of a language model at test time, also known as dynamic evaluation. While it is generally known that this approach improves the overall predictive performance, especially when co…
View article: Kalman Filter for Online Classification of Non-Stationary Data
Kalman Filter for Online Classification of Non-Stationary Data Open
In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps. Important challenges in OCL are concerned with automatic adaptation to the particular non-stationary st…
View article: Towards Compute-Optimal Transfer Learning
Towards Compute-Optimal Transfer Learning Open
The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requ…
View article: NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research
NEVIS'22: A Stream of 100 Tasks Sampled from 30 Years of Computer Vision Research Open
A shared goal of several machine learning communities like continual learning, meta-learning and transfer learning, is to design algorithms and models that efficiently and robustly adapt to unseen tasks. An even more ambitious goal is to b…
View article: Data augmentation for efficient learning from parametric experts
Data augmentation for efficient learning from parametric experts Open
We present a simple, yet powerful data-augmentation technique to enable data-efficient learning from parametric experts for reinforcement and imitation learning. We focus on what we call the policy cloning setting, in which we use online o…
View article: Game Plan: What AI can do for Football, and What Football can do for AI
Game Plan: What AI can do for Football, and What Football can do for AI Open
The rapid progress in artificial intelligence (AI) and machine learning has opened unprecedented analytics possibilities in various team and individual sports, including baseball, basketball, and tennis. More recently, AI techniques have b…
View article: Behavior Priors for Efficient Reinforcement Learning
Behavior Priors for Efficient Reinforcement Learning Open
As we deploy reinforcement learning agents to solve increasingly challenging problems, methods that allow us to inject prior knowledge about the structure of the world and effective solution strategies becomes increasingly important. In th…
View article: Learning Dexterous Manipulation from Suboptimal Experts
Learning Dexterous Manipulation from Suboptimal Experts Open
Learning dexterous manipulation in high-dimensional state-action spaces is an important open challenge with exploration presenting a major bottleneck. Although in many cases the learning process could be guided by demonstrations or other s…
View article: Temporal Difference Uncertainties as a Signal for Exploration
Temporal Difference Uncertainties as a Signal for Exploration Open
An effective approach to exploration in reinforcement learning is to rely on an agent's uncertainty over the optimal policy, which can yield near-optimal exploration strategies in tabular settings. However, in non-tabular settings that inv…
View article: Importance Weighted Policy Learning and Adaptation
Importance Weighted Policy Learning and Adaptation Open
The ability to exploit prior experience to solve novel problems rapidly is a hallmark of biological learning systems and of great practical importance for artificial ones. In the meta reinforcement learning literature much recent work has …
View article: Importance Weighted Policy Learning and Adaption.
Importance Weighted Policy Learning and Adaption. Open
The ability to exploit prior experience to solve novel problems rapidly is a hallmark of biological learning systems and of great practical importance for artificial ones. In the meta reinforcement learning literature much recent work has …
View article: Information Theoretic Meta Learning with Gaussian Processes
Information Theoretic Meta Learning with Gaussian Processes Open
We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck. The idea is to learn a stochastic representation or encoding of the task description, given by a training set, that…
View article: Task Agnostic Continual Learning via Meta Learning
Task Agnostic Continual Learning via Meta Learning Open
While neural networks are powerful function approximators, they suffer from catastrophic forgetting when the data distribution is not stationary. One particular formalism that studies learning under non-stationary distribution is provided …
View article: Meta reinforcement learning as task inference
Meta reinforcement learning as task inference Open
Humans achieve efficient learning by relying on prior knowledge about the structure of naturally occurring tasks. There is considerable interest in designing reinforcement learning (RL) algorithms with similar properties. This includes pro…
View article: Information asymmetry in KL-regularized RL
Information asymmetry in KL-regularized RL Open
Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study the possibility of leveraging such repeated structure to speed up and regularize learning. We start f…
View article: Meta-Learning surrogate models for sequential decision making
Meta-Learning surrogate models for sequential decision making Open
We introduce a unified probabilistic framework for solving sequential decision making problems ranging from Bayesian optimisation to contextual bandits and reinforcement learning. This is accomplished by a probabilistic model-based approac…
View article: Exploiting Hierarchy for Learning and Transfer in KL-regularized RL
Exploiting Hierarchy for Learning and Transfer in KL-regularized RL Open
As reinforcement learning agents are tasked with solving more challenging and diverse tasks, the ability to incorporate prior knowledge into the learning system and to exploit reusable structure in solution space is likely to become increa…
View article: Neural probabilistic motor primitives for humanoid control
Neural probabilistic motor primitives for humanoid control Open
We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids. To do this, we propose a motor architecture that has the general s…
View article: Neural probabilistic motor primitives for humanoid control
Neural probabilistic motor primitives for humanoid control Open
We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids. To do this, we propose a motor architecture that has the general s…