Explanipedia

Partition Tree Weighting for Non-Stationary Stochastic Bandits Open

Joel Veness, Marcus Hütter, András György, Jordi Grau-Moya · 2025

This paper considers a generalisation of universal source coding for interaction data, namely data streams that have actions interleaved with observations. Our goal will be to construct a coding distribution that is both universal \emph{an…

Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data Open

David Heurtel-Depeiges, Anian Ruoss, Joel Veness, Tim Genewein · 2024

Foundation models are strong data compressors, but when accounting for their parameter size, their compression ratios are inferior to standard compression algorithms. Naively reducing the parameter count does not necessarily help as it det…

Learning Universal Predictors Open

Jordi Grau-Moya, Tim Genewein, Marcus Hütter, Laurent Orseau, Grégoire Delétang , et al. · 2024

Meta-learning has emerged as a powerful approach to train neural networks to learn new tasks quickly from limited data. Broad exposure to different tasks leads to versatile representations enabling general problem solving. But, what are th…

Language Modeling Is Compression Open

Grégoire Delétang, Anian Ruoss, Paul-Ambroise Duquenne, Elliot Catt, Tim Genewein , et al. · 2023

It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful self-sup…

Randomized Positional Encodings Boost Length Generalization of Transformers Open

Anian Ruoss, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Róbert Csordás , et al. · 2023

Transformers have impressive generalization capabilities on tasks with a fixed context length. However, they fail to generalize to sequences of arbitrary length, even for seemingly simple tasks such as duplicating a string. Moreover, simpl…

Memory-Based Meta-Learning on Non-Stationary Distributions Open

Tim Genewein, Grégoire Delétang, Anian Ruoss, Li Kevin Wenliang, Elliot Catt , et al. · 2023

Memory-based meta-learning is a technique for approximating Bayes-optimal predictors. Under fairly general conditions, minimizing sequential prediction error, measured by the log loss, leads to implicit meta-learning. The goal of this work…

Randomized Positional Encodings Boost Length Generalization of Transformers Open

Anian Ruoss, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Róbert Csordás , et al. · 2023

Anian Ruoss, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Róbert Csordás, Mehdi Bennani, Shane Legg, Joel Veness. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2023.

Beyond Bayes-optimality: meta-learning what you know you don't know Open

Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Tim Genewein, Elliot Catt , et al. · 2022

Meta-training agents with memory has been shown to culminate in Bayes-optimal agents, which casts Bayes-optimality as the implicit solution to a numerical optimization problem rather than an explicit modeling assumption. Bayes-optimal agen…

Shaking the foundations: delusions in sequence models for interaction and control Open

Pedro A. Ortega, Markus Kunesch, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya , et al. · 2021

The recent phenomenal success of language models has reinvigorated machine learning research, and large sequence models such as transformers are being applied to a variety of domains. One important problem class that has remained relativel…

Shaking the foundations: delusions in sequence models for interaction\n and control Open

Pedro A. Ortega, Markus Kunesch, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya , et al. · 2021

The recent phenomenal success of language models has reinvigorated machine\nlearning research, and large sequence models such as transformers are being\napplied to a variety of domains. One important problem class that has remained\nrelati…

Reinforcement Learning with Information-Theoretic Actuation Open

Elliot Catt, Marcus Hütter, Joel Veness · 2021

Reinforcement Learning formalises an embodied agent's interaction with the environment through observations, rewards and actions. But where do the actions come from? Actions are often considered to represent something external, such as the…

Investigating Contingency Awareness Using Atari 2600 Games Open

Marc G. Bellemare, Joel Veness, Michael Bowling · 2021

Contingency awareness is the recognition that some aspects of a future observation are under an agent's control while others are solely determined by the environment. This paper explores the idea of contingency awareness in reinforcement l…

Gated Linear Networks Open

Joel Veness, Tor Lattimore, David Budden, Avishkar Bhoopchand, Christopher Mattern , et al. · 2021

This paper presents a new family of backpropagation-free neural architectures, Gated Linear Networks (GLNs). What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism…

A rapid and efficient learning rule for biological neural circuits Open

Eren Sezener, Agnieszka Grabska‐Barwińska, Dimitar Kostadinov, Maxime Beau, Sanjukta Krishnagopal , et al. · 2021

The dominant view in neuroscience is that changes in synaptic weights underlie learning. It is unclear, however, how the brain is able to determine which synapses should change, and by how much. This uncertainty stands in sharp contrast to…

A Combinatorial Perspective on Transfer Learning Open

Jianan Wang, Eren Sezener, David Budden, Marcus Hütter, Joel Veness · 2020

Human intelligence is characterized not only by the capacity to learn complex skills, but the ability to rapidly adapt and acquire new skills within an ever-changing environment. In this work we study how the learning of modular solutions …

Gaussian Gated Linear Networks Open

David Budden, Adam Marblestone, Eren Sezener, Tor Lattimore, Greg Wayne , et al. · 2020

We propose the Gaussian Gated Linear Network (G-GLN), an extension to the recently proposed GLN family of deep neural networks. Instead of using backpropagation to learn features, GLNs have a distributed and local credit assignment mechani…

Online Learning in Contextual Bandits using Gated Linear Networks Open

Eren Sezener, Marcus Hütter, David Budden, Jianan Wang, Joel Veness · 2020

We introduce a new and completely online contextual bandit algorithm called Gated Linear Contextual Bandits (GLCB). This algorithm is based on Gated Linear Networks (GLNs), a recently introduced deep learning architecture with properties w…

Meta-learning of Sequential Strategies Open

Pedro A. Ortega, Jane X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth‐Nelson , et al. · 2019

In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundati…

Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents (Extended Abstract) Open

Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht , et al. · 2018

The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been r…

Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents Open

Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht , et al. · 2018

The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been r…

Reply to Huszár: The elastic weight consolidation penalty is empirically valid Open

James Kirkpatrick, Razvan Pascanu, Neil C. Rabinowitz, Joel Veness, Guillaume Desjardins , et al. · 2018

In our recent work on elastic weight consolidation (EWC) (1) we show that forgetting in neural networks can be alleviated by using a quadratic penalty whose derivation was inspired by Bayesian evidence accumulation. In his letter (2), Dr. …

Online Learning with Gated Linear Networks Open

Joel Veness, Tor Lattimore, Avishkar Bhoopchand, Agnieszka Grabska‐Barwińska, Christopher Mattern , et al. · 2017

This paper describes a family of probabilistic architectures designed for online learning under the logarithmic loss. Rather than relying on non-linear transfer functions, our method gains representational power by the use of data conditio…

Revisiting the Arcade Learning Environment: Evaluation Protocols and\n Open Problems for General Agents Open

Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht , et al. · 2017

The Arcade Learning Environment (ALE) is an evaluation platform that poses\nthe challenge of building AI agents with general competency across dozens of\nAtari 2600 games. It supports a variety of different problem settings and it\nhas bee…

Overcoming catastrophic forgetting in neural networks Open

James Kirkpatrick, Razvan Pascanu, Neil C. Rabinowitz, Joel Veness, Guillaume Desjardins , et al. · 2017

Significance Deep neural networks are currently the most successful machine-learning technique for solving a variety of tasks, including language translation, image classification, and image generation. One weakness of such models is that,…

Overcoming catastrophic forgetting in neural networks Open

James Kirkpatrick, Razvan Pascanu, Neil C. Rabinowitz, Joel Veness, Guillaume Desjardins , et al. · 2016

The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Neural networks are not, in general, capable of this and it has been widely thought that catastrophic forgetting is an inevitable …

Compress and Control Open

Joel Veness, Marc G. Bellemare, Marcus Hütter, Alvin J. K. Chua, Guillaume Desjardins · 2015

This paper describes a new information-theoretic policy evaluation technique for reinforcement learning. This technique converts any compression or density model into a corresponding estimate of value. Under appropriate stationarity and er…

Joel Veness YOU? Author Swipe