Maxime Chevalier-Boisvert
YOU?
Author Swipe
View article: Evaluating YJIT’s Performance in a Production Context: A Pragmatic Approach
Evaluating YJIT’s Performance in a Production Context: A Pragmatic Approach Open
Ruby is a dynamically-typed programming language with a large breadth of features which has grown in popularity with the rise of the modern web, and remains at the core of the implementation of widely-used online platforms such as Shopify,…
View article: Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks
Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks Open
We present the Minigrid and Miniworld libraries which provide a suite of goal-oriented 2D and 3D environments. The libraries were explicitly created with a minimalistic design paradigm to allow users to rapidly develop new environments for…
View article: Proceedings of the 3rd Wordplay: When Language Meets Games Workshop (Wordplay 2022)
Proceedings of the 3rd Wordplay: When Language Meets Games Workshop (Wordplay 2022) Open
Since the dawn of the digital age, interactive virtual environments and electronic games have played a huge role in shaping our lives.Not only are they a source of entertainment but they also teach us important life skills such as strategi…
View article: YJIT: a basic block versioning JIT compiler for CRuby
YJIT: a basic block versioning JIT compiler for CRuby Open
Ruby is a dynamically typed programming language with a large breadth of features which has grown in popularity with the rise of the modern web, and remains at the core of the implementation of many widely-used websites.
View article: Combating False Negatives in Adversarial Imitation Learning
Combating False Negatives in Adversarial Imitation Learning Open
In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the desired behavior. However, as the trained policy learns to be more successful, the negative examples …
View article: DeepDrummer : Generating Drum Loops using Deep Learning and a Human in the Loop
DeepDrummer : Generating Drum Loops using Deep Learning and a Human in the Loop Open
DeepDrummer is a drum loop generation tool that uses active learning to learn the preferences (or current artistic intentions) of a human user from a small number of interactions. The principal goal of this tool is to enable an efficient e…
View article: BabyAI 1.1
BabyAI 1.1 Open
The BabyAI platform is designed to measure the sample efficiency of training an agent to follow grounded-language instructions. BabyAI 1.0 presents baseline results of an agent trained by deep imitation or reinforcement learning. BabyAI 1.…
View article: BabyAI 1.1.
BabyAI 1.1. Open
The BabyAI platform is designed to measure the sample efficiency of training an agent to follow grounded-language instructions. BabyAI 1.0 presents baseline results of an agent trained by deep imitation or reinforcement learning. BabyAI 1.…
View article: Combating False Negatives in Adversarial Imitation Learning (Student Abstract)
Combating False Negatives in Adversarial Imitation Learning (Student Abstract) Open
We define the False Negatives problem and show that it is a significant limitation in adversarial imitation learning. We propose a method that solves the problem by leveraging the nature of goal-conditioned tasks. The method, dubbed Fake C…
View article: Options of Interest: Temporal Abstraction with Interest Functions
Options of Interest: Temporal Abstraction with Interest Functions Open
Temporal abstraction refers to the ability of an agent to use behaviours of controllers which act for a limited, variable amount of time. The options framework describes such behaviours as consisting of a subset of states in which they can…
View article: Options of Interest: Temporal Abstraction with Interest Functions
Options of Interest: Temporal Abstraction with Interest Functions Open
Temporal abstraction refers to the ability of an agent to use behaviours of controllers which act for a limited, variable amount of time. The options framework describes such behaviours as consisting of a subset of states in which they can…
View article: Automated curriculum generation for Policy Gradients from Demonstrations
Automated curriculum generation for Policy Gradients from Demonstrations Open
In this paper, we present a technique that improves the process of training an agent (using RL) for instruction following. We develop a training curriculum that uses a nominal number of expert demonstrations and trains the agent in a manne…
View article: Option-Critic in Cooperative Multi-agent Systems
Option-Critic in Cooperative Multi-agent Systems Open
In this paper, we investigate learning temporal abstractions in cooperative multi-agent systems, using the options framework (Sutton et al, 1999). First, we address the planning problem for the decentralized POMDP represented by the multi-…
View article: Robo-PlaNet: Learning to Poke in a Day
Robo-PlaNet: Learning to Poke in a Day Open
Recently, the Deep Planning Network (PlaNet) approach was introduced as a model-based reinforcement learning method that learns environment dynamics directly from pixel observations. This architecture is useful for learning tasks in which …
View article: BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop.
BabyAI: First Steps Towards Grounded Language Learning With a Human In the Loop. Open
Allowing humans to interactively train artificial agents to understand
language instructions is desirable for both practical and scientific reasons,
but given the poor data efficiency of the current learning methods, this goal
may require …
View article: BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning
BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning Open
Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific reasons, but given the poor data efficiency of the current learning methods, this goal may require …
View article: Interprocedural Type Specialization of JavaScript Programs Without Type Analysis
Interprocedural Type Specialization of JavaScript Programs Without Type Analysis Open
Previous work proposed lazy basic block versioning, a technique for just-in-time compilation of dynamic languages which we believe represents an interesting point in the design space. Basic block versioning is simple to implement, simple e…
View article: Interprocedural Type Specialization of JavaScript Programs Without Type Analysis
Interprocedural Type Specialization of JavaScript Programs Without Type Analysis Open
Previous work proposed lazy basic block versioning, a technique for just-in-time compilation of dynamic languages which we believe represents an interesting point in the design space. Basic block versioning is simple to implement, simple e…