Alistair Letcher
YOU?
Author Swipe
View article: An Optimisation Framework for Unsupervised Environment Design
An Optimisation Framework for Unsupervised Environment Design Open
For reinforcement learning agents to be deployed in high-risk settings, they must achieve a high level of robustness to unfamiliar scenarios. One method for improving robustness is unsupervised environment design (UED), a suite of methods …
View article: Tight and Efficient Gradient Bounds for Parameterized Quantum Circuits
Tight and Efficient Gradient Bounds for Parameterized Quantum Circuits Open
The training of a parameterized model largely depends on the landscape of the underlying loss function. In particular, vanishing gradients are a central bottleneck in the scalability of variational quantum algorithms (VQAs), and are known …
View article: Tight and Efficient Gradient Bounds for Parameterized Quantum Circuits
Tight and Efficient Gradient Bounds for Parameterized Quantum Circuits Open
The training of a parameterized model largely depends on the landscape of the underlying loss function. In particular, vanishing gradients are a central bottleneck in the scalability of variational quantum algorithms (VQAs), and are known …
View article: Adversarial Cheap Talk
Adversarial Cheap Talk Open
Adversarial attacks in reinforcement learning (RL) often assume highly-privileged access to the victim's parameters, environment, or data. Instead, this paper proposes a novel adversarial setting called a Cheap Talk MDP in which an Adversa…
View article: Discovered Policy Optimisation
Discovered Policy Optimisation Open
Tremendous progress has been made in reinforcement learning (RL) over the past decade. Most of these advancements came through the continual development of new algorithms, which were designed using a combination of mathematical derivations…
View article: COLA: Consistent Learning with Opponent-Learning Awareness
COLA: Consistent Learning with Opponent-Learning Awareness Open
Learning in general-sum games is unstable and frequently leads to socially undesirable (Pareto-dominated) outcomes. To mitigate this, Learning with Opponent-Learning Awareness (LOLA) introduced opponent shaping to this setting, by accounti…
View article: Polymatrix Competitive Gradient Descent
Polymatrix Competitive Gradient Descent Open
Many economic games and machine learning approaches can be cast as competitive optimization problems where multiple agents are minimizing their respective objective function, which depends on all agents' actions. While gradient descent is …
View article: Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian
Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian Open
Over the last decade, a single algorithm has changed many facets of our lives - Stochastic Gradient Descent (SGD). In the era of ever decreasing loss functions, SGD and its various offspring have become the go-to optimization tool in machi…
View article: On the Impossibility of Global Convergence in Multi-Loss Optimization
On the Impossibility of Global Convergence in Multi-Loss Optimization Open
Under mild regularity conditions, gradient-based methods converge globally to a critical point in the single-loss setting. This is known to break down for vanilla gradient descent when moving to multi-loss optimization, but can we hope to …
View article: Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian
Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian Open
Over the last decade, a single algorithm has changed many facets of our lives - Stochastic Gradient Descent (SGD). In the era of ever decreasing loss functions, SGD and its various offspring have become the go-to optimization tool in machi…
View article: Differentiable Game Mechanics
Differentiable Game Mechanics Open
Deep learning is built on the foundational guarantee that gradient descent on an objective function converges to local minima. Unfortunately, this guarantee fails in settings, such as generative adversarial nets, that exhibit multiple inte…
View article: Stable Opponent Shaping in Differentiable Games
Stable Opponent Shaping in Differentiable Games Open
A growing number of learning methods are actually differentiable games whose players optimise multiple, interdependent objectives in parallel -- from GANs and intrinsic curiosity to multi-agent RL. Opponent shaping is a powerful approach t…
View article: Automatic Conflict Detection in Police Body-Worn Audio
Automatic Conflict Detection in Police Body-Worn Audio Open
Automatic conflict detection has grown in relevance with the advent of body-worn technology, but existing metrics such as turn-taking and overlap are poor indicators of conflict in police-public interactions. Moreover, standard techniques …