Explanipedia

The impact of a focused behavioral intervention on brain cannabinoid signaling and interoceptive function: Implications for mood and anxiety Open

Otto Muzik, Timothy Mann, John J. Kopchick, Asadur Chowdury, Mario A. Yacou , et al. · 2023

Psychology

The Wim Hof method (WHM) is a behavioral intervention technique that consists of deep breathing exercises, cold exposure and meditation. In light of the crucial role of the cannabinoid system in modulating neurotransmitter release through …

MuZero with Self-competition for Rate Control in VP9 Video Compression Open

Amol Mandhane, Anton Zhernov, Maribeth Rauh, Chenjie Gu, Miaosen Wang , et al. · 2022

Computer science Engineering

Video streaming usage has seen a significant rise as entertainment, education, and business increasingly rely on online video. Optimizing video compression has the potential to increase access and quality of content to users, and reduce en…

Data Augmentation Can Improve Robustness Open

Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A. Calian, Florian Stimberg, Olivia Wiles , et al. · 2021

Computer science Mathematics Political science

Adversarial training suffers from robust overfitting, a phenomenon where the robust test accuracy starts to decrease during training. In this paper, we focus on reducing robust overfitting by using common data augmentation schemes. We demo…

Improving Robustness using Generated Data Open

Sven Gowal, Sylvestre-Alvise Rebuffi, Olivia Wiles, Florian Stimberg, Dan A. Calian , et al. · 2021

Computer science Mathematics Physics

Recent work argues that robust training requires substantially larger datasets than those required for standard classification. On CIFAR-10 and CIFAR-100, this translates into a sizable robust-accuracy gap between models trained solely on …

Defending Against Image Corruptions Through Adversarial Augmentations Open

Dan A. Calian, Florian Stimberg, Olivia Wiles, Sylvestre-Alvise Rebuffi, András György , et al. · 2021

Computer science Mathematics Political science

Modern neural networks excel at image classification, yet they remain vulnerable to common image corruptions such as blur, speckle noise or fog. Recent methods that focus on this problem, such as AugMix and DeepAugment, introduce defenses …

Fixing Data Augmentation to Improve Adversarial Robustness Open

Sylvestre-Alvise Rebuffi, Sven Gowal, Dan A. Calian, Florian Stimberg, Olivia Wiles , et al. · 2021

Computer science Chemistry

Adversarial training suffers from robust overfitting, a phenomenon where the robust test accuracy starts to decrease during training. In this paper, we focus on both heuristics-driven and data-driven augmentations as a means to reduce robu…

Robust Constrained Reinforcement Learning for Continuous Control with Model Misspecification Open

Daniel J. Mankowitz, Dan A. Calian, Rae Jeong, Cosmin Păduraru, Nicolas Heess , et al. · 2020

Computer science Mathematics History

Many real-world physical control systems are required to satisfy constraints upon deployment. Furthermore, real-world systems are often subject to effects such as non-stationarity, wear-and-tear, uncalibrated sensors and so on. Such effect…

Balancing Constraints and Rewards with Meta-Gradient D4PG Open

Dan A. Calian, Daniel J. Mankowitz, Tom Zahavy, Zhongwen Xu, Junhyuk Oh , et al. · 2020

Computer science Mathematics Engineering

Deploying Reinforcement Learning (RL) agents to solve real-world applications often requires satisfying complex system constraints. Often the constraint thresholds are incorrectly set due to the complex nature of a system or the inability …

Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples Open

Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy Mann, Pushmeet Kohli · 2020

Mathematics Computer science Political science

Adversarial training and its variants have become de facto standards for learning robust deep neural networks. In this paper, we explore the landscape around adversarial training in a bid to uncover its limits. We systematically study the …

The NodeHopper: Enabling Low Latency Ranking with Constraints via a Fast Dual Solver Open

Anton Zhernov, Krishnamurthy Dvijotham, Ivan Lobov, Dan A. Calian, Michelle N. Gong , et al. · 2020

Computer science Mathematics Art

Modern recommender systems need to deal with multiple objectives like balancing user engagement with recommending diverse and fresh content. An appealing way to optimally trade these off is by imposing constraints on the ranking according …

Non-Stationary Delayed Bandits with Intermediate Observations Open

Claire Vernade, András György, Timothy Mann · 2020

Computer science Mathematics Physics

Online recommender systems often face long delays in receiving feedback, especially when optimizing for some long-term metrics. While mitigating the effects of delays in learning is well-understood in stationary environments, the problem b…

Achieving Robustness in the Wild via Adversarial Mixing With Disentangled Representations Open

Sven Gowal, Chongli Qin, Po-Sen Huang, Ali Taylan Cemgil, Krishnamurthy Dvijotham , et al. · 2020

Computer science Mathematics Chemistry

Recent research has made the surprising finding that state-of-the-art deep learning models sometimes fail to generalize to small variations of the input. Adversarial training has been shown to be an effective approach to overcome this prob…

Robust Reinforcement Learning for Continuous Control with Model Misspecification Open

Daniel J. Mankowitz, Nir Levine, Rae Jeong, Abbas Abdolmaleki, Jost Tobias Springenberg , et al. · 2020

Computer science Mathematics Engineering

We provide a framework for incorporating robustness -- to perturbations in the transition dynamics which we refer to as model misspecification -- into continuous control Reinforcement Learning (RL) algorithms. We specifically focus on inco…

An Alternative Surrogate Loss for PGD-based Adversarial Testing Open

Sven Gowal, Jonathan Uesato, Chongli Qin, Po-Sen Huang, Timothy Mann , et al. · 2019

Computer science Mathematics Political science

Adversarial testing methods based on Projected Gradient Descent (PGD) are widely used for searching norm-bounded perturbations that cause the inputs of neural networks to be misclassified. This paper takes a deeper look at these methods an…

A Dual Approach to Verify and Train Deep Networks Open

Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Timothy Mann, Pushmeet Kohli · 2019

Computer science Mathematics Psychology

This paper addressed the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (e.g., robustness to bounded …

Active Roll-outs in MDP with Irreversible Dynamics Open

Odalric-Ambrym Maillard, Timothy Mann, Ronald Ortner, Shie Mannor · 2019

Chemistry Computer science Physics

In Reinforcement Learning (RL), regret guarantees scaling with the square root of the time horizon have been shown to hold only for communicating Markov decision processes (MDPs) where any two states are connected. This essentially means t…

Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates Open

Hugo Penedones, Carlos Riquelme, Damien Vincent, Hartmut Maennel, Timothy Mann , et al. · 2019

Computer science Political science Mathematics

We consider the core reinforcement-learning problem of on-policy value function approximation from a batch of trajectory data, and focus on various issues of Temporal Difference (TD) learning and Monte Carlo (MC) policy evaluation. The two…

Robust Reinforcement Learning for Continuous Control with Model Misspecification Open

Daniel J. Mankowitz, Nir Levine, Rae Jeong, Yuanyuan Shi, Jackie Kay , et al. · 2019

Computer science Mathematics Engineering

We provide a framework for incorporating robustness -- to perturbations in the transition dynamics which we refer to as model misspecification -- into continuous control Reinforcement Learning (RL) algorithms. We specifically focus on inco…

A Bayesian Approach to Robust Reinforcement Learning Open

Esther Derman, Daniel J. Mankowitz, Timothy Mann, Shie Mannor · 2019

Computer science Psychology

Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior. In this framework, transitions are modeled as arbitrary elements of a known and properly structured uncertainty s…

Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates Open

Carlos Riquelme, Hugo Penedones, Damien Vincent, Hartmut Maennel, Sylvain Gelly , et al. · 2019

Computer science Mathematics Physics

We consider the core reinforcement-learning problem of on-policy value function approximation from a batch of trajectory data, and focus on various issues of Temporal Difference (TD) learning and Monte Carlo (MC) policy evaluation. The two…

On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models Open

Sven Gowal, Krishnamurthy Dvijotham, Robert Stanforth, Rudy Bunel, Chongli Qin , et al. · 2018

Computer science Mathematics Physics

Recent work has shown that it is possible to train deep neural networks that are provably robust to norm-bounded adversarial perturbations. Most of these methods are based on minimizing an upper bound on the worst-case loss over all possib…

Learning from Delayed Outcomes via Proxies with Applications to\n Recommender Systems Open

Timothy Mann, Sven Gowal, András György, Ray Jiang, Huiyi Hu , et al. · 2018

Computer science Psychology Mathematics

Predicting delayed outcomes is an important problem in recommender systems\n(e.g., if customers will finish reading an ebook). We formalize the problem as\nan adversarial, delayed online learning problem and consider how a proxy for\nthe d…

Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems Open

Timothy Mann, Sven Gowal, András György, Huiyi Hu, Ray Jiang , et al. · 2018

Computer science

Predicting delayed outcomes is an important problem in recommender systems (e.g., if customers will finish reading an ebook). We formalize the problem as an adversarial, delayed online learning problem and consider how a proxy for the dela…

Temporal Difference Learning with Neural Networks - Study of the Leakage Propagation Problem Open

Hugo Penedones, Damien Vincent, Hartmut Maennel, Sylvain Gelly, Timothy Mann , et al. · 2018

Computer science Mathematics Economics

Temporal-Difference learning (TD) [Sutton, 1988] with function approximation can converge to solutions that are worse than those obtained by Monte-Carlo regression, even in the simple case of on-policy evaluation. To increase our understan…

Learning Robust Options Open

Daniel J. Mankowitz, Timothy Mann, Pierre‐Luc Bacon, Doina Precup, Shie Mannor · 2018

Computer science Mathematics Physics

Robust reinforcement learning aims to produce policies that have strong guarantees even in the face of environments/transition models whose parameters have strong uncertainty. Existing work uses value-based methods and the usual primitive …

A Dual Approach to Scalable Verification of Deep Networks Open

Krishnamurthy, Dvijotham, Robert Stanforth, Sven Gowal, Timothy Mann , et al. · 2018

Computer science Mathematics Chemistry

This paper addresses the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (robustness to bounded norm a…

Soft-Robust Actor-Critic Policy-Gradient Open

Esther Derman, Daniel J. Mankowitz, Timothy Mann, Shie Mannor · 2018

Computer science Economics

Robust Reinforcement Learning aims to derive optimal behavior that accounts for model uncertainty in dynamical systems. However, previous studies have shown that by considering the worst case scenario, robust policies can be overly conserv…

Optimizing Slate Recommendations via Slate-CVAE. Open

Ray Jiang, Sven Gowal, Timothy Mann, Danilo Jimenez Rezende · 2018

Computer science Political science

The slate recommendation problem aims to find the ordering of a subset of documents to be presented on a surface that we call The definition of changes depending on the underlying applications but a typical goal is to maximize user enga…

Beyond Greedy Ranking: Slate Optimization via List-CVAE Open

Ray Jiang, Sven Gowal, Yuqiu Qian, Timothy Mann, Danilo Jimenez Rezende · 2018

Computer science Mathematics

The conventional solution to the recommendation problem greedily ranks individual document candidates by prediction scores. However, this method fails to optimize the slate as a whole, and hence, often struggles to capture biases caused by…

Timothy Mann YOU? Author Swipe