Explanipedia

Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning Open

Dhruva Tirumala, Markus Wulfmeier, Ben Moran, Sandy H. Huang, Jan Humplik , et al. · 2024

We apply multi-agent deep reinforcement learning (RL) to train end-to-end robot soccer policies with fully onboard computation and sensing via egocentric RGB vision. This setting reflects many challenges of real-world robotics, including a…

Replay across Experiments: A Natural Extension of Off-Policy RL Open

Dhruva Tirumala, Thomas Lampe, José Enrique Chen, Tuomas Haarnoja, Sandy H. Huang , et al. · 2023

Replaying data is a principal mechanism underlying the stability and data efficiency of off-policy reinforcement learning (RL). We present an effective yet simple framework to extend the use of replays across multiple experiments, minimall…

Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning Open

Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala , et al. · 2023

We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environme…

SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration Open

Giulia Vezzani, Dhruva Tirumala, Markus Wulfmeier, Dushyant Rao, Abbas Abdolmaleki , et al. · 2022

The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitation…

NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields Open

Arunkumar Byravan, Jan Humplik, Leonard Hasenclever, Arthur Brussee, Francesco Nori , et al. · 2022

We present a system for applying sim2real approaches to "in the wild" scenes with realistic visuals, and to policies which rely on active perception using RGB cameras. Given a short video of a static scene collected using a generic phone, …

Figure Data for the paper "From Motor Control to Team Play in Simulated Humanoid Football" Open

Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami , et al. · 2022

Data Release for Article: From Motor Control to Team Play in Simulated Humanoid Football This package releases a set of Python notebooks each reproducing a quantitative figure featured in the research article "Fro…

Figure Data for the paper "From Motor Control to Team Play in Simulated Humanoid Football" Open

Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami , et al. · 2022

Data Release for Article: From Motor Control to Team Play in Simulated Humanoid Football This package releases a set of Python notebooks each reproducing a quantitative figure featured in the research article "Fro…

Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data Open

Wenxuan Zhou, Steven Bohez, Jan Humplik, Abbas Abdolmaleki, Dushyant Rao , et al. · 2022

Robots will experience non-stationary environment dynamics throughout their lifetime: the robot dynamics can change due to wear and tear, or its surroundings may change over time. Eventually, the robots should perform well in all of the en…

Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors Open

Steven Bohez, Saran Tunyasuvunakool, Philémon Brakel, Fereshteh Sadeghi, Leonard Hasenclever , et al. · 2022

We investigate the use of prior knowledge of human and animal movement to learn reusable locomotion skills for real legged robots. Our approach builds upon previous work on imitating human or dog Motion Capture (MoCap) data to learn a move…

From Motor Control to Team Play in Simulated Humanoid Football Open

Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami , et al. · 2021

Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to ser…

Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery Open

Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine · 2019

Reinforcement learning requires manual specification of a reward function to learn a task. While in principle this reward function only needs to specify the task goal, in practice reinforcement learning can be very time-consuming or even i…

Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill\n Discovery Open

Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine · 2019

Reinforcement learning requires manual specification of a reward function to\nlearn a task. While in principle this reward function only needs to specify the\ntask goal, in practice reinforcement learning can be very time-consuming or\neve…

Dynamical Distance Learning for Unsupervised and Semi-Supervised Skill Discovery. Open

Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine · 2019

Learning to Walk Via Deep Reinforcement Learning Open

Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker , et al. · 2019

Deep reinforcement learning (deep RL) holds the promise of automating the acquisition of complex controllers that can map sensory inputs directly to low-level actions.In the domain of robotic locomotion, deep RL could enable learning locom…

Learning to Walk via Deep Reinforcement Learning Open

Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker , et al. · 2018

Deep reinforcement learning (deep RL) holds the promise of automating the acquisition of complex controllers that can map sensory inputs directly to low-level actions. In the domain of robotic locomotion, deep RL could enable learning loco…

Soft Actor-Critic Algorithms and Applications Open

Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha , et al. · 2018

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample co…

Composable Deep Reinforcement Learning for Robotic Manipulation Open

Tuomas Haarnoja, Vitchyr H. Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel , et al. · 2018

Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the intera…

Latent Space Policies for Hierarchical Reinforcement Learning Open

Tuomas Haarnoja, Kristian Hartikainen, Pieter Abbeel, Sergey Levine · 2018

We address the problem of learning hierarchical deep neural network policies for reinforcement learning. In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating si…

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor Open

Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine · 2018

Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and b…

Acquiring Diverse Robot Skills via Maximum Entropy Deep Reinforcement Learning Open

Tuomas Haarnoja · 2018

In this thesis, we study how maximum entropy framework can provide efficient deep reinforcement learning (deep RL) algorithms that solve tasks consistently and sample efficiently. This framework has several intriguing properties. First, th…

Reinforcement Learning with Deep Energy-Based Policies Open

Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine · 2017

We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before. We apply our method to learning maximum entropy policies, resulting into a new alg…

Backprop KF: Learning Discriminative Deterministic State Estimators Open

Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel · 2016

Generative state estimators based on probabilistic filters and smoothers are one of the most popular classes of state estimators for robots and autonomous vehicles. However, generative models have limited capacity to handle rich sensory ob…

Tuomas Haarnoja YOU? Author Swipe