Tuomas Haarnoja
YOU?
Author Swipe
View article: Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning
Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning Open
We apply multi-agent deep reinforcement learning (RL) to train end-to-end robot soccer policies with fully onboard computation and sensing via egocentric RGB vision. This setting reflects many challenges of real-world robotics, including a…
View article: Replay across Experiments: A Natural Extension of Off-Policy RL
Replay across Experiments: A Natural Extension of Off-Policy RL Open
Replaying data is a principal mechanism underlying the stability and data efficiency of off-policy reinforcement learning (RL). We present an effective yet simple framework to extend the use of replays across multiple experiments, minimall…
View article: Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning Open
We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environme…
View article: SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration
SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration Open
The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitation…
View article: NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields
NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields Open
We present a system for applying sim2real approaches to "in the wild" scenes with realistic visuals, and to policies which rely on active perception using RGB cameras. Given a short video of a static scene collected using a generic phone, …
View article: Figure Data for the paper "From Motor Control to Team Play in Simulated Humanoid Football"
Figure Data for the paper "From Motor Control to Team Play in Simulated Humanoid Football" Open
Data Release for Article: From Motor Control to Team Play in Simulated Humanoid Football This package releases a set of Python notebooks each reproducing a quantitative figure featured in the research article "Fro…
View article: Figure Data for the paper "From Motor Control to Team Play in Simulated Humanoid Football"
Figure Data for the paper "From Motor Control to Team Play in Simulated Humanoid Football" Open
Data Release for Article: From Motor Control to Team Play in Simulated Humanoid Football This package releases a set of Python notebooks each reproducing a quantitative figure featured in the research article "Fro…
View article: Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data
Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data Open
Robots will experience non-stationary environment dynamics throughout their lifetime: the robot dynamics can change due to wear and tear, or its surroundings may change over time. Eventually, the robots should perform well in all of the en…
View article: Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors
Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors Open
We investigate the use of prior knowledge of human and animal movement to learn reusable locomotion skills for real legged robots. Our approach builds upon previous work on imitating human or dog Motion Capture (MoCap) data to learn a move…
View article: From Motor Control to Team Play in Simulated Humanoid Football
From Motor Control to Team Play in Simulated Humanoid Football Open
Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to ser…
View article: Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery
Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery Open
Reinforcement learning requires manual specification of a reward function to learn a task. While in principle this reward function only needs to specify the task goal, in practice reinforcement learning can be very time-consuming or even i…
View article: Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill\n Discovery
Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill\n Discovery Open
Reinforcement learning requires manual specification of a reward function to\nlearn a task. While in principle this reward function only needs to specify the\ntask goal, in practice reinforcement learning can be very time-consuming or\neve…
View article: Dynamical Distance Learning for Unsupervised and Semi-Supervised Skill Discovery.
Dynamical Distance Learning for Unsupervised and Semi-Supervised Skill Discovery. Open
View article: Learning to Walk Via Deep Reinforcement Learning
Learning to Walk Via Deep Reinforcement Learning Open
Deep reinforcement learning (deep RL) holds the promise of automating the acquisition of complex controllers that can map sensory inputs directly to low-level actions.In the domain of robotic locomotion, deep RL could enable learning locom…
View article: Learning to Walk via Deep Reinforcement Learning
Learning to Walk via Deep Reinforcement Learning Open
Deep reinforcement learning (deep RL) holds the promise of automating the acquisition of complex controllers that can map sensory inputs directly to low-level actions. In the domain of robotic locomotion, deep RL could enable learning loco…
View article: Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic Algorithms and Applications Open
Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample co…
View article: Composable Deep Reinforcement Learning for Robotic Manipulation
Composable Deep Reinforcement Learning for Robotic Manipulation Open
Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the intera…
View article: Latent Space Policies for Hierarchical Reinforcement Learning
Latent Space Policies for Hierarchical Reinforcement Learning Open
We address the problem of learning hierarchical deep neural network policies for reinforcement learning. In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating si…
View article: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor Open
Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and b…
View article: Acquiring Diverse Robot Skills via Maximum Entropy Deep Reinforcement Learning
Acquiring Diverse Robot Skills via Maximum Entropy Deep Reinforcement Learning Open
In this thesis, we study how maximum entropy framework can provide efficient deep reinforcement learning (deep RL) algorithms that solve tasks consistently and sample efficiently. This framework has several intriguing properties. First, th…
View article: Reinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based Policies Open
We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before. We apply our method to learning maximum entropy policies, resulting into a new alg…
View article: Backprop KF: Learning Discriminative Deterministic State Estimators
Backprop KF: Learning Discriminative Deterministic State Estimators Open
Generative state estimators based on probabilistic filters and smoothers are one of the most popular classes of state estimators for robots and autonomous vehicles. However, generative models have limited capacity to handle rich sensory ob…