Explanipedia

Gemini Robotics 1.5: Pushing the Frontier of Generalist Robots with Advanced Embodied Reasoning, Thinking, and Motion Transfer Open

Abbas Abdolmaleki, Saminda Abeyruwan, Joshua Ainslie, Jean-Baptiste Alayrac, Montserrat Gonzalez Arenas , et al. · 2025

General-purpose robots need a deep understanding of the physical world, advanced reasoning, and general and dexterous control. This report introduces the latest generation of the Gemini Robotics model family: Gemini Robotics 1.5, a multi-e…

Splatting Physical Scenes: End-to-End Real-to-Sim from Imperfect Robot Data Open

Ben Moran, Mauro Comi, Arunkumar Byravan, Steven Bohez, Tom Erez , et al. · 2025

Creating accurate, physical simulations directly from real-world robot motion holds great value for safe, scalable, and affordable robot learning, yet remains exceptionally challenging. Real robot data suffers from occlusions, noisy camera…

Gemini Robotics: Bringing AI into the Physical World Open

Gemini Team, Saminda Abeyruwan, Joshua Ainslie, Jean-Baptiste Alayrac, Montserrat Gonzalez Arenas , et al. · 2025

Recent advancements in large multimodal models have led to the emergence of remarkable generalist capabilities in digital domains, yet their translation to physical agents such as robots remains a significant challenge. This report introdu…

Proc4Gem: Foundation models for physical agency through procedural generation Open

Yixin Lin, Jan Humplik, Sandy H. Huang, Leonard Hasenclever, F. Romanò , et al. · 2025

In robot learning, it is common to either ignore the environment semantics, focusing on tasks like whole-body control which only require reasoning about robot-environment contacts, or conversely to ignore contact dynamics, focusing on grou…

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning Open

Norman Di Palo, Leonard Hasenclever, Jan Humplik, Arunkumar Byravan · 2024

We introduce Diffusion Augmented Agents (DAAG), a novel framework that leverages large language models, vision language models, and diffusion models to improve sample efficiency and transfer learning in reinforcement learning for embodied …

Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning Open

Dhruva Tirumala, Markus Wulfmeier, Ben Moran, Sandy H. Huang, Jan Humplik , et al. · 2024

We apply multi-agent deep reinforcement learning (RL) to train end-to-end robot soccer policies with fully onboard computation and sensing via egocentric RGB vision. This setting reflects many challenges of real-world robotics, including a…

Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning Open

Mohak Bhardwaj, Thomas Lampe, Michael Neunert, Francesco Romanò, Abbas Abdolmaleki , et al. · 2024

Recent advances in real-world applications of reinforcement learning (RL) have relied on the ability to accurately simulate systems at scale. However, domains such as fluid dynamical systems exhibit complex dynamic phenomena that are hard …

Foundations for Transfer in Reinforcement Learning: A Taxonomy of Knowledge Modalities Open

Markus Wulfmeier, Arunkumar Byravan, Sarah Bechtle, Karol Hausman, Nicolas Heess · 2023

Contemporary artificial intelligence systems exhibit rapidly growing abilities accompanied by the growth of required resources, expansive datasets and corresponding investments into computing infrastructure. Although earlier successes pred…

Equivariant Data Augmentation for Generalization in Offline Reinforcement Learning Open

Cristina Pinneri, Sarah Bechtle, Markus Wulfmeier, Arunkumar Byravan, Jingwei Zhang , et al. · 2023

We present a novel approach to address the challenge of generalization in offline reinforcement learning (RL), where the agent learns from a fixed dataset without any additional interaction with the environment. Specifically, we aim to imp…

Towards A Unified Agent with Foundation Models Open

Norman Di Palo, Arunkumar Byravan, Leonard Hasenclever, Markus Wulfmeier, Nicolas Heess , et al. · 2023

Language Models and Vision Language Models have recently demonstrated unprecedented capabilities in terms of understanding human intentions, reasoning, scene understanding, and planning-like behaviour, in text form, among many others. In t…

A Generalist Dynamics Model for Control Open

Ingmar Schubert, Jingwei Zhang, Jake Bruce, Sarah Bechtle, Emilio Parisotto , et al. · 2023

We investigate the use of transformer sequence models as dynamics models (TDMs) for control. We find that TDMs exhibit strong generalization capabilities to unseen environments, both in a few-shot setting, where a generalist TDM is fine-tu…

Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning Open

Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala , et al. · 2023

We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environme…

Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains Open

Jingwei Zhang, Jost Tobias Springenberg, Arunkumar Byravan, Leonard Hasenclever, Abbas Abdolmaleki , et al. · 2023

In this paper we study the problem of learning multi-step dynamics prediction models (jumpy models) from unlabeled experience and their utility for fast inference of (high-level) plans in downstream tasks. In particular we propose to learn…

NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields Open

Arunkumar Byravan, Jan Humplik, Leonard Hasenclever, Arthur Brussee, Francesco Nori , et al. · 2022

We present a system for applying sim2real approaches to "in the wild" scenes with realistic visuals, and to policies which rely on active perception using RGB cameras. Given a short video of a static scene collected using a generic phone, …

Revisiting Gaussian mixture critics in off-policy reinforcement learning: a sample-based approach Open

Bobak Shahriari, Abbas Abdolmaleki, Arunkumar Byravan, Abe Friesen, Siqi Liu , et al. · 2022

Actor-critic algorithms that make use of distributional policy evaluation have frequently been shown to outperform their non-distributional counterparts on many challenging control tasks. Examples of this behavior include the D4PG and DMPO…

The Challenges of Exploration for Offline Reinforcement Learning Open

Nathan Lambert, Markus Wulfmeier, William Dwight Whitney, Arunkumar Byravan, Michael Bloesch , et al. · 2022

Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked processes of reinforcement learning: collecting informative experience and inferring optimal behaviour. The second step has been widely studied in the o…

Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse Shapes Open

Alex X. Lee, Coline Devin, Yuxiang Zhou, Thomas Lampe, Konstantinos Bousmalis , et al. · 2021

We study the problem of robotic stacking with objects of complex geometry. We propose a challenging and diverse set of such objects that was carefully designed to require strategies beyond a simple "pick-and-place" solution. Our method is …

Evaluating model-based planning and planner amortization for continuous control Open

Arunkumar Byravan, Leonard Hasenclever, Piotr Trochim, Mehdi Mirza, Alessandro Davide Ialongo , et al. · 2021

There is a widespread intuition that model-based control methods should be able to surpass the data efficiency of model-free approaches. In this paper we attempt to evaluate this intuition on various challenging locomotion tasks. We take a…

Learning Dynamics Models for Model Predictive Agents Open

Michael Lutter, Leonard Hasenclever, Arunkumar Byravan, Gabriel Dulac-Arnold, Piotr Trochim , et al. · 2021

Model-Based Reinforcement Learning involves learning a \textit{dynamics model} from data, and then using this model to optimise behaviour, most often with an online \textit{planner}. Much of the recent research along these lines presents a…

On Multi-objective Policy Optimization as a Tool for Reinforcement Learning. Open

Abbas Abdolmaleki, Sandy H. Huang, Giulia Vezzani, Bobak Shahriari, Jost Tobias Springenberg , et al. · 2021

Many advances that have improved the robustness and efficiency of deep reinforcement learning (RL) algorithms can, in one way or another, be understood as introducing additional objectives, or constraints, in the policy optimization step. …

On Multi-objective Policy Optimization as a Tool for Reinforcement Learning: Case Studies in Offline RL and Finetuning Open

Abbas Abdolmaleki, Sandy H. Huang, Giulia Vezzani, Bobak Shahriari, Jost Tobias Springenberg , et al. · 2021

Many advances that have improved the robustness and efficiency of deep reinforcement learning (RL) algorithms can, in one way or another, be understood as introducing additional objectives or constraints in the policy optimization step. Th…

Representation Matters: Improving Perception and Exploration for Robotics Open

Markus Wulfmeier, Arunkumar Byravan, Tim Hertweck, Irina Higgins, Ankush Gupta , et al. · 2021

Projecting high-dimensional environment observations into lower-dimensional structured representations can considerably improve data-efficiency for reinforcement learning in domains with limited data such as robotics. Can a single generall…

Representation Matters: Improving Perception and Exploration for\n Robotics Open

Markus Wulfmeier, Arunkumar Byravan, Tim Hertweck, Irina Higgins, Ankush Gupta , et al. · 2020

Projecting high-dimensional environment observations into lower-dimensional\nstructured representations can considerably improve data-efficiency for\nreinforcement learning in domains with limited data such as robotics. Can a\nsingle gener…

Local Search for Policy Iteration in Continuous Control Open

Jost Tobias Springenberg, Nicolas Heess, Daniel J. Mankowitz, Josh Merel, Arunkumar Byravan , et al. · 2020

We present an algorithm for local, regularized, policy improvement in reinforcement learning (RL) that allows us to formulate model-based and model-free variants in a single framework. Our algorithm can be interpreted as a natural extensio…

Motion-Nets: 6D Tracking of Unknown Objects in Unseen Environments using RGB Open

Felix Leeb, Arunkumar Byravan, Dieter Fox · 2019

In this work, we bridge the gap between recent pose estimation and tracking work to develop a powerful method for robots to track objects in their surroundings. Motion-Nets use a segmentation model to segment the scene, and separate transl…

Imagined Value Gradients: Model-Based Policy Optimization with\n Transferable Latent Dynamics Models Open

Arunkumar Byravan, Jost Tobias Springenberg, Abbas Abdolmaleki, Roland Hafner, Michael Neunert , et al. · 2019

Humans are masters at quickly learning many complex tasks, relying on an\napproximate understanding of the dynamics of their environments. In much the\nsame way, we would like our learning agents to quickly adapt to new tasks. In\nthis pap…

Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models Open

Arunkumar Byravan, Jost Tobias Springenberg, Abbas Abdolmaleki, Roland Hafner, Michael Neunert , et al. · 2019

Humans are masters at quickly learning many complex tasks, relying on an approximate understanding of the dynamics of their environments. In much the same way, we would like our learning agents to quickly adapt to new tasks. In this paper,…

Prospection: Interpretable plans from language by predicting the future Open

Chris Paxton, Yonatan Bisk, Jesse Thomason, Arunkumar Byravan, Dieter Fox · 2019

High-level human instructions often correspond to behaviors with multiple implicit steps. In order for robots to be useful in the real world, they must be able to to reason over both motions and intermediate goals implied by human instruct…

Structured Deep Visual Dynamics Models for Robot Manipulation Open

Arunkumar Byravan · 2019

SE3-Pose-Nets: Structured Deep Dynamics Models for Visuomotor Planning and Control Open

Arunkumar Byravan, Felix Leeb, Franziska Meier, Dieter Fox · 2017

In this work, we present an approach to deep visuomotor control using structured deep dynamics models. Our deep dynamics model, a variant of SE3-Nets, learns a low-dimensional pose embedding for visuomotor control via an encoder-decoder st…

Arunkumar Byravan YOU? Author Swipe