Peter Welinder
YOU?
Author Swipe
View article: Caltech-UCSD Birds-200-2011 Dataset
Caltech-UCSD Birds-200-2011 Dataset Open
CUB-200-2011 is an extended version of CUB-200 [7], a challenging dataset of 200 bird species. The extended version roughly doubles the number of images per category and adds new part localization annotations. All images are annotated with…
View article: Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback Open
Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these m…
View article: Text and Code Embeddings by Contrastive Pre-Training
Text and Code Embeddings by Contrastive Pre-Training Open
Text embeddings are useful features in many applications such as semantic search and computing text similarity. Previous work typically trains models customized for different use cases, varying in dataset choice, training objective and mod…
View article: Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code Open
We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we…
View article: Sim2Real in Robotics and Automation: Applications and Challenges
Sim2Real in Robotics and Automation: Applications and Challenges Open
To Perform reliably and consistently over sustained periods of time, large-scale automation critically relies on computer simulation. Simulation allows us and supervisory AI to effectively design, validate, and continuously improve complex…
View article: Asymmetric self-play for automatic goal discovery in robotic manipulation
Asymmetric self-play for automatic goal discovery in robotic manipulation Open
We train a single, goal-conditioned policy that can solve many robotic manipulation tasks, including tasks with previously unseen goals and objects. We rely on asymmetric self-play for goal discovery, where two agents, Alice and Bob, play …
View article: Perspectives on Sim2Real Transfer for Robotics: A Summary of the R:SS 2020 Workshop
Perspectives on Sim2Real Transfer for Robotics: A Summary of the R:SS 2020 Workshop Open
This report presents the debates, posters, and discussions of the Sim2Real workshop held in conjunction with the 2020 edition of the "Robotics: Science and System" conference. Twelve leaders of the field took competing debate positions on …
View article: Learning dexterous in-hand manipulation
Learning dexterous in-hand manipulation Open
We use reinforcement learning (RL) to learn dexterous in-hand manipulation policies that can perform vision-based object reorientation on a physical Shadow Dexterous Hand. The training is performed in a simulated environment in which we ra…
View article: Solving Rubik's Cube with a Robot Hand
Solving Rubik's Cube with a Robot Hand Open
We demonstrate that models trained only in simulation can be used to solve a manipulation problem of unprecedented complexity on a real robot. This is made possible by two key components: a novel algorithm, which we call automatic domain r…
View article: ORRB -- OpenAI Remote Rendering Backend
ORRB -- OpenAI Remote Rendering Backend Open
We present the OpenAI Remote Rendering Backend (ORRB), a system that allows fast and customizable rendering of robotics environments. It is based on the Unity3d game engine and interfaces with the MuJoCo physics simulation library. ORRB wa…
View article: Domain Randomization and Generative Models for Robotic Grasping
Domain Randomization and Generative Models for Robotic Grasping Open
Deep learning-based robotic grasping has made significant progress thanks to algorithmic improvements and increased data availability. However, state-of-the-art models are often trained on as few as hundreds or thousands of unique object i…
View article: Asymmetric Actor Critic for Image-Based Robot Learning
Asymmetric Actor Critic for Image-Based Robot Learning Open
Deep reinforcement learning (RL) has proven a powerful technique in many sequential decision making domains. However, Robotics poses many challenges for RL, most notably training on a physical system can be expensive and dangerous, which h…
View article: Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research Open
The purpose of this technical report is two-fold. First of all, it introduces a suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware. The tasks include pushing, sliding an…
View article: Hindsight Experience Replay
Hindsight Experience Replay Open
Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary an…