Explanipedia

Large-Scale Study of Curiosity-Driven Learning Open

Yuri Burda, Harrison Edwards, Deepak Pathak, Amos Storkey, Trevor Darrell , et al. · 2025

Reinforcement learning algorithms rely on carefully engineering environment rewards that are extrinsic to the agent. However, annotating each environment with hand-designed, dense rewards is not scalable, motivating the need for developing…

Prover-Verifier Games improve legibility of LLM outputs Open

Jan H. Kirchner, Yining Chen, Harri Edwards, Jan Leike, Nat McAleese , et al. · 2024

Computer science Mathematics Business

One way to increase confidence in the outputs of Large Language Models (LLMs) is to support them with reasoning that is clear and easy to check -- a property we call legibility. We study legibility in the context of solving grade-school ma…

Importance weighted autoencoders Open

Yuri Burda, Roger Grosse, Ruslan Salakhutdinov · 2024

Computer science Mathematics Medicine

The variational autoencoder (VAE; Kingma, Welling (2014)) is a recently proposed generative model pairing a top-down generative network with a bottom-up recognition network which approximates posterior inference. It typically makes strong …

Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets Open

Alethea Power, Yuri Burda, Harri Edwards, I. Babuschkin, Vedant Misra · 2022

Computer science Mathematics

In this paper we propose to study generalization of neural networks on small algorithmically generated datasets. In this setting, questions about data efficiency, memorization, generalization, and speed of learning can be studied in great …

Evaluating Large Language Models Trained on Code Open

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Pondé de Oliveira Pinto , et al. · 2021

Computer science

We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we…

Exploration by Random Network Distillation Open

Yuri Burda, Harrison Edwards, Amos Storkey, Oleg Klimov · 2018

Computer science Mathematics Chemistry

We introduce an exploration bonus for deep reinforcement learning methods that is easy to implement and adds minimal overhead to the computation performed. The bonus is the error of a neural network predicting features of the observations …

Large-Scale Study of Curiosity-Driven Learning Open

Yuri Burda, Harri Edwards, Deepak Pathak, Amos Storkey, Trevor Darrell , et al. · 2018

Psychology Computer science Geography

Reinforcement learning algorithms rely on carefully engineering environment rewards that are extrinsic to the agent. However, annotating each environment with hand-designed, dense rewards is not scalable, motivating the need for developing…

Continuous Adaptation via Meta-Learning in Nonstationary and Competitive\n Environments Open

Maruan Al-Shedivat, Trapit Bansal, Yuri Burda, Ilya Sutskever, Igor Mordatch , et al. · 2017

Computer science Engineering Mathematics

Ability to continuously learn and adapt from limited experience in\nnonstationary environments is an important milestone on the path towards\ngeneral intelligence. In this paper, we cast the problem of continuous\nadaptation into the learn…

Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments Open

Maruan Al-Shedivat, Trapit Bansal, Yuri Burda, Ilya Sutskever, Igor Mordatch , et al. · 2017

Computer science Engineering Mathematics

Ability to continuously learn and adapt from limited experience in nonstationary environments is an important milestone on the path towards general intelligence. In this paper, we cast the problem of continuous adaptation into the learning…

On the Quantitative Analysis of Decoder-Based Generative Models Open

Yuhuai Wu, Yuri Burda, Ruslan Salakhutdinov, Roger Grosse · 2016

Computer science Mathematics

The past several years have seen remarkable progress in generative models which produce convincing samples of images and other modalities. A shared component of many powerful generative models is a decoder network, a parametric deep neural…

Polynomials Invertible in k-Radicals Open

Yuri Burda, Askold Khovanskiĭ · 2016

Mathematics Physics

A classic result of Ritt describes polynomials invertible in radicals: they are compositions of power polynomials, Chebyshev polynomials and polynomials of degree at most 4. In this paper we prove that a polynomial invertible in radicals a…

Yuri Burda YOU? Author Swipe