Yuri Burda
YOU?
Author Swipe
View article: Large-Scale Study of Curiosity-Driven Learning
Large-Scale Study of Curiosity-Driven Learning Open
Reinforcement learning algorithms rely on carefully engineering environment rewards that are extrinsic to the agent. However, annotating each environment with hand-designed, dense rewards is not scalable, motivating the need for developing…
View article: Prover-Verifier Games improve legibility of LLM outputs
Prover-Verifier Games improve legibility of LLM outputs Open
One way to increase confidence in the outputs of Large Language Models (LLMs) is to support them with reasoning that is clear and easy to check -- a property we call legibility. We study legibility in the context of solving grade-school ma…
View article: Importance weighted autoencoders
Importance weighted autoencoders Open
The variational autoencoder (VAE; Kingma, Welling (2014)) is a recently proposed generative model pairing a top-down generative network with a bottom-up recognition network which approximates posterior inference. It typically makes strong …
View article: Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets
Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets Open
In this paper we propose to study generalization of neural networks on small algorithmically generated datasets. In this setting, questions about data efficiency, memorization, generalization, and speed of learning can be studied in great …
View article: Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code Open
We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we…
View article: Exploration by Random Network Distillation
Exploration by Random Network Distillation Open
We introduce an exploration bonus for deep reinforcement learning methods that is easy to implement and adds minimal overhead to the computation performed. The bonus is the error of a neural network predicting features of the observations …
View article: Large-Scale Study of Curiosity-Driven Learning
Large-Scale Study of Curiosity-Driven Learning Open
Reinforcement learning algorithms rely on carefully engineering environment rewards that are extrinsic to the agent. However, annotating each environment with hand-designed, dense rewards is not scalable, motivating the need for developing…
View article: Continuous Adaptation via Meta-Learning in Nonstationary and Competitive\n Environments
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive\n Environments Open
Ability to continuously learn and adapt from limited experience in\nnonstationary environments is an important milestone on the path towards\ngeneral intelligence. In this paper, we cast the problem of continuous\nadaptation into the learn…
View article: Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments
Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments Open
Ability to continuously learn and adapt from limited experience in nonstationary environments is an important milestone on the path towards general intelligence. In this paper, we cast the problem of continuous adaptation into the learning…
View article: On the Quantitative Analysis of Decoder-Based Generative Models
On the Quantitative Analysis of Decoder-Based Generative Models Open
The past several years have seen remarkable progress in generative models which produce convincing samples of images and other modalities. A shared component of many powerful generative models is a decoder network, a parametric deep neural…
View article: Polynomials Invertible in k-Radicals
Polynomials Invertible in k-Radicals Open
A classic result of Ritt describes polynomials invertible in radicals: they are compositions of power polynomials, Chebyshev polynomials and polynomials of degree at most 4. In this paper we prove that a polynomial invertible in radicals a…