Jimmy Ba
YOU?
Author Swipe
View article: Neuromodulatory Control Networks (NCNs): A Biologically Inspired Architecture for Dynamic LLM Processing
Neuromodulatory Control Networks (NCNs): A Biologically Inspired Architecture for Dynamic LLM Processing Open
Large Language Models (LLMs) based on the Transformer architecture have achieved remarkable success, yet their core processing mechanisms remain largely static after training. While powerful, this static nature limits their ability to dyna…
View article: Mastering diverse control tasks through world models
Mastering diverse control tasks through world models Open
Developing a general algorithm that learns to solve tasks across a wide range of applications has been a fundamental challenge in artificial intelligence. Although current reinforcement-learning algorithms can be readily applied to tasks s…
View article: Report Cards: Qualitative Evaluation of Language Models Using Natural Language Summaries
Report Cards: Qualitative Evaluation of Language Models Using Natural Language Summaries Open
The rapid development and dynamic nature of large language models (LLMs) make it difficult for conventional quantitative benchmarks to accurately assess their capabilities. We propose report cards, which are human-interpretable, natural la…
View article: The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning Open
The White House Executive Order on Artificial Intelligence highlights the risks of large language models (LLMs) empowering malicious actors in developing biological, cyber, and chemical weapons. To measure these risks of malicious use, gov…
View article: Using Large Language Models for Hyperparameter Optimization
Using Large Language Models for Hyperparameter Optimization Open
This paper explores the use of foundational large language models (LLMs) in hyperparameter optimization (HPO). Hyperparameters are critical in determining the effectiveness of machine learning models, yet their optimization often relies on…
View article: OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text
OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text Open
There is growing evidence that pretraining on high quality, carefully thought-out tokens such as code or mathematics plays an important role in improving the reasoning abilities of large language models. For example, Minerva, a PaLM model …
View article: Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Identifying the Risks of LM Agents with an LM-Emulated Sandbox Open
Recent advances in Language Model (LM) agents and tool use, exemplified by applications like ChatGPT Plugins, enable a rich set of capabilities but also amplify potential risks - such as leaking private data or causing financial losses. Id…
View article: STEVE-1: A Generative Model for Text-to-Behavior in Minecraft
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft Open
Constructing AI models that respond to text instructions is challenging, especially for sequential decision-making tasks. This work introduces a methodology, inspired by unCLIP, for instruction-tuning generative models of behavior without …
View article: Training on Thin Air: Improve Image Classification with Generated Data
Training on Thin Air: Improve Image Classification with Generated Data Open
Acquiring high-quality data for training discriminative models is a crucial yet challenging aspect of building effective predictive systems. In this paper, we present Diffusion Inversion, a simple yet effective method that leverages the pr…
View article: AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback Open
Large language models (LLMs) such as ChatGPT have seen widespread adoption due to their strong instruction-following abilities. Developing these LLMs involves a complex yet poorly understood workflow requiring training with human feedback.…
View article: Clinical Camel: An Open Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding
Clinical Camel: An Open Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding Open
We present Clinical Camel, an open large language model (LLM) explicitly tailored for clinical research. Fine-tuned from LLaMA-2 using QLoRA, Clinical Camel achieves state-of-the-art performance across medical benchmarks among openly avail…
View article: Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization
Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization Open
Prompt tuning is one of the successful approaches for parameter-efficient tuning of pre-trained language models. Despite being arguably the most parameter-efficient (tuned soft prompts constitute <0.1% of total parameters), it typically pe…
View article: TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation
TR0N: Translator Networks for 0-Shot Plug-and-Play Conditional Generation Open
We propose TR0N, a highly general framework to turn pre-trained unconditional generative models, such as GANs and VAEs, into conditional models. The conditioning can be highly arbitrary, and requires only a pre-trained auxiliary model. For…
View article: Boosted Prompt Ensembles for Large Language Models
Boosted Prompt Ensembles for Large Language Models Open
Methods such as chain-of-thought prompting and self-consistency have pushed the frontier of language model reasoning performance with no additional training. To further improve performance, we propose a prompt ensembling method for large l…
View article: PIFiA: Self-supervised Approach for Protein Functional Annotation from Single-Cell Imaging Data
PIFiA: Self-supervised Approach for Protein Functional Annotation from Single-Cell Imaging Data Open
Fluorescence microscopy data describe protein localization patterns at single-cell resolution and have the potential to reveal whole-proteome functional information with remarkable precision. Yet, extracting biologically meaningful represe…
View article: Mastering Diverse Domains through World Models
Mastering Diverse Domains through World Models Open
Developing a general algorithm that learns to solve tasks across a wide range of applications has been a fundamental challenge in artificial intelligence. Although current reinforcement learning algorithms can be readily applied to tasks s…
View article: Residual Prompt Tuning: improving prompt tuning with residual reparameterization
Residual Prompt Tuning: improving prompt tuning with residual reparameterization Open
Prompt tuning is one of the successful approaches for parameter-efficient tuning of pre-trained language models. Despite being arguably the most parameter-efficient (tuned soft prompts constitute <0.1% of total parameters), it typically pe…
View article: Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve
Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve Open
Variational autoencoders (VAEs) are powerful tools for learning latent representations of data used in a wide range of applications. In practice, VAEs usually require multiple training rounds to choose the amount of information the latent …
View article: Large Language Models Are Human-Level Prompt Engineers
Large Language Models Are Human-Level Prompt Engineers Open
By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers. However, task performance depends significantly on the quality of the prompt used to steer …
View article: Exploring Low Rank Training of Deep Neural Networks
Exploring Low Rank Training of Deep Neural Networks Open
Training deep neural networks in low rank, i.e. with factorised layers, is of particular interest to the community: it offers efficiency over unfactorised training in terms of both memory consumption and training time. Prior work has focus…
View article: Dataset Distillation using Neural Feature Regression
Dataset Distillation using Neural Feature Regression Open
Dataset distillation aims to learn a small synthetic dataset that preserves most of the information from the original dataset. Dataset distillation can be formulated as a bi-level meta-learning problem where the outer loop optimizes the me…
View article: You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments
You Can't Count on Luck: Why Decision Transformers and RvS Fail in Stochastic Environments Open
Recently, methods such as Decision Transformer that reduce reinforcement learning to a prediction task and solve it via supervised learning (RvS) have become popular due to their simplicity, robustness to hyperparameters, and strong overal…
View article: High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Open
We study the first gradient descent step on the first-layer parameters $\boldsymbol{W}$ in a two-layer neural network: $f(\boldsymbol{x}) = \frac{1}{\sqrt{N}}\boldsymbol{a}^\topσ(\boldsymbol{W}^\top\boldsymbol{x})$, where $\boldsymbol{W}\i…
View article: Learning Domain Invariant Representations in Goal-conditioned Block MDPs
Learning Domain Invariant Representations in Goal-conditioned Block MDPs Open
Deep Reinforcement Learning (RL) is successful in solving many complex Markov Decision Processes (MDPs) problems. However, agents often face unanticipated environmental changes after deployment in the real world. These changes are often sp…
View article: INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving
INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving Open
In learning-assisted theorem proving, one of the most critical challenges is to generalize to theorems unlike those seen at training time. In this paper, we introduce INT, an INequality Theorem proving benchmark designed to test agents’ ge…
View article: Clockwork Variational Autoencoders
Clockwork Variational Autoencoders Open
Deep learning has enabled algorithms to generate realistic images. However, accurately predicting long video sequences requires understanding long-term dependencies and remains an open challenge. While existing video prediction models succ…
View article: Clockwork Variational Autoencoders for Video Prediction
Clockwork Variational Autoencoders for Video Prediction Open
Deep learning has enabled algorithms to generate realistic images. However, accurately predicting long video sequences requires understanding long-term dependencies and remains an open challenge. While existing video prediction models succ…