Nevan Wichers
YOU?
Author Swipe
View article: Visualizing Neural Network Imagination
Visualizing Neural Network Imagination Open
In certain situations, neural networks will represent environment states in their hidden activations. Our goal is to visualize what environment states the networks are representing. We experiment with a recurrent neural network (RNN) archi…
View article: Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation
Beyond Sparse Rewards: Enhancing Reinforcement Learning with Language Model Critique in Text Generation Open
Reinforcement learning (RL) can align language models with non-differentiable reward signals, such as human preferences. However, a major challenge arises from the sparsity of these reward signals - typically, there is only a single reward…
View article: Fusion-Eval: Integrating Assistant Evaluators with LLMs
Fusion-Eval: Integrating Assistant Evaluators with LLMs Open
Evaluating natural language systems poses significant challenges, particularly in the realms of natural language understanding and high-level reasoning. In this paper, we introduce 'Fusion-Eval', an innovative approach that leverages Large…
View article: SiRA: Sparse Mixture of Low Rank Adaptation
SiRA: Sparse Mixture of Low Rank Adaptation Open
Parameter Efficient Tuning has been an prominent approach to adapt the Large Language Model to downstream tasks. Most previous works considers adding the dense trainable parameters, where all parameters are used to adapt certain task. We f…
View article: SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition
SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition Open
Methods that extract policy primitives from offline demonstrations using deep generative models have shown promise at accelerating reinforcement learning(RL) for new tasks. Intuitively, these methods should also help to trainsafeRLagents b…
View article: ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces
ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces Open
As mobile devices are becoming ubiquitous, regularly interacting with a variety of user interfaces (UIs) is a common aspect of daily life for many people. To improve the accessibility of these devices and to enable their usage in a variety…
View article: ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces
ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces Open
As mobile devices are becoming ubiquitous, regularly interacting with a variety of user interfaces (UIs) is a common aspect of daily life for many people. To improve the accessibility of these devices and to enable their usage in a variety…
View article: RL agents Implicitly Learning Human Preferences
RL agents Implicitly Learning Human Preferences Open
In the real world, RL agents should be rewarded for fulfilling human preferences. We show that RL agents implicitly learn the preferences of humans in their environment. Training a classifier to predict if a simulated human's preferences a…
View article: Resolving Spurious Correlations in Causal Models of Environments via Interventions
Resolving Spurious Correlations in Causal Models of Environments via Interventions Open
Causal models bring many benefits to decision-making systems (or agents) by making them interpretable, sample-efficient, and robust to changes in the input distribution. However, spurious correlations can lead to wrong causal models and pr…
View article: Resolving Referring Expressions in Images With Labeled Elements
Resolving Referring Expressions in Images With Labeled Elements Open
Images may have elements containing text and a bounding box associated with them, for example, text identified via optical character recognition on a computer screen image, or a natural image with labeled objects. We present an end-to-end …
View article: Hierarchical Long-term Video Prediction without Supervision
Hierarchical Long-term Video Prediction without Supervision Open
Much of recent research has been devoted to video prediction and generation, yet most of the previous works have demonstrated only limited success in generating videos on short-term horizons. The hierarchical video prediction method by Vil…