Andrew Jesson
YOU?
Author Swipe
View article: Can Generative AI Solve Your In-Context Learning Problem? A Martingale Perspective
Can Generative AI Solve Your In-Context Learning Problem? A Martingale Perspective Open
This work is about estimating when a conditional generative model (CGM) can solve an in-context learning (ICL) problem. An in-context learning (ICL) problem comprises a CGM, a dataset, and a prediction task. The CGM could be a multi-modal …
View article: Hypothesis Testing the Circuit Hypothesis in LLMs
Hypothesis Testing the Circuit Hypothesis in LLMs Open
Large language models (LLMs) demonstrate surprising capabilities, but we do not understand how they are implemented. One hypothesis suggests that these capabilities are primarily executed by small subnetworks within the LLM, known as circu…
View article: Improving Generalization on the ProcGen Benchmark with Simple Architectural Changes and Scale
Improving Generalization on the ProcGen Benchmark with Simple Architectural Changes and Scale Open
We demonstrate that recent advances in reinforcement learning (RL) combined with simple architectural changes significantly improves generalization on the ProcGen benchmark. These changes are frame stacking, replacing 2D convolutional laye…
View article: Estimating the Hallucination Rate of Generative AI
Estimating the Hallucination Rate of Generative AI Open
This paper presents a method for estimating the hallucination rate for in-context learning (ICL) with generative AI. In ICL, a conditional generative model (CGM) is prompted with a dataset and a prediction question and asked to generate a …
View article: DiscoBAX: Discovery of Optimal Intervention Sets in Genomic Experiment Design
DiscoBAX: Discovery of Optimal Intervention Sets in Genomic Experiment Design Open
The discovery of therapeutics to treat genetically-driven pathologies relies on identifying genes involved in the underlying disease mechanisms. Existing approaches search over the billions of potential interventions to maximize the expect…
View article: BatchGFN: Generative Flow Networks for Batch Active Learning
BatchGFN: Generative Flow Networks for Batch Active Learning Open
We introduce BatchGFN -- a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward. With an appropriate reward function to quantify the utility of acquir…
View article: ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages Open
This paper proposes a step toward approximate Bayesian inference in on-policy actor-critic deep reinforcement learning. It is implemented through three changes to the Asynchronous Advantage Actor-Critic (A3C) algorithm: (1) applying a ReLU…
View article: B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding
B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding Open
Estimating heterogeneous treatment effects from observational data is a crucial task across many fields, helping policy and decision-makers take better actions. There has been recent progress on robust and efficient methods for estimating …
View article: Differentiable Multi-Target Causal Bayesian Experimental Design
Differentiable Multi-Target Causal Bayesian Experimental Design Open
We introduce a gradient-based approach for the problem of Bayesian optimal experimental design to learn causal models in a batch setting -- a critical component for causal discovery from finite data where interventions can be costly or ris…
View article: Using uncertainty-aware machine learning models to study aerosol-cloud interactions
Using uncertainty-aware machine learning models to study aerosol-cloud interactions Open
Aerosol-cloud interactions (ACI) include various effects that result from aerosols entering a cloud, and affecting cloud properties. In general, an increase in aerosol concentration results in smaller droplet sizes which leads to larger, b…
View article: Scalable Sensitivity and Uncertainty Analysis for Causal-Effect Estimates of Continuous-Valued Interventions
Scalable Sensitivity and Uncertainty Analysis for Causal-Effect Estimates of Continuous-Valued Interventions Open
Estimating the effects of continuous-valued interventions from observational data is a critically important task for climate science, healthcare, and economics. Recent work focuses on designing neural network architectures and regularizati…
View article: Interventions, Where and How? Experimental Design for Causal Models at Scale
Interventions, Where and How? Experimental Design for Causal Models at Scale Open
Causal discovery from observational and interventional data is challenging due to limited data and non-identifiability: factors that introduce uncertainty in estimating the underlying structural causal model (SCM). Selecting experiments (i…
View article: Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data
Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data Open
Estimating personalized treatment effects from high-dimensional observational data is essential in situations where experimental designs are infeasible, unethical, or expensive. Existing approaches rely on fitting deep models on outcomes o…
View article: Using Non-Linear Causal Models to Study Aerosol-Cloud Interactions in the Southeast Pacific
Using Non-Linear Causal Models to Study Aerosol-Cloud Interactions in the Southeast Pacific Open
Aerosol-cloud interactions include a myriad of effects that all begin when aerosol enters a cloud and acts as cloud condensation nuclei (CCN). An increase in CCN results in a decrease in the mean cloud droplet size (r$_{e}$). The smaller d…
View article: GeneDisco: A Benchmark for Experimental Design in Drug Discovery
GeneDisco: A Benchmark for Experimental Design in Drug Discovery Open
In vitro cellular experimentation with genetic interventions, using for example CRISPR technologies, is an essential step in early-stage drug discovery and target validation that serves to assess initial hypotheses about causal association…
View article: Stochastic Batch Acquisition: A Simple Baseline for Deep Active Learning
Stochastic Batch Acquisition: A Simple Baseline for Deep Active Learning Open
We examine a simple stochastic strategy for adapting well-known single-point acquisition functions to allow batch active learning. Unlike acquiring the top-K points from the pool set, score- or rank-based sampling takes into account that a…
View article: Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding
Quantifying Ignorance in Individual-Level Causal-Effect Estimates under Hidden Confounding Open
We study the problem of learning conditional average treatment effects (CATE) from high-dimensional, observational data with unobserved confounders. Unobserved confounders introduce ignorance -- a level of unidentifiability -- about an ind…
View article: On Feature Collapse and Deep Kernel Learning for Single Forward Pass Uncertainty
On Feature Collapse and Deep Kernel Learning for Single Forward Pass Uncertainty Open
Inducing point Gaussian process approximations are often considered a gold standard in uncertainty estimation since they retain many of the properties of the exact GP and scale to large datasets. A major drawback is that they have difficul…
View article: Identifying Causal-Effect Inference Failure with Uncertainty-Aware Models
Identifying Causal-Effect Inference Failure with Uncertainty-Aware Models Open
Recommending the best course of action for an individual is a major application of individual-level causal effect estimation. This application is often needed in safety-critical domains such as healthcare, where estimating and communicatin…
View article: Identifying Causal-Effect Inference Failure with Uncertainty-Aware Models
Identifying Causal-Effect Inference Failure with Uncertainty-Aware Models Open
Recommending the best course of action for an individual is a major application of individual-level causal effect estimation. This application is often needed in safety-critical domains such as healthcare, where estimating and communicatin…
View article: Adversarially Learned Mixture Model
Adversarially Learned Mixture Model Open
The Adversarially Learned Mixture Model (AMM) is a generative model for unsupervised or semi-supervised data clustering. The AMM is the first adversarially optimized method to model the conditional dependence between inferred continuous an…
View article: On the Importance of Attention in Meta-Learning for Few-Shot Text Classification
On the Importance of Attention in Meta-Learning for Few-Shot Text Classification Open
Current deep learning based text classification methods are limited by their ability to achieve fast learning and generalization when the data is scarce. We address this problem by integrating a meta-learning procedure that uses the knowle…
View article: Feasibility of monitoring compliance to the My 5 Moments and Entry/Exit hand hygiene methods in US hospitals
Feasibility of monitoring compliance to the My 5 Moments and Entry/Exit hand hygiene methods in US hospitals Open
We compared the ability to observe hand hygiene opportunities using the World Health Organization My 5 Moments method to the Entry/Exit method. Under covert direct observation, Entry/Exit method opportunities were observed at all times. My…