Explanipedia

k-Nearest Neighbor Adaptive Sampling, a Simple Tool to Efficiently Explore Conformational Space Open

Evianne Rovers, Anvith Thudi, Jérôme Hénin, Chris J. Maddison, Matthieu Schapira · 2025

Molecular dynamics (MD) simulations are computationally expensive, which is a limiting factor when simulating biomolecular systems. Adaptive sampling approaches can accelerate the exploration of the conformational space by running repeated…

LM Agents May Fail to Act on Their Own Risk Knowledge Open

Yun Tang, Tianxiao Li, Elizabeth Li, Chris J. Maddison, Honghua Dong , et al. · 2025

Language model (LM) agents have demonstrated significant potential for automating real-world tasks, yet they pose a diverse array of potential, severe risks in safety-critical scenarios. In this work, we identify a significant gap between …

Reasoning to Learn from Latent Thoughts Open

Yangjun Ruan, Neil Band, Chris J. Maddison, Tatsunori Hashimoto · 2025

Compute scaling for language model (LM) pretraining has outpaced the growth of human-written texts, leading to concerns that data will become the bottleneck to LM scaling. To continue scaling pretraining in this data-constrained regime, we…

k-Nearest Neighbour Adaptive Sampling (kNN-AS), a Simple Tool to Efficiently Explore Conformational Space Open

Evianne Rovers, Anvith Thudi, Chris J. Maddison, Matthieu Schapira · 2025

Molecular dynamics (MD) simulations are computationally expensive, a limiting factor when simulating biomolecular systems. Adaptive sampling approaches can accelerate the exploration of conformational space by running repeated short MD sim…

MixMin: Finding Data Mixtures via Convex Minimization Open

Anvith Thudi, Evianne Rovers, Yangjun Ruan, Tristan Thrush, Chris J. Maddison · 2025

Modern machine learning pipelines are increasingly combining and mixing data from diverse and disparate sources, e.g., pre-training large language models. Yet, finding the optimal data mixture is a challenging and open problem. We formaliz…

APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts Open

Honghua Dong, Qidong Su, Yubo Gao, Zhaoyu Li, Yangjun Ruan , et al. · 2025

On the Efficiency of ERM in Feature Learning Open

Ayoub El Hanchi, Chris J. Maddison, Murat A. Erdogdu · 2024

Given a collection of feature maps indexed by a set $\mathcal{T}$, we study the performance of empirical risk minimization (ERM) on regression problems with square loss over the union of the linear classes induced by these feature maps. Th…

Boosting the Predictive Power of Protein Representations with a Corpus of Text Annotations Open

Haonan Duan, Marta Skreta, Leonardo Cotta, Ella Miray Rajaonson, Nikita Dhawan , et al. · 2024

Protein language models are trained to predict amino acid sequences from vast protein databases, while learning to represent proteins as feature vectors. These vector representations have enabled impressive applications, from predicting mu…

End-To-End Causal Effect Estimation from Unstructured Natural Language Data Open

Nikita Dhawan, Leonardo Cotta, Karen Ullrich, Rahul G. Krishnan, Chris J. Maddison · 2024

Knowing the effect of an intervention is critical for human decision-making, but current approaches for causal effect estimation rely on manual data collection and structuring, regardless of the causal assumptions. This increases both the …

APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts Open

Honghua Dong, Qidong Su, Yubo Gao, Zhaoyu Li, Yangjun Ruan , et al. · 2024

Large Language Models (LLMs) have become increasingly capable of handling diverse tasks with the aid of well-crafted prompts and integration of external tools, but as task complexity rises, the workflow involving LLMs can be complicated an…

Minimax Linear Regression under the Quantile Risk Open

Ayoub El Hanchi, Chris J. Maddison, Murat A. Erdogdu · 2024

We study the problem of designing minimax procedures in linear regression under the quantile risk. We start by considering the realizable setting with independent Gaussian noise, where for any given noise level and distribution of inputs, …

Test-Time Fairness and Robustness in Large Language Models Open

Leonardo Cotta, Chris J. Maddison · 2024

Frontier Large Language Models (LLMs) can be socially discriminatory or sensitive to spurious features of their inputs. Because only well-resourced corporations can train frontier LLMs, we need robust test-time strategies to control such b…

MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures Open

Anvith Thudi, Chris J. Maddison · 2024

Machine learning models are often required to perform well across several pre-defined settings, such as a set of user groups. Worst-case performance is a common metric to capture this requirement, and is the objective of group distribution…

Observational Scaling Laws and the Predictability of Language Model Performance Open

Yangjun Ruan, Chris J. Maddison, Tatsunori Hashimoto · 2024

Understanding how language model performance varies with scale is critical to benchmark and algorithm development. Scaling laws are one approach to building this understanding, but the requirement of training models across many different s…

Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs Open

Daniel D. Johnson, Daniel Tarlow, David Duvenaud, Chris J. Maddison · 2024

Identifying how much a model ${\widehat{p}}_θ(Y|X)$ knows about the stochastic real-world process $p(Y|X)$ it was trained on is important to ensure it avoids producing incorrect or "hallucinated" answers or taking unsafe actions. But this …

Identifying the Risks of LM Agents with an LM-Emulated Sandbox Open

Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, Yongchao Zhou , et al. · 2023

Recent advances in Language Model (LM) agents and tool use, exemplified by applications like ChatGPT Plugins, enable a rich set of capabilities but also amplify potential risks - such as leaking private data or causing financial losses. Id…

Probabilistic Invariant Learning with Randomized Linear Classifiers Open

Leonardo Cotta, Gal Yehuda, Assaf Schuster, Chris J. Maddison · 2023

Designing models that are both expressive and preserve known invariances of tasks is an increasingly hard problem. Existing solutions tradeoff invariance for computational or memory resources. In this work, we show how to leverage randomne…

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit Open

Lorenzo Noci, C. F. Li, Mufan Bill Li, Bobby He, Thomas Hofmann , et al. · 2023

In deep learning theory, the covariance matrix of the representations serves as a proxy to examine the network's trainability. Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention m…

Benchmarking Neural Network Training Algorithms Open

George E. Dahl, Frank Schneider, Zachary Nado, Naman Agarwal, Chandramouli Shama Sastry , et al. · 2023

Training algorithms, broadly construed, are an essential part of every deep learning pipeline. Training algorithm improvements that speed up training across a wide variety of workloads (e.g., better update rules, tuning protocols, learning…

Contrastive Learning Can Find An Optimal Basis For Approximately View-Invariant Functions Open

Daniel D. Johnson, Ayoub El Hanchi, Chris J. Maddison · 2022

Contrastive learning is a powerful framework for learning self-supervised representations that generalize well to downstream supervised tasks. We show that multiple existing contrastive learning methods can be reinterpreted as learning ker…

Learning To Cut By Looking Ahead: Cutting Plane Selection via Imitation Learning Open

Max B. Paulus, Giulia Zarpellon, Andreas Krause, Laurent Charlin, Chris J. Maddison · 2022

Cutting planes are essential for solving mixed-integer linear problems (MILPs), because they facilitate bound improvements on the optimal solution value. For selecting cuts, modern solvers rely on manually designed heuristics that are tune…

The Machine Learning for Combinatorial Optimization Competition (ML4CO):\n Results and Insights Open

Maxime Gasse, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Chételat , et al. · 2022

Combinatorial optimization is a well-established area in operations research\nand computer science. Until recently, its methods have focused on solving\nproblem instances in isolation, ignoring that they often stem from related data\ndistr…

The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights Open

Maxime Gasse, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Chételat , et al. · 2022

Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distribu…

Augment with Care: Contrastive Learning for Combinatorial Problems Open

Haonan Duan, Pashootan Vaezipoor, Max B. Paulus, Yangjun Ruan, Chris J. Maddison · 2022

Supervised learning can improve the design of state-of-the-art solvers for combinatorial problems, but labelling large numbers of combinatorial instances is often impractical due to exponential worst-case complexity. Inspired by the recent…

Bayesian Nonparametrics for Offline Skill Discovery Open

Valentin Villecroze, Harry J. Braviner, Panteha Naderian, Chris J. Maddison, Gabriel Loaiza-Ganem · 2022

Skills or low-level policies in reinforcement learning are temporally extended actions that can speed up learning and enable complex behaviours. Recent work in offline reinforcement learning and imitation learning has proposed several tech…

Optimal Representations for Covariate Shift Open

Yangjun Ruan, Yann Dubois, Chris J. Maddison · 2021

Machine learning systems often experience a distribution shift between training and testing. In this paper, we introduce a simple variational objective whose optima are exactly the set of all representations on which risk minimizers are gu…

Learning Generalized Gumbel-max Causal Mechanisms Open

Guy Lorberbom, Daniel D. Johnson, Chris J. Maddison, Daniel Tarlow, Tamir Hazan · 2021

To perform counterfactual reasoning in Structural Causal Models (SCMs), one needs to know the causal mechanisms, which provide factorizations of conditional distributions into noise sources and deterministic functions mapping realizations …

Unbiased Gradient Estimation with Balanced Assignments for Mixtures of\n Experts Open

Wouter Kool, Chris J. Maddison, Andriy Mnih · 2021

Training large-scale mixture of experts models efficiently on modern hardware\nrequires assigning datapoints in a batch to different experts, each with a\nlimited capacity. Recently proposed assignment procedures lack a probabilistic\ninte…

Unbiased Gradient Estimation with Balanced Assignments for Mixtures of Experts Open

Wouter Kool, Chris J. Maddison, Andriy Mnih · 2021

Training large-scale mixture of experts models efficiently on modern hardware requires assigning datapoints in a batch to different experts, each with a limited capacity. Recently proposed assignment procedures lack a probabilistic interpr…

Lossy Compression for Lossless Prediction Open

Yann Dubois, Benjamin Bloem-Reddy, Karen Ullrich, Chris J. Maddison · 2021

Most data is automatically collected and only ever "seen" by algorithms. Yet, data compressors preserve perceptual fidelity rather than just the information needed by algorithms performing downstream tasks. In this paper, we characterize t…

Chris J. Maddison YOU? Author Swipe