Explanipedia

Learning on the Job: Test-Time Curricula for Targeted Reinforcement Learning Open

Jonas Hübotter, Leander Diaz-Bone, Ido Hakimi, Andreas Krause, Moritz Hardt · 2025

Humans are good at learning on the job: We learn how to solve the tasks we face as we go along. Can a model do the same? We propose an agent that assembles a task-specific curriculum, called test-time curriculum (TTC-RL), and applies reinf…

Maximizing Prefix-Confidence at Test-Time Efficiently Improves Mathematical Reasoning Open

Matthias Otth, Jonas Hübotter, Ido Hakimi, Andreas Krause · 2025

Recent work has shown that language models can self-improve by maximizing their own confidence in their predictions, without relying on external verifiers or reward signals. In this work, we study the test-time scaling of language models f…

Test-time Offline Reinforcement Learning on Goal-related Experience Open

Marco Bagatella, Berat Mert Albaba, Jonas Hübotter, Georg Martius, Andreas Krause · 2025

Foundation models compress a large amount of information in a single, large neural network, which can then be queried for individual tasks. There are strong parallels between this widespread framework and offline goal-conditioned reinforce…

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning Open

Leander Diaz-Bone, Marco Bagatella, Jonas Hübotter, Andreas Krause · 2025

Sparse-reward reinforcement learning (RL) can model a wide range of highly complex tasks. Solving sparse-reward tasks is RL's core premise, requiring efficient exploration coupled with long-horizon credit assignment, and overcoming these c…

Probabilistic Artificial Intelligence Open

Andreas Krause, Jonas Hübotter · 2025

Computer science

Artificial intelligence commonly refers to the science and engineering of artificial systems that can carry out tasks generally associated with requiring aspects of human intelligence, such as playing games, translating languages, and driv…

LITE: Efficiently Estimating Gaussian Probability of Maximality Open

Nicolas Menet, Jonas Hübotter, Parnian Kassraie, Andreas Krause · 2025

Mathematics Computer science Physics

We consider the problem of computing the probability of maximality (PoM) of a Gaussian random vector, i.e., the probability for each dimension to be maximal. This is a key challenge in applications ranging from Bayesian optimization to rei…

Local Mixtures of Experts: Essentially Free Test-Time Training via Model Merging Open

Ryo Bertolissi, Jonas Hübotter, Ido Hakimi, Andreas Krause · 2025

Computer science

Mixture of expert (MoE) models are a promising approach to increasing model capacity without increasing inference cost, and are core components of many state-of-the-art language models. However, current MoE models typically use only few ex…

Active Fine-Tuning of Multi-Task Policies Open

Marco Bagatella, Jonas Hübotter, Georg Martius, Andreas Krause · 2025

Computer science

Pre-trained generalist policies are rapidly gaining relevance in robot learning due to their promise of fast adaptation to novel, in-domain tasks. This adaptation often relies on collecting new demonstrations for a specific task of interes…

Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs Open

Jonas Hübotter, Sascha Bongni, Ido Hakimi, Andreas Krause · 2024

Computer science Biology

Recent efforts in fine-tuning language models often rely on automatic data selection, commonly using Nearest Neighbors retrieval from large datasets. However, we theoretically show that this approach tends to select redundant data, limitin…

Active Fine-Tuning of Multi-Task Policies Open

Marco Bagatella, Jonas Hübotter, Georg Martius, Andreas Krause · 2024

Business Economics Biology

Pre-trained generalist policies are rapidly gaining relevance in robot learning due to their promise of fast adaptation to novel, in-domain tasks. This adaptation often relies on collecting new demonstrations for a specific task of interes…

Transductive Active Learning: Theory and Applications Open

Jonas Hübotter, Bhavya Sukhija, Lenart Treven, Yarden As, Andreas Krause · 2024

Computer science

We study a generalization of classical active learning to real-world settings with concrete prediction targets where sampling is restricted to an accessible region of the domain, while prediction targets may lie outside this region. We ana…

Active Few-Shot Fine-Tuning Open

Jonas Hübotter, Bhavya Sukhija, Lenart Treven, Yarden As, Andreas Krause · 2024

Computer science Materials science Engineering

We study the question: How can we select the right data for fine-tuning to a specific task? We call this data selection problem active fine-tuning and show that it is an instance of transductive active learning, a novel generalization of c…

Efficient Exploration in Continuous-time Model-based Reinforcement Learning Open

Lenart Treven, Jonas Hübotter, Bhavya Sukhija, Florian Dörfler, Andreas Krause · 2023

Computer science Mathematics

Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents contin…

Tuning Legged Locomotion Controllers via Safe Bayesian Optimization Open

Daniel Widmer, Dong‐Ho Kang, Bhavya Sukhija, Jonas Hübotter, Andreas Krause , et al. · 2023

Computer science Engineering Chemistry

This paper presents a data-driven strategy to streamline the deployment of model-based controllers in legged robotic hardware platforms. Our approach leverages a model-free safe learning algorithm to automate the tuning of control gains, a…

Implementation of Algorithms for Right-Sizing Data Centers Open

Jonas Hübotter · 2021

Computer science Art

The energy consumption of data centers assumes a significant fraction of the world's overall energy consumption. Most data centers are statically provisioned, leading to a very low average utilization of servers. In this work, we survey un…

Jonas Hübotter YOU? Author Swipe