Explanipedia

Hyperparameter Optimization in Machine Learning Open

Luca Franceschi, Michele Donini, Valerio Perrone, Aaron Klein, Cédric Archambeau , et al. · 2024

Computer science

Hyperparameters are configuration variables controlling the behavior of machine learning algorithms. They are ubiquitous in machine learning and artificial intelligence and the choice of their values determines the effectiveness of systems…

Structural Pruning of Pre-trained Language Models via Neural Architecture Search Open

Aaron Klein, Jacek Gołębiowski, Xingchen Ma, Valerio Perrone, Cédric Archambeau · 2024

Computer science Geography Biology

Pre-trained language models (PLM), for example BERT or RoBERTa, mark the state-of-the-art for natural language understanding task when fine-tuned on labeled data. However, their large size poses challenges in deploying them for inference i…

Explaining Probabilistic Models with Distributional Values Open

Luca Franceschi, Michele Donini, Cédric Archambeau, Matthias Seeger · 2024

Economics Computer science Mathematics

A large branch of explainable machine learning is grounded in cooperative game theory. However, research indicates that game-theoretic explanations may mislead or be hard to interpret. We argue that often there is a critical mismatch betwe…

A Negative Result on Gradient Matching for Selective Backprop Open

Lukas Balles, Cédric Archambeau, Giovanni Zappella · 2023

Computer science Mathematics Geology

With increasing scale in model and dataset size, the training of deep neural networks becomes a massive computational burden. One approach to speed up the training process is Selective Backprop. For this approach, we perform a forward pass…

Geographical Erasure in Language Generation Open

Pola Schwöbel, Jacek Gołębiowski, Michele Donini, Cédric Archambeau, Danish Pruthi · 2023

Computer science

Large language models (LLMs) encode vast amounts of world knowledge. However, since these models are trained on large swaths of internet data, they are at risk of inordinately capturing information about dominant groups. This imbalance can…

Optimizing Hyperparameters with Conformal Quantile Regression Open

David Salinas, Jacek Gołębiowski, Aaron Klein, Matthias Seeger, Cédric Archambeau · 2023

Computer science Mathematics Physics

Many state-of-the-art hyperparameter optimization (HPO) algorithms rely on model-based optimizers that learn surrogate models of the target function to guide the search. Gaussian processes are the de facto surrogate model due to their abil…

Renate: A Library for Real-World Continual Learning Open

Martin Wistuba, Martin Ferianc, Lukas Balles, Cédric Archambeau, Giovanni Zappella · 2023

Computer science

Continual learning enables the incremental training of machine learning models on non-stationary data streams.While academic interest in the topic is high, there is little indication of the use of state-of-the-art continual learning algori…

Fortuna: A Library for Uncertainty Quantification in Deep Learning Open

Gianluca Detommaso, Alberto Gasparin, Michele Donini, Matthias Seeger, Andrew Gordon Wilson , et al. · 2023

Computer science Engineering Mathematics

We present Fortuna, an open-source library for uncertainty quantification in deep learning. Fortuna supports a range of calibration techniques, such as conformal prediction that can be applied to any trained neural network to generate reli…

Geographical Erasure in Language Generation Open

Pola Schwöbel, Jacek Gołębiowski, Michele Donini, Cédric Archambeau, Danish Pruthi · 2023

Computer science

Large language models (LLMs) encode vast amounts of world knowledge. However, since these models are trained on large swaths of internet data, they are at risk of inordinately capturing information about dominant groups. This imbalance can…

Private Synthetic Data for Multitask Learning and Marginal Queries Open

Giuseppe Vietri, Cédric Archambeau, Sergül Aydöre, William Brown, Michael Kearns , et al. · 2022

Computer science

We provide a differentially private algorithm for producing synthetic data simultaneously useful for multiple tasks: marginal queries and multitask machine learning (ML). A key innovation in our algorithm is the ability to directly handle …

Uncertainty Calibration in Bayesian Neural Networks via Distance-Aware Priors Open

Gianluca Detommaso, Alberto Gasparin, A. S. Wilson, Cédric Archambeau · 2022

Computer science Mathematics Psychology

As we move away from the data, the predictive uncertainty should increase, since a great variety of explanations are consistent with the little available information. We introduce Distance-Aware Prior (DAP) calibration, a method to correct…

PASHA: Efficient HPO and NAS with Progressive Resource Allocation Open

Ondrej Bohdal, Lukas Balles, Beyza Ermiş, Cédric Archambeau, Giovanni Zappella · 2022

Computer science Art History

Hyperparameter optimization (HPO) and neural architecture search (NAS) are methods of choice to obtain the best-in-class machine learning models, but in practice they can be costly to run. When models are trained on large datasets, tuning …

Continual Learning with Transformers for Image Classification Open

Beyza Ermiş, Giovanni Zappella, Martin Wistuba, Aditya Rawal, Cédric Archambeau · 2022

Computer science Engineering Business

In many real-world scenarios, data to train machine learning models become available over time. However, neural network models struggle to continually learn new concepts without forgetting what has been learnt in the past. This phenomenon …

Gradient-Matching Coresets for Rehearsal-Based Continual Learning Open

Lukas Balles, Giovanni Zappella, Cédric Archambeau · 2022

Computer science Mathematics Psychology

The goal of continual learning (CL) is to efficiently update a machine learning model with new data without forgetting previously-learned knowledge. Most widely-used CL methods rely on a rehearsal memory of data points to be reused while t…

Diverse Counterfactual Explanations for Anomaly Detection in Time Series Open

Déborah Sulem, Michele Donini, Muhammad Bilal Zafar, François-Xavier Aubet, Jan Gasthaus , et al. · 2022

Computer science Physics Philosophy

Data-driven methods that detect anomalies in times series data are ubiquitous in practice, but they are in general unable to provide helpful explanations for the predictions they make. In this work we propose a model-agnostic algorithm tha…

Memory Efficient Continual Learning with Transformers Open

Beyza Ermiş, Giovanni Zappella, Martin Wistuba, Cédric Archambeau · 2022

Computer science Engineering Philosophy

In many real-world scenarios, data to train machine learning models becomes available over time. Unfortunately, these models struggle to continually learn new concepts without forgetting what has been learnt in the past. This phenomenon is…

More Than Words: Towards Better Quality Interpretations of Text Classifiers Open

Muhammad Bilal Zafar, Philipp Schmidt, Michele Donini, Cédric Archambeau, Felix Bießmann , et al. · 2021

Computer science Chemistry Philosophy

The large size and complex decision mechanisms of state-of-the-art text classifiers make it difficult for humans to understand their predictions, leading to a potential lack of trust by the users. These issues have led to the adoption of m…

Gradient-matching coresets for continual learning Open

Lukas Balles, Giovanni Zappella, Cédric Archambeau · 2021

Computer science Mathematics Geology

We devise a coreset selection method based on the idea of gradient matching: The gradients induced by the coreset should match, as closely as possible, those induced by the original training dataset. We evaluate the method in the context o…

Meta-Forecasting by combining Global Deep Representations with Local Adaptation Open

Riccardo Grazzi, Valentín Flunkert, David Salinas, Tim Januschowski, Matthias Seeger , et al. · 2021

Computer science Mathematics Economics

While classical time series forecasting considers individual time series in isolation, recent advances based on deep learning showed that jointly learning from a large pool of related time series can boost the forecasting accuracy. However…

Amazon SageMaker Automatic Model Tuning: Scalable Gradient-Free Optimization Open

Valerio Perrone, Huibin Shen, Aida Zolic, Iaroslav Shcherbatyi, Amr Ahmed , et al. · 2021

Computer science Engineering

Tuning complex machine learning systems is challenging. Machine learning typically requires to set hyperparameters, be it regularization, architecture, or optimization parameters, whose tuning is critical to achieve good predictive perform…

Fair Bayesian Optimization Open

Valerio Perrone, Michele Donini, Muhammad Bilal Zafar, Robin Schmucker, Krishnaram Kenthapadi , et al. · 2021

Computer science Mathematics

Given the increasing importance of machine learning (ML) in our lives, several algorithmic fairness techniques have been proposed to mitigate biases in the outcomes of the ML models. However, most of these techniques are specialized to cat…

Multi-objective Asynchronous Successive Halving Open

Robin Schmucker, Michele Donini, Muhammad Bilal Zafar, David Salinas, Cédric Archambeau · 2021

Computer science Mathematics

Hyperparameter optimization (HPO) is increasingly used to automatically tune the predictive performance (e.g., accuracy) of machine learning models. However, in a plethora of real-world applications, accuracy is only one of the multiple --…

A multi-objective perspective on jointly tuning hardware and hyperparameters Open

David Salinas, Valerio Perrone, Olivier Cruchant, Cédric Archambeau · 2021

Computer science Mathematics

In addition to the best model architecture and hyperparameters, a full AutoML solution requires selecting appropriate hardware automatically. This can be framed as a multi-objective optimization problem: there is not a single best hardware…

Overfitting in Bayesian Optimization: an empirical study and early-stopping solution Open

Anastasia Makarova, Huibin Shen, Valerio Perrone, Aaron Klein, Jean Baptiste Faddoul , et al. · 2021

Computer science Mathematics Engineering

Tuning machine learning models with Bayesian optimization (BO) is a successful strategy to find good hyperparameters. BO defines an iterative procedure where a cross-validated metric is evaluated on promising hyperparameters. In practice, …

Automatic Termination for Hyperparameter Optimization Open

Anastasia Makarova, Huibin Shen, Valerio Perrone, Aaron Klein, Jean Baptiste Faddoul , et al. · 2021

Computer science Mathematics Materials science

Bayesian optimization (BO) is a widely popular approach for the hyperparameter optimization (HPO) in machine learning. At its core, BO iteratively evaluates promising configurations until a user-defined budget, such as wall-clock time or n…

A resource-efficient method for repeated HPO and NAS problems Open

Giovanni Zappella, David Salinas, Cédric Archambeau · 2021

Computer science

In this work we consider the problem of repeated hyperparameter and neural architecture search (HNAS). We propose an extension of Successive Halving that is able to leverage information gained in previous HNAS problems with the goal of sav…

Hyperparameter Transfer Learning with Adaptive Complexity Open

Samuel Horváth, Aaron Klein, Peter Richtárik, Cédric Archambeau · 2021

Computer science Political science Economics

Bayesian optimization (BO) is a sample efficient approach to automatically tune the hyperparameters of machine learning models. In practice, one frequently has to solve similar hyperparameter tuning problems sequentially. For example, one …

BORE: Bayesian Optimization by Density-Ratio Estimation Open

Louis C. Tiao, Aaron Klein, Matthias Seeger, Edwin V. Bonilla, Cédric Archambeau , et al. · 2021

Computer science Mathematics Biology

Bayesian optimization (BO) is among the most effective and widely-used blackbox optimization methods. BO proposes solutions according to an explore-exploit trade-off criterion encoded in an acquisition function, many of which are computed …

On the Lack of Robust Interpretability of Neural Text Classifiers Open

Muhammad Bilal Zafar, Michele Donini, Dylan Slack, Cédric Archambeau, Sanjiv Ranjan Das , et al. · 2021

Computer science Chemistry

With the ever-increasing complexity of neural language models, practitioners have turned to methods for understanding the predictions of these models. One of the most well-adopted approaches for model interpretability is feature-based inte…

Cédric Archambeau YOU? Author Swipe