Pascal Poupart
YOU?
Author Swipe
View article: Chrysalis: A Unified System for Comparing Active Teaching and Passive Learning with AI Agents in Education
Chrysalis: A Unified System for Comparing Active Teaching and Passive Learning with AI Agents in Education Open
AI-assisted learning has seen a remarkable uptick over the last few years, mainly due to the rise in popularity of Large Language Models (LLMs). Their ability to hold long-form, natural language interactions with users makes them excellent…
View article: Time Is Effort: Estimating Human Post-Editing Time for Grammar Error Correction Tool Evaluation
Time Is Effort: Estimating Human Post-Editing Time for Grammar Error Correction Tool Evaluation Open
Text editing can involve several iterations of revision. Incorporating an efficient Grammar Error Correction (GEC) tool in the initial correction round can significantly impact further human editing effort and final text quality. This rais…
View article: Reflect-then-Plan: Offline Model-Based Planning through a Doubly Bayesian Lens
Reflect-then-Plan: Offline Model-Based Planning through a Doubly Bayesian Lens Open
Offline reinforcement learning (RL) is crucial when online exploration is costly or unsafe but often struggles with high epistemic uncertainty due to limited data. Existing methods rely on fixed conservative policies, restricting adaptivit…
View article: Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling
Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling Open
The optimization of expensive black-box functions is ubiquitous in science and engineering. A common solution to this problem is Bayesian optimization (BO), which is generally comprised of two components: (i) a surrogate model and (ii) an …
View article: A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models
A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models Open
Recent work uses reinforcement learning (RL) to fine-tune text-to-image diffusion models, improving text-image alignment and sample quality. However, existing approaches introduce unnecessary complexity: they cache the full sampling trajec…
View article: Measures of Variability for Risk-averse Policy Gradient
Measures of Variability for Risk-averse Policy Gradient Open
Risk-averse reinforcement learning (RARL) is critical for decision-making under uncertainty, which is especially valuable in high-stake applications. However, most existing works focus on risk measures, e.g., conditional value-at-risk (CVa…
View article: Towards Cost-Effective Reward Guided Text Generation
Towards Cost-Effective Reward Guided Text Generation Open
Reward-guided text generation (RGTG) has emerged as a viable alternative to offline reinforcement learning from human feedback (RLHF). RGTG methods can align baseline language models to human preferences without further training like in st…
View article: Learning Soft Driving Constraints from Vectorized Scene Embeddings while Imitating Expert Trajectories
Learning Soft Driving Constraints from Vectorized Scene Embeddings while Imitating Expert Trajectories Open
The primary goal of motion planning is to generate safe and efficient trajectories for vehicles. Traditionally, motion planning models are trained using imitation learning to mimic the behavior of human experts. However, these models often…
View article: A Comprehensive Survey on Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges
A Comprehensive Survey on Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges Open
Inverse Constrained Reinforcement Learning (ICRL) is the task of inferring the implicit constraints that expert agents adhere to, based on their demonstration data. As an emerging research topic, ICRL has received considerable attention in…
View article: Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning
Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning Open
Text-to-image generative models have recently attracted considerable interest, enabling the synthesis of high-quality images from textual prompts. However, these models often lack the capability to generate specific subjects from given ref…
View article: FedLog: Personalized Federated Classification with Less Communication and More Flexibility
FedLog: Personalized Federated Classification with Less Communication and More Flexibility Open
Federated representation learning (FRL) aims to learn personalized federated models with effective feature extraction from local data. FRL algorithms that share the majority of the model parameters face significant challenges with huge com…
View article: Uncertainty-Guided Likelihood Tree Search
Uncertainty-Guided Likelihood Tree Search Open
Tree search is a fundamental tool for planning, as many sequential decision-making problems can be framed as searching over tree-structured spaces. We propose an uncertainty-guided tree search algorithm for settings where the reward functi…
View article: Confidence Aware Inverse Constrained Reinforcement Learning
Confidence Aware Inverse Constrained Reinforcement Learning Open
In coming up with solutions to real-world problems, humans implicitly adhere to constraints that are too numerous and complex to be specified completely. However, reinforcement learning (RL) agents need these constraints to learn the corre…
View article: A Critical Look At Tokenwise Reward-Guided Text Generation
A Critical Look At Tokenwise Reward-Guided Text Generation Open
Large language models (LLMs) can be improved by aligning with human preferences through fine-tuning -- the so-called reinforcement learning from human feedback (RLHF). However, the cost of fine-tuning an LLM is prohibitive for many users. …
View article: How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?
How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization? Open
Bayesian optimization (BO) is an integral part of automated scientific discovery -- the so-called self-driving lab -- where human inputs are ideally minimal or at least non-blocking. However, scientists often have strong intuition, and thu…
View article: Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space
Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space Open
Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client’s dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting th…
View article: A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization
A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization Open
Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their practical applications. This inefficiency stems from two m…
View article: Why Online Reinforcement Learning is Causal
Why Online Reinforcement Learning is Causal Open
Reinforcement learning (RL) and causal modelling naturally complement each other. The goal of causal modelling is to predict the effects of interventions in an environment, while the goal of reinforcement learning is to select intervention…
View article: A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?
A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules? Open
Automation is one of the cornerstones of contemporary material discovery. Bayesian optimization (BO) is an essential part of such workflows, enabling scientists to leverage prior domain knowledge into efficient exploration of a large molec…
View article: Comparing EM with GD in Mixture Models of Two Components
Comparing EM with GD in Mixture Models of Two Components Open
The expectation-maximization (EM) algorithm has been widely used in minimizing the negative log likelihood (also known as cross entropy) of mixture models. However, little is understood about the goodness of the fixed points it converges t…
View article: MatrixNets: A New Scale and Aspect Ratio Aware Architecture for Object Detection
MatrixNets: A New Scale and Aspect Ratio Aware Architecture for Object Detection Open
We present MatrixNets (xNets), a new deep architecture for object detection. xNets map objects with similar sizes and aspect ratios into many specialized layers, allowing xNets to provide a scale and aspect ratio aware architecture. We lev…
View article: Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space
Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space Open
Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting th…
View article: Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks
Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks Open
Discriminatively trained, deterministic neural networks are the de facto choice for classification problems. However, even though they achieve state-of-the-art results on in-domain test sets, they tend to be overconfident on out-of-distrib…
View article: An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient Open
Restricting the variance of a policy's return is a popular choice in risk-averse Reinforcement Learning (RL) due to its clear mathematical definition and easy interpretability. Traditional methods directly restrict the total return varianc…
View article: Attribute Controlled Dialogue Prompting
Attribute Controlled Dialogue Prompting Open
Prompt-tuning has become an increasingly popular parameter-efficient method for adapting large pretrained language models to downstream tasks. However, both discrete prompting and continuous prompting assume fixed prompts for all data samp…
View article: Contrastive Deterministic Autoencoders For Language Modeling
Contrastive Deterministic Autoencoders For Language Modeling Open
Variational autoencoders (VAEs) are a popular family of generative models with wide applicability. Training VAEs, especially for text, often runs into the issue of posterior collapse, resulting in loss of representation quality. Determinis…
View article: Attribute Controlled Dialogue Prompting
Attribute Controlled Dialogue Prompting Open
Prompt-tuning has become an increasingly popular parameter-efficient method for adapting large pretrained language models to downstream tasks. However, both discrete prompting and continuous prompting assume fixed prompts for all data samp…
View article: Do we need Label Regularization to Fine-tune Pre-trained Language Models?
Do we need Label Regularization to Fine-tune Pre-trained Language Models? Open
Ivan Kobyzev, Aref Jafari, Mehdi Rezagholizadeh, Tianda Li, Alan Do-Omri, Peng Lu, Pascal Poupart, Ali Ghodsi. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 2023.
View article: Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization
Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization Open
Knowledge Distillation (KD) has been extensively used for natural language understanding (NLU) tasks to improve a small model's (a student) generalization by transferring the knowledge from a larger model (a teacher). Although KD methods a…
View article: Label Alignment Regularization for Distribution Shift
Label Alignment Regularization for Distribution Shift Open
Recent work has highlighted the label alignment property (LAP) in supervised learning, where the vector of all labels in the dataset is mostly in the span of the top few singular vectors of the data matrix. Drawing inspiration from this ob…