Explanipedia

Chrysalis: A Unified System for Comparing Active Teaching and Passive Learning with AI Agents in Education Open

Priti Arun, Vinita Vader, Erya Xu, Brent McCready-Branch, Sarah Seabrook , et al. · 2025

AI-assisted learning has seen a remarkable uptick over the last few years, mainly due to the rise in popularity of Large Language Models (LLMs). Their ability to hold long-form, natural language interactions with users makes them excellent…

Time Is Effort: Estimating Human Post-Editing Time for Grammar Error Correction Tool Evaluation Open

Ankit Vadehra, Bill Johnson, Pascal Poupart · 2025

Text editing can involve several iterations of revision. Incorporating an efficient Grammar Error Correction (GEC) tool in the initial correction round can significantly impact further human editing effort and final text quality. This rais…

Reflect-then-Plan: Offline Model-Based Planning through a Doubly Bayesian Lens Open

Jihwan Jeong, Jingmin Wang, Scott Sanner, Pascal Poupart · 2025

Offline reinforcement learning (RL) is crucial when online exploration is costly or unsafe but often struggles with high epistemic uncertainty due to limited data. Existing methods rely on fixed conservative policies, restricting adaptivit…

Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling Open

Mohammed Abdulrahman, Hao Wang, Sriram Ganapathi Subramanian, Marc St-Aubin, Sharon O'Sullivan , et al. · 2025

The optimization of expensive black-box functions is ubiquitous in science and engineering. A common solution to this problem is Bayesian optimization (BO), which is generally comprised of two components: (i) a surrogate model and (ii) an …

A Minimalist Method for Fine-tuning Text-to-Image Diffusion Models Open

William Loh, Pascal Poupart, Suraj Kothawade · 2025

Recent work uses reinforcement learning (RL) to fine-tune text-to-image diffusion models, improving text-image alignment and sample quality. However, existing approaches introduce unnecessary complexity: they cache the full sampling trajec…

Measures of Variability for Risk-averse Policy Gradient Open

Yudong Luo, Yangchen Pan, Jackson Tan, Pascal Poupart · 2025

Risk-averse reinforcement learning (RARL) is critical for decision-making under uncertainty, which is especially valuable in high-stake applications. However, most existing works focus on risk measures, e.g., conditional value-at-risk (CVa…

Towards Cost-Effective Reward Guided Text Generation Open

Ahmad Rashid, Ruotian Wu, Raina Fan, Hongliang Li, Agustinus Kristiadi , et al. · 2025

Psychology Computer science

Reward-guided text generation (RGTG) has emerged as a viable alternative to offline reinforcement learning from human feedback (RLHF). RGTG methods can align baseline language models to human preferences without further training like in st…

Learning Soft Driving Constraints from Vectorized Scene Embeddings while Imitating Expert Trajectories Open

Niloufar Saeidi Mobarakeh, Behzad Khamidehi, Chunlin Li, Hamidreza Mirkhani, Fazel Arasteh , et al. · 2024

Computer science

The primary goal of motion planning is to generate safe and efficient trajectories for vehicles. Traditionally, motion planning models are trained using imitation learning to mimic the behavior of human experts. However, these models often…

A Comprehensive Survey on Inverse Constrained Reinforcement Learning: Definitions, Progress and Challenges Open

Guiliang Liu, Sheng Xu, Shicheng Liu, Gaurav Ashish, Sriram Ganapathi Subramanian , et al. · 2024

Computer science Psychology Engineering

Inverse Constrained Reinforcement Learning (ICRL) is the task of inferring the implicit constraints that expert agents adhere to, based on their demonstration data. As an emerging research topic, ICRL has received considerable attention in…

Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning Open

Yanting Miao, William Loh, Suraj Kothawade, Pascal Poupart, Abdullah Rashwan , et al. · 2024

Computer science Psychology Mathematics

Text-to-image generative models have recently attracted considerable interest, enabling the synthesis of high-quality images from textual prompts. However, these models often lack the capability to generate specific subjects from given ref…

FedLog: Personalized Federated Classification with Less Communication and More Flexibility Open

Haolin Yu, Guojun Zhang, Pascal Poupart · 2024

Computer science Mathematics

Federated representation learning (FRL) aims to learn personalized federated models with effective feature extraction from local data. FRL algorithms that share the majority of the model parameters face significant challenges with huge com…

Uncertainty-Guided Likelihood Tree Search Open

Julia Grosse, Ruotian Wu, Ahmad Rashid, Philipp Hennig, Pascal Poupart , et al. · 2024

Computer science

Tree search is a fundamental tool for planning, as many sequential decision-making problems can be framed as searching over tree-structured spaces. We propose an uncertainty-guided tree search algorithm for settings where the reward functi…

Confidence Aware Inverse Constrained Reinforcement Learning Open

Sriram Ganapathi Subramanian, Guiliang Liu, Mohammed Elmahgiubi, Kasra Rezaee, Pascal Poupart · 2024

Computer science Psychology Mathematics

In coming up with solutions to real-world problems, humans implicitly adhere to constraints that are too numerous and complex to be specified completely. However, reinforcement learning (RL) agents need these constraints to learn the corre…

A Critical Look At Tokenwise Reward-Guided Text Generation Open

Ahmad Rashid, Ruotian Wu, J Grosse, Agustinus Kristiadi, Pascal Poupart · 2024

Psychology Economics

Large language models (LLMs) can be improved by aligning with human preferences through fine-tuning -- the so-called reinforcement learning from human feedback (RLHF). However, the cost of fine-tuning an LLM is prohibitive for many users. …

How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization? Open

Agustinus Kristiadi, Felix Strieth‐Kalthoff, Sriram Ganapathi Subramanian, Vincent Fortuin, Pascal Poupart , et al. · 2024

Computer science

Bayesian optimization (BO) is an integral part of automated scientific discovery -- the so-called self-driving lab -- where human inputs are ideally minimal or at least non-blocking. However, scientists often have strong intuition, and thu…

Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space Open

Mohsin Hasan, Guojun Zhang, Kaiyang Guo, Xi Chen, Pascal Poupart · 2024

Computer science

Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client’s dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting th…

A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization Open

Yudong Luo, Yangchen Pan, Han Wang, Philip Torr, Pascal Poupart · 2024

Computer science Mathematics Economics

Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their practical applications. This inefficiency stems from two m…

Why Online Reinforcement Learning is Causal Open

Oliver Schulte, Pascal Poupart · 2024

Computer science Psychology

Reinforcement learning (RL) and causal modelling naturally complement each other. The goal of causal modelling is to predict the effects of interventions in an environment, while the goal of reinforcement learning is to select intervention…

A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules? Open

Agustinus Kristiadi, Felix Strieth‐Kalthoff, Marta Skreta, Pascal Poupart, Alán Aspuru‐Guzik , et al. · 2024

Computer science

Automation is one of the cornerstones of contemporary material discovery. Bayesian optimization (BO) is an essential part of such workflows, enabling scientists to leverage prior domain knowledge into efficient exploration of a large molec…

Comparing EM with GD in Mixture Models of Two Components Open

Guojun Zhang, Pascal Poupart, George Trimponias · 2024

Mathematics Computer science Physics

The expectation-maximization (EM) algorithm has been widely used in minimizing the negative log likelihood (also known as cross entropy) of mixture models. However, little is understood about the goodness of the fixed points it converges t…

MatrixNets: A New Scale and Aspect Ratio Aware Architecture for Object Detection Open

Abdullah Rashwan, Rishav Agarwal, Agastya Kalra, Pascal Poupart · 2024

Computer science Geography Physics

We present MatrixNets (xNets), a new deep architecture for object detection. xNets map objects with similar sizes and aspect ratios into many specialized layers, allowing xNets to provide a scale and aspect ratio aware architecture. We lev…

Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space Open

Mohsin Hasan, Guojun Zhang, Kaiyang Guo, Xi Chen, Pascal Poupart · 2023

Computer science Mathematics

Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting th…

Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks Open

Ahmad Rashid, Serena Hacker, Guojun Zhang, Agustinus Kristiadi, Pascal Poupart · 2023

Computer science Mathematics Political science

Discriminatively trained, deterministic neural networks are the de facto choice for classification problems. However, even though they achieve state-of-the-art results on in-domain test sets, they tend to be overconfident on out-of-distrib…

An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient Open

Yudong Luo, Guiliang Liu, Pascal Poupart, Yangchen Pan · 2023

Computer science Economics Mathematics

Restricting the variance of a policy's return is a popular choice in risk-averse Reinforcement Learning (RL) due to its clear mathematical definition and easy interpretability. Traditional methods directly restrict the total return varianc…

Attribute Controlled Dialogue Prompting Open

Runcheng Liu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Pascal Poupart · 2023

Computer science Psychology Mathematics

Prompt-tuning has become an increasingly popular parameter-efficient method for adapting large pretrained language models to downstream tasks. However, both discrete prompting and continuous prompting assume fixed prompts for all data samp…

Contrastive Deterministic Autoencoders For Language Modeling Open

Amur Ghose, Pascal Poupart · 2023

Computer science Physics Political science

Variational autoencoders (VAEs) are a popular family of generative models with wide applicability. Training VAEs, especially for text, often runs into the issue of posterior collapse, resulting in loss of representation quality. Determinis…

Attribute Controlled Dialogue Prompting Open

Runcheng Liu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Pascal Poupart · 2023

Computer science Philosophy Mathematics

Prompt-tuning has become an increasingly popular parameter-efficient method for adapting large pretrained language models to downstream tasks. However, both discrete prompting and continuous prompting assume fixed prompts for all data samp…

Do we need Label Regularization to Fine-tune Pre-trained Language Models? Open

Ivan Kobyzev, Aref Jafari, Mehdi Rezagholizadeh, Tianda Li, Alan Do-Omri , et al. · 2023

Computer science

Ivan Kobyzev, Aref Jafari, Mehdi Rezagholizadeh, Tianda Li, Alan Do-Omri, Peng Lu, Pascal Poupart, Ali Ghodsi. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 2023.

Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization Open

Aref Jafari, Ivan Kobyzev, Mehdi Rezagholizadeh, Pascal Poupart, Ali Ghodsi · 2022

Computer science Mathematics Engineering

Knowledge Distillation (KD) has been extensively used for natural language understanding (NLU) tasks to improve a small model's (a student) generalization by transferring the knowledge from a larger model (a teacher). Although KD methods a…

Label Alignment Regularization for Distribution Shift Open

Ehsan Imani, Guojun Zhang, Jun Luo, Pascal Poupart, Yangchen Pan · 2022

Computer science Mathematics

Recent work has highlighted the label alignment property (LAP) in supervised learning, where the vector of all labels in the dataset is mostly in the span of the top few singular vectors of the data matrix. Drawing inspiration from this ob…

Pascal Poupart YOU? Author Swipe