Explanipedia

Learning a Pessimistic Reward Model in RLHF Open

Yinglun Xu, Hangoo Kang, Tarun Suresh, Yanan Wan, Gagandeep Singh · 2025

This work proposes `PET', a novel pessimistic reward fine-tuning method, to learn a pessimistic reward model robust against reward hacking in offline reinforcement learning from human feedback (RLHF). Traditional reward modeling techniques…

Robust Thompson Sampling Algorithms Against Reward Poisoning Attacks Open

Yinglun Xu, Zhiwei Wang, Gagandeep Singh · 2024

Computer science

Thompson sampling is one of the most popular learning algorithms for online sequential decision-making problems and has rich real-world applications. However, current Thompson sampling algorithms are limited by the assumption that the rewa…

Binary Reward Labeling: Bridging Offline Preference and Reward-Based Reinforcement Learning Open

Yinglun Xu, David C. Zhu, Rohan Gumastate, Gagandeep Singh · 2024

Psychology Computer science Economics

Offline reinforcement learning has become one of the most practical RL settings. However, most existing works on offline RL focus on the standard setting with scalar reward feedback. It remains unknown how to universally transfer the exist…

Universal Black-Box Reward Poisoning Attack against Offline Reinforcement Learning Open

Yinglun Xu, Rohan Gumaste, Gagandeep Singh · 2024

Psychology Computer science

We study the problem of universal black-boxed reward poisoning attacks against general offline reinforcement learning with deep neural networks. We consider a black-box threat model where the attacker is entirely oblivious to the learning …

Two-Step Offline Preference-Based Reinforcement Learning with Constrained Actions Open

Yinglun Xu, Gagandeep Singh · 2023

Computer science Mathematics

Preference-based reinforcement learning (PBRL) in the offline setting has succeeded greatly in industrial applications such as chatbots. A two-step learning framework where one applies a reinforcement learning step after a reward modeling …

On the Robustness of Epoch-Greedy in Multi-Agent Contextual Bandit Mechanisms Open

Yinglun Xu, Bhuvesh Kumar, Jacob Abernethy · 2023

Computer science Economics Mathematics

Efficient learning in multi-armed bandit mechanisms such as pay-per-click (PPC) auctions typically involves three challenges: 1) inducing truthful bidding behavior (incentives), 2) using personalization in the users (context), and 3) circu…

Black-Box Targeted Reward Poisoning Attack Against Online Deep Reinforcement Learning Open

Yinglun Xu, Gagandeep Singh · 2023

Computer science

We propose the first black-box targeted attack against online deep reinforcement learning through reward poisoning during training time. Our attack is applicable to general environments with unknown dynamics learned by unknown algorithms a…

Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning Open

Yinglun Xu, Qi Zeng, Gagandeep Singh · 2022

Computer science Engineering Chemistry

We study reward poisoning attacks on online deep reinforcement learning (DRL), where the attacker is oblivious to the learning algorithm used by the agent and the dynamics of the environment. We demonstrate the intrinsic vulnerability of s…

Single-molecule optofluidic microsensor with interface whispering gallery modes Open

Xiao‐Chong Yu, Shui‐Jing Tang, Wenjing Liu, Yinglun Xu, Qihuang Gong , et al. · 2022

Materials science Chemistry Engineering

Significance Optical microresonators have emerged as promising platforms for label-free detection of molecules. However, approaching optimum sensitivity is hindered due to the weak tail of evanescent fields. Here, we report the implementat…

Yinglun Xu YOU? Author Swipe