Yali Du
YOU?
Author Swipe
View article: A Comparative User Evaluation of XRL Explanations using Goal Identification
A Comparative User Evaluation of XRL Explanations using Goal Identification Open
Debugging is a core application of explainable reinforcement learning (XRL) algorithms; however, limited comparative evaluations have been conducted to understand their relative performance. We propose a novel evaluation methodology to tes…
View article: Self-Verifying Reflection Helps Transformers with CoT Reasoning
Self-Verifying Reflection Helps Transformers with CoT Reasoning Open
Advanced large language models (LLMs) frequently reflect in reasoning chain-of-thoughts (CoTs), where they self-verify the correctness of current solutions and explore alternatives. However, given recent findings that LLMs detect limited e…
View article: Towards Communication Efficient Multi-Agent Cooperations: Reinforcement Learning and LLM
Towards Communication Efficient Multi-Agent Cooperations: Reinforcement Learning and LLM Open
View article: RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors Open
Evaluating deep reinforcement learning (DRL) agents against targeted behavior attacks is critical for assessing their robustness. These attacks aim to manipulate the victim into specific behaviors that align with the attacker’s objectives,…
View article: VLP: Vision-Language Preference Learning for Embodied Manipulation
VLP: Vision-Language Preference Learning for Embodied Manipulation Open
Reward engineering is one of the key challenges in Reinforcement Learning (RL). Preference-based RL effectively addresses this issue by learning from human feedback. However, it is both time-consuming and expensive to collect human prefere…
View article: Quantifying the Self-Interest Level of Markov Social Dilemmas
Quantifying the Self-Interest Level of Markov Social Dilemmas Open
This paper introduces a novel method for estimating the self-interest level of Markov social dilemmas. We extend the concept of self-interest level from normal-form games to Markov games, providing a quantitative measure of the minimum rew…
View article: Will Systems of LLM Agents Cooperate: An Investigation into a Social Dilemma
Will Systems of LLM Agents Cooperate: An Investigation into a Social Dilemma Open
As autonomous agents become more prevalent, understanding their collective behaviour in strategic interactions is crucial. This study investigates the emergent cooperative tendencies of systems of Large Language Model (LLM) agents in a soc…
View article: Towards Communication Efficient Multi-Agent Cooperations: Reinforcement Learning and LLM
Towards Communication Efficient Multi-Agent Cooperations: Reinforcement Learning and LLM Open
View article: Computational Hermeneutics: Evaluating Generative AI as a Cultural Technology
Computational Hermeneutics: Evaluating Generative AI as a Cultural Technology Open
View article: NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning Open
View article: RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors Open
Evaluating deep reinforcement learning (DRL) agents against targeted behavior attacks is critical for assessing their robustness. These attacks aim to manipulate the victim into specific behaviors that align with the attacker's objectives,…
View article: Resolving social dilemmas with minimal reward transfer
Resolving social dilemmas with minimal reward transfer Open
Social dilemmas present a significant challenge in multi-agent cooperation because individuals are incentivised to behave in ways that undermine socially optimal outcomes. Consequently, self-interested agents often avoid collective behavio…
View article: A Review of Safe Reinforcement Learning: Methods, Theories, and Applications
A Review of Safe Reinforcement Learning: Methods, Theories, and Applications Open
Reinforcement Learning (RL) has achieved tremendous success in many complex decision-making tasks. However, safety concerns are raised during deploying RL in real-world applications, leading to a growing demand for safe RL algorithms, such…
View article: Intermediate dimensions of Moran sets and their visualization
Intermediate dimensions of Moran sets and their visualization Open
Intermediate dimensions are a class of new fractal dimensions which provide a spectrum of dimensions interpolating between the Hausdorff and box-counting dimensions. In this paper, we study the intermediate dimensions of Moran sets. Moran …
View article: Efficient and scalable reinforcement learning for large-scale network control
Efficient and scalable reinforcement learning for large-scale network control Open
The primary challenge in the development of large-scale artificial intelligence (AI) systems lies in achieving scalable decision-making—extending the AI models while maintaining sufficient performance. Existing research indicates that dist…
View article: Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey
Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey Open
Reinforcement Learning (RL) is a potent tool for sequential decision-making and has achieved performance surpassing human capabilities across many challenging real-world tasks. As the extension of RL in the multi-agent system domain, multi…
View article: Explaining an Agent's Future Beliefs through Temporally Decomposing Future Reward Estimators
Explaining an Agent's Future Beliefs through Temporally Decomposing Future Reward Estimators Open
Future reward estimation is a core component of reinforcement learning agents; i.e., Q-value and state-value functions, predicting an agent's sum of future rewards. Their scalar output, however, obfuscates when or what individual future re…
View article: Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf
Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf Open
Communication is a fundamental aspect of human society, facilitating the exchange of information and beliefs among people. Despite the advancements in large language models (LLMs), recent agents built with these often neglect the control o…
View article: Human-Guided Moral Decision Making in Text-Based Games
Human-Guided Moral Decision Making in Text-Based Games Open
Training reinforcement learning (RL) agents to achieve desired goals while also acting morally is a challenging problem. Transformer-based language models (LMs) have shown some promise in moral awareness, but their use in different context…
View article: STAS: Spatial-Temporal Return Decomposition for Solving Sparse Rewards Problems in Multi-agent Reinforcement Learning
STAS: Spatial-Temporal Return Decomposition for Solving Sparse Rewards Problems in Multi-agent Reinforcement Learning Open
Centralized Training with Decentralized Execution (CTDE) has been proven to be an effective paradigm in cooperative multi-agent reinforcement learning (MARL). One of the major challenges is credit assignment, which aims to credit agents by…
View article: TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient
TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient Open
Multi-Agent Policy Gradient (MAPG) has made significant progress in recent years. However, centralized critics in state-of-the-art MAPG methods still face the centralized-decentralized mismatch (CDM) issue, which means sub-optimal actions …
View article: All Language Models Large and Small
All Language Models Large and Small Open
Many leading language models (LMs) use high-intensity computational resources both during training and execution. This poses the challenge of lowering resource costs for deployment and faster execution of decision-making tasks among others…
View article: Aligning Individual and Collective Objectives in Multi-Agent Cooperation
Aligning Individual and Collective Objectives in Multi-Agent Cooperation Open
Among the research topics in multi-agent learning, mixed-motive cooperation is one of the most prominent challenges, primarily due to the mismatch between individual and collective goals. The cutting-edge research is focused on incorporati…
View article: Natural Language Reinforcement Learning
Natural Language Reinforcement Learning Open
Reinforcement Learning (RL) has shown remarkable abilities in learning policies for decision-making tasks. However, RL is often hindered by issues such as low sample efficiency, lack of interpretability, and sparse supervision signals. To …
View article: Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models
Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models Open
Safe reinforcement learning (RL) agents accomplish given tasks while adhering to specific constraints. Employing constraints expressed via easily-understandable human language offers considerable potential for real-world applications due t…
View article: Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game
Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game Open
Multi-agent collaboration with Large Language Models (LLMs) demonstrates proficiency in basic tasks, yet its efficiency in more complex scenarios remains unexplored. In gaming environments, these agents often face situations without establ…
View article: TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient
TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient Open
Multi-Agent Policy Gradient (MAPG) has made significant progress in recent years. However, centralized critics in state-of-the-art MAPG methods still face the centralized-decentralized mismatch (CDM) issue, which means sub-optimal actions …
View article: A Review of Cooperation in Multi-agent Learning
A Review of Cooperation in Multi-agent Learning Open
Cooperation in multi-agent learning (MAL) is a topic at the intersection of numerous disciplines, including game theory, economics, social sciences, and evolutionary biology. Research in this area aims to understand both how agents can coo…
View article: MACCA: Offline Multi-agent Reinforcement Learning with Causal Credit Assignment
MACCA: Offline Multi-agent Reinforcement Learning with Causal Credit Assignment Open
Offline Multi-agent Reinforcement Learning (MARL) is valuable in scenarios where online interaction is impractical or risky. While independent learning in MARL offers flexibility and scalability, accurately assigning credit to individual a…
View article: A human-centered safe robot reinforcement learning framework with interactive behaviors
A human-centered safe robot reinforcement learning framework with interactive behaviors Open
Deployment of Reinforcement Learning (RL) algorithms for robotics applications in the real world requires ensuring the safety of the robot and its environment. Safe Robot RL (SRRL) is a crucial step toward achieving human-robot coexistence…