Bikramjit Banerjee
YOU?
Author Swipe
View article: Quasimetric Value Functions with Dense Rewards
Quasimetric Value Functions with Dense Rewards Open
As a generalization of reinforcement learning (RL) to parametrizable goals, goal conditioned RL (GCRL) has a broad range of applications, particularly in challenging tasks in robotics. Recent work has established that the optimal value fun…
View article: Model AI Assignments 2024
Model AI Assignments 2024 Open
The Model AI Assignments session seeks to gather and dis- seminate the best assignment designs of the Artificial In- telligence (AI) Education community. Recognizing that as- signments form the core of student learning experience, we here …
View article: Reinforcement actor-critic learning as a rehearsal in MicroRTS
Reinforcement actor-critic learning as a rehearsal in MicroRTS Open
Real-time strategy (RTS) games have provided a fertile ground for AI research with notable recent successes based on deep reinforcement learning (RL). However, RL remains a data-hungry approach featuring a high sample complexity. In this p…
View article: Latent Interactive A2C for Improved RL in Open Many-Agent Systems
Latent Interactive A2C for Improved RL in Open Many-Agent Systems Open
There is a prevalence of multiagent reinforcement learning (MARL) methods that engage in centralized training. But, these methods involve obtaining various types of information from the other agents, which may not be feasible in competitiv…
View article: Informed Initial Policies for Learning in Dec-POMDPs
Informed Initial Policies for Learning in Dec-POMDPs Open
Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a formal model for planning in cooperative multiagent systems where agents operate with noisy sensors and actuators, and local information. Prevalent Dec-POMDP…
View article: Sample Bounded Distributed Reinforcement Learning for Decentralized POMDPs
Sample Bounded Distributed Reinforcement Learning for Decentralized POMDPs Open
Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a powerful modeling technique for realistic multi-agent coordination problems under uncertainty. Prevalent solution techniques are centralized and assume prior…
View article: Many Agent Reinforcement Learning Under Partial Observability
Many Agent Reinforcement Learning Under Partial Observability Open
Recent renewed interest in multi-agent reinforcement learning (MARL) has generated an impressive array of techniques that leverage deep reinforcement learning, primarily actor-critic architectures, and can be applied to a limited range of …
View article: Cooperative-Competitive Reinforcement Learning with History-Dependent Rewards
Cooperative-Competitive Reinforcement Learning with History-Dependent Rewards Open
Consider a typical organization whose worker agents seek to collectively cooperate for its general betterment. However, each individual agent simultaneously seeks to act to secure a larger chunk than its co-workers of the annual increment …
View article: Maximum Entropy Multi-Task Inverse RL
Maximum Entropy Multi-Task Inverse RL Open
Multi-task IRL allows for the possibility that the expert could be switching between multiple ways of solving the same problem, or interleaving demonstrations of multiple tasks. The learner aims to learn the multiple reward functions that …
View article: Model-Free IRL Using Maximum Likelihood Estimation
Model-Free IRL Using Maximum Likelihood Estimation Open
The problem of learning an expert’s unknown reward function using a limited number of demonstrations recorded from the expert’s behavior is investigated in the area of inverse reinforcement learning (IRL). To gain traction in this challeng…
View article: A Framework and Method for Online Inverse Reinforcement Learning
A Framework and Method for Online Inverse Reinforcement Learning Open
Inverse reinforcement learning (IRL) is the problem of learning the preferences of an agent from the observations of its behavior on a task. While this problem has been well investigated, the related problem of {\em online} IRL---where the…
View article: Concurrent Learning of Control in Multi-agent Sequential Decision Tasks
Concurrent Learning of Control in Multi-agent Sequential Decision Tasks Open
The overall objective of this project was to develop multi-agent reinforcement learning (MARL) approaches for intelligent agents to autonomously learn distributed control policies in decentralized partially observable Markov decision proce…
View article: Table of contents
Table of contents Open
perceive, reason, learn, and act intelligently.
View article: Detection of Plan Deviation in Multi-Agent Systems
Detection of Plan Deviation in Multi-Agent Systems Open
Plan monitoring in a collaborative multi-agent system requires an agent to not only monitor the execution of its own plan, but also to detect possible deviations or failures in the plan execution of its teammates. In domains featuring part…