Exploring foci of:
Mathematics • Vol 10 • No 15
Noise-Regularized Advantage Value for Multi-Agent Reinforcement Learning
August 2022 • Siying Wang, Wenyu Chen, Jian Hu, Siyue Hu, Liwei Huang
Leveraging global state information to enhance policy optimization is a common approach in multi-agent reinforcement learning (MARL). Even with the supplement of state information, the agents still suffer from insufficient exploration in the training stage. Moreover, training with batch-sampled examples from the replay buffer will induce the policy overfitting problem, i.e., multi-agent proximal policy optimization (MAPPO) may not perform as good as independent PPO (IPPO) even with additional information in the ce…
Overfitting
Reinforcement Learning
Computer Science
Artificial Intelligence
Machine Learning
Mathematics