Exploring foci of:
IEEE Access • Vol 8
Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement Learning
January 2020 • Feng Liu, Shuling Dai, Yongjia Zhao
Using the same algorithm and hyperparameter configurations, deep reinforcement learning (DRL) will derive drastically different results from multiple experimental trials, and most of these results are unsatisfactory. Because of the instability of the results, researchers have to perform many trials to confirm an algorithm or a set of hyperparameters in DRL. In this article, we present the policy return method, which is a new design for reducing the number of trials when training a DRL model. This method allows the…
Reinforcement Learning
Computer Science
Artificial Intelligence
Machine Learning
Mathematics
Geometry
Programming Language