Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement Learning

Exploring foci of: IEEE Access • Vol 8 Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement Learning January 2020 • Feng Liu, Shuling Dai, Yongjia Zhao Using the same algorithm and hyperparameter configurations, deep reinforcement learning (DRL) will derive drastically different results from multiple experimental trials, and most of these results are unsatisfactory. Because of the instability of the results, researchers have to perform many trials to confirm an algorithm or a set of hyperparameters in DRL. In this article, we present the policy return method, which is a new design for reducing the number of trials when training a DRL model. This method allows the… Open Article Page

Reinforcement Learning Computer Science Artificial Intelligence Machine Learning Mathematics Geometry Programming Language Open Article