A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning

Exploring foci of: arXiv (Cornell University) A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning April 2023 • Mizhaan Prajit Maniyar, Akash Mondal, L. A. Prashanth, Shalabh Bhatnagar We consider the problem of control in the setting of reinforcement learning (RL), where model information is not available. Policy gradient algorithms are a popular solution approach for this problem and are usually shown to converge to a stationary point of the value function. In this paper, we propose two policy Newton algorithms that incorporate cubic regularization. Both algorithms employ the likelihood ratio method to form estimates of the gradient and Hessian of the value function using sample trajectories. … Open Article Page

Reinforcement Learning Algorithm Gradient Descent Bellman Equation Mathematics Newton's Method Computer Science Artificial Intelligence Mathematical Analysis Open Article

Physics Biology Economics Economic Growth Geometry Quantum Mechanics Evolutionary Biology Open Article