Exploring foci of:
arXiv (Cornell University)
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning
April 2023 • Mizhaan Prajit Maniyar, Akash Mondal, L. A. Prashanth, Shalabh Bhatnagar
We consider the problem of control in the setting of reinforcement learning (RL), where model information is not available. Policy gradient algorithms are a popular solution approach for this problem and are usually shown to converge to a stationary point of the value function. In this paper, we propose two policy Newton algorithms that incorporate cubic regularization. Both algorithms employ the likelihood ratio method to form estimates of the gradient and Hessian of the value function using sample trajectories. …
Reinforcement Learning
Algorithm
Gradient Descent
Bellman Equation
Mathematics
Newton's Method
Computer Science
Artificial Intelligence
Mathematical Analysis
Physics
Biology
Economics
Economic Growth
Geometry
Quantum Mechanics
Evolutionary Biology