Nithia Vijayan
YOU?
Author Swipe
View article: A policy gradient approach for optimization of smooth risk measures
A policy gradient approach for optimization of smooth risk measures Open
We propose policy gradient algorithms for solving a risk-sensitive reinforcement learning (RL) problem in on-policy as well as off-policy settings. We consider episodic Markov decision processes, and model the risk using the broad class of…
View article: Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint
Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint Open
View article: Likelihood ratio-based policy gradient methods for distorted risk measures: A non-asymptotic analysis.
Likelihood ratio-based policy gradient methods for distorted risk measures: A non-asymptotic analysis. Open
We propose policy-gradient algorithms for solving the problem of control in a risk-sensitive reinforcement learning (RL) context. The objective of our algorithm is to maximize the distorted risk measure (DRM) of the cumulative reward in an…
View article: Policy Gradient Methods for Distortion Risk Measures
Policy Gradient Methods for Distortion Risk Measures Open
We propose policy gradient algorithms which learn risk-sensitive policies in a reinforcement learning (RL) framework. Our proposed algorithms maximize the distortion risk measure (DRM) of the cumulative reward in an episodic Markov decisio…
View article: Smoothed functional-based gradient algorithms for off-policy reinforcement learning.
Smoothed functional-based gradient algorithms for off-policy reinforcement learning. Open
We consider the problem of control in an off-policy reinforcement learning (RL) context. We propose a policy gradient scheme that incorporates a smoothed functional-based gradient estimation scheme. We provide an asymptotic convergence gua…
View article: Smoothed functional-based gradient algorithms for off-policy\n reinforcement learning: A non-asymptotic viewpoint
Smoothed functional-based gradient algorithms for off-policy\n reinforcement learning: A non-asymptotic viewpoint Open
We propose two policy gradient algorithms for solving the problem of control\nin an off-policy reinforcement learning (RL) context. Both algorithms\nincorporate a smoothed functional (SF) based gradient estimation scheme. The\nfirst algori…