A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2304.10951
We consider the problem of control in the setting of reinforcement learning (RL), where model information is not available. Policy gradient algorithms are a popular solution approach for this problem and are usually shown to converge to a stationary point of the value function. In this paper, we propose two policy Newton algorithms that incorporate cubic regularization. Both algorithms employ the likelihood ratio method to form estimates of the gradient and Hessian of the value function using sample trajectories. The first algorithm requires an exact solution of the cubic regularized problem in each iteration, while the second algorithm employs an efficient gradient descent-based approximation to the cubic regularized problem. We establish convergence of our proposed algorithms to a second-order stationary point (SOSP) of the value function, which results in the avoidance of traps in the form of saddle points. In particular, the sample complexity of our algorithms to find an $ε$-SOSP is $O(ε^{-3.5})$, which is an improvement over the state-of-the-art sample complexity of $O(ε^{-4.5})$.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2304.10951
- https://arxiv.org/pdf/2304.10951
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4366835603
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4366835603Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2304.10951Digital Object Identifier
- Title
-
A Cubic-regularized Policy Newton Algorithm for Reinforcement LearningWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-04-21Full publication date if available
- Authors
-
Mizhaan Prajit Maniyar, Akash Mondal, L. A. Prashanth, Shalabh BhatnagarList of authors in order
- Landing page
-
https://arxiv.org/abs/2304.10951Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2304.10951Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2304.10951Direct OA link when available
- Concepts
-
Reinforcement learning, Hessian matrix, Stationary point, Algorithm, Saddle point, Regularization (linguistics), Descent direction, Function (biology), Gradient descent, Convergence (economics), Bellman equation, Mathematics, Newton's method, Computer science, Applied mathematics, Mathematical optimization, Artificial intelligence, Artificial neural network, Mathematical analysis, Physics, Biology, Economics, Economic growth, Geometry, Nonlinear system, Quantum mechanics, Evolutionary biologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4366835603 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2304.10951 |
| ids.doi | https://doi.org/10.48550/arxiv.2304.10951 |
| ids.openalex | https://openalex.org/W4366835603 |
| fwci | |
| type | preprint |
| title | A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10462 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9977999925613403 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Reinforcement Learning in Robotics |
| topics[1].id | https://openalex.org/T12101 |
| topics[1].field.id | https://openalex.org/fields/18 |
| topics[1].field.display_name | Decision Sciences |
| topics[1].score | 0.9868999719619751 |
| topics[1].domain.id | https://openalex.org/domains/2 |
| topics[1].domain.display_name | Social Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1803 |
| topics[1].subfield.display_name | Management Science and Operations Research |
| topics[1].display_name | Advanced Bandit Algorithms Research |
| topics[2].id | https://openalex.org/T11612 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9799000024795532 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Stochastic Gradient Optimization Techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97541855 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7833912372589111 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[0].display_name | Reinforcement learning |
| concepts[1].id | https://openalex.org/C203616005 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7230967879295349 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q620495 |
| concepts[1].display_name | Hessian matrix |
| concepts[2].id | https://openalex.org/C189237950 |
| concepts[2].level | 2 |
| concepts[2].score | 0.7096744775772095 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q2500758 |
| concepts[2].display_name | Stationary point |
| concepts[3].id | https://openalex.org/C11413529 |
| concepts[3].level | 1 |
| concepts[3].score | 0.6199856996536255 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[3].display_name | Algorithm |
| concepts[4].id | https://openalex.org/C2681867 |
| concepts[4].level | 2 |
| concepts[4].score | 0.6144540905952454 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q690935 |
| concepts[4].display_name | Saddle point |
| concepts[5].id | https://openalex.org/C2776135515 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5122396349906921 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q17143721 |
| concepts[5].display_name | Regularization (linguistics) |
| concepts[6].id | https://openalex.org/C116149140 |
| concepts[6].level | 4 |
| concepts[6].score | 0.4986302852630615 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q2070951 |
| concepts[6].display_name | Descent direction |
| concepts[7].id | https://openalex.org/C14036430 |
| concepts[7].level | 2 |
| concepts[7].score | 0.49213331937789917 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q3736076 |
| concepts[7].display_name | Function (biology) |
| concepts[8].id | https://openalex.org/C153258448 |
| concepts[8].level | 3 |
| concepts[8].score | 0.4637508988380432 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q1199743 |
| concepts[8].display_name | Gradient descent |
| concepts[9].id | https://openalex.org/C2777303404 |
| concepts[9].level | 2 |
| concepts[9].score | 0.4514929950237274 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q759757 |
| concepts[9].display_name | Convergence (economics) |
| concepts[10].id | https://openalex.org/C14646407 |
| concepts[10].level | 2 |
| concepts[10].score | 0.4462524354457855 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q1430750 |
| concepts[10].display_name | Bellman equation |
| concepts[11].id | https://openalex.org/C33923547 |
| concepts[11].level | 0 |
| concepts[11].score | 0.4418799877166748 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[11].display_name | Mathematics |
| concepts[12].id | https://openalex.org/C85189116 |
| concepts[12].level | 3 |
| concepts[12].score | 0.4268903434276581 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q374195 |
| concepts[12].display_name | Newton's method |
| concepts[13].id | https://openalex.org/C41008148 |
| concepts[13].level | 0 |
| concepts[13].score | 0.4170117974281311 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[13].display_name | Computer science |
| concepts[14].id | https://openalex.org/C28826006 |
| concepts[14].level | 1 |
| concepts[14].score | 0.41415104269981384 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q33521 |
| concepts[14].display_name | Applied mathematics |
| concepts[15].id | https://openalex.org/C126255220 |
| concepts[15].level | 1 |
| concepts[15].score | 0.41033047437667847 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q141495 |
| concepts[15].display_name | Mathematical optimization |
| concepts[16].id | https://openalex.org/C154945302 |
| concepts[16].level | 1 |
| concepts[16].score | 0.15115976333618164 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[16].display_name | Artificial intelligence |
| concepts[17].id | https://openalex.org/C50644808 |
| concepts[17].level | 2 |
| concepts[17].score | 0.13215014338493347 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[17].display_name | Artificial neural network |
| concepts[18].id | https://openalex.org/C134306372 |
| concepts[18].level | 1 |
| concepts[18].score | 0.1316802203655243 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q7754 |
| concepts[18].display_name | Mathematical analysis |
| concepts[19].id | https://openalex.org/C121332964 |
| concepts[19].level | 0 |
| concepts[19].score | 0.07163217663764954 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[19].display_name | Physics |
| concepts[20].id | https://openalex.org/C86803240 |
| concepts[20].level | 0 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[20].display_name | Biology |
| concepts[21].id | https://openalex.org/C162324750 |
| concepts[21].level | 0 |
| concepts[21].score | 0.0 |
| concepts[21].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[21].display_name | Economics |
| concepts[22].id | https://openalex.org/C50522688 |
| concepts[22].level | 1 |
| concepts[22].score | 0.0 |
| concepts[22].wikidata | https://www.wikidata.org/wiki/Q189833 |
| concepts[22].display_name | Economic growth |
| concepts[23].id | https://openalex.org/C2524010 |
| concepts[23].level | 1 |
| concepts[23].score | 0.0 |
| concepts[23].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[23].display_name | Geometry |
| concepts[24].id | https://openalex.org/C158622935 |
| concepts[24].level | 2 |
| concepts[24].score | 0.0 |
| concepts[24].wikidata | https://www.wikidata.org/wiki/Q660848 |
| concepts[24].display_name | Nonlinear system |
| concepts[25].id | https://openalex.org/C62520636 |
| concepts[25].level | 1 |
| concepts[25].score | 0.0 |
| concepts[25].wikidata | https://www.wikidata.org/wiki/Q944 |
| concepts[25].display_name | Quantum mechanics |
| concepts[26].id | https://openalex.org/C78458016 |
| concepts[26].level | 1 |
| concepts[26].score | 0.0 |
| concepts[26].wikidata | https://www.wikidata.org/wiki/Q840400 |
| concepts[26].display_name | Evolutionary biology |
| keywords[0].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[0].score | 0.7833912372589111 |
| keywords[0].display_name | Reinforcement learning |
| keywords[1].id | https://openalex.org/keywords/hessian-matrix |
| keywords[1].score | 0.7230967879295349 |
| keywords[1].display_name | Hessian matrix |
| keywords[2].id | https://openalex.org/keywords/stationary-point |
| keywords[2].score | 0.7096744775772095 |
| keywords[2].display_name | Stationary point |
| keywords[3].id | https://openalex.org/keywords/algorithm |
| keywords[3].score | 0.6199856996536255 |
| keywords[3].display_name | Algorithm |
| keywords[4].id | https://openalex.org/keywords/saddle-point |
| keywords[4].score | 0.6144540905952454 |
| keywords[4].display_name | Saddle point |
| keywords[5].id | https://openalex.org/keywords/regularization |
| keywords[5].score | 0.5122396349906921 |
| keywords[5].display_name | Regularization (linguistics) |
| keywords[6].id | https://openalex.org/keywords/descent-direction |
| keywords[6].score | 0.4986302852630615 |
| keywords[6].display_name | Descent direction |
| keywords[7].id | https://openalex.org/keywords/function |
| keywords[7].score | 0.49213331937789917 |
| keywords[7].display_name | Function (biology) |
| keywords[8].id | https://openalex.org/keywords/gradient-descent |
| keywords[8].score | 0.4637508988380432 |
| keywords[8].display_name | Gradient descent |
| keywords[9].id | https://openalex.org/keywords/convergence |
| keywords[9].score | 0.4514929950237274 |
| keywords[9].display_name | Convergence (economics) |
| keywords[10].id | https://openalex.org/keywords/bellman-equation |
| keywords[10].score | 0.4462524354457855 |
| keywords[10].display_name | Bellman equation |
| keywords[11].id | https://openalex.org/keywords/mathematics |
| keywords[11].score | 0.4418799877166748 |
| keywords[11].display_name | Mathematics |
| keywords[12].id | https://openalex.org/keywords/newtons-method |
| keywords[12].score | 0.4268903434276581 |
| keywords[12].display_name | Newton's method |
| keywords[13].id | https://openalex.org/keywords/computer-science |
| keywords[13].score | 0.4170117974281311 |
| keywords[13].display_name | Computer science |
| keywords[14].id | https://openalex.org/keywords/applied-mathematics |
| keywords[14].score | 0.41415104269981384 |
| keywords[14].display_name | Applied mathematics |
| keywords[15].id | https://openalex.org/keywords/mathematical-optimization |
| keywords[15].score | 0.41033047437667847 |
| keywords[15].display_name | Mathematical optimization |
| keywords[16].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[16].score | 0.15115976333618164 |
| keywords[16].display_name | Artificial intelligence |
| keywords[17].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[17].score | 0.13215014338493347 |
| keywords[17].display_name | Artificial neural network |
| keywords[18].id | https://openalex.org/keywords/mathematical-analysis |
| keywords[18].score | 0.1316802203655243 |
| keywords[18].display_name | Mathematical analysis |
| keywords[19].id | https://openalex.org/keywords/physics |
| keywords[19].score | 0.07163217663764954 |
| keywords[19].display_name | Physics |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2304.10951 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2304.10951 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2304.10951 |
| locations[1].id | doi:10.48550/arxiv.2304.10951 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2304.10951 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5000476681 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Mizhaan Prajit Maniyar |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Maniyar, Mizhaan Prajit |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5001127459 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-3438-2898 |
| authorships[1].author.display_name | Akash Mondal |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Mondal, Akash |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5068379567 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | L. A. Prashanth |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | A., Prashanth L. |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5038163398 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-7644-3914 |
| authorships[3].author.display_name | Shalabh Bhatnagar |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Bhatnagar, Shalabh |
| authorships[3].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2304.10951 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10462 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9977999925613403 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Reinforcement Learning in Robotics |
| related_works | https://openalex.org/W1978576933, https://openalex.org/W3125203780, https://openalex.org/W2983297293, https://openalex.org/W2153649672, https://openalex.org/W2032898330, https://openalex.org/W2016763547, https://openalex.org/W1987483891, https://openalex.org/W2996136316, https://openalex.org/W2970432723, https://openalex.org/W4388964614 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2304.10951 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2304.10951 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2304.10951 |
| primary_location.id | pmh:oai:arXiv.org:2304.10951 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2304.10951 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2304.10951 |
| publication_date | 2023-04-21 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 23, 37, 117 |
| abstract_inverted_index.In | 44, 139 |
| abstract_inverted_index.We | 0, 109 |
| abstract_inverted_index.an | 83, 99, 149, 155 |
| abstract_inverted_index.in | 6, 91, 128, 133 |
| abstract_inverted_index.is | 16, 151, 154 |
| abstract_inverted_index.of | 4, 9, 40, 67, 72, 86, 112, 122, 131, 136, 144, 162 |
| abstract_inverted_index.to | 34, 36, 64, 104, 116, 147 |
| abstract_inverted_index.we | 47 |
| abstract_inverted_index.The | 79 |
| abstract_inverted_index.and | 30, 70 |
| abstract_inverted_index.are | 22, 31 |
| abstract_inverted_index.for | 27 |
| abstract_inverted_index.not | 17 |
| abstract_inverted_index.our | 113, 145 |
| abstract_inverted_index.the | 2, 7, 41, 60, 68, 73, 87, 95, 105, 123, 129, 134, 141, 158 |
| abstract_inverted_index.two | 49 |
| abstract_inverted_index.Both | 57 |
| abstract_inverted_index.each | 92 |
| abstract_inverted_index.find | 148 |
| abstract_inverted_index.form | 65, 135 |
| abstract_inverted_index.over | 157 |
| abstract_inverted_index.that | 53 |
| abstract_inverted_index.this | 28, 45 |
| abstract_inverted_index.(RL), | 12 |
| abstract_inverted_index.cubic | 55, 88, 106 |
| abstract_inverted_index.exact | 84 |
| abstract_inverted_index.first | 80 |
| abstract_inverted_index.model | 14 |
| abstract_inverted_index.point | 39, 120 |
| abstract_inverted_index.ratio | 62 |
| abstract_inverted_index.shown | 33 |
| abstract_inverted_index.traps | 132 |
| abstract_inverted_index.using | 76 |
| abstract_inverted_index.value | 42, 74, 124 |
| abstract_inverted_index.where | 13 |
| abstract_inverted_index.which | 126, 153 |
| abstract_inverted_index.while | 94 |
| abstract_inverted_index.(SOSP) | 121 |
| abstract_inverted_index.Newton | 51 |
| abstract_inverted_index.Policy | 19 |
| abstract_inverted_index.employ | 59 |
| abstract_inverted_index.method | 63 |
| abstract_inverted_index.paper, | 46 |
| abstract_inverted_index.policy | 50 |
| abstract_inverted_index.saddle | 137 |
| abstract_inverted_index.sample | 77, 142, 160 |
| abstract_inverted_index.second | 96 |
| abstract_inverted_index.Hessian | 71 |
| abstract_inverted_index.control | 5 |
| abstract_inverted_index.employs | 98 |
| abstract_inverted_index.points. | 138 |
| abstract_inverted_index.popular | 24 |
| abstract_inverted_index.problem | 3, 29, 90 |
| abstract_inverted_index.propose | 48 |
| abstract_inverted_index.results | 127 |
| abstract_inverted_index.setting | 8 |
| abstract_inverted_index.usually | 32 |
| abstract_inverted_index.approach | 26 |
| abstract_inverted_index.consider | 1 |
| abstract_inverted_index.converge | 35 |
| abstract_inverted_index.function | 75 |
| abstract_inverted_index.gradient | 20, 69, 101 |
| abstract_inverted_index.learning | 11 |
| abstract_inverted_index.problem. | 108 |
| abstract_inverted_index.proposed | 114 |
| abstract_inverted_index.requires | 82 |
| abstract_inverted_index.solution | 25, 85 |
| abstract_inverted_index.$ε$-SOSP | 150 |
| abstract_inverted_index.algorithm | 81, 97 |
| abstract_inverted_index.avoidance | 130 |
| abstract_inverted_index.efficient | 100 |
| abstract_inverted_index.establish | 110 |
| abstract_inverted_index.estimates | 66 |
| abstract_inverted_index.function, | 125 |
| abstract_inverted_index.function. | 43 |
| abstract_inverted_index.algorithms | 21, 52, 58, 115, 146 |
| abstract_inverted_index.available. | 18 |
| abstract_inverted_index.complexity | 143, 161 |
| abstract_inverted_index.iteration, | 93 |
| abstract_inverted_index.likelihood | 61 |
| abstract_inverted_index.stationary | 38, 119 |
| abstract_inverted_index.convergence | 111 |
| abstract_inverted_index.improvement | 156 |
| abstract_inverted_index.incorporate | 54 |
| abstract_inverted_index.information | 15 |
| abstract_inverted_index.particular, | 140 |
| abstract_inverted_index.regularized | 89, 107 |
| abstract_inverted_index.second-order | 118 |
| abstract_inverted_index.approximation | 103 |
| abstract_inverted_index.descent-based | 102 |
| abstract_inverted_index.reinforcement | 10 |
| abstract_inverted_index.trajectories. | 78 |
| abstract_inverted_index.$O(ε^{-3.5})$, | 152 |
| abstract_inverted_index.$O(ε^{-4.5})$. | 163 |
| abstract_inverted_index.regularization. | 56 |
| abstract_inverted_index.state-of-the-art | 159 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/16 |
| sustainable_development_goals[0].score | 0.4699999988079071 |
| sustainable_development_goals[0].display_name | Peace, Justice and strong institutions |
| citation_normalized_percentile |