Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement Learning Article Swipe
YOU?
·
· 2020
· Open Access
·
· DOI: https://doi.org/10.1109/access.2020.3045835
Using the same algorithm and hyperparameter configurations, deep reinforcement learning (DRL) will derive drastically different results from multiple experimental trials, and most of these results are unsatisfactory. Because of the instability of the results, researchers have to perform many trials to confirm an algorithm or a set of hyperparameters in DRL. In this article, we present the policy return method, which is a new design for reducing the number of trials when training a DRL model. This method allows the learned policy to return to a previous state when it becomes divergent or stagnant at any stage of training. When returning, a certain percentage of stochastic data is added to the weights of the neural networks to prevent a repeated decline. Extensive experiments on challenging tasks and various target scores demonstrate that the policy return method can bring about a 10% to 40% reduction in the required number of trials compared with that of the corresponding original algorithm, and a 10% to 30% reduction compared with the state-of-the-art algorithms.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.1109/access.2020.3045835
- https://ieeexplore.ieee.org/ielx7/6287639/6514899/09298771.pdf
- OA Status
- gold
- Cited By
- 1
- References
- 53
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W3115077346
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3115077346Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1109/access.2020.3045835Digital Object Identifier
- Title
-
Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement LearningWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2020Year of publication
- Publication date
-
2020-01-01Full publication date if available
- Authors
-
Feng Liu, Shuling Dai, Yongjia ZhaoList of authors in order
- Landing page
-
https://doi.org/10.1109/access.2020.3045835Publisher landing page
- PDF URL
-
https://ieeexplore.ieee.org/ielx7/6287639/6514899/09298771.pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
goldOpen access status per OpenAlex
- OA URL
-
https://ieeexplore.ieee.org/ielx7/6287639/6514899/09298771.pdfDirect OA link when available
- Concepts
-
Hyperparameter, Reinforcement learning, Computer science, Artificial intelligence, Reduction (mathematics), Machine learning, Set (abstract data type), Artificial neural network, Stability (learning theory), Mathematical optimization, Mathematics, Geometry, Programming languageTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2023: 1Per-year citation counts (last 5 years)
- References (count)
-
53Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W3115077346 |
|---|---|
| doi | https://doi.org/10.1109/access.2020.3045835 |
| ids.doi | https://doi.org/10.1109/access.2020.3045835 |
| ids.mag | 3115077346 |
| ids.openalex | https://openalex.org/W3115077346 |
| fwci | 0.14685955 |
| type | article |
| title | Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement Learning |
| biblio.issue | |
| biblio.volume | 8 |
| biblio.last_page | 228107 |
| biblio.first_page | 228099 |
| topics[0].id | https://openalex.org/T10462 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9998999834060669 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Reinforcement Learning in Robotics |
| topics[1].id | https://openalex.org/T11975 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9934999942779541 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Evolutionary Algorithms and Applications |
| topics[2].id | https://openalex.org/T10848 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9883999824523926 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1703 |
| topics[2].subfield.display_name | Computational Theory and Mathematics |
| topics[2].display_name | Advanced Multi-Objective Optimization Algorithms |
| is_xpac | False |
| apc_list.value | 1850 |
| apc_list.currency | USD |
| apc_list.value_usd | 1850 |
| apc_paid.value | 1850 |
| apc_paid.currency | USD |
| apc_paid.value_usd | 1850 |
| concepts[0].id | https://openalex.org/C8642999 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8586024045944214 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q4171168 |
| concepts[0].display_name | Hyperparameter |
| concepts[1].id | https://openalex.org/C97541855 |
| concepts[1].level | 2 |
| concepts[1].score | 0.8186781406402588 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[1].display_name | Reinforcement learning |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.6882305145263672 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.5724294185638428 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C111335779 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5391542911529541 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q3454686 |
| concepts[4].display_name | Reduction (mathematics) |
| concepts[5].id | https://openalex.org/C119857082 |
| concepts[5].level | 1 |
| concepts[5].score | 0.5184952020645142 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[5].display_name | Machine learning |
| concepts[6].id | https://openalex.org/C177264268 |
| concepts[6].level | 2 |
| concepts[6].score | 0.4787415862083435 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q1514741 |
| concepts[6].display_name | Set (abstract data type) |
| concepts[7].id | https://openalex.org/C50644808 |
| concepts[7].level | 2 |
| concepts[7].score | 0.4391811788082123 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[7].display_name | Artificial neural network |
| concepts[8].id | https://openalex.org/C112972136 |
| concepts[8].level | 2 |
| concepts[8].score | 0.41667819023132324 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q7595718 |
| concepts[8].display_name | Stability (learning theory) |
| concepts[9].id | https://openalex.org/C126255220 |
| concepts[9].level | 1 |
| concepts[9].score | 0.32939931750297546 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q141495 |
| concepts[9].display_name | Mathematical optimization |
| concepts[10].id | https://openalex.org/C33923547 |
| concepts[10].level | 0 |
| concepts[10].score | 0.18197974562644958 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[10].display_name | Mathematics |
| concepts[11].id | https://openalex.org/C2524010 |
| concepts[11].level | 1 |
| concepts[11].score | 0.0 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[11].display_name | Geometry |
| concepts[12].id | https://openalex.org/C199360897 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[12].display_name | Programming language |
| keywords[0].id | https://openalex.org/keywords/hyperparameter |
| keywords[0].score | 0.8586024045944214 |
| keywords[0].display_name | Hyperparameter |
| keywords[1].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[1].score | 0.8186781406402588 |
| keywords[1].display_name | Reinforcement learning |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.6882305145263672 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.5724294185638428 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/reduction |
| keywords[4].score | 0.5391542911529541 |
| keywords[4].display_name | Reduction (mathematics) |
| keywords[5].id | https://openalex.org/keywords/machine-learning |
| keywords[5].score | 0.5184952020645142 |
| keywords[5].display_name | Machine learning |
| keywords[6].id | https://openalex.org/keywords/set |
| keywords[6].score | 0.4787415862083435 |
| keywords[6].display_name | Set (abstract data type) |
| keywords[7].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[7].score | 0.4391811788082123 |
| keywords[7].display_name | Artificial neural network |
| keywords[8].id | https://openalex.org/keywords/stability |
| keywords[8].score | 0.41667819023132324 |
| keywords[8].display_name | Stability (learning theory) |
| keywords[9].id | https://openalex.org/keywords/mathematical-optimization |
| keywords[9].score | 0.32939931750297546 |
| keywords[9].display_name | Mathematical optimization |
| keywords[10].id | https://openalex.org/keywords/mathematics |
| keywords[10].score | 0.18197974562644958 |
| keywords[10].display_name | Mathematics |
| language | en |
| locations[0].id | doi:10.1109/access.2020.3045835 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S2485537415 |
| locations[0].source.issn | 2169-3536 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | 2169-3536 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | True |
| locations[0].source.display_name | IEEE Access |
| locations[0].source.host_organization | https://openalex.org/P4310319808 |
| locations[0].source.host_organization_name | Institute of Electrical and Electronics Engineers |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310319808 |
| locations[0].source.host_organization_lineage_names | Institute of Electrical and Electronics Engineers |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://ieeexplore.ieee.org/ielx7/6287639/6514899/09298771.pdf |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | IEEE Access |
| locations[0].landing_page_url | https://doi.org/10.1109/access.2020.3045835 |
| locations[1].id | pmh:oai:doaj.org/article:ce74b71c5e014fbbac96e1e91fa156c3 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306401280 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | DOAJ (DOAJ: Directory of Open Access Journals) |
| locations[1].source.host_organization | |
| locations[1].source.host_organization_name | |
| locations[1].license | cc-by-sa |
| locations[1].pdf_url | |
| locations[1].version | submittedVersion |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by-sa |
| locations[1].is_accepted | False |
| locations[1].is_published | False |
| locations[1].raw_source_name | IEEE Access, Vol 8, Pp 228099-228107 (2020) |
| locations[1].landing_page_url | https://doaj.org/article/ce74b71c5e014fbbac96e1e91fa156c3 |
| indexed_in | crossref, doaj |
| authorships[0].author.id | https://openalex.org/A5100415312 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-9006-4520 |
| authorships[0].author.display_name | Feng Liu |
| authorships[0].countries | CN |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I82880672 |
| authorships[0].affiliations[0].raw_affiliation_string | State Key Laboratory of VR Technology & Systems, Beihang University (BUAA), Beijing, China |
| authorships[0].institutions[0].id | https://openalex.org/I82880672 |
| authorships[0].institutions[0].ror | https://ror.org/00wk2mp56 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I82880672 |
| authorships[0].institutions[0].country_code | CN |
| authorships[0].institutions[0].display_name | Beihang University |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Feng Liu |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | State Key Laboratory of VR Technology & Systems, Beihang University (BUAA), Beijing, China |
| authorships[1].author.id | https://openalex.org/A5102866500 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-2934-9033 |
| authorships[1].author.display_name | Shuling Dai |
| authorships[1].countries | CN |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I82880672 |
| authorships[1].affiliations[0].raw_affiliation_string | State Key Laboratory of VR Technology & Systems, Beihang University (BUAA), Beijing, China |
| authorships[1].affiliations[1].institution_ids | https://openalex.org/I82880672 |
| authorships[1].affiliations[1].raw_affiliation_string | Jiangxi Research Institute, Beihang University (BUAA), Beijing, China |
| authorships[1].institutions[0].id | https://openalex.org/I82880672 |
| authorships[1].institutions[0].ror | https://ror.org/00wk2mp56 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I82880672 |
| authorships[1].institutions[0].country_code | CN |
| authorships[1].institutions[0].display_name | Beihang University |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Shuling Dai |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Jiangxi Research Institute, Beihang University (BUAA), Beijing, China, State Key Laboratory of VR Technology & Systems, Beihang University (BUAA), Beijing, China |
| authorships[2].author.id | https://openalex.org/A5082426267 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-4557-9066 |
| authorships[2].author.display_name | Yongjia Zhao |
| authorships[2].countries | CN |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I82880672 |
| authorships[2].affiliations[0].raw_affiliation_string | State Key Laboratory of VR Technology & Systems, Beihang University (BUAA), Beijing, China |
| authorships[2].institutions[0].id | https://openalex.org/I82880672 |
| authorships[2].institutions[0].ror | https://ror.org/00wk2mp56 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I82880672 |
| authorships[2].institutions[0].country_code | CN |
| authorships[2].institutions[0].display_name | Beihang University |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Yongjia Zhao |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | State Key Laboratory of VR Technology & Systems, Beihang University (BUAA), Beijing, China |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://ieeexplore.ieee.org/ielx7/6287639/6514899/09298771.pdf |
| open_access.oa_status | gold |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Policy Return: A New Method for Reducing the Number of Experimental Trials in Deep Reinforcement Learning |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10462 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9998999834060669 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Reinforcement Learning in Robotics |
| related_works | https://openalex.org/W2140186469, https://openalex.org/W4390421286, https://openalex.org/W4280563792, https://openalex.org/W4389724018, https://openalex.org/W4318719684, https://openalex.org/W4318559728, https://openalex.org/W3183136280, https://openalex.org/W2775233965, https://openalex.org/W3114716045, https://openalex.org/W4281847915 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2023 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | doi:10.1109/access.2020.3045835 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S2485537415 |
| best_oa_location.source.issn | 2169-3536 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | 2169-3536 |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | True |
| best_oa_location.source.display_name | IEEE Access |
| best_oa_location.source.host_organization | https://openalex.org/P4310319808 |
| best_oa_location.source.host_organization_name | Institute of Electrical and Electronics Engineers |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310319808 |
| best_oa_location.source.host_organization_lineage_names | Institute of Electrical and Electronics Engineers |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://ieeexplore.ieee.org/ielx7/6287639/6514899/09298771.pdf |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | IEEE Access |
| best_oa_location.landing_page_url | https://doi.org/10.1109/access.2020.3045835 |
| primary_location.id | doi:10.1109/access.2020.3045835 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S2485537415 |
| primary_location.source.issn | 2169-3536 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | 2169-3536 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | True |
| primary_location.source.display_name | IEEE Access |
| primary_location.source.host_organization | https://openalex.org/P4310319808 |
| primary_location.source.host_organization_name | Institute of Electrical and Electronics Engineers |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310319808 |
| primary_location.source.host_organization_lineage_names | Institute of Electrical and Electronics Engineers |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://ieeexplore.ieee.org/ielx7/6287639/6514899/09298771.pdf |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | IEEE Access |
| primary_location.landing_page_url | https://doi.org/10.1109/access.2020.3045835 |
| publication_date | 2020-01-01 |
| publication_year | 2020 |
| referenced_works | https://openalex.org/W6687681856, https://openalex.org/W6631190155, https://openalex.org/W2960855687, https://openalex.org/W3006186780, https://openalex.org/W6744123322, https://openalex.org/W2902098903, https://openalex.org/W2145339207, https://openalex.org/W6684921986, https://openalex.org/W6692846177, https://openalex.org/W6740222838, https://openalex.org/W6752603789, https://openalex.org/W6741002519, https://openalex.org/W2766447205, https://openalex.org/W6683195989, https://openalex.org/W2257979135, https://openalex.org/W2583993537, https://openalex.org/W6638018090, https://openalex.org/W6682849425, https://openalex.org/W2963428623, https://openalex.org/W2575705757, https://openalex.org/W3100789280, https://openalex.org/W2546571074, https://openalex.org/W6637967152, https://openalex.org/W6740092555, https://openalex.org/W6683300800, https://openalex.org/W6744838376, https://openalex.org/W4250743307, https://openalex.org/W32403112, https://openalex.org/W2142441581, https://openalex.org/W2724169821, https://openalex.org/W4298876402, https://openalex.org/W2156737235, https://openalex.org/W2964161785, https://openalex.org/W2754517384, https://openalex.org/W1757796397, https://openalex.org/W2963423916, https://openalex.org/W2121863487, https://openalex.org/W2809256243, https://openalex.org/W2964043796, https://openalex.org/W2964121744, https://openalex.org/W2736601468, https://openalex.org/W1515851193, https://openalex.org/W2761873684, https://openalex.org/W2726187156, https://openalex.org/W2173248099, https://openalex.org/W3100944043, https://openalex.org/W2155007355, https://openalex.org/W1771410628, https://openalex.org/W4298857966, https://openalex.org/W1522301498, https://openalex.org/W2201581102, https://openalex.org/W2963864421, https://openalex.org/W2964174623 |
| referenced_works_count | 53 |
| abstract_inverted_index.a | 45, 62, 73, 85, 101, 118, 139, 159 |
| abstract_inverted_index.In | 51 |
| abstract_inverted_index.an | 42 |
| abstract_inverted_index.at | 94 |
| abstract_inverted_index.in | 49, 144 |
| abstract_inverted_index.is | 61, 107 |
| abstract_inverted_index.it | 89 |
| abstract_inverted_index.of | 22, 28, 31, 47, 69, 97, 104, 112, 148, 153 |
| abstract_inverted_index.on | 123 |
| abstract_inverted_index.or | 44, 92 |
| abstract_inverted_index.to | 36, 40, 82, 84, 109, 116, 141, 161 |
| abstract_inverted_index.we | 54 |
| abstract_inverted_index.DRL | 74 |
| abstract_inverted_index.and | 4, 20, 126, 158 |
| abstract_inverted_index.any | 95 |
| abstract_inverted_index.are | 25 |
| abstract_inverted_index.can | 136 |
| abstract_inverted_index.for | 65 |
| abstract_inverted_index.new | 63 |
| abstract_inverted_index.set | 46 |
| abstract_inverted_index.the | 1, 29, 32, 56, 67, 79, 110, 113, 132, 145, 154, 166 |
| abstract_inverted_index.DRL. | 50 |
| abstract_inverted_index.This | 76 |
| abstract_inverted_index.When | 99 |
| abstract_inverted_index.data | 106 |
| abstract_inverted_index.deep | 7 |
| abstract_inverted_index.from | 16 |
| abstract_inverted_index.have | 35 |
| abstract_inverted_index.many | 38 |
| abstract_inverted_index.most | 21 |
| abstract_inverted_index.same | 2 |
| abstract_inverted_index.that | 131, 152 |
| abstract_inverted_index.this | 52 |
| abstract_inverted_index.when | 71, 88 |
| abstract_inverted_index.will | 11 |
| abstract_inverted_index.with | 151, 165 |
| abstract_inverted_index.(DRL) | 10 |
| abstract_inverted_index.Using | 0 |
| abstract_inverted_index.about | 138 |
| abstract_inverted_index.added | 108 |
| abstract_inverted_index.bring | 137 |
| abstract_inverted_index.stage | 96 |
| abstract_inverted_index.state | 87 |
| abstract_inverted_index.tasks | 125 |
| abstract_inverted_index.these | 23 |
| abstract_inverted_index.which | 60 |
| abstract_inverted_index.allows | 78 |
| abstract_inverted_index.derive | 12 |
| abstract_inverted_index.design | 64 |
| abstract_inverted_index.method | 77, 135 |
| abstract_inverted_index.model. | 75 |
| abstract_inverted_index.neural | 114 |
| abstract_inverted_index.number | 68, 147 |
| abstract_inverted_index.policy | 57, 81, 133 |
| abstract_inverted_index.return | 58, 83, 134 |
| abstract_inverted_index.scores | 129 |
| abstract_inverted_index.target | 128 |
| abstract_inverted_index.trials | 39, 70, 149 |
| abstract_inverted_index.Because | 27 |
| abstract_inverted_index.becomes | 90 |
| abstract_inverted_index.certain | 102 |
| abstract_inverted_index.confirm | 41 |
| abstract_inverted_index.learned | 80 |
| abstract_inverted_index.method, | 59 |
| abstract_inverted_index.perform | 37 |
| abstract_inverted_index.present | 55 |
| abstract_inverted_index.prevent | 117 |
| abstract_inverted_index.results | 15, 24 |
| abstract_inverted_index.trials, | 19 |
| abstract_inverted_index.various | 127 |
| abstract_inverted_index.weights | 111 |
| abstract_inverted_index.article, | 53 |
| abstract_inverted_index.compared | 150, 164 |
| abstract_inverted_index.decline. | 120 |
| abstract_inverted_index.learning | 9 |
| abstract_inverted_index.multiple | 17 |
| abstract_inverted_index.networks | 115 |
| abstract_inverted_index.original | 156 |
| abstract_inverted_index.previous | 86 |
| abstract_inverted_index.reducing | 66 |
| abstract_inverted_index.repeated | 119 |
| abstract_inverted_index.required | 146 |
| abstract_inverted_index.results, | 33 |
| abstract_inverted_index.stagnant | 93 |
| abstract_inverted_index.training | 72 |
| abstract_inverted_index.Extensive | 121 |
| abstract_inverted_index.algorithm | 3, 43 |
| abstract_inverted_index.different | 14 |
| abstract_inverted_index.divergent | 91 |
| abstract_inverted_index.reduction | 143, 163 |
| abstract_inverted_index.training. | 98 |
| abstract_inverted_index.10% | 140, 160 |
| abstract_inverted_index.30% | 162 |
| abstract_inverted_index.40% | 142 |
| abstract_inverted_index.algorithm, | 157 |
| abstract_inverted_index.percentage | 103 |
| abstract_inverted_index.returning, | 100 |
| abstract_inverted_index.stochastic | 105 |
| abstract_inverted_index.algorithms. | 168 |
| abstract_inverted_index.challenging | 124 |
| abstract_inverted_index.demonstrate | 130 |
| abstract_inverted_index.drastically | 13 |
| abstract_inverted_index.experiments | 122 |
| abstract_inverted_index.instability | 30 |
| abstract_inverted_index.researchers | 34 |
| abstract_inverted_index.experimental | 18 |
| abstract_inverted_index.corresponding | 155 |
| abstract_inverted_index.reinforcement | 8 |
| abstract_inverted_index.hyperparameter | 5 |
| abstract_inverted_index.configurations, | 6 |
| abstract_inverted_index.hyperparameters | 48 |
| abstract_inverted_index.unsatisfactory. | 26 |
| abstract_inverted_index.state-of-the-art | 167 |
| cited_by_percentile_year.max | 94 |
| cited_by_percentile_year.min | 89 |
| countries_distinct_count | 1 |
| institutions_distinct_count | 3 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/16 |
| sustainable_development_goals[0].score | 0.49000000953674316 |
| sustainable_development_goals[0].display_name | Peace, Justice and strong institutions |
| citation_normalized_percentile.value | 0.57907438 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |