Multiplicative Controller Fusion: Leveraging Algorithmic Priors for\n Sample-efficient Reinforcement Learning and Safe Sim-To-Real Transfer Article Swipe
YOU?
·
· 2020
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2003.05117
Learning-based approaches often outperform hand-coded algorithmic solutions\nfor many problems in robotics. However, learning long-horizon tasks on real\nrobot hardware can be intractable, and transferring a learned policy from\nsimulation to reality is still extremely challenging. We present a novel\napproach to model-free reinforcement learning that can leverage existing\nsub-optimal solutions as an algorithmic prior during training and deployment.\nDuring training, our gated fusion approach enables the prior to guide the\ninitial stages of exploration, increasing sample-efficiency and enabling\nlearning from sparse long-horizon reward signals. Importantly, the policy can\nlearn to improve beyond the performance of the sub-optimal prior since the\nprior's influence is annealed gradually. During deployment, the policy's\nuncertainty provides a reliable strategy for transferring a simulation-trained\npolicy to the real world by falling back to the prior controller in uncertain\nstates. We show the efficacy of our Multiplicative Controller Fusion approach\non the task of robot navigation and demonstrate safe transfer from simulation\nto the real world without any fine-tuning. The code for this project is made\npublicly available at https://sites.google.com/view/mcf-nav/home\n
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2003.05117
- https://arxiv.org/pdf/2003.05117
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4287827453
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4287827453Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2003.05117Digital Object Identifier
- Title
-
Multiplicative Controller Fusion: Leveraging Algorithmic Priors for\n Sample-efficient Reinforcement Learning and Safe Sim-To-Real TransferWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2020Year of publication
- Publication date
-
2020-03-11Full publication date if available
- Authors
-
Krishan Rana, Vibhavari Dasagi, Ben Talbot, Michael Milford, Niko SünderhaufList of authors in order
- Landing page
-
https://arxiv.org/abs/2003.05117Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2003.05117Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2003.05117Direct OA link when available
- Concepts
-
Reinforcement learning, Leverage (statistics), Computer science, Software deployment, Artificial intelligence, Robotics, Robot, Controller (irrigation), Transfer of learning, Sample (material), Prior probability, Machine learning, Task (project management), Multiplicative function, Bayesian probability, Engineering, Operating system, Chromatography, Agronomy, Mathematics, Chemistry, Systems engineering, Mathematical analysis, BiologyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4287827453 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2003.05117 |
| ids.openalex | https://openalex.org/W4287827453 |
| fwci | 0.0 |
| type | preprint |
| title | Multiplicative Controller Fusion: Leveraging Algorithmic Priors for\n Sample-efficient Reinforcement Learning and Safe Sim-To-Real Transfer |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10462 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9925000071525574 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Reinforcement Learning in Robotics |
| topics[1].id | https://openalex.org/T11689 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9785000085830688 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Adversarial Robustness in Machine Learning |
| topics[2].id | https://openalex.org/T12814 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9628000259399414 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Gaussian Processes and Bayesian Inference |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97541855 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8168928623199463 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[0].display_name | Reinforcement learning |
| concepts[1].id | https://openalex.org/C153083717 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7460945248603821 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q6535263 |
| concepts[1].display_name | Leverage (statistics) |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.7272384762763977 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C105339364 |
| concepts[3].level | 2 |
| concepts[3].score | 0.6316240429878235 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q2297740 |
| concepts[3].display_name | Software deployment |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.6192331910133362 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C34413123 |
| concepts[5].level | 3 |
| concepts[5].score | 0.5419933795928955 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q170978 |
| concepts[5].display_name | Robotics |
| concepts[6].id | https://openalex.org/C90509273 |
| concepts[6].level | 2 |
| concepts[6].score | 0.5385355353355408 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q11012 |
| concepts[6].display_name | Robot |
| concepts[7].id | https://openalex.org/C203479927 |
| concepts[7].level | 2 |
| concepts[7].score | 0.4826543927192688 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q5165939 |
| concepts[7].display_name | Controller (irrigation) |
| concepts[8].id | https://openalex.org/C150899416 |
| concepts[8].level | 2 |
| concepts[8].score | 0.47021105885505676 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q1820378 |
| concepts[8].display_name | Transfer of learning |
| concepts[9].id | https://openalex.org/C198531522 |
| concepts[9].level | 2 |
| concepts[9].score | 0.46649500727653503 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q485146 |
| concepts[9].display_name | Sample (material) |
| concepts[10].id | https://openalex.org/C177769412 |
| concepts[10].level | 3 |
| concepts[10].score | 0.464738130569458 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q278090 |
| concepts[10].display_name | Prior probability |
| concepts[11].id | https://openalex.org/C119857082 |
| concepts[11].level | 1 |
| concepts[11].score | 0.4620271325111389 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[11].display_name | Machine learning |
| concepts[12].id | https://openalex.org/C2780451532 |
| concepts[12].level | 2 |
| concepts[12].score | 0.4461824595928192 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q759676 |
| concepts[12].display_name | Task (project management) |
| concepts[13].id | https://openalex.org/C42747912 |
| concepts[13].level | 2 |
| concepts[13].score | 0.4160654544830322 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q1048447 |
| concepts[13].display_name | Multiplicative function |
| concepts[14].id | https://openalex.org/C107673813 |
| concepts[14].level | 2 |
| concepts[14].score | 0.3114786148071289 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q812534 |
| concepts[14].display_name | Bayesian probability |
| concepts[15].id | https://openalex.org/C127413603 |
| concepts[15].level | 0 |
| concepts[15].score | 0.11840629577636719 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[15].display_name | Engineering |
| concepts[16].id | https://openalex.org/C111919701 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[16].display_name | Operating system |
| concepts[17].id | https://openalex.org/C43617362 |
| concepts[17].level | 1 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q170050 |
| concepts[17].display_name | Chromatography |
| concepts[18].id | https://openalex.org/C6557445 |
| concepts[18].level | 1 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q173113 |
| concepts[18].display_name | Agronomy |
| concepts[19].id | https://openalex.org/C33923547 |
| concepts[19].level | 0 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[19].display_name | Mathematics |
| concepts[20].id | https://openalex.org/C185592680 |
| concepts[20].level | 0 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[20].display_name | Chemistry |
| concepts[21].id | https://openalex.org/C201995342 |
| concepts[21].level | 1 |
| concepts[21].score | 0.0 |
| concepts[21].wikidata | https://www.wikidata.org/wiki/Q682496 |
| concepts[21].display_name | Systems engineering |
| concepts[22].id | https://openalex.org/C134306372 |
| concepts[22].level | 1 |
| concepts[22].score | 0.0 |
| concepts[22].wikidata | https://www.wikidata.org/wiki/Q7754 |
| concepts[22].display_name | Mathematical analysis |
| concepts[23].id | https://openalex.org/C86803240 |
| concepts[23].level | 0 |
| concepts[23].score | 0.0 |
| concepts[23].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[23].display_name | Biology |
| keywords[0].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[0].score | 0.8168928623199463 |
| keywords[0].display_name | Reinforcement learning |
| keywords[1].id | https://openalex.org/keywords/leverage |
| keywords[1].score | 0.7460945248603821 |
| keywords[1].display_name | Leverage (statistics) |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.7272384762763977 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/software-deployment |
| keywords[3].score | 0.6316240429878235 |
| keywords[3].display_name | Software deployment |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.6192331910133362 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/robotics |
| keywords[5].score | 0.5419933795928955 |
| keywords[5].display_name | Robotics |
| keywords[6].id | https://openalex.org/keywords/robot |
| keywords[6].score | 0.5385355353355408 |
| keywords[6].display_name | Robot |
| keywords[7].id | https://openalex.org/keywords/controller |
| keywords[7].score | 0.4826543927192688 |
| keywords[7].display_name | Controller (irrigation) |
| keywords[8].id | https://openalex.org/keywords/transfer-of-learning |
| keywords[8].score | 0.47021105885505676 |
| keywords[8].display_name | Transfer of learning |
| keywords[9].id | https://openalex.org/keywords/sample |
| keywords[9].score | 0.46649500727653503 |
| keywords[9].display_name | Sample (material) |
| keywords[10].id | https://openalex.org/keywords/prior-probability |
| keywords[10].score | 0.464738130569458 |
| keywords[10].display_name | Prior probability |
| keywords[11].id | https://openalex.org/keywords/machine-learning |
| keywords[11].score | 0.4620271325111389 |
| keywords[11].display_name | Machine learning |
| keywords[12].id | https://openalex.org/keywords/task |
| keywords[12].score | 0.4461824595928192 |
| keywords[12].display_name | Task (project management) |
| keywords[13].id | https://openalex.org/keywords/multiplicative-function |
| keywords[13].score | 0.4160654544830322 |
| keywords[13].display_name | Multiplicative function |
| keywords[14].id | https://openalex.org/keywords/bayesian-probability |
| keywords[14].score | 0.3114786148071289 |
| keywords[14].display_name | Bayesian probability |
| keywords[15].id | https://openalex.org/keywords/engineering |
| keywords[15].score | 0.11840629577636719 |
| keywords[15].display_name | Engineering |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2003.05117 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2003.05117 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2003.05117 |
| indexed_in | arxiv |
| authorships[0].author.id | https://openalex.org/A5001506562 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-9028-9295 |
| authorships[0].author.display_name | Krishan Rana |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Rana, Krishan |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5012909623 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Vibhavari Dasagi |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Dasagi, Vibhavari |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5055734658 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-7029-7813 |
| authorships[2].author.display_name | Ben Talbot |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Talbot, Ben |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5078340555 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-5162-1793 |
| authorships[3].author.display_name | Michael Milford |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Milford, Michael |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5034957065 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-5286-3789 |
| authorships[4].author.display_name | Niko Sünderhauf |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Sünderhauf, Niko |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2003.05117 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2022-07-26T00:00:00 |
| display_name | Multiplicative Controller Fusion: Leveraging Algorithmic Priors for\n Sample-efficient Reinforcement Learning and Safe Sim-To-Real Transfer |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10462 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9925000071525574 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Reinforcement Learning in Robotics |
| related_works | https://openalex.org/W2580650124, https://openalex.org/W4386190339, https://openalex.org/W2968424575, https://openalex.org/W3142333283, https://openalex.org/W3122088529, https://openalex.org/W3041320102, https://openalex.org/W2111669074, https://openalex.org/W2085259108, https://openalex.org/W3123087812, https://openalex.org/W2063076820 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | pmh:oai:arXiv.org:2003.05117 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2003.05117 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2003.05117 |
| primary_location.id | pmh:oai:arXiv.org:2003.05117 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2003.05117 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2003.05117 |
| publication_date | 2020-03-11 |
| publication_year | 2020 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 23, 35, 101, 106 |
| abstract_inverted_index.We | 33, 121 |
| abstract_inverted_index.an | 47 |
| abstract_inverted_index.as | 46 |
| abstract_inverted_index.at | 156 |
| abstract_inverted_index.be | 19 |
| abstract_inverted_index.by | 112 |
| abstract_inverted_index.in | 9, 119 |
| abstract_inverted_index.is | 29, 93, 153 |
| abstract_inverted_index.of | 66, 86, 125, 133 |
| abstract_inverted_index.on | 15 |
| abstract_inverted_index.to | 27, 37, 62, 81, 108, 115 |
| abstract_inverted_index.The | 148 |
| abstract_inverted_index.and | 21, 52, 70, 136 |
| abstract_inverted_index.any | 146 |
| abstract_inverted_index.can | 18, 42 |
| abstract_inverted_index.for | 104, 150 |
| abstract_inverted_index.our | 55, 126 |
| abstract_inverted_index.the | 60, 78, 84, 87, 98, 109, 116, 123, 131, 142 |
| abstract_inverted_index.back | 114 |
| abstract_inverted_index.code | 149 |
| abstract_inverted_index.from | 72, 140 |
| abstract_inverted_index.many | 7 |
| abstract_inverted_index.real | 110, 143 |
| abstract_inverted_index.safe | 138 |
| abstract_inverted_index.show | 122 |
| abstract_inverted_index.task | 132 |
| abstract_inverted_index.that | 41 |
| abstract_inverted_index.this | 151 |
| abstract_inverted_index.gated | 56 |
| abstract_inverted_index.guide | 63 |
| abstract_inverted_index.often | 2 |
| abstract_inverted_index.prior | 49, 61, 89, 117 |
| abstract_inverted_index.robot | 134 |
| abstract_inverted_index.since | 90 |
| abstract_inverted_index.still | 30 |
| abstract_inverted_index.tasks | 14 |
| abstract_inverted_index.world | 111, 144 |
| abstract_inverted_index.During | 96 |
| abstract_inverted_index.Fusion | 129 |
| abstract_inverted_index.beyond | 83 |
| abstract_inverted_index.during | 50 |
| abstract_inverted_index.fusion | 57 |
| abstract_inverted_index.policy | 25, 79 |
| abstract_inverted_index.reward | 75 |
| abstract_inverted_index.sparse | 73 |
| abstract_inverted_index.stages | 65 |
| abstract_inverted_index.enables | 59 |
| abstract_inverted_index.falling | 113 |
| abstract_inverted_index.improve | 82 |
| abstract_inverted_index.learned | 24 |
| abstract_inverted_index.present | 34 |
| abstract_inverted_index.project | 152 |
| abstract_inverted_index.reality | 28 |
| abstract_inverted_index.without | 145 |
| abstract_inverted_index.However, | 11 |
| abstract_inverted_index.annealed | 94 |
| abstract_inverted_index.approach | 58 |
| abstract_inverted_index.efficacy | 124 |
| abstract_inverted_index.hardware | 17 |
| abstract_inverted_index.learning | 12, 40 |
| abstract_inverted_index.leverage | 43 |
| abstract_inverted_index.problems | 8 |
| abstract_inverted_index.provides | 100 |
| abstract_inverted_index.reliable | 102 |
| abstract_inverted_index.signals. | 76 |
| abstract_inverted_index.strategy | 103 |
| abstract_inverted_index.training | 51 |
| abstract_inverted_index.transfer | 139 |
| abstract_inverted_index.available | 155 |
| abstract_inverted_index.extremely | 31 |
| abstract_inverted_index.influence | 92 |
| abstract_inverted_index.robotics. | 10 |
| abstract_inverted_index.solutions | 45 |
| abstract_inverted_index.training, | 54 |
| abstract_inverted_index.Controller | 128 |
| abstract_inverted_index.approaches | 1 |
| abstract_inverted_index.can\nlearn | 80 |
| abstract_inverted_index.controller | 118 |
| abstract_inverted_index.gradually. | 95 |
| abstract_inverted_index.hand-coded | 4 |
| abstract_inverted_index.increasing | 68 |
| abstract_inverted_index.model-free | 38 |
| abstract_inverted_index.navigation | 135 |
| abstract_inverted_index.outperform | 3 |
| abstract_inverted_index.algorithmic | 5, 48 |
| abstract_inverted_index.demonstrate | 137 |
| abstract_inverted_index.deployment, | 97 |
| abstract_inverted_index.performance | 85 |
| abstract_inverted_index.real\nrobot | 16 |
| abstract_inverted_index.sub-optimal | 88 |
| abstract_inverted_index.Importantly, | 77 |
| abstract_inverted_index.approach\non | 130 |
| abstract_inverted_index.challenging. | 32 |
| abstract_inverted_index.exploration, | 67 |
| abstract_inverted_index.fine-tuning. | 147 |
| abstract_inverted_index.intractable, | 20 |
| abstract_inverted_index.long-horizon | 13, 74 |
| abstract_inverted_index.the\ninitial | 64 |
| abstract_inverted_index.the\nprior's | 91 |
| abstract_inverted_index.transferring | 22, 105 |
| abstract_inverted_index.reinforcement | 39 |
| abstract_inverted_index.Learning-based | 0 |
| abstract_inverted_index.Multiplicative | 127 |
| abstract_inverted_index.made\npublicly | 154 |
| abstract_inverted_index.simulation\nto | 141 |
| abstract_inverted_index.solutions\nfor | 6 |
| abstract_inverted_index.novel\napproach | 36 |
| abstract_inverted_index.from\nsimulation | 26 |
| abstract_inverted_index.sample-efficiency | 69 |
| abstract_inverted_index.enabling\nlearning | 71 |
| abstract_inverted_index.uncertain\nstates. | 120 |
| abstract_inverted_index.deployment.\nDuring | 53 |
| abstract_inverted_index.existing\nsub-optimal | 44 |
| abstract_inverted_index.policy's\nuncertainty | 99 |
| abstract_inverted_index.simulation-trained\npolicy | 107 |
| abstract_inverted_index.https://sites.google.com/view/mcf-nav/home\n | 157 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/17 |
| sustainable_development_goals[0].score | 0.4300000071525574 |
| sustainable_development_goals[0].display_name | Partnerships for the goals |
| citation_normalized_percentile.value | 0.29899756 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |