Guiding Reinforcement Learning Exploration Using Natural Language Article Swipe
Brent Harrison
,
Upol Ehsan
,
Mark Riedl
·
YOU?
·
· 2017
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.1707.08616
YOU?
·
· 2017
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.1707.08616
In this work we present a technique to use natural language to help reinforcement learning generalize to unseen environments. This technique uses neural machine translation, specifically the use of encoder-decoder networks, to learn associations between natural language behavior descriptions and state-action information. We then use this learned model to guide agent exploration using a modified version of policy shaping to make it more effective at learning in unseen environments. We evaluate this technique using the popular arcade game, Frogger, under ideal and non-ideal conditions. This evaluation shows that our modified policy shaping algorithm improves over a Q-learning agent as well as a baseline version of policy shaping.
Related Topics
Concepts
Reinforcement learning
Computer science
Artificial intelligence
Natural language
Ideal (ethics)
Natural (archaeology)
Machine translation
Baseline (sea)
State (computer science)
Action (physics)
Natural language understanding
Machine learning
Philosophy
Physics
History
Epistemology
Quantum mechanics
Geology
Archaeology
Algorithm
Oceanography
Metadata
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/1707.08616
- https://arxiv.org/pdf/1707.08616
- OA Status
- green
- Cited By
- 3
- Related Works
- 20
- OpenAlex ID
- https://openalex.org/W2739490129
All OpenAlex metadata
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W2739490129Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.1707.08616Digital Object Identifier
- Title
-
Guiding Reinforcement Learning Exploration Using Natural LanguageWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2017Year of publication
- Publication date
-
2017-07-26Full publication date if available
- Authors
-
Brent Harrison, Upol Ehsan, Mark RiedlList of authors in order
- Landing page
-
https://arxiv.org/abs/1707.08616Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/1707.08616Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/1707.08616Direct OA link when available
- Concepts
-
Reinforcement learning, Computer science, Artificial intelligence, Natural language, Ideal (ethics), Natural (archaeology), Machine translation, Baseline (sea), State (computer science), Action (physics), Natural language understanding, Machine learning, Philosophy, Physics, History, Epistemology, Quantum mechanics, Geology, Archaeology, Algorithm, OceanographyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
3Total citation count in OpenAlex
- Citations by year (recent)
-
2024: 1, 2023: 1, 2021: 1Per-year citation counts (last 5 years)
- Related works (count)
-
20Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W2739490129 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.1707.08616 |
| ids.doi | https://doi.org/10.48550/arxiv.1707.08616 |
| ids.mag | 2739490129 |
| ids.openalex | https://openalex.org/W2739490129 |
| fwci | |
| type | preprint |
| title | Guiding Reinforcement Learning Exploration Using Natural Language |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | 1958 |
| biblio.first_page | 1956 |
| topics[0].id | https://openalex.org/T10462 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9735999703407288 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Reinforcement Learning in Robotics |
| topics[1].id | https://openalex.org/T11975 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.932200014591217 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Evolutionary Algorithms and Applications |
| topics[2].id | https://openalex.org/T11574 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9203000068664551 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Artificial Intelligence in Games |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97541855 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8919489979743958 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[0].display_name | Reinforcement learning |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.7664059996604919 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C154945302 |
| concepts[2].level | 1 |
| concepts[2].score | 0.6387267708778381 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[2].display_name | Artificial intelligence |
| concepts[3].id | https://openalex.org/C195324797 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5892857313156128 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q33742 |
| concepts[3].display_name | Natural language |
| concepts[4].id | https://openalex.org/C2776639384 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5797101855278015 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q840396 |
| concepts[4].display_name | Ideal (ethics) |
| concepts[5].id | https://openalex.org/C2776608160 |
| concepts[5].level | 2 |
| concepts[5].score | 0.4884618818759918 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q4785462 |
| concepts[5].display_name | Natural (archaeology) |
| concepts[6].id | https://openalex.org/C203005215 |
| concepts[6].level | 2 |
| concepts[6].score | 0.47669073939323425 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q79798 |
| concepts[6].display_name | Machine translation |
| concepts[7].id | https://openalex.org/C12725497 |
| concepts[7].level | 2 |
| concepts[7].score | 0.4495127499103546 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q810247 |
| concepts[7].display_name | Baseline (sea) |
| concepts[8].id | https://openalex.org/C48103436 |
| concepts[8].level | 2 |
| concepts[8].score | 0.43249616026878357 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q599031 |
| concepts[8].display_name | State (computer science) |
| concepts[9].id | https://openalex.org/C2780791683 |
| concepts[9].level | 2 |
| concepts[9].score | 0.4249236583709717 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q846785 |
| concepts[9].display_name | Action (physics) |
| concepts[10].id | https://openalex.org/C2779439875 |
| concepts[10].level | 3 |
| concepts[10].score | 0.4222482442855835 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q1078276 |
| concepts[10].display_name | Natural language understanding |
| concepts[11].id | https://openalex.org/C119857082 |
| concepts[11].level | 1 |
| concepts[11].score | 0.41349172592163086 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[11].display_name | Machine learning |
| concepts[12].id | https://openalex.org/C138885662 |
| concepts[12].level | 0 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q5891 |
| concepts[12].display_name | Philosophy |
| concepts[13].id | https://openalex.org/C121332964 |
| concepts[13].level | 0 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[13].display_name | Physics |
| concepts[14].id | https://openalex.org/C95457728 |
| concepts[14].level | 0 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q309 |
| concepts[14].display_name | History |
| concepts[15].id | https://openalex.org/C111472728 |
| concepts[15].level | 1 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q9471 |
| concepts[15].display_name | Epistemology |
| concepts[16].id | https://openalex.org/C62520636 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q944 |
| concepts[16].display_name | Quantum mechanics |
| concepts[17].id | https://openalex.org/C127313418 |
| concepts[17].level | 0 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q1069 |
| concepts[17].display_name | Geology |
| concepts[18].id | https://openalex.org/C166957645 |
| concepts[18].level | 1 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q23498 |
| concepts[18].display_name | Archaeology |
| concepts[19].id | https://openalex.org/C11413529 |
| concepts[19].level | 1 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[19].display_name | Algorithm |
| concepts[20].id | https://openalex.org/C111368507 |
| concepts[20].level | 1 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q43518 |
| concepts[20].display_name | Oceanography |
| keywords[0].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[0].score | 0.8919489979743958 |
| keywords[0].display_name | Reinforcement learning |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.7664059996604919 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[2].score | 0.6387267708778381 |
| keywords[2].display_name | Artificial intelligence |
| keywords[3].id | https://openalex.org/keywords/natural-language |
| keywords[3].score | 0.5892857313156128 |
| keywords[3].display_name | Natural language |
| keywords[4].id | https://openalex.org/keywords/ideal |
| keywords[4].score | 0.5797101855278015 |
| keywords[4].display_name | Ideal (ethics) |
| keywords[5].id | https://openalex.org/keywords/natural |
| keywords[5].score | 0.4884618818759918 |
| keywords[5].display_name | Natural (archaeology) |
| keywords[6].id | https://openalex.org/keywords/machine-translation |
| keywords[6].score | 0.47669073939323425 |
| keywords[6].display_name | Machine translation |
| keywords[7].id | https://openalex.org/keywords/baseline |
| keywords[7].score | 0.4495127499103546 |
| keywords[7].display_name | Baseline (sea) |
| keywords[8].id | https://openalex.org/keywords/state |
| keywords[8].score | 0.43249616026878357 |
| keywords[8].display_name | State (computer science) |
| keywords[9].id | https://openalex.org/keywords/action |
| keywords[9].score | 0.4249236583709717 |
| keywords[9].display_name | Action (physics) |
| keywords[10].id | https://openalex.org/keywords/natural-language-understanding |
| keywords[10].score | 0.4222482442855835 |
| keywords[10].display_name | Natural language understanding |
| keywords[11].id | https://openalex.org/keywords/machine-learning |
| keywords[11].score | 0.41349172592163086 |
| keywords[11].display_name | Machine learning |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:1707.08616 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/1707.08616 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/1707.08616 |
| locations[1].id | mag:2739490129 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | submittedVersion |
| locations[1].raw_type | |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | False |
| locations[1].raw_source_name | arXiv (Cornell University) |
| locations[1].landing_page_url | https://arxiv.org/pdf/1707.08616.pdf |
| locations[2].id | doi:10.48550/arxiv.1707.08616 |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S4306400194 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | True |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | arXiv (Cornell University) |
| locations[2].source.host_organization | https://openalex.org/I205783295 |
| locations[2].source.host_organization_name | Cornell University |
| locations[2].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[2].license | |
| locations[2].pdf_url | |
| locations[2].version | |
| locations[2].raw_type | article |
| locations[2].license_id | |
| locations[2].is_accepted | False |
| locations[2].is_published | |
| locations[2].raw_source_name | |
| locations[2].landing_page_url | https://doi.org/10.48550/arxiv.1707.08616 |
| locations[3].id | mag:2963279277 |
| locations[3].is_oa | True |
| locations[3].source.id | https://openalex.org/S4306400194 |
| locations[3].source.issn | |
| locations[3].source.type | repository |
| locations[3].source.is_oa | True |
| locations[3].source.issn_l | |
| locations[3].source.is_core | False |
| locations[3].source.is_in_doaj | False |
| locations[3].source.display_name | arXiv (Cornell University) |
| locations[3].source.host_organization | https://openalex.org/I205783295 |
| locations[3].source.host_organization_name | Cornell University |
| locations[3].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[3].license | |
| locations[3].pdf_url | |
| locations[3].version | |
| locations[3].raw_type | |
| locations[3].license_id | |
| locations[3].is_accepted | False |
| locations[3].is_published | |
| locations[3].raw_source_name | arXiv (Cornell University) |
| locations[3].landing_page_url | https://arxiv.org/abs/1707.08616 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5047163199 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-1301-5928 |
| authorships[0].author.display_name | Brent Harrison |
| authorships[0].countries | US |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I143302722 |
| authorships[0].affiliations[0].raw_affiliation_string | UNIVERSITY OF KENTUCKY, LEXINGTON, KY, USA |
| authorships[0].institutions[0].id | https://openalex.org/I143302722 |
| authorships[0].institutions[0].ror | https://ror.org/02k3smh20 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I143302722 |
| authorships[0].institutions[0].country_code | US |
| authorships[0].institutions[0].display_name | University of Kentucky |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Brent Harrison |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | UNIVERSITY OF KENTUCKY, LEXINGTON, KY, USA |
| authorships[1].author.id | https://openalex.org/A5010875544 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-4911-0409 |
| authorships[1].author.display_name | Upol Ehsan |
| authorships[1].countries | US |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I130701444 |
| authorships[1].affiliations[0].raw_affiliation_string | Georgia Institute of Technology, Atlanta, GA (USA) |
| authorships[1].institutions[0].id | https://openalex.org/I130701444 |
| authorships[1].institutions[0].ror | https://ror.org/01zkghx44 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I130701444 |
| authorships[1].institutions[0].country_code | US |
| authorships[1].institutions[0].display_name | Georgia Institute of Technology |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Upol Ehsan |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Georgia Institute of Technology, Atlanta, GA (USA) |
| authorships[2].author.id | https://openalex.org/A5061883150 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-5283-6588 |
| authorships[2].author.display_name | Mark Riedl |
| authorships[2].countries | US |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I130701444 |
| authorships[2].affiliations[0].raw_affiliation_string | Georgia Institute of Technology, Atlanta, GA (USA) |
| authorships[2].institutions[0].id | https://openalex.org/I130701444 |
| authorships[2].institutions[0].ror | https://ror.org/01zkghx44 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I130701444 |
| authorships[2].institutions[0].country_code | US |
| authorships[2].institutions[0].display_name | Georgia Institute of Technology |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Mark O. Riedl |
| authorships[2].is_corresponding | False |
| authorships[2].raw_affiliation_strings | Georgia Institute of Technology, Atlanta, GA (USA) |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/1707.08616 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Guiding Reinforcement Learning Exploration Using Natural Language |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10462 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9735999703407288 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Reinforcement Learning in Robotics |
| related_works | https://openalex.org/W2963279277, https://openalex.org/W2095779990, https://openalex.org/W2785948534, https://openalex.org/W2494202692, https://openalex.org/W2132073301, https://openalex.org/W2896921068, https://openalex.org/W3101177506, https://openalex.org/W1604959332, https://openalex.org/W2889681745, https://openalex.org/W3094234915, https://openalex.org/W2510924756, https://openalex.org/W3105781833, https://openalex.org/W2020573190, https://openalex.org/W3116007007, https://openalex.org/W2372307837, https://openalex.org/W3132615417, https://openalex.org/W2807193538, https://openalex.org/W3042933240, https://openalex.org/W2946234491, https://openalex.org/W3208231292 |
| cited_by_count | 3 |
| counts_by_year[0].year | 2024 |
| counts_by_year[0].cited_by_count | 1 |
| counts_by_year[1].year | 2023 |
| counts_by_year[1].cited_by_count | 1 |
| counts_by_year[2].year | 2021 |
| counts_by_year[2].cited_by_count | 1 |
| locations_count | 4 |
| best_oa_location.id | pmh:oai:arXiv.org:1707.08616 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/1707.08616 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/1707.08616 |
| primary_location.id | pmh:oai:arXiv.org:1707.08616 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/1707.08616 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/1707.08616 |
| publication_date | 2017-07-26 |
| publication_year | 2017 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 5, 53, 95, 101 |
| abstract_inverted_index.In | 0 |
| abstract_inverted_index.We | 42, 69 |
| abstract_inverted_index.as | 98, 100 |
| abstract_inverted_index.at | 64 |
| abstract_inverted_index.in | 66 |
| abstract_inverted_index.it | 61 |
| abstract_inverted_index.of | 28, 56, 104 |
| abstract_inverted_index.to | 7, 11, 16, 31, 48, 59 |
| abstract_inverted_index.we | 3 |
| abstract_inverted_index.and | 39, 81 |
| abstract_inverted_index.our | 88 |
| abstract_inverted_index.the | 26, 74 |
| abstract_inverted_index.use | 8, 27, 44 |
| abstract_inverted_index.This | 19, 84 |
| abstract_inverted_index.help | 12 |
| abstract_inverted_index.make | 60 |
| abstract_inverted_index.more | 62 |
| abstract_inverted_index.over | 94 |
| abstract_inverted_index.that | 87 |
| abstract_inverted_index.then | 43 |
| abstract_inverted_index.this | 1, 45, 71 |
| abstract_inverted_index.uses | 21 |
| abstract_inverted_index.well | 99 |
| abstract_inverted_index.work | 2 |
| abstract_inverted_index.agent | 50, 97 |
| abstract_inverted_index.game, | 77 |
| abstract_inverted_index.guide | 49 |
| abstract_inverted_index.ideal | 80 |
| abstract_inverted_index.learn | 32 |
| abstract_inverted_index.model | 47 |
| abstract_inverted_index.shows | 86 |
| abstract_inverted_index.under | 79 |
| abstract_inverted_index.using | 52, 73 |
| abstract_inverted_index.arcade | 76 |
| abstract_inverted_index.neural | 22 |
| abstract_inverted_index.policy | 57, 90, 105 |
| abstract_inverted_index.unseen | 17, 67 |
| abstract_inverted_index.between | 34 |
| abstract_inverted_index.learned | 46 |
| abstract_inverted_index.machine | 23 |
| abstract_inverted_index.natural | 9, 35 |
| abstract_inverted_index.popular | 75 |
| abstract_inverted_index.present | 4 |
| abstract_inverted_index.shaping | 58, 91 |
| abstract_inverted_index.version | 55, 103 |
| abstract_inverted_index.Frogger, | 78 |
| abstract_inverted_index.baseline | 102 |
| abstract_inverted_index.behavior | 37 |
| abstract_inverted_index.evaluate | 70 |
| abstract_inverted_index.improves | 93 |
| abstract_inverted_index.language | 10, 36 |
| abstract_inverted_index.learning | 14, 65 |
| abstract_inverted_index.modified | 54, 89 |
| abstract_inverted_index.shaping. | 106 |
| abstract_inverted_index.algorithm | 92 |
| abstract_inverted_index.effective | 63 |
| abstract_inverted_index.networks, | 30 |
| abstract_inverted_index.non-ideal | 82 |
| abstract_inverted_index.technique | 6, 20, 72 |
| abstract_inverted_index.Q-learning | 96 |
| abstract_inverted_index.evaluation | 85 |
| abstract_inverted_index.generalize | 15 |
| abstract_inverted_index.conditions. | 83 |
| abstract_inverted_index.exploration | 51 |
| abstract_inverted_index.associations | 33 |
| abstract_inverted_index.descriptions | 38 |
| abstract_inverted_index.information. | 41 |
| abstract_inverted_index.specifically | 25 |
| abstract_inverted_index.state-action | 40 |
| abstract_inverted_index.translation, | 24 |
| abstract_inverted_index.environments. | 18, 68 |
| abstract_inverted_index.reinforcement | 13 |
| abstract_inverted_index.encoder-decoder | 29 |
| cited_by_percentile_year | |
| countries_distinct_count | 1 |
| institutions_distinct_count | 3 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/4 |
| sustainable_development_goals[0].score | 0.5699999928474426 |
| sustainable_development_goals[0].display_name | Quality Education |
| citation_normalized_percentile |