Partner Approximating Learners (PAL): Simulation-Accelerated Learning\n with Explicit Partner Modeling in Multi-Agent Domains Article Swipe
YOU?
·
· 2019
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.1909.03868
Mixed cooperative-competitive control scenarios such as human-machine\ninteraction with individual goals of the interacting partners are very\nchallenging for reinforcement learning agents. In order to contribute towards\nintuitive human-machine collaboration, we focus on problems in the continuous\nstate and control domain where no explicit communication is considered and the\nagents do not know the others' goals or control laws but only sense their\ncontrol inputs retrospectively. Our proposed framework combines a learned\npartner model based on online data with a reinforcement learning agent that is\ntrained in a simulated environment including the partner model. Thus, we\novercome drawbacks of independent learners and, in addition, benefit from a\nreduced amount of real world data required for reinforcement learning which is\nvital in the human-machine context. We finally analyze an example that\ndemonstrates the merits of our proposed framework which learns fast due to the\nsimulated environment and adapts to the continuously changing partner due to\nthe partner approximation.\n
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/1909.03868
- https://arxiv.org/pdf/1909.03868
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4288112366
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4288112366Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.1909.03868Digital Object Identifier
- Title
-
Partner Approximating Learners (PAL): Simulation-Accelerated Learning\n with Explicit Partner Modeling in Multi-Agent DomainsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2019Year of publication
- Publication date
-
2019-09-09Full publication date if available
- Authors
-
Florian Köpf, Alexander Nitsch, Michael Flad, Sören HohmannList of authors in order
- Landing page
-
https://arxiv.org/abs/1909.03868Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/1909.03868Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/1909.03868Direct OA link when available
- Concepts
-
Reinforcement learning, Computer science, Context (archaeology), Control (management), Domain (mathematical analysis), Artificial intelligence, State (computer science), Focus (optics), Error-driven learning, Human–computer interaction, Machine learning, Algorithm, Mathematical analysis, Paleontology, Biology, Physics, Optics, MathematicsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4288112366 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.1909.03868 |
| ids.openalex | https://openalex.org/W4288112366 |
| fwci | 0.0 |
| type | preprint |
| title | Partner Approximating Learners (PAL): Simulation-Accelerated Learning\n with Explicit Partner Modeling in Multi-Agent Domains |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10462 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9370999932289124 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Reinforcement Learning in Robotics |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97541855 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8866837024688721 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[0].display_name | Reinforcement learning |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.7514389157295227 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C2779343474 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6354308128356934 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q3109175 |
| concepts[2].display_name | Context (archaeology) |
| concepts[3].id | https://openalex.org/C2775924081 |
| concepts[3].level | 2 |
| concepts[3].score | 0.6314114332199097 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q55608371 |
| concepts[3].display_name | Control (management) |
| concepts[4].id | https://openalex.org/C36503486 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5660843849182129 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11235244 |
| concepts[4].display_name | Domain (mathematical analysis) |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.4960039556026459 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C48103436 |
| concepts[6].level | 2 |
| concepts[6].score | 0.469178706407547 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q599031 |
| concepts[6].display_name | State (computer science) |
| concepts[7].id | https://openalex.org/C192209626 |
| concepts[7].level | 2 |
| concepts[7].score | 0.4487084150314331 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q190909 |
| concepts[7].display_name | Focus (optics) |
| concepts[8].id | https://openalex.org/C47932503 |
| concepts[8].level | 3 |
| concepts[8].score | 0.4411829710006714 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q5395689 |
| concepts[8].display_name | Error-driven learning |
| concepts[9].id | https://openalex.org/C107457646 |
| concepts[9].level | 1 |
| concepts[9].score | 0.39197447896003723 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q207434 |
| concepts[9].display_name | Human–computer interaction |
| concepts[10].id | https://openalex.org/C119857082 |
| concepts[10].level | 1 |
| concepts[10].score | 0.39069193601608276 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[10].display_name | Machine learning |
| concepts[11].id | https://openalex.org/C11413529 |
| concepts[11].level | 1 |
| concepts[11].score | 0.08296561241149902 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[11].display_name | Algorithm |
| concepts[12].id | https://openalex.org/C134306372 |
| concepts[12].level | 1 |
| concepts[12].score | 0.0 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q7754 |
| concepts[12].display_name | Mathematical analysis |
| concepts[13].id | https://openalex.org/C151730666 |
| concepts[13].level | 1 |
| concepts[13].score | 0.0 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q7205 |
| concepts[13].display_name | Paleontology |
| concepts[14].id | https://openalex.org/C86803240 |
| concepts[14].level | 0 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q420 |
| concepts[14].display_name | Biology |
| concepts[15].id | https://openalex.org/C121332964 |
| concepts[15].level | 0 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[15].display_name | Physics |
| concepts[16].id | https://openalex.org/C120665830 |
| concepts[16].level | 1 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q14620 |
| concepts[16].display_name | Optics |
| concepts[17].id | https://openalex.org/C33923547 |
| concepts[17].level | 0 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[17].display_name | Mathematics |
| keywords[0].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[0].score | 0.8866837024688721 |
| keywords[0].display_name | Reinforcement learning |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.7514389157295227 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/context |
| keywords[2].score | 0.6354308128356934 |
| keywords[2].display_name | Context (archaeology) |
| keywords[3].id | https://openalex.org/keywords/control |
| keywords[3].score | 0.6314114332199097 |
| keywords[3].display_name | Control (management) |
| keywords[4].id | https://openalex.org/keywords/domain |
| keywords[4].score | 0.5660843849182129 |
| keywords[4].display_name | Domain (mathematical analysis) |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.4960039556026459 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/state |
| keywords[6].score | 0.469178706407547 |
| keywords[6].display_name | State (computer science) |
| keywords[7].id | https://openalex.org/keywords/focus |
| keywords[7].score | 0.4487084150314331 |
| keywords[7].display_name | Focus (optics) |
| keywords[8].id | https://openalex.org/keywords/error-driven-learning |
| keywords[8].score | 0.4411829710006714 |
| keywords[8].display_name | Error-driven learning |
| keywords[9].id | https://openalex.org/keywords/human–computer-interaction |
| keywords[9].score | 0.39197447896003723 |
| keywords[9].display_name | Human–computer interaction |
| keywords[10].id | https://openalex.org/keywords/machine-learning |
| keywords[10].score | 0.39069193601608276 |
| keywords[10].display_name | Machine learning |
| keywords[11].id | https://openalex.org/keywords/algorithm |
| keywords[11].score | 0.08296561241149902 |
| keywords[11].display_name | Algorithm |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:1909.03868 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/1909.03868 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/1909.03868 |
| indexed_in | arxiv |
| authorships[0].author.id | https://openalex.org/A5082736433 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-2536-3409 |
| authorships[0].author.display_name | Florian Köpf |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Köpf, Florian |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5000673280 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-5740-9451 |
| authorships[1].author.display_name | Alexander Nitsch |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Nitsch, Alexander |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5081378290 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-3252-6632 |
| authorships[2].author.display_name | Michael Flad |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Flad, Michael |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5040502908 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-4170-1431 |
| authorships[3].author.display_name | Sören Hohmann |
| authorships[3].author_position | last |
| authorships[3].raw_author_name | Hohmann, Sören |
| authorships[3].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/1909.03868 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2022-07-28T00:00:00 |
| display_name | Partner Approximating Learners (PAL): Simulation-Accelerated Learning\n with Explicit Partner Modeling in Multi-Agent Domains |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10462 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9370999932289124 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Reinforcement Learning in Robotics |
| related_works | https://openalex.org/W2371091044, https://openalex.org/W2171010636, https://openalex.org/W87513465, https://openalex.org/W2391666574, https://openalex.org/W2786230833, https://openalex.org/W3203256658, https://openalex.org/W2352650970, https://openalex.org/W1544514152, https://openalex.org/W1493952344, https://openalex.org/W4312372616 |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | pmh:oai:arXiv.org:1909.03868 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/1909.03868 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/1909.03868 |
| primary_location.id | pmh:oai:arXiv.org:1909.03868 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/1909.03868 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/1909.03868 |
| publication_date | 2019-09-09 |
| publication_year | 2019 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 64, 72, 79 |
| abstract_inverted_index.In | 20 |
| abstract_inverted_index.We | 113 |
| abstract_inverted_index.an | 116 |
| abstract_inverted_index.as | 5 |
| abstract_inverted_index.do | 45 |
| abstract_inverted_index.in | 31, 78, 93, 109 |
| abstract_inverted_index.is | 41 |
| abstract_inverted_index.no | 38 |
| abstract_inverted_index.of | 10, 89, 99, 121 |
| abstract_inverted_index.on | 29, 68 |
| abstract_inverted_index.or | 51 |
| abstract_inverted_index.to | 22, 129, 134 |
| abstract_inverted_index.we | 27 |
| abstract_inverted_index.Our | 60 |
| abstract_inverted_index.and | 34, 43, 132 |
| abstract_inverted_index.are | 14 |
| abstract_inverted_index.but | 54 |
| abstract_inverted_index.due | 128, 139 |
| abstract_inverted_index.for | 16, 104 |
| abstract_inverted_index.not | 46 |
| abstract_inverted_index.our | 122 |
| abstract_inverted_index.the | 11, 32, 48, 83, 110, 119, 135 |
| abstract_inverted_index.and, | 92 |
| abstract_inverted_index.data | 70, 102 |
| abstract_inverted_index.fast | 127 |
| abstract_inverted_index.from | 96 |
| abstract_inverted_index.know | 47 |
| abstract_inverted_index.laws | 53 |
| abstract_inverted_index.only | 55 |
| abstract_inverted_index.real | 100 |
| abstract_inverted_index.such | 4 |
| abstract_inverted_index.that | 76 |
| abstract_inverted_index.with | 7, 71 |
| abstract_inverted_index.Mixed | 0 |
| abstract_inverted_index.Thus, | 86 |
| abstract_inverted_index.agent | 75 |
| abstract_inverted_index.based | 67 |
| abstract_inverted_index.focus | 28 |
| abstract_inverted_index.goals | 9, 50 |
| abstract_inverted_index.model | 66 |
| abstract_inverted_index.order | 21 |
| abstract_inverted_index.sense | 56 |
| abstract_inverted_index.where | 37 |
| abstract_inverted_index.which | 107, 125 |
| abstract_inverted_index.world | 101 |
| abstract_inverted_index.adapts | 133 |
| abstract_inverted_index.amount | 98 |
| abstract_inverted_index.domain | 36 |
| abstract_inverted_index.inputs | 58 |
| abstract_inverted_index.learns | 126 |
| abstract_inverted_index.merits | 120 |
| abstract_inverted_index.model. | 85 |
| abstract_inverted_index.online | 69 |
| abstract_inverted_index.agents. | 19 |
| abstract_inverted_index.analyze | 115 |
| abstract_inverted_index.benefit | 95 |
| abstract_inverted_index.control | 2, 35, 52 |
| abstract_inverted_index.example | 117 |
| abstract_inverted_index.finally | 114 |
| abstract_inverted_index.others' | 49 |
| abstract_inverted_index.partner | 84, 138, 141 |
| abstract_inverted_index.to\nthe | 140 |
| abstract_inverted_index.changing | 137 |
| abstract_inverted_index.combines | 63 |
| abstract_inverted_index.context. | 112 |
| abstract_inverted_index.explicit | 39 |
| abstract_inverted_index.learners | 91 |
| abstract_inverted_index.learning | 18, 74, 106 |
| abstract_inverted_index.partners | 13 |
| abstract_inverted_index.problems | 30 |
| abstract_inverted_index.proposed | 61, 123 |
| abstract_inverted_index.required | 103 |
| abstract_inverted_index.addition, | 94 |
| abstract_inverted_index.drawbacks | 88 |
| abstract_inverted_index.framework | 62, 124 |
| abstract_inverted_index.including | 82 |
| abstract_inverted_index.is\nvital | 108 |
| abstract_inverted_index.scenarios | 3 |
| abstract_inverted_index.simulated | 80 |
| abstract_inverted_index.a\nreduced | 97 |
| abstract_inverted_index.considered | 42 |
| abstract_inverted_index.contribute | 23 |
| abstract_inverted_index.individual | 8 |
| abstract_inverted_index.environment | 81, 131 |
| abstract_inverted_index.independent | 90 |
| abstract_inverted_index.interacting | 12 |
| abstract_inverted_index.is\ntrained | 77 |
| abstract_inverted_index.the\nagents | 44 |
| abstract_inverted_index.continuously | 136 |
| abstract_inverted_index.we\novercome | 87 |
| abstract_inverted_index.communication | 40 |
| abstract_inverted_index.human-machine | 25, 111 |
| abstract_inverted_index.reinforcement | 17, 73, 105 |
| abstract_inverted_index.collaboration, | 26 |
| abstract_inverted_index.the\nsimulated | 130 |
| abstract_inverted_index.their\ncontrol | 57 |
| abstract_inverted_index.approximation.\n | 142 |
| abstract_inverted_index.learned\npartner | 65 |
| abstract_inverted_index.retrospectively. | 59 |
| abstract_inverted_index.continuous\nstate | 33 |
| abstract_inverted_index.very\nchallenging | 15 |
| abstract_inverted_index.that\ndemonstrates | 118 |
| abstract_inverted_index.towards\nintuitive | 24 |
| abstract_inverted_index.cooperative-competitive | 1 |
| abstract_inverted_index.human-machine\ninteraction | 6 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 4 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/17 |
| sustainable_development_goals[0].score | 0.5 |
| sustainable_development_goals[0].display_name | Partnerships for the goals |
| citation_normalized_percentile.value | 0.32253985 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |