Learning to Play No-Press Diplomacy with Best Response Policy Iteration Article Swipe
YOU?
·
· 2020
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2006.04635
Recent advances in deep reinforcement learning (RL) have led to considerable progress in many 2-player zero-sum games, such as Go, Poker and Starcraft. The purely adversarial nature of such games allows for conceptually simple and principled application of RL methods. However real-world settings are many-agent, and agent interactions are complex mixtures of common-interest and competitive aspects. We consider Diplomacy, a 7-player board game designed to accentuate dilemmas resulting from many-agent interactions. It also features a large combinatorial action space and simultaneous moves, which are challenging for RL algorithms. We propose a simple yet effective approximate best response operator, designed to handle large combinatorial action spaces and simultaneous moves. We also introduce a family of policy iteration methods that approximate fictitious play. With these methods, we successfully apply RL to Diplomacy: we show that our agents convincingly outperform the previous state-of-the-art, and game theoretic equilibrium analysis shows that the new process yields consistent improvements.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2006.04635
- https://arxiv.org/pdf/2006.04635
- OA Status
- green
- Cited By
- 18
- References
- 108
- Related Works
- 20
- OpenAlex ID
- https://openalex.org/W3033370016
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3033370016Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2006.04635Digital Object Identifier
- Title
-
Learning to Play No-Press Diplomacy with Best Response Policy IterationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2020Year of publication
- Publication date
-
2020-06-08Full publication date if available
- Authors
-
Thomas Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Roman Werpachowski, Satinder Singh, Thore Graepel, Yoram BachrachList of authors in order
- Landing page
-
https://arxiv.org/abs/2006.04635Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2006.04635Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2006.04635Direct OA link when available
- Concepts
-
Reinforcement learning, Simple (philosophy), Computer science, Fictitious play, Action (physics), Adversarial system, Process (computing), Diplomacy, Artificial intelligence, Game theory, Operator (biology), Mathematical economics, Mathematical optimization, Theoretical computer science, Mathematics, Political science, Epistemology, Law, Quantum mechanics, Operating system, Politics, Gene, Repressor, Biochemistry, Chemistry, Physics, Transcription factor, PhilosophyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
18Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 1, 2024: 2, 2023: 2, 2022: 4, 2021: 5Per-year citation counts (last 5 years)
- References (count)
-
108Number of works referenced by this work
- Related works (count)
-
20Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W3033370016 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2006.04635 |
| ids.doi | https://doi.org/10.48550/arxiv.2006.04635 |
| ids.mag | 3033370016 |
| ids.openalex | https://openalex.org/W3033370016 |
| fwci | |
| type | preprint |
| title | Learning to Play No-Press Diplomacy with Best Response Policy Iteration |
| biblio.issue | |
| biblio.volume | 33 |
| biblio.last_page | 18003 |
| biblio.first_page | 17987 |
| topics[0].id | https://openalex.org/T11574 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9993000030517578 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Artificial Intelligence in Games |
| topics[1].id | https://openalex.org/T10462 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9984999895095825 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Reinforcement Learning in Robotics |
| topics[2].id | https://openalex.org/T11674 |
| topics[2].field.id | https://openalex.org/fields/20 |
| topics[2].field.display_name | Economics, Econometrics and Finance |
| topics[2].score | 0.9822999835014343 |
| topics[2].domain.id | https://openalex.org/domains/2 |
| topics[2].domain.display_name | Social Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2002 |
| topics[2].subfield.display_name | Economics and Econometrics |
| topics[2].display_name | Sports Analytics and Performance |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97541855 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8158525228500366 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[0].display_name | Reinforcement learning |
| concepts[1].id | https://openalex.org/C2780586882 |
| concepts[1].level | 2 |
| concepts[1].score | 0.7165284156799316 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q7520643 |
| concepts[1].display_name | Simple (philosophy) |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.6945250630378723 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C145071142 |
| concepts[3].level | 3 |
| concepts[3].score | 0.6405930519104004 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q1411116 |
| concepts[3].display_name | Fictitious play |
| concepts[4].id | https://openalex.org/C2780791683 |
| concepts[4].level | 2 |
| concepts[4].score | 0.5929345488548279 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q846785 |
| concepts[4].display_name | Action (physics) |
| concepts[5].id | https://openalex.org/C37736160 |
| concepts[5].level | 2 |
| concepts[5].score | 0.5650641322135925 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q1801315 |
| concepts[5].display_name | Adversarial system |
| concepts[6].id | https://openalex.org/C98045186 |
| concepts[6].level | 2 |
| concepts[6].score | 0.4639069437980652 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q205663 |
| concepts[6].display_name | Process (computing) |
| concepts[7].id | https://openalex.org/C557252395 |
| concepts[7].level | 3 |
| concepts[7].score | 0.4581768214702606 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q1889 |
| concepts[7].display_name | Diplomacy |
| concepts[8].id | https://openalex.org/C154945302 |
| concepts[8].level | 1 |
| concepts[8].score | 0.45785602927207947 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[8].display_name | Artificial intelligence |
| concepts[9].id | https://openalex.org/C177142836 |
| concepts[9].level | 2 |
| concepts[9].score | 0.4302177131175995 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q44455 |
| concepts[9].display_name | Game theory |
| concepts[10].id | https://openalex.org/C17020691 |
| concepts[10].level | 5 |
| concepts[10].score | 0.41424059867858887 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q139677 |
| concepts[10].display_name | Operator (biology) |
| concepts[11].id | https://openalex.org/C144237770 |
| concepts[11].level | 1 |
| concepts[11].score | 0.3720826506614685 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q747534 |
| concepts[11].display_name | Mathematical economics |
| concepts[12].id | https://openalex.org/C126255220 |
| concepts[12].level | 1 |
| concepts[12].score | 0.3525891900062561 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q141495 |
| concepts[12].display_name | Mathematical optimization |
| concepts[13].id | https://openalex.org/C80444323 |
| concepts[13].level | 1 |
| concepts[13].score | 0.35218459367752075 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q2878974 |
| concepts[13].display_name | Theoretical computer science |
| concepts[14].id | https://openalex.org/C33923547 |
| concepts[14].level | 0 |
| concepts[14].score | 0.23144948482513428 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[14].display_name | Mathematics |
| concepts[15].id | https://openalex.org/C17744445 |
| concepts[15].level | 0 |
| concepts[15].score | 0.11432835459709167 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q36442 |
| concepts[15].display_name | Political science |
| concepts[16].id | https://openalex.org/C111472728 |
| concepts[16].level | 1 |
| concepts[16].score | 0.10322350263595581 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q9471 |
| concepts[16].display_name | Epistemology |
| concepts[17].id | https://openalex.org/C199539241 |
| concepts[17].level | 1 |
| concepts[17].score | 0.08277419209480286 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q7748 |
| concepts[17].display_name | Law |
| concepts[18].id | https://openalex.org/C62520636 |
| concepts[18].level | 1 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q944 |
| concepts[18].display_name | Quantum mechanics |
| concepts[19].id | https://openalex.org/C111919701 |
| concepts[19].level | 1 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q9135 |
| concepts[19].display_name | Operating system |
| concepts[20].id | https://openalex.org/C94625758 |
| concepts[20].level | 2 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q7163 |
| concepts[20].display_name | Politics |
| concepts[21].id | https://openalex.org/C104317684 |
| concepts[21].level | 2 |
| concepts[21].score | 0.0 |
| concepts[21].wikidata | https://www.wikidata.org/wiki/Q7187 |
| concepts[21].display_name | Gene |
| concepts[22].id | https://openalex.org/C158448853 |
| concepts[22].level | 4 |
| concepts[22].score | 0.0 |
| concepts[22].wikidata | https://www.wikidata.org/wiki/Q425218 |
| concepts[22].display_name | Repressor |
| concepts[23].id | https://openalex.org/C55493867 |
| concepts[23].level | 1 |
| concepts[23].score | 0.0 |
| concepts[23].wikidata | https://www.wikidata.org/wiki/Q7094 |
| concepts[23].display_name | Biochemistry |
| concepts[24].id | https://openalex.org/C185592680 |
| concepts[24].level | 0 |
| concepts[24].score | 0.0 |
| concepts[24].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[24].display_name | Chemistry |
| concepts[25].id | https://openalex.org/C121332964 |
| concepts[25].level | 0 |
| concepts[25].score | 0.0 |
| concepts[25].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[25].display_name | Physics |
| concepts[26].id | https://openalex.org/C86339819 |
| concepts[26].level | 3 |
| concepts[26].score | 0.0 |
| concepts[26].wikidata | https://www.wikidata.org/wiki/Q407384 |
| concepts[26].display_name | Transcription factor |
| concepts[27].id | https://openalex.org/C138885662 |
| concepts[27].level | 0 |
| concepts[27].score | 0.0 |
| concepts[27].wikidata | https://www.wikidata.org/wiki/Q5891 |
| concepts[27].display_name | Philosophy |
| keywords[0].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[0].score | 0.8158525228500366 |
| keywords[0].display_name | Reinforcement learning |
| keywords[1].id | https://openalex.org/keywords/simple |
| keywords[1].score | 0.7165284156799316 |
| keywords[1].display_name | Simple (philosophy) |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.6945250630378723 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/fictitious-play |
| keywords[3].score | 0.6405930519104004 |
| keywords[3].display_name | Fictitious play |
| keywords[4].id | https://openalex.org/keywords/action |
| keywords[4].score | 0.5929345488548279 |
| keywords[4].display_name | Action (physics) |
| keywords[5].id | https://openalex.org/keywords/adversarial-system |
| keywords[5].score | 0.5650641322135925 |
| keywords[5].display_name | Adversarial system |
| keywords[6].id | https://openalex.org/keywords/process |
| keywords[6].score | 0.4639069437980652 |
| keywords[6].display_name | Process (computing) |
| keywords[7].id | https://openalex.org/keywords/diplomacy |
| keywords[7].score | 0.4581768214702606 |
| keywords[7].display_name | Diplomacy |
| keywords[8].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[8].score | 0.45785602927207947 |
| keywords[8].display_name | Artificial intelligence |
| keywords[9].id | https://openalex.org/keywords/game-theory |
| keywords[9].score | 0.4302177131175995 |
| keywords[9].display_name | Game theory |
| keywords[10].id | https://openalex.org/keywords/operator |
| keywords[10].score | 0.41424059867858887 |
| keywords[10].display_name | Operator (biology) |
| keywords[11].id | https://openalex.org/keywords/mathematical-economics |
| keywords[11].score | 0.3720826506614685 |
| keywords[11].display_name | Mathematical economics |
| keywords[12].id | https://openalex.org/keywords/mathematical-optimization |
| keywords[12].score | 0.3525891900062561 |
| keywords[12].display_name | Mathematical optimization |
| keywords[13].id | https://openalex.org/keywords/theoretical-computer-science |
| keywords[13].score | 0.35218459367752075 |
| keywords[13].display_name | Theoretical computer science |
| keywords[14].id | https://openalex.org/keywords/mathematics |
| keywords[14].score | 0.23144948482513428 |
| keywords[14].display_name | Mathematics |
| keywords[15].id | https://openalex.org/keywords/political-science |
| keywords[15].score | 0.11432835459709167 |
| keywords[15].display_name | Political science |
| keywords[16].id | https://openalex.org/keywords/epistemology |
| keywords[16].score | 0.10322350263595581 |
| keywords[16].display_name | Epistemology |
| keywords[17].id | https://openalex.org/keywords/law |
| keywords[17].score | 0.08277419209480286 |
| keywords[17].display_name | Law |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2006.04635 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2006.04635 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2006.04635 |
| locations[1].id | mag:3033370016 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | submittedVersion |
| locations[1].raw_type | |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | False |
| locations[1].raw_source_name | arXiv (Cornell University) |
| locations[1].landing_page_url | https://arxiv.org/pdf/2006.04635.pdf |
| locations[2].id | pmh:oai:eprints.ucl.ac.uk.OAI2:10109592 |
| locations[2].is_oa | False |
| locations[2].source.id | https://openalex.org/S4306400024 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | False |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | UCL Discovery (University College London) |
| locations[2].source.host_organization | https://openalex.org/I45129253 |
| locations[2].source.host_organization_name | University College London |
| locations[2].source.host_organization_lineage | https://openalex.org/I45129253 |
| locations[2].license | |
| locations[2].pdf_url | |
| locations[2].version | submittedVersion |
| locations[2].raw_type | Proceedings paper |
| locations[2].license_id | |
| locations[2].is_accepted | False |
| locations[2].is_published | False |
| locations[2].raw_source_name | In: Advances in Neural Information Processing Systems 33 pre-proceedings (NeurIPS 2020). NeurIPS (2020) (In press). |
| locations[2].landing_page_url | https://discovery.ucl.ac.uk/id/eprint/10109592/ |
| locations[3].id | doi:10.48550/arxiv.2006.04635 |
| locations[3].is_oa | True |
| locations[3].source.id | https://openalex.org/S4306400194 |
| locations[3].source.issn | |
| locations[3].source.type | repository |
| locations[3].source.is_oa | True |
| locations[3].source.issn_l | |
| locations[3].source.is_core | False |
| locations[3].source.is_in_doaj | False |
| locations[3].source.display_name | arXiv (Cornell University) |
| locations[3].source.host_organization | https://openalex.org/I205783295 |
| locations[3].source.host_organization_name | Cornell University |
| locations[3].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[3].license | |
| locations[3].pdf_url | |
| locations[3].version | |
| locations[3].raw_type | article |
| locations[3].license_id | |
| locations[3].is_accepted | False |
| locations[3].is_published | |
| locations[3].raw_source_name | |
| locations[3].landing_page_url | https://doi.org/10.48550/arxiv.2006.04635 |
| locations[4].id | mag:3099253297 |
| locations[4].is_oa | False |
| locations[4].source.id | https://openalex.org/S4306420609 |
| locations[4].source.issn | |
| locations[4].source.type | conference |
| locations[4].source.is_oa | False |
| locations[4].source.issn_l | |
| locations[4].source.is_core | False |
| locations[4].source.is_in_doaj | False |
| locations[4].source.display_name | Neural Information Processing Systems |
| locations[4].source.host_organization | |
| locations[4].source.host_organization_name | |
| locations[4].license | |
| locations[4].pdf_url | |
| locations[4].version | |
| locations[4].raw_type | |
| locations[4].license_id | |
| locations[4].is_accepted | False |
| locations[4].is_published | |
| locations[4].raw_source_name | Neural Information Processing Systems |
| locations[4].landing_page_url | https://papers.nips.cc/paper/2020/file/d1419302db9c022ab1d48681b13d5f8b-Paper.pdf |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5000081835 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-4443-5466 |
| authorships[0].author.display_name | Thomas Anthony |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Thomas Anthony |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5019669511 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-6706-017X |
| authorships[1].author.display_name | Tom Eccles |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Tom Eccles |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5071433151 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-9311-9171 |
| authorships[2].author.display_name | Andrea Tacchetti |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Andrea Tacchetti |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5027979882 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | János Kramár |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | János Kramár |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5112947561 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-7774-3246 |
| authorships[4].author.display_name | Ian Gemp |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Ian Gemp |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5112451503 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Thomas C. Hudson |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Thomas C. Hudson |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5036050246 |
| authorships[6].author.orcid | |
| authorships[6].author.display_name | Nicolas Porcel |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Nicolas Porcel |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5049659586 |
| authorships[7].author.orcid | |
| authorships[7].author.display_name | Marc Lanctot |
| authorships[7].author_position | middle |
| authorships[7].raw_author_name | Marc Lanctot |
| authorships[7].is_corresponding | False |
| authorships[8].author.id | https://openalex.org/A5056707583 |
| authorships[8].author.orcid | https://orcid.org/0000-0002-8176-1666 |
| authorships[8].author.display_name | Julien Pérolat |
| authorships[8].author_position | middle |
| authorships[8].raw_author_name | Julien Pérolat |
| authorships[8].is_corresponding | False |
| authorships[9].author.id | https://openalex.org/A5007090604 |
| authorships[9].author.orcid | https://orcid.org/0000-0002-9404-6338 |
| authorships[9].author.display_name | Richard Everett |
| authorships[9].author_position | middle |
| authorships[9].raw_author_name | Richard Everett |
| authorships[9].is_corresponding | False |
| authorships[10].author.id | https://openalex.org/A5079557398 |
| authorships[10].author.orcid | https://orcid.org/0000-0002-1252-0316 |
| authorships[10].author.display_name | Roman Werpachowski |
| authorships[10].author_position | middle |
| authorships[10].raw_author_name | Roman Werpachowski |
| authorships[10].is_corresponding | False |
| authorships[11].author.id | https://openalex.org/A5046052093 |
| authorships[11].author.orcid | |
| authorships[11].author.display_name | Satinder Singh |
| authorships[11].author_position | middle |
| authorships[11].raw_author_name | Satinder Singh |
| authorships[11].is_corresponding | False |
| authorships[12].author.id | https://openalex.org/A5051619646 |
| authorships[12].author.orcid | https://orcid.org/0000-0003-3957-0310 |
| authorships[12].author.display_name | Thore Graepel |
| authorships[12].author_position | middle |
| authorships[12].raw_author_name | Thore Graepel |
| authorships[12].is_corresponding | False |
| authorships[13].author.id | https://openalex.org/A5062949033 |
| authorships[13].author.orcid | https://orcid.org/0000-0002-4382-7636 |
| authorships[13].author.display_name | Yoram Bachrach |
| authorships[13].author_position | last |
| authorships[13].raw_author_name | Yoram Bachrach |
| authorships[13].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2006.04635 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Learning to Play No-Press Diplomacy with Best Response Policy Iteration |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11574 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9993000030517578 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Artificial Intelligence in Games |
| related_works | https://openalex.org/W2996037775, https://openalex.org/W2982316857, https://openalex.org/W2964043796, https://openalex.org/W2902907165, https://openalex.org/W2145339207, https://openalex.org/W3085581910, https://openalex.org/W3000818637, https://openalex.org/W2031098375, https://openalex.org/W2750605955, https://openalex.org/W3172715706, https://openalex.org/W2993335844, https://openalex.org/W2913345290, https://openalex.org/W2780635050, https://openalex.org/W3173547034, https://openalex.org/W2618299181, https://openalex.org/W3044200169, https://openalex.org/W3200319033, https://openalex.org/W2963390138, https://openalex.org/W1996251656, https://openalex.org/W2951859429 |
| cited_by_count | 18 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 1 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 2 |
| counts_by_year[2].year | 2023 |
| counts_by_year[2].cited_by_count | 2 |
| counts_by_year[3].year | 2022 |
| counts_by_year[3].cited_by_count | 4 |
| counts_by_year[4].year | 2021 |
| counts_by_year[4].cited_by_count | 5 |
| counts_by_year[5].year | 2020 |
| counts_by_year[5].cited_by_count | 4 |
| locations_count | 5 |
| best_oa_location.id | pmh:oai:arXiv.org:2006.04635 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2006.04635 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2006.04635 |
| primary_location.id | pmh:oai:arXiv.org:2006.04635 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2006.04635 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2006.04635 |
| publication_date | 2020-06-08 |
| publication_year | 2020 |
| referenced_works | https://openalex.org/W119501476, https://openalex.org/W2922444559, https://openalex.org/W2097532427, https://openalex.org/W2153975459, https://openalex.org/W2970213829, https://openalex.org/W2762117857, https://openalex.org/W2946175686, https://openalex.org/W2036083037, https://openalex.org/W1889629917, https://openalex.org/W2155847328, https://openalex.org/W2041367235, https://openalex.org/W1960351623, https://openalex.org/W2931953508, https://openalex.org/W160318044, https://openalex.org/W2082551242, https://openalex.org/W2970104882, https://openalex.org/W2970130737, https://openalex.org/W2962957031, https://openalex.org/W2010612033, https://openalex.org/W2135071748, https://openalex.org/W2073826257, https://openalex.org/W2998299793, https://openalex.org/W2174009875, https://openalex.org/W2026535904, https://openalex.org/W1981774220, https://openalex.org/W2786036274, https://openalex.org/W2257979135, https://openalex.org/W2121863487, https://openalex.org/W3008369822, https://openalex.org/W2211498769, https://openalex.org/W2169359757, https://openalex.org/W2810602713, https://openalex.org/W1543121739, https://openalex.org/W2155183030, https://openalex.org/W2898895787, https://openalex.org/W2963937357, https://openalex.org/W2096913736, https://openalex.org/W2996706791, https://openalex.org/W2067050450, https://openalex.org/W2141744833, https://openalex.org/W2561139292, https://openalex.org/W2028798910, https://openalex.org/W2963836708, https://openalex.org/W3000558234, https://openalex.org/W2170677670, https://openalex.org/W2941355526, https://openalex.org/W2802170217, https://openalex.org/W2970725926, https://openalex.org/W2594138064, https://openalex.org/W2110630796, https://openalex.org/W3008383470, https://openalex.org/W2805516822, https://openalex.org/W2099587183, https://openalex.org/W2970894611, https://openalex.org/W2730328371, https://openalex.org/W2982316857, https://openalex.org/W2902907165, https://openalex.org/W2035999883, https://openalex.org/W2167957526, https://openalex.org/W2149107760, https://openalex.org/W2164464108, https://openalex.org/W2964019952, https://openalex.org/W2964043796, https://openalex.org/W2070214254, https://openalex.org/W2081476860, https://openalex.org/W2062663664, https://openalex.org/W42043901, https://openalex.org/W2952095743, https://openalex.org/W2075567596, https://openalex.org/W2963407617, https://openalex.org/W2098418738, https://openalex.org/W2969287672, https://openalex.org/W2963627051, https://openalex.org/W2973525135, https://openalex.org/W2157803532, https://openalex.org/W2188908787, https://openalex.org/W2006037036, https://openalex.org/W1415072653, https://openalex.org/W2150974097, https://openalex.org/W2604175534, https://openalex.org/W2963871073, https://openalex.org/W2264897026, https://openalex.org/W2149254401, https://openalex.org/W2110130062, https://openalex.org/W2954700257, https://openalex.org/W2570979475, https://openalex.org/W2945582565, https://openalex.org/W2020018978, https://openalex.org/W3198350258, https://openalex.org/W2913781869, https://openalex.org/W2038794597, https://openalex.org/W1991899349, https://openalex.org/W1757796397, https://openalex.org/W2077440620, https://openalex.org/W2766447205, https://openalex.org/W2963642149, https://openalex.org/W2099550549, https://openalex.org/W2574978968, https://openalex.org/W2291986326, https://openalex.org/W2144274908, https://openalex.org/W2159920598, https://openalex.org/W1988716172, https://openalex.org/W2963104507, https://openalex.org/W2049387611, https://openalex.org/W1497027873, https://openalex.org/W2132299980, https://openalex.org/W2060284899, https://openalex.org/W2949899112 |
| referenced_works_count | 108 |
| abstract_inverted_index.a | 59, 74, 90, 111 |
| abstract_inverted_index.It | 71 |
| abstract_inverted_index.RL | 38, 86, 127 |
| abstract_inverted_index.We | 56, 88, 108 |
| abstract_inverted_index.as | 18 |
| abstract_inverted_index.in | 2, 12 |
| abstract_inverted_index.of | 27, 37, 51, 113 |
| abstract_inverted_index.to | 9, 64, 99, 128 |
| abstract_inverted_index.we | 124, 130 |
| abstract_inverted_index.Go, | 19 |
| abstract_inverted_index.The | 23 |
| abstract_inverted_index.and | 21, 34, 45, 53, 79, 105, 140 |
| abstract_inverted_index.are | 43, 48, 83 |
| abstract_inverted_index.for | 31, 85 |
| abstract_inverted_index.led | 8 |
| abstract_inverted_index.new | 148 |
| abstract_inverted_index.our | 133 |
| abstract_inverted_index.the | 137, 147 |
| abstract_inverted_index.yet | 92 |
| abstract_inverted_index.(RL) | 6 |
| abstract_inverted_index.With | 121 |
| abstract_inverted_index.also | 72, 109 |
| abstract_inverted_index.best | 95 |
| abstract_inverted_index.deep | 3 |
| abstract_inverted_index.from | 68 |
| abstract_inverted_index.game | 62, 141 |
| abstract_inverted_index.have | 7 |
| abstract_inverted_index.many | 13 |
| abstract_inverted_index.show | 131 |
| abstract_inverted_index.such | 17, 28 |
| abstract_inverted_index.that | 117, 132, 146 |
| abstract_inverted_index.Poker | 20 |
| abstract_inverted_index.agent | 46 |
| abstract_inverted_index.apply | 126 |
| abstract_inverted_index.board | 61 |
| abstract_inverted_index.games | 29 |
| abstract_inverted_index.large | 75, 101 |
| abstract_inverted_index.play. | 120 |
| abstract_inverted_index.shows | 145 |
| abstract_inverted_index.space | 78 |
| abstract_inverted_index.these | 122 |
| abstract_inverted_index.which | 82 |
| abstract_inverted_index.Recent | 0 |
| abstract_inverted_index.action | 77, 103 |
| abstract_inverted_index.agents | 134 |
| abstract_inverted_index.allows | 30 |
| abstract_inverted_index.family | 112 |
| abstract_inverted_index.games, | 16 |
| abstract_inverted_index.handle | 100 |
| abstract_inverted_index.moves, | 81 |
| abstract_inverted_index.moves. | 107 |
| abstract_inverted_index.nature | 26 |
| abstract_inverted_index.policy | 114 |
| abstract_inverted_index.purely | 24 |
| abstract_inverted_index.simple | 33, 91 |
| abstract_inverted_index.spaces | 104 |
| abstract_inverted_index.yields | 150 |
| abstract_inverted_index.However | 40 |
| abstract_inverted_index.complex | 49 |
| abstract_inverted_index.methods | 116 |
| abstract_inverted_index.process | 149 |
| abstract_inverted_index.propose | 89 |
| abstract_inverted_index.2-player | 14 |
| abstract_inverted_index.7-player | 60 |
| abstract_inverted_index.advances | 1 |
| abstract_inverted_index.analysis | 144 |
| abstract_inverted_index.aspects. | 55 |
| abstract_inverted_index.consider | 57 |
| abstract_inverted_index.designed | 63, 98 |
| abstract_inverted_index.dilemmas | 66 |
| abstract_inverted_index.features | 73 |
| abstract_inverted_index.learning | 5 |
| abstract_inverted_index.methods, | 123 |
| abstract_inverted_index.methods. | 39 |
| abstract_inverted_index.mixtures | 50 |
| abstract_inverted_index.previous | 138 |
| abstract_inverted_index.progress | 11 |
| abstract_inverted_index.response | 96 |
| abstract_inverted_index.settings | 42 |
| abstract_inverted_index.zero-sum | 15 |
| abstract_inverted_index.effective | 93 |
| abstract_inverted_index.introduce | 110 |
| abstract_inverted_index.iteration | 115 |
| abstract_inverted_index.operator, | 97 |
| abstract_inverted_index.resulting | 67 |
| abstract_inverted_index.theoretic | 142 |
| abstract_inverted_index.Diplomacy, | 58 |
| abstract_inverted_index.Diplomacy: | 129 |
| abstract_inverted_index.Starcraft. | 22 |
| abstract_inverted_index.accentuate | 65 |
| abstract_inverted_index.consistent | 151 |
| abstract_inverted_index.fictitious | 119 |
| abstract_inverted_index.many-agent | 69 |
| abstract_inverted_index.outperform | 136 |
| abstract_inverted_index.principled | 35 |
| abstract_inverted_index.real-world | 41 |
| abstract_inverted_index.adversarial | 25 |
| abstract_inverted_index.algorithms. | 87 |
| abstract_inverted_index.application | 36 |
| abstract_inverted_index.approximate | 94, 118 |
| abstract_inverted_index.challenging | 84 |
| abstract_inverted_index.competitive | 54 |
| abstract_inverted_index.equilibrium | 143 |
| abstract_inverted_index.many-agent, | 44 |
| abstract_inverted_index.conceptually | 32 |
| abstract_inverted_index.considerable | 10 |
| abstract_inverted_index.convincingly | 135 |
| abstract_inverted_index.interactions | 47 |
| abstract_inverted_index.simultaneous | 80, 106 |
| abstract_inverted_index.successfully | 125 |
| abstract_inverted_index.combinatorial | 76, 102 |
| abstract_inverted_index.improvements. | 152 |
| abstract_inverted_index.interactions. | 70 |
| abstract_inverted_index.reinforcement | 4 |
| abstract_inverted_index.common-interest | 52 |
| abstract_inverted_index.state-of-the-art, | 139 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 14 |
| citation_normalized_percentile |