Noise-Regularized Advantage Value for Multi-Agent Reinforcement Learning Article Swipe
YOU?
·
· 2022
· Open Access
·
· DOI: https://doi.org/10.3390/math10152728
Leveraging global state information to enhance policy optimization is a common approach in multi-agent reinforcement learning (MARL). Even with the supplement of state information, the agents still suffer from insufficient exploration in the training stage. Moreover, training with batch-sampled examples from the replay buffer will induce the policy overfitting problem, i.e., multi-agent proximal policy optimization (MAPPO) may not perform as good as independent PPO (IPPO) even with additional information in the centralized critic. In this paper, we propose a novel noise-injection method to regularize the policies of agents and mitigate the overfitting issue. We analyze the cause of policy overfitting in actor–critic MARL, and design two specific patterns of noise injection applied to the advantage function with random Gaussian noise to stabilize the training and enhance the performance. The experimental results on the Matrix Game and StarCraft II show the higher training efficiency and superior performance of our method, and the ablation studies indicate our method will keep higher entropy of agents’ policies during training, which leads to more exploration.
Related Topics
- Type
- article
- Language
- en
- Landing Page
- https://doi.org/10.3390/math10152728
- https://www.mdpi.com/2227-7390/10/15/2728/pdf?version=1660032885
- OA Status
- gold
- Cited By
- 7
- References
- 29
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4289524420
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4289524420Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.3390/math10152728Digital Object Identifier
- Title
-
Noise-Regularized Advantage Value for Multi-Agent Reinforcement LearningWork title
- Type
-
articleOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2022Year of publication
- Publication date
-
2022-08-02Full publication date if available
- Authors
-
Siying Wang, Wenyu Chen, Jian Hu, Siyue Hu, Liwei HuangList of authors in order
- Landing page
-
https://doi.org/10.3390/math10152728Publisher landing page
- PDF URL
-
https://www.mdpi.com/2227-7390/10/15/2728/pdf?version=1660032885Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
goldOpen access status per OpenAlex
- OA URL
-
https://www.mdpi.com/2227-7390/10/15/2728/pdf?version=1660032885Direct OA link when available
- Concepts
-
Overfitting, Reinforcement learning, Computer science, Noise (video), Artificial intelligence, Mathematical optimization, Machine learning, Artificial neural network, Mathematics, Image (mathematics)Top concepts (fields/topics) attached by OpenAlex
- Cited by
-
7Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 6, 2024: 1Per-year citation counts (last 5 years)
- References (count)
-
29Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4289524420 |
|---|---|
| doi | https://doi.org/10.3390/math10152728 |
| ids.doi | https://doi.org/10.3390/math10152728 |
| ids.openalex | https://openalex.org/W4289524420 |
| fwci | 1.37059174 |
| type | article |
| title | Noise-Regularized Advantage Value for Multi-Agent Reinforcement Learning |
| biblio.issue | 15 |
| biblio.volume | 10 |
| biblio.last_page | 2728 |
| biblio.first_page | 2728 |
| topics[0].id | https://openalex.org/T10462 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9994999766349792 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Reinforcement Learning in Robotics |
| topics[1].id | https://openalex.org/T12794 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9977999925613403 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1703 |
| topics[1].subfield.display_name | Computational Theory and Mathematics |
| topics[1].display_name | Adaptive Dynamic Programming Control |
| topics[2].id | https://openalex.org/T10249 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9763000011444092 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1705 |
| topics[2].subfield.display_name | Computer Networks and Communications |
| topics[2].display_name | Distributed Control Multi-Agent Systems |
| is_xpac | False |
| apc_list.value | 1800 |
| apc_list.currency | CHF |
| apc_list.value_usd | 1949 |
| apc_paid.value | 1800 |
| apc_paid.currency | CHF |
| apc_paid.value_usd | 1949 |
| concepts[0].id | https://openalex.org/C22019652 |
| concepts[0].level | 3 |
| concepts[0].score | 0.881364643573761 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q331309 |
| concepts[0].display_name | Overfitting |
| concepts[1].id | https://openalex.org/C97541855 |
| concepts[1].level | 2 |
| concepts[1].score | 0.8429630398750305 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[1].display_name | Reinforcement learning |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.7185949683189392 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C99498987 |
| concepts[3].level | 3 |
| concepts[3].score | 0.5599098205566406 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q2210247 |
| concepts[3].display_name | Noise (video) |
| concepts[4].id | https://openalex.org/C154945302 |
| concepts[4].level | 1 |
| concepts[4].score | 0.5235055088996887 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[4].display_name | Artificial intelligence |
| concepts[5].id | https://openalex.org/C126255220 |
| concepts[5].level | 1 |
| concepts[5].score | 0.4459887444972992 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q141495 |
| concepts[5].display_name | Mathematical optimization |
| concepts[6].id | https://openalex.org/C119857082 |
| concepts[6].level | 1 |
| concepts[6].score | 0.4224834740161896 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[6].display_name | Machine learning |
| concepts[7].id | https://openalex.org/C50644808 |
| concepts[7].level | 2 |
| concepts[7].score | 0.36116671562194824 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[7].display_name | Artificial neural network |
| concepts[8].id | https://openalex.org/C33923547 |
| concepts[8].level | 0 |
| concepts[8].score | 0.1376085877418518 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[8].display_name | Mathematics |
| concepts[9].id | https://openalex.org/C115961682 |
| concepts[9].level | 2 |
| concepts[9].score | 0.0 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[9].display_name | Image (mathematics) |
| keywords[0].id | https://openalex.org/keywords/overfitting |
| keywords[0].score | 0.881364643573761 |
| keywords[0].display_name | Overfitting |
| keywords[1].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[1].score | 0.8429630398750305 |
| keywords[1].display_name | Reinforcement learning |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.7185949683189392 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/noise |
| keywords[3].score | 0.5599098205566406 |
| keywords[3].display_name | Noise (video) |
| keywords[4].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[4].score | 0.5235055088996887 |
| keywords[4].display_name | Artificial intelligence |
| keywords[5].id | https://openalex.org/keywords/mathematical-optimization |
| keywords[5].score | 0.4459887444972992 |
| keywords[5].display_name | Mathematical optimization |
| keywords[6].id | https://openalex.org/keywords/machine-learning |
| keywords[6].score | 0.4224834740161896 |
| keywords[6].display_name | Machine learning |
| keywords[7].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[7].score | 0.36116671562194824 |
| keywords[7].display_name | Artificial neural network |
| keywords[8].id | https://openalex.org/keywords/mathematics |
| keywords[8].score | 0.1376085877418518 |
| keywords[8].display_name | Mathematics |
| language | en |
| locations[0].id | doi:10.3390/math10152728 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4210192031 |
| locations[0].source.issn | 2227-7390 |
| locations[0].source.type | journal |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | 2227-7390 |
| locations[0].source.is_core | True |
| locations[0].source.is_in_doaj | True |
| locations[0].source.display_name | Mathematics |
| locations[0].source.host_organization | https://openalex.org/P4310310987 |
| locations[0].source.host_organization_name | Multidisciplinary Digital Publishing Institute |
| locations[0].source.host_organization_lineage | https://openalex.org/P4310310987 |
| locations[0].source.host_organization_lineage_names | Multidisciplinary Digital Publishing Institute |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://www.mdpi.com/2227-7390/10/15/2728/pdf?version=1660032885 |
| locations[0].version | publishedVersion |
| locations[0].raw_type | journal-article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | True |
| locations[0].is_published | True |
| locations[0].raw_source_name | Mathematics |
| locations[0].landing_page_url | https://doi.org/10.3390/math10152728 |
| locations[1].id | pmh:oai:doaj.org/article:fe844a242e30412495249e2e240bfa7f |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306401280 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | DOAJ (DOAJ: Directory of Open Access Journals) |
| locations[1].source.host_organization | |
| locations[1].source.host_organization_name | |
| locations[1].license | cc-by-sa |
| locations[1].pdf_url | |
| locations[1].version | submittedVersion |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by-sa |
| locations[1].is_accepted | False |
| locations[1].is_published | False |
| locations[1].raw_source_name | Mathematics, Vol 10, Iss 15, p 2728 (2022) |
| locations[1].landing_page_url | https://doaj.org/article/fe844a242e30412495249e2e240bfa7f |
| locations[2].id | pmh:oai:mdpi.com:/2227-7390/10/15/2728/ |
| locations[2].is_oa | True |
| locations[2].source.id | https://openalex.org/S4306400947 |
| locations[2].source.issn | |
| locations[2].source.type | repository |
| locations[2].source.is_oa | True |
| locations[2].source.issn_l | |
| locations[2].source.is_core | False |
| locations[2].source.is_in_doaj | False |
| locations[2].source.display_name | MDPI (MDPI AG) |
| locations[2].source.host_organization | https://openalex.org/I4210097602 |
| locations[2].source.host_organization_name | Multidisciplinary Digital Publishing Institute (Switzerland) |
| locations[2].source.host_organization_lineage | https://openalex.org/I4210097602 |
| locations[2].license | cc-by |
| locations[2].pdf_url | |
| locations[2].version | submittedVersion |
| locations[2].raw_type | Text |
| locations[2].license_id | https://openalex.org/licenses/cc-by |
| locations[2].is_accepted | False |
| locations[2].is_published | False |
| locations[2].raw_source_name | Mathematics |
| locations[2].landing_page_url | https://dx.doi.org/10.3390/math10152728 |
| indexed_in | crossref, doaj |
| authorships[0].author.id | https://openalex.org/A5002463311 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-3180-0815 |
| authorships[0].author.display_name | Siying Wang |
| authorships[0].countries | CN |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I150229711 |
| authorships[0].affiliations[0].raw_affiliation_string | School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China |
| authorships[0].institutions[0].id | https://openalex.org/I150229711 |
| authorships[0].institutions[0].ror | https://ror.org/04qr3zq92 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I150229711 |
| authorships[0].institutions[0].country_code | CN |
| authorships[0].institutions[0].display_name | University of Electronic Science and Technology of China |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Siying Wang |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China |
| authorships[1].author.id | https://openalex.org/A5100687323 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-9933-8014 |
| authorships[1].author.display_name | Wenyu Chen |
| authorships[1].countries | CN |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I150229711 |
| authorships[1].affiliations[0].raw_affiliation_string | School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China |
| authorships[1].institutions[0].id | https://openalex.org/I150229711 |
| authorships[1].institutions[0].ror | https://ror.org/04qr3zq92 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I150229711 |
| authorships[1].institutions[0].country_code | CN |
| authorships[1].institutions[0].display_name | University of Electronic Science and Technology of China |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wenyu Chen |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China |
| authorships[2].author.id | https://openalex.org/A5008387750 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-3437-2625 |
| authorships[2].author.display_name | Jian Hu |
| authorships[2].countries | TW |
| authorships[2].affiliations[0].institution_ids | https://openalex.org/I16733864 |
| authorships[2].affiliations[0].raw_affiliation_string | Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei 106, Taiwan |
| authorships[2].institutions[0].id | https://openalex.org/I16733864 |
| authorships[2].institutions[0].ror | https://ror.org/05bqach95 |
| authorships[2].institutions[0].type | education |
| authorships[2].institutions[0].lineage | https://openalex.org/I16733864 |
| authorships[2].institutions[0].country_code | TW |
| authorships[2].institutions[0].display_name | National Taiwan University |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Jian Hu |
| authorships[2].is_corresponding | True |
| authorships[2].raw_affiliation_strings | Graduate Institute of Networking and Multimedia, National Taiwan University, Taipei 106, Taiwan |
| authorships[3].author.id | https://openalex.org/A5016146635 |
| authorships[3].author.orcid | |
| authorships[3].author.display_name | Siyue Hu |
| authorships[3].countries | TW |
| authorships[3].affiliations[0].institution_ids | https://openalex.org/I16733864 |
| authorships[3].affiliations[0].raw_affiliation_string | Department of Computer Science & Information Engineering, National Taiwan University, Taipei 106, Taiwan |
| authorships[3].institutions[0].id | https://openalex.org/I16733864 |
| authorships[3].institutions[0].ror | https://ror.org/05bqach95 |
| authorships[3].institutions[0].type | education |
| authorships[3].institutions[0].lineage | https://openalex.org/I16733864 |
| authorships[3].institutions[0].country_code | TW |
| authorships[3].institutions[0].display_name | National Taiwan University |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Siyue Hu |
| authorships[3].is_corresponding | False |
| authorships[3].raw_affiliation_strings | Department of Computer Science & Information Engineering, National Taiwan University, Taipei 106, Taiwan |
| authorships[4].author.id | https://openalex.org/A5038091112 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-7590-144X |
| authorships[4].author.display_name | Liwei Huang |
| authorships[4].countries | CN, MO |
| authorships[4].affiliations[0].institution_ids | https://openalex.org/I150229711 |
| authorships[4].affiliations[0].raw_affiliation_string | School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China |
| authorships[4].affiliations[1].institution_ids | https://openalex.org/I204512498 |
| authorships[4].affiliations[1].raw_affiliation_string | The State Key Laboratory of IoTSC, University of Macau, Taipa, Macau 999078, China |
| authorships[4].institutions[0].id | https://openalex.org/I150229711 |
| authorships[4].institutions[0].ror | https://ror.org/04qr3zq92 |
| authorships[4].institutions[0].type | education |
| authorships[4].institutions[0].lineage | https://openalex.org/I150229711 |
| authorships[4].institutions[0].country_code | CN |
| authorships[4].institutions[0].display_name | University of Electronic Science and Technology of China |
| authorships[4].institutions[1].id | https://openalex.org/I204512498 |
| authorships[4].institutions[1].ror | https://ror.org/01r4q9n85 |
| authorships[4].institutions[1].type | education |
| authorships[4].institutions[1].lineage | https://openalex.org/I204512498 |
| authorships[4].institutions[1].country_code | MO |
| authorships[4].institutions[1].display_name | University of Macau |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Liwei Huang |
| authorships[4].is_corresponding | False |
| authorships[4].raw_affiliation_strings | School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China, The State Key Laboratory of IoTSC, University of Macau, Taipa, Macau 999078, China |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.mdpi.com/2227-7390/10/15/2728/pdf?version=1660032885 |
| open_access.oa_status | gold |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Noise-Regularized Advantage Value for Multi-Agent Reinforcement Learning |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-11-06T03:46:38.306776 |
| primary_topic.id | https://openalex.org/T10462 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9994999766349792 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Reinforcement Learning in Robotics |
| related_works | https://openalex.org/W4362597605, https://openalex.org/W1574414179, https://openalex.org/W4297676672, https://openalex.org/W3009056573, https://openalex.org/W2922073769, https://openalex.org/W4281702477, https://openalex.org/W2490526372, https://openalex.org/W2989932438, https://openalex.org/W4387297750, https://openalex.org/W2186333919 |
| cited_by_count | 7 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 6 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 1 |
| locations_count | 3 |
| best_oa_location.id | doi:10.3390/math10152728 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4210192031 |
| best_oa_location.source.issn | 2227-7390 |
| best_oa_location.source.type | journal |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | 2227-7390 |
| best_oa_location.source.is_core | True |
| best_oa_location.source.is_in_doaj | True |
| best_oa_location.source.display_name | Mathematics |
| best_oa_location.source.host_organization | https://openalex.org/P4310310987 |
| best_oa_location.source.host_organization_name | Multidisciplinary Digital Publishing Institute |
| best_oa_location.source.host_organization_lineage | https://openalex.org/P4310310987 |
| best_oa_location.source.host_organization_lineage_names | Multidisciplinary Digital Publishing Institute |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://www.mdpi.com/2227-7390/10/15/2728/pdf?version=1660032885 |
| best_oa_location.version | publishedVersion |
| best_oa_location.raw_type | journal-article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | True |
| best_oa_location.is_published | True |
| best_oa_location.raw_source_name | Mathematics |
| best_oa_location.landing_page_url | https://doi.org/10.3390/math10152728 |
| primary_location.id | doi:10.3390/math10152728 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4210192031 |
| primary_location.source.issn | 2227-7390 |
| primary_location.source.type | journal |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | 2227-7390 |
| primary_location.source.is_core | True |
| primary_location.source.is_in_doaj | True |
| primary_location.source.display_name | Mathematics |
| primary_location.source.host_organization | https://openalex.org/P4310310987 |
| primary_location.source.host_organization_name | Multidisciplinary Digital Publishing Institute |
| primary_location.source.host_organization_lineage | https://openalex.org/P4310310987 |
| primary_location.source.host_organization_lineage_names | Multidisciplinary Digital Publishing Institute |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://www.mdpi.com/2227-7390/10/15/2728/pdf?version=1660032885 |
| primary_location.version | publishedVersion |
| primary_location.raw_type | journal-article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | True |
| primary_location.is_published | True |
| primary_location.raw_source_name | Mathematics |
| primary_location.landing_page_url | https://doi.org/10.3390/math10152728 |
| publication_date | 2022-08-02 |
| publication_year | 2022 |
| referenced_works | https://openalex.org/W3217162497, https://openalex.org/W2012812921, https://openalex.org/W2735548234, https://openalex.org/W2108383324, https://openalex.org/W2299109138, https://openalex.org/W2145339207, https://openalex.org/W1641379095, https://openalex.org/W2166533447, https://openalex.org/W6738796088, https://openalex.org/W2894976951, https://openalex.org/W2794643322, https://openalex.org/W6740092555, https://openalex.org/W6739193204, https://openalex.org/W2981038142, https://openalex.org/W2807741983, https://openalex.org/W2946606218, https://openalex.org/W2173248099, https://openalex.org/W6739901393, https://openalex.org/W2617547828, https://openalex.org/W3200561352, https://openalex.org/W6767327128, https://openalex.org/W3012445938, https://openalex.org/W6795826604, https://openalex.org/W3170511602, https://openalex.org/W6697407801, https://openalex.org/W2121092017, https://openalex.org/W3168274109, https://openalex.org/W4299802797, https://openalex.org/W4385245566 |
| referenced_works_count | 29 |
| abstract_inverted_index.a | 9, 78 |
| abstract_inverted_index.II | 137 |
| abstract_inverted_index.In | 73 |
| abstract_inverted_index.We | 93 |
| abstract_inverted_index.as | 59, 61 |
| abstract_inverted_index.in | 12, 31, 69, 100 |
| abstract_inverted_index.is | 8 |
| abstract_inverted_index.of | 21, 86, 97, 108, 146, 160 |
| abstract_inverted_index.on | 131 |
| abstract_inverted_index.to | 4, 82, 112, 120, 167 |
| abstract_inverted_index.we | 76 |
| abstract_inverted_index.PPO | 63 |
| abstract_inverted_index.The | 128 |
| abstract_inverted_index.and | 88, 103, 124, 135, 143, 149 |
| abstract_inverted_index.may | 56 |
| abstract_inverted_index.not | 57 |
| abstract_inverted_index.our | 147, 154 |
| abstract_inverted_index.the | 19, 24, 32, 41, 46, 70, 84, 90, 95, 113, 122, 126, 132, 139, 150 |
| abstract_inverted_index.two | 105 |
| abstract_inverted_index.Even | 17 |
| abstract_inverted_index.Game | 134 |
| abstract_inverted_index.even | 65 |
| abstract_inverted_index.from | 28, 40 |
| abstract_inverted_index.good | 60 |
| abstract_inverted_index.keep | 157 |
| abstract_inverted_index.more | 168 |
| abstract_inverted_index.show | 138 |
| abstract_inverted_index.this | 74 |
| abstract_inverted_index.will | 44, 156 |
| abstract_inverted_index.with | 18, 37, 66, 116 |
| abstract_inverted_index.MARL, | 102 |
| abstract_inverted_index.cause | 96 |
| abstract_inverted_index.i.e., | 50 |
| abstract_inverted_index.leads | 166 |
| abstract_inverted_index.noise | 109, 119 |
| abstract_inverted_index.novel | 79 |
| abstract_inverted_index.state | 2, 22 |
| abstract_inverted_index.still | 26 |
| abstract_inverted_index.which | 165 |
| abstract_inverted_index.(IPPO) | 64 |
| abstract_inverted_index.Matrix | 133 |
| abstract_inverted_index.agents | 25, 87 |
| abstract_inverted_index.buffer | 43 |
| abstract_inverted_index.common | 10 |
| abstract_inverted_index.design | 104 |
| abstract_inverted_index.during | 163 |
| abstract_inverted_index.global | 1 |
| abstract_inverted_index.higher | 140, 158 |
| abstract_inverted_index.induce | 45 |
| abstract_inverted_index.issue. | 92 |
| abstract_inverted_index.method | 81, 155 |
| abstract_inverted_index.paper, | 75 |
| abstract_inverted_index.policy | 6, 47, 53, 98 |
| abstract_inverted_index.random | 117 |
| abstract_inverted_index.replay | 42 |
| abstract_inverted_index.stage. | 34 |
| abstract_inverted_index.suffer | 27 |
| abstract_inverted_index.(MAPPO) | 55 |
| abstract_inverted_index.(MARL). | 16 |
| abstract_inverted_index.analyze | 94 |
| abstract_inverted_index.applied | 111 |
| abstract_inverted_index.critic. | 72 |
| abstract_inverted_index.enhance | 5, 125 |
| abstract_inverted_index.entropy | 159 |
| abstract_inverted_index.method, | 148 |
| abstract_inverted_index.perform | 58 |
| abstract_inverted_index.propose | 77 |
| abstract_inverted_index.results | 130 |
| abstract_inverted_index.studies | 152 |
| abstract_inverted_index.Gaussian | 118 |
| abstract_inverted_index.ablation | 151 |
| abstract_inverted_index.approach | 11 |
| abstract_inverted_index.examples | 39 |
| abstract_inverted_index.function | 115 |
| abstract_inverted_index.indicate | 153 |
| abstract_inverted_index.learning | 15 |
| abstract_inverted_index.mitigate | 89 |
| abstract_inverted_index.patterns | 107 |
| abstract_inverted_index.policies | 85, 162 |
| abstract_inverted_index.problem, | 49 |
| abstract_inverted_index.proximal | 52 |
| abstract_inverted_index.specific | 106 |
| abstract_inverted_index.superior | 144 |
| abstract_inverted_index.training | 33, 36, 123, 141 |
| abstract_inverted_index.Moreover, | 35 |
| abstract_inverted_index.StarCraft | 136 |
| abstract_inverted_index.advantage | 114 |
| abstract_inverted_index.agents’ | 161 |
| abstract_inverted_index.injection | 110 |
| abstract_inverted_index.stabilize | 121 |
| abstract_inverted_index.training, | 164 |
| abstract_inverted_index.Leveraging | 0 |
| abstract_inverted_index.additional | 67 |
| abstract_inverted_index.efficiency | 142 |
| abstract_inverted_index.regularize | 83 |
| abstract_inverted_index.supplement | 20 |
| abstract_inverted_index.centralized | 71 |
| abstract_inverted_index.exploration | 30 |
| abstract_inverted_index.independent | 62 |
| abstract_inverted_index.information | 3, 68 |
| abstract_inverted_index.multi-agent | 13, 51 |
| abstract_inverted_index.overfitting | 48, 91, 99 |
| abstract_inverted_index.performance | 145 |
| abstract_inverted_index.experimental | 129 |
| abstract_inverted_index.exploration. | 169 |
| abstract_inverted_index.information, | 23 |
| abstract_inverted_index.insufficient | 29 |
| abstract_inverted_index.optimization | 7, 54 |
| abstract_inverted_index.performance. | 127 |
| abstract_inverted_index.batch-sampled | 38 |
| abstract_inverted_index.reinforcement | 14 |
| abstract_inverted_index.actor–critic | 101 |
| abstract_inverted_index.noise-injection | 80 |
| cited_by_percentile_year.max | 99 |
| cited_by_percentile_year.min | 90 |
| corresponding_author_ids | https://openalex.org/A5008387750 |
| countries_distinct_count | 3 |
| institutions_distinct_count | 5 |
| corresponding_institution_ids | https://openalex.org/I16733864 |
| citation_normalized_percentile.value | 0.80120372 |
| citation_normalized_percentile.is_in_top_1_percent | False |
| citation_normalized_percentile.is_in_top_10_percent | False |