Dynamic Sight Range Selection in Multi-Agent Reinforcement Learning Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2505.12811
Multi-agent reinforcement Learning (MARL) is often challenged by the sight range dilemma, where agents either receive insufficient or excessive information from their environment. In this paper, we propose a novel method, called Dynamic Sight Range Selection (DSR), to address this issue. DSR utilizes an Upper Confidence Bound (UCB) algorithm and dynamically adjusts the sight range during training. Experiment results show several advantages of using DSR. First, we demonstrate using DSR achieves better performance in three common MARL environments, including Level-Based Foraging (LBF), Multi-Robot Warehouse (RWARE), and StarCraft Multi-Agent Challenge (SMAC). Second, our results show that DSR consistently improves performance across multiple MARL algorithms, including QMIX and MAPPO. Third, DSR offers suitable sight ranges for different training steps, thereby accelerating the training process. Finally, DSR provides additional interpretability by indicating the optimal sight range used during training. Unlike existing methods that rely on global information or communication mechanisms, our approach operates solely based on the individual sight ranges of agents. This approach offers a practical and efficient solution to the sight range dilemma, making it broadly applicable to real-world complex environments.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2505.12811
- https://arxiv.org/pdf/2505.12811
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4417302210
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4417302210Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2505.12811Digital Object Identifier
- Title
-
Dynamic Sight Range Selection in Multi-Agent Reinforcement LearningWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-05-19Full publication date if available
- Authors
-
Weichen Liao, Ti-Rong Wu, I‐Chen WuList of authors in order
- Landing page
-
https://arxiv.org/abs/2505.12811Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2505.12811Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2505.12811Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4417302210 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2505.12811 |
| ids.doi | https://doi.org/10.48550/arxiv.2505.12811 |
| ids.openalex | https://openalex.org/W4417302210 |
| fwci | |
| type | preprint |
| title | Dynamic Sight Range Selection in Multi-Agent Reinforcement Learning |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2505.12811 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://arxiv.org/pdf/2505.12811 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2505.12811 |
| locations[1].id | doi:10.48550/arxiv.2505.12811 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2505.12811 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5080364763 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-4470-1699 |
| authorships[0].author.display_name | Weichen Liao |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Liao, Wei-Chen |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5027984982 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-7532-3176 |
| authorships[1].author.display_name | Ti-Rong Wu |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Wu, Ti-Rong |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5016730899 |
| authorships[2].author.orcid | https://orcid.org/0000-0003-2535-0587 |
| authorships[2].author.display_name | I‐Chen Wu |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Wu, I-Chen |
| authorships[2].is_corresponding | False |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2505.12811 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Dynamic Sight Range Selection in Multi-Agent Reinforcement Learning |
| has_fulltext | True |
| is_retracted | False |
| updated_date | 2025-12-13T22:11:55.860423 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2505.12811 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2505.12811 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2505.12811 |
| primary_location.id | pmh:oai:arXiv.org:2505.12811 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://arxiv.org/pdf/2505.12811 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2505.12811 |
| publication_date | 2025-05-19 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 28, 162 |
| abstract_inverted_index.In | 23 |
| abstract_inverted_index.an | 43 |
| abstract_inverted_index.by | 7, 127 |
| abstract_inverted_index.in | 73 |
| abstract_inverted_index.is | 4 |
| abstract_inverted_index.it | 173 |
| abstract_inverted_index.of | 62, 157 |
| abstract_inverted_index.on | 141, 152 |
| abstract_inverted_index.or | 17, 144 |
| abstract_inverted_index.to | 37, 167, 176 |
| abstract_inverted_index.we | 26, 66 |
| abstract_inverted_index.DSR | 41, 69, 95, 108, 123 |
| abstract_inverted_index.and | 49, 85, 105, 164 |
| abstract_inverted_index.for | 113 |
| abstract_inverted_index.our | 91, 147 |
| abstract_inverted_index.the | 8, 52, 119, 129, 153, 168 |
| abstract_inverted_index.DSR. | 64 |
| abstract_inverted_index.MARL | 76, 101 |
| abstract_inverted_index.QMIX | 104 |
| abstract_inverted_index.This | 159 |
| abstract_inverted_index.from | 20 |
| abstract_inverted_index.rely | 140 |
| abstract_inverted_index.show | 59, 93 |
| abstract_inverted_index.that | 94, 139 |
| abstract_inverted_index.this | 24, 39 |
| abstract_inverted_index.used | 133 |
| abstract_inverted_index.(UCB) | 47 |
| abstract_inverted_index.Bound | 46 |
| abstract_inverted_index.Range | 34 |
| abstract_inverted_index.Sight | 33 |
| abstract_inverted_index.Upper | 44 |
| abstract_inverted_index.based | 151 |
| abstract_inverted_index.novel | 29 |
| abstract_inverted_index.often | 5 |
| abstract_inverted_index.range | 10, 54, 132, 170 |
| abstract_inverted_index.sight | 9, 53, 111, 131, 155, 169 |
| abstract_inverted_index.their | 21 |
| abstract_inverted_index.three | 74 |
| abstract_inverted_index.using | 63, 68 |
| abstract_inverted_index.where | 12 |
| abstract_inverted_index.(DSR), | 36 |
| abstract_inverted_index.(LBF), | 81 |
| abstract_inverted_index.(MARL) | 3 |
| abstract_inverted_index.First, | 65 |
| abstract_inverted_index.MAPPO. | 106 |
| abstract_inverted_index.Third, | 107 |
| abstract_inverted_index.Unlike | 136 |
| abstract_inverted_index.across | 99 |
| abstract_inverted_index.agents | 13 |
| abstract_inverted_index.better | 71 |
| abstract_inverted_index.called | 31 |
| abstract_inverted_index.common | 75 |
| abstract_inverted_index.during | 55, 134 |
| abstract_inverted_index.either | 14 |
| abstract_inverted_index.global | 142 |
| abstract_inverted_index.issue. | 40 |
| abstract_inverted_index.making | 172 |
| abstract_inverted_index.offers | 109, 161 |
| abstract_inverted_index.paper, | 25 |
| abstract_inverted_index.ranges | 112, 156 |
| abstract_inverted_index.solely | 150 |
| abstract_inverted_index.steps, | 116 |
| abstract_inverted_index.(SMAC). | 89 |
| abstract_inverted_index.Dynamic | 32 |
| abstract_inverted_index.Second, | 90 |
| abstract_inverted_index.address | 38 |
| abstract_inverted_index.adjusts | 51 |
| abstract_inverted_index.agents. | 158 |
| abstract_inverted_index.broadly | 174 |
| abstract_inverted_index.complex | 178 |
| abstract_inverted_index.method, | 30 |
| abstract_inverted_index.methods | 138 |
| abstract_inverted_index.optimal | 130 |
| abstract_inverted_index.propose | 27 |
| abstract_inverted_index.receive | 15 |
| abstract_inverted_index.results | 58, 92 |
| abstract_inverted_index.several | 60 |
| abstract_inverted_index.thereby | 117 |
| abstract_inverted_index.(RWARE), | 84 |
| abstract_inverted_index.Finally, | 122 |
| abstract_inverted_index.Foraging | 80 |
| abstract_inverted_index.Learning | 2 |
| abstract_inverted_index.achieves | 70 |
| abstract_inverted_index.approach | 148, 160 |
| abstract_inverted_index.dilemma, | 11, 171 |
| abstract_inverted_index.existing | 137 |
| abstract_inverted_index.improves | 97 |
| abstract_inverted_index.multiple | 100 |
| abstract_inverted_index.operates | 149 |
| abstract_inverted_index.process. | 121 |
| abstract_inverted_index.provides | 124 |
| abstract_inverted_index.solution | 166 |
| abstract_inverted_index.suitable | 110 |
| abstract_inverted_index.training | 115, 120 |
| abstract_inverted_index.utilizes | 42 |
| abstract_inverted_index.Challenge | 88 |
| abstract_inverted_index.Selection | 35 |
| abstract_inverted_index.StarCraft | 86 |
| abstract_inverted_index.Warehouse | 83 |
| abstract_inverted_index.algorithm | 48 |
| abstract_inverted_index.different | 114 |
| abstract_inverted_index.efficient | 165 |
| abstract_inverted_index.excessive | 18 |
| abstract_inverted_index.including | 78, 103 |
| abstract_inverted_index.practical | 163 |
| abstract_inverted_index.training. | 56, 135 |
| abstract_inverted_index.Confidence | 45 |
| abstract_inverted_index.Experiment | 57 |
| abstract_inverted_index.additional | 125 |
| abstract_inverted_index.advantages | 61 |
| abstract_inverted_index.applicable | 175 |
| abstract_inverted_index.challenged | 6 |
| abstract_inverted_index.indicating | 128 |
| abstract_inverted_index.individual | 154 |
| abstract_inverted_index.real-world | 177 |
| abstract_inverted_index.Level-Based | 79 |
| abstract_inverted_index.Multi-Agent | 87 |
| abstract_inverted_index.Multi-Robot | 82 |
| abstract_inverted_index.Multi-agent | 0 |
| abstract_inverted_index.algorithms, | 102 |
| abstract_inverted_index.demonstrate | 67 |
| abstract_inverted_index.dynamically | 50 |
| abstract_inverted_index.information | 19, 143 |
| abstract_inverted_index.mechanisms, | 146 |
| abstract_inverted_index.performance | 72, 98 |
| abstract_inverted_index.accelerating | 118 |
| abstract_inverted_index.consistently | 96 |
| abstract_inverted_index.environment. | 22 |
| abstract_inverted_index.insufficient | 16 |
| abstract_inverted_index.communication | 145 |
| abstract_inverted_index.environments, | 77 |
| abstract_inverted_index.environments. | 179 |
| abstract_inverted_index.reinforcement | 1 |
| abstract_inverted_index.interpretability | 126 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |