Self-Interest and Systemic Benefits: Emergence of Collective Rationality in Mixed Autonomy Traffic Through Deep Reinforcement Learning Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2511.04883
Autonomous vehicles (AVs) are expected to be commercially available in the near future, leading to mixed autonomy traffic consisting of both AVs and human-driven vehicles (HVs). Although numerous studies have shown that AVs can be deployed to benefit the overall traffic system performance by incorporating system-level goals into their decision making, it is not clear whether the benefits still exist when agents act out of self-interest -- a trait common to all driving agents, both human and autonomous. This study aims to understand whether self-interested AVs can bring benefits to all driving agents in mixed autonomy traffic systems. The research is centered on the concept of collective rationality (CR). This concept, originating from game theory and behavioral economics, means that driving agents may cooperate collectively even when pursuing individual interests. Our recent research has proven the existence of CR in an analytical game-theoretical model and empirically in mixed human-driven traffic. In this paper, we demonstrate that CR can be attained among driving agents trained using deep reinforcement learning (DRL) with a simple reward design. We examine the extent to which self-interested traffic agents can achieve CR without directly incorporating system-level objectives. Results show that CR consistently emerges in various scenarios, which indicates the robustness of this property. We also postulate a mechanism to explain the emergence of CR in the microscopic and dynamic environment and verify it based on simulation evidence. This research suggests the possibility of leveraging advanced learning methods (such as federated learning) to achieve collective cooperation among self-interested driving agents in mixed-autonomy systems.
Related Topics
- Type
- preprint
- Landing Page
- https://doi.org/10.48550/arxiv.2511.04883
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W7104655214
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W7104655214Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2511.04883Digital Object Identifier
- Title
-
Self-Interest and Systemic Benefits: Emergence of Collective Rationality in Mixed Autonomy Traffic Through Deep Reinforcement LearningWork title
- Type
-
preprintOpenAlex work type
- Publication year
-
2025Year of publication
- Publication date
-
2025-11-07Full publication date if available
- Authors
-
Chen Di, Li Jia, Zhang, MichaelList of authors in order
- Landing page
-
https://doi.org/10.48550/arxiv.2511.04883Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://doi.org/10.48550/arxiv.2511.04883Direct OA link when available
- Concepts
-
Reinforcement learning, Rationality, Autonomy, Robustness (evolution), Computer science, Collective behavior, Agent-based model, Trait, Game theory, Mechanism (biology), Reinforcement, Microeconomics, Simple (philosophy), Evolutionary game theory, Cognitive psychology, Artificial intelligence, Psychology, Social psychology, Self-determination theory, Risk analysis (engineering), Economics, Nash equilibrium, Management scienceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W7104655214 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2511.04883 |
| ids.doi | https://doi.org/10.48550/arxiv.2511.04883 |
| ids.openalex | https://openalex.org/W7104655214 |
| fwci | 0.0 |
| type | preprint |
| title | Self-Interest and Systemic Benefits: Emergence of Collective Rationality in Mixed Autonomy Traffic Through Deep Reinforcement Learning |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97541855 |
| concepts[0].level | 2 |
| concepts[0].score | 0.6970070004463196 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[0].display_name | Reinforcement learning |
| concepts[1].id | https://openalex.org/C201717286 |
| concepts[1].level | 2 |
| concepts[1].score | 0.6798497438430786 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q938185 |
| concepts[1].display_name | Rationality |
| concepts[2].id | https://openalex.org/C65414064 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6009548902511597 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q484105 |
| concepts[2].display_name | Autonomy |
| concepts[3].id | https://openalex.org/C63479239 |
| concepts[3].level | 3 |
| concepts[3].score | 0.5474610924720764 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q7353546 |
| concepts[3].display_name | Robustness (evolution) |
| concepts[4].id | https://openalex.org/C41008148 |
| concepts[4].level | 0 |
| concepts[4].score | 0.4714452624320984 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[4].display_name | Computer science |
| concepts[5].id | https://openalex.org/C100339178 |
| concepts[5].level | 2 |
| concepts[5].score | 0.44356346130371094 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q2548752 |
| concepts[5].display_name | Collective behavior |
| concepts[6].id | https://openalex.org/C2780873155 |
| concepts[6].level | 2 |
| concepts[6].score | 0.43134692311286926 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q392811 |
| concepts[6].display_name | Agent-based model |
| concepts[7].id | https://openalex.org/C106934330 |
| concepts[7].level | 2 |
| concepts[7].score | 0.3868710994720459 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q1971873 |
| concepts[7].display_name | Trait |
| concepts[8].id | https://openalex.org/C177142836 |
| concepts[8].level | 2 |
| concepts[8].score | 0.38387492299079895 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q44455 |
| concepts[8].display_name | Game theory |
| concepts[9].id | https://openalex.org/C89611455 |
| concepts[9].level | 2 |
| concepts[9].score | 0.381965309381485 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q6804646 |
| concepts[9].display_name | Mechanism (biology) |
| concepts[10].id | https://openalex.org/C67203356 |
| concepts[10].level | 2 |
| concepts[10].score | 0.3327351212501526 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q1321905 |
| concepts[10].display_name | Reinforcement |
| concepts[11].id | https://openalex.org/C175444787 |
| concepts[11].level | 1 |
| concepts[11].score | 0.33255481719970703 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q39072 |
| concepts[11].display_name | Microeconomics |
| concepts[12].id | https://openalex.org/C2780586882 |
| concepts[12].level | 2 |
| concepts[12].score | 0.3106893002986908 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q7520643 |
| concepts[12].display_name | Simple (philosophy) |
| concepts[13].id | https://openalex.org/C20249471 |
| concepts[13].level | 3 |
| concepts[13].score | 0.29849785566329956 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q2298789 |
| concepts[13].display_name | Evolutionary game theory |
| concepts[14].id | https://openalex.org/C180747234 |
| concepts[14].level | 1 |
| concepts[14].score | 0.29657143354415894 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q23373 |
| concepts[14].display_name | Cognitive psychology |
| concepts[15].id | https://openalex.org/C154945302 |
| concepts[15].level | 1 |
| concepts[15].score | 0.28880080580711365 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[15].display_name | Artificial intelligence |
| concepts[16].id | https://openalex.org/C15744967 |
| concepts[16].level | 0 |
| concepts[16].score | 0.27344661951065063 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q9418 |
| concepts[16].display_name | Psychology |
| concepts[17].id | https://openalex.org/C77805123 |
| concepts[17].level | 1 |
| concepts[17].score | 0.27150195837020874 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q161272 |
| concepts[17].display_name | Social psychology |
| concepts[18].id | https://openalex.org/C146854351 |
| concepts[18].level | 3 |
| concepts[18].score | 0.2619902193546295 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q1433910 |
| concepts[18].display_name | Self-determination theory |
| concepts[19].id | https://openalex.org/C112930515 |
| concepts[19].level | 1 |
| concepts[19].score | 0.26088783144950867 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q4389547 |
| concepts[19].display_name | Risk analysis (engineering) |
| concepts[20].id | https://openalex.org/C162324750 |
| concepts[20].level | 0 |
| concepts[20].score | 0.2573305070400238 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q8134 |
| concepts[20].display_name | Economics |
| concepts[21].id | https://openalex.org/C46814582 |
| concepts[21].level | 2 |
| concepts[21].score | 0.2568768858909607 |
| concepts[21].wikidata | https://www.wikidata.org/wiki/Q23389 |
| concepts[21].display_name | Nash equilibrium |
| concepts[22].id | https://openalex.org/C539667460 |
| concepts[22].level | 1 |
| concepts[22].score | 0.255063533782959 |
| concepts[22].wikidata | https://www.wikidata.org/wiki/Q2414942 |
| concepts[22].display_name | Management science |
| keywords[0].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[0].score | 0.6970070004463196 |
| keywords[0].display_name | Reinforcement learning |
| keywords[1].id | https://openalex.org/keywords/rationality |
| keywords[1].score | 0.6798497438430786 |
| keywords[1].display_name | Rationality |
| keywords[2].id | https://openalex.org/keywords/autonomy |
| keywords[2].score | 0.6009548902511597 |
| keywords[2].display_name | Autonomy |
| keywords[3].id | https://openalex.org/keywords/robustness |
| keywords[3].score | 0.5474610924720764 |
| keywords[3].display_name | Robustness (evolution) |
| keywords[4].id | https://openalex.org/keywords/collective-behavior |
| keywords[4].score | 0.44356346130371094 |
| keywords[4].display_name | Collective behavior |
| keywords[5].id | https://openalex.org/keywords/agent-based-model |
| keywords[5].score | 0.43134692311286926 |
| keywords[5].display_name | Agent-based model |
| keywords[6].id | https://openalex.org/keywords/trait |
| keywords[6].score | 0.3868710994720459 |
| keywords[6].display_name | Trait |
| language | |
| locations[0].id | doi:10.48550/arxiv.2511.04883 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | |
| locations[0].version | |
| locations[0].raw_type | article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.48550/arxiv.2511.04883 |
| indexed_in | datacite |
| authorships[0].author.id | https://openalex.org/A2105472525 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Chen Di |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Chen, Di |
| authorships[0].is_corresponding | True |
| authorships[1].author.id | https://openalex.org/A1937087924 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-5191-9971 |
| authorships[1].author.display_name | Li Jia |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Li, Jia |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A3176550034 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Zhang, Michael |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Zhang, Michael |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://doi.org/10.48550/arxiv.2511.04883 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-11-11T00:00:00 |
| display_name | Self-Interest and Systemic Benefits: Emergence of Collective Rationality in Mixed Autonomy Traffic Through Deep Reinforcement Learning |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-11T23:23:10.385787 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.48550/arxiv.2511.04883 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | |
| best_oa_location.version | |
| best_oa_location.raw_type | article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.48550/arxiv.2511.04883 |
| primary_location.id | doi:10.48550/arxiv.2511.04883 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | |
| primary_location.version | |
| primary_location.raw_type | article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.48550/arxiv.2511.04883 |
| publication_date | 2025-11-07 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 67, 170, 210 |
| abstract_inverted_index.-- | 66 |
| abstract_inverted_index.CR | 138, 156, 185, 194, 217 |
| abstract_inverted_index.In | 150 |
| abstract_inverted_index.We | 174, 207 |
| abstract_inverted_index.an | 140 |
| abstract_inverted_index.as | 242 |
| abstract_inverted_index.be | 6, 34, 158 |
| abstract_inverted_index.by | 43 |
| abstract_inverted_index.in | 9, 93, 139, 146, 197, 218, 253 |
| abstract_inverted_index.is | 52, 100 |
| abstract_inverted_index.it | 51, 226 |
| abstract_inverted_index.of | 19, 64, 105, 137, 204, 216, 236 |
| abstract_inverted_index.on | 102, 228 |
| abstract_inverted_index.to | 5, 14, 36, 70, 81, 89, 178, 212, 245 |
| abstract_inverted_index.we | 153 |
| abstract_inverted_index.AVs | 21, 32, 85 |
| abstract_inverted_index.Our | 130 |
| abstract_inverted_index.The | 98 |
| abstract_inverted_index.act | 62 |
| abstract_inverted_index.all | 71, 90 |
| abstract_inverted_index.and | 22, 76, 115, 144, 221, 224 |
| abstract_inverted_index.are | 3 |
| abstract_inverted_index.can | 33, 86, 157, 183 |
| abstract_inverted_index.has | 133 |
| abstract_inverted_index.may | 122 |
| abstract_inverted_index.not | 53 |
| abstract_inverted_index.out | 63 |
| abstract_inverted_index.the | 10, 38, 56, 103, 135, 176, 202, 214, 219, 234 |
| abstract_inverted_index.This | 78, 109, 231 |
| abstract_inverted_index.aims | 80 |
| abstract_inverted_index.also | 208 |
| abstract_inverted_index.both | 20, 74 |
| abstract_inverted_index.deep | 165 |
| abstract_inverted_index.even | 125 |
| abstract_inverted_index.from | 112 |
| abstract_inverted_index.game | 113 |
| abstract_inverted_index.have | 29 |
| abstract_inverted_index.into | 47 |
| abstract_inverted_index.near | 11 |
| abstract_inverted_index.show | 192 |
| abstract_inverted_index.that | 31, 119, 155, 193 |
| abstract_inverted_index.this | 151, 205 |
| abstract_inverted_index.when | 60, 126 |
| abstract_inverted_index.with | 169 |
| abstract_inverted_index.(AVs) | 2 |
| abstract_inverted_index.(CR). | 108 |
| abstract_inverted_index.(DRL) | 168 |
| abstract_inverted_index.(such | 241 |
| abstract_inverted_index.among | 160, 249 |
| abstract_inverted_index.based | 227 |
| abstract_inverted_index.bring | 87 |
| abstract_inverted_index.clear | 54 |
| abstract_inverted_index.exist | 59 |
| abstract_inverted_index.goals | 46 |
| abstract_inverted_index.human | 75 |
| abstract_inverted_index.means | 118 |
| abstract_inverted_index.mixed | 15, 94, 147 |
| abstract_inverted_index.model | 143 |
| abstract_inverted_index.shown | 30 |
| abstract_inverted_index.still | 58 |
| abstract_inverted_index.study | 79 |
| abstract_inverted_index.their | 48 |
| abstract_inverted_index.trait | 68 |
| abstract_inverted_index.using | 164 |
| abstract_inverted_index.which | 179, 200 |
| abstract_inverted_index.(HVs). | 25 |
| abstract_inverted_index.agents | 61, 92, 121, 162, 182, 252 |
| abstract_inverted_index.common | 69 |
| abstract_inverted_index.extent | 177 |
| abstract_inverted_index.paper, | 152 |
| abstract_inverted_index.proven | 134 |
| abstract_inverted_index.recent | 131 |
| abstract_inverted_index.reward | 172 |
| abstract_inverted_index.simple | 171 |
| abstract_inverted_index.system | 41 |
| abstract_inverted_index.theory | 114 |
| abstract_inverted_index.verify | 225 |
| abstract_inverted_index.Results | 191 |
| abstract_inverted_index.achieve | 184, 246 |
| abstract_inverted_index.agents, | 73 |
| abstract_inverted_index.benefit | 37 |
| abstract_inverted_index.concept | 104 |
| abstract_inverted_index.design. | 173 |
| abstract_inverted_index.driving | 72, 91, 120, 161, 251 |
| abstract_inverted_index.dynamic | 222 |
| abstract_inverted_index.emerges | 196 |
| abstract_inverted_index.examine | 175 |
| abstract_inverted_index.explain | 213 |
| abstract_inverted_index.future, | 12 |
| abstract_inverted_index.leading | 13 |
| abstract_inverted_index.making, | 50 |
| abstract_inverted_index.methods | 240 |
| abstract_inverted_index.overall | 39 |
| abstract_inverted_index.studies | 28 |
| abstract_inverted_index.traffic | 17, 40, 96, 181 |
| abstract_inverted_index.trained | 163 |
| abstract_inverted_index.various | 198 |
| abstract_inverted_index.whether | 55, 83 |
| abstract_inverted_index.without | 186 |
| abstract_inverted_index.Although | 26 |
| abstract_inverted_index.advanced | 238 |
| abstract_inverted_index.attained | 159 |
| abstract_inverted_index.autonomy | 16, 95 |
| abstract_inverted_index.benefits | 57, 88 |
| abstract_inverted_index.centered | 101 |
| abstract_inverted_index.concept, | 110 |
| abstract_inverted_index.decision | 49 |
| abstract_inverted_index.deployed | 35 |
| abstract_inverted_index.directly | 187 |
| abstract_inverted_index.expected | 4 |
| abstract_inverted_index.learning | 167, 239 |
| abstract_inverted_index.numerous | 27 |
| abstract_inverted_index.pursuing | 127 |
| abstract_inverted_index.research | 99, 132, 232 |
| abstract_inverted_index.suggests | 233 |
| abstract_inverted_index.systems. | 97, 255 |
| abstract_inverted_index.traffic. | 149 |
| abstract_inverted_index.vehicles | 1, 24 |
| abstract_inverted_index.available | 8 |
| abstract_inverted_index.cooperate | 123 |
| abstract_inverted_index.emergence | 215 |
| abstract_inverted_index.evidence. | 230 |
| abstract_inverted_index.existence | 136 |
| abstract_inverted_index.federated | 243 |
| abstract_inverted_index.indicates | 201 |
| abstract_inverted_index.learning) | 244 |
| abstract_inverted_index.mechanism | 211 |
| abstract_inverted_index.postulate | 209 |
| abstract_inverted_index.property. | 206 |
| abstract_inverted_index.Autonomous | 0 |
| abstract_inverted_index.analytical | 141 |
| abstract_inverted_index.behavioral | 116 |
| abstract_inverted_index.collective | 106, 247 |
| abstract_inverted_index.consisting | 18 |
| abstract_inverted_index.economics, | 117 |
| abstract_inverted_index.individual | 128 |
| abstract_inverted_index.interests. | 129 |
| abstract_inverted_index.leveraging | 237 |
| abstract_inverted_index.robustness | 203 |
| abstract_inverted_index.scenarios, | 199 |
| abstract_inverted_index.simulation | 229 |
| abstract_inverted_index.understand | 82 |
| abstract_inverted_index.autonomous. | 77 |
| abstract_inverted_index.cooperation | 248 |
| abstract_inverted_index.demonstrate | 154 |
| abstract_inverted_index.empirically | 145 |
| abstract_inverted_index.environment | 223 |
| abstract_inverted_index.microscopic | 220 |
| abstract_inverted_index.objectives. | 190 |
| abstract_inverted_index.originating | 111 |
| abstract_inverted_index.performance | 42 |
| abstract_inverted_index.possibility | 235 |
| abstract_inverted_index.rationality | 107 |
| abstract_inverted_index.collectively | 124 |
| abstract_inverted_index.commercially | 7 |
| abstract_inverted_index.consistently | 195 |
| abstract_inverted_index.human-driven | 23, 148 |
| abstract_inverted_index.system-level | 45, 189 |
| abstract_inverted_index.incorporating | 44, 188 |
| abstract_inverted_index.reinforcement | 166 |
| abstract_inverted_index.self-interest | 65 |
| abstract_inverted_index.mixed-autonomy | 254 |
| abstract_inverted_index.self-interested | 84, 180, 250 |
| abstract_inverted_index.game-theoretical | 142 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |