The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2305.10697
When the data used for reinforcement learning (RL) are collected by multiple agents in a distributed manner, federated versions of RL algorithms allow collaborative learning without the need for agents to share their local data. In this paper, we consider federated Q-learning, which aims to learn an optimal Q-function by periodically aggregating local Q-estimates trained on local data alone. Focusing on infinite-horizon tabular Markov decision processes, we provide sample complexity guarantees for both the synchronous and asynchronous variants of federated Q-learning. In both cases, our bounds exhibit a linear speedup with respect to the number of agents and near-optimal dependencies on other salient problem parameters. In the asynchronous setting, existing analyses of federated Q-learning, which adopt an equally weighted averaging of local Q-estimates, require that every agent covers the entire state-action space. In contrast, our improved sample complexity scales inverse proportionally to the minimum entry of the average stationary state-action occupancy distribution of all agents, thus only requiring the agents to collectively cover the entire state-action space, unveiling the blessing of heterogeneity in enabling collaborative learning by relaxing the coverage requirement of the single-agent case. However, its sample complexity still suffers when the local trajectories are highly heterogeneous. In response, we propose a novel federated Q-learning algorithm with importance averaging, giving larger weights to more frequently visited state-action pairs, which achieves a robust linear speedup as if all trajectories are centrally processed, regardless of the heterogeneity of local behavior policies.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2305.10697
- https://arxiv.org/pdf/2305.10697
- OA Status
- green
- Cited By
- 4
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4377130806
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4377130806Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2305.10697Digital Object Identifier
- Title
-
The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and BeyondWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-05-18Full publication date if available
- Authors
-
Jiin Woo, Gauri Joshi, Yuejie ChiList of authors in order
- Landing page
-
https://arxiv.org/abs/2305.10697Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2305.10697Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2305.10697Direct OA link when available
- Concepts
-
Speedup, Computer science, Asynchronous communication, Reinforcement learning, Markov decision process, Sample (material), State space, Action (physics), State (computer science), Artificial intelligence, Algorithm, Markov process, Mathematics, Parallel computing, Quantum mechanics, Computer network, Chemistry, Chromatography, Statistics, PhysicsTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
4Total citation count in OpenAlex
- Citations by year (recent)
-
2025: 2, 2024: 2Per-year citation counts (last 5 years)
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4377130806 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2305.10697 |
| ids.doi | https://doi.org/10.48550/arxiv.2305.10697 |
| ids.openalex | https://openalex.org/W4377130806 |
| fwci | |
| type | preprint |
| title | The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10462 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9850000143051147 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Reinforcement Learning in Robotics |
| topics[1].id | https://openalex.org/T13553 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9616000056266785 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1705 |
| topics[1].subfield.display_name | Computer Networks and Communications |
| topics[1].display_name | Age of Information Optimization |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C68339613 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8418018817901611 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1549489 |
| concepts[0].display_name | Speedup |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.7268593907356262 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C151319957 |
| concepts[2].level | 2 |
| concepts[2].score | 0.6709046363830566 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q752739 |
| concepts[2].display_name | Asynchronous communication |
| concepts[3].id | https://openalex.org/C97541855 |
| concepts[3].level | 2 |
| concepts[3].score | 0.6544727683067322 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[3].display_name | Reinforcement learning |
| concepts[4].id | https://openalex.org/C106189395 |
| concepts[4].level | 3 |
| concepts[4].score | 0.5519530177116394 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q176789 |
| concepts[4].display_name | Markov decision process |
| concepts[5].id | https://openalex.org/C198531522 |
| concepts[5].level | 2 |
| concepts[5].score | 0.490372896194458 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q485146 |
| concepts[5].display_name | Sample (material) |
| concepts[6].id | https://openalex.org/C72434380 |
| concepts[6].level | 2 |
| concepts[6].score | 0.4857054650783539 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q230930 |
| concepts[6].display_name | State space |
| concepts[7].id | https://openalex.org/C2780791683 |
| concepts[7].level | 2 |
| concepts[7].score | 0.4248407185077667 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q846785 |
| concepts[7].display_name | Action (physics) |
| concepts[8].id | https://openalex.org/C48103436 |
| concepts[8].level | 2 |
| concepts[8].score | 0.4198514223098755 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q599031 |
| concepts[8].display_name | State (computer science) |
| concepts[9].id | https://openalex.org/C154945302 |
| concepts[9].level | 1 |
| concepts[9].score | 0.3559004068374634 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[9].display_name | Artificial intelligence |
| concepts[10].id | https://openalex.org/C11413529 |
| concepts[10].level | 1 |
| concepts[10].score | 0.2650968134403229 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[10].display_name | Algorithm |
| concepts[11].id | https://openalex.org/C159886148 |
| concepts[11].level | 2 |
| concepts[11].score | 0.2562650442123413 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q176645 |
| concepts[11].display_name | Markov process |
| concepts[12].id | https://openalex.org/C33923547 |
| concepts[12].level | 0 |
| concepts[12].score | 0.1998785436153412 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[12].display_name | Mathematics |
| concepts[13].id | https://openalex.org/C173608175 |
| concepts[13].level | 1 |
| concepts[13].score | 0.08236810564994812 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q232661 |
| concepts[13].display_name | Parallel computing |
| concepts[14].id | https://openalex.org/C62520636 |
| concepts[14].level | 1 |
| concepts[14].score | 0.0 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q944 |
| concepts[14].display_name | Quantum mechanics |
| concepts[15].id | https://openalex.org/C31258907 |
| concepts[15].level | 1 |
| concepts[15].score | 0.0 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q1301371 |
| concepts[15].display_name | Computer network |
| concepts[16].id | https://openalex.org/C185592680 |
| concepts[16].level | 0 |
| concepts[16].score | 0.0 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q2329 |
| concepts[16].display_name | Chemistry |
| concepts[17].id | https://openalex.org/C43617362 |
| concepts[17].level | 1 |
| concepts[17].score | 0.0 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q170050 |
| concepts[17].display_name | Chromatography |
| concepts[18].id | https://openalex.org/C105795698 |
| concepts[18].level | 1 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q12483 |
| concepts[18].display_name | Statistics |
| concepts[19].id | https://openalex.org/C121332964 |
| concepts[19].level | 0 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q413 |
| concepts[19].display_name | Physics |
| keywords[0].id | https://openalex.org/keywords/speedup |
| keywords[0].score | 0.8418018817901611 |
| keywords[0].display_name | Speedup |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.7268593907356262 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/asynchronous-communication |
| keywords[2].score | 0.6709046363830566 |
| keywords[2].display_name | Asynchronous communication |
| keywords[3].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[3].score | 0.6544727683067322 |
| keywords[3].display_name | Reinforcement learning |
| keywords[4].id | https://openalex.org/keywords/markov-decision-process |
| keywords[4].score | 0.5519530177116394 |
| keywords[4].display_name | Markov decision process |
| keywords[5].id | https://openalex.org/keywords/sample |
| keywords[5].score | 0.490372896194458 |
| keywords[5].display_name | Sample (material) |
| keywords[6].id | https://openalex.org/keywords/state-space |
| keywords[6].score | 0.4857054650783539 |
| keywords[6].display_name | State space |
| keywords[7].id | https://openalex.org/keywords/action |
| keywords[7].score | 0.4248407185077667 |
| keywords[7].display_name | Action (physics) |
| keywords[8].id | https://openalex.org/keywords/state |
| keywords[8].score | 0.4198514223098755 |
| keywords[8].display_name | State (computer science) |
| keywords[9].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[9].score | 0.3559004068374634 |
| keywords[9].display_name | Artificial intelligence |
| keywords[10].id | https://openalex.org/keywords/algorithm |
| keywords[10].score | 0.2650968134403229 |
| keywords[10].display_name | Algorithm |
| keywords[11].id | https://openalex.org/keywords/markov-process |
| keywords[11].score | 0.2562650442123413 |
| keywords[11].display_name | Markov process |
| keywords[12].id | https://openalex.org/keywords/mathematics |
| keywords[12].score | 0.1998785436153412 |
| keywords[12].display_name | Mathematics |
| keywords[13].id | https://openalex.org/keywords/parallel-computing |
| keywords[13].score | 0.08236810564994812 |
| keywords[13].display_name | Parallel computing |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2305.10697 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2305.10697 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2305.10697 |
| locations[1].id | doi:10.48550/arxiv.2305.10697 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2305.10697 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5102514656 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Jiin Woo |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Woo, Jiin |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5067441201 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-6372-9697 |
| authorships[1].author.display_name | Gauri Joshi |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Joshi, Gauri |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5053809095 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-6766-5459 |
| authorships[2].author.display_name | Yuejie Chi |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Chi, Yuejie |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2305.10697 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | The Blessing of Heterogeneity in Federated Q-Learning: Linear Speedup and Beyond |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10462 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9850000143051147 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Reinforcement Learning in Robotics |
| related_works | https://openalex.org/W2168758875, https://openalex.org/W4246549241, https://openalex.org/W2410733619, https://openalex.org/W2963483475, https://openalex.org/W4327568679, https://openalex.org/W3096874164, https://openalex.org/W1985560493, https://openalex.org/W2937181779, https://openalex.org/W2386410636, https://openalex.org/W2357975469 |
| cited_by_count | 4 |
| counts_by_year[0].year | 2025 |
| counts_by_year[0].cited_by_count | 2 |
| counts_by_year[1].year | 2024 |
| counts_by_year[1].cited_by_count | 2 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2305.10697 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2305.10697 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2305.10697 |
| primary_location.id | pmh:oai:arXiv.org:2305.10697 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2305.10697 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2305.10697 |
| publication_date | 2023-05-18 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 14, 87, 202, 221 |
| abstract_inverted_index.In | 35, 81, 105, 132, 198 |
| abstract_inverted_index.RL | 20 |
| abstract_inverted_index.an | 46, 116 |
| abstract_inverted_index.as | 225 |
| abstract_inverted_index.by | 10, 49, 176 |
| abstract_inverted_index.if | 226 |
| abstract_inverted_index.in | 13, 172 |
| abstract_inverted_index.of | 19, 78, 95, 111, 120, 145, 152, 170, 181, 233, 236 |
| abstract_inverted_index.on | 55, 60, 100 |
| abstract_inverted_index.to | 30, 44, 92, 141, 160, 213 |
| abstract_inverted_index.we | 38, 66, 200 |
| abstract_inverted_index.all | 153, 227 |
| abstract_inverted_index.and | 75, 97 |
| abstract_inverted_index.are | 8, 195, 229 |
| abstract_inverted_index.for | 4, 28, 71 |
| abstract_inverted_index.its | 186 |
| abstract_inverted_index.our | 84, 134 |
| abstract_inverted_index.the | 1, 26, 73, 93, 106, 128, 142, 146, 158, 163, 168, 178, 182, 192, 234 |
| abstract_inverted_index.(RL) | 7 |
| abstract_inverted_index.When | 0 |
| abstract_inverted_index.aims | 43 |
| abstract_inverted_index.both | 72, 82 |
| abstract_inverted_index.data | 2, 57 |
| abstract_inverted_index.more | 214 |
| abstract_inverted_index.need | 27 |
| abstract_inverted_index.only | 156 |
| abstract_inverted_index.that | 124 |
| abstract_inverted_index.this | 36 |
| abstract_inverted_index.thus | 155 |
| abstract_inverted_index.used | 3 |
| abstract_inverted_index.when | 191 |
| abstract_inverted_index.with | 90, 207 |
| abstract_inverted_index.adopt | 115 |
| abstract_inverted_index.agent | 126 |
| abstract_inverted_index.allow | 22 |
| abstract_inverted_index.case. | 184 |
| abstract_inverted_index.cover | 162 |
| abstract_inverted_index.data. | 34 |
| abstract_inverted_index.entry | 144 |
| abstract_inverted_index.every | 125 |
| abstract_inverted_index.learn | 45 |
| abstract_inverted_index.local | 33, 52, 56, 121, 193, 237 |
| abstract_inverted_index.novel | 203 |
| abstract_inverted_index.other | 101 |
| abstract_inverted_index.share | 31 |
| abstract_inverted_index.still | 189 |
| abstract_inverted_index.their | 32 |
| abstract_inverted_index.which | 42, 114, 219 |
| abstract_inverted_index.Markov | 63 |
| abstract_inverted_index.agents | 12, 29, 96, 159 |
| abstract_inverted_index.alone. | 58 |
| abstract_inverted_index.bounds | 85 |
| abstract_inverted_index.cases, | 83 |
| abstract_inverted_index.covers | 127 |
| abstract_inverted_index.entire | 129, 164 |
| abstract_inverted_index.giving | 210 |
| abstract_inverted_index.highly | 196 |
| abstract_inverted_index.larger | 211 |
| abstract_inverted_index.linear | 88, 223 |
| abstract_inverted_index.number | 94 |
| abstract_inverted_index.pairs, | 218 |
| abstract_inverted_index.paper, | 37 |
| abstract_inverted_index.robust | 222 |
| abstract_inverted_index.sample | 68, 136, 187 |
| abstract_inverted_index.scales | 138 |
| abstract_inverted_index.space, | 166 |
| abstract_inverted_index.space. | 131 |
| abstract_inverted_index.agents, | 154 |
| abstract_inverted_index.average | 147 |
| abstract_inverted_index.equally | 117 |
| abstract_inverted_index.exhibit | 86 |
| abstract_inverted_index.inverse | 139 |
| abstract_inverted_index.manner, | 16 |
| abstract_inverted_index.minimum | 143 |
| abstract_inverted_index.optimal | 47 |
| abstract_inverted_index.problem | 103 |
| abstract_inverted_index.propose | 201 |
| abstract_inverted_index.provide | 67 |
| abstract_inverted_index.require | 123 |
| abstract_inverted_index.respect | 91 |
| abstract_inverted_index.salient | 102 |
| abstract_inverted_index.speedup | 89, 224 |
| abstract_inverted_index.suffers | 190 |
| abstract_inverted_index.tabular | 62 |
| abstract_inverted_index.trained | 54 |
| abstract_inverted_index.visited | 216 |
| abstract_inverted_index.weights | 212 |
| abstract_inverted_index.without | 25 |
| abstract_inverted_index.Focusing | 59 |
| abstract_inverted_index.However, | 185 |
| abstract_inverted_index.achieves | 220 |
| abstract_inverted_index.analyses | 110 |
| abstract_inverted_index.behavior | 238 |
| abstract_inverted_index.blessing | 169 |
| abstract_inverted_index.consider | 39 |
| abstract_inverted_index.coverage | 179 |
| abstract_inverted_index.decision | 64 |
| abstract_inverted_index.enabling | 173 |
| abstract_inverted_index.existing | 109 |
| abstract_inverted_index.improved | 135 |
| abstract_inverted_index.learning | 6, 24, 175 |
| abstract_inverted_index.multiple | 11 |
| abstract_inverted_index.relaxing | 177 |
| abstract_inverted_index.setting, | 108 |
| abstract_inverted_index.variants | 77 |
| abstract_inverted_index.versions | 18 |
| abstract_inverted_index.weighted | 118 |
| abstract_inverted_index.algorithm | 206 |
| abstract_inverted_index.averaging | 119 |
| abstract_inverted_index.centrally | 230 |
| abstract_inverted_index.collected | 9 |
| abstract_inverted_index.contrast, | 133 |
| abstract_inverted_index.federated | 17, 40, 79, 112, 204 |
| abstract_inverted_index.occupancy | 150 |
| abstract_inverted_index.policies. | 239 |
| abstract_inverted_index.requiring | 157 |
| abstract_inverted_index.response, | 199 |
| abstract_inverted_index.unveiling | 167 |
| abstract_inverted_index.Q-function | 48 |
| abstract_inverted_index.Q-learning | 205 |
| abstract_inverted_index.algorithms | 21 |
| abstract_inverted_index.averaging, | 209 |
| abstract_inverted_index.complexity | 69, 137, 188 |
| abstract_inverted_index.frequently | 215 |
| abstract_inverted_index.guarantees | 70 |
| abstract_inverted_index.importance | 208 |
| abstract_inverted_index.processed, | 231 |
| abstract_inverted_index.processes, | 65 |
| abstract_inverted_index.regardless | 232 |
| abstract_inverted_index.stationary | 148 |
| abstract_inverted_index.Q-estimates | 53 |
| abstract_inverted_index.Q-learning, | 41, 113 |
| abstract_inverted_index.Q-learning. | 80 |
| abstract_inverted_index.aggregating | 51 |
| abstract_inverted_index.distributed | 15 |
| abstract_inverted_index.parameters. | 104 |
| abstract_inverted_index.requirement | 180 |
| abstract_inverted_index.synchronous | 74 |
| abstract_inverted_index.Q-estimates, | 122 |
| abstract_inverted_index.asynchronous | 76, 107 |
| abstract_inverted_index.collectively | 161 |
| abstract_inverted_index.dependencies | 99 |
| abstract_inverted_index.distribution | 151 |
| abstract_inverted_index.near-optimal | 98 |
| abstract_inverted_index.periodically | 50 |
| abstract_inverted_index.single-agent | 183 |
| abstract_inverted_index.state-action | 130, 149, 165, 217 |
| abstract_inverted_index.trajectories | 194, 228 |
| abstract_inverted_index.collaborative | 23, 174 |
| abstract_inverted_index.heterogeneity | 171, 235 |
| abstract_inverted_index.reinforcement | 5 |
| abstract_inverted_index.heterogeneous. | 197 |
| abstract_inverted_index.proportionally | 140 |
| abstract_inverted_index.infinite-horizon | 61 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/16 |
| sustainable_development_goals[0].score | 0.7699999809265137 |
| sustainable_development_goals[0].display_name | Peace, Justice and strong institutions |
| citation_normalized_percentile |