Reinforcement Learning for Infinite-Dimensional Systems Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2409.15737
Interest in reinforcement learning (RL) for large-scale systems, comprising extensive populations of intelligent agents interacting with heterogeneous environments, has surged significantly across diverse scientific domains in recent years. However, the large-scale nature of these systems often leads to high computational costs or reduced performance for most state-of-the-art RL techniques. To address these challenges, we propose a novel RL architecture and derive effective algorithms to learn optimal policies for arbitrarily large systems of agents. In our formulation, we model such systems as parameterized control systems defined on an infinite-dimensional function space. We then develop a moment kernel transform that maps the parameterized system and the value function into a reproducing kernel Hilbert space. This transformation generates a sequence of finite-dimensional moment representations for the RL problem, organized into a filtrated structure. Leveraging this RL filtration, we develop a hierarchical algorithm for learning optimal policies for the infinite-dimensional parameterized system. To enhance the algorithm's efficiency, we exploit early stopping at each hierarchy, demonstrating the fast convergence property of the algorithm through the construction of a convergent spectral sequence. The performance and efficiency of the proposed algorithm are validated using practical examples in engineering and quantum systems.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2409.15737
- https://arxiv.org/pdf/2409.15737
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4403786626
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4403786626Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2409.15737Digital Object Identifier
- Title
-
Reinforcement Learning for Infinite-Dimensional SystemsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-09-24Full publication date if available
- Authors
-
Wei Zhang, Jr-Shin LiList of authors in order
- Landing page
-
https://arxiv.org/abs/2409.15737Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2409.15737Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2409.15737Direct OA link when available
- Concepts
-
Reinforcement, Reinforcement learning, Computer science, Artificial intelligence, Engineering, Structural engineeringTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4403786626 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2409.15737 |
| ids.doi | https://doi.org/10.48550/arxiv.2409.15737 |
| ids.openalex | https://openalex.org/W4403786626 |
| fwci | |
| type | preprint |
| title | Reinforcement Learning for Infinite-Dimensional Systems |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11245 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.4577000141143799 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2206 |
| topics[0].subfield.display_name | Computational Mechanics |
| topics[0].display_name | Advanced Numerical Analysis Techniques |
| topics[1].id | https://openalex.org/T11159 |
| topics[1].field.id | https://openalex.org/fields/22 |
| topics[1].field.display_name | Engineering |
| topics[1].score | 0.42340001463890076 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2209 |
| topics[1].subfield.display_name | Industrial and Manufacturing Engineering |
| topics[1].display_name | Manufacturing Process and Optimization |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C67203356 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7658798694610596 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q1321905 |
| concepts[0].display_name | Reinforcement |
| concepts[1].id | https://openalex.org/C97541855 |
| concepts[1].level | 2 |
| concepts[1].score | 0.5553407669067383 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[1].display_name | Reinforcement learning |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.4286413788795471 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.24540790915489197 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| concepts[4].id | https://openalex.org/C127413603 |
| concepts[4].level | 0 |
| concepts[4].score | 0.16391149163246155 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[4].display_name | Engineering |
| concepts[5].id | https://openalex.org/C66938386 |
| concepts[5].level | 1 |
| concepts[5].score | 0.14890232682228088 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q633538 |
| concepts[5].display_name | Structural engineering |
| keywords[0].id | https://openalex.org/keywords/reinforcement |
| keywords[0].score | 0.7658798694610596 |
| keywords[0].display_name | Reinforcement |
| keywords[1].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[1].score | 0.5553407669067383 |
| keywords[1].display_name | Reinforcement learning |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.4286413788795471 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.24540790915489197 |
| keywords[3].display_name | Artificial intelligence |
| keywords[4].id | https://openalex.org/keywords/engineering |
| keywords[4].score | 0.16391149163246155 |
| keywords[4].display_name | Engineering |
| keywords[5].id | https://openalex.org/keywords/structural-engineering |
| keywords[5].score | 0.14890232682228088 |
| keywords[5].display_name | Structural engineering |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2409.15737 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2409.15737 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2409.15737 |
| locations[1].id | doi:10.48550/arxiv.2409.15737 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2409.15737 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5100441694 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-7511-2870 |
| authorships[0].author.display_name | Wei Zhang |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Zhang, Wei |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5079314465 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-6693-3979 |
| authorships[1].author.display_name | Jr-Shin Li |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Li, Jr-Shin |
| authorships[1].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2409.15737 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2024-10-26T00:00:00 |
| display_name | Reinforcement Learning for Infinite-Dimensional Systems |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11245 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.4577000141143799 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2206 |
| primary_topic.subfield.display_name | Computational Mechanics |
| primary_topic.display_name | Advanced Numerical Analysis Techniques |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W4310083477, https://openalex.org/W2328553770, https://openalex.org/W2920061524, https://openalex.org/W1977959518, https://openalex.org/W2038908348, https://openalex.org/W2107890255, https://openalex.org/W2106552856 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2409.15737 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2409.15737 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2409.15737 |
| primary_location.id | pmh:oai:arXiv.org:2409.15737 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2409.15737 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2409.15737 |
| publication_date | 2024-09-24 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 55, 93, 107, 115, 127, 136, 172 |
| abstract_inverted_index.In | 73 |
| abstract_inverted_index.RL | 47, 57, 123, 132 |
| abstract_inverted_index.To | 49, 148 |
| abstract_inverted_index.We | 90 |
| abstract_inverted_index.an | 86 |
| abstract_inverted_index.as | 80 |
| abstract_inverted_index.at | 157 |
| abstract_inverted_index.in | 1, 25, 189 |
| abstract_inverted_index.of | 11, 32, 71, 117, 165, 171, 180 |
| abstract_inverted_index.on | 85 |
| abstract_inverted_index.or | 41 |
| abstract_inverted_index.to | 37, 63 |
| abstract_inverted_index.we | 53, 76, 134, 153 |
| abstract_inverted_index.The | 176 |
| abstract_inverted_index.and | 59, 102, 178, 191 |
| abstract_inverted_index.are | 184 |
| abstract_inverted_index.for | 5, 44, 67, 121, 139, 143 |
| abstract_inverted_index.has | 18 |
| abstract_inverted_index.our | 74 |
| abstract_inverted_index.the | 29, 99, 103, 122, 144, 150, 161, 166, 169, 181 |
| abstract_inverted_index.(RL) | 4 |
| abstract_inverted_index.This | 112 |
| abstract_inverted_index.each | 158 |
| abstract_inverted_index.fast | 162 |
| abstract_inverted_index.high | 38 |
| abstract_inverted_index.into | 106, 126 |
| abstract_inverted_index.maps | 98 |
| abstract_inverted_index.most | 45 |
| abstract_inverted_index.such | 78 |
| abstract_inverted_index.that | 97 |
| abstract_inverted_index.then | 91 |
| abstract_inverted_index.this | 131 |
| abstract_inverted_index.with | 15 |
| abstract_inverted_index.costs | 40 |
| abstract_inverted_index.early | 155 |
| abstract_inverted_index.large | 69 |
| abstract_inverted_index.leads | 36 |
| abstract_inverted_index.learn | 64 |
| abstract_inverted_index.model | 77 |
| abstract_inverted_index.novel | 56 |
| abstract_inverted_index.often | 35 |
| abstract_inverted_index.these | 33, 51 |
| abstract_inverted_index.using | 186 |
| abstract_inverted_index.value | 104 |
| abstract_inverted_index.across | 21 |
| abstract_inverted_index.agents | 13 |
| abstract_inverted_index.derive | 60 |
| abstract_inverted_index.kernel | 95, 109 |
| abstract_inverted_index.moment | 94, 119 |
| abstract_inverted_index.nature | 31 |
| abstract_inverted_index.recent | 26 |
| abstract_inverted_index.space. | 89, 111 |
| abstract_inverted_index.surged | 19 |
| abstract_inverted_index.system | 101 |
| abstract_inverted_index.years. | 27 |
| abstract_inverted_index.Hilbert | 110 |
| abstract_inverted_index.address | 50 |
| abstract_inverted_index.agents. | 72 |
| abstract_inverted_index.control | 82 |
| abstract_inverted_index.defined | 84 |
| abstract_inverted_index.develop | 92, 135 |
| abstract_inverted_index.diverse | 22 |
| abstract_inverted_index.domains | 24 |
| abstract_inverted_index.enhance | 149 |
| abstract_inverted_index.exploit | 154 |
| abstract_inverted_index.optimal | 65, 141 |
| abstract_inverted_index.propose | 54 |
| abstract_inverted_index.quantum | 192 |
| abstract_inverted_index.reduced | 42 |
| abstract_inverted_index.system. | 147 |
| abstract_inverted_index.systems | 34, 70, 79, 83 |
| abstract_inverted_index.through | 168 |
| abstract_inverted_index.However, | 28 |
| abstract_inverted_index.Interest | 0 |
| abstract_inverted_index.examples | 188 |
| abstract_inverted_index.function | 88, 105 |
| abstract_inverted_index.learning | 3, 140 |
| abstract_inverted_index.policies | 66, 142 |
| abstract_inverted_index.problem, | 124 |
| abstract_inverted_index.property | 164 |
| abstract_inverted_index.proposed | 182 |
| abstract_inverted_index.sequence | 116 |
| abstract_inverted_index.spectral | 174 |
| abstract_inverted_index.stopping | 156 |
| abstract_inverted_index.systems, | 7 |
| abstract_inverted_index.systems. | 193 |
| abstract_inverted_index.algorithm | 138, 167, 183 |
| abstract_inverted_index.effective | 61 |
| abstract_inverted_index.extensive | 9 |
| abstract_inverted_index.filtrated | 128 |
| abstract_inverted_index.generates | 114 |
| abstract_inverted_index.organized | 125 |
| abstract_inverted_index.practical | 187 |
| abstract_inverted_index.sequence. | 175 |
| abstract_inverted_index.transform | 96 |
| abstract_inverted_index.validated | 185 |
| abstract_inverted_index.Leveraging | 130 |
| abstract_inverted_index.algorithms | 62 |
| abstract_inverted_index.comprising | 8 |
| abstract_inverted_index.convergent | 173 |
| abstract_inverted_index.efficiency | 179 |
| abstract_inverted_index.hierarchy, | 159 |
| abstract_inverted_index.scientific | 23 |
| abstract_inverted_index.structure. | 129 |
| abstract_inverted_index.algorithm's | 151 |
| abstract_inverted_index.arbitrarily | 68 |
| abstract_inverted_index.challenges, | 52 |
| abstract_inverted_index.convergence | 163 |
| abstract_inverted_index.efficiency, | 152 |
| abstract_inverted_index.engineering | 190 |
| abstract_inverted_index.filtration, | 133 |
| abstract_inverted_index.intelligent | 12 |
| abstract_inverted_index.interacting | 14 |
| abstract_inverted_index.large-scale | 6, 30 |
| abstract_inverted_index.performance | 43, 177 |
| abstract_inverted_index.populations | 10 |
| abstract_inverted_index.reproducing | 108 |
| abstract_inverted_index.techniques. | 48 |
| abstract_inverted_index.architecture | 58 |
| abstract_inverted_index.construction | 170 |
| abstract_inverted_index.formulation, | 75 |
| abstract_inverted_index.hierarchical | 137 |
| abstract_inverted_index.computational | 39 |
| abstract_inverted_index.demonstrating | 160 |
| abstract_inverted_index.environments, | 17 |
| abstract_inverted_index.heterogeneous | 16 |
| abstract_inverted_index.parameterized | 81, 100, 146 |
| abstract_inverted_index.reinforcement | 2 |
| abstract_inverted_index.significantly | 20 |
| abstract_inverted_index.transformation | 113 |
| abstract_inverted_index.representations | 120 |
| abstract_inverted_index.state-of-the-art | 46 |
| abstract_inverted_index.finite-dimensional | 118 |
| abstract_inverted_index.infinite-dimensional | 87, 145 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |