Reservoir Computing for Fast, Simplified Reinforcement Learning on Memory Tasks Article Swipe
Tasks in which rewards depend upon past information not available in the current observation set can only be solved by agents that are equipped with short-term memory. Usual choices for memory modules include trainable recurrent hidden layers, often with gated memory. Reservoir computing presents an alternative, in which a recurrent layer is not trained, but rather has a set of fixed, sparse recurrent weights. The weights are scaled to produce stable dynamical behavior such that the reservoir state contains a high-dimensional, nonlinear impulse response function of the inputs. An output decoder network can then be used to map the compressive history represented by the reservoir's state to any outputs, including agent actions or predictions. In this study, we find that reservoir computing greatly simplifies and speeds up reinforcement learning on memory tasks by (1) eliminating the need for backpropagation of gradients through time, (2) presenting all recent history simultaneously to the downstream network, and (3) performing many useful and generic nonlinear computations upstream from the trained modules. In particular, these findings offer significant benefit to meta-learning that depends primarily on efficient and highly general memory systems.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2412.13093
- https://arxiv.org/pdf/2412.13093
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4405562708
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4405562708Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2412.13093Digital Object Identifier
- Title
-
Reservoir Computing for Fast, Simplified Reinforcement Learning on Memory TasksWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-12-17Full publication date if available
- Authors
-
Kevin R. McKeeList of authors in order
- Landing page
-
https://arxiv.org/abs/2412.13093Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2412.13093Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2412.13093Direct OA link when available
- Concepts
-
Reinforcement learning, Reservoir computing, Computer science, Reinforcement, Parallel computing, Artificial intelligence, Distributed computing, Engineering, Structural engineering, Artificial neural network, Recurrent neural networkTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4405562708 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2412.13093 |
| ids.doi | https://doi.org/10.48550/arxiv.2412.13093 |
| ids.openalex | https://openalex.org/W4405562708 |
| fwci | |
| type | preprint |
| title | Reservoir Computing for Fast, Simplified Reinforcement Learning on Memory Tasks |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T12611 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9991000294685364 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Neural Networks and Reservoir Computing |
| topics[1].id | https://openalex.org/T10502 |
| topics[1].field.id | https://openalex.org/fields/22 |
| topics[1].field.display_name | Engineering |
| topics[1].score | 0.9972000122070312 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2208 |
| topics[1].subfield.display_name | Electrical and Electronic Engineering |
| topics[1].display_name | Advanced Memory and Neural Computing |
| topics[2].id | https://openalex.org/T12676 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.992900013923645 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Machine Learning and ELM |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C97541855 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7910301685333252 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q830687 |
| concepts[0].display_name | Reinforcement learning |
| concepts[1].id | https://openalex.org/C135796866 |
| concepts[1].level | 4 |
| concepts[1].score | 0.6933387517929077 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q7315328 |
| concepts[1].display_name | Reservoir computing |
| concepts[2].id | https://openalex.org/C41008148 |
| concepts[2].level | 0 |
| concepts[2].score | 0.6720191836357117 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[2].display_name | Computer science |
| concepts[3].id | https://openalex.org/C67203356 |
| concepts[3].level | 2 |
| concepts[3].score | 0.47370144724845886 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q1321905 |
| concepts[3].display_name | Reinforcement |
| concepts[4].id | https://openalex.org/C173608175 |
| concepts[4].level | 1 |
| concepts[4].score | 0.39286988973617554 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q232661 |
| concepts[4].display_name | Parallel computing |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.34061798453330994 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C120314980 |
| concepts[6].level | 1 |
| concepts[6].score | 0.3388291895389557 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q180634 |
| concepts[6].display_name | Distributed computing |
| concepts[7].id | https://openalex.org/C127413603 |
| concepts[7].level | 0 |
| concepts[7].score | 0.11495661735534668 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q11023 |
| concepts[7].display_name | Engineering |
| concepts[8].id | https://openalex.org/C66938386 |
| concepts[8].level | 1 |
| concepts[8].score | 0.05965617299079895 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q633538 |
| concepts[8].display_name | Structural engineering |
| concepts[9].id | https://openalex.org/C50644808 |
| concepts[9].level | 2 |
| concepts[9].score | 0.056541621685028076 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q192776 |
| concepts[9].display_name | Artificial neural network |
| concepts[10].id | https://openalex.org/C147168706 |
| concepts[10].level | 3 |
| concepts[10].score | 0.0 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q1457734 |
| concepts[10].display_name | Recurrent neural network |
| keywords[0].id | https://openalex.org/keywords/reinforcement-learning |
| keywords[0].score | 0.7910301685333252 |
| keywords[0].display_name | Reinforcement learning |
| keywords[1].id | https://openalex.org/keywords/reservoir-computing |
| keywords[1].score | 0.6933387517929077 |
| keywords[1].display_name | Reservoir computing |
| keywords[2].id | https://openalex.org/keywords/computer-science |
| keywords[2].score | 0.6720191836357117 |
| keywords[2].display_name | Computer science |
| keywords[3].id | https://openalex.org/keywords/reinforcement |
| keywords[3].score | 0.47370144724845886 |
| keywords[3].display_name | Reinforcement |
| keywords[4].id | https://openalex.org/keywords/parallel-computing |
| keywords[4].score | 0.39286988973617554 |
| keywords[4].display_name | Parallel computing |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.34061798453330994 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/distributed-computing |
| keywords[6].score | 0.3388291895389557 |
| keywords[6].display_name | Distributed computing |
| keywords[7].id | https://openalex.org/keywords/engineering |
| keywords[7].score | 0.11495661735534668 |
| keywords[7].display_name | Engineering |
| keywords[8].id | https://openalex.org/keywords/structural-engineering |
| keywords[8].score | 0.05965617299079895 |
| keywords[8].display_name | Structural engineering |
| keywords[9].id | https://openalex.org/keywords/artificial-neural-network |
| keywords[9].score | 0.056541621685028076 |
| keywords[9].display_name | Artificial neural network |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2412.13093 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2412.13093 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2412.13093 |
| locations[1].id | doi:10.48550/arxiv.2412.13093 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2412.13093 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5013841168 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-4412-1686 |
| authorships[0].author.display_name | Kevin R. McKee |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | McKee, Kevin |
| authorships[0].is_corresponding | True |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2412.13093 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Reservoir Computing for Fast, Simplified Reinforcement Learning on Memory Tasks |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T12611 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9991000294685364 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Neural Networks and Reservoir Computing |
| related_works | https://openalex.org/W3192662224, https://openalex.org/W4389072666, https://openalex.org/W2887258823, https://openalex.org/W4300888463, https://openalex.org/W4226454644, https://openalex.org/W2998821156, https://openalex.org/W2949388105, https://openalex.org/W3211266228, https://openalex.org/W4285341284, https://openalex.org/W4220815069 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2412.13093 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2412.13093 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2412.13093 |
| primary_location.id | pmh:oai:arXiv.org:2412.13093 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2412.13093 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2412.13093 |
| publication_date | 2024-12-17 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 48, 57, 79 |
| abstract_inverted_index.An | 88 |
| abstract_inverted_index.In | 114, 167 |
| abstract_inverted_index.an | 44 |
| abstract_inverted_index.be | 17, 94 |
| abstract_inverted_index.by | 19, 102, 132 |
| abstract_inverted_index.in | 1, 10, 46 |
| abstract_inverted_index.is | 51 |
| abstract_inverted_index.of | 59, 85, 139 |
| abstract_inverted_index.on | 129, 179 |
| abstract_inverted_index.or | 112 |
| abstract_inverted_index.to | 68, 96, 106, 149, 174 |
| abstract_inverted_index.up | 126 |
| abstract_inverted_index.we | 117 |
| abstract_inverted_index.(1) | 133 |
| abstract_inverted_index.(2) | 143 |
| abstract_inverted_index.(3) | 154 |
| abstract_inverted_index.The | 64 |
| abstract_inverted_index.all | 145 |
| abstract_inverted_index.and | 124, 153, 158, 181 |
| abstract_inverted_index.any | 107 |
| abstract_inverted_index.are | 22, 66 |
| abstract_inverted_index.but | 54 |
| abstract_inverted_index.can | 15, 92 |
| abstract_inverted_index.for | 29, 137 |
| abstract_inverted_index.has | 56 |
| abstract_inverted_index.map | 97 |
| abstract_inverted_index.not | 8, 52 |
| abstract_inverted_index.set | 14, 58 |
| abstract_inverted_index.the | 11, 75, 86, 98, 103, 135, 150, 164 |
| abstract_inverted_index.find | 118 |
| abstract_inverted_index.from | 163 |
| abstract_inverted_index.many | 156 |
| abstract_inverted_index.need | 136 |
| abstract_inverted_index.only | 16 |
| abstract_inverted_index.past | 6 |
| abstract_inverted_index.such | 73 |
| abstract_inverted_index.that | 21, 74, 119, 176 |
| abstract_inverted_index.then | 93 |
| abstract_inverted_index.this | 115 |
| abstract_inverted_index.upon | 5 |
| abstract_inverted_index.used | 95 |
| abstract_inverted_index.with | 24, 38 |
| abstract_inverted_index.Tasks | 0 |
| abstract_inverted_index.Usual | 27 |
| abstract_inverted_index.agent | 110 |
| abstract_inverted_index.gated | 39 |
| abstract_inverted_index.layer | 50 |
| abstract_inverted_index.offer | 171 |
| abstract_inverted_index.often | 37 |
| abstract_inverted_index.state | 77, 105 |
| abstract_inverted_index.tasks | 131 |
| abstract_inverted_index.these | 169 |
| abstract_inverted_index.time, | 142 |
| abstract_inverted_index.which | 2, 47 |
| abstract_inverted_index.agents | 20 |
| abstract_inverted_index.depend | 4 |
| abstract_inverted_index.fixed, | 60 |
| abstract_inverted_index.hidden | 35 |
| abstract_inverted_index.highly | 182 |
| abstract_inverted_index.memory | 30, 130, 184 |
| abstract_inverted_index.output | 89 |
| abstract_inverted_index.rather | 55 |
| abstract_inverted_index.recent | 146 |
| abstract_inverted_index.scaled | 67 |
| abstract_inverted_index.solved | 18 |
| abstract_inverted_index.sparse | 61 |
| abstract_inverted_index.speeds | 125 |
| abstract_inverted_index.stable | 70 |
| abstract_inverted_index.study, | 116 |
| abstract_inverted_index.useful | 157 |
| abstract_inverted_index.actions | 111 |
| abstract_inverted_index.benefit | 173 |
| abstract_inverted_index.choices | 28 |
| abstract_inverted_index.current | 12 |
| abstract_inverted_index.decoder | 90 |
| abstract_inverted_index.depends | 177 |
| abstract_inverted_index.general | 183 |
| abstract_inverted_index.generic | 159 |
| abstract_inverted_index.greatly | 122 |
| abstract_inverted_index.history | 100, 147 |
| abstract_inverted_index.impulse | 82 |
| abstract_inverted_index.include | 32 |
| abstract_inverted_index.inputs. | 87 |
| abstract_inverted_index.layers, | 36 |
| abstract_inverted_index.memory. | 26, 40 |
| abstract_inverted_index.modules | 31 |
| abstract_inverted_index.network | 91 |
| abstract_inverted_index.produce | 69 |
| abstract_inverted_index.rewards | 3 |
| abstract_inverted_index.through | 141 |
| abstract_inverted_index.trained | 165 |
| abstract_inverted_index.weights | 65 |
| abstract_inverted_index.behavior | 72 |
| abstract_inverted_index.contains | 78 |
| abstract_inverted_index.equipped | 23 |
| abstract_inverted_index.findings | 170 |
| abstract_inverted_index.function | 84 |
| abstract_inverted_index.learning | 128 |
| abstract_inverted_index.modules. | 166 |
| abstract_inverted_index.network, | 152 |
| abstract_inverted_index.outputs, | 108 |
| abstract_inverted_index.presents | 43 |
| abstract_inverted_index.response | 83 |
| abstract_inverted_index.systems. | 185 |
| abstract_inverted_index.trained, | 53 |
| abstract_inverted_index.upstream | 162 |
| abstract_inverted_index.weights. | 63 |
| abstract_inverted_index.Reservoir | 41 |
| abstract_inverted_index.available | 9 |
| abstract_inverted_index.computing | 42, 121 |
| abstract_inverted_index.dynamical | 71 |
| abstract_inverted_index.efficient | 180 |
| abstract_inverted_index.gradients | 140 |
| abstract_inverted_index.including | 109 |
| abstract_inverted_index.nonlinear | 81, 160 |
| abstract_inverted_index.primarily | 178 |
| abstract_inverted_index.recurrent | 34, 49, 62 |
| abstract_inverted_index.reservoir | 76, 120 |
| abstract_inverted_index.trainable | 33 |
| abstract_inverted_index.downstream | 151 |
| abstract_inverted_index.performing | 155 |
| abstract_inverted_index.presenting | 144 |
| abstract_inverted_index.short-term | 25 |
| abstract_inverted_index.simplifies | 123 |
| abstract_inverted_index.compressive | 99 |
| abstract_inverted_index.eliminating | 134 |
| abstract_inverted_index.information | 7 |
| abstract_inverted_index.observation | 13 |
| abstract_inverted_index.particular, | 168 |
| abstract_inverted_index.represented | 101 |
| abstract_inverted_index.reservoir's | 104 |
| abstract_inverted_index.significant | 172 |
| abstract_inverted_index.alternative, | 45 |
| abstract_inverted_index.computations | 161 |
| abstract_inverted_index.predictions. | 113 |
| abstract_inverted_index.meta-learning | 175 |
| abstract_inverted_index.reinforcement | 127 |
| abstract_inverted_index.simultaneously | 148 |
| abstract_inverted_index.backpropagation | 138 |
| abstract_inverted_index.high-dimensional, | 80 |
| cited_by_percentile_year | |
| corresponding_author_ids | https://openalex.org/A5013841168 |
| countries_distinct_count | 0 |
| institutions_distinct_count | 1 |
| citation_normalized_percentile |