Parallel bandit architecture based on laser chaos for reinforcement learning Article Swipe

PDF

Takashi Urushibara , Nicolas Chauvet , Satoshi Kochi , Satoshi Sunada , Kazutaka Kanno , Atsushi Uchida , Ryoichi Horisaki , Makoto Naruse ·

YOU? · · 2022 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2205.09543

Accelerating artificial intelligence by photonics is an active field of study aiming to exploit the unique properties of photons. Reinforcement learning is an important branch of machine learning, and photonic decision-making principles have been demonstrated with respect to the multi-armed bandit problems. However, reinforcement learning could involve a massive number of states, unlike previously demonstrated bandit problems where the number of states is only one. Q-learning is a well-known approach in reinforcement learning that can deal with many states. The architecture of Q-learning, however, does not fit well photonic implementations due to its separation of update rule and the action selection. In this study, we organize a new architecture for multi-state reinforcement learning as a parallel array of bandit problems in order to benefit from photonic decision-makers, which we call parallel bandit architecture for reinforcement learning or PBRL in short. Taking a cart-pole balancing problem as an instance, we demonstrate that PBRL adapts to the environment in fewer time steps than Q-learning. Furthermore, PBRL yields faster adaptation when operated with a chaotic laser time series than the case with uniformly distributed pseudorandom numbers where the autocorrelation inherent in the laser chaos provides a positive effect. We also find that the variety of states that the system undergoes during the learning phase exhibits completely different properties between PBRL and Q-learning. The insights obtained through the present study are also beneficial for existing computing platforms, not just photonic realizations, in accelerating performances by the PBRL algorithms and correlated random sequences.

Related Topics

Reinforcement Learning

Computer Science

Photonics

Multi-Armed Bandit

Temporal Difference Learning

Artificial Intelligence

Concepts

Reinforcement learning Computer science Photonics Multi-armed bandit Temporal difference learning Artificial intelligence Chaotic Exploit Machine learning Physics Computer security Regret Optics

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2205.09543
PDF: https://arxiv.org/pdf/2205.09543
OA Status: green
Related Works: 10
OpenAlex ID: https://openalex.org/W4307900468

All OpenAlex metadata

Raw OpenAlex JSON

OpenAlex ID: https://openalex.org/W4307900468

Canonical identifier for this work in OpenAlex
DOI: https://doi.org/10.48550/arxiv.2205.09543

Digital Object Identifier
Title: Parallel bandit architecture based on laser chaos for reinforcement learning

Work title
Type: preprint

OpenAlex work type
Language: en

Primary language
Publication year: 2022

Year of publication
Publication date: 2022-05-19

Full publication date if available
Authors: Takashi Urushibara, Nicolas Chauvet, Satoshi Kochi, Satoshi Sunada, Kazutaka Kanno, Atsushi Uchida, Ryoichi Horisaki, Makoto Naruse

List of authors in order
Landing page: https://arxiv.org/abs/2205.09543

Publisher landing page
PDF URL: https://arxiv.org/pdf/2205.09543

Direct link to full text PDF
Open access: Yes

Whether a free full text is available
OA status: green

Open access status per OpenAlex
OA URL: https://arxiv.org/pdf/2205.09543

Direct OA link when available
Concepts: Reinforcement learning, Computer science, Photonics, Multi-armed bandit, Temporal difference learning, Artificial intelligence, Chaotic, Exploit, Machine learning, Physics, Computer security, Regret, Optics

Top concepts (fields/topics) attached by OpenAlex
Cited by: 0

Total citation count in OpenAlex
Related works (count): 10

Other works algorithmically related by OpenAlex

Full payload

id	https://openalex.org/W4307900468
doi	https://doi.org/10.48550/arxiv.2205.09543
ids.doi	https://doi.org/10.48550/arxiv.2205.09543
ids.openalex	https://openalex.org/W4307900468
fwci
type	preprint
title	Parallel bandit architecture based on laser chaos for reinforcement learning
biblio.issue
biblio.volume
biblio.last_page
biblio.first_page
topics[0].id	https://openalex.org/T12611
topics[0].field.id	https://openalex.org/fields/17
topics[0].field.display_name	Computer Science
topics[0].score	0.9990000128746033
topics[0].domain.id	https://openalex.org/domains/3
topics[0].domain.display_name	Physical Sciences
topics[0].subfield.id	https://openalex.org/subfields/1702
topics[0].subfield.display_name	Artificial Intelligence
topics[0].display_name	Neural Networks and Reservoir Computing
topics[1].id	https://openalex.org/T10423
topics[1].field.id	https://openalex.org/fields/28
topics[1].field.display_name	Neuroscience
topics[1].score	0.9284999966621399
topics[1].domain.id	https://openalex.org/domains/1
topics[1].domain.display_name	Life Sciences
topics[1].subfield.id	https://openalex.org/subfields/2804
topics[1].subfield.display_name	Cellular and Molecular Neuroscience
topics[1].display_name	Neurobiology and Insect Physiology Research
topics[2].id	https://openalex.org/T10232
topics[2].field.id	https://openalex.org/fields/22
topics[2].field.display_name	Engineering
topics[2].score	0.9271000027656555
topics[2].domain.id	https://openalex.org/domains/3
topics[2].domain.display_name	Physical Sciences
topics[2].subfield.id	https://openalex.org/subfields/2208
topics[2].subfield.display_name	Electrical and Electronic Engineering
topics[2].display_name	Optical Network Technologies
is_xpac	False
apc_list
apc_paid
concepts[0].id	https://openalex.org/C97541855
concepts[0].level	2
concepts[0].score	0.9186882376670837
concepts[0].wikidata	https://www.wikidata.org/wiki/Q830687
concepts[0].display_name	Reinforcement learning
concepts[1].id	https://openalex.org/C41008148
concepts[1].level	0
concepts[1].score	0.6675879955291748
concepts[1].wikidata	https://www.wikidata.org/wiki/Q21198
concepts[1].display_name	Computer science
concepts[2].id	https://openalex.org/C20788544
concepts[2].level	2
concepts[2].score	0.6469535231590271
concepts[2].wikidata	https://www.wikidata.org/wiki/Q467054
concepts[2].display_name	Photonics
concepts[3].id	https://openalex.org/C123197309
concepts[3].level	3
concepts[3].score	0.5473476052284241
concepts[3].wikidata	https://www.wikidata.org/wiki/Q2882343
concepts[3].display_name	Multi-armed bandit
concepts[4].id	https://openalex.org/C196340769
concepts[4].level	3
concepts[4].score	0.5069089531898499
concepts[4].wikidata	https://www.wikidata.org/wiki/Q7698910
concepts[4].display_name	Temporal difference learning
concepts[5].id	https://openalex.org/C154945302
concepts[5].level	1
concepts[5].score	0.4663239121437073
concepts[5].wikidata	https://www.wikidata.org/wiki/Q11660
concepts[5].display_name	Artificial intelligence
concepts[6].id	https://openalex.org/C2777052490
concepts[6].level	2
concepts[6].score	0.45985016226768494
concepts[6].wikidata	https://www.wikidata.org/wiki/Q5072826
concepts[6].display_name	Chaotic
concepts[7].id	https://openalex.org/C165696696
concepts[7].level	2
concepts[7].score	0.4251117408275604
concepts[7].wikidata	https://www.wikidata.org/wiki/Q11287
concepts[7].display_name	Exploit
concepts[8].id	https://openalex.org/C119857082
concepts[8].level	1
concepts[8].score	0.2262095808982849
concepts[8].wikidata	https://www.wikidata.org/wiki/Q2539
concepts[8].display_name	Machine learning
concepts[9].id	https://openalex.org/C121332964
concepts[9].level	0
concepts[9].score	0.10833033919334412
concepts[9].wikidata	https://www.wikidata.org/wiki/Q413
concepts[9].display_name	Physics
concepts[10].id	https://openalex.org/C38652104
concepts[10].level	1
concepts[10].score	0.0
concepts[10].wikidata	https://www.wikidata.org/wiki/Q3510521
concepts[10].display_name	Computer security
concepts[11].id	https://openalex.org/C50817715
concepts[11].level	2
concepts[11].score	0.0
concepts[11].wikidata	https://www.wikidata.org/wiki/Q79895177
concepts[11].display_name	Regret
concepts[12].id	https://openalex.org/C120665830
concepts[12].level	1
concepts[12].score	0.0
concepts[12].wikidata	https://www.wikidata.org/wiki/Q14620
concepts[12].display_name	Optics
keywords[0].id	https://openalex.org/keywords/reinforcement-learning
keywords[0].score	0.9186882376670837
keywords[0].display_name	Reinforcement learning
keywords[1].id	https://openalex.org/keywords/computer-science
keywords[1].score	0.6675879955291748
keywords[1].display_name	Computer science
keywords[2].id	https://openalex.org/keywords/photonics
keywords[2].score	0.6469535231590271
keywords[2].display_name	Photonics
keywords[3].id	https://openalex.org/keywords/multi-armed-bandit
keywords[3].score	0.5473476052284241
keywords[3].display_name	Multi-armed bandit
keywords[4].id	https://openalex.org/keywords/temporal-difference-learning
keywords[4].score	0.5069089531898499
keywords[4].display_name	Temporal difference learning
keywords[5].id	https://openalex.org/keywords/artificial-intelligence
keywords[5].score	0.4663239121437073
keywords[5].display_name	Artificial intelligence
keywords[6].id	https://openalex.org/keywords/chaotic
keywords[6].score	0.45985016226768494
keywords[6].display_name	Chaotic
keywords[7].id	https://openalex.org/keywords/exploit
keywords[7].score	0.4251117408275604
keywords[7].display_name	Exploit
keywords[8].id	https://openalex.org/keywords/machine-learning
keywords[8].score	0.2262095808982849
keywords[8].display_name	Machine learning
keywords[9].id	https://openalex.org/keywords/physics
keywords[9].score	0.10833033919334412
keywords[9].display_name	Physics
language	en
locations[0].id	pmh:oai:arXiv.org:2205.09543
locations[0].is_oa	True
locations[0].source.id	https://openalex.org/S4306400194
locations[0].source.issn
locations[0].source.type	repository
locations[0].source.is_oa	True
locations[0].source.issn_l
locations[0].source.is_core	False
locations[0].source.is_in_doaj	False
locations[0].source.display_name	arXiv (Cornell University)
locations[0].source.host_organization	https://openalex.org/I205783295
locations[0].source.host_organization_name	Cornell University
locations[0].source.host_organization_lineage	https://openalex.org/I205783295
locations[0].license
locations[0].pdf_url	https://arxiv.org/pdf/2205.09543
locations[0].version	submittedVersion
locations[0].raw_type	text
locations[0].license_id
locations[0].is_accepted	False
locations[0].is_published	False
locations[0].raw_source_name
locations[0].landing_page_url	http://arxiv.org/abs/2205.09543
locations[1].id	doi:10.48550/arxiv.2205.09543
locations[1].is_oa	True
locations[1].source.id	https://openalex.org/S4306400194
locations[1].source.issn
locations[1].source.type	repository
locations[1].source.is_oa	True
locations[1].source.issn_l
locations[1].source.is_core	False
locations[1].source.is_in_doaj	False
locations[1].source.display_name	arXiv (Cornell University)
locations[1].source.host_organization	https://openalex.org/I205783295
locations[1].source.host_organization_name	Cornell University
locations[1].source.host_organization_lineage	https://openalex.org/I205783295
locations[1].license
locations[1].pdf_url
locations[1].version
locations[1].raw_type	article
locations[1].license_id
locations[1].is_accepted	False
locations[1].is_published
locations[1].raw_source_name
locations[1].landing_page_url	https://doi.org/10.48550/arxiv.2205.09543
indexed_in	arxiv, datacite
authorships[0].author.id	https://openalex.org/A5081541078
authorships[0].author.orcid
authorships[0].author.display_name	Takashi Urushibara
authorships[0].author_position	first
authorships[0].raw_author_name	Urushibara, Takashi
authorships[0].is_corresponding	False
authorships[1].author.id	https://openalex.org/A5062445268
authorships[1].author.orcid	https://orcid.org/0000-0002-6504-1730
authorships[1].author.display_name	Nicolas Chauvet
authorships[1].author_position	middle
authorships[1].raw_author_name	Chauvet, Nicolas
authorships[1].is_corresponding	False
authorships[2].author.id	https://openalex.org/A5084998369
authorships[2].author.orcid
authorships[2].author.display_name	Satoshi Kochi
authorships[2].author_position	middle
authorships[2].raw_author_name	Kochi, Satoshi
authorships[2].is_corresponding	False
authorships[3].author.id	https://openalex.org/A5044971688
authorships[3].author.orcid	https://orcid.org/0000-0003-0466-8529
authorships[3].author.display_name	Satoshi Sunada
authorships[3].author_position	middle
authorships[3].raw_author_name	Sunada, Satoshi
authorships[3].is_corresponding	False
authorships[4].author.id	https://openalex.org/A5005161622
authorships[4].author.orcid	https://orcid.org/0000-0002-2982-4308
authorships[4].author.display_name	Kazutaka Kanno
authorships[4].author_position	middle
authorships[4].raw_author_name	Kanno, Kazutaka
authorships[4].is_corresponding	False
authorships[5].author.id	https://openalex.org/A5004119695
authorships[5].author.orcid	https://orcid.org/0000-0002-4654-8616
authorships[5].author.display_name	Atsushi Uchida
authorships[5].author_position	middle
authorships[5].raw_author_name	Uchida, Atsushi
authorships[5].is_corresponding	False
authorships[6].author.id	https://openalex.org/A5006017403
authorships[6].author.orcid	https://orcid.org/0000-0002-2280-5921
authorships[6].author.display_name	Ryoichi Horisaki
authorships[6].author_position	middle
authorships[6].raw_author_name	Horisaki, Ryoichi
authorships[6].is_corresponding	False
authorships[7].author.id	https://openalex.org/A5002483949
authorships[7].author.orcid	https://orcid.org/0000-0001-8982-9824
authorships[7].author.display_name	Makoto Naruse
authorships[7].author_position	last
authorships[7].raw_author_name	Naruse, Makoto
authorships[7].is_corresponding	False
has_content.pdf	False
has_content.grobid_xml	False
is_paratext	False
open_access.is_oa	True
open_access.oa_url	https://arxiv.org/pdf/2205.09543
open_access.oa_status	green
open_access.any_repository_has_fulltext	False
created_date	2022-11-06T00:00:00
display_name	Parallel bandit architecture based on laser chaos for reinforcement learning
has_fulltext	False
is_retracted	False
updated_date	2025-11-06T06:51:31.235846
primary_topic.id	https://openalex.org/T12611
primary_topic.field.id	https://openalex.org/fields/17
primary_topic.field.display_name	Computer Science
primary_topic.score	0.9990000128746033
primary_topic.domain.id	https://openalex.org/domains/3
primary_topic.domain.display_name	Physical Sciences
primary_topic.subfield.id	https://openalex.org/subfields/1702
primary_topic.subfield.display_name	Artificial Intelligence
primary_topic.display_name	Neural Networks and Reservoir Computing
related_works	https://openalex.org/W2145363145, https://openalex.org/W2341346307, https://openalex.org/W2154399718, https://openalex.org/W4321463377, https://openalex.org/W2768629321, https://openalex.org/W1914583973, https://openalex.org/W2130711276, https://openalex.org/W4308828368, https://openalex.org/W1528400370, https://openalex.org/W2152445738
cited_by_count	0
locations_count	2
best_oa_location.id	pmh:oai:arXiv.org:2205.09543
best_oa_location.is_oa	True
best_oa_location.source.id	https://openalex.org/S4306400194
best_oa_location.source.issn
best_oa_location.source.type	repository
best_oa_location.source.is_oa	True
best_oa_location.source.issn_l
best_oa_location.source.is_core	False
best_oa_location.source.is_in_doaj	False
best_oa_location.source.display_name	arXiv (Cornell University)
best_oa_location.source.host_organization	https://openalex.org/I205783295
best_oa_location.source.host_organization_name	Cornell University
best_oa_location.source.host_organization_lineage	https://openalex.org/I205783295
best_oa_location.license
best_oa_location.pdf_url	https://arxiv.org/pdf/2205.09543
best_oa_location.version	submittedVersion
best_oa_location.raw_type	text
best_oa_location.license_id
best_oa_location.is_accepted	False
best_oa_location.is_published	False
best_oa_location.raw_source_name
best_oa_location.landing_page_url	http://arxiv.org/abs/2205.09543
primary_location.id	pmh:oai:arXiv.org:2205.09543
primary_location.is_oa	True
primary_location.source.id	https://openalex.org/S4306400194
primary_location.source.issn
primary_location.source.type	repository
primary_location.source.is_oa	True
primary_location.source.issn_l
primary_location.source.is_core	False
primary_location.source.is_in_doaj	False
primary_location.source.display_name	arXiv (Cornell University)
primary_location.source.host_organization	https://openalex.org/I205783295
primary_location.source.host_organization_name	Cornell University
primary_location.source.host_organization_lineage	https://openalex.org/I205783295
primary_location.license
primary_location.pdf_url	https://arxiv.org/pdf/2205.09543
primary_location.version	submittedVersion
primary_location.raw_type	text
primary_location.license_id
primary_location.is_accepted	False
primary_location.is_published	False
primary_location.raw_source_name
primary_location.landing_page_url	http://arxiv.org/abs/2205.09543
publication_date	2022-05-19
publication_year	2022
referenced_works_count	0
abstract_inverted_index.a	47, 67, 106, 114, 141, 170, 192
abstract_inverted_index.In	101
abstract_inverted_index.We	195
abstract_inverted_index.an	6, 22, 146
abstract_inverted_index.as	113, 145
abstract_inverted_index.by	3, 240
abstract_inverted_index.in	70, 120, 138, 156, 187, 237
abstract_inverted_index.is	5, 21, 62, 66
abstract_inverted_index.of	9, 17, 25, 50, 60, 81, 94, 117, 201
abstract_inverted_index.or	136
abstract_inverted_index.to	12, 37, 91, 122, 153
abstract_inverted_index.we	104, 128, 148
abstract_inverted_index.The	79, 219
abstract_inverted_index.and	28, 97, 217, 244
abstract_inverted_index.are	226
abstract_inverted_index.can	74
abstract_inverted_index.due	90
abstract_inverted_index.fit	86
abstract_inverted_index.for	109, 133, 229
abstract_inverted_index.its	92
abstract_inverted_index.new	107
abstract_inverted_index.not	85, 233
abstract_inverted_index.the	14, 38, 58, 98, 154, 176, 184, 188, 199, 204, 208, 223, 241
abstract_inverted_index.PBRL	137, 151, 163, 216, 242
abstract_inverted_index.also	196, 227
abstract_inverted_index.been	33
abstract_inverted_index.call	129
abstract_inverted_index.case	177
abstract_inverted_index.deal	75
abstract_inverted_index.does	84
abstract_inverted_index.find	197
abstract_inverted_index.from	124
abstract_inverted_index.have	32
abstract_inverted_index.just	234
abstract_inverted_index.many	77
abstract_inverted_index.one.	64
abstract_inverted_index.only	63
abstract_inverted_index.rule	96
abstract_inverted_index.than	160, 175
abstract_inverted_index.that	73, 150, 198, 203
abstract_inverted_index.this	102
abstract_inverted_index.time	158, 173
abstract_inverted_index.well	87
abstract_inverted_index.when	167
abstract_inverted_index.with	35, 76, 169, 178
abstract_inverted_index.array	116
abstract_inverted_index.chaos	190
abstract_inverted_index.could	45
abstract_inverted_index.fewer	157
abstract_inverted_index.field	8
abstract_inverted_index.laser	172, 189
abstract_inverted_index.order	121
abstract_inverted_index.phase	210
abstract_inverted_index.steps	159
abstract_inverted_index.study	10, 225
abstract_inverted_index.where	57, 183
abstract_inverted_index.which	127
abstract_inverted_index.Taking	140
abstract_inverted_index.action	99
abstract_inverted_index.active	7
abstract_inverted_index.adapts	152
abstract_inverted_index.aiming	11
abstract_inverted_index.bandit	40, 55, 118, 131
abstract_inverted_index.branch	24
abstract_inverted_index.during	207
abstract_inverted_index.faster	165
abstract_inverted_index.number	49, 59
abstract_inverted_index.random	246
abstract_inverted_index.series	174
abstract_inverted_index.short.	139
abstract_inverted_index.states	61, 202
abstract_inverted_index.study,	103
abstract_inverted_index.system	205
abstract_inverted_index.unique	15
abstract_inverted_index.unlike	52
abstract_inverted_index.update	95
abstract_inverted_index.yields	164
abstract_inverted_index.benefit	123
abstract_inverted_index.between	215
abstract_inverted_index.chaotic	171
abstract_inverted_index.effect.	194
abstract_inverted_index.exploit	13
abstract_inverted_index.involve	46
abstract_inverted_index.machine	26
abstract_inverted_index.massive	48
abstract_inverted_index.numbers	182
abstract_inverted_index.present	224
abstract_inverted_index.problem	144
abstract_inverted_index.respect	36
abstract_inverted_index.states,	51
abstract_inverted_index.states.	78
abstract_inverted_index.through	222
abstract_inverted_index.variety	200
abstract_inverted_index.However,	42
abstract_inverted_index.approach	69
abstract_inverted_index.exhibits	211
abstract_inverted_index.existing	230
abstract_inverted_index.however,	83
abstract_inverted_index.inherent	186
abstract_inverted_index.insights	220
abstract_inverted_index.learning	20, 44, 72, 112, 135, 209
abstract_inverted_index.obtained	221
abstract_inverted_index.operated	168
abstract_inverted_index.organize	105
abstract_inverted_index.parallel	115, 130
abstract_inverted_index.photonic	29, 88, 125, 235
abstract_inverted_index.photons.	18
abstract_inverted_index.positive	193
abstract_inverted_index.problems	56, 119
abstract_inverted_index.provides	191
abstract_inverted_index.balancing	143
abstract_inverted_index.cart-pole	142
abstract_inverted_index.computing	231
abstract_inverted_index.different	213
abstract_inverted_index.important	23
abstract_inverted_index.instance,	147
abstract_inverted_index.learning,	27
abstract_inverted_index.photonics	4
abstract_inverted_index.problems.	41
abstract_inverted_index.undergoes	206
abstract_inverted_index.uniformly	179
abstract_inverted_index.Q-learning	65
abstract_inverted_index.adaptation	166
abstract_inverted_index.algorithms	243
abstract_inverted_index.artificial	1
abstract_inverted_index.beneficial	228
abstract_inverted_index.completely	212
abstract_inverted_index.correlated	245
abstract_inverted_index.platforms,	232
abstract_inverted_index.previously	53
abstract_inverted_index.principles	31
abstract_inverted_index.properties	16, 214
abstract_inverted_index.selection.	100
abstract_inverted_index.separation	93
abstract_inverted_index.sequences.	247
abstract_inverted_index.well-known	68
abstract_inverted_index.Q-learning,	82
abstract_inverted_index.Q-learning.	161, 218
abstract_inverted_index.demonstrate	149
abstract_inverted_index.distributed	180
abstract_inverted_index.environment	155
abstract_inverted_index.multi-armed	39
abstract_inverted_index.multi-state	110
abstract_inverted_index.Accelerating	0
abstract_inverted_index.Furthermore,	162
abstract_inverted_index.accelerating	238
abstract_inverted_index.architecture	80, 108, 132
abstract_inverted_index.demonstrated	34, 54
abstract_inverted_index.intelligence	2
abstract_inverted_index.performances	239
abstract_inverted_index.pseudorandom	181
abstract_inverted_index.Reinforcement	19
abstract_inverted_index.realizations,	236
abstract_inverted_index.reinforcement	43, 71, 111, 134
abstract_inverted_index.autocorrelation	185
abstract_inverted_index.decision-making	30
abstract_inverted_index.implementations	89
abstract_inverted_index.decision-makers,	126
cited_by_percentile_year
countries_distinct_count	0
institutions_distinct_count	8
sustainable_development_goals[0].id	https://metadata.un.org/sdg/16
sustainable_development_goals[0].score	0.7900000214576721
sustainable_development_goals[0].display_name	Peace, Justice and strong institutions
citation_normalized_percentile