Scalable Photonic Reinforcement Learning By Time-division Multiplexing Of Laser Chaos
2018 Β· Makoto Naruse, Takatomo Mihana, Hirokazu Hori, et al.
Abstract
Reinforcement learning involves decision making in dynamic and uncertain environments and constitutes a crucial element of artificial intelligence. In our previous work, we experimentally demonstrated that the ultrafast chaotic oscillatory dynamics of lasers can be used to solve the two-armed bandit problem efficiently, which requires decision making concerning a class of difficult trade-offs called the exploration-exploitation dilemma. However, only two selections were employed in that research; thus, the scalability of the laser-chaos-based reinforcement learning should be clarified. In this study, we demonstrated a scalable, pipelined principle of resolving the multi-armed bandit problem by introducing time-division multiplexing of chaotically oscillated ultrafast time-series. The experimental demonstrations in which bandit problems with up to 64 arms were successfully solved are presented in this report. Detailed analyses are also provided that include performance comparisons among
Authors
(none)
Tags
Stats
Related papers
- Parallel Bandit Architecture Based On Laser Chaos For Reinforcement Learning (2022)0.00
- Decentralized Multi-agent Reinforcement Learning Algorithm Using A Cluster-synchronized Laser Network (2024)0.00
- Bandit Approach To Conflict-free Multi-agent Q-learning In View Of Photonic Implementation (2022)0.00
- Autocorrelation Effects In A Stochastic-process Model For Decision Making Via Time Series (2026)0.00
- An Optical Control Environment For Benchmarking Reinforcement Learning Algorithms (2022)0.00
- Laser Learning Environment: A New Environment For Coordination-critical Multi-agent Tasks (2024)0.00
- A Survey On Large-population Systems And Scalable Multi-agent Reinforcement Learning (2022)0.00
- Large-scale Traffic Signal Control Using A Novel Multi-agent Reinforcement Learning (2019)16.21