Ray: A Distributed Framework For Emerging AI Applications
2017 Β· Philipp Moritz, Robert Nishihara, Stephanie Wang, et al.
Abstract
The next generation of AI applications will continuously interact with the environment and learn from these interactions. These applications impose new and demanding systems requirements, both in terms of performance and flexibility. In this paper, we consider these requirements and present Ray---a distributed system to address them. Ray implements a unified interface that can express both task-parallel and actor-based computations, supported by a single dynamic execution engine. To meet the performance requirements, Ray employs a distributed scheduler and a distributed and fault-tolerant store to manage the system's control state. In our experiments, we demonstrate scaling beyond 1.8 million tasks per second and better performance than existing specialized systems for several challenging reinforcement learning applications.
Authors
(none)
Tags
Stats
Related papers
- The AI Arena: A Framework For Distributed Multi-agent Reinforcement Learning (2021)0.00
- Rllib Flow: Distributed Reinforcement Learning Is A Dataflow Problem (2020)0.00
- SRL: Scaling Distributed Reinforcement Learning To Over Ten Thousand Cores (2023)0.00
- Acceleration For Deep Reinforcement Learning Using Parallel And Distributed Computing: A Survey (2024)8.82
- Distributed Deep Reinforcement Learning: An Overview (2020)0.00
- Fiber: A Platform For Efficient Development And Distributed Training For Reinforcement Learning And Population-based Methods (2020)0.00
- Jumanji: A Diverse Suite Of Scalable Reinforcement Learning Environments In JAX (2023)0.00
- Cogment: Open Source Framework For Distributed Multi-actor Training, Deployment & Operations (2021)0.00