Overcoming The Sim-to-real Gap: Leveraging Simulation To Learn To Explore For Real-world RL
2024 Β· Andrew Wagenmaker, Kevin Huang, Liyiming Ke, et al.
Abstract
In order to mitigate the sample complexity of real-world reinforcement learning, common practice is to first train a policy in a simulator where samples are cheap, and then deploy this policy in the real world, with the hope that it generalizes effectively. Such *direct sim2real* transfer is not guaranteed to succeed, however, and in cases where it fails, it is unclear how to best utilize the simulator. In this work, we show that in many regimes, while direct sim2real transfer may fail, we can utilize the simulator to learn a set of *exploratory* policies which enable efficient exploration in the real world. In particular, in the setting of low-rank MDPs, we show that coupling these exploratory policies with simple, practical approaches -- least-squares regression oracles and naive randomized exploration -- yields a polynomial sample complexity in the real world, an exponential improvement over direct sim2real transfer, or learning without access to a simulator. To the best of our know
Authors
(none)
Tags
Stats
Related papers
- Post-convergence Sim-to-real Policy Transfer: A Principled Alternative To Cherry-picking (2025)0.00
- Trade-off On Sim2real Learning: Real-world Learning Faster Than Simulations (2020)3.58
- Provable Sim-to-real Transfer In Continuous Domain With Partial Observations (2022)0.00
- Understanding Domain Randomization For Sim-to-real Transfer (2021)0.00
- Influence-augmented Local Simulators: A Scalable Solution For Fast Deep RL In Large Networked Systems (2022)0.00
- When To Trust Your Simulator: Dynamics-aware Hybrid Offline-and-online Reinforcement Learning (2022)2.26
- Human-inspired Framework To Accelerate Reinforcement Learning (2023)0.00
- Policy Learning For Off-dynamics RL With Deficient Support (2024)0.00