Exploration
50 papers tagged Exploration (ordered by heat_score)
Papers
- Evolutionary Reinforcement Learning: A Survey (2023)Hui Bai, Ran Cheng, Yaochu Jin13.93
- Proximal Policy Optimization Via Enhanced Exploration Efficiency (2020)Junwei Zhang, Zhenghao Zhang, Shuai Han, et al.13.70
- Count-based Exploration With The Successor Representation (2018)Marlos C. MacHado, Marc G. Bellemare, Michael Bowling13.17
- Scalable Photonic Reinforcement Learning By Time-division Multiplexing Of Laser Chaos (2018)Makoto Naruse, Takatomo Mihana, Hirokazu Hori, et al.13.05
- Policy Search In Continuous Action Domains: An Overview (2018)Olivier Sigaud, Freek Stulp12.74
- Environment Reconstruction With Hidden Confounders For Reinforcement Learning Based Recommendation (2019)Wenjie Shang, Yang Yu, Qingyang Li, et al.11.93
- Provably Efficient Reinforcement Learning With Linear Function Approximation (2019)Chi Jin, Zhuoran Yang, Zhaoran Wang, et al.11.76
- A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks (2017)Xiao Li, Yao Ma, Calin Belta11.58
- Diversity Policy Gradient For Sample Efficient Quality-diversity Optimization (2020)Thomas Pierrot, Valentin MacÉ, Félix Chalumeau, et al.11.58
- Theoretical Analysis Of Meta Reinforcement Learning: Generalization Bounds And Convergence Guarantees (2024)Cangqing Wang, Mingxiu Sui, Dan Sun, et al.10.35
- Evolutionary Reinforcement Learning Via Cooperative Coevolutionary Negatively Correlated Search (2020)Hu Zhang, Peng Yang, Yanglong Yu, et al.9.92
- Exploration Versus Exploitation In Reinforcement Learning: A Stochastic Control Approach (2018)Haoran Wang, Thaleia Zariphopoulou, Xunyu Zhou9.76
- Offline Reinforcement Learning For Wireless Network Optimization With Mixture Datasets (2023)Kun Yang, Cong Shen, Jing Yang, et al.9.59
- Intrinsic Fluctuations Of Reinforcement Learning Promote Cooperation (2022)Wolfram Barfuss, Janusz Meylahn9.23
- Parallel Exploration Via Negatively Correlated Search (2019)Peng Yang, Qi Yang, Ke Tang, et al.8.60
- Online Meta-learning By Parallel Algorithm Competition (2017)Stefan Elfwing, Eiji Uchibe, Kenji Doya8.35
- Sampling Efficient Deep Reinforcement Learning Through Preference-guided Stochastic Exploration (2022)Wenhui Huang, Cong Zhang, Jingda Wu, et al.8.09
- Multi-agent Deep Reinforcement Learning With Human Strategies (2018)Thanh Nguyen, Ngoc Duy Nguyen, Saeid Nahavandi8.09
- Exploration And Incentives In Reinforcement Learning (2021)Max Simchowitz, Aleksandrs Slivkins8.09
- DSAC: Distributional Soft Actor-critic For Risk-sensitive Reinforcement Learning (2020)Xiaoteng Ma, Junyao Chen, Li Xia, et al.7.81
- Autoregressive Policies For Continuous Control Deep Reinforcement Learning (2019)Dmytro Korenkevych, A. Rupam Mahmood, Gautham Vasan, et al.7.50
- Long-term Visitation Value For Deep Exploration In Sparse Reward Reinforcement Learning (2020)Simone Parisi, Davide Tateo, Maximilian Hensel, et al.7.24
- PNS: Population-guided Novelty Search For Reinforcement Learning In Hard Exploration Environments (2018)Qihao Liu, Yujia Wang, Xiaofeng Liu7.16
- Fox: Formation-aware Exploration In Multi-agent Reinforcement Learning (2023)Yonghyeon Jo, Sunwoo Lee, Junghyuk Yeom, et al.6.77
- Uncertainty Quantification And Exploration For Reinforcement Learning (2019)Yi Zhu, Jing Dong, Henry Lam6.77
- Learning-driven Exploration For Reinforcement Learning (2019)Muhammad Usama, Dong Eui Chang6.45
- Collaborative Training Of Heterogeneous Reinforcement Learning Agents In Environments With Sparse Rewards: What And When To Share? (2022)Alain Andres, Esther Villar-Rodriguez, Javier del Ser6.34
- An Efficient Off-policy Reinforcement Learning Algorithm For The Continuous-time LQR Problem (2023)Victor G. Lopez, Matthias A. Müller6.34
- An Intrinsically-motivated Approach For Learning Highly Exploring And Fast Mixing Policies (2019)Mirco Mutti, Marcello Restelli6.34
- Exploring The Limits Of Hierarchical World Models In Reinforcement Learning (2024)Robin Schiewer, Anand Subramoney, Laurenz Wiskott6.34
- MENTOR: Guiding Hierarchical Reinforcement Learning With Human Feedback And Dynamic Distance Constraint (2024)Xinglin Zhou, Yifu Yuan, Shaofu Yang, et al.6.34
- A Joint Imitation-reinforcement Learning Framework For Reduced Baseline Regret (2022)Sheelabhadra Dey, Sumedh Pendurkar, Guni Sharon, et al.5.84
- Deterministic Sequencing Of Exploration And Exploitation For Reinforcement Learning (2022)Piyush Gupta, Vaibhav Srivastava5.84
- A Further Exploration Of Deep Multi-agent Reinforcement Learning With Hybrid Action Space (2022)Hongzhi Hua, Guixuan Wen, Kaigui Wu5.84
- Policy Optimization With Model-based Explorations (2018)Feiyang Pan, Qingpeng Cai, An-Xiang Zeng, et al.5.84
- Design Space Exploration Of Approximate Computing Techniques With A Reinforcement Learning Approach (2023)Sepide Saeedi, Alessandro Savino, Stefano di Carlo5.84
- Deep Reinforcement Learning With Feedback-based Exploration (2019)Jan Scholten, Daan Wout, Carlos Celemin, et al.5.84
- Overcoming The Sim-to-real Gap: Leveraging Simulation To Learn To Explore For Real-world RL (2024)Andrew Wagenmaker, Kevin Huang, Liyiming Ke, et al.5.84
- VASE: Variational Assorted Surprise Exploration For Reinforcement Learning (2019)Haitao Xu, Brendan McCane, Lech Szymanski5.84
- Boosting Exploration In Actor-critic Algorithms By Incentivizing Plausible Novel States (2022)Chayan Banerjee, Zhiyong Chen, Nasimul Noman5.24
- Situation-dependent Causal Influence-based Cooperative Multi-agent Reinforcement Learning (2023)Xiao Du, Yutong Ye, Pengyu Zhang, et al.5.24
- On Hard Exploration For Reinforcement Learning: A Case Study In Pommerman (2019)Chao Gao, Bilal Kartal, Pablo Hernandez-Leal, et al.5.24
- POPO: Pessimistic Offline Policy Optimization (2020)Qiang He, Xinwen Hou5.24
- Model-based Safe Deep Reinforcement Learning Via A Constrained Proximal Policy Optimization Algorithm (2022)Ashish Kumar Jayant, Shalabh Bhatnagar5.24
- Robbins-monro Conditions For Persistent Exploration Learning Strategies (2018)Dmitry B. Rokhlin5.24
- Autotelic Reinforcement Learning: Exploring Intrinsic Motivations For Skill Acquisition In Open-ended Environments (2025)Prakhar Srivastava, Jasmeet Singh5.24
- Reinforcement Learning By Guided Safe Exploration (2023)Qisong Yang, Thiago D. Simão, Nils Jansen, et al.5.24
- Graph Exploration For Effective Multi-agent Q-learning (2023)Ainur Zhaikhan, Ali H. Sayed5.24
- AGPO: Adaptive Group Policy Optimization with Dual Statistical Feedback (2026)Miaobo Hu et al.4.54
- APEX: Autonomous Policy Exploration for Self-Evolving LLM Agents (2026)Yibo Li et al.4.54