cluster #3
50 papers in this cluster (ordered by heat_score)
Papers
- Deep TAMER: Interactive Agent Shaping In High-dimensional State Spaces (2017)Garrett Warnell, Nicholas Waytowich, Vernon Lawhern, et al.14.73
- Counterfactual State Explanations For Reinforcement Learning Agents Via Generative Deep Learning (2021)Matthew L. Olson, Roli Khanna, Lawrence Neal, et al.13.23
- Meta-reinforcement Learning For The Tuning Of PI Controllers: An Offline Approach (2022)Daniel G. McClement, Nathan P. Lawrence, Johan U. Backstrom, et al.12.02
- Environment Reconstruction With Hidden Confounders For Reinforcement Learning Based Recommendation (2019)Wenjie Shang, Yang Yu, Qingyang Li, et al.11.93
- Improving Interactive Reinforcement Learning: What Makes A Good Teacher? (2019)Francisco Cruz, Sven Magg, Yukie Nagai, et al.11.19
- Multi-agent Inverse Reinforcement Learning For Certain General-sum Stochastic Games (2018)Xiaomin Lin, Stephen C. Adams, Peter A. Beling10.97
- A Tutorial On Meta-reinforcement Learning (2023)Jacob Beck, Risto Vuorio, Evan Zheran Liu, et al.10.85
- Meta-learning Within Projective Simulation (2016)Adi Makmal, Alexey A. Melnikov, Vedran Dunjko, et al.10.85
- Theoretical Analysis Of Meta Reinforcement Learning: Generalization Bounds And Convergence Guarantees (2024)Cangqing Wang, Mingxiu Sui, Dan Sun, et al.10.35
- Meta-learning Via Learned Loss (2019)Sarah Bechtle, Artem Molchanov, Yevgen Chebotar, et al.10.07
- Explaining Online Reinforcement Learning Decisions Of Self-adaptive Systems (2022)Felix Feit, Andreas Metzger, Klaus Pohl9.59
- Facial Feedback For Reinforcement Learning: A Case Study And Offline Analysis Using The TAMER Framework (2020)Guangliang Li, Hamdi Dibeklioğlu, Shimon Whiteson, et al.9.59
- Agent-pro: Learning To Evolve Via Policy-level Reflection And Optimization (2024)Wenqi Zhang, Ke Tang, Hai Wu, et al.9.59
- Human Engagement Providing Evaluative And Informative Advice For Interactive Reinforcement Learning (2020)Adam Bignold, Francisco Cruz, Richard Dazeley, et al.9.23
- Diffail: Diffusion Adversarial Imitation Learning (2023)Bingzheng Wang, Guoqiang Wu, Teng Pang, et al.9.13
- Inverse Reinforcement Learning In Contextual Mdps (2019)Stav Belogolovsky, Philip Korsunsky, Shie Mannor, et al.8.82
- Online Meta-learning By Parallel Algorithm Competition (2017)Stefan Elfwing, Eiji Uchibe, Kenji Doya8.35
- Lipschitz Lifelong Reinforcement Learning (2020)Erwan Lecarpentier, David Abel, Kavosh Asadi, et al.8.35
- Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning (2020)Lionel Blondé, Pablo Strasser, Alexandros Kalousis7.81
- Meta-reinforcement Learning Based On Self-supervised Task Representation Learning (2023)Mingyang Wang, Zhenshan Bing, Xiangtong Yao, et al.7.81
- Swiftrl: Towards Efficient Reinforcement Learning On Real Processing-in-memory Systems (2024)Kailash Gogineni, Sai Santosh Dayapule, Juan Gómez-Luna, et al.7.50
- Inverse-inverse Reinforcement Learning. How To Hide Strategy From An Adversarial Inverse Reinforcement Learner (2022)Kunal Pattanayak, Vikram Krishnamurthy, Christopher Berry7.50
- Reinforcement Learning And Inverse Reinforcement Learning With System 1 And System 2 (2018)Alexander Peysakhovich7.16
- Competitive Multi-agent Deep Reinforcement Learning With Counterfactual Thinking (2019)Yue Wang, Yao Wan, Chenwei Zhang, et al.7.16
- Options Of Interest: Temporal Abstraction With Interest Functions (2020)Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, et al.6.77
- Higher : Improving Instruction Following With Hindsight Generation For Experience Replay (2019)Geoffrey Cideron, Mathieu Seurin, Florian Strub, et al.6.34
- Hypernetworks For Zero-shot Transfer In Reinforcement Learning (2022)Sahand Rezaei-Shoshtari, Charlotte Morissette, Francois Robert Hogan, et al.6.34
- On The Role Of Weight Sharing During Deep Option Learning (2019)Matthew Riemer, Ignacio Cases, Clemens Rosenbaum, et al.6.34
- Exploring The Limits Of Hierarchical World Models In Reinforcement Learning (2024)Robin Schiewer, Anand Subramoney, Laurenz Wiskott6.34
- Rating-based Reinforcement Learning (2023)Devin White, Mingkang Wu, Ellen Novoseller, et al.6.34
- Reinforcement Learning In System Identification (2022)Jose Antonio Martin H., Oscar Fernandez Vicente, Sergio Perez, et al.5.84
- Causal Deep Reinforcement Learning Using Observational Data (2022)Wenxuan Zhu, Chao Yu, Qiang Zhang5.84
- Task Phasing: Automated Curriculum Learning From Demonstrations (2022)Vaibhav Bajaj, Guni Sharon, Peter Stone5.24
- Efficient Reinforcement Learning In Resource Allocation Problems Through Permutation Invariant Multi-task Learning (2021)Desmond Cai, Shiau Hong Lim, Laura Wynter5.24
- Unified Algorithms For RL With Decision-estimation Coefficients: PAC, Reward-free, Preference-based Learning, And Beyond (2022)Fan Chen, Song Mei, Yu Bai5.24
- In-trajectory Inverse Reinforcement Learning: Learn Incrementally Before An Ongoing Trajectory Terminates (2024)Shicheng Liu, Minghui Zhu5.24
- Misspecification In Inverse Reinforcement Learning (2022)Joar Skalse, Alessandro Abate5.24
- Deep Interactive Bayesian Reinforcement Learning Via Meta-learning (2021)Luisa Zintgraf, Sam Devlin, Kamil Ciosek, et al.5.24
- Rewarding The Scientific Process: Process-level Reward Modeling For Agentic Data Analysis (2026)Zhisong Qiu, Shuofei Qiao, Kewei Xu, et al.5.07
- Iterative Reward Shaping Using Human Feedback For Correcting Reward Misspecification (2023)Jasmina Gajcin, James McCarthy, Rahul Nair, et al.4.52
- Learning Action Translator For Meta Reinforcement Learning On Sparse-reward Tasks (2022)Yijie Guo, Qiucheng Wu, Honglak Lee4.52
- A Model-based Approach For Improving Reinforcement Learning Efficiency Leveraging Expert Observations (2024)Erhan Can Ozcan, Vittorio Giammarino, James Queeney, et al.4.52
- Excluding The Irrelevant: Focusing Reinforcement Learning Through Continuous Action Masking (2024)Roland Stolz, Hanna Krasowski, Jakob Thumm, et al.4.52
- Meta-q-learning (2019)Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, et al.3.58
- Sero: Self-supervised Reinforcement Learning For Recovery From Out-of-distribution Situations (2023)Chan Kim, Jaekyung Cho, Christophe Bobda, et al.3.58
- Ask-ac: An Initiative Advisor-in-the-loop Actor-critic Framework (2022)Shunyu Liu, Kaixuan Chen, Na Yu, et al.3.58
- Discriminator Soft Actor Critic Without Extrinsic Rewards (2020)Daichi Nishio, Daiki Kuyoshi, Toi Tsuneda, et al.3.58
- A Bayesian Approach To Policy Recognition And State Representation Learning (2016)Adrian Šošić, Abdelhak M. Zoubir, Heinz Koeppl3.58
- Harmonydream: Task Harmonization Inside World Models (2023)Haoyu Ma, Jialong Wu, Ningya Feng, et al.3.46
- Direct Multi-turn Preference Optimization For Language Agents (2024)Wentao Shi, Mengqi Yuan, Junkang Wu, et al.3.45