Awesome Papers

Papers

Active Inference: Demystified And Compared (2019)
Noor Sajid, Philip J. Ball, Thomas Parr, et al.
15.98
Revisiting The Arcade Learning Environment: Evaluation Protocols And Open Problems For General Agents (2017)
Marlos C. MacHado, Marc G. Bellemare, Erik Talvitie, et al.
15.67
Sophisticated Inference (2020)
Karl Friston, Lancelot da Costa, Danijar Hafner, et al.
14.83
Statistical Inference Of The Value Function For Reinforcement Learning In Infinite Horizon Settings (2020)
C. Shi, S. Zhang, W. Lu, et al.
13.14
Human-level Control Through Directly-trained Deep Spiking Q-networks (2021)
Guisong Liu, Wenjie Deng, Xiurui Xie, et al.
12.40
Reinforcement Learning And Its Connections With Neuroscience And Psychology (2020)
Ajay Subramanian, Sharad Chitlangia, Veeky Baths
12.25
Adaptive Trust Region Policy Optimization: Global Convergence And Faster Rates For Regularized Mdps (2019)
Lior Shani, Yonathan Efroni, Shie Mannor
12.10
Learning Offline: Memory Replay In Biological And Artificial Reinforcement Learning (2021)
Emma L. Roscow, Raymond Chua, Rui Ponte Costa, et al.
11.67
Reward Maximisation Through Discrete Active Inference (2020)
Lancelot da Costa, Noor Sajid, Thomas Parr, et al.
10.74
Variance Reduction In Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) For Extensive Form Games Using Baselines (2018)
Martin Schmid, Neil Burch, Marc Lanctot, et al.
10.48
Efficiently Breaking The Curse Of Horizon In Off-policy Evaluation With Double Reinforcement Learning (2019)
Nathan Kallus, Masatoshi Uehara
10.21
Expanding The Active Inference Landscape: More Intrinsic Motivations In The Perception-action Loop (2018)
Martin Biehl, Christian Guckelsberger, Christoph Salge, et al.
9.92
Deep Active Inference For Partially Observable Mdps (2020)
Otto van Der Himst, Pablo Lanillos
9.59
Reinforcement Learning With Low-complexity Liquid State Machines (2019)
Wachirawit Ponghiran, Gopalakrishnan Srinivasan, Kaushik Roy
9.41
The Sufficiency Of Off-policyness And Soft Clipping: PPO Is Still Insufficient According To An Off-policy Measure (2022)
Xing Chen, Dongcui Diao, Hechang Chen, et al.
9.23
Bootstrapping With Models: Confidence Intervals For Off-policy Evaluation (2016)
Josiah P. Hanna, Peter Stone, Scott Niekum
9.23
Compatible Natural Gradient Policy Search (2019)
Joni Pajarinen, Hong Linh Thai, Riad Akrour, et al.
9.23
Online Bootstrap Inference For Policy Evaluation In Reinforcement Learning (2021)
Pratik Ramprasad, Yuantong Li, Zhuoran Yang, et al.
9.23
Towards Applicable Reinforcement Learning: Improving The Generalization And Sample Efficiency With Policy Ensemble (2022)
Zhengyu Yang, Kan Ren, Xufang Luo, et al.
9.23
Experience Replay Using Transition Sequences (2017)
Thommen George Karimpanal, Roland Bouffanais
8.82
Learning First-to-spike Policies For Neuromorphic Control Using Policy Gradients (2018)
Bleema Rosenfeld, Osvaldo Simeone, Bipin Rajendran
8.60
Vizdoom: DRQN With Prioritized Experience Replay, Double-q Learning, & Snapshot Ensembling (2018)
Christopher Schulze, Marcus Schulze
8.60
Lucid Dreaming For Experience Replay: Refreshing Past States With The Current Policy (2020)
Yunshu Du, Garrett Warnell, Assefaw Gebremedhin, et al.
7.81
Autoregressive Policies For Continuous Control Deep Reinforcement Learning (2019)
Dmytro Korenkevych, A. Rupam Mahmood, Gautham Vasan, et al.
7.50
Reinforcement Learning Framework For Deep Brain Stimulation Study (2020)
Dmitrii Krylov, Remi Tachet, Romain Laroche, et al.
7.50
Off-policy Evaluation In Doubly Inhomogeneous Environments (2023)
Zeyu Bian, Chengchun Shi, Zhengling Qi, et al.
7.16
Adaptively Calibrated Critic Estimates For Deep Reinforcement Learning (2021)
Nicolai Dorka, Tim Welschehold, Joschka Boedecker, et al.
7.16
Conformal Off-policy Evaluation In Markov Decision Processes (2023)
Daniele Foffano, Alessio Russo, Alexandre Proutiere
7.16
A Low Latency Adaptive Coding Spiking Framework For Deep Reinforcement Learning (2022)
Lang Qin, Rui Yan, Huajin Tang
7.16
Branching Time Active Inference: Empirical Study And Complexity Class Analysis (2021)
Théophile Champion, Howard Bowman, Marek Grześ
6.77
Faded-experience Trust Region Policy Optimization For Model-free Power Allocation In Interference Channel (2020)
Mohammad G. Khoshkholgh, Halim Yanikomeroglu
6.77
Proximal Policy Optimization With Relative Pearson Divergence (2020)
Taisuke Kobayashi
6.77
Neural Networks With Motivation (2019)
Sergey A. Shuvaev, Ngoc B. Tran, Marcus Stephenson-Jones, et al.
6.77
Prioritized Sweeping Neural Dynaq With Multiple Predecessors, And Hippocampal Replays (2018)
Lise Aubin, Mehdi Khamassi, Benoît Girard
6.34
Context Meta-reinforcement Learning Via Neuromodulation (2021)
Eseoghene Ben-Iwhiwhu, Jeffery Dick, Nicholas A. Ketz, et al.
6.34
Associative Memory Based Experience Replay For Deep Reinforcement Learning (2022)
Mengyuan Li, Arman Kazemi, Ann Franchesca Laguna, et al.
6.34
A Dual-memory Architecture For Reinforcement Learning On Neuromorphic Platforms (2021)
Wilkie Olin-Ammentorp, Yury Sokolov, Maxim Bazhenov
6.34
An Improved Strategy For Blood Glucose Control Using Multi-step Deep Reinforcement Learning (2024)
Weiwei Gu, Senquan Wang
5.84
An Introduction To Reinforcement Learning For Neuroscience (2023)
Kristopher T. Jensen
5.84
Design Space Exploration Of Approximate Computing Techniques With A Reinforcement Learning Approach (2023)
Sepide Saeedi, Alessandro Savino, Stefano di Carlo
5.84
Smoothed Functional-based Gradient Algorithms For Off-policy Reinforcement Learning: A Non-asymptotic Viewpoint (2021)
Nithia Vijayan, Prashanth L. A
5.84
Bootstrapping A DQN Replay Memory With Synthetic Experiences (2020)
Wenzel Baron Pilar von Pilchau, Anthony Stein, Jörg Hähner
5.84
Accmer: Accelerating Multi-agent Experience Replay With Cache Locality-aware Prioritization (2023)
Kailash Gogineni, Yongsheng Mei, Peng Wei, et al.
5.24
Learning Expected Emphatic Traces For Deep RL (2021)
Ray Jiang, Shangtong Zhang, Veronica Chelu, et al.
5.24
Effects Of Spectral Normalization In Multi-agent Reinforcement Learning (2022)
Kinal Mehta, Anuj Mahajan, Pawan Kumar
5.24
Generalized Policy Improvement Algorithms With Theoretically Supported Sample Reuse (2022)
James Queeney, Ioannis Ch. Paschalidis, Christos G. Cassandras
5.24
Augmented Replay Memory In Reinforcement Learning With Continuous Control (2019)
Mirza Ramicic, Andrea Bonarini
5.24
Prioritizing Samples In Reinforcement Learning With Reducible Loss (2022)
Shivakanth Sujit, Somjit Nath, Pedro H. M. Braga, et al.
5.24
More For Less: Safe Policy Improvement With Stronger Performance Guarantees (2023)
Patrick Wienhöft, Marnix Suilen, Thiago D. Simão, et al.
5.24
Gradient Informed Proximal Policy Optimization (2023)
Sanghyun Son, Laura Yu Zheng, Ryan Sullivan, et al.
5.15