cluster #9
50 papers in this cluster (ordered by heat_score)
Papers
- A Tour Of Reinforcement Learning: The View From Continuous Control (2018)Benjamin Recht19.86
- Reinforcement Learning In Economics And Finance (2020)Arthur Charpentier, Romuald Elie, Carl Remlinger14.73
- Deep Hierarchical Reinforcement Learning Algorithm In Partially Observable Markov Decision Processes (2018)Le Pham Tuyen, Ngo Anh Vien, Abu Layek, et al.12.87
- Provably Efficient Reinforcement Learning With Linear Function Approximation (2019)Chi Jin, Zhuoran Yang, Zhaoran Wang, et al.11.76
- Sample Complexity Of Asynchronous Q-learning: Sharper Analysis And Variance Reduction (2020)Gen Li, Yuting Wei, Yuejie Chi, et al.11.19
- Robust Reinforcement Learning: A Case Study In Linear Quadratic Regulation (2020)Bo Pang, Zhong-Ping Jiang11.19
- Soft Policy Gradient Method For Maximum Entropy Deep Reinforcement Learning (2019)Wenjie Shi, Shiji Song, Cheng Wu10.85
- Direct And Indirect Reinforcement Learning (2019)Yang Guan, Shengbo Eben Li, Jingliang Duan, et al.10.74
- Achieving Zero Constraint Violation For Constrained Reinforcement Learning Via Primal-dual Approach (2021)Qinbo Bai, Amrit Singh Bedi, Mridul Agarwal, et al.9.59
- Modeling The Effects Of Environmental And Perceptual Uncertainty Using Deterministic Reinforcement Learning Dynamics With Partial Observability (2021)Wolfram Barfuss, Richard P. Mann9.59
- On Overfitting And Asymptotic Bias In Batch Reinforcement Learning With Partial Observability (2017)Vincent Francois-Lavet, Guillaume Rabusseau, Joelle Pineau, et al.9.23
- Convergence Of Policy Gradient Methods For Finite-horizon Exploratory Linear-quadratic Control Problems (2022)Michael Giegrich, Christoph Reisinger, Yufei Zhang9.23
- Convergence Guarantees Of Policy Optimization Methods For Markovian Jump Linear Systems (2020)Joao Paulo Jansch-Porto, Bin Hu, Geir Dullerud9.03
- Breaking The Sample Size Barrier In Model-based Reinforcement Learning With A Generative Model (2020)Gen Li, Yuting Wei, Yuejie Chi, et al.9.03
- Regret Bounds For Reinforcement Learning Via Markov Chain Concentration (2018)Ronald Ortner9.03
- Learning And Information In Stochastic Networks And Queues (2021)Neil Walton, Kuang Xu9.03
- Rethinking The Discount Factor In Reinforcement Learning: A Decision Theoretic Approach (2019)Silviu Pitis8.60
- Parameterized Mdps And Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework (2020)Amber Srivastava, Srinivasa M Salapaka8.60
- Revisiting LQR Control From The Perspective Of Receding-horizon Policy Gradient (2023)Xiangyuan Zhang, Tamer Başar8.60
- Q-learning Lagrange Policies For Multi-action Restless Bandits (2021)Jackson A. Killian, Arpita Biswas, Sanket Shah, et al.8.35
- Unified Models Of Human Behavioral Agents In Bandits, Contextual Bandits And RL (2020)Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf, et al.8.35
- Minimax Optimal Q Learning With Nearest Neighbors (2023)Puning Zhao, Lifeng Lai8.09
- Logarithmic Regret For Episodic Continuous-time Linear-quadratic Reinforcement Learning Over A Finite-time Horizon (2020)Matteo Basei, Xin Guo, Anran Hu, et al.7.81
- DSAC: Distributional Soft Actor-critic For Risk-sensitive Reinforcement Learning (2020)Xiaoteng Ma, Junyao Chen, Li Xia, et al.7.81
- Computably Continuous Reinforcement-learning Objectives Are Pac-learnable (2023)Cambridge Yang, Michael Littman, Michael Carbin7.81
- Approximating Euclidean By Imprecise Markov Decision Processes (2020)Manfred Jaeger, Giorgio Bacci, Giovanni Bacci, et al.7.50
- Renewal Monte Carlo: Renewal Theory Based Reinforcement Learning (2018)Jayakumar Subramanian, Aditya Mahajan7.50
- Effective Multi-user Delay-constrained Scheduling With Deep Recurrent Reinforcement Learning (2022)Pihe Hu, Ling Pan, Yu Chen, et al.7.16
- An Online Prediction Algorithm For Reinforcement Learning With Linear Function Approximation Using Cross Entropy Method (2018)Ajin George Joseph, Shalabh Bhatnagar7.16
- Entropic Regularization Of Markov Decision Processes (2019)Boris Belousov, Jan Peters6.77
- Reinforcement Learning In Pomdps With Memoryless Options And Option-observation Initiation Sets (2017)Denis Steckelmacher, Diederik M. Roijers, Anna Harutyunyan, et al.6.77
- Efficient Learning In Non-stationary Linear Markov Decision Processes (2020)Ahmed Touati, Pascal Vincent6.77
- Entropy Regularized Reinforcement Learning Using Large Deviation Theory (2021)Argenis Arriojas, Jacob Adamczyk, Stas Tiomkin, et al.6.34
- Optimality-based Analysis Of XCSF Compaction In Discrete Reinforcement Learning (2020)Jordan T. Bishop, Marcus Gallagher6.34
- Multi-timescale Ensemble Q-learning For Markov Decision Process Policy Optimization (2024)Talha Bozkus, Urbashi Mitra6.34
- Linear Convergence Of Entropy-regularized Natural Policy Gradient With Linear Function Approximation (2021)Semih Cayci, Niao He, R. Srikant6.34
- Learning In Restless Bandits Under Exogenous Global Markov Process (2021)Tomer Gafni, Michal Yemini, Kobi Cohen6.34
- Softmax Policy Gradient Methods Can Take Exponential Time To Converge (2021)Gen Li, Yuting Wei, Yuejie Chi, et al.6.34
- An Efficient Off-policy Reinforcement Learning Algorithm For The Continuous-time LQR Problem (2023)Victor G. Lopez, Matthias A. Müller6.34
- Reinforcement Learning Under Partial Observability Guided By Learned Environment Models (2022)Edi Muskardin, Martin Tappler, Bernhard K. Aichernig, et al.6.34
- Anomaly Detection Via Learning-based Sequential Controlled Sensing (2023)Geethu Joseph, Chen Zhong, M. Cenk Gursoy, et al.5.84
- Performance Dynamics And Termination Errors In Reinforcement Learning: A Unifying Perspective (2019)Nikki Lijing Kuang, Clement H. C. Leung5.84
- Sparse Tree Search Optimality Guarantees In Pomdps With Continuous Observation Spaces (2019)Michael H. Lim, Claire J. Tomlin, Zachary N. Sunberg5.84
- Active Inference And Reinforcement Learning: A Unified Inference On Continuous State And Action Spaces Under Partial Observability (2022)Parvin Malekzadeh, Konstantinos N. Plataniotis5.84
- Robust Risk-sensitive Reinforcement Learning With Conditional Value-at-risk (2024)Xinyi Ni, Lifeng Lai5.84
- Modelling Stock-market Investors As Reinforcement Learning Agents [correction] (2016)Alvin Pastore, Umberto Esposito, Eleni Vasilaki5.84
- Certrl: Formalizing Convergence Proofs For Value And Policy Iteration In Coq (2020)Koundinya Vajjha, Avraham Shinnar, Vasily Pestun, et al.5.84
- Hidden Markov Model Estimation-based Q-learning For Partially Observable Markov Decision Process (2018)Hyung-Jin Yoon, Donghwan Lee, Naira Hovakimyan5.84
- Reinforcement Learning With Non-cumulative Objective (2023)Wei Cui, Wei Yu5.24
- Replicable Reinforcement Learning With Linear Function Approximation (2025)Eric Eaton, Marcel Hussing, Michael Kearns, et al.5.24