


default search action
Reinforcement Learning Journal, Volume 1
Volume 1, 2024
- Max Olan Smith, Michael P. Wellman:

Co-Learning Empirical Games & World Models. 1-15 - Woojin Jeong, Seungki Min:

Improving Thompson Sampling via Information Relaxation for Budgeted Multi-armed Bandits. RLJ 1: 16-28 (2024) - Shuang Wu, Arash A. Amini:

Graph Neural Thompson Sampling. RLJ 1: 29-63 (2024) - Junxiong Wang, Kaiwen Wang, Yueying Li, Nathan Kallus, Immanuel Trummer, Wen Sun:

JoinGym: An Efficient Join Order Selection Environment. RLJ 1: 64-91 (2024) - Antonin Raffin, Olivier Sigaud, Jens Kober, Alin Albu-Schäffer, João Silvério, Freek Stulp:

An Open-Loop Baseline for Reinforcement Learning Locomotion Tasks. RLJ 1: 92-107 (2024) - Raphaël Avalos, Eugenio Bargiacchi, Ann Nowé, Diederik M. Roijers, Frans A. Oliehoek:

Online Planning in POMDPs with State-Requests. RLJ 1: 108-129 (2024) - Abdulaziz Almuzairee, Nicklas Hansen, Henrik I. Christensen:

A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning. RLJ 1: 130-157 (2024) - Robert J. Moss, Anthony Corso, Jef Caers, Mykel J. Kochenderfer:

BetaZero: Belief-State Planning for Long-Horizon POMDPs using Learned Approximations. RLJ 1: 158-181 (2024) - Audrey Huang, Mohammad Ghavamzadeh, Nan Jiang, Marek Petrik:

Non-adaptive Online Finetuning for Offline Reinforcement Learning. RLJ 1: 182-197 (2024) - Nicholas E. Corrado, Yuxiao Qu, John U. Balis, Adam Labiosa, Josiah P. Hanna:

Guided Data Augmentation for Offline Reinforcement Learning and Imitation Learning. RLJ 1: 198-215 (2024) - Michael Lu, Matin Aghaei, Anant Raj, Sharan Vaswani:

Towards Principled, Practical Policy Gradient for Bandits and Tabular MDPs. RLJ 1: 216-282 (2024) - Benjamin Freed, Thomas Wei, Roberto Calandra, Jeff Schneider, Howie Choset:

Unifying Model-Based and Model-Free Reinforcement Learning with Equivalent Policy Sets. RLJ 1: 283-301 (2024) - Noah Golowich, Ankur Moitra:

The Role of Inherent Bellman Error in Offline Reinforcement Learning with Linear Function Approximation. RLJ 1: 302-341 (2024) - Max Rudolph, Caleb Chuck, Kevin Black, Misha Lvovsky, Scott Niekum, Amy Zhang:

Learning Action-based Representations Using Invariance. RLJ 1: 342-365 (2024) - Oliver Järnefelt, Mahdi Kallel, Carlo D'Eramo:

Cyclicity-Regularized Coordination Graphs. RLJ 1: 366-379 (2024) - Aditya Kapoor, Benjamin Freed, Jeff Schneider, Howie Choset:

Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization. RLJ 1: 380-399 (2024) - Quentin Delfosse, Jannis Blüml, Bjarne Gregori, Sebastian Sztwiertnia, Kristian Kersting:

OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments. RLJ 1: 400-449 (2024) - Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Zheng Xiong, Shimon Whiteson:

SplAgger: Split Aggregation for Meta-Reinforcement Learning. RLJ 1: 450-469 (2024) - Nan Jiang, Jinzhao Li, Yexiang Xue:

A Tighter Convergence Proof of Reverse Experience Replay. RLJ 1: 470-480 (2024)

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














