"A Bandit Framework for Optimal Selection of Reinforcement Learning Agents."

Andreas Merentitis et al. (2019)
a service of Schloss Dagstuhl - Leibniz Center for Informatics