"Bayesian Policy Gradient and Actor-Critic Algorithms."

Mohammad Ghavamzadeh, Yaakov Engel, Michal Valko (2016)
a service of Schloss Dagstuhl - Leibniz Center for Informatics