default search action
David Silver
Person information
- affiliation: DeepMind, London, UK
- affiliation (former): University College London, UK
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [j29]Daniel J. Mankowitz, Andrea Michi, Anton Zhernov, Marco Gelmi, Marco Selvi, Cosmin Paduraru, Edouard Leurent, Shariq Iqbal, Jean-Baptiste Lespiau, Alex Ahern, Thomas Köppe, Kevin Millikin, Stephen Gaffney, Sophie Elster, Jackson Broshear, Chris Gamble, Kieran Milan, Robert Tung, Minjae Hwang, A. Taylan Cemgil, Mohammadamin Barekatain, Yujia Li, Amol Mandhane, Thomas Hubert, Julian Schrittwieser, Demis Hassabis, Pushmeet Kohli, Martin A. Riedmiller, Oriol Vinyals, David Silver:
Faster sorting algorithms discovered using deep reinforcement learning. Nat. 618(7964): 257-263 (2023) - [i59]Rohan Anil, Sebastian Borgeaud, Yonghui Wu, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Slav Petrov, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy P. Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul Ronald Barham, Tom Hennigan, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, Ryan Doherty, Eli Collins, Clemens Meyer, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, George Tucker, Enrique Piqueras, Maxim Krikun, Iain Barr, Nikolay Savinov, Ivo Danihelka, Becca Roelofs, Anaïs White, Anders Andreassen, Tamara von Glehn, Lakshman Yagati, Mehran Kazemi, Lucas Gonzalez, Misha Khalman, Jakub Sygnowski, et al.:
Gemini: A Family of Highly Capable Multimodal Models. CoRR abs/2312.11805 (2023) - 2022
- [j28]Alhussein Fawzi, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Francisco J. R. Ruiz, Julian Schrittwieser, Grzegorz Swirszcz, David Silver, Demis Hassabis, Pushmeet Kohli:
Discovering faster matrix multiplication algorithms with reinforcement learning. Nat. 610(7930): 47-53 (2022) - [j27]Yutaka Matsuo, Yann LeCun, Maneesh Sahani, Doina Precup, David Silver, Masashi Sugiyama, Eiji Uchibe, Jun Morimoto:
Deep learning, reinforcement learning, and world models. Neural Networks 152: 267-275 (2022) - [c88]Ioannis Antonoglou, Julian Schrittwieser, Sherjil Ozair, Thomas K. Hubert, David Silver:
Planning in Stochastic Environments with a Learned Model. ICLR 2022 - [c87]Ivo Danihelka, Arthur Guez, Julian Schrittwieser, David Silver:
Policy improvement by planning with Gumbel. ICLR 2022 - [c86]Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh:
Bootstrapped Meta-Learning. ICLR 2022 - [c85]David Silver, Anirudh Goyal, Ivo Danihelka, Matteo Hessel, Hado van Hasselt:
Learning by Directional Gradient Descent. ICLR 2022 - [d1]Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Figure Data for the paper "Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning". Zenodo, 2022 - [i58]Julien Pérolat, Bart De Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas W. Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot, Shayegan Omidshafiei, Edward Lockhart, Laurent Sifre, Nathalie Beauguerlange, Rémi Munos, David Silver, Satinder Singh, Demis Hassabis, Karl Tuyls:
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning. CoRR abs/2206.15378 (2022) - 2021
- [j26]David Silver, Satinder Singh, Doina Precup, Richard S. Sutton:
Reward is enough. Artif. Intell. 299: 103535 (2021) - [c84]Will Dabney, André Barreto, Mark Rowland, Robert Dadashi, John Quan, Marc G. Bellemare, David Silver:
The Value-Improvement Path: Towards Better Representations for Reinforcement Learning. AAAI 2021: 7160-7168 - [c83]Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa:
Expected Eligibility Traces. AAAI 2021: 9997-10005 - [c82]Matteo Hessel, Ivo Danihelka, Fabio Viola, Arthur Guez, Simon Schmitt, Laurent Sifre, Theophane Weber, David Silver, Hado van Hasselt:
Muesli: Combining Improvements in Policy Optimization. ICML 2021: 4214-4226 - [c81]Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Mohammadamin Barekatain, Simon Schmitt, David Silver:
Learning and Planning in Complex Action Spaces. ICML 2021: 4476-4486 - [c80]Gregory Farquhar, Kate Baumli, Zita Marinho, Angelos Filos, Matteo Hessel, Hado Philip van Hasselt, David Silver:
Self-Consistent Models and Values. NeurIPS 2021: 1111-1125 - [c79]Christopher Grimm, André Barreto, Gregory Farquhar, David Silver, Satinder Singh:
Proper Value Equivalence. NeurIPS 2021: 7773-7786 - [c78]Julian Schrittwieser, Thomas Hubert, Amol Mandhane, Mohammadamin Barekatain, Ioannis Antonoglou, David Silver:
Online and Offline Reinforcement Learning by Planning with a Learned Model. NeurIPS 2021: 27580-27591 - [c77]Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Options via Meta-Learned Subgoals. NeurIPS 2021: 29861-29873 - [i57]Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Options via Meta-Learned Subgoals. CoRR abs/2102.06741 (2021) - [i56]Matteo Hessel, Ivo Danihelka, Fabio Viola, Arthur Guez, Simon Schmitt, Laurent Sifre, Theophane Weber, David Silver, Hado van Hasselt:
Muesli: Combining Improvements in Policy Optimization. CoRR abs/2104.06159 (2021) - [i55]Julian Schrittwieser, Thomas Hubert, Amol Mandhane, Mohammadamin Barekatain, Ioannis Antonoglou, David Silver:
Online and Offline Reinforcement Learning by Planning with a Learned Model. CoRR abs/2104.06294 (2021) - [i54]Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Mohammadamin Barekatain, Simon Schmitt, David Silver:
Learning and Planning in Complex Action Spaces. CoRR abs/2104.06303 (2021) - [i53]Christopher Grimm, André Barreto, Gregory Farquhar, David Silver, Satinder Singh:
Proper Value Equivalence. CoRR abs/2106.10316 (2021) - [i52]André Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser Aygün, Philippe Hamel, Daniel Toyama, Jonathan J. Hunt, Shibl Mourad, David Silver, Doina Precup:
The Option Keyboard: Combining Skills in Reinforcement Learning. CoRR abs/2106.13105 (2021) - [i51]Sebastian Flennerhag, Yannick Schroecker, Tom Zahavy, Hado van Hasselt, David Silver, Satinder Singh:
Bootstrapped Meta-Learning. CoRR abs/2109.04504 (2021) - [i50]Gregory Farquhar, Kate Baumli, Zita Marinho, Angelos Filos, Matteo Hessel, Hado van Hasselt, David Silver:
Self-Consistent Models and Values. CoRR abs/2110.12840 (2021) - 2020
- [j25]Andrew W. Senior, Richard Evans, John Jumper, James Kirkpatrick, Laurent Sifre, Tim Green, Chongli Qin, Augustin Zídek, Alexander W. R. Nelson, Alex Bridgland, Hugo Penedones, Stig Petersen, Karen Simonyan, Steve Crossan, Pushmeet Kohli, David T. Jones, David Silver, Koray Kavukcuoglu, Demis Hassabis:
Improved protein structure prediction using potentials from deep learning. Nat. 577(7792): 706-710 (2020) - [j24]Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy P. Lillicrap, David Silver:
Mastering Atari, Go, chess and shogi by planning with a learned model. Nat. 588(7839): 604-609 (2020) - [j23]André Barreto, Shaobo Hou, Diana Borsa, David Silver, Doina Precup:
Fast reinforcement learning with generalized policy updates. Proc. Natl. Acad. Sci. USA 117(48): 30079-30087 (2020) - [c76]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. ICLR 2020 - [c75]Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh:
What Can Learned Intrinsic Rewards Capture? ICML 2020: 11436-11446 - [c74]Christopher Grimm, André Barreto, Satinder Singh, David Silver:
The Value Equivalence Principle for Model-Based Reinforcement Learning. NeurIPS 2020 - [c73]Arthur Guez, Fabio Viola, Theophane Weber, Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess:
Value-driven Hindsight Modelling. NeurIPS 2020 - [c72]Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver:
Discovering Reinforcement Learning Algorithms. NeurIPS 2020 - [c71]Zhongwen Xu, Hado Philip van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver:
Meta-Gradient Reinforcement Learning with an Objective Discovered Online. NeurIPS 2020 - [c70]Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
A Self-Tuning Actor-Critic Algorithm. NeurIPS 2020 - [i49]Arthur Guez, Fabio Viola, Théophane Weber, Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess:
Value-driven Hindsight Modelling. CoRR abs/2002.08329 (2020) - [i48]Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
Self-Tuning Deep Reinforcement Learning. CoRR abs/2002.12928 (2020) - [i47]Will Dabney, André Barreto, Mark Rowland, Robert Dadashi, John Quan, Marc G. Bellemare, David Silver:
The Value-Improvement Path: Towards Better Representations for Reinforcement Learning. CoRR abs/2006.02243 (2020) - [i46]Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa:
Expected Eligibility Traces. CoRR abs/2007.01839 (2020) - [i45]Zhongwen Xu, Hado van Hasselt, Matteo Hessel, Junhyuk Oh, Satinder Singh, David Silver:
Meta-Gradient Reinforcement Learning with an Objective Discovered Online. CoRR abs/2007.08433 (2020) - [i44]Junhyuk Oh, Matteo Hessel, Wojciech M. Czarnecki, Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver:
Discovering Reinforcement Learning Algorithms. CoRR abs/2007.08794 (2020) - [i43]Christopher Grimm, André Barreto, Satinder Singh, David Silver:
The Value Equivalence Principle for Model-Based Reinforcement Learning. CoRR abs/2011.03506 (2020)
2010 – 2019
- 2019
- [j22]Oriol Vinyals, Igor Babuschkin, Wojciech M. Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung, David H. Choi, Richard Powell, Timo Ewalds, Petko Georgiev, Junhyuk Oh, Dan Horgan, Manuel Kroiss, Ivo Danihelka, Aja Huang, Laurent Sifre, Trevor Cai, John P. Agapiou, Max Jaderberg, Alexander Sasha Vezhnevets, Rémi Leblond, Tobias Pohlen, Valentin Dalibard, David Budden, Yury Sulsky, James Molloy, Tom Le Paine, Çaglar Gülçehre, Ziyu Wang, Tobias Pfaff, Yuhuai Wu, Roman Ring, Dani Yogatama, Dario Wünsch, Katrina McKinney, Oliver Smith, Tom Schaul, Timothy P. Lillicrap, Koray Kavukcuoglu, Demis Hassabis, Chris Apps, David Silver:
Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nat. 575(7782): 350-354 (2019) - [c69]Théophane Weber, Nicolas Heess, Lars Buesing, David Silver:
Credit Assignment Techniques in Stochastic Computation Graphs. AISTATS 2019: 2650-2660 - [c68]Diana Borsa, André Barreto, John Quan, Daniel J. Mankowitz, Hado van Hasselt, Rémi Munos, David Silver, Tom Schaul:
Universal Successor Features Approximators. ICLR (Poster) 2019 - [c67]Arthur Guez, Mehdi Mirza, Karol Gregor, Rishabh Kabra, Sébastien Racanière, Theophane Weber, David Raposo, Adam Santoro, Laurent Orseau, Tom Eccles, Greg Wayne, David Silver, Timothy P. Lillicrap:
An Investigation of Model-Free Planning. ICML 2019: 2464-2473 - [c66]Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Janarthanan Rajendran, Richard L. Lewis, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Useful Questions as Auxiliary Tasks. NeurIPS 2019: 9306-9317 - [c65]André Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser Aygün, Philippe Hamel, Daniel Toyama, Jonathan J. Hunt, Shibl Mourad, David Silver, Doina Precup:
The Option Keyboard: Combining Skills in Reinforcement Learning. NeurIPS 2019: 13031-13041 - [i42]Théophane Weber, Nicolas Heess, Lars Buesing, David Silver:
Credit Assignment Techniques in Stochastic Computation Graphs. CoRR abs/1901.01761 (2019) - [i41]Arthur Guez, Mehdi Mirza, Karol Gregor, Rishabh Kabra, Sébastien Racanière, Théophane Weber, David Raposo, Adam Santoro, Laurent Orseau, Tom Eccles, Greg Wayne, David Silver, Timothy P. Lillicrap:
An investigation of model-free planning. CoRR abs/1901.03559 (2019) - [i40]André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel J. Mankowitz, Augustin Zídek, Rémi Munos:
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. CoRR abs/1901.10964 (2019) - [i39]Matteo Hessel, Hado van Hasselt, Joseph Modayil, David Silver:
On Inductive Biases in Deep Reinforcement Learning. CoRR abs/1907.02908 (2019) - [i38]Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvári, Satinder Singh, Benjamin Van Roy, Richard S. Sutton, David Silver, Hado van Hasselt:
Behaviour Suite for Reinforcement Learning. CoRR abs/1908.03568 (2019) - [i37]Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard L. Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh:
Discovery of Useful Questions as Auxiliary Tasks. CoRR abs/1909.04607 (2019) - [i36]Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy P. Lillicrap, David Silver:
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model. CoRR abs/1911.08265 (2019) - [i35]Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh:
What Can Learned Intrinsic Rewards Capture? CoRR abs/1912.05500 (2019) - 2018
- [j21]Ron Sun, David Silver, Gerald Tesauro, Guang-Bin Huang:
Introduction to the special issue on deep reinforcement learning: An editorial. Neural Networks 107: 1-2 (2018) - [c64]Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Gheshlaghi Azar, David Silver:
Rainbow: Combining Improvements in Deep Reinforcement Learning. AAAI 2018: 3215-3222 - [c63]Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver:
Distributed Prioritized Experience Replay. ICLR (Poster) 2018 - [c62]André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel J. Mankowitz, Augustin Zídek, Rémi Munos:
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. ICML 2018: 510-519 - [c61]Will Dabney, Georg Ostrovski, David Silver, Rémi Munos:
Implicit Quantile Networks for Distributional Reinforcement Learning. ICML 2018: 1104-1113 - [c60]Arthur Guez, Theophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver:
Learning to Search with MCTSnets. ICML 2018: 1817-1826 - [c59]Zhongwen Xu, Hado van Hasselt, David Silver:
Meta-Gradient Reinforcement Learning. NeurIPS 2018: 2402-2413 - [i34]Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver:
Learning to Search with MCTSnets. CoRR abs/1802.04697 (2018) - [i33]Daniel J. Mankowitz, Augustin Zídek, André Barreto, Dan Horgan, Matteo Hessel, John Quan, Junhyuk Oh, Hado van Hasselt, David Silver, Tom Schaul:
Unicorn: Continual Learning with a Universal, Off-policy Agent. CoRR abs/1802.08294 (2018) - [i32]Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver:
Distributed Prioritized Experience Replay. CoRR abs/1803.00933 (2018) - [i31]Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack W. Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Jimenez Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matthew M. Botvinick, Demis Hassabis, Timothy P. Lillicrap:
Unsupervised Predictive Memory in a Goal-Directed Agent. CoRR abs/1803.10760 (2018) - [i30]Zhongwen Xu, Hado van Hasselt, David Silver:
Meta-Gradient Reinforcement Learning. CoRR abs/1805.09801 (2018) - [i29]Will Dabney, Georg Ostrovski, David Silver, Rémi Munos:
Implicit Quantile Networks for Distributional Reinforcement Learning. CoRR abs/1806.06923 (2018) - [i28]Max Jaderberg, Wojciech M. Czarnecki, Iain Dunning, Luke Marris, Guy Lever, Antonio García Castañeda, Charles Beattie, Neil C. Rabinowitz, Ari S. Morcos, Avraham Ruderman, Nicolas Sonnerat, Tim Green, Louise Deason, Joel Z. Leibo, David Silver, Demis Hassabis, Koray Kavukcuoglu, Thore Graepel:
Human-level performance in first-person multiplayer games with population-based deep reinforcement learning. CoRR abs/1807.01281 (2018) - [i27]Yutian Chen, Aja Huang, Ziyu Wang, Ioannis Antonoglou, Julian Schrittwieser, David Silver, Nando de Freitas:
Bayesian Optimization in AlphaGo. CoRR abs/1812.06855 (2018) - [i26]Diana Borsa, André Barreto, John Quan, Daniel J. Mankowitz, Rémi Munos, Hado van Hasselt, David Silver, Tom Schaul:
Universal Successor Features Approximators. CoRR abs/1812.07626 (2018) - 2017
- [j20]David Silver:
Technical perspective: Solving imperfect information games. Commun. ACM 60(11): 80 (2017) - [j19]David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy P. Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, Demis Hassabis:
Mastering the game of Go without human knowledge. Nat. 550(7676): 354-359 (2017) - [c58]Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z. Leibo, David Silver, Koray Kavukcuoglu:
Reinforcement Learning with Unsupervised Auxiliary Tasks. ICLR 2017 - [c57]Max Jaderberg, Wojciech Marian Czarnecki, Simon Osindero, Oriol Vinyals, Alex Graves, David Silver, Koray Kavukcuoglu:
Decoupled Neural Interfaces using Synthetic Gradients. ICML 2017: 1627-1635 - [c56]David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David P. Reichert, Neil C. Rabinowitz, André Barreto, Thomas Degris:
The Predictron: End-To-End Learning and Planning. ICML 2017: 3191-3199 - [c55]Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu:
FeUdal Networks for Hierarchical Reinforcement Learning. ICML 2017: 3540-3549 - [c54]Zhongwen Xu, Joseph Modayil, Hado van Hasselt, André Barreto, David Silver, Tom Schaul:
Natural Value Approximators: Learning when to Trust Past Estimates. NIPS 2017: 2120-2128 - [c53]André Barreto, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, David Silver, Hado van Hasselt:
Successor Features for Transfer in Reinforcement Learning. NIPS 2017: 4055-4065 - [c52]Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, Thore Graepel:
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. NIPS 2017: 4190-4203 - [c51]Sébastien Racanière, Theophane Weber, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter W. Battaglia, Demis Hassabis, David Silver, Daan Wierstra:
Imagination-Augmented Agents for Deep Reinforcement Learning. NIPS 2017: 5690-5701 - [i25]Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu:
FeUdal Networks for Hierarchical Reinforcement Learning. CoRR abs/1703.01161 (2017) - [i24]Nicolas Heess, Dhruva TB, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, S. M. Ali Eslami, Martin A. Riedmiller, David Silver:
Emergence of Locomotion Behaviours in Rich Environments. CoRR abs/1707.02286 (2017) - [i23]Theophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adrià Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter W. Battaglia, David Silver, Daan Wierstra:
Imagination-Augmented Agents for Deep Reinforcement Learning. CoRR abs/1707.06203 (2017) - [i22]Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John P. Agapiou, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy P. Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, Jacob Repp, Rodney Tsing:
StarCraft II: A New Challenge for Reinforcement Learning. CoRR abs/1708.04782 (2017) - [i21]Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Daniel Horgan, Bilal Piot, Mohammad Gheshlaghi Azar, David Silver:
Rainbow: Combining Improvements in Deep Reinforcement Learning. CoRR abs/1710.02298 (2017) - [i20]Marc Lanctot, Vinícius Flores Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Pérolat, David Silver, Thore Graepel:
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning. CoRR abs/1711.00832 (2017) - [i19]David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy P. Lillicrap, Karen Simonyan, Demis Hassabis:
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. CoRR abs/1712.01815 (2017) - 2016
- [j18]David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Vedavyas Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy P. Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis:
Mastering the game of Go with deep neural networks and tree search. Nat. 529(7587): 484-489 (2016) - [c50]Hado van Hasselt, Arthur Guez, David Silver:
Deep Reinforcement Learning with Double Q-Learning. AAAI 2016: 2094-2100 - [c49]Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu:
Asynchronous Methods for Deep Reinforcement Learning. ICML 2016: 1928-1937 - [c48]Hado van Hasselt, Arthur Guez, Matteo Hessel, Volodymyr Mnih, David Silver:
Learning values across many orders of magnitude. NIPS 2016: 4287-4295 - [c47]Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra:
Continuous control with deep reinforcement learning. ICLR (Poster) 2016 - [c46]Tom Schaul, John Quan, Ioannis Antonoglou, David Silver:
Prioritized Experience Replay. ICLR (Poster) 2016 - [i18]Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu:
Asynchronous Methods for Deep Reinforcement Learning. CoRR abs/1602.01783 (2016) - [i17]Hado van Hasselt, Arthur Guez, Matteo Hessel, David Silver:
Learning functions across many orders of magnitudes. CoRR abs/1602.07714 (2016) - [i16]Johannes Heinrich, David Silver:
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games. CoRR abs/1603.01121 (2016) - [i15]André Barreto, Rémi Munos, Tom Schaul, David Silver:
Successor Features for Transfer in Reinforcement Learning. CoRR abs/1606.05312 (2016) - [i14]Nicolas Heess, Gregory Wayne, Yuval Tassa, Timothy P. Lillicrap, Martin A. Riedmiller, David Silver:
Learning and Transfer of Modulated Locomotor Controllers. CoRR abs/1610.05182 (2016) - [i13]Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z. Leibo, David Silver, Koray Kavukcuoglu:
Reinforcement Learning with Unsupervised Auxiliary Tasks. CoRR abs/1611.05397 (2016) - [i12]David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David P. Reichert, Neil C. Rabinowitz, André Barreto, Thomas Degris:
The Predictron: End-To-End Learning and Planning. CoRR abs/1612.08810 (2016) - 2015
- [j17]Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin A. Riedmiller, Andreas Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis:
Human-level control through deep reinforcement learning. Nat. 518(7540): 529-533 (2015) - [c45]John Vines, Peter C. Wright, David Silver, Maggie Winchcombe, Patrick Olivier:
Authenticity, Relatability and Collaborative Approaches to Sharing Knowledge about Assistive Living Technology. CSCW 2015: 82-94 - [c44]Johannes Heinrich, Marc Lanctot, David Silver:
Fictitious Self-Play in Extensive-Form Games. ICML 2015: 805-813 - [c43]