David Silver

Orcid: 0000-0002-5197-2892

Affiliations:
  • Google DeepMind, London, UK
  • University College London, UK
  • University of Alberta, Edmonton, Canada (PhD 2009)


According to our database1, David Silver authored at least 112 papers between 2005 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
DataRater: Meta-Learned Dataset Curation.
CoRR, May, 2025

2023
Faster sorting algorithms discovered using deep reinforcement learning.
Nat., 2023

Gemini: A Family of Highly Capable Multimodal Models.
CoRR, 2023

2022

Deep learning, reinforcement learning, and world models.
Neural Networks, 2022

Discovering faster matrix multiplication algorithms with reinforcement learning.
Nat., 2022

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning.
CoRR, 2022

Learning by Directional Gradient Descent.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Bootstrapped Meta-Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Policy improvement by planning with Gumbel.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Planning in Stochastic Environments with a Learned Model.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Reward is enough.
Artif. Intell., 2021

Discovery of Options via Meta-Learned Subgoals.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Online and Offline Reinforcement Learning by Planning with a Learned Model.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Proper Value Equivalence.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Self-Consistent Models and Values.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning and Planning in Complex Action Spaces.
Proceedings of the 38th International Conference on Machine Learning, 2021

Muesli: Combining Improvements in Policy Optimization.
Proceedings of the 38th International Conference on Machine Learning, 2021

Expected Eligibility Traces.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

The Value-Improvement Path: Towards Better Representations for Reinforcement Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Fast reinforcement learning with generalized policy updates.
Proc. Natl. Acad. Sci. USA, 2020

Improved protein structure prediction using potentials from deep learning.
Nat., 2020

Mastering Atari, Go, chess and shogi by planning with a learned model.
Nat., 2020

Self-Tuning Deep Reinforcement Learning.
CoRR, 2020

A Self-Tuning Actor-Critic Algorithm.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Meta-Gradient Reinforcement Learning with an Objective Discovered Online.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Discovering Reinforcement Learning Algorithms.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Value-driven Hindsight Modelling.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

The Value Equivalence Principle for Model-Based Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

What Can Learned Intrinsic Rewards Capture?
Proceedings of the 37th International Conference on Machine Learning, 2020

Behaviour Suite for Reinforcement Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Grandmaster level in StarCraft II using multi-agent reinforcement learning.
Nat., 2019

On Inductive Biases in Deep Reinforcement Learning.
CoRR, 2019

Discovery of Useful Questions as Auxiliary Tasks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

The Option Keyboard: Combining Skills in Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

An Investigation of Model-Free Planning.
Proceedings of the 36th International Conference on Machine Learning, 2019

Universal Successor Features Approximators.
Proceedings of the 7th International Conference on Learning Representations, 2019

Credit Assignment Techniques in Stochastic Computation Graphs.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018
Introduction to the special issue on deep reinforcement learning: An editorial.
Neural Networks, 2018

Bayesian Optimization in AlphaGo.
CoRR, 2018

Human-level performance in first-person multiplayer games with population-based deep reinforcement learning.
CoRR, 2018

Unsupervised Predictive Memory in a Goal-Directed Agent.
CoRR, 2018

Unicorn: Continual Learning with a Universal, Off-policy Agent.
CoRR, 2018

Meta-Gradient Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Learning to Search with MCTSnets.
Proceedings of the 35th International Conference on Machine Learning, 2018

Implicit Quantile Networks for Distributional Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement.
Proceedings of the 35th International Conference on Machine Learning, 2018

Distributed Prioritized Experience Replay.
Proceedings of the 6th International Conference on Learning Representations, 2018

Rainbow: Combining Improvements in Deep Reinforcement Learning.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Mastering the game of Go without human knowledge.
Nat., 2017

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm.
CoRR, 2017

StarCraft II: A New Challenge for Reinforcement Learning.
CoRR, 2017

Imagination-Augmented Agents for Deep Reinforcement Learning.
CoRR, 2017

Emergence of Locomotion Behaviours in Rich Environments.
CoRR, 2017

Technical perspective: Solving imperfect information games.
Commun. ACM, 2017

Natural Value Approximators: Learning when to Trust Past Estimates.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Imagination-Augmented Agents for Deep Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Successor Features for Transfer in Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

FeUdal Networks for Hierarchical Reinforcement Learning.
Proceedings of the 34th International Conference on Machine Learning, 2017

The Predictron: End-To-End Learning and Planning.
Proceedings of the 34th International Conference on Machine Learning, 2017

Decoupled Neural Interfaces using Synthetic Gradients.
Proceedings of the 34th International Conference on Machine Learning, 2017

Reinforcement Learning with Unsupervised Auxiliary Tasks.
Proceedings of the 5th International Conference on Learning Representations, 2017

2016
Mastering the game of Go with deep neural networks and tree search.
Nat., 2016

Prioritized Experience Replay.
Proceedings of the 4th International Conference on Learning Representations, 2016

Continuous control with deep reinforcement learning.
Proceedings of the 4th International Conference on Learning Representations, 2016

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games.
CoRR, 2016

Learning and Transfer of Modulated Locomotor Controllers.
CoRR, 2016

Learning functions across many orders of magnitudes.
CoRR, 2016

Successor Features for Transfer in Reinforcement Learning.
CoRR, 2016

Learning values across many orders of magnitude.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Asynchronous Methods for Deep Reinforcement Learning.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Deep Reinforcement Learning with Double Q-Learning.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Human-level control through deep reinforcement learning.
Nat., 2015

Massively Parallel Methods for Deep Reinforcement Learning.
CoRR, 2015

Move Evaluation in Go Using Deep Convolutional Neural Networks.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Memory-based control with recurrent neural networks.
CoRR, 2015

Value Iteration with Options and State Aggregation.
CoRR, 2015

Learning Continuous Control Policies by Stochastic Value Gradients.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Smooth UCT Search in Computer Poker.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Universal Value Function Approximators.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Fictitious Self-Play in Extensive-Form Games.
Proceedings of the 32nd International Conference on Machine Learning, 2015

2014
Unit Tests for Stochastic Optimization.
Proceedings of the 2nd International Conference on Learning Representations, 2014

Better Optimism By Bayes: Adaptive Planning with Rich Models.
CoRR, 2014

Bayes-Adaptive Simulation-based Search with Value Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Deterministic Policy Gradient Algorithms.
Proceedings of the 31th International Conference on Machine Learning, 2014

2013
Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search.
J. Artif. Intell. Res., 2013

Playing Atari with Deep Reinforcement Learning.
CoRR, 2013

Concurrent Reinforcement Learning from Customer Interactions.
Proceedings of the 30th International Conference on Machine Learning, 2013

2012
Temporal-difference search in computer Go.
Mach. Learn., 2012

Learning to Win by Reading Manuals in a Monte-Carlo Framework.
J. Artif. Intell. Res., 2012

The grand challenge of computer Go: Monte Carlo tree search and extensions.
Commun. ACM, 2012

Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Compositional Planning Using Optimal Option Models.
Proceedings of the 29th International Conference on Machine Learning, 2012

Gradient Temporal Difference Networks.
Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Actor-Critic Reinforcement Learning with Energy-Based Policies.
Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

2011
A Monte-Carlo AIXI Approximation.
J. Artif. Intell. Res., 2011

Monte-Carlo tree search and rapid action value estimation in computer Go.
Artif. Intell., 2011

Non-Linear Monte-Carlo Search in Civilization II.
Proceedings of the IJCAI 2011, 2011

2010
Monte-Carlo Planning in Large POMDPs.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Reinforcement Learning via AIXI Approximation.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

2009
A Monte Carlo AIXI Approximation
CoRR, 2009

Bootstrapping from Game Tree Search.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Fast gradient-descent methods for temporal-difference learning with linear function approximation.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Monte-Carlo simulation balancing.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008
Sample-based learning and search with permanent and transient memories.
Proceedings of the Machine Learning, 2008

Achieving Master Level Play in 9 x 9 Computer Go.
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007
Reinforcement Learning of Local Shape in the Game of Go.
Proceedings of the IJCAI 2007, 2007

On the role of tracking in stationary environments.
Proceedings of the Machine Learning, 2007

Combining online and offline knowledge in UCT.
Proceedings of the Machine Learning, 2007

2005
Cooperative Pathfinding.
Proceedings of the First Artificial Intelligence and Interactive Digital Entertainment Conference, 2005


  Loading...