We stand with Ukraine

We stand with Ukraine

Arthur Guez

According to our database¹, Arthur Guez authored at least 37 papers between 2008 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning.

[DOI]

Khimya Khetarpal

,

Zhaohan Daniel Guo

,

Bernardo Ávila Pires

,

,

,

,

,

,

,

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2025

2023

Optimism and Adaptivity in Policy Optimization.

[DOI]

,

,

,

,

Sebastian Flennerhag

CoRR, 2023

2022

Retrieval-Augmented Reinforcement Learning.

[DOI]

,

Abram L. Friesen

,

,

Theophane Weber

,

Nan Rosemary Ke

,

Adrià Puigdomènech Badia

,

,

,

Ksenia Konyushkova

,

,

,

Timothy P. Lillicrap

,

,

Charles Blundell

CoRR, 2022

Large-Scale Retrieval for Reinforcement Learning.

[DOI]

Peter Conway Humphreys

,

,

Olivier Tieleman

,

,

Theophane Weber

,

Timothy P. Lillicrap

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Retrieval-Augmented Reinforcement Learning.

[DOI]

,

Abram L. Friesen

,

,

Theophane Weber

,

Nan Rosemary Ke

,

Adrià Puigdomènech Badia

,

,

,

Peter Conway Humphreys

,

Ksenia Konyushkova

,

,

,

Timothy P. Lillicrap

,

,

Charles Blundell

Proceedings of the International Conference on Machine Learning, 2022

Policy improvement by planning with Gumbel.

[DOI]

,

,

Julian Schrittwieser

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation.

[DOI]

,

Cosmin Paduraru

,

Daniel J. Mankowitz

,

,

,

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Counterfactual Credit Assignment in Model-Free Reinforcement Learning.

[DOI]

,

Theophane Weber

,

,

Shantanu Thakoor

,

,

Anna Harutyunyan

,

,

Thomas S. Stepleton

,

,

,

,

,

,

Proceedings of the 38th International Conference on Machine Learning, 2021

Muesli: Combining Improvements in Policy Optimization.

[DOI]

,

,

,

,

,

,

Theophane Weber

,

,

Hado van Hasselt

Proceedings of the 38th International Conference on Machine Learning, 2021

On the role of planning in model-based deep reinforcement learning.

[DOI]

Jessica B. Hamrick

,

Abram L. Friesen

,

Feryal M. P. Behbahani

,

,

,

Sims Witherspoon

,

,

Lars Holger Buesing

,

Petar Velickovic

,

Theophane Weber

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Mastering Atari, Go, chess and shogi by planning with a learned model.

[DOI]

Julian Schrittwieser

,

Ioannis Antonoglou

,

,

,

,

,

,

Edward Lockhart

,

,

,

Timothy P. Lillicrap

,

Nat., 2020

Counterfactual Credit Assignment in Model-Free Reinforcement Learning.

[DOI]

,

Théophane Weber

,

,

Shantanu Thakoor

,

,

Anna Harutyunyan

,

,

,

,

,

,

,

CoRR, 2020

Beyond Tabula-Rasa: a Modular Reinforcement Learning Approach for Physically Embedded 3D Sokoban.

[DOI]

,

,

,

,

Timothy P. Lillicrap

,

,

,

Theophane Weber

CoRR, 2020

Physically Embedded Planning Problems: New Challenges for Reinforcement Learning.

[DOI]

,

,

Jonathan J. Hunt

,

,

Saran Tunyasuvunakool

,

Alistair Muldal

,

Théophane Weber

,

,

Sébastien Racanière

,

,

Timothy P. Lillicrap

,

CoRR, 2020

Value-driven Hindsight Modelling.

[DOI]

,

,

Theophane Weber

,

,

Steven Kapturowski

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019

Augmenting learning using symmetry in a biologically-inspired domain.

[DOI]

,

Abbas Abdolmaleki

,

,

,

CoRR, 2019

An Investigation of Model-Free Planning.

[DOI]

,

,

,

,

Sébastien Racanière

,

Theophane Weber

,

,

,

,

,

,

,

Timothy P. Lillicrap

Proceedings of the 36th International Conference on Machine Learning, 2019

Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search.

[DOI]

,

Theophane Weber

,

,

,

Sébastien Racanière

,

,

Jean-Baptiste Lespiau

Proceedings of the 7th International Conference on Learning Representations, 2019

2018

Learning to Search with MCTSnets.

[DOI]

,

Theophane Weber

,

Ioannis Antonoglou

,

,

,

,

,

Proceedings of the 35th International Conference on Machine Learning, 2018

Adaptive planning in human search.

[DOI]

,

,

,

Maarten Speekenbrink

Proceedings of the 40th Annual Meeting of the Cognitive Science Society, 2018

2017

Mastering the game of Go without human knowledge.

[DOI]

,

Julian Schrittwieser

,

,

Ioannis Antonoglou

,

,

,

,

,

,

,

,

Timothy P. Lillicrap

,

,

,

George van den Driessche

,

,

Nat., 2017

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm.

[DOI]

,

,

Julian Schrittwieser

,

Ioannis Antonoglou

,

,

,

,

,

Dharshan Kumaran

,

,

Timothy P. Lillicrap

,

,

CoRR, 2017

Imagination-Augmented Agents for Deep Reinforcement Learning.

[DOI]

Theophane Weber

,

Sébastien Racanière

,

David P. Reichert

,

,

,

Danilo Jimenez Rezende

,

Adrià Puigdomènech Badia

,

,

,

,

,

Peter W. Battaglia

,

,

CoRR, 2017

Imagination-Augmented Agents for Deep Reinforcement Learning.

[DOI]

Sébastien Racanière

,

Theophane Weber

,

David P. Reichert

,

,

,

Danilo Jimenez Rezende

,

Adrià Puigdomènech Badia

,

,

,

,

,

Peter W. Battaglia

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

The Predictron: End-To-End Learning and Planning.

[DOI]

,

Hado van Hasselt

,

,

,

,

,

Gabriel Dulac-Arnold

,

David P. Reichert

,

Neil C. Rabinowitz

,

,

Proceedings of the 34th International Conference on Machine Learning, 2017

2016

Mastering the game of Go with deep neural networks and tree search.

[DOI]

Nat., 2016

Learning functions across many orders of magnitudes.

[DOI]

Hado van Hasselt

,

,

,

CoRR, 2016

Learning values across many orders of magnitude.

[DOI]

Hado van Hasselt

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Deep Reinforcement Learning with Double Q-Learning.

[DOI]

Hado van Hasselt

,

,

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Increasing the Action Gap: New Operators for Reinforcement Learning.

[DOI]

Marc G. Bellemare

,

Georg Ostrovski

,

,

Philip S. Thomas

,

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2014

Better Optimism By Bayes: Adaptive Planning with Rich Models.

[DOI]

,

,

CoRR, 2014

Bayes-Adaptive Simulation-based Search with Value Function Approximation.

[DOI]

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013

Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search.

[DOI]

,

,

J. Artif. Intell. Res., 2013

2012

Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search.

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

2010

Multi-tasking SLAM.

[DOI]

,

Proceedings of the IEEE International Conference on Robotics and Automation, 2010

2009

Treating Epilepsy via Adaptive Neurostimulation: a Reinforcement Learning Approach.

[DOI]

,

,

Robert D. Vincent

,

Gabriella Panuccio

,

Int. J. Neural Syst., 2009

2008

Adaptive Treatment of Epilepsy via Batch-mode Reinforcement Learning.

[DOI]

,

Robert D. Vincent

,

,

Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

Loading...