We stand with Ukraine

We stand with Ukraine

Ian Osband

According to our database¹, Ian Osband authored at least 45 papers between 2013 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Delightful Exploration.

[DOI]

CoRR, May, 2026

Delightful Gradients Accelerate Corner Escape.

[DOI]

,

CoRR, May, 2026

Does This Gradient Spark Joy?

[DOI]

CoRR, March, 2026

Delightful Distributed Policy Gradient.

[DOI]

CoRR, March, 2026

Delightful Policy Gradient.

[DOI]

CoRR, March, 2026

2023

Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping.

[DOI]

Vikranth Dwaracherla

,

,

,

,

Seyed Mohammad Asghari

,

Benjamin Van Roy

Trans. Mach. Learn. Res., 2023

Reinforcement Learning, Bit by Bit.

[DOI]

,

Benjamin Van Roy

,

Vikranth Dwaracherla

,

Morteza Ibrahimi

,

,

Found. Trends Mach. Learn., 2023

Approximate Thompson Sampling via Epistemic Neural Networks.

[DOI]

,

,

Seyed Mohammad Asghari

,

Vikranth Dwaracherla

,

Morteza Ibrahimi

,

,

Benjamin Van Roy

Proceedings of the Uncertainty in Artificial Intelligence, 2023

Epistemic Neural Networks.

[DOI]

,

,

Seyed Mohammad Asghari

,

Vikranth Dwaracherla

,

Morteza Ibrahimi

,

,

Benjamin Van Roy

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022

Fine-Tuning Language Models via Epistemic Neural Networks.

[DOI]

,

Seyed Mohammad Asghari

,

Benjamin Van Roy

,

,

,

Geoffrey Irving

CoRR, 2022

Robustness of Epinets against Distributional Shifts.

[DOI]

,

,

Seyed Mohammad Asghari

,

,

Vikranth Dwaracherla

,

,

Benjamin Van Roy

CoRR, 2022

Evaluating high-order predictive distributions in deep learning.

[DOI]

,

,

Seyed Mohammad Asghari

,

Vikranth Dwaracherla

,

,

Benjamin Van Roy

Proceedings of the Uncertainty in Artificial Intelligence, 2022

The Neural Testbed: Evaluating Joint Predictions.

[DOI]

,

,

Seyed Mohammad Asghari

,

Vikranth Dwaracherla

,

,

Morteza Ibrahimi

,

Dieterich Lawson

,

,

Brendan O'Donoghue

,

Benjamin Van Roy

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021

Evaluating Predictive Distributions: Does Bayesian Deep Learning Work?

[DOI]

,

,

Seyed Mohammad Asghari

,

Vikranth Dwaracherla

,

,

Morteza Ibrahimi

,

Dieterich Lawson

,

,

Brendan O'Donoghue

,

Benjamin Van Roy

CoRR, 2021

Evaluating Probabilistic Inference in Deep Learning: Beyond Marginal Predictions.

[DOI]

,

,

Benjamin Van Roy

,

CoRR, 2021

Epistemic Neural Networks.

[DOI]

,

,

Mohammad Asghari

,

Morteza Ibrahimi

,

,

Benjamin Van Roy

CoRR, 2021

Matrix games with bandit feedback.

[DOI]

Brendan O'Donoghue

,

,

Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

2020

Stochastic matrix games with bandit feedback.

[DOI]

Brendan O'Donoghue

,

,

CoRR, 2020

Behaviour Suite for Reinforcement Learning.

[DOI]

,

,

,

,

,

,

Katrina McKinney

,

,

Csaba Szepesvári

,

,

Benjamin Van Roy

,

Richard S. Sutton

,

,

Hado van Hasselt

Proceedings of the 8th International Conference on Learning Representations, 2020

Making Sense of Reinforcement Learning and Probabilistic Inference.

[DOI]

Brendan O'Donoghue

,

,

Catalin Ionescu

Proceedings of the 8th International Conference on Learning Representations, 2020

Hypermodels for Exploration.

[DOI]

Vikranth Dwaracherla

,

,

Morteza Ibrahimi

,

,

,

Benjamin Van Roy

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Deep Exploration via Randomized Value Functions.

[DOI]

,

Benjamin Van Roy

,

Daniel J. Russo

,

J. Mach. Learn. Res., 2019

Meta-learning of Sequential Strategies.

[DOI]

CoRR, 2019

2018

A Tutorial on Thompson Sampling.

[DOI]

,

Benjamin Van Roy

,

Abbas Kazerouni

,

,

Found. Trends Mach. Learn., 2018

Randomized Prior Functions for Deep Reinforcement Learning.

[DOI]

,

,

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Scalable Coordinated Exploration in Concurrent Reinforcement Learning.

[DOI]

Maria Dimakopoulou

,

,

Benjamin Van Roy

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

The Uncertainty Bellman Equation and Exploration.

[DOI]

Brendan O'Donoghue

,

,

,

Proceedings of the 35th International Conference on Machine Learning, 2018

Noisy Networks For Exploration.

[DOI]

Meire Fortunato

,

Mohammad Gheshlaghi Azar

,

,

,

,

,

,

,

,

,

Olivier Pietquin

,

Charles Blundell

,

Proceedings of the 6th International Conference on Learning Representations, 2018

Deep Q-learning From Demonstrations.

[DOI]

,

,

Olivier Pietquin

,

,

,

,

,

,

Andrew Sendonaris

,

,

Gabriel Dulac-Arnold

,

John P. Agapiou

,

,

Audrunas Gruslys

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

On Optimistic versus Randomized Exploration in Reinforcement Learning.

[DOI]

,

Benjamin Van Roy

CoRR, 2017

Gaussian-Dirichlet Posterior Dominance in Sequential Learning.

[DOI]

,

Benjamin Van Roy

CoRR, 2017

Learning from Demonstrations for Real World Reinforcement Learning.

[DOI]

,

,

Olivier Pietquin

,

,

,

,

Andrew Sendonaris

,

Gabriel Dulac-Arnold

,

,

John P. Agapiou

,

,

Audrunas Gruslys

CoRR, 2017

Noisy Networks for Exploration.

[DOI]

Meire Fortunato

,

Mohammad Gheshlaghi Azar

,

,

,

,

,

,

,

,

Olivier Pietquin

,

Charles Blundell

,

CoRR, 2017

A Tutorial on Thompson Sampling.

[DOI]

,

Benjamin Van Roy

,

Abbas Kazerouni

,

CoRR, 2017

Why is Posterior Sampling Better than Optimism for Reinforcement Learning?

[DOI]

,

Benjamin Van Roy

Proceedings of the 34th International Conference on Machine Learning, 2017

Minimax Regret Bounds for Reinforcement Learning.

[DOI]

Mohammad Gheshlaghi Azar

,

,

Proceedings of the 34th International Conference on Machine Learning, 2017

2016

On Lower Bounds for Regret in Reinforcement Learning.

[DOI]

,

Benjamin Van Roy

CoRR, 2016

Posterior Sampling for Reinforcement Learning Without Episodes.

[DOI]

,

Benjamin Van Roy

CoRR, 2016

Deep Exploration via Bootstrapped DQN.

[DOI]

,

Charles Blundell

,

Alexander Pritzel

,

Benjamin Van Roy

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Generalization and Exploration via Randomized Value Functions.

[DOI]

,

Benjamin Van Roy

,

Proceedings of the 33nd International Conference on Machine Learning, 2016

2015

Bootstrapped Thompson Sampling and Deep Exploration.

[DOI]

,

Benjamin Van Roy

CoRR, 2015

2014

Near-optimal Regret Bounds for Reinforcement Learning in Factored MDPs.

[DOI]

,

Benjamin Van Roy

CoRR, 2014

Model-based Reinforcement Learning and the Eluder Dimension.

[DOI]

,

Benjamin Van Roy

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Near-optimal Reinforcement Learning in Factored MDPs.

[DOI]

,

Benjamin Van Roy

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

2013

(More) Efficient Reinforcement Learning via Posterior Sampling.

[DOI]

,

,

Benjamin Van Roy

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Loading...