Adam White

CoRR, May, 2026

Gradient Iterated Temporal-Difference Learning.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

Deep Reinforcement Learning with Gradient Eligibility Traces.

[BibT_eX]

[DOI]

CoRR, July, 2025

Fine-Tuning without Performance Degradation.

[BibT_eX]

[DOI]

Han Wang

CoRR, May, 2025

Position: Lifetime tuning is incompatible with continual reinforcement learning.

[BibT_eX]

[DOI]

Golnaz Mesbahi

Proceedings of the Forty-second International Conference on Machine Learning, 2025

2024

GVFs in the real world: making predictions online for water treatment.

[BibT_eX]

[DOI]

Muhammad Kamran Janjua

Mach. Learn., July, 2024

AGaLiTe: Approximate Gated Linear Transformers for Online Reinforcement Learning.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

Empirical Design in Reinforcement Learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2024

Goal-Space Planning with Subgoal Models.

[BibT_eX]

[DOI]

Chunlok Lo

Kevin Roice

J. Mach. Learn. Res., 2024

The Cross-environment Hyperparameter Setting Benchmark for Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2024

A New View on Planning in Online Reinforcement Learning.

[BibT_eX]

[DOI]

Kevin Roice

Scott M. Jordan

CoRR, 2024

Tuning for the Unknown: Revisiting Evaluation Strategies for Lifelong RL.

[BibT_eX]

[DOI]

Golnaz Mesbahi

Olya Mastikhina

CoRR, 2024

Investigating the properties of neural network representations in reinforcement learning.

[BibT_eX]

[DOI]

Artif. Intell., 2024

Cross-environment Hyperparameter Tuning for Reinforcement Learning.

[BibT_eX]

[DOI]

RLJ, 2024

Investigating the Interplay of Prioritized Replay and Generalization.

[BibT_eX]

[DOI]

Andrew Patterson

RLJ, 2024

Harnessing Discrete Representations for Continual Reinforcement Learning.

[BibT_eX]

[DOI]

Edan Meyer

Marlos C. Machado

RLJ, 2024

The Cliff of Overcommitment with Policy Gradient Step Sizes.

[BibT_eX]

[DOI]

RLJ, 2024

Real-Time Recurrent Learning using Trace Units in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

A Method for Evaluating Hyperparameter Sensitivity in Reinforcement Learning.

[BibT_eX]

[DOI]

Jacob Adkins

Michael Bowling

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Position: Benchmarking is Limited in Reinforcement Learning Research.

[BibT_eX]

[DOI]

Scott M. Jordan

Bruno Castro da Silva

Philip S. Thomas

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Reward-Respecting Subtasks for Model-Based Reinforcement Learning (Abstract Reprint).

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Reward-respecting subtasks for model-based reinforcement learning.

[BibT_eX]

[DOI]

Artif. Intell., November, 2023

From eye-blinks to state construction: Diagnostic benchmarks for online representation learning.

[BibT_eX]

[DOI]

Adapt. Behav., February, 2023

Agent-State Construction with Auxiliary Inputs.

[BibT_eX]

[DOI]

Ruo Yu Tao

Marlos C. Machado

Trans. Mach. Learn. Res., 2023

Investigating Action Encodings in Recurrent Neural Networks in Reinforcement Learning.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Recurrent Linear Transformers.

[BibT_eX]

[DOI]

CoRR, 2023

The In-Sample Softmax for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Auxiliary task discovery through generate-and-test.

[BibT_eX]

[DOI]

Proceedings of the Conference on Lifelong Learning Agents, 2023

Measuring and Mitigating Interference in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Conference on Lifelong Learning Agents, 2023

Loss of Plasticity in Continual Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Conference on Lifelong Learning Agents, 2023

Entropy as a Measure of Puzzle Difficulty.

[BibT_eX]

[DOI]

Eugene You Chen Chen

Nathan R. Sturtevant

Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2023

2022

No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2022

A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning.

[BibT_eX]

[DOI]

Andrew Patterson

J. Mach. Learn. Res., 2022

Goal-Space Planning with Subgoal Models.

[BibT_eX]

[DOI]

CoRR, 2022

What makes useful auxiliary tasks in reinforcement learning: investigating the effect of the target policy.

[BibT_eX]

[DOI]

CoRR, 2022

The Frost Hollow Experiments: Pavlovian Signalling as a Path to Coordination and Communication Between Agents.

[BibT_eX]

[DOI]

Patrick M. Pilarski

Andrew Butcher

Elnaz Davoodi

Michael Bradley Johanson

CoRR, 2022

Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making.

[BibT_eX]

[DOI]

Andrew Butcher

Michael Bradley Johanson

CoRR, 2022

Learning Expected Emphatic Traces for Deep RL.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

General Value Function Networks.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2021

Assessing Human Interaction in Virtual Reality With Continually Learning Prediction Agents Based on Reinforcement Learning Algorithms: A Pilot Study.

[BibT_eX]

[DOI]

Dylan J. A. Brenneis

Adam S. R. Parker

Michael Bradley Johanson

CoRR, 2021

A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Continual Auxiliary Task Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Emphatic Algorithms for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Adapting Behavior via Intrinsic Reward: A Survey and Empirical Study.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2020

Towards a practical measure of interference for reinforcement learning.

[BibT_eX]

[DOI]

CoRR, 2020

Gradient Temporal-Difference Learning with Regularized Corrections.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Training Recurrent Neural Networks Online by Learning Explicit State Variables.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

2019

Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study.

[BibT_eX]

[DOI]

CoRR, 2019

Planning with Expectation Models.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Prediction in Intelligence: An Empirical Comparison of Off-policy Algorithms on Robots.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

Meta-Descent for Online, Continual Prediction.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

The Barbados 2018 List of Open Issues in Continual Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Online Off-policy Prediction.

[BibT_eX]

[DOI]

CoRR, 2018

General Value Function Networks.

[BibT_eX]

[DOI]

CoRR, 2018

Directly Estimating the Variance of the λ-Return Using Temporal-Difference Methods.

[BibT_eX]

[DOI]

CoRR, 2018

Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, 2018

Context-dependent upper-confidence bounds for directed exploration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

2017

GQ($λ$) Quick Reference and Implementation Guide.

[BibT_eX]

[DOI]

Richard S. Sutton

CoRR, 2017

Accelerated Gradient Temporal Difference Learning.

[BibT_eX]

[DOI]

Yangchen Pan

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

Investigating Practical Linear Temporal Difference Learning.

[BibT_eX]

[DOI]

Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

Introspective Agents: Confidence Measures for General Value Functions.

[BibT_eX]

[DOI]

Proceedings of the Artificial General Intelligence - 9th International Conference, 2016

2012

Acquiring a broad range of empirical knowledge in real time by temporal-difference learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2012

Multi-timescale Nexting in a Reinforcement Learning Robot.

[BibT_eX]

[DOI]

Joseph Modayil

Richard S. Sutton

Proceedings of the From Animals to Animats 12, 2012

Scaling life-long off-policy learning.

[BibT_eX]

[DOI]

Joseph Modayil

Richard S. Sutton

Proceedings of the 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics, 2012

2011

Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

2010

Report on the 2008 Reinforcement Learning Competition.

[BibT_eX]

[DOI]

Shimon Whiteson

Brian Tanner

AI Mag., 2010

Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

2009

RL-Glue: Language-Independent Software for Reinforcement-Learning Experiments.

[BibT_eX]

[DOI]

Brian Tanner