Pablo Samuel Castro

Glen Berseth

CoRR, May, 2026

A Mechanistic Analysis of Looped Reasoning Language Models.

[BibT_eX]

[DOI]

Hugh Blayney

Alvaro Arroyo

CoRR, April, 2026

Align and Filter: Improving Performance in Asynchronous On-Policy RL.

[BibT_eX]

[DOI]

Homayoun Honari

CoRR, March, 2026

Stable Deep Reinforcement Learning via Isotropic Gaussian Representations.

[BibT_eX]

[DOI]

Ali Saheb Pasand

Pouya Bashivan

CoRR, February, 2026

Discovering Differences in Strategic Behavior Between Humans and LLMs.

[BibT_eX]

[DOI]

CoRR, February, 2026

2025

A Comedy of Estimators: On KL Regularization in RL Training of LLMs.

[BibT_eX]

[DOI]

Vedant Shah

CoRR, December, 2025

The Formalism-Implementation Gap in Reinforcement Learning Research.

[BibT_eX]

[DOI]

CoRR, October, 2025

ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, October, 2025

Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents.

[BibT_eX]

[DOI]

CoRR, October, 2025

Asymmetric Proximal Policy Optimization: mini-critics boost LLM reasoning.

[BibT_eX]

[DOI]

Jiashun Liu

CoRR, October, 2025

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Gintare Karolina Dziugaite

CoRR, June, 2025

Continual Learning in Vision-Language Models via Aligned Model Merging.

[BibT_eX]

[DOI]

Ghada Sokar

CoRR, June, 2025

Optimistic critics can empower small actors.

[BibT_eX]

[DOI]

Olya Mastikhina

Dhruv Sreenivas

CoRR, June, 2025

Mind the GAP! The Challenges of Scale in Pixel-based Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Ghada Sokar

CoRR, May, 2025

Estimating Policy Functions in Payment Systems Using Reinforcement Learning.

[BibT_eX]

[DOI]

Francisco Rivadeneyra

ACM Trans. Economics and Comput., March, 2025

Multi-Task Reinforcement Learning Enables Parameter Scaling.

[BibT_eX]

[DOI]

Reginald McLean

Evangelos Chatzaroulas

CoRR, March, 2025

A Survey of State Representation Learning for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Ayoub Echchahed

Trans. Mach. Learn. Res., 2025

NAVIX: Scaling MiniGrid Environments with JAX.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Meta-World+: An Improved, Standardized, RL Benchmark.

[BibT_eX]

[DOI]

Reginald McLean

Evangelos Chatzaroulas

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning.

[BibT_eX]

[DOI]

Jiashun Liu

Zihao Wu

Ling Pan

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Mitigating Plasticity Loss in Continual Reinforcement Learning by Reducing Churn.

[BibT_eX]

[DOI]

Hongyao Tang

Glen Berseth

Proceedings of the Forty-second International Conference on Machine Learning, 2025

The Impact of On-Policy Parallelized Data Collection on Deep Reinforcement Learning Networks.

[BibT_eX]

[DOI]

Walter Mayor

Proceedings of the Forty-second International Conference on Machine Learning, 2025

The Courage to Stop: Overcoming Sunk Cost Fallacy in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Jiashun Liu

Ling Pan

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Discovering Symbolic Cognitive Models from Human and Animal Behavior.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL.

[BibT_eX]

[DOI]

Ghada Sokar

Hugo Larochelle

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

A density estimation perspective on learning from pairwise human preferences.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

In deep reinforcement learning, a pruned network is a good network.

[BibT_eX]

[DOI]

CoRR, 2024

Mixture of Experts in a Mixture of RL settings.

[BibT_eX]

[DOI]

Timon Willi

Gintare Karolina Dziugaite

Jakob Nicolaus Foerster

RLJ, 2024

On the consistency of hyper-parameter selection in value-based deep reinforcement learning.

[BibT_eX]

[DOI]

João Guilherme Madeira Araújo

RLJ, 2024

CALE: Continuous Arcade Learning Environment.

[BibT_eX]

[DOI]

Jesse Farebrother

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Adaptive Accompaniment with ReaLchords.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Mixtures of Experts Unlock Parameter Scaling for Deep RL.

[BibT_eX]

[DOI]

Gintare Karolina Dziugaite

Jakob Nicolaus Foerster

Proceedings of the Forty-first International Conference on Machine Learning, 2024

In value-based deep reinforcement learning, a pruned network is a good network.

[BibT_eX]

[DOI]

Maxime Chevalier-Boisvert

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Jaxpruner: A Concise Library for Sparsity Research.

[BibT_eX]

[DOI]

Proceedings of the Conference on Parsimony and Learning, 2024

2023

A Kernel Perspective on Behavioural Metrics for Markov Decision Processes.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy.

[BibT_eX]

[DOI]

CoRR, 2023

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks.

[BibT_eX]

[DOI]

CoRR, 2023

Offline Reinforcement Learning with On-Policy Q-Function Regularization.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Small batch deep reinforcement learning.

[BibT_eX]

[DOI]

Maxime Chevalier-Boisvert

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Minigrid & Miniworld: Modular & Customizable Reinforcement Learning Environments for Goal-Oriented Tasks.

[BibT_eX]

[DOI]

Bolun Dai

Mark Towers

Rodrigo Perez-Vicente

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

The Dormant Neuron Phenomenon in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Bigger, Better, Faster: Human-level Atari with human-level efficiency.

[BibT_eX]

[DOI]

Max Schwarzer

Proceedings of the International Conference on Machine Learning, 2023

The Small Batch Size Anomaly in Multistep Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

Beyond Tabula Rasa: Reincarnating Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

The State of Sparse Training in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

A general class of surrogate functions for stable and efficient reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Losses, Dissonances, and Distortions.

[BibT_eX]

[DOI]

CoRR, 2021

A functional mirror ascent view of policy gradient methods with function approximation.

[BibT_eX]

[DOI]

CoRR, 2021

MICo: Learning improved representations via sampling-based state similarity for Markov decision processes.

[BibT_eX]

[DOI]

CoRR, 2021

The Difficulty of Passive Learning in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Georg Ostrovski

Will Dabney

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

MICo: Improved representations via sampling-based state similarity for Markov decision processes.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Deep Reinforcement Learning at the Edge of the Statistical Precipice.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Metrics and Continuity in Reinforcement Learning.

[BibT_eX]

[DOI]

Charline Le Lan

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Autonomous navigation of stratospheric balloons using reinforcement learning.

[BibT_eX]

[DOI]

Nat., 2020

GANterpretations.

[BibT_eX]

[DOI]

CoRR, 2020

Rigging the Lottery: Making All Tickets Winners.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Shaping the Narrative Arc: Information-Theoretic Collaborative DialoguePaper type: Technical Paper.

[BibT_eX]

[DOI]

Kory Wallace Mathewson

Proceedings of the Eleventh International Conference on Computational Creativity, 2020

Scalable Methods for Computing State Similarity in Deterministic Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Inverse Reinforcement Learning with Multiple Ranked Experts.

[BibT_eX]

[DOI]

Shijian Li

Daqing Zhang

CoRR, 2019

Shaping the Narrative Arc: An Information-Theoretic Approach to Collaborative Dialogue.

[BibT_eX]

[DOI]

CoRR, 2019

A Geometric Perspective on Optimal Representations for Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Performing Structured Improvisations with Pre-trained Deep Learning Models.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Computational Creativity, 2019

Distributional reinforcement learning with linear function approximation.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

A Comparative Analysis of Expected and Distributional Reinforcement Learning.

[BibT_eX]

[DOI]

Clare Lyle

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents.

[BibT_eX]

[DOI]

CoRR, 2018

Dopamine: A Research Framework for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Combining Learned Lyrical Structures and Vocabulary for Improved Lyric Generation.

[BibT_eX]

[DOI]

Maria Attarian

CoRR, 2018

2013

iBOAT: Isolation-Based Online Anomalous Trajectory Detection.

[BibT_eX]

[DOI]

IEEE Trans. Intell. Transp. Syst., 2013

Real Time Anomalous Trajectory Detection and Analysis.

[BibT_eX]

[DOI]

Mob. Networks Appl., 2013

From taxi GPS traces to social and community dynamics: A survey.

[BibT_eX]

[DOI]

ACM Comput. Surv., 2013

2012

Urban Traffic Modelling and Prediction Using Large Scale Taxi GPS Traces.

[BibT_eX]

[DOI]

Daqing Zhang

Shijian Li

Proceedings of the Pervasive Computing - 10th International Conference, 2012

2011

Real-Time Detection of Anomalous Taxi Trajectories from GPS Traces.

[BibT_eX]

[DOI]

Proceedings of the Mobile and Ubiquitous Systems: Computing, Networking, and Services, 2011

Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

2010

Smarter Sampling in Model-Based Bayesian Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2010

Using Bisimulation for Policy Transfer in MDPs.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

2009

Equivalence Relations in Fully and Partially Observable Markov Decision Processes.

[BibT_eX]

[DOI]

Prakash Panangaden

Proceedings of the IJCAI 2009, 2009

2007

Using Linear Programming for Bayesian Exploration in Markov Decision Processes.

[BibT_eX]

[DOI]