We stand with Ukraine

We stand with Ukraine

Pierre-Luc Bacon

According to our database¹, Pierre-Luc Bacon authored at least 64 papers between 2015 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Rotation-Preserving Supervised Fine-Tuning.

[DOI]

,

,

,

Pierre-Luc Bacon

,

Mohammad Hamdaqa

,

CoRR, May, 2026

Layerwise LQR for Geometry-Aware Optimization of Deep Networks.

[DOI]

Simon Dufort-Labbé

,

Pierre-Luc Bacon

,

,

Simon Lacoste-Julien

,

Aristide Baratin

CoRR, May, 2026

Towards Practical World Model-based Reinforcement Learning for Vision-Language-Action Models.

[DOI]

,

,

,

,

,

,

,

Pierre-Luc Bacon

,

CoRR, March, 2026

What Makes Value Learning Efficient in Residual Reinforcement Learning?

[DOI]

,

,

,

,

Pierre-Luc Bacon

,

CoRR, February, 2026

Reward Redistribution for CVaR MDPs using a Bellman Operator on L-infinity.

[DOI]

,

,

,

Pierre-Luc Bacon

,

CoRR, February, 2026

2025

Long-Horizon Model-Based Offline Reinforcement Learning Without Conservatism.

[DOI]

,

,

,

,

Siamak Ravanbakhsh

,

Pierre-Luc Bacon

CoRR, December, 2025

The Three Regimes of Offline-to-Online Reinforcement Learning.

[DOI]

,

,

,

Pierre-Luc Bacon

CoRR, October, 2025

Planning with Unified Multimodal Models.

[DOI]

,

,

,

Pierre-Luc Bacon

CoRR, September, 2025

Discovery of Sustainable Refrigerants through Physics-Informed RL Fine-Tuning of Sequence Models.

[DOI]

Adrien Goldszal

,

Diego Calanzone

,

,

Pierre-Luc Bacon

CoRR, September, 2025

Robust Reinforcement Learning for Discrete Compositional Generation via General Soft Operators.

[DOI]

Marco Jiralerspong

,

,

,

,

,

,

Pierre-Luc Bacon

,

CoRR, June, 2025

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning.

[DOI]

Roger Creus Castanyer

,

Johan S. Obando-Ceron

,

,

Pierre-Luc Bacon

,

,

Aaron C. Courville

,

Pablo Samuel Castro

CoRR, June, 2025

State Entropy Regularization for Robust Reinforcement Learning.

[DOI]

,

,

,

,

Pierre-Luc Bacon

,

CoRR, June, 2025

Understanding Behavioral Metric Learning: A Large-Scale Study on Distracting Reinforcement Learning Environments.

[DOI]

,

,

Pierre-Luc Bacon

,

,

CoRR, June, 2025

Mol-MoE: Training Preference-Guided Routers for Molecule Generation.

[DOI]

Diego Calanzone

,

,

Pierre-Luc Bacon

CoRR, February, 2025

Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons.

[DOI]

Simon Dufort-Labbé

,

,

Evgenii Nikishin

,

,

Pierre-Luc Bacon

,

,

Aristide Baratin

Trans. Mach. Learn. Res., 2025

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning.

[DOI]

,

,

,

,

Pierre-Luc Bacon

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Scaling Trends in Language Model Robustness.

[DOI]

Nikolaus H. R. Howe

,

Ian R. McKenzie

,

Oskar John Hollinsworth

,

,

,

Aaron David Tucker

,

Pierre-Luc Bacon

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

MaestroMotif: Skill Design from Artificial Intelligence Feedback.

[DOI]

Martin Klissarov

,

,

Roberta Raileanu

,

,

,

,

Pierre-Luc Bacon

,

,

Marlos C. Machado

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Exploring Scaling Trends in LLM Robustness.

[DOI]

Nikolaus H. R. Howe

,

,

Ian R. McKenzie

,

Oskar John Hollinsworth

,

,

Pierre-Luc Bacon

,

CoRR, 2024

Generative Active Learning for the Search of Small-molecule Protein Binders.

[DOI]

CoRR, 2024

Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons.

[DOI]

Simon Dufort-Labbé

,

,

Evgenii Nikishin

,

,

Pierre-Luc Bacon

,

Aristide Baratin

CoRR, 2024

Do Transformer World Models Give Better Policy Gradients?

[DOI]

,

,

Clement Gehring

,

,

Pierre-Luc Bacon

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Bridging State and History Representations: Understanding Self-Predictive RL.

[DOI]

,

Benjamin Eysenbach

,

Erfan Seyedsalehi

,

,

Clement Gehring

,

,

Pierre-Luc Bacon

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Decoupling regularization from the action space.

[DOI]

Sobhan Mohammadpour

,

,

Pierre-Luc Bacon

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Motif: Intrinsic Motivation from Artificial Intelligence Feedback.

[DOI]

Martin Klissarov

,

,

,

Roberta Raileanu

,

Pierre-Luc Bacon

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Course Correcting Koopman Representations.

[DOI]

,

Clement Gehring

,

Jonathan Pilault

,

,

Pierre-Luc Bacon

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Maximum entropy GFlowNets with soft Q-learning.

[DOI]

Sobhan Mohammadpour

,

Emmanuel Bengio

,

,

Pierre-Luc Bacon

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023

Block-State Transformer.

[DOI]

,

Jonathan Pilault

,

Pierre-Luc Bacon

,

Christopher Pal

,

,

CoRR, 2023

Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design.

[DOI]

,

Pierre-Luc Bacon

,

Christopher Pal

,

Emmanuel Bengio

CoRR, 2023

Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control.

[DOI]

,

,

,

Pierre-Luc Bacon

,

Marc G. Bellemare

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Block-State Transformers.

[DOI]

Jonathan Pilault

,

,

,

,

Pierre-Luc Bacon

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment.

[DOI]

,

,

Benjamin Eysenbach

,

Pierre-Luc Bacon

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Double Gumbel Q-Learning.

[DOI]

David Yu-Tung Hui

,

Aaron C. Courville

,

Pierre-Luc Bacon

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier.

[DOI]

,

,

Evgenii Nikishin

,

Pierre-Luc Bacon

,

Marc G. Bellemare

,

Aaron C. Courville

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization.

[DOI]

,

,

,

,

Pierre-Luc Bacon

CoRR, 2022

Myriad: a real-world testbed to bridge trajectory optimization and deep learning.

[DOI]

Nikolaus H. R. Howe

,

Simon Dufort-Labbé

,

Nitarshan Rajkumar

,

Pierre-Luc Bacon

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Direct Behavior Specification via Constrained Reinforcement Learning.

[DOI]

,

,

,

Pierre-Luc Bacon

,

Christopher J. Pal

Proceedings of the International Conference on Machine Learning, 2022

The Primacy Bias in Deep Reinforcement Learning.

[DOI]

Evgenii Nikishin

,

,

,

Pierre-Luc Bacon

,

Aaron C. Courville

Proceedings of the International Conference on Machine Learning, 2022

Continuous-Time Meta-Learning with Forward Mode Differentiation.

[DOI]

,

,

,

,

,

Guillaume Lajoie

,

Pierre-Luc Bacon

Proceedings of the Tenth International Conference on Learning Representations, 2022

Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation.

[DOI]

Evgenii Nikishin

,

,

Rishabh Agarwal

,

Pierre-Luc Bacon

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning.

[DOI]

,

Peter Henderson

,

Pierre-Luc Bacon

CoRR, 2021

Neural Algorithmic Reasoners are Implicit Planners.

[DOI]

,

Petar Velickovic

,

Ognjen Milinkovic

,

Pierre-Luc Bacon

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

TDprop: Does Adaptive Optimization With Jacobi Preconditioning Help Temporal Difference Learning?

[DOI]

,

Peter Henderson

,

,

Emmanuel Bengio

,

,

Pierre-Luc Bacon

,

Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

2020

XLVIN: eXecuted Latent Value Iteration Nets.

[DOI]

,

Petar Velickovic

,

Ognjen Milinkovic

,

Pierre-Luc Bacon

,

,

CoRR, 2020

Graph neural induction of value iteration.

[DOI]

,

Pierre-Luc Bacon

,

CoRR, 2020

TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?

[DOI]

,

Peter Henderson

,

,

Emmanuel Bengio

,

,

Pierre-Luc Bacon

,

CoRR, 2020

Policy Evaluation Networks.

[DOI]

,

,

,

Pierre-Luc Bacon

CoRR, 2020

Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling.

[DOI]

,

Pierre-Luc Bacon

,

Proceedings of the 37th International Conference on Machine Learning, 2020

Options of Interest: Temporal Abstraction with Interest Functions.

[DOI]

Khimya Khetarpal

,

Martin Klissarov

,

Maxime Chevalier-Boisvert

,

Pierre-Luc Bacon

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods.

[DOI]

,

,

Pierre-Luc Bacon

,

CoRR, 2019

All-Action Policy Gradient Methods: A Numerical Integration Approach.

[DOI]

,

Loren Amdahl-Culleton

,

,

,

Pierre-Luc Bacon

CoRR, 2019

2018

The Barbados 2018 List of Open Issues in Continual Learning.

[DOI]

,

Hado van Hasselt

,

,

,

,

Pierre-Luc Bacon

,

,

,

Marc G. Bellemare

,

CoRR, 2018

Constructing Temporal Abstractions Autonomously in Reinforcement Learning.

[DOI]

Pierre-Luc Bacon

,

AI Mag., 2018

Convergent TREE BACKUP and RETRACE with Function Approximation.

[DOI]

,

Pierre-Luc Bacon

,

,

Proceedings of the 35th International Conference on Machine Learning, 2018

Learning Robust Options.

[DOI]

Daniel J. Mankowitz

,

Timothy A. Mann

,

Pierre-Luc Bacon

,

,

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Learning With Options That Terminate Off-Policy.

[DOI]

Anna Harutyunyan

,

,

Pierre-Luc Bacon

,

,

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

When Waiting Is Not an Option: Learning Options With a Deliberation Cost.

[DOI]

,

Pierre-Luc Bacon

,

Martin Klissarov

,

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

OptionGAN: Learning Joint Reward-Policy Options Using Generative Adversarial Inverse Reinforcement Learning.

[DOI]

Peter Henderson

,

,

Pierre-Luc Bacon

,

,

,

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Learnings Options End-to-End for Continuous Action Tasks.

[DOI]

Martin Klissarov

,

Pierre-Luc Bacon

,

,

CoRR, 2017

The Option-Critic Architecture.

[DOI]

Pierre-Luc Bacon

,

,

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

A Matrix Splitting Perspective on Planning with Options.

[DOI]

Pierre-Luc Bacon

,

CoRR, 2016

2015

Conditional Computation in Neural Networks for faster models.

[DOI]

Emmanuel Bengio

,

Pierre-Luc Bacon

,

,

CoRR, 2015

Learning and Planning with Timing Information in Markov Decision Processes.

[DOI]

Pierre-Luc Bacon

,

,

Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015

Analyzing Open Data from the City of Montreal.

[DOI]

,

Pierre-Luc Bacon

Proceedings of the 2nd International Workshop on Mining Urban Data co-located with 32nd International Conference on Machine Learning (ICML 2015), 2015

Loading...