Bruno Scherrer

According to our database1, Bruno Scherrer authored at least 58 papers between 2002 and 2021.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2021
Infinite-Horizon Offline Reinforcement Learning with Linear Function Approximation: Curse of Dimensionality and Algorithm.
CoRR, 2021

2020
Leverage the Average: an Analysis of Regularization in RL.
CoRR, 2020

Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Momentum in Reinforcement Learning.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
A Theory of Regularized Markov Decision Processes.
Proceedings of the 36th International Conference on Machine Learning, 2019

Stability guarantees for nonlinear discrete-time systems controlled by approximate value iteration.
Proceedings of the 58th IEEE Conference on Decision and Control, 2019

How to Combine Tree-Search Methods in Reinforcement Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Anderson Acceleration for Reinforcement Learning.
CoRR, 2018

Multiple-Step Greedy Policies in Online and Approximate Reinforcement Learning.
CoRR, 2018

Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Beyond the One-Step Greedy Approach in Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

2016
Improved and Generalized Upper Bounds on the Complexity of Policy Iteration.
Math. Oper. Res., 2016

Softened Approximate Policy Iteration for Markov Games.
Proceedings of the 33nd International Conference on Machine Learning, 2016

On the Use of Non-Stationary Strategies for Solving Two-Player Zero-Sum Markov Games.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

Contributions algorithmiques au contrôle optimal stochastique à temps discret et horizon infini.
, 2016

2015
Recherche locale de politique dans un espace convexe.
Rev. d'Intelligence Artif., 2015

Approximate modified policy iteration and its application to the game of Tetris.
J. Mach. Learn. Res., 2015

On the Rate of Convergence and Error Bounds for LSTD(\(\lambda\)).
Proceedings of the 32nd International Conference on Machine Learning, 2015

Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Non-Stationary Approximate Modified Policy Iteration.
Proceedings of the 32nd International Conference on Machine Learning, 2015

2014
Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming.
Oper. Res. Lett., 2014

Off-policy learning with eligibility traces: a survey.
J. Mach. Learn. Res., 2014

Rate of Convergence and Error Bounds for LSTD(λ).
CoRR, 2014

Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

Approximate Policy Iteration Schemes: A Comparison.
Proceedings of the 31th International Conference on Machine Learning, 2014

2013
Performance bounds for λ policy iteration and application to the game of Tetris.
J. Mach. Learn. Res., 2013

Tight Performance Bounds for Approximate Modified Policy Iteration with Non-Stationary Policies
CoRR, 2013

Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee.
CoRR, 2013

On the Performance Bounds of some Policy Search Dynamic Programming Algorithms.
CoRR, 2013

Approximate Dynamic Programming Finally Performs Well in the Game of Tetris.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

2012
On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes
CoRR, 2012

On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Approximate Modified Policy Iteration.
Proceedings of the 29th International Conference on Machine Learning, 2012

A Dantzig Selector Approach to Temporal Difference Learning.
Proceedings of the 29th International Conference on Machine Learning, 2012

2011
Classification-based Policy Iteration with a Critic.
Proceedings of the 28th International Conference on Machine Learning, 2011

Recursive Least-Squares Learning with Eligibility Traces.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

ℓ1-Penalized Projected Bellman Residual.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

2010
Least-Squares Policy Iteration: Bias-Variance Trade-off in Control Problems.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

2009
Construction d'un joueur artificiel pour Tetris.
Rev. d'Intelligence Artif., 2009

Improvements on Learning Tetris with Cross Entropy.
J. Int. Comput. Games Assoc., 2009

Building Controllers for Tetris.
J. Int. Comput. Games Assoc., 2009

2008
Analyse d'un algorithme d'intelligence en essaim pour le fourragement.
Rev. d'Intelligence Artif., 2008

Embedded Harmonic Control for Trajectory Planning in Large Environments.
Proceedings of the ReConFig'08: 2008 International Conference on Reconfigurable Computing and FPGAs, 2008

Biasing Approximate Dynamic Programming with a Lower Discount Factor.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

2007
Performance Bounds for Lambda Policy Iteration
CoRR, 2007

Optimal control subsumes harmonic control.
Proceedings of the 2007 IEEE International Conference on Robotics and Automation, 2007

Convergence and rate of convergence of a foraging ant model.
Proceedings of the IEEE Congress on Evolutionary Computation, 2007

Convergence and rate of convergence of a simple ant model.
Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2007), 2007

2006
Modular self-organization
CoRR, 2006

2005
Asynchronous neurocomputing for optimal control and reinforcement learning with large state spaces.
Neurocomputing, 2005

2003
Apprentissage de représentation et auto-organisation modulaire pour un agent autonome.
PhD thesis, 2003

Modular self-organization for a long-living autonomous agent.
Proceedings of the IJCAI-03, 2003

Planning Cooperative Homogeneous Multiagent Systems Using Markov Decision Processes.
Proceedings of the ICEIS 2003, 2003

Parallel asynchronous distributed computations of optimal control in large state space Markov Decision processes.
Proceedings of the 11th European Symposium on Artificial Neural Networks, 2003

2002
A heuristic approach for solving decentralized-POMDP: assessment on the pursuit problem.
Proceedings of the 2002 ACM Symposium on Applied Computing (SAC), 2002

Cooperative Co-Learning: A Model-Based Approach for Solving Multi Agent Reinforcement Problems.
Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2002), 2002

Coevolutive planning in markov decision processes.
Proceedings of the First International Joint Conference on Autonomous Agents & Multiagent Systems, 2002


  Loading...