Bruno Scherrer

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

2012

On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes

[BibT_eX]

[DOI]

CoRR, 2012

On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes.

[BibT_eX]

[DOI]

Boris Lesner

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Approximate Modified Policy Iteration.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Machine Learning, 2012

A Dantzig Selector Approach to Temporal Difference Learning.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Machine Learning, 2012

2011

Classification-based Policy Iteration with a Critic.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Machine Learning, 2011

Recursive Least-Squares Learning with Eligibility Traces.

[BibT_eX]

[DOI]

Matthieu Geist

Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

ℓ1-Penalized Projected Bellman Residual.

[BibT_eX]

[DOI]

Matthieu Geist

Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

2010

Least-Squares Policy Iteration: Bias-Variance Trade-off in Control Problems.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

2009

Construction d'un joueur artificiel pour Tetris.

[BibT_eX]

[DOI]

Rev. d'Intelligence Artif., 2009

Improvements on Learning Tetris with Cross Entropy.

[BibT_eX]

[DOI]

ICGA J., 2009

Building Controllers for Tetris.

[BibT_eX]

[DOI]

ICGA J., 2009

2008

Analyse d'un algorithme d'intelligence en essaim pour le fourragement.

[BibT_eX]

[DOI]

Rev. d'Intelligence Artif., 2008

Embedded Harmonic Control for Trajectory Planning in Large Environments.

[BibT_eX]

[DOI]

Proceedings of the ReConFig'08: 2008 International Conference on Reconfigurable Computing and FPGAs, 2008

Biasing Approximate Dynamic Programming with a Lower Discount Factor.

[BibT_eX]

[DOI]

Marek Petrik

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

2007

Performance Bounds for Lambda Policy Iteration

[BibT_eX]

[DOI]

CoRR, 2007

Optimal control subsumes harmonic control.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE International Conference on Robotics and Automation, 2007

Convergence and rate of convergence of a foraging ant model.

[BibT_eX]

[DOI]

Proceedings of the IEEE Congress on Evolutionary Computation, 2007

Convergence and rate of convergence of a simple ant model.

[BibT_eX]

[DOI]

Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2007), 2007

2006

Modular self-organization

[BibT_eX]

[DOI]

CoRR, 2006

2005

Asynchronous neurocomputing for optimal control and reinforcement learning with large state spaces.

[BibT_eX]

[DOI]

Neurocomputing, 2005

2003

Apprentissage de représentation et auto-organisation modulaire pour un agent autonome.

[BibT_eX]

[DOI]

PhD thesis, 2003

Modular self-organization for a long-living autonomous agent.

[BibT_eX]

[DOI]

Proceedings of the IJCAI-03, 2003

Planning Cooperative Homogeneous Multiagent Systems Using Markov Decision Processes.

[BibT_eX]

Iadine Chades

Proceedings of the ICEIS 2003, 2003

Parallel asynchronous distributed computations of optimal control in large state space Markov Decision processes.

[BibT_eX]

[DOI]

Proceedings of the 11th European Symposium on Artificial Neural Networks, 2003

2002

A heuristic approach for solving decentralized-POMDP: assessment on the pursuit problem.

[BibT_eX]

[DOI]

Iadine Chades

Proceedings of the 2002 ACM Symposium on Applied Computing (SAC), 2002

Cooperative Co-Learning: A Model-Based Approach for Solving Multi Agent Reinforcement Problems.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2002), 2002

Coevolutive planning in markov decision processes.

[BibT_eX]

[DOI]