Mohammad Ghavamzadeh

Bruno Scherrer

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Cost-sensitive Multiclass Classification Risk Bounds.

[BibT_eX]

[DOI]

Bernardo Ávila Pires

Proceedings of the 30th International Conference on Machine Learning, 2013

A Generalized Kernel Approach to Structured Output Learning.

[BibT_eX]

[DOI]

Hachem Kadri

Philippe Preux

Proceedings of the 30th International Conference on Machine Learning, 2013

2012

Finite-sample analysis of least-squares policy iteration.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2012

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence.

[BibT_eX]

[DOI]

Victor Gabillon

Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Approximate Modified Policy Iteration.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Machine Learning, 2012

A Dantzig Selector Approach to Temporal Difference Learning.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Machine Learning, 2012

Semi-Supervised Apprenticeship Learning.

[BibT_eX]

[DOI]

Michal Valko

Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Conservative and Greedy Approaches to Classification-Based Policy Iteration.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

Bayesian Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Reinforcement Learning, 2012

Least-Squares Methods for Policy Iteration.

[BibT_eX]

[DOI]

Proceedings of the Reinforcement Learning, 2012

2011

Multi-Bandit Best Arm Identification.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Speedy Q-Learning.

[BibT_eX]

[DOI]

Mohammad Gheshlaghi Azar

Hilbert J. Kappen

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Finite-Sample Analysis of Lasso-TD.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Machine Learning, 2011

Classification-based Policy Iteration with a Critic.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference on Machine Learning, 2011

Regularized Least Squares Temporal Difference Learning with Nested ℓ2 and ℓ1 Penalization.

[BibT_eX]

[DOI]

Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Upper-Confidence-Bound Algorithms for Active Learning in Multi-armed Bandits.

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Learning Theory - 22nd International Conference, 2011

2010

Finite-sample Analysis of Bellman Residual Minimization.

[BibT_eX]

[DOI]

Odalric-Ambrym Maillard

Proceedings of the 2nd Asian Conference on Machine Learning, 2010

LSTD with Random Projections.

[BibT_eX]

[DOI]

Odalric-Ambrym Maillard

Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Finite-Sample Analysis of LSTD.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Analysis of a Classification-based Policy Iteration Algorithm.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Bayesian Multi-Task Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

2009

Natural actor-critic algorithms.

[BibT_eX]

[DOI]

Autom., 2009

Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems.

[BibT_eX]

[DOI]

Amir Massoud Farahmand

Shie Mannor

Proceedings of the American Control Conference, 2009

2008

Regularized Policy Iteration.

[BibT_eX]

[DOI]

Amir Massoud Farahmand

Shie Mannor

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Regularized Fitted Q-Iteration: Application to Planning.

[BibT_eX]

[DOI]

Amir Massoud Farahmand

Shie Mannor

Proceedings of the Recent Advances in Reinforcement Learning, 8th European Workshop, 2008

2007

Hierarchical Average Reward Reinforcement Learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2007

Incremental Natural Actor-Critic Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Bayesian actor-critic algorithms.

[BibT_eX]

[DOI]

Yaakov Engel

Proceedings of the Machine Learning, 2007

2006

Bayesian Policy Gradient Algorithms.

[BibT_eX]

[DOI]

Yaakov Engel

Proceedings of the Advances in Neural Information Processing Systems 19, 2006

2005

The Workshop Program at the Nineteenth National Conference on Artificial Intelligence.

[BibT_eX]

[DOI]

AI Mag., 2005

2004

Learning to Communicate and Act Using Hierarchical Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), 2004

2003

Hierarchical Policy Gradient Algorithms.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2003

2002

Hierarchically Optimal Average Reward Reinforcement Learning.

[BibT_eX]

Proceedings of the Machine Learning, 2002

A multiagent reinforcement learning algorithm by dynamically merging markov decision processes.

[BibT_eX]

[DOI]

Proceedings of the First International Joint Conference on Autonomous Agents & Multiagent Systems, 2002

2001

Continuous-Time Hierarchical Reinforcement Learning.

[BibT_eX]

Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

Hierarchical multi-agent reinforcement learning.

[BibT_eX]

[DOI]

Rajbala Makar