We stand with Ukraine

We stand with Ukraine

Matteo Pirotta

According to our database¹, Matteo Pirotta authored at least 81 papers between 2011 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Compositional Planning with Jumpy World Models.

[DOI]

Jesse Farebrother

,

,

Andrea Tirinzoni

,

Marc G. Bellemare

,

Alessandro Lazaric

,

CoRR, February, 2026

2025

BFM-Zero: A Promptable Behavioral Foundation Model for Humanoid Control Using Unsupervised Reinforcement Learning.

[DOI]

,

,

,

,

Anssi Kanervisto

,

Andrea Tirinzoni

,

,

,

,

,

Alessandro Lazaric

,

,

CoRR, November, 2025

TD-JEPA: Latent-predictive Representations for Zero-Shot Reinforcement Learning.

[DOI]

Marco Bagatella

,

,

,

Alessandro Lazaric

,

Andrea Tirinzoni

CoRR, October, 2025

Fast Adaptation with Behavioral Foundation Models.

[DOI]

,

Andrea Tirinzoni

,

,

,

Anssi Kanervisto

,

,

,

Alessandro Lazaric

,

CoRR, April, 2025

Temporal Difference Flows.

[DOI]

Jesse Farebrother

,

,

Andrea Tirinzoni

,

,

Alessandro Lazaric

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models.

[DOI]

Andrea Tirinzoni

,

,

Jesse Farebrother

,

,

Anssi Kanervisto

,

,

Alessandro Lazaric

,

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Simple Ingredients for Offline Reinforcement Learning.

[DOI]

,

Andrea Tirinzoni

,

,

Alessandro Lazaric

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Fast Imitation via Behavior Foundation Models.

[DOI]

,

Andrea Tirinzoni

,

,

Alessandro Lazaric

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Group Fairness in Reinforcement Learning.

[DOI]

,

Alessandro Lazaric

,

,

Trans. Mach. Learn. Res., 2023

Layered State Discovery for Incremental Autonomous Exploration.

[DOI]

,

Andrea Tirinzoni

,

Alessandro Lazaric

,

Proceedings of the International Conference on Machine Learning, 2023

Contextual bandits with concave rewards, and an application to fair ranking.

[DOI]

,

,

,

Alessandro Lazaric

,

Nicolas Usunier

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path.

[DOI]

,

Andrea Tirinzoni

,

,

Alessandro Lazaric

Proceedings of the International Conference on Algorithmic Learning Theory, 2023

On the Complexity of Representation Learning in Contextual Linear Bandits.

[DOI]

Andrea Tirinzoni

,

,

Alessandro Lazaric

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

Smoothing policies and safe policy gradients.

[DOI]

,

,

Marcello Restelli

Mach. Learn., 2022

Improved Adaptive Algorithm for Scalable Active Learning with Weak Labeler.

[DOI]

,

Karthik Abinav Sankararaman

,

Alessandro Lazaric

,

,

Dmytro Karamshuk

,

,

Karishma Mandyam

,

,

CoRR, 2022

Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees.

[DOI]

Andrea Tirinzoni

,

,

,

Alessandro Lazaric

,

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning.

[DOI]

,

,

,

Evrard Garcelon

,

,

Alessandro Lazaric

,

,

Simon Shaolei Du

Proceedings of the Tenth International Conference on Learning Representations, 2022

Privacy Amplification via Shuffling for Linear Contextual Bandits.

[DOI]

Evrard Garcelon

,

Kamalika Chaudhuri

,

Vianney Perchet

,

Proceedings of the International Conference on Algorithmic Learning Theory, 29 March, 2022

Adaptive Multi-Goal Exploration.

[DOI]

Jean Tarbouriech

,

Omar Darwiche Domingues

,

,

,

,

Alessandro Lazaric

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Encrypted Linear Contextual Bandit.

[DOI]

Evrard Garcelon

,

,

Vianney Perchet

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Top K Ranking for Multi-Armed Bandit with Noisy Evaluations.

[DOI]

Evrard Garcelon

,

Vashist Avadhanula

,

Alessandro Lazaric

,

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Safe Policy Iteration: A Monotonically Improving Approximate Policy Iteration Approach.

[DOI]

Alberto Maria Metelli

,

,

Daniele Calandriello

,

Marcello Restelli

J. Mach. Learn. Res., 2021

Gaussian Approximation for Bias Reduction in Q-Learning.

[DOI]

,

,

Alessandro Nuara

,

,

,

,

Marcello Restelli

J. Mach. Learn. Res., 2021

Differentially Private Exploration in Reinforcement Learning with Linear Representation.

[DOI]

,

Evrard Garcelon

,

Alessandro Lazaric

,

CoRR, 2021

A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs.

[DOI]

Andrea Tirinzoni

,

,

Alessandro Lazaric

CoRR, 2021

A Unified Framework for Conservative Exploration.

[DOI]

,

,

,

Evrard Garcelon

,

,

Alessandro Lazaric

,

,

CoRR, 2021

Homomorphically Encrypted Linear Contextual Bandit.

[DOI]

Evrard Garcelon

,

Vianney Perchet

,

CoRR, 2021

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret.

[DOI]

Jean Tarbouriech

,

,

,

,

,

Alessandro Lazaric

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Provably Efficient Sample Collection Strategy for Reinforcement Learning.

[DOI]

Jean Tarbouriech

,

,

,

Alessandro Lazaric

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection.

[DOI]

,

Andrea Tirinzoni

,

,

Marcello Restelli

,

Alessandro Lazaric

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Local Differential Privacy for Regret Minimization in Reinforcement Learning.

[DOI]

Evrard Garcelon

,

Vianney Perchet

,

Ciara Pike-Burke

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Leveraging Good Representations in Linear Contextual Bandits.

[DOI]

,

Andrea Tirinzoni

,

Marcello Restelli

,

Alessandro Lazaric

,

Proceedings of the 38th International Conference on Machine Learning, 2021

Kernel-Based Reinforcement Learning: A Finite-Time Analysis.

[DOI]

Omar Darwiche Domingues

,

,

,

Emilie Kaufmann

,

Proceedings of the 38th International Conference on Machine Learning, 2021

Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model.

[DOI]

Jean Tarbouriech

,

,

,

Alessandro Lazaric

Proceedings of the Algorithmic Learning Theory, 2021

A Kernel-Based Approach to Non-Stationary Reinforcement Learning in Metric Spaces.

[DOI]

Omar Darwiche Domingues

,

,

,

Emilie Kaufmann

,

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

On the use of the policy gradient and Hessian in inverse reinforcement learning.

[DOI]

Alberto Maria Metelli

,

,

Marcello Restelli

Intelligenza Artificiale, 2020

Local Differentially Private Regret Minimization in Reinforcement Learning.

[DOI]

Evrard Garcelon

,

Vianney Perchet

,

Ciara Pike-Burke

,

CoRR, 2020

Improved Analysis of UCRL2 with Empirical Bernstein Inequality.

[DOI]

,

,

Alessandro Lazaric

CoRR, 2020

Learning Adaptive Exploration Strategies in Dynamic Environments Through Informed Policy Regularization.

[DOI]

Pierre-Alexandre Kamienny

,

,

Alessandro Lazaric

,

Thibault Lavril

,

Nicolas Usunier

,

Ludovic Denoyer

CoRR, 2020

Regret Bounds for Kernel-Based Reinforcement Learning.

[DOI]

Omar Darwiche Domingues

,

,

,

Emilie Kaufmann

,

CoRR, 2020

Exploration-Exploitation in Constrained MDPs.

[DOI]

Yonathan Efroni

,

,

CoRR, 2020

Concentration Inequalities for Multinoulli Random Variables.

[DOI]

,

,

,

Alessandro Lazaric

CoRR, 2020

Exploiting Language Instructions for Interpretable and Compositional Reinforcement Learning.

[DOI]

Michiel van der Meer

,

,

CoRR, 2020

Active Model Estimation in Markov Decision Processes.

[DOI]

Jean Tarbouriech

,

Shubhanshu Shekhar

,

,

Mohammad Ghavamzadeh

,

Alessandro Lazaric

Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, 2020

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits.

[DOI]

Andrea Tirinzoni

,

,

Marcello Restelli

,

Alessandro Lazaric

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Improved Sample Complexity for Incremental Autonomous Exploration in MDPs.

[DOI]

Jean Tarbouriech

,

,

,

Alessandro Lazaric

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Adversarial Attacks on Linear Contextual Bandits.

[DOI]

Evrard Garcelon

,

Baptiste Rozière

,

Laurent Meunier

,

Jean Tarbouriech

,

Olivier Teytaud

,

Alessandro Lazaric

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

No-Regret Exploration in Goal-Oriented Reinforcement Learning.

[DOI]

Jean Tarbouriech

,

Evrard Garcelon

,

,

,

Alessandro Lazaric

Proceedings of the 37th International Conference on Machine Learning, 2020

Frequentist Regret Bounds for Randomized Least-Squares Value Iteration.

[DOI]

,

David Brandfonbrener

,

,

,

Alessandro Lazaric

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Conservative Exploration in Reinforcement Learning.

[DOI]

Evrard Garcelon

,

Mohammad Ghavamzadeh

,

Alessandro Lazaric

,

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Improved Algorithms for Conservative Exploration in Bandits.

[DOI]

Evrard Garcelon

,

Mohammad Ghavamzadeh

,

Alessandro Lazaric

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Frequentist Regret Bounds for Randomized Least-Squares Value Iteration.

[DOI]

,

David Brandfonbrener

,

,

Alessandro Lazaric

CoRR, 2019

Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs.

[DOI]

,

,

,

Alessandro Lazaric

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Regret Bounds for Learning State Representations in Reinforcement Learning.

[DOI]

,

,

Alessandro Lazaric

,

,

Odalric-Ambrym Maillard

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

2018

Exploration Bonus for Regret Minimization in Undiscounted Discrete and Continuous Markov Decision Processes.

[DOI]

,

,

,

Alessandro Lazaric

CoRR, 2018

Near Optimal Exploration-Exploitation in Non-Communicating Markov Decision Processes.

[DOI]

,

,

Alessandro Lazaric

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Does Reinforcement Learning outperform PID in the control of FES-induced elbow flex-extension?

[DOI]

Davide Di Febbo

,

Emilia Ambrosini

,

,

,

Marcello Restelli

,

Alessandra Laura Giulia Pedrocchi

,

Simona Ferrante

Proceedings of the 2018 IEEE International Symposium on Medical Measurements and Applications, 2018

Importance Weighted Transfer of Samples in Reinforcement Learning.

[DOI]

Andrea Tirinzoni

,

,

,

Marcello Restelli

Proceedings of the 35th International Conference on Machine Learning, 2018

Stochastic Variance-Reduced Policy Gradient.

[DOI]

,

Damiano Binaghi

,

Giuseppe Canonaco

,

,

Marcello Restelli

Proceedings of the 35th International Conference on Machine Learning, 2018

Efficient Bias-Span-Constrained Exploration-Exploitation in Reinforcement Learning.

[DOI]

,

,

Alessandro Lazaric

,

Proceedings of the 35th International Conference on Machine Learning, 2018

2017

Manifold-based multi-objective policy search with sample reuse.

[DOI]

,

,

Neurocomputing, 2017

Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent.

[DOI]

,

Marcello Restelli

CoRR, 2017

Gradient-based minimization for multi-expert Inverse Reinforcement Learning.

[DOI]

,

,

Marcello Restelli

,

Andrea Bonarini

Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, 2017

Adaptive Batch Size for Safe Policy Gradients.

[DOI]

,

,

Marcello Restelli

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Compatible Reward Inverse Reinforcement Learning.

[DOI]

Alberto Maria Metelli

,

,

Marcello Restelli

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Regret Minimization in MDPs with Options without Prior Knowledge.

[DOI]

,

,

Alessandro Lazaric

,

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Boosted Fitted Q-Iteration.

[DOI]

Samuele Tosatto

,

,

,

Marcello Restelli

Proceedings of the 34th International Conference on Machine Learning, 2017

Estimating the Maximum Expected Value in Continuous Reinforcement Learning Problems.

[DOI]

,

Alessandro Nuara

,

,

Marcello Restelli

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Reinforcement learning: from theory to algorithms.

[DOI]

PhD thesis, 2016

Policy Search for the Optimal Control of Markov Decision Processes: A Novel Particle-Based Iterative Scheme.

[DOI]

Giorgio Manganini

,

,

Marcello Restelli

,

,

IEEE Trans. Cybern., 2016

Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation.

[DOI]

,

,

Marcello Restelli

J. Artif. Intell. Res., 2016

Inverse Reinforcement Learning through Policy Gradient Minimization.

[DOI]

,

Marcello Restelli

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

Policy gradient in Lipschitz Markov Decision Processes.

[DOI]

,

Marcello Restelli

,

Mach. Learn., 2015

Following Newton direction in Policy Gradient with parameter exploration.

[DOI]

Giorgio Manganini

,

,

Marcello Restelli

,

Proceedings of the 2015 International Joint Conference on Neural Networks, 2015

Optimal control to reduce emissions in gasoline engines: an iterative learning control approach for ECU calibration maps improvement.

[DOI]

Danilo Caporale

,

,

,

Alessandro Falsone

,

Riccardo Vignali

,

,

,

Giorgio Manganini

Proceedings of the 14th European Control Conference, 2015

Multi-Objective Reinforcement Learning with Continuous Pareto Frontier Approximation.

[DOI]

,

,

Marcello Restelli

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014

Policy gradient approaches for multi-objective sequential decision making.

[DOI]

,

,

Nicola Smacchia

,

,

Marcello Restelli

Proceedings of the 2014 International Joint Conference on Neural Networks, 2014

Policy gradient approaches for multi-objective sequential decision making: A comparison.

[DOI]

,

,

Nicola Smacchia

,

,

Marcello Restelli

Proceedings of the 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2014

2013

Adaptive Step-Size for Policy Gradient Methods.

[DOI]

,

Marcello Restelli

,

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Safe Policy Iteration.

[DOI]

,

Marcello Restelli

,

Alessio Pecorino

,

Daniele Calandriello

Proceedings of the 30th International Conference on Machine Learning, 2013

2011

Fitted policy search.

[DOI]

Martino Migliavacca

,

Alessio Pecorino

,

,

Marcello Restelli

,

Andrea Bonarini

Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning, 2011

Loading...