Matteo Papini

Ludovic Schwartz

CoRR, 2024

No-Regret Reinforcement Learning in Smooth MDPs.

[BibT_eX]

[DOI]

Davide Maran

CoRR, 2024

2023

Importance-Weighted Offline Learning Done Right.

[BibT_eX]

[DOI]

Germano Gabbianelli

Gergely Neu

CoRR, 2023

Offline Primal-Dual Reinforcement Learning for Linear MDPs.

[BibT_eX]

[DOI]

CoRR, 2023

Online Learning with Off-Policy Feedback.

[BibT_eX]

[DOI]

Germano Gabbianelli

Gergely Neu

Proceedings of the International Conference on Algorithmic Learning Theory, 2023

2022

Smoothing policies and safe policy gradients.

[BibT_eX]

[DOI]

Matteo Pirotta

Mach. Learn., 2022

Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021

Safe policy optimization.

[BibT_eX]

[DOI]

PhD thesis, 2021

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Leveraging Good Representations in Linear Contextual Bandits.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Automated Reasoning for Reinforcement Learning Agents in Structured Environments.

[BibT_eX]

[DOI]

Alessandro Gianola

Marco Montali

Proceedings of the 3rd Workshop on Artificial Intelligence and Formal Verification, 2021

Policy Optimization as Online Learning with Mediator Feedback.

[BibT_eX]

[DOI]

Pierluca D'Oro

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Importance Sampling Techniques for Policy Optimization.

[BibT_eX]

[DOI]

Nico Montali

J. Mach. Learn. Res., 2020

Risk-Averse Trust Region Optimization for Reward-Volatility Reduction.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration.

[BibT_eX]

[DOI]

Andrea Battistello

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Gradient-Aware Model-Based Policy Search.

[BibT_eX]

[DOI]

Pierluca D'Oro

Andrea Tirinzoni

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Feature Selection via Mutual Information: New Theoretical Insights.

[BibT_eX]

[DOI]

Mario Beraha

Andrea Tirinzoni

Proceedings of the International Joint Conference on Neural Networks, 2019

Optimistic Policy Optimization via Multiple Importance Sampling.

[BibT_eX]

[DOI]

Lorenzo Lupo

Proceedings of the 36th International Conference on Machine Learning, 2019

2018

Policy Optimization via Importance Sampling.

[BibT_eX]

[DOI]

Francesco Faccio

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Stochastic Variance-Reduced Policy Gradient.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

2017

Adaptive Batch Size for Safe Policy Gradients.

[BibT_eX]

[DOI]

Matteo Pirotta