Matteo Papini

Orcid: 0000-0002-3807-3171

According to our database1, Matteo Papini authored at least 22 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Optimistic Information Directed Sampling.
CoRR, 2024

No-Regret Reinforcement Learning in Smooth MDPs.
CoRR, 2024

2023
Importance-Weighted Offline Learning Done Right.
CoRR, 2023

Offline Primal-Dual Reinforcement Learning for Linear MDPs.
CoRR, 2023

Online Learning with Off-Policy Feedback.
Proceedings of the International Conference on Algorithmic Learning Theory, 2023

2022
Smoothing policies and safe policy gradients.
Mach. Learn., 2022

Scalable Representation Learning in Linear Contextual Bandits with Constant Regret Guarantees.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
Safe policy optimization.
PhD thesis, 2021

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Leveraging Good Representations in Linear Contextual Bandits.
Proceedings of the 38th International Conference on Machine Learning, 2021

Automated Reasoning for Reinforcement Learning Agents in Structured Environments.
Proceedings of the 3rd Workshop on Artificial Intelligence and Formal Verification, 2021

Policy Optimization as Online Learning with Mediator Feedback.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Importance Sampling Techniques for Policy Optimization.
J. Mach. Learn. Res., 2020

Risk-Averse Trust Region Optimization for Reward-Volatility Reduction.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Gradient-Aware Model-Based Policy Search.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Feature Selection via Mutual Information: New Theoretical Insights.
Proceedings of the International Joint Conference on Neural Networks, 2019

Optimistic Policy Optimization via Multiple Importance Sampling.
Proceedings of the 36th International Conference on Machine Learning, 2019

2018
Policy Optimization via Importance Sampling.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Stochastic Variance-Reduced Policy Gradient.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
Adaptive Batch Size for Safe Policy Gradients.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017


  Loading...