Marco Mussi

CoRR, September, 2025

Generalized Kernelized Bandits: Self-Normalized Bernstein-Like Dimension-Free Inequality and Regret Bounds.

[BibT_eX]

[DOI]

CoRR, August, 2025

Gym4ReaL: A Suite for Benchmarking Real-World Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, July, 2025

Reusing Trajectories in Policy Gradients Enables Fast Convergence.

[BibT_eX]

[DOI]

Federico Mansutti

CoRR, June, 2025

Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes.

[BibT_eX]

[DOI]

Leonardo Cesani

CoRR, June, 2025

A Refined Analysis of UCBVI.

[BibT_eX]

[DOI]

CoRR, February, 2025

Generalizing the Regret: an Analysis of Lower and Upper Bounds.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2025

Factored-reward bandits with intermediate observations: Regret minimization and best arm identification.

[BibT_eX]

[DOI]

Artif. Intell., 2025

Convergence Analysis of Policy Gradient Methods with Dynamic Stochasticity.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Sleeping Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Towards Theoretical Understanding of Sequential Decision Making with Preference Feedback.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

2024

Bridging Rested and Restless Bandits with Graph-Triggering: Rising and Rotting.

[BibT_eX]

[DOI]

CoRR, 2024

State and Action Factorization in Power Grids.

[BibT_eX]

[DOI]

Gianvito Losapio

Davide Beretta

CoRR, 2024

Open Problem: Tight Bounds for Kernelized Multi-Armed Bandits with Bernoulli Rewards.

[BibT_eX]

[DOI]

CoRR, 2024

Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Best Arm Identification for Stochastic Rising Bandits.

[BibT_eX]

[DOI]

Francesco Trovò

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Factored-Reward Bandits with Intermediate Observations.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Learning Optimal Deterministic Policies with Stochastic Policy Gradients.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Graph-Triggered Rising Bandits.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Autoregressive Bandits.

[BibT_eX]

[DOI]

Francesco Bacchiocchi

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023

ARLO: A framework for Automated Reinforcement Learning.

[BibT_eX]

[DOI]

Davide Lombarda

Francesco Trovò

Expert Syst. Appl., August, 2023

Dynamical Linear Bandits.

[BibT_eX]

[DOI]