Andrea Zanette

According to our database1, Andrea Zanette authored at least 20 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL.
CoRR, 2024

Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement.
CoRR, 2024

2023
Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

When is Realizability Sufficient for Off-Policy Reinforcement Learning?
Proceedings of the International Conference on Machine Learning, 2023

2022
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning.
CoRR, 2022

Bellman Residual Orthogonalization for Offline Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Stabilizing Q-learning with Linear Architectures for Provable Efficient Learning.
Proceedings of the International Conference on Machine Learning, 2022

2021
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Design of Experiments for Stochastic Contextual Linear Bandits.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Exponential Lower Bounds for Batch Reinforcement Learning: Batch RL can be Exponentially Harder than Online RL.
Proceedings of the 38th International Conference on Machine Learning, 2021

Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation.
Proceedings of the Conference on Learning Theory, 2021

2020
Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning Near Optimal Policies with Low Inherent Bellman Error.
Proceedings of the 37th International Conference on Machine Learning, 2020

Frequentist Regret Bounds for Randomized Least-Squares Value Iteration.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
Frequentist Regret Bounds for Randomized Least-Squares Value Iteration.
CoRR, 2019

Limiting Extrapolation in Linear Approximate Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Almost Horizon-Free Structure-Aware Best Policy Identification with a Generative Model.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds.
Proceedings of the 36th International Conference on Machine Learning, 2019

2018
Robust Super-Level Set Estimation Using Gaussian Processes.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2018

Problem Dependent Reinforcement Learning Bounds Which Can Identify Bandit Structure in MDPs.
Proceedings of the 35th International Conference on Machine Learning, 2018


  Loading...