Julian Zimmert

Róbert Istvan Busa-Fekete

CoRR, February, 2026

TBDFiltering: Sample-Efficient Tree-Based Data Filtering.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

An Improved Model-Free Decision-Estimation Coefficient with Applications in Adversarial MDPs.

[BibT_eX]

[DOI]

CoRR, October, 2025

Beyond the Rosetta Stone: Unification Forces in Generalization Dynamics.

[BibT_eX]

[DOI]

CoRR, August, 2025

Non-stationary Bandit Convex Optimization: A Comprehensive Study.

[BibT_eX]

[DOI]

CoRR, June, 2025

A Scalable Crawling Algorithm Utilizing Noisy Change-Indicating Signals.

[BibT_eX]

[DOI]

Proceedings of the ACM on Web Conference 2025, 2025

Contextual Dynamic Pricing with Heterogeneous Buyers.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Decision Making in Hybrid Environments: A Model Aggregation Approach.

[BibT_eX]

[DOI]

Proceedings of the Thirty Eighth Annual Conference on Learning Theory, 2025

2024

Incentive-compatible Bandits: Importance Weighting No More.

[BibT_eX]

[DOI]

Teodor V. Marinov

CoRR, 2024

PRODuctive bandits: Importance Weighting No More.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

A Best-of-both-worlds Algorithm for Bandits with Delayed Feedback with Robustness to Excessive Delays.

[BibT_eX]

[DOI]

Saeed Masoudian

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Beating Adversarial Low-Rank MDPs with Unknown Transition and Bandit Feedback.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

An Improved Best-of-both-worlds Algorithm for Bandits with Delayed Feedback.

[BibT_eX]

[DOI]

Saeed Masoudian

CoRR, 2023

Optimal cross-learning for contextual bandits with unknown context distributions.

[BibT_eX]

[DOI]

Jon Schneider

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Best of Both Worlds Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Refined Regret for Adversarial MDPs with Linear Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

A Blackbox Approach to Best of Both Worlds in Bandits and Beyond.

[BibT_eX]

[DOI]

Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

A Unified Algorithm for Stochastic Path Problems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Algorithmic Learning Theory, 2023

2022

A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback.

[BibT_eX]

[DOI]

Saeed Masoudian

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality.

[BibT_eX]

[DOI]

Mehryar Mohri

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Return of the bias: Almost minimax optimal high probability bounds for adversarial linear bandits.

[BibT_eX]

[DOI]

Tor Lattimore

Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

Pushing the Efficiency-Regret Pareto Frontier for Online Learning of Portfolios and Quantum States.

[BibT_eX]

[DOI]

Naman Agarwal

Satyen Kale

Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

A Model Selection Approach for Corruption Robust Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Algorithmic Learning Theory, 29 March, 2022

Efficient Methods for Online Multiclass Logistic Regression.

[BibT_eX]

[DOI]

Naman Agarwal

Satyen Kale

Proceedings of the International Conference on Algorithmic Learning Theory, 29 March, 2022

2021

Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2021

The Pareto Frontier of model selection for general Contextual Bandits.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning.

[BibT_eX]

[DOI]

Mehryar Mohri

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020

Model Selection in Contextual Stochastic Bandit Problems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Adapting to Misspecification in Contextual Bandits.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Online Learning for Active Cache Synchronization.

[BibT_eX]

[DOI]

Andrey Kolobov

Sébastien Bubeck

Proceedings of the 37th International Conference on Machine Learning, 2020

An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

Connections Between Mirror Descent, Thompson Sampling and the Information Ratio.

[BibT_eX]

[DOI]

Tor Lattimore

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously.

[BibT_eX]

[DOI]

Haipeng Luo

Proceedings of the 36th International Conference on Machine Learning, 2019

An Optimal Algorithm for Stochastic and Adversarial Bandits.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018

Factored Bandits.

[BibT_eX]

[DOI]