Yonathan Efroni

CoRR, September, 2025

Simple Optimizers for Convex Aligned Multi-Objective Optimization.

[BibT_eX]

[DOI]

Ben Kretzu

Karen Ullrich

CoRR, September, 2025

Aligned Multi Objective Optimization.

[BibT_eX]

[DOI]

CoRR, February, 2025

Aligned Multi Objective Optimization.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Exploiting Structure in Offline Multi-Agent RL: The Benefits of Low Interaction Rank.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Time After Time: Deep-Q Effect Estimation for Interventions on When and What to do.

[BibT_eX]

[DOI]

Yoav Wald

Mark Goldstein

Wouter A. C. van Amsterdam

Rajesh Ranganath

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Pearl: A Production-Ready Reinforcement Learning Agent.

[BibT_eX]

[DOI]

Zheqing Zhu

Rodrigo de Salvo Braz

J. Mach. Learn. Res., 2024

Generalizing Multi-Step Inverse Models for Representation Learning to Finite-Memory POMDPs.

[BibT_eX]

[DOI]

CoRR, 2024

The Bias of Harmful Label Associations in Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Prospective Side Information for Latent MDPs.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

PcLast: Discovering Plannable Continuous Latent States.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models.

[BibT_eX]

[DOI]

Alex Lamb

Riashat Islam

Aniket Rajiv Didolkar

Trans. Mach. Learn. Res., 2023

Reward-Mixing MDPs with Few Latent Contexts are Learnable.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Principled Offline RL in the Presence of Rich Exogenous Information.

[BibT_eX]

[DOI]

Aniket Rajiv Didolkar

Dipendra Misra

Xin Li

Harm van Seijen

Remi Tachet des Combes

John Langford

Proceedings of the International Conference on Machine Learning, 2023

2022

Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information.

[BibT_eX]

[DOI]

Remi Tachet des Combes

John Langford

CoRR, 2022

Guaranteed Discovery of Controllable Latent States with Multi-Step Inverse Models.

[BibT_eX]

[DOI]

CoRR, 2022

Tractable Optimality in Episodic Latent MABs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Coordinated Attacks against Contextual Bandits: Fundamental Limits and Defense Mechanisms.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Sparsity in Partially Controllable Linear Systems.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Provable Reinforcement Learning with a Short-Term Memory.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Mirror Descent Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Provably Filtering Exogenous Distractors using Multistep Inverse Dynamics.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information.

[BibT_eX]

[DOI]

Proceedings of the Conference on Learning Theory, 2-5 July 2022, London, UK., 2022

2021

Provable RL with Exogenous Distractors via Multistep Inverse Dynamics.

[BibT_eX]

[DOI]

CoRR, 2021

Dare not to Ask: Problem-Dependent Guarantees for Budgeted Bandits.

[BibT_eX]

[DOI]

Nadav Merlis

CoRR, 2021

Bandits with partially observable confounded data.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

RL for Latent MDPs: Regret Guarantees and a Lower Bound.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Reinforcement Learning in Reward-Mixing MDPs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Minimax Regret for Stochastic Shortest Path.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Confidence-Budget Matching for Sequential Budgeted Learning.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Reinforcement Learning with Trajectory Feedback.

[BibT_eX]

[DOI]

Nadav Merlis

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Bandits with Partially Observable Offline Data.

[BibT_eX]

[DOI]

CoRR, 2020

Exploration-Exploitation in Constrained MDPs.

[BibT_eX]

[DOI]

Matteo Pirotta

CoRR, 2020

Online Planning with Lookahead Policies.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Multi-step Greedy Reinforcement Learning Algorithms.

[BibT_eX]

[DOI]

Manan Tomar

Proceedings of the 37th International Conference on Machine Learning, 2020

Optimistic Policy Optimization with Bandit Feedback.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs.

[BibT_eX]

[DOI]

Lior Shani

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Manan Tomar

CoRR, 2019

Multi-Step Greedy and Approximate Real Time Dynamic Programming.

[BibT_eX]

[DOI]

CoRR, 2019

Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Action Robust Reinforcement Learning and Applications in Continuous Control.

[BibT_eX]

[DOI]

Chen Tessler

Proceedings of the 36th International Conference on Machine Learning, 2019

Exploration Conscious Reinforcement Learning Revisited.

[BibT_eX]

[DOI]

Lior Shani

Proceedings of the 36th International Conference on Machine Learning, 2019

How to Combine Tree-Search Methods in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Revisiting Exploration-Conscious Reinforcement Learning.

[BibT_eX]

[DOI]

Lior Shani