Nadav Merlis

According to our database1, Nadav Merlis authored at least 16 papers between 2018 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
The Value of Reward Lookahead in Reinforcement Learning.
CoRR, 2024

2023
Ranking with Popularity Bias: User Welfare under Self-Amplification Dynamics.
CoRR, 2023

Reinforcement Learning with History Dependent Dynamic Contexts.
Proceedings of the International Conference on Machine Learning, 2023

On Preemption and Learning in Stochastic Scheduling.
Proceedings of the International Conference on Machine Learning, 2023

Never Worse, Mostly Better: Stable Policy Improvement in Deep Reinforcement Learning.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

2022
Reinforcement Learning with a Terminator.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
Dare not to Ask: Problem-Dependent Guarantees for Budgeted Bandits.
CoRR, 2021

Ensemble Bootstrapping for Q-Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

Confidence-Budget Matching for Sequential Budgeted Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

Lenient Regret for Multi-Armed Bandits.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Reinforcement Learning with Trajectory Feedback.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Tight Lower Bounds for Combinatorial Multi-Armed Bandits.
Proceedings of the Conference on Learning Theory, 2020

2019
Stabilizing Off-Policy Reinforcement Learning with Conservative Policy Gradients.
CoRR, 2019

Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Batch-Size Independent Regret Bounds for the Combinatorial Multi-Armed Bandit Problem.
Proceedings of the Conference on Learning Theory, 2019

2018
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018


  Loading...