Tom Zahavy

Orcid: 0009-0009-2309-922X

According to our database¹, Tom Zahavy authored at least 49 papers between 2014 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

Generating Creative Chess Puzzles.

[BibT_eX]

[DOI]

CoRR, October, 2025

Evaluating In Silico Creativity: An Expert Review of AI Chess Compositions.

[BibT_eX]

[DOI]

CoRR, October, 2025

Mastering Board Games by External and Internal Planning with Language Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

2023

POMRL: No-Regret Learning-to-Plan with Increasing Horizons.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT.

[BibT_eX]

[DOI]

Hadar Schreiber Galler

Tom Zahavy

Guillaume Desjardins

Alon Cohen

CoRR, 2023

Diversifying AI: Towards Creative Chess with AlphaZero.

[BibT_eX]

[DOI]

CoRR, 2023

Optimism and Adaptivity in Policy Optimization.

[BibT_eX]

[DOI]

CoRR, 2023

Optimistic Meta-Gradients.

[BibT_eX]

[DOI]

Sebastian Flennerhag

Tom Zahavy

Brendan O'Donoghue

Hado Philip van Hasselt

András György

Satinder Singh

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality.

[BibT_eX]

[DOI]

Tom Zahavy

Yannick Schroecker

Feryal M. P. Behbahani

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Discovering Evolution Strategies via Meta-Black-Box Optimization.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization.

[BibT_eX]

[DOI]

Proceedings of the Genetic and Evolutionary Computation Conference, 2023

2022

Palm up: Playing in the Latent Manifold for Unsupervised Pretraining.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Bootstrapped Meta-Learning.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Meta-Gradients in Non-Stationary Environments.

[BibT_eX]

[DOI]

Proceedings of the Conference on Lifelong Learning Agents, 2022

Online Apprenticeship Learning.

[BibT_eX]

[DOI]

Lior Shani

Tom Zahavy

Shie Mannor

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Inverse reinforcement learning in contextual MDPs.

[BibT_eX]

[DOI]

Mach. Learn., 2021

Discovering Diverse Nearly Optimal Policies withSuccessor Features.

[BibT_eX]

[DOI]

CoRR, 2021

Reward is enough for convex MDPs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Discovery of Options via Meta-Learned Subgoals.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Online Limited Memory Neural-Linear Bandits with Likelihood Matching.

[BibT_eX]

[DOI]

Ofir Nabati

Tom Zahavy

Shie Mannor

Proceedings of the 38th International Conference on Machine Learning, 2021

Emphatic Algorithms for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Discovering a set of policies for the worst case reward.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Balancing Constraints and Rewards with Meta-Gradient D4PG.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Self-Tuning Deep Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Unknown mixing times in apprenticeship and reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, 2020

A Self-Tuning Actor-Critic Algorithm.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning to Ask Medical Questions using Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Healthcare Conference, 2020

Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies.

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Learning Theory, 2020

Apprenticeship Learning via Frank-Wolfe.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Deep learning reconstruction of ultrashort pulses from 2D spatial intensity patterns recorded by an all-in-line system in a single-shot.

[BibT_eX]

[DOI]

CoRR, 2019

Average reward reinforcement learning with unknown mixing times.

[BibT_eX]

[DOI]

CoRR, 2019

Action Assembly: Sparse Imitation Learning for Text Based Games with Combinatorial Action Spaces.

[BibT_eX]

[DOI]

CoRR, 2019

Deep Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching.

[BibT_eX]

[DOI]

Tom Zahavy

Shie Mannor

CoRR, 2019

2018

Deep Learning Reconstruction of Ultra-Short Pulses.

[BibT_eX]

[DOI]

CoRR, 2018

Hierarchical Reinforcement Learning: Approximating Optimal Discounted TSP Using Local Policies.

[BibT_eX]

[DOI]

CoRR, 2018

Train on Validation: Squeezing the Data Lemon.

[BibT_eX]

[DOI]

Guy Tennenholtz

Tom Zahavy

Shie Mannor

CoRR, 2018

Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Ensemble Robustness and Generalization of Stochastic Deep Learning Algorithms.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Learning How Not to Act in Text-based Games.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Is a Picture Worth a Thousand Words? A Deep Multi-Modal Architecture for Product Classification in E-Commerce.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Shallow Updates for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

A Deep Hierarchical Approach to Lifelong Learning in Minecraft.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Visualizing Dynamics: from t-SNE to SEMI-MDPs.

[BibT_eX]

[DOI]

Nir Ben-Zrihem

Tom Zahavy

Shie Mannor

CoRR, 2016

Is a picture worth a thousand words? A Deep Multi-Modal Fusion Architecture for Product Classification in e-commerce.

[BibT_eX]

[DOI]

CoRR, 2016

Ensemble Robustness of Deep Learning Algorithms.

[BibT_eX]

[DOI]

CoRR, 2016

Deep Reinforcement Learning Discovers Internal Models.

[BibT_eX]

[DOI]

Nir Baram

Tom Zahavy

Shie Mannor

CoRR, 2016

Graying the black box: Understanding DQNs.

[BibT_eX]

[DOI]

Tom Zahavy

Nir Ben-Zrihem

Shie Mannor

Proceedings of the 33nd International Conference on Machine Learning, 2016

2014

Sub-Nyquist sampling of OFDM signals for cognitive radios.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Tom Zahavy

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...