Lior Shani

According to our database1, Lior Shani authored at least 19 papers between 2018 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Latent Reasoning with Supervised Thinking States.
CoRR, February, 2026

2025
Reinforcement Learning with Discrete Diffusion Policies for Combinatorial Action Spaces.
CoRR, September, 2025

Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

2024
Embedding-Aligned Language Models.
CoRR, 2024

Offline Regularised Reinforcement Learning for Large Language Models Alignment.
CoRR, 2024

Multi-turn Reinforcement Learning from Preference Human Feedback.
CoRR, 2024

Embedding-Aligned Language Models.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Multi-turn Reinforcement Learning with Preference Human Feedback.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Demystifying Embedding Spaces using Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Reinforcement Learning with History Dependent Dynamic Contexts.
Proceedings of the International Conference on Machine Learning, 2023

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Reinforcement Learning with a Terminator.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Mirror Descent Policy Optimization.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Online Apprenticeship Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2020
Optimistic Policy Optimization with Bandit Feedback.
Proceedings of the 37th International Conference on Machine Learning, 2020

Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Exploration Conscious Reinforcement Learning Revisited.
Proceedings of the 36th International Conference on Machine Learning, 2019

2018
Multi Instance Learning For Unbalanced Data.
CoRR, 2018

Revisiting Exploration-Conscious Reinforcement Learning.
CoRR, 2018


  Loading...