We stand with Ukraine

We stand with Ukraine

Asaf Yehudai

According to our database¹, Asaf Yehudai authored at least 30 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks.

[DOI]

,

,

,

,

Michal Shmueli-Scheuer

,

CoRR, May, 2026

Agentic CLEAR: Automating Multi-Level Evaluation of LLM Agents.

[DOI]

,

,

Michal Shmueli-Scheuer

CoRR, May, 2026

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

[DOI]

,

,

,

,

,

,

Michal Shmueli-Scheuer

,

CoRR, May, 2026

Growing Pains: Extensible and Efficient LLM Benchmarking Via Fixed Parameter Calibration.

[DOI]

,

,

,

,

,

Michal Shmueli-Scheuer

,

,

Gabriel Stanovsky

CoRR, April, 2026

CUBE: A Standard for Unifying Agent Benchmarks.

[DOI]

CoRR, March, 2026

General Agent Evaluation.

[DOI]

,

,

,

Yehoshua Sagron

,

,

,

Natalia Razinkov

,

,

Shlomit Shachor Ifergan

,

,

,

,

,

,

Michal Shmueli-Scheuer

CoRR, February, 2026

Will it Merge? On The Causes of Model Mergeability.

[DOI]

,

,

,

,

,

Yonatan Belinkov

CoRR, January, 2026

Mediocrity is the key for LLM as a Judge Anchor Selection.

[DOI]

Shachar Don-Yehiya

,

,

,

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

CLEAR: Error Analysis via LLM-as-a-Judge Made Easy.

[DOI]

,

,

,

,

Michal Shmueli-Scheuer

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Guided Query Refinement: Multimodal Hybrid Retrieval with Test-Time Optimization.

[DOI]

,

,

,

,

CoRR, October, 2025

Survey on Evaluation of LLM-based Agents.

[DOI]

,

,

,

,

,

,

,

Michal Shmueli-Scheuer

CoRR, March, 2025

WildIFEval: Instruction Following in the Wild.

[DOI]

,

,

,

CoRR, March, 2025

The Mighty ToRR: A Benchmark for Table Reasoning and Robustness.

[DOI]

Shir Ashury-Tahan

,

,

,

,

,

,

,

,

,

,

Michal Shmueli-Scheuer

CoRR, February, 2025

Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models.

[DOI]

,

,

,

Dinesh Khandelwal

,

,

Sachindra Joshi

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

JuStRank: Benchmarking LLM Judges for System Ranking.

[DOI]

,

,

,

,

,

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models.

[DOI]

,

,

,

,

,

,

Sachindra Joshi

CoRR, 2024

Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation.

[DOI]

,

,

,

,

,

,

Michal Shmueli-Scheuer

,

CoRR, 2024

When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes.

[DOI]

,

CoRR, 2024

Genie: Achieving Human Parity in Content-Grounded Datasets Generation.

[DOI]

,

,

,

,

Nathaniel Mills

,

,

,

CoRR, 2024

FastFit: Fast and Effective Few-Shot Text Classification with a Multitude of Classes.

[DOI]

,

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations, 2024

Achieving Human Parity in Content-Grounded Datasets Generation.

[DOI]

,

,

,

,

Nathaniel Mills

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

More Bang for your Context: Virtual Documents for Question Answering over Long Documents.

[DOI]

,

,

,

,

Nathaniel Mills

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation.

[DOI]

,

,

,

Gabriel Stanovsky

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns.

[DOI]

,

,

Gabriel Stanovsky

,

Ariel Goldstein

,

Proceedings of the 46th Annual Meeting of the Cognitive Science Society, 2024

A Grounded Preference Model for LLM Alignment.

[DOI]

,

,

Sarathkrishna Swaminathan

,

,

Subhajit Chaudhury

,

,

Ramón Fernandez Astudillo

,

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

QAID: Question Answering Inspired Few-shot Intent Detection.

[DOI]

,

,

,

,

,

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Evaluating and Improving the Coreference Capabilities of Machine Translation Models.

[DOI]

,

,

,

Gabriel Stanovsky

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

2022

Reinforcement Learning with Large Action Spaces for Neural Machine Translation.

[DOI]

,

,

,

Proceedings of the 29th International Conference on Computational Linguistics, 2022

Conversational Search with Mixed-Initiative - Asking Good Clarification Questions backed-up by Passage Retrieval.

[DOI]

,

,

,

David Konopnicki

Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, 2022

2021

Filling the Gaps in Ancient Akkadian Texts: A Masked Language Modelling Approach.

[DOI]

,

,

,

,

Nathan Wasserman

,

Gabriel Stanovsky

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Loading...