Jiahao Ying

Orcid: 0000-0003-4264-3648

According to our database1, Jiahao Ying authored at least 23 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
EffiEval: Efficient and Generalizable Model Evaluation via Capability Coverage Maximization.
CoRR, August, 2025

Disentangling Language and Culture for Evaluating Multilingual Large Language Models.
CoRR, May, 2025

FRABench and GenEval: Scaling Fine-Grained Aspect Evaluation across Tasks, Modalities.
CoRR, May, 2025

Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks.
CoRR, April, 2025

Revisiting LLM Evaluation through Mechanism Interpretability: a New Metric and Model Utility Law.
CoRR, April, 2025

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers.
CoRR, March, 2025

SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Disentangling Language and Culture for Evaluating Multilingual Large Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

EvoWiki: Evaluating LLMs on Evolving Knowledge.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
EvoWiki: Evaluating LLMs on Evolving Knowledge.
CoRR, 2024

Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning.
CoRR, 2024

LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement.
CoRR, 2024

A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential.
CoRR, 2024

Have Seen Me Before? Automating Dataset Updates Towards Reliable and Timely Evaluation.
CoRR, 2024

Automating Dataset Updates Towards Reliable and Timely Evaluation of Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

QRMeM: Unleash the Length Limitation through Question then Reflection Memory Mechanism.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting Prompts.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Intuitive or Dependent? Investigating LLMs' Robustness to Conflicting Prompts.
CoRR, 2023

Benchmarking Foundation Models with Language-Model-as-an-Examiner.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
Spare simple MKKM with semi-infinite linear program optimization.
Int. J. Intell. Syst., 2022

2021
RoKGDS: A Robust Knowledge Grounded Dialog System.
Proceedings of the Natural Language Processing and Chinese Computing, 2021


  Loading...