Haonan Li

Orcid: 0000-0001-6623-5089

Affiliations:

Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
University of Melbourne, Parkville, Australia (former)
Shanghai Jiao Tong University, Department of Computer Science and Engineering, China (former)

According to our database¹, Haonan Li authored at least 55 papers between 2018 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Controllable Reasoning Models Are Private Thinkers.

[BibT_eX]

[DOI]

CoRR, February, 2026

SimuScene: Training and Benchmarking Code Generation to Simulate Physical Scenarios.

[BibT_eX]

[DOI]

CoRR, February, 2026

Neural Theorem Proving for Verification Conditions: A Real-World Benchmark.

[BibT_eX]

[DOI]

CoRR, January, 2026

SCALAR: Scientific Citation-based Live Assessment of Long-context Academic Reasoning.

[BibT_eX]

[DOI]

Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics, 2026

Nanda Family: Open-Weights Generative Large Language Models for Hindi.

[BibT_eX]

[DOI]

Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics, 2026

FinChain: A Symbolic Benchmark for Verifiable Chain-of-Thought Financial Reasoning.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

Multilingual Idioms in Sentences and Conversations Across High-, Medium-, and Low-Resource Languages.

[BibT_eX]

[DOI]

Saeed Almheiri

Bilal Elbouardi

Salsabila Zahirah Pranida

Irina Nikishina

Ashwath Rao

Parameswari Krishnamurthy

Muhammad Cendekia Airlangga

Ahmad Fathan Hidayatullah

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

Control Illusion: The Failure of Instruction Hierarchies in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Concise Reasoning in the Lens of Lagrangian Optimization.

[BibT_eX]

[DOI]

CoRR, October, 2025

K2-Think: A Parameter-Efficient Reasoning System.

[BibT_eX]

[DOI]

CoRR, September, 2025

BALSAM: A Platform for Benchmarking Arabic Large Language Models.

[BibT_eX]

[DOI]

CoRR, July, 2025

IsaMini: Redesigned Isabelle Proof Lanugage for Machine Learning.

[BibT_eX]

[DOI]

CoRR, July, 2025

AgentFly: Extensible and Scalable Reinforcement Learning for LM Agents.

[BibT_eX]

[DOI]

CoRR, July, 2025

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective.

[BibT_eX]

[DOI]

CoRR, June, 2025

FinChain: A Symbolic Benchmark for Verifiable Chain-of-Thought Financial Reasoning.

[BibT_eX]

[DOI]

CoRR, June, 2025

Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi.

[BibT_eX]

[DOI]

CoRR, April, 2025

RuozhiBench: Evaluating LLMs with Logical Fallacies and Misleading Premises.

[BibT_eX]

[DOI]

CoRR, February, 2025

LLM360 K2: Building a 65B 360-Open-Source Large Language Model from Scratch.

[BibT_eX]

[DOI]

CoRR, January, 2025

Against The Achilles' Heel: A Survey on Red Teaming for Generative Models.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2025

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective.

[BibT_eX]

[DOI]

Jorge (Zhoujun) Cheng

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

NAT: Enhancing Agent Tuning with Negative Samples.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

ToolGen: Unified Tool Retrieval and Calling via Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Loki: An Open-Source Tool for Fact Verification.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

2024

EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models.

[BibT_eX]

[DOI]

Rocktim Jyoti Das

Simeon Emilov Hristov

Haonan Li

Dimitar Iliyanov Dimitrov

Ivan Koychev

Preslav Nakov

CoRR, 2024

A Chinese Dataset for Evaluating the Safeguards in Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents.

[BibT_eX]

[DOI]

CoRR, 2024

Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Do-Not-Answer: Evaluating Safeguards in LLMs.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

A Chinese Dataset for Evaluating the Safeguards in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Demystifying Instruction Mixing for Fine-tuning Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), 2024

ArabicMMLU: Assessing Massive Multitask Language Understanding in Arabic.

[BibT_eX]

[DOI]

Abdelrahman Boda Sadallah

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification.

[BibT_eX]

[DOI]

Ekaterina Fadeeva

Aleksandr Rubashevskii

Proceedings of the Findings of the Association for Computational Linguistics, 2024

EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models.

[BibT_eX]

[DOI]

Rocktim Jyoti Das

Simeon Emilov Hristov

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

CMMLU: Measuring massive multitask language understanding in Chinese.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Understanding the Instruction Mixture for Large Language Model Fine-tuning.

[BibT_eX]

[DOI]

CoRR, 2023

LLM360: Towards Fully Transparent Open-Source LLMs.

[BibT_eX]

[DOI]

CoRR, 2023

Can Large Language Model Comprehend Ancient Chinese? A Preliminary Test on ACLUE.

[BibT_eX]

[DOI]

Yixuan Zhang

Haonan Li

CoRR, 2023

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs.

[BibT_eX]

[DOI]

CoRR, 2023

Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation.

[BibT_eX]

[DOI]

CoRR, 2023

Can Large Langauge Model Comprehend Ancient Chinese? A Preliminary Test on ACLUE.

[BibT_eX]

[DOI]

Yixuan Zhang

Haonan Li

Proceedings of the Ancient Language Processing Workshop, 2023

Location Aware Modular Biencoder for Tourism Question Answering.

[BibT_eX]

[DOI]

Haonan Li

Martin Tomko

Timothy Baldwin

Proceedings of the Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023, 2023

Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022

Neural Character-Level Syntactic Parsing for Chinese.

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2022

MultiSpanQA: A Dataset for Multi-Span Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

CULG: Commercial Universal Language Generation.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, 2022

Sentiment-Aware Word and Sentence Level Pre-training for Sentiment Analysis.

[BibT_eX]

[DOI]