Jingqi Tong
According to our database1,
Jingqi Tong authored at least 20 papers
between 2024 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model.
CoRR, April, 2026
CoRR, February, 2026
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment.
CoRR, January, 2026
LLMEval-Fair: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026
Beyond Scaling: Measuring and Predicting the Upper Bound of Knowledge Retention in Language Model Pre-Training.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026
OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Models.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026
2025
CoRR, November, 2025
CoRR, September, 2025
LLMEval-3: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models.
CoRR, August, 2025
SpeechRole: A Large-Scale Dataset and Benchmark for Evaluating Speech Role-Playing Agents.
CoRR, August, 2025
CoRR, May, 2025
Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training.
CoRR, February, 2025
Understanding Parametric and Contextual Knowledge Reconciliation within Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025
LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
2024
Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning.
CoRR, 2024
Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning Through Trap Problems.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024