Jingqi Tong

According to our database1, Jingqi Tong authored at least 20 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model.
CoRR, April, 2026

AI Can Learn Scientific Taste.
CoRR, March, 2026

SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents.
CoRR, February, 2026

MOVA: Towards Scalable and Synchronized Video-Audio Generation.
CoRR, February, 2026

OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment.
CoRR, January, 2026

LLMEval-Fair: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

VideoPro: Adaptive Program Reasoning for Long Video Understanding.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

Beyond Scaling: Measuring and Predicting the Upper Bound of Knowledge Retention in Language Model Pre-Training.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Models.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

2025
Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm.
CoRR, November, 2025

Adaptive Fast-and-Slow Visual Program Reasoning for Long-Form VideoQA.
CoRR, September, 2025

LLMEval-3: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models.
CoRR, August, 2025

SpeechRole: A Large-Scale Dataset and Benchmark for Evaluating Speech Role-Playing Agents.
CoRR, August, 2025

Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning.
CoRR, May, 2025

Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training.
CoRR, February, 2025

Understanding Parametric and Contextual Knowledge Reconciliation within Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

2024
Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning.
CoRR, 2024

Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning Through Trap Problems.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-Training.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024


  Loading...