Zhoufutu Wen

Orcid: 0009-0000-0894-5824

According to our database1, Zhoufutu Wen authored at least 14 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction.
CoRR, August, 2025

First Return, Entropy-Eliciting Explore.
CoRR, July, 2025

SciDA: Scientific Dynamic Assessor of LLMs.
CoRR, June, 2025

MARS-Bench: A Multi-turn Athletic Real-world Scenario Benchmark for Dialogue Evaluation.
CoRR, May, 2025

KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation.
CoRR, May, 2025

IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs.
CoRR, April, 2025

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines.
CoRR, February, 2025

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models.
CoRR, February, 2025

CryptoX : Compositional Reasoning Evaluation of Large Language Models.
CoRR, February, 2025

Distillation Quantification for Large Language Models.
CoRR, January, 2025

KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Quantification of Large Language Model Distillation.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks.
CoRR, 2024

2023
Enhancing Dynamic Image Advertising with Vision-Language Pre-training.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023


  Loading...