Zhongying Tu

Orcid: 0009-0009-9184-5896

According to our database1, Zhongying Tu authored at least 13 papers between 2024 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale.
CoRR, April, 2026

2025
Dripper: Token-Efficient Main HTML Extraction with a Lightweight LM.
CoRR, November, 2025

AICC: Parse HTML Finer, Make Models Better - A 7.3T AI-Ready Corpus Built by a Model-Based HTML Parser.
CoRR, November, 2025

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing.
CoRR, September, 2025

NovelSeek: When Agent Becomes the Scientist - Building Closed-Loop System from Hypothesis to Verification.
CoRR, May, 2025

WanJuanSiLu: A High-Quality Open-Source Webtext Dataset for Low-Resource Languages.
CoRR, January, 2025

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions.
CoRR, 2024

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.
CoRR, 2024

GRUtopia: Dream General Robots in a City at Scale.
CoRR, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.
CoRR, 2024

WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset.
CoRR, 2024


  Loading...