Xiangyu Zeng

Orcid: 0000-0001-6956-5040

Affiliations:

Shandong University, School of Software, Jinan, China

According to our database¹, Xiangyu Zeng authored at least 17 papers between 2023 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

RIVER: A Real-Time Interaction Benchmark for Video LLMs.

[BibT_eX]

[DOI]

CoRR, March, 2026

Video-o3: Native Interleaved Clue Seeking for Long Video Multi-Hop Reasoning.

[BibT_eX]

[DOI]

CoRR, January, 2026

VideoTG-R1: Boosting Video Temporal Grounding via Curriculum Reinforcement Learning on Reflected Boundary Annotations.

[BibT_eX]

[DOI]

Proceedings of the 2026 International Conference on Multimedia Retrieval, 2026

2025

VKnowU: Evaluating Visual Knowledge Understanding in Multimodal LLMs.

[BibT_eX]

[DOI]

CoRR, November, 2025

VideoTG-R1: Boosting Video Temporal Grounding via Curriculum Reinforcement Learning on Reflected Boundary Annotations.

[BibT_eX]

[DOI]

CoRR, October, 2025

UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, October, 2025

Learning Goal-Oriented Language-Guided Navigation with Self-Improving Demonstrations at Scale.

[BibT_eX]

[DOI]

CoRR, September, 2025

VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning.

[BibT_eX]

[DOI]

CoRR, April, 2025

InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling.

[BibT_eX]

[DOI]

CoRR, January, 2025

Online Video Understanding: A Comprehensive Benchmark and Memory-Augmented Method.

[BibT_eX]

[DOI]

CoRR, January, 2025

VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling.

[BibT_eX]

[DOI]

CoRR, January, 2025

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Make Your Training Flexible: Towards Deployment-Efficient Video Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Online Video Understanding: OVBench and VideoChat-Online.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2023

Adaptive Edge-Aware Semantic Interaction Network for Salient Object Detection in Optical Remote Sensing Images.

[BibT_eX]

[DOI]

IEEE Trans. Geosci. Remote. Sens., 2023

Xiangyu Zeng

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...