Yufei Zhan
Orcid: 0009-0002-1377-8519
According to our database1,
Yufei Zhan authored at least 20 papers
between 2021 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
TraceVision: Trajectory-Aware Vision-Language Model for Human-Like Spatial Understanding.
CoRR, February, 2026
Pattern Recognit., 2026
REFORMamba-Unet: Hierarchical gated refocusing convolution and Mamba-Based u-net for PECTPA 3D medical image.
Biomed. Signal Process. Control., 2026
Proceedings of the 23rd IEEE International Symposium on Biomedical Imaging, 2026
GeM-VG: Towards Generalized Multi-image Visual Grounding with Multimodal Large Language Models.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
CoRR, October, 2025
Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models.
CoRR, June, 2025
FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation.
CoRR, June, 2025
VFaith: Do Large Multimodal Models Really Reason on Seen Images Rather than Previous Memories?
CoRR, June, 2025
CoRR, June, 2025
Understand, Think, and Answer: Advancing Visual Reasoning with Large Multimodal Models.
CoRR, May, 2025
Vision-R1: Evolving Human-Free Alignment in Large Vision-Language Models via Vision-Guided Reinforcement Learning.
CoRR, March, 2025
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
UIOrchestra: Generating High-Fidelity Code from UI Designs with a Multi-agent System.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
2024
Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models.
CoRR, 2024
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring.
CoRR, 2024
Griffon: Spelling Out All Object Locations at Any Granularity with Large Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024
2023
CoRR, 2023
2021