Haiyang Yu

Orcid: 0000-0003-2747-7338

Affiliations:

Fudan University, School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, China

According to our database¹, Haiyang Yu authored at least 32 papers between 2021 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model.

[BibT_eX]

[DOI]

CoRR, October, 2025

SAIL-VL2 Technical Report.

[BibT_eX]

[DOI]

CoRR, September, 2025

Benchmarking Vision-Language Models on Chinese Ancient Documents: From OCR to Knowledge Reasoning.

[BibT_eX]

[DOI]

CoRR, September, 2025

IADGPT: Unified LVLM for Few-Shot Industrial Anomaly Detection, Localization, and Reasoning via In-Context Learning.

[BibT_eX]

[DOI]

CoRR, August, 2025

From Intent to Execution: Multimodal Chain-of-Thought Reinforcement Learning for Precise CAD Code Generation.

[BibT_eX]

[DOI]

CoRR, August, 2025

Interpretable Oracle Bone Script Decipherment through Radical and Pictographic Analysis with LVLMs.

[BibT_eX]

[DOI]

CoRR, August, 2025

MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement.

[BibT_eX]

[DOI]

CoRR, August, 2025

CoLa: Chinese Character Decomposition with Compositional Latent Components.

[BibT_eX]

[DOI]

CoRR, June, 2025

CReFT-CAD: Boosting Orthographic Projection Reasoning for CAD via Reinforcement Fine-Tuning.

[BibT_eX]

[DOI]

CoRR, June, 2025

Zero-Shot Chinese Character Recognition with Hierarchical Multi-Granularity Image-Text Aligning.

[BibT_eX]

[DOI]

CoRR, May, 2025

Prolonged Reasoning Is Not All You Need: Certainty-Based Adaptive Routing for Efficient LLM/MLLM Reasoning.

[BibT_eX]

[DOI]

CoRR, May, 2025

Synthesizing efficient data with diffusion models for person re-identification pre-training.

[BibT_eX]

[DOI]

Mach. Learn., March, 2025

PHT-CAD: Efficient CAD Parametric Primitive Analysis with Progressive Hierarchical Tuning.

[BibT_eX]

[DOI]

CoRR, March, 2025

UMIT: Unifying Medical Imaging Tasks via Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, March, 2025

ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

Foundation Model Driven Appearance Extraction for Robust Multiple Object Tracking.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Chinese character recognition with radical-structured stroke trees.

[BibT_eX]

[DOI]

Mach. Learn., June, 2024

A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Enhancing Adaptive Deep Networks for Image Classification via Uncertainty-aware Decision Fusion.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

EAFormer: Scene Text Segmentation with Edge-Aware Transformers.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

Privacy-Preserving Collaborative Chinese Text Recognition with Federated Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Uncertainty-aware U-Net for Medical Landmark Detection.

[BibT_eX]

[DOI]

Ziyang Ye

Haiyang Yu

Bin Li

CoRR, 2023

Weakly-Supervised Text Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Scene Text Segmentation with Text-Focused Transformers.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

DeNoising-MOT: Towards Multiple Object Tracking with Severe Occlusions.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Towards Accurate Video Text Spotting with Text-wise Semantic Reasoning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Orientation-Independent Chinese Text Recognition in Scene Images.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

TextFormer: Component-aware Text Segmentation with Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022

Chinese Character Recognition with Augmented Character Profile Matching.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Text Gestalt: Stroke-Aware Scene Text Image Super-resolution.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study.

[BibT_eX]

[DOI]

CoRR, 2021

Haiyang Yu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...