Haobo Yuan

Orcid: 0000-0001-9770-7720

According to our database1, Haobo Yuan authored at least 30 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World.
CoRR, June, 2025

On Path to Multimodal Generalist: General-Level and General-Bench.
CoRR, May, 2025

PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild.
CoRR, April, 2025

An Empirical Study of GPT-4o Image Generation Capabilities.
CoRR, April, 2025

4th PVUW MeViS 3rd Place Report: Sa2VA.
CoRR, April, 2025

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos.
CoRR, January, 2025

RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Point Cloud Mamba: Point Cloud Learning via State Space Model.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Panoptic-PartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Transformer-Based Visual Segmentation: A Survey.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Multi-Task Learning With Multi-Query Transformer for Dense Prediction.
IEEE Trans. Circuits Syst. Video Technol., February, 2024

Towards Open Vocabulary Learning: A Survey.
IEEE Trans. Pattern Anal. Mach. Intell., 2024

LLAVADI: What Matters For Multimodal Large Language Models Distillation.
CoRR, 2024

Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model.
CoRR, 2024

Point Cloud Mamba: Point Cloud Learning via State Space Model.
CoRR, 2024

OMG-Seg: Is One Model Good Enough For All Segmentation?
CoRR, 2024

RAP-SAM: Towards Real-Time All-Purpose Segment Anything.
CoRR, 2024

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Open-Vocabulary SAM: Segment and Recognize Twenty-Thousand Classes Interactively.
Proceedings of the Computer Vision - ECCV 2024, 2024

OMG-Seg: Is One Model Good Enough for all Segmentation?
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Monocular Road Planar Parallax Estimation.
IEEE Trans. Image Process., 2023

Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants.
CoRR, 2023

Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation.
CoRR, 2023

PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation.
CoRR, 2023

Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class-Incremental Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
Multi-Task Learning with Multi-query Transformer for Dense Prediction.
CoRR, 2022

Towards Theoretically Inspired Neural Initialization Optimization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

PolyphonicFormer: Unified Query Learning for Depth-Aware Video Panoptic Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
BOSSA: A Decentralized System for Proofs of Data Retrievability and Replication.
IEEE Trans. Parallel Distributed Syst., 2021


  Loading...