Shenglong Ye

According to our database¹, Shenglong Ye authored at least 15 papers between 2021 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

InternSpatial: A Comprehensive Dataset for Spatial Reasoning in Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, June, 2025

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces.

[BibT_eX]

[DOI]

CoRR, June, 2025

Point or Line? Using Line-based Representation for Panoptic Symbol Spotting in CAD Drawings.

[BibT_eX]

[DOI]

CoRR, May, 2025

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

[BibT_eX]

[DOI]

CoRR, April, 2025

ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting.

[BibT_eX]

[DOI]

CoRR, March, 2025

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning.

[BibT_eX]

[DOI]

CoRR, March, 2025

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.

[BibT_eX]

[DOI]

et al.

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Docopilot: Improving Multimodal Models for Document-Level Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Mini-InternVL: a flexible-transfer pocket multi-modal model with 5% parameters and 90% performance.

[BibT_eX]

[DOI]

Vis. Intell., 2024

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.

[BibT_eX]

[DOI]

CoRR, 2024

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance.

[BibT_eX]

[DOI]

CoRR, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.

[BibT_eX]

[DOI]

CoRR, 2024

How far are we to GPT-4V? Closing the gap to commercial multimodal models with open-source suites.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2024

2023

Synthesizing a Large Scene with Multiple NeRFs.

[BibT_eX]

[DOI]

Shenglong Ye

Feifei Li

Rui Huang

Proceedings of the Image and Graphics - 12th International Conference, 2023

2021

SDAN: Squared Deformable Alignment Network for Learning Misaligned Optical Zoom.

[BibT_eX]

[DOI]

Kangfu Mei

Shenglong Ye

Rui Huang

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Shenglong Ye

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...