Erfei Cui

According to our database¹, Erfei Cui authored at least 17 papers between 2023 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

ScaleEdit-12M: Scaling Open-Source Image Editing Data Generation via Multi-Agent Framework.

[BibT_eX]

[DOI]

CoRR, March, 2026

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

MetaCaptioner: Towards Generalist Visual Captioning with Open-source Suites.

[BibT_eX]

[DOI]

CoRR, October, 2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency.

[BibT_eX]

[DOI]

CoRR, August, 2025

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces.

[BibT_eX]

[DOI]

CoRR, June, 2025

Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR.

[BibT_eX]

[DOI]

CoRR, April, 2025

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

[BibT_eX]

[DOI]

CoRR, April, 2025

DriveMLM: aligning multi-modal large language models with behavioral planning states for autonomous driving.

[BibT_eX]

[DOI]

Vis. Intell., 2025

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.

[BibT_eX]

[DOI]

et al.

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Mini-InternVL: a flexible-transfer pocket multi-modal model with 5% parameters and 90% performance.

[BibT_eX]

[DOI]

Vis. Intell., 2024

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.

[BibT_eX]

[DOI]

CoRR, 2024

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance.

[BibT_eX]

[DOI]

CoRR, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.

[BibT_eX]

[DOI]

CoRR, 2024

Teaching MLP More Graph Information: A Three-stage Multitask Knowledge Distillation Framework.

[BibT_eX]

[DOI]

CoRR, 2024

How far are we to GPT-4V? Closing the gap to commercial multimodal models with open-source suites.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2024

ControlLLM: Augment Language Models with Tools by Searching on Graphs.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

ControlLLM: Augment Language Models with Tools by Searching on Graphs.

[BibT_eX]

[DOI]

CoRR, 2023

Erfei Cui

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...