Zeyi Sun

Orcid: 0009-0008-4264-4281

Affiliations:
  • Shanghai Jiao Tong University, Shanghai, China


According to our database1, Zeyi Sun authored at least 15 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
RAR: Retrieving and Ranking Augmented MLLMs for Visual Recognition.
IEEE Trans. Image Process., 2026

2025
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning.
CoRR, August, 2025

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience.
CoRR, August, 2025

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting.
CoRR, January, 2025

Bootstrap3D: Improving Multi-View Diffusion Model with Synthetic Data.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

X-Prompt: Generalizable Auto-Regressive Visual Learning with In-Context Prompting.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Visual-RFT: Visual Reinforcement Fine-Tuning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

2024
X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models.
CoRR, 2024

V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results.
CoRR, 2024

Bootstrap3D: Improving 3D Content Creation with Synthetic Data.
CoRR, 2024

Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials.
CoRR, 2024

Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

GPT4Point: A Unified Framework for Point-Language Understanding and Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Alpha-CLIP: A CLIP Model Focusing on Wherever you Want.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases.
CoRR, 2023


  Loading...