Yihan Zeng

According to our database1, Yihan Zeng authored at least 25 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
C2-Evo: Co-Evolving Multimodal Data and Model for Self-Improving Reasoning.
CoRR, July, 2025

CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback.
CoRR, April, 2025

Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?
CoRR, March, 2025

Corrupted but Not Broken: Rethinking the Impact of Corrupted Data in Visual Instruction Tuning.
CoRR, February, 2025

FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors.
CoRR, January, 2025

UniGS: Unified Language-Image-3D Pretraining with Gaussian Splatting.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
AtomThink: A Slow Thinking Framework for Multimodal Mathematical Reasoning.
CoRR, 2024

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions.
CoRR, 2024

DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors.
CoRR, 2024

Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection.
CoRR, 2024

OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

TextField3D: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text Fields.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion.
Proceedings of the Computer Vision - ECCV 2024, 2024

JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation.
Proceedings of the Computer Vision - ECCV 2024, 2024

DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SUIT: Learning Significance-Guided Information for 3D Temporal Detection.
IROS, 2023

Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

CLIP<sup>2</sup>: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
LIFT: Learning 4D LiDAR Image Fusion Transformer for 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Learning Transferable Features for Point Cloud Detection via 3D Contrastive Co-training.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Cross-Modal 3D Object Detection and Tracking for Auto-Driving.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021


  Loading...