Dongzhi Jiang

According to our database1, Dongzhi Jiang authored at least 18 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation.
CoRR, August, 2025

MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning.
CoRR, June, 2025

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT.
CoRR, May, 2025

ADT: Tuning Diffusion Models with Adversarial Supervision.
CoRR, April, 2025

SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems.
CoRR, March, 2025

PiSA: A Self-Augmented Data Engine and Training Strategy for 3D Understanding with Large Models.
CoRR, March, 2025

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency.
CoRR, February, 2025

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM.
CoRR, 2024

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines.
CoRR, 2024

MAVIS: Mathematical Visual Instruction Tuning.
CoRR, 2024

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
CoRR, 2024

MoVA: Adapting Mixture of Vision Experts to Multimodal Context.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MATHVERSE: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023


  Loading...