Dongzhi Jiang

According to our database¹, Dongzhi Jiang authored at least 20 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark.

[BibT_eX]

[DOI]

CoRR, October, 2025

BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception.

[BibT_eX]

[DOI]

CoRR, October, 2025

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation.

[BibT_eX]

[DOI]

CoRR, August, 2025

MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning.

[BibT_eX]

[DOI]

CoRR, June, 2025

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT.

[BibT_eX]

[DOI]

CoRR, May, 2025

ADT: Tuning Diffusion Models with Adversarial Supervision.

[BibT_eX]

[DOI]

CoRR, April, 2025

SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems.

[BibT_eX]

[DOI]

CoRR, March, 2025

PiSA: A Self-Augmented Data Engine and Training Strategy for 3D Understanding with Large Models.

[BibT_eX]

[DOI]

CoRR, March, 2025

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines.

[BibT_eX]

[DOI]

CoRR, 2024

MAVIS: Mathematical Visual Instruction Tuning.

[BibT_eX]

[DOI]

CoRR, 2024

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

[BibT_eX]

[DOI]

CoRR, 2024

MoVA: Adapting Mixture of Vision Experts to Multimodal Context.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MATHVERSE: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Dongzhi Jiang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...