Dongzhan Zhou

Orcid: 0000-0001-6568-5440

According to our database1, Dongzhan Zhou authored at least 86 papers between 2019 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Meow-Omni 1: A Multimodal Large Language Model for Feline Ethology.
CoRR, May, 2026

LabBuilder: Protocol-Grounded 3D Layout Generation for Interactable and Safe Laboratory.
CoRR, May, 2026

CoEnv: Driving Embodied Multi-Agent Collaboration via Compositional Environment.
CoRR, April, 2026

FD<sup>2</sup>: A Dedicated Framework for Fine-Grained Dataset Distillation.
CoRR, March, 2026

An SO(3)-equivariant reciprocal-space neural potential for long-range interactions.
CoRR, March, 2026

Ego to World: Collaborative Spatial Reasoning in Embodied Systems via Reinforcement Learning.
CoRR, March, 2026

Discovery of Interpretable Physical Laws in Materials via Language-Model-Guided Symbolic Regression.
CoRR, February, 2026

AnyTouch 2: General Optical Tactile Representation Learning For Dynamic Tactile Perception.
CoRR, February, 2026

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads.
CoRR, February, 2026

SciDataCopilot: An Agentic Data Preparation Framework for AGI-driven Scientific Discovery.
CoRR, February, 2026

Attention Reallocation: Towards Zero-cost and Controllable Hallucination Mitigation of MLLMs.
Int. J. Comput. Vis., January, 2026

Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning.
CoRR, January, 2026

SketchThinker-R1: Towards Efficient Sketch-Style Reasoning in Large Multimodal Models.
CoRR, January, 2026

Deep Research Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

Mitigating Low-Quality Reasoning in MLLMs: Self-Driven Refined Multimodal CoT with Selective Thinking and Step-wise Visual Enhancement.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
SCP: Accelerating Discovery with a Global Web of Autonomous Scientific Agents.
CoRR, December, 2025

MiST: Understanding the Role of Mid-Stage Scientific Training in Developing Chemical Reasoning Models.
CoRR, December, 2025

An Agentic Framework for Autonomous Materials Computation.
CoRR, December, 2025

Single-Agent Scaling Fails Multi-Agent Intelligence: Towards Foundation Models with Native Multi-Agent Intelligence.
CoRR, December, 2025

Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks.
CoRR, November, 2025

P1: Mastering Physics Olympiads with Reinforcement Learning.
CoRR, November, 2025

AtomWorld: A Benchmark for Evaluating Spatial Reasoning in Large Language Models on Crystalline Materials.
CoRR, October, 2025

Object-AVEdit: An Object-level Audio-Visual Editing Model.
CoRR, October, 2025

PhysicsMinions: Winning Gold Medals in the Latest Physics Olympiads with a Coevolutionary Multimodal Multi-Agent System.
CoRR, September, 2025

SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines.
CoRR, September, 2025

ChemBOMAS: Accelerated BO in Chemistry with LLM-Enhanced Multi-Agent System.
CoRR, September, 2025

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?
CoRR, September, 2025

DeepResearch Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks.
CoRR, September, 2025

A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers.
CoRR, August, 2025

CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics.
CoRR, August, 2025

From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery.
CoRR, August, 2025

Chem3DLLM: 3D Multimodal Large Language Models for Chemistry.
CoRR, August, 2025

SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning.
CoRR, August, 2025

Δ-AttnMask: Attention-Guided Masked Hidden States for Efficient Data Selection and Augmentation.
CoRR, August, 2025

SpectrumWorld: Artificial Intelligence Foundation for Spectroscopy.
CoRR, August, 2025

Iterative Pretraining Framework for Interatomic Potentials.
CoRR, July, 2025

Multimodal-Guided Dynamic Dataset Pruning for Robust and Efficient Data-Centric Learning.
CoRR, July, 2025

Position: Intelligent Science Laboratory Requires the Integration of Cognitive and Embodied AI.
CoRR, June, 2025

Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning.
CoRR, June, 2025

SELT: Self-Evaluation Tree Search for LLMs with Task Decomposition.
CoRR, June, 2025

CheMatAgent: Enhancing LLMs for Chemistry and Materials Science through Tree-Search Based Tool Learning.
CoRR, June, 2025

Control-R: Towards controllable test-time scaling.
CoRR, June, 2025

LabUtopia: High-Fidelity Simulation and Hierarchical Benchmark for Scientific Embodied Agents.
CoRR, May, 2025

MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback.
CoRR, May, 2025

SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward.
CoRR, May, 2025

NovelSeek: When Agent Becomes the Scientist - Building Closed-Loop System from Hypothesis to Verification.
CoRR, May, 2025

ChemMLLM: Chemical Multimodal Large Language Model.
CoRR, May, 2025

When Dynamic Data Selection Meets Data Augmentation.
CoRR, May, 2025

CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation.
CoRR, April, 2025

DONOD: Robust and Generalizable Instruction Fine-Tuning for LLMs via Model-Intrinsic Dataset Pruning.
CoRR, April, 2025

ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition.
CoRR, March, 2025

TokenCarve: Information-Preserving Visual Token Compression in Multimodal Large Language Models.
CoRR, March, 2025

Large multimodal models evaluation: a survey.
Sci. China Inf. Sci., 2025

Scaling Physical Reasoning with the PHYSICS Dataset.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Can Knowledge-Graph-based Retrieval Augmented Generation Really Retrieve What You Need?
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

MokA: Multimodal Low-Rank Adaptation for MLLMs.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

LLaMA-Berry: Pairwise Optimization for Olympiad-level Mathematical Reasoning via O1-like Monte Carlo Tree Search.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

CFSSeg: Closed-Form Solution for Class-Incremental Semantic Segmentation of 2D Images and 3D Point Clouds.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

When Dynamic Data Selection Meets Data Augmentation: Achieving Enhanced Training Acceleration.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

A CLIP-Powered Framework for Robust and Generalizable Data Selection.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Biology-Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
TOMG-Bench: Evaluating LLMs on Text-based Open Molecule Generation.
CoRR, 2024

SegACIL: Solving the Stability-Plasticity Dilemma in Class-Incremental Semantic Segmentation.
CoRR, 2024

MolReFlect: Towards In-Context Fine-grained Alignments between Molecules and Texts.
CoRR, 2024

LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning.
CoRR, 2024

Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B.
CoRR, 2024

Physical formula enhanced multi-task learning for pharmacokinetics prediction.
CoRR, 2024

LOCR: Location-Guided Transformer for Optical Character Recognition.
CoRR, 2024

ChemLLM: A Chemical Large Language Model.
CoRR, 2024

LOCR: Location-Guided Transformer for Optical Character Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Exploiting Visual Context Semantics for Sound Source Localization.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

2022
SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance.
CoRR, 2022

SepFusion: Finding Optimal Fusion Structures for Visual Sound Separation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
TRUFM: a Transformer-Guided Framework for Fine-Grained Urban Flow Inference.
Proceedings of the Neural Information Processing - 28th International Conference, 2021

Delving Into Localization Errors for Monocular 3D Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Performance Optimization of Federated Person Re-identification via Benchmark Analysis.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Cheaper Pre-training Lunch: An Efficient Paradigm for Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020

EcoNAS: Finding Proxies for Economical Neural Architecture Search.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
WIDER Face and Pedestrian Challenge 2018: Methods and Results.
CoRR, 2019


  Loading...