Zhenyang Cai

Orcid: 0009-0006-0320-8490

According to our database¹, Zhenyang Cai authored at least 22 papers between 2024 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

GlobalDentBench: A Multinational Benchmark for Evaluating LLM Clinical Reasoning in Dentistry with Expert Calibration.

[BibT_eX]

[DOI]

CoRR, May, 2026

Agentifying Patient Dynamics within LLMs through Interacting with Clinical World Model.

[BibT_eX]

[DOI]

CoRR, May, 2026

Beyond ViT Tokens: Masked-Diffusion Pretrained Convolutional Pathology Foundation Model for Cell-Level Dense Prediction.

[BibT_eX]

[DOI]

CoRR, May, 2026

MicroVerse: A Preliminary Exploration Toward a Micro-World Simulation.

[BibT_eX]

[DOI]

CoRR, March, 2026

To What Extent Do Token-Level Representations from Pathology Foundation Models Improve Dense Prediction?

[BibT_eX]

[DOI]

CoRR, February, 2026

2025

DentalGPT: Incentivizing Multimodal Complex Reasoning in Dentistry.

[BibT_eX]

[DOI]

CoRR, December, 2025

WaveMind: Towards a Conversational EEG Foundation Model Aligned to Textual and Visual Modalities.

[BibT_eX]

[DOI]

CoRR, October, 2025

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization.

[BibT_eX]

[DOI]

CoRR, September, 2025

ShizhenGPT: Towards Multimodal LLMs for Traditional Chinese Medicine.

[BibT_eX]

[DOI]

CoRR, August, 2025

MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos.

[BibT_eX]

[DOI]

CoRR, July, 2025

ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation.

[BibT_eX]

[DOI]

CoRR, June, 2025

UCL-Bench: A Chinese User-Centric Legal Benchmark for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Towards Medical Complex Reasoning with LLMs through Medical Verifiable Problems.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Additional Self-Attention Transformer With Adapter for Thick Haze Removal.

[BibT_eX]

[DOI]

IEEE Geosci. Remote. Sens. Lett., 2024

On the Compositional Generalization of Multimodal LLMs for Medical Imaging.

[BibT_eX]

[DOI]

CoRR, 2024

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale.

[BibT_eX]

[DOI]

CoRR, 2024

Alignment at Pre-training! Towards Native Alignment for Arabic LLMs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Zhenyang Cai

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...