Zhenyang Cai

Orcid: 0009-0006-0320-8490

According to our database1, Zhenyang Cai authored at least 19 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
MicroVerse: A Preliminary Exploration Toward a Micro-World Simulation.
CoRR, March, 2026

To What Extent Do Token-Level Representations from Pathology Foundation Models Improve Dense Prediction?
CoRR, February, 2026

2025
DentalGPT: Incentivizing Multimodal Complex Reasoning in Dentistry.
CoRR, December, 2025

WaveMind: Towards a Conversational EEG Foundation Model Aligned to Textual and Visual Modalities.
CoRR, October, 2025

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization.
CoRR, September, 2025

ShizhenGPT: Towards Multimodal LLMs for Traditional Chinese Medicine.
CoRR, August, 2025

MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos.
CoRR, July, 2025

ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation.
CoRR, June, 2025

UCL-Bench: A Chinese User-Centric Legal Benchmark for Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Towards Medical Complex Reasoning with LLMs through Medical Verifiable Problems.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Additional Self-Attention Transformer With Adapter for Thick Haze Removal.
IEEE Geosci. Remote. Sens. Lett., 2024

On the Compositional Generalization of Multimodal LLMs for Medical Imaging.
CoRR, 2024

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs.
CoRR, 2024

HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale.
CoRR, 2024

Alignment at Pre-training! Towards Native Alignment for Arabic LLMs.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024


  Loading...