Shunian Chen

Orcid: 0009-0003-4996-0475

According to our database1, Shunian Chen authored at least 21 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis.
CoRR, August, 2025

MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos.
CoRR, July, 2025

ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation.
CoRR, June, 2025

FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion.
CoRR, June, 2025

Huatuo-26M, a Large-scale Chinese Medical QA Dataset.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

2024
BlenderLLM: Training Large Language Models for Computer-Aided Design with Self-improvement.
CoRR, 2024

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination.
CoRR, 2024

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture.
CoRR, 2024

Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications.
CoRR, 2024

HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale.
CoRR, 2024

MileBench: Benchmarking MLLMs in Long Context.
CoRR, 2024

ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model.
CoRR, 2024

Humans or LLMs as the Judge? A Study on Judgement Biases.
CoRR, 2024

VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Humans or LLMs as the Judge? A Study on Judgement Bias.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Silkie: Preference Distillation for Large Visual Language Models.
CoRR, 2023

MLLM-Bench, Evaluating Multi-modal LLMs using GPT-4V.
CoRR, 2023

HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs.
CoRR, 2023


  Loading...