Shunian Chen

Orcid: 0009-0003-4996-0475

According to our database1, Shunian Chen authored at least 26 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Do Phone-Use Agents Respect Your Privacy?
CoRR, April, 2026

EvA: An Evidence-First Audio Understanding Paradigm for LALMs.
CoRR, March, 2026

From Lossy to Verified: A Provenance-Aware Tiered Memory for Agents.
CoRR, February, 2026

2025
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis.
CoRR, August, 2025

MedGen: Unlocking Medical Video Generation by Scaling Granularly-annotated Medical Videos.
CoRR, July, 2025

ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation.
CoRR, June, 2025

FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion.
CoRR, June, 2025

Huatuo-26M, a Large-scale Chinese Medical QA Dataset.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

MLLM-Bench: Evaluating Multimodal LLMs with Per-sample Criteria.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Less is More: A Simple yet Effective Token Reduction Method for Efficient Multi-modal LLMs.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

2024
BlenderLLM: Training Large Language Models for Computer-Aided Design with Self-improvement.
CoRR, 2024

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination.
CoRR, 2024

LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture.
CoRR, 2024

Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications.
CoRR, 2024

HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale.
CoRR, 2024

MileBench: Benchmarking MLLMs in Long Context.
CoRR, 2024

ALLaVA: Harnessing GPT4V-synthesized Data for A Lite Vision-Language Model.
CoRR, 2024

Humans or LLMs as the Judge? A Study on Judgement Biases.
CoRR, 2024

VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Towards Injecting Medical Visual Knowledge into Multimodal LLMs at Scale.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Humans or LLMs as the Judge? A Study on Judgement Bias.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Silkie: Preference Distillation for Large Visual Language Models.
CoRR, 2023

MLLM-Bench, Evaluating Multi-modal LLMs using GPT-4V.
CoRR, 2023

HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs.
CoRR, 2023


  Loading...