Changhao Pan

Orcid: 0009-0004-6023-1764

According to our database1, Changhao Pan authored at least 17 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Diffusion Model as a Generalist Segmentation Learner.
CoRR, April, 2026

ImVideoEdit: Image-learning Video Editing via 2D Spatial Difference Attention Blocks.
CoRR, April, 2026

Modeling and Benchmarking Spoken Dialogue Rewards with Modality and Colloquialness.
CoRR, March, 2026

2025
MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations.
CoRR, October, 2025

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation.
CoRR, July, 2025

Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis.
CoRR, July, 2025

A Multimodal Evaluation Framework for Spatial Audio Playback Systems: From Localization to Listener Preference.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

ASAudio: A Survey of Advanced Spatial Audio Research.
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches.
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

Versatile Framework for Song Generation with Prompt-based Control.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Interactive Table Synthesis With Natural Language.
IEEE Trans. Vis. Comput. Graph., September, 2024

GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024


  Loading...