We stand with Ukraine

We stand with Ukraine

Changhao Pan

Orcid: 0009-0004-6023-1764

According to our database¹, Changhao Pan authored at least 20 papers between 2024 and 2026.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer.

[DOI]

,

,

,

,

,

,

CoRR, May, 2026

TMD-Bench: A Multi-Level Evaluation Paradigm for Music-Dance Co-Generation.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, May, 2026

Diffusion Model as a Generalist Segmentation Learner.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, April, 2026

ImVideoEdit: Image-learning Video Editing via 2D Spatial Difference Attention Blocks.

[DOI]

,

,

,

,

,

,

,

,

CoRR, April, 2026

Modeling and Benchmarking Spoken Dialogue Rewards with Modality and Colloquialness.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, March, 2026

SDiaReward: Modeling and Benchmarking Spoken Dialogue Rewards with Modality and Colloquialness.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

2025

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation.

[DOI]

,

,

,

,

,

,

,

,

CoRR, July, 2025

Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis.

[DOI]

,

,

,

,

CoRR, July, 2025

MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

A Multimodal Evaluation Framework for Spatial Audio Playback Systems: From Localization to Listener Preference.

[DOI]

,

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting.

[DOI]

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

ASAudio: A Survey of Advanced Spatial Audio Research.

[DOI]

,

,

,

,

Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches.

[DOI]

,

,

,

,

,

,

Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

Versatile Framework for Song Generation with Prompt-based Control.

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics, 2025

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics, 2025

TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Interactive Table Synthesis With Natural Language.

[DOI]

,

,

,

,

,

,

IEEE Trans. Vis. Comput. Graph., September, 2024

GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Loading...