We stand with Ukraine

We stand with Ukraine

Yu Zhang

Orcid: 0009-0007-4594-0281

Affiliations:

Zhejiang University, Hangzhou, China

According to our database¹, Yu Zhang authored at least 20 papers between 2024 and 2026.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

on orcid.org
on scholar.google.com

On csauthors.net:

Bibliography

2026

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer.

[DOI]

,

,

,

,

,

,

CoRR, May, 2026

Rectifying the Emotional Flow: Aligning Priors and Dynamic Guidance for High-Arousal Text-to-Speech.

[DOI]

,

,

,

,

,

,

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

2025

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation.

[DOI]

,

,

,

,

,

,

,

,

CoRR, July, 2025

Leveraging Pretrained Diffusion Models for Zero-Shot Part Assembly.

[DOI]

,

,

,

,

,

CoRR, May, 2025

Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, February, 2025

MRSAudio: A Large-Scale Multimodal Recorded Spatial Audio Dataset with Refined Annotations.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

A Multimodal Evaluation Framework for Spatial Audio Playback Systems: From Localization to Listener Preference.

[DOI]

,

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting.

[DOI]

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

ASAudio: A Survey of Advanced Spatial Audio Research.

[DOI]

,

,

,

,

Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

Synthetic Singers: A Review of Deep-Learning-based Singing Voice Synthesis Approaches.

[DOI]

,

,

,

,

,

,

Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

Versatile Framework for Song Generation with Prompt-based Control.

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Conan: A Chunkwise Online Network for Zero-Shot Adaptive Voice Conversion.

[DOI]

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics, 2025

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics, 2025

TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Denoising algorithm for medical ultrasound image with improved threshold neighborhood-mean filtering.

[DOI]

,

Proceedings of the 2024 5th International Symposium on Artificial Intelligence for Medicine Science, 2024

TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Robust Singing Voice Transcription Serves Synthesis.

[DOI]

,

,

,

,

,

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

StyleSinger: Style Transfer for Out-of-Domain Singing Voice Synthesis.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Loading...