Ziyue Jiang
Affiliations:- Zhejiang University, Hangzhou, China
According to our database1,
Ziyue Jiang
authored at least 44 papers
between 2020 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
On csauthors.net:
Bibliography
2025
Discl-VC: Disentangled Discrete Tokens and In-Context Learning for Controllable Zero-Shot Voice Conversion.
CoRR, May, 2025
CoRR, March, 2025
Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis.
CoRR, February, 2025
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
Proceedings of the 31st International Conference on Computational Linguistics, 2025
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
2024
CoRR, 2024
FluentEditor+: Text-based Speech Editing by Modeling Local Hierarchical Acoustic Smoothness and Global Prosody Consistency.
CoRR, 2024
DiffEditor: Enhancing Speech Editing with Semantic Enrichment and Acoustic Consistency.
CoRR, 2024
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling.
CoRR, 2024
ControlSpeech: Towards Simultaneous Zero-shot Speaker Cloning and Zero-shot Language Style Control With Decoupled Codec.
CoRR, 2024
Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models.
CoRR, 2024
GTSinger: A Global Multi-Technique Singing Corpus with Realistic Music Scores for All Singing Tasks.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
VoiceTuner: Self-Supervised Pre-training and Efficient Fine-tuning For Voice Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech Models.
CoRR, 2023
CoRR, 2023
CoRR, 2023
GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation.
CoRR, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-Training.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
FluentSpeech: Stutter-Oriented Automatic Speech Editing with Context-Aware Diffusion Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
FastDiff 2: Revisiting and Incorporating GANs and Diffusion Models in High-Fidelity Speech Synthesis.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
2021
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021
2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020