Yongmao Zhang

Orcid: 0009-0000-0526-5778

According to our database1, Yongmao Zhang authored at least 16 papers between 2022 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

2023
Application of wearable devices based on deep learning algorithm in rope skipping data monitoring.
Soft Comput., May, 2023

Accent-VITS: accent transfer for end-to-end TTS.
CoRR, 2023

PromptSpeaker: Speaker Generation Based on Text Descriptions.
CoRR, 2023

PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions.
CoRR, 2023

The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023

Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling.
Proceedings of the IEEE International Conference on Acoustics, 2023

DSPGAN: A Gan-Based Universal Vocoder for High-Fidelity TTS by Time-Frequency Domain Supervision from DSP.
Proceedings of the IEEE International Conference on Acoustics, 2023

Promptspeaker: Speaker Generation Based on Text Descriptions.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer.
CoRR, 2022

AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Learn2Sing 2.0: Diffusion and Mutual Information-Based Target Speaker SVS by Learning from Singing Teacher.
Proceedings of the Interspeech 2022, 2022

Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis.
Proceedings of the Interspeech 2022, 2022

VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022


  Loading...