Yongmao Zhang

Orcid: 0009-0000-0526-5778

According to our database¹, Yongmao Zhang authored at least 19 papers between 2022 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

POSIM: A Multi-Agent Simulation Framework for Social Media Public Opinion Evolution and Governance.

[BibT_eX]

[DOI]

CoRR, March, 2026

2025

MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder.

[BibT_eX]

[DOI]

CoRR, May, 2025

2024

SStackGNN: Graph Data Augmentation Simplified Stacking Graph Neural Network for Twitter Bot Detection.

[BibT_eX]

[DOI]

Int. J. Comput. Intell. Syst., December, 2024

METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Text-aware and Context-aware Expressive Audiobook Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

2023

Accent-VITS: accent transfer for end-to-end TTS.

[BibT_eX]

[DOI]

CoRR, 2023

The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Spoken Language Translation, 2023

VISinger2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

DSPGAN: A Gan-Based Universal Vocoder for High-Fidelity TTS by Time-Frequency Domain Supervision from DSP.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Promptspeaker: Speaker Generation Based on Text Descriptions.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer.

[BibT_eX]

[DOI]

CoRR, 2022

AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Learn2Sing 2.0: Diffusion and Mutual Information-Based Target Speaker SVS by Learning from Singing Teacher.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Yongmao Zhang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...