Chenpeng Du

Orcid: 0000-0001-5329-0847

According to our database¹, Chenpeng Du authored at least 41 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Recent Advances in Discrete Speech Tokens: A Review.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2026

2025

DiSTAR: Diffusion over a Scalable Token Autoregressive Representation for Speech Generation.

[BibT_eX]

[DOI]

CoRR, October, 2025

MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and Generation.

[BibT_eX]

[DOI]

CoRR, June, 2025

LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Towards Reliable Large Audio Language Model.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Language Model Can Listen While Speaking.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

E$^{3}$TTS: End-to-End Text-Based Speech Editing TTS System and Its Applications.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective.

[BibT_eX]

[DOI]

CoRR, 2024

vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders.

[BibT_eX]

[DOI]

CoRR, 2024

GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting.

[BibT_eX]

[DOI]

CoRR, 2024

The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge.

[BibT_eX]

[DOI]

CoRR, 2024

Attention-Constrained Inference For Robust Decoder-Only Text-to-Speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

The X-Lance Technical Report for Interspeech 2024 Speech Processing using Discrete Speech Unit Challenge.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Generation-Based Target Speech Extraction with Speech Discretization and Vocoder.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Acoustic BPE for Speech Generation with Discrete Tokens.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

DiffDub: Person-Generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-Encoder.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

VoiceFlow: Efficient Text-To-Speech with Rectified Flow Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Speaker Adaptive Text-to-Speech With Timbre-Normalized Vector-Quantized Feature.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation.

[BibT_eX]

[DOI]

CoRR, 2023

DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Improving Code-Switching and Name Entity Recognition in ASR with Speech Editing based Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Emodiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Multi-Speaker Multi-Lingual VQTTS System for LIMMITS 2023 Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Phone-Level Prosody Modelling With GMM-Based MDN for Diverse and Controllable Speech Synthesis.

[BibT_eX]

[DOI]

Chenpeng Du

Kai Yu

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Neural Fusion for Voice Cloning.

[BibT_eX]

[DOI]

Bo Chen

Chenpeng Du

Kai Yu

IEEE ACM Trans. Audio Speech Lang. Process., 2022

VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Unsupervised Word-Level Prosody Tagging for Controllable Speech Synthesis.

[BibT_eX]

[DOI]

Yiwei Guo

Chenpeng Du

Kai Yu

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Diverse and Controllable Speech Synthesis with GMM-Based Phone-Level Prosody Modelling.

[BibT_eX]

[DOI]

Chenpeng Du

Kai Yu

CoRR, 2021

Mixture Density Network for Phone-Level Prosody Modelling in Speech Synthesis.

[BibT_eX]

[DOI]

Chenpeng Du

Kai Yu

CoRR, 2021

Data Augmentation for end-to-end Code-Switching Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Rich Prosody Diversity Modelling with Phone-Level Mixture Density Network.

[BibT_eX]

[DOI]

Chenpeng Du

Kai Yu

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Towards Data Selection on TTS Data for Children's Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

SynAug: Synthesis-Based Data Augmentation for Text-Dependent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Speaker Augmentation for Low Resource Speech Recognition.

[BibT_eX]

[DOI]

Chenpeng Du

Kai Yu

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

SJTU Entry in Blizzard Challenge 2019.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2019, Vienna, Austria, September 23, 2019, 2019

Chenpeng Du

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...