Hui-Peng Du

Orcid: 0009-0000-9831-6086

According to our database1, Hui-Peng Du authored at least 18 papers between 2024 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Universal Discrete-Domain Speech Enhancement.
CoRR, October, 2025

DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis.
CoRR, September, 2025

Say More with Less: Variable-Frame-Rate Speech Tokenization via Adaptive Clustering and Implicit Duration Coding.
CoRR, September, 2025

Is GAN Necessary for Mel-Spectrogram-Based Neural Vocoder?
IEEE Signal Process. Lett., 2025

Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Vision-Integrated High-Quality Neural Speech Coding.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

CASC-XVC: Zero-Shot Cross-Lingual Voice Conversion with Content Accordant and Speaker Contrastive Losses.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
APCodec: A Neural Audio Codec With Parallel Amplitude and Phase Spectrum Encoding and Decoding.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram.
CoRR, 2024

Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction.
CoRR, 2024

Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Stage-Wise and Prior-Aware Neural Speech Phase Prediction.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024

MDCTCodec: A Lightweight MDCT-Based Neural Audio Codec Towards High Sampling Rate and Low Bitrate Scenarios.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024

SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Considering Temporal Connection between Turns for Conversational Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2024


  Loading...