We stand with Ukraine

We stand with Ukraine

Ye-Xin Lu

This page is a disambiguation page, it actually contains mutiple papers from persons of the same or a similar name.

Bibliography

2025

Universal Discrete-Domain Speech Enhancement.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, October, 2025

DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, September, 2025

Is GAN Necessary for Mel-Spectrogram-Based Neural Vocoder?

[BibT_eX]

[DOI]

,

,

,

,

IEEE Signal Process. Lett., 2025

Explicit estimation of magnitude and phase spectra in parallel for high-quality speech enhancement.

[BibT_eX]

[DOI]

,

,

Neural Networks, 2025

Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

,

,

Zheng-Yan Sheng

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Can Automated Speech Recognition Errors Provide Valuable Clues for Alzheimer's Disease Detection?

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

APCodec: A Neural Audio Codec With Parallel Amplitude and Phase Spectrum Encoding and Decoding.

[BibT_eX]

[DOI]

,

Xiao-Hang Jiang

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2024

ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram.

[BibT_eX]

[DOI]

Xiao-Hang Jiang

,

,

,

,

CoRR, 2024

Multi-Stage Speech Bandwidth Extension with Flexible Sampling Rate Control.

[BibT_eX]

[DOI]

,

,

Zheng-Yan Sheng

,

CoRR, 2024

Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2024

Pitch-and-Spectrum-Aware Singing Quality Assessment with Bias Correction and Model Fusion.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Stage-Wise and Prior-Aware Neural Speech Phase Prediction.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

MDCTCodec: A Lightweight MDCT-Based Neural Audio Codec Towards High Sampling Rate and Low Bitrate Scenarios.

[BibT_eX]

[DOI]

Xiao-Hang Jiang

,

,

,

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

SAMOS: A Neural MOS Prediction Model Leveraging Semantic Representations and Acoustic Features.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

MultiStage Speech Bandwidth Extension with Flexible Sampling Rate Control.

[BibT_eX]

[DOI]

,

,

Zheng-Yan Sheng

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

A Low-Bitrate Neural Audio Codec Framework with Bandwidth Reduction and Recovery for High-Sampling-Rate Waveforms.

[BibT_eX]

[DOI]

,

,

Xiao-Hang Jiang

,

Zheng-Yan Sheng

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

2023

Long-Frame-Shift Neural Speech Phase Prediction With Spectral Continuity Enhancement and Interpolation Error Compensation.

[BibT_eX]

[DOI]

,

,

IEEE Signal Process. Lett., 2023

Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis.

[BibT_eX]

[DOI]

,

,

CoRR, 2023

Nurturing Eco-Consciousness: The Journey of the EcoMorph Guardian in Shaping Tomorrow's Stewards.

[BibT_eX]

[DOI]

Carlos Henrique Araújo de Aguiar

,

,

,

,

,

Proceedings of the 2023 Symposium on Learning, Design and Technology, 2023

MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra.

[BibT_eX]

[DOI]

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

The USTC-NERCSLIP System for the Track 1.2 of Audio Deepfake Detection (ADD 2023) Challenge.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

Loading...