Fenglong Xie

Orcid: 0000-0002-1206-3696

According to our database¹, Fenglong Xie authored at least 29 papers between 2012 and 2025.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

FireRedChat: A Pluggable, Full-Duplex Voice Interaction System with Cascaded and Semi-Cascaded Implementations.

[BibT_eX]

[DOI]

CoRR, September, 2025

FireRedTTS-2: Towards Long Conversational Speech Generation for Podcast and Chatbot.

[BibT_eX]

[DOI]

CoRR, September, 2025

FireRedTTS-1S: An Upgraded Streamable Foundation Text-to-Speech System.

[BibT_eX]

[DOI]

CoRR, March, 2025

FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration.

[BibT_eX]

[DOI]

CoRR, January, 2025

Speaking from Coarse to Fine: Improving Neural Codec Language Model via Multi-Scale Speech Coding and Generation.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

PodAgent: A Comprehensive Framework for Podcast Generation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

FireRedTTS: A Foundation Text-To-Speech Framework for Industry-Level Generative Speech Applications.

[BibT_eX]

[DOI]

CoRR, 2024

Addressing Index Collapse of Large-Codebook Speech Tokenizer With Dual-Decoding Product-Quantized Variational Auto-Encoder.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

SoCodec: A Semantic-Ordered Multi-Stream Speech Codec For Efficient Language Model Based Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

2023

MSMC-TTS: Multi-Stage Multi-Codebook VQ-VAE Based Neural TTS.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2023

FireRedTTS: The Xiaohongshu Speech Synthesis System for Blizzard Challenge 2023.

[BibT_eX]

[DOI]

Kun Xie

Yi-Chen Wu

Feng-Long Xie

Proceedings of the 18th Blizzard Challenge Workshop, Grenoble, France, August 29, 2023, 2023

2022

Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations.

[BibT_eX]

[DOI]

CoRR, 2022

A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Triple M: A Practical Neural Text-to-speech System With Multi-guidance Attention And Multi-band Multi-time Lpcnet.

[BibT_eX]

[DOI]

CoRR, 2021

Triple M: A Practical Text-to-Speech Synthesis System with Multi-Guidance Attention and Multi-Band Multi-Time LPCNet.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A New High Quality Trajectory Tiling Based Hybrid TTS In Real Time.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Nana-HDR: A Non-attentive Non-autoregressive Hybrid Model for TTS.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2021, virtual, October 23, 2021, 2021

2020

Improving End-to-End Speech Synthesis with Local Recurrent Neural Network Enhanced Transformer.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

An Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Voice conversion with SI-DNN and KL divergence based mapping without parallel training data.

[BibT_eX]

[DOI]

Feng-Long Xie

Frank K. Soong

Haifeng Li

Speech Commun., 2019

2018

LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2018

Frame Selection in SI-DNN Phonetic Space with WaveNet Vocoder for Voice Conversion without Parallel Training Data.

[BibT_eX]

[DOI]

Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

2016

A KL Divergence and DNN-Based Approach to Voice Conversion without Parallel Training Sentences.

[BibT_eX]

[DOI]

Feng-Long Xie

Frank K. Soong

Haifeng Li

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A KL divergence and DNN approach to cross-lingual TTS.

[BibT_eX]

[DOI]

Feng-Long Xie

Frank K. Soong

Haifeng Li

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2014

Pitch transformation in neural network based voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Sequence error (SE) minimization training of neural network for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

TTS synthesis with bidirectional LSTM based recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2012

Cross validation and Minimum Generation Error for improved model clustering in HMM-based TTS.

[BibT_eX]

[DOI]

Feng-Long Xie

Yi-Jian Wu

Frank K. Soong

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Fenglong Xie

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...