Xie Chen

Orcid: 0000-0001-7423-617X

Affiliations:

Shanghai Jiao Tong University, China
Microsoft, Redmond, WA, USA (former)
University of Cambridge, UK (former)

According to our database¹, Xie Chen authored at least 74 papers between 2011 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

Advanced Long-Content Speech Recognition With Factorized Neural Transducer.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Advanced Long-Content Speech Recognition With Factorized Neural Transducer.

[BibT_eX]

[DOI]

CoRR, 2024

An Embarrassingly Simple Approach for LLM with Strong ASR Capacity.

[BibT_eX]

[DOI]

CoRR, 2024

BAT: Learning to Reason about Spatial Sounds with Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering.

[BibT_eX]

[DOI]

CoRR, 2024

EAT: Self-Supervised Pre-Training with Efficient Audio Transformer.

[BibT_eX]

[DOI]

CoRR, 2024

2023

Speaker Adaptive Text-to-Speech With Timbre-Normalized Vector-Quantized Feature.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation.

[BibT_eX]

[DOI]

CoRR, 2023

Expressive TTS Driven by Natural Language Prompts Using Few Human Annotations.

[BibT_eX]

[DOI]

CoRR, 2023

Acoustic BPE for Speech Generation with Discrete Tokens.

[BibT_eX]

[DOI]

CoRR, 2023

Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

Improved Factorized Neural Transducer Model For text-only Domain Adaptation.

[BibT_eX]

[DOI]

Junzhe Liu

Jianwei Yu

Xie Chen

CoRR, 2023

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS.

[BibT_eX]

[DOI]

CoRR, 2023

VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching.

[BibT_eX]

[DOI]

CoRR, 2023

Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems.

[BibT_eX]

[DOI]

CoRR, 2023

Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation.

[BibT_eX]

[DOI]

CoRR, 2023

Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation.

[BibT_eX]

[DOI]

CoRR, 2023

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding.

[BibT_eX]

[DOI]

CoRR, 2023

Blank-regularized CTC for Frame Skipping in Neural Transducer.

[BibT_eX]

[DOI]

CoRR, 2023

DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Emodiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Factorized AED: Factorized Attention-Based Encoder-Decoder for Text-Only Domain Adaptive ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

LongFNT: Long-Form Speech Recognition with Factorized Neural Transducer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Front-End Adapter: Adapting Front-End Input of Speech Based Self-Supervised Learning for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models.

[BibT_eX]

[DOI]

CoRR, 2022

MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets.

[BibT_eX]

[DOI]

CoRR, 2022

Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2022

Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2022, 2022

VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2022, 2022

Factorized Neural Transducer for Efficient Language Model Adaptation.

[BibT_eX]

[DOI]

Xie Chen

Zhong Meng

Sarangarajan Parthasarathy

Jinyu Li

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Zhong Meng

Sarangarajan Parthasarathy

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Improving RNN-T for Domain Scaling Using Semi-Supervised Training with Neural TTS.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Memory-Efficient Pipeline-Parallel DNN Training.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Internal Language Model Training for Domain-Adaptive End-To-End Speech Recognition.

[BibT_eX]

[DOI]

Zhong Meng

Naoyuki Kanda

Yashesh Gaur

Sarangarajan Parthasarathy

Proceedings of the IEEE International Conference on Acoustics, 2021

Developing Real-Time Streaming Transformer Transducer for Speech Recognition on Large-Scale Dataset.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition.

[BibT_eX]

[DOI]

Xie Chen

Sarangarajan Parthasarathy

William Gale

Shuangyu Chang

Michael Zeng

CoRR, 2020

Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition.

[BibT_eX]

[DOI]

Jeremy Heng Meng Wong

Mark J. F. Gales

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Long-span language modeling for speech recognition.

[BibT_eX]

[DOI]

Sarangarajan Parthasarathy

CoRR, 2019

Recurrent Neural Network Language Model Training Using Natural Gradient.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Gaussian Process Lstm Recurrent Neural Network Language Models for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Investigation of Sampling Techniques for Maximum Entropy Language Modeling Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Active Memory Networks for Language Modeling.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2018, 2018

Neural Network Language Modeling with Letter-Based Features and Importance Sampling.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription.

[BibT_eX]

[DOI]

Jeremy Heng Meng Wong

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

The Effect of Adding Authorship Knowledge in Automated Text Scoring.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications@NAACL-HLT 2018, 2018

2017

Future Word Contexts in Neural Network Language Models.

[BibT_eX]

[DOI]

CoRR, 2017

Investigating Bidirectional Recurrent Neural Network Language Models for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2017, 2017

Exploiting the Tibetan Radicals in Recurrent Neural Network for Low-Resource Language Models.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 24th International Conference, 2017

Recurrent neural network language models for keyword search.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Future word contexts in neural network language models.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

Two Efficient Lattice Rescoring Methods Using Recurrent Neural Network Language Models.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Multi-Language Neural Network Language Models.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2016, 2016

CUED-RNNLM - An open-source toolkit for efficient training and evaluation of recurrent neural network language models.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Recurrent neural network language model adaptation for multi-genre broadcast speech recognition.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2015, 2015

Paraphrastic recurrent neural network language models.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Robust excitation-based features for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Recurrent neural network language model training with noise contrastive estimation for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improving the training and evaluation efficiency of recurrent neural network language models.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Investigation of back-off based interpolation between recurrent neural network and n-gram language models.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2014, 2014

An initial investigation of long-term adaptation for meeting transcription.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2014, 2014

Impact of single-microphone dereverberation on DNN-based meeting transcription systems.

[BibT_eX]

[DOI]

Takuya Yoshioka

Xie Chen

Mark J. F. Gales

Proceedings of the IEEE International Conference on Acoustics, 2014

Efficient lattice rescoring using recurrent neural network language models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2012

Pipelined Back-Propagation for Context-Dependent Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the INTERSPEECH 2012, 2012

2011

Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Xie Chen

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...