Shaojin Ding

Yanzhang He

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Efficient Cascaded Streaming ASR System Via Frame Rate Reduction.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Accentron: Foreign accent conversion to arbitrary non-native speakers using zero-shot learning.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2022

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

4-bit Conformer with Native Quantization Aware Training for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable.

[BibT_eX]

[DOI]

Tianlong Chen

Zhangyang Wang

Proceedings of the Tenth International Conference on Learning Representations, 2022

Towards Lifelong Learning of Multilingual Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Converting Foreign Accent Speech Without a Reference.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Textual Echo Cancellation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Learning Structured Sparse Representations for Voice Conversion.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Personal VAD: Speaker-Conditioned Voice Activity Detection.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Improving the Speaker Identity of Non-Parallel Many-to-Many Voice Conversion with Adversarial Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

AutoSpeech: Neural Architecture Search for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Golden speaker builder - An interactive tool for pronunciation training.

[BibT_eX]

[DOI]

Evgeny Chukharev-Hudilainen

John Levis

Speech Commun., 2019

Foreign Accent Conversion by Synthesizing Speech from Phonetic Posteriorgrams.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Group Latent Embedding for Vector Quantized Variational Autoencoder in Non-Parallel Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

ABD-Net: Attentive but Diverse Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2018

Improving Sparse Representations in Exemplar-Based Voice Conversion with a Phoneme-Selective Objective Function.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Learning Structured Dictionaries for Exemplar-based Voice Conversion.

[BibT_eX]

[DOI]