Di He

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

2024

Turn-Taking and Backchannel Prediction with Acoustic and Large Language Model Fusion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Personalized Predictive ASR for Latency Reduction in Voice Assistants.

[BibT_eX]

[DOI]

Andreas Schwarz

Maarten Van Segbroeck

Mohammed Hethnawi

Ariya Rastrow

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Adaptive Endpointing with Deep Contextual Multi-Armed Bandits.

[BibT_eX]

[DOI]

Viet Anh Trinh

Proceedings of the IEEE International Conference on Acoustics, 2023

Towards Accurate and Real-Time End-of-Speech Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Two-Pass Endpoint Detection for Speech Recognition.

[BibT_eX]

[DOI]

Roland Maas

Ariya Rastrow

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

VADOI: Voice-Activity-Detection Overlapping Inference for End-To-End Long-Form Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

wav2vec-C: A Self-Supervised Model for Speech Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2019

The benefits of acoustic perceptual information for speech processing systems

[BibT_eX]

[DOI]

PhD thesis, 2019

When CTC Training Meets Acoustic Landmarks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Augmenting Input Method Language Model with user Location Type Information.

[BibT_eX]

[DOI]

CoRR, 2018

Improved ASR for Under-resourced Languages through Multi-task Learning with Acoustic Landmarks.

[BibT_eX]

[DOI]

Boon Pang Lim

Xuesong Yang

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

Acoustic Landmarks Contain More Information About the Phone String than Other Frames.

[BibT_eX]

[DOI]

Boon Pang Lim

Xuesong Yang

CoRR, 2017

Using Approximated Auditory Roughness as a Pre-Filtering Feature for Human Screaming and Affective Speech AED.

[BibT_eX]

[DOI]

Zuofu Cheng