He Huang
Affiliations:- NVIDIA, Santa Clara, USA
According to our database1,
He Huang
authored at least 20 papers
between 2023 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
Streaming Sortformer: Speaker Cache-Based Online Speaker Diarization with Arrival-Time Ordering.
CoRR, July, 2025
Recent Trends in Distant Conversational Speech Recognition: A Review of CHiME-7 and 8 DASR Challenges.
CoRR, July, 2025
CoRR, May, 2025
CoRR, May, 2025
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
2024
Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens.
CoRR, 2024
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Bestow: Efficient and Streamable Speech Language Model with The Best of Two Worlds in GPT and T5.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
SALM: Speech-Augmented Language Model with in-Context Learning for Speech Recognition and Translation.
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System.
CoRR, 2023
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation.
CoRR, 2023
CoRR, 2023
Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023