Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Word Level Timestamp Generation for Automatic Speech Recognition and Translation.

[BibT_eX]

[DOI]

Ke Hu

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Sortformer: A Novel Approach for Permutation-Resolved Speaker Supervision in Speech-to-Text Systems.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts.

[BibT_eX]

[DOI]

CoRR, 2024

Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens.

[BibT_eX]

[DOI]

CoRR, 2024

Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Bestow: Efficient and Streamable Speech Language Model with The Best of Two Worlds in GPT and T5.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

SALM: Speech-Augmented Language Model with in-Context Learning for Speech Recognition and Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System.

[BibT_eX]

[DOI]

CoRR, 2023

Conformer-Based Target-Speaker Automatic Speech Recognition For Single-Channel Audio.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Fast Conformer With Linearly Scalable Attention For Efficient Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2021

Unsupervised and Semi-Supervised Few-Shot Acoustic Event Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Few-Shot Acoustic Event Detection Via Meta Learning.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Krishna C. Puvvada

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...