We stand with Ukraine

We stand with Ukraine

Kunal Dhawan

Orcid: 0000-0002-5276-2475

According to our database¹, Kunal Dhawan authored at least 27 papers between 2018 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2025

Training and Inference Efficiency of Encoder-Decoder Speech Models.

[BibT_eX]

[DOI]

,

,

,

Krishna C. Puvvada

,

,

Nithin Rao Koluguri

,

,

Vitaly Lavrukhin

,

Jagadeesh Balam

,

CoRR, March, 2025

VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning.

[BibT_eX]

[DOI]

,

Krishna C. Puvvada

,

,

,

,

,

,

Shinji Watanabe

,

Jagadeesh Balam

,

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Speaker Targeting via Self-Speaker Adaptation for Multi-talker ASR.

[BibT_eX]

[DOI]

,

,

Ivan Medennikov

,

,

,

,

Nithin Rao Koluguri

,

Jagadeesh Balam

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Streaming Sortformer: Speaker Cache-Based Online Speaker Diarization with Arrival-Time Ordering.

[BibT_eX]

[DOI]

Ivan Medennikov

,

,

,

,

,

,

Jagadeesh Balam

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Word Level Timestamp Generation for Automatic Speech Recognition and Translation.

[BibT_eX]

[DOI]

,

Krishna C. Puvvada

,

Elena Rastorgueva

,

,

,

,

,

,

Jagadeesh Balam

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

SPGISpeech 2.0: Transcribed multi-speaker financial audio for speaker-tagged transcription.

[BibT_eX]

[DOI]

Raymond Grossman

,

,

,

,

,

Yulia Shchadilova

,

,

Jagadeesh Balam

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Sortformer: A Novel Approach for Permutation-Resolved Speaker Supervision in Speech-to-Text Systems.

[BibT_eX]

[DOI]

,

Ivan Medennikov

,

,

,

,

Nithin Rao Koluguri

,

Krishna C. Puvvada

,

Jagadeesh Balam

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR.

[BibT_eX]

[DOI]

,

,

,

,

,

Ivan Medennikov

,

,

Nithin Rao Koluguri

,

Jagadeesh Balam

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks.

[BibT_eX]

[DOI]

,

,

,

Ivan Medennikov

,

Krishna C. Puvvada

,

Nithin Rao Koluguri

,

,

Jagadeesh Balam

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Wei-Cheng Tseng

,

,

,

,

,

,

,

,

,

,

,

,

,

Fabian Ritter Gutierrez

,

,

,

,

,

,

,

Chung-Ming Chien

,

,

Cheng-Hsiu Hsieh

,

,

,

,

Heitor R. Guimarães

,

,

,

,

,

,

,

,

,

,

,

,

,

Kuan-Yu Fang Chiang

,

,

,

,

Shao-Syuan Huang

,

,

,

,

,

,

,

,

,

,

Shih-Yun Shan Kuan

,

,

,

,

,

,

,

,

Chao-Han Huck Yang

,

,

,

Shao-Xiang Yuan

,

,

,

,

,

,

Shinji Watanabe

,

CoRR, 2024

Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens.

[BibT_eX]

[DOI]

,

Ivan Medennikov

,

,

,

,

Nithin Rao Koluguri

,

Krishna C. Puvvada

,

Jagadeesh Balam

,

CoRR, 2024

Large Language Model Based Generative Error Correction: A Challenge and Baselines For Speech Recognition, Speaker Tagging, and Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR.

[BibT_eX]

[DOI]

,

,

,

Krishna C. Puvvada

,

Ivan Medennikov

,

Somshubra Majumdar

,

,

Jagadeesh Balam

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data.

[BibT_eX]

[DOI]

Krishna C. Puvvada

,

,

,

Oleksii Hrinchuk

,

Nithin Rao Koluguri

,

,

Somshubra Majumdar

,

Elena Rastorgueva

,

,

Vitaly Lavrukhin

,

Jagadeesh Balam

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations.

[BibT_eX]

[DOI]

,

Nithin Rao Koluguri

,

,

,

Jagadeesh Balam

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition.

[BibT_eX]

[DOI]

Krishna C. Puvvada

,

Nithin Rao Koluguri

,

,

Jagadeesh Balam

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach.

[BibT_eX]

[DOI]

,

,

Nithin Rao Koluguri

,

Jagadeesh Balam

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System.

[BibT_eX]

[DOI]

,

,

,

,

Krishna C. Puvvada

,

Nithin Rao Koluguri

,

,

Aleksandr Laptev

,

Jagadeesh Balam

,

CoRR, 2023

Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation.

[BibT_eX]

[DOI]

,

,

,

Nithin Rao Koluguri

,

,

,

Jagadeesh Balam

,

CoRR, 2023

Towards training Bilingual and Code-Switched Speech Recognition models from Monolingual data sources.

[BibT_eX]

[DOI]

,

,

CoRR, 2023

2021

Phonetic Word Embeddings.

[BibT_eX]

[DOI]

,

,

Balakrishna Pailla

CoRR, 2021

2020

Novel textual features for language modeling of intra-sentential code-switching data.

[BibT_eX]

[DOI]

,

,

Comput. Speech Lang., 2020

Joint Language Identification of Code-Switching Speech using Attention-based E2E Network.

[BibT_eX]

[DOI]

,

,

Kumar Priyadarshi

,

Proceedings of the International Conference on Signal Processing and Communications, 2020

Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data.

[BibT_eX]

[DOI]

,

,

Kumar Priyadarshi

,

Proceedings of the 2020 National Conference on Communications, 2020

2019

IITG-HingCoS corpus: A Hinglish code-switching database for automatic speech recognition.

[BibT_eX]

[DOI]

,

,

Speech Commun., 2019

Towards Adapting NMF Dictionaries Using Total Variability Modeling for Noise-Robust Acoustic Features.

[BibT_eX]

[DOI]

,

,

,

Shrikanth S. Narayanan

CoRR, 2019

2018

Hindi-English Code-Switching Speech Corpus.

[BibT_eX]

[DOI]

,

,

CoRR, 2018

Loading...