Kunal Dhawan

Orcid: 0000-0002-5276-2475

According to our database1, Kunal Dhawan authored at least 25 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Streaming Sortformer: Speaker Cache-Based Online Speaker Diarization with Arrival-Time Ordering.
CoRR, July, 2025

Speaker Targeting via Self-Speaker Adaptation for Multi-talker ASR.
CoRR, June, 2025

Word Level Timestamp Generation for Automatic Speech Recognition and Translation.
CoRR, May, 2025

Training and Inference Efficiency of Encoder-Decoder Speech Models.
CoRR, March, 2025

VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks.
CoRR, 2024

Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens.
CoRR, 2024

Large Language Model Based Generative Error Correction: A Challenge and Baselines For Speech Recognition, Speaker Tagging, and Emotion Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System.
CoRR, 2023

Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation.
CoRR, 2023

Towards training Bilingual and Code-Switched Speech Recognition models from Monolingual data sources.
CoRR, 2023

2021
Phonetic Word Embeddings.
CoRR, 2021

2020
Novel textual features for language modeling of intra-sentential code-switching data.
Comput. Speech Lang., 2020

Joint Language Identification of Code-Switching Speech using Attention-based E2E Network.
Proceedings of the International Conference on Signal Processing and Communications, 2020

Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data.
Proceedings of the 2020 National Conference on Communications, 2020

2019
IITG-HingCoS corpus: A Hinglish code-switching database for automatic speech recognition.
Speech Commun., 2019

Towards Adapting NMF Dictionaries Using Total Variability Modeling for Noise-Robust Acoustic Features.
CoRR, 2019

2018
Hindi-English Code-Switching Speech Corpus.
CoRR, 2018


  Loading...