Kunal Dhawan
Orcid: 0000-0002-5276-2475
According to our database1,
Kunal Dhawan
authored at least 25 papers
between 2018 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
Streaming Sortformer: Speaker Cache-Based Online Speaker Diarization with Arrival-Time Ordering.
CoRR, July, 2025
CoRR, May, 2025
CoRR, March, 2025
VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
2024
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks.
CoRR, 2024
Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens.
CoRR, 2024
Large Language Model Based Generative Error Correction: A Challenge and Baselines For Speech Recognition, Speaker Tagging, and Emotion Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024
Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach.
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System.
CoRR, 2023
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation.
CoRR, 2023
Towards training Bilingual and Code-Switched Speech Recognition models from Monolingual data sources.
CoRR, 2023
2021
2020
Novel textual features for language modeling of intra-sentential code-switching data.
Comput. Speech Lang., 2020
Joint Language Identification of Code-Switching Speech using Attention-based E2E Network.
Proceedings of the International Conference on Signal Processing and Communications, 2020
Investigating Target Set Reduction for End-to-End Speech Recognition of Hindi-English Code-Switching Data.
Proceedings of the 2020 National Conference on Communications, 2020
2019
IITG-HingCoS corpus: A Hinglish code-switching database for automatic speech recognition.
Speech Commun., 2019
Towards Adapting NMF Dictionaries Using Total Variability Modeling for Noise-Robust Acoustic Features.
CoRR, 2019
2018