We stand with Ukraine

We stand with Ukraine

Jagadeesh Balam

According to our database¹, Jagadeesh Balam authored at least 54 papers between 2006 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Chunk-wise Attention Transducers for Fast and Accurate Streaming Speech-to-Text.

[DOI]

,

Vladimir Bataev

,

Travis M. Bartley

,

Jagadeesh Balam

CoRR, February, 2026

2025

Canary-1B-v2 & Parakeet-TDT-0.6B-v3: Efficient and High-Performance Models for Multilingual ASR and AST.

[DOI]

,

Nithin Rao Koluguri

,

Nune Tadevosyan

,

,

Travis M. Bartley

,

,

Jagadeesh Balam

,

CoRR, September, 2025

Training and Inference Efficiency of Encoder-Decoder Speech Models.

[DOI]

,

,

,

Krishna C. Puvvada

,

,

Nithin Rao Koluguri

,

,

Vitaly Lavrukhin

,

Jagadeesh Balam

,

CoRR, March, 2025

VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning.

[DOI]

,

Krishna C. Puvvada

,

,

,

,

,

,

Shinji Watanabe

,

Jagadeesh Balam

,

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Anticipating Future with Large Language Model for Simultaneous Machine Translation.

[DOI]

,

Oleksii Hrinchuk

,

,

Vitaly Lavrukhin

,

Jagadeesh Balam

,

,

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Speaker Targeting via Self-Speaker Adaptation for Multi-talker ASR.

[DOI]

,

,

Ivan Medennikov

,

,

,

,

Nithin Rao Koluguri

,

Jagadeesh Balam

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Streaming Sortformer: Speaker Cache-Based Online Speaker Diarization with Arrival-Time Ordering.

[DOI]

Ivan Medennikov

,

,

,

,

,

,

Jagadeesh Balam

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Granary: Speech Recognition and Translation Dataset in 25 European Languages.

[DOI]

Nithin Rao Koluguri

,

,

George Zelenfroynd

,

,

,

Sofia Kostandian

,

,

,

Jagadeesh Balam

,

Vitaly Lavrukhin

,

,

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Word Level Timestamp Generation for Automatic Speech Recognition and Translation.

[DOI]

,

Krishna C. Puvvada

,

Elena Rastorgueva

,

,

,

,

,

,

Jagadeesh Balam

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Efficient and Direct Duplex Modeling for Speech-to-Speech Language Model.

[DOI]

,

Ehsan Hosseini-Asl

,

,

Edresson Casanova

,

Subhankar Ghosh

,

,

,

,

Jagadeesh Balam

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

SPGISpeech 2.0: Transcribed multi-speaker financial audio for speaker-tagged transcription.

[DOI]

Raymond Grossman

,

,

,

,

,

Yulia Shchadilova

,

,

Jagadeesh Balam

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Sortformer: A Novel Approach for Permutation-Resolved Speaker Supervision in Speech-to-Text Systems.

[DOI]

,

Ivan Medennikov

,

,

,

,

Nithin Rao Koluguri

,

Krishna C. Puvvada

,

Jagadeesh Balam

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

EMMeTT: Efficient Multimodal Machine Translation Training.

[DOI]

,

,

,

,

Oleksii Hrinchuk

,

,

,

Jagadeesh Balam

,

Vitaly Lavrukhin

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR.

[DOI]

,

,

,

,

,

Ivan Medennikov

,

,

Nithin Rao Koluguri

,

Jagadeesh Balam

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data.

[DOI]

,

,

,

Chao-Han Huck Yang

,

Jagadeesh Balam

,

,

Yu-Chiang Frank Wang

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks.

[DOI]

,

,

,

Ivan Medennikov

,

Krishna C. Puvvada

,

Nithin Rao Koluguri

,

,

Jagadeesh Balam

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Chain-of-Thought Prompting for Speech Translation.

[DOI]

,

,

Chao-Han Huck Yang

,

,

Oleksii Hrinchuk

,

Vitaly Lavrukhin

,

Jagadeesh Balam

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Training and Inference Efficiency of Encoder-Decoder Speech Models.

[DOI]

,

,

,

Krishna C. Puvvada

,

,

Travis M. Bartley

,

Nithin Rao Koluguri

,

,

Vitaly Lavrukhin

,

Jagadeesh Balam

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Open Full-duplex Voice Agent with Speech-to-Speech Language Model.

[DOI]

Edresson Casanova

,

,

,

,

Elena Rastorgueva

,

Seelan Lakshmi Narasimhan

,

,

Ehsan Hosseini-Asl

,

,

Valentin Mendelev

,

Subhankar Ghosh

,

,

,

,

Jagadeesh Balam

,

Vitaly Lavrukhin

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

Genetic Instruct: Scaling up Synthetic Generation of Coding Instructions for Large Language Models.

[DOI]

Somshubra Majumdar

,

,

,

Sean Narenthiran

,

Aleksander Ficek

,

Wasi Uddin Ahmad

,

,

Jagadeesh Balam

,

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), 2025

NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model.

[DOI]

,

,

,

,

,

,

Krishna C. Puvvada

,

,

,

,

Jagadeesh Balam

,

,

Yu-Chiang Frank Wang

,

Chao-Han Huck Yang

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), 2025

2024

NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts.

[DOI]

,

Chao-Han Huck Yang

,

,

,

,

,

Krishna C. Puvvada

,

,

,

,

Jagadeesh Balam

,

,

Yu-Chiang Frank Wang

CoRR, 2024

Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens.

[DOI]

,

Ivan Medennikov

,

,

,

,

Nithin Rao Koluguri

,

Krishna C. Puvvada

,

Jagadeesh Balam

,

CoRR, 2024

Genetic Instruct: Scaling up Synthetic Generation of Coding Instructions for Large Language Models.

[DOI]

Somshubra Majumdar

,

,

Sean Narenthiran

,

Aleksander Ficek

,

Jagadeesh Balam

,

CoRR, 2024

Large Language Model Based Generative Error Correction: A Challenge and Baselines For Speech Recognition, Speaker Tagging, and Emotion Recognition.

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR.

[DOI]

,

,

,

Krishna C. Puvvada

,

Ivan Medennikov

,

Somshubra Majumdar

,

,

Jagadeesh Balam

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation.

[DOI]

Nithin Rao Koluguri

,

Travis M. Bartley

,

,

Oleksii Hrinchuk

,

Jagadeesh Balam

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Bestow: Efficient and Streamable Speech Language Model with The Best of Two Worlds in GPT and T5.

[DOI]

,

,

Oleksii Hrinchuk

,

Krishna C. Puvvada

,

Nithin Rao Koluguri

,

,

Jagadeesh Balam

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data.

[DOI]

Krishna C. Puvvada

,

,

,

Oleksii Hrinchuk

,

Nithin Rao Koluguri

,

,

Somshubra Majumdar

,

Elena Rastorgueva

,

,

Vitaly Lavrukhin

,

Jagadeesh Balam

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Instruction Data Generation and Unsupervised Adaptation for Speech Language Models.

[DOI]

,

,

Somshubra Majumdar

,

,

Jagadeesh Balam

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Schrödinger Bridge for Generative Speech Enhancement.

[DOI]

,

,

Jagadeesh Balam

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations.

[DOI]

,

Nithin Rao Koluguri

,

,

,

Jagadeesh Balam

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition.

[DOI]

Krishna C. Puvvada

,

Nithin Rao Koluguri

,

,

Jagadeesh Balam

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach.

[DOI]

,

,

Nithin Rao Koluguri

,

Jagadeesh Balam

Proceedings of the IEEE International Conference on Acoustics, 2024

Stateful Conformer with Cache-Based Inference for Streaming Automatic Speech Recognition.

[DOI]

,

Somshubra Majumdar

,

,

Jagadeesh Balam

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Investigating End-to-End ASR Architectures for Long Form Audio Transcription.

[DOI]

Nithin Rao Koluguri

,

,

Georgy Zelenfroind

,

Somshubra Majumdar

,

,

,

Jagadeesh Balam

,

Proceedings of the IEEE International Conference on Acoustics, 2024

SALM: Speech-Augmented Language Model with in-Context Learning for Speech Recognition and Translation.

[DOI]

,

,

Andrei Andrusenko

,

Oleksii Hrinchuk

,

Krishna C. Puvvada

,

,

Subhankar Ghosh

,

Jagadeesh Balam

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer.

[DOI]

,

Krishna C. Puvvada

,

Jagadeesh Balam

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System.

[DOI]

,

,

,

,

Krishna C. Puvvada

,

Nithin Rao Koluguri

,

,

Aleksandr Laptev

,

Jagadeesh Balam

,

CoRR, 2023

Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation.

[DOI]

,

,

,

Nithin Rao Koluguri

,

,

,

Jagadeesh Balam

,

CoRR, 2023

Flexible Multichannel Speech Enhancement for Noise-Robust Frontend.

[DOI]

,

Jagadeesh Balam

,

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

A Compact End-to-End Model with Local and Global Context for Spoken Language Identification.

[DOI]

,

Nithin Rao Koluguri

,

Jagadeesh Balam

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Leveraging Pretrained ASR Encoders for Effective and Efficient End-to-End Speech Intent Classification and Slot Filling.

[DOI]

,

Jagadeesh Balam

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Fast Conformer With Linearly Scalable Attention For Efficient Speech Recognition.

[DOI]

,

Nithin Rao Koluguri

,

,

Somshubra Majumdar

,

,

,

Oleksii Hrinchuk

,

Krishna C. Puvvada

,

,

Jagadeesh Balam

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

AmberNet: A Compact End-to-End Model for Spoken Language Identification.

[DOI]

,

Nithin Rao Koluguri

,

Jagadeesh Balam

,

CoRR, 2022

NeMo Open Source Speaker Diarization System.

[DOI]

,

Nithin Rao Koluguri

,

,

Jagadeesh Balam

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multi-scale Speaker Diarization with Dynamic Scale Weighting.

[DOI]

,

Nithin Rao Koluguri

,

Jagadeesh Balam

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

CarneliNet: Neural Mixture Model for Automatic Speech Recognition.

[DOI]

Aleksei Kalinov

,

Somshubra Majumdar

,

Jagadeesh Balam

,

CoRR, 2021

SPGISpeech: 5, 000 Hours of Transcribed Financial Audio for Fully Formatted End-to-End Speech Recognition.

[DOI]

Patrick K. O'Neill

,

Vitaly Lavrukhin

,

Somshubra Majumdar

,

,

,

Oleksii Kuchaiev

,

Jagadeesh Balam

,

Yuliya Dovzhenko

,

Keenan Freyberg

,

Michael D. Shulman

,

,

Shinji Watanabe

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Cross-Language Transfer Learning and Domain Adaptation for End-to-End Automatic Speech Recognition.

[DOI]

,

,

,

,

,

,

Patrick K. O'Neill

,

Jagadeesh Balam

,

,

,

,

,

Oleksii Kuchaiev

,

Vitaly Lavrukhin

,

Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

2008

A Transcoding-Free Multiple Description Coder for Voice over Mobile Ad-Hoc Networks.

[DOI]

Jagadeesh Balam

,

Jerry D. Gibson

Proceedings of the WCNC 2008, IEEE Wireless Communications & Networking Conference, March 31 2008, 2008

2007

Multiple Descriptions and Path Diversity for Voice Communications Over Wireless Mesh Networks.

[DOI]

Jagadeesh Balam

,

Jerry D. Gibson

IEEE Trans. Multim., 2007

Two-Hop Two-Path Voice Communications Over a Mobile Ad-Hoc Network.

[DOI]

Jagadeesh Balam

,

Jerry D. Gibson

Proceedings of the Global Communications Conference, 2007

2006

Multiple descriptions and path diversity using the AMR-WB speech codec for voice communication over MANETs.

[DOI]

Jagadeesh Balam

,

Jerry D. Gibson

Proceedings of the International Conference on Wireless Communications and Mobile Computing, 2006

Loading...