Sriram Ganapathy

Dataset, January, 2026

A Mixture-of-Experts model for multimodal emotion recognition in conversations.

[BibT_eX]

[DOI]

Smruthi Balaji

Comput. Speech Lang., 2026

2025

Audio-to-Audio Emotion Conversion With Pitch And Duration Style Transfer.

[BibT_eX]

[DOI]

Avni Jain

CoRR, May, 2025

Gradient-Free Post-Hoc Explainability Using Distillation Aided Learnable Approach.

[BibT_eX]

[DOI]

Amir H. Poorjam

Deepak Mittal

IEEE J. Sel. Top. Signal Process., January, 2025

ABHINAYA - A System for Speech Emotion Recognition In Naturalistic Conditions Challenge.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Benchmarking and Confidence Evaluation of LALMs For Temporal Reasoning.

[BibT_eX]

[DOI]

Apoorva Kulkarni

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Spoken Language Understanding on Unseen Tasks With In-Context Learning.

[BibT_eX]

[DOI]

Neeraj Agrawal

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Identifying and Mitigating Mismatched Language Code in Multilingual ASR.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

LLM supervised Pre-training for Multimodal Emotion Recognition in Conversations.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

FESTA: Functionally Equivalent Sampling for Trust Assessment of Multimodal LLMs.

[BibT_eX]

[DOI]

Apoorva Kulkarni

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

ULTRAS - Unified Learning of Transformer Representations for Audio and Speech Signals.

[BibT_eX]

[DOI]

P. E. Ameenudeen

Charumathi Narayanan

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

Speech Dereverberation With Frequency Domain Autoregressive Modeling.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Representation Learning With Hidden Unit Clustering for Low Resource Speech Applications.

[BibT_eX]

[DOI]

Varun Krishna

Tarun Sai

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Summary of the DISPLACE challenge 2023-DIarization of SPeaker and LAnguage in Conversational Environments.

[BibT_eX]

[DOI]

Speech Commun., 2024

STAB: Speech Tokenizer Assessment Benchmark.

[BibT_eX]

[DOI]

Chulayuth Asawaroengchai

CoRR, 2024

Overlap-aware End-to-End Supervised Hierarchical Graph Clustering for Speaker Diarization.

[BibT_eX]

[DOI]

CoRR, 2024

LLM Augmented LLMs: Expanding Capabilities through Composition.

[BibT_eX]

[DOI]

CoRR, 2024

Improving Self-supervised Pre-training using Accent-Specific Codebooks.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

The Second DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments.

[BibT_eX]

[DOI]

S. R. Mahadeva Prasanna

Deepu Vijayasenan

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

LLM Augmented LLMs: Expanding Capabilities through Composition.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Zero Shot Audio To Audio Emotion Transfer With Speaker Disentanglement.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Multimodal Modeling for Spoken Language Identification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Speech enhancement with frequency domain auto-regressive modeling.

[BibT_eX]

[DOI]

CoRR, 2023

Multimodal Modeling For Spoken Language Identification.

[BibT_eX]

[DOI]

CoRR, 2023

MASR: Metadata Aware Speech Representation.

[BibT_eX]

[DOI]

CoRR, 2023

Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection.

[BibT_eX]

[DOI]

CoRR, 2023

HCAM - Hierarchical Cross Attention Model for Multi-modal Emotion Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments.

[BibT_eX]

[DOI]

CoRR, 2023

Label Aware Speech Representation Learning For Language Identification.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Enhancing the EEG Speech Match Mismatch Tasks With Word Boundaries.

[BibT_eX]

[DOI]

Akshara Soman

Vidhi Sinha

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

The DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Supervised Hierarchical Clustering Using Graph Neural Networks for Speaker Diarization.

[BibT_eX]

[DOI]

Amrit Kaul

Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Influence Guided Data Reweighting for Language Model Pre-training.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Accented Speech Recognition With Accent-specific Codebooks.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

MASR: Multi-Label Aware Speech Representation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Pseudo-Label Based Supervised Contrastive Loss for Robust Speech Representations.

[BibT_eX]

[DOI]

Varun Krishna

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Towards sound based testing of COVID-19 - Summary of the first Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2022

PLDA inspired Siamese networks for speaker verification.

[BibT_eX]

[DOI]

Prashant Krishnan V

Comput. Speech Lang., 2022

Dereverberation of autoregressive envelopes for far-field speech recognition.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2022

Interpretable Acoustic Representation Learning on Breathing and Speech Signals for COVID-19 Detection.

[BibT_eX]

[DOI]

CoRR, 2022

Svadhyaya system for the Second Diagnosing COVID-19 using Acoustics Challenge 2021.

[BibT_eX]

[DOI]

Deepak Mittal

Amir H. Poorjam

Zemin Yu

Maneesh Kumar Singh

CoRR, 2022

Transformer Networks for Non-Intrusive Speech Quality Prediction.

[BibT_eX]

[DOI]

M. K. Jayesh

Mukesh Sharma

Praneeth Vonteddu

Mahaboob Ali Basha Shaik

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Acoustic Representation Learning on Breathing and Speech Signals for COVID-19 Detection.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Speaker conditioned acoustic modeling for multi-speaker conversational ASR.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Coswara: A website application enabling COVID-19 screening by analysing respiratory sound samples and health symptoms.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Semi-supervised Acoustic and Language Modeling for Hindi ASR.

[BibT_eX]

[DOI]

Tarun Sai Bandarupalli

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer.

[BibT_eX]

[DOI]

Shrutina Agarwal

Naoya Takahashi

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

The Second Dicova Challenge: Dataset and Performance Analysis for Diagnosis of Covid-19 Using Acoustics.

[BibT_eX]

[DOI]

Pravin Mote

Proceedings of the IEEE International Conference on Acoustics, 2022

End-To-End Speech Recognition with Joint Dereverberation of Sub-Band Autoregressive Envelopes.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Self Supervised Representation Learning with Deep Clustering for Acoustic Unit Discovery from Raw Speech.

[BibT_eX]

[DOI]

Varun Krishna

Proceedings of the IEEE International Conference on Acoustics, 2022

Multimodal Transformer with Learnable Frontend and Self Attention for Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Self-Supervised Representation Learning With Path Integral Clustering for Speaker Diarization.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

The Second DiCOVA Challenge: Dataset and performance analysis for COVID-19 diagnosis using acoustics.

[BibT_eX]

[DOI]

Pravin Mote

CoRR, 2021

Multi-modal Point-of-Care Diagnostics for COVID-19 Based On Acoustics and Symptoms.

[BibT_eX]

[DOI]

CoRR, 2021

Deep Correlation Analysis for Audio-EEG Decoding.

[BibT_eX]

[DOI]

Jaswanth Reddy Katthi

CoRR, 2021

A Multi-Head Relevance Weighting Framework for Learning Raw Waveform Audio Representations.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

LEAP Submission for the Third DIHARD Diarization Challenge.

[BibT_eX]

[DOI]

Rajat Varma

Venkat Krishnamohan

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

The Third DIHARD Diarization Challenge.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

SRIB-LEAP Submission to Far-Field Multi-Channel Speech Enhancement Challenge for Video Conferencing.

[BibT_eX]

[DOI]

R. G. Prithvi Raj

M. K. Jayesh

M. Ali Basha Shaik

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

DiCOVA Challenge: Dataset, Task, and Baseline System for COVID-19 Diagnosis Using Acoustics.

[BibT_eX]

[DOI]

Viral Nanda

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Uncovering the Acoustic Cues of COVID-19 Infection.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Investigating Feature Selection and Explainability for COVID-19 Diagnostics from Cough Sounds.

[BibT_eX]

[DOI]

Maneesh Singh

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Deep Multiway Canonical Correlation Analysis For Multi-Subject Eeg Normalization.

[BibT_eX]

[DOI]

Jaswanth Reddy Katthi

Proceedings of the IEEE International Conference on Acoustics, 2021

NISP: A Multi-lingual Multi-accent Dataset for Speaker Profiling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

End-to-End Lyrics Recognition with Voice to Singing Style Transfer.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Representation Learning for Speech Recognition Using Feedback Based Relevance Weighting.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Towards Relevance and Sequence Modeling in Language Recognition.

[BibT_eX]

[DOI]

Bharat Padi

Anand Mohan

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Interpretable Representation Learning for Speech and Audio Signals Based on Relevance Weighting.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Automatic speaker profiling from short duration speech data.

[BibT_eX]

[DOI]

Shareef Babu Kalluri

Deepu Vijayasenan

Speech Commun., 2020

Supervised I-vector modeling for language and accent recognition.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2020

Deep Learning Based Dereverberation of Temporal Envelopesfor Robust Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2020

Third DIHARD Challenge Evaluation Plan.

[BibT_eX]

[DOI]

CoRR, 2020

LEAP System for SRE19 Challenge - Improvements and Error Analysis.

[BibT_eX]

[DOI]

CoRR, 2020

Pairwise Discriminative Neural PLDA for Speaker Verification.

[BibT_eX]

[DOI]

CoRR, 2020

LEAP System for SRE 2019 CTS Challenge - Improvements and Error Analysis.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

NPLDA: A Deep Neural PLDA Model for Speaker Verification.

[BibT_eX]

[DOI]

Prashant Krishnan V

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

IITG- Indigo Submissions for NIST 2018 Speaker Recognition Evaluation and Post-Challenge Improvements.

[BibT_eX]

[DOI]

Proceedings of the 2020 National Conference on Communications, 2020

Deep Self-Supervised Hierarchical Clustering for Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Coswara - A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis.

[BibT_eX]

[DOI]

Nirmala R.

Prasanta Kumar Ghosh

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Neural PLDA Modeling for End-to-End Speaker Verification.

[BibT_eX]

[DOI]

Prashant Krishnan V

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Deep Learning Based Dereverberation of Temporal Envelopes for Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Audiovisual Correspondence Learning in Humans and Machines.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Context Dependent RNNLM for Automatic Transcription of Conversations.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Robust Raw Waveform Speech Recognition Using Relevance Weighted Representations.

[BibT_eX]

[DOI]

Sudarsanam Parthasaarathy

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Improving Voice Separation by Incorporating End-To-End Speech Recognition.

[BibT_eX]

[DOI]

Naoya Takahashi

Mayank Kumar Singh

Sakya Basak

Yuki Mitsufuji

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

On The Impact of Language Familiarity in Talker Change Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

3-D Acoustic Modeling for Far-Field Multi-Channel Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Unsupervised Neural Mask Estimator for Generalized Eigen-Value Beamforming Based Asr.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Deep Canonical Correlation Analysis For Decoding The Auditory Brain.

[BibT_eX]

[DOI]

Jaswanth Reddy Katthi

Sandeep Kothinti

Malcolm Slaney

Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2020

2019

Modulation Filter Learning Using Deep Variational Networks for Robust Speech Recognition.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2019

3-D Feature and Acoustic Modeling for Far-Field Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2019

LEAP Diarization System for the Second DIHARD Challenge.

[BibT_eX]

[DOI]

Harsha Vardhan

Ahilan Kanagasundaram

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The Second DIHARD Diarization Challenge: Dataset, Task, and Baselines.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Attention Based Hybrid i-Vector BLSTM Model for Language Recognition.

[BibT_eX]

[DOI]

Bharat Padi

Anand Mohan

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Active Learning Methods for Low Resource End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Karan Malhotra

Shubham Bansal

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A Study of x-Vector Based Speaker Recognition on Short Utterances.

[BibT_eX]

[DOI]

Ahilan Kanagasundaram

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Unsupervised Raw Waveform Representation Learning for ASR.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Analyzing Human Reaction Time for Talker Change Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

The Leap Speaker Recognition System for NIST SRE 2018 Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

End-to-end Language Recognition Using Attention Based Hierarchical Gated Recurrent Unit Models.

[BibT_eX]

[DOI]

Bharat Padi

Anand Mohan

Proceedings of the IEEE International Conference on Acoustics, 2019

A Deep Neural Network Based End to End Model for Joint Height and Age Estimation from Short Duration Speech.

[BibT_eX]

[DOI]

Shareef Babu Kalluri

Deepu Vijayasenan

Proceedings of the IEEE International Conference on Acoustics, 2019

Deep Variational Filter Learning Models for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Level-wise Subject adaptation to improve classification of motor and mental EEG tasks.

[BibT_eX]

[DOI]

Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2019

Second Language Transfer Learning in Humans and Machines Using Image Supervision.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Speaker and Language Aware Training for End-to-End ASR.

[BibT_eX]

[DOI]

Shubham Bansal

Karan Malhotra

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

The LEAP Language Recognition System for LRE 2017 Challenge - Improvements and Error Analysis.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Supervised I-vector Modeling - Theory and Applications.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

On Convolutional LSTM Modeling for Joint Wake-Word Detection and Text Dependent Speaker Verification.

[BibT_eX]

[DOI]

Rajath Kumar

Vaishnavi Yeruva

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Far-Field Speech Recognition Using Multivariate Autoregressive Models.

[BibT_eX]

[DOI]

Madhumita Harish

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Speaker and Language Recognition - From Laboratory Technologies to the Wild.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Talker Diarization in the Wild: the Case of Child-centered Daylong Audio-recordings.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Comparison of Unsupervised Modulation Filter Learning Methods for ASR.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Leveraging LSTM Models for Overlap Detection in Multi-Party Meetings.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Enhancement and Analysis of Conversational Speech: JSALT 2017.

[BibT_eX]

[DOI]

Mahesh Krishnamoorthy

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

3-D CNN Models for Far-Field Multi-Channel Speech Recognition.

[BibT_eX]

[DOI]

Vijayaditya Peddinti

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Multivariate Autoregressive Spectrogram Modeling for Noisy Speech Recognition.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2017

Increasing the robustness of CNN acoustic models using autoregressive moving average spectrogram features and channel dropout.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2017

IITG-Indigo System for NIST 2016 SRE Challenge.

[BibT_eX]

[DOI]

S. R. Mahadeva Prasanna

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Speech Representation Learning Using Unsupervised Data-Driven Modulation Filtering for Robust ASR.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Factor analysis methods for joint speaker verification and spoof detection.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Leveraging native language speech for accent identification using deep Siamese networks.

[BibT_eX]

[DOI]

Aditya Siddhant

Preethi Jyothi

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Unsupervised HMM posteriograms for language independent acoustic modeling in zero resource conditions.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Deep learning methods for unsupervised acoustic modeling - Leap submission to ZeroSpeech challenge 2017.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

The IBM 2016 Speaker Recognition System.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

The IBM Speaker Recognition System: Recent Advances and Error Analysis.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

An Investigation on the Use of i-Vectors for Robust ASR.

[BibT_eX]

[DOI]

Dimitrios Dimitriadis

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Speaker age estimation on conversational telephone speech using senone posterior based i-vectors.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Investigating factor analysis features for deep neural networks in noisy speech recognition.

[BibT_eX]

[DOI]

Dimitrios Dimitriadis

Steven J. Rennie

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Nearest neighbor discriminant analysis for language recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Robust speech processing using ARMA spectrogram models.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Robust Feature Extraction Using Modulation Filtering of Autoregressive Models.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2014

Robust language identification using convolutional neural network features.

[BibT_eX]

[DOI]

Maarten Van Segbroeck

Shrikanth S. Narayanan

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Shift-invariant features for speech activity detection in adverse radio-frequency channel conditions.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Enhancing Frequency Shifted Speech Signals in Single Side-Band Communication.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2013

The IBM speech activity detection system for the DARPA RATS program.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Robust speaker recognition using spectro-temporal autoregressive models.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

TRAP language identification system for RATS phase II evaluation.

[BibT_eX]

[DOI]

Shrikanth S. Narayanan

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Unsupervised channel adaptation for language identification using co-training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Noisy channel adaptation in language identification.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Adaptation transforms of auto-associative neural networks as features for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Feature extraction using 2-d autoregressive models for speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Data-driven Posterior Features for Low Resource Speech Recognition Applications.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Analysis of Temporal Resolution in Frequency Domain Linear Prediction.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Multilingual MLP features for low-resource LVCSR systems.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

The UMD-JHU 2011 speaker recognition system.

[BibT_eX]

[DOI]

Daniel Garcia-Romero

Xinhui Zhou

Dmitry N. Zotkin

Balaji Vasan Srinivasan

Yuancheng Luo

Garimella S. V. S. Sivaram

Sridhar Krishna Nemala

Majid Mirbagheri

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Comparison of Different Approaches for Speech Recognition in Hands-free Mode.

[BibT_eX]

[DOI]

Hans-Günter Hirsch

Proceedings of the 10th ITG Conference on Speech Communication, 2012

2011

Multi-layer perceptron based speech activity detection for speaker verification.

[BibT_eX]

[DOI]

Padmanabhan Rajan

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2011

Modulation Spectrum Analysis for Recognition of Reverberant Speech.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Feature normalization for speaker verification in room reverberation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Autoregressive Models of Amplitude Modulations in Audio Compression.

[BibT_eX]

[DOI]

Petr Motlícek

IEEE Trans. Speech Audio Process., 2010

Wide-Band Audio Coding Based on Frequency-Domain Linear Prediction.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2010

A phoneme recognition framework based on auditory spectro-temporal receptive fields.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Cross-lingual and multi-stream posterior features for low resource LVCSR systems.

[BibT_eX]

[DOI]

Garimella S. V. S. Sivaram

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Sparse auto-associative neural networks: theory and application to speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Comparison of modulation features for phoneme recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Robust spectro-temporal features based on autoregressive models of Hilbert envelopes.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Applications of signal analysis using autoregressive models for amplitude modulation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009

Error Resilient Speech Coding Using Sub-band Hilbert Envelopes.

[BibT_eX]

[DOI]

Petr Motlícek

Proceedings of the Text, Speech and Dialogue, 12th International Conference, 2009

Tandem representations of spectral envelope and modulation frequency features for ASR.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Arithmetic coding of sub-band residuals in FDLP speech/audio codec.

[BibT_eX]

[DOI]

Petr Motlícek

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Static and dynamic modulation spectrum for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Phoneme recognition using spectral envelope and modulation frequency features.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Temporal envelope subtraction for robust speech recognition using modulation spectrum.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008

Recognition of Reverberant Speech Using Frequency Domain Linear Prediction.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2008

Perceptually Motivated Sub-band Decomposition for FDLP Audio Coding.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Hilbert Envelope Based Features for Far-Field Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Multimodal Interaction, 5th International Workshop, 2008

Hilbert envelope based spectro-temporal features for phoneme recognition in telephone speech.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Front-end for far-field speech recognition based on frequency domain linear prediction.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Spectral noise shaping: improvements in speech/audio codec based on linear prediction in spectral domain.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Temporal masking for bit-rate reduction in audio codec based on Frequency Domain Linear Prediction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

Spectro-temporal features for Automatic Speech Recognition using Linear Prediction in spectral domain.

[BibT_eX]

[DOI]

Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007

Non-uniform Speech/Audio Coding Exploiting Predictability of Temporal Evolution of Spectral Envelopes.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 10th International Conference, 2007

Frequency Domain Linear Prediction for QMF Sub-bands and Applications to Audio Coding.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Multimodal Interaction , 2007

2003

Agreement strategies for cooperative control of uninhabited autonomous vehicles.

[BibT_eX]

[DOI]