Anil Kumar Vuppala

Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

TeluguST-46: A Benchmark Corpus and Comprehensive Evaluation for Telugu-English Speech Translation.

[BibT_eX]

[DOI]

Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

Enhancing Stutter Detection using Long-Term Average Spectrum Values.

[BibT_eX]

[DOI]

Priyanka Kommagouni

Sridhar Vanga

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

A Multi-modal Approach to Dysarthria Detection and Severity Assessment Using Speech and Text Information.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Typical vs. Atypical Disfluency Classification: Introducing the IIITH-TISA Corpus and Temporal Context-Based Feature Representations.

[BibT_eX]

[DOI]

Priyanka Kommagouni

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Towards Unified Processing of Perso-Arabic Scripts for ASR.

[BibT_eX]

[DOI]

Srihari Bandarupalli

Bhavana Akkiraju

Sri Charan Devarakonda

Harinie Sivaramasethu

Proceedings of the 31st International Conference on Computational Linguistics, 2025

2024

Epoch extraction in real-world scenario.

[BibT_eX]

[DOI]

Int. J. Speech Technol., September, 2024

Stockwell-Transform based feature representation for detection and assessment of voice disorders.

[BibT_eX]

[DOI]

Int. J. Speech Technol., March, 2024

Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation.

[BibT_eX]

[DOI]

CoRR, 2024

Open Vocabulary Keyword Spotting Through Transfer Learning from Speech Synthesis.

[BibT_eX]

[DOI]

Kesavaraj V

Proceedings of the International Conference on Signal Processing and Communications, 2024

Enhancing Stuttering Detection: A Syllable-Level Stutter Dataset.

[BibT_eX]

[DOI]

Hina Fathima

Proceedings of the International Conference on Signal Processing and Communications, 2024

Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation.

[BibT_eX]

[DOI]

Anindita Mondal

Proceedings of the International Conference on Signal Processing and Communications, 2024

IIIT-Speech Twins 1.0: An English-Hindi Parallel Speech Corpora for Speech-to-Speech Machine Translation and Automatic Dubbing.

[BibT_eX]

[DOI]

Anindita Mondal

Chiranjeevi Yarra

Proceedings of the 27th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2024

Custom wake word detection.

[BibT_eX]

[DOI]

Kesavaraj V

Charan Devarkonda

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Stress transfer in speech-to-speech machine translation.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

End-to-End User-Defined Keyword Spotting Using Shifted Delta Coefficients.

[BibT_eX]

[DOI]

Kesavaraj V

Anuprabha M

Proceedings of the Pattern Recognition - 27th International Conference, 2024

2023

IIITH-CSTD Corpus: Crowdsourced Strategies for the Collection of a Large-scale Telugu Speech Corpus.

[BibT_eX]

[DOI]

Vamshi Raghu Simha Narasinga

ACM Trans. Asian Low Resour. Lang. Inf. Process., 2023

Enhancing Stutter Detection in Speech Using Zero Time Windowing Cepstral Coefficients and Phase Information.

[BibT_eX]

[DOI]

Proceedings of the Speech and Computer - 25th International Conference, 2023

Enhancing Language Identification in Indian Context Through Exploiting Learned Features with Wav2Vec2.0.

[BibT_eX]

[DOI]

Shivang Gupta

Vamshi Raghu Simha Narasinga

Ravi Kumar

Proceedings of the Speech and Computer - 25th International Conference, 2023

Hardware Accelerator for Transformer based End-to-End Automatic Speech Recognition System.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Stuttering Detection Application.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022

Novel feature representation using single frequency filtering and nonlinear energy operator for speech emotion recognition.

[BibT_eX]

[DOI]

Digit. Signal Process., 2022

Study of Indian English Pronunciation Variabilities relative to Received Pronunciation.

[BibT_eX]

[DOI]

CoRR, 2022

Decoding self-automated and motivated finger movements using novel single-frequency filtering method - An EEG study.

[BibT_eX]

[DOI]

Biomed. Signal Process. Control., 2022

Exploring High Spectro-Temporal Resolution for Alzheimer's Dementia Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Signal Processing and Communications, 2022

How do Phonological Properties Affect Bilingual Automatic Speech Recognition?

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Exploring the Effect of Dialect Mismatched Language Models in Telugu Automatic Speech Recognition.

[BibT_eX]

[DOI]

Aditya Yadavalli

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, 2022

Multi-Task End-to-End Model for Telugu Dialect and Speech Recognition.

[BibT_eX]

[DOI]

Aditya Yadavalli

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Investigation of Subword-Based Bilingual Automatic Speech Recognition for Indian Languages.

[BibT_eX]

[DOI]

Proceedings of the 2022 Fourteenth International Conference on Contemporary Computing, 2022

Towards improving Disfluency Detection from Speech using Shifted Delta Cepstral Coefficients.

[BibT_eX]

[DOI]

Proceedings of the 2022 Fourteenth International Conference on Contemporary Computing, 2022

Shifted Delta Cepstral Coefficients with RNN to Improve the Detection of Parkinson's Disease from the Speech.

[BibT_eX]

[DOI]

Anshul Lahoti

Juan Rafael Orozco-Arroyave

Proceedings of the 2022 Fourteenth International Conference on Contemporary Computing, 2022

Implementation of Zero-Phase Zero Frequency Resonator Algorithm on FPGA.

[BibT_eX]

[DOI]

Proceedings of the 2022 Fourteenth International Conference on Contemporary Computing, 2022

2021

Detection of Fricative Landmarks Using Spectral Weighting: A Temporal Approach.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., 2021

Toward Improving the Performance of Epoch Extraction from Telephonic Speech.

[BibT_eX]

[DOI]

Mohammad Hashim Javid

Circuits Syst. Signal Process., 2021

Reed: An Approach Towards Quickly Bootstrapping Multilingual Acoustic Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

An Investigation of Hybrid architectures for Low Resource Multilingual Speech Recognition system in Indian context.

[BibT_eX]

[DOI]

Ganesh Mirishkar

Aditya Yadavalli

Proceedings of the 18th International Conference on Natural Language Processing (ICON 2021), National Institute of Technology Silchar, Silchar, India, December 16, 2021

IE-CPS Lexicon: An Automatic Speech Recognition Oriented Indian-English Pronunciation Dictionary.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Natural Language Processing (ICON 2021), National Institute of Technology Silchar, Silchar, India, December 16, 2021

Comparative Study of Different Epoch Extraction Methods for Speech Associated with Voice Disorders.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Acoustic Features, Bert Model and their complementary Nature for Alzheimer's Dementia Detection.

[BibT_eX]

[DOI]

Proceedings of the IC3 2021: Thirteenth International Conference on Contemporary Computing, Noida, India, August 5, 2021

Outcomes of Speech to Speech Translation for Broadcast Speeches and Crowd Source Based Speech Data Collection Pilot Projects.

[BibT_eX]

[DOI]

Prakash Yalla

Proceedings of the Big Data Analytics - 9th International Conference, 2021

Detecting Multiple Disfluencies from Speech using Pre-linguistic Automatic Syllabification with Acoustic and Prosody Features.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

CSTD-Telugu Corpus: Crowd-Sourced Approach for Large-Scale Speech data collection.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Comparative Study of Filter Banks to Improve the Performance of Voice Disorder Assessment Systems using LTAS Features.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

Duration of the rhotic approximant /ɹ/ in spastic dysarthria of different severity levels.

[BibT_eX]

[DOI]

Speech Commun., 2020

Analytic phase features for dysarthric speech detection and intelligibility assessment.

[BibT_eX]

[DOI]

Speech Commun., 2020

Towards Emotion Independent Language Identification System.

[BibT_eX]

[DOI]

Priyam Jain

Proceedings of the International Conference on Signal Processing and Communications, 2020

Study on the Effect of Emotional Speech on Language Identification.

[BibT_eX]

[DOI]

Priyam Jain

Proceedings of the 2020 National Conference on Communications, 2020

Towards Automatic Assessment of Voice Disorders: A Clinical Approach.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Single Frequency Filter Bank Based Long-Term Average Spectra for Hypernasality Detection and Assessment in Cleft Lip and Palate Speech.

[BibT_eX]

[DOI]

Mohammad Hashim Javid

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Stable Implementation of Zero Frequency Filtering of Speech Signals for Efficient Epoch Extraction.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2019

Application of Emotion Recognition and Modification for Emotional Telugu Speech Recognition.

[BibT_eX]

[DOI]

Mob. Networks Appl., 2019

Replay spoofing countermeasures using high spectro-temporal resolution features.

[BibT_eX]

[DOI]

Int. J. Speech Technol., 2019

Towards Feature-space Emotional Speech Adaptation for TDNN based Telugu ASR systems.

[BibT_eX]

[DOI]

Proceedings of the 2019 Workshop on Speech, Music and Mind, 2019

Sound Privacy: A Conversational Speech Corpus for Quantifying the Experience of Privacy.

[BibT_eX]

[DOI]

Pablo Pérez Zarazaga

Sneha Das

Tom Bäckström

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

IIIT-H Spoofing Countermeasures for Automatic Speaker Verification Spoofing and Countermeasures Challenge 2019.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Perceptually Enhanced Single Frequency Filtering for Dysarthric Speech Detection and Intelligibility Assessment.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Multi-Head Self-Attention Networks for Language Identification.

[BibT_eX]

[DOI]

Proceedings of the 2019 Twelfth International Conference on Contemporary Computing, 2019

Attention based Residual-Time Delay Neural Network for Indian Language Identification.

[BibT_eX]

[DOI]

Tirusha Mandava

Proceedings of the 2019 Twelfth International Conference on Contemporary Computing, 2019

An Investigation of LSTM-CTC based Joint Acoustic Model for Indian Language Identification.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Improved vowel region detection from a continuous speech using post processing of vowel onset points and vowel end-points.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2018

Prosody modification for speech recognition in emotionally mismatched conditions.

[BibT_eX]

[DOI]

Int. J. Speech Technol., 2018

Combining evidences from excitation source and vocal tract system features for Indian language identification using deep neural networks.

[BibT_eX]

[DOI]

Mounika Kamsali Veera

Int. J. Speech Technol., 2018

Application of non-negative frequency-weighted energy operator for vowel region detection.

[BibT_eX]

[DOI]

Int. J. Speech Technol., 2018

Curriculum learning based approach for noise robust language identification using DNN with attention.

[BibT_eX]

[DOI]

Expert Syst. Appl., 2018

Automatic Detection of Retroflex Approximants in a Continuous Tamil Speech.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., 2018

Emotional Speech Classifier Systems: For Sensitive Assistance to support Disabled Individuals.

[BibT_eX]

[DOI]

Priyam Jain

Proceedings of the 2018 Workshop on Speech, Music and Mind, 2018

Improved Language Identification Using Stacked SDC Features and Residual Neural Network.

[BibT_eX]

[DOI]

Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

IIITH-ILSC Speech Database for Indain Language Identification.

[BibT_eX]

[DOI]

Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

Automatic Detection of Palatalized Consonants in Kashmiri.

[BibT_eX]

[DOI]

Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

Incorporating Speaker Normalizing Capabilities to an End-to-End Speech Recognition System.

[BibT_eX]

[DOI]

Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

An Exploration towards Joint Acoustic Modeling for Indian Languages: IIIT-H Submission for Low Resource Speech Recognition Challenge for Indian Languages, INTERSPEECH 2018.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

Investigative study of various activation functions for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Twenty-third National Conference on Communications, 2017

DNN-HMM Acoustic Modeling for Large Vocabulary Telugu Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Mining Intelligence and Knowledge Exploration, 2017

Detection of Replay Attacks Using Single Frequency Filtering Cepstral Coefficients.

[BibT_eX]

[DOI]

Sudarsana Reddy Kadiri

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

SFF Anti-Spoofer: IIIT-H Submission for Automatic Speaker Verification Spoofing and Countermeasures Challenge 2017.

[BibT_eX]

[DOI]

Sudarsana Reddy Kadiri

Brij Mohan Lal Srivastava

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Significance of neural phonotactic models for large-scale spoken language identification.

[BibT_eX]

[DOI]

Manish Shrivastava

Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Sentiment analysis using relative prosody features.

[BibT_eX]

[DOI]

Harika Abburi

Manish Shrivastava

Proceedings of the Tenth International Conference on Contemporary Computing, 2017

Residual neural networks for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 25th European Signal Processing Conference, 2017

Importance of non-uniform prosody modification for speech recognition in emotion conditions.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

Vowel-Based Non-uniform Prosody Modification for Emotion Conversion.

[BibT_eX]

[DOI]

Sudarsana Reddy Kadiri

Circuits Syst. Signal Process., 2016

Changes in shout features in automatically detected vowel regions.

[BibT_eX]

[DOI]

Vinay Kumar Mittal

Proceedings of the 2016 International Conference on Signal Processing and Communications (SPCOM), 2016

A Study on Vowel Region Detection from a Continuous Speech.

[BibT_eX]

[DOI]

Proceedings of the Mining Intelligence and Knowledge Exploration, 2016

A Study on Text-Independent Speaker Recognition Systems in Emotional Conditions Using Different Pattern Recognition Models.

[BibT_eX]

[DOI]

Rajendra Prasath

Proceedings of the Mining Intelligence and Knowledge Exploration, 2016

Significance of automatic detection of vowel regions for automatic shout detection in continuous speech.

[BibT_eX]

[DOI]

Vinay Kumar Mittal

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

An Investigation of Deep Neural Network Architectures for Language Recognition in Indian Languages.

[BibT_eX]

[DOI]

Mounika K. V.

Lakshmi H. R.

Brij Mohan Lal Srivastava

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015

A language model based approach towards large scale and lightweight language identification systems.

[BibT_eX]

[DOI]

Manish Shrivastava

CoRR, 2015

Significance of Emotionally Significant Regions of Speech for Emotive to Neutral Conversion.

[BibT_eX]

[DOI]

Proceedings of the Mining Intelligence and Knowledge Exploration, 2015

Improved Language Identification in Presence of Speech Coding.

[BibT_eX]

[DOI]

Jiteesh Varma Bhupathiraju

Proceedings of the Mining Intelligence and Knowledge Exploration, 2015

2014

Speech Processing in Mobile Environments

[BibT_eX]

[DOI]

Springer Briefs in Electrical and Computer Engineering, Springer, ISBN: 978-3-319-03116-3, 2014

Automatic detection of breathy voiced vowels in Gujarati speech.

[BibT_eX]

[DOI]

Peri Bhaskararao

Int. J. Speech Technol., 2014

Application of Zero-Frequency Filtering for Vowel Onset Point Detection.

[BibT_eX]

[DOI]

Proceedings of the Mining Intelligence and Knowledge Exploration, 2014

2013

Non-uniform time scale modification using instants of significant excitation and vowel onset points.

[BibT_eX]

[DOI]

Speech Commun., 2013

Vowel onset point detection for noisy speech using spectral energy at formant frequencies.

[BibT_eX]

[DOI]

Int. J. Speech Technol., 2013

Neutral Speech to Anger Speech Conversion Using Prosody Modification.

[BibT_eX]

[DOI]

Krothapalli Sreenivasa Rao

J. Limmayya

G. Raghavendra

Proceedings of the Mining Intelligence and Knowledge Exploration, 2013

2012

Vowel Onset Point Detection for Low Bit Rate Coded Speech.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Neural network based feature transformation for emotion independent speaker identification.

[BibT_eX]

[DOI]

Jaynath Yadav

Sourjya Sarkar

Shashidhar G. Koolagudi

Int. J. Speech Technol., 2012

Spotting and Recognition of Consonant-Vowel Units from Continuous Speech Using Accurate Detection of Vowel Onset Points.

[BibT_eX]

[DOI]

Saswat Chakrabarti

Circuits Syst. Signal Process., 2012

2011

Recognition of consonant-vowel (CV) units under background noise using combined temporal and spectral preprocessing.

[BibT_eX]

[DOI]

Int. J. Speech Technol., 2011

Effect of Noise on Vowel Onset Point Detection.

[BibT_eX]

[DOI]

Proceedings of the Contemporary Computing - 4th International Conference, 2011

Effect of Noise on Recognition of Consonant-Vowel (CV) Units.

[BibT_eX]

[DOI]

Saswat Chakrabarti

Proceedings of the Contemporary Computing - 4th International Conference, 2011

2010

Effect of Speech Coding on Recognition of Consonant-Vowel (CV) Units.

[BibT_eX]

[DOI]

Saswat Chakrabarti