Dhananjaya Gowda

Affiliations:
  • Aalto University, Espoo, Finland
  • IIIT Hyderabad, India


According to our database1, Dhananjaya Gowda authored at least 54 papers between 2004 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech.
CoRR, 2024

2023
Refining a deep learning-based formant tracker using linear prediction methods.
Comput. Speech Lang., June, 2023

On the compression of shallow non-causal ASR models using knowledge distillation and tied-and-reduced decoder for low-latency on-device speech recognition.
CoRR, 2023

Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction.
CoRR, 2023

Self-Supervised Accent Learning for Under-Resourced Accents Using Native Language Data.
Proceedings of the IEEE International Conference on Acoustics, 2023

A Transformer-Based E2E SLU Model for Improved Semantic Parsing.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Multi-stage Progressive Compression of Conformer Transducer for On-device Speech Recognition.
Proceedings of the Interspeech 2022, 2022

Prototypical speaker-interference loss for target voice separation using non-parallel audio samples.
Proceedings of the Interspeech 2022, 2022

2021
Automatic Assessment for Severe Self-Reported Depressive Symptoms Using Speech Cues.
IEEE Trans. Cogn. Dev. Syst., 2021

Formant Tracking Using Quasi-Closed Phase Forward-Backward Linear Prediction Analysis and Deep Neural Networks.
IEEE Access, 2021

Streaming End-to-End Speech Recognition with Jointly Trained Neural Feature Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2021

Neural Utterance Confidence Measure for RNN-Transducers and Two Pass Models.
Proceedings of the IEEE International Conference on Acoustics, 2021

Comparative Study of Different Tokenization Strategies for Streaming End-to-End ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

A Comparison of Streaming Models and Data Augmentation Methods for Robust Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Semi-Supervised Transfer Learning for Language Expansion of End-to-End Speech Recognition Models to Low-Resource Languages.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Voice to Action: Spoken Language Understanding for Memory-Constrained Systems.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

HiTNet: Byte-to-BPE Hierarchical Transcription Network for End-to-End Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Two-Pass End-to-End ASR Model Compression.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking in Speech Signals.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Utterance Confidence Measure for End-to-End Speech Recognition with Applications to Distributed Speech Recognition Scenarios.
Proceedings of the Interspeech 2020, 2020

Utterance Invariant Training for Hybrid Two-Pass End-to-End Speech Recognition.
Proceedings of the Interspeech 2020, 2020

Streaming On-Device End-to-End ASR System for Privacy-Sensitive Voice-Typing.
Proceedings of the Interspeech 2020, 2020

Hierarchical Multi-Stage Word-to-Grapheme Named Entity Corrector for Automatic Speech Recognition.
Proceedings of the Interspeech 2020, 2020

A Review of On-Device Fully Neural End-to-End Automatic Speech Recognition Algorithms.
Proceedings of the 54th Asilomar Conference on Signals, Systems, and Computers, 2020

2019
Improved Vocal Tract Length Perturbation for a State-of-the-Art End-to-End Speech Recognition System.
Proceedings of the Interspeech 2019, 2019

Multi-Task Multi-Resolution Char-to-BPE Cross-Attention Decoder for End-to-End Speech Recognition.
Proceedings of the Interspeech 2019, 2019

End-to-End Training of a Large Vocabulary End-to-End Speech Recognition System.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Power-Law Nonlinearity with Maximally Uniform Distribution Criterion for Improved Neural Network Training in Automatic Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Attention Based On-Device Streaming Speech Recognition with Large Speech Corpus.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Improved Multi-Stage Training of Online Attention-Based Encoder-Decoder Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Speaker recognition from whispered speech: A tutorial survey and an application of time-varying linear prediction.
Speech Commun., 2018

2017
Time-Varying Autoregressions for Speaker Verification in Reverberant Conditions.
Proceedings of the Interspeech 2017, 2017

2016
Whispered Speech Detection Using Fusion of Group-Delay-Based Subband Modulation Spectrum and Correntropy Features.
IEEE Signal Process. Lett., 2016

Time-Varying Quasi-Closed-Phase Weighted Linear Prediction Analysis of Speech for Accurate Formant Detection and Tracking.
Proceedings of the Interspeech 2016, 2016

Quasi closed phase analysis of speech signals using time varying weighted linear prediction for accurate formant tracking.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Vowel Enhancement in Early Stage Spanish Esophageal Speech Using Natural Glottal Flow Pulse and Vocal Tract Frequency Warping.
Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies, 2015

AM-FM based filter bank analysis for estimation of spectro-temporal envelopes and its application for speaker recognition in noisy reverberant environments.
Proceedings of the INTERSPEECH 2015, 2015

2014
On the role of missing data imputation and NMF feature enhancement in building synthetic voices using reverberant speech.
Proceedings of the INTERSPEECH 2014, 2014

2013
Spectro-temporal analysis of speech signals using zero-time windowing and group delay function.
Speech Commun., 2013

Analysis of Acoustic Events in Speech Signals Using Bessel Series Expansion.
Circuits Syst. Signal Process., 2013

Robust spectral representation using group delay function and stabilized weighted linear prediction for additive noise degradations.
Proceedings of the 7th Conference on Speech Technology and Human-Computer Dialogue, 2013

Robust formant detection using group delay function and stabilized weighted linear prediction.
Proceedings of the INTERSPEECH 2013, 2013

Analysis of breathy, modal and pressed phonation based on low frequency spectral density.
Proceedings of the INTERSPEECH 2013, 2013

2012
Effect of Tongue Tip Trilling on the Glottal Excitation Source.
Proceedings of the INTERSPEECH 2012, 2012

2011
Exploring Bessel Features for Detection of Glottal Closure Instants.
Proceedings of the INTERSPEECH 2011, 2011

Decomposition of speech signals for analysis of aperiodic components of excitation.
Proceedings of the IEEE International Conference on Acoustics, 2011

Acoustic-phonetic information from excitation source for refining manner hypotheses of a phone recognizer.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Voiced/Nonvoiced Detection Based on Robustness of Voiced Epochs.
IEEE Signal Process. Lett., 2010

2008
Speaker change detection in casual conversations using excitation source features.
Speech Commun., 2008

Analysis of glottal stops in speech signals.
Proceedings of the INTERSPEECH 2008, 2008

Features for automatic detection of voice bars in continuous speech.
Proceedings of the INTERSPEECH 2008, 2008

Video Shot Segmentation Using Late Fusion Technique.
Proceedings of the Seventh International Conference on Machine Learning and Applications, 2008

2006
Correlation-Based Similarity Between Signals for Speaker Verification with Limited Amount of Speech Data.
Proceedings of the Multimedia Content Representation, 2006

2004
Speaker Segmentation Based on Subsegmental Features and Neural Network Models.
Proceedings of the Neural Information Processing, 11th International Conference, 2004


  Loading...