Visar Berisha

Comput. Speech Lang., 2027

2026

V.O.I.C.E (Voice, Ownership, Identity, Control, Expression): Risk Taxonomy of Synthetic Voice Generation From Empirical Data.

[BibT_eX]

[DOI]

CoRR, April, 2026

Multilingual Dysarthric Speech Assessment Using Universal Phone Recognition and Language-Specific Phonemic Contrast Modeling.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

Advancing Automated Spatio-Semantic Analysis in Picture Description Using Language Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

Why Speech Deepfake Detectors Won't Generalize: The Limits of Detection in an Open World.

[BibT_eX]

[DOI]

Prad Kadambi

Isabella Lenz

CoRR, September, 2025

MAGIC: Multi-task Gaussian process for joint imputation and classification in healthcare time series.

[BibT_eX]

[DOI]

CoRR, September, 2025

Matched-Pair Experimental Design with Active Learning.

[BibT_eX]

[DOI]

CoRR, September, 2025

PRAC3 (Privacy, Reputation, Accountability, Consent, Credit, Compensation): Long Tailed Risks of Voice Actors in AI Data-Economy.

[BibT_eX]

[DOI]

Tanusree Sharma

Yihao Zhou

CoRR, July, 2025

Potential Applications of Artificial Intelligence for Cross-language Intelligibility Assessment of Dysarthric Speech.

[BibT_eX]

[DOI]

CoRR, January, 2025

Unraveling overoptimism and publication bias in ML-driven science.

[BibT_eX]

[DOI]

Pouria Saidi

Patterns, 2025

Statistically Valid Post-Deployment Monitoring Should Be Standard for AI-Based Digital Health.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Mitigating Overfitting During Speech Foundation Model Fine-tuning: Applications to Dysarthric Speech Detection.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Witness Sensing for Verifying the Human Origin of Digital Media.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

The Impact of Decorrelation on Transformer Interpretation Methods: Applications to Clinical Speech AI.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Automated Extraction of Spatio-Semantic Graphs for Identifying Cognitive Impairment.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Cross-lingual Evaluation Of Hypernasality Using Wav2Vec2 Features.

[BibT_eX]

[DOI]

S. R. Mahadeva Prasanna

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

Active Sequential Two-Sample Testing.

[BibT_eX]

[DOI]

Karthikeyan Natesan Ramamurthy

Prad Kadambi

Pouria Saidi

Trans. Mach. Learn. Res., 2024

Responsible development of clinical speech AI: Bridging the gap between clinical research and technology.

[BibT_eX]

[DOI]

npj Digit. Medicine, 2024

Orthogonality and graph divergence losses promote disentanglement in generative models.

[BibT_eX]

[DOI]

Frontiers Comput. Sci., 2024

A Tutorial on Clinical Speech AI Development: From Data Collection to Model Validation.

[BibT_eX]

[DOI]

CoRR, 2024

Advanced Tutorial: Label-Efficient Two-Sample Tests.

[BibT_eX]

[DOI]

Proceedings of the Winter Simulation Conference, 2024

Improving Speech-Based Dysarthria Detection using Multi-task Learning with Gradient Projection.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Segmental and Suprasegmental Speech Foundation Models for Classifying Cognitive Risk Factors: Evaluating Out-of-the-Box Performance.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

How Does Alignment Error Affect Automated Pronunciation Scoring in Children's Speech?

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Achieving Reproducibility in EEG-Based Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 2024

2023

Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Consonant-Vowel Transition Models Based on Deep Learning for Objective Evaluation of Articulation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Learning Repeatable Speech Embeddings Using An Intra-class Correlation Regularizer.

[BibT_eX]

[DOI]

Jianwei Zhang

Suren Jayasuriya

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Aligning Speech Enhancement for Improving Downstream Classification Performance.

[BibT_eX]

[DOI]

Yan Xiong

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Decorrelating Language Model Embeddings for Speech-Based Prediction of Cognitive Impairment.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Requirements For Mass Adoption Of Assistive Listening Technology By The General Public.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Does Human Speech Follow Benford's Law?

[BibT_eX]

[DOI]

Leo Hsu

Proceedings of the IEEE International Conference on Acoustics, 2023

Smoothly Giving up: Robustness for Simple Models.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

TorchDIVA: An Extensible Computational Model of Speech Production built on an Open-Source Machine Learning Library.

[BibT_eX]

[DOI]

Sean Kinahan

Maria Elena Chavez Echeagaray

CoRR, 2022

Unsupervised EEG channel selection based on nonnegative matrix factorization.

[BibT_eX]

[DOI]

Lingfeng Xu

Biomed. Signal Process. Control., 2022

A label efficient two-sample test.

[BibT_eX]

[DOI]

Karthikeyan Natesan Ramamurthy

Proceedings of the Uncertainty in Artificial Intelligence, 2022

Investigating the Impact of Speech Compression on the Acoustics of Dysarthric Speech.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Are reported accuracies in the clinical speech machine learning literature overoptimistic?

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

A Deep Learning Algorithm for Objective Assessment of Hypernasality in Children With Cleft Palate.

[BibT_eX]

[DOI]

IEEE Trans. Biomed. Eng., 2021

Revisiting the accuracy problem in network analysis using a unique dataset.

[BibT_eX]

[DOI]

Soc. Networks, 2021

Digital medicine and the curse of dimensionality.

[BibT_eX]

[DOI]

npj Digit. Medicine, 2021

Computationally-efficient voice activity detection based on deep neural networks.

[BibT_eX]

[DOI]

Yan Xiong

Proceedings of the IEEE Workshop on Signal Processing Systems, 2021

Restoring Degraded Speech via a Modified Diffusion Model.

[BibT_eX]

[DOI]

Jianwei Zhang

Suren Jayasuriya

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

The Impact of Forced-Alignment Errors on Automatic Pronunciation Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

An Attention Model for Hypernasality Prediction in Children with Cleft Palate.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Robust Estimation of Hypernasality in Dysarthria With Acoustic Model Likelihood Features.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Early detection and tracking of bulbar changes in ALS via frequent and remote speech analysis.

[BibT_eX]

[DOI]

npj Digit. Medicine, 2020

Author Correction: Early detection and tracking of bulbar changes in ALS via frequent and remote speech analysis.

[BibT_eX]

[DOI]

npj Digit. Medicine, 2020

A Review of Automated Speech and Language Features for Assessment of Cognitive and Thought Disorders.

[BibT_eX]

[DOI]

Rohit Voleti

IEEE J. Sel. Top. Signal Process., 2020

An 8.93 TOPS/W LSTM Recurrent Neural Network Accelerator Featuring Hierarchical Coarse-Grain Sparsity for On-Device Speech Recognition.

[BibT_eX]

[DOI]

IEEE J. Solid State Circuits, 2020

Finding the Homology of Decision Boundaries with Active Learning.

[BibT_eX]

[DOI]

Karthikeyan Natesan Ramamurthy

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

UncommonVoice: A Crowdsourced Dataset of Dysphonic Speech.

[BibT_eX]

[DOI]

Sethuraman Panchanathan

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Compressing LSTM Networks with Hierarchical Coarse-Grain Sparsity.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Deep Learning Based Prediction of Hypernasality for Clinical Applications.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Regularization via Structural Label Smoothing.

[BibT_eX]

[DOI]

Ravi Prakash Ramachandran

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019

Articulation constrained learning with application to speech emotion recognition.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2019

Guest Editorial: Algorithms and Architectures for Machine Learning Based Speech Processing.

[BibT_eX]

[DOI]

Tokunbo Ogunfunmi

Roberto Togneri

Brett Y. Smolenski

Circuits Syst. Signal Process., 2019

Robust Estimation of Hypernasality in Dysarthria.

[BibT_eX]

[DOI]

CoRR, 2019

A Review of Language and Speech Features for Cognitive-Linguistic Assessment.

[BibT_eX]

[DOI]

Rohit Voleti

CoRR, 2019

Residual + Capsule Networks (ResCap) for Simultaneous Single-Channel Overlapped Keyword Recognition.

[BibT_eX]

[DOI]

Yan Xiong

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Objective Assessment of Social Skills Using Automated Language Analysis for Identification of Schizophrenia and Bipolar Disorder.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Say What? A Dataset for Exploring the Error Patterns That Two ASR Engines Make.

[BibT_eX]

[DOI]

Sethuraman Panchanathan

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Do Conversational Partners Entrain on Articulatory Precision?

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Investigating the Effects of Word Substitution Errors on Sentence Embeddings.

[BibT_eX]

[DOI]

Rohit Voleti

Proceedings of the IEEE International Conference on Acoustics, 2019

Joint Optimization of Quantization and Structured Sparsity for Compressed Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Objective Measures of Plosive Nasalization in Hypernasal Speech.

[BibT_eX]

[DOI]

Michael Saxon

Proceedings of the IEEE International Conference on Acoustics, 2019

Objective Assessment of Vocal Tremor.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

A 8.93-TOPS/W LSTM Recurrent Neural Network Accelerator Featuring Hierarchical Coarse-Grain Sparsity With All Parameters Stored On-Chip.

[BibT_eX]

[DOI]

Proceedings of the 45th IEEE European Solid State Circuits Conference, 2019

2018

Direct Estimation of Density Functionals Using a Polynomial Basis.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2018

A Discriminative Acoustic-Prosodic Approach for Measuring Local Entrainment.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Investigating the Role of L1 in Automatic Pronunciation Evaluation of L2 Speech.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Triplet Network with Attention for Speaker Diarization.

[BibT_eX]

[DOI]

Huan Song

Megan M. Willi

Jayaraman J. Thiagarajan

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Direct Ensemble Estimation of Density Functionals.

[BibT_eX]

[DOI]

Kevin R. Moon

Uday Shankar Shanthamallu

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Towards a Wearable Cough Detector Based on Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Simulating Dysarthric Speech for Training Data Augmentation in Clinical Speech Applications.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Online Machine Learning Experiments in HTML5.

[BibT_eX]

[DOI]

Abhinav Dixit

Mahesh K. Banavar

Proceedings of the IEEE Frontiers in Education Conference, 2018

2017

Articulation Entropy: An Unsupervised Measure of Articulatory Precision.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2017

Improving efficiency in sparse learning with the feedforward inhibitory motif.

[BibT_eX]

[DOI]

Neurocomputing, 2017

A data-driven basis for direct estimation of functionals of distributions.

[BibT_eX]

[DOI]

CoRR, 2017

Interpretable Objective Assessment of Dysarthric Speech Based on Deep Neural Networks.

[BibT_eX]

[DOI]

Ming Tu

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Float Like a Butterfly Sting Like a Bee: Changes in Speech Preceded Parkinsonism Diagnosis for Muhammad Ali.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Objective assessment of pathological speech using distribution regression.

[BibT_eX]

[DOI]

Ming Tu

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Interpretable phonological features for clinical applications.

[BibT_eX]

[DOI]

Yishan Jiao

Shreyas K. Venkataramanaiah

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Minimizing area and energy of deep learning hardware design using collective low precision and structured compression.

[BibT_eX]

[DOI]

Shihui Yin

Gaurav Srivastava

Jae-sun Seo

Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017

Improved finite-sample estimate of a nonparametric f-divergence.

[BibT_eX]

[DOI]

Prad Kadambi

Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017

2016

Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2016

Reducing the Model Order of Deep Neural Networks Using Information Theory.

[BibT_eX]

[DOI]

Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2016

A Convex Model for Linguistic Influence in Group Conversations.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Accent Identification by Combining Deep Neural Networks and Recurrent Neural Networks Trained on Long and Short Term Features.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Empirically-estimable multi-class classification bounds.

[BibT_eX]

[DOI]

Dennis Wei

Karthikeyan Ramamurthy

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Ranking the parameters of deep neural networks using the fisher information.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Online speaking rate estimation using recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Noise robust dysarthric speech classification using domain adaptation.

[BibT_eX]

[DOI]

Proceedings of the Digital Media Industry & Academic Forum, 2016

Models for objective evaluation of dysarthric speech from data annotated by multiple listeners.

[BibT_eX]

[DOI]

Proceedings of the 50th Asilomar Conference on Signals, Systems and Computers, 2016

2015

Convex Weighting Criteria for Speaking Rate Estimation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Empirical Non-Parametric Estimation of the Fisher Information.

[BibT_eX]

[DOI]

Alfred O. Hero III

IEEE Signal Process. Lett., 2015

Active data labeling for improved classifier generalizability.

[BibT_eX]

[DOI]

Douglas Cochran

Signal Process., 2015

Removing data with noisy responses in regression analysis.

[BibT_eX]

[DOI]

Karthikeyan Ramamurthy

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Estimating speaking rate in spontaneous discourse.

[BibT_eX]

[DOI]

Proceedings of the 49th Asilomar Conference on Signals, Systems and Computers, 2015

2014

Empirically Estimable Classification Bounds Based on a New Divergence Measure.

[BibT_eX]

[DOI]

CoRR, 2014

Domain invariant speech features using a new divergence measure.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Modeling pathological speech perception from data with similarity labels.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Bandwidth Extension of Speech Using Perceptual Criteria

[BibT_eX]

[DOI]

Steven Sandoval

Synthesis Lectures on Algorithms and Software in Engineering, Morgan & Claypool Publishers, ISBN: 978-3-031-01521-2, 2013

Towards a clinical tool for automatic intelligibility assessment.

[BibT_eX]

[DOI]

Rene Utianski

Proceedings of the IEEE International Conference on Acoustics, 2013

Selecting disorder-specific features for speech pathology fingerprinting.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2011

Editorial.

[BibT_eX]

[DOI]

Louis L. Scharf

Digit. Signal Process., 2011

Semi-supervised hierarchy learning using multiple-labeled data.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE International Workshop on Machine Learning for Signal Processing, 2011

2010

An auditory-domain based speech enhancement algorithm.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Experiments With Sensor Motes and Java-DSP.

[BibT_eX]

[DOI]

IEEE Trans. Educ., 2009

A Frequency/Detector Pruning Approach for Loudness Estimation.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2009

A Sensor Network for Real-time Acoustic Scene Analysis.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

Energy-constrained discriminant analysis.

[BibT_eX]

[DOI]

Scott Philips

Proceedings of the IEEE International Conference on Acoustics, 2009

Low-complexity sinusoidal component selection using loudness patterns.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Gradient projection-based channel equalization under sustained fading.

[BibT_eX]

[DOI]

Venkatraman Atti

Kostas Tsakalis

Constantinos Panayiotou

Leonidas D. Iasemidis

Signal Process., 2008

Real-time sensing and acoustic scene characterization for security applications.

[BibT_eX]

[DOI]

Proceedings of the Third International Symposium on Wireless Pervasive Computing, 2008

A low-complexity loudness estimation algorithm.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Wideband Speech Recovery Using Psychoacoustic Criteria.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2007

Dual-Mode Wideband Speech Compression.

[BibT_eX]

[DOI]

Proceedings of the IEEE 9th Workshop on Multimedia Signal Processing, 2007

Sparse Manifold Learning with Applications to SAR Image Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

A Scalable Bandwidth Extension Algorithm.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

Bandwidth Extension of Audio Based on Partial Loudness Criteria.

[BibT_eX]

[DOI]

Proceedings of the IEEE 8th Workshop on Multimedia Signal Processing, 2006

Real-time acoustic monitoring using wireless sensor motes.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Real-Time Collaborative Monitoring in Wireless Sensor Networks.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Enhancing the Quality of Coded Audio Using Perceptual Criteria.

[BibT_eX]

[DOI]

Andreas S. Spanias

Proceedings of the IEEE 7th Workshop on Multimedia Signal Processing, 2005

Enhancing vocoder performance for music signals.

[BibT_eX]

[DOI]