Carol Y. Espy-Wilson

CoRR, January, 2026

2025

A Computational Approach to Analyzing Disrupted Language in Schizophrenia: Integrating Surprisal and Coherence Measures.

[BibT_eX]

[DOI]

CoRR, November, 2025

Quantifying Articulatory Coordination as a Biomarker for Schizophrenia.

[BibT_eX]

[DOI]

CoRR, November, 2025

Articulation-Informed ASR: Integrating Articulatory Features into ASR via Auxiliary Speech Inversion and Cross-Attention Fusion.

[BibT_eX]

[DOI]

Jing Liu

CoRR, October, 2025

RealClass: A Framework for Classroom Speech Simulation with Public Datasets and Game Engines.

[BibT_eX]

[DOI]

Jing Liu

CoRR, October, 2025

Reverse Attention for Lightweight Speech Enhancement on Edge Devices.

[BibT_eX]

[DOI]

Shuubham Ojha

Felix Gervits

CoRR, September, 2025

From Weak Labels to Strong Results: Utilizing 5,000 Hours of Noisy Classroom Transcripts with Minimal Accurate Data.

[BibT_eX]

[DOI]

CoRR, May, 2025

FT-Boosted SV: Towards Noise Robust Speaker Verification for English Speaking Classroom Environments.

[BibT_eX]

[DOI]

Saba Tabatabaee

Jing Liu

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Enhancing Acoustic-to-Articulatory Speech Inversion by Incorporating Nasality.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Multimodal Biomarkers for Schizophrenia: Towards Individual Symptom Severity Estimation.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Analyzing the Impact of Accent on English Speech: Acoustic and Articulatory Perspectives.

[BibT_eX]

[DOI]

Vinith Kugathasan

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Speech Kinematic Analysis from Acoustics: Scientific, Clinical and Practical Applications.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Subtyping Speech Errors in Childhood Speech Sound Disorders with Acoustic-to-Articulatory Speech Inversion.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

From Weak Labels to Strong Results: Utilizing 5, 000 Hours of Noisy Classroom Transcripts with Minimal Accurate Data.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Self-supervised Multimodal Speech Representations for the Assessment of Schizophrenia Symptoms.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Speech-Based Estimation of Schizophrenia Severity Using Feature Fusion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2025

CPT-Boosted Wav2vec2.0: Towards Noise Robust Speech Recognition for Classroom Environments.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Speaking with Robots in Noisy Environments.

[BibT_eX]

[DOI]

Shuubham Ojha

Felix Gervits

Proceedings of the 20th ACM/IEEE International Conference on Human-Robot Interaction, 2025

Acoustic to Articulatory Speech Inversion for Children with Velopharyngeal Insufficiency.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

Continued Pretraining for Domain Adaptation of Wav2vec2.0 in Automatic Speech Recognition for Elementary Math Classroom Settings.

[BibT_eX]

[DOI]

CoRR, 2024

Accent Conversion with Articulatory Representations.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

A Multimodal Framework for the Assessment of the Schizophrenia Spectrum.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Examining Vocal Tract Coordination in Childhood Apraxia of Speech with Acoustic-to-Articulatory Speech Inversion Feature Sets.

[BibT_eX]

[DOI]

Nina R. Benway

Jonathan L. Preston

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables.

[BibT_eX]

[DOI]

Proceedings of the 32nd European Signal Processing Conference, 2024

A multi-modal approach for identifying schizophrenia using cross-modal attention.

[BibT_eX]

[DOI]

Philip Resnik

Proceedings of the 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2024

Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults.

[BibT_eX]

[DOI]

Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES-24) - Full Archival Papers, October 21-23, 2024, San Jose, California, USA, 2024

2023

Learning to Compute the Articulatory Representations of Speech with the MIRRORNET.

[BibT_eX]

[DOI]

Shihab A. Shamma

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Speaker-independent Speech Inversion for Estimation of Nasalance.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Acoustic-to-Articulatory Speech Inversion Features for Mispronunciation Detection of /ɹ/ in Child Speech Sound Disorders.

[BibT_eX]

[DOI]

Nina R. Benway

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Enhancing Speech Articulation Analysis Using A Geometric Transformation of the X-ray Microbeam Dataset.

[BibT_eX]

[DOI]

Mark Tiede

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

The Secret Source : Incorporating Source Features to Improve Acoustic-To-Articulatory Speech Inversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Masked Autoencoders are Articulatory Learners.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Audio Data Augmentation for Acoustic-to-Articulatory Speech Inversion.

[BibT_eX]

[DOI]

Proceedings of the 31st European Signal Processing Conference, 2023

2022

Modeling Feature Representations for Affective Speech Using Generative Adversarial Networks.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2022

Spoken language interaction with robots: Recommendations for future research.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2022

Acoustic-to-articulatory Speech Inversion with Multi-task Learning.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multimodal Depression Severity Score Prediction Using Articulatory Coordination Features and Hierarchical Attention Based Text Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Acoustic To Articulatory Speech Inversion Using Multi-Resolution Spectro-Temporal Representations Of Speech Signals.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

An Empirical Analysis on the Vulnerabilities of End-to-End Speech Segregation Models.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multimodal Depression Classification using Articulatory Coordination Features and Hierarchical Attention Based text Embeddings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Harmonicity Plays a Critical Role in DNN Based Versus in Biologically-Inspired Monaural Speech Segregation Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Generalized Dilated CNN Models for Depression Detection Using Inverted Vocal Tract Variables.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Speech Based Depression Severity Level Classification Using a Multi-Stage Dilated CNN-LSTM Model.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Multimodal Approach for Assessing Neuromotor Coordination in Schizophrenia Using Convolutional Neural Networks.

[BibT_eX]

[DOI]

Chris Kitchen

Deanna L. Kelly

Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

2020

Deep Learning Based Generalized Models for Depression Classification.

[BibT_eX]

[DOI]

CoRR, 2020

Spoken Language Interaction with Robots: Research Issues and Recommendations, Report from the NSF Future Directions Workshop.

[BibT_eX]

[DOI]

Matthew Marge

Nigel G. Ward

CoRR, 2020

Extended Study on the Use of Vocal Tract Variables to Quantify Neuromotor Coordination in Depression.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Multi-Corpus Acoustic-to-Articulatory Speech Inversion.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multi-Modal Learning for Speech Emotion Recognition: An Analysis and Comparison of ASR Outputs with Ground Truth Transcription.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Assessing Neuromotor Coordination in Depression Using Inverted Vocal Tract Variables.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018

Noise Robust Acoustic to Articulatory Speech Inversion.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

On Enhancing Speech Emotion Recognition Using Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Smoothing Model Predictions Using Adversarial Training Procedures for Speech Based Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Semi-Supervised and Transfer Learning Approaches for Low Resource Sentiment Classification.

[BibT_eX]

[DOI]

Shrikanth S. Narayanan

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Hybrid convolutional neural networks for articulatory and acoustic information based speech recognition.

[BibT_eX]

[DOI]

Speech Commun., 2017

SCL-UMD at the Medico Task-MediaEval 2017: Transfer Learning based Classification of Medical Images.

[BibT_eX]

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2017 Workshop co-located with the Conference and Labs of the Evaluation Forum (CLEF 2017), 2017

Analysis of Acoustic-to-Articulatory Speech Inversion Across Different Accents and Languages.

[BibT_eX]

[DOI]

Martijn Wieling

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Adversarial Auto-Encoders for Speech Based Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

An Affect Prediction Approach Through Depression Severity Parameter Incorporation in Neural Networks.

[BibT_eX]

[DOI]

Shrikanth S. Narayanan

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Joint modeling of articulatory and acoustic spaces for continuous speech recognition tasks.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

Vocal Tract Length Normalization for Speaker Independent Acoustic-to-Articulatory Speech Inversion.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Speech Features for Depression Detection.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015

Analysis of coarticulated speech using estimated articulatory trajectories.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Diversity of tongue shapes for the American English rhotic liquid.

[BibT_eX]

[DOI]

Proceedings of the 18th International Congress of Phonetic Sciences, 2015

2014

Articulatory features from deep neural networks and their role in speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

A cine MRI-based study of sibilant fricatives production in post-glossectomy speakers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Automatic intelligibility assessment of pathologic speech in head and neck cancer based on auditory-inspired spectro-temporal modulations.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Multicondition training of Gaussian PLDA models in i-vector space for noise and reverberation robust speaker recognition.

[BibT_eX]

[DOI]

Xinhui Zhou

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Articulatory Information for Noise Robust Speech Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2011

A Comparative Acoustic Study on Speech of Glossectomy Patients and Normal Subjects.

[BibT_eX]

[DOI]

Xinhui Zhou

Maureen L. Stone

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Automatic Speech Codec Identification with Applications to Tampering Detection of Speech Recordings.

[BibT_eX]

[DOI]

Jingting Zhou

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Analysis of i-vector Length Normalization in Speaker Recognition Systems.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Speech inversion: Benefits of tract variables over pellet trajectories.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Gesture-based Dynamic Bayesian Network for noise robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Linear versus mel frequency cepstral coefficients for speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Robust speech recognition using articulatory gestures in a Dynamic Bayesian Network framework.

[BibT_eX]

[DOI]

Vikramjit Mitra

Hosung Nam

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010

Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2010

Joint Factor Analysis for Speaker Recognition Reinterpreted as Signal Coding Using Overcomplete Dictionaries.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

A procedure for estimating gestural scores from natural speech.

[BibT_eX]

[DOI]

Mark Hasegawa-Johnson

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Robust word recognition using articulatory trajectories and gestures.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

An MRI-based articulatory and acoustic study of lateral sound in American English.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Automatic acquisition device identification from speech recordings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Noise robustness of tract variables and their application to speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A noise-type and level-dependent MPO-based speech enhancement architecture with variable frame analysis for noise-robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

An algorithm for speech segregation of co-channel speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

From acoustics to Vocal Tract time functions.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

An algorithm for multi-pitch tracking in co-channel speech.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Language and genre detection in audio content analysis.

[BibT_eX]

[DOI]

Vikramjit Mitra

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Intersession variability in speaker recognition: a behind the scene analysis.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Language detection in audio content analysis.

[BibT_eX]

[DOI]

Vikramjit Mitra

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Report on the NSF-sponsored Human Language Technology Workshop on Industrial Centers.

[BibT_eX]

[DOI]

Proceedings of Machine Translation Summit XI: Papers, 2007

An articulatory and acoustic study of "retroflex" and "bunched" american English rhotic sound based on MRI.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Acoustic parameters for the automatic detection of vowel nasalization.

[BibT_eX]

[DOI]

Tarun Pruthi

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

A semi-automatic approach for speaker mining of tapped telephone conversations.

[BibT_eX]

[DOI]

Sandeep Manocha

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Landmark-based approach to speech recognition: an alternative to HMMs.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

2006

Automatic detection of irregular phonation in continuous speech.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

An MRI based study of the acoustic effects of sinus cavities and its application to speaker recognition.

[BibT_eX]

[DOI]

Tarun Pruthi

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

A new set of features for text-independent speaker identification.

[BibT_eX]

[DOI]

Sandeep Manocha

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Speech enhancement using modified phase opponency model.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Modified phase opponency based solution to the speech separation challenge.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005

Use of Temporal Information: Detection of Periodicity, Aperiodicity, and Pitch in Speech.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2005

Speech enhancement using auditory phase opponency model.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Modeling of the Front Cavity and Sublingual Space in American English Rhotic Sounds.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Acoustic parameters for automatic detection of nasal manner.

[BibT_eX]

[DOI]

Tarun Pruthi

Speech Commun., 2004

A novel method for computation of periodicity, aperiodicity and pitch of speech signals.

[BibT_eX]

[DOI]

Jawahar Singh

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

Acoustic modeling of american English lateral approximants.

[BibT_eX]

[DOI]

Zhaoyan Zhang

Mark Tiede

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Speech segmentation using probabilistic phonetic feature hierarchy and support vector machines.

[BibT_eX]

[DOI]

Amit Juneja

Proceedings of the International Joint Conference on Neural Networks, 2003

A measure of aperiodicity and periodicity in speech.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002

An event-based acoustic-phonetic approach for speech segmentation and E-set recognition.

[BibT_eX]

[DOI]

Amit Juneja

Proceedings of the IEEE International Conference on Acoustics, 2002

Acoustic-phonetic speech parameters for speaker-independent speech recognition.

[BibT_eX]

[DOI]

Amit Juneja

Proceedings of the IEEE International Conference on Acoustics, 2002

2000

A new strategy of formant tracking based on dynamic programming.

[BibT_eX]

[DOI]

Kun Xia

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Detection of speech landmarks using temporal cues.

[BibT_eX]

[DOI]

Ariel Salomon

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999

Automatic detection of manner events based on temporal parameters.

[BibT_eX]

[DOI]

Ariel Salomon

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Improvement of electrolaryngeal speech by introducing normal excitation information.

[BibT_eX]

[DOI]

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1997

Acoustic modelling of American English /r/.

[BibT_eX]

[DOI]

Shrikanth S. Narayanan

Suzanne Boyce

Abeer Alwan

Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

The design of acoustic parameters for speaker-independent speech recognition.

[BibT_eX]

[DOI]

Nabil N. Bitar

Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

1996

Enhancement of alaryngeal speech by adaptive filtering.

[BibT_eX]

[DOI]

Venkatesh R. Chari

Caroline B. Huang

Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Coarticulatory stability in american English /r/.

[BibT_eX]

[DOI]

Suzanne Boyce

Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Knowledge-based parameters for HMM speech recognition.

[BibT_eX]

[DOI]

Nabil N. Bitar

Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995

Adaptive enhancement of Fourier spectra.

[BibT_eX]

[DOI]

Venkatesh R. Chari

IEEE Trans. Speech Audio Process., 1995

Speech parameterization based on phonetic features: application to speech recognition.

[BibT_eX]

[DOI]

Nabil N. Bitar

Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

1986

A phonetically based semivowel recognition system.

[BibT_eX]

[DOI]