L. Paola García-Perera

Matthew Wiesner

Matthew Maciejewski

Kelley M. Kempski Leadingham

CoRR, 2024

2023

Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Enhancing Code-switching Speech Recognition with Interactive Language Biases.

[BibT_eX]

[DOI]

CoRR, 2023

Unidirectional brain-computer interface: Artificial neural network encoding natural images to fMRI response in the visual cortex.

[BibT_eX]

[DOI]

Joshua Punnoose

Leibny Paola García

Amir Manbachi

CoRR, 2023

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios.

[BibT_eX]

[DOI]

CoRR, 2023

Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts.

[BibT_eX]

[DOI]

CoRR, 2023

Investigating model performance in language identification: beyond simple error statistics.

[BibT_eX]

[DOI]

Andy W. H. Khong

Justin Dauwels

CoRR, 2023

Genre Classification of Books on Spanish.

[BibT_eX]

[DOI]

Ana Verónica Guerrero-Galván

Carolina Del-Valle-Soto

IEEE Access, 2023

A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extracters.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Natural Language and Speech Processing (ICNLSP 2023), 2023

Crosslingual Handwritten Text Generation Using GANs.

[BibT_eX]

[DOI]

Chun Chieh Chang

Proceedings of the Document Analysis and Recognition - ICDAR 2023 Workshops, 2023

A New Approach to Extract Fetal Electrocardiogram Using Affine Combination of Adaptive Filters.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Bridging Speech and Textual Pre-Trained Models With Unsupervised ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Reducing Language Confusion for Code-Switching Speech Recognition with Token-Level Language Diarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

PQLM - Multilingual Decentralized Portable Quantum Language Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Building Keyword Search System from End-To-End Asr Systems.

[BibT_eX]

[DOI]

Ruizhe Huang

Matthew Wiesner

Jan Trmal

Proceedings of the IEEE International Conference on Acoustics, 2023

Adapting Self-Supervised Models to Multi-Talker Speech Recognition Using Speaker Embeddings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Euro: Espnet Unsupervised ASR Open-Source Toolkit.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Learning From Flawed Data: Weakly Supervised Automatic Speech Recognition.

[BibT_eX]

[DOI]

Dongji Gao

Hainan Xu

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Synthetic Data Augmentation for ASR with Domain Filtering.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Encoder-Decoder Based Attractors for End-to-End Neural Diarization.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Efficient Self-Supervised Learning Representations for Spoken Language Identification.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2022

Joint speaker diarization and speech recognition based on region proposal networks.

[BibT_eX]

[DOI]

Zili Huang

Marc Delcroix

Comput. Speech Lang., 2022

PQLM - Multilingual Decentralized Portable Quantum Language Model for Privacy Protection.

[BibT_eX]

[DOI]

CoRR, 2022

Investigating self-supervised learning for lyrics recognition.

[BibT_eX]

[DOI]

CoRR, 2022

Online Neural Diarization of Unlimited Numbers of Speakers.

[BibT_eX]

[DOI]

CoRR, 2022

Enhance Language Identification using Dual-mode Model with Knowledge Distillation.

[BibT_eX]

[DOI]

CoRR, 2022

On Compressing Sequences for Self-Supervised Speech Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Advances in Cross-Lingual and Cross-Source Audio-Visual Speaker Recognition: The JHU-MIT System for NIST SRE21.

[BibT_eX]

[DOI]

Pedro A. Torres-Carrasquillo

Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Enhancing Language Identification Using Dual-Mode Model with Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models.

[BibT_eX]

[DOI]

Yuki Takashima

Shota Horiguchi

Yohei Kawaguchi

Proceedings of the Interspeech 2022, 2022

PHO-LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language Identification.

[BibT_eX]

[DOI]

Andy W. H. Khong

Suzy J. Styles

Proceedings of the Interspeech 2022, 2022

Investigating Self-Supervised Learning for Speech Enhancement and Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Multi-Channel End-To-End Neural Diarization with Distributed Microphones.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Encoder-Decoder Based Attractor Calculation for End-to-End Neural Diarization.

[BibT_eX]

[DOI]

CoRR, 2021

The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap.

[BibT_eX]

[DOI]

CoRR, 2021

Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers.

[BibT_eX]

[DOI]

CoRR, 2021

Online End-To-End Neural Diarization with Speaker-Tracing Buffer.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

DOVER-Lap: A Method for Combining Overlap-Aware Diarization Outputs.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers.

[BibT_eX]

[DOI]

Kenji Nagamatsu

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Training Hybrid Models on Noisy Transliterated Transcripts for Code-Switched Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Semi-Supervised Training with Pseudo-Labeling for End-To-End Neural Diarization.

[BibT_eX]

[DOI]

Kenji Nagamatsu

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

End-to-End Language Diarization for Bilingual Code-Switching Speech.

[BibT_eX]

[DOI]

Carlos Rodrigo Castillo-Sanchez

Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

End-To-End Speaker Diarization as Post-Processing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

The CLIR-CLSP System for the IberSPEECH-RTVE 2020 Speaker Diarization and Identity Assignment Challenge.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference, 2021

Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and Speakers in the Wild evaluations.

[BibT_eX]

[DOI]

Pedro A. Torres-Carrasquillo

Fred Richardson

Réda Dehak

Carlos Rodrigo Castillo-Sanchez

Comput. Speech Lang., 2020

DNN Speaker Tracking with Embeddings.

[BibT_eX]

[DOI]

Anabel Martín-González

CoRR, 2020

The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge.

[BibT_eX]

[DOI]

Ashish Arora

Aswin Shanmugam Subramanian

CoRR, 2020

Single Channel Far Field Feature Enhancement For Speaker Verification In The Wild.

[BibT_eX]

[DOI]

Jesús Antonio Villalba López

CoRR, 2020

Advances in Speaker Recognition for Telephone and Audio-Visual Data: the JHU-MIT Submission for NIST SRE19.

[BibT_eX]

[DOI]

Pedro Torres-Carrasquiilo

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Speaker Detection in the Wild: Lessons Learned from JSALT 2019.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

End-to-End Domain-Adversarial Voice Activity Detection.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2020, 2020

Unsupervised Feature Enhancement for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Feature Enhancement with Deep Feature Losses for Speaker Verification.

[BibT_eX]

[DOI]

Nanxin Chen

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Speaker Diarization with Region Proposal Network.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Overlap-Aware Diarization: Resegmentation Using Neural End-to-End Overlapped Speech Detection.

[BibT_eX]

[DOI]

Latané Bullock

Hervé Bredin

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Analysis of Robustness of Deep Single-Channel Speech Separation Using Corpora Constructed From Multiple Domains.

[BibT_eX]

[DOI]

Matthew Maciejewski

Gregory Sell

Yusuke Fujita

Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

Multi-PLDA Diarization on Children's Speech.

[BibT_eX]

[DOI]

Jiamin Xie

Proceedings of the Interspeech 2019, 2019

Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network.

[BibT_eX]

[DOI]

Fei Wu

Proceedings of the Interspeech 2019, 2019

State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18.

[BibT_eX]

[DOI]

Pedro A. Torres-Carrasquillo

Proceedings of the Interspeech 2019, 2019

Optical Character Recognition with Chinese and Korean Character Decomposition.

[BibT_eX]

[DOI]

Chun-Chieh Chang

Ashish Arora

David Etter

Proceedings of the Second International Workshop on Machine Learning, 2019

Using ASR Methods for OCR.

[BibT_eX]

[DOI]

Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

2018

Building Corpora for Single-Channel Speech Separation Across Multiple Domains.

[BibT_eX]

[DOI]

Matthew Maciejewski

Gregory Sell

CoRR, 2018

JHU Diarization System Description.

[BibT_eX]

[DOI]

Zili Huang

Proceedings of the Fourth International Conference, 2018

2017

Analysis and Description of ABC Submission to NIST SRE 2016.

[BibT_eX]

[DOI]

Proceedings of the Interspeech 2017, 2017

DNN Bottleneck Features for Speaker Clustering.

[BibT_eX]

[DOI]

Jesús Jorrín

Paola García

Luis Buera

Proceedings of the Interspeech 2017, 2017

2016

Analysis of the Impact of the Audio Database Characteristics in the Accuracy of a Speaker Clustering System.

[BibT_eX]

[DOI]

Jesús Jorrín-Prieto

Carlos Vaquero

Paola García

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

2015

Context-Aware Communicator for All.

[BibT_eX]

[DOI]

Proceedings of the Universal Access in Human-Computer Interaction. Access to Today's Technologies, 2015

2013

Ensemble approach in speaker verification.

[BibT_eX]

[DOI]

Bhiksha Raj

Proceedings of the INTERSPEECH 2013, 2013

Optimization of the DET curve in speaker verification under noisy conditions.

[BibT_eX]

[DOI]

Bhiksha Raj

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Optimization of the DET curve in speaker verification.

[BibT_eX]

[DOI]

Bhiksha Raj

Richard M. Stern

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

2011

Speaker Verification in Different Database Scenarios.

[BibT_eX]

[DOI]

Roberto Aceves-Lopez

Computación y Sistemas, 2011

2010

Speech Magnitude-Spectrum Information-Entropy (MSIE) for Automatic Speech Recognition in Noisy Environments.

[BibT_eX]

[DOI]

Roberto A. Aceves L.

Proceedings of the 20th International Conference on Pattern Recognition, 2010

On the Results of the First Mobile Biometry (MOBIO) Face and Speaker Verification Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Recognizing Patterns in Signals, Speech, Images and Videos, 2010

2008

Enhancing acoustic models for robust speaker verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Robust Automatic Speech Recognition Using PD-MEEMLIN.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition and Image Analysis, Third Iberian Conference, 2007

2006

Using PCA to Improve the Generation of Speech Keys.

[BibT_eX]

[DOI]

Brenda Sanchez-Torres

Proceedings of the MICAI 2006: Advances in Artificial Intelligence, 2006

2005

Parameter Optimization in a Text-Dependent Cryptographic-Speech-Key Generation Task.

[BibT_eX]

[DOI]

Proceedings of the Nonlinear Analyses and Algorithms for Speech Processing, 2005

Cryptographic-Speech-Key Generation Architecture Improvements.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition and Image Analysis, Second Iberian Conference, 2005

Phoneme Spotting for Speech-Based Crypto-key Generation.

[BibT_eX]

[DOI]

Proceedings of the Progress in Pattern Recognition, 2005

Multi-speaker voice cryptographic key generation.

[BibT_eX]

[DOI]

Proceedings of the 2005 ACS / IEEE International Conference on Computer Systems and Applications (AICCSA 2005), 2005

2004

Cryptographic-Speech-Key Generation Using the SVM Technique over the lp-Cepstral Speech Space.

[BibT_eX]

[DOI]

Proceedings of the Nonlinear Speech Modeling and Applications, 2004

SVM Applied to the Generation of Biometric Speech Key.

[BibT_eX]

[DOI]