Najim Dehak

Orcid: 0000-0002-4489-5753

Affiliations:
  • MIT, Cambridge, USA


According to our database1, Najim Dehak authored at least 169 papers between 2006 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Automating the analysis of eye movement for different neurodegenerative disorders.
Comput. Biol. Medicine, March, 2024

Time-Domain Speech Super-Resolution With GAN Based Modeling for Telephony Speaker Verification.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Unraveling Adversarial Examples against Speaker Identification - Techniques for Attack Detection and Victim Model Classification.
CoRR, 2024

2023
Interpretable speech features vs. DNN embeddings: What to use in the automatic assessment of Parkinson's disease in multi-lingual scenarios.
Comput. Biol. Medicine, November, 2023

Time Scale Network: A Shallow Neural Network For Time Series Data.
CoRR, 2023

DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction.
CoRR, 2023

Leveraging Pretrained Image-text Models for Improving Audio-Visual Learning.
CoRR, 2023

Stabilized training of joint energy-based models and their practical applications.
CoRR, 2023

Clustering Unsupervised Representations as Defense Against Poisoning Attacks on Speech Commands Classification System.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Joint Energy-Based Model for Robust Speech Classification System Against Dirty-Label Backdoor Poisoning Attacks.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Model-Based Fairness Metric for Speaker Verification.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Unsupervised Speech Segmentation and Variable Rate Representation Learning Using Segmental Contrastive Predictive Coding.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Non-Contrastive Self-Supervised Learning for Utterance-Level Information Extraction From Speech.
IEEE J. Sel. Top. Signal Process., 2022

Discovering phonetic inventories with crosslingual automatic speech recognition.
Comput. Speech Lang., 2022

Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser.
CoRR, 2022

Code-Switching Text Augmentation for Multilingual Speech Processing.
CoRR, 2022

Textual Data Augmentation for Arabic-English Code-Switching Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

A Multi-Modal Array of Interpretable Features to Evaluate Language and Speech Patterns in Different Neurological Disorders.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Vsameter: Evaluation of a New Open-Source Tool to Measure Vowel Space Area and Related Metrics.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Advances in Cross-Lingual and Cross-Source Audio-Visual Speaker Recognition: The JHU-MIT System for NIST SRE21.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Advances in Speaker Recognition for Multilingual Conversational Telephone Speech: The JHU-MIT System for NIST SRE20 CTS Challenge.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Chunking Defense for Adversarial Attacks on ASR.
Proceedings of the Interspeech 2022, 2022

End-to-End Neural Speaker Diarization with an Iterative Refinement of Non-Autoregressive Attention-based Attractors.
Proceedings of the Interspeech 2022, 2022

Joint domain adaptation and speech bandwidth extension using time-domain GANs for speaker verification.
Proceedings of the Interspeech 2022, 2022

AdvEst: Adversarial Perturbation Estimation to Classify and Detect Adversarial Attacks against Speaker Identification.
Proceedings of the Interspeech 2022, 2022

Defense against Adversarial Attacks on Hybrid Speech Recognition System using Adversarial Fine-tuning with Denoiser.
Proceedings of the Interspeech 2022, 2022

Non-contrastive self-supervised learning of utterance-level speech representations.
Proceedings of the Interspeech 2022, 2022

2021
Study of Pre-Processing Defenses Against Adversarial Attacks on State-of-the-Art Speaker Recognition Systems.
IEEE Trans. Inf. Forensics Secur., 2021

What Helps Transformers Recognize Conversational Structure? Importance of Context, Punctuation, and Labels in Dialog Act Recognition.
Trans. Assoc. Comput. Linguistics, 2021

Non-Autoregressive Transformer for Speech Recognition.
IEEE Signal Process. Lett., 2021

The JHU submission to VoxSRC-21: Track 3.
CoRR, 2021

Adversarial Attacks and Defenses for Speech Recognition Systems.
CoRR, 2021

Adversarial Attacks and Defenses for Speaker Identification Systems.
CoRR, 2021

Advances in Parkinson's Disease detection and assessment using voice and speech: A review of the articulatory and phonatory aspects.
Biomed. Signal Process. Control., 2021

Invariant Representation Learning for Robust Far-Field Speaker Recognition.
Proceedings of the Statistical Language and Speech Processing, 2021

Representation Learning to Classify and Detect Adversarial Attacks Against Speaker and Speech Recognition Systems.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Spine2Net: SpineNet with Res2Net and Time-Squeeze-and-Excitation Blocks for Speaker Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Automatic Detection and Assessment of Alzheimer Disease Using Speech and Language Technologies in Low-Resource Scenarios.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Deep Feature CycleGANs: Speaker Identity Preserving Non-Parallel Microphone-Telephone Domain Adaptation for Speaker Verification.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Align-Denoise: Single-Pass Non-Autoregressive Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Segmental Contrastive Predictive Coding for Unsupervised Word Segmentation.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Align or attend? Toward More Efficient and Accurate Spoken Word Discovery Using Speech-to-Image Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2021

CopyPaste: An Augmentation Method for Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Perceptual Loss Based Speech Denoising with an Ensemble of Audio Pattern Recognition and Self-Supervised Models.
Proceedings of the IEEE International Conference on Acoustics, 2021

How Phonotactics Affect Multilingual and Zero-Shot ASR Performance.
Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Reconstruction Loss Based Speaker Embedding in Unsupervised and Semi-Supervised Scenarios.
Proceedings of the IEEE International Conference on Acoustics, 2021

Focus on the Present: A Regularization Method for the ASR Source-Target Attention Layer.
Proceedings of the IEEE International Conference on Acoustics, 2021

New tools for the differential evaluation of Parkinson's disease using voice and speech processing.
Proceedings of the Fifth International Conference, 2021

Beyond Isolated Utterances: Conversational Emotion Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Joint Prediction of Truecasing and Punctuation for Conversational Speech in Low-Resource Scenarios.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Analysis of the Effects of Supraglottal Tract Surgical Procedures in Automatic Speaker Recognition Performance.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Introduction to the Issue on Automatic Assessment of Health Disorders Based on Voice, Speech, and Language Processing.
IEEE J. Sel. Top. Signal Process., 2020

State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and Speakers in the Wild evaluations.
Comput. Speech Lang., 2020

rVAD: An unsupervised segment-based robust voice activity detection method.
Comput. Speech Lang., 2020

Single Channel Far Field Feature Enhancement For Speaker Verification In The Wild.
CoRR, 2020

Advances in Speaker Recognition for Telephone and Audio-Visual Data: the JHU-MIT Submission for NIST SRE19.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Analysis of Deep Feature Loss Based Enhancement for Speaker Verification.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020


Black-Box Attacks on Spoofing Countermeasures Using Transferability of Adversarial Examples.
Proceedings of the Interspeech 2020, 2020

That Sounds Familiar: An Analysis of Phonetic Representations Transfer Across Languages.
Proceedings of the Interspeech 2020, 2020

x-Vectors Meet Adversarial Attacks: Benchmarking Adversarial Robustness in Speaker Verification.
Proceedings of the Interspeech 2020, 2020

Using State of the Art Speaker Recognition and Natural Language Processing Technologies to Detect Alzheimer's Disease and Assess its Severity.
Proceedings of the Interspeech 2020, 2020

Learning Speaker Embedding from Text-to-Speech.
Proceedings of the Interspeech 2020, 2020

Self-Expressing Autoencoders for Unsupervised Spoken Term Discovery.
Proceedings of the Interspeech 2020, 2020

Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?
Proceedings of the Interspeech 2020, 2020

X-Vectors Meet Emotions: A Study On Dependencies Between Emotion and Speaker Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Unsupervised Feature Enhancement for Speaker Verification.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Using X-Vectors to Automatically Detect Parkinson's Disease from Speech.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Feature Enhancement with Deep Feature Losses for Speaker Verification.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Listen and Fill in the Missing Letters: Non-Autoregressive Transformer for Speech Recognition.
CoRR, 2019

Speaker Sincerity Detection based on Covariance Feature Vectors and Ensemble Methods.
CoRR, 2019

A forced gaussians based methodology for the differential evaluation of Parkinson's Disease by means of speech processing.
Biomed. Signal Process. Control., 2019

Pretraining by Backtranslation for End-to-End ASR in Low-Resource Settings.
Proceedings of the Interspeech 2019, 2019

State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18.
Proceedings of the Interspeech 2019, 2019

The JHU Speaker Recognition System for the VOiCES 2019 Challenge.
Proceedings of the Interspeech 2019, 2019

MCE 2018: The 1st Multi-Target Speaker Detection and Identification Challenge Evaluation.
Proceedings of the Interspeech 2019, 2019

Improving Emotion Identification Using Phone Posteriors in Raw Speech Waveform Based DNN.
Proceedings of the Interspeech 2019, 2019

Study of the Performance of Automatic Speech Recognition Systems in Speakers with Parkinson's Disease.
Proceedings of the Interspeech 2019, 2019

ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual Networks.
Proceedings of the Interspeech 2019, 2019

Tied Mixture of Factor Analyzers Layer to Combine Frame Level Representations in Neural Speaker Embeddings.
Proceedings of the Interspeech 2019, 2019

Unsupervised Acoustic Segmentation and Clustering Using Siamese Network Embeddings.
Proceedings of the Interspeech 2019, 2019

Cycle-GANs for Domain Adaptation of Acoustic Features for Speaker Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Investigation on Neural Bandwidth Extension of Telephone Speech for Improved Speaker Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Attentive Filtering Networks for Audio Replay Attack Detection.
Proceedings of the IEEE International Conference on Acoustics, 2019

Language Model Integration Based on Memory Control for Sequence to Sequence Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

LSTM Siamese Network for Parkinson's Disease Detection from Speech.
Proceedings of the 2019 IEEE Global Conference on Signal and Information Processing, 2019

Bottom-Up Unsupervised Word Discovery via Acoustic Units.
Proceedings of the 2019 IEEE Global Conference on Signal and Information Processing, 2019

Hierarchical Transformers for Long Document Classification.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Low-Resource Domain Adaptation for Speaker Recognition Using Cycle-Gans.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
NeuroSpeech.
SoftwareX, 2018

NeuroSpeech: An open-source software for Parkinson's speech analysis.
Digit. Signal Process., 2018

Low Resource Multi-modal Data Augmentation for End-to-end ASR.
CoRR, 2018

MCE 2018: The 1st Multi-target Speaker Detection and Identification Challenge Evaluation (MCE) Plan, Dataset and Baseline System.
CoRR, 2018

The JHU Speech LOREHLT 2017 System: Cross-Language Transfer for Situation-Frame Detection.
CoRR, 2018

Analysis of speaker recognition methodologies and the influence of kinetic changes to automatically detect Parkinson's Disease.
Appl. Soft Comput., 2018

Age Estimation in Short Speech Utterances Based on LSTM Recurrent Neural Networks.
IEEE Access, 2018

Building an ASR System for Mboshi Using A Cross-Language Definition of Acoustic Units Approach.
Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

Low-Resource Contextual Topic Identification on Speech.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

The MIT Lincoln Laboratory / JHU / EPITA-LSE LRE17 System.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

End-to-End versus Embedding Neural Networks for Language Recognition in Mismatched Conditions.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Punctuation Prediction Model for Conversational Speech.
Proceedings of the Interspeech 2018, 2018

Automatic Speech Recognition and Topic Identification from Speech for Almost-Zero-Resource Languages.
Proceedings of the Interspeech 2018, 2018

Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge.
Proceedings of the Interspeech 2018, 2018

Visualizing Phoneme Category Adaptation in Deep Neural Networks.
Proceedings of the Interspeech 2018, 2018

Emotion Identification from Raw Speech Signals Using DNNs.
Proceedings of the Interspeech 2018, 2018

Investigation on Bandwidth Extension for Speaker Recognition.
Proceedings of the Interspeech 2018, 2018

End-to-end Deep Neural Network Age Estimation.
Proceedings of the Interspeech 2018, 2018

Effectiveness of Single-Channel BLSTM Enhancement for Language Identification.
Proceedings of the Interspeech 2018, 2018

Deep Neural Networks for Emotion Recognition Combining Audio and Transcripts.
Proceedings of the Interspeech 2018, 2018

An Investigation of Non-linear i-vectors for Speaker Verification.
Proceedings of the Interspeech 2018, 2018

Joint Verification-Identification in end-to-end Multi-Scale CNN Framework for Topic Identification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Characterizing Performance of Speaker Diarization Systems on Far-Field Speech Using Standard Methods.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Measuring Uncertainty in Deep Regression Models: The Case of Age Estimation from Speech.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

JHU Diarization System Description.
Proceedings of the Fourth International Conference, 2018

Study of the Automatic Detection of Parkison's Disease Based on Speaker Recognition Technologies and Allophonic Distillation.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

2017
Language Independent Assessment of Motor Impairments of Patients with Parkinson's Disease Using i-Vectors.
Proceedings of the Text, Speech, and Dialogue - 20th International Conference, 2017

Tied Variational Autoencoder Backends for i-Vector Speaker Recognition.
Proceedings of the Interspeech 2017, 2017


Evaluation of the Neurological State of People with Parkinson's Disease Using i-Vectors.
Proceedings of the Interspeech 2017, 2017

Multi-view representation learning via gcca for multimodal analysis of Parkinson's disease.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

An empirical evaluation of zero resource acoustic unit discovery.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Topic identification of spoken documents using unsupervised acoustic unit discovery.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
On the Use of Acoustic Unit Discovery for Language Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

The MITLL NIST LRE 2015 Language Recognition System.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

I-Vector Representation Based on GMM and DNN for Audio Classification.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Native Language Detection Using the I-Vector Framework.
Proceedings of the Interspeech 2016, 2016

Exploiting Hidden-Layer Responses of Deep Neural Networks for Language Recognition.
Proceedings of the Interspeech 2016, 2016

Automatic Dialect Detection in Arabic Broadcast Speech.
Proceedings of the Interspeech 2016, 2016

2015
Deep Neural Network Approaches to Speaker and Language Recognition.
IEEE Signal Process. Lett., 2015

ETS System for AV+EC 2015 Challenge.
Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge, 2015

A unified deep neural network for speaker and language recognition.
Proceedings of the INTERSPEECH 2015, 2015

Speaker adaptation using the i-vector technique for bottleneck features.
Proceedings of the INTERSPEECH 2015, 2015

2014
Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

A complete KALDI recipe for building Arabic speech recognition systems.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

GMM Weights Adaptation Based on Subspace Approaches for Speaker Verification.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Limited labels for unlimited data: active learning for speaker recognition.
Proceedings of the INTERSPEECH 2014, 2014

Recent advances in ASR applied to an Arabic transcription system for Al-Jazeera.
Proceedings of the INTERSPEECH 2014, 2014

2013
Unsupervised Methods for Speaker Diarization: An Integrated and Iterative Approach.
IEEE Trans. Speech Audio Process., 2013

New cosine similarity scorings to implement gender-independent speaker verification.
Proceedings of the INTERSPEECH 2013, 2013

Bayesian distance metric learning on i-vector for speaker verification.
Proceedings of the INTERSPEECH 2013, 2013

Developing a speaker identification system for the DARPA RATS project.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
The MITLL NIST LRE 2011 language recognition system.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

First attempt of boltzmann machines for speaker verification.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

On the Use of Spectral and Iterative Methods for Speaker Diarization.
Proceedings of the INTERSPEECH 2012, 2012

Patrol Team Language Identification System for DARPA RATS P1 Evaluation.
Proceedings of the INTERSPEECH 2012, 2012

2011
Front-End Factor Analysis for Speaker Verification.
IEEE Trans. Speech Audio Process., 2011

Exploiting Intra-Conversation Variability for Speaker Diarization.
Proceedings of the INTERSPEECH 2011, 2011

Language Recognition via i-vectors and Dimensionality Reduction.
Proceedings of the INTERSPEECH 2011, 2011

The MIT LL 2010 speaker recognition evaluation system: Scalable language-independent speaker recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Towards reduced false-alarms using cohorts.
Proceedings of the IEEE International Conference on Acoustics, 2011

A channel-blind system for speaker verification.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification.
Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

An i-vector Extractor Suitable for Speaker Recognition with both Microphone and Telephone Speech.
Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

Cosine Similarity Scoring without Score Normalization Techniques.
Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

2009
Cepstral and long-term features for emotion recognition.
Proceedings of the INTERSPEECH 2009, 2009

Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification.
Proceedings of the INTERSPEECH 2009, 2009

Comparison of scoring methods used in speaker recognition with Joint Factor Analysis.
Proceedings of the IEEE International Conference on Acoustics, 2009

Support vector machines and Joint Factor Analysis for speaker verification.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
A Study of Interspeaker Variability in Speaker Verification.
IEEE Trans. Speech Audio Process., 2008

The role of speaker factors in the NIST extended data task.
Proceedings of the Odyssey 2008: The Speaker and Language Recognition Workshop, 2008

Kernel combination for SVM speaker verification.
Proceedings of the Odyssey 2008: The Speaker and Language Recognition Workshop, 2008

Comparison between factor analysis and GMM support vector machines for speaker verification.
Proceedings of the Odyssey 2008: The Speaker and Language Recognition Workshop, 2008

Development of the primary CRIM system for the NIST 2008 speaker recognition evaluation.
Proceedings of the INTERSPEECH 2008, 2008

2007
Modeling Prosodic Features With Joint Factor Analysis for Speaker Verification.
IEEE Trans. Speech Audio Process., 2007

Continuous prosodic features and formant modeling with joint factor analysis for speaker verification.
Proceedings of the INTERSPEECH 2007, 2007

Linear and non linear kernel GMM supervector machines for speaker verification.
Proceedings of the INTERSPEECH 2007, 2007

2006
Support Vector Gmms for Speaker Verification.
Proceedings of the Odyssey 2006, 2006

GMM-based SVM for face recognition.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006


  Loading...