Jahangir Alam

Orcid: 0000-0003-1081-3665

Affiliations:
  • Computer Research Institute of Montreal, CRIM, Quebec, Canada
  • University of Quebec, Institut National de la Recherche Scientifique, QC, Canada (PhD 2014)


According to our database1, Jahangir Alam authored at least 135 papers between 2008 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
SSAVSV: Towards Unified Model for Self-Supervised Audio-Visual Speaker Verification.
CoRR, June, 2025

United we stand, Divided we fall: Handling Weak Complementary Relationships for Audio-Visual Emotion Recognition in Valence-Arousal Space.
CoRR, March, 2025

A Novel Hybrid Neural Embedding Extractor for Text Independent Speaker Verification.
Proceedings of the 13th International Workshop on Biometrics and Forensics, 2025

Text-dependent Speaker Verification Challenge 2024: Exploring Shared and User-defined Passphrases.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

LAVViT: Latent Audio-Visual Vision Transformers for Speaker Verification.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

AdaptiveDrop: A Simple Adaptive Label Noise Filtering Scheme for Enhanced Self-supervised Speaker Verification.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

United we stand, Divided we fall: Handling Weak Complementarity for Audio-Visual Emotion Recognition in Valence-Arousal Space.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025

2024
Incongruity-Aware Cross-Modal Attention for Audio-Visual Fusion in Dimensional Emotion Recognition.
IEEE J. Sel. Top. Signal Process., April, 2024

An analytic study on clustering driven self-supervised speaker verification.
Pattern Recognit. Lett., 2024

Inconsistency-Aware Cross-Attention for Audio-Visual Fusion in Dimensional Emotion Recognition.
CoRR, 2024

Text-dependent Speaker Verification (TdSV) Challenge 2024: Challenge Evaluation Plan.
CoRR, 2024

On the Influence of CNN-Based Feature Learning Modules in Neural Speaker Verification Framework.
Proceedings of the Speech and Computer - 26th International Conference, 2024

Cross-Modal Transformers for Audio-Visual Person Verification.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

An investigative study of the effect of several regularization techniques on label noise robustness of self-supervised speaker verification systems.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

On the impact of several regularization techniques on label noise robustness of self-supervised speaker verification systems.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Cross-Attention is not always needed: Dynamic Cross-Attention for Audio-Visual Dimensional Emotion Recognition.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

On the influence of regularization techniques on label noise robustness: Self-supervised speaker verification as a use case.
Proceedings of the IEEE International Joint Conference on Biometrics, 2024

Self-Supervised Speaker Verification Employing A Novel Clustering Algorithm.
Proceedings of the IEEE International Conference on Acoustics, 2024

Audio-Visual Person Verification Based on Recursive Fusion of Joint Cross-Attention.
Proceedings of the 18th IEEE International Conference on Automatic Face and Gesture Recognition, 2024

Dynamic Cross Attention for Audio-Visual Person Verification.
Proceedings of the 18th IEEE International Conference on Automatic Face and Gesture Recognition, 2024

Less is Enough: Adapting Pre-trained Vision Transformers for Audio-Visual Speaker Verification.
Proceedings of the NeurIPS Efficient Natural Language and Speech Processing Workshop, 2024

Enhanced label noise robustness through early adaptive filtering for the self-supervised speaker verification task.
Proceedings of the NeurIPS Efficient Natural Language and Speech Processing Workshop, 2024

Recursive Joint Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Audio-Visual Speaker Verification via Joint Cross-Attention.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Multi-task Learning over Mixup Variants for the Speaker Verification Task.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Self-supervised Speaker Verification Employing Augmentation Mix and Self-augmented Training-Based Clustering.
Proceedings of the Speech and Computer - 25th International Conference, 2023

Audio DeepFake Detection Employing Multiple Parametric Exponential Linear Units.
Proceedings of the Speech and Computer - 25th International Conference, 2023

On the influence of the quality of pseudo-labels on the self-supervised speaker verification task: a thorough analysis.
Proceedings of the 11th International Workshop on Biometrics and Forensics, 2023

On the Use of Cross-module Attention Statistics Pooling for Speaker Verification.
Proceedings of the 11th International Workshop on Biometrics and Forensics, 2023

On the Use of Cross- and Self-Module Attentive Statistics Pooling Techniques for Text-Independent Speaker Verification.
Proceedings of the IEEE International Joint Conference on Biometrics, 2023

Investigation Of The Quality Of Pseudo-Labels For The Self-Supervised Speaker Verification Task.
Proceedings of the IEEE International Conference on Acoustics, 2023

Hybrid Neural Network with Cross- and Self-Module Attention Pooling for Text-Independent Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2023

CAMSAT: Augmentation Mix and Self-Augmented Training Clustering for Self-Supervised Speaker Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
A Multimodal Non-Intrusive Stress Monitoring From the Pleasure-Arousal Emotional Dimensions.
IEEE Trans. Affect. Comput., 2022

Multi-level self-attentive TDNN: A general and efficient approach to summarize speech into discriminative utterance-level representations.
Speech Commun., 2022

L-Mix: A Latent-Level Instance Mixup Regularization for Robust Self-Supervised Speaker Representation Learning.
IEEE J. Sel. Top. Signal Process., 2022

Attentive activation function for improving end-to-end spoofing countermeasure systems.
CoRR, 2022

An Analytic Study on Clustering-Based Pseudo-labels for Self-supervised Deep Speaker Verification.
Proceedings of the Speech and Computer - 24th International Conference, 2022

Multiresolution Decomposition Analysis via Wavelet Transforms for Audio Deepfake Detection.
Proceedings of the Speech and Computer - 24th International Conference, 2022

Neural Embedding Extractors for Text-Independent Speaker Verification.
Proceedings of the Speech and Computer - 24th International Conference, 2022

Flow-ER: A Flow-Based Embedding Regularization Strategy for Robust Speech Representation Learning.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Domain Generalized Speaker Embedding Learning via Mutual Information Minimization.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Investigation on Mixup Strategies for End-to-End Voice Spoof Detection System.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Investigation on Deep Speaker Embedding Extraction Methods for Multi-Genre Speaker Verification.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Hybrid Neural Network-Based Deep Embedding Extractors for Text-Independent Speaker Verification.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Deep learning-based end-to-end spoken language identification system for domain-mismatched scenario.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

End-to-end framework for spoof-aware speaker verification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Mixup regularization strategies for spoofing countermeasure system.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

MIM-DG: Mutual information minimization-based domain generalization for speaker verification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Mel-Spectrogram Image-Based End-to-End Audio Deepfake Detection Under Channel-Mismatched Conditions.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Robust Self-Supervised Speaker Representation Learning Via Instance Mix Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
On the use of blind channel response estimation and a residual neural network to detect physical access attacks to speaker verification systems.
Comput. Speech Lang., 2021

Robust Speech Representation Learning via Flow-based Embedding Regularization.
CoRR, 2021

An Ensemble Approach for the Diagnosis of COVID-19 from Speech and Cough Sounds.
Proceedings of the Speech and Computer - 23rd International Conference, 2021

End-to-End Voice Spoofing Detection Employing Time Delay Neural Networks and Higher Order Statistics.
Proceedings of the Speech and Computer - 23rd International Conference, 2021

Text-Independent Speaker Verification Employing CNN-LSTM-TDNN Hybrid Networks.
Proceedings of the Speech and Computer - 23rd International Conference, 2021

Hybrid Network with Multi-Level Global-Local Statistics Pooling for Robust Text-Independent Speaker Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
On the use of the i-vector speech representation for instrumental quality measurement.
Qual. User Exp., 2020

Generalized end-to-end detection of spoofing attacks to automatic speaker recognizers.
Comput. Speech Lang., 2020

A Multi-condition Training Strategy for Countermeasures Against Spoofing Attacks to Speaker Recognizers.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020


On The Performance of Time-Pooling Strategies for End-to-End Spoken Language Identification.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

SdSV Challenge 2020: Large-Scale Evaluation of Short-Duration Speaker Verification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

An end-to-end approach for the verification problem: learning the right distance.
Proceedings of the 37th International Conference on Machine Learning, 2020

An Ensemble Based Approach for Generalized Detection of Spoofing Attacks to Automatic Speaker Recognizers.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Residual convolutional neural network with attentive feature pooling for end-to-end language identification from short-duration speech.
Comput. Speech Lang., 2019

Short-duration Speaker Verification (SdSV) Challenge 2020: the Challenge Evaluation Plan.
CoRR, 2019

On the Use of Fisher Vector Encoding for Voice Spoofing Detection.
Proceedings of the 13th International Conference on Ubiquitous Computing and Ambient Intelligence, 2019

Intrusive Quality Measurement of Noisy and Enhanced Speech based on i-Vector Similarity.
Proceedings of the 11th International Conference on Quality of Multimedia Experience QoMEX 2019, 2019

End-To-End Detection Of Attacks To Automatic Speaker Recognizers With Time-Attentive Light Convolutional Neural Networks.
Proceedings of the 29th IEEE International Workshop on Machine Learning for Signal Processing, 2019

Combining Speaker Recognition and Metric Learning for Speaker-Dependent Representation Learning.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

CRIM's Speech Transcription and Call Sign Detection System for the ATC Airbus Challenge Task.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Deep Speaker Recognition: Modular or Monolithic?
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Blind Channel Response Estimation for Replay Attack Detection.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Generative Adversarial Speaker Embedding Networks for Domain Robust End-to-end Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2019

Adapting End-to-end Neural Speaker Verification to New Languages and Recording Conditions with Adversarial Training.
Proceedings of the IEEE International Conference on Acoustics, 2019

End-to-End Language Identification Using a Residual Convolutional Neural Network with Attentive Temporal Pooling.
Proceedings of the 27th European Signal Processing Conference, 2019

Development of Voice Spoofing Detection Systems for 2019 Edition of Automatic Speaker Verification and Countermeasures Challenge.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Boosting the Performance of Spoofing Detection Systems on Replay Attacks Using q-Logarithm Domain Feature Normalization.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Speaker Verification in Mismatched Conditions with Frustratingly Easy Domain Adaptation.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Deeply Fused Speaker Embeddings for Text-Independent Speaker Verification.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Investigating Speech Enhancement and Perceptual Quality for Speech Emotion Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017
Analysis and Description of ABC Submission to NIST SRE 2016.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Deep Speaker Embeddings for Short-Duration Speaker Verification.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Speaker Verification Under Adverse Conditions Using i-Vector Adaptation and Neural Networks.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Spoofing detection employing infinite impulse response - constant Q transform-based feature representations.
Proceedings of the 25th European Signal Processing Conference, 2017

2016
Speaker and Channel Factors in Text-Dependent Speaker Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Text-Dependent Speaker Recognition With Random Digit Strings.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Modelling speaker and channel variability using deep neural networks for robust speaker verification.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Compensation for phonetic nuisance variability in speaker recognition using DNNs.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Uncertainty Modeling Without Subspace Methods For Text-Dependent Speaker Recognition.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Deep Neural Network based Text-Dependent Speaker Verification : Preliminary Results.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Spoofing Detection on the ASVspoof2015 Challenge Corpus Employing Deep Neural Networks.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Tandem Features for Text-Dependent Speaker Verification on the RedDots Corpus.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015
Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition.
Speech Commun., 2015

Speech recognition in reverberant and noisy environments employing multiple feature extractors and i-vector speaker adaptation.
EURASIP J. Adv. Signal Process., 2015

ETS System for AV+EC 2015 Challenge.
Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge, 2015

JFA for speaker recognition with random digit strings.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

The reddots data collection for speaker recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

An i-vector backend for speaker verification.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Combining amplitude and phase-based features for speaker verification with short duration utterances.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Development of CRIM system for the automatic speaker verification spoofing and countermeasures challenge 2015.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

JFA modeling with left-to-right structure and a new backend for text-dependent speaker recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Robust feature extraction based on an asymmetric level-dependent auditory filterbank and a subband spectrum enhancement technique.
Digit. Signal Process., 2014

Deep Neural Networks for extracting Baum-Welch statistics for Speaker Recognition.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Supervised/Unsupervised Voice Activity Detectors for Text-dependent Speaker Recognition on the RSR2015 Corpus.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Joint Factor Analysis for Text-Dependent Speaker Verification.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Automatic Emotion Recognition from Cochlear Implant-Like Spectrally Reduced Speech.
Proceedings of the Ambient Assisted Living and Daily Activities, 2014

In-domain versus out-of-domain training for text-dependent JFA.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Noise spectrum estimation using Gaussian mixture model-based speech presence probability for robust speech recognition.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

JFA-based front ends for speaker recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

Robust speech recognition using warped DFT-based cepstral features in clean and multistyle training.
Proceedings of the 22nd European Signal Processing Conference, 2014

Robust feature extractors for continuous speech recognition.
Proceedings of the 22nd European Signal Processing Conference, 2014

2013
Multitaper MFCC and PLP features for speaker verification using i-vectors.
Speech Commun., 2013

Low-variance Multitaper Mel-frequency Cepstral Coefficient Features for Speech and Speaker Recognition Systems.
Cogn. Comput., 2013

Smoothed Nonlinear Energy Operator-Based Amplitude Modulation Features for Robust Speech Recognition.
Proceedings of the Advances in Nonlinear Speech Processing - 6th International Conference, 2013

Frequency warping and robust speaker verification: a comparison of alternative mel-scale representations.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Regularized MVDR spectrum estimation-based robust feature extractors for speech recognition.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Amplitude modulation features for emotion recognition from speech.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

PLDA for speaker verification with utterances of arbitrary duration.
Proceedings of the IEEE International Conference on Acoustics, 2013

Multiple windowed spectral features for emotion recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Speech recognition using regularized minimum variance distortionless response spectrum estimation-based cepstral features.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
On the use of asymmetric-shaped tapers for speaker verification using i-vectors.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Robust Feature Extraction for Speech Recognition by Enhancing Auditory Spectrum.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Robust speech recognition under noisy environments using asymmetric tapers.
Proceedings of the 20th European Signal Processing Conference, 2012

2011
Perceptual improvement of Wiener filtering employing a post-filter.
Digit. Signal Process., 2011

Comparative Evaluation of Feature Normalization Techniques for Speaker Verification.
Proceedings of the Advances in Nonlinear Speech Processing, 2011

A Study of Low-variance Multi-taper Features for Distributed Speech Recognition.
Proceedings of the Advances in Nonlinear Speech Processing, 2011

Full-covariance UBM and heavy-tailed PLDA in i-vector speaker verification.
Proceedings of the IEEE International Conference on Acoustics, 2011

Multi-taper MFCC features for speaker verification using I-vectors.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2009
An improved perceptual speech enhancement technique employing a psychoacoustically motivated weighting factor.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Speech enhancement using a wiener denoising technique and musical noise reduction.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Speech enhancement based on novel two-step a priori SNR estimators.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Speech enhancement based on a hybrid a priori signal-to-noise ratio (SNR) estimator and a self-adaptive Lagrange multiplier.
Proceedings of the 2008 16th European Signal Processing Conference, 2008


  Loading...