Kong-Aik Lee

Eng Siong Chng

CoRR, March, 2026

U3-xi: Pushing the Boundaries of Speaker Recognition via Incorporating Uncertainty.

[BibT_eX]

[DOI]

Junjie Li

CoRR, January, 2026

Stream-Voice-Anon: Enhancing Utility of Real-Time Speaker Anonymization via Neural Audio Codec and Language Models.

[BibT_eX]

[DOI]

CoRR, January, 2026

ASVspoof 5: Evaluation of Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech.

[BibT_eX]

[DOI]

CoRR, January, 2026

ASVspoof 5: Design, collection and validation of resources for spoofing, deepfake, and adversarial attack detection using crowdsourced speech.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2026

2025

Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding.

[BibT_eX]

[DOI]

CoRR, October, 2025

EDVD-LLaMA: Explainable Deepfake Video Detection via Multimodal Large Language Model Reasoning.

[BibT_eX]

[DOI]

CoRR, October, 2025

Addressing Gradient Misalignment in Data-Augmented Training for Robust Speech Deepfake Detection.

[BibT_eX]

[DOI]

CoRR, September, 2025

QAMO: Quality-aware Multi-centroid One-class Learning For Speech Deepfake Detection.

[BibT_eX]

[DOI]

CoRR, September, 2025

The First Voice Timbre Attribute Detection Challenge.

[BibT_eX]

[DOI]

CoRR, September, 2025

Xi+: Uncertainty Supervision for Robust Speaker Embedding.

[BibT_eX]

[DOI]

CoRR, September, 2025

MeMo: Attentional Momentum for Real-time Audio-visual Speaker Extraction under Impaired Visual Conditions.

[BibT_eX]

[DOI]

CoRR, July, 2025

Robust Localization of Partially Fake Speech: Metrics, Models, and Out-of-Domain Evaluation.

[BibT_eX]

[DOI]

CoRR, July, 2025

Modeling the One-to-Many Property in Open-Domain Dialogue with LLMs.

[BibT_eX]

[DOI]

CoRR, June, 2025

Introducing voice timbre attribute detection.

[BibT_eX]

[DOI]

CoRR, May, 2025

The Voice Timbre Attribute Detection 2025 Challenge Evaluation Plan.

[BibT_eX]

[DOI]

CoRR, May, 2025

Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-Spoofing.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2025

Any-to-Any Speaker Attribute Perturbation for Asynchronous Voice Anonymization.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2025

Asynchronous Voice Anonymization by Learning From Speaker-Adversarial Speech.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2025

Pinhole Effect on Linkability and Dispersion in Speaker Anonymization.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2025

Make full use of your data: On copy-based augmentation in speech anti-spoofing.

[BibT_eX]

[DOI]

Neurocomputing, 2025

ConFusionformer: Locality-enhanced Conformer through multi-resolution attention fusion for speaker verification.

[BibT_eX]

[DOI]

Neurocomputing, 2025

Adversarially adaptive temperatures for decoupled knowledge distillation with applications to speaker verification.

[BibT_eX]

[DOI]

Neurocomputing, 2025

Quantifying prediction uncertainties in automatic speaker verification systems.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2025

The Sub-3Sec Problem: From Text-Independent to Text-Dependent Corpus.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Bayesian Learning for Domain-Invariant Speaker Verification and Anti-Spoofing.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

IDIR: Identifying and Distilling Informative Relations for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

MoMuSE: Momentum Multi-modal Target Speaker Extraction for Real-time Scenarios with Impaired Visual Cues.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

Text-dependent Speaker Verification Challenge 2024: Exploring Shared and User-defined Passphrases.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Grouped Knowledge Distillation with Adaptive Logit Softening for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Investigation of Perception Inconsistency in Speaker Embedding for Asynchronous Voice Anonymization.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

A Preliminary Study on Sectional Voice Anonymization and Detection.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Multimodal Large Language Model for Deepfake Video Detection and Description.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Variational Regularization for End-to-End Speech Deepfake Detection.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Exploring Audio-Visual Fusion Methods in Foundation Model-Based Deception Detection.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Robust Localization of Partially Fake Speech: Metrics and Out-of-Domain Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

Speaker Privacy and Security in the Big Data Era: Protection and Defense Against Deepfake.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

2024

ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech.

[BibT_eX]

[DOI]

Dataset, December, 2024

Encoder-Decoder Calibration for Multimodal Machine Translation.

[BibT_eX]

[DOI]

IEEE Trans. Artif. Intell., August, 2024

t-EER: Parameter-Free Tandem Evaluation of Countermeasures and Biometric Comparators.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., May, 2024

Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Generalizing Speaker Verification for Spoof Awareness in the Embedding Space.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Cosine Scoring With Uncertainty for Neural Speaker Embedding.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2024

MoMuSE: Momentum Multi-modal Target Speaker Extraction for Real-time Scenarios with Impaired Visual Cues.

[BibT_eX]

[DOI]

CoRR, 2024

NTU-NPU System for Voice Privacy 2024 Challenge.

[BibT_eX]

[DOI]

CoRR, 2024

Malacopula: adversarial automatic speaker verification attacks using a neural-based generalised Hammerstein model.

[BibT_eX]

[DOI]

CoRR, 2024

ASVspoof 5: Crowdsourced Speech Data, Deepfakes, and Adversarial Attacks at Scale.

[BibT_eX]

[DOI]

CoRR, 2024

Text-dependent Speaker Verification (TdSV) Challenge 2024: Challenge Evaluation Plan.

[BibT_eX]

[DOI]

CoRR, 2024

VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2024

Using Twitter Dataset for Social Listening in Singapore.

[BibT_eX]

[DOI]

IEEE Access, 2024

Room Impulse Responses Help Attackers to Evade Deep Fake Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Towards Quantifying and Reducing Language Mismatch Effects in Cross-Lingual Speech Anti-Spoofing.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

On the Effectiveness of Enrollment Speech Augmentation For Target Speaker Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

On The Generation and Removal of Speaker Adversarial Perturbation For Voice-Privacy Protection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

MM-NodeFormer: Node Transformer Multimodal Fusion for Emotion Recognition in Conversation.

[BibT_eX]

[DOI]

Zilong Huang

Man-Wai Mak

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Two-stage Semi-supervised Speaker Recognition with Gated Label Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

CPAUG: Refining Copy-Paste Augmentation for Speech Anti-Spoofing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Automatic Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Gradient Weighting for Speaker Verification in Extremely Low Signal-to-Noise Ratio.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Modeling Pseudo-Speaker Uncertainty in Voice Anonymization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Adversarial Speech for Voice Privacy Protection from Personalized Speech Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

A Dual Latent Variable Personalized Dialogue Agent.

[BibT_eX]

[DOI]

SN Comput. Sci., March, 2023

Generalized Domain Adaptation Framework for Parametric Back-End in Speaker Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2023

Meta-Generalization for Domain-Invariant Speaker Verification.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Self-Supervised Training of Speaker Encoder With Multi-Modal Diverse Positive Pairs.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

An Empirical Bayes Framework for Open-Domain Dialogue Generation.

[BibT_eX]

[DOI]

CoRR, 2023

Partially Randomizing Transformer Weights for Dialogue Response Diversity.

[BibT_eX]

[DOI]

Proceedings of the 37th Pacific Asia Conference on Language, 2023

Disentangling Voice and Content with Self-Supervision for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Towards Single Integrated Spoofing-aware Speaker Verification Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Speaker-Aware Anti-spoofing.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification.

[BibT_eX]

[DOI]

Tianchi Liu

Proceedings of the IEEE International Conference on Acoustics, 2023

Speaker Recognition with Two-Step Multi-Modal Deep Cleansing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Noise-Disentanglement Metric Learning for Robust Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Probabilistic Back-ends for Online Speaker Recognition and Clustering.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Cross-Modal Audio-Visual Co-Learning for Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Positional-Related Local-Global Dependency for Synthetic Speech Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Supervised Audio-Visual Speaker Representation with Co-Meta Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Neural Acoustic-Phonetic Approach for Speaker Verification With Phonetic Attention Mask.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2022

Discriminative speaker embedding with serialized multi-layer multi-head attention.

[BibT_eX]

[DOI]

Hongning Zhu

Speech Commun., 2022

I4U System Description for NIST SRE'20 CTS Challenge.

[BibT_eX]

[DOI]

CoRR, 2022

Noise-Robust Semi-supervised Multi-modal Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the PRICAI 2022: Trends in Artificial Intelligence, 2022

Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Deep Spectro-temporal Artifacts for Detecting Synthesized Speech.

[BibT_eX]

[DOI]

Proceedings of the DDAM@MM 2022: Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia, 2022

The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines.

[BibT_eX]

[DOI]

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?

[BibT_eX]

[DOI]

Tianchi Liu

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Learning Domain-Invariant Transformation for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Self-Supervised Speaker Recognition with Loss-Gated Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

MFA: TDNN with Multi-Scale Frequency-Channel Attention for Text-Independent Speaker Verification with Short Utterances.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Contextual Coherence in Variational Personalized and Empathetic Dialogue Agents.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

DLVGen: A Dual Latent Variable Approach to Personalized Dialogue Generation.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on Agents and Artificial Intelligence, 2022

A Randomized Link Transformer for Diverse Open-Domain Dialogue Generation.

[BibT_eX]

[DOI]

Proceedings of the 4th Workshop on NLP for Conversational AI, 2022

2021

ASVspoof 2019: Spoofing Countermeasures for the Detection of Synthesized, Converted and Replayed Speech.

[BibT_eX]

[DOI]

IEEE Trans. Biom. Behav. Identity Sci., 2021

Xi-Vector Embedding for Speaker Recognition.

[BibT_eX]

[DOI]

Takafumi Koshinaka

IEEE Signal Process. Lett., 2021

ASVtorch toolkit: Speaker verification with deep neural networks.

[BibT_eX]

[DOI]

Ville Vestman

SoftwareX, 2021

Replay attack detection using variable-frequency resolution phase and magnitude features.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2021

ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection.

[BibT_eX]

[DOI]

CoRR, 2021

ASVspoof 2021: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan.

[BibT_eX]

[DOI]

CoRR, 2021

Benchmarking and challenges in security and privacy for voice biometrics.

[BibT_eX]

[DOI]

Brij Mohan Lal Srivastava

Massimiliano Todisco

Natalia A. Tomashenko

Emmanuel Vincent

Xin Wang

Junichi Yamagishi

CoRR, 2021

Generating Personalized Dialogue via Multi-Task Meta-Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Exploring Deep Learning for Joint Audio-Visual Lip Biometrics.

[BibT_eX]

[DOI]

CoRR, 2021

Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding.

[BibT_eX]

[DOI]

Hongning Zhu

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Multi-Level Transfer Learning from Near-Field to Far-Field Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Joint Feature Enhancement and Speaker Recognition with Multi-Objective Task-Oriented Network.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Meta-Learning for Cross-Channel Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Replay-Attack Detection Using Features With Adaptive Spectro-Temporal Resolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

COOPNet: Multi-Modal Cooperative Gender Prediction in Social Media User Profiling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Task-aware Warping Factors in Mask-based Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 29th European Signal Processing Conference, 2021

PL-EESR: Perceptual Loss Based End-to-End Robust Speaker Representation Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

DeepLip: A Benchmark for Deep Learning-Based Audio-Visual Lip Biometrics.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Maximal Figure-of-Merit Framework to Detect Multi-Label Phonetic Features for Spoken Language Recognition.

[BibT_eX]

[DOI]

Ivan Kukanov

Trung Ngo Trong

Sabato Marco Siniscalchi

Valerio Mario Salerno

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2020

Voice biometrics security: Extrapolating false alarm rate via hierarchical Bayesian modeling of speaker verification scores.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2020

NEC-TT System for Mixed-Bandwidth and Multi-Domain Speaker Recognition.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2020

Two decades into Speaker Recognition Evaluation - are we there yet?

[BibT_eX]

[DOI]

Comput. Speech Lang., 2020

Using Multi-Resolution Feature Maps with Convolutional Neural Networks for Anti-Spoofing in ASV.

[BibT_eX]

[DOI]

Takafumi Koshinaka

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Neural i-vectors.

[BibT_eX]

[DOI]

Ville Vestman

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Speaker Detection in the Wild: Lessons Learned from JSALT 2019.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

On Early-stop Clustering for Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Dynamic Margin Softmax Loss for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Adversarial Separation Network for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

SdSV Challenge 2020: Large-Scale Evaluation of Short-Duration Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Extrapolating False Alarm Rates in Automatic Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

NEC-TT Speaker Verification System for SRE'19 CTS Challenge.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

POCO: A Voice Spoofing and Liveness Detection Corpus Based on Pop Noise.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Deep Discriminative Embedding with Ranked Weight for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 27th International Conference, 2020

A Generalized Framework for Domain Adaptation of PLDA in Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Introduction to Voice Presentation Attack Detection and Recent Advances.

[BibT_eX]

[DOI]

Proceedings of the Handbook of Biometric Anti-Spoofing, 2019

Short-duration Speaker Verification (SdSV) Challenge 2020: the Challenge Evaluation Plan.

[BibT_eX]

[DOI]

CoRR, 2019

The ASVspoof 2019 database.

[BibT_eX]

[DOI]

CoRR, 2019

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences.

[BibT_eX]

[DOI]

CoRR, 2019

Introduction to Voice Presentation Attack Detection and Recent Advances.

[BibT_eX]

[DOI]

CoRR, 2019

Speaker Augmentation and Bandwidth Extension for Deep Speaker Embedding.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Unleashing the Unused Potential of i-Vectors Enabled by GPU Acceleration.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The NEC-TT 2018 Speaker Verification System.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The CORAL+ Algorithm for Unsupervised Domain Adaptation of PLDA.

[BibT_eX]

[DOI]

Takafumi Koshinaka

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Generalizing I-Vector Estimation for Rapid Speaker Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Generalized Variability Model for Speaker Verification.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2018

Attention Mechanism in Speaker Recognition: What Does it Learn in Deep Speaker Embedding?

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

t-DCF: a Detection Cost Function for the Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

ASVspoof 2017 Version 2.0: meta-data analysis and baseline enhancements.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Co-whitening of I-vectors for Short and Long Duration Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Integrated Presentation Attack Detection and Automatic Speaker Verification: Common Features and Gaussian Back-end Fusion.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

On the Importance of Analytic Phase of Speech Signals in Spoken Language Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Speaker-Phonetic Vector Estimation for Short Duration Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Maximal Figure-of-Merit Embedding for Multi-Label Audio Classification.

[BibT_eX]

[DOI]

Ivan Kukanov

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Many-to-Many Voice Conversion based on Bottleneck Features with Variational Autoencoder for Non-parallel Training Data.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Direct Optimization of the Detection Cost for I-Vector-Based Spoken Language Recognition.

[BibT_eX]

[DOI]

Aleksandr Sizov

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Incorporating Local Acoustic Variability Information into Short Duration Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Gain Compensation for Fast i-Vector Extraction Over Short Duration.

[BibT_eX]

[DOI]

Dennis Alexander Lehmann Thomsen

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016.

[BibT_eX]

[DOI]

Achintya Kumar Sarkar

Fahimeh Bahmaninezhad

Sergey Isadskiy

Christian Rathgeb

Christoph Busch

Georgios Tzimiropoulos

Pierre-Michel Bousquet

Dennis Alexander Lehmann Thomsen

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

The ASVspoof 2017 Challenge: Assessing the Limits of Replay Spoofing Attack Detection.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

RedDots replayed: A new replay spoofing attack corpus for text-dependent speaker verification research.

[BibT_eX]

[DOI]

Rosa González Hautamäki

Achintya Kumar Sarkar

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Adaptation of PLDA for multi-source text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

I2R-NUS submission to oriental language recognition AP16-OL7 challenge.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

Exploration of Local Variability in Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2016

Total Variability Modeling Using Source-Specific Priors.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Fantastic 4 system for NIST 2015 Language Recognition Evaluation.

[BibT_eX]

[DOI]

CoRR, 2016

Rapid Computation of I-vector.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Deep Language: a comprehensive deep learning approach to end-to-end language recognition.

[BibT_eX]

[DOI]

Trung Ngo Trong

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

I2R Submission to the 2015 NIST Language Recognition I-vector Challenge.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Discriminating Languages in a Probabilistic Latent Subspace.

[BibT_eX]

[DOI]

Aleksandr Sizov

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Neural networks based channel compensation for i-vector speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Joint Speaker and Lexical Modeling for Short-Term Characterization of Speaker.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Twin Model G-PLDA for Duration Mismatch Compensation in Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

The 2015 NIST Language Recognition Evaluation: The Shared View of I2R, Fantastic4 and SingaMS.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

An extensible speaker identification sidekit in Python.

[BibT_eX]

[DOI]

Sylvain Meignier

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Content-aware local variability vector for speaker verification with short utterance.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Quasi-Factorial Prior for i-vector Extraction.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2015

Relevance factor of maximum a posteriori adaptation for GMM-NAP-SVM in speaker and language recognition.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2015

Sparse coding of total variability matrix.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

The reddots platform for mobile crowd-sourcing of speech data.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

The reddots data collection for speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Phone-centric local variability vector for text-constrained speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A new study of GMM-SVM system for text-dependent speaker recognition.

[BibT_eX]

[DOI]

Hanwu Sun

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Source-specific informative prior for i-vector extraction.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Normalization of total variability matrix for i-vector/PLDA speaker verification.

[BibT_eX]

[DOI]

Wei Rao

Man-Wai Mak

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Channel adaptation of plda for text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Text-dependent speaker verification: Classifiers, databases and RSR2015.

[BibT_eX]

[DOI]

Speech Commun., 2014

PLDA in the I-Supervector Space for Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

Ye Jiang

Longbiao Wang

EURASIP J. Audio Speech Music. Process., 2014

Unifying Probabilistic Linear Discriminant Analysis Variants in Biometric Authentication.

[BibT_eX]

[DOI]

Aleksandr Sizov

Proceedings of the Structural, Syntactic, and Statistical Pattern Recognition, 2014

A Comparison of Categorical Attribute Data Clustering Methods.

[BibT_eX]

[DOI]

Proceedings of the Structural, Syntactic, and Statistical Pattern Recognition, 2014

Text-Dependent Speaker Verification System in VHF Communication Channel.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Local Variability Modeling for Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Single-sided approach to discriminative PLDA training for text-independent speaker verification without using expanded i-vector.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Local variability vector for text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Extended RSR2015 for text-dependent speaker verification over VHF channel.

[BibT_eX]

[DOI]

Pablo Luis Sordo Martinez

Trung Hieu Nguyen

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Imposture classification for text-dependent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Modelling the alternative hypothesis for text-dependent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Minimum divergence estimation of speaker prior in multi-session PLDA scoring.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Sparse Classifier Fusion for Speaker Verification.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2013

Spoken Language Recognition: From Fundamentals to Practice.

[BibT_eX]

[DOI]

Proc. IEEE, 2013

I4u submission to NIST SRE 2012: a large-scale collaborative effort for noise-robust speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Vulnerability evaluation of speaker verification under voice conversion spoofing: the effect of text constraints.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Multi-session PLDA scoring of i-vector for partially open-set speaker detection.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

ALIZE 3.0 - open source toolkit for state-of-the-art speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Automatic regularization of cross-entropy cost for speaker recognition fusion.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A study on GMM-SVM with adaptive relevance factor and its comparison with i-vector and JFA for speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Anti-model KL-SVM-NAP system for NIST SRE 2012 evaluation.

[BibT_eX]

[DOI]

Hanwu Sun

Proceedings of the IEEE International Conference on Acoustics, 2013

Phonetically-constrained PLDA modeling for text-dependent speaker verification with multiple short utterances.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification.

[BibT_eX]

[DOI]

Maria Hansson-Sandsten

IEEE Trans. Speech Audio Process., 2012

Bhattacharyya-based GMM-SVM system with adaptive relevance factor for pair language recognition.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Variational Bayes logistic regression as regularized fusion for NIST SRE 2010.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

PLDA Modeling in I-Vector and Supervector Space for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

I-vectors in the context of phonetically-constrained short utterances for speaker verification.

[BibT_eX]

[DOI]

Pierre-Michel Bousquet

Driss Matrouf

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Using Discrete Probabilities With Bhattacharyya Measure for SVM-Based Speaker Verification.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2011

Study on the Relevance Factor of Maximum a Posteriori with GMM for Language Recognition.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Spoken Language Recognition in the Latent Topic Simplex.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Joint Application of Speech and Speaker Recognition for Automation and Security in Smart Home.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Regularized Logistic Regression Fusion for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Speech enhancement with masking properties in eigen-domain for colored noise.

[BibT_eX]

[DOI]

Cheung-Chi Leung

Proceedings of the IEEE International Conference on Acoustics, 2011

Factored covariance modeling for text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Classifier subset selection and fusion for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

GMM-SVM Kernel With a Bhattacharyya-Based Distance for Speaker Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2010

Factor analysis based spatial correlation modeling for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

MAP estimation of subspace transform for speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A hybrid modeling strategy for GMM-SVM speaker recognition with adaptive relevance factor.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

The estimation and kernel metric of spectral correlation for text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Incorporating MAP estimation and covariance transform for SVM based speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Approaching human listener accuracy with modern speaker verification.

[BibT_eX]

[DOI]

Mohaddeseh Nosratighods

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Adaptive score fusion using Weighted Logistic Linear Regression for spoken language recognition.

[BibT_eX]

[DOI]

Khe Chai Sim

Proceedings of the IEEE International Conference on Acoustics, 2010

A GMM-supervector approach to language recognition with adaptive relevance factor.

[BibT_eX]

[DOI]

Proceedings of the 18th European Signal Processing Conference, 2010

Discrete expected likelihood kernel for SVM-based speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 18th European Signal Processing Conference, 2010

2009

An SVM Kernel With GMM-Supervector Based on the Bhattacharyya Distance for Speaker Recognition.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2009

Target-aware language models for spoken language recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A GMM supervector Kernel with the Bhattacharyya distance for SVM based speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

The I4U system in NIST 2008 speaker recognition evaluation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

NIST 2007 Language Recognition Evaluation: From the Perspective of IIR.

[BibT_eX]

[DOI]

Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation, 2008

Dimension reduction of the modulation spectrogram for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2008: The Speaker and Language Recognition Workshop, 2008

Self-Organized Clustering for Feature Mapping in Language Recognition.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Characterizing speech utterances for speaker verification with sequence kernel SVM.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Spoken Language recognition using support vector machines with generative front-end.

[BibT_eX]

[DOI]

Changhuai You

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

A GMM-based probabilistic sequence kernel for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

On Delayless Architecture for the Normalized Subband Adaptive Filter.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

2006

Inherent Decorrelating and Least Perturbation Properties of the Normalized Subband Adaptive Filter.

[BibT_eX]

[DOI]

Woon S. Gan

IEEE Trans. Signal Process., 2006

On the Subband Orthogonality of Cosine-Modulated Filter Banks.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. II Express Briefs, 2006

Fusion of Acoustic and Tokenization Features for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

The IIR Submission to CSLP 2006 Speaker Recognition Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

2005

Adaptive filtering using constrained subband updates.

[BibT_eX]

[DOI]

Woon S. Gan

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

2004

Improving convergence of the NLMS algorithm using constrained subband updates.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2004

Subband adaptive filtering using a multiple-constraint optimization criterion.

[BibT_eX]

[DOI]