Themos Stafylakis

Orcid: 0000-0002-9227-3588

According to our database1, Themos Stafylakis authored at least 76 papers between 2007 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Do End-to-End Neural Diarization Attractors Need to Encode Speaker Characteristic Information?
CoRR, 2024

2023
KAN-AV dataset for audio-visual face and speech analysis in the wild.
Image Vis. Comput., December, 2023

DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors.
CoRR, 2023

Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters.
Proceedings of the IEEE International Conference on Acoustics, 2023

Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing.
Proceedings of the IEEE International Conference on Acoustics, 2023

A Simple Baseline for Knowledge-Based Visual Question Answering.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022
Extracting Speaker and Emotion Information from Self-Supervised Speech Models via Channel-Wise Correlations.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

An Attention-Based Backend Allowing Efficient Fine-Tuning of Transformer Models for Speaker Verification.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

On the Use of Semantically-Aligned Speech Representations for Spoken Language Understanding.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Analyzing Speaker Verification Embedding Extractors and Back-Ends Under Language and Channel Mismatch.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Development of ABC Systems for the 2021 Edition of NIST Speaker Recognition Evaluation.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Training speaker embedding extractors using multi-speaker audio with unknown speaker boundaries.
Proceedings of the Interspeech 2022, 2022

Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings.
Proceedings of the Interspeech 2022, 2022

2021
Speaker Embeddings by Modeling Channel-Wise Correlations.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

2020
Probabilistic Embeddings for Speaker Diarization.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020


End-to-End Architectures for ASR-Free Spoken Language Understanding.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Seeing wake words: Audio-visual Keyword Spotting.
Proceedings of the 31st British Machine Vision Conference 2020, 2020

2019
Speaker Recognition With Random Digit Strings Using Uncertainty Normalized HMM-Based i-Vectors.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Detecting Spoofing Attacks Using VGG and SincNet: BUT-Omilia Submission to ASVspoof 2019 Challenge.
Proceedings of the Interspeech 2019, 2019

Self-Supervised Speaker Embeddings.
Proceedings of the Interspeech 2019, 2019

Privacy-Preserving Speaker Recognition with Cohort Score Normalisation.
Proceedings of the Interspeech 2019, 2019

How to Improve Your Speaker Embeddings Extractor in Generic Toolkits.
Proceedings of the IEEE International Conference on Acoustics, 2019

Speaker Verification Using End-to-end Adversarial Language Adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Pushing the boundaries of audiovisual word recognition using Residual Networks and LSTMs.
Comput. Vis. Image Underst., 2018

Audio-Visual Speech Recognition with a Hybrid CTC/Attention Architecture.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

DeepMine Speech Processing Database: Text-Dependent and Independent Speaker Verification and Speech Recognition in Persian and English.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Deep Word Embeddings for Visual Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

End-to-End Audiovisual Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Zero-Shot Keyword Spotting for Visual Speech Recognition In-the-wild.
Proceedings of the Computer Vision - ECCV 2018, 2018

2017
Combining Residual Networks with LSTMs for Lipreading.
Proceedings of the Interspeech 2017, 2017


2016
Speaker and Channel Factors in Text-Dependent Speaker Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Text-Dependent Speaker Recognition With Random Digit Strings.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Compensation for phonetic nuisance variability in speaker recognition using DNNs.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Uncertainty Modeling Without Subspace Methods For Text-Dependent Speaker Recognition.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Deep Neural Network based Text-Dependent Speaker Verification : Preliminary Results.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Spoofing Detection on the ASVspoof2015 Challenge Corpus Employing Deep Neural Networks.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Towards PLDA-RBM based speaker recognition in mobile environment: Designing stacked/deep PLDA-RBM systems.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
JFA for speaker recognition with random digit strings.
Proceedings of the INTERSPEECH 2015, 2015


An i-vector backend for speaker verification.
Proceedings of the INTERSPEECH 2015, 2015

Combining amplitude and phase-based features for speaker verification with short duration utterances.
Proceedings of the INTERSPEECH 2015, 2015

Development of CRIM system for the automatic speaker verification spoofing and countermeasures challenge 2015.
Proceedings of the INTERSPEECH 2015, 2015

JFA modeling with left-to-right structure and a new backend for text-dependent speaker recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
A Study of the Cosine Distance-Based Mean Shift for Telephone Speech Diarization.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Deep Neural Networks for extracting Baum-Welch statistics for Speaker Recognition.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Supervised/Unsupervised Voice Activity Detectors for Text-dependent Speaker Recognition on the RSR2015 Corpus.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Joint Factor Analysis for Text-Dependent Speaker Verification.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

In-domain versus out-of-domain training for text-dependent JFA.
Proceedings of the INTERSPEECH 2014, 2014

JFA-based front ends for speaker recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

I-vector-based speaker adaptation of deep neural networks for French broadcast audio transcription.
Proceedings of the IEEE International Conference on Acoustics, 2014

Unscented transform for ivector-based noisy speaker recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Text-dependent speaker recognition using PLDA with uncertainty propagation.
Proceedings of the INTERSPEECH 2013, 2013

Compensation for inter-frame correlations in speaker diarization and recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Efficient iterative mean shift based cosine dissimilarity for multi-recording speaker clustering.
Proceedings of the IEEE International Conference on Acoustics, 2013

PLDA for speaker verification with utterances of arbitrary duration.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Preliminary investigation of Boltzmann machine classifiers for speaker recognition.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Mean shift algorithm for exponential families with applications to speaker clustering.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

A mean shift algorithm for manifolds of exponential families.
Proceedings of the 11th International Conference on Information Science, 2012

PLDA using Gaussian Restricted Boltzmann Machines with application to Speaker Verification.
Proceedings of the INTERSPEECH 2012, 2012

Music tempo estimation and beat tracking by applying source separation and metrical relations.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Developing a Scoring Algorithm for Automatic Pronunciation Assessment of Modern Greek.
J. Quant. Linguistics, 2011

Enhancing Handwritten Word Segmentation by Employing Local Spatial Features.
Proceedings of the 2011 International Conference on Document Analysis and Recognition, 2011

Closed-form expressions vs. BIC: A comparison for speaker clustering.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Handwritten document image segmentation into text lines and words.
Pattern Recognit., 2010

The Segmental Bayesian Information Criterion and Its Applications to Speaker Diarization.
IEEE J. Sel. Top. Signal Process., 2010

Speaker clustering via the mean shift algorithm.
Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

Improvements to the equal-parameter BIC for speaker diarization.
Proceedings of the INTERSPEECH 2010, 2010

A new penalty term for the BIC with respect to speaker diarization.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Redefining the Bayesian information criterion for speaker diarisation.
Proceedings of the INTERSPEECH 2009, 2009

2008
Robust text-line and word segmentation for handwritten documents images.
Proceedings of the IEEE International Conference on Acoustics, 2008

PANOPTIS: A System for Intelligent Monitoring of the Hellenic Broadcast Sector.
Proceedings of the 19th International Workshop on Database and Expert Systems Applications (DEXA 2008), 2008

2007
A Parametric Spectral-Based Method for Verification of Text in Videos.
Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007

Efficient combination of parametric spaces, models and metrics for speaker diarization<sup>1</sup>.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007


  Loading...