Ville Hautamäki

CoRR, September, 2025

Refining Neural Activation Patterns for Layer-Level Concept Discovery in Neural Network-Based Receivers.

[BibT_eX]

[DOI]

CoRR, May, 2025

Generalizable speech deepfake detection via meta-learned LoRA.

[BibT_eX]

[DOI]

Janne Laakkonen

CoRR, February, 2025

Improving Numerical Stability of Normalized Mutual Information Estimator on High Dimensions.

[BibT_eX]

[DOI]

Marko Tuononen

IEEE Signal Process. Lett., 2025

Zero-shot World Models via Search in Memory.

[BibT_eX]

[DOI]

Federico Malato

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Continuous Learning for Children's ASR: Overcoming Catastrophic Forgetting with Elastic Weight Consolidation and Synaptic Intelligence.

[BibT_eX]

[DOI]

Edem Ahadzi

Vishwanath Pratap Singh

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Interpreting Deep Neural Network-Based Receiver Under Varying Signal-To-Noise Ratios.

[BibT_eX]

[DOI]

Marko Tuononen

Dani Korpi

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

On the Importance of Representation in Imitating Human-Like Gameplay.

[BibT_eX]

[DOI]

Ville Tanskanen

Arto Klami

Proceedings of the IEEE Conference on Games, 2025

Mixture of Low-Rank Adapter Experts in Generalizable Audio Deepfake Detection.

[BibT_eX]

[DOI]

Janne Laakkonen

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

2024

Natural Language as Polices: Reasoning for Coordinate-Level Embodied Control with LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Meta-Learning Approaches For Improving Detection of Unseen Speech Deepfakes.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

ROAR: Reinforcing Original to Augmented Data Ratio Dynamics for Wav2vec2.0 Based ASR.

[BibT_eX]

[DOI]

Vishwanath Pratap Singh

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Zero-Shot Imitation Policy Via Search In Demonstration Dataset.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Gradient Weighting for Speaker Verification in Extremely Low Signal-to-Noise Ratio.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Online Adaptation for Enhancing Imitation Learning Policies.

[BibT_eX]

[DOI]

Federico Malato

Proceedings of the IEEE Conference on Games, 2024

2023

GAN-Aimbots: Using Machine Learning for Cheating in First Person Shooters.

[BibT_eX]

[DOI]

IEEE Trans. Games, December, 2023

Self-Supervised Training of Speaker Encoder With Multi-Modal Diverse Positive Pairs.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Behavioral Cloning via Search in Embedded Demonstration Dataset.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition.

[BibT_eX]

[DOI]

CoRR, 2023

2022

Optimizing Tandem Speaker Verification and Anti-Spoofing Systems.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Behavioral Cloning via Search in Video PreTraining Latent Space.

[BibT_eX]

[DOI]

CoRR, 2022

The Transitive Information Theory and its Application to Deep Generative Models.

[BibT_eX]

[DOI]

CoRR, 2022

Improving Behavioural Cloning with Human-Driven Dynamic Dataset Augmentation.

[BibT_eX]

[DOI]

Federico Malato

Joona Jehkonen

CoRR, 2022

Self-Supervised Speaker Recognition with Loss-Gated Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition.

[BibT_eX]

[DOI]

Proceedings of the NeurIPS 2022 Competition Track, 2021

Multi-Task Learning With Attention for End-to-End Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Distilling Reinforcement Learning Tricks for Video Games.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE Conference on Games (CoG), 2021

PL-EESR: Perceptual Loss Based End-to-End Robust Speaker Representation Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Voxceleb Enrichment for Age and Gender Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Maximal Figure-of-Merit Framework to Detect Multi-Label Phonetic Features for Spoken Language Recognition.

[BibT_eX]

[DOI]

Valerio Mario Salerno

Kong Aik Lee

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Semisupervised Generative Autoencoder for Single-Cell Data.

[BibT_eX]

[DOI]

J. Comput. Biol., 2020

Policy Supervectors: General Characterization of Agents by their Behaviour.

[BibT_eX]

[DOI]

CoRR, 2020

Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya.

[BibT_eX]

[DOI]

Abrhalei Tela

Abraham Woubie

CoRR, 2020

An Initial Investigation on Optimizing Tandem Speaker Verification and Countermeasure Systems Using Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

From Video Game to Real Robot: The Transfer Between Action Spaces.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Action Space Shaping in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Christian Scheller

Proceedings of the IEEE Conference on Games, 2020

Benchmarking End-to-End Behavioural Cloning on Video Games.

[BibT_eX]

[DOI]

Joonas Pussinen

Proceedings of the IEEE Conference on Games, 2020

Cost Sensitive Optimization of Deepfake Detector.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Do Autonomous Agents Benefit from Hearing?

[BibT_eX]

[DOI]

CoRR, 2019

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences.

[BibT_eX]

[DOI]

CoRR, 2019

Playing Minecraft with Behavioural Cloning.

[BibT_eX]

[DOI]

Janne Karttunen

Proceedings of the NeurIPS 2019 Competition and Demonstration Track, 2019

Towards Debugging Deep Neural Networks by Generating Speech Utterances.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Who Do I Sound like? Showcasing Speaker Recognition Technology by Youtube Voice Search.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

ToriLLE: Learning Environment for Hand-to-Hand Combat.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Games, 2019

2018

Staircase Network: structural language identification via hierarchical attentive units.

[BibT_eX]

[DOI]

Kristiina Jokinen

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Perceptual Evaluation of the Effectiveness of Voice Disguise by Age Modification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Enabling Spoken Dialogue Systems for Low-Resourced Languages - End-to-End Dialect Recognition for North Sami.

[BibT_eX]

[DOI]

Kristiina Jokinen

Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018

Maximal Figure-of-Merit Embedding for Multi-Label Audio Classification.

[BibT_eX]

[DOI]

Kong-Aik Lee

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Acoustical and perceptual study of voice disguise by age modification in speaker verification.

[BibT_eX]

[DOI]

Md. Sahidullah

Dennis Alexander Lehmann Thomsen

Speech Commun., 2017

The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016.

[BibT_eX]

[DOI]

Achintya Kumar Sarkar

Fahimeh Bahmaninezhad

Sergey Isadskiy

Christian Rathgeb

Christoph Busch

Georgios Tzimiropoulos

Pierre-Michel Bousquet

Jean-François Bonastre

Eliathamby Ambikairajah

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

RedDots replayed: A new replay spoofing attack corpus for text-dependent speaker verification research.

[BibT_eX]

[DOI]

Dennis Alexander Lehmann Thomsen

Achintya Kumar Sarkar

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Effects of gender information in text-independent and text-dependent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

i-Vector Modeling of Speech Attributes for Automatic Foreign Accent Recognition.

[BibT_eX]

[DOI]

Chin-Hui Lee

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Fantastic 4 system for NIST 2015 Language Recognition Evaluation.

[BibT_eX]

[DOI]

CoRR, 2016

Deep learning with maximal figure-of-merit cost to advance multi-label speech attribute detection.

[BibT_eX]

[DOI]

Kehuang Li

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Deep Language: a comprehensive deep learning approach to end-to-end language recognition.

[BibT_eX]

[DOI]

Kong-Aik Lee

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Incorporating uncertainty as a Quality Measure in I-Vector Based Language Recognition.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Age-Related Voice Disguise and its Impact on Speaker Verification Accuracy.

[BibT_eX]

[DOI]

Md. Sahidullah

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Out-of-Set i-Vector Selection for Open-set Language Identification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech.

[BibT_eX]

[DOI]

Md. Sahidullah

Dennis Alexander Lehmann Thomsen

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

The 2015 NIST Language Recognition Evaluation: The Shared View of I2R, Fantastic4 and SingaMS.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Utterance Verification for Text-Dependent Speaker Recognition: A Comparative Assessment Using the RedDots Corpus.

[BibT_eX]

[DOI]

Achintya Kumar Sarkar

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Variation in Spoken North Sami Language.

[BibT_eX]

[DOI]

Kristiina Jokinen

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015

Automatic versus human speaker verification: The case of voice mimicry.

[BibT_eX]

[DOI]

Anne-Maria Laukkanen

Speech Commun., 2015

Factors affecting i-vector based foreign accent recognition: A case study in spoken Finnish.

[BibT_eX]

[DOI]

Speech Commun., 2015

Boosting universal speech attributes classification with deep neural network for foreign accent characterization.

[BibT_eX]

[DOI]

Valerio Mario Salerno

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014

From single to multiple enrollment i-vectors: Practical PLDA scoring variants for speaker verification.

[BibT_eX]

[DOI]

Digit. Signal Process., 2014

A Comparison of Categorical Attribute Data Clustering Methods.

[BibT_eX]

[DOI]

Proceedings of the Structural, Syntactic, and Statistical Pattern Recognition, 2014

Comparison of human listeners and speaker verification systems using voice mimicry data.

[BibT_eX]

[DOI]

Anne-Maria Laukkanen

Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Dialect levelling in Finnish: a universal speech attribute approach.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

An i-vector based descriptor for alphabetical gesture recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Introducing attribute features to foreign accent recognition.

[BibT_eX]

[DOI]

Chin-Hui Lee

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Sparse Classifier Fusion for Speaker Verification.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2013

I4u submission to NIST SRE 2012: a large-scale collaborative effort for noise-robust speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Effect of multicondition training on i-vector PLDA configurations for speaker recognition.

[BibT_eX]

[DOI]

Padmanabhan Rajan

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A blind segmentation approach to acoustic event detection based on i-vector.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Automatic regularization of cross-entropy cost for speaker recognition fusion.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicry.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Merging human and automatic system decisions to improve speaker recognition performance.

[BibT_eX]

[DOI]

Padmanabhan Rajan

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Minimax i-vector extractor for short duration speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Foreign accent detection from spoken Finnish using i-vectors.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

Review of a concise introduction to data compression by David Salomon.

[BibT_eX]

[DOI]

SIGACT News, 2012

Random swap EM algorithm for Gaussian mixture models.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2012

Variational Bayes logistic regression as regularized fusion for NIST SRE 2010.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

2011

Spoken Language Recognition in the Latent Topic Simplex.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Regularized Logistic Regression Fusion for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

RSEM: An Accelerated Algorithm on Repeated EM.

[BibT_eX]

[DOI]

Qinpei Zhao

Proceedings of the Sixth International Conference on Image and Graphics, 2011

Classifier subset selection and fusion for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Ad-hoc Georeferencing of Web-pages using Street-name Prefix Trees.

[BibT_eX]

Andrei Tabarcea

Proceedings of the WEBIST 2010, 2010

Towards long-range prosodic attribute modeling for language recognition.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Approaching human listener accuracy with modern speaker verification.

[BibT_eX]

[DOI]

Mohaddeseh Nosratighods

Kong-Aik Lee

Bin Ma

Haizhou Li

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Location-based search engine for multimedia phones.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

2009

Comparative evaluation of maximum a Posteriori vector quantization and gaussian mixture models in speaker verification.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2009

Random swap EM algorithm for finite mixture models in image segmentation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Image Processing, 2009

Comparing maximum a posteriori vector quantization and Gaussian mixture models in speaker verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Developing Speaker Recognition System: From Prototype to Practical Application.

[BibT_eX]

[DOI]

Proceedings of the Forensics in Telecommunications, 2009

2008

Maximum a Posteriori Adaptation of the Centroid Model for Speaker Verification.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2008

Text-independent speaker recognition using graph matching.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2008

Time-series clustering by approximate prototypes.

[BibT_eX]

[DOI]

Pekka Nykänen

Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Probabilistic clustering by random swap algorithm.

[BibT_eX]

[DOI]

Olli Virmajoki

Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Knee Point Detection in BIC for Detecting the Number of Clusters.

[BibT_eX]

[DOI]

Qinpei Zhao

Proceedings of the Advanced Concepts for Intelligent Vision Systems, 2008

2006

Fast Agglomerative Clustering Using a k-Nearest Neighbor Graph.

[BibT_eX]

[DOI]

Olli Virmajoki

IEEE Trans. Pattern Anal. Mach. Intell., 2006

Speaker, Vocabulary and Context Independent Word Spotting System for Continuous Speech.

[BibT_eX]

[DOI]

Radu Timofte

Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

On the Use of Long-Term Average Spectrum in Automatic Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

2005

Accuracy of MFCC-Based Speaker Recognition in Series 60 Device.

[BibT_eX]

[DOI]

EURASIP J. Adv. Signal Process., 2005

Improving K-Means by Outlier Removal.

[BibT_eX]

[DOI]

Svetlana Cherednichenko

Ismo Kärkkäinen

Proceedings of the Image Analysis, 14th Scandinavian Conference, 2005

2004

Outlier Detection Using k-Nearest Neighbour Graph.

[BibT_eX]

[DOI]

Ismo Kärkkäinen

Proceedings of the 17th International Conference on Pattern Recognition, 2004

2003

On the fusion of dissimilarity-based classifiers for speaker identification.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Fast PNN-based Clustering Using K-nearest Neighbor Graph.

[BibT_eX]

[DOI]

Olli Virmajoki