Mickael Rouvier

CoRR, May, 2026

Evaluation of Automatic Speech Recognition Using Generative Large Language Models.

[BibT_eX]

[DOI]

CoRR, April, 2026

Enhancing Multi-Corpus Training in SSL-Based Anti-Spoofing Models: Domain-Invariant Feature Extraction.

[BibT_eX]

[DOI]

Proceedings of the 14th International Workshop on Biometrics and Forensics, 2026

Assessing the Reliability of Deep Learning-Based Voice Comparison Models for Critical Decisions.

[BibT_eX]

[DOI]

Mickaël Rouvier

Proceedings of the 14th International Workshop on Biometrics and Forensics, 2026

2025

Text-Speech Language Models with Improved Cross-Modal Transfer by Aligning Abstraction Levels.

[BibT_eX]

[DOI]

CoRR, March, 2025

An Empirical Analysis of Discrete Unit Representations in Speech Language Modeling Pre-training.

[BibT_eX]

[DOI]

Mickaël Rouvier

Proceedings of the Text, Speech, and Dialogue - 28th International Conference, 2025

Étude comparative de réponses humaines et de grands modèles de langue à des QCM en pharmacie.

[BibT_eX]

[DOI]

Proceedings of the Actes des 32ème Conférence sur le Traitement Automatique des Langues Naturelles, 2025

Comparative Analysis of Human and Large Language Model Performance in Pharmacology Multiple-Choice Questions.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Recent Advances in Natural Language Processing, 2025

A Benchmark of French ASR Systems Based on Error Severity.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

2024

Open-Source Conversational AI with SpeechBrain 1.0.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2024

LeBenchmark 2.0: A standardized, replicable and enhanced framework for self-supervised representations of French speech.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2024

Open-Source Conversational AI with SpeechBrain 1.0.

[BibT_eX]

[DOI]

CoRR, 2024

Asymmetric and trial-dependent modeling: the contribution of LIA to SdSV Challenge Task 2.

[BibT_eX]

[DOI]

CoRR, 2024

Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems.

[BibT_eX]

[DOI]

Quentin Raymondaud

CoRR, 2024

A Paradigm for Interpreting Metrics and Measuring Error Severity in Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 27th International Conference, 2024

Un paradigme pour l'interprétation des métriques et pour mesurer la gravité des erreurs de reconnaissance automatique de la parole.

[BibT_eX]

[DOI]

Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

Tâches et systèmes de sélection automatique de réponses à des QCM dans le domaine médical : Présentation de la campagne DEFT 2024.

[BibT_eX]

[DOI]

Proceedings of the Actes du Défi Fouille de Textes@TALN 2024, 2024

MSP-Podcast SER Challenge 2024: L'antenne du Ventoux Multimodal Self-Supervised Learning for Speech Emotion Recognition.

[BibT_eX]

[DOI]

Jarod Duret

Yannick Estève

Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

Zero-Shot End-To-End Spoken Question Answering In Medical Domain.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Synvox2: Towards A Privacy-Friendly Voxceleb2 Dataset.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

A Comprehensive Analysis of Tokenization and Self-Supervised Learning in End-to-End Automatic Speech Recognition Applied on French Language.

[BibT_eX]

[DOI]

Proceedings of the 32nd European Signal Processing Conference, 2024

RoboVox: A Single/Multi-channel Far-field Speaker Recognition Benchmark for a Mobile Robot.

[BibT_eX]

[DOI]

Mohammad MohammadAmini

Mickaël Rouvier

Romain Serizel

Théophile Gonos

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks.

[BibT_eX]

[DOI]

Pacôme Constant dit Beaufils

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

DrBenchmark: A Large Language Understanding Evaluation Benchmark for French Biomedical Domain.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

How Important Is Tokenization in French Medical Masked Language Models?

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains.

[BibT_eX]

[DOI]

Adrien Bazoge

Emmanuel Morin

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

HATS: An Open Data Set Integrating Human Perception Applied to the Evaluation of Automatic Speech Recognition Metrics.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 26th International Conference, 2023

HATS : Un jeu de données intégrant la perception humaine appliquée à l'évaluation des métriques de transcription de la parole.

[BibT_eX]

[DOI]

Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023, 2023

MORFITT : Un corpus multi-labels d'articles scientifiques français dans le domaine biomédical.

[BibT_eX]

[DOI]

Proceedings of the Actes de CORIA-TALN 2023. Actes de l'atelier "Analyse et Recherche de Textes Scientifiques", 2023

DrBERT: Un modèle robuste pré-entraîné en français pour les domaines biomédical et clinique.

[BibT_eX]

[DOI]

Proceedings of the Actes de CORIA-TALN 2023. Actes de la 30e Conférence sur le Traitement Automatique des Langues Naturelles, TALN 2023, 2023

Improving training datasets for resource-constrained speaker recognition neural networks.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Jeffreys Divergence-Based Regularization of Neural Network Output Distribution Applied to Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

I4U System Description for NIST SRE'20 CTS Challenge.

[BibT_eX]

[DOI]

CoRR, 2022

Mesures linguistiques automatiques pour l'évaluation des systèmes de Reconnaissance Automatique de la Parole (Automated linguistic measures for automatic speech recognition systems' evaluation).

[BibT_eX]

[DOI]

Proceedings of the Actes de la 29e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale, 2022

On the Use of Semantically-Aligned Speech Representations for Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Far-Field Speaker Recognition Benchmark Derived From The DiPCo Corpus.

[BibT_eX]

[DOI]

Mohammad MohammadAmini

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Speech Resources in the Tamasheq Language.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Qualitative Evaluation of Language Model Rescoring in Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Reliability criterion based on learning-phase entropy for speaker recognition with neural network.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain.

[BibT_eX]

[DOI]

Emmanuel Morin

Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022

2021

Influence of Speaker Pre-training on Character Voice Representation.

[BibT_eX]

[DOI]

Proceedings of the Speech and Computer - 23rd International Conference, 2021

Language Adaptation for Speaker Recognition Systems Using Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the Speech and Computer - 23rd International Conference, 2021

Study On the Temporal Pooling Used In Deep Neural Networks For Speaker Verification.

[BibT_eX]

[DOI]

Jarod Duret

Proceedings of the 29th European Signal Processing Conference, 2021

Studying Squeeze-and-Excitation Used in CNN for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Adaptation Strategy and Clustering from Scratch for New Domains of Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Review of different robust x-vector extractors for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 28th European Signal Processing Conference, 2020

2019

ON-TRAC Consortium End-to-End Speech Translation Systems for the IWSLT 2019 Shared Task.

[BibT_eX]

[DOI]

Ha Nguyen

Natalia A. Tomashenko

CoRR, 2019

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences.

[BibT_eX]

[DOI]

CoRR, 2019

I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

On Robustness of Unsupervised Domain Adaptation for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018

LIA@CLEF 2018: Mining Events Opinion Argumentation from Raw Unlabeled Twitter Data using Convolutional Neural Network.

[BibT_eX]

[DOI]

Proceedings of the Working Notes of CLEF 2018, 2018

2017

LIA at SemEval-2017 Task 4: An Ensemble of Neural Networks for Sentiment Classification.

[BibT_eX]

[DOI]

Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016.

[BibT_eX]

[DOI]

Achintya Kumar Sarkar

Fahimeh Bahmaninezhad

Sergey Isadskiy

Christian Rathgeb

Christoph Busch

Georgios Tzimiropoulos

Dennis Alexander Lehmann Thomsen

Eliathamby Ambikairajah

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Acoustic Pairing of Original and Dubbed Voices in the Context of Video Game Localization.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Duration Mismatch Compensation Using Four-Covariance Model and Deep Neural Network for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Building a robust sentiment lexicon with (almost) no resource.

[BibT_eX]

[DOI]

CoRR, 2016

LIA system description for NIST SRE 2016.

[BibT_eX]

[DOI]

Moez Ajili

Waad Ben Kheder

CoRR, 2016

Fusion d'espaces de représentations multimodaux pour la reconnaissance du rôle du locuteur dans des documents télévisuels (Multimodal embedding fusion for robust speaker role recognition in video broadcast ).

[BibT_eX]

[DOI]

Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 1 : JEP, 2016

SENSEI-LIF at SemEval-2016 Task 4: Polarity embedding fusion for robust sentiment analysis.

[BibT_eX]

[DOI]

Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

Investigation of speaker embeddings for cross-show speaker diarization.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Audio-Based Video Genre Identification.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2015

PERCOLATTE : A Multimodal Person Discovery System in TV Broadcast for the Medieval 2015 Evaluation Campaign.

[BibT_eX]

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

"speech is silver, but silence is golden": improving speech-to-speech translation performance by slashing users input.

[BibT_eX]

[DOI]

Frédéric Béchet

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Speaker diarization through speaker embeddings.

[BibT_eX]

[DOI]

Proceedings of the 23rd European Signal Processing Conference, 2015

Identification de personnes dans des flux multimédia.

[BibT_eX]

[DOI]

Proceedings of the CORIA 2015 - Conférence en Recherche d'Infomations et Applications, 2015

Multimodal embedding fusion for robust speaker role recognition in video broadcast.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Joint decoding of complementary utterances.

[BibT_eX]

[DOI]

Frédéric Béchet

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Speaker adaptation of DNN-based ASR with i-vectors: does it actually adapt models to speakers?

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Multimodal understanding for person recognition in video broadcasts.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Reranked aligners for interactive transcript correction.

[BibT_eX]

[DOI]

Frédéric Béchet

Proceedings of the IEEE International Conference on Acoustics, 2014

Scene understanding for identifying persons in TV shows: Beyond face authentication.

[BibT_eX]

[DOI]

Proceedings of the 12th International Workshop on Content-Based Multimedia Indexing, 2014

2013

Searching segments of interest in single story web-videos.

[BibT_eX]

[DOI]

Proceedings of the 14th International Workshop on Image Analysis for Multimedia Interactive Services, 2013

LIUM ASR System for ETAPE French Evaluation Campaign: Experiments on System Combination Using Open-Source Recognizers.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 16th International Conference, 2013

An Investigation of Single-Pass ASR System Combination for Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the Statistical Language and Speech Processing, 2013

An open-source state-of-the-art toolbox for broadcast news diarization.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Semi-Supervised and Unsupervised Data Extraction Targeting Speakers: From Speaker Roles to Fame?

[BibT_eX]

[DOI]

Proceedings of the First Workshop on Speech, 2013

2012

Nouvelle approche pour le regroupement des locuteurs dans des émissions radiophoniques et télévisuelles (New approach for speaker clustering of broadcast news) [in French].

[BibT_eX]

[DOI]

Sylvain Meignier

Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012

Segmentation et Regroupement en Locuteurs d'une collection de documents audio (Cross-show speaker diarization) [in French].

[BibT_eX]

[DOI]

Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012

Avancées dans le domaine de la transcription automatique par décodage guidé (Improvements on driven decoding system combination) [in French].

[BibT_eX]

[DOI]

Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012

A global optimization framework for speaker diarization.

[BibT_eX]

[DOI]

Sylvain Meignier

Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

I-vectors and ILP clustering adapted to cross-show speaker diarization.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Subspace Gaussian Mixture Models Based on Noise Compensation for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Low latency combination of parallelized single-pass LVCSR systems.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

Structuration de contenus audio-visuel pour le résumé automatique. (Audio-visual content structuring for automatic summarization).

[BibT_eX]

[DOI]

PhD thesis, 2011

Modeling nuisance variabilities with factor analysis for GMM-based audio pattern classification.

[BibT_eX]

[DOI]

Florian Verdet

Comput. Speech Lang., 2011

Qui êtes-vous ? Catégoriser les questions pour déterminer le rôle des locuteurs dans des conversations orales (Who are you? Categorize questions to determine the role of speakers in oral conversations).

[BibT_eX]

[DOI]

Proceedings of the Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2011

Static and dynamic video summaries.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

LIA @ MediaEval 2011: Compact representation of heterogeneous descriptors for video genre classification.

[BibT_eX]

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2011 Workshop, 2011

Speaker Role Recognition Using Question Detection and Characterization.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Factor analysis based session variability compensation for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Subspace Gaussian Mixture Models for vectorial HMM-states representation.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010

Query-Driven Strategy for On-the-Fly Term Spotting in Spontaneous Speech.

[BibT_eX]

[DOI]

Benjamin Lecouteux

EURASIP J. Audio Speech Music. Process., 2010

Classification du genre vidéo reposant sur des transcriptions automatiques.

[BibT_eX]

[DOI]

Stanislas Oger

Proceedings of the Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs, 2010

A language-identification inspired method for spontaneous speech detection.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

On-the-fly video genre classification by combination of audio features.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Transcription-based video genre classification.

[BibT_eX]

[DOI]

Stanislas Oger

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Factor analysis for audio-based video genre classification.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Robust audio-based classification of video genre.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

2008

On-the-fly term spotting by phonetic filtering and request-driven decoding.

[BibT_eX]

[DOI]