Julien Pinquier

Lionel Fontan

Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

Erreurs de prononciation en L2 : comparaison de méthodes pour la détection et le diagnostic guidés par la didactique.

[BibT_eX]

[DOI]

Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

Étude des liens acoustico-moteurs après cancer oral ou oropharyngé, via la réalisation d'un inventaire phonémique automatique des consonnes.

[BibT_eX]

[DOI]

Proceedings of the Actes des 35èmes Journées d'Études sur la Parole, 2024

Detection of Pharyngolaryngeal Activities in Real-World Settings Using Wearable Sensors.

[BibT_eX]

[DOI]

Proceedings of the 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2024

Emvd Dataset: a Dataset of Extreme Vocal Distortion Techniques Used in Heavy Metal.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on Content-Based Multimedia Indexing, 2024

2023

Audio-video fusion strategies for active speaker detection in meetings.

[BibT_eX]

[DOI]

Multim. Tools Appl., April, 2023

Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval?

[BibT_eX]

[DOI]

Etienne Labbé

CoRR, 2023

Comparing phoneme recognition systems on the detection and diagnosis of reading mistakes for young children's oral reading evaluation.

[BibT_eX]

[DOI]

Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Towards Reducing Patient Effort for the Automatic Prediction of Speech Intelligibility in Head and Neck Cancers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Multitask Learning in Audio Captioning: A Sentence Embedding Regression Loss Acts as a Regularizer.

[BibT_eX]

[DOI]

Etienne Labbé

Proceedings of the 31st European Signal Processing Conference, 2023

Can We Use Speaker Embeddings On Spontaneous Speech Obtained From Medical Conversations To Predict Intelligibility?

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Deep neural networks for automatic speech processing: a survey from large corpora to limited data.

[BibT_eX]

[DOI]

Vincent Roger

Jérôme Farinas

EURASIP J. Audio Speech Music. Process., 2022

Automatic Assessment of Speech Intelligibility using Consonant Similarity for Head and Neck Cancer.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Prediction of L2 speech proficiency based on multi-level linguistic features.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Is my Automatic Audio Captioning System so Bad? SPIDEr-max: A Metric to Consider Several Caption Candidates.

[BibT_eX]

[DOI]

Etienne Labbé

Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021

End-to-end acoustic modelling for phone recognition of young readers.

[BibT_eX]

[DOI]

Speech Commun., 2021

C2SI corpus: a database of speech disorder productions to assess intelligibility and quality of life in head and neck cancers.

[BibT_eX]

[DOI]

Lang. Resour. Evaluation, 2021

Improving vehicle re-identification using CNN latent spaces: Metrics comparison and track-to-track extension.

[BibT_eX]

[DOI]

Geoffrey Roman-Jimenez

IET Comput. Vis., 2021

Multimodal Neural Network for Sentiment Analysis in Embedded Systems.

[BibT_eX]

[DOI]

Proceedings of the 16th International Joint Conference on Computer Vision, 2021

Multimodal human interaction analysis in vehicle cockpit.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Intelligent Transportation Systems Conference, 2021

Simulating Reading Mistakes for Child Speech Transformer-Based Phone Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Automatic macro segmentation into interaction sequence: a silence-based approach for meeting structuring.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Content-Based Multimedia Indexing, 2021

Towards a content-based prediction of personalized musical preferences using transfer learning.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Content-Based Multimedia Indexing, 2021

2020

Étude des facteurs affectant la compréhensibilité de documents multimodaux : une étude expérimentale (Factors affecting the comprehensibility of multimodal documents : an experimental study ).

[BibT_eX]

[DOI]

Estelle I. S. Randria

Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020

Reconnaissance de phones fondée sur du Transfer Learning pour des enfants apprenants lecteurs en environnement de classe (Transfer Learning based phone recognition on children learning to read, with speech recorded in a classroom environment).

[BibT_eX]

[DOI]

Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020

Analyse de l'effet de la réverbération sur la reconnaissance automatique de la parole (Analyzing how reverberation affects Automatic Speech Recognition).

[BibT_eX]

[DOI]

Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020

Une nouvelle mesure de la réverbération pour prédire les performances a priori de la transcription de la parole (A new reverberation measure to predict a priori ASR performance).

[BibT_eX]

[DOI]

Proceedings of the Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 2020

Subjective Evaluation of Comprehensibility in Movie Interactions.

[BibT_eX]

[DOI]

Estelle I. S. Randria

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Automatic Prediction of Speech Intelligibility Based on X-Vectors in the Context of Head and Neck Cancer.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Audiovisual Annotation Procedure for Multi-view Field Recordings.

[BibT_eX]

[DOI]

Patrice Guyot

Thierry Malon

Geoffrey Roman-Jimenez

Proceedings of the MultiMedia Modeling - 25th International Conference, 2019

2018

Toulouse campus surveillance dataset: scenarios, soundtracks, synchronized videos with overlapping and disjoint views.

[BibT_eX]

[DOI]

Thierry Malon

Geoffrey Roman-Jimenez

Proceedings of the 9th ACM Multimedia Systems Conference, 2018

Carcinologic Speech Severity Index Project: A Database of Speech Disorder Productions to Assess Quality of Life Related to Speech After Cancer.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Perceptual and Automatic Evaluations of the Intelligibility of Speech Degraded by Noise Induced Hearing Loss Simulation.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Catégorisation libre d'extraits musicaux et analyse automatique.

[BibT_eX]

[DOI]

Proceedings of the COnférence en Recherche d'Informations et Applications, 2018

2017

Unsupervised Speech Unit Discovery Using K-means and Neural Networks.

[BibT_eX]

[DOI]

Céline Manenti

Proceedings of the Statistical Language and Speech Processing, 2017

Music Feature Maps with Convolutional Neural Networks for Music Genre Classification.

[BibT_eX]

[DOI]

Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing, 2017

2016

A multi-modal perception based assistive robotic system for the elderly.

[BibT_eX]

[DOI]

Comput. Vis. Image Underst., 2016

Influence de la quantité de données sur une tâche de segmentation de phones fondée sur les réseaux de neurones (Phone-level speech segmentation with neural networks : influence of the amount of data ).

[BibT_eX]

[DOI]

Céline Manenti

Proceedings of the Actes de la conférence conjointe JEP-TALN-RECITAL 2016. Volume 1 : JEP, 2016

CNN-Based Phone Segmentation Experiments in a Less-Represented Language.

[BibT_eX]

[DOI]

Céline Manenti

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Using Phonologically Weighted Levenshtein Distances for the Prediction of Microscopic Intelligibility.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Online Audiovisual Signature Training for Person Re-identification.

[BibT_eX]

[DOI]

François-Xavier Decroix

Isabelle Ferrané

Frédéric Lerasle

Proceedings of the 10th International Conference on Distributed Smart Camera, 2016

A Multi-modal Perception based Architecture for a Non-intrusive Domestic Assistant Robot.

[BibT_eX]

[DOI]

Proceedings of the Eleventh ACM/IEEE International Conference on Human Robot Interation, 2016

Filterbank coefficients selection for segmentation in singer turns.

[BibT_eX]

[DOI]

Proceedings of the 14th International Workshop on Content-Based Multimedia Indexing, 2016

2015

Automatic intelligibility measures applied to speech signals simulating age-related hearing loss.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Perceiving user's intention-for-interaction: A probabilistic multimodal data fusion scheme.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Pyc2Sound: a Python tool to convert images into sound.

[BibT_eX]

[DOI]

Vincent Bragard

Proceedings of the Audio Mostly 2015 on Interaction With Sound, 2015

2014

Comparaison de mesures perceptives et automatiques de l'intelligibilité. Application à de la parole simulant la presbyacousie.

[BibT_eX]

[DOI]

Trait. Autom. des Langues, 2014

Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2014

Segmentation in singer turns with the Bayesian information criterion.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A particle swarm optimization inspired tracker applied to visual tracking.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Segmentations sonore et audiovisuelle ?

[BibT_eX]

[DOI]

, 2014

2013

Superposed speech localisation using frequency tracking.

[BibT_eX]

[DOI]

Maxime Le Coz

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Two-step detection of water sound events for the diagnostic and monitoring of dementia.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Conference on Multimedia and Expo, 2013

Water sound recognition based on physical models.

[BibT_eX]

[DOI]

Patrice Guyot

Proceedings of the IEEE International Conference on Acoustics, 2013

Audio indexing including frequency tracking of simultaneous multiple sources in speech and music.

[BibT_eX]

[DOI]

Proceedings of the 11th International Workshop on Content-Based Multimedia Indexing, 2013

2012

Detecting individual role using features extracted from speaker diarization results.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2012

Strategies for multiple feature fusion with Hierarchical HMM: Application to activity recognition from wearable audiovisual sensors.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on Pattern Recognition, 2012

Water flow detection from a wearable device with a new feature, the spectral cover.

[BibT_eX]

[DOI]

Patrice Guyot

Proceedings of the 10th International Workshop on Content-Based Multimedia Indexing, 2012

Feasibility of the detection of choirs for ethnomusicologic music indexing.

[BibT_eX]

[DOI]

Maxime Le Coz

Proceedings of the 10th International Workshop on Content-Based Multimedia Indexing, 2012

2011

Distinguishing Monophonies From Polyphonies Using Weibull Bivariate Distributions.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2011

Activities of daily living indexing by hierarchical HMM for dementia diagnostics.

[BibT_eX]

[DOI]

Proceedings of the 9th International Workshop on Content-Based Multimedia Indexing, 2011

2010

The IMMED project: wearable video monitoring of people with age dementia.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Multimedia 2010, 2010

Speaker role recognition to help spontaneous conversational speech detection.

[BibT_eX]

[DOI]

Proceedings of the 2010 International Workshop on Searching Spontaneous Conversational Speech, 2010

Looking for relevant features for speaker role recognition.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Exploiting speaker segmentations for automatic role detection. An application to broadcast news documents.

[BibT_eX]

[DOI]

Benjamin Bigot

Isabelle Ferrané

Proceedings of the 2010 International Workshop on Content-Based Multimedia Indexing, 2010

2009

Improved speaker diarization system for meetings.

[BibT_eX]

[DOI]

Elie Khoury

Christine Sénac

Proceedings of the IEEE International Conference on Acoustics, 2009

Singing voice detection in monophonic and polyphonic contexts.

[BibT_eX]

[DOI]

Proceedings of the 17th European Signal Processing Conference, 2009

Monophony vs Polyphony: A New Method Based on Weibull Bivariate Models.

[BibT_eX]

[DOI]

Proceedings of the Seventh International Workshop on Content-Based Multimedia Indexing, 2009

2008

Dynamic organization of audiovisual database using a user-defined similarity measure based on low-level features.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Image Processing, 2008

Wearable video monitoring of people with age Dementia : Video indexing at the service of helthcare.

[BibT_eX]

[DOI]

Catherine Helmer

Proceedings of the International Workshop on Content-Based Multimedia Indexing, 2008

2007

Singing voice characterization for audio indexing.

[BibT_eX]

[DOI]

Proceedings of the 15th European Signal Processing Conference, 2007

ACADI showcase - automatic character indexing in audiovisual document.

[BibT_eX]

[DOI]

Frédéric Gianni

Ewa Kijak

Proceedings of the 6th ACM International Conference on Image and Video Retrieval, 2007

Fast Hierarchical Multimodal Structuring of Time Slots.

[BibT_eX]

[DOI]

Proceedings of the International Workshop on Content-Based Multimedia Indexing, 2007

Association of Audio and Video Segmentations for Automatic Person Indexing.

[BibT_eX]

[DOI]

Proceedings of the International Workshop on Content-Based Multimedia Indexing, 2007

2006

Audio indexing: primary components retrieval.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2006

Intervenant Classification in an Audiovisual Document.

[BibT_eX]

Jeremy Philippeau

Philippe Joly

Proceedings of the SIGMAP 2006, 2006

2005

Evaluation of classification techniques for audio indexing.

[BibT_eX]

[DOI]

José Anibal Arias

Proceedings of the 13th European Signal Processing Conference, 2005

2004

Indexation sonore : recherche de composantes primaires pour une structuration audiovisuelle. (Audio classification: search of primary components for audiovisual structuring).

[BibT_eX]

[DOI]

PhD thesis, 2004

Jingle detection and identification in audio documents.

[BibT_eX]

[DOI]