Emmanuel Dupoux

Orcid: 0000-0002-7814-2952

According to our database1, Emmanuel Dupoux authored at least 139 papers between 1999 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Language Evolution with Deep Learning.
CoRR, 2024

SpiRit-LM: Interleaved Spoken and Written Language Model.
CoRR, 2024

2023
EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models.
CoRR, 2023

WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models.
CoRR, 2023

Low-Resource Self-Supervised Learning with SSL-Enhanced TTS.
CoRR, 2023

EXPRESSO: A Benchmark and Analysis of Discrete Expressive Speech Resynthesis.
CoRR, 2023

BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models.
CoRR, 2023

ProsAudit, a prosodic benchmark for self-supervised speech models.
CoRR, 2023

Textually Pretrained Speech Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Augmentation Invariant Discrete Representation for Generative Spoken Language Modeling.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023

Neural Agents Struggle to Take Turns in Bidirectional Emergent Communication.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Introducing Topography in Convolutional Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2023

Do Coarser Units Benefit Cluster Prediction-Based Speech Pre-Training?
Proceedings of the IEEE International Conference on Acoustics, 2023

XLS-R fine-tuning on noisy word boundaries for unsupervised speech segmentation into words.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Generative Spoken Language Model based on continuous word-sized audio tokens.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Brouhaha: Multi-Task Training for Voice Activity Detection, Speech-to-Noise Ratio, and C50 Room Acoustics Estimation.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon.
Trans. Assoc. Comput. Linguistics, 2022

IntPhys 2019: A Benchmark for Visual Intuitive Physics Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Are Discrete Units Necessary for Spoken Language Modeling?
IEEE J. Sel. Top. Signal Process., 2022

Self-Supervised Language Learning From Raw Audio: Lessons From the Zero Resource Speech Challenge.
IEEE J. Sel. Top. Signal Process., 2022

Evaluating context-invariance in unsupervised speech representations.
CoRR, 2022

Are word boundaries useful for unsupervised language learning?
CoRR, 2022

On The Robustness of Self-Supervised Representations for Spoken Language Modeling.
CoRR, 2022

STOP: A dataset for Spoken Task Oriented Semantic Parsing.
CoRR, 2022

Is the Language Familiarity Effect gradual? A computational modelling approach.
CoRR, 2022

Generative Spoken Dialogue Language Modeling.
CoRR, 2022

textless-lib: a Library for Textless Spoken Language Processing.
CoRR, 2022

Stop: A Dataset for Spoken Task Oriented Semantic Parsing.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

A comparison study on patient-psychologist voice diarization.
Proceedings of the Ninth Workshop on Speech and Language Processing for Assistive Technologies, 2022

Emergent Communication: Generalization and Overfitting in Lewis Games.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Probing phoneme, language and speaker information in unsupervised speech representations.
Proceedings of the Interspeech 2022, 2022

Speech Sequence Embeddings using Nearest Neighbors Contrastive Learning.
Proceedings of the Interspeech 2022, 2022

On the role of population heterogeneity in emergent communication.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Textless Speech Emotion Conversion using Discrete & Decomposed Representations.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Text-Free Prosody-Aware Generative Spoken Language Modeling.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Towards Interactive Language Modeling.
CoRR, 2021

Shennong: a Python toolbox for audio speech features extraction.
CoRR, 2021

Textless Speech Emotion Conversion using Decomposed and Discrete Representations.
CoRR, 2021

ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition.
CoRR, 2021

The Interspeech Zero Resource Speech Challenge 2021: Spoken language modelling.
CoRR, 2021

Learning spectro-temporal representations of complex sounds with parameterized neural networks.
CoRR, 2021

Generative Spoken Language Modeling from Raw Audio.
CoRR, 2021

Does Infant-Directed Speech Help Phonetic Learning? A Machine Learning Investigation.
Cogn. Sci., 2021

Towards Unsupervised Learning of Speech Features in the Wild.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

The Zero Resource Speech Challenge 2021: Spoken Language Modelling.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Speech Technology for Unwritten Languages.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling.
CoRR, 2020

Comparison of Speaker Role Recognition and Speaker Enrollment Protocol for conversational Clinical Interviews.
CoRR, 2020

Occlusion resistant learning of intuitive physics from videos.
CoRR, 2020


Seshat: a Tool for Managing and Verifying Annotation Campaigns of Audio Data.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Identification of Primary and Collateral Tracks in Stuttered Speech.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Vocal Markers from Sustained Phonation in Huntington's Disease.
Proceedings of the Interspeech 2020, 2020

An Open-Source Voice Type Classifier for Child-Centered Daylong Recordings.
Proceedings of the Interspeech 2020, 2020

The Zero Resource Speech Challenge 2020: Discovering Discrete Subword and Word Units.
Proceedings of the Interspeech 2020, 2020

Evaluating the Reliability of Acoustic Speech Embeddings.
Proceedings of the Interspeech 2020, 2020

Unsupervised Pretraining Transfers Well Across Languages.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Libri-Light: A Benchmark for ASR with Limited or No Supervision.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

"LazImpa": Lazy and Impatient neural agents learn to communicate efficiently.
Proceedings of the 24th Conference on Computational Natural Language Learning, 2020

Analogies minus analogy test: measuring regularities in word embeddings.
Proceedings of the 24th Conference on Computational Natural Language Learning, 2020

Does bilingual input hurt? A simulation of language discrimination and clustering using i-vectors.
Proceedings of the 42th Annual Meeting of the Cognitive Science Society, 2020

Modelling Perceptual Effects of Phonology with ASR Systems.
Proceedings of the 42th Annual Meeting of the Cognitive Science Society, 2020

Compositionality and Generalization In Emergent Languages.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Modeling German Verb Argument Structures: LSTMs vs. Humans.
CoRR, 2019

Anti-efficient encoding in emergent communication.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019


Phoneme learning is influenced by the taxonomic similarity of the semantic referents.
Proceedings of the 41th Annual Meeting of the Cognitive Science Society, 2019

Word-order Biases in Deep-agent Emergent Communication.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
IntPhys: A Framework and Benchmark for Visual Intuitive Physics Reasoning.
CoRR, 2018

Are Words Easier to Learn From Infant- Than Adult-Directed Speech? A Quantitative Corpus-Based Investigation.
Cogn. Sci., 2018

Zero Resource Speech Technology: Past, Present, and Future.
Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018

A K-Nearest Neighbours Approach To Unsupervised Spoken Term Discovery.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

BabyCloud, a Technological Platform for Parents and Researchers.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

End-to-End Speech Recognition from the Raw Waveform.
Proceedings of the Interspeech 2018, 2018

Sampling Strategies in Siamese Networks for Unsupervised Speech Representation Learning.
Proceedings of the Interspeech 2018, 2018

Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments.
Proceedings of the Interspeech 2018, 2018

Learning Filterbanks from Raw Speech for Phone Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Linguistic Unit Discovery from Multi-Modal Inputs in Unwritten Languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Bayesian Models for Unit Discovery on a Very Low Resource Language.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
A Quantitative Measure of the Impact of Coarticulation on Phone Discriminability.
Proceedings of the Interspeech 2017, 2017

Relating Unsupervised Word Segmentation to Reported Vocabulary Acquisition.
Proceedings of the Interspeech 2017, 2017

Predicting Epenthetic Vowel Quality from Acoustics.
Proceedings of the Interspeech 2017, 2017

Learning Weakly Supervised Multimodal Phoneme Embeddings.
Proceedings of the Interspeech 2017, 2017

Comparing Character-level Neural Language Models Using a Lexical Decision Task.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

ASR Systems as Models of Phonetic Category Perception in Adults.
Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017

The zero resource speech challenge 2017.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Blind Phoneme Segmentation With Temporal Prediction Errors.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

The Role of Prosody and Speech Register in Word Segmentation: A Computational Modelling Perspective.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies.
Trans. Assoc. Comput. Linguistics, 2016

Improving Phoneme segmentation with Recurrent Neural Networks.
CoRR, 2016

Cognitive Science in the era of Artificial Intelligence: A roadmap for reverse-engineering the infant language-learner.
CoRR, 2016

Quantificational features in distributional word representations.
Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics, 2016

The Zero Resource Speech Challenge 2015: Proposed Approaches and Results.
Proceedings of the SLTU-2016, 2016

A Temporal Coherence Loss Function for Learning Unsupervised Acoustic Embeddings.
Proceedings of the SLTU-2016, 2016

Automatic Syllable Segmentation Using Broad Phonetic Class Information.
Proceedings of the SLTU-2016, 2016

Joint Learning of Speaker and Phonetic Similarities with Siamese Networks.
Proceedings of the Interspeech 2016, 2016

A new efficient measure for accuracy prediction and its application to multistream-based unsupervised adaptation.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

The "language filter" hypothesis: A feasibility study of language separation in infancy using unsupervised clustering of I-vectors.
Proceedings of the 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics, 2016

A deep scattering spectrum - Deep Siamese network pipeline for unsupervised acoustic modeling.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

The role of word-word co-occurrence in word learning.
Proceedings of the 38th Annual Meeting of the Cognitive Science Society, 2016

Modeling language discrimination in infants using i-vector representations.
Proceedings of the 38th Annual Meeting of the Cognitive Science Society, 2016

Discriminability of sound contrasts in the face of speaker variation quantified.
Proceedings of the 38th Annual Meeting of the Cognitive Science Society, 2016

2015
Weakly Supervised Multi-Embeddings Learning of Acoustic Models.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Exploring multi-language resources for unsupervised spoken term discovery.
Proceedings of the International Conference on Speech Technology and Human-Computer Dialogue, 2015

Rhythm-Based Syllabic Stress Learning Without Labelled Data.
Proceedings of the Statistical Language and Speech Processing, 2015

Prosodic boundary information helps unsupervised word segmentation.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Sign constraints on feature weights improve a joint model of word segmentation and phonology.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

The zero resource speech challenge 2015.
Proceedings of the INTERSPEECH 2015, 2015

A hybrid dynamic time warping-deep neural network architecture for unsupervised acoustic modeling.
Proceedings of the INTERSPEECH 2015, 2015

Salient dimensions in implicit phonotactic learning.
Proceedings of the INTERSPEECH 2015, 2015

A multilingual study on intensity as a cue for marking prosodic boundaries.
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

Quantitative methods for comparing featural representations.
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

Towards machines that know when they do not know: Summary of work done at 2014 Frederick Jelinek Memorial Workshop.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Towards low-resource prosodic boundary detection.
Proceedings of the 4th Workshop on Spoken Language Technologies for Under-resourced Languages, 2014

Phonetics embedding learning with side information.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Bridging the gap between speech technology and natural language processing: an evaluation toolbox for term discovery systems.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Evaluating speech features with the minimal-pair ABX task (II): resistance to noise.
Proceedings of the INTERSPEECH 2014, 2014

A Rudimentary Lexicon and Semantics Help Bootstrap Phoneme Acquisition.
Proceedings of the Eighteenth Conference on Computational Natural Language Learning, 2014

Unsupervised Word Segmentation in Context.
Proceedings of the COLING 2014, 2014

Self-Consistency as an Inductive Bias in Early Language Acquisition.
Proceedings of the 36th Annual Meeting of the Cognitive Science Society, 2014

Modelling function words improves unsupervised word segmentation.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Exploring the Relative Role of Bottom-up and Top-down Information in Phoneme Learning.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2013
Learning Phonemes With a Proto-Lexicon.
Cogn. Sci., 2013

Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline.
Proceedings of the INTERSPEECH 2013, 2013


A corpus-based evaluation method for Distributional Semantic Models.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

Why is English so easy to segment?
Proceedings of the Fourth Annual Workshop on Cognitive Modeling and Computational Linguistics, 2013

2011
Holographic String Encoding.
Cogn. Sci., 2011

Templatic features for modeling phoneme acquisition.
Proceedings of the 33th Annual Meeting of the Cognitive Science Society, 2011

Testing the Robustness of Online Word Segmentation: Effects of Linguistic Diversity and Phonetic Variation.
Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics, 2011

2010
Cerebral bases of subliminal speech priming.
NeuroImage, 2010

Perception of predictable stress: A cross-linguistic investigation.
J. Phonetics, 2010

2008
Unsupervised Learning of Acoustic Sub-word Units.
Proceedings of the ACL 2008, 2008

2006
The Role of the Striatum in Processing Language Rules: Evidence from Word Perception in Huntington's Disease.
J. Cogn. Neurosci., 2006

1999
Perception of stress by French, Spanish, and bilingual subjects.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Prelexical locus of an illusory vowel effect in Japanese.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999


  Loading...