Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Aligning Spoken Dialogue Models from User Interactions.

[BibT_eX]

[DOI]

Anne Wu

Laurent Mazaré

Neil Zeghidour

Alexandre Défossez

Proceedings of the Forty-second International Conference on Machine Learning, 2025

High-Fidelity Simultaneous Speech-To-Speech Translation.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

CaReAQA: A Cardiac and Respiratory Audio Question Answering Model for Open-Ended Diagnostic Reasoning.

[BibT_eX]

[DOI]

Proceedings of the Conference on Health, 2025

2024

Moshi: a speech-text foundation model for real-time dialogue.

[BibT_eX]

[DOI]

CoRR, 2024

MusicRL: Aligning Music Generation to Human Preferences.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

AudioLM: A Language Modeling Approach to Audio Generation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision.

[BibT_eX]

[DOI]

Trans. Assoc. Comput. Linguistics, 2023

AudioPaLM: A Large Language Model That Can Speak and Listen.

[BibT_eX]

[DOI]

CoRR, 2023

SoundStorm: Efficient Parallel Audio Generation.

[BibT_eX]

[DOI]

CoRR, 2023

DNArch: Learning Convolutional Neural Architectures by Backpropagation.

[BibT_eX]

[DOI]

David W. Romero

Neil Zeghidour

CoRR, 2023

SingSong: Generating musical accompaniments from singing.

[BibT_eX]

[DOI]

CoRR, 2023

MusicLM: Generating Music From Text.

[BibT_eX]

[DOI]

Christian Havnø Frank

CoRR, 2023

TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Pose-graph SLAM Using Multi-order Ultrasonic Echoes and Beamforming for Long-range Inspection Robots.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Speech Intelligibility Classifiers from 550k Disordered Speech Samples.

[BibT_eX]

[DOI]

Subhashini Venugopalan

Proceedings of the IEEE International Conference on Acoustics, 2023

Disentangling Speech from Surroundings with Neural Embeddings.

[BibT_eX]

[DOI]

Ahmed Omran

Neil Zeghidour

Zalán Borsos

Félix de Chaumont Quitry

Malcolm Slaney

Marco Tagliasacchi

Proceedings of the IEEE International Conference on Acoustics, 2023

LMCodec: A Low Bitrate Speech Codec with Causal Transformer Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

SoundStream: An End-to-End Neural Audio Codec.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

AudioLM: a Language Modeling Approach to Audio Generation.

[BibT_eX]

[DOI]

CoRR, 2022

Disentangling speech from surroundings in a neural audio codec.

[BibT_eX]

[DOI]

Ahmed Omran

Neil Zeghidour

Zalán Borsos

Félix de Chaumont Quitry

Malcolm Slaney

Marco Tagliasacchi

CoRR, 2022

Multi-instrument Music Synthesis with Spectrogram Diffusion.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Learning neural audio features without supervision.

[BibT_eX]

[DOI]

Sarthak Yadav

Neil Zeghidour

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Combined Grid and Feature-based Mapping of Metal Structures with Ultrasonic Guided Waves.

[BibT_eX]

[DOI]

Proceedings of the 2022 International Conference on Robotics and Automation, 2022

General-purpose, long-context autoregressive modeling with Perceiver AR.

[BibT_eX]

[DOI]

Jean-Baptiste Alayrac

João Carreira

Jesse H. Engel

Proceedings of the International Conference on Machine Learning, 2022

Learning Strides in Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Polygonal Shapes Reconstruction from Acoustic Echoes Using a Mobile Sensor and Beamforming.

[BibT_eX]

[DOI]

Proceedings of the 30th European Signal Processing Conference, 2022

2021

Wavesplit: End-to-End Speech Separation by Speaker Clustering.

[BibT_eX]

[DOI]

Neil Zeghidour

David Grangier

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Self-Supervised Learning of Audio Representations From Permutations With Differentiable Ranking.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2021

LEAF: A Learnable Frontend for Audio Classification.

[BibT_eX]

[DOI]

Neil Zeghidour

Olivier Teboul

Félix de Chaumont Quitry

Marco Tagliasacchi

Proceedings of the 9th International Conference on Learning Representations, 2021

Contrastive Learning of General-Purpose Audio Representations.

[BibT_eX]

[DOI]

Aaqib Saeed

David Grangier

Neil Zeghidour

Proceedings of the IEEE International Conference on Acoustics, 2021

Learning From Heterogeneous Eeg Signals with Differentiable Channel Reordering.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Dive: End-to-End Speech Diarization Via Iterative Speaker Embedding.

[BibT_eX]

[DOI]

Neil Zeghidour

Olivier Teboul

David Grangier

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2019

Learning representations of speech from the raw waveform. (Apprentissage de représentations de la parole à partir du signal brut).

[BibT_eX]

[DOI]

Neil Zeghidour

PhD thesis, 2019

Deep multi-class learning from label proportions.

[BibT_eX]

[DOI]

CoRR, 2019

Learning to Detect Dysarthria from Raw Speech.

[BibT_eX]

[DOI]

Juliette Millet

Neil Zeghidour

Proceedings of the IEEE International Conference on Acoustics, 2019

To Reverse the Gradient or Not: an Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Fully Convolutional Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2018

SING: Symbol-to-Instrument Neural Generator.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

End-to-End Speech Recognition from the Raw Waveform.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Sampling Strategies in Siamese Networks for Unsupervised Speech Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Learning Filterbanks from Raw Speech for Phone Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Fader Networks: Manipulating Images by Sliding Attributes.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Learning Weakly Supervised Multimodal Phoneme Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Joint Learning of Speaker and Phonetic Similarities with Siamese Networks.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A deep scattering spectrum - Deep Siamese network pipeline for unsupervised acoustic modeling.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Neil Zeghidour

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...