Jordi Pons

CoRR, March, 2026

2025

Music and Artificial Intelligence: Artistic Trends.

[BibT_eX]

[DOI]

CoRR, August, 2025

Fast Text-to-Audio Generation with Adversarial Post-Training.

[BibT_eX]

[DOI]

Taylor Berg-Kirkpatrick

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2025

Scaling Transformers for Low-Bitrate High-Quality Speech Coding.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Stable Audio Open.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

Long-form music generation with latent diffusion.

[BibT_eX]

[DOI]

CoRR, 2024

Long-Form Music Generation With Latent Diffusion.

[BibT_eX]

[DOI]

Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024

Fast Timing-Conditioned Latent Audio Diffusion.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

GASS: Generalizing Audio Source Separation with Large-Scale Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Towards Robust Image-in-Audio Deep Steganography.

[BibT_eX]

[DOI]

CoRR, 2023

CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models.

[BibT_eX]

[DOI]

Taylor Berg-Kirkpatrick

Julian J. McAuley

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Mono-to-Stereo Through Parametric Stereo Generation.

[BibT_eX]

[DOI]

Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

Adversarial Permutation Invariant Training for Universal Sound Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Full-Band General Audio Synthesis with Score-Based Diffusion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Upsampling Layers for Music Source Separation.

[BibT_eX]

[DOI]

Proceedings of the 31st European Signal Processing Conference, 2023

2022

FSD50K: An Open Dataset of Human-Labeled Sound Events.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Universal Speech Enhancement with Score-based Diffusion.

[BibT_eX]

[DOI]

CoRR, 2022

PodcastMix: A dataset for separating music and speech in podcasts.

[BibT_eX]

[DOI]

Nicolás Schmidt

Marius Miron

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

On Loss Functions and Evaluation Metrics for Music Source Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Pixinwav: Residual Steganography for Hiding Pixels in Audio.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

On tuning consistent annealed sampling for denoising score matching.

[BibT_eX]

[DOI]

CoRR, 2021

Adversarial Auto-Encoding for Packet Loss Concealment.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Automatic Multitrack Mixing With A Differentiable Mixing Console Of Neural Audio Effects.

[BibT_eX]

[DOI]

Christian J. Steinmetz

Proceedings of the IEEE International Conference on Acoustics, 2021

SESQA: Semi-Supervised Learning for Speech Quality Assessment.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Upsampling Artifacts in Neural Audio Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

On Permutation Invariant Training For Speech Source Separation.

[BibT_eX]

[DOI]

Xiaoyu Liu

Proceedings of the IEEE International Conference on Acoustics, 2021

Multichannel-based Learning for Audio Object Extraction.

[BibT_eX]

[DOI]

Daniel Arteaga

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

FSD50K.

[BibT_eX]

[DOI]

Dataset, October, 2020

An Empirical Study of Conv-Tasnet.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Tensorflow Audio Models in Essentia.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

FSDKaggle2018.

[BibT_eX]

[DOI]

Dataset, January, 2019

Deep neural networks for music and audio tagging.

[BibT_eX]

[DOI]

PhD thesis, 2019

musicnn: Pre-trained convolutional neural networks for music audio tagging.

[BibT_eX]

[DOI]

CoRR, 2019

End-to-End Music Source Separation: Is it Possible in the Waveform Domain?

[BibT_eX]

[DOI]

Francesc Lluís

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Training Neural Audio Classifiers with Few Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Randomly Weighted CNNs for (Music) Audio Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

End-to-end Learning for Music Audio Tagging at Scale.

[BibT_eX]

[DOI]

Proceedings of the 19th International Society for Music Information Retrieval Conference, 2018

A Wavenet for Speech Denoising.

[BibT_eX]

[DOI]

Dario Rethage

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

General-purpose tagging of Freesound audio with AudioSet labels: task description, dataset, and baseline.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2018

2017

Score-Informed Syllable Segmentation for A Cappella Singing Voice with Convolutional Neural Networks.

[BibT_eX]

[DOI]

Rong Gong

Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

Audio to Score Matching by Combining Phonetic and Duration Information.

[BibT_eX]

[DOI]

Rong Gong

Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

Freesound Datasets: A Platform for the Creation of Open Audio Datasets.

[BibT_eX]

[DOI]

Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

Designing efficient architectures for modeling temporal features with convolutional neural networks.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Timbre analysis of music audio signals with convolutional neural networks.

[BibT_eX]

[DOI]

Proceedings of the 25th European Signal Processing Conference, 2017

2016

Experimenting with musically motivated convolutional neural networks.

[BibT_eX]

[DOI]

Thomas Lidy