Emmanouil Benetos

Mathieu Lagrange

CoRR, February, 2026

AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking.

[BibT_eX]

[DOI]

CoRR, January, 2026

Embryonic Exposure to VPA Influences Chick Vocalisations: A Computational Study.

[BibT_eX]

[DOI]

Antonella M. C. Torrisi

CoRR, January, 2026

Computational hermeneutics: evaluating generative AI as a cultural technology.

[BibT_eX]

[DOI]

Frontiers Artif. Intell., 2026

2025

Singing to speech conversion with generative flow.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., December, 2025

AutoMV: An Automatic Multi-Agent System for Music Video Generation.

[BibT_eX]

[DOI]

CoRR, December, 2025

SAR-LM: Symbolic Audio Reasoning with Large Language Models.

[BibT_eX]

[DOI]

Termeh Taheri

Yinghao Ma

CoRR, November, 2025

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs.

[BibT_eX]

[DOI]

CoRR, October, 2025

Scalable Evaluation for Audio Identification via Synthetic Latent Fingerprint Generation.

[BibT_eX]

[DOI]

Aditya Bhattacharjee

Marco Pasini

CoRR, September, 2025

Audio-JEPA: Joint-Embedding Predictive Architecture for Audio Representation Learning.

[BibT_eX]

[DOI]

CoRR, July, 2025

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix.

[BibT_eX]

[DOI]

CoRR, May, 2025

Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks.

[BibT_eX]

[DOI]

CoRR, May, 2025

Word-Level Lyrics Alignment Annotations for the MUSDB18 Test Set.

[BibT_eX]

[DOI]

Dataset, May, 2025

YuE: Scaling Open Foundation Models for Long-Form Music Generation.

[BibT_eX]

[DOI]

CoRR, March, 2025

Audio-FLAN: A Preliminary Release.

[BibT_eX]

[DOI]

CoRR, February, 2025

Velocity2DMs: A Contextual Modeling Approach to Dynamics Marking Prediction in Piano Performance.

[BibT_eX]

[DOI]

Hyon Kim

Xavier Serra

IEEE Signal Process. Lett., 2025

RUMAA: Repeat-Aware Unified Music Audio Analysis for Score-Performance Alignment, Transcription, and Mistake Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2025

From Aesthetics to Human Preferences: Comparative Perspectives of Evaluating Text-to-Music Systems.

[BibT_eX]

[DOI]

Proceedings of the 35th IEEE International Workshop on Machine Learning for Signal Processing, 2025

Perceptual Errors in Music Source Separation: looking beyond SDR averages.

[BibT_eX]

[DOI]

Proceedings of the 26th International Society for Music Information Retrieval Conference, 2025

Universal Music Representations? Evaluating Foundation Models on World Music Corpora.

[BibT_eX]

[DOI]

Charilaos Papaioannou

Alexandros Potamianos

Proceedings of the 26th International Society for Music Information Retrieval Conference, 2025

CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following.

[BibT_eX]

[DOI]

Proceedings of the 26th International Society for Music Information Retrieval Conference, 2025

Refining music sample identification with a self-supervised graph neural network.

[BibT_eX]

[DOI]

Proceedings of the 26th International Society for Music Information Retrieval Conference, 2025

Enhancing Lyrics Transcription on Music Mixtures with Consistency Loss.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Position Paper: Towards a Unified Representation Evaluation Framework Beyond Downstream Tasks.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2025

MuPT: A Generative Symbolic Music Pretrained Transformer.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Learning Music Audio Representations With Limited Data.

[BibT_eX]

[DOI]

Christos Plachouras

Johan Pauwels

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Twenty-Five Years of MIR Research: Achievements, Practices, Evaluations, and Future Challenges.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Acoustic Identification of Individual Animals with Hierarchical Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

GraFPrint: A GNN-Based Approach for Audio Identification.

[BibT_eX]

[DOI]

Aditya Bhattacharjee

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

MuChoMusic dataset.

[BibT_eX]

[DOI]

Dataset, July, 2024

Few-Shot Class-Incremental Audio Classification Using Dynamically Expanded Classifier With Self-Attention Modified Prototypes.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2024

ATGNN: Audio Tagging Graph Neural Network.

[BibT_eX]

[DOI]

Christian J. Steinmetz

Dan Stowell

IEEE Signal Process. Lett., 2024

A Data-Driven Analysis of Robust Automatic Piano Transcription.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2024

OmniBench: Towards The Future of Universal Omni-Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

LC-Protonets: Multi-label Few-shot learning for world music audio tagging.

[BibT_eX]

[DOI]

Charilaos Papaioannou

Alexandros Potamianos

CoRR, 2024

Domain-Invariant Representation Learning of Bird Sounds.

[BibT_eX]

[DOI]

CoRR, 2024

Foundation Models for Music: A Survey.

[BibT_eX]

[DOI]

CoRR, 2024

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series.

[BibT_eX]

[DOI]

CoRR, 2024

MuPT: A Generative Symbolic Music Pretrained Transformer.

[BibT_eX]

[DOI]

CoRR, 2024

ChatMusician: Understanding and Generating Music Intrinsically with LLM.

[BibT_eX]

[DOI]

CoRR, 2024

Explaining Models Relating Objects and Privacy.

[BibT_eX]

[DOI]

Proceedings of the 3rd Explainable AI for Computer Vision (XAI4CV) Workshop, 2024

Classification Of Spontaneous And Scripted Speech For Multilingual Audio.

[BibT_eX]

[DOI]

Shahar Elisha

Andrew McDowell

Mariano Beguerisse-Díaz

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

YourMT3+: Multi-Instrument Music Transcription with Enhanced Transformer Architectures and Cross-Dataset STEM Augmentation.

[BibT_eX]

[DOI]

Proceedings of the 34th IEEE International Workshop on Machine Learning for Signal Processing, 2024

Can LLMs "Reason" in Music? an Evaluation of LLMs' Capability of Music Understanding and Generation.

[BibT_eX]

[DOI]

Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024

MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models.

[BibT_eX]

[DOI]

Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024

ST-ITO: Controlling Audio Effects for Style Transfer With Inference-Time Optimization.

[BibT_eX]

[DOI]

Christian J. Steinmetz

Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024

ComposerX: Multi-Agent Symbolic Music Composition With LLMs.

[BibT_eX]

[DOI]

Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Learning from Taxonomy: Multi-Label Few-Shot Classification for Everyday Sound Recognition.

[BibT_eX]

[DOI]

Jinhua Liang

Proceedings of the IEEE International Conference on Acoustics, 2024

Mertech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model with Multi-Task Finetuning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Mind the Domain Gap: A Systematic Analysis on Bioacoustic Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 32nd European Signal Processing Conference, 2024

Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model.

[BibT_eX]

[DOI]

Proceedings of the 32nd European Signal Processing Conference, 2024

ChatMusician: Understanding and Generating Music Intrinsically with LLM.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

YourMT3+ restricted datasets (new).

[BibT_eX]

[DOI]

Zamorano-Osorio Javiera

Dataset, October, 2023

YourMT3 dataset (Part 1).

[BibT_eX]

[DOI]

Dataset, October, 2023

PiJAMA: Piano Jazz with Automatic MIDI Annotations.

[BibT_eX]

[DOI]

Drew Edwards

Dataset, September, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

MusicNet-16k + EM for YourMT3.

[BibT_eX]

[DOI]

Dataset, April, 2023

Slakh2100-16k for YourMT3.

[BibT_eX]

[DOI]

Dataset, March, 2023

PiJAMA: Piano Jazz with Automatic MIDI Annotations.

[BibT_eX]

[DOI]

Drew Edwards

Trans. Int. Soc. Music. Inf. Retr., January, 2023

The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation.

[BibT_eX]

[DOI]

CoRR, 2023

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training.

[BibT_eX]

[DOI]

CoRR, 2023

Perceptual Musical Similarity Metric Learning with Graph Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Leveraging Synthetic Data for Improving Chamber Ensemble Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

MARBLE: Music Audio Representation Benchmark for Universal Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LyricWhiz: Robust Multilingual Zero-Shot Lyrics Transcription by Whispering to ChatGPT.

[BibT_eX]

[DOI]

Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

From West to East: Who Can Understand the Music of the Others Better?

[BibT_eX]

[DOI]

Charilaos Papaioannou

Alexandros Potamianos

Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

On the Effectiveness of Speech Self-Supervised Learning for Music.

[BibT_eX]

[DOI]

Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

Adapting Language-Audio Models as Few-Shot Audio Learners.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Audio Quality Assessment of Vinyl Music Collections Using Self-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Learning Music Representations with wav2vec 2.0.

[BibT_eX]

[DOI]

Proceedings of the 31st Irish Conference on Artificial Intelligence and Cognitive Science, 2023

2022

FSD-FS.

[BibT_eX]

[DOI]

Jinhua Liang

Dataset, December, 2022

EnsembleSet.

[BibT_eX]

[DOI]

Mark Sandler

Dataset, May, 2022

EnsembleSet.

[BibT_eX]

[DOI]

Mark Sandler

Dataset, May, 2022

Adaptive Scattering Transforms for Playing Technique Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Comparison of Feature Extraction Methods for Sound-Based Classification of Honey Bee Activity.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Humanities and engineering perspectives on music transcription.

[BibT_eX]

[DOI]

Digit. Scholarsh. Humanit., 2022

MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Anomalous behaviour in loss-gradient based interpretability methods.

[BibT_eX]

[DOI]

CoRR, 2022

Deep Conditional Representation Learning for Drum Sample Retrieval by Vocalisation.

[BibT_eX]

[DOI]

CoRR, 2022

EnsembleSet: a new high quality synthesised dataset for chamber ensemble separation.

[BibT_eX]

[DOI]

Mark Sandler

Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Contrastive Audio-Language Learning for Music.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Performance MIDI-to-score conversion by neural beat tracking.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Exploring Transformer's Potential on Automatic Piano Transcription.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Learning Music Audio Representations Via Weak Language Supervision.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Lyrics Alignment Through Joint Pitch Detection.

[BibT_eX]

[DOI]

Sebastian Ewert

Proceedings of the IEEE International Conference on Acoustics, 2022

Joint Scattering for Automatic Chick Call Recognition.

[BibT_eX]

[DOI]

Proceedings of the 30th European Signal Processing Conference, 2022

Hypernetworks for Sound event Detection: a Proof-of-Concept.

[BibT_eX]

[DOI]

Proceedings of the 30th European Signal Processing Conference, 2022

Explaining the Decision of Anomalous Sound Detectors.

[BibT_eX]

[DOI]

Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

Leveraging Label Hierachies for Few-Shot Everyday Sound Recognition.

[BibT_eX]

[DOI]

Jinhua Liang

Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021

ACPAS dataset: Aligned Classical Piano Audio and Score (synthetic subset).

[BibT_eX]

[DOI]

Dataset, October, 2021

ACPAS dataset: Aligned Classical Piano Audio and Score (real recording subset).

[BibT_eX]

[DOI]

Dataset, October, 2021

MuseSyn: A dataset for complete automatic piano music transcription research.

[BibT_eX]

[DOI]

Dataset, February, 2021

Adversarial Unsupervised Domain Adaptation for Harmonic-Percussive Source Separation.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2021

More for Less: Non-Intrusive Speech Quality Assessment with Limited Annotations.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Quality of Multimedia Experience, 2021

Detecting Cover Songs with Pitch Class Key-Invariant Networks.

[BibT_eX]

[DOI]

Ken O'Hanlon

Proceedings of the 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP), 2021

Agreement Among Human and Automated Transcriptions of Global Songs.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Society for Music Information Retrieval Conference, 2021

Pitch-Informed Instrument Assignment using a Deep Convolutional Network with Multiple Kernel Shapes.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Society for Music Information Retrieval Conference, 2021

Vocal Harmony Separation Using Time-Domain Neural Networks.

[BibT_eX]

[DOI]

Mark B. Sandler

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

An Evaluation of Data Augmentation Methods for Sound Scene Geotagging.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

MusCaps: Generating Captions for Music Audio.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2021

Revisiting the Onsets and Frames Model with Additive Attention.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2021

Prototypical Networks for Domain Adaptation in Acoustic Scene Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Joint Multi-Pitch Detection and Score Transcription for Polyphonic Piano Music.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Violinist identification based on vibrato features.

[BibT_eX]

[DOI]

Proceedings of the 29th European Signal Processing Conference, 2021

2020

Tap & Fiddle: a Dataset with Scandinavian Fiddle Tunes with Accompanying Foot-Tapping.

[BibT_eX]

[DOI]

Dataset, December, 2020

CBFdataset: A Dataset of Chinese Bamboo Flute Performances.

[BibT_eX]

[DOI]

Changhong Wang

Elaine Chew

Dataset, May, 2020

CBFdataset: A Dataset of Chinese Bamboo Flute Performances.

[BibT_eX]

[DOI]

Changhong Wang

Elaine Chew

Dataset, May, 2020

A large joint sound scene and sound event dataset for source separation of foreground sound events.

[BibT_eX]

[DOI]

Daniel Stoller

Dataset, February, 2020

Speech endpoint annotations and artefact details for ASVspoof 2017 version 2.0 dataset.

[BibT_eX]

[DOI]

Dataset, January, 2020

Investigating the Perceptual Validity of Evaluation Metrics for Automatic Piano Music Transcription.

[BibT_eX]

[DOI]

Trans. Int. Soc. Music. Inf. Retr., 2020

Learning and Evaluation Methodologies for Polyphonic Music Sequence Prediction With LSTMs.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Dataset Artefacts in Anti-Spoofing Systems: A Case Study on the ASVspoof 2017 Benchmark.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Deep generative variational autoencoding for replay spoof detection in automatic speaker verification.

[BibT_eX]

[DOI]

Tomi Kinnunen

Comput. Speech Lang., 2020

Musical Features for Automatic Music Transcription Evaluation.

[BibT_eX]

[DOI]

CoRR, 2020

Audio Impairment Recognition using a Correlation-Based Feature Representation.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Quality of Multimedia Experience, 2020

Subband Modeling for Spoofing Detection in Automatic Speaker Verification.

[BibT_eX]

[DOI]

Tomi Kinnunen

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Development of a Speech Quality Database Under Uncontrolled Conditions.

[BibT_eX]

[DOI]

Marco A. Martínez Ramírez

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Memory Controlled Sequential Self Attention for Sound Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Reliable Local Explanations for Machine Listening.

[BibT_eX]

[DOI]

Proceedings of the 2020 International Joint Conference on Neural Networks, 2020

The Effect of Spectrogram Reconstruction on Automatic Music Transcription: An Alternative Approach to Improve Transcription Accuracy.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Pattern Recognition, 2020

Playing Technique Recognition by Joint Time-Frequency Scattering.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Study on the Transferability of Adversarial Attacks in Sound Event Classification.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Modeling Plate and Spring Reverberation Using A DSP-Informed Deep Neural Network.

[BibT_eX]

[DOI]

Joshua D. Reiss

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A-CRNN: A Domain Adaptation Model for Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

CBFdataset: A Dataset of Chinese Bamboo Flute Performances.

[BibT_eX]

[DOI]

Changhong Wang

Elaine Chew

Dataset, November, 2019

Audio-Based identification of Beehive states: The dataset.

[BibT_eX]

[DOI]

Dataset, February, 2019

Audio-Based identification of Beehive states: The dataset.

[BibT_eX]

[DOI]

Dataset, February, 2019

Automatic Music Transcription: An Overview.

[BibT_eX]

[DOI]

IEEE Signal Process. Mag., 2019

Adaptive Noise Reduction for Sound Event Detection Using Subband-Weighted NMF.

[BibT_eX]

[DOI]

Qing Zhou

Zuren Feng

Marco A. Martínez Ramírez

Sensors, 2019

Adversarial Attacks in Sound Event Classification.

[BibT_eX]

[DOI]

CoRR, 2019

A general-purpose deep learning approach to model time-varying audio effects.

[BibT_eX]

[DOI]

Joshua D. Reiss

CoRR, 2019

GAN-based Generation and Automatic Selection of Explanations for Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2019

Polyphonic Sound Event and Sound Activity Detection: A Multi-Task Approach.

[BibT_eX]

[DOI]

Arjun Pankajakshan

Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

Investigating Kernel Shapes and Skip Connections for Deep Learning-Based Harmonic-Percussive Separation.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

City Classification from Multiple Real-World Sound Scenes.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

Adapting the Quality of Experience Framework for Audio Archive Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 11th International Conference on Quality of Multimedia Experience QoMEX 2019, 2019

A Comparative Study of Neural Models for Polyphonic Music Sequence Transduction.

[BibT_eX]

[DOI]

Daniel Stoller

Proceedings of the 20th International Society for Music Information Retrieval Conference, 2019

Blending Acoustic and Language Model Predictions for Automatic Music Transcription.

[BibT_eX]

[DOI]

Proceedings of the 20th International Society for Music Information Retrieval Conference, 2019

Adaptive Time-Frequency Scattering for Periodic Modulation Recognition in Music Signals.

[BibT_eX]

[DOI]

Proceedings of the 20th International Society for Music Information Retrieval Conference, 2019

Automatic Music Transcription and Ethnomusicology: a User Study.

[BibT_eX]

[DOI]

Proceedings of the 20th International Society for Music Information Retrieval Conference, 2019

Ensemble Models for Spoofing Detection in Automatic Speaker Verification.

[BibT_eX]

[DOI]

Daniel Stoller

Marco A. Martínez Ramírez

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Towards Joint Sound Scene and Polyphonic Sound Event Recognition.

[BibT_eX]

[DOI]

Inês Nolasco

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

SubSpectralNet - Using Sub-spectrogram Based Convolutional Neural Networks for Acoustic Scene Classification.

[BibT_eX]

[DOI]

Sai Samarth R. Phaye

Ye Wang

Proceedings of the IEEE International Conference on Acoustics, 2019

Audio-based Identification of Beehive States.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Automatic Transcription of Diatonic Harmonica Recordings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Robustness of Adversarial Attacks in Sound Event Classification.

[BibT_eX]

[DOI]

Vinod Subramanian

Mark B. Sandler

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019

Audio Tagging using Linear Noise Modelling Layer.

[BibT_eX]

[DOI]

Arjun Pankajakshan

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019

Onsets, Activity, and Events: A Multi-task Approach for Polyphonic Sound Event Modelling.

[BibT_eX]

[DOI]

Arjun Pankajakshan

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019

2018

To bee or not to bee: An annotated dataset for beehive sound recognition.

[BibT_eX]

[DOI]

Inês Nolasco

Dataset, July, 2018

Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Speaker recognition with hybrid features from a deep belief network.

[BibT_eX]

[DOI]

Hazrat Ali

Son Ngoc Tran

Neural Comput. Appl., 2018

A Study On Convolutional Neural Network Based End-To-End Replay Anti-Spoofing.

[BibT_eX]

[DOI]

CoRR, 2018

Optimal Neural Network Feature Selection for Spatial-Temporal Forecasting.

[BibT_eX]

[DOI]

Eurico Covas

CoRR, 2018

Analysing The Predictions Of a CNN-Based Replay Spoofing Detection System.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Analysing Replay Spoofing Countermeasure Performance under varied conditions.

[BibT_eX]

[DOI]

Proceedings of the 28th IEEE International Workshop on Machine Learning for Signal Processing, 2018

Polyphonic Music Sequence Transduction with Meter-Constrained LSTM Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Towards Complete Polyphonic Music Transcription: Integrating Multi-Pitch Detection and Rhythm Quantization.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

To bee or not to bee: Investigating machine learning approaches for beehive sound recognition.

[BibT_eX]

[DOI]

Inês Nolasco

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2018

An extensible cluster-graph taxonomy for open set sound scene analysis.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2018

2017

DCASE2016 Challenge Submissions Package.

[BibT_eX]

[DOI]

Dataset, September, 2017

On-Bird Sound Recordings: Automatic Acoustic Recognition of Activities and Contexts.

[BibT_eX]

[DOI]

Dan Stowell

Lisa F. Gill

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Polyphonic Sound Event Tracking Using Linear Dynamical Systems.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

The Digital Music Lab: A Big Data Infrastructure for Digital Musicology.

[BibT_eX]

[DOI]

ACM Journal on Computing and Cultural Heritage, 2017

Sound event detection in synthetic audio: Analysis of the dcase 2016 task results.

[BibT_eX]

[DOI]

Grégoire Lafay

Mathieu Lagrange

Proceedings of the 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2017

Assessing the Relevance of Onset Information for Note Tracking in Piano Music Transcription.

[BibT_eX]

[DOI]

Jose J. Valero-Mas

José Manuel Iñesta Quereda

Proceedings of the AES International Conference Semantic Audio 2017, 2017

Automatic Transcription of a Cappella recordings from Multiple Singers.

[BibT_eX]

[DOI]

Rodrigo Schramm

Proceedings of the AES International Conference Semantic Audio 2017, 2017

Polyphonic Note and Instrument Tracking Using Linear Dynamical Systems.

[BibT_eX]

[DOI]

Proceedings of the AES International Conference Semantic Audio 2017, 2017

A Study on LSTM Networks for Polyphonic Music Sequence Modelling.

[BibT_eX]

[DOI]

Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

Multi-Pitch Detection and Voice Assignment for A Cappella Recordings of Multiple Singers.

[BibT_eX]

[DOI]

Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

On the memory properties of recurrent neural models.

[BibT_eX]

[DOI]

Arthur Jack Russell

Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

2016

An End-to-End Neural Network for Polyphonic Piano Music Transcription.

[BibT_eX]

[DOI]

Siddharth Sigtia

IEEE ACM Trans. Audio Speech Lang. Process., 2016

A Morphological Model for Simulating Acoustic Scenes and Its Application to Sound Event Detection.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Learning a Feature Space for Similarity in World Music.

[BibT_eX]

[DOI]

Maria Panteli

Proceedings of the 17th International Society for Music Information Retrieval Conference, 2016

The Sousta Corpus: Beat-Informed Automatic Transcription of Traditional Dance Tunes.

[BibT_eX]

[DOI]

Proceedings of the 17th International Society for Music Information Retrieval Conference, 2016

An Attack/Decay Model for Piano Transcription.

[BibT_eX]

[DOI]

Proceedings of the 17th International Society for Music Information Retrieval Conference, 2016

Detection of overlapping acoustic events using a temporally-constrained probabilistic model.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Digital music lab: A framework for analysing big music data.

[BibT_eX]

[DOI]

Proceedings of the 24th European Signal Processing Conference, 2016

2015

Detection and Classification of Acoustic Scenes and Events.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2015

An End-to-End Neural Network for Polyphonic Music Transcription.

[BibT_eX]

[DOI]

Siddharth Sigtia

CoRR, 2015

An evaluation framework for event detection using a morphological model of acoustic scenes.

[BibT_eX]

[DOI]

CoRR, 2015

An Efficient Temporally-Constrained Probabilistic Model for Multiple-Instrument Music Transcription.

[BibT_eX]

[DOI]

Proceedings of the 16th International Society for Music Information Retrieval Conference, 2015

A hybrid recurrent neural network for music transcription.

[BibT_eX]

[DOI]

Siddharth Sigtia

Nicolas Boulanger-Lewandowski

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Alternate level clustering for drum transcription.

[BibT_eX]

[DOI]

Proceedings of the 23rd European Signal Processing Conference, 2015

2014

Improving Automatic Music Transcription Through Key Detection.

[BibT_eX]

[DOI]

Andreas Jansson

Proceedings of the AES International Conference on Semantic Audio 2014, 2014

Incremental Dataset Definition for Large Scale Musicological Research.

[BibT_eX]

[DOI]

Proceedings of the 1st International Workshop on Digital Libraries for Musicology, 2014

Big Data for Musicology.

[BibT_eX]

[DOI]

Aquiles Alancar-Brayner

Mahendra Mahey

Adam Tovell

Proceedings of the 1st International Workshop on Digital Libraries for Musicology, 2014

An RNN-based Music Language Model for Improving Automatic Music Transcription.

[BibT_eX]

[DOI]

Proceedings of the 15th International Society for Music Information Retrieval Conference, 2014

Template Adaptation for Improving Automatic Music Transcription.

[BibT_eX]

[DOI]

Proceedings of the 15th International Society for Music Information Retrieval Conference, 2014

Learning motion-difference features using Gaussian restricted Boltzmann machines for efficient human action recognition.

[BibT_eX]

[DOI]

Son Ngoc Tran

Proceedings of the 2014 International Joint Conference on Neural Networks, 2014

Improving instrument recognition in polyphonic music through system integration.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Automatic transcription of pitched and unpitched sounds from polyphonic music.

[BibT_eX]

[DOI]

Sebastian Ewert

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Automatic music transcription: challenges and future directions.

[BibT_eX]

[DOI]

J. Intell. Inf. Syst., 2013

Detection and classification of acoustic scenes and events: An IEEE AASP challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013

A Machine Learning Approach to Voice Separation in Lute Tablature.

[BibT_eX]

[DOI]

Reinier de Valk

Proceedings of the 14th International Society for Music Information Retrieval Conference, 2013

Explicit Duration Hidden Markov Models for Multiple-Instrument Polyphonic Music Transcription.

[BibT_eX]

[DOI]

Proceedings of the 14th International Society for Music Information Retrieval Conference, 2013

Automatic Transcription of Turkish Makam Music.

[BibT_eX]

[DOI]

Proceedings of the 14th International Society for Music Information Retrieval Conference, 2013

A database and challenge for acoustic scene classification and event detection.

[BibT_eX]

[DOI]

Proceedings of the 21st European Signal Processing Conference, 2013

2012

A Shift-Invariant Latent Variable Model for Automatic Music Transcription.

[BibT_eX]

[DOI]

Comput. Music. J., 2012

Automatic Music Transcription: Breaking the Glass Ceiling.

[BibT_eX]

[DOI]

Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Temporally-Constrained Convolutive Probabilistic Latent Component Analysis for Multi-pitch Detection.

[BibT_eX]

[DOI]

Proceedings of the Latent Variable Analysis and Signal Separation, 2012

Score-informed transcription for automatic piano tutoring.

[BibT_eX]

[DOI]

Anssi Klapuri

Proceedings of the 20th European Signal Processing Conference, 2012

2011

Joint Multi-Pitch Detection Using Harmonic Envelope Estimation for Polyphonic Music Transcription.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2011

A temporally-constrained convolutive probabilistic model for pitch detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2011

The Temperament Police: The Truth, the Ground Truth, and Nothing but the Truth.

[BibT_eX]

[DOI]

Dan Tidhar

Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Polyphonic music transcription using note onset and offset detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Auditory Spectrum-Based Pitched Instrument Onset Detection.

[BibT_eX]

[DOI]

Yannis Stylianou

IEEE Trans. Speech Audio Process., 2010

Non-Negative Tensor Factorization Applied to Music Genre Classification.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2010

Multiple-F0 estimation of piano sounds exploiting spectral structure and temporal evolution.

[BibT_eX]

[DOI]

Proceedings of the ISCA Workshop on Statistical And Perceptual Audition, 2010

2009

Pitched Instrument Onset Detection based on Auditory Spectra.

[BibT_eX]

[DOI]

Yannis Stylianou

Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009

2008

Computationally Efficient and Robust BIC-Based Speaker Segmentation.

[BibT_eX]

[DOI]

Costas Kotropoulos

IEEE Trans. Speech Audio Process., 2008

Music Genre Classification: A Multilinear Approach.

[BibT_eX]

[DOI]

Ioannis Panagakis

Proceedings of the ISMIR 2008, 2008

A tensor-based approach for automatic music genre classification.

[BibT_eX]

[DOI]

Proceedings of the 2008 16th European Signal Processing Conference, 2008

Movie Analysis with Emphasis to Dialogue and Action Scene Detection.

[BibT_eX]

[DOI]

Spyridon Siatras

Nikos Nikolaidis

Ioannis Pitas

Proceedings of the Multimodal Processing and Interaction, Audio, Video, Text, 2008

2007

A neural network approach to audio-assisted movie dialogue detection.

[BibT_eX]

[DOI]

Ioannis Pitas

Neurocomputing, 2007

Systematic comparison of BIC-based speaker segmentation systems.

[BibT_eX]

[DOI]

Vassiliki Moschou

Proceedings of the IEEE 9th Workshop on Multimedia Signal Processing, 2007

2006

Automatic speaker change detection with the Bayesian information criterion using MPEG-7 features and a fusion scheme.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Musical instrument classification using non-negative matrix factorization algorithms.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Automatic Speaker Segmentation using Multiple Features and Distance Measures: A Comparison of Three Approaches.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Applying Supervised Classifiers Based on Non-negative Matrix Factorization to Musical Instrument Classification.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Musical Instrument Classification using Non-Negative Matrix Factorization Algorithms and Subset Feature Selection.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Testing supervised classifiers based on non-negative matrix factorization to musical instrument classification.

[BibT_eX]

[DOI]