Mirco Ravanelli

Orcid: 0000-0002-3929-5526

According to our database1, Mirco Ravanelli authored at least 74 papers between 2012 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Listenable Maps for Audio Classifiers.
CoRR, 2024

SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning.
CoRR, 2024

Focal Modulation Networks for Interpretable Sound Classification.
CoRR, 2024

Bayesian Deep Learning for Remaining Useful Life Estimation via Stein Variational Gradient Descent.
CoRR, 2024

Are LLMs Robust for Spoken Dialogues?
CoRR, 2024

2023
Exploring Self-Attention Mechanisms for Speech Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch.
CoRR, 2023

CL-MASR: A Continual Learning Benchmark for Multilingual ASR.
CoRR, 2023

Audio Editing with Non-Rigid Text Prompts.
CoRR, 2023

Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets.
CoRR, 2023

Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads.
CoRR, 2023

Generalization Limits of Graph Neural Networks in Identity Effects Learning.
CoRR, 2023

Speech Emotion Diarization: Which Emotion Appears When?
CoRR, 2023

Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
CoRR, 2023

Posthoc Interpretation via Quantization.
CoRR, 2023

Fine-Tuning Strategies for Faster Inference Using Speech Self-Supervised Models: A Comparative Study.
Proceedings of the IEEE International Conference on Acoustics, 2023

Simulated Annealing in Early Layers Leads to Better Generalization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Speech Emotion Diarization: Which Emotion Appears When?
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Rescuespeech: A German Corpus for Speech Recognition in Search and Rescue Domain.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Learning Representations for New Sound Classes With Continual Self-Supervised Learning.
IEEE Signal Process. Lett., 2022

Resource-Efficient Separation Transformer.
CoRR, 2022

On Using Transformers for Speech-Separation.
CoRR, 2022

OSSEM: one-shot speaker adaptive speech enhancement using meta learning.
Proceedings of the Interspeech 2022, 2022

SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation.
Proceedings of the Interspeech 2022, 2022

Real-M: Towards Speech Separation on Real Mixtures.
Proceedings of the IEEE International Conference on Acoustics, 2022

MetricGAN-U: Unsupervised Speech Enhancement/ Dereverberation Based Only on Noisy/ Reverberated Speech.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
SpeechBrain: A General-Purpose Speech Toolkit.
CoRR, 2021

Transformers with Competitive Ensembles of Independent Mechanisms.
CoRR, 2021

Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

The Energy and Carbon Footprint of Training End-to-End Speech Recognizers.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

ECAPA-TDNN Embeddings for Speaker Diarization.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Attention Is All You Need In Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2021

Interpretable SincNet-based Deep Learning for Emotion Recognition from EEG brain activity.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021

2020
BIRD: Big Impulse Response Dataset.
CoRR, 2020

Towards Unsupervised Learning of Speech Representations.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Quaternion Neural Networks for Multi-Channel Distant Speech Recognition.
Proceedings of the Interspeech 2020, 2020

Multi-Task Self-Supervised Learning for Robust Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Using Speech Synthesis to Train End-To-End Spoken Language Understanding Models.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Learning Speaker Representations with Mutual Information.
Proceedings of the Interspeech 2019, 2019

Learning Problem-Agnostic Speech Representations from Multiple Self-Supervised Tasks.
Proceedings of the Interspeech 2019, 2019

Speech Model Pre-Training for End-to-End Spoken Language Understanding.
Proceedings of the Interspeech 2019, 2019

Quaternion Recurrent Neural Networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

The Pytorch-kaldi Speech Recognition Toolkit.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Light Gated Recurrent Units for Speech Recognition.
IEEE Trans. Emerg. Top. Comput. Intell., 2018

Automatic context window composition for distant speech recognition.
Speech Commun., 2018

Speech and Speaker Recognition from Raw Waveform with SincNet.
CoRR, 2018

Interpretable Convolutional Filters with SincNet.
CoRR, 2018

Speech recognition with quaternion neural networks.
CoRR, 2018

Speaker Recognition from Raw Waveform with SincNet.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Twin Regularization for Online Speech Recognition.
Proceedings of the Interspeech 2018, 2018

2017
Deep Learning for Distant Speech Recognition.
PhD thesis, 2017

Deep Learning for Distant Speech Recognition.
CoRR, 2017

The DIRHA-English corpus and related tasks for distant-speech recognition in domestic environments.
CoRR, 2017

Improving Speech Recognition by Revising Gated Recurrent Units.
Proceedings of the Interspeech 2017, 2017

A network of deep neural networks for Distant Speech Recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Batch-normalized joint training for DNN-based distant speech recognition.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016


Realistic Multi-Microphone Data Simulation for Distant Speech Recognition.
Proceedings of the Interspeech 2016, 2016

2015
Insights into Audio-Based Multimedia Event Classification with Neural Networks.
Proceedings of the 2015 Workshop on Community-Organized Multimodal Mining: Opportunities for Novel Solutions, 2015

Contaminated speech training methods for robust DNN-HMM distant speech recognition.
Proceedings of the INTERSPEECH 2015, 2015

A multi-channel corpus for distant-speech interaction in presence of known interferences.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

The DIRHA-ENGLISH corpus and related tasks for distant-speech recognition in domestic environments.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
The DIRHA simulated corpus.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

TANDEM-bottleneck feature combination using hierarchical Deep Neural Networks.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

On the selection of the impulse responses for distant-speech recognition based on contaminated speech training.
Proceedings of the INTERSPEECH 2014, 2014

The DIRHA-GRID corpus: baseline and tools for multi-room distant speech recognition using distributed microphones.
Proceedings of the INTERSPEECH 2014, 2014

Audio-concept features and hidden Markov models for multimedia event detection.
Proceedings of the 2nd International Workshop on Speech, Language and Audio in Multimedia, 2014

A speech event detection and localization task for multiroom environments.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

Audio concept classification with Hierarchical Deep Neural Networks.
Proceedings of the 22nd European Signal Processing Conference, 2014

2013
Embedding speech recognition to control lights.
Proceedings of the INTERSPEECH 2013, 2013

Audio Concept Ranking for Video Event Detection on User-Generated Content.
Proceedings of the First Workshop on Speech, 2013

2012
Impulse response estimation for robust speech recognition in a reverberant environment.
Proceedings of the 20th European Signal Processing Conference, 2012


  Loading...