Jonathan Le Roux

Reinhold Haeb-Umbach

IEEE ACM Trans. Audio Speech Lang. Process., 2024

SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers.

[BibT_eX]

[DOI]

CoRR, 2024

TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 18th International Workshop on Acoustic Signal Enhancement, 2024

Disentangled Acoustic Fields For Multimodal Physical Scene Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

Enhanced Reverberation as Supervision for Unsupervised Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

PARIS: Pseudo-AutoRegressIve Siamese Training for Online Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

ZeroST: Zero-Shot Speech Translation.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Sound Event Bounding Boxes.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Speech dereverberation constrained on room impulse response characteristics.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Improving Audio Captioning Models with Fine-Grained Audio Features, Text Embedding Supervision, and LLM Mix-Up Augmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Late Audio-Visual Fusion for in-the-Wild Speaker Diarization.

[BibT_eX]

[DOI]

Zexu Pan

François G. Germain

Proceedings of the IEEE International Conference on Acoustics, 2024

NeuroHeed+: Improving Neuro-Steered Speaker Extraction with Joint Auditory Attention Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

GLA-GRAD: A Griffin-Lim Extended Waveform Generation Diffusion Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Why Does Music Source Separation Benefit from Cacophony?

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

WI-FI based Indoor Monitoring Enhanced by Multimodal Fusion.

[BibT_eX]

[DOI]

Pu Wang

Mahbub Rahman

Cristian J. Vaca-Rubio

Sameer Khurana

Anoop Cherian

Proceedings of the IEEE International Conference on Acoustics, 2024

Generation or Replication: Auscultating Audio Latent Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Librispeech Slakh Unmix (LSX).

[BibT_eX]

[DOI]

Darius Petermann

Dataset, March, 2023

STFT-Domain Neural Speech Enhancement With Very Low Algorithmic Latency.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks.

[BibT_eX]

[DOI]

Darius Petermann

Alexander L. Stempkovskiy

IEEE ACM Trans. Audio Speech Lang. Process., 2023

The Sound Demixing Challenge 2023 - Cinematic Demixing Track.

[BibT_eX]

[DOI]

Tatiana Habruseva

Mikhail Sukhovei

Yuki Mitsufuji

CoRR, 2023

Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT.

[BibT_eX]

[DOI]

CoRR, 2023

Location as Supervision for Weakly Supervised Multi-Channel Source Separation of Machine Sounds.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Hyperbolic Unsupervised Anomalous Sound Detection.

[BibT_eX]

[DOI]

François G. Germain

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Style-transfer based Speech and Audio-visual Scene understanding for Robot Action Sequence Acquisition from Videos.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Cold Diffusion for Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Optimal Condition Training for Target Source Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Hyperbolic Audio Source Separation.

[BibT_eX]

[DOI]

Darius Petermann

Proceedings of the IEEE International Conference on Acoustics, 2023

Paᗧ-HuBERT: Self-Supervised Music Source Separation Via Primitive Auditory Clustering And Hidden-Unit Bert.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Latent Iterative Refinement for Modular Source Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Reverberation as Supervision For Speech Separation.

[BibT_eX]

[DOI]

Rohith Aralikatti

Christoph Böddeker

Proceedings of the IEEE International Conference on Acoustics, 2023

Scenario-Aware Audio-Visual TF-Gridnet for Target Speech Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Momentum Pseudo-Labeling: Semi-Supervised ASR With Continuously Improving Pseudo-Labels.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2022

Towards End-to-end Speaker Diarization in the Wild.

[BibT_eX]

[DOI]

Zexu Pan

François G. Germain

CoRR, 2022

Heterogeneous Target Speech Separation.

[BibT_eX]

[DOI]

Efthymios Tzinis

Paris Smaragdis

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Low-Latency Online Streaming VideoQA Using Audio-Visual Transformers.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Locate This, Not that: Class-Conditioned Sound Event DOA Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Sequence Transduction with Graph-Based Supervision.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Improved Domain Generalization via Disentangled Multi-Task Learning in Unsupervised Anomalous Sound Detection.

[BibT_eX]

[DOI]

Satvik Venkatesh

Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

(2.5+1)D Spatio-Temporal Scene Graphs for Video Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

On the Compensation Between Magnitude and Phase in Speech Separation.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2021

Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement.

[BibT_eX]

[DOI]

CoRR, 2021

Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers.

[BibT_eX]

[DOI]

CoRR, 2021

Anomalous Sound Detection Using Attentive Neural Processes.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Convolutive Prediction for Reverberant Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Advanced Long-Context End-to-End Speech Recognition Using Context-Expanded Transformers.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Optimizing Latency for Online Video Captioning Using Audio-Visual Transformers.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Visual Scene Graphs for Audio Source Separation.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Semi-Supervised Speech Recognition Via Graph-Based Temporal Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Capturing Multi-Resolution Context by Dilated Self-Attention.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Transcription Is All You Need: Learning To Separate Musical Mixtures With Score As Supervision.

[BibT_eX]

[DOI]

Yun-Ning Hung

Proceedings of the IEEE International Conference on Acoustics, 2021

Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Finding Strength in Weakness: Learning to Separate Sounds With Weak Supervision.

[BibT_eX]

[DOI]

Fatemeh Pishdadian

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Multi-Pass Transformer for Machine Translation.

[BibT_eX]

[DOI]

CoRR, 2020

Spatio-Temporal Scene Graphs for Video Dialog.

[BibT_eX]

[DOI]

CoRR, 2020

Autoclip: Adaptive Gradient Clipping for Source Separation Networks.

[BibT_eX]

[DOI]

Proceedings of the 30th IEEE International Workshop on Machine Learning for Signal Processing, 2020

Hierarchical Musical Instrument Separation.

[BibT_eX]

[DOI]

Ethan Manilow

Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020

All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Detecting Audio Attacks on ASR Systems with Dropout Uncertainty.

[BibT_eX]

[DOI]

Tejas Jayashankar

Pierre Moulin

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transformer-Based Long-Context End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Learning to Separate Sounds from Weakly Labeled Scenes.

[BibT_eX]

[DOI]

Fatemeh Pishdadian

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Streaming Automatic Speech Recognition with the Transformer Model.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

WHAMR!: Noisy and Reverberant Single-Channel Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

End-To-End Multi-Speaker Speech Recognition With Transformer.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Phasebook and Friends: Leveraging Discrete Representations for Source Separation.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2019

Bootstrapping deep music separation from primitive auditory grouping principles.

[BibT_eX]

[DOI]

CoRR, 2019

Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

Universal Sound Separation.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

WHAM!: Extending Speech Separation to Noisy Environments.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Multilingual Multi-Speaker Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Vectorized Beam Search for CTC-Attention-Based Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Class-conditional Embeddings for Music Source Separation.

[BibT_eX]

[DOI]

Prem Seetharaman

Shrikant Venkataramani

Proceedings of the IEEE International Conference on Acoustics, 2019

Bootstrapping Single-channel Source Separation via Unsupervised Spatial Clustering on Stereo Mixtures.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

The Phasebook: Building Complex Masks via Discrete Representations for Source Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

SDR - Half-baked or Well Done?

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Triggered Attention for End-to-end Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Cycle-consistency Training for End-to-end Speech Recognition.

[BibT_eX]

[DOI]

Ramón Fernandez Astudillo

Proceedings of the IEEE International Conference on Acoustics, 2019

Teacher-student Deep Clustering for Low-delay Single Channel Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Streaming End-to-End Speech Recognition with Joint CTC-Attention Based Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

MIMO-Speech: End-to-End Multi-Channel Multi-Speaker Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Phase Reconstruction with Learned Time-Frequency Representations for Single-Channel Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Alternative Objective Functions for Deep Clustering.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

End-to-End Multi-Speaker Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

A Purely End-to-End System for Multi-speaker Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017

Duration-Controlled LSTM for Polyphonic Sound Event Detection.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Prior-based Binary Masking and Discriminative Methods for Reverberant and Noisy Speech Recognition Using Distant Stereo Microphones.

[BibT_eX]

[DOI]

J. Inf. Process., 2017

Multi-microphone speech recognition integrating beamforming, robust feature extraction, and advanced DNN/RNN backend.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2017

Consistent anisotropic Wiener filtering for audio source separation.

[BibT_eX]

[DOI]

Paul Magron

Tuomas Virtanen

Proceedings of the 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2017

Coupled Initialization of Multi-Channel Non-Negative Matrix Factorization Based on Spatial and Spectral Information.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Student-teacher network learning with enhanced features.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Deep clustering and conventional networks for music separation: Stronger together.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Novel Deep Architectures in Speech Processing.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

Deep Recurrent Networks for Separation and Recognition of Single-Channel Speech in Nonstationary Background Audio.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

Dialog state tracking with attention-based sequence-to-sequence learning.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Full-Capacity Unitary Recurrent Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Single-Channel Multi-Speaker Separation Using Deep Clustering.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Improved MVDR Beamforming Using Single-Channel Mask Prediction Networks.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Deep unfolding for multichannel source separation.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Deep clustering: Discriminative embeddings for segmentation and separation.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Bidirectional LSTM-HMM Hybrid System for Polyphonic Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2016

2015

Phase Processing for Single-Channel Speech Enhancement: History and recent advances.

[BibT_eX]

[DOI]

Timo Gerkmann

Martin Krawczyk-Becker

IEEE Signal Process. Mag., 2015

Micbots: Collecting large realistic datasets for speech and audio research using mobile robots.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Deep NMF for speech separation.

[BibT_eX]

[DOI]

Felix Weninger

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Speech Enhancement with LSTM Recurrent Neural Networks and its Application to Noise-Robust ASR.

[BibT_eX]

[DOI]

Proceedings of the Latent Variable Analysis and Signal Separation, 2015

The MERL/SRI system for the 3RD CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures.

[BibT_eX]

[DOI]

Felix Weninger

CoRR, 2014

Discriminative NMF and its application to single-channel source separation.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Sequential maximum mutual information linear discriminant analysis for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Black box optimization for automatic speech recognition.

[BibT_eX]

[DOI]

Shinji Watanabe

Proceedings of the IEEE International Conference on Acoustics, 2014

Non-negative source-filter dynamical system for speech enhancement.

[BibT_eX]

[DOI]

Umut Simsekli

Proceedings of the IEEE International Conference on Acoustics, 2014

Ensemble integration of calibrated speaker localization and statistical speech detection in domestic environments.

[BibT_eX]

[DOI]

Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

Discriminatively trained recurrent neural networks for single-channel speech separation.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing, 2014

Sequence discriminative training for low-rank deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing, 2014

2013

Consistent Wiener Filtering for Audio Source Separation.

[BibT_eX]

[DOI]

Emmanuel Vincent

IEEE Signal Process. Lett., 2013

Block Coordinate Descent for Sparse NMF

[BibT_eX]

[DOI]

Proceedings of the 1st International Conference on Learning Representations, 2013

Hierarchical and coupled non-negative dynamical systems with application to audio modeling.

[BibT_eX]

[DOI]

Umut Simsekli

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013

Ensemble learning for speech enhancement.

[BibT_eX]

[DOI]

Shinji Watanabe

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013

Statistical Dialogue Management using Intention Dependency Graph.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

The second 'chime' speech separation and recognition challenge: Datasets, tasks and baselines.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Source localization in reverberant environments using sparse optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Non-negative dynamical system with application to speech and audio.

[BibT_eX]

[DOI]

Cédric Févotte

Proceedings of the IEEE International Conference on Acoustics, 2013

The second 'CHiME' speech separation and recognition challenge: An overview of challenge systems and outcomes.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

A generalized discriminative training framework for system combination.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012

Indirect model-based speech enhancement.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Factorial Models for Noise Robust Speech Recognition.

[BibT_eX]

[DOI]

Steven J. Rennie

Proceedings of the Techniques for Noise Robustness in Automatic Speech Recognition, 2012

2011

Computational auditory induction as a missing-data model-fitting problem with Bregman divergence.

[BibT_eX]

[DOI]

Speech Commun., 2011

Bayesian nonparametric spectrogram modeling based on infinite factorial infinite hidden Markov model.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2011

Infinite-state spectrum model for music signal analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Harmonic and Percussive Sound Separation and Its Application to MIR-Related Tasks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Music Information Retrieval, 2010

A statistical model of speech F0 contours.

[BibT_eX]

[DOI]

Hirokazu Kameoka

Yasunori Ohishi

Proceedings of the ISCA Workshop on Statistical And Perceptual Audition, 2010

Consistent Wiener Filtering: Generalized Time-Frequency Masking Respecting Spectrogram Consistency.

[BibT_eX]

[DOI]

Proceedings of the Latent Variable Analysis and Signal Separation, 2010

Nonnegative Matrix Factorization with Markov-Chained Bases for Modeling Time-Varying Patterns in Music Spectrograms.

[BibT_eX]

[DOI]

Proceedings of the Latent Variable Analysis and Signal Separation, 2010

Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation.

[BibT_eX]

[DOI]

Proceedings of the Latent Variable Analysis and Signal Separation, 2010

2008

Adaptive Template Matching with Shift-Invariant Semi-NMF.

[BibT_eX]

[DOI]

Alain de Cheveigné

Lucas C. Parra

Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Explicit consistency constraints for STFT spectrograms and their application to phase reconstruction.

[BibT_eX]

[DOI]

Nobutaka Ono

Shigeki Sagayama

Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, 2008

Computational auditory induction by missing-data non-negative matrix factorization.

[BibT_eX]

[DOI]

Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, 2008

Modulation analysis of speech through orthogonal FIR filterbank optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

Separation of a monaural audio signal into harmonic/percussive components by complementary diffusion on spectrogram.

[BibT_eX]

[DOI]

Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007

Single and Multiple F<sub>0</sub> Contour Estimation Through Parametric Spectrogram Modeling of Speech in Noisy Environments.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2007

Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2007

Harmonic-Temporal Clustering of Speech for Single and Multiple F0 Contour Estimation in Noisy Environments.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

MEG Signal Denoising Based on Time-Shift PCA.

[BibT_eX]

[DOI]

Alain de Cheveigné

Jonathan Z. Simon

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

Speech analyzer using a joint estimation model of spectral envelope and fine structure.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005

Optimization methods for discriminative training.

[BibT_eX]

[DOI]