Timo Gerkmann

Orcid: 0000-0002-8678-4699

Affiliations:
  • University Hamburg, Germany


According to our database1, Timo Gerkmann authored at least 148 papers between 2006 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
DriftRec: Adapting Diffusion Models to Blind JPEG Restoration.
IEEE Trans. Image Process., 2024

Multi-Channel Speech Separation Using Spatially Selective Deep Non-Linear Filters.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Diffusion Models for Audio Restoration.
CoRR, 2024

An Analysis of the Variance of Diffusion-based Speech Enhancement.
CoRR, 2024

2023
A neural network-supported two-stage algorithm for lightweight dereverberation on hearing devices.
EURASIP J. Audio Speech Music. Process., December, 2023

Insights Into Deep Non-Linear Filters for Improved Multi-Channel Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Speech Enhancement and Dereverberation With Diffusion-Based Generative Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

StoRM: A Diffusion-Based Stochastic Regeneration Model for Speech Enhancement and Dereverberation.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Integrating Uncertainty Into Neural Network-Based Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Distilling HuBERT with LSTMs via Decoupled Knowledge Distillation.
CoRR, 2023

Single and Few-step Diffusion for Generative Speech Enhancement.
CoRR, 2023

EMOCONV-DIFF: Diffusion-based Speech Emotion Conversion for Non-parallel and In-the-wild Data.
CoRR, 2023

A Flexible Online Framework for Projection-Based STFT Phase Retrieval.
CoRR, 2023

Wind Noise Reduction with a Diffusion-based Stochastic Regeneration Model.
CoRR, 2023

On the Behavior of Intrusive and Non-intrusive Speech Enhancement Metrics in Predictive and Generative Settings.
CoRR, 2023

In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis.
CoRR, 2023

Audio-Visual Speech Enhancement with Score-Based Generative Models.
CoRR, 2023

Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model.
CoRR, 2023

Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models.
CoRR, 2023

Extending DNN-based Multiplicative Masking to Deep Subband Filtering for Improved Dereverberation.
CoRR, 2023

Reducing the Prior Mismatch of Stochastic Differential Equations for Diffusion-based Speech Enhancement.
CoRR, 2023

Diffusion Posterior Sampling for Informed Single-Channel Dereverberation.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Acoustic and Visual Knowledge Distillation for Contrastive Audio-Visual Localization.
Proceedings of the 25th International Conference on Multimodal Interaction, 2023

Spatially Selective Deep Non-Linear Filters For Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2023

Speech Signal Improvement Using Causal Generative Diffusion Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

DiffPhase: Generative Diffusion-Based STFT Phase Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2023

Analysing Diffusion-based Generative Approaches Versus Discriminative Approaches for Speech Restoration.
Proceedings of the IEEE International Conference on Acoustics, 2023

Partially Adaptive Multichannel Joint Reduction of Ego-Noise and Environmental Noise.
Proceedings of the IEEE International Conference on Acoustics, 2023

Uncertainty Estimation in Deep Speech Enhancement Using Complex Gaussian Mixture Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
DriftRec: Adapting diffusion models to blind image restoration tasks.
CoRR, 2022

End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning.
CoRR, 2022

End-To-End Optimization of Online Neural Network-supported Two-Stage Dereverberation for Hearing Devices.
CoRR, 2022

Phase-Aware Deep Speech Enhancement: It's All About The Frame Length.
CoRR, 2022

Speech Enhancement Regularized by a Speaker Verification Model.
Proceedings of the 24th IEEE International Workshop on Multimedia Signal Processing, 2022

Beyond Griffin-LIM: Improved Iterative Phase Retrieval for Speech.
Proceedings of the 17th International Workshop on Acoustic Signal Enhancement, 2022

Speech Enhancement with Score-Based Generative Models in the Complex STFT Domain.
Proceedings of the Interspeech 2022, 2022

On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement.
Proceedings of the Interspeech 2022, 2022

End-To-End Label Uncertainty Modeling for Speech-based Arousal Recognition Using Bayesian Neural Networks.
Proceedings of the Interspeech 2022, 2022

Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes.
Proceedings of the Interspeech 2022, 2022

Neural Network-augmented Kalman Filtering for Robust Online Speech Dereverberation in Noisy Reverberant Environments.
Proceedings of the Interspeech 2022, 2022

Continuous Phoneme Recognition based on Audio-Visual Modality Fusion.
Proceedings of the International Joint Conference on Neural Networks, 2022

Deep Iterative Phase Retrieval for Ptychography.
Proceedings of the IEEE International Conference on Acoustics, 2022

Customizable End-To-End Optimization Of Online Neural Network-Supported Dereverberation For Hearing Devices.
Proceedings of the IEEE International Conference on Acoustics, 2022

Integrating Statistical Uncertainty into Neural Network-Based Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2022

Label Uncertainty Modeling and Prediction for Speech Emotion Recognition using t-Distributions.
Proceedings of the 10th International Conference on Affective Computing and Intelligent Interaction, 2022

2021
Efficient Joint Estimation of Tracer Distribution and Background Signals in Magnetic Particle Imaging Using a Dictionary Approach.
IEEE Trans. Medical Imaging, 2021

Nonlinear Spatial Filtering in Multichannel Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

SNR-Based Features and Diverse Training Data for Robust DNN-Based Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

End-to-end label uncertainty modeling for speech emotion recognition using Bayesian neural networks.
CoRR, 2021

Disentanglement Learning for Variational Autoencoders Applied to Audio-Visual Speech Enhancement.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Dictionary-Based Background Signal Estimation For Magnetic Particle Imaging.
Proceedings of the 18th IEEE International Symposium on Biomedical Imaging, 2021

See the Silence: Improving Visual-Only Voice Activity Detection by Optical Flow and RGB Fusion.
Proceedings of the Computer Vision Systems - 13th International Conference, 2021

Variational Autoencoder for Speech Enhancement with a Noise-Aware Encoder.
Proceedings of the IEEE International Conference on Acoustics, 2021

Guided Variational Autoencoder for Speech Enhancement with a Supervised Classifier.
Proceedings of the IEEE International Conference on Acoustics, 2021

Plosive Enhancement Using Phase Linearization and Smoothing.
Proceedings of the 14th ITG Conference on Speech Communication, online, September 29, 2021

Intelligibility Prediction of Speech Reconstructed From Its Magnitude or Phase.
Proceedings of the 14th ITG Conference on Speech Communication, online, September 29, 2021

An Integrated Deep Clustering-Based System for Speaker Count Agnostic Speech Separation.
Proceedings of the 14th ITG Conference on Speech Communication, online, September 29, 2021

Joint Reduction of Ego-noise and Environmental Noise with a Partially-adaptive Dictionary.
Proceedings of the 14th ITG Conference on Speech Communication, online, September 29, 2021

2020
A Survey on Probabilistic Models in Human Perception and Machines.
Frontiers Robotics AI, 2020

Reinforcement Learning with Time-dependent Goals for Robotic Musicians.
CoRR, 2020

Robust Robotic Pouring using Audition and Haptics.
CoRR, 2020

Robust Robotic Pouring using Audition and Haptics.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Speech Enhancement with Stochastic Temporal Convolutional Networks.
Proceedings of the Interspeech 2020, 2020

Improving mix-and-separate training in audio-visual sound source separation with an object prior.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Nonlinear Spatial Filtering for Multichannel Speech Enhancement in Inhomogeneous Noise Fields.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Multi-Phase Gammatone Filterbank for Speech Separation Via Tasnet.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Making Sense of Audio Vibration for Liquid Height Estimation in Robotic Pouring.
CoRR, 2019

Making Sense of Audio Vibration for Liquid Height Estimation in Robotic Pouring.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

On Nonlinear Spatial Filtering in Multichannel Speech Enhancement.
Proceedings of the Interspeech 2019, 2019

Influence of Speaker-Specific Parameters on Speech Separation Systems.
Proceedings of the Interspeech 2019, 2019

An Analysis of Noise-aware Features in Combination with the Size and Diversity of Training Data for DNN-based Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
On the Importance of Super-Gaussian Speech Priors for Machine-Learning Based Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

On Speech Enhancement Under PSD Uncertainty.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Weighted and Multi-Task Loss for Rare Audio Event Detection.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Nonlinear Speech Enhancement Under Speech PSD Uncertainty.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

A Study on the Benefits of Phase-Aware Speech Enhancement in Challenging Noise Scenarios.
Proceedings of the Latent Variable Analysis and Signal Separation, 2018

Robust DNN-Based Speech Enhancement with Limited Training Data.
Proceedings of the 13th ITG Symposium on Speech Communication, 2018

2017
An Analysis of Adaptive Recursive Smoothing with Applications to Noise PSD Estimation.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Improving the Generalizability of Deep Neural Network Based Speech Enhancement.
CoRR, 2017

DNN and CNN with Weighted and Multi-task Loss Functions for Audio Event Detection.
CoRR, 2017

On the Importance of Super-Gaussian Speech Priors for Pre-Trained Speech Enhancement.
CoRR, 2017

MixMax Approximation as a Super-Gaussian Log-Spectral Amplitude Estimator for Speech Enhancement.
Proceedings of the Interspeech 2017, 2017

2016
On MMSE-Based Estimation of Amplitude and Complex Speech Spectral Coefficients Under Phase-Uncertainty.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Fundamental Frequency Informed Speech Enhancement in a Flexible Statistical Framework.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Constrained multi-channel linear prediction for adaptive speech dereverberation.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

BIAS correction methods for adaptive recursive smoothing with applications in noise PSD estimation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Single-microphone speech enhancement using MVDR filtering and Wiener post-filtering.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Perceptual and instrumental evaluation of the perceived level of reverberation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Sparse reconstruction of quantized speech signals.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A Combination of Pre-Trained Approaches and Generic Methods for an Improved Speech Enhancement.
Proceedings of the 12th ITG Symposium on Speech Communication, 2016

Combined Single-Microphone Wiener and MVDR Filtering based on Speech Interframe Correlations and Speech Presence Probability.
Proceedings of the 12th ITG Symposium on Speech Communication, 2016

2015
Noise Power Spectral Density Estimation Using MaxNSR Blocking Matrix.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Two-Stage Filter-Bank System for Improved Single-Channel Noise Reduction in Hearing Aids.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Multi-Channel Linear Prediction-Based Speech Dereverberation With Sparse Priors.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Phase Processing for Single-Channel Speech Enhancement: History and recent advances.
IEEE Signal Process. Mag., 2015

Front-end technologies for robust ASR in reverberant environments - spectral enhancement-based dereverberation and auditory modulation filterbank features.
EURASIP J. Adv. Signal Process., 2015

Combination of MVDR beamforming and single-channel spectral processing for enhancing noisy and reverberant speech.
EURASIP J. Adv. Signal Process., 2015

On the bias of adaptive first-order recursive smoothing.
Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

MMSE-optimal combination of wiener filtering and harmonic model based speech enhancement in a general framework.
Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

Group sparsity for mimo speech dereverberation.
Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

Least squares estimate of the initial phases in STFT based speech enhancement.
Proceedings of the INTERSPEECH 2015, 2015

Cepstral noise subtraction for robust automatic speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Multi-channel PSD estimators for speech dereverberation - A theoretical and experimental comparison.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Utilizing spectro-temporal correlations for an improved speech presence probability based noise power estimation.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Multi-channel linear prediction-based speech dereverberation with low-rank power spectrogram approximation.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Late reverberant spectral variance estimation using acoustic channel equalization.
Proceedings of the 23rd European Signal Processing Conference, 2015

2014
Bayesian Estimation of Clean Speech Spectral Coefficients Given a Priori Knowledge of the Phase.
IEEE Trans. Signal Process., 2014

STFT phase reconstruction in voiced speech for an improved single-channel speech enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Subjective speech quality and speech intelligibility evaluation of single-channel dereverberation algorithms.
Proceedings of the 14th International Workshop on Acoustic Signal Enhancement, 2014

Generalization of supervised learning for binary mask estimation.
Proceedings of the 14th International Workshop on Acoustic Signal Enhancement, 2014

Single channel noise reduction based on an auditory filterbank.
Proceedings of the 14th International Workshop on Acoustic Signal Enhancement, 2014

Speech dereverberation with convolutive transfer function approximation using map and variational deconvolution approaches.
Proceedings of the 14th International Workshop on Acoustic Signal Enhancement, 2014

A study on speech quality and speech intelligibility measures for quality assessment of single-channel dereverberation algorithms.
Proceedings of the 14th International Workshop on Acoustic Signal Enhancement, 2014

A posteriori speech presence probability estimation based on averaged observations and a super-Gaussian speech model.
Proceedings of the 14th International Workshop on Acoustic Signal Enhancement, 2014

A posteriori voiced/unvoiced probability estimation based on a sinusoidal model.
Proceedings of the IEEE International Conference on Acoustics, 2014

Frequency-domain single-channel inverse filtering for speech dereverberation: Theory and practice.
Proceedings of the IEEE International Conference on Acoustics, 2014

MMSE-optimal enhancement of complex speech coefficients with uncertain prior knowledge of the clean speech phase.
Proceedings of the IEEE International Conference on Acoustics, 2014

Speech dereverberation with multi-channel linear prediction and sparse priors for the desired signal.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

A speech presence probability estimator based on fixed priors and a heavy-tailed speech model.
Proceedings of the 22nd European Signal Processing Conference, 2014

Efficient Multi-Channel Acoustic Echo Cancellation Using Constrained Sparse Filter Updates in the Subband Domain.
Proceedings of the 11th ITG Symposium on Speech Communication, 2014

2013
DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement
Synthesis Lectures on Speech and Audio Processing, Morgan & Claypool Publishers, ISBN: 978-3-031-02564-8, 2013

MMSE-Optimal Spectral Amplitude Estimation Given the STFT-Phase.
IEEE Signal Process. Lett., 2013

Privacy-preserving distributed speech enhancement forwireless sensor networks by processing in the encrypted domain.
Proceedings of the IEEE International Conference on Acoustics, 2013

On the relation between speech corruption models in the spectral and the cepstral domain.
Proceedings of the IEEE International Conference on Acoustics, 2013

Phase-sensitive real-time capable speech enhancement under voiced-unvoiced uncertainty.
Proceedings of the 21st European Signal Processing Conference, 2013

Privacy preserving distributed beamforming based on homomorphic encryption.
Proceedings of the 21st European Signal Processing Conference, 2013

2012
Noise Correlation Matrix Estimation for Multi-Microphone Speech Enhancement.
IEEE Trans. Speech Audio Process., 2012

Unbiased MMSE-Based Noise Power Estimation With Low Complexity and Low Tracking Delay.
IEEE Trans. Speech Audio Process., 2012

Noise PSD Estimation Using Blind Source Separation in a Diffuse Noise Field.
Proceedings of the IWAENC 2012 - International Workshop on Acoustic Signal Enhancement, Proceedings, RWTH Aachen University, Germany, September 4th, 2012

STFT Phase Improvement for Single Channel Speech Enhancement.
Proceedings of the IWAENC 2012 - International Workshop on Acoustic Signal Enhancement, Proceedings, RWTH Aachen University, Germany, September 4th, 2012

Improved mmse-based noise PSD tracking using temporal cepstrum smoothing.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
A new linear MMSE filter for single channel speech enhancement based on Nonnegative Matrix Factorization.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2011

Noise power estimation based on the probability of speech presence.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2011

Blind source separation of nondisjoint sources in the time-frequency domain with model-based determination of source contribution.
Proceedings of the 2011 IEEE International Symposium on Signal Processing and Information Technology, 2011

A new approach for speech enhancement based on a constrained Nonnegative Matrix Factorization.
Proceedings of the International Symposium on Intelligent Signal Processing and Communications Systems, 2011

Estimation of the noise correlation matrix.
Proceedings of the IEEE International Conference on Acoustics, 2011

Cepstral weighting for speech dereverberation without musical noise.
Proceedings of the 19th European Signal Processing Conference, 2011

2010
Speech presence probability estimation based on temporal cepstrum smoothing.
Proceedings of the IEEE International Conference on Acoustics, 2010

Musical genre classification based on a highly-resolved cepstral modulation spectrum.
Proceedings of the 18th European Signal Processing Conference, 2010

Cepstral Smoothing with Reduced Computational Complexity.
Proceedings of the 9. ITG-Fachtagung Sprachkommunikation 2010, 2010

2009
On the statistics of spectral amplitudes after variance reduction by temporal cepstrum smoothing and cepstral nulling.
IEEE Trans. Signal Process., 2009

Multi-microphone maximum a posteriori fundamental frequency estimation in the cepstral domain.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Improved A Posteriori Speech Presence Probability Estimation Based on a Likelihood Ratio With Fixed Priors.
IEEE Trans. Speech Audio Process., 2008

A novel a priori SNR estimation approach based on selective cepstro-temporal smoothing.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Cepstral Smoothing of Spectral Filter Gains for Speech Enhancement Without Musical Noise.
IEEE Signal Process. Lett., 2007

2006
Soft decision combining for dual channel noise reduction.
Proceedings of the INTERSPEECH 2006, 2006

Statistical Inference of Missing Speech Data in the ICA Domain.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006


  Loading...