We stand with Ukraine

We stand with Ukraine

Niko Moritz

According to our database¹, Niko Moritz authored at least 49 papers between 2011 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

GSRM: Generative Speech Reward Model for Speech RLHF.

[DOI]

,

Tejas Jayashankar

,

,

,

,

Katerina Zmolíková

,

,

,

,

,

Gregory W. Wornell

,

,

CoRR, February, 2026

2025

Overcoming Latency Bottlenecks in On-Device Speech Translation: A Cascaded Approach with Alignment-Based Streaming MT.

[DOI]

,

,

,

,

,

,

,

Christian Fuegen

CoRR, August, 2025

Non-Monotonic Attention-based Read/Write Policy Learning for Simultaneous Translation.

[DOI]

,

,

,

Rastislav Rabatin

,

,

,

,

,

Christian Fuegen

CoRR, March, 2025

Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens.

[DOI]

,

,

,

,

,

Katerina Zmolíková

,

,

,

,

Christian Fuegen

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses.

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Transcribing and Translating, Fast and Slow: Joint Speech Translation and Recognition.

[DOI]

,

,

,

,

,

,

,

Christian Fuegen

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Directional Source Separation for Robust Speech Recognition on Smart Glasses.

[DOI]

,

,

,

,

Kaustubh Kalgaonkar

,

,

,

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Transducer-Llama: Integrating LLMs into Streamable Transducer-based Speech Recognition.

[DOI]

,

,

,

,

Philip C. Woodland

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition.

[DOI]

,

,

,

,

,

Christian Fuegen

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Effective Internal Language Model Training and Fusion for Factorized Transducer Model.

[DOI]

,

,

,

,

,

,

,

Christian Fuegen

,

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision.

[DOI]

,

,

Konstantinos Vougioukas

,

,

,

,

,

,

,

Stavros Petridis

,

,

Christian Fuegen

CoRR, 2023

Directional Speech Recognition for Speaker Disambiguation and Cross-talk Suppression.

[DOI]

,

,

,

Kaustubh Kalgaonkar

,

Christian Fuegen

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Streaming Audio-Visual Speech Recognition with Alignment Regularization.

[DOI]

,

,

Stavros Petridis

,

Christian Fuegen

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Anchored Speech Recognition with Neural Transducers.

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

SynthVSR: Scaling Up Visual Speech RecognitionWith Synthetic Supervision.

[DOI]

,

,

Konstantinos Vougioukas

,

,

,

,

,

,

,

Stavros Petridis

,

,

Christian Fuegen

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Momentum Pseudo-Labeling: Semi-Supervised ASR With Continuously Improving Pseudo-Labels.

[DOI]

,

,

Jonathan Le Roux

,

IEEE J. Sel. Top. Signal Process., 2022

An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition.

[DOI]

,

,

,

,

Christian Fuegen

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Sequence Transduction with Graph-Based Supervision.

[DOI]

,

,

Shinji Watanabe

,

Jonathan Le Roux

Proceedings of the IEEE International Conference on Acoustics, 2022

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy.

[DOI]

,

,

Jonathan Le Roux

,

Proceedings of the IEEE International Conference on Acoustics, 2022

Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR.

[DOI]

,

,

,

Shinji Watanabe

,

Jonathan Le Roux

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition.

[DOI]

,

,

Jonathan Le Roux

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Advanced Long-Context End-to-End Speech Recognition Using Context-Expanded Transformers.

[DOI]

,

,

,

Jonathan Le Roux

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition.

[DOI]

,

,

Jonathan Le Roux

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Semi-Supervised Speech Recognition Via Graph-Based Temporal Classification.

[DOI]

,

,

Jonathan Le Roux

Proceedings of the IEEE International Conference on Acoustics, 2021

Capturing Multi-Resolution Context by Dilated Self-Attention.

[DOI]

,

,

Jonathan Le Roux

Proceedings of the IEEE International Conference on Acoustics, 2021

Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training.

[DOI]

,

,

,

Jonathan Le Roux

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection.

[DOI]

,

,

,

Jonathan Le Roux

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transformer-Based Long-Context End-to-End Speech Recognition.

[DOI]

,

,

,

Jonathan Le Roux

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR.

[DOI]

,

,

,

Jonathan Le Roux

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Streaming Automatic Speech Recognition with the Transformer Model.

[DOI]

,

,

Jonathan Le Roux

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Vectorized Beam Search for CTC-Attention-Based Speech Recognition.

[DOI]

,

,

Shinji Watanabe

,

,

Jonathan Le Roux

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition.

[DOI]

,

,

Jonathan Le Roux

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Triggered Attention for End-to-end Speech Recognition.

[DOI]

,

,

Jonathan Le Roux

Proceedings of the IEEE International Conference on Acoustics, 2019

Streaming End-to-End Speech Recognition with Joint CTC-Attention Based Models.

[DOI]

,

,

Jonathan Le Roux

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Objective Assessment of a Speech Enhancement Scheme with an Automatic Speech Recognition-Based System.

[DOI]

,

,

,

,

Henning F. Schepker

,

Proceedings of the 13th ITG Symposium on Speech Communication, 2018

2017

Classifier Architectures for Acoustic Scenes and Events: Implications for DNNs, TDNNs, and Perceptual Features from DCASE 2016.

[DOI]

,

,

Jörn Anemüller

,

,

Birger Kollmeier

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Multi-Channel Speech Enhancement and Amplitude Modulation Analysis for Noise Robust Automatic Speech Recognition.

[DOI]

,

,

Jörn Anemüller

,

,

Birger Kollmeier

Comput. Speech Lang., 2017

2016

Integration of Optimized Modulation Filter Sets Into Deep Neural Networks for Automatic Speech Recognition.

[DOI]

,

Birger Kollmeier

,

Jörn Anemüller

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Probabilistic Spatial Filter Estimation for Signal Enhancement in Multi-Channel Automatic Speech Recognition.

[DOI]

,

,

Jörn Anemüller

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Acoustic Scene Classification using Time-Delay Neural Networks and Amplitude Modulation Filter Bank Features.

[DOI]

,

,

,

Jörn Anemüller

,

Birger Kollmeier

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2016

2015

An Auditory Inspired Amplitude Modulation Filter Bank for Robust Feature Extraction in Automatic Speech Recognition.

[DOI]

,

Jörn Anemüller

,

Birger Kollmeier

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Front-end technologies for robust ASR in reverberant environments - spectral enhancement-based dereverberation and auditory modulation filterbank features.

[DOI]

,

,

,

,

Jörn Anemüller

,

,

,

EURASIP J. Adv. Signal Process., 2015

A CHiME-3 challenge system: Long-term acoustic features for noise robust automatic speech recognition.

[DOI]

,

Stephan Gerlach

,

,

Jörn Anemüller

,

Birger Kollmeier

,

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Should deep neural nets have ears? the role of auditory features in deep learning approaches.

[DOI]

Angel Mario Castro Martinez

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013

On the use of spectro-temporal features for the IEEE AASP challenge 'detection and classification of acoustic scenes and events'.

[DOI]

,

,

Marc René Schädler

,

Benjamin Cauchi

,

,

Jörn Anemüller

,

,

Birger Kollmeier

,

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2013

Analysis of Trabecular Bone Microstructure Using Contour Tree Connectivity.

[DOI]

Dogu Baran Aydogan

,

,

,

Jari A. K. Hyttinen

Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2013, 2013

2012

Amplitude Modulation Filters as Feature Sets for Robust ASR: Constant Absolute or Relative Bandwidth?

[DOI]

,

Jörn Anemüller

,

Birger Kollmeier

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Multimodal Human-Machine Interaction for Service Robots in Home-Care Environments.

[DOI]

,

,

,

,

Proceedings of the 1st Workshop on Speech and Multimodal Interaction in Assistive Environments, 2012

2011

Amplitude modulation spectrogram based features for robust speech recognition in noisy and reverberant environments.

[DOI]

,

Jörn Anemüller

,

Birger Kollmeier

Proceedings of the IEEE International Conference on Acoustics, 2011

Loading...