We stand with Ukraine

We stand with Ukraine

Ralf Schlüter

Orcid: 0000-0003-2839-9247

Affiliations:

RWTH Aachen University, Germany

According to our database¹, Ralf Schlüter authored at least 284 papers between 1997 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

On csauthors.net:

Bibliography

2026

Text-Utilization for Encoder-dominated Speech Recognition Models.

[DOI]

,

,

,

CoRR, April, 2026

Diffusion Language Models for Speech Recognition.

[DOI]

Davyd Naveriani

,

,

,

CoRR, April, 2026

Sequence-Level Unsupervised Training in Speech Recognition: A Theoretical Study.

[DOI]

,

,

,

CoRR, March, 2026

2025

Supplementary Resources and Analysis for Automatic Speech Recognition Systems Trained on the Loquacious Dataset.

[DOI]

Nick Rossenbach

,

,

,

,

Larissa Kleppel

,

CoRR, December, 2025

Reproducing and Dissecting Denoising Language Models for Speech Recognition.

[DOI]

,

,

Nick Rossenbach

,

,

CoRR, December, 2025

Error Analysis in a Modular Meeting Transcription System.

[DOI]

,

,

Thilo von Neumann

,

Christoph Boeddeker

,

,

Reinhold Haeb-Umbach

CoRR, September, 2025

Unified Learnable 2D Convolutional Feature Extraction for ASR.

[DOI]

,

Benedikt Hilmes

,

,

CoRR, September, 2025

A Comparative Analysis on ASR System Combination for Attention, CTC, Factored Hybrid, and Transducer Models.

[DOI]

Noureldin Bayoumi

,

,

,

,

,

CoRR, August, 2025

Exploring Gender Bias in Large Language Models: An In-depth Dive into the German Language.

[DOI]

,

,

Simone Kopeinik

,

CoRR, July, 2025

Medical Spoken Named Entity Recognition.

[DOI]

,

,

Hung-Phong Tran

,

,

Khai-Nguyen Nguyen

,

,

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Label-Context-Dependent Internal Language Model Estimation for CTC.

[DOI]

,

Minh-Nghia Phan

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Regularizing Learnable Feature Extraction for Automatic Speech Recognition.

[DOI]

,

Maximilian Kannen

,

Benedikt Hilmes

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Running Conventional Automatic Speech Recognition on Memristor Hardware: A Simulated Approach.

[DOI]

Nick Rossenbach

,

Benedikt Hilmes

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Analyzing the Importance of Blank for CTC-Based Knowledge Distillation.

[DOI]

Benedikt Hilmes

,

Nick Rossenbach

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Dynamic Acoustic Model Architecture Optimization in Training for ASR.

[DOI]

,

,

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Classification Error Bound for Low Bayes Error Conditions in Machine Learning.

[DOI]

,

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

The Conformer Encoder May Reverse the Time Dimension.

[DOI]

,

,

Mohammad Zeineldeen

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Right Label Context in End-to-End Training of Time-Synchronous ASR Models.

[DOI]

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Efficient Supernet Training with Orthogonal Softmax for Scalable ASR Model Compression.

[DOI]

,

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Analysis of Domain Shift across ASR Architectures via TTS-Enabled Separation of Target Domain and Acoustic Conditions.

[DOI]

,

Nick Rossenbach

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

End-to-End Speech Recognition: A Survey.

[DOI]

Rohit Prabhavalkar

,

,

Tara N. Sainath

,

,

Shinji Watanabe

IEEE ACM Trans. Audio Speech Lang. Process., 2024

On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition.

[DOI]

Nick Rossenbach

,

,

CoRR, 2024

On the Effect of Purely Synthetic Training Data for Different Automatic Speech Recognition Architectures.

[DOI]

Nick Rossenbach

,

Benedikt Hilmes

,

CoRR, 2024

Combining TF-GridNet And Mixture Encoder For Continuous Speech Separation For Meeting Transcription.

[DOI]

,

,

Thilo von Neumann

,

Christoph Boeddeker

,

,

Reinhold Haeb-Umbach

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Refined Statistical Bounds for Classification Error Mismatches with Constrained Bayes Error.

[DOI]

,

,

,

Proceedings of the IEEE Information Theory Workshop, 2024

Investigating the Effect of Label Topology and Training Criterion on ASR Performance and Alignment Quality.

[DOI]

,

Christoph Lüscher

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Dynamic Encoder Size Based on Data-Driven Layer-wise Pruning for Speech Recognition.

[DOI]

,

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Chunked Attention-Based Encoder-Decoder Model for Streaming Speech Recognition.

[DOI]

Mohammad Zeineldeen

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

On the Relation Between Internal Language Model and Sequence Discriminative Training for Neural Transducers.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Mixture Encoder Supporting Continuous Speech Separation for Meeting Recognition.

[DOI]

,

,

Thilo von Neumann

,

Christoph Böddeker

,

,

Reinhold Haeb-Umbach

CoRR, 2023

Comparative Analysis of the wav2vec 2.0 Feature Extractor.

[DOI]

,

,

CoRR, 2023

Improving And Analyzing Neural Speaker Embeddings for ASR.

[DOI]

Christoph Lüscher

,

,

Mohammad Zeineldeen

,

,

CoRR, 2023

Competitive and Resource Efficient Factored Hybrid HMM Systems are Simpler Than You Think.

[DOI]

,

Christoph Lüscher

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Mixture Encoder for Joint Speech Separation and Recognition.

[DOI]

,

,

Christoph Böddeker

,

,

Reinhold Haeb-Umbach

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition.

[DOI]

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Enhancing and Adversarial: Improve ASR with Speaker Labels.

[DOI]

,

,

,

Mohammad Zeineldeen

,

Christoph Lüscher

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Efficient Utilization of Large Pre-Trained Models for Low Resource ASR.

[DOI]

,

Christoph Lüscher

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Investigating The Effect of Language Models in Sequence Discriminative Training For Neural Transducers.

[DOI]

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition.

[DOI]

Nick Rossenbach

,

Benedikt Hilmes

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

End-To-End Training of a Neural HMM with Label and Transition Probabilities.

[DOI]

,

,

Wilfried Michel

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Efficient Use of Large Pre-Trained Models for Low Resource ASR.

[DOI]

,

Christoph Lüscher

,

,

,

CoRR, 2022

Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech.

[DOI]

Christoph Lüscher

,

Mohammad Zeineldeen

,

,

,

,

,

,

CoRR, 2022

Monotonic Segmental Attention for Automatic Speech Recognition.

[DOI]

,

,

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

HMM vs. CTC for Automatic Speech Recognition: Comparison Based on Full-Sum Training from Scratch.

[DOI]

,

,

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Discrete Steps towards Approximate Computing.

[DOI]

,

,

,

,

Farhad Merchant

,

,

Mohammad Zeineldeen

,

,

Proceedings of the 23rd International Symposium on Quality Electronic Design, 2022

Efficient Training of Neural Transducer for Speech Recognition.

[DOI]

,

Wilfried Michel

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving the Training Recipe for a Robust Conformer-based Hybrid Model.

[DOI]

Mohammad Zeineldeen

,

,

Christoph Lüscher

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Self-Normalized Importance Sampling for Neural Language Modeling.

[DOI]

,

,

Alexander Gerstenberger

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Automatic Learning of Subword Dependent Model Scales.

[DOI]

,

Wilfried Michel

,

Mohammad Zeineldeen

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

On Language Model Integration for RNN Transducer Based Speech Recognition.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

Conformer-Based Hybrid ASR System For Switchboard Dataset.

[DOI]

Mohammad Zeineldeen

,

,

Christoph Lüscher

,

Wilfried Michel

,

Alexander Gerstenberger

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

Efficient Sequence Training of Attention Models Using Approximative Recombination.

[DOI]

Nils-Philipp Wynands

,

Wilfried Michel

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Factored Hybrid HMM Acoustic Modeling without State Tying.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Dataset Using (Psycho-)Linguistic and Fluency Features.

[DOI]

,

,

Rishab Bhattacharyya

,

Daniel Wiechmann

,

,

,

CoRR, 2021

Why does CTC result in peaky behavior?

[DOI]

,

,

CoRR, 2021

The Impact of ASR on the Automatic Analysis of Linguistic Complexity and Sophistication in Spontaneous L2 Speech.

[DOI]

,

,

,

CoRR, 2021

Feature Replacement and Combination for Hybrid ASR Systems.

[DOI]

,

Christoph Lüscher

,

Wilfried Michel

,

,

CoRR, 2021

Towards Consistent Hybrid HMM Acoustic Modeling.

[DOI]

,

,

,

CoRR, 2021

A study of latent monotonic attention variants.

[DOI]

,

,

CoRR, 2021

Tight Integrated End-to-End Training for Cascaded Speech Translation.

[DOI]

,

Tobias Bieschke

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition.

[DOI]

,

Mohammad Zeineldeen

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Equivalence of Segmental and Neural Transducer Modeling: A Proof of Concept.

[DOI]

,

,

André Merboldt

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Librispeech Transducer Model with Internal Language Model Prior Correction.

[DOI]

,

André Merboldt

,

Wilfried Michel

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Investigating Methods to Improve Language Model Integration for Attention-Based Encoder-Decoder ASR Models.

[DOI]

Mohammad Zeineldeen

,

Aleksandr Glushko

,

Wilfried Michel

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

The Impact of ASR on the Automatic Analysis of Linguistic Complexity and Sophistication in Spontaneous L2 Speech.

[DOI]

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

On Sampling-Based Training Criteria for Neural Language Modeling.

[DOI]

,

,

Alexander Gerstenberger

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Phoneme Based Neural Transducer for Large Vocabulary Speech Recognition.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

On Architectures and Training for Raw Waveform Feature Extraction in ASR.

[DOI]

,

Christoph Lüscher

,

Wilfried Michel

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition Architectures.

[DOI]

Nick Rossenbach

,

Mohammad Zeineldeen

,

Benedikt Hilmes

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Investigations on Phoneme-Based End-To-End Speech Recognition.

[DOI]

,

,

,

,

CoRR, 2020

Robust Beam Search for Encoder-Decoder Attention Based Speech Recognition Without Length Bias.

[DOI]

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

A New Training Pipeline for an Improved Neural Transducer.

[DOI]

,

André Merboldt

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Context-Dependent Acoustic Modeling Without Explicit Phone Clustering.

[DOI]

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Early Stage LM Integration Using Local and Global Log-Linear Combination.

[DOI]

Wilfried Michel

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Investigation of Large-Margin Softmax in Neural Language Modeling.

[DOI]

,

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

LVCSR with Transformer Language Models.

[DOI]

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Full-Sum Decoding for Hybrid Hmm Based Speech Recognition Using LSTM Language Model.

[DOI]

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

The Rwth Asr System for Ted-Lium Release 2: Improving Hybrid Hmm With Specaugment.

[DOI]

,

Wilfried Michel

,

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Layer-Normalized LSTM for Hybrid-Hmm and End-To-End ASR.

[DOI]

Mohammad Zeineldeen

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems.

[DOI]

Nick Rossenbach

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Frame-Level MMI as A Sequence Discriminative Training Criterion for LVCSR.

[DOI]

Wilfried Michel

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

How Much Self-Attention Do We Need? Trading Attention for Feed-Forward Layers.

[DOI]

,

Alexander Gerstenberger

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Comprehensive Study of Residual CNNS for Acoustic Modeling in ASR.

[DOI]

Vitalii Bozheniuk

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Exploring A Zero-Order Direct Hmm Based on Latent Attention for Automatic Speech Recognition.

[DOI]

,

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Training of reduced-rank linear transformations for multi-layer polynomial acoustic features for speech recognition.

[DOI]

Muhammad Ali Tahir

,

,

,

,

Speech Commun., 2019

Upper and Lower Tight Error Bounds for Feature Omission with an Extension to Context Reduction.

[DOI]

,

,

IEEE Trans. Pattern Anal. Mach. Intell., 2019

LSTM Language Models for LVCSR in First-Pass Decoding and Lattice-Rescoring.

[DOI]

,

,

,

CoRR, 2019

RWTH ASR Systems for LibriSpeech: Hybrid vs Attention - w/o Data Augmentation.

[DOI]

Christoph Lüscher

,

,

,

,

Wilfried Michel

,

,

,

CoRR, 2019

On Using SpecAugment for End-to-End Speech Translation.

[DOI]

,

,

,

Proceedings of the 16th International Conference on Spoken Language Translation, 2019

Survey Talk: Modeling in Automatic Speech Recognition: Beyond Hidden Markov Models.

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Rescoring Keyword Search Confidence Estimates with Graph-Based Re-Ranking Using Acoustic Word Embeddings.

[DOI]

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR.

[DOI]

Wilfried Michel

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

An Analysis of Local Monotonic Attention Variants.

[DOI]

André Merboldt

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Analysis of Deep Clustering as Preprocessing for Automatic Speech Recognition of Sparsely Overlapping Speech.

[DOI]

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

RWTH ASR Systems for LibriSpeech: Hybrid vs Attention.

[DOI]

Christoph Lüscher

,

,

,

,

Wilfried Michel

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Cumulative Adaptation for BLSTM Acoustic Models.

[DOI]

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Language Modeling with Deep Transformers.

[DOI]

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Investigation into Joint Optimization of Single Channel Speech Enhancement and Acoustic Modeling for Robust ASR.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2019

On Using 2D Sequence-to-sequence Models for Speech Recognition.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2019

A Comparison of Transformer and LSTM Encoder Decoder Models for ASR.

[DOI]

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Training Language Models for Long-Span Cross-Sentence Evaluation.

[DOI]

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Speaker Adapted Beamforming for Multi-Channel Automatic Speech Recognition.

[DOI]

,

,

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Improved Training of End-to-end Attention Models for Speech Recognition.

[DOI]

,

,

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Investigation on LSTM Recurrent N-gram Language Models for Speech Recognition.

[DOI]

,

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Comparison of BLSTM-Layer-Specific Affine Transformations for Speaker Adaptation.

[DOI]

,

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Investigation on Estimation of Sentence Probability by Combining Forward, Backward and Bi-directional LSTM-RNNs.

[DOI]

,

,

,

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Segmental Encoder-Decoder Models for Large Vocabulary Automatic Speech Recognition.

[DOI]

,

Mirko Hannemann

,

Patrick Dötsch

,

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Acoustic Modeling of Speech Waveform Based on Multi-Resolution, Neural Network Signal Processing.

[DOI]

,

,

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Prediction of LSTM-RNN Full Context States as a Subtask for N-Gram Feedforward Language Models.

[DOI]

,

,

,

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Sequence Modeling and Alignment for LVCSR-Systems.

[DOI]

,

,

Patrick Doetsch

,

André Merboldt

,

,

Proceedings of the 13th ITG Symposium on Speech Communication, 2018

2017

Inverted Alignments for End-to-End Automatic Speech Recognition.

[DOI]

Patrick Doetsch

,

Mirko Hannemann

,

,

IEEE J. Sel. Top. Signal Process., 2017

The 2016 RWTH Keyword Search System for Low-Resource Languages.

[DOI]

,

,

,

,

,

Proceedings of the Speech and Computer - 19th International Conference, 2017

CTC in the Context of Generalized Full-Sum HMM Training.

[DOI]

,

,

,

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Parallel Neural Network Features for Improved Tandem Acoustic Modeling.

[DOI]

,

Wilfried Michel

,

,

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Faster sequence training.

[DOI]

,

,

,

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A comprehensive study of deep bidirectional LSTM RNNS for acoustic modeling in speech recognition.

[DOI]

,

Patrick Doetsch

,

Paul Voigtlaender

,

,

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Noisy objective functions based on the f-divergence.

[DOI]

Markus Nußbaum-Thom

,

,

,

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Investigations on byte-level convolutional neural networks for language modeling in low resource speech recognition.

[DOI]

,

,

,

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Returnn: The RWTH extensible training framework for universal recurrent neural networks.

[DOI]

Patrick Doetsch

,

,

Paul Voigtlaender

,

,

,

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

Automatic Speech Recognition Based on Neural Networks.

[DOI]

,

Patrick Doetsch

,

,

,

,

,

,

Proceedings of the Speech and Computer - 18th International Conference, 2016

The RWTH Aachen LVCSR system for IWSLT-2016 German Skype conversation recognition task.

[DOI]

Wilfried Michel

,

,

M. Ali Basha Shaik

,

,

Proceedings of the 13th International Conference on Spoken Language Translation, 2016

Towards Online-Recognition with Deep Bidirectional LSTM Acoustic Models.

[DOI]

,

,

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

LSTM, GRU, Highway and a Bit of Attention: An Empirical Overview for Language Modeling in Speech Recognition.

[DOI]

,

,

,

,

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Investigation on log-linear interpolation of multi-domain neural network language model.

[DOI]

,

,

,

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Robust Online Multi-Channel Speech Recognition.

[DOI]

,

,

,

,

Reinhold Haeb-Umbach

Proceedings of the 12th ITG Symposium on Speech Communication, 2016

2015

From Feedforward to Recurrent LSTM Neural Networks for Language Modeling.

[DOI]

Martin Sundermeyer

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Improvements in RWTH LVCSR evaluation systems for Polish, Portuguese, English, urdu, and Arabic.

[DOI]

M. Ali Basha Shaik

,

,

Muhammad Ali Tahir

,

Markus Nußbaum-Thom

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Bag-of-words input for long history representation in neural network-based language models for speech recognition.

[DOI]

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Multilingual features based keyword search for very low-resource languages.

[DOI]

,

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Convolutional neural networks for acoustic modeling of raw time signal in LVCSR.

[DOI]

,

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Error bounds for context reduction and feature omission.

[DOI]

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Investigations on sequence training of neural networks.

[DOI]

,

,

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Sequence-discriminative training of recurrent neural networks.

[DOI]

Paul Voigtlaender

,

Patrick Doetsch

,

,

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Integrating Gaussian mixtures into deep neural networks: Softmax layer with hidden variables.

[DOI]

,

Muhammad Ali Tahir

,

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Investigation of mixture splitting concept for training linear bottlenecks of deep neural network acoustic models.

[DOI]

Muhammad Ali Tahir

,

,

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improved strategies for a zero oov rate LVCSR system.

[DOI]

M. Ali Basha Shaik

,

Amr El-Desoky Mousa

,

,

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Unsupervised adaptation of a denoising autoencoder by Bayesian Feature Enhancement for reverberant asr under mismatch conditions.

[DOI]

,

Reinhold Haeb-Umbach

,

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Speaker adaptive joint training of Gaussian mixture models and bottleneck features.

[DOI]

,

,

,

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Multilingual representations for low resource speech recognition and keyword search.

[DOI]

,

Brian Kingsbury

,

Bhuvana Ramabhadran

,

,

Kartik Audhkhasi

,

,

,

,

Markus Nußbaum-Thom

,

Michael Picheny

,

,

,

,

,

Mark J. F. Gales

,

,

,

,

Philip C. Woodland

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Acoustic modeling with deep neural networks using raw time signal for LVCSR.

[DOI]

,

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Data augmentation, feature combination, and multilingual neural networks to improve ASR and KWS performance for low-resource languages.

[DOI]

,

,

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Lattice decoding and rescoring with long-Span neural network language models.

[DOI]

Martin Sundermeyer

,

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

rwthlm - the RWTH aachen university neural network language modeling toolkit.

[DOI]

Martin Sundermeyer

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

RWTH LVCSR systems for quaero and EU-bridge: German, Polish, Spanish and Portuguese.

[DOI]

M. Ali Basha Shaik

,

,

Muhammad Ali Tahir

,

Markus Nußbaum-Thom

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Word pair approximation for more efficient decoding with high-order language models.

[DOI]

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Open-Lexicon Language Modeling Combining Word and Character Levels.

[DOI]

Michal Kozielski

,

Martin Matysiak

,

Patrick Doetsch

,

,

Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition, 2014

Mean-normalized stochastic gradient for large-scale deep learning.

[DOI]

,

Alexander Richard

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

RASR/NN: The RWTH neural network toolkit for speech recognition.

[DOI]

,

Alexander Richard

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

The RWTH English lecture recognition system.

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

Multilingual MRASTA features for low-resource keyword search and speech recognition systems.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

A family of discriminative training criteria based on the F-divergence for deep neural networks.

[DOI]

Markus Nußbaum-Thom

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Lexical Prefix Tree and WFST: A Comparison of Two Dynamic Search Concepts for LVCSR.

[DOI]

,

,

IEEE Trans. Speech Audio Process., 2013

Investigations on an EM-Style Optimization Algorithm for Discriminative Training of HMMs.

[DOI]

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2013

The RWTH Aachen German and English LVCSR systems for IWSLT-2013.

[DOI]

M. Ali Basha Shaik

,

,

,

Markus Nußbaum-Thom

,

,

,

Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2013, 2013

Novel tight classification error bounds under mismatch conditions based on f-Divergence.

[DOI]

,

Markus Nußbaum-Thom

,

,

,

Proceedings of the 2013 IEEE Information Theory Workshop, 2013

Multilingual hierarchical MRASTA features for ASR.

[DOI]

,

,

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Training log-linear acoustic models in higher-order polynomial feature space for speech recognition.

[DOI]

Muhammad Ali Tahir

,

,

,

,

Louis ten Bosch

,

,

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Feature-rich sub-lexical language models using a maximum entropy approach for German LVCSR.

[DOI]

M. Ali Basha Shaik

,

Amr El-Desoky Mousa

,

,

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Relative error bounds for statistical classifiers based on the f-divergence.

[DOI]

Markus Nußbaum-Thom

,

,

,

,

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Morpheme level hierarchical pitman-yor class-based language models for LVCSR of morphologically rich languages.

[DOI]

Amr El-Desoky Mousa

,

M. Ali Basha Shaik

,

,

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Improving LVCSR with hidden conditional random fields for grapheme-to-phoneme conversion.

[DOI]

,

,

,

,

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Development of the RWTH transcription system for slovenian.

[DOI]

,

,

,

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A critical evaluation of stochastic algorithms for convex optimization.

[DOI]

,

Alexander Richard

,

,

Proceedings of the IEEE International Conference on Acoustics, 2013

Deep hierarchical bottleneck MRASTA features for LVCSR.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2013

Investigation on cross- and multilingual MLP features under matched and mismatched acoustical conditions.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2013

Comparison of feedforward and recurrent neural network language models.

[DOI]

Martin Sundermeyer

,

,

Jean-Luc Gauvain

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2013

Feature combination and stacking of recurrent and non-recurrent neural networks for LVCSR.

[DOI]

Christian Plahl

,

Michal Kozielski

,

,

Proceedings of the IEEE International Conference on Acoustics, 2013

Advanced search space pruning with acoustic look-ahead for WFST based LVCSR.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2013

System combination and score normalization for spoken term detection.

[DOI]

,

,

,

Mark J. F. Gales

,

Brian Kingsbury

,

,

,

,

Michael Picheny

,

Bhuvana Ramabhadran

,

,

,

Philip C. Woodland

Proceedings of the IEEE International Conference on Acoustics, 2013

Open vocabulary handwriting recognition using combined word-level and character-level language models.

[DOI]

Michal Kozielski

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2013

A high-performance Cantonese keyword search system.

[DOI]

Brian Kingsbury

,

,

,

Mark J. F. Gales

,

,

,

,

,

Michael Picheny

,

Bhuvana Ramabhadran

,

,

,

Philip C. Woodland

Proceedings of the IEEE International Conference on Acoustics, 2013

Efficient nearly error-less LVCSR decoding based on incremental forward and backward passes.

[DOI]

,

,

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012

WFST Enabled Solutions to ASR Problems: Beyond HMM Decoding.

[DOI]

Björn Hoffmeister

,

,

,

,

IEEE Trans. Speech Audio Process., 2012

Discriminative Training for Automatic Speech Recognition: Modeling, Criteria, Optimization, Implementation, and Performance.

[DOI]

,

,

,

IEEE Signal Process. Mag., 2012

Does the Cost Function Matter in Bayes Decision Rule?

[DOI]

,

Markus Nußbaum-Thom

,

IEEE Trans. Pattern Anal. Mach. Intell., 2012

Phase difference of filter-stable part-tones as acoustic feature.

[DOI]

,

Friedhelm R. Drepper

,

Proceedings of the IEEE Statistical Signal Processing Workshop, 2012

Accelerated Batch Learning of Convex Log-linear Models for LVCSR.

[DOI]

,

,

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both?

[DOI]

,

,

,

Martin Sundermeyer

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Non-stationary signal processing and its application in speech recognition.

[DOI]

,

Friedhelm R. Drepper

,

Proceedings of the ISCA Workshop on Statistical And Perceptual Audition, 2012

Simultaneous Discriminative Training and Mixture Splitting of HMMs for Speech Recognition.

[DOI]

Muhammad Ali Tahir

,

Markus Nußbaum-Thom

,

,

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

LSTM Neural Networks for Language Modeling.

[DOI]

Martin Sundermeyer

,

,

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Hierarchical hybrid language models for open vocabulary continuous speech recognition using WFST.

[DOI]

M. Ali Basha Shaik

,

,

,

,

Proceedings of the ISCA Workshop on Statistical And Perceptual Audition, 2012

Investigation of Maximum Entropy Hybrid Language Models for Open Vocabulary German and Polish LVCSR.

[DOI]

M. Ali Basha Shaik

,

Amr El-Desoky Mousa

,

,

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Posterior-Scaled MPE: Novel Discriminative Training Criteria.

[DOI]

Markus Nußbaum-Thom

,

,

,

,

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Search Space Pruning Based on Anticipated Path Recombination in LVCSR.

[DOI]

,

,

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Morpheme Level Feature-based Language Models for German LVCSR.

[DOI]

Amr El-Desoky Mousa

,

M. Ali Basha Shaik

,

,

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Comparison and combination of different CRBE based MLP features for LVCSR.

[DOI]

,

,

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Silence is golden: Modeling non-speech events in WFST-based dynamic network decoders.

[DOI]

,

,

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Extended search space pruning in LVCSR.

[DOI]

,

,

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Joining advantages of word-conditioned and token-passing decoding.

[DOI]

,

,

,

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Investigations on the use of morpheme level features in Language Models for Arabic LVCSR.

[DOI]

Amr El-Desoky Mousa

,

,

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Basis vector orthogonalization for an improved kernel gradient matching pursuit method.

[DOI]

,

Shinji Watanabe

,

Atsushi Nakamura

,

,

,

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

On the Relationship Between Bayes Risk and Word Error Rate in ASR.

[DOI]

,

Markus Nußbaum-Thom

,

IEEE Trans. Speech Audio Process., 2011

Equivalence of Generative and Log-Linear Models.

[DOI]

,

,

,

,

IEEE Trans. Speech Audio Process., 2011

Speech recognition for machine translation in Quaero.

[DOI]

Proceedings of the 2011 International Workshop on Spoken Language Translation, 2011

A Study on Speaker Normalized MLP Features in LVCSR.

[DOI]

,

Christian Plahl

,

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Log-Linear Optimization of Second-Order Polynomial Features with Subsequent Dimension Reduction for Speech Recognition.

[DOI]

Muhammad Ali Tahir

,

,

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

On the Estimation of Discount Parameters for Language Model Smoothing.

[DOI]

Martin Sundermeyer

,

,

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Hybrid Language Models Using Mixed Types of Sub-Lexical Units for Open Vocabulary German LVCSR.

[DOI]

M. Ali Basha Shaik

,

Amr El-Desoky Mousa

,

,

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Improved Acoustic Feature Combination for LVCSR by Neural Networks.

[DOI]

Christian Plahl

,

,

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Compound Word Recombination for German LVCSR.

[DOI]

Markus Nußbaum-Thom

,

Amr El-Desoky Mousa

,

,

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Acoustic Look-Ahead for More Efficient Decoding in LVCSR.

[DOI]

,

,

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Morpheme Based Factored Language Models for German LVCSR.

[DOI]

Amr El-Desoky Mousa

,

M. Ali Basha Shaik

,

,

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Feature selection for log-linear acoustic models.

[DOI]

,

Alexander Richard

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2011

Non-stationary feature extraction for automatic speech recognition.

[DOI]

,

,

,

Friedhelm R. Drepper

Proceedings of the IEEE International Conference on Acoustics, 2011

The RWTH 2010 Quaero ASR evaluation system for English, French, and German.

[DOI]

Martin Sundermeyer

,

Markus Nußbaum-Thom

,

,

Christian Plahl

,

Amr El-Desoky Mousa

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2011

Using morpheme and syllable based sub-words for polish LVCSR.

[DOI]

M. Ali Basha Shaik

,

Amr El-Desoky Mousa

,

,

Proceedings of the IEEE International Conference on Acoustics, 2011

A comparative analysis of dynamic network decoding.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2011

Exploiting sparseness of backing-off language models for efficient look-ahead in LVCSR.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2011

Subspace pursuit method for kernel-log-linear models.

[DOI]

,

,

,

,

Shinji Watanabe

,

Atsushi Nakamura

,

Tetsunori Kobayashi

Proceedings of the IEEE International Conference on Acoustics, 2011

A convergence analysis of log-linear training and its application to speech recognition.

[DOI]

,

,

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Discriminative splitting of Gaussian/log-linear mixture HMMs for speech recognition.

[DOI]

Muhammad Ali Tahir

,

,

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Cross-lingual portability of Chinese and english neural network features for French and German LVCSR.

[DOI]

Christian Plahl

,

,

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010

Margin-Based Discriminative Training for String Recognition.

[DOI]

,

,

,

,

IEEE J. Sel. Top. Signal Process., 2010

Sub-lexical language models for German LVCSR.

[DOI]

Amr El-Desoky Mousa

,

M. Ali Basha Shaik

,

,

Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Evaluation of automatic transcription systems for the judicial domain.

[DOI]

,

Daniele Falavigna

,

,

,

Roberto Gretter

,

Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

A Hybrid Morphologically Decomposed Factored Language Models for Arabic LVCSR.

[DOI]

,

,

Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2010

A discriminative splitting criterion for phonetic decision trees.

[DOI]

,

,

Markus Nußbaum-Thom

,

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

On the relation of Bayes risk, word error, and word posteriors in ASR.

[DOI]

,

Markus Nußbaum-Thom

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Revisiting VTLN using linear transformation on conventional MFCC.

[DOI]

Doddipatla Rama Sanand

,

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Hierarchical bottle neck features for LVCSR.

[DOI]

Christian Plahl

,

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Parallel lexical-tree based LVCSR on multi-core processors.

[DOI]

,

,

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

The RWTH 2009 quaero ASR evaluation system for English and German.

[DOI]

Markus Nußbaum-Thom

,

,

Martin Sundermeyer

,

Christian Plahl

,

,

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Time conditioned search in automatic speech recognition reconsidered.

[DOI]

,

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Discriminative adaptation for log-linear acoustic models.

[DOI]

,

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Discriminative HMMS, log-linear models, and CRFS: What is the difference?

[DOI]

,

,

Markus Nußbaum-Thom

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

The RWTH aachen university open source speech recognition system.

[DOI]

,

Christian Gollan

,

,

Björn Hoffmeister

,

,

,

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Development of the GALE 2008 Mandarin LVCSR system.

[DOI]

Christian Plahl

,

Björn Hoffmeister

,

,

,

,

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Parallel fast likelihood computation for LVCSR using mixture decomposition.

[DOI]

,

,

,

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Bayes risk approximations using time overlap with an application to system combination.

[DOI]

Björn Hoffmeister

,

,

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Log-linear model combination with word-dependent scaling factors.

[DOI]

Björn Hoffmeister

,

,

,

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Investigations on convex optimization using log-linear HMMs for digit string recognition.

[DOI]

,

,

,

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Investigating the use of morphological decomposition and diacritization for improving Arabic LVCSR.

[DOI]

,

Christian Gollan

,

,

,

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Automatic Transcription of Courtroom Recordings in the JUMAS project.

[DOI]

Daniele Falavigna

,

,

Roberto Gretter

,

,

Christian Gollan

,

,

Proceedings of the 2<sup>nd</sup> International Conference on ICT Solutions for Justice, 2009

Audio segmentation for speech recognition using segment features.

[DOI]

,

Christian Gollan

,

,

Proceedings of the IEEE International Conference on Acoustics, 2009

Modified MPE/MMI in a transducer-based framework.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2009

Investigations on features for log-linear acoustic models in continuous speech recognition.

[DOI]

,

Markus Nußbaum-Thom

,

,

,

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Generalized likelihood ratio discriminant analysis.

[DOI]

Muhammad Ali Tahir

,

,

Christian Plahl

,

,

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008

Development of the SRI/nightingale Arabic ASR system.

[DOI]

Dimitra Vergyri

,

,

,

Andreas Stolcke

,

,

Martin Graciarena

,

,

Christian Gollan

,

,

Katrin Kirchhoff

,

,

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Recent improvements of the RWTH GALE Mandarin LVCSR system.

[DOI]

Christian Plahl

,

Björn Hoffmeister

,

,

,

,

,

,

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

iCNC and iROVER: the limits of improving system combination with classification?

[DOI]

Björn Hoffmeister

,

,

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

On the equivalence of Gaussian and log-linear HMMs.

[DOI]

,

,

,

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Modified MMI/MPE: a direct evaluation of the margin in speech recognition.

[DOI]

,

Thomas Deselaers

,

,

Proceedings of the Machine Learning, 2008

A GIS-like training algorithm for log-linear models with hidden variables.

[DOI]

,

Thomas Deselaers

,

,

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Using multiple acoustic feature sets for speech recognition.

[DOI]

,

Daniil Kocharov

,

,

Speech Commun., 2007

iROVER: Improving System Combination with Classification.

[DOI]

,

Björn Hoffmeister

,

,

,

Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Hierarchical neural networks feature extraction for LVCSR system.

[DOI]

,

,

Christian Plahl

,

Christian Gollan

,

Hynek Hermansky

,

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Efficient estimation of speaker-specific projecting feature transforms.

[DOI]

,

,

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

The RWTH 2007 TC-STAR evaluation system for european English and Spanish.

[DOI]

,

Christian Gollan

,

,

,

Björn Hoffmeister

,

Christian Plahl

,

,

,

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

On the equivalence of Gaussian HMM and Gaussian HMM-like hidden conditional random fields.

[DOI]

,

,

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

An improved method for unsupervised training of LVCSR systems.

[DOI]

Christian Gollan

,

,

,

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Gammatone Features and Feature Combination for Large Vocabulary Speech Recognition.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2007

Cross-Site and Intra-Site ASR System Combination: Comparisons on Lattice and 1-Best Methods.

[DOI]

Björn Hoffmeister

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2007

Advances in Arabic broadcast news transcription at RWTH.

[DOI]

,

,

Christian Gollan

,

,

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Development of the 2007 RWTH Mandarin LVCSR system.

[DOI]

Björn Hoffmeister

,

Christian Plahl

,

,

,

,

,

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006

Feature combination using linear discriminant analysis and its pitfalls.

[DOI]

,

,

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

The 2006 RWTH parliamentary speeches transcription system.

[DOI]

,

Maximilian Bisani

,

Christian Gollan

,

,

Björn Hoffmeister

,

Christian Plahl

,

,

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Frame based system combination and a comparison with weighted ROVER and CNC.

[DOI]

Björn Hoffmeister

,

,

,

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005

Bayes risk minimization using metric loss functions.

[DOI]

,

T. Scharrenbach

,

Volker Steinbiss

,

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Investigations on error minimizing training criteria for discriminative training in automatic speech recognition.

[DOI]

Wolfgang Macherey

,

,

,

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Articulatory motivated acoustic features for speech recognition.

[DOI]

Daniil Kocharov

,

,

,

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Acoustic Feature Combination for Robust Speech Recognition.

[DOI]

,

,

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Cross Domain Automatic Transcription on the TC-STAR EPPS Corpus.

[DOI]

Christian Gollan

,

Maximilian Bisani

,

Stephan Kanthak

,

,

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Discriminative training with tied covariance matrices.

[DOI]

Wolfgang Macherey

,

,

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

2003

Extraction methods of voicing feature for robust speech recognition.

[DOI]

,

,

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002

Robust speech recognition using a voiced-unvoiced feature.

[DOI]

,

,

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2001

Confidence measures for large vocabulary continuous speech recognition.

[DOI]

,

,

,

IEEE Trans. Speech Audio Process., 2001

Model-based MCE bound to the true Bayes' error.

[DOI]

,

IEEE Signal Process. Lett., 2001

Comparison of discriminative training criteria and optimization methods for speech recognition.

[DOI]

,

Wolfgang Macherey

,

,

Speech Commun., 2001

Vocal tract normalization equals linear transformation in cepstral space.

[DOI]

,

,

,

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Explicit word error minimization using word hypothesis posterior probabilities.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2001

Using phase spectrum information for improved speech recognition performance.

[DOI]

,

Proceedings of the IEEE International Conference on Acoustics, 2001

Computing Mel-frequency cepstral coefficients on the power spectrum.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2001

2000

Investigations on discriminative training criteria.

[DOI]

PhD thesis, 2000

The RWTH Large Vocabulary Speech Recognition System for Spontaneous Speech.

Stephan Kanthak

,

,

,

,

Proceedings of the KONVENS 2000 / Sprachkommunikation, 2000

Speech recognition using context conditional word posterior probabilities.

[DOI]

,

,

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Using posterior word probabilities for improved speech recognition.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2000

Recent improvements of the RWTH large vocabulary speech recognition system on spontaneous speech.

[DOI]

,

,

Stephan Kanthak

,

,

Proceedings of the IEEE International Conference on Acoustics, 2000

1999

A combined maximum mutual information and maximum likelihood approach for mixture density splitting.

[DOI]

,

Wolfgang Macherey

,

,

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Discriminative Training of Gaussian Mixtures for Image Object Recognition.

[DOI]

,

,

Proceedings of the Mustererkennung 1999, 1999

1998

Using word probabilities as confidence measures.

[DOI]

,

,

Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Comparison of discriminative training criteria.

[DOI]

,

Wolfgang Macherey

Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997

Comparison of optimization methods for discriminative training criteria.

[DOI]

,

Wolfgang Macherey

,

Stephan Kanthak

,

,

Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Loading...