Martin Karafiát

CoRR, 2020

2019

Analysis of Multilingual Sequence-to-Sequence Speech Recognition Systems.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Promising Accurate Prefix Boosting for Sequence-to-sequence ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Residual Memory Networks: Feed-forward approach to learn long temporal dependencies.

[BibT_eX]

[DOI]

CoRR, 2018

Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling.

[BibT_eX]

[DOI]

Jaejin Cho

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

BUT System for Low Resource Indian Language ASR.

[BibT_eX]

[DOI]

Bhargav Pulugundla

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

BUT OpenSAT 2017 Speech Recognition System.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Analysis of Multilingual Blstm Acoustic Model on Low and High Resource Languages.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Team ELISA System for DARPA LORELEI Speech Evaluation 2016.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016 BUT Babel System: Multilingual BLSTM Acoustic Model with i-Vector Based Adaptation.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Residual memory networks: Feed-forward approach to learn long-term temporal dependencies.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Training Data Augmentation and Data Selection.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

Bottle-Neck Feature Extraction Structures for Multilingual Training and Porting.

[BibT_eX]

[DOI]

Proceedings of the SLTU-2016, 2016

Study of Large Data Resources for Multilingual Training and System Porting.

[BibT_eX]

[DOI]

Ekaterina Egorova

Proceedings of the SLTU-2016, 2016

Multilingual BLSTM and speaker-specific vector adaptation in 2016 but babel system.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Boosting performance on low-resource languages by standard corpora: An analysis.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

BUT Zero-Cost Speech Recognition 2016 System Description.

[BibT_eX]

[DOI]

Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016

Data Selection by Sequence Summarizing Neural Network in Mismatch Condition Training.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Sequence summarizing neural network for speaker adaptation.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Multilingual region-dependent transforms.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Three ways to adapt a CTS recognizer to unseen reverberated speech in BUT system for the ASpIRE challenge.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Robust speech recognition in unknown reverberant and noisy conditions.

[BibT_eX]

[DOI]

Sri Harish Reddy Mallidi

Hynek Hermansky

Stavros Tsakalidis

Richard M. Schwartz

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Adapting multilingual neural network hierarchy to a new language.

[BibT_eX]

[DOI]

Proceedings of the 4th Workshop on Spoken Language Technologies for Under-resourced Languages, 2014

But ASR system for BABEL Surprise evaluation 2014.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Further investigation into multilingual training and adaptation of stacked bottle-neck neural network structure.

[BibT_eX]

[DOI]

Ekaterina Egorova

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Progress in the BBN keyword search system for the DARPA RATS program.

[BibT_eX]

[DOI]

Sri Harish Reddy Mallidi

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

BUT 2014 Babel system: analysis of adaptation in NN based systems.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Combination of multilingual and semi-supervised training for under-resourced languages.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

But neural network features for spontaneous Vietnamese in BABEL.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Adaptation of multilingual stacked bottle-neck neural network structure for new language.

[BibT_eX]

[DOI]

Karel Veselý

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

A region-specific feature-space transformation for speaker adaptation and singularity analysis of jacobian matrix.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

BUT BABEL system for spontaneous Cantonese.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Feature and score level combination of subspace Gaussinas in LVCSR task.

[BibT_eX]

[DOI]

Petr Motlícek

Daniel Povey

Proceedings of the IEEE International Conference on Acoustics, 2013

Manual and semi-automatic approaches to building a multilingual phoneme set.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Score normalization and system combination for improved keyword spotting.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

Semi-supervised bootstrapping approach for neural network feature extractor training.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012

Transcribing Meetings With the AMIDA Systems.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Dealing with Numbers in Grapheme-Based Speech Recognition.

[BibT_eX]

[DOI]

Milos Janda

Jan Cernocký

Proceedings of the Text, Speech and Dialogue - 15th International Conference, 2012

The language-independent bottleneck features.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Speaker vectors from subspace Gaussian mixture model as complementary features for language identification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Description and analysis of the Brno276 system for LRE2011.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

A factorized representation of FMLLR transform based on QR-decomposition.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Generating exact lattices in the WFST framework.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Improving language models for ASR using translated in-domain data.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Region dependent linear transforms in multilingual speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Independent component analysis and MLLR transforms for speaker identification.

[BibT_eX]

[DOI]

Sandro Cumani

Oldrich Plchot

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

The subspace Gaussian mixture model - A structured model for speech recognition.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2011

Recurrent Neural Network Based Language Modeling in Meeting Recognition.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Integrating Recent MLP Feature Extraction Techniques into TRAP Architecture.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A symmetrization of the Subspace Gaussian Mixture Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Simplification and optimization of i-vector extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Variational approximation of long-span language models for lvcsr.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Convolutive Bottleneck Network features for LVCSR.

[BibT_eX]

[DOI]

Karel Veselý

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

iVector-based discriminative adaptation for automatic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Study of probabilistic and Bottle-Neck features in multilingual environment.

[BibT_eX]

[DOI]

Milos Janda

Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010

Using Gradient Descent Optimization for Acoustics Training from Heterogeneous Data.

[BibT_eX]

[DOI]

Igor Szöke

Jan Cernocký

Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010

Data selection and calibration issues in automatic language recognition - investigation with BUT-AGNITIO NIST LRE 2009 system.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

Recurrent neural network based language model.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Similarity scoring for recognizing repeated out-of-vocabulary words.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

The AMIDA 2009 meeting transcription system.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Hierarchical neural net architectures for feature extraction in ASR.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Subword-based spoken term detection in audio course lectures.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Subspace Gaussian Mixture Models for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Approaches to automatic lexicon learning with limited training examples.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

A novel estimation of feature-space MLLR for full-covariance models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Posterior-based out of vocabulary word detection in telephone speech.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Investigation into bottle-neck features for meeting speech recognition.

[BibT_eX]

[DOI]

Lukás Burget

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Real-time ASR from meetings.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

BUT system for NIST 2008 speaker recognition evaluation.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

2008

Advances in Acoustic Modeling for the Recognition of Czech.

[BibT_eX]

[DOI]

Jirí Kopecký

Ondrej Glembek

Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Discrimininative training of narrow band - wide band adapted systems for meeting recognition.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

2007

Fusion of Heterogeneous Speaker Recognition Systems in the STBU Submission for the NIST Speaker Recognition Evaluation 2006.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2007

Spoken Term Detection System Based on Combination of LVCSR and Phonetic Search.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Multimodal Interaction , 2007

Application of CMLLR in narrow band wide band adapted systems.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

STBU System for the NIST 2006 Speaker Recognition Evaluation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

The AMI System for the Transcription of Speech in Meetings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Probabilistic and Bottle-Neck Features for LVCSR of Meetings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

The 2007 AMI(DA) System for Meeting Transcription.

[BibT_eX]

[DOI]

Proceedings of the Multimodal Technologies for Perception of Humans, 2007

2006

Indexing and Search Methods for Spoken Documents.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 9th International Conference, 2006

Robust Heteroscedastic Linear Discriminant Analysis and LCRC Posterior Features in Meeting Data Recognition.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Multimodal Interaction, 2006

The AMI Meeting Transcription System: Progress and Performance.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Multimodal Interaction, 2006

Information Retrieval from Spoken Documents.

[BibT_eX]

[DOI]

Proceedings of the Computational Linguistics and Intelligent Text Processing, 2006

2005

Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 8th International Conference, 2005

The Development of the AMI System for the Transcription of Speech in Meetings.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Multimodal Interaction, 2005

The 2005 AMI System for the Transcription of Speech in Meetings.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Multimodal Interaction, 2005

Comparison of keyword spotting approaches for informal continuous speech.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Transcription of conference room meetings: an investigation.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004

TRAP based features for LVCSR of meting data.

[BibT_eX]

[DOI]