Josef Psutka

Proceedings of the Text, Speech, and Dialogue - 28th International Conference, 2025

2024

A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

2023

Transfer Learning of Transformer-Based Speech Recognition Models from Czech to Slovak.

[BibT_eX]

[DOI]

Jan Lehecka

Proceedings of the Text, Speech, and Dialogue - 26th International Conference, 2023

2022

Transformer-Based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project.

[BibT_eX]

[DOI]

Jan Lehecka

Proceedings of the Text, Speech, and Dialogue - 25th International Conference, 2022

Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Live TV Subtitling Through Respeaking.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020

Live TV subtitling through respeaking with remote cutting-edge technology.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2020

Increasing the Accuracy of the ASR System by Prolonging Voiceless Phonemes in the Speech of Patients Using the Electrolarynx.

[BibT_eX]

[DOI]

Petr Stanislav

Proceedings of the Speech and Computer - 22nd International Conference, 2020

2019

Sample size for maximum-likelihood estimates of Gaussian model depending on dimensionality of pattern space.

[BibT_eX]

[DOI]

Pattern Recognit., 2019

Tuning of Acoustic Modeling and Adaptation Technique for a Real Speech Recognition Task.

[BibT_eX]

[DOI]

Josef Michálek

Proceedings of the Statistical Language and Speech Processing, 2019

2018

Recurrent DNNs and Its Ensembles on the TIMIT Phone Recognition Task.

[BibT_eX]

[DOI]

Josef Michálek

Proceedings of the Speech and Computer - 20th International Conference, 2018

A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures.

[BibT_eX]

[DOI]

Proceedings of the Statistical Language and Speech Processing, 2018

2017

A GPU-Architecture Optimized Hierarchical Decomposition Algorithm for Support Vector Machine Training.

[BibT_eX]

[DOI]

Josef Michalek

IEEE Trans. Parallel Distributed Syst., 2017

A Comparison of Support Vector Machines Training GPU-Accelerated Open Source Implementations.

[BibT_eX]

[DOI]

Josef Michalek

CoRR, 2017

Recognition of the Electrolaryngeal Speech: Comparison Between Human and Machine.

[BibT_eX]

[DOI]

Petr Stanislav

Proceedings of the Text, Speech, and Dialogue - 20th International Conference, 2017

A Regularization Post Layer: An Additional Way How to Make Deep Neural Networks Robust.

[BibT_eX]

[DOI]

Proceedings of the Statistical Language and Speech Processing, 2017

2015

Sample Size for Maximum Likelihood Estimates of Gaussian Model.

[BibT_eX]

[DOI]

Proceedings of the Computer Analysis of Images and Patterns, 2015

2014

Anti-Models: - An Alternative Way to Discriminative Training.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue - 17th International Conference, 2014

Captioning of Live TV Commentaries from the Olympic Games in Sochi: Some Interesting Insights.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue - 17th International Conference, 2014

Audio-Video Speaker Diarization for Unsupervised Speaker and Face Model Creation.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue - 17th International Conference, 2014

2013

Online Speaker Adaptation of an Acoustic Model Using Face Recognition.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 16th International Conference, 2013

Covariance Matrix Enhancement Approach to Train Robust Gaussian Mixture Models of Speech Data.

[BibT_eX]

[DOI]

Proceedings of the Speech and Computer - 15th International Conference, 2013

Towards Live Subtitling of TV Ice-hockey Commentary.

[BibT_eX]

[DOI]

Proceedings of the SIGMAP and WINSYS 2013, 2013

Estimation of Single-Gaussian and Gaussian Mixture Models for Pattern Recognition.

[BibT_eX]

[DOI]

Lukás Machlica

Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2013

2012

Optimized Acoustic Likelihoods Computation for NVIDIA and ATI/AMD Graphics Processors.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Captioning of Live TV Programs through Speech Recognition and Re-speaking.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue - 15th International Conference, 2012

Full covariance Gaussian mixture models evaluation on GPU.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2012

Novel Approach to Live Captioning Through Re-speaking: Tailoring Speech Recognition to Re-speaker's Needs.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2011

Speaker-Clustered Acoustic Models Evaluated on GPU for On-line Subtitling of Parliament Meetings.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue - 14th International Conference, 2011

Four-phase Re-speaker Training System.

[BibT_eX]

Proceedings of the SIGMAP 2011, 2011

Optimization of the Gaussian Mixture Model Evaluation on GPU.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

2010

Online TV Captioning of Czech Parliamentary Sessions.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010

Fast Phonetic/Lexical Searching in the Archives of the Czech Holocaust Testimonies: Advancing Towards the MALACH Project Visions.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010

2009

Using Morphological Information for Robust Language Modeling in Czech ASR System.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2009

Discriminative Training of Gender-Dependent Acoustic Models.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 12th International Conference, 2009

Training of Speaker-clustered Acoustic Models for use in Real-time Recognizers.

[BibT_eX]

Proceedings of the SIGMAP 2009, 2009

Czech Senior COMPANION: Wizard of Oz Data Collection and Expressive Speech Corpus Recording and Annotation.

[BibT_eX]

[DOI]

Proceedings of the Human Language Technology. Challenges for Computer Science and Linguistics, 2009

2008

Voice-controlled Data Entry in Dental Electronic Health Record.

[BibT_eX]

[DOI]

Proceedings of the eHealth Beyond the Horizon, 2008

2007

An Intelligent Telephony Interface of Multiagent Decision Support Systems.

[BibT_eX]

[DOI]

IEEE Trans. Syst. Man Cybern. Part C, 2007

Feature space reduction and decorrelation in a large number of speech recognition experiments.

[BibT_eX]

Proceedings of the Signal and Image Processing (SIP 2007), 2007

Live TV Subtitling - Fast 2-pass LVCSR System for Online Subtitling.

[BibT_eX]

Proceedings of the SIGMAP 2007, 2007

2006

Automatic Online Subtitling of the Czech Parliament Meetings.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 9th International Conference, 2006

Exploiting Linguistic Knowledge in Language Modeling of Czech Spontaneous Speech.

[BibT_eX]

[DOI]

Jan Hoidekr

Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Benefit of a Class-based Language Model for Real-time Closed-captioning of TV Ice-hockey Commentaries.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Adaptive language model in automatic online subtitling.

[BibT_eX]

Proceedings of the Second IASTED International Conference on Computational Intelligence, 2006

Automatic transcription of audio archives for spoken document retrieval.

[BibT_eX]

Proceedings of the Second IASTED International Conference on Computational Intelligence, 2006

2005

Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH project.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Czech spontaneous speech corpus with structural metadata.

[BibT_eX]

[DOI]

Jáchym Kolár

Jan Svec

Stephanie M. Strassel

Christopher Walker

Dagmar Kozlíková

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004

Automatic recognition of spontaneous speech for access to multilingual oral history archives.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2004

Issues in Annotation of the Czech Spontaneous Speech Corpus in the MALACH project.

[BibT_eX]

[DOI]

Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

The development of ASR for Slavic languages in the MALACH project.

[BibT_eX]

[DOI]

Jan Hajic

William Byrne

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

Towards Automatic Transcription of Spontaneous Czech Speech in the MALACH Project.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003

Building LVCSR System for Transcription of Spontaneously Pronounced Russian Testimonies in the MALACH Project: Initial Steps and First Results.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003

Experiments with Automatic Segmentation for Czech Speech Synthesis.

[BibT_eX]

[DOI]

Daniel Tihelka

Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003

Large vocabulary ASR for spontaneous czech in the MALACH project.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Automatic segmentation for czech concatenative speech synthesis using statistical approach with boundary-specific correction.

[BibT_eX]

[DOI]

Daniel Tihelka

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

The czech speech and prosody database both for ASR and TTS purposes.

[BibT_eX]

[DOI]

Jáchym Kolár

Jan Romportl

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Fitting class-based language models into weighted finite-state transducer framework.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002

Automatic Transcription of Czech Language Oral History in the MALACH Project: Resources and Initial Experiments.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 5th International Conference, 2002

German and Czech Speech Synthesis Using HMM-Based Speech Segment Database.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 5th International Conference, 2002

2001

The Influence of a Filter Shape in Telephone-Based Recognition Module Using PLP Parameterization.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001

Two-Pass Recognition of Czech Speech Using Adaptive Vocabulary.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001

Robust Knowledge Discovery from Parallel Speech and Text Sources.

[BibT_eX]

[DOI]

Proceedings of the First International Conference on Human Language Technology Research, 2001

Corpus-based database of residual excitations used for speech reconstruction from MFCCs.

[BibT_eX]

[DOI]

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Large broadcast news and read speech corpora of spoken czech.

[BibT_eX]

[DOI]

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Comparison of MFCC and PLP parameterizations in the speaker independent continuous speech recognition task.

[BibT_eX]

[DOI]

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Design of speech corpus for text-to-speech synthesis.

[BibT_eX]

[DOI]

Jiri Kruta

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

On large vocabulary continuous speech recognition of highly inflectional language - czech.

[BibT_eX]

[DOI]

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000

Recording and Annotation of the Czech Speech Corpus.

[BibT_eX]

[DOI]

Vlasta Radová

Proceedings of the Text, Speech and Dialogue - Third International Workshop, 2000

Design of Speech Recognition Engine.

[BibT_eX]

[DOI]

Lubos Smídl

Proceedings of the Text, Speech and Dialogue - Third International Workshop, 2000

Morpheme Based Language Models for Speech Recognition of Czech.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue - Third International Workshop, 2000

UWB_S01 corpus - a czech read-speech corpus.

[BibT_eX]

[DOI]

Vlasta Radová

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

ARTIC: a new Czech text-to-speech system using statistical approach to speech segment database construction.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Pitch synchronous residual excited speech reconstruction on the MFCC.

[BibT_eX]

[DOI]

Proceedings of the 10th European Signal Processing Conference, 2000

1999

Statistical Approach to the Automatic Synthesis of Czech Speech.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue - Second International Workshop, 1999

Large Vocabulary Speech Recognition for Read and Broadcast Czech.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue - Second International Workshop, 1999

Speech production based on the mel-frequency cepstral coefficients.

[BibT_eX]

[DOI]

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Using various language model smoothing techniques for the transcription of a weather forecast broadcasted by the czech radio.

[BibT_eX]

[DOI]

Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1997

An approach to speaker identification using multiple classifiers.

[BibT_eX]

[DOI]

Vlasta Radová

Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996

Modelling man-computer oral dialogue in noisy environment.

[BibT_eX]

[DOI]

Proceedings of the 8th European Signal Processing Conference, 1996

1989

The use of the LPC residual error autocorrelation to pitch period extraction.

[BibT_eX]

[DOI]

Proceedings of the First European Conference on Speech Communication and Technology, 1989

Short History and Present State of the Artificial Intelligence at the Technical University in Pilsen.

[BibT_eX]

[DOI]

Václav Matousek

Proceedings of the Artificial Intelligence in Higher Education, 1989

1987

The use of coherence coefficient to parameters statistical dependence evaluation in acoustic anylysis.

[BibT_eX]

[DOI]