Petr Cerva

Speech Commun., 2025

Combining multilingual resources to enhance end-to-end speech recognition systems for Scandinavian languages.

[BibT_eX]

[DOI]

Speech Commun., 2025

Efficient Enhancement of Norwegian ASR Model.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 28th International Conference, 2025

2024

A lightweight approach to real-time speaker diarization: from audio toward audio-visual data streams.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., December, 2024

2023

Developing State-of-the-Art End-to-End ASR for Norwegian.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 26th International Conference, 2023

Online Speaker Diarization Using Optimized SE-ResNet Architecture.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 26th International Conference, 2023

Online Punctuation Restoration using ELECTRA Model for streaming ASR Systems.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Combining Multilingual Resources and Models to Develop State-of-the-Art E2E ASR for Swedish.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022

Lexicon-based vs. Lexicon-free ASR for Norwegian Parliament Speech Transcription.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 25th International Conference, 2022

Overlapped Speech Detection in Broadcast Streams Using X-vectors.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Identification of related languages from spoken data: Moving from off-line to on-line scenario.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2021

Identification of Scandinavian Languages from Speech Using Bottleneck Features and X-Vectors.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 24th International Conference, 2021

Using X-Vectors for Speech Activity Detection in Broadcast Streams.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020

Very Fast Keyword Spotting System with Real Time Factor Below 0.01.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue, 2020

Dealing with Newly Emerging OOVs in Broadcast Programs by Daily Updates of the Lexicon and Language Model.

[BibT_eX]

[DOI]

Veronika Volna

Lenka Weingartová

Proceedings of the Speech and Computer - 22nd International Conference, 2020

Optical Character Recognition for Audio-Visual Broadcast Transcription System.

[BibT_eX]

[DOI]

Proceedings of the 11th IEEE International Conference on Cognitive Infocommunications, 2020

2019

An Approach to Online Speaker Change Point Detection Using DNNs and WFSTs.

[BibT_eX]

[DOI]

Lukás Mateju

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018

Robust Recognition of Conversational Telephone Speech via Multi-condition Training and Data Augmentation.

[BibT_eX]

[DOI]

Jirí Málek

Proceedings of the Text, Speech, and Dialogue - 21st International Conference, 2018

Using Deep Neural Networks for Identification of Slavic Languages from Acoustic Signal.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Robust Recognition of Speech with Background Music in Acoustically Under-Resourced Scenarios.

[BibT_eX]

[DOI]

Jirí Málek

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Speech Activity Detection in online broadcast transcription using Deep Neural Networks and Weighted Finite State Transducers.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Robust Automatic Recognition of Speech with background music.

[BibT_eX]

[DOI]

Jirí Málek

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

Speech-to-Text Summarization Using Automatic Phrase Extraction from Recognized Text.

[BibT_eX]

[DOI]

Michal Rott

Proceedings of the Text, Speech, and Dialogue - 19th International Conference, 2016

Study on the Use of Deep Neural Networks for Speech Activity Detection in Broadcast Recordings.

[BibT_eX]

[DOI]

Lukás Mateju

Proceedings of the 13th International Joint Conference on e-Business and Telecommunications (ICETE 2016), 2016

Study on the Use and Adaptation of Bottleneck Features for Robust Speech Recognition of Nonlinearly Distorted Speech.

[BibT_eX]

[DOI]

Proceedings of the 13th International Joint Conference on e-Business and Telecommunications (ICETE 2016), 2016

ASR for South Slavic Languages Developed in Almost Automated Way.

[BibT_eX]

[DOI]

Radek Safarík

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Investigation into the Use of WFSTs and DNNs for Speech Activity Detection in Broadcast Data Transcription.

[BibT_eX]

[DOI]

Lukás Mateju

Proceedings of the E-Business and Telecommunications - 13th International Joint Conference, 2016

2015

System for producing subtitles to internet audio-visual documents.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Telecommunications and Signal Processing, 2015

Compensation of nonlinear distortions in speech for automatic recognition.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Telecommunications and Signal Processing, 2015

Cross-Lingual Adaptation of Broadcast Transcription System to Polish Language Using Public Data Sources.

[BibT_eX]

[DOI]

Radek Safarík

Proceedings of the Human Language Technology. Challenges for Computer Science and Linguistics, 2015

2014

A cross-lingual adaptation approach for rapid development of speech recognizers for learning disabled users.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2014

Investigation of deep neural networks for robust recognition of nonlinearly distorted speech.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Speech-to-text technology to transcribe and disclose 100, 000+ hours of bilingual documents from historical Czech and Czechoslovak radio archive.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Investigation of Latent Semantic Analysis for Clustering of Czech News Articles.

[BibT_eX]

[DOI]

Michal Rott

Proceedings of the 25th International Workshop on Database and Expert Systems Applications, 2014

2013

Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives.

[BibT_eX]

[DOI]

Speech Commun., 2013

Impact of microphone on computer applications with voice input modality.

[BibT_eX]

[DOI]

Michaela Kucharová

Proceedings of the 36th International Conference on Telecommunications and Signal Processing, 2013

SummEC: A Summarization Engine for Czech.

[BibT_eX]

[DOI]

Michal Rott

Proceedings of the Text, Speech, and Dialogue - 16th International Conference, 2013

Downdating Lexicon and Language Model for Automatic Transcription of Czech Historical Spoken Documents.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 16th International Conference, 2013

Dealing with Bilingualism in Automatic Transcription of Historical Archive of Czech Radio.

[BibT_eX]

[DOI]

Proceedings of the New Trends in Image Analysis and Processing - ICIAP 2013, 2013

Adding controlled amount of noise to improve recognition of compressed and spectrally distorted speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Making Czech Historical Radio Archive Accessible and Searchable for Wide Public.

[BibT_eX]

[DOI]

J. Multim., 2012

Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Workshop on Multimedia Signal Processing, 2012

Large-scale processing, indexing and search system for Czech audio-visual cultural heritage archives.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Workshop on Multimedia Signal Processing, 2012

Browsing, indexing and automatic transcription of lectures for distance learning.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Workshop on Multimedia Signal Processing, 2012

Study on Integration of Speaker Diarization with Speaker Adaptive Speech Recognition for Broadcast Transcription.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Real-Time Lecture Transcription using ASR for Czech Hearing Impaired or Deaf Students.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

Voice Technology to Enable Sophisticated Access to Historical Audio Archive of the Czech Radio.

[BibT_eX]

[DOI]

Proceedings of the Multimedia for Cultural Heritage - First International Workshop, 2011

PLDA-Based Clustering for Speaker Diarization of Broadcast Streams.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Using Unsupervised Feature-Based Speaker Adaptation for Improved Transcription of Spoken Archives.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Rainbow Bridge - Training Center based on Voice Technology for People with Physical Disabilities.

[BibT_eX]

Josef Chaloupka

Proceedings of the HEALTHINF 2011, 2011

2010

Study on Cross-Lingual Adaptation of a Czech LVCSR System towards Slovak.

[BibT_eX]

[DOI]

Proceedings of the Analysis of Verbal and Nonverbal Communication and Enactment. The Processing Issues, 2010

2009

Cost-Efficient Cross-Lingual Adaptation of a Speech Recognition System.

[BibT_eX]

[DOI]

Proceedings of the Computer Recognition Systems 3, 2009

Very large vocabulary voice dictation for mobile devices.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Challenges in Speech Processing of Slavic Languages (Case Studies in Speech Recognition of Czech and Slovak).

[BibT_eX]

[DOI]

Proceedings of the Development of Multimodal Interfaces: Active Listening and Synchrony, 2009

2008

Study on Speaker Adaptation Methods in the Broadcast News Transcription Task.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Czech-to-slovak adapted broadcast news transcription system.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

MLLR Transforms Based Speaker Recognition in Broadcast Streams.

[BibT_eX]

[DOI]

Proceedings of the Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions, 2008

Voice Technology Applied for Building a Prototype Smart Room.

[BibT_eX]

[DOI]

Proceedings of the Multimodal Signals: Cognitive and Algorithmic Issues, 2008

2007

MyVoice goes Spanish. Cross-lingual Adaptation of a Voice Controlled PC Tool for Handicapped People.

[BibT_eX]

[DOI]

Proces. del Leng. Natural, 2007

Design and development of voice controlled aids for motor-handicapped persons.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

2006

A System for Information Retrieval from Large Records of Czech Spoken Data.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 9th International Conference, 2006

Continual on-line monitoring of Czech spoken broadcast programs.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Two-step unsupervised speaker adaptation based on speaker and gender recognition and HMM combination.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005

Supervised and Unsupervised Speaker Adaptation in Large Vocabulary Continuous Speech Recognition of Czech.

[BibT_eX]

[DOI]