Petr Motlícek

Orcid: 0000-0001-6467-1119

According to our database1, Petr Motlícek authored at least 154 papers between 2000 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Node-weighted Graph Convolutional Network for Depression Detection in Transcribed Clinical Interviews.
CoRR, 2023

Implementing contextual biasing in GPU decoder for online ASR.
CoRR, 2023

HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition.
CoRR, 2023

Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding.
CoRR, 2023

A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers.
CoRR, 2023

Effectiveness of Text, Acoustic, and Lattice-Based Representations in Spoken Language Understanding Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2023

Parameter-Efficient Tuning with Adaptive Bottlenecks for Automatic Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Claim-Dissector: An Interpretable Fact-Checking System with Joint Re-ranking and Veracity Prediction.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Speech and Natural Language Processing Technologies for Pseudo-Pilot Simulator.
CoRR, 2022

ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications.
CoRR, 2022

How Does Pre-trained Wav2Vec2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications.
CoRR, 2022

Bertraffic: Bert-Based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

How Does Pre-Trained Wav2Vec 2.0 Perform on Domain-Shifted Asr? an Extensive Benchmark on Air Traffic Control Communications.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Expanded Lattice Embeddings for Spoken Document Retrieval on Informal Meetings.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Team Innovators at SemEval-2022 for Task 8: Multi-Task Training with Hyperpartisan and Semantic Relation for Multi-Lingual News Article Similarity.
Proceedings of the 16th International Workshop on Semantic Evaluation, SemEval@NAACL 2022, 2022

Bio-Medical Multi-label Scientific Literature Classification using LWAN and Dual-attention module.
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation, 2022

An End-to-End Multilingual System for Automatic Minuting of Multi-Party Dialogues.
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation, 2022

IDIAP Submission@LT-EDI-ACL2022: Detecting Signs of Depression from Social Media Text.
Proceedings of the Second Workshop on Language Technology for Equality, 2022

IDIAP Submission@LT-EDI-ACL2022: Homophobia/Transphobia Detection in social media comments.
Proceedings of the Second Workshop on Language Technology for Equality, 2022

IDIAP Submission@LT-EDI-ACL2022 : Hope Speech Detection for Equality, Diversity and Inclusion.
Proceedings of the Second Workshop on Language Technology for Equality, 2022

IDIAP_TIET@LT-EDI-ACL2022 : Hope Speech Detection in Social Media using Contextualized BERT with Attention Mechanism.
Proceedings of the Second Workshop on Language Technology for Equality, 2022

Hierarchical Multi-task learning framework for Isometric-Speech Language Translation.
Proceedings of the 19th International Conference on Spoken Language Translation, 2022

A Two-Step Approach to Leverage Contextual Data: Speech Recognition in Air-Traffic Communications.
Proceedings of the IEEE International Conference on Acoustics, 2022

IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model.
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text, 2022

IDIAPers @ Causal News Corpus 2022: Efficient Causal Relation Identification Through a Prompt-based Few-shot Approach.
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text, 2022

2021
Neural Network Adaptation and Data Augmentation for Multi-Speaker Direction-of-Arrival Estimation.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection and Role Identification of Air-Traffic Communications.
CoRR, 2021

Grammar Based Identification Of Speaker Role For Improving ATCO And Pilot ASR.
CoRR, 2021

Improving callsign recognition with air-surveillance data in air-traffic communication.
CoRR, 2021

Applying Attention-Based Models for Detecting Cognitive Processes and Mental Health Conditions.
Cogn. Comput., 2021

IEEE SLT 2021 Alpha-Mini Speech Challenge: Open Datasets, Tracks, Rules and Baselines.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Contextual Semi-Supervised Learning: An Approach to Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Late Fusion of the Available Lexicon and Raw Waveform-Based Acoustic Modeling for Depression and Dementia Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Speech Activity Detection Based on Multilingual Speech Recognition System.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Robust Command Recognition for Lithuanian Air Traffic Control Tower Utterances.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Multitask Adaptation with Lattice-Free MMI for Multi-Genre Speech Recognition of Low Resource Languages.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Multi-Task Neural Network for Robust Multiple Speaker Embedding Extraction.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

ROXANNE Research Platform: Automate Criminal Investigations.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Comparison of Methods for OOV-Word Recognition on a New Public Dataset.
Proceedings of the IEEE International Conference on Acoustics, 2021

NLPHut's Participation at WAT2021.
Proceedings of the 8th Workshop on Asian Translation, 2021

2020
Inferring Highly-dense Representations for Clustering Broadcast Media Content.
Prague Bull. Math. Linguistics, 2020

Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models.
CoRR, 2020

Quantization of Acoustic Model Parameters in Automatic Speech Recognition Framework.
CoRR, 2020

Improving Speaker Identification using Network Knowledge in Criminal Conversational Data.
CoRR, 2020

Idiap Submission to Swiss-German Language Detection Shared Task.
Proceedings of the 5th Swiss Text Analytics Conference and the 16th Conference on Natural Language Processing, 2020

Idiap and UAM Participation at MEX-A3T Evaluation Campaign.
Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020) co-located with 36th Conference of the Spanish Society for Natural Language Processing (SEPLN 2020), 2020

The MuMMER Data Set for Robot Perception in Multi-party HRI Scenarios.
Proceedings of the 29th IEEE International Conference on Robot and Human Interactive Communication, 2020

Automatic Speech Recognition Benchmark for Air-Traffic Communications.
Proceedings of the Interspeech 2020, 2020

Supervised Domain Adaptation for Text-Independent Speaker Verification Using Limited Data.
Proceedings of the Interspeech 2020, 2020

Lattice-Free Maximum Mutual Information Training of Multilingual Speech Recognition Systems.
Proceedings of the Interspeech 2020, 2020

Detection of Similar Languages and Dialects Using Deep Supervised Autoencoder.
Proceedings of the 17th International Conference on Natural Language Processing, 2020

BertAA : BERT fine-tuning for Authorship Attribution.
Proceedings of the 17th International Conference on Natural Language Processing, 2020

Incremental Semi-Supervised Learning for Multi-Genre Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

ODIANLP's Participation in WAT2020.
Proceedings of the 7th Workshop on Asian Translation, 2020

2019
Voice Presentation Attack Detection Using Convolutional Neural Networks.
Proceedings of the Handbook of Biometric Anti-Spoofing, 2019

MuMMER: Socially Intelligent Human-Robot Interaction in Public Spaces.
CoRR, 2019

Spoken Language Identification Using Language Bottleneck Features.
Proceedings of the Text, Speech, and Dialogue - 22nd International Conference, 2019

Idiap Abstract Text Summarization System for German Text Summarization Task.
Proceedings of the 4th Swiss Text Analytics Conference, SwissText 2019, Winterthur, 2019

End-to-End Accented Speech Recognition.
Proceedings of the Interspeech 2019, 2019

Exploiting Semi-Supervised Training Through a Dropout Regularization in End-to-End Speech Recognition.
Proceedings of the Interspeech 2019, 2019

Adaptation of Assistant Based Speech Recognition to New Domains and Its Acceptance by Air Traffic Controllers.
Proceedings of the Intelligent Human Systems Integration 2019, 2019

A Bayesian Approach to Inter-task Fusion for Speaker Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Adaptation of Multiple Sound Source Localization Neural Networks with Weak Supervision and Domain-adversarial Training.
Proceedings of the IEEE International Conference on Acoustics, 2019

Abstract Text Summarization: A Low Resource Challenge.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Cross-lingual Automatic Speech Recognition Exploiting Articulatory Features.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Idiap NMT System for WAT 2019 Multimodal Translation Task.
Proceedings of the 6th Workshop on Asian Translation, 2019

2018
Iterative Learning of Speech Recognition Models for Air Traffic Control.
Proceedings of the Interspeech 2018, 2018

Analysis of Language Dependent Front-End for Speaker Recognition.
Proceedings of the Interspeech 2018, 2018

Joint Localization and Classification of Multiple Sound Sources Using a Multi-task Neural Network.
Proceedings of the Interspeech 2018, 2018

End-to-end Text-dependent Speaker Verification Using Novel Distance Measures.
Proceedings of the Interspeech 2018, 2018

Deep Neural Networks for Multiple Speaker Detection and Localization.
Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

DNN Based Speaker Embedding Using Content Information for Text-Dependent Speaker Verification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Template-matching for text-dependent speaker verification.
Speech Commun., 2017

Semi-Supervised Learning with Semantic Knowledge Extraction for Improved Speech Recognition in Air Traffic Control.
Proceedings of the Interspeech 2017, 2017

Content Normalization for Text-Dependent Speaker Verification.
Proceedings of the Interspeech 2017, 2017

Intra-class covariance adaptation in PLDA back-ends for speaker verification.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Exploiting sequence information for text-dependent Speaker Verification.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Towards a Breakthrough Speaker Identification Approach for Law Enforcement Agencies: SIIP.
Proceedings of the European Intelligence and Security Informatics Conference, 2017

A context-aware speech recognition and understanding system for air traffic control domain.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016
A Large-Scale Open-Source Acoustic Simulator for Speaker Recognition.
IEEE Signal Process. Lett., 2016

Feature mapping using far-field microphones for distant speech recognition.
Speech Commun., 2016

Investigating Cross-lingual Multi-level Adaptive Networks: The Importance of the Correlation of Source and Target Languages.
Proceedings of the 13th International Conference on Spoken Language Translation, 2016

Idlak Tangle: An Open Source Kaldi Based Parametric Speech Synthesiser Based on DNN.
Proceedings of the Interspeech 2016, 2016

Inter-Task System Fusion for Speaker Recognition.
Proceedings of the Interspeech 2016, 2016

System fusion and speaker linking for longitudinal diarization of TV shows.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Information theoretic clustering for unsupervised domain-adaptation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Deep neural network based posteriors for text-dependent speaker verification.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Incremental Syllable-Context Phonetic Vocoding.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Exploiting foreign resources for DNN-based ASR.
EURASIP J. Audio Speech Music. Process., 2015

Integrating online i-vector extractor with information bottleneck based speaker diarization system.
Proceedings of the INTERSPEECH 2015, 2015

Channel selection in the short-time modulation domain for distant speech recognition.
Proceedings of the INTERSPEECH 2015, 2015

Employment of Subspace Gaussian Mixture Models in speaker recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Combining SGMM speaker vectors and KL-HMM approach for speaker diarization.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Learning feature mapping using deep neural network bottleneck features for distant large vocabulary speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Towards utterance-based neural network adaptation in acoustic modeling.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Using out-of-language data to improve an under-resourced speech recognizer.
Speech Commun., 2014

The DBOX Corpus Collection of Spoken Human-Human and Human-Machine Dialogues.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Phoneme background model for information bottleneck based speaker diarization.
Proceedings of the INTERSPEECH 2014, 2014

Development of bilingual ASR system for MediaParl corpus.
Proceedings of the INTERSPEECH 2014, 2014

Stress and accent transmission in HMM-based syllable-context very low bit rate speech coding.
Proceedings of the INTERSPEECH 2014, 2014

Multilingual deep neural network based acoustic modeling for rapid language adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2014

Exploiting un-transcribed foreign data for speech recognition in well-resourced languages.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
A Simple Continuous Pitch Estimation Algorithm.
IEEE Signal Process. Lett., 2013

Real-Time Audio-Visual Analysis for Multiperson Videoconferencing.
Adv. Multim., 2013

Crosslingual tandem-SGMM: exploiting out-of-language data for acoustic model and feature level adaptation.
Proceedings of the INTERSPEECH 2013, 2013

Feature and score level combination of subspace Gaussinas in LVCSR task.
Proceedings of the IEEE International Conference on Acoustics, 2013

Accent adaptation using Subspace Gaussian Mixture Models.
Proceedings of the IEEE International Conference on Acoustics, 2013

On the (UN)importance of the contextual factors in HMM-based speech synthesis and coding.
Proceedings of the IEEE International Conference on Acoustics, 2013

Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Impact du degré de supervision sur l'adaptation à un domaine d'un modèle de langage à partir du Web (Impact of the level of supervision on Web-based language model domain adaptation) [in French].
Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012

Assessing the impact of language style on emergent leadership perception from ubiquitous audio.
Proceedings of the 11th International Conference on Mobile and Ubiquitous Multimedia, 2012

Multimodal Cue Detection Engine for Orchestrated Entertainment.
Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

Annotation and Recognition of Personality Traits in Spoken Conversations from the AMI Meetings Corpus.
Proceedings of the INTERSPEECH 2012, 2012

Conversion of Recurrent Neural Network Language Models to Weighted Finite State Transducers for Automatic Speech Recognition.
Proceedings of the INTERSPEECH 2012, 2012

Supervised and unsupervised Web-based language model domain adaptation.
Proceedings of the INTERSPEECH 2012, 2012

Comparing different acoustic modeling techniques for multilingual boosting.
Proceedings of the INTERSPEECH 2012, 2012

Bi-modal authentication in mobile environments using session variability modelling.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Generating exact lattices in the WFST framework.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Improving acoustic based keyword spotting using LVCSR lattices.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Detecting and labeling folk literature in spoken cultural heritage archives using structural and prosodic features.
Proceedings of the 10th International Workshop on Content-Based Multimedia Indexing, 2012

2011
Just-in-time multimodal association and fusion from home entertainment.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Multistream speaker diarization through Information Bottleneck system outputs combination.
Proceedings of the IEEE International Conference on Acoustics, 2011

Speaker diarization of meetings based on speaker role n-gram models.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Autoregressive Models of Amplitude Modulations in Audio Compression.
IEEE Trans. Speech Audio Process., 2010

Wide-Band Audio Coding Based on Frequency-Domain Linear Prediction.
EURASIP J. Audio Speech Music. Process., 2010

English spoken term detection in multilingual recordings.
Proceedings of the INTERSPEECH 2010, 2010

Hands free audio analysis from home entertainment.
Proceedings of the INTERSPEECH 2010, 2010

Variational Bayesian speaker diarization of meeting recordings.
Proceedings of the IEEE International Conference on Acoustics, 2010

Application of out-of-language detection to spoken term detection.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Applications of signal analysis using autoregressive models for amplitude modulation.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009

Error Resilient Speech Coding Using Sub-band Hilbert Envelopes.
Proceedings of the Text, Speech and Dialogue, 12th International Conference, 2009

Arithmetic coding of sub-band residuals in FDLP speech/audio codec.
Proceedings of the INTERSPEECH 2009, 2009

Automatic out-of-language detection based on confidence measures derived from LVCSR word and phone lattices.
Proceedings of the INTERSPEECH 2009, 2009

2008
Exploiting Contextual Information for Speech/Non-Speech Detection.
Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Perceptually Motivated Sub-band Decomposition for FDLP Audio Coding.
Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Spectral noise shaping: improvements in speech/audio codec based on linear prediction in spectral domain.
Proceedings of the INTERSPEECH 2008, 2008

The DIRAC AWEAR audio-visual platform for detection of unexpected and incongruent events.
Proceedings of the 10th International Conference on Multimodal Interfaces, 2008

Temporal masking for bit-rate reduction in audio codec based on Frequency Domain Linear Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Non-uniform Speech/Audio Coding Exploiting Predictability of Temporal Evolution of Spectral Envelopes.
Proceedings of the Text, Speech and Dialogue, 10th International Conference, 2007

Frequency Domain Linear Prediction for QMF Sub-bands and Applications to Audio Coding.
Proceedings of the Machine Learning for Multimodal Interaction , 2007

Wide-Band Perceptual Audio Coding Based on Frequency-Domain Linear Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2007

Unsupervised Speech/Non-Speech Detection for Automatic Speech Recognition in Meeting Rooms.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Speech Coding Based on Spectral Dynamics.
Proceedings of the Text, Speech and Dialogue, 9th International Conference, 2006


2005
Non-parametric speaker turn segmentation of meeting data.
Proceedings of the INTERSPEECH 2005, 2005

2004
Multimodal Phoneme Recognition of Meeting Data.
Proceedings of the Text, Speech and Dialogue, 7th International Conference, 2004

2003
All-Pole Modeling for Definition of Speech Features in Aurora3 DSR Task.
Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003

Autoregressive modeling based feature extraction for Aurora3 DSR task.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Time-domain based temporal processing with application of orthogonal transformations.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002
Efficient Noise Estimation and Its Application for Robust Speech Recognition.
Proceedings of the Text, Speech and Dialogue, 5th International Conference, 2002

Noise estimation for efficient speech enhancement and robust speech recognition.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2001
Minimization of Transition Noise and HNM Synthesis in Very Low Bit Rate Speech Coding.
Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001

2000
Optimal Pitch Path Tracking for More Reliable Pitch Detection.
Proceedings of the Text, Speech and Dialogue - Third International Workshop, 2000


  Loading...