Petr Motlícek

CoRR, 2024

Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper.

[BibT_eX]

[DOI]

CoRR, 2024

TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR.

[BibT_eX]

[DOI]

CoRR, 2024

ROXSD: The ROXANNE Multimodal and Simulated Dataset for Advancing Criminal Investigations.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

Normalizing Flows for Speaker and Language Recognition Backend.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

Reliability Estimation of News Media Sources: Birds of a Feather Flock Together.

[BibT_eX]

[DOI]

Dairazalia Sanchez-Cortes

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Speech and Language Recognition with Low-rank Adaptation of Pretrained Models.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Detecting Criminal Networks via Non-content Communication Data Analysis Techniques from the TRACY Project.

[BibT_eX]

[DOI]

Pradeep Rangappa

Amanda Muscat

Alejandra Sanchez Lara

Michaela Antonopoulou

Proceedings of the Digital Forensics and Cyber Crime - 15th EAI International Conference, 2024

Probability-Aware Word-Confusion-Network-To-Text Alignment Approach for Intent Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Fine-Tuning Self-Supervised Models for Language Identification Using Orthonormal Constraint.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers.

[BibT_eX]

[DOI]

S. Pavankumar Dubagunta

Proceedings of the IEEE International Conference on Acoustics, 2024

Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition.

[BibT_eX]

[DOI]

Mrinmoy Bhattacharjee

Proceedings of the IEEE International Conference on Acoustics, 2024

Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper.

[BibT_eX]

[DOI]

Iuliia Thorbecke

Juan Pablo Zuluaga-Gomez

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

TokenVerse: Towards Unifying Speech and NLP Tasks via Transducer-based ASR.

[BibT_eX]

[DOI]

Shashi Kumar

Juan Pablo Zuluaga-Gomez

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction.

[BibT_eX]

[DOI]

Dairazalia Sanchez-Cortes

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Entity Matching Across Small Networks Using Node Attributes.

[BibT_eX]

[DOI]

Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024

Mapping the Media Landscape: Predicting Factual Reporting and Political Bias Through Web Interactions.

[BibT_eX]

[DOI]

Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2024

DAIC-WOZ: On the Validity of Using the Therapist's prompts in Automatic Depression Detection from Clinical Interviews.

[BibT_eX]

[DOI]

Ernesto Reyes-Ramírez

Adrián Pastor López-Monroy

Fernando Sánchez-Vega

Proceedings of the 6th Clinical Natural Language Processing Workshop, 2024

2023

Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers.

[BibT_eX]

[DOI]

CoRR, 2023

Implementing Contextual Biasing in GPU Decoder for Online ASR.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

HyperConformer: Multi-head HyperMixer for Efficient Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Node-weighted Graph Convolutional Network for Depression Detection in Transcribed Clinical Interviews.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Effectiveness of Text, Acoustic, and Lattice-Based Representations in Spoken Language Understanding Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Parameter-Efficient Tuning with Adaptive Bottlenecks for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Claim-Dissector: An Interpretable Fact-Checking System with Joint Re-ranking and Veracity Prediction.

[BibT_eX]

[DOI]

Martin Fajcik

Pavel Smrz

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022

Speech and Natural Language Processing Technologies for Pseudo-Pilot Simulator.

[BibT_eX]

[DOI]

CoRR, 2022

ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications.

[BibT_eX]

[DOI]

CoRR, 2022

How Does Pre-trained Wav2Vec2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications.

[BibT_eX]

[DOI]

CoRR, 2022

Bertraffic: Bert-Based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

How Does Pre-Trained Wav2Vec 2.0 Perform on Domain-Shifted Asr? an Extensive Benchmark on Air Traffic Control Communications.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Expanded Lattice Embeddings for Spoken Document Retrieval on Informal Meetings.

[BibT_eX]

[DOI]

Alexei V. Ivanov

Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

Team Innovators at SemEval-2022 for Task 8: Multi-Task Training with Hyperpartisan and Semantic Relation for Multi-Lingual News Article Similarity.

[BibT_eX]

[DOI]

Proceedings of the 16th International Workshop on Semantic Evaluation, SemEval@NAACL 2022, 2022

Bio-Medical Multi-label Scientific Literature Classification using LWAN and Dual-attention module.

[BibT_eX]

[DOI]

Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation, 2022

An End-to-End Multilingual System for Automatic Minuting of Multi-Party Dialogues.

[BibT_eX]

[DOI]

Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation, 2022

IDIAP Submission@LT-EDI-ACL2022: Detecting Signs of Depression from Social Media Text.

[BibT_eX]

[DOI]

Proceedings of the Second Workshop on Language Technology for Equality, 2022

IDIAP Submission@LT-EDI-ACL2022: Homophobia/Transphobia Detection in social media comments.

[BibT_eX]

[DOI]

Proceedings of the Second Workshop on Language Technology for Equality, 2022

IDIAP Submission@LT-EDI-ACL2022 : Hope Speech Detection for Equality, Diversity and Inclusion.

[BibT_eX]

[DOI]

Proceedings of the Second Workshop on Language Technology for Equality, 2022

IDIAP_TIET@LT-EDI-ACL2022 : Hope Speech Detection in Social Media using Contextualized BERT with Attention Mechanism.

[BibT_eX]

[DOI]

Deepanshu Khanna

Proceedings of the Second Workshop on Language Technology for Equality, 2022

Hierarchical Multi-task learning framework for Isometric-Speech Language Translation.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Spoken Language Translation, 2022

A Two-Step Approach to Leverage Contextual Data: Speech Recognition in Air-Traffic Communications.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

IDIAPers @ Causal News Corpus 2022: Extracting Cause-Effect-Signal Triplets via Pre-trained Autoregressive Language Model.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text, 2022

IDIAPers @ Causal News Corpus 2022: Efficient Causal Relation Identification Through a Prompt-based Few-shot Approach.

[BibT_eX]

[DOI]

Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text, 2022

2021

Neural Network Adaptation and Data Augmentation for Multi-Speaker Direction-of-Arrival Estimation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection and Role Identification of Air-Traffic Communications.

[BibT_eX]

[DOI]

CoRR, 2021

Grammar Based Identification Of Speaker Role For Improving ATCO And Pilot ASR.

[BibT_eX]

[DOI]

CoRR, 2021

Improving callsign recognition with air-surveillance data in air-traffic communication.

[BibT_eX]

[DOI]

CoRR, 2021

Applying Attention-Based Models for Detecting Cognitive Processes and Mental Health Conditions.

[BibT_eX]

[DOI]

Cogn. Comput., 2021

IEEE SLT 2021 Alpha-Mini Speech Challenge: Open Datasets, Tracks, Rules and Baselines.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Contextual Semi-Supervised Learning: An Approach to Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Late Fusion of the Available Lexicon and Raw Waveform-Based Acoustic Modeling for Depression and Dementia Recognition.

[BibT_eX]

[DOI]

Gabriela Ramírez-de-la-Rosa

S. Pavankumar Dubagunta

Julian Fritsch

Mathew Magimai-Doss

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Speech Activity Detection Based on Multilingual Speech Recognition System.

[BibT_eX]

[DOI]

Seyyed Saeed Sarfjoo

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Robust Command Recognition for Lithuanian Air Traffic Control Tower Utterances.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Multitask Adaptation with Lattice-Free MMI for Multi-Genre Speech Recognition of Low Resource Languages.

[BibT_eX]

[DOI]

Hervé Bourlard

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Multi-Task Neural Network for Robust Multiple Speaker Embedding Extraction.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

ROXANNE Research Platform: Automate Criminal Investigations.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Comparison of Methods for OOV-Word Recognition on a New Public Dataset.

[BibT_eX]

[DOI]

Rudolf A. Braun

Proceedings of the IEEE International Conference on Acoustics, 2021

NLPHut's Participation at WAT2021.

[BibT_eX]

[DOI]

Proceedings of the 8th Workshop on Asian Translation, 2021

2020

Inferring Highly-dense Representations for Clustering Broadcast Media Content.

[BibT_eX]

[DOI]

Prague Bull. Math. Linguistics, 2020

Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models.

[BibT_eX]

[DOI]

CoRR, 2020

Quantization of Acoustic Model Parameters in Automatic Speech Recognition Framework.

[BibT_eX]

[DOI]

Amrutha Prasad

CoRR, 2020

Improving Speaker Identification using Network Knowledge in Criminal Conversational Data.

[BibT_eX]

[DOI]

CoRR, 2020

Idiap Submission to Swiss-German Language Detection Shared Task.

[BibT_eX]

[DOI]

Proceedings of the 5th Swiss Text Analytics Conference and the 16th Conference on Natural Language Processing, 2020

Idiap and UAM Participation at MEX-A3T Evaluation Campaign.

[BibT_eX]

[DOI]

Gabriela Ramírez-de-la-Rosa

Sajit Kumar

Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020) co-located with 36th Conference of the Spanish Society for Natural Language Processing (SEPLN 2020), 2020

The MuMMER Data Set for Robot Perception in Multi-party HRI Scenarios.

[BibT_eX]

[DOI]

Proceedings of the 29th IEEE International Conference on Robot and Human Interactive Communication, 2020

Automatic Speech Recognition Benchmark for Air-Traffic Communications.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Supervised Domain Adaptation for Text-Independent Speaker Verification Using Limited Data.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Lattice-Free Maximum Mutual Information Training of Multilingual Speech Recognition Systems.

[BibT_eX]

[DOI]

Banriskhem K. Khonglah

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Detection of Similar Languages and Dialects Using Deep Supervised Autoencoder.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Natural Language Processing, 2020

BertAA : BERT fine-tuning for Authorship Attribution.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Natural Language Processing, 2020

Incremental Semi-Supervised Learning for Multi-Genre Speech Recognition.

[BibT_eX]

[DOI]

Banriskhem K. Khonglah

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

ODIANLP's Participation in WAT2020.

[BibT_eX]

[DOI]

Debasish Kumar Mallick

Satya Prakash Biswal

Priyanka Pattnaik

Biranchi Narayan Nayak

Ondrej Bojar

Proceedings of the 7th Workshop on Asian Translation, 2020

2019

Voice Presentation Attack Detection Using Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Handbook of Biometric Anti-Spoofing, 2019

MuMMER: Socially Intelligent Human-Robot Interaction in Public Spaces.

[BibT_eX]

[DOI]

CoRR, 2019

Spoken Language Identification Using Language Bottleneck Features.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 22nd International Conference, 2019

Idiap Abstract Text Summarization System for German Text Summarization Task.

[BibT_eX]

[DOI]

Proceedings of the 4th Swiss Text Analytics Conference, SwissText 2019, Winterthur, 2019

End-to-End Accented Speech Recognition.

[BibT_eX]

[DOI]

Thibault Viglino

Milos Cernak

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Exploiting Semi-Supervised Training Through a Dropout Regularization in End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Adaptation of Assistant Based Speech Recognition to New Domains and Its Acceptance by Air Traffic Controllers.

[BibT_eX]

[DOI]

Proceedings of the Intelligent Human Systems Integration 2019, 2019

A Bayesian Approach to Inter-task Fusion for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Adaptation of Multiple Sound Source Localization Neural Networks with Weak Supervision and Domain-adversarial Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Abstract Text Summarization: A Low Resource Challenge.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Cross-lingual Automatic Speech Recognition Exploiting Articulatory Features.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Idiap NMT System for WAT 2019 Multimodal Translation Task.

[BibT_eX]

[DOI]

Ondrej Bojar

Proceedings of the 6th Workshop on Asian Translation, 2019

2018

Iterative Learning of Speech Recognition Models for Air Traffic Control.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Analysis of Language Dependent Front-End for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Joint Localization and Classification of Multiple Sound Sources Using a Multi-task Neural Network.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

End-to-end Text-dependent Speaker Verification Using Novel Distance Measures.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Deep Neural Networks for Multiple Speaker Detection and Localization.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

DNN Based Speaker Embedding Using Content Information for Text-Dependent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Template-matching for text-dependent speaker verification.

[BibT_eX]

[DOI]

Speech Commun., 2017

Semi-Supervised Learning with Semantic Knowledge Extraction for Improved Speech Recognition in Air Traffic Control.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Content Normalization for Text-Dependent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Intra-class covariance adaptation in PLDA back-ends for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Exploiting sequence information for text-dependent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Towards a Breakthrough Speaker Identification Approach for Law Enforcement Agencies: SIIP.

[BibT_eX]

[DOI]

Emmanouil Chatzigavriil

Proceedings of the European Intelligence and Security Informatics Conference, 2017

A context-aware speech recognition and understanding system for air traffic control domain.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

A Large-Scale Open-Source Acoustic Simulator for Speaker Recognition.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2016

Feature mapping using far-field microphones for distant speech recognition.

[BibT_eX]

[DOI]

Speech Commun., 2016

Investigating Cross-lingual Multi-level Adaptive Networks: The Importance of the Correlation of Source and Target Languages.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Spoken Language Translation, 2016

Idlak Tangle: An Open Source Kaldi Based Parametric Speech Synthesiser Based on DNN.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Inter-Task System Fusion for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

System fusion and speaker linking for longitudinal diarization of TV shows.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Information theoretic clustering for unsupervised domain-adaptation.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Deep neural network based posteriors for text-dependent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Incremental Syllable-Context Phonetic Vocoding.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Exploiting foreign resources for DNN-based ASR.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

Integrating online i-vector extractor with information bottleneck based speaker diarization system.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Channel selection in the short-time modulation domain for distant speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Employment of Subspace Gaussian Mixture Models in speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Combining SGMM speaker vectors and KL-HMM approach for speaker diarization.

[BibT_eX]

[DOI]

Hervé Bourlard

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Learning feature mapping using deep neural network bottleneck features for distant large vocabulary speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Towards utterance-based neural network adaptation in acoustic modeling.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Using out-of-language data to improve an under-resourced speech recognizer.

[BibT_eX]

[DOI]

Speech Commun., 2014

The DBOX Corpus Collection of Spoken Human-Human and Human-Machine Dialogues.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Phoneme background model for information bottleneck based speaker diarization.

[BibT_eX]

[DOI]

Sree Harsha Yella

Hervé Bourlard

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Development of bilingual ASR system for MediaParl corpus.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Stress and accent transmission in HMM-based syllable-context very low bit rate speech coding.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Multilingual deep neural network based acoustic modeling for rapid language adaptation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Exploiting un-transcribed foreign data for speech recognition in well-resourced languages.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

A Simple Continuous Pitch Estimation Algorithm.

[BibT_eX]

[DOI]

Milos Cernak

IEEE Signal Process. Lett., 2013

Real-Time Audio-Visual Analysis for Multiperson Videoconferencing.

[BibT_eX]

[DOI]

Adv. Multim., 2013

Crosslingual tandem-SGMM: exploiting out-of-language data for acoustic model and feature level adaptation.

[BibT_eX]

[DOI]

David Imseng

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Feature and score level combination of subspace Gaussinas in LVCSR task.

[BibT_eX]

[DOI]

Daniel Povey

Martin Karafiát

Proceedings of the IEEE International Conference on Acoustics, 2013

Accent adaptation using Subspace Gaussian Mixture Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

On the (UN)importance of the contextual factors in HMM-based speech synthesis and coding.

[BibT_eX]

[DOI]

Milos Cernak

Dairazalia Sanchez-Cortes

Proceedings of the IEEE International Conference on Acoustics, 2013

Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012

Impact du degré de supervision sur l'adaptation à un domaine d'un modèle de langage à partir du Web (Impact of the level of supervision on Web-based language model domain adaptation) [in French].

[BibT_eX]

[DOI]

Proceedings of the Joint Conference JEP-TALN-RECITAL 2012, 2012

Assessing the impact of language style on emergent leadership perception from ubiquitous audio.

[BibT_eX]

[DOI]

Daniel Gatica-Perez

Proceedings of the 11th International Conference on Mobile and Ubiquitous Multimedia, 2012

Multimodal Cue Detection Engine for Orchestrated Entertainment.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

Annotation and Recognition of Personality Traits in Spoken Conversations from the AMI Meetings Corpus.

[BibT_eX]

[DOI]

Samuel Kim

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Conversion of Recurrent Neural Network Language Models to Weighted Finite State Transducers for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Gwénolé Lecorvé

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Supervised and unsupervised Web-based language model domain adaptation.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Comparing different acoustic modeling techniques for multilingual boosting.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Bi-modal authentication in mobile environments using session variability modelling.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on Pattern Recognition, 2012

Generating exact lattices in the WFST framework.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Improving acoustic based keyword spotting using LVCSR lattices.

[BibT_eX]

[DOI]

Igor Szöke

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Detecting and labeling folk literature in spoken cultural heritage archives using structural and prosodic features.

[BibT_eX]

[DOI]

Proceedings of the 10th International Workshop on Content-Based Multimedia Indexing, 2012

2011

Just-in-time multimodal association and fusion from home entertainment.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Multistream speaker diarization through Information Bottleneck system outputs combination.

[BibT_eX]

[DOI]

Deepu Vijayasenan

Proceedings of the IEEE International Conference on Acoustics, 2011

Speaker diarization of meetings based on speaker role n-gram models.

[BibT_eX]

[DOI]

Deepu Vijayasenan

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Autoregressive Models of Amplitude Modulations in Audio Compression.

[BibT_eX]

[DOI]

Sriram Ganapathy

IEEE Trans. Speech Audio Process., 2010

Wide-Band Audio Coding Based on Frequency-Domain Linear Prediction.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2010

English spoken term detection in multilingual recordings.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Hands free audio analysis from home entertainment.

[BibT_eX]

[DOI]

Danil Korchagin

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Variational Bayesian speaker diarization of meeting recordings.

[BibT_eX]

[DOI]

Deepu Vijayasenan

Proceedings of the IEEE International Conference on Acoustics, 2010

Application of out-of-language detection to spoken term detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Applications of signal analysis using autoregressive models for amplitude modulation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009

Error Resilient Speech Coding Using Sub-band Hilbert Envelopes.

[BibT_eX]

[DOI]

Sriram Ganapathy

Proceedings of the Text, Speech and Dialogue, 12th International Conference, 2009

Arithmetic coding of sub-band residuals in FDLP speech/audio codec.

[BibT_eX]

[DOI]

Sriram Ganapathy

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Automatic out-of-language detection based on confidence measures derived from LVCSR word and phone lattices.

[BibT_eX]

[DOI]

Sree Hari Krishnan Parthasarathi

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

2008

Exploiting Contextual Information for Speech/Non-Speech Detection.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Perceptually Motivated Sub-band Decomposition for FDLP Audio Coding.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Spectral noise shaping: improvements in speech/audio codec based on linear prediction in spectral domain.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

The DIRAC AWEAR audio-visual platform for detection of unexpected and incongruent events.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Multimodal Interfaces, 2008

Temporal masking for bit-rate reduction in audio codec based on Frequency Domain Linear Prediction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Non-uniform Speech/Audio Coding Exploiting Predictability of Temporal Evolution of Spectral Envelopes.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 10th International Conference, 2007

Frequency Domain Linear Prediction for QMF Sub-bands and Applications to Audio Coding.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Multimodal Interaction , 2007

Wide-Band Perceptual Audio Coding Based on Frequency-Domain Linear Prediction.

[BibT_eX]

[DOI]

Vijay Ullal

Proceedings of the IEEE International Conference on Acoustics, 2007

Unsupervised Speech/Non-Speech Detection for Automatic Speech Recognition in Meeting Rooms.

[BibT_eX]

[DOI]

Hari Krishna Maganti

Daniel Gatica-Perez

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

Speech Coding Based on Spectral Dynamics.

[BibT_eX]

[DOI]

Harinath Garudadri

Naveen Srinivasamurthy

Proceedings of the Text, Speech and Dialogue, 9th International Conference, 2006

Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning for Multimodal Interaction, 2006

2005

Non-parametric speaker turn segmentation of meeting data.

[BibT_eX]

[DOI]

Lukás Burget

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004

Multimodal Phoneme Recognition of Meeting Data.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 7th International Conference, 2004

2003

All-Pole Modeling for Definition of Speech Features in Aurora3 DSR Task.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003

Autoregressive modeling based feature extraction for Aurora3 DSR task.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Time-domain based temporal processing with application of orthogonal transformations.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002

Efficient Noise Estimation and Its Application for Robust Speech Recognition.

[BibT_eX]

[DOI]

Lukás Burget

Proceedings of the Text, Speech and Dialogue, 5th International Conference, 2002

Noise estimation for efficient speech enhancement and robust speech recognition.

[BibT_eX]

[DOI]

Lukás Burget

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2001

Minimization of Transition Noise and HNM Synthesis in Very Low Bit Rate Speech Coding.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001

2000

Optimal Pitch Path Tracking for More Reliable Pitch Detection.

[BibT_eX]

[DOI]