Hervé Bourlard

According to our database1, Hervé Bourlard authored at least 307 papers between 1984 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Awards

IEEE Fellow

IEEE Fellow 2000, "For contributions to the fields of statistical speech recognition and neural networks.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2022
Autoencoders reloaded.
Biol. Cybern., 2022

From Undercomplete to Sparse Overcomplete Autoencoders to Improve LF-MMI based Speech Recognition.
Proceedings of the Interspeech 2022, 2022

Comparison of 5 methods for the evaluation of intelligibility in mild to moderate French dysarthric speech.
Proceedings of the Interspeech 2022, 2022

2021
Subspace-Based Learning for Automatic Dysarthric Speech Detection.
IEEE Signal Process. Lett., 2021

Comparing CTC and LFMMI for Out-of-Domain Adaptation of wav2vec 2.0 Acoustic Model.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Multitask Adaptation with Lattice-Free MMI for Multi-Genre Speech Recognition of Low Resource Languages.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Lattice-Free Mmi Adaptation of Self-Supervised Pretrained Acoustic Models.
Proceedings of the IEEE International Conference on Acoustics, 2021

Automatic And Perceptual Discrimination Between Dysarthria, Apraxia of Speech, and Neurotypical Speech.
Proceedings of the IEEE International Conference on Acoustics, 2021

Automatic Dysarthric Speech Detection Exploiting Pairwise Distance-Based Convolutional Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2021

Speech Dereverberation Using Variational Autoencoders.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Neural Network Based End-to-End Query by Example Spoken Term Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Spectro-Temporal Sparsity Characterization for Dysarthric Speech Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Automatic Pathological Speech Intelligibility Assessment Exploiting Subspace-Based Analyses.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

On quantifying the quality of acoustic models in hybrid DNN-HMM ASR.
Speech Commun., 2020

Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models.
CoRR, 2020

Lattice-Free Maximum Mutual Information Training of Multilingual Speech Recognition Systems.
Proceedings of the Interspeech 2020, 2020

Automatic Discrimination of Apraxia of Speech and Dysarthria Using a Minimalistic Set of Handcrafted Features.
Proceedings of the Interspeech 2020, 2020

Incremental Semi-Supervised Learning for Multi-Genre Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Synthetic Speech References for Automatic Pathological Speech Intelligibility Assessment.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Low-rank and sparse subspace modeling of speech for DNN based acoustic modeling.
Speech Commun., 2019

Unbiased Semi-Supervised LF-MMI Training Using Dropout.
Proceedings of the Interspeech 2019, 2019

Spectral Subspace Analysis for Automatic Assessment of Pathological Speech Intelligibility.
Proceedings of the Interspeech 2019, 2019

Analyzing Uncertainties in Speech Recognition Using Dropout.
Proceedings of the IEEE International Conference on Acoustics, 2019

An Investigation of Multilingual ASR Using End-to-end LF-MMI.
Proceedings of the IEEE International Conference on Acoustics, 2019

An End-to-end Network to Synthesize Intonation Using a Generalized Command Response Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

Super-gaussianity of Speech Spectral Coefficients as a Potential Biomarker for Dysarthric Speech Detection.
Proceedings of the IEEE International Conference on Acoustics, 2019

Pathological Speech Intelligibility Assessment Based on the Short-time Objective Intelligibility Measure.
Proceedings of the IEEE International Conference on Acoustics, 2019

Multilingual Bottleneck Features for Query by Example Spoken Term Detection.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Sparse Subspace Modeling for Query by Example Spoken Term Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Cross-lingual adaptation of a CTC-based multilingual acoustic model.
Speech Commun., 2018

Phonetic subspace features for improved query by example spoken term detection.
Speech Commun., 2018

Far-Field ASR Using Low-Rank and Sparse Soft Targets from Parallel Data.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Fast Language Adaptation Using Phonological Information.
Proceedings of the Interspeech 2018, 2018

CNN Based Query by Example Spoken Term Detection.
Proceedings of the Interspeech 2018, 2018

Single-channel Late Reverberation Power Spectral Density Estimation Using Denoising Autoencoders.
Proceedings of the Interspeech 2018, 2018

Evolution of Neural Network Architectures for Speech Recognition.
Proceedings of the Interspeech 2018, 2018

Phonological Posterior Hashing for Query by Example Spoken Term Detection.
Proceedings of the Interspeech 2018, 2018

Statistical Modeling of Speech Spectral Coefficients in Patients with Parkinson's Disease.
Proceedings of the 13th ITG Symposium on Speech Communication, 2018

2017
Perceptual Information Loss due to Impaired Speech Production.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Multilingual Training and Cross-lingual Adaptation on CTC-based Acoustic Model.
CoRR, 2017

An Investigation of Deep Neural Networks for Multilingual Speech Recognition Training and Adaptation.
Proceedings of the Interspeech 2017, 2017

Exploiting Eigenposteriors for Semi-Supervised Training of DNN Acoustic Models with Sequence Discrimination.
Proceedings of the Interspeech 2017, 2017

Low-rank and sparse soft targets to learn better DNN acoustic models.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017


2016
Binary Sparse Coding of Convolutive Mixtures for Sound Localization and Separation via Spatialization.
IEEE Trans. Signal Process., 2016

Speaker Diarization and Linking of Meeting Data.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

A Large-Scale Open-Source Acoustic Simulator for Speaker Recognition.
IEEE Signal Process. Lett., 2016

Predicting the intrusiveness of noise through sparse coding with auditory kernels.
Speech Commun., 2016

Sparse modeling of neural network posterior probabilities for exemplar-based speech recognition.
Speech Commun., 2016

On structured sparsity of phonological posteriors for linguistic parsing.
Speech Commun., 2016

Computational methods for underdetermined convolutive speech localization and separation via model-based sparse component analysis.
Speech Commun., 2016

Subspace Detection of DNN Posterior Probabilities via Sparse Representation for Query by Example Spoken Term Detection.
Proceedings of the Interspeech 2016, 2016

Low-Rank Representation of Nearest Neighbor Posterior Probabilities to Enhance DNN Based Acoustic Modeling.
Proceedings of the Interspeech 2016, 2016

Inter-Task System Fusion for Speaker Recognition.
Proceedings of the Interspeech 2016, 2016

Sound Pattern Matching for Automatic Prosodic Event Detection.
Proceedings of the Interspeech 2016, 2016

Phonetic and Phonological Posterior Search Space Hashing Exploiting Class-Specific Sparsity Structures.
Proceedings of the Interspeech 2016, 2016

System fusion and speaker linking for longitudinal diarization of TV shows.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Exploiting low-dimensional structures to enhance DNN based acoustic modeling in speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Automatic Recognition of Emergent Social Roles in Small Group Interactions.
IEEE Trans. Multim., 2015

Ad hoc microphone array calibration: Euclidean distance matrix completion algorithm and theoretical guarantees.
Signal Process., 2015

Spatial Sound Localization via Multipath Euclidean Distance Matrix Recovery.
IEEE J. Sel. Top. Signal Process., 2015

Objective intelligibility assessment of text-to-speech systems through utterance verification.
Proceedings of the INTERSPEECH 2015, 2015

Sparse modeling of posterior exemplars for keyword detection.
Proceedings of the INTERSPEECH 2015, 2015

On compressibility of neural network phonological features for low bit rate speech coding.
Proceedings of the INTERSPEECH 2015, 2015

Novel GCC-PHAT model in diffuse sound field for microphone array pairwise distance based calibration.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Objective speech intelligibility assessment through comparison of phoneme class conditional probability sequences.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Robust microphone placement for source localization from noisy distance measurements.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Combining SGMM speaker vectors and KL-HMM approach for speaker diarization.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

KL-HMM based speaker diarization system for meetings.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

On application of non-negative matrix factorization for ad hoc microphone array calibration from incomplete noisy distances.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Overlapping speech detection using long-term conversational features for speaker diarization in meeting room conversations.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Feature mapping of multiple beamformed sources for robust overlapping speech recognition using a microphone array.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Structured Sparsity Models for Reverberant Speech Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Using out-of-language data to improve an under-resourced speech recognizer.
Speech Commun., 2014

Enhanced diffuse field model for ad hoc microphone array calibration.
Signal Process., 2014

Modeling Overlapping Speech using Vector Taylor Series.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Phoneme background model for information bottleneck based speaker diarization.
Proceedings of the INTERSPEECH 2014, 2014

Detecting speaker roles and topic changes in multiparty conversations using latent topic models.
Proceedings of the INTERSPEECH 2014, 2014

Diarizing large corpora using multi-modal speaker linking.
Proceedings of the INTERSPEECH 2014, 2014

Multi-source posteriors for speech activity detection on public talks.
Proceedings of the INTERSPEECH 2014, 2014

Detecting and labeling speakers on overlapping speech using vector taylor series.
Proceedings of the INTERSPEECH 2014, 2014

Posterior-based sparse representation for automatic speech recognition.
Proceedings of the INTERSPEECH 2014, 2014

ROCKIT: Roadmap for Conversational Interaction Technologies.
Proceedings of the 2014 Workshop on Roadmapping the Future of Multimodal Interaction Research including Business Opportunities and Challenges, 2014

Information bottleneck based speaker diarization of meetings using non-speech as side information.
Proceedings of the IEEE International Conference on Acoustics, 2014

Multilingual deep neural network based acoustic modeling for rapid language adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2014

Improving speaker diarization using social role information.
Proceedings of the IEEE International Conference on Acoustics, 2014

Filterbank slope based features for speaker diarization.
Proceedings of the IEEE International Conference on Acoustics, 2014

Exploiting un-transcribed foreign data for speech recognition in well-resourced languages.
Proceedings of the IEEE International Conference on Acoustics, 2014

Model-based sparse component analysis for reverberant speech localization.
Proceedings of the IEEE International Conference on Acoustics, 2014

Ad-hoc microphone array calibration from partial distance measurements.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

2013
Wordless Sounds: Robust Speaker Diarization Using Privacy-Preserving Audio Representations.
IEEE Trans. Speech Audio Process., 2013

Robust Log-Energy Estimation and its Dynamic Change Enhancement for In-car Speech Recognition.
IEEE Trans. Speech Audio Process., 2013

Applying Multi- and Cross-Lingual Stochastic Phone Space Transformations to Non-Native Speech Recognition.
IEEE Trans. Speech Audio Process., 2013

Real-Time Audio-Visual Analysis for Multiperson Videoconferencing.
Adv. Multim., 2013

Automatic social role recognition in professional meetings using conditional random fields.
Proceedings of the INTERSPEECH 2013, 2013

Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project.
Proceedings of the First Workshop on Speech, 2013

Euclidean distance matrix completion for ad-hoc microphone array calibration.
Proceedings of the 18th International Conference on Digital Signal Processing, 2013

Improved overlap speech diarization of meeting recordings using long-term conversational features.
Proceedings of the IEEE International Conference on Acoustics, 2013

Speaker adaptive Kullback-Leibler divergence based hidden Markov models.
Proceedings of the IEEE International Conference on Acoustics, 2013

MLP-based factor analysis for tandem speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Impact of deep MLP architecture on different acoustic modeling techniques for under-resourced speech recognition.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

Investigating the Impact of Language Style and Vocal Expression on Social Roles of Participants in Professional Meetings.
Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, 2013

2012
A Technical Revolution: Social Learning and Networking [From the Guest Editors].
IEEE Signal Process. Mag., 2012

Multistream speaker diarization of meetings recordings beyond MFCC and TDOA features.
Speech Commun., 2012

Finding Information in Multimedia Meeting Records.
IEEE Multim., 2012

Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings
CoRR, 2012

Boosting under-resourced speech recognizers by exploiting out-of-language data - case study on Afrikaans.
Proceedings of the Third Workshop on Spoken Language Technologies for Under-resourced Languages, 2012

MediaParl: Bilingual mixed language accented speech database.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Template-based ASR using posterior features and synthetic references: comparing different TTS systems.
Proceedings of the ISCA Workshop on Statistical And Perceptual Audition, 2012

Synthetic References for Template-based ASR using posterior features.
Proceedings of the INTERSPEECH 2012, 2012

Sub-band based Log-energy and Its Dynamic Range Stretching for Robust In-car Speech Recognition.
Proceedings of the INTERSPEECH 2012, 2012

Comparing different acoustic modeling techniques for multilingual boosting.
Proceedings of the INTERSPEECH 2012, 2012

Robust triphone mapping for acoustic modeling.
Proceedings of the INTERSPEECH 2012, 2012

Structured sparse coding for microphone array location calibration.
Proceedings of the ISCA Workshop on Statistical And Perceptual Audition, 2012

Microphone array beampattern characterization for hands-free speech applications.
Proceedings of the IEEE 7th Sensor Array and Multichannel Signal Processing Workshop, 2012

Using KL-divergence and multilingual information to improve ASR for under-resourced languages.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Computational methods for structured sparse component analysis of convolutive speech mixtures.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization.
IEEE Trans. Speech Audio Process., 2011

Analysis of MLP-Based Hierarchical Phoneme Posterior Probability Estimator.
IEEE Trans. Speech Audio Process., 2011

Privacy-Sensitive Audio Features for Speech/Nonspeech Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2011

Media Search in Mobile Devices [From the Guest Editors].
IEEE Signal Process. Mag., 2011

Hierarchical Tandem Features for ASR in Mandarin.
Proceedings of the INTERSPEECH 2011, 2011

LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization.
Proceedings of the INTERSPEECH 2011, 2011

Grapheme-Based Automatic Speech Recognition Using KL-HMM.
Proceedings of the INTERSPEECH 2011, 2011

Improving Non-Native ASR Through Stochastic Multilingual Phoneme Space Transformations.
Proceedings of the INTERSPEECH 2011, 2011

Multi-Party Speech Recovery Exploiting Structured Sparsity Models.
Proceedings of the INTERSPEECH 2011, 2011

Just-in-time multimodal association and fusion from home entertainment.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Posterior features for template-based ASR.
Proceedings of the IEEE International Conference on Acoustics, 2011

Language dependent universal phoneme posterior estimation for mixed language speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Model-based compressive sensing for multi-party distant speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

A BSS-based approach for localization of simultaneous speakers in reverberant conditions.
Proceedings of the 19th European Signal Processing Conference, 2011

2010
Enhanced Phone Posteriors for Improving Speech Recognition Systems.
IEEE Trans. Speech Audio Process., 2010

Introduction to the Special Issue on Signal and Information Processing for Social Networks.
IEEE J. Sel. Top. Signal Process., 2010

Mobile social signal processing: vision and research issues.
Proceedings of the 12th Conference on Human-Computer Interaction with Mobile Devices and Services, 2010

Advances in fast multistream diarization based on the information bottleneck framework.
Proceedings of the INTERSPEECH 2010, 2010

Hierarchical multilayer perceptron based language identification.
Proceedings of the INTERSPEECH 2010, 2010

Towards mixed language speech recognition systems.
Proceedings of the INTERSPEECH 2010, 2010

Audio-visual synchronisation for speaker diarisation.
Proceedings of the INTERSPEECH 2010, 2010

Floor holder detection and end of speaker turn prediction in meetings.
Proceedings of the INTERSPEECH 2010, 2010

Sparse component analysis for speech recognition in multi-speaker environment.
Proceedings of the INTERSPEECH 2010, 2010

Multistream speaker diarization beyond two acoustic feature streams.
Proceedings of the IEEE International Conference on Acoustics, 2010

Evaluating the robustness of privacy-sensitive audio features for speech detection in personal audio log scenarios.
Proceedings of the IEEE International Conference on Acoustics, 2010

Using audio and visual cues for speaker diarisation initialisation.
Proceedings of the IEEE International Conference on Acoustics, 2010

Analysis of phone posterior feature space exploiting class-specific sparsity and MLP-based similarity measure.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Intelligent Multi-modal Interfaces for Mobile Applications in Hostile Environment(IM-HOST).
Proceedings of the Human Machine Interaction, Research Results of the MMI Program, 2009

An Information Theoretic Approach to Speaker Diarization of Meeting Data.
IEEE Trans. Speech Audio Process., 2009

Social signal processing: Survey of an emerging domain.
Image Vis. Comput., 2009

Investigating the use of visual focus of attention for audio-visual speaker diarisation.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

KL realignment for speaker diarization with multiple feature streams.
Proceedings of the INTERSPEECH 2009, 2009

Investigating privacy-sensitive features for speech detection in multiparty conversations.
Proceedings of the INTERSPEECH 2009, 2009

Speaker change detection with privacy-preserving audio cues.
Proceedings of the 11th International Conference on Multimodal Interfaces, 2009

Mutual information based channel selection for speaker diarization of meetings data.
Proceedings of the IEEE International Conference on Acoustics, 2009

Non-linear mapping for multi-channel speech separation and robust overlapping spech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Posterior features applied to speech recognition tasks with user-defined vocabulary.
Proceedings of the IEEE International Conference on Acoustics, 2009

MLP based hierarchical system for task adaptation in ASR.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Social signal processing: state-of-the-art and future perspectives of an emerging domain.
Proceedings of the 16th International Conference on Multimedia 2008, 2008

A Neural Network Based Regression Approach for Recognizing Simultaneous Speech.
Proceedings of the Machine Learning for Multimodal Interaction, 5th International Workshop, 2008

Integration of TDOA features in information bottleneck framework for fast speaker diarization.
Proceedings of the INTERSPEECH 2008, 2008

Neural network based regression for robust overlapping speech recognition using microphone arrays.
Proceedings of the INTERSPEECH 2008, 2008

Using KL-based acoustic models in a large vocabulary recognition task.
Proceedings of the INTERSPEECH 2008, 2008

Social signals, their function, and automatic analysis: a survey.
Proceedings of the 10th International Conference on Multimodal Interfaces, 2008

Combination of agglomerative and sequential clustering for speaker diarization.
Proceedings of the IEEE International Conference on Acoustics, 2008

Hierarchical integration of phonetic and lexical knowledge in phone posterior estimation.
Proceedings of the IEEE International Conference on Acoustics, 2008

MLP-based log spectral energy mapping for robust overlapping speech recognition.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007
Posterior-Based Features and Distances in Template Matching for Speech Recognition.
Proceedings of the Machine Learning for Multimodal Interaction , 2007

Non-linear spectral contrast stretching for in-car speech recognition.
Proceedings of the INTERSPEECH 2007, 2007

In-context phone posteriors as complementary features for tandem ASR.
Proceedings of the INTERSPEECH 2007, 2007

An Acoustic Model Based on Kullback-Leibler Divergence for Posterior Features.
Proceedings of the IEEE International Conference on Acoustics, 2007

Agglomerative information bottleneck for speaker diarization of meetings data.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Recognition and understanding of meetings the AMI and AMIDA projects.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
On variable-scale piecewise stationary spectral analysis of speech signals for ASR.
Speech Commun., 2006

User-customized password speaker verification using multiple reference and background models.
Speech Commun., 2006

Understanding and Modeling Communication Scenes.
Proceedings of the 2006 IEEE ACL Spoken Language Technology Workshop, 2006


Multi-stream ASR: an oracle perspective.
Proceedings of the INTERSPEECH 2006, 2006

Posterior based keyword spotting with a priori thresholds.
Proceedings of the INTERSPEECH 2006, 2006

Using posterior-based features in template matching for speech recognition.
Proceedings of the INTERSPEECH 2006, 2006

Threshold Selection for Unsupervised Detection, With an Application to Microphone Arrays.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Using More Informative Posterior Probabilities for Speech Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Using Pitch as Prior Knowledge in Template-Based Speech Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Comparison and combination of features in a hybrid HMM/MLP and a HMM/GMM speech recognition system.
IEEE Trans. Speech Audio Process., 2005

Pushing the envelope - aside [speech recognition].
IEEE Signal Process. Mag., 2005

A Variable-Scale Piecewise Stationary Spectral Analysis Technique Applied to ASR.
Proceedings of the Machine Learning for Multimodal Interaction, 2005

Hierarchical Multi-stream Posterior Based Speech Recognition System.
Proceedings of the Machine Learning for Multimodal Interaction, 2005

Spectral entropy feature in full-combination multi-stream for robust ASR.
Proceedings of the INTERSPEECH 2005, 2005

Developing and enhancing posterior based speech recognition systems.
Proceedings of the INTERSPEECH 2005, 2005

Improving speech recognition using a data-driven approach.
Proceedings of the INTERSPEECH 2005, 2005

Multi-resolution Spectral Entropy Feature for Robust ASR.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

HMM/ANN Based Spectral Peak Location Estimation for Noise Robust Speech Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Speech recognition with auxiliary information.
IEEE Trans. Speech Audio Process., 2004

Robust speaker change detection.
IEEE Signal Process. Lett., 2004

Text detection, recognition in images and video frames.
Pattern Recognit., 2004

Editorial.
EURASIP J. Adv. Signal Process., 2004

On the Adequacy of Baseform Pronunciations and Pronunciation Variants.
Proceedings of the Machine Learning for Multimodal Interaction, 2004

Modeling auxiliary features in tandem systems.
Proceedings of the INTERSPEECH 2004, 2004

Entropy based combination of tandem representations for noise robust ASR.
Proceedings of the INTERSPEECH 2004, 2004

Spectro-temporal activity pattern (STAP) features for noise robust ASR.
Proceedings of the INTERSPEECH 2004, 2004

Posteriori probabilities and likelihoods combination for speech and speaker recognition.
Proceedings of the INTERSPEECH 2004, 2004

An online audio indexing system.
Proceedings of the INTERSPEECH 2004, 2004

Spectral entropy based feature for robust ASR.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Joint decoding for phoneme-grapheme continuous speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Phase autocorrelation (PAC) features in entropy based multi-stream for robust speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Confidence measures in multiple pronunciations modeling for speaker verification.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Multi Channel Sequence Processing.
Proceedings of the Deterministic and Statistical Methods in Machine Learning, 2004

2003
Microphone array post-filter based on noise field coherence.
IEEE Trans. Speech Audio Process., 2003

Speech/music segmentation using entropy and dynamism features in a HMM classification framework.
Speech Commun., 2003

Robust speech recognition and feature extraction using HMM2.
Comput. Speech Lang., 2003

On factorizing spectral dynamics for robust speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Using pitch frequency information in speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

On the combination of speech and speaker recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Speech & face based biometric authentication at IDIAP.
Proceedings of the 2003 IEEE International Conference on Multimedia and Expo, 2003

On automatic annotation of meeting databases.
Proceedings of the 2003 International Conference on Image Processing, 2003

Speech recognition of spontaneous, noisy speech using auxiliary information in Bayesian networks.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

New entropy based combination rules in HMM/ANN multi-stream ASR.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Modeling human interaction in meetings.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Phase autocorrelation (PAC) derived robust speech features.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Hybrid HMM/ANN and GMM combination for user-customized password speaker verification.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Towards Computer Understanding of Human Interactions.
Proceedings of the Ambient Intelligence, First European Symposium, 2003

2002
Analytic assessment of telephone transmission impact on ASR performance using a simulation model.
Speech Commun., 2002

Dynamic Bayesian network based speech recognition with pitch and energy as auxiliary variables.
Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, 2002

Speaker normalization using HMM2.
Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, 2002

Evaluation of formant-like features for ASR.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Auxiliary variables in conditional Gaussian mixtures for automatic speech recognition.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Low cost duration modelling for noise robust speech recognition.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Improving speech recognition performance of small microphone arrays using missing data techniques.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Comparison and combination of RASTA-PLP and FF features in a hybrid HMM/MLP speech recognition system.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

User-customized password speaker verification based on HMM/ANN and GMM models.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Unknown-multiple speaker clustering using HMM.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Mixed Bayesian Networks with Auxiliary Variables for Automatic Speech Recognition.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Text Segmentation and Recognition in Complex Background Based on Markov Random Field.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Some Recent Advances in Speech Recognition with Potential Applications in Other Statistical Pattern Recognition Areas.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Increasing speech recognition robustness with HMM2.
Proceedings of the IEEE International Conference on Acoustics, 2002

Microphone array post-filter for diffuse noise field.
Proceedings of the IEEE International Conference on Acoustics, 2002

Robust HMM-based speech/music segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Multi-stream adaptive evidence combination for noise robust ASR.
Speech Commun., 2001

HMM2- extraction of formant structures and their use for robust ASR.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Modeling auxiliary information in Bayesian network based ASR.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

MAP combination of multi-stream HMM or HMM/ANN experts.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Error correcting posterior combination for robust multi-band speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Text Enhancement with Asymmetric Filter for Video OCR.
Proceedings of the 11th International Conference on Image Analysis and Processing (ICIAP 2001), 2001

Adaptive ML-weighting in multi-band recombination of Gaussian mixture ASR.
Proceedings of the IEEE International Conference on Acoustics, 2001

Text Identification in Complex Background Using SVM.
Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), 2001

2000
New Approaches Towards Robust, Adaptive Speech Recognition (invited paper).
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

Development of Acoustic and Linguistic Resources for Research and Evaluation in Interactive Vocal Information Servers.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

HMM2- a novel approach to HMM emission probability estimation.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Automatic speech recognition using dynamic bayesian networks with both acoustic and articulatory variables.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A neural network for classification with incomplete data: application to robust ASR.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Real-time telephone transmission simulation for speech recognizer and dialogue system evaluation and improvement.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Using multiple time scales in the framework of multi-stream speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A new keyword spotting approach based on iterative dynamic programming.
Proceedings of the IEEE International Conference on Acoustics, 2000

Will the spoken words be back to libraries? (invited talk - Abstract not available).
Proceedings of the First DELOS Network of Excellence Workshop on Information Seeking, 2000

1999
The full combination sub-bands approach to noise robust HMM/ANN based ASR.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1998
Interfacing of CASA and partial recognition based on a multistream technique.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Improving posterior based confidence measures in hybrid HMM/ANN speech recognition systems.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1997
Hybrid HMM/ANN Systems for Speech Recognition: Overview and New Research Directions.
Proceedings of the Adaptive Processing of Sequences and Data Structures, 1997

Estimation of global posteriors and forward-backward training of hybrid HMM/ANN systems.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Using multiple time scales in a multi-stream speech recognition system.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Speaker-dependent speech recognition based on phone-like units models-application to voice dialling.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Hybrid HMM/ANN systems for training independent tasks: experiments on Phonebook and related improvements.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Subband-based speech recognition.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

State-of-the-Art and Recent Progress in Hybrid HMM/ANN Speech Recognition.
Proceedings of the Artificial Neural Networks, 1997

1996
A training algorithm for statistical sequence recognition with applications to transition-based speech recognition.
IEEE Signal Process. Lett., 1996

Towards increasing speech recognition error rates.
Speech Commun., 1996

A new ASR approach based on independent processing and recombination of partial frequency bands.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Stochastic perceptual speech models with durational dependence.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

REMAP-experiments with speech recognition.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

A new training algorithm for hybrid HMM/ANN speech recognition systems.
Proceedings of the 8th European Signal Processing Conference, 1996

Towards subband-based speech recognition.
Proceedings of the 8th European Signal Processing Conference, 1996

1995
Continuous speech recognition.
IEEE Signal Process. Mag., 1995

Comparison of hidden Markov model techniques for automatic speaker verification in real-world conditions.
Speech Commun., 1995

Neural networks for statistical recognition of continuous speech.
Proc. IEEE, 1995

REMAP: Recursive Estimation and Maximization of A Posteriori Probabilities - Application to Transition-Based Connectionist Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 8, 1995

Digit recognition with stochastic perceptual speech models.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

REMAP: recursive estimation and maximization of a posteriori probabilities in connectionist speech recognition.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Towards increasing speech recognition error rates.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Stochastic perceptual models of speech.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
Connectionist probability estimators in HMM speech recognition.
IEEE Trans. Speech Audio Process., 1994

Stochastic perceptual auditory-event-based models for speech recognition.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Comparison of acoustic features and robustness tests of a real-time recogniser using a hardware telephone line simulator.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Optimizing recognition and rejection performance in wordspotting systems.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

Task independent and dependent training: performance comparison of HMM and hybrid HMM/MLP approaches.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
Continuous speech recognition by connectionist statistical methods.
IEEE Trans. Neural Networks, 1993

Hybrid Neural Network/Hidden Markov Model Systems for Continuous Speech Recognition.
Int. J. Pattern Recognit. Artif. Intell., 1993

Speaker verification over telephone channels based on concatenated phonemic hidden Markov models.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Real-time, neural network-based, French alphabet recognition with telephone speech.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

A neural network based, speaker independent, large vocabulary, continuous speech recognition system: the WERNICKE project.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Performance comparison of hidden Markov models and neural networks for task dependent and independent isolated word recognition.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

A new approach towards keyword spotting.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Linear and nonlinear prediction for speech recognition with hidden Markov models.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Limited parameter hidden Markov models for connected digit speaker verification over telephone channels.
Proceedings of the IEEE International Conference on Acoustics, 1993

1992
Neural nets and hidden Markov models: Review and generalizations.
Speech Commun., 1992

Factoring Networks by a Statistical Method.
Neural Comput., 1992

CDNN: a context dependent neural network for continuous speech recognition.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991
Connectionist Optimisation of Tied Mixture Hidden Markov Models.
Proceedings of the Advances in Neural Information Processing Systems 4, 1991

Phonetic context in hybrid HMM/MLP continuous speech recognition.
Proceedings of the Second European Conference on Speech Communication and Technology, 1991

Neural nets and hidden Markov models: review and generalizations.
Proceedings of the Second European Conference on Speech Communication and Technology, 1991

Continuous speech recognition using PLP analysis with multilayer perceptrons.
Proceedings of the 1991 International Conference on Acoustics, 1991

1990
Links Between Markov Models and Multilayer Perceptrons.
IEEE Trans. Pattern Anal. Mach. Intell., 1990

Connectionist Approaches to the Use of Markov Models for Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 3, 1990

Continuous speech recognition on the resource management database using connectionist probability estimation.
Proceedings of the First International Conference on Spoken Language Processing, 1990

Continuous speech recognition using multilayer perceptrons with hidden Markov models.
Proceedings of the 1990 International Conference on Acoustics, 1990

1989
Generalization and Parameter Estimation in Feedforward Netws: Some Experiments.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

A Continuous Speech Recognition System Embedding MLP into HMM.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989

Statistical Inference in Multilayer Perceptrons and Hidden Markov Models with Applications in Continuous Speech Recognition.
Proceedings of the Neurocomputing - Algorithms, Architectures and Applications, Proceedings of the NATO Advanced Research Workshop on Neurocomputing Algorithms, Architectures and Applications, Les Arcs, France, February 27, 1989

Speech dynamics and recurrent neural networks.
Proceedings of the IEEE International Conference on Acoustics, 1989

1985
Speaker dependent connected speech recognition via phonetic Markov models.
Proceedings of the IEEE International Conference on Acoustics, 1985

1984
Connected digit recognition using vector quantization.
Proceedings of the IEEE International Conference on Acoustics, 1984


  Loading...