Olivier Siohan

According to our database1, Olivier Siohan authored at least 83 papers between 1992 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Audio-visual fine-tuning of audio-only ASR models.
CoRR, 2023

Cascaded encoders for fine-tuning ASR models on overlapped speech.
CoRR, 2023

Conformers are All You Need for Visual Speech Recogntion.
CoRR, 2023

Revisiting the Entropy Semiring for Neural Speech Recognition.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
On Robustness to Missing Video for Audiovisual Speech Recognition.
Trans. Mach. Learn. Res., 2022

Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition.
CoRR, 2022

Transformer-Based Video Front-Ends for Audio-Visual Speech Recognition for Single and Muti-Person Video.
Proceedings of the Interspeech 2022, 2022

End-to-End multi-talker audio-visual ASR using an active speaker attention module.
Proceedings of the Interspeech 2022, 2022

Best of Both Worlds: Multi-Task Audio-Visual Automatic Speech Recognition and Active Speaker Detection.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Audio-Visual Speech Recognition is Worth 32×32×8 Voxels.
CoRR, 2021

Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models.
CoRR, 2021

End-to-End Audio-Visual Speech Recognition for Overlapping Speech.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Bridging the Gap Between Streaming and Non-Streaming ASR Systems by Distilling Ensembles of CTC and RNN-T Models.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Closer Look at Audio-Visual Multi-Person Speech Recognition and Active Speaker Selection.
Proceedings of the IEEE International Conference on Acoustics, 2021

Audio-Visual Speech Recognition is Worth $32\times 32\times 8$ Voxels.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Action Item Detection in Meetings Using Pretrained Transformers.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
End-to-End Multi-Person Audio/Visual Automatic Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Recurrent Neural Network Transducer for Audio-Visual Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2017
CTC Training of Multi-Phone Acoustic Models for Speech Recognition.
Proceedings of the Interspeech 2017, 2017

Annealed f-Smoothing as a Mechanism to Speed up Neural Network Training.
Proceedings of the Interspeech 2017, 2017


2016
Automatic optimization of data perturbation distributions for multi-style training in speech recognition.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Selection and combination of hypotheses for dialectal speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Sequence training of multi-task acoustic models using meta-state labels.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Large vocabulary automatic speech recognition for children.
Proceedings of the INTERSPEECH 2015, 2015

Exemplar-based large vocabulary speech recognition using k-nearest neighbors.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Multitask learning and system combination for automatic speech recognition.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
A big data approach to acoustic model training corpus selection.
Proceedings of the INTERSPEECH 2014, 2014

Training data selection based on context-dependent state matching.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
ivector-based acoustic data selection.
Proceedings of the INTERSPEECH 2013, 2013

2010
Decision tree state clustering with word and syllable features.
Proceedings of the INTERSPEECH 2010, 2010

2009
An audio indexing system for election video material.
Proceedings of the IEEE International Conference on Acoustics, 2009

2007
Comments on Vocal Tract Length Normalization Equals Linear Transformation in Cepstral Space.
IEEE Trans. Speech Audio Process., 2007

Vocabulary independent spoken term detection.
Proceedings of the SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2007

Gaussian Mixture Language Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2007

The IBM 2007 speech transcription system for European parliamentary speeches.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Automated Quality Monitoring for Call Centers using Speech and NLP Technologies.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006

The IBM Rich Transcription Spring 2006 Speech-to-Text System for Lecture Meetings.
Proceedings of the Machine Learning for Multimodal Interaction, 2006

The IBM 2006 speech transcription system for european parliamentary speeches.
Proceedings of the INTERSPEECH 2006, 2006

Automated Quality Monitoring in the Call Center with ASR and Maximum Entropy.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
A new verification-based fast-match for large vocabulary continuous speech recognition.
IEEE Trans. Speech Audio Process., 2005

Fast vocabulary-independent audio search using path-based graph indexing.
Proceedings of the INTERSPEECH 2005, 2005

Contructing Ensembles of ASR Systems Using Randomized Decision Trees.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Sequential estimation with optimal forgetting for robust speech recognition.
IEEE Trans. Speech Audio Process., 2004

Speech recognition error analysis on the English MALACH corpus.
Proceedings of the INTERSPEECH 2004, 2004

Use of metadata to improve recognition of spontaneous speech and named entities.
Proceedings of the INTERSPEECH 2004, 2004

2003
Advances in natural language call routing.
Bell Labs Tech. J., 2003

Hierarchical class n-gram language models: towards better estimation of unseen events in speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Combining neighboring filter channels to improve quantile based histogram equalization.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Upper and lower bounds on the mean of noisy speech: application to minimax classification.
IEEE Trans. Speech Audio Process., 2002

Structural maximum a posteriori linear regression for fast HMM adaptation.
Comput. Speech Lang., 2002

Backoff hierarchical class n-gram language modelling for automatic speech recognition systems.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Bell labs approach to Aurora evaluation on connected digit recognition.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Towards knowledge-based features for HMM based large vocabulary automatic speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

A dynamic in-search discriminative training approach for large vocabulary speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

A discriminative training criterion and an associated EM learning algorithm.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Joint maximum a posteriori adaptation of transformation and HMM parameters.
IEEE Trans. Speech Audio Process., 2001

A real-time Japanese broadcast news closed-captioning system.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A new verification-based fast match approach to large vocabulary speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

An auditory system-based feature for robust speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Minimax classification with parametric neighborhoods for noisy speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Evaluating the Aurora connected digit recognition task - a bell labs approach.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Sequential noise estimation with optimal forgetting for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Small group speaker identification with common password phrases.
Speech Commun., 2000

Structural maximum a-posteriori linear regression for unsupervised speaker adaptation.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A high-performance auditory feature for robust speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Extended maximum a posterior linear regression (EMAPLR) model adaptation for speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Constrained maximum likelihood linear regression for speaker adaptation.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Joint maximum a posteriori estimation of transformation and hidden Markov model parameters.
Proceedings of the IEEE International Conference on Acoustics, 2000

Multiple classifiers by constrained minimization.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Maximum a posteriori linear regression for hidden Markov model adaptation.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Background model design for flexible and portable speaker verification systems.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Speaker identification using minimum classification error training.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Speaker verification using minimum verification error training.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997
Iterative noise and channel estimation under the stochastic matching algorithm framework.
IEEE Signal Process. Lett., 1997

1996
Comparative experiments of several adaptation approaches to noisy speech recognition using stochastic trajectory models.
Speech Commun., 1996

A semi-continuous stochastic trajectory model for phoneme-based continuous speech recognition.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995
Reconnaissance automatique de la parole continue en environnement bruité : application à des modèles stochastiques de trajectoires. (Continuous speech recognition in a noisy environment : application to stochastic trajectory models).
PhD thesis, 1995

Noise adaptation using linear regression for continuous noisy speech recognition.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

On the robustness of linear discriminant analysis as a preprocessing step for noisy speech recognition.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
A comparison of three noisy speech recognition approaches.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

1993
A Bayesian approach to phone duration adaptation for lombard speech recognition.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

1992
Minimization of speech alignment error by iterative transformation for speaker adaptation.
Proceedings of the Second International Conference on Spoken Language Processing, 1992


  Loading...