Michael Picheny

According to our database1, Michael Picheny authored at least 150 papers between 1983 and 2023.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2001, "For contributions to speech recognition systems and products.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Twenty-Five Years of Evolution in Speech and Language Processing.
IEEE Signal Process. Mag., July, 2023

Improving Joint Speech-Text Representations Without Alignment.
CoRR, 2023

A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at Scale.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Dual Learning for Large Vocabulary On-Device ASR.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Towards Disentangled Speech Representations.
Proceedings of the Interspeech 2022, 2022

Towards Measuring Fairness in Speech Recognition: Casual Conversations Dataset Transcriptions.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Accented Speech Recognition Inspired by Human Perception.
CoRR, 2021

Diarization of Legal Proceedings. Identifying and Transcribing Judicial Speech from Recorded Court Audio.
CoRR, 2021

AVLnet: Learning Audio-Visual Language Representations from Instructional Videos.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Cascaded Multilingual Audio-Visual Learning from Videos.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020
Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition: A comparison of current training strategies.
IEEE Signal Process. Mag., 2020

AVLnet: Learning Audio-Visual Language Representations from Instructional Videos.
CoRR, 2020

Distributed Training of Deep Neural Network Acoustic Models for Automatic Speech Recognition.
CoRR, 2020

Improving Efficiency in Large-Scale Decentralized Distributed Training.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Leveraging Unpaired Text Data for Training End-To-End Speech-to-Intent Systems.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Kernel Approximation Methods for Speech Recognition.
J. Mach. Learn. Res., 2019

A Highly Efficient Distributed Deep Learning System for Automatic Speech Recognition.
Proceedings of the Interspeech 2019, 2019

Detection and Recovery of OOVs for Improved English Broadcast News Captioning.
Proceedings of the Interspeech 2019, 2019

Challenging the Boundaries of Speech Recognition: The MALACH Corpus.
Proceedings of the Interspeech 2019, 2019

Large-Scale Mixed-Bandwidth Deep Neural Network Acoustic Modeling for Automatic Speech Recognition.
Proceedings of the Interspeech 2019, 2019

Acoustic Model Optimization Based on Evolutionary Stochastic Gradient Descent with Anchors for Automatic Speech Recognition.
Proceedings of the Interspeech 2019, 2019

Forget a Bit to Learn Better: Soft Forgetting for CTC-Based Automatic Speech Recognition.
Proceedings of the Interspeech 2019, 2019

Identifying Mood Episodes Using Dialogue Features from Clinical Interviews.
Proceedings of the Interspeech 2019, 2019

Distributed Deep Learning Strategies for Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

English Broadcast News Speech Recognition by Humans and Machines.
Proceedings of the IEEE International Conference on Acoustics, 2019

Acoustically Grounded Word Embeddings for Improved Acoustics-to-word Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Pre-training of Speaker Embeddings for Low-latency Speaker Change Detection in Broadcast News.
Proceedings of the IEEE International Conference on Acoustics, 2019

Grounding Spoken Words in Unlabeled Video.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Simplified LSTMS for Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Semi-Supervised Training and Data Augmentation for Adaptation of Automatic Broadcast News Captioning Systems.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Evolutionary Stochastic Gradient Descent for Optimization of Deep Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q.
IEEE Trans. Parallel Distributed Syst., 2017

Recent advances in conversational speech recognition using convolutional and recurrent neural networks.
IBM J. Res. Dev., 2017

Recent progress in deep end-to-end models for spoken language processing.
IBM J. Res. Dev., 2017

English Conversational Telephone Speech Recognition by Humans and Machines.
Proceedings of the Interspeech 2017, 2017

Direct Acoustics-to-Word Models for English Conversational Speech Recognition.
Proceedings of the Interspeech 2017, 2017

End-to-end speech recognition and keyword search on low-resource languages.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Training variance and performance evaluation of neural networks in speech.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
A comparison between deep neural nets and kernel acoustic models for speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

On the importance of event detection for ASR.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
The IBM 2015 English conversational telephone speech recognition system.
Proceedings of the INTERSPEECH 2015, 2015

Order-free spoken term detection.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Multilingual representations for low resource speech recognition and keyword search.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets.
CoRR, 2014

Unfolded recurrent neural networks for speech recognition.
Proceedings of the INTERSPEECH 2014, 2014

Parallel deep neural network training for LVCSR tasks using blue gene/Q.
Proceedings of the INTERSPEECH 2014, 2014

Efficient spoken term detection using confusion networks.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
System combination and score normalization for spoken term detection.
Proceedings of the IEEE International Conference on Acoustics, 2013

A high-performance Cantonese keyword search system.
Proceedings of the IEEE International Conference on Acoustics, 2013

Developing speech recognition systems for corpus indexing under the IARPA Babel program.
Proceedings of the IEEE International Conference on Acoustics, 2013

Speaker adaptation of neural network acoustic models using i-vectors.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Matching Criteria for Vocabulary-Independent Search.
IEEE Trans. Speech Audio Process., 2012

2011
Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR.
IEEE ACM Trans. Audio Speech Lang. Process., 2011

Trends and advances in speech recognition.
IBM J. Res. Dev., 2011

Deep Belief Networks using discriminative features for phone recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Effects of automated transcription quality on non-native speakers' comprehension in real-time computer-mediated communication.
Proceedings of the 28th International Conference on Human Factors in Computing Systems, 2010

2009
Cultural voice markers in speech-to-speech machine translation systems.
Proceedings of the 2009 international workshop on Intercultural collaboration, 2009

Effects of real-time transcription on non-native speaker's comprehension in computer-mediated communications.
Proceedings of the 27th International Conference on Human Factors in Computing Systems, 2009

An exploration of large vocabulary tools for small vocabulary phonetic recognition.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Improved vocabulary independent search with approximate match based on Conditional Random Fields.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Articulatory feature detection with Support Vector Machines for integration into ASR and phone recognition.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2007
Voice-Melody Transcription Under a Speech Recognition Framework.
Proceedings of the IEEE International Conference on Acoustics, 2007

Lattice-based Viterbi decoding techniques for speech translation.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Improvements in phone based audio search via constrained match with high order confusion estimates.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
The IBM expressive text-to-speech synthesis system for American English.
IEEE Trans. Speech Audio Process., 2006

Concept-based speech-to-speech translation using maximum entropy models for statistical natural concept generation.
IEEE Trans. Speech Audio Process., 2006

Towards Pooled-Speaker Concatenative Text-to-Speech.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Semantic confidence measurement for spoken dialog systems.
IEEE Trans. Speech Audio Process., 2005

Using semantic analysis to improve speech recognition performance.
Comput. Speech Lang., 2005

Toward multiple-language TTS: experiments in English and Mandarin.
Proceedings of the INTERSPEECH 2005, 2005

2004
Automatic recognition of spontaneous speech for access to multilingual oral history archives.
IEEE Trans. Speech Audio Process., 2004

Applications of Language Modeling in Speech-To-Speech Translation.
Int. J. Speech Technol., 2004

Advances in Large Vocabulary Continuous Speech Recognition.
Adv. Comput., 2004

A corpus-based approach to expressive speech synthesis.
Proceedings of the Fifth ISCA ITRW on Speech Synthesis, 2004

A Comparison of Rule-Based and Statistical Methods for Semantic Language Modeling and Confidence Measurement.
Proceedings of HLT-NAACL 2004: Short Papers, Boston, Massachusetts, USA, May 2-7, 2004, 2004

The IBM expressive speech synthesis system.
Proceedings of the INTERSPEECH 2004, 2004

2003
Noise robustness in speech to speech translation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Toward domain-independent conversational speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Improving statistical natural concept generation in interlingua-based speech-to-speech translation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Automated transcription and topic segmentation of large spoken archives.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Word level confidence measurement using semantic features.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Towards automatic transcription of large spoken archives - English ASR for the MALACH project.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Use of statistical N-gram models in natural language generation for machine translation.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Recent improvements to the IBM trainable speech synthesis system.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
MARS: A Statistical Semantic Parsing and Generation-Based Multilingual Automatic tRanslation System.
Mach. Transl., 2002

Large-Vocabulary Speech Recognition Algorithms.
Computer, 2002

Cross-Language Access to Recorded Speech in the MALACH Project.
Proceedings of the Text, Speech and Dialogue, 5th International Conference, 2002

Supporting access to large digital oral history archives.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2002

Statistical natural language generation for speech-to-speech machine translation systems.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Semantic structured language models.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Turn-Based Language Modeling for spoken dialog systems.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Current status of the IBM Trainable Speech Synthesis System.
Proceedings of the 4th ITRW on Speech Synthesis, Perthshire, Scotland, UK, August 29, 2001

Recent advances in speech recognition system for IBM DARPA communicator.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Innovative approaches for large vocabulary name recognition.
Proceedings of the IEEE International Conference on Acoustics, 2001

Rapid adaptation using penalized-likelihood methods.
Proceedings of the IEEE International Conference on Acoustics, 2001


2000
Impact of bucketing on performance of linearly interpolated language models.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Dynamic selection of feature spaces for robust speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Heredity and environment in speech recognition: the role of a priori information vs. data.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Speed improvement of the tree-based time asynchronous search.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Maximal rank likelihood as an optimization function for speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Rapid likelihood calculation of subspace clustered Gaussian components.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Enhanced likelihood computation using regression.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Speed improvement of the time-asynchronous acoustic fast match.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

HMM training based on quality measurement.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Speaker clustering and transformation for speaker adaptation in speech recognition systems.
IEEE Trans. Speech Audio Process., 1998

On variable sampling frequencies in speech recognition.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

A new confidence measure based on rank-ordering subphone scores.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Telephone band LVCSR for hearing-impaired users.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Improvements in children's speech recognition performance.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997
Key-phrase spotting using an integrated language model of n-grams and finite-state grammar.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Speaker adaptation based on pre-clustering training speakers.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

New methods in continuous Mandarin speech recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

1996
Speaker clustering and transformation for speaker adaptation in large-vocabulary speech recognition systems.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

Speech recognition on Mandarin Call Home: a large-vocabulary, conversational, and telephone speech corpus.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995
Context dependent phonetic duration models for decoding conversational speech.
Proceedings of the 1995 International Conference on Acoustics, 1995

Experiments using data augmentation for speaker adaptation.
Proceedings of the 1995 International Conference on Acoustics, 1995

Performance of the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal task.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
Speaker adaptation via VQ prototype modification.
IEEE Trans. Speech Audio Process., 1994

The metamorphic algorithm: a speaker mapping approach to data augmentation.
IEEE Trans. Speech Audio Process., 1994

A channel-bank-based phone detection strategy.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

Adaptation techniques for ambience and microphone compensation in the IBM Tangora speech recognition system.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

Robust methods for using context-dependent features and models in a continuous speech recognizer.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
A method for the construction of acoustic Markov models for words.
IEEE Trans. Speech Audio Process., 1993

Multonic Markov word models for large vocabulary continuous speech recognition.
IEEE Trans. Speech Audio Process., 1993

Word lookahead scheme for cross-word right context models in a stack decoder.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Influence of background noise and microphone on the performance of the IBM Tangora speech recognition system.
Proceedings of the IEEE International Conference on Acoustics, 1993

A supervised approach to the construction of context-sensitive acoustic prototypes.
Proceedings of the IEEE International Conference on Acoustics, 1993

Context dependent vector quantization for continuous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 1993

1992
Robust speaker adaptation using a piecewise linear acoustic mapping.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

Adaptation of large vocabulary recognition system parameters.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

A fast match for continuous speech recognition using allophonic models.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991
Context Dependent Modeling of Phones in Continuous Speech Using Decision Trees.
Proceedings of the Speech and Natural Language, 1991

An iterative 'flip-flop' approximation of the most informative split in the construction of decision trees.
Proceedings of the 1991 International Conference on Acoustics, 1991

Decision trees for phonological rules in continuous speech.
Proceedings of the 1991 International Conference on Acoustics, 1991

Automatic phonetic baseform determination.
Proceedings of the 1991 International Conference on Acoustics, 1991

A new class of fenonic Markov word models for large vocabulary continuous speech recognition.
Proceedings of the 1991 International Conference on Acoustics, 1991

1989
Speech recognition using noise-adaptive prototypes.
IEEE Trans. Acoust. Speech Signal Process., 1989

Large vocabulary natural language continuous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 1989

1988
On a model-robust training method for speech recognition.
IEEE Trans. Acoust. Speech Signal Process., 1988

Adaptive labeling: normalization of speech by adaptive transformations based on vector quantization.
Proceedings of the IEEE International Conference on Acoustics, 1988

Decoder selection based on cross-entropies.
Proceedings of the IEEE International Conference on Acoustics, 1988

Acoustic Markov models used in the Tangora speech recognition system.
Proceedings of the IEEE International Conference on Acoustics, 1988

1987

1986

1984
Some experiments with large-vocabulary isolated-word sentence recognition.
Proceedings of the IEEE International Conference on Acoustics, 1984

1983
Recognition of isolated-word sentences from a 5000-word vocabulary office correspondence task.
Proceedings of the IEEE International Conference on Acoustics, 1983


  Loading...