Michael L. Seltzer

Orcid: 0000-0003-3474-2451

According to our database1, Michael L. Seltzer authored at least 102 papers between 2000 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
End-to-End Speech Recognition Contextualization with Large Language Models.
CoRR, 2023

Augmenting text for spoken language understanding with Large Language Models.
CoRR, 2023

Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding.
CoRR, 2023

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving fast-slow Encoder based Transducer with Streaming Deliberation.
Proceedings of the IEEE International Conference on Acoustics, 2023

Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Streaming parallel transducer beam search with fast slow cascaded encoders.
Proceedings of the Interspeech 2022, 2022

Deliberation Model for On-Device Spoken Language Understanding.
Proceedings of the Interspeech 2022, 2022

Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric.
Proceedings of the Interspeech 2022, 2022

Neural-FST Class Language Model for End-to-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios.
CoRR, 2021

Streaming Attention-Based Models with Augmented Memory for End-To-End Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Alignment Restricted Streaming Recurrent Neural Network Transducer.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Deep Shallow Fusion for RNN-T Personalization.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Collaborative Training of Acoustic Encoders for Speech Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute for Multi-Domain On-Device Scenarios.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Memory-Efficient Speech Recognition on Smart Devices.
Proceedings of the IEEE International Conference on Acoustics, 2021

Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition.
CoRR, 2020

Weak-Attention Suppression for Transformer Based Speech Recognition.
Proceedings of the Interspeech 2020, 2020

Transformer-Based Acoustic Modeling for Hybrid Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Aipnet: Generative Adversarial Pre-Training of Accent-Invariant Networks for End-To-End Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Speech Processing for Digital Home Assistants: Combining signal processing with deep-learning techniques.
IEEE Signal Process. Mag., 2019

Introduction to the Issue on Far-Field Speech Processing in the Era of Deep Learning: Speech Enhancement, Separation, and Recognition.
IEEE J. Sel. Top. Signal Process., 2019

RNN-T For Latency Controlled ASR With Improved Beam Search.
CoRR, 2019

Transformer-Transducer: End-to-End Speech Recognition with Self-Attention.
CoRR, 2019

Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR.
Proceedings of the Interspeech 2019, 2019

End-to-end Contextual Speech Recognition Using Class Language Models and a Token Passing Decoder.
Proceedings of the IEEE International Conference on Acoustics, 2019

From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Improved Training for Online End-to-end Speech Recognition Systems.
Proceedings of the Interspeech 2018, 2018

Towards Language-Universal End-to-End Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Efficient Integration of Fixed Beamformers and Speech Separation Networks for Multi-Channel Far-Field Speech Separation.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Toward Human Parity in Conversational Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Large-Scale Domain Adaptation via Teacher-Student Learning.
Proceedings of the Interspeech 2017, 2017

A study on data augmentation of reverberant speech for robust speech recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

May I take your order? A Neural Model for Extracting Structured Information from Conversations.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition.
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016
On the Role of Nonlinear Transformations in Deep Neural Network Acoustic Models.
Proceedings of the Interspeech 2016, 2016

Deep beamforming networks for multi-channel speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Linearly augmented deep neural network.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Exploring how deep neural networks form phonemic categories.
Proceedings of the INTERSPEECH 2015, 2015

Speech recognition with prediction-adaptation-correction recurrent neural networks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
An introduction to computational networks and the computational network toolkit (invited talk).
Proceedings of the INTERSPEECH 2014, 2014

The influence of pitch and noise on the discriminability of filterbank features.
Proceedings of the INTERSPEECH 2014, 2014

Towards better performance with heterogeneous training data in acoustic modeling using deep neural networks.
Proceedings of the INTERSPEECH 2014, 2014

Single-channel mixed speech recognition using deep neural networks.
Proceedings of the IEEE International Conference on Acoustics, 2014

Factored adaptation of speaker and environment using orthogonal subspace transforms.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Feature Learning in Deep Neural Networks - A Study on Speech Recognition Tasks
Proceedings of the 1st International Conference on Learning Representations, 2013

Deep neural network features and semi-supervised training for low resource speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

An investigation of deep neural networks for noise robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Multi-task learning in deep neural networks for improved phoneme recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Recent advances in deep learning for speech research at Microsoft.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Factored adaptation using a combination of feature-space and model-space transforms.
Proceedings of the INTERSPEECH 2012, 2012

Efficient VTS Adaptation Using Jacobian Approximation.
Proceedings of the INTERSPEECH 2012, 2012

Improvements to VTS feature enhancement.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Acoustic Model Training for Robust Speech Recognition.
Proceedings of the Techniques for Noise Robustness in Automatic Speech Recognition, 2012

2011
In-Car Media Search.
IEEE Signal Process. Mag., 2011

Improved Bottleneck Features Using Pretrained Deep Neural Networks.
Proceedings of the INTERSPEECH 2011, 2011

Separating Speaker and Environmental Variability Using Factored Transforms.
Proceedings of the INTERSPEECH 2011, 2011

CROWDMOS: An approach for crowdsourcing mean opinion score studies.
Proceedings of the IEEE International Conference on Acoustics, 2011

Joint encoding of the waveform and speech recognition features using a transform codec.
Proceedings of the IEEE International Conference on Acoustics, 2011

Factored adaptation for separable compensation of speaker and environmental variability.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010
Noise Adaptive Training for Robust Automatic Speech Recognition.
IEEE Trans. Speech Audio Process., 2010

HMM adaptation using linear spline interpolation with integrated spline parameter training for robust speech recognition.
Proceedings of the INTERSPEECH 2010, 2010

Binary coding of speech spectrograms using a deep auto-encoder.
Proceedings of the INTERSPEECH 2010, 2010

Acoustic model adaptation via Linear Spline Interpolation for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Improving perceived accuracy for in-car media search.
Proceedings of the INTERSPEECH 2009, 2009

Voice search of structured media data.
Proceedings of the IEEE International Conference on Acoustics, 2009

The data deluge: Challenges and opportunities of unlimited data in statistical signal processing.
Proceedings of the IEEE International Conference on Acoustics, 2009

Noise adaptive training using a vector taylor series approach for noise robust automatic speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Noise robust model adaptation using linear spline interpolation.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Towards a non-parametric acoustic model: an acoustic decision tree for observation probability calculation.
Proceedings of the INTERSPEECH 2008, 2008

Maximum a posteriori ICA: Applying prior knowledge to the separation of acoustic sources.
Proceedings of the IEEE International Conference on Acoustics, 2008

Robust design of wideband loudspeaker arrays.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Training Wideband Acoustic Models Using Mixed-Bandwidth Training Data for Speech Recognition.
IEEE Trans. Speech Audio Process., 2007

Automatic Removal of Typed Keystrokes From Speech Signals.
IEEE Signal Process. Lett., 2007

Commute UX: Telephone Dialog System for Location-based Services.
Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, 2007

Robust location understanding in spoken dialog systems using intersections.
Proceedings of the INTERSPEECH 2007, 2007

Microphone Array Post-Filter using Incremental Bayes Learning to Track the Spatial Distributions of Speech and Noise.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Subband Likelihood-Maximizing Beamforming for Speech Recognition in Reverberant Environments.
IEEE Trans. Speech Audio Process., 2006

2005
Robust bandwidth extension of noise-corrupted narrowband speech.
Proceedings of the INTERSPEECH 2005, 2005

Training Wideband Acoustic Models using Mixed-Bandwidth Training Data via Feature Bandwidth Extension.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Speech Recognizer Based Maximum Likelihood Beamforming.
Proceedings of the Speech Separation by Humans and Machines, 2005

2004
Likelihood-maximizing beamforming for robust hands-free speech recognition.
IEEE Trans. Speech Audio Process., 2004

A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition.
Speech Commun., 2004

Reconstruction of missing features for robust speech recognition.
Speech Commun., 2004

Parameter sharing in subband likelihood-maximizing beamforming for speech recognition using microphone arrays.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Speech-recognizer-based filter optimization for microphone array processing.
IEEE Signal Process. Lett., 2003

A harmonic-model-based front end for robust speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Subband parameter optimization of microphone arrays for speech recognition in reverberant environments.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Speech recognizer-based microphone array processing for robust hands-free speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Calibration of microphone arrays for improved speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Speech in Noisy Environments: robust automatic segmentation, feature extraction, and hypothesis combination.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Classifier-based mask estimation for missing feature methods of robust speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Reconstruction of damaged spectrographic features for robust speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000


  Loading...