We stand with Ukraine

We stand with Ukraine

Jasha Droppo

Orcid: 0000-0001-6097-0090

Affiliations:

Microsoft Research

According to our database¹, Jasha Droppo authored at least 101 papers between 2001 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

on orcid.org
on research.microsoft.com

On csauthors.net:

Bibliography

2024

LightLT: A Lightweight Representation Quantization Framework for Long-Tail Data.

[DOI]

,

,

,

,

,

Monica Xiao Cheng

,

,

,

,

Proceedings of the 40th IEEE International Conference on Data Engineering, 2024

2023

Federated Representation Learning for Automatic Speech Recognition.

[DOI]

Guruprasad V. Ramesh

,

Gopinath Chennupati

,

,

Anit Kumar Sahu

,

,

CoRR, 2023

Diffusion-based accent modelling in speech synthesis.

[DOI]

,

,

Marta Czarnowska

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Federated Self-Learning with Weak Supervision for Speech Recognition.

[DOI]

,

Gopinath Chennupati

,

,

Anit Kumar Sahu

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech.

[DOI]

,

Iván Vallés-Pérez

,

Andreas Stolcke

,

,

,

Olabanji Shonibare

,

Roberto Barra-Chicote

,

Venkatesh Ravichandran

CoRR, 2022

Guided Contrastive Self-Supervised Pre-Training for Automatic Speech Recognition.

[DOI]

,

,

Saurabhchand Bhati

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

ILASR: Privacy-Preserving Incremental Learning for Automatic Speech Recognition at Production Scale.

[DOI]

Gopinath Chennupati

,

,

Gurpreet Chadha

,

,

,

,

Anit Kumar Sahu

,

,

,

,

Buddha Nandanoor

,

Prahalad Venkataramanan

,

,

Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Do You Listen with one or two Microphones? A Unified ASR Model for Single and Multi-Channel Audio.

[DOI]

,

,

Brian John King

,

Sri Harish Mallidi

,

,

,

,

Proceedings of the 17th International Workshop on Acoustic Signal Enhancement, 2022

Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation.

[DOI]

,

Pegah Ghahremani

,

Brian John King

,

,

Andreas Stolcke

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Adversarial Reweighting for Speaker Verification Fairness.

[DOI]

,

,

,

,

,

Andreas Stolcke

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improved Representation Learning For Acoustic Event Classification Using Tree-Structured Ontology.

[DOI]

Arman Zharmagambetov

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Fairness in Speaker Verification via Group-Adapted Fusion Network.

[DOI]

,

,

,

,

,

,

Andreas Stolcke

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Investigation of Training Label Error Impact on RNN-T.

[DOI]

,

,

CoRR, 2021

Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition.

[DOI]

Bhargav Pulugundla

,

,

Brian John King

,

,

Sri Harish Mallidi

,

,

,

CoRR, 2021

Improving Multi-Speaker TTS Prosody Variance with a Residual Encoder and Normalizing Flows.

[DOI]

Iván Vallés-Pérez

,

,

Grzegorz Beringer

,

Roberto Barra-Chicote

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

CoDERT: Distilling Encoder Representations with Co-Learning for Transducer-Based Speech Recognition.

[DOI]

Rupak Vignesh Swaminathan

,

Brian John King

,

Grant P. Strimel

,

,

Athanasios Mouchtaris

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Evaluating the Vulnerability of End-to-End Automatic Speech Recognition Models to Membership Inference Attacks.

[DOI]

Muhammad A. Shah

,

,

,

Athanasios Mouchtaris

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

wav2vec-C: A Self-Supervised Model for Speech Representation Learning.

[DOI]

,

,

,

Sri Harish Mallidi

,

,

,

Andreas Stolcke

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End.

[DOI]

Swayambhu Nath Ray

,

,

,

Pegah Ghahremani

,

Raghavendra Bilgi

,

,

Harish Arsikere

,

,

Andreas Stolcke

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Scaling Effect of Self-Supervised Speech Models.

[DOI]

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention.

[DOI]

Daniel Korzekwa

,

Roberto Barra-Chicote

,

Szymon Zaporowski

,

Grzegorz Beringer

,

Jaime Lorenzo-Trueba

,

Alicja Serafinowicz

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

SynthASR: Unlocking Synthetic Data for Speech Recognition.

[DOI]

,

,

,

Roberto Barra-Chicote

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Scaling Laws for Acoustic Models.

[DOI]

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Exploring the application of synthetic audio in training keyword spotters.

[DOI]

Andrew Werchniak

,

Roberto Barra-Chicote

,

Yuriy Mishchenko

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

DO as I Mean, Not as I Say: Sequence Loss Training for Spoken Language Understanding.

[DOI]

,

,

,

,

,

,

Andreas Stolcke

Proceedings of the IEEE International Conference on Acoustics, 2021

Joint ASR and Language Identification Using RNN-T: An Efficient Approach to Dynamic Language Switching.

[DOI]

Surabhi Punjabi

,

Harish Arsikere

,

,

Chander Chandak

,

,

,

,

,

,

Andreas Stolcke

,

,

,

,

,

Athanasios Mouchtaris

,

Siegfried Kunzmann

Proceedings of the IEEE International Conference on Acoustics, 2021

Top-Down Attention in End-to-End Spoken Language Understanding.

[DOI]

,

,

Alejandro Mottini

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Efficient Minimum Word Error Rate Training of RNN-Transducer for End-to-End Speech Recognition.

[DOI]

,

,

,

Maarten Van Segbroeck

,

,

Andreas Stolcke

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Single-channel Speech Extraction Using Speaker Inventory and Attention Network.

[DOI]

,

,

Takuya Yoshioka

,

,

,

Dimitrios Dimitriadis

,

,

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Progressive Joint Modeling in Unsupervised Single-Channel Overlapped Speech Recognition.

[DOI]

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2018

The Microsoft 2017 Conversational Speech Recognition System.

[DOI]

,

,

,

,

,

Andreas Stolcke

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Sequence Modeling in Unsupervised Single-Channel Overlapped Speech Recognition.

[DOI]

,

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Toward Human Parity in Conversational Speech Recognition.

[DOI]

,

,

,

,

Michael L. Seltzer

,

Andreas Stolcke

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Comparing Human and Machine Errors in Conversational Speech Transcription.

[DOI]

Andreas Stolcke

,

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Advances in all-neural speech recognition.

[DOI]

,

,

,

Andreas Stolcke

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

The microsoft 2016 conversational speech recognition system.

[DOI]

,

,

,

,

,

Andreas Stolcke

,

,

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Acoustic-to-word model without OOV.

[DOI]

,

,

,

,

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

Achieving Human Parity in Conversational Speech Recognition.

[DOI]

,

,

,

,

,

Andreas Stolcke

,

,

CoRR, 2016

On training bi-directional neural network language model with noise contrastive estimation.

[DOI]

,

,

,

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention.

[DOI]

,

,

,

Andreas Stolcke

,

,

,

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Parallelizing WFST speech decoders.

[DOI]

,

,

,

Madanlal Musuvathi

,

,

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Exploiting LSTM structure in deep neural networks for speech recognition.

[DOI]

,

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Linearly augmented deep neural network.

[DOI]

Pegah Ghahremani

,

,

Michael L. Seltzer

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Self-stabilized deep neural network.

[DOI]

Pegah Ghahremani

,

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Deep Neural Networks for Single-Channel Multi-Talker Speech Recognition.

[DOI]

,

,

Michael L. Seltzer

,

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Speech recognition with prediction-adaptation-correction recurrent neural networks.

[DOI]

,

,

Michael L. Seltzer

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning.

[DOI]

,

Michael L. Seltzer

,

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Deep bi-directional recurrent networks over spectral windows.

[DOI]

Abdel-rahman Mohamed

,

,

,

,

Andreas Stolcke

,

,

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

An introduction to computational networks and the computational network toolkit (invited talk).

[DOI]

,

,

Michael L. Seltzer

,

,

,

Oleksii Kuchaiev

,

,

,

,

,

,

Christopher J. Rossbach

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs.

[DOI]

,

,

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Single-channel mixed speech recognition using deep neural networks.

[DOI]

,

,

Michael L. Seltzer

,

Proceedings of the IEEE International Conference on Acoustics, 2014

On parallelizability of stochastic gradient descent for speech DNNS.

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

Phone sequence modeling with recurrent neural networks.

[DOI]

Nicolas Boulanger-Lewandowski

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Multi-task learning in deep neural networks for improved phoneme recognition.

[DOI]

Michael L. Seltzer

,

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

A chunk-based phonetic score for mobile voice search.

[DOI]

Rohit Prabhavalkar

,

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Feature Compensation.

[DOI]

Proceedings of the Techniques for Noise Robustness in Automatic Speech Recognition, 2012

2011

Automatically Optimizing Utterance Classification Performance without Human in the Loop.

[DOI]

,

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Learning non-parametric models of pronunciation.

[DOI]

Brian Hutchinson

,

Proceedings of the IEEE International Conference on Acoustics, 2011

Joint encoding of the waveform and speech recognition features using a transform codec.

[DOI]

,

Michael L. Seltzer

,

,

Henrique S. Malvar

,

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Noise Adaptive Training for Robust Automatic Speech Recognition.

[DOI]

,

Michael L. Seltzer

,

,

IEEE Trans. Speech Audio Process., 2010

Spontaneous Mandarin speech understanding using Utterance Classification: A case study.

[DOI]

,

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Continuous speech recognition with a TF-IDF acoustic model.

[DOI]

,

,

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Information retrieval methods for automatic speech recognition.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2010

Context dependent phonetic string edit distance for automatic speech recognition.

[DOI]

,

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Experimenting with a global decision tree for state clustering in automatic speech recognition systems.

[DOI]

,

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Robust Speech Recognition Using a Cepstral Minimum-Mean-Square-Error-Motivated Noise Suppressor.

[DOI]

,

,

,

,

,

IEEE Trans. Speech Audio Process., 2008

Towards a non-parametric acoustic model: an acoustic decision tree for observation probability calculation.

[DOI]

,

Michael L. Seltzer

,

,

Yu-Hsiang Bosco Chiu

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A minimum-mean-square-error noise reduction algorithm on Mel-frequency cepstra for robust speech recognition.

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2008

Robust design of wideband loudspeaker arrays.

[DOI]

,

,

Michael L. Seltzer

,

Proceedings of the IEEE International Conference on Acoustics, 2008

Speech enhancement using a pitch predictive model.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

A fine pitch model for speech.

[DOI]

,

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Maximum Entropy Confidence Estimation for Speech Recognition.

[DOI]

Christopher White

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

Joint Discriminative Front End and Back End Training for Improved Speech Recognition Accuracy.

[DOI]

,

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion.

[DOI]

,

,

IEEE Trans. Speech Audio Process., 2005

Analysis and comparison of two speech feature extraction/compensation algorithms.

[DOI]

,

,

,

IEEE Signal Process. Lett., 2005

A graphical model for multi-sensory speech processing in air-and-bone conductive microphones.

[DOI]

Amarnag Subramanya

,

,

,

,

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Robust bandwidth extension of noise-corrupted narrowband speech.

[DOI]

Michael L. Seltzer

,

,

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Maximum mutual information SPLICE transform for seen and unseen conditions.

[DOI]

,

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Leakage Model and Teeth Clack Removal for Air- and Bone-Conductive Integrated Microphones.

[DOI]

,

Amar Subramanya

,

,

,

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Speech and Language Processing for Multimodal Human-Computer Interaction.

[DOI]

,

,

,

,

,

,

Constantinos Boulis

,

,

J. VLSI Signal Process., 2004

Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features.

[DOI]

,

,

IEEE Trans. Speech Audio Process., 2004

Enhancement of log Mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise.

[DOI]

,

,

IEEE Trans. Speech Audio Process., 2004

Direct filtering for air- and bone-conductive microphones.

[DOI]

,

,

Alejandro Acero

,

,

Proceedings of the IEEE 6th Workshop on Multimedia Signal Processing, 2004

Multi-sensory microphones for robust speech detection, enhancement and recognition.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Noise robust speech recognition with a switching linear dynamic model.

[DOI]

,

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition.

[DOI]

,

,

IEEE Trans. Speech Audio Process., 2003

A harmonic-model-based front end for robust speech recognition.

[DOI]

Michael L. Seltzer

,

,

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

A comparison of three non-linear observation models for noisy speech features.

[DOI]

,

,

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Incremental Bayes learning with prior evolution for tracking nonstationary noise statistics from noisy speech data.

[DOI]

,

,

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002

Distributed speech processing in miPad's multimodal user interface.

[DOI]

,

,

,

,

,

Constantinos Boulis

,

,

,

,

,

IEEE Trans. Speech Audio Process., 2002

Evaluation of SPLICE on the Aurora 2 and 3 tasks.

[DOI]

,

,

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Noise from corrupted speech log mel-spectral energies.

[DOI]

,

,

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Exploiting variances in robust feature extraction based on a parametric model of speech distortion.

[DOI]

,

,

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Sequential MAP noise estimation and a phase-sensitive model of the acoustic environment.

[DOI]

,

,

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Uncertainty decoding with SPLICE for noise robust speech recognition.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2002

A Bayesian approach to speech feature enhancement using the dynamic cepstral prior.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2002

A speech-centric perspective for human-computer interface.

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE 5th Workshop on Multimedia Signal Processing, 2002

2001

Evaluation of the SPLICE algorithm on the Aurora2 database.

[DOI]

,

,

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

MiPad: a multimodal interaction prototype.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2001

Efficient on-line acoustic environment estimation for FCDCN in a continuous speech recognition system.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2001

High-performance robust speech recognition using stereo training data.

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2001

Loading...