Richard M. Stern

Orcid: 0000-0003-0557-7282

Affiliations:
  • Carnegie Mellon University, Electrical and Computer Engineering, Pittsburgh, PA, USA


According to our database1, Richard M. Stern authored at least 162 papers between 1983 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Modeling Analog Dynamic Range Compressors using Deep Learning and State-space Models.
CoRR, 2024

2023
Automatic Detection of Dyspnea in Real Human-Robot Interaction Scenarios.
Sensors, September, 2023

Online Active Learning For Sound Event Detection.
CoRR, 2023

Unsupervised Voice Type Discrimination Score Adaptation Using X-Vector Clusters.
Proceedings of the IEEE International Conference on Acoustics, 2023

Reducing the Cost of Spoof Detection Labeling using Mixed-Strategy Active Learning and Pretrained Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Learnable Front Ends Based on Temporal Modulation for Music Tagging.
CoRR, 2022

Investigating the Important Temporal Modulations for Deep-Learning-Based Speech Activity Detection.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Improved Modulation-Domain Loss for Neural-Network-based Speech Enhancement.
Proceedings of the Interspeech 2022, 2022

2021
Temporal Context in Speech Emotion Recognition.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

The Application of Learnable STRF Kernels to the 2021 Fearless Steps Phase-03 SAD Challenge.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Modulation-Domain Loss for Neural-Network-Based Real-Time Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Non causal deep learning based dereverberation.
CoRR, 2020

Learnable Spectro-Temporal Receptive Fields for Robust Voice Type Discrimination.
Proceedings of the Interspeech 2020, 2020

2019
On combining features for single-channel robust speech recognition in reverberant environments.
CoRR, 2019

Weighted delay-and-sum beamforming guided by visual tracking for human-robot interaction.
CoRR, 2019

Robust Recognition of Reverberant and Noisy Speech Using Coherence-based Processing.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
An improved DNN-based spectral feature mapping that removes noise and reverberation for robust automatic speech recognition.
CoRR, 2018

Exploring the robustness of features and enhancement on speech recognition systems in highly-reverberant real environments.
CoRR, 2018

Highly-Reverberant Real Environment database: HRRE.
CoRR, 2018

A Comparative Study of Spatial Speech Separation Techniques to Improve Speech Recognition.
Proceedings of the Advances in Neural Networks - ISNN 2018, 2018

A Priori SNR Estimation Based on a Recurrent Neural Network for Robust Speech Enhancement.
Proceedings of the Interspeech 2018, 2018

Sound Source Separation Using Phase Difference and Reliable Mask Selection Selection.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Synchrony-Based Feature Extraction for Robust Automatic Speech Recognition.
IEEE Signal Process. Lett., 2017

Locally Normalized Filter Banks Applied to Deep Neural-Network-Based Robust Speech Recognition.
IEEE Signal Process. Lett., 2017

Robustness Over Time-Varying Channels in DNN-HMM ASR Based Human-Robot Interaction.
Proceedings of the Interspeech 2017, 2017

Robust Speech Recognition Based on Binaural Auditory Processing.
Proceedings of the Interspeech 2017, 2017

Binaural processing for robust recognition of degraded speech.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Robust Features in Deep-Learning-Based Speech Recognition.
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016
Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

A Subband-Based Stationary-Component Suppression Method Using Harmonics and Power Ratio for Reverberant Speech Recognition.
IEEE Signal Process. Lett., 2016

The Use of Locally Normalized Cepstral Coefficients (LNCC) to Improve Speaker Recognition Accuracy in Highly Reverberant Rooms.
Proceedings of the Interspeech 2016, 2016

Fusion Strategies for Robust Speech Recognition and Keyword Spotting for Channel- and Noise-Degraded Speech.
Proceedings of the Interspeech 2016, 2016

2015
Efficient Real Spherical Harmonic Representation of Head-Related Transfer Functions.
IEEE J. Sel. Top. Signal Process., 2015

A perceptually-motivated low-complexity instantaneous linear channel normalization technique applied to speaker verification.
Comput. Speech Lang., 2015


Robust parameter estimation for audio declipping in noise.
Proceedings of the INTERSPEECH 2015, 2015

Robustness to additive noise of locally-normalized cepstral coefficients in speaker verification.
Proceedings of the INTERSPEECH 2015, 2015

Towards machines that know when they do not know: Summary of work done at 2014 Frederick Jelinek Memorial Workshop.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Efficient audio declipping using regularized least squares.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Optimization of the parameters characterizing sigmoidal rate-level functions based on acoustic features.
Speech Commun., 2014


Robust speech recognition in reverberant environments using subband-based steady-state monaural and binaural suppression.
Proceedings of the INTERSPEECH 2014, 2014

Post-masking: a hybrid approach to array processing for speech recognition.
Proceedings of the INTERSPEECH 2014, 2014

Robust speech recognition using temporal masking and thresholding algorithm.
Proceedings of the INTERSPEECH 2014, 2014

Least squares signal declipping for robust speech recognition.
Proceedings of the INTERSPEECH 2014, 2014

An analysis of binaural spectro-temporal masking as nonlinear beamforming.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Perceptual Properties of Current Speech Recognition Technology.
Proc. IEEE, 2013


Optimization of sigmoidal rate-level function based on acoustic features.
Proceedings of the INTERSPEECH 2013, 2013

2012
Learning-Based Auditory Encoding for Robust Speech Recognition.
IEEE Trans. Speech Audio Process., 2012

Hearing Is Believing: Biologically Inspired Methods for Robust Automatic Speech Recognition.
IEEE Signal Process. Mag., 2012


Optimization of the DET curve in speaker verification.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Two-microphone source separation algorithm based on statistical modeling of angle distributions.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Histogram-based subband powerwarping and spectral averaging for robust speech recognition under matched and multistyle training.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Features Based on Auditory Physiology and Perception.
Proceedings of the Techniques for Noise Robustness in Automatic Speech Recognition, 2012

2011
Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise.
Speech Commun., 2011

Gammatone sub-band magnitude-domain dereverberation for ASR.
Proceedings of the IEEE International Conference on Acoustics, 2011

An iterative least-squares technique for dereverberation.
Proceedings of the IEEE International Conference on Acoustics, 2011

Delta-spectral cepstral coefficients for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Binaural sound source separation motivated by auditory processing.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Automatic selection of thresholds for signal separation algorithms based on interaural delay.
Proceedings of the INTERSPEECH 2010, 2010

Nonlinear enhancement of onset for robust speech recognition.
Proceedings of the INTERSPEECH 2010, 2010

Maximum-likelihood-based cepstral inverse filtering for blind speech dereverberation.
Proceedings of the IEEE International Conference on Acoustics, 2010

Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring.
Proceedings of the IEEE International Conference on Acoustics, 2010

A hybrid physical and statistical dynamic articulatory framework incorporating analysis-by-synthesis for improved phone classification.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Spatial separation of speech signals using amplitude estimation based on interaural comparisons of zero-crossings.
Speech Commun., 2009

Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction.
Proceedings of the INTERSPEECH 2009, 2009

Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain.
Proceedings of the INTERSPEECH 2009, 2009

Speaker segmentation and clustering for simultaneously presented speech.
Proceedings of the INTERSPEECH 2009, 2009

Towards fusion of feature extraction and acoustic model training: a top down process for robust speech recognition.
Proceedings of the INTERSPEECH 2009, 2009

Unsupervised training scheme with non-stereo data for empirical feature vector compensation.
Proceedings of the INTERSPEECH 2009, 2009

Deriving vocal tract shapes from electromagnetic articulograph data via geometric adaptation and matching.
Proceedings of the INTERSPEECH 2009, 2009

Minimum variance modulation filter for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Power function-based power distribution normalization algorithm for robust speech recognition.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Robust speech recognition using a Small Power Boosting algorithm.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Robust signal-to-noise ratio estimation based on waveform amplitude distribution analysis.
Proceedings of the INTERSPEECH 2008, 2008

Analysis of physiologically-motivated signal processing for robust speech recognition.
Proceedings of the INTERSPEECH 2008, 2008

Environment-invariant compensation for reverberation using linear post-filtering for minimum distortion.
Proceedings of the IEEE International Conference on Acoustics, 2008

Single-channel speech separation based on modulation frequency.
Proceedings of the IEEE International Conference on Acoustics, 2008

Analysis-by-synthesis features for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
"polyaural" array processing for automatic speech recognition in degraded environments.
Proceedings of the INTERSPEECH 2007, 2007

Missing Feature Speech Recognition using Dereverberation and Echo Suppression in Reverberant Environments.
Proceedings of the IEEE International Conference on Acoustics, 2007

Profile View Lip Reading.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Subband Likelihood-Maximizing Beamforming for Speech Recognition in Reverberant Environments.
IEEE Trans. Speech Audio Process., 2006

Voting for two speaker segmentation.
Proceedings of the INTERSPEECH 2006, 2006

Physiologically-motivated synchrony-based processing for robust automatic speech recognition.
Proceedings of the INTERSPEECH 2006, 2006

An integrated approach to improve speech recognition rate for non-native speakers.
Proceedings of the INTERSPEECH 2006, 2006

Spatial Separation of Speech Signals Using Continuously-Variable Masks Estimated From Comparisons of Zero Crossings.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Band-Independent Mask Estimation for Missing-Feature Reconstruction in the Presence of Unknown Background Noise.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Missing-feature approaches in speech recognition.
IEEE Signal Process. Mag., 2005

Feature compensation based on switching linear dynamic model.
IEEE Signal Process. Lett., 2005

Voice driven applications in non-stationary and chaotic environment.
Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2005

Environment-independent mask estimation for missing-feature reconstruction.
Proceedings of the INTERSPEECH 2005, 2005

Signal Separation Motivated by Human Auditory Perception: Applications to Automatic Speech Recognition.
Proceedings of the Speech Separation by Humans and Machines, 2005

2004
Likelihood-maximizing beamforming for robust hands-free speech recognition.
IEEE Trans. Speech Audio Process., 2004

A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition.
Speech Commun., 2004

Reconstruction of missing features for robust speech recognition.
Speech Commun., 2004

Normalization of Time-Derivative Parameters for Robust Speech Recognition in Small Devices.
IEICE Trans. Inf. Syst., 2004

N-Best List Rescoring Using Syntactic Trigrams.
Proceedings of the MICAI 2004: Advances in Artificial Intelligence, 2004

Parallel feature generation based on maximizing normalized acoustic likelihood.
Proceedings of the INTERSPEECH 2004, 2004

Parameter sharing in subband likelihood-maximizing beamforming for speech recognition using microphone arrays.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

On tracking noise with linear dynamical system models.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Feature generation based on maximum normalized acoustic likelihood for improved speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Normalization of time-derivative parameters using histogram equalization.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Duration normalization and hypothesis combination for improved spontaneous speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Feature generation based on maximum classification probability for improved speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Subband parameter optimization of microphone arrays for speech recognition in reverberant environments.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Training of stream weights for the decoding of speech using parallel feature streams.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Automatic generation of subword units for speech recognition systems.
IEEE Trans. Speech Audio Process., 2002

Combining search spaces of heterogeneous recognizers for improved speech recogniton.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Speech recognizer-based microphone array processing for robust hands-free speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Distortion-class modeling for robust speech recognition under GSM RPE-LTP coding.
Speech Commun., 2001

Speech in Noisy Environments: robust automatic segmentation, feature extraction, and hypothesis combination.
Proceedings of the IEEE International Conference on Acoustics, 2001

Duration normalization for improved recognition of spontaneous and read speech via missing feature methods.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Structured redefinition of sound units by merging and splitting for improved speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Classifier-based mask estimation for missing feature methods of robust speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Reconstruction of damaged spectrographic features for robust speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Automatic subword unit refinement for spontaneous speech recognition via phone splitting.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Phone transition acoustic modeling: application to speaker independent and spontaneous speech systems.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Instantaneous-distortion based weighted acoustic modeling for robust recognition of coded speech.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Using class weighting in inter-class MLLR.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Automatic generation of phone sets and lexical transcriptions.
Proceedings of the IEEE International Conference on Acoustics, 2000

Inter-class MLLR for speaker adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Domain adduced state tying for cross-domain acoustic modelling.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Automatic clustering and generation of contextual questions for tied states in hidden Markov models.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Data-driven environmental compensation for speech recognition: A unified approach.
Speech Commun., 1998

Inference of missing spectrographic features for robust speech recognition.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Speech recognition from GSM codec parameters.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1997
Compensation for environmental and speaker variability by normalization of pole locations.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Speaker normalization through formant-based warping of the frequency scale.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

The effects of background music on speech recognition accuracy.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
Cepstral compensation by polynomial approximation for environment-independent speech recognition.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

A vector Taylor series approach for environment-independent speech recognition.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995
A unified approach for robust speech recognition.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

On the effects of speech rate in large vocabulary speech recognition systems.
Proceedings of the 1995 International Conference on Acoustics, 1995

Multivariate-Gaussian-based cepstral normalization for robust speech recognition.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
Signal processing for robust speech recognition.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Environmental robustness in automatic speech recognition using physiologic ally-motivated signal processing.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Robust speech recognition in the automobile.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Sources of degradation of speech recognition in the telephone network.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

Environment normalization for robust speech recognition using direct cepstral comparison.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
Efficient Cepstral Normalization For Robust Speech Recognition.
Proceedings of the Human Language Technology: Proceedings of a Workshop Held at Plainsboro, 1993

Multi-microphone correlation-based processing for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 1993

1992
Speech Understanding in Open Tasks.
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Harriman, 1992

Multiple approaches to robust speech recognition.
Proceedings of the Second International Conference on Spoken Language Processing, 1992

Efficient joint compensation of speech for the effects of additive noise and linear filtering.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991
Speaker adaptation in continuous speech recognition via estimation of correlated mean vectors.
Proceedings of the 1991 International Conference on Acoustics, 1991

Robust speech recognition by normalization of the acoustic space.
Proceedings of the 1991 International Conference on Acoustics, 1991

1990
Overview of the Third DARPA Speech and Natural Language Workshop.
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, 1990

Towards Environment-Independent Spoken Language Systems.
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, 1990

Acoustical pre-processing for robust spoken language systems.
Proceedings of the First International Conference on Spoken Language Processing, 1990

Environmental robustness in automatic speech recognition.
Proceedings of the 1990 International Conference on Acoustics, 1990

1989
ACOUSTICAL PRE-PROCESSING FOR ROBUST SPEECH RECOGNITION.
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Cape Cod, 1989

1988
Parsing spoken phrases despite missing words.
Proceedings of the IEEE International Conference on Acoustics, 1988

1987
Dynamic speaker adaptation for feature-based isolated word recognition.
IEEE Trans. Acoust. Speech Signal Process., 1987

Sentence parsing with weak grammatical constraints.
Proceedings of the IEEE International Conference on Acoustics, 1987

1984
A Posteriori Estimation of Correlated Jointly Gaussian Mean Vectors.
IEEE Trans. Pattern Anal. Mach. Intell., 1984

Fast Computation of the Difference of Low-Pass Transform.
IEEE Trans. Pattern Anal. Mach. Intell., 1984

Unsupervised adaptation to new speakers in feature-based letter recognition.
Proceedings of the IEEE International Conference on Acoustics, 1984

1983
Dynamic speaker adaptation for isolated letter recognition using MAP estimation.
Proceedings of the IEEE International Conference on Acoustics, 1983

Feature-based speaker-independent recognition of isolated english letters.
Proceedings of the IEEE International Conference on Acoustics, 1983


  Loading...