Li Deng

Orcid: 0000-0002-1014-0790

Affiliations:
  • Artificial Intelligence, Citadel, USA
  • Microsoft Research, Redmond, WA, USA
  • University of Waterloo, Department of Electrical and Computer Engineering, ON, Canada (1989 - 1999)
  • University of Wisconsin-Madison, WI, USA (PhD 1986)


According to our database1, Li Deng authored at least 322 papers between 1991 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Trusting Language Models in Education.
CoRR, 2023

2020
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications.
IEEE J. Sel. Top. Signal Process., 2020

Introduction to the Special Issue on Deep Learning for Multi-Modal Intelligence Across Speech, Language, Vision, and Heterogeneous Signals.
IEEE J. Sel. Top. Signal Process., 2020

Prediction model of the response to neoadjuvant chemotherapy in breast cancers by a Naive Bayes algorithm.
Comput. Methods Programs Biomed., 2020

2019
From Caesar Cipher to Unsupervised Learning: A New Method for Classifier Parameter Estimation.
CoRR, 2019

Attentive Tensor Product Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Artificial Intelligence in the Rising Wave of Deep Learning: The Historical Path and Future Outlook [Perspectives].
IEEE Signal Process. Mag., 2018

Attentive Tensor Product Learning for Language Generation and Grammar Parsing.
CoRR, 2018

Tensor Product Generation Networks for Deep NLP Modeling.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Towards Neural Phrase-based Machine Translation.
Proceedings of the 6th International Conference on Learning Representations, 2018

Question-Answering with Grammatically-Interpretable Representations.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Perspectives on predictive power of multimodal deep learning: surprises and future directions.
Proceedings of the Handbook of Multimodal-Multisensor Interfaces: Foundations, User Modeling, and Common Modality Combinations, 2018

2017
Deep Learning for Image-to-Text Generation: A Technical Overview.
IEEE Signal Process. Mag., 2017

Challenges and Open Problems in Signal Processing: Panel Discussion Summary from ICASSP 2017 [Panel and Forum].
IEEE Signal Process. Mag., 2017

Convolutional Deep Stacking Networks for distributed compressive sensing.
Signal Process., 2017

A Neural-Symbolic Approach to Natural Language Tasks.
CoRR, 2017

Tensor Product Generation Networks.
CoRR, 2017

Deep Learning of Grammatically-Interpretable Representations Through Question-Answering.
CoRR, 2017

An Unsupervised Learning Method Exploiting Sequential Output Statistics.
CoRR, 2017

Neural Phrase-based Machine Translation.
CoRR, 2017

Scaffolding Networks for Teaching and Learning to Comprehend.
CoRR, 2017

Unsupervised Sequence Classification using Sequential Output Statistics.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Q-LDA: Uncovering Latent Patterns in Text-based Sequential Decision Processes.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Sequence Modeling via Segmentations.
Proceedings of the 34th International Conference on Machine Learning, 2017

End-to-end joint learning of natural language understanding and dialogue manager.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Character-level deep conflation for business data analytics.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Semantic Compositional Networks for Visual Captioning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

StyleNet: Generating Attractive Visual Captions with Styles.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Distributed Compressive Sensing: A Deep Learning Approach.
IEEE Trans. Signal Process., 2016

Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Basic Reasoning with Tensor Product Representations.
CoRR, 2016

Efficient Exploration for Dialog Policy Learning with Deep BBQ Networks \& Replay Buffer Spiking.
CoRR, 2016

Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear.
CoRR, 2016

Reasoning in Vector Space: An Exploratory Study of Question Answering.
Proceedings of the 4th International Conference on Learning Representations, 2016

Deep Reinforcement Learning with a Combinatorial Action Space for Predicting and Tracking Popular Discussion Threads.
CoRR, 2016

End-to-End Reinforcement Learning of Dialogue Agents for Information Access.
CoRR, 2016

Knowledge as a Teacher: Knowledge-Guided Structural Attention Networks.
CoRR, 2016

Unsupervised Learning of Predictors from Unpaired Input-Output Samples.
CoRR, 2016

Syntax or semantics? knowledge-guided joint semantic frame parsing.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

MS MARCO: A Human Generated MAchine Reading COmprehension Dataset.
Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), 2016

The 2015 NIST Language Recognition Evaluation: The Shared View of I2R, Fantastic4 and SingaMS.
Proceedings of the Interspeech 2016, 2016

Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM.
Proceedings of the Interspeech 2016, 2016

End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding.
Proceedings of the Interspeech 2016, 2016

Exploiting correlations among channels in distributed compressive sensing with convolutional deep stacking networks.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Interpreting the prediction process of a deep network constructed from supervised topic models.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Reconstruction of sparse vectors in compressive sensing with multiple measurement vectors using bidirectional long short-term memory.
Proceedings of the 2016 IEEE Global Conference on Signal and Information Processing, 2016

Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Bi-directional Attention with Agreement for Dependency Parsing.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Stacked Attention Networks for Image Question Answering.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Deep Reinforcement Learning with a Natural Language Action Space.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2015
Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends.
IEEE Signal Process. Mag., 2015

Embedding Entities and Relations for Learning and Inference in Knowledge Bases.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Deep Sentence Embedding Using the Long Short Term Memory Network: Analysis and Application to Information Retrieval.
CoRR, 2015

Recurrent Reinforcement Learning: A Hybrid Approach.
CoRR, 2015

Deep Reinforcement Learning with an Unbounded Action Space.
CoRR, 2015

End-to-end Learning of Latent Dirichlet Allocation by Mirror-Descent Back Propagation.
CoRR, 2015

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

A Deep Embedding Model for Co-occurrence Learning.
Proceedings of the IEEE International Conference on Data Mining Workshop, 2015

From captions to visual concepts and back.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Language Models for Image Captioning: The Quirks and What Works.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
An Overview of Noise-Robust Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Editorial: Expanding the Technical Reach of our Transactions.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Farewell editorial: keeping up the momentum of innovations.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Convolutional Neural Networks for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

A fast maximum likelihood nonlinear feature transformation method for GMM-HMM speaker adaptation.
Neurocomputing, 2014

Deep Learning: Methods and Applications.
Found. Trends Signal Process., 2014

Learning Multi-Relational Semantics Using Neural-Embedding Models.
CoRR, 2014

Semantic Modelling with Long-Short-Term Memory for Information Retrieval.
CoRR, 2014

A New Method for Learning Deep Recurrent Neural Networks.
Proceedings of the 2nd International Conference on Learning Representations, 2014

Learning semantic representations using convolutional neural networks for web search.
Proceedings of the 23rd International World Wide Web Conference, 2014

Ensemble deep learning for speech recognition.
Proceedings of the INTERSPEECH 2014, 2014

Achievements and challenges of deep learning - from speech analysis and recognition to language and multimodal processing.
Proceedings of the INTERSPEECH 2014, 2014

Sequence classification using the high-level features extracted from deep neural networks.
Proceedings of the IEEE International Conference on Acoustics, 2014

Modeling Interestingness with Deep Neural Networks.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

Recurrent Deep-Stacking Networks for sequence classification.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Learning Continuous Phrase Representations for Translation Modeling.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2013
The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition.
IEEE Trans. Speech Audio Process., 2013

Optimization Algorithms and Applications for Speech and Language Processing.
IEEE Trans. Speech Audio Process., 2013

Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis.
IEEE Trans. Speech Audio Process., 2013

Speech Recognition Using Long-Span Temporal Patterns in a Deep Network Model.
IEEE Signal Process. Lett., 2013

Speech Information Processing: Theory and Applications [Scanning the Issue].
Proc. IEEE, 2013

Speech-Centric Information Processing: An Optimization-Oriented Approach.
Proc. IEEE, 2013

Tensor Deep Stacking Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Guest Editors' Introduction: Special Section on Learning Deep Architectures.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Exploiting deep neural networks for detection-based speech recognition.
Neurocomputing, 2013

Learning Input and Recurrent Weight Matrices in Echo State Networks.
CoRR, 2013

Learning Semantic Representations for the Phrase Translation Model.
CoRR, 2013

Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding.
Proceedings of the INTERSPEECH 2013, 2013

Deep segmental neural networks for speech recognition.
Proceedings of the INTERSPEECH 2013, 2013

Exploring convolutional neural network structures and optimization techniques for speech recognition.
Proceedings of the INTERSPEECH 2013, 2013

Using deep stacking network to improve structured compressed sensing with Multiple Measurement Vectors.
Proceedings of the IEEE International Conference on Acoustics, 2013

Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2013

Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers.
Proceedings of the IEEE International Conference on Acoustics, 2013

Predicting speech recognition confidence using deep learning with word identity and score features.
Proceedings of the IEEE International Conference on Acoustics, 2013

Random features for Kernel Deep Convex Network.
Proceedings of the IEEE International Conference on Acoustics, 2013

Multi-style adaptive training for robust cross-lingual spoken language understanding.
Proceedings of the IEEE International Conference on Acoustics, 2013

End-to-end learning of parsing models for information retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2013

Recent advances in deep learning for speech research at Microsoft.
Proceedings of the IEEE International Conference on Acoustics, 2013

New types of deep neural network learning for speech recognition and related applications: an overview.
Proceedings of the IEEE International Conference on Acoustics, 2013

Deep stacking networks for information retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2013

A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion.
Proceedings of the IEEE International Conference on Acoustics, 2013

Large-scale malware classification using random projections and neural networks.
Proceedings of the IEEE International Conference on Acoustics, 2013

Learning deep structured semantic models for web search using clickthrough data.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

2012
Inaugural Editorial: Riding the Tidal Wave of Human-Centric Information Processing - Innovate, Outreach, Collaborate, Connect, Expand, and Win.
IEEE Trans. Speech Audio Process., 2012

Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition.
IEEE Trans. Speech Audio Process., 2012

The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web].
IEEE Signal Process. Mag., 2012

Efficient and effective algorithms for training single-hidden-layer neural networks.
Pattern Recognit. Lett., 2012

Adaptation of context-dependent deep neural networks for automatic speech recognition.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Use of kernel deep convex networks and end-to-end learning for spoken language understanding.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Being deep and being dynamic - new-generation models and methodology for advancing speech technology.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Learning with Recursive Perceptual Representations.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Learning deep architectures using kernel modules.
Proceedings of the 2012 Symposium on Machine Learning in Speech and Language Processing, 2012

Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks.
Proceedings of the INTERSPEECH 2012, 2012

Are Sparse Representations Rich Enough for Acoustic Modeling?
Proceedings of the INTERSPEECH 2012, 2012

Parallel Training for Deep Stacking Networks.
Proceedings of the INTERSPEECH 2012, 2012

Exploiting sparseness in deep neural networks for large vocabulary speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Towards deeper understanding: Deep convex networks for semantic utterance classification.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

A deep architecture with bilinear modeling of hidden representations: Applications to phonetic recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Optimization in speech-centric information processing: Criteria and techniques.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Scalable stacking and learning for building deep architectures.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

New methods and evaluation experiments on translating TED talks in the IWSLT benchmark.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Maximum Expected BLEU Training of Phrase and Lexicon Translation Models.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011
Deep Learning and Its Applications to Signal and Information Processing [Exploratory DSP].
IEEE Signal Process. Mag., 2011

Signal Processing Trends in Media, Mobility, and Search [From the Editors].
IEEE Signal Process. Mag., 2011

Speech Recognition, Machine Translation, and Speech Translation - A Unified Discriminative Learning Paradigm [Lecture Notes].
IEEE Signal Process. Mag., 2011

Shining Bright: The Golden Era of Signal Processing [From the Editor].
IEEE Signal Process. Mag., 2011

New Honor, New Initiatives, and New Impact to Come [From the Editor].
IEEE Signal Process. Mag., 2011

The MSR SYSTEM for IWSLT 2011 evaluation.
Proceedings of the 2011 International Workshop on Spoken Language Translation, 2011

Deep Convex Net: A Scalable Architecture for Speech Pattern Classification.
Proceedings of the INTERSPEECH 2011, 2011

Accelerated Parallelizable Neural Network Learning Algorithm for Speech Recognition.
Proceedings of the INTERSPEECH 2011, 2011

Robust Speech Translation by Domain Adaptation.
Proceedings of the INTERSPEECH 2011, 2011

A novel decision function and the associated decision-feedback learning for speech translation.
Proceedings of the IEEE International Conference on Acoustics, 2011

Why word error rate is not a good metric for speech recognizer training for the speech translation task?
Proceedings of the IEEE International Conference on Acoustics, 2011

Large vocabulary continuous speech recognition with context-dependent DBN-HMMS.
Proceedings of the IEEE International Conference on Acoustics, 2011

Front-End, Back-End, and Hybrid Techniques for Noise-Robust Speech Recognition.
Proceedings of the Robust Speech Recognition of Uncertain or Missing Data, 2011

2010
An Overview of Modern Speech Recognition.
Proceedings of the Handbook of Natural Language Processing, Second Edition., 2010

A Geometric Perspective of Large-Margin Training of Gaussian Models [Lecture Notes].
IEEE Signal Process. Mag., 2010

Sequential Labeling Using Deep-Structured Conditional Random Fields.
IEEE J. Sel. Top. Signal Process., 2010

Introduction to the Issue on Statistical Learning Methods for Speech and Language Processing.
IEEE J. Sel. Top. Signal Process., 2010

Active learning and semi-supervised learning for speech recognition: A unified framework using the global entropy reduction maximization criterion.
Comput. Speech Lang., 2010

The MSRA machine translation system for IWSLT 2010.
Proceedings of the 2010 International Workshop on Spoken Language Translation, 2010

Deep-structured hidden conditional random fields for phonetic recognition.
Proceedings of the INTERSPEECH 2010, 2010

Investigation of full-sequence training of deep belief networks for speech recognition.
Proceedings of the INTERSPEECH 2010, 2010

Unscented transform with online distortion estimation for HMM adaptation.
Proceedings of the INTERSPEECH 2010, 2010

Binary coding of speech spectrograms using a deep auto-encoder.
Proceedings of the INTERSPEECH 2010, 2010

Word confidence calibration using a maximum entropy model with constraints on confidence and word distributions.
Proceedings of the IEEE International Conference on Acoustics, 2010

Language recognition using deep-structured conditional random fields.
Proceedings of the IEEE International Conference on Acoustics, 2010

Semantic confidence calibration for spoken dialog applications.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
A Novel Framework and Training Algorithm for Variable-Parameter Hidden Markov Models.
IEEE Trans. Speech Audio Process., 2009

Solving nonlinear estimation problems using splines [Lecture Notes].
IEEE Signal Process. Mag., 2009

Updated MINDS report on speech recognition and understanding, Part 2 [DSP Education].
IEEE Signal Process. Mag., 2009

Developments and directions in speech recognition and understanding, Part 1 [DSP Education].
IEEE Signal Process. Mag., 2009

Using continuous features in the maximum entropy model.
Pattern Recognit. Lett., 2009

A unified framework of HMM adaptation with joint compensation of additive and convolutive distortions.
Comput. Speech Lang., 2009

Hidden conditional random field with distribution constraints for phone classification.
Proceedings of the INTERSPEECH 2009, 2009

Rethinking of computation for future-generation, knowledge-rich speech recognition and understanding.
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Cross-lingual speech recognition under runtime resource constraints.
Proceedings of the IEEE International Conference on Acoustics, 2009

Discriminative pronounciation learning using phonetic decoder and minimum-classification-error criterion.
Proceedings of the IEEE International Conference on Acoustics, 2009

Maximizing global entropy reduction for active learning in speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Using collective information in semi-supervised learning for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

A study on multilingual acoustic modeling for large vocabulary ASR.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Discriminative Learning for Speech Recognition: Theory and Practice
Synthesis Lectures on Speech and Audio Processing, Morgan & Claypool Publishers, ISBN: 978-3-031-02557-0, 2008

Robust Speech Recognition Using a Cepstral Minimum-Mean-Square-Error-Motivated Noise Suppressor.
IEEE Trans. Speech Audio Process., 2008

An Integrative and Discriminative Technique for Spoken Utterance Classification.
IEEE Trans. Speech Audio Process., 2008

Discriminative learning in sequential pattern recognition.
IEEE Signal Process. Mag., 2008

Large-margin minimum classification error training: A theoretical risk minimization perspective.
Comput. Speech Lang., 2008

Improvements on Mel-Frequency Cepstrum Minimum-Mean-Square-Error Noise Suppressor for Robust Speech Recognition.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Parameter clustering and sharing in variable-parameter HMMs for noise robust speech recognition.
Proceedings of the INTERSPEECH 2008, 2008

Discriminative training of variable-parameter HMMs for noise robust speech recognition.
Proceedings of the INTERSPEECH 2008, 2008

Automatic children's reading tutor on hand-held devices.
Proceedings of the INTERSPEECH 2008, 2008

A minimum-mean-square-error noise reduction algorithm on Mel-frequency cepstra for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008

Adaptation of compressed HMM parameters for resource-constrained speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008

HMM adaptation using a phase-sensitive acoustic distortion model for environment-robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Guest Editors' Introduction: Special Section on Emergent Systems, Algorithms and Architectures for Speech-Based Human-Machine Interaction.
IEEE Trans. Computers, 2007

Adaptive Kalman Filtering and Smoothing for Tracking Vocal Tract Resonances Using a Continuous-Valued Hidden Dynamic Model.
IEEE Trans. Speech Audio Process., 2007

A new look at discriminative training for hidden Markov models.
Pattern Recognit. Lett., 2007

Speaker-adaptive learning of resonance targets in a hidden trajectory model of speech coarticulation.
Comput. Speech Lang., 2007

Large-Margin Discriminative Training of Hidden Markov Models for Speech Recognition.
Proceedings of the First IEEE International Conference on Semantic Computing (ICSC 2007), 2007

Handling phonetic context and speaker variation in a structure-based speech recognizer.
Proceedings of the INTERSPEECH 2007, 2007

A structured speech model parameterized by recursive dynamics and neural networks.
Proceedings of the INTERSPEECH 2007, 2007

Phone-discriminating minimum classification error (p-MCE) training for phonetic recognition.
Proceedings of the INTERSPEECH 2007, 2007

Structure-based and template-based automatic speech recognition - comparing parametric and non-parametric approaches.
Proceedings of the INTERSPEECH 2007, 2007

Large-Margin Minimum Classification Error Training for Large-Scale Speech Recognition Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2007

A Discriminative Training Framework using N-Best Speech Recognition Transcriptions and Scores for Spoken Utterance Classification.
Proceedings of the IEEE International Conference on Acoustics, 2007

Efficient and Robust Language Modeling in an Automatic Children's Reading Tutor System.
Proceedings of the IEEE International Conference on Acoustics, 2007

Use of Differential Cepstra as Acoustic Features in Hidden Trajectory Modeling for Phonetic Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2007

High-performance hmm adaptation with joint compensation of additive and convolutive distortions via Vector Taylor Series.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Roles of high-fidelity acoustic modeling in robust speech recognition.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Dynamic Speech Models: Theory, Algorithms, and Applications
Synthesis Lectures on Speech and Audio Processing, Morgan & Claypool Publishers, ISBN: 978-3-031-02555-6, 2006

Structured speech modeling.
IEEE Trans. Speech Audio Process., 2006

A bidirectional target-filtering model of speech coarticulation and reduction: two-stage implementation for phonetic recognition.
IEEE Trans. Speech Audio Process., 2006

Tracking vocal tract resonances using a quantized nonlinear function embeddedin a temporal constraint.
IEEE Trans. Speech Audio Process., 2006

A lattice search technique for a long-contextual-span hidden trajectory model of speech.
Speech Commun., 2006

A state-space model with neural-network prediction for recovering vocal tract resonances in fluent speech from Mel-cepstral coefficients.
Speech Commun., 2006

A Novel Learning Method for Hidden Markov Models in Speech and Audio Processing.
Proceedings of the IEEE 8th Workshop on Multimedia Signal Processing, 2006

Use of incrementally regulated discriminative margins in MCE training for speech recognition.
Proceedings of the INTERSPEECH 2006, 2006

A time-synchronous phonetic decoder for a long-contextual-Span hidden trajectory model.
Proceedings of the INTERSPEECH 2006, 2006

2005
A Speech-Centric Perspective for Human-Computer Interface: A Case Study.
J. VLSI Signal Process., 2005

Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion.
IEEE Trans. Speech Audio Process., 2005

Spoken language understanding.
IEEE Signal Process. Mag., 2005

Speech technology and systems in human-machine communication [from the Guest Editors].
IEEE Signal Process. Mag., 2005

Analysis and comparison of two speech feature extraction/compensation algorithms.
IEEE Signal Process. Lett., 2005

Evaluation of a long-contextual-Span hidden trajectory model and phonetic recognizer using a* lattice search.
Proceedings of the INTERSPEECH 2005, 2005

Learning statistically characterized resonance targets in a hidden trajectory model of speech coarticulation and reduction.
Proceedings of the INTERSPEECH 2005, 2005

Multi-sensory speech processing: incorporating automatically extracted hidden dynamic information.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

A Hidden Trajectory Model with Bi-directional Target-Filtering: Cascaded vs. Integrated Implementation for Phonetic Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Speech and Language Processing for Multimodal Human-Computer Interaction.
J. VLSI Signal Process., 2004

Target-directed mixture dynamic models for spontaneous speech recognition.
IEEE Trans. Speech Audio Process., 2004

Estimating cepstrum of speech under the presence of noise using a joint prior of static and dynamic features.
IEEE Trans. Speech Audio Process., 2004

Enhancement of log Mel power spectra of speech using a phase-sensitive model of the acoustic environment and sequential estimation of the corrupting noise.
IEEE Trans. Speech Audio Process., 2004

A mixed-level switching dynamic system for continuous speech recognition.
Comput. Speech Lang., 2004

Challenges in adopting speech recognition.
Commun. ACM, 2004

Nonlinear information fusion in multi-sensor processing - extracting and exploiting hidden dynamics of speech captured by a bone-conductive microphone.
Proceedings of the IEEE 6th Workshop on Multimedia Signal Processing, 2004

Unsupervised learning from users' error correction in speech dictation.
Proceedings of the INTERSPEECH 2004, 2004

Use of neural network mapping and extended kalman filter to recover vocal tract resonances from the MFCC parameters of speech.
Proceedings of the INTERSPEECH 2004, 2004

A quantitative model for formant dynamics and contextually assimilated reduction in fluent speech.
Proceedings of the INTERSPEECH 2004, 2004

Multi-sensory microphones for robust speech detection, enhancement and recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

A multimodal variational approach to learning and inference in switching state space models [speech processing application].
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

A structured speech model with continuous hidden dynamics and prediction-residual training for tracking vocal tract resonances.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Joint state and parameter estimation for a target-directed nonlinear dynamic system model.
IEEE Trans. Signal Process., 2003

Efficient decoding strategies for conversational speech recognition using a constrained nonlinear state-space model.
IEEE Trans. Speech Audio Process., 2003

Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition.
IEEE Trans. Speech Audio Process., 2003

A comparison of three non-linear observation models for noisy speech features.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Tracking vocal tract resonances using an analytical nonlinear predictor and a target-guided temporal constraint.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM - model and training.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM - MAP decoding and evaluation.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Variational inference and learning for segmental switching state space models of hidden speech dynamics.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Incremental Bayes learning with prior evolution for tracking nonstationary noise statistics from noisy speech data.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

An expectation maximization approach for formant tracking using a parameter-free non-linear predictor.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
A robust compensation strategy for extraneous acoustic variations in spontaneous speech recognition.
IEEE Trans. Speech Audio Process., 2002

Distributed speech processing in miPad's multimodal user interface.
IEEE Trans. Speech Audio Process., 2002

Nonstationary-state hidden Markov model representation of speech signals for speech enhancement.
Signal Process., 2002

Evaluation of SPLICE on the Aurora 2 and 3 tasks.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Noise from corrupted speech log mel-spectral energies.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Exploiting variances in robust feature extraction based on a parametric model of speech distortion.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Sequential MAP noise estimation and a phase-sensitive model of the acoustic environment.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

A new approach to speech enhancement by a microphone array using EM and mixture models.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

A mixture linear model with target-directed dynamics for spontaneous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

Uncertainty decoding with SPLICE for noise robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

A Bayesian approach to speech feature enhancement using the dynamic cepstral prior.
Proceedings of the IEEE International Conference on Acoustics, 2002

A speech-centric perspective for human-computer interface.
Proceedings of the IEEE 5th Workshop on Multimedia Signal Processing, 2002

2001
A Bayesian approach to the verification problem: applications to speaker verification.
IEEE Trans. Speech Audio Process., 2001

A maximum a posteriori approach to speaker adaptation using the trended hidden Markov model.
IEEE Trans. Speech Audio Process., 2001

Parameter estimation of a target-directed dynamic system model with switching states.
Signal Process., 2001

ALGONQUIN - Learning Dynamic Noise Models From Noisy Speech for Robust Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Efficient decoding strategy for conversational speech recognition using state-space models for vocal-tract-resonance dynamics.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

ALGONQUIN: iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Evaluation of the SPLICE algorithm on the Aurora2 database.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A new method for speech denoising and robust speech recognition using probabilistic models for clean speech and for noise.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

An EKF-based algorithm for learning statistical hidden dynamic model parameters for phonetic recognition.
Proceedings of the IEEE International Conference on Acoustics, 2001

A functional articulatory dynamic model for speech production.
Proceedings of the IEEE International Conference on Acoustics, 2001

Towards non-stationary model-based noise adaptation for large vocabulary speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2001


Efficient on-line acoustic environment estimation for FCDCN in a continuous speech recognition system.
Proceedings of the IEEE International Conference on Acoustics, 2001

High-performance robust speech recognition using stereo training data.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
A path-stack algorithm for optimizing dynamic regimes in a statistical hidden dynamic model of speech.
Comput. Speech Lang., 2000

Speech Denoising and Dereverberation Using Probabilistic Models.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

A robust speech understanding system using conceptual relational grammar.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Data-driven model construction for continuous speech recognition using overlapping articulatory features.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A robust training strategy against extraneous acoustic variations for spontaneous speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Mipad: a next generation PDA prototype.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Large-vocabulary speech recognition under adverse acoustic environments.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

HMM adaptation using vector taylor series for noisy speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999
A dynamic system approach to speech enhancement using the H<sub>∞</sub> filtering algorithm.
IEEE Trans. Speech Audio Process., 1999

A layered neural network interfaced with a cochlear model for the study of speech encoding in the auditory system.
Comput. Speech Lang., 1999

Optimization of dynamic regimes in a statistical hidden dynamic model for conversational speech recognition.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

A statistical coarticulatory model for the hidden vocal-tract-resonance dynamics.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Speech enhancement using voice source models.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Speech analysis and recognition using interval statistics generated from a composite auditory model.
IEEE Trans. Speech Audio Process., 1998

HMM-based strategies for enhancement of speech signals embedded in nonstationary noise.
IEEE Trans. Speech Audio Process., 1998

Speech trajectory discrimination using the minimum classification error learning.
IEEE Trans. Speech Audio Process., 1998

A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition.
Speech Commun., 1998

Use of high-level linguistic constraints for constructing feature-based phonological model in speech recognition.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1997
Game theory approach to discrete H<sub>∞</sub> filter design.
IEEE Trans. Signal Process., 1997

HMM-based speech recognition using state-dependent, discriminatively derived transforms on mel-warped DFT features.
IEEE Trans. Speech Audio Process., 1997

Use of generalized dynamic feature parameters for speech recognition.
IEEE Trans. Speech Audio Process., 1997

Production models as a structural basis for automatic speech recognition.
Speech Commun., 1997

Speech recognition using autosegmental representation of phonological units with interface to the trended HMM.
Speech Commun., 1997

Maximum likelihood in statistical estimation of dynamic systems: Decomposition algorithm and simulation results.
Signal Process., 1997

Speaker adaptation experiments using nonstationary-state hidden Markov models: a MAP approach.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Integrated-multilingual speech recognition using universal phonological features in a functional speech production model.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
Transitional speech units and their representation by regressive Markov states: applications to speech recognition.
IEEE Trans. Speech Audio Process., 1996

Decomposition solution of H∞ filter gain in singularly perturbed systems.
Signal Process., 1996

Construction of state-dependent dynamic parameters using the maximum likelihood approach: Applications to speech recognition.
Signal Process., 1996

Transiems as dynamically defined, sub-phonemic units of speech: A computational model.
Signal Process., 1996

Modeling context-dependent phonetic units in a continuous speech recognition system for Mandarin Chinese.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

H-infinity filtering for speech enhancement.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

The trended HMM with discriminative training for phonetic classification.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Optimal filtering and smoothing for speech recognition using a stochastic target model.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Interaction of speech disorders with speech coders: effects on speech intelligibility.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Hierarchical partition of the articulatory state space for overlapping-feature based speech recognition.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Simulation of disordered speech using a frequency-domain vocal tract model.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

HMM-based speech recognition using state-dependent, linear transforms on Mel-warped DFT features.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995
Tracking nonstationary targets using a dynamical system with Markov-modulated parameters.
IEEE Signal Process. Lett., 1995

A Markov model containing state-conditioned second-order non-stationarity: application to speech recognition.
Comput. Speech Lang., 1995

Maximum-likelihood estimation for articulatory speech recognition using a stochastic target model.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Analysis of acoustic-phonetic variations in fluent speech using TIMIT.
Proceedings of the 1995 International Conference on Acoustics, 1995

Improved speech modeling and recognition using multi-dimensional articulatory states as primitive speech units.
Proceedings of the 1995 International Conference on Acoustics, 1995

Use of generalized dynamic feature parameters for speech recognition: maximum likelihood and minimum classification error approaches.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
Waveform-based speech recognition using hidden filter models: parameter selection and sensitivity to power normalization.
IEEE Trans. Speech Audio Process., 1994

A statistical model for formant-transition microsegments of speech incorporating locus equations.
Signal Process., 1994

Pipelined architecture for neural-network-based speech recognition.
Neural Parallel Sci. Comput., 1994

Analysis of the correlation structure for a neural predictive model with application to speech recognition.
Neural Networks, 1994

Nonstationary-state hidden Markov model with state-dependent time warping: application to speech recognition.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Automatic speech recognition using dynamically defined speech units.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Comparative performance of spectral subtraction and HMM-based speech enhancement strategies with application to hearing and design.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

Vowel classification using a neural predictive HMM: a discriminative training approach.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

Phonetic classification and recognition using HMM representation of overlapping articulatory features for all classes of English sounds.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
A stochastic model of speech incorporating hierarchical nonstationarity.
IEEE Trans. Speech Audio Process., 1993

Hidden Markov model representation of quantized articulatory features for speech recognition.
Comput. Speech Lang., 1993

Speech recognition using the atomic speech units constructed from overlapping articulatory features.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

1992
A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal.
Signal Process., 1992

Processing of acoustic signals in a cochlear model incorporating laterally coupled suppressive elements.
Neural Networks, 1992

HMM representation of quantized articulatory features for recognition of highly confusable words.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991
Microstructural speech units and their HMM representation for discrete utterance speech recognition.
Proceedings of the 1991 International Conference on Acoustics, 1991


  Loading...