Chin-Hui Lee

Orcid: 0000-0002-1892-2551

Affiliations:
  • Georgia Institute of Technology, School of Electrical and Computer Engineering, USA
  • Bell Laboratories, Dialogue Systems Research Department, Murray Hill, New Jersey, NY, USA (1981-2001)


According to our database1, Chin-Hui Lee authored at least 199 papers between 2006 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Bayesian adaptive learning to latent variables via Variational Bayes and Maximum a Posteriori.
CoRR, 2024

2023
Space-and-speaker-aware acoustic modeling with effective data augmentation for recognition of multi-array conversational speech.
Speech Commun., September, 2023

Using iterative adaptation and dynamic mask for child speech extraction under real-world multilingual conditions.
Speech Commun., July, 2023

A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

QDM-SSD: Quality-Aware Dynamic Masking for Separation-Based Speaker Diarization.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture.
CoRR, 2023

Boosting End-to-End Multilingual Phoneme Recognition through Exploiting Universal Speech Attributes Constraints.
CoRR, 2023

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.
CoRR, 2023

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge.
CoRR, 2023

Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement.
CoRR, 2023

A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models.
CoRR, 2023

Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Incorporating Visual Information Reconstruction into Progressive Learning for Optimizing audio-visual Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2023

A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Loss Function Design for DNN-Based Sound Event Localization and Detection on Low-Resource Realistic Data.
Proceedings of the IEEE International Conference on Acoustics, 2023

An Experimental Study on Sound Event Localization and Detection Under Realistic Testing Conditions.
Proceedings of the IEEE International Conference on Acoustics, 2023

Incorporating Lip Features into Audio-Visual Multi-Speaker DOA Estimation by Gated Fusion.
Proceedings of the IEEE International Conference on Acoustics, 2023

Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023

Semi-Supervised Multi-Channel Speaker Diarization With Cross-Channel Attention.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Enhancing Privacy Preservation with Quantum Computing for Word-Level Audio-Visual Speech Recognition.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Improving Sound Event Localization and Detection with Class-Dependent Sound Separation for Real-World Scenarios.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Correlated Multi-Level Speech Enhancement for Robust Real-World ASR Applications Using Mask-Waveform-Feature Optimization.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Audio-Visual Wake Word Spotting in MISP2021 Challenge: Dataset Release and Deep Analysis.
Proceedings of the Interspeech 2022, 2022

Deep Segment Model for Acoustic Scene Classification.
Proceedings of the Interspeech 2022, 2022

End-to-End Audio-Visual Neural Speaker Diarization.
Proceedings of the Interspeech 2022, 2022

Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis.
Proceedings of the Interspeech 2022, 2022

A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning.
Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Separation-Based Speaker Diarization Via Iterative Model Refinement And Speaker Embedding Based Post-Processing.
Proceedings of the IEEE International Conference on Acoustics, 2022

A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer.
Proceedings of the IEEE International Conference on Acoustics, 2022

The USTC-Ximalaya System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription (M2met) Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Information Fusion in Attention Networks Using Adaptive and Multi-Level Factorized Bilinear Pooling for Audio-Visual Emotion Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

A Cross-Entropy-Guided Measure (CEGM) for Assessing Speech Recognition Performance and Optimizing DNN-Based Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Correlating subword articulation with lip shapes for embedding aware audio-visual speech enhancement.
Neural Networks, 2021

Separation Guided Speaker Diarization in Realistic Mismatched Conditions.
CoRR, 2021

A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification.
CoRR, 2021

USTC-NELSLIP System Description for DIHARD-III Challenge.
CoRR, 2021

Acoustic Modeling for Multi-Array Conversational Speech Recognition in the Chime-6 Challenge.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Speech Emotion Recognition Based on Acoustic Segment Model.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

A Model Ensemble Approach for Sound Event Localization and Detection.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

A Maximum Likelihood Approach to SNR-Progressive Learning Using Generalized Gaussian Distribution for LSTM-Based Speech Enhancement.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Scenario-Dependent Speaker Diarization for DIHARD-III Challenge.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Speech Enhancement Autoencoder with Hierarchical Latent Structure.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Progressive Learning Approach to Adaptive Noise and Speech Estimation for Speech Enhancement and Noisy Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Two-Stage Approach to Device-Robust Acoustic Scene Classification.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Analyzing Upper Bounds on Mean Absolute Errors for Deep Neural Network-Based Vector-to-Vector Regression.
IEEE Trans. Signal Process., 2020

A Multi-Target SNR-Progressive Learning Approach to Regression Based Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

On Mean Absolute Error for Deep Neural Network Based Vector-to-Vector Regression.
IEEE Signal Process. Lett., 2020

Lip-reading with Hierarchical Pyramidal Convolution and Self-Attention.
CoRR, 2020

Device-Robust Acoustic Scene Classification Based on Two-Stage Categorization and Data Augmentation.
CoRR, 2020

Using Speech Enhancement Preprocessing for Speech Emotion Recognition in Realistic Noisy Conditions.
Proceedings of the Interspeech 2020, 2020

A Noise-Aware Memory-Attention Network Architecture for Regression-Based Speech Enhancement.
Proceedings of the Interspeech 2020, 2020

A Space-and-Speaker-Aware Iterative Mask Estimation Approach to Multi-Channel Speech Recognition in the CHiME-6 Challenge.
Proceedings of the Interspeech 2020, 2020

Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement.
Proceedings of the Interspeech 2020, 2020

Relational Teacher Student Learning with Neural Label Embedding for Device Adaptation in Acoustic Scene Classification.
Proceedings of the Interspeech 2020, 2020

An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances.
Proceedings of the Interspeech 2020, 2020

Enhanced Adversarial Strategically-Timed Attacks Against Deep Reinforcement Learning.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Characterizing Speech Adversarial Examples Using Self-Attention U-Net Enhancement.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Cross-Task Transfer Learning Approach to Adapting Deep Speech Enhancement Models to Unseen Background Noise Using Paired Senone Classifiers.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Study of Child Speech Extraction Using Joint Speech Enhancement and Separation in Realistic Conditions.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2D-to-2D Mask Estimation for Speech Enhancement Based on Fully Convolutional Neural Network.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Geometry Constrained Progressive Learning for Lstm-Based Speech Enhancement.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Progressive Multi-Target Network Based Speech Enhancement with Snr-Preselection for Robust Speaker Diarization.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Tensor-To-Vector Regression for Multi-Channel Speech Enhancement Based on Tensor-Train Network.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Maximum Likelihood Approach to Multi-Objective Learning Using Generalized Gaussian Distributions for Dnn-Based Speech Enhancement.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

High-Resolution Attention Network with Acoustic Segment Model for Acoustic Scene Classification.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Performance Analysis for Tensor-Train Decomposition to Deep Neural Network Based Vector-to-Vector Regression.
Proceedings of the 54th Annual Conference on Information Sciences and Systems, 2020

2019
Speech Enhancement Based on Teacher-Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

A Theory on Deep Neural Network Based Vector-to-Vector Regression With an Illustration of Its Expressive Power in Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Improving Mispronunciation Detection of Mandarin Tones for Non-Native Learners With Soft-Target Tone Labels and BLSTM-Based Deep Tone Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Using Generalized Gaussian Distributions to Improve Regression Error Modeling for Deep Learning-Based Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

An iterative mask estimation approach to deep learning based multi-channel speech recognition.
Speech Commun., 2019

A Speaker-Dependent Approach to Separation of Far-Field Multi-Talker Microphone Array Speech for Front-End Processing in the CHiME-5 Challenge.
IEEE J. Sel. Top. Signal Process., 2019

Acoustic Model Ensembling Using Effective Data Augmentation for CHiME-5 Challenge.
Proceedings of the Interspeech 2019, 2019

A Hybrid Approach to Acoustic Scene Classification Based on Universal Acoustic Models.
Proceedings of the Interspeech 2019, 2019

A Cross-Entropy-Guided (CEG) Measure for Speech Enhancement Front-End Assessing Performances of Back-End Automatic Speech Recognition.
Proceedings of the Interspeech 2019, 2019

KL-Divergence Regularized Deep Neural Network Adaptation for Low-Resource Speaker-Dependent Speech Enhancement.
Proceedings of the Interspeech 2019, 2019

DNN Training Based on Classic Gain Function for Single-channel Speech Enhancement and Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Two-stage Single-channel Speaker-dependent Speech Separation Approach for Chime-5 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2019

Improving Audio-visual Speech Recognition Performance with Cross-modal Student-teacher Training.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Speech Enhancement Neural Network Architecture with SNR-Progressive Multi-Target Learning for Robust Speech Recognition.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

A LSTM-Based Joint Progressive Learning Framework for Simultaneous Speech Dereverberation and Denoising.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Improving Deep Neural Network Based Speech Synthesis through Contextual Feature Parametrization and Multi-Task Learning.
J. Signal Process. Syst., 2018

A Speaker-Dependent Approach to Single-Channel Joint Speech Separation and Acoustic Modeling Based on Deep Neural Networks for Robust Recognition of Multi-Talker Speech.
J. Signal Process. Syst., 2018

Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks.
J. Signal Process. Syst., 2018

A Multiobjective Learning and Ensembling Approach to High-Performance Speech Enhancement With Compact Neural Network Architectures.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Acoustics-guided evaluation (AGE): a new measure for estimating performance of speech enhancement algorithms for robust ASR.
CoRR, 2018

A Progressive Deep Learning Approach to Child Speech Separation.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

A Maximum Likelihood Approach to Masking-based Speech Enhancement Using Deep Neural Network.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Speaker Diarization with Enhancing Speech for the First DIHARD Challenge.
Proceedings of the Interspeech 2018, 2018

Error Modeling via Asymmetric Laplace Distribution for Deep Neural Network Based Single-Channel Speech Enhancement.
Proceedings of the Interspeech 2018, 2018

A Novel LSTM-Based Speech Preprocessor for Speaker Diarization in Realistic Mismatch Conditions.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Improving Mandarin Tone Mispronunciation Detection for Non-Native Learners with Soft-Target Tone Labels and BLSTM-Based Deep Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Densely Connected Progressive Learning for LSTM-Based Speech Enhancement.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Online LSTM-based Iterative Mask Estimation for Multi-Channel Speech Enhancement and ASR.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
A Gender Mixture Detection Approach to Unsupervised Single-Channel Speech Separation Based on Deep Neural Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

A unified DNN approach to speaker-dependent simultaneous speech enhancement and speech separation in low SNR environments.
Speech Commun., 2017

Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation.
Pattern Recognit. Lett., 2017

An End-to-End Deep Learning Approach to Simultaneous Speech Dereverberation and Acoustic Modeling for Robust Speech Recognition.
IEEE J. Sel. Top. Signal Process., 2017

A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation.
EURASIP J. Adv. Signal Process., 2017

An information fusion framework with multi-channel feature concatenation and multi-perspective system combination for the deep-learning-based robust recognition of microphone array speech.
Comput. Speech Lang., 2017

On generating mixing noise signals with basis functions for simulating noisy speech and learning dnn-based speech enhancement models.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

A Maximum Likelihood Approach to Deep Neural Network Based Nonlinear Spectral Mapping for Single-Channel Speech Separation.
Proceedings of the Interspeech 2017, 2017

On Design of Robust Deep Models for CHiME-4 Multi-Channel Speech Recognition with Multiple Configurations of Array Microphones.
Proceedings of the Interspeech 2017, 2017

Improving Mispronunciation Detection for Non-Native Learners with Multisource Information and LSTM-Based Deep Models.
Proceedings of the Interspeech 2017, 2017

Joint Training of Multi-Channel-Condition Dereverberation and Acoustic Modeling of Microphone Array Speech for Robust Distant Speech Recognition.
Proceedings of the Interspeech 2017, 2017

A transfer learning and progressive stacking approach to reducing deep model sizes with an application to speech enhancement.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A unified deep modeling approach to simultaneous speech dereverberation and recognition for the reverb challenge.
Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017

Joint noise and mask aware training for DNN-based speech enhancement with SUB-band features.
Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017

Multiple-target deep learning for LSTM-RNN based speech enhancement.
Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017

LSTM-based iterative mask estimation and post-processing for multi-channel speech enhancement.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
A Regression Approach to Single-Channel Speech Separation Via High-Resolution Deep Neural Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

i-Vector Modeling of Speech Attributes for Automatic Foreign Accent Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition.
Neurocomputing, 2016

Joint training of DNNs by incorporating an explicit dereverberation structure for distant speech recognition.
EURASIP J. Adv. Signal Process., 2016

Learning auxiliary categorical information for speech synthesis based on deep and recurrent neural networks.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

A speaker-dependent deep learning approach to joint speech separation and acoustic modeling for multi-talker automatic speech recognition.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees.
Proceedings of the Interspeech 2016, 2016

SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement.
Proceedings of the Interspeech 2016, 2016

An experimental study on joint modeling of mixed-bandwidth data via deep neural networks for robust speech recognition.
Proceedings of the 2016 International Joint Conference on Neural Networks, 2016

Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Unsupervised single-channel speech separation via deep neural network for different gender mixtures.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Using tone-based extended recognition network to detect non-native Mandarin tone mispronunciations.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Towards a direct Bayesian adaptation framework for deep models.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015
A Regression Approach to Speech Enhancement Based on Deep Neural Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Maximum a Posteriori Adaptation of Network Parameters in Deep Models.
CoRR, 2015

Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement.
Proceedings of the INTERSPEECH 2015, 2015

High-resolution acoustic modeling and compact language modeling of language-universal speech attributes for spoken language identification.
Proceedings of the INTERSPEECH 2015, 2015

A universal VAD based on jointly trained deep neural networks.
Proceedings of the INTERSPEECH 2015, 2015

DNN-based speech bandwidth expansion and its application to adding high-frequency missing features for automatic speech recognition of narrowband speech.
Proceedings of the INTERSPEECH 2015, 2015

Maximum a posteriori adaptation of network parameters in deep models.
Proceedings of the INTERSPEECH 2015, 2015

Rapid adaptation for deep neural networks through multi-task learning.
Proceedings of the INTERSPEECH 2015, 2015

Speech Separation based on signal-noise-dependent deep neural networks for robust speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Joint training of front-end and back-end deep neural networks for robust speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments.
Proceedings of the Latent Variable Analysis and Signal Separation, 2015

A unified speaker-dependent speech separation and enhancement system based on deep neural networks.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

An information fusion approach to recognizing microphone array speech in the CHiME-3 challenge based on a deep learning framework.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
An Experimental Study on Speech Enhancement Based on Deep Neural Networks.
IEEE Signal Process. Lett., 2014

An artificial neural network approach to automatic speech processing.
Neurocomputing, 2014

Cross-language transfer learning for deep neural network based speech enhancement.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

A fusion approach to spoken language identification based on combining multiple phone recognizers and speech attribute detectors.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Speech separation based on improved deep neural networks with dual outputs of speech features for both target and interfering speakers.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Dynamic noise aware training for speech enhancement based on deep neural networks.
Proceedings of the INTERSPEECH 2014, 2014

Beyond cross-entropy: towards better frame-level objective functions for deep neural network training in automatic speech recognition.
Proceedings of the INTERSPEECH 2014, 2014

Feature space maximum a posteriori linear regression for adaptation of deep neural networks.
Proceedings of the INTERSPEECH 2014, 2014

Robust speech recognition with speech enhanced deep neural networks.
Proceedings of the INTERSPEECH 2014, 2014

Dialect levelling in Finnish: a universal speech attribute approach.
Proceedings of the INTERSPEECH 2014, 2014

A maximal figure-of-merit learning approach to maximizing mean average precision with deep neural network based classifiers.
Proceedings of the IEEE International Conference on Acoustics, 2014

Deep learning vector quantization for acoustic information retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2014

An i-vector based descriptor for alphabetical gesture recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

Attribute based lattice rescoring in spontaneous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

Introducing attribute features to foreign accent recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

Global variance equalization for improving deep neural network based speech enhancement.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

2013
A Bottom-Up Modular Search Approach to Large Vocabulary Continuous Speech Recognition.
IEEE Trans. Speech Audio Process., 2013

Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems.
IEEE Trans. Speech Audio Process., 2013

Speech Recognition Using Long-Span Temporal Patterns in a Deep Network Model.
IEEE Signal Process. Lett., 2013

An Information-Extraction Approach to Speech Processing: Analysis, Detection, Verification, and Recognition.
Proc. IEEE, 2013

Exploiting deep neural networks for detection-based speech recognition.
Neurocomputing, 2013

Model-based margin estimation for hidden Markov model learning and generalisation.
IET Signal Process., 2013

Universal attribute characterization of spoken languages for automatic spoken language recognition.
Comput. Speech Lang., 2013


A blind segmentation approach to acoustic event detection based on i-vector.
Proceedings of the INTERSPEECH 2013, 2013

Knowledge integration for improving performance in LVCSR.
Proceedings of the INTERSPEECH 2013, 2013

An experimental study on structural-MAP approaches to implementing very large vocabulary speech recognition systems for real-world tasks.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012
Experiments on Cross-Language Attribute Detection and Phone Recognition With Minimal Target-Specific Training Data.
IEEE Trans. Speech Audio Process., 2012


A new confidence measure combining Hidden Markov Models and Artificial Neural Networks of phonemes for effective keyword spotting.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

A study on cross-language knowledge integration in Mandarin LVCSR.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Hermitian based Hidden Activation Functions for Adaptation of Hybrid HMM/ANN Models.
Proceedings of the INTERSPEECH 2012, 2012

Consumer-level multimedia event detection through unsupervised audio signal modeling.
Proceedings of the INTERSPEECH 2012, 2012

Boosting attribute and phone estimation accuracies with deep neural networks for detection-based speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines.
Proceedings of the INTERSPEECH 2011, 2011

2010
A survey on recent progress in the ASAT/SIRKUS paradigm.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition.
Proceedings of the INTERSPEECH 2010, 2010

Experimental studies on continuous speech recognition using neural architectures with "adaptive" hidden activation functions.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
A study on integrating acoustic-phonetic information into lattice rescoring for automatic speech recognition.
Speech Commun., 2009

Minimum Classification Error Training to Improve Isolated Chord Recognition.
Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009

Exploring universal attribute characterization of spoken languages for spoken language recognition.
Proceedings of the INTERSPEECH 2009, 2009

A phonetic feature based lattice rescoring approach to LVCSR.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
A penalized logistic regression approach to detection based phone classification.
Proceedings of the INTERSPEECH 2008, 2008

Continuous phone recognition without target language training data.
Proceedings of the INTERSPEECH 2008, 2008

Toward a detector-based universal phone recognizer.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Information fusion techniques for automatic image annotation.
Proceedings of the VISAPP 2007: Proceedings of the Second International Conference on Computer Vision Theory and Applications, Barcelona, Spain, March 8-11, 2007, 2007

Boosting of Maximal Figure of Merit Classifiers for Automatic Image Annotation.
Proceedings of the International Conference on Image Processing, 2007

High-Accuracy Phone Recognition By Combining High-Performance Lattice Generation and Knowledge Based Rescoring.
Proceedings of the IEEE International Conference on Acoustics, 2007

Approximate Test Risk Minimization Through Soft Margin Estimation.
Proceedings of the IEEE International Conference on Acoustics, 2007

Towards bottom-up continuous phone recognition.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
A study on lattice rescoring with knowledge scores for automatic speech recognition.
Proceedings of the INTERSPEECH 2006, 2006


  Loading...