Nam Soo Kim

Orcid: 0000-0002-0568-4902

According to our database1, Nam Soo Kim authored at least 210 papers between 1990 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Transfer Learning for Low-Resource, Multi-Lingual, and Zero-Shot Multi-Speaker Text-to-Speech.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction.
CoRR, 2024

Efficient Parallel Audio Generation using Group Masked Language Modeling.
CoRR, 2024

2023
Oracle Teacher: Leveraging Target Information for Better Knowledge Distillation of CTC Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Text Implicates Prosodic Ambiguity: A Corpus for Intention Identification of the Korean Spoken Language.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2023

EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings.
CoRR, 2023

Towards single integrated spoofing-aware speaker verification embeddings.
CoRR, 2023

When Crowd Meets Persona: Creating a Large-Scale Open-Domain Persona Dialogue Corpus.
CoRR, 2023

EM-Network: Oracle Guided Self-distillation for Sequence Learning.
Proceedings of the International Conference on Machine Learning, 2023

Multi-Resolution Sequence Aggregation and Model-Agnostic Framework for Time-Series Forecasting.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Learning Objectives for Speaker Verification from the Perspective of Score Comparison.
Proceedings of the IEEE International Conference on Acoustics, 2023

Transduce and Speak: Neural Transducer for Text-To-Speech with Semantic Token Prediction.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Neurally Optimized Decoder for Low Bitrate Speech Codec.
IEEE Signal Process. Lett., 2022

SNAC: Speaker-Normalized Affine Coupling Layer in Flow-Based Architecture for Zero-Shot Multi-Speaker Text-to-Speech.
IEEE Signal Process. Lett., 2022

A Controllable Multi-Lingual Multi-Speaker Multi-Style Text-to-Speech Synthesis With Multivariate Information Minimization.
IEEE Signal Process. Lett., 2022

Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech.
CoRR, 2022

Disentangled Speaker Representation Learning via Mutual Information Minimization.
CoRR, 2022

HuBERT-EE: Early Exiting HuBERT for Efficient Speech Recognition.
CoRR, 2022

Selective Kernel Attention for Robust Speaker Verification.
CoRR, 2022

Inter-KD: Intermediate Knowledge Distillation for CTC-Based Automatic Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Frequency and Multi-Scale Selective Kernel Attention for Speaker Verification.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Fully Unsupervised Training of Few-Shot Keyword Spotting.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

OpenKorPOS: Democratizing Korean Tokenization with Voting-Based Open Corpus Annotation.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

StyleKQC: A Style-Variant Paraphrase Corpus for Korean Questions and Commands.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus.
Proceedings of the Interspeech 2022, 2022

2021
TutorNet: Towards Flexible Knowledge Distillation for End-to-End Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Gated Recurrent Context: Softmax-Free Attention for Online Encoder-Decoder Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Oracle Teacher: Towards Better Knowledge Distillation.
CoRR, 2021

Bootstrap Equilibrium and Probabilistic Speaker Representation Learning for Self-Supervised Speaker Verification.
IEEE Access, 2021

Expressive Text-to-Speech Using Style Tag.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Team02 Text-Independent Speaker Verification System for SdSV Challenge 2021.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Diff-TTS: A Denoising Diffusion Model for Text-to-Speech.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

kosp2e: Korean Speech to English Translation Corpus.
Proceedings of the Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August, 2021

Speech Separation Based on DPTNet with Sparse Attention.
Proceedings of the 7th IEEE International Conference on Network Intelligence and Digital Content, 2021

Towards Cross-Lingual Generalization of Translation Gender Bias.
Proceedings of the FAccT '21: 2021 ACM Conference on Fairness, 2021

Giving Space to Your Message: Assistive Word Segmentation for the Electronic Typing of Digital Minorities.
Proceedings of the DIS '21: Designing Interactive Systems Conference 2021, 2021

2020
Memory Attention: Robust Alignment Using Gating Mechanism for End-to-End Speech Synthesis.
IEEE Signal Process. Lett., 2020

Unsupervised Representation Learning for Speaker Recognition via Contrastive Equilibrium Learning.
CoRR, 2020

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis.
CoRR, 2020

Disentangled Speaker and Nuisance Attribute Embedding for Robust Speaker Verification.
IEEE Access, 2020

Pay Attention to Categories: Syntax-Based Sentence Modeling with Metadata Projection Matrix.
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation, 2020

Information Preservation Pooling for Speaker Embedding.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Discourse Component to Sentence (DC2S): An Efficient Human-Aided Construction of Paraphrase and Sentence Similarity Dataset.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Robust Text-Dependent Speaker Verification via Character-Level Information Preservation for the SdSV Challenge 2020.
Proceedings of the Interspeech 2020, 2020

Reformer-TTS: Neural Speech Synthesis with Reformer Network.
Proceedings of the Interspeech 2020, 2020

Speech to Text Adaptation: Towards an Efficient Cross-Modal Distillation.
Proceedings of the Interspeech 2020, 2020

Robust Front-End for Multi-Channel ASR using Flow-Based Density Estimation.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Adaptive Knowledge Distillation Based on Entropy.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Machines Getting with the Program: Understanding Intent Arguments of Non-Canonical Directives.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Text Matters but Speech Influences: A Computational Analysis of Syntactic Ambiguity Resolution.
Proceedings of the 42th Annual Meeting of the Cognitive Science Society, 2020

Towards an Efficient Code-Mixed Grapheme-to-Phoneme Conversion in an Agglutinative Language: A Case Study on To-Korean Transliteration.
Proceedings of the The 4th Workshop on Computational Approaches to Code Switching, 2020

2019
Adversarially Learned Total Variability Embedding for Speaker Recognition with Random Digit Strings.
Sensors, 2019

Disambiguating Speech Intention via Audio-Text Co-attention Framework: A Case of Prosody-semantics Interface.
CoRR, 2019

Investigating an Effective Character-level Embedding in Korean Sentence Classification.
CoRR, 2019

On Measuring Gender Bias in Translation of Gender-neutral Pronouns.
CoRR, 2019

End-to-End Multi-Channel Speech Enhancement Using Inter-Channel Time-Restricted Attention on Raw Waveform.
Proceedings of the Interspeech 2019, 2019

2018
DNN-based monaural speech enhancement with temporal and spectral variations equalization.
Digit. Signal Process., 2018

Speech Intention Understanding in a Head-final Language: A Disambiguation Utilizing Intonation-dependency.
CoRR, 2018

Real-time Automatic Word Segmentation for User-generated Text.
CoRR, 2018

Structured Argument Extraction of Korean Question and Command.
CoRR, 2018

HashCount at SemEval-2018 Task 3: Concatenative Featurization of Tweet and Hashtags for Irony Detection.
Proceedings of The 12th International Workshop on Semantic Evaluation, 2018

Acoustic Modeling Using Adversarially Trained Variational Recurrent Neural Network for Speech Synthesis.
Proceedings of the Interspeech 2018, 2018

Stochastic DNN-HMM Training for Robust ASR.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Robust Time-Delay Estimation for Acoustic Indoor Localization in Reverberant Environments.
IEEE Signal Process. Lett., 2017

Detecting oxymoron in a single statement.
Proceedings of the 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, 2017

Audio Classification Using Class-Specific Learned Descriptors.
Proceedings of the Interspeech 2017, 2017

Integrated DNN-based model adaptation technique for noise-robust speech recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Weakly labeled acoustic event detection using local detector and global classifier.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Overlapping acoustic event classification based on joint training with source separation.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
DNN-Based Voice Activity Detection with Multi-Task Learning.
IEICE Trans. Inf. Syst., 2016

DNN-Based Feature Enhancement Using Joint Training Framework for Robust Multichannel Speech Recognition.
Proceedings of the Interspeech 2016, 2016

Multi-microphone approach for reliable acoustic data transmission.
Proceedings of the IEEE International Conference on Consumer Electronics, 2016

Two-stage noise aware training using asymmetric deep denoising autoencoder.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

NMF-based source separation utilizing prior knowledge on encoding vector.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

DNN-Based Sound Event Detection with Exemplar-Based Approach for Noise Reduction.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2016

Acoustic Scene Classification Using Parallel Combination of LSTM and CNN.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2016

Incremental approach to NMF basis estimation for audio source separation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

DNN-based voice activity detection with local feature shift technique.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015
NMF-Based Speech Enhancement Using Bases Update.
IEEE Signal Process. Lett., 2015

NMF-based Target Source Separation Using Deep Neural Network.
IEEE Signal Process. Lett., 2015

Tampering Detection Scheme for Speech Signals using Formant Enhancement based Watermarking.
J. Inf. Hiding Multim. Signal Process., 2015

Target Source Separation Based on Discriminative Nonnegative Matrix Factorization Incorporating Cross-Reconstruction Error.
IEICE Trans. Inf. Syst., 2015

Supervised Denoising Pre-Training for Robust ASR with DNN-HMM.
IEICE Trans. Inf. Syst., 2015

An acoustic data transmission system based on audio data hiding: method and performance evaluation.
EURASIP J. Audio Speech Music. Process., 2015

DNN-based residual echo suppression.
Proceedings of the INTERSPEECH 2015, 2015

Discriminative nonnegative matrix factorization using cross-reconstruction error for source separation.
Proceedings of the INTERSPEECH 2015, 2015

Speaker adaptation using relevance vector regression for HMM-based expressive TTS.
Proceedings of the INTERSPEECH 2015, 2015

Reverberation-robust acoustic indoor localization.
Proceedings of the INTERSPEECH 2015, 2015

Acoustic modeling and parameter generation using relevance vector machines for speech synthesis.
Proceedings of the 23rd European Signal Processing Conference, 2015

2014
Stereophonic Acoustic Echo Suppression Incorporating Spectro-Temporal Correlations.
IEEE Signal Process. Lett., 2014

Spectro-Temporal Filtering for Multichannel Speech Enhancement in Short-Time Fourier Transform Domain.
IEEE Signal Process. Lett., 2014

Factored Maximum Penalized Likelihood Kernel Regression for HMM-Based Style-Adaptive Speech Synthesis.
IEEE J. Sel. Top. Signal Process., 2014

Formant enhancement based speech watermarking for tampering detection.
Proceedings of the INTERSPEECH 2014, 2014

A data-driven approach to speech enhancement using Gaussian process.
Proceedings of the INTERSPEECH 2014, 2014

NMF-based speech enhancement incorporating deep neural network.
Proceedings of the INTERSPEECH 2014, 2014

Speaker Adaptation Using Nonlinear Regression Techniques for HMM-Based Speech Synthesis.
Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014

Crossband filtering for stereophonic acoustic echo suppression.
Proceedings of the IEEE International Conference on Acoustics, 2014

Speech enhancement combining statistical models and NMF with update of speech and noise bases.
Proceedings of the IEEE International Conference on Acoustics, 2014

Reverberation and noise robust feature enhancement using multiple inputs.
Proceedings of the IEEE International Conference on Acoustics, 2014

Parametric multichannel noise reduction algorithm utilizing temporal correlations in reverberant environment.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Reverberation and Noise Robust Feature Compensation Based on IMM.
IEEE Trans. Speech Audio Process., 2013

Statistical Approaches to Excitation Modeling in HMM-Based Speech Synthesis.
IEICE Trans. Inf. Syst., 2013

Factored maximum likelihood kernelized regression for HMM-based singing voice synthesis.
Proceedings of the INTERSPEECH 2013, 2013

Robust Audio Data Hiding Method Based on Phase of Modulated Complex Lapped Transform.
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013

Blind method of estimating speech transmission index from reverberant speech signals.
Proceedings of the 21st European Signal Processing Conference, 2013

Blind method of estimating speech transmission index in room acoustics based on concept of modulation transfer function.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

IMM-based feature compensation robust to slowly time-varying noise and reverberation.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

2012
Speech Feature Mapping Based on Switching Linear Dynamic System.
IEEE Trans. Speech Audio Process., 2012

Spectral Magnitude Adjustment for MCLT-Based Acoustic Data Transmission.
IEICE Trans. Inf. Syst., 2012

Outlier Detection and Removal for HMM-Based Speech Synthesis with an Insufficient Speech Database.
IEICE Trans. Inf. Syst., 2012

Factored MLLR Adaptation Algorithm for HMM-based Expressive TTS.
Proceedings of the INTERSPEECH 2012, 2012

Quality Enhancement of Audio Watermarking for Data Transmission in Aerial Space Based on Segmental SNR Adjustment.
Proceedings of the Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2012

Artificial stereo data generation for speech feature mapping.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Feature enhancement error compensation for noise robust speech recognition.
Proceedings of the International Multi-Conference on Systems, Signals & Devices, 2012

2011
Factored MLLR Adaptation.
IEEE Signal Process. Lett., 2011

Speech Enhancement Based on Data-Driven Residual Gain Estimation.
IEICE Trans. Inf. Syst., 2011

Factored MLLR Adaptation for Singing Voice Generation.
Proceedings of the INTERSPEECH 2011, 2011

Decision Tree-Based Clustering with Outlier Detection for HMM-Based Speech Synthesis.
Proceedings of the INTERSPEECH 2011, 2011

A data-driven residual gain approach for two-stage speech enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2011

Switching linear dynamic transducer for stereo data based speech feature mapping.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Acoustic Data Transmission Based on Modulated Complex Lapped Transform.
IEEE Signal Process. Lett., 2010

Frequency-Domain Double-Talk Detection Based on the Gaussian Mixture Model.
IEEE Signal Process. Lett., 2010

Robust Data Hiding for MCLT Based Acoustic Data Transmission.
IEEE Signal Process. Lett., 2010

Study of Prominence Detection Based on Various Phone-Specific Features.
IEICE Trans. Inf. Syst., 2010

On Detecting Target Acoustic Signals Based on Non-negative Matrix Factorization.
IEICE Trans. Inf. Syst., 2010

Estimation of Phone Mismatch Penalty Matricesfor Two-Stage Keyword Spotting.
IEICE Trans. Inf. Syst., 2010

Implementation of HMM-Based Human Activity Recognition Using Single Triaxial Accelerometer.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2010

Voice activity detection based on statistical models and machine learning approaches.
Comput. Speech Lang., 2010

Excitation modeling based on waveform interpolation for HMM-based speech synthesis.
Proceedings of the INTERSPEECH 2010, 2010

Phone mismatch penalty matrices for two-stage keyword spotting via multi-pass phone recognizer.
Proceedings of the INTERSPEECH 2010, 2010

Multichannel noise reduction using low order RTF estimate.
Proceedings of the INTERSPEECH 2010, 2010

2009
Audio Fingerprinting Based on Multiple Hashing in DCT Domain.
IEEE Signal Process. Lett., 2009

Global Soft Decision Employing Support Vector Machine For Speech Enhancement.
IEEE Signal Process. Lett., 2009

Computationally Efficient Cepstral Domain Feature Compensation.
IEICE Trans. Inf. Syst., 2009

Speech reinforcement based on partial masking effect.
Proceedings of the IEEE International Conference on Acoustics, 2009

DCT based multiple hashing technique for robust audio fingerprinting.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Analysis and Improvement of Speech/Music Classification for 3GPP2 SMV Based on GMM.
IEEE Signal Process. Lett., 2008

Voice Activity Detection Based on Conditional MAP Criterion.
IEEE Signal Process. Lett., 2008

Frame Splitting Scheme for Error-Robust Audio Streaming over Packet-Switching Networks.
IEICE Trans. Commun., 2008

Improved Frame Mode Selection for AMR-WB+ Based on Decision Tree.
IEICE Trans. Inf. Syst., 2008

Decision tree based frame mode selection for AMR-WB+.
Proceedings of the INTERSPEECH 2008, 2008

Cepstral domain feature compensation based on diagonal approximation.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
On Using Multiple Models for Automatic Speech Segmentation.
IEEE Trans. Speech Audio Process., 2007

Perceptual Reinforcement of Speech Signal Based on Partial Specific Loudness.
IEEE Signal Process. Lett., 2007

Feature Compensation Incorporating Modeling Error Statistics.
IEEE Signal Process. Lett., 2007

A Statistical Model-Based Residual Echo Suppression.
IEEE Signal Process. Lett., 2007

Voice activity detection based on a family of parametric distributions.
Pattern Recognit. Lett., 2007

Multiple statistical models for soft decision in noisy speech enhancement.
Pattern Recognit., 2007

Speech Enhancement Based on Perceptually Comfortable Residual Noise.
IEICE Trans. Commun., 2007

Feature Compensation with Model-Based Estimation for Noise Masking.
IEICE Trans. Inf. Syst., 2007

Improved Global Soft Decision Using Smoothed Global Likelihood Ratio for Speech Enhancement.
IEICE Trans. Commun., 2007

Speech reinforcement based on partial specific loudness.
Proceedings of the INTERSPEECH 2007, 2007

A multiple-model based framework for automatic speech segmentation.
Proceedings of the INTERSPEECH 2007, 2007

A statistical model based post-filtering algorithm for residual echo suppression.
Proceedings of the INTERSPEECH 2007, 2007

Feature Compensation using More Accurate Statistics of Modeling Error.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Voice activity detection based on multiple statistical models.
IEEE Trans. Signal Process., 2006

A new structural approach in system identification with generalized analysis-by-synthesis for robust speech coding.
IEEE Trans. Speech Audio Process., 2006

Signal modification for ADPCM based on analysis-by-synthesis framework.
IEEE Signal Process. Lett., 2006

Automatic Speech Segmentation Based on Boundary-Type Candidate Selection.
IEEE Signal Process. Lett., 2006

Speech enhancement based on residual noise shaping.
Proceedings of the INTERSPEECH 2006, 2006

Automatic speech segmentation with multiple statistical models.
Proceedings of the INTERSPEECH 2006, 2006

Clean speech feature estimation based on soft spectral masking.
Proceedings of the INTERSPEECH 2006, 2006

Signal modification incorporating perceptual weighting filter.
Proceedings of the INTERSPEECH 2006, 2006

2005
Rapid online adaptation based on transformation space model evolution.
IEEE Trans. Speech Audio Process., 2005

Statistical modeling of speech signals based on generalized gamma distribution.
IEEE Signal Process. Lett., 2005

An approach to robust unsupervised speaker adaptation.
IEEE Signal Process. Lett., 2005

Feature compensation based on switching linear dynamic model.
IEEE Signal Process. Lett., 2005

Image probability distribution based on generalized gamma function.
IEEE Signal Process. Lett., 2005

A new double-talk detector using echo path estimation.
Speech Commun., 2005

Pitch estimation of speech signal based on adaptive lattice notch filter.
Signal Process., 2005

Feature compensation based on switching linear dynamic model and soft decision.
Proceedings of the INTERSPEECH 2005, 2005

A new structural preprocessor for low-bit rate speech coding.
Proceedings of the INTERSPEECH 2005, 2005

Voice Activity Detection based on Generalized Gamma Distribution.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Signal modification for robust speech coding.
IEEE Trans. Speech Audio Process., 2004

Discriminative training for concatenative speech synthesis.
IEEE Signal Process. Lett., 2004

Feature compensation based on soft decision.
IEEE Signal Process. Lett., 2004

Rapid online adaptation using speaker space model evolution.
Speech Commun., 2004

Maximum a posteriori adaptation of HMM parameters based on speaker space projection.
Speech Commun., 2004

A Statistical Model-Based V/UV Decision under Background Noise Environments.
IEICE Trans. Inf. Syst., 2004

Distorted Speech Rejection for Automatic Speech Recognition in Wireless Communication.
IEICE Trans. Inf. Syst., 2004

Speech probability distribution based on generalized gama distribution.
Proceedings of the INTERSPEECH 2004, 2004

Inner product based-multiband vector quantization for wideband speech coding at 16 kbps.
Proceedings of the INTERSPEECH 2004, 2004

2003
Discriminative weight training for unit-selection based speech synthesis.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Feature compensation technique for robust speech recognition in noisy environments.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Likelihood ratio test with complex laplacian model for voice activity detection.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Online adaptation using speatransformation space model evolution.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
A preprocessor for low-bit-rate speech coding.
IEEE Signal Process. Lett., 2002

Feature domain compensation of nonstationary noise for robust speech recognition.
Speech Commun., 2002

Markov models based on speaker space model evolution.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Generalized analysis-by-synthesis based on system identification.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Rapid speaker adaptation using probabilistic principal component analysis.
IEEE Signal Process. Lett., 2001

Robust correlation estimation for EMAP-based speaker adaptation.
IEEE Signal Process. Lett., 2001

EMAP-based speaker adaptation with robust correlation estimation.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Filtering on hidden Markov models.
IEEE Signal Process. Lett., 2000

Spectral enhancement based on global soft decision.
IEEE Signal Process. Lett., 2000

Bayesian speaker adaptation based on probabilistic principal component analysis.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Speech enhancement: new approaches to soft decision.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999
A statistical model-based voice activity detection.
IEEE Signal Process. Lett., 1999

Time-varying noise compensation using multiple Kalman filters.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Deleted strategy for MMI-based HMM training.
IEEE Trans. Speech Audio Process., 1998

IMM-based estimation for slowly evolving environments.
IEEE Signal Process. Lett., 1998

Nonstationary environment compensation based on sequential estimation.
IEEE Signal Process. Lett., 1998

Statistical linear approximation for environment compensation.
IEEE Signal Process. Lett., 1998

Speech recognition in noisy environments using first-order vector Taylor series.
Speech Commun., 1998

1997
Statistically reliable deleted interpolation.
IEEE Trans. Speech Audio Process., 1997

Frame-correlated hidden Markov model based on extended logarithmic pool.
IEEE Trans. Speech Audio Process., 1997

Model-based approach for robust speech recognition in noisy environements with multiple noise sources.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

1995
On estimating robust probability distribution in HMM-based speech recognition.
IEEE Trans. Speech Audio Process., 1995

1990
Generalized training of hidden Markov model parameters for speech recognition.
Proceedings of the First International Conference on Spoken Language Processing, 1990


  Loading...