Frank K. Soong

Lei Xie

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives.

[BibT_eX]

[DOI]

CoRR, 2022

A Multi-Stage Multi-Codebook VQ-VAE Approach to High-Performance Neural TTS.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Fastspeech TTS with Efficient Self-Attention and Compact Feed-Forward Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

A Universal Ordinal Regression for Assessing Phoneme-Level Pronunciation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Cycle consistent network for end-to-end style transfer TTS training.

[BibT_eX]

[DOI]

Neural Networks, 2021

Effective and direct control of neural TTS prosody by removing interactions between different attributes.

[BibT_eX]

[DOI]

Neural Networks, 2021

A Survey on Neural Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2021

Multilingual Byte2Speech Models for Scalable Low-resource Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2021

Conversational End-to-End TTS for Voice Agents.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Improving Performance of Seen and Unseen Speech Style Transfer in End-to-End Neural TTS.

[BibT_eX]

[DOI]

Xiaochun An

Lei Xie

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A New High Quality Trajectory Tiling Based Hybrid TTS In Real Time.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Pronunciation Assessment Via Ordinal Regression with Anchored Reference Samples.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

MBNET: MOS Prediction for Synthesized Speech with Mean-Bias Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Speech Bert Embedding for Improving Prosody in Neural TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Spoken Language Understanding of Human-Machine Conversations for Language Learning Applications.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2020

s-Transformer: Segment-Transformer for Robust Neural Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2020

Conversational End-to-End TTS for Voice Agent.

[BibT_eX]

[DOI]

CoRR, 2020

On Early-stop Clustering for Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Transfer Learning for Improving Singing-Voice Detection in Polyphonic Instrumental Music.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

An Efficient Subband Linear Prediction for LPCNet-Based Neural Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

An Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Improving Prosody with Linguistic and Bert Derived Features in Multi-Speaker Based Mandarin Chinese Neural TTS.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Improving LPCNET-Based Text-to-Speech with Linear Prediction-Structured Mixture Density Network.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Voice conversion with SI-DNN and KL divergence based mapping without parallel training data.

[BibT_eX]

[DOI]

Haifeng Li

Speech Commun., 2019

Feature reinforcement with word embedding and parsing information in neural TTS.

[BibT_eX]

[DOI]

CoRR, 2019

Forward-Backward Decoding for Regularizing End-to-End TTS.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Exploiting Syntactic Features in a Parsed Tree to Improve End-to-End TTS.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A New GAN-Based End-to-End TTS Training Algorithm.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A Pitch-aware Approach to Single-channel Speech Separation.

[BibT_eX]

[DOI]

Ke Wang

Lei Xie

Proceedings of the IEEE International Conference on Acoustics, 2019

NN-based Ordinal Regression for Assessing Fluency of ESL Speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Domain Adversarial Training for Improving Keyword Spotting Performance of ESL Speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice.

[BibT_eX]

[DOI]

Yan Deng

Lei He

CoRR, 2018

LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2018

Frame Selection in SI-DNN Phonetic Space with WaveNet Vocoder for Voice Conversion without Parallel Training Data.

[BibT_eX]

[DOI]

Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

From Speech Signals to Semantics - Tagging Performance at Acoustic, Phonetic and Word Levels.

[BibT_eX]

[DOI]

Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

A Refined Query-by-Example Approach to Spoken-Term-Detection on ESL learners' Speech.

[BibT_eX]

[DOI]

Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Paired Phone-Posteriors Approach to ESL Pronunciation Quality Assessment.

[BibT_eX]

[DOI]

Yujia Xiao

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A New Glottal Neural Vocoder for Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Exploring Sequential Characteristics in Speaker Bottleneck Feature for Text-Dependent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Effective Spectral and Excitation Modeling Techniques for LSTM-RNN-Based Speech Synthesis Systems.

[BibT_eX]

[DOI]

Eunwoo Song

IEEE ACM Trans. Audio Speech Lang. Process., 2017

DNN i-Vector Speaker Verification with Short, Text-Constrained Test Utterances.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Proficiency Assessment of ESL Learner's Sentence Prosody with TTS Synthesized Voice as Reference.

[BibT_eX]

[DOI]

Yujia Xiao

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Improving Sub-Phone Modeling for Better Native Language Identification with Non-Native English Speech.

[BibT_eX]

[DOI]

Keelan Evanini

Xinhao Wang

David Suendermann-Oeft

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Perceptual quality and modeling accuracy of excitation parameters in DLSTM-based speech synthesis systems.

[BibT_eX]

[DOI]

Eunwoo Song

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Improving native language (L1) identifation with better VAD and TDNN trained separately on native and non-native English corpora.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

A Two-Pass Framework of Mispronunciation Detection and Diagnosis for Computer-Aided Pronunciation Training.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Modeling F0 trajectories in hierarchically structured deep neural networks.

[BibT_eX]

[DOI]

Speech Commun., 2016

Improving speaker verification performance against long-term speaker variability.

[BibT_eX]

[DOI]

Speech Commun., 2016

A deep bidirectional LSTM approach for video-realistic talking head.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2016

Learning Distributed Word Representations For Bidirectional LSTM Recurrent Neural Network.

[BibT_eX]

[DOI]

Proceedings of the NAACL HLT 2016, 2016

A KL Divergence and DNN-Based Approach to Voice Conversion without Parallel Training Sentences.

[BibT_eX]

[DOI]

Haifeng Li

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Improved Time-Frequency Trajectory Excitation Vocoder for DNN-Based Speech Synthesis.

[BibT_eX]

[DOI]

Eunwoo Song

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A KL divergence and DNN approach to cross-lingual TTS.

[BibT_eX]

[DOI]

Haifeng Li

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Speaker and language factorization in DNN-based TTS synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Unsupervised speaker adaptation for DNN-based TTS synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

KL-divergence based mispronunciation detection via DNN and decision tree in the phonetic space.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015

Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers.

[BibT_eX]

[DOI]

Speech Commun., 2015

HMM trajectory-guided sample selection for photo-realistic talking head.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2015

A Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding.

[BibT_eX]

[DOI]

CoRR, 2015

Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network.

[BibT_eX]

[DOI]

CoRR, 2015

An improved DNN-based approach to mispronunciation detection and diagnosis of L2 learners' speech.

[BibT_eX]

[DOI]

Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015

Sequence generation error (SGE) minimization based deep neural networks training for text-to-speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

AA spectral space warping approach to cross-lingual voice transformation in HMM-based TTS.

[BibT_eX]

[DOI]

Hao Wang

Helen Meng

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Word embedding for recurrent neural network based TTS synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Photo-real talking head with deep bidirectional LSTM.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Multi-speaker modeling and speaker adaptation for DNN-based TTS synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

From text-to-speech (TTS) to talking head - a machine learning approach to A/V speech modeling and rendering.

[BibT_eX]

[DOI]

Proceedings of the Auditory-Visual Speech Processing, 2015

A two-pass framework of mispronunciation detection & diagnosis for computer-aided pronunciation training.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014

Pitch transformation in neural network based voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

A new Neural Network based logistic regression classifier for improving mispronunciation detection of L2 language learners.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Modeling DCT parameterized F0 trajectory at intonation phrase level with DNN or decision tree.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Sequence error (SE) minimization training of neural network for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

TTS synthesis with bidirectional LSTM based recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise.

[BibT_eX]

[DOI]

Hyunson Seo

Proceedings of the IEEE International Conference on Acoustics, 2014

On the training aspects of Deep Neural Network (DNN) for parametric TTS synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

A DNN-based acoustic modeling of tonal language and its application to Mandarin pronunciation training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Discriminative scoring for speaker recognition based on I-vectors.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013

A Unified Trajectory Tiling Approach to High Quality Speech Rendering.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2013

A new language independent, photo-realistic talking head driven by voice only.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Binocular photometric stereo acquisition and reconstruction for 3d talking head applications.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A source-filter based adaptive harmonic model and its application to speech prosody modification.

[BibT_eX]

[DOI]

JeeSok Lee

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A new DNN-based high quality pronunciation evaluation for computer-aided language learning (CALL).

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A fast table lookup based, statistical model driven non-uniform unit selection TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Computer-Assisted Audiovisual Language Learning.

[BibT_eX]

[DOI]

Computer, 2012

Tip tap tones: mobile microtraining of mandarin sounds.

[BibT_eX]

[DOI]

Proceedings of the Mobile HCI '12, 2012

Break index labeling of mandarin text via syntactic-to-prosodic tree mapping.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Cross validation and Minimum Generation Error for improved model clustering in HMM-based TTS.

[BibT_eX]

[DOI]

Yi-Jian Wu

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

A unified trajectory tiling approach to high quality TTS and cross-lingual voice transformation.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Pitch accent detection and prediction with DCT features and CRF model.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Constrained Multichannel Speech Dereverberation.

[BibT_eX]

[DOI]

Meng Yu

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Objective Intelligibility Assessment of Text-to-Speech System using Template Constrained Generalized Posterior Probability.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

The Use of DBN-HMMs for Mispronunciation Detection and Diagnosis in L2 English to Support Computer-Aided Pronunciation Training.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Turning a Monolingual Speaker into Multilingual for a Mixed-language TTS.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Noise estimation using a constrained sequential HMM IN log-spectral domain.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Modeling pitch trajectory by hierarchical HMM with minimum generation error training.

[BibT_eX]

[DOI]

Yi-Jian Wu

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

High quality lip-sync animation for 3D photo-realistic talking head.

[BibT_eX]

[DOI]

Wei Han

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Improved minimum converted trajectory error training for real-time speech-to-lips conversion.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

High quality lips animation with speech and captured facial action unit as A/V input.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011

Voice Activity Detection Based on an Unsupervised Learning Framework.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2011

Improved Prosody Generation by Maximizing Joint Probability of State and Longer Units.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2011

Text Driven 3D Photo-Realistic Talking Head.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

On Mispronunciation Lexicon Generation Using Joint-Sequence Multigrams in Computer-Aided Pronunciation Training (CAPT).

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A New Phonetic Candidate Generator for Improving Search Query Efficiency.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Improvements in Speaker Characterization Using Spectral Subband Energy Based on Harmonic plus Noise Model.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Sparse and Low-rank approach to efficient face alignment for photo-real talking head synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Synthesizing visual speech trajectory with minimum generation error.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

A frame mapping based HMM approach to cross-lingual voice transformation.

[BibT_eX]

[DOI]

Ji Xu

Proceedings of the IEEE International Conference on Acoustics, 2011

Speaker characterization using spectral subband energy ratio based on Harmonic plus Noise Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Improved F0 modeling and generation in voice conversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Photo-real lips synthesis with trajectory-guided sample selection.

[BibT_eX]

[DOI]

Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Rendering a personalized photo-real talking head from short video footage.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Automatic prosody prediction and detection with Conditional Random Field (CRF) models.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Capturing L2 segmental mispronunciations with joint-sequence models in Computer-Aided Pronunciation Training (CAPT).

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion.

[BibT_eX]

[DOI]

Xiaodan Zhuang

Mark Hasegawa-Johnson

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Formant-based frequency warping for improving speaker adaptation in HMM TTS.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Synthesizing photo-real talking head via trajectory-guided sample selection.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

An HMM trajectory tiling (HTT) approach to high quality TTS.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT).

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A hierarchical F0 modeling method for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A perceptual study of acceleration parameters in HMM-based TTS.

[BibT_eX]

[DOI]

Yining Chen

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Cross-validation based decision tree clustering for HMM-based TTS.

[BibT_eX]

[DOI]

Yu Zhang

Proceedings of the IEEE International Conference on Acoustics, 2010

Improved modeling for F0 generation and V/U decision in HMM-based TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

RIch-context Unit Selection (RUS) approach to high quality TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

An HMM Trajectory Tiling (HTT) Approach to High Quality TTS - Microsoft Entry to Blizzard Challenge 2010.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010

2009

A Cross-Language State Sharing and Mapping Approach to Bilingual (Mandarin-English) TTS.

[BibT_eX]

[DOI]

Hui Liang

IEEE Trans. Speech Audio Process., 2009

Graph-Based Partial Hypothesis Fusion for Pen-Aided Speech Input.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2009

A Quadratic Optimization Approach to Discriminative Training of CDHMMs.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2009

A Multi-Space Distribution (MSD) and two-stream tone modeling approach to Mandarin speech recognition.

[BibT_eX]

[DOI]

Speech Commun., 2009

Rich context modeling for high quality HMM-based TTS.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Auto-checking speech transcriptions by multiple template constrained posterior.

[BibT_eX]

[DOI]

Shenghao Qin

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A minimum v/u error approach to F0 generation in HMM-based TTS.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Model-based speech separation: identifying transcription using orthogonality.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

An evidence framework for Bayesian learning of continuous-density hidden Markov models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Improved prosody generation by maximizing joint likelihood of state and longer units.

[BibT_eX]

[DOI]

Zhizheng Wu

Proceedings of the IEEE International Conference on Acoustics, 2009

State mapping for cross-language speaker adaptation in TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Improving mispronunciation detection using machine learning.

[BibT_eX]

[DOI]

Yuqiang Chen

Chao Huang

Proceedings of the IEEE International Conference on Acoustics, 2009

HMM-based motion trajectory generation for speech animation synthesis.

[BibT_eX]

[DOI]

Proceedings of the Auditory-Visual Speech Processing, 2009

2008

Identifying Language Origin of Named Entity With Multiple Information Sources.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2008

A Constrained Line Search Optimization Method for Discriminative Training of HMMs.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2008

Modeling and Generating Tone Contour with Phrase Intonation for Mandarin Chinese Speech.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

HMM-Based Mixed-Language (Mandarin-English) Speech Synthesis.

[BibT_eX]

[DOI]

Houwei Cao

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Pitch Tracking for Model-Based Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Improving Automatic Evaluation of Mandarin Pronunciation with Speaker Adaptive Training (SAT) and MLLR Speaker Adaption.

[BibT_eX]

[DOI]

Chao Huang

Feng Zhang

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Prosody for Mandarin speech recognition: a comparative study of read and spontaneous speech.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A real-time text to audio-visual speech synthesis system.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Efficient handwriting correction of speech recognition errors with template constrained posterior (TCP).

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

GPU-accelerated Gaussian clustering for fMPE discriminative training.

[BibT_eX]

[DOI]

Frank Seide

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Generating natural F0 trajectory with additive trees.

[BibT_eX]

[DOI]

Hui Liang

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

An ellipsoid constrained quadratic programming perspective to discriminative training of HMMs.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Mispronunciation detection for Mandarin Chinese.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Duration refinement by jointly optimizing state and longer unit likelihood.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A symbol graph based handwritten math expression recognition.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Radical based fine trajectory HMMs of online handwritten characters.

[BibT_eX]

[DOI]

Lei Ma

Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Automatic mispronunciation detection for Mandarin.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

Improving letter-to-sound conversion performance with automatically generated new words.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

Template constrained posterior for verifying phone transcriptions.

[BibT_eX]

[DOI]

Tao Hu

Proceedings of the IEEE International Conference on Acoustics, 2008

Symbol graph based discriminative training and rescoring for improved math symbol recognition.

[BibT_eX]

[DOI]

Zhen Xuan Luo

Proceedings of the IEEE International Conference on Acoustics, 2008

Prefix tree based auto-completion for convenient bi-modal chinese character input.

[BibT_eX]

[DOI]

Lei Ma

Proceedings of the IEEE International Conference on Acoustics, 2008

A cross-language state mapping approach to bilingual (Mandarin-English) TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

Discriminative training for improving letter-to-sound conversion performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2007

A Syllable Lattice Approach to Speaker Verification.

[BibT_eX]

[DOI]

Minho Jin

Chang Dong Yoo

IEEE Trans. Speech Audio Process., 2007

Performance of Discriminative HMM Training in Noise.

[BibT_eX]

[DOI]

Int. J. Comput. Linguistics Chin. Lang. Process., 2007

Measuring attribute dissimilarity with HMM KL-divergence for speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Perceptual annotation of expressive speech.

[BibT_eX]

[DOI]

Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

An HMM-based bilingual (Mandarin-English) TTS.

[BibT_eX]

[DOI]

Hui Liang

Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Context constrained-generalized posterior probability for verifying phone transcriptions.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Robust F0 modeling for Mandarin speech recognition in noise.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

An unsupervised approach to automatic prosodic annotation.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Iterative unit selection with unnatural prosody detection.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Model-based speech separation with single-microphone input.

[BibT_eX]

[DOI]

Pak-Chung Ching

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Minimum Error Discriminative Training for Radical-Based Online Chinese Handwriting Recognition.

[BibT_eX]

[DOI]

Yu Zhang

Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007

A Unified Framework for Symbol Segmentation and Recognition of Handwritten Mathematical Expressions.

[BibT_eX]

[DOI]

HaiYang Li

Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007

A MSD-HMM Approach to Pen Trajectory Modeling for Online Handwriting Recognition.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007

Generalized Segment Posterior Probability for Automatic Mandarin Pronunciation Evaluation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Word Graph Based Feature Enhancement for Noisy Speech Recognition.

[BibT_eX]

[DOI]

Ren-Hua Wang

Proceedings of the IEEE International Conference on Acoustics, 2007

A Segmentation Posterior Based Endpointing Algorithm.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Full HMM Training for Minimizing Generation Error in Synthesis.

[BibT_eX]

[DOI]

Yi-Jian Wu

Ren-Hua Wang

Proceedings of the IEEE International Conference on Acoustics, 2007

Agreement Learning for Automatic Accent Annotation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

A Constrained Line Search Optimization for Discriminative Training in Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

A New Minimum Divergence Approach to Discriminative Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Divergence-Based Similarity Measure for Spoken Document Retrieval.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Enrich Web Applications with Voice Internet Persona Text-to-Speech for Anyone, Anywhere.

[BibT_eX]

[DOI]

Proceedings of the Human-Computer Interaction. HCI Intelligent Multimodal Interaction Environments, 2007

A constrained line search approach to general discriminative HMM training.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006

A tree-based kernel selection approach to efficient Gaussian mixture model-universal background model based speaker identification.

[BibT_eX]

[DOI]

Speech Commun., 2006

Modeling Cantonese Pronunciation Variations for Large-Vocabulary Continuous Speech Recognition.

[BibT_eX]

[DOI]

Patgi Kam

Int. J. Comput. Linguistics Chin. Lang. Process., 2006

Context-Dependent Boundary Model for Refining Boundaries Segmentation of TTS Units.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2006

Automatic Detection of Tone Mispronunciation in Mandarin.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

A Robust Voice Activity Detection Based on Noise Eigenspace Projection.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Signal Trajectory Based Noise Compensation for Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Training Discriminative HMM by Optimal Allocation of Gaussian Kernels.

[BibT_eX]

[DOI]

Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Improved Mandarin Speech Recognition by Lattice Rescoring with Enhanced Tone Models.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

An HMM-Based Mandarin Chinese Text-To-Speech System.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Non-uniform Kernel Allocation Based Parsimonious HMM.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Noisy Speech Recognition Performance of Discriminative HMMs.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

The Paradigm for Creating Multi-lingual Text-To-Speech Voice Databases.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Integrating Hypotheses of Multiple Recognizers for Improving Mandarin LVCSR Performance.

[BibT_eX]

[DOI]

Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

A multi-space distribution (MSD) approach to speech recognition of tonal languages.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Auto-segmentation based VAD for robust ASR.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Generalization of the minimum classification error (MCE) training based on maximizing generalized posterior probability (GPP).

[BibT_eX]

[DOI]

Qiang Fu

Antonio Moreno-Daniel

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Minimum divergence based discriminative training.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Word graph based speech rcognition error correction by handwriting input.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Multimodal Interfaces, 2006

Improved Chinese Character Input by Merging Speech and Handwriting Recognition Hypotheses.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

A Comparative Study of Discriminative Methods for Reranking LVCSR N-Best Hypotheses in Domain Adaptation and Generalization.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Auto-Segmentation Based Partitioning and Clustering Approach to Robust Endpointing.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Tone-Enhanced Generalized Character Posterior Probability (GCPP) for Cantonese LVCSR.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

An Iterative Trajectory Regeneration Algorithm for Separating Mixed Speech Sources.

[BibT_eX]

[DOI]

Pak-Chung Ching

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Syllable Lattice Based Re-Scoring For Speaker Verification.

[BibT_eX]

[DOI]

Minho Jin

Chang D. Yoo

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Weighted Likelihood Ratio (WLR) Hidden Markov Model for Noisy Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

A Study on How Human Annotations Benefit the TTS Voice.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2006, Pittsburgh, PA, USA, September 16, 2006, 2006

2005

A Dynamic In-Search Data Selection Method With Its Applications to Acoustic Modeling and Utterance Verification.

[BibT_eX]

[DOI]

Hui Jiang

IEEE Trans. Speech Audio Process., 2005

Verification of Multi-Class Recognition Decision: A Classification Approach.

[BibT_eX]

[DOI]

Tomoko Matsui

IEICE Trans. Inf. Syst., 2005

Refining phoneme segmentations using speaker-adaptive context dependent boundary models.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Phonetic transcription verification with generalized posterior probability.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Background model based posterior probability for measuring confidence.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Harmonic filtering for joint estimation of pitch and voiced source with single-microphone input.

[BibT_eX]

[DOI]

Pak-Chung Ching

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Static and Dynamic Spectral Features: Their Noise Robustness and Optimal Weights for ASR.

[BibT_eX]

[DOI]

Chen Yang

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Generalized Posterior Probability for Minimum Error Verification of Recognized Sentences.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Optimal Clustering and Non-Uniform Allocation of Gaussian Kernels in Scalar Dimension for HMM Compression.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

On noise robustness of dynamic and static features for continuous Cantonese digit recognition.

[BibT_eX]

[DOI]

Chen Yang

Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

Generalized posterior probability for minimizing verification errors at subword, word and sentence levels.

[BibT_eX]

[DOI]

Satoshi Nakamura

Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

Improved spoken language translation using n-best speech recognition hypotheses.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Optimal acoustic and language model weights for minimizing word verification errors.

[BibT_eX]

[DOI]

Satoshi Nakamura

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Tone information as a confidence measure for improving Cantonese LVCSR.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Robust verification of recognized words in noise.

[BibT_eX]

[DOI]

Satoshi Nakamura

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

A Unified Approach in Speech-to-Speech Translation: Integrating Features of Speech recognition and Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the COLING 2004, 2004

2003

On divergence based clustering of normal distributions and its application to HMM adaptation.

[BibT_eX]

[DOI]

Tor André Myrvoll

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Modeling Cantonese pronunciation variation by acoustic model refinement.

[BibT_eX]

[DOI]

Patgi Kam

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Optimal clustering of multivariate normal distributions using divergence and its application to HMM adaptation.

[BibT_eX]

[DOI]

Tor André Myrvoll

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Combining neighboring filter channels to improve quantile based histogram equalization.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002

Recognition of noisy speech using normalized moments.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Bell labs approach to Aurora evaluation on connected digit recognition.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Classifier design for verification of multi-class recognition decision.

[BibT_eX]

[DOI]

Tomoko Matsui

Proceedings of the IEEE International Conference on Acoustics, 2002

A dynamic in-search discriminative training approach for large vocabulary speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2002

2001

A real-time Japanese broadcast news closed-captioning system.

[BibT_eX]

[DOI]

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

An auditory system-based feature for robust speech recognition.

[BibT_eX]

[DOI]

Qi Li

Olivier Siohan

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A data selection strategy for utterance verification in continuous speech recognition.

[BibT_eX]

[DOI]

Hui Jiang

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Evaluating the Aurora connected digit recognition task - a bell labs approach.

[BibT_eX]

[DOI]

Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Hierarchical stochastic feature matching for robust speech recognition.

[BibT_eX]

[DOI]

Hui Jiang

Proceedings of the IEEE International Conference on Acoustics, 2001

2000

Hands-free human-machine dialogue - corpora, technology and evaluation.

[BibT_eX]

[DOI]

Eric A. Woudenberg

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A high-performance auditory feature for robust speech recognition.

[BibT_eX]

[DOI]

Qi Li

Olivier Siohan

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999

Recent advancements in automatic speaker authentication.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Mag., 1999

A block least squares approach to acoustic echo cancellation.

[BibT_eX]

[DOI]

Eric A. Woudenberg

Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

Hidden Markov models with divergence based vector quantized variances.

[BibT_eX]

[DOI]

Jae H. Kim

Raziel Haimi-Cohen

Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998

Improved utterance rejection using length dependent thresholds.

[BibT_eX]

[DOI]

Sunil K. Gupta

Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1997

Generalized mixture of HMMs for continuous speech recognition.

[BibT_eX]

[DOI]

Filipp Korkmazskiy

Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996

Quantizing mixture-weights in a tied-mixture HMM.

[BibT_eX]

[DOI]

Sunil K. Gupta

Raziel Haimi-Cohen

Proceedings of the 4th International Conference on Spoken Language Processing, 1996

High-accuracy connected digit recognition for mobile applications.

[BibT_eX]

[DOI]

Sunil K. Gupta

Raziel Haimi-Cohen

Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995

Optimizing baseforms for HMM-based speech recognition.

[BibT_eX]

[DOI]

Torbjørn Svendsen

Heiko Purnhagen

Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Large vocabulary, word-based Mandarin dictation system.

[BibT_eX]

[DOI]

Lin-Shan Lee

Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

An orthogonal polynomial representation of speech signals and its probabilistic model for text independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 1995 International Conference on Acoustics, 1995

1994

A fast algorithm for large vocabulary keyword spotting application.

[BibT_eX]

[DOI]

Hsiao-Chuan Wang

IEEE Trans. Speech Audio Process., 1994

An N-best candidates-based discriminative training for speech recognition applications.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 1994

A Minimum Error Rate Pattern Recognition Approach to Speech Recognition.

[BibT_eX]

[DOI]

Int. J. Pattern Recognit. Artif. Intell., 1994

The use of tree-trellis search for large-vocabulary Mandarin polysyllabic word speech recognition.

[BibT_eX]

[DOI]

Hsiao-Chuan Wang

Comput. Speech Lang., 1994

Cepstral channel normalization techniques for HMM-based speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Large vocabulary word recognition based on tree-trellis search.

[BibT_eX]

[DOI]

Lin-Shan Lee

Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

Discriminative training of high performance speech recognizer using N best candidates.

[BibT_eX]

[DOI]

Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993

Optimal quantization of LSP parameters.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 1993

1992

Continuous mixture HMM-LR using the a* algorithm for continuous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Second International Conference on Spoken Language Processing, 1992

The use of cohort normalized scores for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the Second International Conference on Spoken Language Processing, 1992

Continuous probabilistic acoustic map for speaker recognition.

[BibT_eX]

[DOI]

Belle L. Tseng

Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991

A tree-trellis based fast search for finding the N-best sentence hypotheses in continuous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 1991 International Conference on Acoustics, 1991

1990

A Tree.Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, 1990

Experiments in automatic talker verification using sub-word unit hidden Markov models.

[BibT_eX]

[DOI]

Proceedings of the First International Conference on Spoken Language Processing, 1990

Optimal quantization of LSP parameters using delayed decisions.

[BibT_eX]

[DOI]

Proceedings of the 1990 International Conference on Acoustics, 1990

Sub-word unit talker verification using hidden Markov models.

[BibT_eX]

[DOI]

Proceedings of the 1990 International Conference on Acoustics, 1990

Speaker recognition based on source coding approaches.

[BibT_eX]

[DOI]

Proceedings of the 1990 International Conference on Acoustics, 1990

A probabilistic acoustic map based discriminative HMM training.

[BibT_eX]

[DOI]

Proceedings of the 1990 International Conference on Acoustics, 1990

Statistical segmentation and word modeling techniques in isolated word recognition.

[BibT_eX]

[DOI]

Proceedings of the 1990 International Conference on Acoustics, 1990

1989

A phonetically labeled acoustic segment (PLAS) approach to speech analysis-synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1989

Word recognition using whole word and subword models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1989

1988

Optimal quantization of LSP parameters [speech coding].

[BibT_eX]

[DOI]

Bling-Hwang Juang

Proceedings of the IEEE International Conference on Acoustics, 1988

High performance connected digit recognition, using hidden Markov models.

[BibT_eX]

[DOI]

Jay G. Wilpon

Proceedings of the IEEE International Conference on Acoustics, 1988

A segment model based approach to speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1988

1987

On the automatic segmentation of speech signals.

[BibT_eX]

[DOI]

Torbjørn Svendsen

Proceedings of the IEEE International Conference on Acoustics, 1987

A frequency-weighted Itakura spectral distortion measure and its application to speech recognition in noise.

[BibT_eX]

[DOI]

M. Mohan Sondhi

Proceedings of the IEEE International Conference on Acoustics, 1987

A training procedure for a segment-based-network approach to isolated word recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1987

1986

On the use of instantaneous and transitional spectral information in speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1986

A high quality subband speech coder with backward adaptive predictor and optimal time-frequency bit assignment.

[BibT_eX]

[DOI]

Richard V. Cox

Nikil S. Jayant

Proceedings of the IEEE International Conference on Acoustics, 1986

Evaluation of a vector quantization talker recognition system in text independent and text dependent modes.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1986

1985

A vector-quantization-based preprocessor for speaker-independent isolated word recognition.

[BibT_eX]

[DOI]

Kuk-Chin Pan

IEEE Trans. Acoust. Speech Signal Process., 1985

Single-frame vowel recognition using vector quantization with several distance measures.

[BibT_eX]

[DOI]

AT&T Tech. J., 1985

Incorporation of temporal structure into a vector-quantization-based preprocessor for speaker-independent, isolated-word recognition.

[BibT_eX]

[DOI]

A. F. Bergh

AT&T Tech. J., 1985

A vector quantization approach to speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1985

Subband coding of speech using backward adaptive prediction and bit allocation.

[BibT_eX]

[DOI]

Richard V. Cox

Nikil S. Jayant

Proceedings of the IEEE International Conference on Acoustics, 1985

An efficient vector-quantization preprocessor for speaker independent isolated word recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1985

Comparative study of several distortion measures for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1985

1984

On the performance of isolated word speech recognizers using vector quantization and temporal energy contours.

[BibT_eX]

[DOI]

Kuk-Chin Pan

AT&T Bell Lab. Tech. J., 1984

Line spectrum pair (LSP) and speech data compression.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1984

On the use of transient information in speech recognition.

[BibT_eX]

[DOI]

Jean-Sylvain Liénard

Proceedings of the IEEE International Conference on Acoustics, 1984

1982

Fast least-squares (LS) in the voice echo cancellation application.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1982

On the high resolution and unbiased frequency estimates of sinusoids in white noise-A new adaptive approach.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1982

1981

On the asymptotic behavior of a complex adaptive line enchancer (CALE).

[BibT_eX]

[DOI]

S. Shankar Narayan

Proceedings of the IEEE International Conference on Acoustics, 1981

1980

Fast spectral estimation of speech signal in analytic form.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1980

1978

Frequency estimation by linear prediction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 1978

Observations on linear estimation.

[BibT_eX]

[DOI]

Leland B. Jackson