Lin-Shan Lee

According to our database1, Lin-Shan Lee authored at least 294 papers between 1981 and 2019.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 1993, "For contributions to computer voice input/output techniques for Mandarin Chinese and to engineering education.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2019
Sequence-to-sequence Automatic Speech Recognition with Word Embedding Regularization and Fused Decoding.
CoRR, 2019

Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning.
CoRR, 2019

Interrupted and cascaded permutation invariant training for speech separation.
CoRR, 2019

Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering.
CoRR, 2019

From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings.
CoRR, 2019

Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models.
CoRR, 2019

Towards End-to-end Speech-to-text Translation with Two-pass Decoding.
Proceedings of the IEEE International Conference on Acoustics, 2019

Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Unsupervised Discovery of Structured Acoustic Tokens With Applications to Spoken Term Detection.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2018

Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection.
CoRR, 2018

Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data.
CoRR, 2018

Rhythm-Flexible Voice Conversion Without Parallel Data Using Cycle-GAN Over Phoneme Posteriorgram Sequences.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Phonetic-and-Semantic Embedding of Spoken words with Applications in Spoken Content Retrieval.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Completely Unsupervised Phoneme Recognition by Adversarially Learning Mapping Relationships from Audio Embeddings.
Proceedings of the Interspeech 2018, 2018

Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations.
Proceedings of the Interspeech 2018, 2018

Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Transcribing Lyrics from Commercial Song Audio: the First Step Towards Singing Content Processing.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Domain Independent Key Term Extraction from Spoken Content Based on Context and Term Location Information in the Utterances.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Scalable Sentiment for Sequence-to-Sequence Chatbot Response with Performance Analysis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Unsupervised Iterative Deep Learning of Speech Features and Acoustic Tokens with Applications to Spoken Term Detection.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2017

Order-Preserving Abstractive Summarization for Spoken Content Based on Connectionist Temporal Classification.
Proceedings of the Interspeech 2017, 2017

Personalized acoustic modeling by weakly supervised multi-task deep learning using acoustic tokens discovered from unlabeled data.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Personalized word representations carrying personalized semantics learned from social network posts.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016
Abstractive headline generation for spoken content by attentive recurrent neural networks with ASR error modeling.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Hierarchical attention model for improved machine comprehension of spoken content.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Interactive Spoken Content Retrieval by Deep Reinforcement Learning.
Proceedings of the Interspeech 2016, 2016

Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine.
Proceedings of the Interspeech 2016, 2016

Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder.
Proceedings of the Interspeech 2016, 2016

2015
An Improved Framework for Recognizing Highly Imbalanced Bilingual Code-Switched Lectures with Cross-Language Acoustic Modeling and Frame-Level Language Identification.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2015

Supervised Detection and Unsupervised Discovery of Pronunciation Error Patterns for Computer-Assisted Language Learning.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2015

A Recursive Dialogue Game for Personalized Computer-Aided Pronunciation Training.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2015

Spoken Content Retrieval - Beyond Cascading Speech Recognition with Text Retrieval.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2015

Finding Complex Features for Guest Language Fragment Recovery in Resource-Limited Code-Mixed Speech Recognition.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2015

Personalizing a Universal Recurrent Neural Network Language Model with User Characteristic Features by Crowdsouring over Social Networks.
CoRR, 2015

A Multi-layered Acoustic Tokenizing Deep Neural Network (MAT-DNN) for Unsupervised Discovery of Linguistic Units and Generation of High Quality Features.
CoRR, 2015

Personalized speech recognizer with keyword-based personalized lexicon and language model using word vector representations.
Proceedings of the INTERSPEECH 2015, 2015

Structuring lectures in massive open online courses (MOOCs) for efficient learning by linking similar sections and predicting prerequisites.
Proceedings of the INTERSPEECH 2015, 2015

Semantic retrieval of personal photos using a deep autoencoder fusing visual features with speech annotations represented as word/paragraph vectors.
Proceedings of the INTERSPEECH 2015, 2015

Enhancing sparse voice annotation for semantic retrieval of personal photos by continuous space word representations.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Enhancing automatically discovered multi-level acoustic patterns considering context consistency with applications in spoken term detection.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Personalizing universal recurrent neural network language model with user characteristic features by social network crowdsourcing.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Towards structured deep neural network for automatic speech recognition.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

An iterative deep learning framework for unsupervised discovery of speech features and linguistic units with applications on spoken term detection.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Spoken Knowledge Organization by Semantic Structuring and a Prototype Course Lecture System for Personalized Learning.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2014

Improved Semantic Retrieval of Spoken Content by Document/Query Expansion with Random Walk Over Acoustic Similarity Graphs.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2014

Variable Selection Linear Regression for Robust Speech Recognition.
IEICE Transactions, 2014

Improved open-vocabulary spoken content retrieval with word and subword lattices using acoustic feature similarity.
Comput. Speech Lang., 2014

Personalized video summarization based on Multi-Layered Probabilistic Latent Semantic Analysis with shared topics.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Spoken question answering using tree-structured conditional random fields and two-layer random walk.
Proceedings of the INTERSPEECH 2014, 2014

Alignment of spoken utterances with slide content for easier learning with recorded lectures using structured support vector machine (SVM).
Proceedings of the INTERSPEECH 2014, 2014

Semantic retrieval of personal photos using matrix factorization and two-layer random walk fusing sparse speech annotations with visual features.
Proceedings of the INTERSPEECH 2014, 2014

Transcribing code-switched bilingual lectures using deep neural networks with unit merging in acoustic modeling.
Proceedings of the IEEE International Conference on Acoustics, 2014

Unsupervised spoken term detection with spoken queries by multi-level acoustic patterns with varying model granularity.
Proceedings of the IEEE International Conference on Acoustics, 2014

Towards personalized video summarization using synchronized comments and Probabilistic Latent Semantic Analysis.
Proceedings of the IEEE 3rd Global Conference on Consumer Electronics, 2014

2013
An Experimental Analysis on Integrating Multi-Stream Spectro-Temporal, Cepstral and Pitch Information for Mandarin Speech Recognition.
IEEE Trans. Audio, Speech & Language Processing, 2013

Enhanced Spoken Term Detection Using Support Vector Machines and Weighted Pseudo Examples.
IEEE Trans. Audio, Speech & Language Processing, 2013

Model-Based Unsupervised Spoken Term Detection with Spoken Queries.
IEEE Trans. Audio, Speech & Language Processing, 2013

NTU Chinese 2.0: a personalized recursive dialogue game for computer-assisted learning of Mandarin Chinese.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2013

A cloud-based personalized recursive dialogue game system for computer-assisted language learning.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2013

Speaking rate normalization with lattice-based context-dependent phoneme duration modeling for personalized speech recognizers on mobile devices.
Proceedings of the INTERSPEECH 2013, 2013

Recurrent neural network based language model personalization by social network crowdsourcing.
Proceedings of the INTERSPEECH 2013, 2013

A recursive dialogue game framework with optimal Policy offering personalized computer-assisted language learning.
Proceedings of the INTERSPEECH 2013, 2013

Supervised spoken document summarization based on structured support vector machine with utterance clusters as hidden variables.
Proceedings of the INTERSPEECH 2013, 2013

Interactive spoken content retrieval by extended query model and continuous state space Markov Decision Process.
Proceedings of the IEEE International Conference on Acoustics, 2013

Toward unsupervised discovery of pronunciation error patterns using universal phoneme posteriorgram for computer-assisted language learning.
Proceedings of the IEEE International Conference on Acoustics, 2013

A dialogue game framework with personalized training using reinforcement learning for computer-assisted language learning.
Proceedings of the IEEE International Conference on Acoustics, 2013

Enhancing query expansion for semantic retrieval of spoken content with automatically discovered acoustic patterns.
Proceedings of the IEEE International Conference on Acoustics, 2013

Unsupervised domain adaptation for spoken document summarization with structured support vector machine.
Proceedings of the IEEE International Conference on Acoustics, 2013

Unsupervised discovery of linguistic structure including two-level acoustic patterns using three cascaded stages of iterative optimization.
Proceedings of the IEEE International Conference on Acoustics, 2013

Toward unsupervised model-based spoken term detection with spoken queries without annotated data.
Proceedings of the IEEE International Conference on Acoustics, 2013

Towards unsupervised semantic retrieval of spoken content with query expansion based on automatically discovered acoustic patterns.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Modulation Spectrum Equalization for Improved Robust Speech Recognition.
IEEE Trans. Audio, Speech & Language Processing, 2012

Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process.
IEEE Trans. Audio, Speech & Language Processing, 2012

Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection.
IEEE Trans. Audio, Speech & Language Processing, 2012

Personalized language modeling by crowd sourcing with social network data for voice access of cloud applications.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Improved semantic retrieval of spoken content by language models enhanced with acoustic similarity graph.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Minimum Phone Error model training on merged acoustic units for transcribing bilingual code-switched speech.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Interactive Spoken Content Retrieval with Different Types of Actions Optimized By a Markov Decision Process.
Proceedings of the INTERSPEECH 2012, 2012

Error Pattern Detection Integrating Generative and Discriminative Learning for Computer-Aided Pronunciation Training.
Proceedings of the INTERSPEECH 2012, 2012

Supervised Spoken Document Summarization jointly Considering Utterance Importance and Redundancy by Structured Support Vector Machine.
Proceedings of the INTERSPEECH 2012, 2012

Open-Vocabulary Retrieval of Spoken Content with Shorter/Longer Queries Considering Word/Subword-based Acoustic Feature Similarity.
Proceedings of the INTERSPEECH 2012, 2012

Discriminative Fuzzy Clustering Maximum a Posterior Linear Regression for Speaker Adaptation.
Proceedings of the INTERSPEECH 2012, 2012

Recognition of highly imbalanced code-mixed bilingual speech with frame-level language detection based on blurred posteriorgram.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Improved approaches of modeling and detecting Error Patterns with empirical analysis for Computer-Aided Pronunciation Training.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Semantic query expansion and context-based discriminative term modeling for spoken document retrieval.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Utterance-level latent topic transition modeling for spoken documents and its application in automatic summarization.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Unsupervised two-stage keyword extraction from spoken documents by topic coherence and support vector machine.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Two-dimensional frame-and-feature weighted Viterbi decoding for robust speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Semantic Analysis and Organization of Spoken Documents Based on Parameters Derived From Latent Topics.
IEEE Trans. Audio, Speech & Language Processing, 2011

Bilingual Acoustic Model Adaptation by Unit Merging on Different Levels and Cross-Level Integration.
Proceedings of the INTERSPEECH 2011, 2011

Improved Tonal Language Speech Recognition by Integrating Spectro-Temporal Evidence and Pitch Information with Properly Chosen Tonal Acoustic Units.
Proceedings of the INTERSPEECH 2011, 2011

Spoken Lecture Summarization by Random Walk over a Graph Constructed with Automatically Extracted Key Terms.
Proceedings of the INTERSPEECH 2011, 2011

Unsupervised Hidden Markov Modeling of Spoken Queries for Spoken Term Detection without Speech Recognition.
Proceedings of the INTERSPEECH 2011, 2011

Bilingual acoustic modeling with state mapping and three-stage adaptation for transcribing unbalanced code-mixed lectures.
Proceedings of the IEEE International Conference on Acoustics, 2011

Multi-stream spectro-temporal and cepstral features based on data-driven hierarchical phoneme clusters.
Proceedings of the IEEE International Conference on Acoustics, 2011

Improved spoken term detection using support vector machines based on lattice context consistency.
Proceedings of the IEEE International Conference on Acoustics, 2011

Improved spoken term detection with graph-based re-ranking in feature space.
Proceedings of the IEEE International Conference on Acoustics, 2011

Integrating frame-based and segment-based dynamic time warping for unsupervised spoken term detection with spoken queries.
Proceedings of the IEEE International Conference on Acoustics, 2011

Improved spoken term detection using support vector machines with acoustic and context features from pseudo-relevance feedback.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010
Performance Analysis for Lattice-Based Speech Indexing Approaches Using Words and Subword Units.
IEEE Trans. Audio, Speech & Language Processing, 2010

A framework integrating different relevance feedback scenarios and approaches for spoken term detection.
Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Automatic key term extraction from spoken course lectures using branching entropy and prosodic/semantic features.
Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Mandarin tone recognition using affine-invariant prosodic features and tone posteriorgram.
Proceedings of the INTERSPEECH 2010, 2010

Improved phoneme recognition by integrating evidence from spectro-temporal and cepstral features.
Proceedings of the INTERSPEECH 2010, 2010

Improved spoken term detection by discriminative training of acoustic models based on user relevance feedback.
Proceedings of the INTERSPEECH 2010, 2010

Improved spoken term detection by feature space pseudo-relevance feedback.
Proceedings of the INTERSPEECH 2010, 2010

Unsupervised spoken-term detection with spoken queries using segment-based dynamic time warping.
Proceedings of the INTERSPEECH 2010, 2010

An initial attempt for phoneme recognition using Structured Support Vector Machine (SVM).
Proceedings of the IEEE International Conference on Acoustics, 2010

Integrating recognition and retrieval with user feedback: A new framework for spoken term detection.
Proceedings of the IEEE International Conference on Acoustics, 2010

An initial attempt to improve spoken term detection by learning optimal weights for different indexing features.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Improved Features and Models for Detecting Edit Disfluencies in Transcribing Spontaneous Mandarin Speech.
IEEE Trans. Audio, Speech & Language Processing, 2009

Higher Order Cepstral Moment Normalization for Improved Robust Speech Recognition.
IEEE Trans. Audio, Speech & Language Processing, 2009

Virtual Chinese tutor (VCT) - a Chinese language pronunciation learning software.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2009

Mandarin spontaneous narrative planning - prosodic evidence from national taiwan university lecture corpus.
Proceedings of the INTERSPEECH 2009, 2009

Improved lattice-based spoken document retrieval by directly learning from the evaluation measures.
Proceedings of the IEEE International Conference on Acoustics, 2009

Learning on demand - course lecture distillation by information extraction and semantic structuring for spoken documents.
Proceedings of the IEEE International Conference on Acoustics, 2009

Latent semantic retrieval of personal photos with sparse user annotation by fused image/speech/text features.
Proceedings of the IEEE International Conference on Acoustics, 2009

Improved clustered hierarchical tandem system with bottom-up processing.
Proceedings of the IEEE International Conference on Acoustics, 2009

Spoken term detection from bilingual spontaneous speech using code-switched lattice-based structures for words and subword units.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Voice-based information retrieval - how far are we from the text-based information retrieval ?
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Discriminative Lexicon Adaptation for Improved Character Accuracy - A New Direction in Chinese Language Modeling.
Proceedings of the ACL 2009, 2009

2008
Histogram-Based Quantization for Robust and/or Distributed Speech Recognition.
IEEE Trans. Audio, Speech & Language Processing, 2008

Robustness analysis on lattice-based speech indexing approaches with respect to varying recognition accuracies by refined simulations.
Proceedings of the 2008 IEEE Spoken Language Technology Workshop, 2008

Automatic title generation for Chinese spoken documents with a delicate scored Viterbi algorithm.
Proceedings of the 2008 IEEE Spoken Language Technology Workshop, 2008

Latent semantic retrieval of spoken documents over position specific posterior lattices.
Proceedings of the 2008 IEEE Spoken Language Technology Workshop, 2008

Evaluation and Analysis of Minimum Phone Error Training and its Modified Versions for Large Vocabulary Mandarin Speech Recognition.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Evaluation of modulation spectrum equalization techniques for large vocabulary robust speech recognition.
Proceedings of the INTERSPEECH 2008, 2008

Improved large vocabulary Mandarin speech recognition by selectively using tone information with a two-stage prosodic model.
Proceedings of the INTERSPEECH 2008, 2008

Confusion-based entropy-weighted decoding for robust speech recognition.
Proceedings of the INTERSPEECH 2008, 2008

Data-driven clustered hierarchical tandem system for LVCSR.
Proceedings of the INTERSPEECH 2008, 2008

Context dependent quantization for distributed and/or robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
A Perceptually Constrained GSVD-Based Approach for Enhancing Speech Corrupted by Colored Noise.
IEEE Trans. Audio, Speech & Language Processing, 2007

Lexicon adaptation with reduced character error (LARCE) - a new direction in Chinese language modeling.
Proceedings of the INTERSPEECH 2007, 2007

Subword-based position specific posterior lattices (s-PSPL) for indexing speech information.
Proceedings of the INTERSPEECH 2007, 2007

Extended powered cepstral normalization (p-CN) with range equalization for robust features in speech recognition.
Proceedings of the INTERSPEECH 2007, 2007

Language model adaptation using latent dirichlet allocation and an efficient topic inference algorithm.
Proceedings of the INTERSPEECH 2007, 2007

Virtual Conduction System with Multi-Resolution Wall Display.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Three-Stage Error Concealment for Distributed Speech Recognition (DSR) with Histogram-Based Quantization (HQ) Under Noisy Environment.
Proceedings of the IEEE International Conference on Acoustics, 2007

Pronunciation Modeling for Spontaneous Speech Recognition using Latent Pronunciation Analysis (LPA) and Prior Knowledge.
Proceedings of the IEEE International Conference on Acoustics, 2007

Modulation spectrum equalization for robust speech recognition.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Type-II dialogue systems for information access from unstructured knowledge sources.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Analytical comparison between position specific posterior lattices and confusion networks based on words and subword units for spoken document indexing.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Robust topic inference for latent semantic language model adaptation.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Robust speech recognition by properly utilizing reliable frames and segments in corrupted signals.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Optimization of temporal filters for constructing robust features in speech recognition.
IEEE Trans. Audio, Speech & Language Processing, 2006

Simulation Analysis for Interactive Retrieval of spoken Documents with Key Terms Ranked by Reinforcement Learning.
Proceedings of the 2006 IEEE ACL Spoken Language Technology Workshop, 2006

Improved Summarization of Chinese spoken Documents by Probabilistic Latent Semantic Analysis (PLSA) with Further Analysis and Integrated Scoring.
Proceedings of the 2006 IEEE ACL Spoken Language Technology Workshop, 2006

A Multi-layered Summarization System for Multi-media Archives by Understanding and Structuring of Chinese Spoken Documents.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Improved Large Vocabulary Continuous Chinese Speech Recognition by Character-Based Consensus Networks.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Minimum Phone Error (MPE) Model and Feature Training on Mandarin Broadcast News Task.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Efficient interactive retrieval of spoken documents with key terms ranked by reinforcement learning.
Proceedings of the INTERSPEECH 2006, 2006

Latent prosodic modeling (LPM) for speech with applications in recognizing spontaneous Mandarin speech with disfluencies.
Proceedings of the INTERSPEECH 2006, 2006

Multi-layered summarization of spoken document archives by information extraction and semantic structuring.
Proceedings of the INTERSPEECH 2006, 2006

Feature analysis for emotion recognition from Mandarin speech considering the special characteristics of Chinese language.
Proceedings of the INTERSPEECH 2006, 2006

Prosodic modeling in large vocabulary Mandarin speech recognition.
Proceedings of the INTERSPEECH 2006, 2006

Powered cepstral normalization (p-CN) for robust features in speech recognition.
Proceedings of the INTERSPEECH 2006, 2006

Extension and further analysis of higher order cepstral moment normalization (HOCMN) for robust features in speech recognition.
Proceedings of the INTERSPEECH 2006, 2006

A new framework for system combination based on integrated hypothesis space.
Proceedings of the INTERSPEECH 2006, 2006

Joint Uncertainty Decoding (JUD) with Histogram-Based Quantization (HQ) for Robust and/or Distributed Speech Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Improved Spoken Document Summarization Using Probabilistic Latent Semantic Analysis (PLSA).
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis (PLSA).
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Entropy-Based Feature Parameter Weighting for Robust Speech Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Segmental eigenvoice with delicate eigenspace for improved speaker adaptation.
IEEE Trans. Speech and Audio Processing, 2005

Histogram-based quantization (HQ) for robust and scalable distributed speech recognition.
Proceedings of the INTERSPEECH 2005, 2005

Improved spontaneous Mandarin speech recognition by disfluency interruption point (IP) detection using prosodic features.
Proceedings of the INTERSPEECH 2005, 2005

Hierarchical topic organization and visual presentation of spoken documents using probabilistic latent semantic analysis (PLSA) for efficient retrieval/browsing applications.
Proceedings of the INTERSPEECH 2005, 2005

Energy-based frame selection for reliable feature normalization and transformation in robust speech recognition.
Proceedings of the INTERSPEECH 2005, 2005

Important and new features with analysis for disfluency interruption point (IP) detection in spontaneous Mandarin speech.
Proceedings of the ISCA Tutorial and Research Workshop (ITRW) on Disfluency in Spontaneous Speech, 2005

2004
A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents.
ACM Trans. Asian Lang. Inf. Process., 2004

Large vocabulary continuous Mandarin speech recognition using finite state machine.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

An initial prototype system for Chinese spoken document understanding and organization for indexing/browsing and retrieval applications.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

Robust features for speech recognition using minimum variance distortionless response (MVDR) spectrum estimation and feature normalization techniques.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

A new feature extraction front-end for robust speech recognition using progressive histogram equalization and multi-eigenvector temporal filtering.
Proceedings of the INTERSPEECH 2004, 2004

Improved speech enhancement by applying time-shift property of DFT on hankel matrices for signal subspace decomposition.
Proceedings of the INTERSPEECH 2004, 2004

Higher order cepstral moment normalization (HOCMN) for robust speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Efficient and robust distributed speech recognition (DSR) over wireless fading channels: 2D-DCT compression, iterative bit allocation, short BCH code and interleaving.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Why is the special structure of the language important for Chinese spoken language processing? - examples on spoken document retrieval, segmentation and summarization.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Cross domain Chinese speech understanding and answering based on named-entity extraction.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Automatic title generation for Chinese spoken documents considering the special structure of the language.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Speech enhancement and improved recognition accuracy by integrating wavelet transform and spectral subtraction algorithm.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Perceptually-constrained generalized singular value decomposition-based approach for enhancing speech corrupted by colored noise.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Automatic title generation for Chinese spoken documents using an adaptive k nearest-neighbor approach.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Improved Chinese broadcast news transcription by language modeling with temporally consistent training corpora and iterative phrase extraction.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Data-driven temporal filters based on multi-eigenvectors for robust features in speech recognition.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
A set of corpus-based text-to-speech synthesis technologies for Mandarin Chinese.
IEEE Trans. Speech and Audio Processing, 2002

Discriminating capabilities of syllable-based features and approaches of utilizing them for voice retrieval of speech information in Mandarin Chinese.
IEEE Trans. Speech and Audio Processing, 2002

A hierarchical tag-graph search scheme with layered grammar rules for spontaneous speech understanding.
Pattern Recognit. Lett., 2002

Improved Chinese spoken document retrieval with hybrid modeling and data-driven indexing features.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Distributed Chinese keyword spotting and verification for spoken dialogues under wireless environment.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Speech enhancement based on generalized singular value decomposition approach.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Data-driven temporal filters obtained via different optimization criteria evaluated on Aurora2 database.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Data-driven temporal filters for robust features in speech recognition obtained via Minimum Classification Error (MCE).
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Computer-aided analysis and design for spoken dialogue systems based on quantitative simulations.
IEEE Trans. Speech and Audio Processing, 2001

New approaches for domain transformation and parameter combination for improved accuracy in parallel model combination (PMC) techniques.
IEEE Trans. Speech and Audio Processing, 2001

Eigen-MLLR coefficients as new feature parameters for speaker identification.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Segmental eigenvoice for rapid speaker adaptation.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Pronunciation variation analysis with respect to various linguistic levels and contextual conditions for Mandarin Chinese.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Comparative analysis for data-driven temporal filters obtained via principal component analysis (PCA) and linear discriminant analysis (LDA) in speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Credibility proof for speech content and speaker verification by fragile watermarking with consecutive frame-based processing.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

An HMM/n-gram-based linguistic processing approach for Mandarin spoken document retrieval.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Improved spoken document retrieval by exploring extra acoustic and linguistic cues.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Rapid speaker adaptation using a priori knowledge by eigenspace analysis of MLLR parameters.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Live Lexicons and Dynamic Corpora Adapted to the Network Resources for Chinese Spoken Language Processing Applications in an Internet Era.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

MAT-2000 - design, collection, and validation of a Mandarin 2000-speaker telephone speech database.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Automatic metric-based speech segmentation for broadcast news via principal component analysis.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Retrieval of mandarin broadcast news using spoken queries.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Fast speaker adaptation using eigenspace-based maximum likelihood linear regression.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Fundamental performance analysis for spoken dialogue systems based on a quantitative simulation approach.
Proceedings of the IEEE International Conference on Acoustics, 2000

Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Telephony Based Speaker-Independent Large Vocabulary Continuous Mandarin Speech Recognition.
IJCLCLP, 1999

Automatic selection of phonetically distributed sentence sets for speaker adaptation with application to large vocabulary Mandarin speech recognition.
Comput. Speech Lang., 1999

Consistent dialogue across concurrent topics based on an expert system model.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Phonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Selection of waveform units for corpus-based Mandarin speech synthesis based on decision trees and prosodic modification costs.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

A framework of performance evaluation and error analysis methodology for speech understanding systems.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

Improved parallel model combination techniques with split Gaussian mixtures for speech recognition under noisy conditions.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Isolated Mandarin base-syllable recognition based upon the segmental probability model.
IEEE Trans. Speech and Audio Processing, 1998

Speaker-Independent Continuous Mandarin Speech Recognition Under Telephone Environments.
Proceedings of the 11th Research on Computational Linguistics Conference, 1998

CPAT-Tree-Based Language Models with an Application for Text Verification in Chinese.
Proceedings of the 11th Research on Computational Linguistics Conference, 1998

A syllable-based Chinese spoken dialogue system for telephone directory services primarily trained with a corpus.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Robust entropy-based endpoint detection for speech recognition in noisy environments.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Improved robust speech recognition considering signal correlation approximated by taylor series.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Hierarchical tag-graph search for spontaneous speech understanding in spoken dialog systems.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Improved parallel model combination based on better domain transformation for speech recognition under noisy environments.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Automatic segmental and prosodic labeling of Mandarin speech database.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

A*-admissible key-phrase spotting with sub-syllable level utterance verification.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Statistics-based segment pattern lexicon-a new direction for Chinese language modeling.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Improved robustness for speech recognition under noisy conditions using correlated parallel model combination.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Improved search strategy for large vocabulary continuous Mandarin speech recognition.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997
Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary using limited training data.
IEEE Trans. Speech and Audio Processing, 1997

Isolated Mandarin syllable recognition with limited training data specially considering the effect of tones.
IEEE Trans. Speech and Audio Processing, 1997

Truncation on Combined Word-Based and Class-Based Language Model Using Kullback-Leibler Distance Criterion.
Proceedings of the 10th Research on Computational Linguistics International Conference, 1997

Integrating Long-Distance Language Modeling to Phoneme-to-Text Conversion.
Proceedings of the 10th Research on Computational Linguistics International Conference, 1997

Chinese language model adaptation based on document classification and multiple domain-specific language models.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Intelligent retrieval of very large Chinese dictionaries with speech queries.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

A Modified Code Tracking Loop for Direct-Sequence Spread-Spectrum Systems on Frequency-Selective Fading Channels.
Proceedings of the 1997 IEEE International Conference on Communications: Towards the Knowledge Millennium, 1997

Syllable-based relevance feedback techniques for Mandarin voice record retrieval using speech queries.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

A Chinese text-to-speech system based on part-of-speech analysis, prosodic modeling and non-uniform units.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Internet Chinese information retrieval using unconstrained Mandarin speech queries based on a client-server architecture and a PAT-tree-based language model.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

A multi-phase approach for fast spotting of large vocabulary Chinese keywords from Mandarin speech using prosodic information.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
國語語音辨認中多領域語言模型之訓練、偵測與調適 (Training, Detection and Adaptation of Multi-Domain Language Models for Mandarin Speech Recognition) [In Chinese].
Proceedings of 9th Computational Linguistics Conference, 1996

Speaker intention modeling for large vocabulary Mandarin spoken dialogues.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Robust speech recognition features based on temporal trajectory filtering of frequency band spectrum.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Use of prosodic information to integrate acoustic and linguistic knowledge in continuous Mandarin speech recognition with very large vocabulary.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Automatic generation of prosodic structure for high quality Mandarin speech synthesis.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Very-large-vocabulary Mandarin voice message file retrieval using speech queries.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Fast and accurate recognition of very-large-vocabulary continuous Mandarin speech for Chinese language with improved segmental probability modeling.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

An efficient voice retrieval system for very-large-vocabulary Chinese textual databases with a clustered language model.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995
Practically realizable digital transmission significantly below the Nyquist bandwidth.
IEEE Trans. Communications, 1995

A chernoff distance based segmental probability model (CD-SPM) approach for Mandarin syllable recognition.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Unconstrained speech retrieval for Chinese document databases with very large vocabulary and unlimited domains.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

A syllable-based very-large-vocabulary voice retrieval system for Chinese databases with textual attributes.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Fast and accurate continuous speech recognition for Chinese language with very large vocabulary.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Large vocabulary, word-based Mandarin dictation system.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data.
Proceedings of the 1995 International Conference on Acoustics, 1995

Golden Mandarin (III)-a user-adaptive prosodic-segment-based Mandarin dictation machine for Chinese language with very large vocabulary.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
An exact performance analysis of the clipped diversity combining receiver for FH/MFSK systems against a band multitone jammer.
IEEE Trans. Communications, 1994

國語語音辨認中詞群語言模型之分群方法與應用 (Methodology Implementation and Application of Word-Class Based Language Model in Mandarin Speech Recognition) [In Chinese].
Proceedings of Rocling Computational Linguistics Conference VII, 1994

An intelligent and efficient word-class-based Chinese language model for Mandarin speech recognition with very large vocabulary.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Incremental speaker adaptation using phonetically balanced training sentences for Mandarin syllable recognition based on segmental probability models.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

An initial study on a segmental probability model approach to large-vocabulary continuous Mandarin speech recognition.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

Large vocabulary word recognition based on tree-trellis search.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
A direct-concatenation approach to train hidden Markov models to recognize the highly confusing Mandarin syllables with very limited training data.
IEEE Trans. Speech and Audio Processing, 1993

Improved tone concatenation rules in a formant-based Chinese text-to-speech system.
IEEE Trans. Speech and Audio Processing, 1993

Golden Mandarin (I)-A real-time Mandarin speech dictation machine for Chinese language with very large vocabulary.
IEEE Trans. Speech and Audio Processing, 1993

A best-first language processing model integrating the unification grammar and Markov language model for speech recognition applications.
IEEE Trans. Speech and Audio Processing, 1993

A novel scaling scheme for fast Hartley transform.
Signal Process., 1993

Continuous hidden Markov models integrating transitional and instantaneous features for Mandarin syllable recognition.
Comput. Speech Lang., 1993

從中文語料庫中自動選取連續國語語音特性平衡句的方法 (Automatic Selection of Phonetically Rich Sentences from A Chinese Text Corpus) [In Chinese].
Proceedings of Rocling Computational Linguistics Conference VI, 1993

國語語音辨認中詞群雙連語言模型的解碼方法 (A Word-Class Bigram Approach to Linguistic Decoding in Mandarin Speech Recognition) [In Chinese].
Proceedings of Rocling Computational Linguistics Conference VI, 1993

A new framework for recognition of Mandarin syllables with tones using sub-syllabic units.
Proceedings of the IEEE International Conference on Acoustics, 1993

Golden Mandarin (II)-an improved single-chip real-time Mandarin dictation machine for Chinese language with very large vocabulary.
Proceedings of the IEEE International Conference on Acoustics, 1993

1991
Isolated-utterance speech recognition using hidden Markov models with bounded state durations.
IEEE Trans. Signal Processing, 1991

An augmented chart data structure with efficient word lattice parsing scheme in speech recognition applications.
Speech Commun., 1991

An Efficient Natural Language Processing System Specially Designed for the Chinese Language.
Computational Linguistics, 1991

A Preference-first Language Processor Integrating the Unification Grammar and Markov Language Model for Speech Recognition Applications.
Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, 1991

1990
A Mandarin Dictation Machine Based Upon a Hierarchical Recognition Approach and Chinese Natural Language Analysis.
IEEE Trans. Pattern Anal. Mach. Intell., 1990

An Efficient speech Recognition System for the initials of Mandarin syllables.
IJPRAI, 1990

1989
The synthesis rules in a Chinese text-to-speech system.
IEEE Trans. Acoustics, Speech, and Signal Processing, 1989

Multi-H phase-coded modulations with asymmetric modulation indexes.
IEEE Journal on Selected Areas in Communications, 1989

1988
An improved sequential estimation scheme for PN acquisition.
IEEE Trans. Communications, 1988

Efficient speech Recognition Techniques for the finals of Mandarin syllables.
IJPRAI, 1988

1987
The Minimum Likelihood-A New Concept for Bit Synchronization.
IEEE Trans. Communications, 1987

The Preliminary Results of a Mandarin Dictation Machine Based Upon Chinese Natural Language Analysis.
Proceedings of the 10th International Joint Conference on Artificial Intelligence. Milan, 1987

1986
A General Theory for Asynchronous Speech Encryption Techniques.
IEEE Journal on Selected Areas in Communications, 1986

Closed-Form Statistical Analysis for Square Law PN Acquisition. Detector Performance in Spread Spectrum Systems.
Proceedings of the IEEE International Conference on Communications: Integrating the World Through Communications, 1986

A Chinese Natural Language Processing System Based Upon the Theory of Empty Categories.
Proceedings of the 5th National Conference on Artificial Intelligence. Philadelphia, 1986

1984
A Differential Technique for Detections of Circularly Polarized Satellite Signal Parameters.
IEEE Trans. Communications, 1984

An A, B-Polar Approach for Multimode Polarization Analysis in Satellite Communications.
IEEE Trans. Communications, 1984

A New Frequency Domain Speech Scrambling System Which Does Not Require Frame Synchronization.
IEEE Trans. Communications, 1984

A New Formulation of Spectrum-Orbit Utilization Efficiency for Satellite Communications in Interference-Limited Situations.
IEEE Trans. Communications, 1984

A New Time Domain Speech Scrambling System Which Does Not Require Frame Synchronization.
IEEE Journal on Selected Areas in Communications, 1984

1981
Results on Sampling-based Scrambling for Secure Speech Communication.
Proceedings of the Advances in Cryptology: A Report on CRYPTO 81, 1981


  Loading...