Hsin-Min Wang

According to our database1, Hsin-Min Wang authored at least 236 papers between 1993 and 2019.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Other 

Links

On csauthors.net:

Bibliography

2019
Reinforcement Learning Based Speech Enhancement for Robust Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Coherent Deep-Net Fusion To Classify Shots In Concert Videos.
IEEE Trans. Multimedia, 2018

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks.
IEEE Trans. Emerging Topics in Comput. Intellig., 2018

An Information Distillation Framework for Extractive Summarization.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2018

Locally Linear Embedding Based Post-Filtering for Speech Enhancement.
J. Inf. Sci. Eng., 2018

Voice Conversion Based on Locally Linear Embedding.
J. Inf. Sci. Eng., 2018

WaveNet 聲碼器及其於語音轉換之應用 (WaveNet Vocoder and its Applications in Voice Conversion) [In Chinese].
Proceedings of the 30th Conference on Computational Linguistics and Speech Processing, 2018

Automatic Detection of Speech Under Cold Using Discriminative Autoencoders and Strength Modeling with Multiple Sub-Dictionary Generation.
Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

Voice Conversion Based on Cross-Domain Features Using Variational Auto Encoders.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Exemplar-Based Spectral Detail Compensation for Voice Conversion.
Proceedings of the Interspeech 2018, 2018

Quality-Net: An End-to-End Non-intrusive Speech Quality Assessment Model Based on BLSTM.
Proceedings of the Interspeech 2018, 2018

Seethevoice: Learning from Music to Visual Storytelling of Shots.
Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Essence Vector-Based Query Modeling for Spoken Document Retrieval.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Deep Denoising Autoencoder Based Post Filtering for Speech Enhancement.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Affective Music Information Retrieval.
Proceedings of the Emotions and Personality in Personalized Services, 2017

A Position-Aware Language Modeling Framework for Extractive Broadcast News Speech Summarization.
ACM Trans. Asian & Low-Resource Lang. Inf. Process., 2017

A Replay Spoofing Detection System Based on Discriminative Autoencoders.
IJCLCLP, 2017

On the Use of Neural Network Modeling Techniques for Spoken Document Retrieval.
IJCLCLP, 2017

An Empirical Comparison of Contemporary Unsupervised Approaches for Extractive Speech Summarization.
IJCLCLP, 2017

基於鑑別式自編碼解碼器之錄音回放攻擊偵測系統 (A Replay Spoofing Detection System Based on Discriminative Autoencoders) [In Chinese].
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing, 2017

使用查詢意向探索與類神經網路於語音文件檢索之研究 (Exploring Query Intent and Neural Network modeling Techniques for Spoken Document Retrieval) [In Chinese].
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing, 2017

基於i-vector與PLDA並使用GMM-HMM強制對位之自動語者分段標記系統 (Speaker Diarization based on I-vector PLDA Scoring and using GMM-HMM Forced Alignment) [In Chinese].
Proceedings of the 29th Conference on Computational Linguistics and Speech Processing, 2017

Automatic Music Video Generation Based on Simultaneous Soundtrack Recommendation and Video Editing.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Discriminative Autoencoders for Acoustic Modeling.
Proceedings of the Interspeech 2017, 2017

A Post-Filtering Approach Based on Locally Linear Embedding Difference Compensation for Speech Enhancement.
Proceedings of the Interspeech 2017, 2017

Wavelet Speech Enhancement Based on Robust Principal Component Analysis.
Proceedings of the Interspeech 2017, 2017

Voice Conversion from Unaligned Corpora Using Variational Autoencoding Wasserstein Generative Adversarial Networks.
Proceedings of the Interspeech 2017, 2017

Exploring the Use of Significant Words Language Modeling for Spoken Document Retrieval.
Proceedings of the Interspeech 2017, 2017

A locally linear embbeding based postfiltering approach for speech enhancement.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Deep-net fusion to classify shots in concert videos.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Speech emotion recognition with skew-robust neural networks.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Leveraging manifold learning for extractive broadcast news summarization.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Discriminative autoencoders for speaker verification.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A locality-preserving essence vector modeling framework for spoken document retrieval.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Neural relevance-aware query modeling for spoken document retrieval.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Personality trait perception from speech signals using multiresolution analysis and convolutional neural networks.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Fast locally linear embedding algorithm for exemplar-based voice conversion.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Alignment of Lyrics With Accompanied Singing Audio Based on Acoustic-Phonetic Vowel Likelihood Modeling.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2016

Exploring the use of unsupervised query modeling techniques for speech recognition and summarization.
Speech Communication, 2016

運用序列到序列生成架構於重寫式自動摘要(Exploiting Sequence-to-Sequence Generation Framework for Automatic Abstractive Summarization)[In Chinese].
Proceedings of the 28th Conference on Computational Linguistics and Speech Processing, 2016

Automatic Music Video Generation Based on Emotion-Oriented Pseudo Song Prediction and Matching.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Dictionary update for NMF-based voice conversion using an encoder-decoder network.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Locally Linear Embedding for Exemplar-Based Spectral Conversion.
Proceedings of the Interspeech 2016, 2016

Exploring Word Mover's Distance and Semantic-Aware Embedding Techniques for Extractive Broadcast News Summarization.
Proceedings of the Interspeech 2016, 2016

Minimization of Regression and Ranking Losses with Shallow Neural Networks on Automatic Sincerity Evaluation.
Proceedings of the Interspeech 2016, 2016

DEMV-matchmaker: Emotional temporal course representation and deep similarity matching for automatic music video generation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Improved spoken document summarization with coverage modeling techniques.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Learning to Distill: The Essence Vector Modeling Framework.
Proceedings of the COLING 2016, 2016

Exploiting graph regularized nonnegative matrix factorization for extractive speech summarization.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Voice conversion from non-parallel corpora using variational auto-encoder.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Audio-visual speech enhancement using deep neural networks.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

A novel paragraph embedding method for spoken document summarization.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015
Combining Relevance Language Modeling and Clarity Measure for Extractive Speech Summarization.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2015

An Acoustic-Phonetic Model of F0 Likelihood for Vocal Melody Extraction.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2015

Extractive Broadcast News Summarization Leveraging Recurrent Neural Network Language Modeling Techniques.
IEEE/ACM Trans. Audio, Speech & Language Processing, 2015

A Probabilistic Framework for Chinese Spelling Check.
ACM Trans. Asian & Low-Resource Lang. Inf. Process., 2015

Modeling the Affective Content of Music with a Gaussian Mixture Model.
IEEE Trans. Affective Computing, 2015

Extractive Spoken Document Summarization with Representation Learning Techniques.
IJCLCLP, 2015

Investigating Modulation Spectrum Factorization Techniques for Robust Speech Recognition.
IJCLCLP, 2015

表示法學習技術於節錄式語音文件摘要之研究(A Study on Representation Learning Techniques for Extractive Spoken Document Summarization) [In Chinese].
Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, 2015

調變頻譜分解之改良於強健性語音辨識(Several Refinements of Modulation Spectrum Factorization for Robust Speech Recognition) [In Chinese].
Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, 2015

EMV-matchmaker: Emotional Temporal Course Modeling and Matching for Automatic Music Video Generation.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Positional language modeling for extractive broadcast news speech summarization.
Proceedings of the INTERSPEECH 2015, 2015

Leveraging word embeddings for spoken document summarization.
Proceedings of the INTERSPEECH 2015, 2015

A histogram density modeling approach to music emotion recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

I-vector based language modeling for query representation.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Incorporating paragraph embeddings and density peaks clustering for spoken document summarization.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Improving denoising auto-encoder based speech enhancement with the speech parameter generation algorithm.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Incorporating proximity information in relevance language modeling for extractive speech summarization.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

A probabilistic interpretation for artificial neural network-based voice conversion.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014
Generalized k-Labelsets Ensemble for Multi-Label and Cost-Sensitive Classification.
IEEE Trans. Knowl. Data Eng., 2014

Enhancing Query Formulation for Spoken Document Retrieval.
J. Inf. Sci. Eng., 2014

探究新穎語句模型化技術於節錄式語音摘要 (Investigating Novel Sentence Modeling Techniques for Extractive Speech Summarization) [In Chinese].
Proceedings of the 26th Conference on Computational Linguistics and Speech Processing, 2014

Automatic Set List Identification and Song Segmentation for Full-Length Concert Videos.
Proceedings of the 15th International Society for Music Information Retrieval Conference, 2014

Enhanced language modeling for extractive speech summarization with sentence relatedness information.
Proceedings of the INTERSPEECH 2014, 2014

Clustering-based i-vector formulation for speaker recognition.
Proceedings of the INTERSPEECH 2014, 2014

Ensemble of machine learning algorithms for cognitive and physical speaker load detection.
Proceedings of the INTERSPEECH 2014, 2014

Towards time-varying music auto-tagging based on CAL500 expansion.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

A recurrent neural network language modeling framework for extractive speech summarization.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

Improving music auto-tagging by intra-song instance bagging.
Proceedings of the IEEE International Conference on Acoustics, 2014

Effective pseudo-relevance feedback for language modeling in extractive speech summarization.
Proceedings of the IEEE International Conference on Acoustics, 2014

Speaker verification using kernel-based binary classifiers with binary operation derived features.
Proceedings of the IEEE International Conference on Acoustics, 2014

I-vector based language modeling for spoken document retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2014

Leveraging Effective Query Modeling Techniques for Speech Recognition and Summarization.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

A margin-based discriminative modeling approach for extractive speech summarization.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Emotion recognition of conversational affective speech using temporal course modeling-based error weighted cross-correlation model.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
改良語句模型技術於節錄式語音摘要之研究 (Improved Sentence Modeling Techniques for Extractive Speech Summarization) [In Chinese].
Proceedings of the 25th Conference on Computational Linguistics and Speech Processing, 2013

Query-Document Relevance Topic Models.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2013

Non-reference audio quality assessment for online live music recordings.
Proceedings of the ACM Multimedia Conference, 2013

Alleviating the over-smoothing problem in GMM-based voice conversion with discriminative training.
Proceedings of the INTERSPEECH 2013, 2013

Semantic Naïve Bayes Classifier for Document Classification.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

Subspace-based phonotactic language recognition using multivariate dynamic linear models.
Proceedings of the IEEE International Conference on Acoustics, 2013

Weighted matrix factorization for spoken document retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2013

Effective pseudo-relevance feedback for spoken document retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2013

Incorporating global variance in the training phase of GMM-based voice conversion.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

A Study of Language Modeling for Chinese Spelling Check.
Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing, 2013

2012
Spoken Document Retrieval Leveraging Unsupervised and Supervised Topic Modeling Techniques.
IEICE Transactions, 2012

A Term Association Translation Model for Naive Bayes Text Classification.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2012

The acoustic emotion gaussians model for emotion-based music annotation and retrieval.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

The acousticvisual emotion guassians model for automatic generation of music video.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Exploring the relationship between categorical and dimensional emotion semantics of music.
Proceedings of the second international ACM workshop on Music information retrieval with user-centered and multimodal strategies, 2012

Exploring mutual information for GMM-based spectral conversion.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Subspace-Based Feature Representation and Learning for Language Recognition.
Proceedings of the INTERSPEECH 2012, 2012

A Study of Mutual Information for GMM-Based Spectral Conversion.
Proceedings of the INTERSPEECH 2012, 2012

Word Relevance Modeling for Speech Recognition.
Proceedings of the INTERSPEECH 2012, 2012

Term relevance dependency model for text classification.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Playing with tagging: A real-time tagging music player.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Generalized k-labelset ensemble for multi-label classification.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Personalized music emotion recognition via model adaptation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011
Cost-Sensitive Multi-Label Learning for Audio Tag Annotation and Retrieval.
IEEE Trans. Multimedia, 2011

Audio Tag Annotation and Retrieval Using Tag Count Information.
Proceedings of the Advances in Multimedia Modeling, 2011

Colorizing tags in tag cloud: a novel query-by-tag music search system.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Learning the Similarity of Audio Music in Bag-of-frames Representation from Tagged Music Data.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

An Acoustic-Phonetic Approach to Vocal Melody Extraction.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Query by multi-tags with multi-level preferences for content-based music retrieval.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Automatic annotation of Web videos.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Cost-sensitive stacking for audio tag annotation and retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Fast min-hashing indexing and robust spatio-temporal matching for detecting video copies.
TOMCCAP, 2010

Time-Series Linear Search for Video Copies Based on Compact Signature Manipulation and Containment Relation Modeling.
IEEE Trans. Circuits Syst. Video Techn., 2010

BIC-Based Speaker Segmentation Using Divide-and-Conquer Strategies With Application to Speaker Diarization.
IEEE Trans. Audio, Speech & Language Processing, 2010

Exploiting semantic associative information in topic modeling.
Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Phone boundary refinement using ranking methods.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Speaker verification using support vector machine with LLR-based sequence kernels.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Bayesian speaker recognition using Gaussian mixture model and laplace approximation.
Proceedings of the INTERSPEECH 2010, 2010

Phonetic subspace mixture model for speaker diarization.
Proceedings of the INTERSPEECH 2010, 2010

A Discriminative and Heteroscedastic Linear Feature Transformation for Multiclass Classification.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

Homogeneous segmentation and classifier ensemble for audio tag annotation and retrieval.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

Detecting pitching frames in baseball game video using Markov random walk.
Proceedings of the International Conference on Image Processing, 2010

Background music identification through content filtering and min-hash matching.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Model-Based Clustering by Probabilistic Self-Organizing Maps.
IEEE Trans. Neural Networks, 2009

A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization.
IEEE Trans. Audio, Speech & Language Processing, 2009

A Comparative Study of Probabilistic Ranking Models for Chinese Spoken Document Summarization.
ACM Trans. Asian Lang. Inf. Process., 2009

Improving the characterization of the alternative hypothesis via minimum verification error training with applications to speaker verification.
Pattern Recognition, 2009

Raman-Based 10.66 Gb/s Bidirectional TDM over Long-Reach WDM Hybrid PON.
IEICE Transactions, 2009

Evolutionary minimization of the Rand index for speaker clustering.
Computer Speech & Language, 2009

Improving GMM-UBM speaker verification using discriminative feedback adaptation.
Computer Speech & Language, 2009

Virtual Chinese tutor (VCT) - a Chinese language pronunciation learning software.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2009

Speaker diarization using divide-and-conquer.
Proceedings of the INTERSPEECH 2009, 2009

Articulatory feature asynchrony analysis and compensation in detection-based ASR.
Proceedings of the INTERSPEECH 2009, 2009

Learning to rank from Bayesian decision inference.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

2008
A Query-by-Singing System for Retrieving Karaoke Music.
IEEE Trans. Multimedia, 2008

Using Kernel Discriminant Analysis to Improve the Characterization of the Alternative Hypothesis for Speaker Verification.
IEEE Trans. Audio, Speech & Language Processing, 2008

Using the Similarity of Main Melodies to Identify Cover Versions of Popular Songs for Music Document Retrieval.
J. Inf. Sci. Eng., 2008

An Investigation of Phonological Feature Systems Used in Detection-Based ASR.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Discriminative Feedback Adaptation for GMM-UBM Speaker Verification.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

A comparative study of probabilistic ranking models for spoken document summarization.
Proceedings of the IEEE International Conference on Acoustics, 2008

BIC-based audio segmentation by divide-and-conquer.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Automatic Speaker Clustering Using a Voice Characteristic Reference Space and Maximum Purity Estimation.
IEEE Trans. Audio, Speech & Language Processing, 2007

Integrating coding techniques into LP-based Mandarin text-to-speech synthesis.
I. J. Speech Technology, 2007

A Novel Characterization of the Alternative Hypothesis Using Kernel Discriminant Analysis for LLR-Based Speaker Verification.
IJCLCLP, 2007

Improved HMM/SVM methods for automatic phoneme segmentation.
Proceedings of the INTERSPEECH 2007, 2007

A unified probabilistic generative framework for extractive spoken document summarization.
Proceedings of the INTERSPEECH 2007, 2007

Evolutionary minimum verification error learning of the alternative hypothesis model for LLR-based speaker verification.
Proceedings of the INTERSPEECH 2007, 2007

Cascading Multimodal Verification using Face, Voice and Iris Information.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Speaker Clustering Based on Minimum Rand Index.
Proceedings of the IEEE International Conference on Acoustics, 2007

Phonetic Boundary Refinement using Support Vector Machine.
Proceedings of the IEEE International Conference on Acoustics, 2007

Improved Methods for Characterizing the Alternative Hypothesis using Minimum Verification Error Training for LLR-Based Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2007

Spoken document summarization using relevant information.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals.
IEEE Trans. Audio, Speech & Language Processing, 2006

An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition.
IJCLCLP, 2006

A Maximum Entropy Approach for Semantic Language Modeling.
IJCLCLP, 2006

A Minimum Boundary Error Framework for Automatic Phonetic Segmentation.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Automatic Construction of Regression Class Tree for MLLR Via Model-Based Hierarchical Clustering.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Extractive Chinese Spoken Document Summarization Using Probabilistic Ranking Models.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

On Using Entropy Information to Improve Posterior Probability-Based Confidence Measures.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

A Novel Alternative Hypothesis Characterization Using Kernel Classifiers for LLR-Based Speaker Verification.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Minimum boundary error training for automatic phonetic segmentation.
Proceedings of the INTERSPEECH 2006, 2006

Improving the characterization of the alternative hypothesis via kernel discriminant analysis for likelihood ratio-based speaker verification.
Proceedings of the INTERSPEECH 2006, 2006

A Prototypes-Embedded Genetic K-means Algorithm.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

A Kernel-based Discrimination Framework for Solving Hypothesis Testing Problems with Application to Speaker Verification.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

On Maximizing the Within-Cluster Homogeneity of Speaker Voice Characteristics For Speech Utterance Clustering.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

A Music Retrieval System Based on Query-by-Singing for Karaoke Jukebox.
Proceedings of the Information Retrieval Technology, 2006

2005
Fluent speech prosody: Framework and modeling.
Speech Communication, 2005

MATBN: A Mandarin Chinese Broadcast News Corpus.
IJCLCLP, 2005

On the extraction of vocal-related information to facilitate the management of popular music collections.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2005

Query-By-Example Technique for Retrieving Cover Versions of Popular Songs with Similar Melodies.
Proceedings of the ISMIR 2005, 2005

Speaker clustering of unknown utterances based on maximum purity estimation.
Proceedings of the INTERSPEECH 2005, 2005

An Efficient Approach to Multimodal Person Identity Verification by Fusing Face and Voice Information.
Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, 2005

Clustering Speech Utterances by Speaker Using Eigenvoice-Motivated Vector Space Models.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Gmm-Based Bhattacharyya Kernel Fisher Discriminant Analysis For Speaker Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

A Query-by-Singing Technique for Retrieving Polyphonic Objects of Popular Music.
Proceedings of the Information Retrieval Technology, 2005

2004
A discriminative HMM/N-gram-based retrieval approach for mandarin spoken documents.
ACM Trans. Asian Lang. Inf. Process., 2004

The SoVideo Mandarin Chinese Broadcast News Retrieval System.
I. J. Speech Technology, 2004

A Model-Selection-Based Self-Splitting Gaussian Mixture Learning with Application to Speaker Identification.
EURASIP J. Adv. Sig. Proc., 2004

Mandarin-English Information (MEI): investigating translingual speech retrieval.
Computer Speech & Language, 2004

Blind Clustering of Popular Music Recordings Based on Singer Voice Characteristics.
Computer Music Journal, 2004

藍芽無線環境下中文語音辨識效能之評估與分析 (Performance Evaluation and Analysis of Mandarin Speech Recognition over Bluetooth Communication Environments) [In Chinese].
Proceedings of the 16th Conference on Computational Linguistics and Speech Processing, 2004

Towards Automatic Identification Of Singing Language In Popular Music Recordings.
Proceedings of the ISMIR 2004, 2004

A Mandarin TTS system with an integrated prosodic model.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

A new eigenvoice approach to speaker adaptation.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

A maximum entropy approach for integrating semantic information in statistical language models.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

METRIC-SEQDAC: a hybrid approach for audio segmentation.
Proceedings of the INTERSPEECH 2004, 2004

Speaker clustering of speech utterances using a voice characteristic reference space.
Proceedings of the INTERSPEECH 2004, 2004

Statistical Chinese spoken document retrieval using latent topical information.
Proceedings of the INTERSPEECH 2004, 2004

A query-by-example framework to retrieve music documents by singer.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Automatic detection and tracking of target singer in multi-singer music recordings.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Blind clustering of popular music recordings based on singer voice characteristics.
Proceedings of the ISMIR 2003, 2003

Automatic singer identification of popular music recordings via estimation and modeling of solo vocal signal.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Multi-scale document expansion in English-Mandarin cross-language spoken document retrieval.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

A sequential metric-based audio segmentation method via the Bayesian information criterion.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002
Discriminating capabilities of syllable-based features and approaches of utilizing them for voice retrieval of speech information in Mandarin Chinese.
IEEE Trans. Speech and Audio Processing, 2002

A hierarchical tag-graph search scheme with layered grammar rules for spontaneous speech understanding.
Pattern Recognition Letters, 2002

2001
Comparison of Word and Subword Indexing Techniques for Mandarin Chinese Spoken Document Retrieval.
Proceedings of the Advances in Multimedia Information Processing, 2001

Mandarin-English Information: Investigating Translingual Speech Retrieval.
Proceedings of the First International Conference on Human Language Technology Research, 2001

Comparative analysis for data-driven temporal filters obtained via principal component analysis (PCA) and linear discriminant analysis (LDA) in speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

An HMM/n-gram-based linguistic processing approach for Mandarin spoken document retrieval.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Improved spoken document retrieval by exploring extra acoustic and linguistic cues.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Multi-scale-audio indexing for translingual spoken document retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2001

Eigenspace-based maximum a posteriori linear regression for rapid speaker adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese.
Speech Communication, 2000

Mandarin spoken document retrieval based on syllable lattice matching.
Pattern Recognition Letters, 2000

A spoken-access approach for chinese text and speech information retrieval.
JASIS, 2000

Syllable-Based Chinese Text/Spoken Document Retrieval Using Text/Speech Queries.
IJPRAI, 2000

Browsing the Chinese Web Pages Using Mandarin Speech.
Int. J. Comput. Proc. Oriental Lang., 2000

Content-based language models for spoken document retrieval.
Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, 2000, Hong Kong, China, September 30, 2000

Automatic metric-based speech segmentation for broadcast news via principal component analysis.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Retrieval of mandarin broadcast news using spoken queries.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Fast speaker adaptation using eigenspace-based maximum likelihood linear regression.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Automatic selection of phonetically distributed sentence sets for speaker adaptation with application to large vocabulary Mandarin speech recognition.
Computer Speech & Language, 1999

A New Syllable-based Approach for Retrieving Mandarin Spoken Documents Using Short Speech Queries.
Proceedings of the 12th Research on Computational Linguistics Conference, 1999

Consistent dialogue across concurrent topics based on an expert system model.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1998
Statistical Analysis of Mandarin Acoustic Units and Automatic Extraction of Phonetically Rich Sentences Based Upon a very Large Chinese Text Corpus.
IJCLCLP, 1998

Towards a Mandarin voice memo system.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Hierarchical tag-graph search for spontaneous speech understanding in spoken dialog systems.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

A*-admissible key-phrase spotting with sub-syllable level utterance verification.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1997
Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary using limited training data.
IEEE Trans. Speech and Audio Processing, 1997

Internet Chinese information retrieval using unconstrained Mandarin speech queries based on a client-server architecture and a PAT-tree-based language model.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
Frameworks for recognition of Mandarin syllables with tones using sub-syllabic units.
Speech Communication, 1996

1995
Fast and accurate continuous speech recognition for Chinese language with very large vocabulary.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Complete recognition of continuous Mandarin speech for Chinese language with very large vocabulary but limited training data.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
Incremental speaker adaptation using phonetically balanced training sentences for Mandarin syllable recognition based on segmental probability models.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

An initial study on a segmental probability model approach to large-vocabulary continuous Mandarin speech recognition.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
從中文語料庫中自動選取連續國語語音特性平衡句的方法 (Automatic Selection of Phonetically Rich Sentences from A Chinese Text Corpus) [In Chinese].
Proceedings of Rocling Computational Linguistics Conference VI, 1993

Golden Mandarin (II)-an improved single-chip real-time Mandarin dictation machine for Chinese language with very large vocabulary.
Proceedings of the IEEE International Conference on Acoustics, 1993


  Loading...