Chia-Ping Chen

J. Inf. Sci. Eng., March, 2024

Training Speech Recognition Model with Speech Synthesis and Text Discriminator.

[BibT_eX]

[DOI]

Hou-An Lin

J. Inf. Sci. Eng., March, 2024

Improving Speech Synthesis by Automatic Speech Recognition and Speech Discriminator.

[BibT_eX]

[DOI]

Li-Yu Huang

J. Inf. Sci. Eng., January, 2024

Multilingual Meta-Transfer Learning for Low-Resource Speech Recognition.

[BibT_eX]

[DOI]

IEEE Access, 2024

Residual Modules Combined with Squeeze-and-Excitation Attention Mechanism for Improving Few-Shot Road Alert Detection Model.

[BibT_eX]

[DOI]

Proceedings of the 36th Conference on Computational Linguistics and Speech Processing, 2024

ReF-LDM: A Latent Diffusion Model for Reference-based Face Image Restoration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Bilingual and Code-switching TTS Enhanced with Denoising Diffusion Model and GAN.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Enhancing ECAPA-TDNN with Feature Processing Module and Attention Mechanism for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Evaluation of Environmental Sound Classification using Vision Transformer.

[BibT_eX]

[DOI]

Proceedings of the 2024 16th International Conference on Machine Learning and Computing, 2024

Long Audio File Speaker Diarization with Feasible End-to-End Models.

[BibT_eX]

[DOI]

Kai-Wei Huang

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

Enhancing Branchformer with Dynamic Branch Merging Module for Code-Switching Speech Recognition.

[BibT_eX]

[DOI]

Hong-Jie Hu

Yu-Chiao Lai

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

Sound Event Detection System Based on VGGSKCCT Model Architecture with Knowledge Distillation.

[BibT_eX]

[DOI]

Sung-Jen Huang

Chia-Chuan Liu

Appl. Artif. Intell., 2023

Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Regression-based Sound Event Detection with Semi-supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

A Lightweight Speaker Verification Model For Edge Device.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Audio Time-Scale Modification with Temporal Compressing Networks.

[BibT_eX]

[DOI]

Ernie Chu

Ju-Ting Chen

CoRR, 2022

On the Efficiency of Integrating Self-supervised Learning and Meta-learning for User-defined Few-shot Keyword Spotting.

[BibT_eX]

[DOI]

CoRR, 2022

On the Efficiency of Integrating Self-Supervised Learning and Meta-Learning for User-Defined Few-Shot Keyword Spotting.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Lightweight Sound Event Detection Model with RepVGG Architecture.

[BibT_eX]

[DOI]

Proceedings of the 34th Conference on Computational Linguistics and Speech Processing, 2022

Development of Mandarin-English code-switching speech synthesis system.

[BibT_eX]

[DOI]

Hsin-Jou Lien

Li-Yu Huang

Proceedings of the 34th Conference on Computational Linguistics and Speech Processing, 2022

Mandarin-English Code-Switching Speech Recognition System for Specific Domain.

[BibT_eX]

[DOI]

Chung-Pu Chiou

Hou-An Lin

Proceedings of the 34th Conference on Computational Linguistics and Speech Processing, 2022

Investigation of feature processing modules and attention mechanisms in speaker verification system.

[BibT_eX]

[DOI]

Proceedings of the 34th Conference on Computational Linguistics and Speech Processing, 2022

Denoising Likelihood Score Matching for Conditional Score-based Data Generation.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Discussion on domain generalization in the cross-device speaker verification system.

[BibT_eX]

[DOI]

Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing, 2021

Exploiting Low-Resource Code-Switching Data to Mandarin-English Speech Recognition Systems.

[BibT_eX]

[DOI]

Hou-An Lin

Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing, 2021

RCRNN-based Sound Event Detection System with Specific Speech Resolution.

[BibT_eX]

[DOI]

Proceedings of the 33rd Conference on Computational Linguistics and Speech Processing, 2021

Improving Time Delay Neural Network Based Speaker Recognition with Convolutional Block and Feature Aggregation Methods.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Systems for Low-Resource Speech Recognition Tasks in Open Automatic Speech Recognition and Formosa Speech Recognition Challenges.

[BibT_eX]

[DOI]

Hung-Pang Lin

Yu-Jia Zhang

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bridging Unsupervised and Supervised Depth from Focus via All-in-Focus Supervision.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Semi-supervised Sound Event Detection Using Multiscale Channel Attention and Multiple Consistency Training.

[BibT_eX]

[DOI]

Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021

CLCC: Contrastive Learning for Color Constancy.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Semi-Supervised Sound Event Detection Using Self-Attention and Multiple Techniques of Consistency Training.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

NSYSU+CHT Speaker Verification System for Far-Field Speaker Verification Challenge 2020.

[BibT_eX]

[DOI]

Yu-Jia Zhang

Proceedings of the 32nd Conference on Computational Linguistics and Speech Processing, 2020

Real-Time Single-Speaker Taiwanese-Accented Mandarin Speech Synthesis System.

[BibT_eX]

[DOI]

Yih-Wen Wang

Proceedings of the 32nd Conference on Computational Linguistics and Speech Processing, 2020

Improving Embedding-based Neural-Network Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Explorable Tone Mapping Operators.

[BibT_eX]

[DOI]

Proceedings of the 25th International Conference on Pattern Recognition, 2020

Learning Camera-Aware Noise Models.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

2019

以三元組損失微調時延神經網路語者嵌入函數之語者辨識系統(Time Delay Neural Network-based Speaker Embedding Function Fine-tuned with Triplet Loss for Distance-based Speaker Recognition).

[BibT_eX]

[DOI]

Proceedings of the 31st Conference on Computational Linguistics and Speech Processing, 2019

即時中文語音合成系統(Real-Time Mandarin Speech Synthesis System).

[BibT_eX]

[DOI]

An-Chieh Cheng

Proceedings of the 31st Conference on Computational Linguistics and Speech Processing, 2019

AI Deep Learning with Multiple Labels for Sentiment Classification of Tweets.

[BibT_eX]

[DOI]

Zi Yuan Gao

Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

Transfer-Representation Learning for Detecting Spoofing Attacks with Converted and Synthesized Speech in Automatic Speaker Verification System.

[BibT_eX]

[DOI]

Su-Yu Chang

Kai-Cheng Wu

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Speaker Characterization Using TDNN-LSTM Based Speaker Embedding.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Image Haze Removal By Adaptive CycleGAN.

[BibT_eX]

[DOI]

Yi-Fan Chen

Amey Kiran Patel

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Combining De-noising Auto-encoder and Recurrent Neural Networks in End-to-End Automatic Speech Recognition for Noise Robustness.

[BibT_eX]

[DOI]

Tzu-Hsuan Ting

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

deepSA2018 at SemEval-2018 Task 1: Multi-task Learning of Different Label for Affect in Tweets.

[BibT_eX]

[DOI]

Zi Yuan Gao

Proceedings of The 12th International Workshop on Semantic Evaluation, 2018

結合卷積神經網路與遞迴神經網路於推文極性分類 (Combining Convolutional Neural Network and Recurrent Neural Network for Tweet Polarity Classification) [In Chinese].

[BibT_eX]

[DOI]

Chih-Ting Yeh

Proceedings of the 30th Conference on Computational Linguistics and Speech Processing, 2018

Effective Attention Mechanism in Dynamic Models for Speech Emotion Recognition.

[BibT_eX]

[DOI]

Po-Wei Hsiao

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

deepSA at SemEval-2017 Task 4: Interpolated Deep Neural Networks for Sentiment Analysis in Twitter.

[BibT_eX]

[DOI]

Tzu-Hsuan Yang

Tzu-Hsuan Tseng

Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

Using Teacher-Student Model For Emotional Speech Recognition[In Chinese].

[BibT_eX]

[DOI]

Po-Wei Hsiao

Po-Chen Hsieh

Proceedings of the 29th Conference on Computational Linguistics and Speech Processing, 2017

Speech emotion recognition with ensemble learning methods.

[BibT_eX]

[DOI]

Po-Yuan Shih

Chung-Hsien Wu

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Speech emotion recognition with skew-robust neural networks.

[BibT_eX]

[DOI]

Po-Yuan Shih

Hsin-Min Wang

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

標記對於類神經語音情緒辨識系統辨識效果之影響(Effects of Label in Neural Speech Emotion Recognition System)[In Chinese].

[BibT_eX]

[DOI]

Tung-Han Wu

Proceedings of the 28th Conference on Computational Linguistics and Speech Processing, 2016

Support Super-Vector Machines in Automatic Speech Emotion Recognition.

[BibT_eX]

[DOI]

Chia-Ying Chen

Proceedings of the 28th Conference on Computational Linguistics and Speech Processing, 2016

以多層感知器辨識情緒於國台客語料庫 (Use Multilayer Perceptron To Recognize Emotion in Mandarin, Taiwanese and Hakka Database) [In Chinese].

[BibT_eX]

[DOI]

Chia-Hsien Chan

Proceedings of the 28th Conference on Computational Linguistics and Speech Processing, 2016

Integration of orthogonal feature detectors in parameter learning of artificial neural networks to improve robustness and the evaluation on hand-written digit recognition tasks.

[BibT_eX]

[DOI]

Po-Yuan Shih

Wei-Bin Liang

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Recurrent neural network-based language models with variation in net topology, language, and granularity.

[BibT_eX]

[DOI]

Tzu-Hsuan Yang

Tzu-Hsuan Tseng

Proceedings of the 2016 International Conference on Asian Language Processing, 2016

Verifying the long-range dependency of RNN language models.

[BibT_eX]

[DOI]

Tzu-Hsuan Tseng

Tzu-Hsuan Yang

Proceedings of the 2016 International Conference on Asian Language Processing, 2016

2014

Polyglot Speech Synthesis Based on Cross-Lingual Frame Selection Using Auditory and Articulatory Features.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2014

台灣情緒語料庫建置與辨識 (An Emotional Speech Database in Taiwan: Collection and Recognition) [In Chinese].

[BibT_eX]

[DOI]

Bo-Chang Chiou

Proceedings of the 26th Conference on Computational Linguistics and Speech Processing, 2014

Speech emotion recognition with cross-lingual databases.

[BibT_eX]

[DOI]

Bo-Chang Chiou

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Natural speech synthesis based on hybrid approach with candidate expansion and verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

基於Sphinx 可快速個人化行動數字語音辨識系統 (Quickly Personalizable Mobile Digit Speech Recognition System Based on Sphinx) [In Chinese].

[BibT_eX]

[DOI]

Tsung-Peng Yen

Proceedings of the 25th Conference on Computational Linguistics and Speech Processing, 2013

基於時域上基週同步疊加法之歌聲合成系統 (Singing Voice Synthesis System Based on Time Domain-Pitch Synchronized Overlap and Add) [In Chinese].

[BibT_eX]

[DOI]

Ming-Kuan Wu

Proceedings of the 25th Conference on Computational Linguistics and Speech Processing, 2013

Query-Document Relevance Topic Models.

[BibT_eX]

[DOI]

Meng-Sung Wu

Hsin-Min Wang

Proceedings of the Advances in Knowledge Discovery and Data Mining, 2013

Yet another Gaussian mixture model-based feature compensation method for robust noisy-digit recognition.

[BibT_eX]

[DOI]

Bing-Feng Yeh

Proceedings of the IEEE International Conference on Acoustics, 2013

Feature space dimension reduction in speech emotion recognition using support vector machine.

[BibT_eX]

[DOI]

Bo-Chang Chiou

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012

Intrinsic Illumination Subspace for Lighting Insensitive Face Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Syst. Man Cybern. Part B, 2012

Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection.

[BibT_eX]

[DOI]

Hung-yi Lee

Lin-Shan Lee

IEEE Trans. Speech Audio Process., 2012

Speaker-Dependent Model Interpolation for Statistical Emotional Speech Synthesis.

[BibT_eX]

[DOI]

Chih-Yu Hsu

EURASIP J. Audio Speech Music. Process., 2012

Robust dialogue act detection based on partial sentence tree, derivation rule, and spectral clustering algorithm.

[BibT_eX]

[DOI]

Chung-Hsien Wu

Wei-Bin Liang

EURASIP J. Audio Speech Music. Process., 2012

應用串接方法於連續變化轉速之四行程引擎聲音合成 (Concatenation-based Method for the Synthesis of Engine Noise with Continuously Varying Speed) [In Chinese].

[BibT_eX]

[DOI]

Ming-Kuan Wu

Proceedings of the 24th Conference on Computational Linguistics and Speech Processing, 2012

Cross-lingual frame selection method for polyglot speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Data-driven rescaled Teager energy cepstral coefficients for noise-robust speech recognition.

[BibT_eX]

[DOI]

Miau-Luan Hsu

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011

Noise-robust speech feature processing with empirical mode decomposition.

[BibT_eX]

[DOI]

Kuo-Hau Wu

Bing-Feng Yeh

EURASIP J. Audio Speech Music. Process., 2011

Real-time hand tracking on depth images.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Visual Communications and Image Processing, 2011

Improved spoken term detection using support vector machines based on lattice context consistency.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Improved spoken term detection with graph-based re-ranking in feature space.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Semantic Information and Derivation Rules for Robust Dialogue Act Detection in a Spoken Dialogue System.

[BibT_eX]

[DOI]

Wei-Bin Liang

Chung-Hsien Wu

Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, 19-24 June, 2011, Portland, Oregon, USA, 2011

2010

A hidden Markov model-based approach for emotional speech synthesis.

[BibT_eX]

[DOI]

Chih-Yung Yang

Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

A framework integrating different relevance feedback scenarios and approaches for spoken term detection.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Transformational Breathing between Present and Past: Virtual Exhibition System of the Mao-Kung Ting.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Modeling, 2010

Auditory front-ends for noise-robust automatic speech recognition.

[BibT_eX]

[DOI]

Ja-Zang Yeh

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Empirical mode decomposition for noise-robust automatic speech recognition.

[BibT_eX]

[DOI]

Kuo-Hao Wu

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Improved spoken term detection by discriminative training of acoustic models based on user relevance feedback.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Improved spoken term detection by feature space pseudo-relevance feedback.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Turning Rust into Gold: An ancient artifact as an interactive artwork.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

MOMI-Cosegmentation: Simultaneous Segmentation of Multiple Objects among Multiple Images.

[BibT_eX]

[DOI]

Wen-Sheng Chu

Proceedings of the Computer Vision - ACCV 2010, 2010

2009

Noise-Robust Speech Features Based on Cepstral Time Coefficients.

[BibT_eX]

[DOI]

Ja-Zang Yeh

Proceedings of the 21st Conference on Computational Linguistics and Speech Processing, 2009

A Framework for Machine Translation Output Combination.

[BibT_eX]

[DOI]

Yi-Chang Chen

Proceedings of the 21st Conference on Computational Linguistics and Speech Processing, 2009

Noise-robust feature extraction based on forward masking.

[BibT_eX]

[DOI]

Sheng-Chiuan Chiou

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Speaker diarization using divide-and-conquer.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Pixel-based correspondence and shape reconstruction for moving objects.

[BibT_eX]

[DOI]

Yi-Ping Hung

Proceedings of the 12th IEEE International Conference on Computer Vision Workshops, 2009

2007

MVA Processing of Speech Features.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2007

2006

An Approach to Using the Web as a Live Corpus for Spoken Transliteration Name Access.

[BibT_eX]

[DOI]

Ming-Shun Lin

Hsin-Hsi Chen

Int. J. Comput. Linguistics Chin. Lang. Process., 2006

Automatic Learning of Context-Free Grammar.

[BibT_eX]

[DOI]

Tai-Hung Chen

Chun-Han Tseng

Proceedings of the 18th Conference on Computational Linguistics and Speech Processing, 2006

Chinese input method based on reduced Mandarin phonetic alphabet.

[BibT_eX]

[DOI]

Chun-Han Tseng

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

The 4-Source Photometric Stereo Under General Unknown Lighting.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision, 2006

2005

An Approach of Using the Web as a Live Corpus for Spoken Transliteration Name Access.

[BibT_eX]

[DOI]

Ming-Shun Lin

Hsin-Hsi Chen

Proceedings of the 17th Conference on Computational Linguistics and Speech Processing, 2005

Focused word segmentation for ASR.

[BibT_eX]

[DOI]

Amarnag Subramanya

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Lighting Normalization with Generic Intrinsic Illumination Subspace for Face Recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE International Conference on Computer Vision (ICCV 2005), 2005

Speech Feature Smoothing for Robust ASR.

[BibT_eX]

[DOI]

Daniel P. W. Ellis

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Image set compression through minimal-cost prediction structures.

[BibT_eX]

[DOI]

Proceedings of the 2004 International Conference on Image Processing, 2004

2002

Frontend post-processing and backend model enhancement on the Aurora 2.0/3.0 databases.

[BibT_eX]

[DOI]

Karim Filali

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Low-resource noise-robust feature post-processing on Aurora 2.0.

[BibT_eX]

[DOI]