We stand with Ukraine

We stand with Ukraine

Sung-Feng Huang

Orcid: 0000-0002-9720-811X

According to our database¹, Sung-Feng Huang authored at least 32 papers between 2018 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

MoVE: Translating Laughter and Tears via Mixture of Vocalization Experts in Speech-to-Speech Translation.

[DOI]

,

,

,

Sung-Feng Huang

,

CoRR, April, 2026

VIBE: Voice-Induced open-ended Bias Evaluation for Large Audio-Language Models via Real-World Speech.

[DOI]

,

,

Sung-Feng Huang

,

CoRR, April, 2026

Joint Fullband-Subband Modeling for High-Resolution SingFake Detection.

[DOI]

,

,

Sung-Feng Huang

,

,

,

Jyh-Shing Roger Jang

CoRR, April, 2026

How Auditory Knowledge in LLM Backbones Shapes Audio Language Models: A Holistic Evaluation.

[DOI]

,

,

Chao-Han Huck Yang

,

,

Sung-Feng Huang

,

,

,

,

,

,

,

,

Cheng-Han Chiang

,

,

Yu-Chiang Frank Wang

,

CoRR, March, 2026

Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement.

[DOI]

,

,

,

Sung-Feng Huang

,

Ryandhimas E. Zezario

,

Rauf Nasretdinov

,

,

,

Yu-Chiang Frank Wang

CoRR, March, 2026

2025

SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models.

[DOI]

,

,

,

,

,

,

Sung-Feng Huang

,

Chao-Han Huck Yang

,

Yu-Chiang Frank Wang

,

,

CoRR, October, 2025

Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations.

[DOI]

,

,

Yu-Hsuan Li Liang

,

,

,

,

,

Sung-Feng Huang

,

Chao-Han Huck Yang

,

Yu-Chiang Frank Wang

,

,

CoRR, October, 2025

How Does Instrumental Music Help SingFake Detection?

[DOI]

,

,

,

,

,

,

Sung-Feng Huang

,

,

,

,

Jyh-Shing Roger Jang

CoRR, September, 2025

Leveraging Mamba with Full-Face Vision for Audio-Visual Speech Enhancement.

[DOI]

,

,

,

,

Sung-Feng Huang

,

,

Wen-Huang Cheng

,

CoRR, August, 2025

DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment.

[DOI]

CoRR, July, 2025

VoiceNoNG: Robust High-Quality Speech Editing Model without Hallucinations.

[DOI]

Sung-Feng Huang

,

,

,

,

,

,

,

,

Yu-Chiang Frank Wang

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration.

[DOI]

,

Alexander H. Liu

,

,

Sung-Feng Huang

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

HighRateMOS: Sampling-Rate Aware Modeling for Speech Quality Assessment.

[DOI]

,

,

,

Ryandhimas E. Zezario

,

,

Sung-Feng Huang

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2025

2024

Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits.

[DOI]

Sung-Feng Huang

,

,

,

,

Chao-Han Huck Yang

,

,

Yu-Chiang Frank Wang

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

2023

Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning.

[DOI]

Sung-Feng Huang

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization.

[DOI]

,

Sung-Feng Huang

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network.

[DOI]

,

,

,

Sung-Feng Huang

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech.

[DOI]

Sung-Feng Huang

,

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Few Shot Cross-Lingual TTS Using Transferable Phoneme Embedding.

[DOI]

,

,

Sung-Feng Huang

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech.

[DOI]

Sung-Feng Huang

,

,

CoRR, 2021

SpeechNet: A Universal Modularized Model for Speech Processing Tasks.

[DOI]

,

,

,

,

,

Sung-Feng Huang

,

,

,

Cheng-Kuang Lee

,

CoRR, 2021

Non-autoregressive Mandarin-English Code-switching Speech Recognition with Pinyin Mask-CTC and Word Embedding Regularization.

[DOI]

,

,

Sung-Feng Huang

,

CoRR, 2021

Stabilizing Label Assignment for Speech Separation by Self-Supervised Pre-Training.

[DOI]

Sung-Feng Huang

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Non-Autoregressive Mandarin-English Code-Switching Speech Recognition.

[DOI]

,

,

Sung-Feng Huang

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation.

[DOI]

Sung-Feng Huang

,

,

,

,

,

CoRR, 2020

Pretrained Language Model Embryology: The Birth of ALBERT.

[DOI]

David Cheng-Han Chiang

,

Sung-Feng Huang

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

2019

Audio Word2vec: Sequence-to-Sequence Autoencoding for Unsupervised Learning of Audio Segmentation and Representation.

[DOI]

,

Sung-Feng Huang

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2019

From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings.

[DOI]

,

Sung-Feng Huang

,

,

CoRR, 2019

2018

Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection.

[DOI]

Sung-Feng Huang

,

,

,

CoRR, 2018

Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data.

[DOI]

,

,

Sung-Feng Huang

,

,

CoRR, 2018

Towards Unsupervised Automatic Speech Recognition Trained by Unaligned Speech and Text only.

[DOI]

,

,

Sung-Feng Huang

,

CoRR, 2018

Phonetic-and-Semantic Embedding of Spoken words with Applications in Spoken Content Retrieval.

[DOI]

,

Sung-Feng Huang

,

,

,

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Loading...