Seung-Bin Kim
Orcid: 0000-0002-2287-9111
According to our database1,
Seung-Bin Kim authored at least 18 papers
between 2022 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
ImmersiveTTS: Environment-Aware Text-to-Speech with Multimodal Diffusion Transformer and Domain-Specific Representation Alignment.
CoRR, May, 2026
Affectron: Emotional Speech Synthesis with Affective and Contextually Aligned Nonverbal Vocalizations.
CoRR, March, 2026
Comprehensive Validation of Bridge Module and EBM Loss for One-Class Audio Deepfake Detection.
IEEE Access, 2026
2025
HierSpeech++: Bridging the Gap Between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-Shot Speech Synthesis.
IEEE Trans. Neural Networks Learn. Syst., October, 2025
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech Via Emotion-Adaptive Spherical Vector.
IEEE Trans. Affect. Comput., 2025
Spotlight-TTS: Spotlighting the Style via Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025
EmoSphere-SER: Enhancing Speech Emotion Recognition Through Spherical Representation with Auxiliary Classification.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025
DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025
FLowHigh: Towards Efficient and High-Quality Audio Super-Resolution with Single-Step Flow Matching.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
FillerSpeech: Towards Human-Like Text-to-Speech Synthesis with Filler Insertion and Filler Style Control.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
2024
Audio Super-Resolution With Robust Speech Representation Learning of Masked Autoencoder.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
PromotiCon: Prompt-based Emotion Controllable Text-to-Speech via Prompt Generation and Matching.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2024
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
TranSentence: speech-to-speech Translation via Language-Agnostic Sentence-Level Speech Encoding without Language-Parallel Data.
Proceedings of the IEEE International Conference on Acoustics, 2024
2022
HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
EMOQ-TTS: Emotion Intensity Quantization for Fine-Grained Controllable Emotional Text-to-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2022