Seung-Bin Kim

Orcid: 0000-0002-2287-9111

According to our database1, Seung-Bin Kim authored at least 18 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
ImmersiveTTS: Environment-Aware Text-to-Speech with Multimodal Diffusion Transformer and Domain-Specific Representation Alignment.
CoRR, May, 2026

Affectron: Emotional Speech Synthesis with Affective and Contextually Aligned Nonverbal Vocalizations.
CoRR, March, 2026

Toward Complex-Valued Neural Networks for Waveform Generation.
CoRR, March, 2026

Comprehensive Validation of Bridge Module and EBM Loss for One-Class Audio Deepfake Detection.
IEEE Access, 2026

2025
HierSpeech++: Bridging the Gap Between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-Shot Speech Synthesis.
IEEE Trans. Neural Networks Learn. Syst., October, 2025

EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech Via Emotion-Adaptive Spherical Vector.
IEEE Trans. Affect. Comput., 2025

Spotlight-TTS: Spotlighting the Style via Voiced-Aware Style Extraction and Style Direction Adjustment for Expressive Text-to-Speech.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

EmoSphere-SER: Enhancing Speech Emotion Recognition Through Spherical Representation with Auxiliary Classification.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

FLowHigh: Towards Efficient and High-Quality Audio Super-Resolution with Single-Step Flow Matching.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

FillerSpeech: Towards Human-Like Text-to-Speech Synthesis with Filler Insertion and Filler Style Control.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2024
Audio Super-Resolution With Robust Speech Representation Learning of Masked Autoencoder.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

PromotiCon: Prompt-based Emotion Controllable Text-to-Speech via Prompt Generation and Matching.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2024

EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

TranSentence: speech-to-speech Translation via Language-Agnostic Sentence-Level Speech Encoding without Language-Parallel Data.
Proceedings of the IEEE International Conference on Acoustics, 2024

2022
HierSpeech: Bridging the Gap between Text and Speech by Hierarchical Variational Inference using Self-supervised Representations for Speech Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

EMOQ-TTS: Emotion Intensity Quantization for Fine-Grained Controllable Emotional Text-to-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2022


  Loading...