Jeongsoo Choi
Orcid: 0009-0005-6817-604X
According to our database1,
Jeongsoo Choi authored at least 28 papers
between 2022 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
CoRR, April, 2026
DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization.
CoRR, March, 2026
2025
Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment.
CoRR, May, 2025
CoRR, March, 2025
AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025
Accelerating Diffusion-based Text-to-Speech Model Trainingwith Dual Modality Alignment.
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025
ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025
Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
From Faces to Voices: Learning Hierarchical Representations for High-quality Video-to-Speech.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
2024
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model.
IEEE Trans. Multim., 2024
Textless Unit-to-Unit Training for Many-to-Many Multilingual Speech-to-Speech Translation.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Multilingual Visual Speech Recognition with a Single Model by Learning with Discrete Visual Speech Units.
CoRR, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-Training and Multi-Modal Tokens.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation.
CoRR, 2023
CoRR, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022