Youngjoon Jang

Orcid: 0009-0002-0500-6025

Affiliations:
  • Korea Advanced Institute of Science and Technology (KAIST), Multimodal AI Lab, Daejeon, South Korea


According to our database1, Youngjoon Jang authored at least 18 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
EDNet: A Distortion-Agnostic Speech Enhancement Framework with Gating Mamba Mechanism and Phase Shift-Invariant Training.
CoRR, June, 2025

Fork-Merge Decoding: Enhancing Multimodal Understanding in Audio-Visual Large Language Models.
CoRR, May, 2025

AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding.
CoRR, May, 2025

Test-Time Augmentation for Pose-invariant Face Recognition.
CoRR, May, 2025

Deep Understanding of Sign Language for Sign to Subtitle Alignment.
CoRR, March, 2025

VoiceDiT: Dual-Condition Diffusion Transformer for Environment-Aware Speech Synthesis.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Fregrad: Lightweight and Fast Frequency-Aware Diffusion Vocoder.
Proceedings of the IEEE International Conference on Acoustics, 2024

Seeing Through The Conversation: Audio-Visual Speech Separation Based on Diffusion Model.
Proceedings of the IEEE International Conference on Acoustics, 2024

VoxMM: Rich Transcription of Conversations in the Wild.
Proceedings of the IEEE International Conference on Acoustics, 2024

TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning.
Proceedings of the IEEE International Conference on Acoustics, 2024

Slowfast Network for Continuous Sign Language Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

Faces that Speak: Jointly Synthesising Talking Face and Speech from Text.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
That's What I Said: Fully-Controllable Talking Face Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Metric Learning for User-Defined Keyword Spotting.
Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Sufficient Framework for Continuous Sign Language Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language Recognition.
Proceedings of the 33rd British Machine Vision Conference 2022, 2022


  Loading...