We stand with Ukraine

We stand with Ukraine

Youngmoon Jung

Orcid: 0000-0002-4321-379X

According to our database¹, Youngmoon Jung authored at least 26 papers between 2017 and 2026.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

MATE: Matryoshka Audio-Text Embeddings for Open-Vocabulary Keyword Spotting.

[DOI]

,

,

Joon-Young Yang

,

,

,

CoRR, January, 2026

DAME: Duration-Aware Matryoshka Embedding for Duration-Robust Speaker Verification.

[DOI]

,

Joon-Young Yang

,

,

,

,

CoRR, January, 2026

2025

Adversarial Deep Metric Learning for Cross-Modal Audio-Text Alignment in Open-Vocabulary Keyword Spotting.

[DOI]

,

,

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Text-Aware Adapter for Few-Shot Keyword Spotting.

[DOI]

,

,

,

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

Relational Proxy Loss for Audio-Text based Keyword Spotting.

[DOI]

,

,

Joon-Young Yang

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

CTC-aligned Audio-Text Embedding for Streaming Open-vocabulary Keyword Spotting.

[DOI]

,

,

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

2022

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning.

[DOI]

,

,

,

,

CoRR, 2022

Learning to Maximize Speech Quality Directly Using MOS Prediction for Neural Text-to-Speech.

[DOI]

,

,

,

IEEE Access, 2022

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Models.

[DOI]

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Neural MOS Prediction for Synthesized Speech Using Multi-Task Learning with Spoofing Detection and Spoofing Type Classification.

[DOI]

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

2020

Perceptually Guided End-to-End Text-to-Speech.

[DOI]

,

,

,

CoRR, 2020

Multi-Scale Aggregation Using Feature Pyramid Module for Text-Independent Speaker Verification.

[DOI]

,

,

,

,

CoRR, 2020

A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments.

[DOI]

,

,

,

IEEE Access, 2020

Dual Attention in Time and Frequency Domain for Voice Activity Detection.

[DOI]

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs.

[DOI]

,

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Improving Multi-Scale Aggregation Using Feature Pyramid Module for Robust Speaker Verification of Variable-Duration Utterances.

[DOI]

,

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Multi-Task Network for Noise-Robust Keyword Spotting and Speaker Verification Using CTC-Based Soft VAD and Global Query Attention.

[DOI]

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling.

[DOI]

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Dynamic Noise Embedding: Noise Aware Training and Adaptation for Speech Enhancement.

[DOI]

,

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Spatial Pyramid Encoding with Convex Length Normalization for Text-Independent Speaker Verification.

[DOI]

,

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Additional Shared Decoder on Siamese Multi-View Encoders for Learning Acoustic Word Embeddings.

[DOI]

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Self-Adaptive Soft Voice Activity Detection Using Deep Neural Networks for Robust Speaker Verification.

[DOI]

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Learning acoustic word embeddings with phonetically associated triplet network.

[DOI]

,

,

,

,

CoRR, 2018

Joint Learning Using Denoising Variational Autoencoders for Voice Activity Detection.

[DOI]

,

,

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

Development of distant multi-channel speech and noise databases for speech recognition by in-door conversational robots.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, 2017

Linear-scale filterbank for deep neural network-based voice activity detection.

[DOI]

,

,

,

Proceedings of the 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, 2017

Loading...