We stand with Ukraine

We stand with Ukraine

Chanwoo Kim

Orcid: 0000-0003-4085-2470

Affiliations:

Samsung Research, Seoul, South Korea

According to our database¹, Chanwoo Kim authored at least 75 papers between 2005 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

on orcid.org
on chanwcom.github.io

On csauthors.net:

Bibliography

2025

OleSpeech-IV: A Large-Scale Multispeaker and Multilingual Conversational Speech Dataset with Diverse Topics.

[DOI]

,

,

,

,

Xavier Menéndez-Pidal

,

,

,

,

,

CoRR, September, 2025

2024

Physics Informed Distillation for Diffusion Models.

[DOI]

Joshua Tian Jin Tee

,

,

,

Dhananjaya N. Gowda

,

,

Trans. Mach. Learn. Res., 2024

Wave-U-Mamba: An End-To-End Framework For High-Quality And Efficient Speech Super Resolution.

[DOI]

,

CoRR, 2024

Data Driven Grapheme-to-Phoneme Representations for a Lexicon-Free Text-to-Speech.

[DOI]

,

,

,

,

Dhananjaya Gowda

Proceedings of the IEEE International Conference on Acoustics, 2024

AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition.

[DOI]

,

,

,

,

Mark Hasegawa-Johnson

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Mels-Tts : Multi-Emotion Multi-Lingual Multi-Speaker Text-To-Speech System Via Disentangled Style Tokens.

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Latent Filling: Latent Space Data Augmentation for Zero-Shot Speech Synthesis.

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

On the compression of shallow non-causal ASR models using knowledge distillation and tied-and-reduced decoder for low-latency on-device speech recognition.

[DOI]

,

,

Chintigari Shiva Kumar

,

Shatrughan Singh

,

,

,

Dhananjaya Gowda

CoRR, 2023

Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction.

[DOI]

,

,

Dhananjaya Gowda

,

,

,

John B. Harvill

,

,

Mark Hasegawa-Johnson

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Hierarchical Timbre-Cadence Speaker Encoder for Zero-shot Speech Synthesis.

[DOI]

,

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Counterfactual Two-Stage Debiasing For Video Corpus Moment Retrieval.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Supervised Accent Learning for Under-Resourced Accents Using Native Language Data.

[DOI]

,

,

Dhananjaya Gowda

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space.

[DOI]

,

,

,

,

,

,

CoRR, 2022

Into-TTS : Intonation Template based Prosody Control System.

[DOI]

,

,

,

,

,

CoRR, 2022

Conformer-Based on-Device Streaming Speech Recognition with KD Compression and Two-Pass Architecture.

[DOI]

,

,

,

,

Dhairya Sandhyana

,

,

,

,

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Macro-Block Dropout for Improved Regularization in Training End-to-End Speech Recognition Models.

[DOI]

,

Sathish Indurti

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Cross-Modal Decision Regularization for Simultaneous Speech Translation.

[DOI]

Mohd Abbas Zaidi

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Prototypical speaker-interference loss for target voice separation using non-parallel audio samples.

[DOI]

,

Dhananjaya Gowda

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Decision Attentive Regularization to Improve Simultaneous Speech Translation Systems.

[DOI]

Mohd Abbas Zaidi

,

,

Nikhil Kumar Lakumarapu

,

,

CoRR, 2021

Convolution-Based Attention Model With Positional Encoding For Streaming Speech Recognition On Embedded Devices.

[DOI]

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Streaming End-to-End Speech Recognition with Jointly Trained Neural Feature Enhancement.

[DOI]

,

,

Dhananjaya Gowda

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Task Aware Multi-Task Learning for Speech to Text Tasks.

[DOI]

Sathish Reddy Indurthi

,

Mohd Abbas Zaidi

,

Nikhil Kumar Lakumarapu

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Neural Utterance Confidence Measure for RNN-Transducers and Two Pass Models.

[DOI]

,

,

Dhananjaya Gowda

,

,

,

Shatrughan Singh

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Comparative Study of Different Tokenization Strategies for Streaming End-to-End ASR.

[DOI]

,

,

,

Dhananjaya Gowda

,

Shatrughan Singh

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

A Comparison of Streaming Models and Data Augmentation Methods for Robust Speech Recognition.

[DOI]

,

,

Dhananjaya Gowda

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Semi-Supervised Transfer Learning for Language Expansion of End-to-End Speech Recognition Models to Low-Resource Languages.

[DOI]

,

,

Dhananjaya Gowda

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Voice to Action: Spoken Language Understanding for Memory-Constrained Systems.

[DOI]

,

Aditya Jayasimha

,

,

Shatrughan Singh

,

Dhananjaya Gowda

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

HiTNet: Byte-to-BPE Hierarchical Transcription Network for End-to-End Speech Recognition.

[DOI]

Dhananjaya Gowda

,

,

,

,

,

,

,

Nauman Dawalatabad

,

,

Shatrughan Singh

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Two-Pass End-to-End ASR Model Compression.

[DOI]

Nauman Dawalatabad

,

,

,

,

Shatrughan Singh

,

Dhananjaya Gowda

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Faster Re-translation Using Non-Autoregressive Model For Simultaneous Neural Machine Translation.

[DOI]

,

Sathish Reddy Indurthi

,

Mohd Abbas Zaidi

,

Nikhil Kumar Lakumarapu

,

,

,

,

CoRR, 2020

Utterance Confidence Measure for End-to-End Speech Recognition with Applications to Distributed Speech Recognition Scenarios.

[DOI]

,

,

Dhananjaya Gowda

,

,

Shatrughan Singh

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Utterance Invariant Training for Hybrid Two-Pass End-to-End Speech Recognition.

[DOI]

Dhananjaya Gowda

,

,

,

,

,

,

,

,

,

Shatrughan Singh

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Streaming On-Device End-to-End ASR System for Privacy-Sensitive Voice-Typing.

[DOI]

,

Gowtham P. Vadisetti

,

Dhananjaya Gowda

,

,

Aditya Jayasimha

,

,

,

,

,

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Hierarchical Multi-Stage Word-to-Grapheme Named Entity Corrector for Automatic Speech Recognition.

[DOI]

,

,

Dhananjaya Gowda

,

Shatrughan Singh

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Small Energy Masking for Improved Neural Network Training for End-To-End Speech Recognition.

[DOI]

,

,

Sathish Reddy Indurthi

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

End-end Speech-to-Text Translation with Modality Agnostic Meta-Learning.

[DOI]

Sathish Reddy Indurthi

,

,

Nikhil Kumar Lakumarapu

,

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Review of On-Device Fully Neural End-to-End Automatic Speech Recognition Algorithms.

[DOI]

,

Dhananjaya Gowda

,

,

,

,

,

,

Proceedings of the 54th Asilomar Conference on Signals, Systems, and Computers, 2020

2019

Data Efficient Direct Speech-to-Text Translation with Modality Agnostic Meta-Learning.

[DOI]

Sathish Reddy Indurthi

,

,

Nikhil Kumar Lakumarapu

,

,

,

,

CoRR, 2019

Improved Vocal Tract Length Perturbation for a State-of-the-Art End-to-End Speech Recognition System.

[DOI]

,

,

,

Dhananjaya Gowda

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multi-Task Multi-Resolution Char-to-BPE Cross-Attention Decoder for End-to-End Speech Recognition.

[DOI]

Dhananjaya Gowda

,

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Robust Recognition of Reverberant and Noisy Speech Using Coherence-based Processing.

[DOI]

,

,

Richard M. Stern

Proceedings of the IEEE International Conference on Acoustics, 2019

End-to-End Training of a Large Vocabulary End-to-End Speech Recognition System.

[DOI]

,

,

Shatrughan Singh

,

,

Dhananjaya Gowda

,

,

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Power-Law Nonlinearity with Maximally Uniform Distribution Criterion for Improved Neural Network Training in Automatic Speech Recognition.

[DOI]

,

,

,

Dhananjaya Gowda

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Attention Based On-Device Streaming Speech Recognition with Large Speech Corpus.

[DOI]

,

,

,

,

,

,

Dhananjaya Gowda

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Improved Multi-Stage Training of Online Attention-Based Encoder-Decoder Models.

[DOI]

,

Dhananjaya Gowda

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

A Comparative Study of Spatial Speech Separation Techniques to Improve Speech Recognition.

[DOI]

,

,

,

,

,

Richard M. Stern

Proceedings of the Advances in Neural Networks - ISNN 2018, 2018

Efficient Implementation of the Room Simulator for Training Deep Neural Network Acoustic Models.

[DOI]

,

,

,

Michiel Bacchiani

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Spectral Distortion Model for Training Phase-Sensitive Deep-Neural Networks for Far-Field Speech Recognition.

[DOI]

,

Tara N. Sainath

,

,

,

Rajeev C. Nongpiur

,

Michiel Bacchiani

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Sound Source Separation Using Phase Difference and Reliable Mask Selection Selection.

[DOI]

,

,

Michiel Bacchiani

,

Richard M. Stern

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition.

[DOI]

Tara N. Sainath

,

,

Kevin W. Wilson

,

,

,

,

Michiel Bacchiani

,

,

Andrew W. Senior

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Robust Speech Recognition Based on Binaural Auditory Processing.

[DOI]

,

,

Richard M. Stern

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Acoustic Modeling for Google Home.

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Generation of Large-Scale Simulated Utterances in Virtual Rooms to Train Deep-Neural Networks for Far-Field Speech Recognition in Google Home.

[DOI]

,

,

,

,

,

Tara N. Sainath

,

Michiel Bacchiani

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Binaural processing for robust recognition of degraded speech.

[DOI]

,

,

,

Richard M. Stern

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Raw Multichannel Processing Using Deep Neural Networks.

[DOI]

Tara N. Sainath

,

,

Kevin W. Wilson

,

,

Michiel Bacchiani

,

,

,

,

Andrew W. Senior

,

,

,

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

A Subband-Based Stationary-Component Suppression Method Using Harmonics and Power Ratio for Reverberant Speech Recognition.

[DOI]

,

,

,

,

Richard M. Stern

,

IEEE Signal Process. Lett., 2016

2015

Sound source separation algorithm using phase difference and angle distribution modeling near the target.

[DOI]

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014

Robust speech recognition in reverberant environments using subband-based steady-state monaural and binaural suppression.

[DOI]

,

Matthew Maciejewski

,

,

Richard M. Stern

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Robust speech recognition using temporal masking and thresholding algorithm.

[DOI]

,

,

Michiel Bacchiani

,

Richard M. Stern

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2012

Power-Normalized Cepstral Coefficients (PNCC) for robust speech recognition.

[DOI]

,

Richard M. Stern

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Two-microphone source separation algorithm based on statistical modeling of angle distributions.

[DOI]

,

Charbel El Khawand

,

Richard M. Stern

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Delta-spectral cepstral coefficients for robust speech recognition.

[DOI]

,

,

Richard M. Stern

Proceedings of the IEEE International Conference on Acoustics, 2011

Binaural sound source separation motivated by auditory processing.

[DOI]

,

,

Richard M. Stern

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Automatic selection of thresholds for signal separation algorithms based on interaural delay.

[DOI]

,

Richard M. Stern

,

,

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Nonlinear enhancement of onset for robust speech recognition.

[DOI]

,

Richard M. Stern

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Feature extraction for robust speech recognition based on maximizing the sharpness of the power distribution and on power flooring.

[DOI]

,

Richard M. Stern

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction.

[DOI]

,

Richard M. Stern

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain.

[DOI]

,

,

,

Richard M. Stern

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Power function-based power distribution normalization algorithm for robust speech recognition.

[DOI]

,

Richard M. Stern

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Robust speech recognition using a Small Power Boosting algorithm.

[DOI]

,

,

Richard M. Stern

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008

Robust signal-to-noise ratio estimation based on waveform amplitude distribution analysis.

[DOI]

,

Richard M. Stern

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

2006

Efficient Media Synchronization Method for Video Telephony System.

[DOI]

,

,

IEICE Trans. Inf. Syst., 2006

A Robust Formant Extraction Algorithm Combining Spectral Peak Picking and Root Polishing.

[DOI]

,

,

EURASIP J. Adv. Signal Process., 2006

Physiologically-motivated synchrony-based processing for robust automatic speech recognition.

[DOI]

,

Yu-Hsiang Bosco Chiu

,

Richard M. Stern

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005

Robust DTW-based recognition algorithm for hand-held consumer devices.

[DOI]

,

IEEE Trans. Consumer Electron., 2005

Loading...