We stand with Ukraine

We stand with Ukraine

Cong Han

Orcid: 0000-0003-2121-000X

Affiliations:

Columbia University, Department of Electrical Engineering, New York, NY, USA

According to our database¹, Cong Han authored at least 32 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

On csauthors.net:

Bibliography

2026

AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking.

[DOI]

CoRR, January, 2026

2025

Listen, Chat, and Remix: Text-Guided Soundscape Remixing for Enhanced Auditory Experience.

[DOI]

,

,

Yinghao Aaron Li

,

IEEE J. Sel. Top. Signal Process., May, 2025

StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis.

[DOI]

Yinghao Aaron Li

,

,

IEEE J. Sel. Top. Signal Process., January, 2025

StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion.

[DOI]

Yinghao Aaron Li

,

,

,

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis.

[DOI]

,

Yinghao Aaron Li

,

Adrian Nicolas Florea

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation.

[DOI]

,

,

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024

Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience.

[DOI]

,

,

Yinghao Aaron Li

,

CoRR, 2024

Exploring Self-supervised Contrastive Learning of Spatial Sound Event Representation.

[DOI]

,

,

Yinghao Aaron Li

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Unsupervised Multi-Channel Separation And Adaptation.

[DOI]

,

Kevin W. Wilson

,

,

John R. Hershey

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform.

[DOI]

Yinghao Aaron Li

,

,

,

CoRR, 2023

SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs.

[DOI]

Yinghao Aaron Li

,

,

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models.

[DOI]

Yinghao Aaron Li

,

,

Vinay S. Raghavan

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Phoneme-Level Bert for Enhanced Prosody of Text-To-Speech with Grapheme Predictions.

[DOI]

Yinghao Aaron Li

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Online Binaural Speech Separation Of Moving Speakers With A Wavesplit Network.

[DOI]

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation.

[DOI]

,

Vishal Choudhari

,

Yinghao Aaron Li

,

Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

2022

Styletts-VC: One-Shot Voice Conversion by Knowledge Transfer From Style-Based TTS Models.

[DOI]

Yinghao Aaron Li

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Improving Conversational Recommendation Systems' Quality with Context-Aware Item Meta-Information.

[DOI]

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Multi-Channel Speech Denoising for Machine Ears.

[DOI]

,

Emine Merve Kaya

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Group Communication With Context Codec for Lightweight Source Separation.

[DOI]

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Dual-Path RNN for Long Recording Speech Separation.

[DOI]

,

,

,

,

Takuya Yoshioka

,

,

,

Keisuke Kinoshita

,

Christoph Böddeker

,

,

Shinji Watanabe

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Distortion-Controlled Training for end-to-end Reverberant Speech Separation with Auxiliary Autoencoding Loss.

[DOI]

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Empirical Analysis of Generalized Iterative Speech Separation Networks.

[DOI]

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Binaural Speech Separation of Moving Speakers With Preserved Spatial Cues.

[DOI]

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Continuous Speech Separation Using Speaker Inventory for Long Recording.

[DOI]

,

,

,

,

Keisuke Kinoshita

,

Shinji Watanabe

,

,

,

John R. Hershey

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Ultra-Lightweight Speech Separation Via Group Communication.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Rethinking The Separation Layers In Speech Separation Networks.

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Dual-Path Modeling for Long Recording Speech Separation in Meetings.

[DOI]

,

,

,

,

,

Keisuke Kinoshita

,

,

Shinji Watanabe

,

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Continuous Speech Separation Using Speaker Inventory for Long Multi-talker Recording.

[DOI]

,

,

,

,

Keisuke Kinoshita

,

Shinji Watanabe

,

,

,

John R. Hershey

,

,

CoRR, 2020

Group Communication with Context Codec for Ultra-Lightweight Source Separation.

[DOI]

,

,

CoRR, 2020

Real-Time Binaural Speech Separation with Preserved Spatial Cues.

[DOI]

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Online Deep Attractor Network for Real-time Single-channel Speech Separation.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2019

FaSNet: Low-Latency Adaptive Beamforming for Multi-Microphone Audio Processing.

[DOI]

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Loading...