Cong Han

Orcid: 0000-0003-2121-000X

Affiliations:
  • Columbia University, Department of Electrical Engineering, New York, NY, USA


According to our database1, Cong Han authored at least 32 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking.
CoRR, January, 2026

2025
Listen, Chat, and Remix: Text-Guided Soundscape Remixing for Enhanced Auditory Experience.
IEEE J. Sel. Top. Signal Process., May, 2025

StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis.
IEEE J. Sel. Top. Signal Process., January, 2025

StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience.
CoRR, 2024

Exploring Self-supervised Contrastive Learning of Spatial Sound Event Representation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Unsupervised Multi-Channel Separation And Adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform.
CoRR, 2023

SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Phoneme-Level Bert for Enhanced Prosody of Text-To-Speech with Grapheme Predictions.
Proceedings of the IEEE International Conference on Acoustics, 2023

Online Binaural Speech Separation Of Moving Speakers With A Wavesplit Network.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

2022
Styletts-VC: One-Shot Voice Conversion by Knowledge Transfer From Style-Based TTS Models.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Improving Conversational Recommendation Systems' Quality with Context-Aware Item Meta-Information.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Multi-Channel Speech Denoising for Machine Ears.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Group Communication With Context Codec for Lightweight Source Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Dual-Path RNN for Long Recording Speech Separation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Distortion-Controlled Training for end-to-end Reverberant Speech Separation with Auxiliary Autoencoding Loss.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Empirical Analysis of Generalized Iterative Speech Separation Networks.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Binaural Speech Separation of Moving Speakers With Preserved Spatial Cues.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Continuous Speech Separation Using Speaker Inventory for Long Recording.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Ultra-Lightweight Speech Separation Via Group Communication.
Proceedings of the IEEE International Conference on Acoustics, 2021

Rethinking The Separation Layers In Speech Separation Networks.
Proceedings of the IEEE International Conference on Acoustics, 2021

Dual-Path Modeling for Long Recording Speech Separation in Meetings.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Continuous Speech Separation Using Speaker Inventory for Long Multi-talker Recording.
CoRR, 2020

Group Communication with Context Codec for Ultra-Lightweight Source Separation.
CoRR, 2020

Real-Time Binaural Speech Separation with Preserved Spatial Cues.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Online Deep Attractor Network for Real-time Single-channel Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2019

FaSNet: Low-Latency Adaptive Beamforming for Multi-Microphone Audio Processing.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019


  Loading...