Hangting Chen

Orcid: 0000-0002-4085-4364

According to our database¹, Hangting Chen authored at least 53 papers between 2018 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents.

[BibT_eX]

[DOI]

CoRR, May, 2026

Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis.

[BibT_eX]

[DOI]

CoRR, March, 2026

DualSpeechLM: Towards Unified Speech Understanding and Generation via Dual Speech Token Modeling with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

HunyuanImage 3.0 Technical Report.

[BibT_eX]

[DOI]

CoRR, September, 2025

AUV: Teaching Audio Universal Vector Quantization with Single Nested Codebook.

[BibT_eX]

[DOI]

CoRR, September, 2025

SongPrep: A Preprocessing Framework and End-to-end Model for Full-song Structure Parsing and Lyrics Transcription.

[BibT_eX]

[DOI]

CoRR, September, 2025

Towards Hallucination-Free Music: A Reinforcement Learning Preference Optimization Framework for Reliable Song Generation.

[BibT_eX]

[DOI]

CoRR, August, 2025

SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement.

[BibT_eX]

[DOI]

CoRR, June, 2025

LeVo: High-Quality Song Generation with Multi-Preference Alignment.

[BibT_eX]

[DOI]

CoRR, June, 2025

Controllable Text-to-Speech Synthesis with Masked-Autoencoded Style-Rich Representation.

[BibT_eX]

[DOI]

CoRR, June, 2025

Layer-wise Investigation of Large-Scale Self-Supervised Music Representation Models.

[BibT_eX]

[DOI]

Yizhi Zhou

Haina Zhu

Hangting Chen

CoRR, May, 2025

MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization.

[BibT_eX]

[DOI]

CoRR, January, 2025

LeVo: High-Quality Song Generation with Multi-Preference Alignment.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

MuCodec: Ultra Low-Bitrate Music Codec for Music Generation.

[BibT_eX]

[DOI]

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

TVC-MusicGen: Time-Varying Structure Control for Background Music Generation via Self-Supervised Training.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

WAKE: Watermarking Audio with Key Enrichment.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

TSDT-Net: Ultra-Low-Complexity Two-Stage Model Combining Dual-Path-Transformer and Transform-Average-Concatenate Network for Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

UniSep: Universal Target Audio Separation with Language Models at Scale.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2025

AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

MuCodec: Ultra Low-Bitrate Music Codec.

[BibT_eX]

[DOI]

CoRR, 2024

AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions.

[BibT_eX]

[DOI]

CoRR, 2024

Gull: A Generative Multifunctional Audio Codec.

[BibT_eX]

[DOI]

CoRR, 2024

Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings.

[BibT_eX]

[DOI]

CoRR, 2024

Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2024

AutoPrep: An Automatic Preprocessing Framework for In-The-Wild Speech Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Consistent and Relevant: Rethink the Query Embedding in General Sound Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Complexity Scaling for Speech Denoising.

[BibT_eX]

[DOI]

Hangting Chen

Jianwei Yu

Chao Weng

Proceedings of the IEEE International Conference on Acoustics, 2024

SECap: Speech Emotion Captioning with Large Language Model.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

How to make embeddings suitable for PLDA.

[BibT_eX]

[DOI]

Comput. Speech Lang., June, 2023

First coarse, fine afterward: A lightweight two-stage complex approach for monaural speech enhancement.

[BibT_eX]

[DOI]

Speech Commun., 2023

High Fidelity Speech Enhancement with Band-split RNN.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Bayes Risk Transducer: Transducer with Controllable Alignment Prediction.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

TSpeech-AI System Description to the 5th Deep Noise Suppression (DNS) Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Master-Teacher-Student: A Weakly Labelled Semi-Supervised Framework for Audio Tagging and Sound Event Detection.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2022

The HCCL System for the NIST SRE21.

[BibT_eX]

[DOI]

CoRR, 2022

The HCCL System for the NIST SRE21.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Beam-Guided TasNet: An Iterative Speech Separation Framework with Multi-Channel Output.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

DPT-FSNet: Dual-Path Transformer Based Full-Band and Sub-Band Fusion Network for Speech Enhancement.

[BibT_eX]

[DOI]

Feng Dang

Hangting Chen

Pengyuan Zhang

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

A dual-stream deep attractor network with multi-domain learning for speech dereverberation and separation.

[BibT_eX]

[DOI]

Hangting Chen

Pengyuan Zhang

Neural Networks, 2021

Improved Speech Enhancement Using a Complex-Domain GAN with Fused Time-Domain and Time-Frequency Domain Constraints.

[BibT_eX]

[DOI]

Feng Dang

Pengyuan Zhang

Hangting Chen

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Power Pooling: An Adaptive Pooling Function for Weakly Labelled Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2021

2020

Power pooling: An adaptive pooling function for weakly labelled sound event detection.

[BibT_eX]

[DOI]

CoRR, 2020

Exploring the time-domain deep attractor network with two-stream architectures in a reverberant environment.

[BibT_eX]

[DOI]

Hangting Chen

Pengyuan Zhang

CoRR, 2020

ACGAN-based Data Augmentation Integrated with Long-term Scalogram for Acoustic Scene Classification.

[BibT_eX]

[DOI]

CoRR, 2020

Power Pooling Operators and Confidence Learning for Semi-Supervised Sound Event Detection.

[BibT_eX]

[DOI]

Yuzhuo Liu

Hangting Chen

Pengyuan Zhang

CoRR, 2020

Improved Guided Source Separation Integrated with a Strong Back-End for the CHiME-6 Dinner Party Scenario.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Integrating the Data Augmentation Scheme with Various Classifiers for Acoustic Scene Modeling.

[BibT_eX]

[DOI]

CoRR, 2019

Speaker-Invariant Feature-Mapping for Distant Speech Recognition via Adversarial Teacher-Student Learning.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Audio Scene Classification with Discriminatively-Trained Segment-Level Features.

[BibT_eX]

[DOI]

Haichuan Bai

Hangting Chen

Yonghong Yan

Proceedings of the IEEE International Conference on Multimedia & Expo Workshops, 2019

An Audio Scene Classification Framework with Embedded Filters and a DCT-based Temporal Module.

[BibT_eX]

[DOI]

Hangting Chen

Pengyuan Zhang

Yonghong Yan

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Deep Convolutional Neural Network with Scalogram for Audio Scene Modeling.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Hangting Chen

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...