Hangting Chen

Orcid: 0000-0002-4085-4364

According to our database1, Hangting Chen authored at least 43 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
DualSpeechLM: Towards Unified Speech Understanding and Generation via Dual Speech Token Modeling with Large Language Models.
CoRR, August, 2025

Towards Hallucination-Free Music: A Reinforcement Learning Preference Optimization Framework for Reliable Song Generation.
CoRR, August, 2025

SongBloom: Coherent Song Generation via Interleaved Autoregressive Sketching and Diffusion Refinement.
CoRR, June, 2025

LeVo: High-Quality Song Generation with Multi-Preference Alignment.
CoRR, June, 2025

WAKE: Watermarking Audio with Key Enrichment.
CoRR, June, 2025

Controllable Text-to-Speech Synthesis with Masked-Autoencoded Style-Rich Representation.
CoRR, June, 2025

Layer-wise Investigation of Large-Scale Self-Supervised Music Representation Models.
CoRR, May, 2025

UniSep: Universal Target Audio Separation with Language Models at Scale.
CoRR, March, 2025

MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization.
CoRR, January, 2025

AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
MuCodec: Ultra Low-Bitrate Music Codec.
CoRR, 2024

AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions.
CoRR, 2024

Gull: A Generative Multifunctional Audio Codec.
CoRR, 2024

Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings.
CoRR, 2024

Continuous Target Speech Extraction: Enhancing Personalized Diarization and Extraction on Complex Recordings.
Proceedings of the International Joint Conference on Neural Networks, 2024

AutoPrep: An Automatic Preprocessing Framework for In-The-Wild Speech Data.
Proceedings of the IEEE International Conference on Acoustics, 2024

Consistent and Relevant: Rethink the Query Embedding in General Sound Separation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Complexity Scaling for Speech Denoising.
Proceedings of the IEEE International Conference on Acoustics, 2024

SECap: Speech Emotion Captioning with Large Language Model.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
How to make embeddings suitable for PLDA.
Comput. Speech Lang., June, 2023

First coarse, fine afterward: A lightweight two-stage complex approach for monaural speech enhancement.
Speech Commun., 2023

High Fidelity Speech Enhancement with Band-split RNN.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Bayes Risk Transducer: Transducer with Controllable Alignment Prediction.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

TSpeech-AI System Description to the 5th Deep Noise Suppression (DNS) Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Master-Teacher-Student: A Weakly Labelled Semi-Supervised Framework for Audio Tagging and Sound Event Detection.
IEICE Trans. Inf. Syst., 2022

The HCCL System for the NIST SRE21.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Beam-Guided TasNet: An Iterative Speech Separation Framework with Multi-Channel Output.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

DPT-FSNet: Dual-Path Transformer Based Full-Band and Sub-Band Fusion Network for Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
A dual-stream deep attractor network with multi-domain learning for speech dereverberation and separation.
Neural Networks, 2021

Improved Speech Enhancement Using a Complex-Domain GAN with Fused Time-Domain and Time-Frequency Domain Constraints.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Power Pooling: An Adaptive Pooling Function for Weakly Labelled Sound Event Detection.
Proceedings of the International Joint Conference on Neural Networks, 2021

2020
Power pooling: An adaptive pooling function for weakly labelled sound event detection.
CoRR, 2020

Exploring the time-domain deep attractor network with two-stream architectures in a reverberant environment.
CoRR, 2020

ACGAN-based Data Augmentation Integrated with Long-term Scalogram for Acoustic Scene Classification.
CoRR, 2020

Power Pooling Operators and Confidence Learning for Semi-Supervised Sound Event Detection.
CoRR, 2020

Improved Guided Source Separation Integrated with a Strong Back-End for the CHiME-6 Dinner Party Scenario.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019
Integrating the Data Augmentation Scheme with Various Classifiers for Acoustic Scene Modeling.
CoRR, 2019

Speaker-Invariant Feature-Mapping for Distant Speech Recognition via Adversarial Teacher-Student Learning.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Audio Scene Classification with Discriminatively-Trained Segment-Level Features.
Proceedings of the IEEE International Conference on Multimedia & Expo Workshops, 2019

An Audio Scene Classification Framework with Embedded Filters and a DCT-based Temporal Module.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Deep Convolutional Neural Network with Scalogram for Audio Scene Modeling.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018


  Loading...