Chong Zhang

Orcid: 0000-0002-2162-4344

Affiliations:

Alibaba Group, Speech Lab of DAMO Academy, Singapore
National University of Singapore, Department of Electrical and Computer Engineering, Singapore (PhD 2017)

According to our database¹, Chong Zhang authored at least 38 papers between 2015 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

Mismatch Aware Guidance for Robust Emotion Control in Auto-Regressive TTS Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

OmniDRCA: Parallel Speech-Text Foundation Model via Dual-Resolution Speech Representations and Contrastive Alignment.

[BibT_eX]

[DOI]

CoRR, June, 2025

InspireMusic: Integrating Super Resolution and Large Language Model for High-Fidelity Long-Form Music Generation.

[BibT_eX]

[DOI]

CoRR, March, 2025

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction.

[BibT_eX]

[DOI]

CoRR, January, 2025

Plug-and-Play Co-Occurring Face Attention for Robust Audio-Visual Speaker Extraction.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Online Audio-Visual Autoregressive Speaker Extraction.

[BibT_eX]

[DOI]

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Conditional Latent Diffusion-Based Speech Enhancement via Dual Context Learning.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Fine-Tuning Channel-Pruned Deep Model via Knowledge Distillation.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., November, 2024

Tuning Large Language Model for Speech Recognition With Mixed-Scale Re-Tokenization.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2024

Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions.

[BibT_eX]

[DOI]

CoRR, 2024

Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers.

[BibT_eX]

[DOI]

CoRR, 2024

Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

SPGM: Prioritizing Local Features for Enhanced Speech Separation Performance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Are Soft Prompts Good Zero-Shot Learners for Speech Recognition?

[BibT_eX]

[DOI]

Fabian Ritter Gutierrez

Proceedings of the IEEE International Conference on Acoustics, 2024

Loss Masking Is Not Needed In Decoder-Only Transformer For Discrete-Token-Based ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention.

[BibT_eX]

[DOI]

CoRR, 2023

deHuBERT: Disentangling Noise in a Self-supervised Model for Robust Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2023

ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Dual-Memory Multi-Modal Learning for Continual Spoken Keyword Spotting with Confidence Selection and Diversity Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

A Unified Recognition and Correction Model under Noisy and Accent Speech Conditions.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Dual Acoustic Linguistic Self-supervised Representation Learning for Cross-Domain Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Adapter-tuning with Effective Token-dependent Representation Shift for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Adaptive Knowledge Distillation Between Text and Speech Pre-Trained Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Contrastive Speech Mixup for Low-Resource Keyword Spotting.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

De'hubert: Disentangling Noise in a Self-Supervised Model for Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Auxiliary Pooling Layer For Spoken Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022

I2CR: Improving Noise Robustness on Keyword Spotting Using Inter-Intra Contrastive Regularization.

[BibT_eX]

[DOI]

CoRR, 2022

2019

A Cost-Sensitive Deep Belief Network for Imbalanced Classification.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2019

2018

A Multi-State Diagnosis and Prognosis Framework with Feature Learning for Tool Condition Monitoring.

[BibT_eX]

[DOI]

CoRR, 2018

Gated Recurrent Units Based Neural Network For Tool Condition Monitoring.

[BibT_eX]

[DOI]

Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

2017

Multiobjective Deep Belief Networks Ensemble for Remaining Useful Life Estimation in Prognostics.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2017

A data-driven prognostics framework for tool remaining useful life estimation in tool condition monitoring.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Conference on Emerging Technologies and Factory Automation, 2017

2016

Training cost-sensitive Deep Belief Networks on imbalance data problems.

[BibT_eX]

[DOI]

Chong Zhang

Kay Chen Tan

Ruoxu Ren

Proceedings of the 2016 International Joint Conference on Neural Networks, 2016

2015

Deep Belief Networks Ensemble with Multi-objective Optimization for Failure Diagnosis.

[BibT_eX]

[DOI]

Chong Zhang

Jia Hui Sun

Kay Chen Tan

Proceedings of the 2015 IEEE International Conference on Systems, 2015

Chong Zhang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...