Proceedings of the 26th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2023

HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models.

[BibT_eX]

[DOI]

Chen Chen

Yuchen Hu

Chao-Han Huck Yang

Sabato Marco Siniscalchi

Pin-Yu Chen

Chng Eng Siong

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

CN-Celeb-AV: A Multi-Genre Audio-Visual Dataset for Person Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

A Neural State-Space Modeling Approach to Efficient Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

CN-CVS: A Mandarin Audio-Visual Dataset for Large Vocabulary Continuous Visual to Speech Synthesis.

[BibT_eX]

[DOI]

Chen Chen

Dong Wang

Thomas Fang Zheng

Proceedings of the IEEE International Conference on Acoustics, 2023

Unsupervised Noise Adaptation Using Data Simulation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Metric-Oriented Speech Enhancement Using Diffusion Probabilistic Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Lifelong Sequence Generation with Dynamic Module Expansion and Adaptation.

[BibT_eX]

[DOI]

Chengwei Qin

Chen Chen

Shafiq Joty

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Study of Generative Adversarial Networks for Noisy Speech Simulation from Clean Speech.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

UniS-MMC: Multimodal Classification via Unimodality-supervised Multimodal Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

MIR-GAN: Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Leveraging Modality-Specific Representations for Audio-Visual Speech Recognition via Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Interactive Audio-text Representation for Automated Audio Captioning with Contrastive Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Cycleflow: Purify Information Factors by Cycle Loss.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

DENT-DDSP: Data-efficient noisy speech generator using differentiable digital signal processors for explicit distortion modelling and noise-robust speech recognition.

[BibT_eX]

[DOI]

Zixun Guo

Chen Chen

Eng Siong Chng

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Interactive Auido-text Representation for Automated Audio Captioning with Contrastive Learning.

[BibT_eX]

[DOI]

Chen Chen

Nana Hou

Yuchen Hu