We stand with Ukraine

We stand with Ukraine

Rongfeng Su

Orcid: 0000-0002-7228-5768

According to our database¹, Rongfeng Su authored at least 37 papers between 2013 and 2026.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Learning to Attend to Depression-Related Patterns: An Adaptive Cross-Modal Gating Network for Depression Detection.

[DOI]

,

,

,

,

CoRR, April, 2026

Towards unified diffusion model for speech to ultrasound tongue imaging synthesis.

[DOI]

,

,

,

Manwa Lawrence Ng

,

,

Inf. Fusion, 2026

2025

UTI-LLM: A Personalized Articulatory-Speech Therapy Assistance System Based on Multimodal Large Language Model.

[DOI]

,

,

,

,

,

CoRR, September, 2025

KGMV-net: Knowledge-guided multi-view network for audio-visual dysarthria severity assessment.

[DOI]

,

,

,

,

,

,

Knowl. Based Syst., 2025

Emotion-Guided Graph Attention Networks for Speech-Based Depression Detection under Emotion-Inducting Tasks.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025

Chain-of-Thought Distillation for ASR Error Correction with Multimodal Large Language Models.

[DOI]

,

,

,

,

,

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

A Psychological Strategy Annotation Method Using Multiple LLMs with a Chain of Thought Based on Deductive Reasoning.

[DOI]

,

,

,

,

,

,

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025

2024

Investigating Acoustic-Textual Emotional Inconsistency Information for Automatic Depression Detection.

[DOI]

,

,

,

,

,

,

CoRR, 2024

Multi-source-Domain Adaptation for TMS-EEG Based Depression Detection.

[DOI]

,

,

,

,

,

,

,

Proceedings of the Social Robotics - 16th International Conference, 2024

Feature Extraction Method Based on Contrastive Learning for Dysarthria Detection.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the Social Robotics - 16th International Conference, 2024

A Transformer-Based Depression Detection Network Leveraging Speech Emotional Expression Cues.

[DOI]

,

,

,

,

,

,

,

Proceedings of the Social Robotics - 16th International Conference, 2024

The Open-Access Mandarin Subacute Stroke Dysarthria Multimodal (MSDM) Database for Intelligent Assessment.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Optical Flow Guided Tongue Trajectory Generation for Diffusion-based Acoustic to Articulatory Inversion.

[DOI]

,

,

,

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Depression Enhances Internal Inconsistency between Spoken and Semantic Emotion: Evidence from the Analysis of Emotion Expression in Conversation.

[DOI]

,

,

,

,

,

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

An Audio-Textual Diffusion Model for Converting Speech Signals into Ultrasound Tongue Imaging Data.

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Audio-video database from subacute stroke patients for dysarthric speech intelligence assessment and preliminary analysis.

[DOI]

,

,

,

,

,

Manwa Lawrence Ng

,

,

,

Biomed. Signal Process. Control., 2023

On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and Elderly Speech Recognition.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022

On-the-fly Feature Based Speaker Adaptation for Dysarthric and Elderly Speech Recognition.

[DOI]

,

,

,

,

,

,

CoRR, 2022

Respiratory and laryngeal influences on voice in post-stroke dysarthria: a pilot study.

[DOI]

,

,

,

,

,

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

A Phone-Level Speaker Embedding Extraction Framework with Multi-Gate Mixture-of-Experts Based Multi-Task Learning.

[DOI]

,

,

,

,

,

Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

A Multi-level Acoustic Feature Extraction Framework for Transformer Based End-to-End Speech Recognition.

[DOI]

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A New Method for Predicting Severity Level of Dysarthric Speech Based on Joint Feature-Sample Selection using Audio-Visual Data.

[DOI]

,

,

,

,

,

,

,

Proceedings of the International Conference on Asian Language Processing, 2022

An Investigation of Magnitude-Based and Phase-Based Features for Large-Scale Speaker Identification.

[DOI]

,

,

,

,

Proceedings of the International Conference on Asian Language Processing, 2022

2020

Cross-Domain Deep Visual Feature Generation for Mandarin Audio-Visual Speech Recognition.

[DOI]

,

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition.

[DOI]

,

,

,

,

,

,

Shi-Xiong Zhang

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Towards the Speech Features of Early-Stage Dementia: Design and Application of the Mandarin Elderly Cognitive Speech Database.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition.

[DOI]

,

,

,

,

,

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018

Semi-supervised Cross-domain Visual Feature Learning for Audio-Visual Broadcast Speech Transcription.

[DOI]

,

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Gaussian Process Neural Networks for Speech Recognition.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

Multimodal learning using 3D audio-visual data for audio-visual speech recognition.

[DOI]

,

,

Proceedings of the 2017 International Conference on Asian Language Processing, 2017

2016

A multi-channel/multi-speaker interactive 3D audio-visual speech corpus in Mandarin.

[DOI]

,

,

,

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Convolutional neural network bottleneck features for bi-directional generalized variable parameter HMMs.

[DOI]

,

,

Proceedings of the IEEE International Conference on Information and Automation, 2016

2015

Automatic Complexity Control of Generalized Variable Parameter HMMs for Noise Robust Speech Recognition.

[DOI]

,

,

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Generalized variable parameter HMMs based acoustic-to-articulatory inversion.

[DOI]

,

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Efficient use of DNN bottleneck features in generalized variable parameter HMMs for noise robust speech recognition.

[DOI]

,

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014

Deep neural network bottleneck features for generalized variable parameter HMMs.

[DOI]

,

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013

Automatic model complexity control for generalized variable parameter HMMs.

[DOI]

,

,

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

Loading...