Haoqin Sun

Orcid: 0000-0002-8554-8969

According to our database1, Haoqin Sun authored at least 29 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Marco-Voice Technical Report.
CoRR, August, 2025

Learning Personalised Human Internal Cognition from External Expressive Behaviours for Real Personality Recognition.
CoRR, August, 2025

DIFFA: Large Language Diffusion Models Can Listen and Understand.
CoRR, July, 2025

StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling.
CoRR, June, 2025

EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations.
CoRR, May, 2025

RA-CLAP: Relation-Augmented Emotional Speaking Style Contrastive Language-Audio Pretraining For Speech Retrieval.
CoRR, May, 2025

Discrete Audio Representations for Automated Audio Captioning.
CoRR, May, 2025

CS-Dialogue: A 104-Hour Dataset of Spontaneous Mandarin-English Code-Switching Dialogues for Speech Recognition.
CoRR, February, 2025

FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching.
CoRR, February, 2025

MusicEval: A Generative Music Corpus with Expert Ratings for Automatic Text-to-Music Evaluation.
CoRR, January, 2025

Improving Zero-Shot Chinese-English Code-Switching ASR with kNN-CTC and Gated Monolingual Datastores.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Enhancing Multimodal Emotion Recognition through Multi-Granularity Cross-Modal Alignment.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Enhancing Emotion Recognition in Incomplete Data: A Novel Cross-Modal Alignment, Reconstruction, and Refinement Framework.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Emotion-Preserving Prosody Anonymization Network for Voice Privacy Protection.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Feature distribution Adaptation Network for Speech Emotion Recognition.
CoRR, 2024

ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5.
CoRR, 2024

Self-Prompt Tuning: Enable Autonomous Role-Playing in LLMs.
CoRR, 2024

Uncertainty-Aware Mean Opinion Score Prediction.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Fine-Grained Disentangled Representation Learning For Multimodal Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Speech Emotion Recognition Using Cascaded Attention Network with Joint Loss for Discrimination of Confusions.
Mach. Intell. Res., August, 2023

A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., June, 2023

A Discriminative Feature Representation Method Based on Cascaded Attention Network With Adversarial Strategy for Speech Emotion Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Multi-Level Knowledge Distillation for Speech Emotion Recognition in Noisy Conditions.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022
Multi-modal speech emotion recognition using self-attention mechanism and multi-scale fusion framework.
Speech Commun., 2022

Discriminative Feature Representation Based on Cascaded Attention Network with Adversarial Joint Loss for Speech Emotion Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022


  Loading...