Hang Chen
Orcid: 0000-0002-0904-8946Affiliations:
- University of Science and Technology of China, National Engineering Research Center of Speech and Language Information Processing, Hefei, China
  According to our database1,
  Hang Chen
  authored at least 49 papers
  between 2019 and 2025.
  
  
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
- 
    on orcid.org
On csauthors.net:
Bibliography
  2025
MEAN-RIR: Multi-Modal Environment-Aware Network for Robust Room Impulse Response Estimation.
    
  
    CoRR, September, 2025
    
  
Cross-Modal Knowledge Distillation with Multi-Level Data Augmentation for Low-Resource Audio-Visual Sound Event Localization and Detection.
    
  
    CoRR, August, 2025
    
  
Lightweight Audio-Visual Wake Word Spotting With Diverse Acoustic Knowledge Distillation.
    
  
    IEEE Trans. Circuits Syst. Video Technol., July, 2025
    
  
HPCNet: Hybrid Pixel and Contour Network for Audio-Visual Speech Enhancement With Low-Quality Video.
    
  
    IEEE J. Sel. Top. Signal Process., May, 2025
    
  
The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition.
    
  
    CoRR, May, 2025
    
  
    IEEE Trans. Multim., 2025
    
  
Dual-Branch Codec With Orthogonality Constraint and Knowledge Distillation for Noisy Environment.
    
  
    IEEE Signal Process. Lett., 2025
    
  
Cross-attention among spectrum, waveform and SSL representations with bidirectional knowledge distillation for speech enhancement.
    
  
    Inf. Fusion, 2025
    
  
Projection Valued-based Quantum Machine Learning Adapting to Differential Privacy Algorithm for Word-level Lipreading.
    
  
    Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
    
  
  2024
    IEEE Trans. Multim., 2024
    
  
Optimizing Audio-Visual Speech Enhancement Using Multi-Level Distortion Measures for Audio-Visual Speech Recognition.
    
  
    IEEE ACM Trans. Audio Speech Lang. Process., 2024
    
  
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition.
    
  
    CoRR, 2024
    
  
    Proceedings of the IEEE Spoken Language Technology Workshop, 2024
    
  
Layer-Adaptive Low-Rank Adaptation of Large ASR Model for Low-Resource Multilingual Scenarios.
    
  
    Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024
    
  
Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design.
    
  
    Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
    
  
    Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
    
  
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2024
    
  
Implicit Enhancement of Target Speaker in Speaker-Adaptive ASR through Efficient Joint Optimization.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2024
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2024
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2024
    
  
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition.
    
  
    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
    
  
  2023
Space-and-speaker-aware acoustic modeling with effective data augmentation for recognition of multi-array conversational speech.
    
  
    Speech Commun., September, 2023
    
  
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.
    
  
    CoRR, 2023
    
  
Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023.
    
  
    Proceedings of the 31st ACM International Conference on Multimedia, 2023
    
  
Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder.
    
  
    Proceedings of the IEEE International Conference on Multimedia and Expo, 2023
    
  
Incorporating Visual Information Reconstruction into Progressive Learning for Optimizing audio-visual Speech Enhancement.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2023
    
  
The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2023
    
  
Incorporating Lip Features into Audio-Visual Multi-Speaker DOA Estimation by Gated Fusion.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2023
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2023
    
  
    Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
    
  
Enhancing Privacy Preservation with Quantum Computing for Word-Level Audio-Visual Speech Recognition.
    
  
    Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
    
  
    Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
    
  
    Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
    
  
Correlated Multi-Level Speech Enhancement for Robust Real-World ASR Applications Using Mask-Waveform-Feature Optimization.
    
  
    Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
    
  
  2022
Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function.
    
  
    Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
    
  
    Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
    
  
    Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
    
  
Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis.
    
  
    Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
    
  
The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results.
    
  
    Proceedings of the IEEE International Conference on Acoustics, 2022
    
  
  2021
Correlating subword articulation with lip shapes for embedding aware audio-visual speech enhancement.
    
  
    Neural Networks, 2021
    
  
Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments.
    
  
    Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
    
  
Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries.
    
  
    Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
    
  
  2020
  2019
    Proceedings of the 4th International Conference on Multimedia Systems and Signal Processing, 2019