Hang Chen

Orcid: 0000-0002-0904-8946

Affiliations:
  • University of Science and Technology of China, National Engineering Research Center of Speech and Language Information Processing, Hefei, China


According to our database1, Hang Chen authored at least 46 papers between 2019 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Lightweight Audio-Visual Wake Word Spotting With Diverse Acoustic Knowledge Distillation.
IEEE Trans. Circuits Syst. Video Technol., July, 2025

Exploring Speaker Diarization with Mixture of Experts.
CoRR, June, 2025

HPCNet: Hybrid Pixel and Contour Network for Audio-Visual Speech Enhancement With Low-Quality Video.
IEEE J. Sel. Top. Signal Process., May, 2025

The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition.
CoRR, May, 2025

Dual-Branch Codec With Orthogonality Constraint and Knowledge Distillation for Noisy Environment.
IEEE Signal Process. Lett., 2025

Cross-attention among spectrum, waveform and SSL representations with bidirectional knowledge distillation for speech enhancement.
Inf. Fusion, 2025

Projection Valued-based Quantum Machine Learning Adapting to Differential Privacy Algorithm for Word-level Lipreading.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
Collaborative Viseme Subword and End-to-End Modeling for Word-Level Lip Reading.
IEEE Trans. Multim., 2024

Optimizing Audio-Visual Speech Enhancement Using Multi-Level Distortion Measures for Audio-Visual Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Deep CLAS: Deep Contextual Listen, Attend and Spell.
CoRR, 2024

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition.
CoRR, 2024

Summary of Low-Resource Dysarthria Wake-Up Word Spotting Challenge.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Layer-Adaptive Low-Rank Adaptation of Large ASR Model for Low-Resource Multilingual Scenarios.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Summary on the Chat-Scenario Chinese Lipreading (ChatCLR) Challenge.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2024

Implicit Enhancement of Target Speaker in Speaker-Adaptive ASR through Efficient Joint Optimization.
Proceedings of the IEEE International Conference on Acoustics, 2024

The USTC System for Cadenza 2024 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2024

Summary on the Multimodal Information-Based Speech Processing (MISP) 2023 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2024

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Space-and-speaker-aware acoustic modeling with effective data augmentation for recognition of multi-array conversational speech.
Speech Commun., September, 2023

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.
CoRR, 2023

The USTC-NERCSLIP Systems for the CHiME-7 DASR Challenge.
CoRR, 2023

Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Incorporating Visual Information Reconstruction into Progressive Learning for Optimizing audio-visual Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2023

The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Incorporating Lip Features into Audio-Visual Multi-Speaker DOA Estimation by Gated Fusion.
Proceedings of the IEEE International Conference on Acoustics, 2023

Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023

Semi-Supervised Multi-Channel Speaker Diarization With Cross-Channel Attention.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Enhancing Privacy Preservation with Quantum Computing for Word-Level Audio-Visual Speech Recognition.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Progressive Multi-scale Self-supervised Learning for Speech Recognition.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Improved Data2vec with Soft Supervised Hidden Unit for Mandarin Speech Recognition.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Correlated Multi-Level Speech Enhancement for Robust Real-World ASR Applications Using Mask-Waveform-Feature Optimization.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
Improved Speech Pre-Training with Supervision-Enhanced Acoustic Unit.
CoRR, 2022

Progressive Multi-Scale Self-Supervised Learning for Speech Recognition.
CoRR, 2022

Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Multi-Task Joint Learning for Embedding Aware Audio-Visual Speech Enhancement.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Deep Segment Model for Acoustic Scene Classification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Correlating subword articulation with lip shapes for embedding aware audio-visual speech enhancement.
Neural Networks, 2021

Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020
Lip-reading with Hierarchical Pyramidal Convolution and Self-Attention.
CoRR, 2020

2019
Deep Neural Network Based Regression Approach for Acoustic Echo Cancellation.
Proceedings of the 4th International Conference on Multimedia Systems and Signal Processing, 2019


  Loading...