Ming Cheng

Orcid: 0000-0002-4733-3596

Affiliations:
  • Wuhan University, School of Computer Science, China
  • Duke Kunshan University, Suzhou Municipal Key Laboratory of Multimodal Intelligent Systems, China


According to our database1, Ming Cheng authored at least 16 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

Efficient Personal Voice Activity Detection with Wake Word Reference Speech.
Proceedings of the IEEE International Conference on Acoustics, 2024

Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer.
Proceedings of the IEEE International Conference on Acoustics, 2024

Joint Inference of Speaker Diarization and ASR with Multi-Stage Information Sharing.
Proceedings of the IEEE International Conference on Acoustics, 2024

Voxblink: A Large Scale Speaker Verification Dataset on Camera.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Computer-Aided Autism Spectrum Disorder Diagnosis With Behavior Signal Processing.
IEEE Trans. Affect. Comput., 2023

VoxBlink: X-Large Speaker Verification Dataset on Camera.
CoRR, 2023

Assessing the Social Skills of Children with Autism Spectrum Disorder via Language-Image Pre-training Models.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis.
Proceedings of the IEEE International Conference on Acoustics, 2023

Target-Speaker Voice Activity Detection Via Sequence-to-Sequence Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2023

The WHU-Alibaba Audio-Visual Speaker Diarization System for the MISP 2022 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
The DKU Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
A Multimodal Dynamic Neural Network for Call for Help Recognition in Elevators.
Proceedings of the ICMI '21 Companion: Companion Publication of the 2021 International Conference on Multimodal Interaction, Montreal, QC, Canada, October 18, 2021

Cross-modal Assisted Training for Abnormal Event Recognition in Elevators.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

2020
Responsive Social Smile: A Machine Learning based Multimodal Behavior Assessment Framework towards Early Stage Autism Screening.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

RWF-2000: An Open Large Scale Video Database for Violence Detection.
Proceedings of the 25th International Conference on Pattern Recognition, 2020


  Loading...