Yusheng Dai

Orcid: 0000-0003-3211-3862

According to our database1, Yusheng Dai authored at least 21 papers between 2018 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Omni2Sound: Towards Unified Video-Text-to-Audio Generation.
CoRR, January, 2026

ControlAudio: Tackling Text-Guided, Timing-Indicated and Intelligible Audio Generation via Progressive Diffusion Modeling.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

2025
Latent Swap Joint Diffusion for Long-Form Audio Generation.
CoRR, February, 2025

AudioAtlas: A Comprehensive and Balanced Benchmark Towards Movie-Oriented Text-to-Audio Generation.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Latent Swap Joint Diffusion for 2D Long-Form Latent Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Phoneme-Level Contrastive Learning for User-Defined Keyword Spotting with Flexible Enrollment.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Robust-MVTON: Learning Cross-Pose Feature Alignment and Fusion for Robust Multi-View Virtual Try-On.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition.
CoRR, 2024

Meta-Adaptive Stock Movement Prediction with Two-Stage Representation Learning.
Proceedings of the 2024 SIAM International Conference on Data Mining, 2024

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2024

Improving Multi-Modal Emotion Recognition Using Entropy-Based Fusion and Pruning-Based Network Architecture Optimization.
Proceedings of the IEEE International Conference on Acoustics, 2024

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.
CoRR, 2023

Improving Audio-Visual Speech Recognition by Lip-Subword Correlation Based Visual Pre-training and Cross-Modal Fusion Encoder.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Using IRP and local alignment method to detect distributed malware.
Comput. Secur., 2021

2019
An online log template extraction method based on hierarchical clustering.
EURASIP J. Wirel. Commun. Netw., 2019

SMASH: A Malware Detection Method Based on Multi-Feature Ensemble Learning.
IEEE Access, 2019

M4D: A Malware Detection Method Using Multimodal Features.
Proceedings of the Frontiers in Cyber Security - Second International Conference, 2019

2018
A malware classification method based on memory dump grayscale image.
Digit. Investig., 2018


  Loading...