Kai Wang

Affiliations:
  • University of Toronto, ON, Canada
  • Concordia University, Montreal, Canada (former)


According to our database1, Kai Wang authored at least 15 papers between 2021 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
OmniSonic: Towards Universal and Holistic Audio Generation from Video and Text.
CoRR, April, 2026

2025
Explainable AI-Generated Image Detection RewardBench.
CoRR, November, 2025

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Self-Improvement in Multimodal Large Language Models: A Survey.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

2024
Audio-Visual Dataset Distillation.
Trans. Mach. Learn. Res., 2024

HARWE: A multi-modal large-scale dataset for context-aware human activity recognition in smart working environments.
Pattern Recognit. Lett., 2024

AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation.
CoRR, 2024

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

MOMA: Mixture-of-Modality-Adaptations for Transferring Knowledge from Image Models Towards Efficient Audio-Visual Action Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Towards Efficient Audio-Visual Learners via Empowering Pre-trained Vision Transformers with Cross-Modal Adaptation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
SEformer: Dual-Path Conformer Neural Network is a Good Speech Denoiser.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
Cptnn: Cross-Parallel Transformer Neural Network For Time-Domain Speech Enhancement.
Proceedings of the 17th International Workshop on Acoustic Signal Enhancement, 2022

2021
CAUNet: Context-Aware U-Net for Speech Enhancement in Time Domain.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2021

TSTNN: Two-Stage Transformer Based Neural Network for Speech Enhancement in the Time Domain.
Proceedings of the IEEE International Conference on Acoustics, 2021


  Loading...