Wei Xue
Orcid: 0000-0002-4942-7748Affiliations:
- Hong Kong Baptist University, Division of Emerging Interdisciplinary Areas, Hong Kong
- Imperial College London, Department of Electrical and Electronic Engineering, UK
- Chinese Academy of Sciences (CAS), Pattern Recognition and Intelligent Systems from the Institute of Automation, Beijing, China (PhD 2015)
According to our database1,
Wei Xue
authored at least 92 papers
between 2016 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
Inf. Fusion, 2026
2025
CoRR, August, 2025
CoRR, August, 2025
ThinkSound: Chain-of-Thought Reasoning in Multimodal Large Language Models for Audio Generation and Editing.
CoRR, June, 2025
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix.
CoRR, May, 2025
CoRR, May, 2025
Co<sup>3</sup>Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion.
CoRR, May, 2025
CoRR, March, 2025
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens.
CoRR, March, 2025
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement.
CoRR, March, 2025
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer.
CoRR, February, 2025
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis.
CoRR, February, 2025
Every Angle Is Worth A Second Glance: Mining Kinematic Skeletal Structures from Multi-view Joint Cloud.
CoRR, February, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Co3Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025
BayesKD: Bayesian Knowledge Distillation for Compact LLMs in Constrained Fine-tuning Scenarios.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
2024
IEEE Trans. Multim., 2024
SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model.
CoRR, 2024
Foundation Cures Personalization: Recovering Facial Personalized Models' Prompt Consistency.
CoRR, 2024
CoRR, 2024
PSHuman: Photorealistic Single-view Human Reconstruction using Cross-Scale Diffusion.
CoRR, 2024
CoRR, 2024
AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems.
CoRR, 2024
CoRR, 2024
CoRR, 2024
Proceedings of the Uncertainty in Artificial Intelligence, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Can LLMs "Reason" in Music? an Evaluation of LLMs' Capability of Music Understanding and Generation.
Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024
Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
Auto-GAS: Automated Proxy Discovery for Training-Free Generative Architecture Search.
Proceedings of the Computer Vision - ECCV 2024, 2024
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-Speech Gesture Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
FM-OV3D: Foundation Model-Based Cross-Modal Knowledge Blending for Open-Vocabulary 3D Detection.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection.
CoRR, 2023
CoRR, 2023
Learn to Sing by Listening: Building Controllable Virtual Singer by Unsupervised Learning from Voice Recordings.
CoRR, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
LyricWhiz: Robust Multilingual Zero-Shot Lyrics Transcription by Whispering to ChatGPT.
Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023
NAS-FM: Neural Architecture Search for Tunable and Interpretable Sound Synthesis Based on Frequency Modulation.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
GCC-Speaker: Target Speaker Localization with Optimal Speaker-Dependent Weighting in Multi-Speaker Scenarios.
Proceedings of the IEEE International Conference on Acoustics, 2023
MoMusic: A Motion-Driven Human-AI Collaborative Music Composition and Performing System.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
IEEE Signal Process. Lett., 2022
2021
Speech Enhancement Based on Modulation-Domain Parametric Multichannel Kalman Filtering.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Causal System Identification based Compensation for Reverberation-Robust DOA Estimation.
Proceedings of the 29th European Signal Processing Conference, 2021
2020
Sound Event Localization and Detection Based on Multiple DOA Beamforming and Multi-Task Learning.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
2019
IEEE ACM Trans. Audio Speech Lang. Process., 2019
Direct-Path Signal Cross-Correlation Estimation for Sound Source Localization in Reverberation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
2018
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 26th European Signal Processing Conference, 2018
Proceedings of the 52nd Asilomar Conference on Signals, Systems, and Computers, 2018
2017
Frequency-domain under-modelled blind system identification based on cross power spectrum and sparsity regularization.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
2016
Under-modelled blind system identification for time delay estimation in reverberant environments.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016
Cross-correlation based under-modelled multichannel blind acoustic system identification with sparsity regularization.
Proceedings of the 24th European Signal Processing Conference, 2016