Xuesong Yang
Orcid: 0009-0001-5996-4344
According to our database1,
Xuesong Yang authored at least 57 papers
between 2011 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation.
CoRR, March, 2026
Rethinking Training Targets, Architectures and Data Quality for Universal Speech Enhancement.
CoRR, March, 2026
Knowl. Based Syst., 2026
Hybrid high-dimensional vine copula-Bayesian network framework for flood risk analysis in reservoir-lake systems: Addressing multisource uncertainties.
Environ. Model. Softw., 2026
Spatiotemporal correction of decision variables using XGBoost for multi-objective intelligent scheduling rule extraction model in reservoir-lake flood control systems.
Environ. Model. Softw., 2026
LLaVA-UHD v2: Exploiting Hierarchical Vision Granularity in MLLMs via Inverse Semantic Pyramid.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
MM-UAVBench: How Well Do Multimodal Large Language Models See, Think, and Plan in Low-Altitude UAV Scenarios?
CoRR, December, 2025
Align2Speak: Improving TTS for Low Resource Languages via ASR-Guided Online Preference Optimization.
CoRR, September, 2025
CoRR, September, 2025
Computer learning career path optimisation utilising multi-modal large models and privacy-preserving collaborative computing.
Int. J. Inf. Commun. Technol., 2025
Evaluation of teaching quality in database courses based on domain-adaptive transfer learning.
Int. J. Inf. Commun. Technol., 2025
Enhancing empathy of medical students in clinical training: a narrative-driven virtual reality experience for understanding undiagnosed chronic pain.
Frontiers Virtual Real., 2025
Unveiling the molecular mechanisms of Haitang-Xiaoyin Mixture in psoriasis treatment based on bioinformatics, network pharmacology, machine learning, and molecular docking verification.
Comput. Biol. Chem., 2025
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025
Proceedings of the 26th Annual Conference of the International Speech Communication Association, 2025
Koel-TTS: Enhancing LLM based Speech Generation with Preference Alignment and Classifier Free Guidance.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
NeKo: Cross-Modality Post-Recognition Error Correction with Tasks-Guided Mixture-of-Experts Language Model.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), 2025
2024
IEEE Trans. Circuits Syst. I Regul. Pap., July, 2024
Inf. Sci., January, 2024
The Smart City Waste Classification Management System: Strategies and Applications Based on Computer Vision.
J. Organ. End User Comput., 2024
LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer.
CoRR, 2024
NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts.
CoRR, 2024
Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Proceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation, 2024
2023
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023
Optimizing Electromagnetic Cigarette Heaters Using PSO-NSGA II Algorithm: An Effective Strategy to Improve Temperature Control and Production Rate.
Appl. Artif. Intell., December, 2023
Trans. Mach. Learn. Res., 2023
SLN-RED: Regularization by Simultaneous Local and Nonlocal Denoising for Image Restoration.
IEEE Signal Process. Lett., 2023
2022
CoRR, 2022
2021
Triplet is All You Need with Random Mappings for Unsupervised Visual Representation Learning.
CoRR, 2021
REDAT: Accent-Invariant Representation for End-To-End ASR by Domain Adversarial Training with Relabeling.
Proceedings of the IEEE International Conference on Acoustics, 2021
2019
Proceedings of the 2019 IEEE International Conference on Industrial Engineering and Engineering Management, 2019
Proceedings of the 2019 IEEE SENSORS, Montreal, QC, Canada, October 27-30, 2019, 2019
Proceedings of the 36th International Conference on Machine Learning, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
2018
Feature extraction using convolutional neural networks for multi-atlas based image segmentation.
Proceedings of the Medical Imaging 2018: Image Processing, 2018
Proceedings of the Medical Imaging 2018: Image Processing, 2018
Improved ASR for Under-resourced Languages through Multi-task Learning with Acoustic Landmarks.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 2018 IEEE SENSORS, New Delhi, India, October 28-31, 2018, 2018
Proceedings of the 2018 IEEE SENSORS, New Delhi, India, October 28-31, 2018, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Neuroinformatics, 2017
Acoustic Landmarks Contain More Information About the Phone String than Other Frames.
CoRR, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
A study on landmark detection based on CTC and its application to pronunciation error detection.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
2016
Proceedings of the 13th IEEE International Symposium on Biomedical Imaging, 2016
2015
2014
Machine learning approaches to improving pronunciation error detection on an imbalanced corpus.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014
2011
Sound source localization for mobile robot based on time difference feature and space grid matching.
Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011
Improvement of Segmental Mispronunciation Detection with Prior Knowledge Extracted from Large L2 Speech Corpus.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011