Wenyi Yu

Orcid: 0000-0002-8693-8168

According to our database¹, Wenyi Yu authored at least 23 papers between 2016 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

End-to-end Listen, Look, Speak and Act.

[BibT_eX]

[DOI]

CoRR, October, 2025

MMAU-Pro: A Challenging and Comprehensive Benchmark for Holistic Evaluation of Audio General Intelligence.

[BibT_eX]

[DOI]

CoRR, August, 2025

SALMONN-omni: A Standalone Speech LLM without Codec Injection for Full-duplex Conversation.

[BibT_eX]

[DOI]

CoRR, May, 2025

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Adaptive Conditional Expert Selection Network for Multi-domain Recommendation.

[BibT_eX]

[DOI]

CoRR, 2024

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation.

[BibT_eX]

[DOI]

CoRR, 2024

Extract and Diffuse: Latent Integration for Improved Diffusion-based Speech and Vocal Enhancement.

[BibT_eX]

[DOI]

CoRR, 2024

HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate Prediction.

[BibT_eX]

[DOI]

CoRR, 2024

M<sup>3</sup>AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.

[BibT_eX]

[DOI]

CoRR, 2024

An Improved Empirical Fisher Approximation for Natural Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

An Optimizer for Conformer Based on Conjugate Gradient Method.

[BibT_eX]

[DOI]

Wenyi Yu

Chao Zhang

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Can Large Language Models Understand Spatial Audio?

[BibT_eX]

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024

video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

SALMONN: Towards Generic Hearing Abilities for Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Connecting Speech Encoder and Large Language Model for ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Extending Large Language Models for Speech and Audio Captioning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

M³AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

T2T-YAO: A Telomere-to-Telomere Assembled Diploid Reference Genome for Han Chinese.

[BibT_eX]

[DOI]

Genom. Proteom. Bioinform., 2023

Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

2022

A method of band selection of remote sensing image based on clustering and intra-class index.

[BibT_eX]

[DOI]

Yunyi Yan

Wenyi Yu

Lingxia Zhang

Multim. Tools Appl., 2022

2016

Superpixel-Based CFAR Target Detection for High-Resolution SAR Images.

[BibT_eX]

[DOI]

IEEE Geosci. Remote. Sens. Lett., 2016

Wenyi Yu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...