Ruibin Yuan

According to our database1, Ruibin Yuan authored at least 48 papers between 2020 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation.
CoRR, May, 2025

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix.
CoRR, May, 2025

Kimi-Audio Technical Report.
CoRR, April, 2025

AudioX: Diffusion Transformer for Anything-to-Audio Generation.
CoRR, March, 2025

YuE: Scaling Open Foundation Models for Long-Form Music Generation.
CoRR, March, 2025

Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens.
CoRR, March, 2025

Audio-FLAN: A Preliminary Release.
CoRR, February, 2025

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines.
CoRR, February, 2025

CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages.
CoRR, February, 2025

CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

MuPT: A Generative Symbolic Music Pretrained Transformer.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

2024
You Know What I'm Saying: Jailbreak Attack via Implicit Reference.
CoRR, 2024

HiddenGuard: Fine-Grained Safe Generation with Specialized Representation Router.
CoRR, 2024

OmniBench: Towards The Future of Universal Omni-Language Models.
CoRR, 2024

SongTrans: An unified song transcription and alignment method for lyrics and notes.
CoRR, 2024

Foundation Models for Music: A Survey.
CoRR, 2024

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions.
CoRR, 2024

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling.
CoRR, 2024

LLMs Meet Multimodal Generation and Editing: A Survey.
CoRR, 2024

MuPT: A Generative Symbolic Music Pretrained Transformer.
CoRR, 2024

Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model.
CoRR, 2024

The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis.
CoRR, 2024

RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation.
CoRR, 2024

COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning.
CoRR, 2024

Modeling Analog Dynamic Range Compressors using Deep Learning and State-space Models.
CoRR, 2024

ChatMusician: Understanding and Generating Music Intrinsically with LLM.
CoRR, 2024

CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models.
CoRR, 2024

CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark.
CoRR, 2024

Can LLMs "Reason" in Music? an Evaluation of LLMs' Capability of Music Understanding and Generation.
Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024

ComposerX: Multi-Agent Symbolic Music Composition With LLMs.
Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MMMU: A Massive Multi-Discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-Speech Gesture Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024


CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training.
CoRR, 2023

Chinese Open Instruction Generalist: A Preliminary Release.
CoRR, 2023

MARBLE: Music Audio Representation Benchmark for Universal Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LyricWhiz: Robust Multilingual Zero-Shot Lyrics Transcription by Whispering to ChatGPT.
Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

On the Effectiveness of Speech Self-Supervised Learning for Music.
Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

2022
MAP-Music2Vec: A Simple and Effective Baseline for Self-Supervised Music Audio Representation Learning.
CoRR, 2022

Noisy Label Detection for Speaker Recognition.
CoRR, 2022

DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Parallel Adaptive Subspace Pursuit Algorithm for Multiuser Detection of Uplink Grant-Free NOMA.
Proceedings of the IEEE Wireless Communications and Networking Conference, 2021

2020
Diverse Melody Generation from Chinese Lyrics via Mutual Information Maximization.
CoRR, 2020


  Loading...