Xiquan Li

According to our database1, Xiquan Li authored at least 17 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
TinyMU: A Compact Audio-Language Model for Music Understanding.
CoRR, April, 2026

Resonate: Reinforcing Text-to-Audio Generation via Online Feedback from Large Audio Language Models.
CoRR, March, 2026

Audio ControlNet for Fine-Grained Audio Generation and Editing.
CoRR, February, 2026

SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing.
IEEE J. Sel. Top. Signal Process., January, 2026

SemanticAudio: Audio Generation and Editing in Semantic Space.
CoRR, January, 2026

FineLAP: Taming Heterogeneous Supervision for Fine-grained Language-Audio Pretraining.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

MeanAudio: Fast and Faithful Text-to-Audio Generation with Mean Flows.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

SAC: Neural Speech Codec with Semantic-Acoustic Dual-Stream Quantization.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

2025
SAC: Neural Speech Codec with Semantic-Acoustic Dual-Stream Quantization.
CoRR, October, 2025

MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix.
CoRR, May, 2025

URO-Bench: A Comprehensive Benchmark for End-to-End Spoken Dialogue Models.
CoRR, February, 2025

DRCap: Decoding CLAP Latents with Retrieval-Augmented Generation for Zero-shot Audio Captioning.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

URO-Bench: Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Towards Reliable Large Audio Language Model.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024


  Loading...