Zeyue Tian

Orcid: 0000-0002-7278-3708

According to our database1, Zeyue Tian authored at least 23 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing.
CoRR, April, 2026

Inference-time Scaling for Diffusion-based Audio Super-resolution.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

VMChill: A Dataset for Fine-Grained Visual-Musical Synergy.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
VAInpaint: Zero-Shot Video-Audio inpainting framework with LLMs-driven Module.
CoRR, September, 2025

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data.
CoRR, September, 2025

WeTok: Powerful Discrete Tokenization for High-Fidelity Visual Reconstruction.
CoRR, August, 2025

AudioX: Diffusion Transformer for Anything-to-Audio Generation.
CoRR, March, 2025

YuE: Scaling Open Foundation Models for Long-Form Music Generation.
CoRR, March, 2025

Audio-FLAN: A Preliminary Release.
CoRR, February, 2025

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Foundation Models for Music: A Survey.
CoRR, 2024

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions.
CoRR, 2024

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling.
CoRR, 2024

LLMs Meet Multimodal Generation and Editing: A Survey.
CoRR, 2024

ChatMusician: Understanding and Generating Music Intrinsically with LLM.
CoRR, 2024

ComposerX: Multi-Agent Symbolic Music Composition With LLMs.
Proceedings of the 25th International Society for Music Information Retrieval Conference, 2024

Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024


Multitarget Device-Free Localization via Cross-Domain Wi-Fi RSS Training Data and Attentional Prior Fusion.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Deep Cascade Gradient RBF Networks With Output-Relevant Feature Extraction and Adaptation for Nonlinear and Nonstationary Processes.
IEEE Trans. Cybern., 2023

MARBLE: Music Audio Representation Benchmark for Universal Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Mixed Neural Voxels for Fast Multi-view Video Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
Mixed Neural Voxels for Fast Multi-view Video Synthesis.
CoRR, 2022


  Loading...