Xin Cheng

Orcid: 0009-0001-7581-8662

Affiliations:
  • Renmin University of China, Gaoling School of Artificial Intelligence, Beijing, China


According to our database1, Xin Cheng authored at least 11 papers between 2025 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Unified Synthesis of Compositional Speech and Sound from Free-Form Text Prompts.
CoRR, May, 2026

SyncDPO: Enhancing Temporal Synchronization in Video-Audio Joint Generation via Preference Learning.
CoRR, May, 2026

2025
VSpeechLM: A Visual Speech Language Model for Visual Text-to-Speech Task.
CoRR, November, 2025

Taming Text-to-Sounding Video Generation via Advanced Modality Condition and Interaction.
CoRR, October, 2025

VSSFlow: Unifying Video-conditioned Sound and Speech Generation via Joint Learning.
CoRR, September, 2025

WildSpoof Challenge Evaluation Plan.
CoRR, August, 2025

A Visual Speech Language Model for Visual Text-to-Speech Task.
Proceedings of the 7th ACM International Conference on Multimedia in Asia, 2025

VAFlow: Video-to-Audio Generation with Cross-Modality Flow Matching.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

LoVA: Long-form Video-to-Audio Generation.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Animate and Sound an Image.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

EyEar: Learning Audio Synchronized Human Gaze Trajectory Based on Physics-Informed Dynamics.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025


  Loading...