Sizhe Shan
According to our database1,
Sizhe Shan authored at least 7 papers
between 2024 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
CoRR, May, 2026
2025
HunyuanVideo-Foley: Multimodal Diffusion with Representation Alignment for High-Fidelity Foley Audio Generation.
CoRR, August, 2025
Enhance Generation Quality of Flow Matching V2A Model via Multi-Step CoT-Like Guidance and Combined Preference Optimization.
CoRR, March, 2025
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025
ProsodyFlow: High-fidelity Text-to-Speech through Conditional Flow Matching and Prosody Modeling with Large Speech Language Models.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
EFTTS: Zero-Shot Emotional Speech Synthesis via Conditional Flow Matching and Self-Supervised Representations.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2025
2024
YingSound: Video-Guided Sound Effects Generation with Multi-modal Chain-of-Thought Controls.
CoRR, 2024