Chenyang Si

Orcid: 0000-0002-3354-1968

According to our database1, Chenyang Si authored at least 42 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation.
CoRR, August, 2025

FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model.
CoRR, July, 2025

Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers.
CoRR, June, 2025

DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation.
CoRR, June, 2025

IVY-FAKE: A Unified Explainable Framework and Benchmark for Image and Video AIGC Detection.
CoRR, June, 2025

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models.
Int. J. Comput. Vis., May, 2025

V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning.
CoRR, March, 2025

Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT.
CoRR, February, 2025

CoS: Chain-of-Shot Prompting for Long Video Understanding.
CoRR, February, 2025

RepVideo: Rethinking Cross-Layer Representation for Video Generation.
CoRR, January, 2025

Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models.
CoRR, January, 2025

Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control.
CoRR, January, 2025

FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
MetaFormer Baselines for Vision.
IEEE Trans. Pattern Anal. Mach. Intell., February, 2024

Federated zero-shot learning with mid-level semantic knowledge transfer.
Pattern Recognit., 2024

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models.
CoRR, 2024

Scaling Supervised Local Learning with Augmented Auxiliary Networks.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

FreeInit: Bridging Initialization Gap in Video Diffusion Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion.
Proceedings of the Computer Vision - ECCV 2024, 2024

Momentum Auxiliary Network for Supervised Local Learning.
Proceedings of the Computer Vision - ECCV 2024, 2024

Towards Language-Driven Video Inpainting via Multimodal Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FreeU: Free Lunch in Diffusion U-Net.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VBench: Comprehensive Benchmark Suite for Video Generative Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VideoBooth: Diffusion-based Video Generation with Image Prompts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Semantic Prompt for Few-Shot Image Recognition.
CoRR, 2023

Frequency-Enhanced Data Augmentation for Vision-and-Language Navigation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FSAR: Federated Skeleton-based Action Recognition with Adaptive Topology Structure and Knowledge Distillation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
Contrast-Reconstruction Representation Learning for Self-Supervised Skeleton-Based Action Recognition.
IEEE Trans. Image Process., 2022

Mugs: A Multi-Granular Self-Supervised Learning Framework.
CoRR, 2022

Inception Transformer.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

MetaFormer is Actually What You Need for Vision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Generalizable Person Re-identification via Self-Supervised Batch Norm Test-Time Adaption.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Few-Shot Learning with Part Discovery and Augmentation from Unlabeled Images.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

2020
Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network.
Pattern Recognit., 2020

Adversarial Self-supervised Learning for Semi-supervised 3D Action Recognition.
Proceedings of the Computer Vision - ECCV 2020, 2020

Pose-Guided Multi-Granularity Attention Network for Text-Based Person Search.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Progressive Cluster Purification for Transductive Few-shot Learning.
CoRR, 2019

An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Cascade Attention Network for Person Search: Both Image and Text-Image Similarity Selection.
CoRR, 2018

Pose-Based Two-Stream Relational Networks for Action Recognition in Videos.
CoRR, 2018

Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning.
Proceedings of the Computer Vision - ECCV 2018, 2018

Multistage Adversarial Losses for Pose-Based Human Image Synthesis.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018


  Loading...