Xiaoyu Shi

Orcid: 0009-0003-3696-4442

Affiliations:
  • Chinese University of Hong Kong, Multimedia Laboratory, Hong Kong


According to our database1, Xiaoyu Shi authored at least 34 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory.
CoRR, May, 2026

ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling.
CoRR, March, 2026

Kling-MotionControl Technical Report.
CoRR, March, 2026

2025
SemanticGen: Video Generation in Semantic Space.
CoRR, December, 2025

KlingAvatar 2.0 Technical Report.
CoRR, December, 2025

MultiShotMaster: A Controllable Multi-Shot Video Generation Framework.
CoRR, December, 2025

RelightMaster: Precise Video Relighting with Multi-plane Light Images.
CoRR, November, 2025

VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning.
CoRR, October, 2025

FlexDrive: Toward Trajectory Flexibility in Driving Scene Reconstruction and Rendering.
CoRR, February, 2025

GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking.
CoRR, January, 2025

CamCloneMaster: Enabling Reference-based Camera Control for Video Generation.
Proceedings of the SIGGRAPH Asia 2025 Conference Papers, 2025

CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation.
Proceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference, 2025

3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

FlexDrive: Toward Trajectory Flexibility in Driving Scene Gaussian Splatting Reconstruction and Rendering.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields through Efficient Dense 3D Point Tracking.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning.
CoRR, 2024

AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data.
Proceedings of the SIGGRAPH Asia 2024 Technical Communications, 2024

Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling.
Proceedings of the ACM SIGGRAPH 2024 Conference Papers, 2024

Three Things We Need to Know About Transferring Stable Diffusion to Visual Dense Prediction Tasks.
Proceedings of the Computer Vision - ECCV 2024, 2024

Be-Your-Outpainter: Mastering Video Outpainting Through Input-Specific Adaptation.
Proceedings of the Computer Vision - ECCV 2024, 2024

BlinkVision: A Benchmark for Optical Flow, Scene Flow and Point Tracking Estimation Using RGB Frames and Events.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
FlowFormer: A Transformer Architecture and Its Masked Cost Volume Autoencoding for Optical Flow.
CoRR, 2023

Context-TAP: Tracking Any Point Demands Spatial Context Features.
CoRR, 2023

KBNet: Kernel Basis Network for Image Restoration.
CoRR, 2023

A Unified Conditional Framework for Diffusion-based Image Restoration.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Context-PIPs: Persistent Independent Particles Demands Context Features.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

BlinkFlow: A Dataset to Push the Limits of Event-Based Optical Flow Estimation.
IROS, 2023

VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

FlowFormer++: Masked Cost Volume Autoencoding for Pretraining Optical Flow Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

A Simple Baseline for Video Restoration with Grouped Spatial-Temporal Shift.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
No Attention is Needed: Grouped Spatial-temporal Shift for Simple and Efficient Video Restorers.
CoRR, 2022

FlowFormer: A Transformer Architecture for Optical Flow.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
Decoupled Spatial-Temporal Transformer for Video Inpainting.
CoRR, 2021

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021


  Loading...