Ruihang Chu

Orcid: 0000-0001-9057-745X

According to our database1, Ruihang Chu authored at least 53 papers between 2019 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO.
CoRR, May, 2026

AVBench: Human-Aligned and Automated Evaluation Benchmark for Audio-Video Generative Models.
CoRR, May, 2026

MSAVBench: Towards Comprehensive and Reliable Evaluation of Multi-Shot Audio-Video Generation.
CoRR, May, 2026

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation.
CoRR, May, 2026

DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models.
CoRR, May, 2026

Video-Zero: Self-Evolution Video Understanding.
CoRR, May, 2026

Velocity-Space 3D Asset Editing.
CoRR, May, 2026

Mini-Gemini: Mining the Potential of Multi-Modality Vision Language Models.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2026

DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning.
CoRR, March, 2026

From Narrow to Panoramic Vision: Attention-Guided Cold-Start Reshapes Multimodal Reasoning.
CoRR, March, 2026

SIN-Bench: Tracing Native Evidence Chains in Long-Context Multimodal Scientific Interleaved Literature.
CoRR, January, 2026

X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests.
CoRR, January, 2026

O-DisCo-Edit: Object Distortion Control for Unified Realistic Video Editing.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
DreamOmni3: Scribble-based Editing and Generation.
CoRR, December, 2025

VideoZoomer: Reinforcement-Learned Temporal Focusing for Long Video Reasoning.
CoRR, December, 2025

Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance.
CoRR, December, 2025

Nav-R<sup>2</sup> Dual-Relation Reasoning for Generalizable Open-Vocabulary Object-Goal Navigation.
CoRR, December, 2025

A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook.
ACM Comput. Surv., November, 2025

Enhancing Diffusion-based Restoration Models via Difficulty-Adaptive Reinforcement Learning with IQA Reward.
CoRR, November, 2025

Boosting Fidelity for Pre-Trained-Diffusion-Based Low-Light Image Enhancement via Condition Refinement.
CoRR, October, 2025

Generative Universal Verifier as Multimodal Meta-Reasoner.
CoRR, October, 2025

AdaViewPlanner: Adapting Video Diffusion Models for Viewpoint Planning in 4D Scenes.
CoRR, October, 2025

SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer.
CoRR, September, 2025

LongLive: Real-time Interactive Long Video Generation.
CoRR, September, 2025

A Generative Foundation Model for Chest Radiography.
CoRR, September, 2025

Exploiting Discriminative Codebook Prior for Autoregressive Image Generation.
CoRR, August, 2025

DreamVE: Unified Instruction-based Image and Video Editing.
CoRR, August, 2025

TTS-VAR: A Test-Time Scaling Framework for Visual Auto-Regressive Generation.
CoRR, July, 2025

AnyCap Project: A Unified Framework, Dataset, and Benchmark for Controllable Omni-modal Captioning.
CoRR, July, 2025

Zero-P-to-3: Zero-Shot Partial-View Images to 3D Object.
CoRR, May, 2025

Wan: Open and Advanced Large-Scale Video Generative Models.
CoRR, March, 2025

IterPref: Focal Preference Learning for Code Generation via Iterative Debugging.
CoRR, March, 2025

The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

DialogGen: Multi-modal Interactive Dialogue System with Multi-turn Text-Image Generation.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Teaching Your Models to Understand Code via Focal Preference Alignment.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End Driving.
CoRR, 2024

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation.
CoRR, 2024

2023
A Survey of Reasoning with Foundation Models.
CoRR, 2023

DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation.
CoRR, 2023

DiT-3D: Exploring Plain Diffusion Transformers for 3D Shape Generation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DiffComplete: Diffusion-based Generative 3D Shape Completion.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Mask-Attention-Free Transformer for 3D Instance Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

TriVol: Point Cloud Rendering via Triple Volumes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Command-driven Articulated Object Understanding and Manipulation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
ICM-3D: Instantiated Category Modeling for 3D Instance Segmentation.
IEEE Robotics Autom. Lett., 2022

TWIST: Two-Way Inter-label Self-Training for Semi-supervised 3D Instance Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Simultaneous Semantic and Collision Learning for 6-DoF Grasp Pose Estimation.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Scale-Aware Automatic Augmentation for Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Co-Actuation: A Method for Achieving High Stiffness and Low Inertia for Haptic Devices.
IEEE Trans. Haptics, 2020

2019
An Intuitive End-to-End Human-UAV Interaction System for Field Exploration.
Frontiers Neurorobotics, 2019

Vehicle Re-Identification With Viewpoint-Aware Metric Learning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019


  Loading...