Shoubin Yu

Orcid: 0009-0006-1670-0054

According to our database1, Shoubin Yu authored at least 18 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Video-RTS: Rethinking Reinforcement Learning and Test-Time Scaling for Efficient and Enhanced Video Reasoning.
CoRR, July, 2025

4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time.
CoRR, June, 2025

MEXA: Towards General Multimodal Reasoning with Dynamic Multi-Expert Aggregation.
CoRR, June, 2025

Movie Facts and Fibs (MF<sup>2</sup>): A Benchmark for Long Movie Understanding.
CoRR, June, 2025

Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization.
CoRR, April, 2025

VEGGIE: Instructional Editing and Reasoning of Video Concepts with Grounded Generation.
CoRR, March, 2025

CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
Regularity Learning via Explicit Distribution Modeling for Skeletal Video Anomaly Detection.
IEEE Trans. Circuits Syst. Video Technol., August, 2024

RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives.
CoRR, 2024

CREMA: Multimodal Compositional Video Reasoning via Efficient Modular Adaptation and Fusion.
CoRR, 2024

Zero-Shot Controllable Image-to-Video Animation via Motion Decomposition.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

A Simple LLM Framework for Long-Range Video Question-Answering.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Self-Chained Image-Language Model for Video Localization and Question Answering.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2021
STAR: A Benchmark for Situated Reasoning in Real-World Videos.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021


  Loading...