Ruichuan An

Orcid: 0009-0000-3758-4335

According to our database1, Ruichuan An authored at least 38 papers between 2023 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Rethinking VLM Representation for VLA Initialization.
CoRR, May, 2026

TaskGround: Structured Executable Task Inference for Full-Scene Household Reasoning.
CoRR, May, 2026

VGGT-Edit: Feed-forward Native 3D Scene Editing with Residual Field Prediction.
CoRR, May, 2026

Uni-Synergy: Bridging Understanding and Generation for Personalized Reasoning via Co-operative Reinforcement Learning.
CoRR, May, 2026

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models.
CoRR, April, 2026

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models.
CoRR, March, 2026

PEARL: Personalized Streaming Video Understanding Model.
CoRR, March, 2026

MME-CoF-Pro: Evaluating Reasoning Coherence in Video Generative Models with Text and Visual Hints.
CoRR, March, 2026

GENIUS: Generative Fluid Intelligence Evaluation Suite.
CoRR, February, 2026

GEBench: Benchmarking Image Generation Models as GUI Environments.
CoRR, February, 2026

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining.
CoRR, February, 2026

How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing.
CoRR, February, 2026

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks.
CoRR, February, 2026

Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning.
CoRR, January, 2026

CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation.
CoRR, January, 2026

LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts.
Proceedings of the ACM Web Conference 2026, 2026

2025
DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI.
CoRR, December, 2025

GRAN-TED: Generating Robust, Aligned, and Nuanced Text Embedding for Diffusion Models.
CoRR, December, 2025

Are Video Models Ready as Zero-Shot Reasoners? An Empirical Study with the MME-CoF Benchmark.
CoRR, October, 2025

Jarvis: Towards Personalized AI Assistant via Personal KV-Cache Retrieval.
CoRR, October, 2025

Robobench: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models as Embodied Brain.
CoRR, October, 2025

MorphoBench: A Benchmark with Difficulty Adaptive to Model Reasoning.
CoRR, October, 2025

CapGeo: A Caption-Assisted Approach to Geometric Reasoning.
CoRR, October, 2025

CodeRankEval: Benchmarking and Analyzing LLM Performance for Code Ranking.
J. Comput. Sci. Technol., September, 2025

WoW: Towards a World omniscient World model Through Embodied Interaction.
CoRR, September, 2025

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos.
CoRR, June, 2025

Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking.
CoRR, May, 2025

SpikeGen: Generative Framework for Visual Spike Stream Processing.
CoRR, May, 2025

LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts.
CoRR, May, 2025

Concept-as-Tree: Synthetic Data is All You Need for VLM Personalization.
CoRR, March, 2025

UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
MC-LLaVA: Multi-Concept Personalized Vision-Language Model.
CoRR, 2024

Can Modifying Data Address Graph Domain Adaptation?
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
Split & Merge: Unlocking the Potential of Visual Adapters via Sparse Training.
CoRR, 2023

MoEC: Mixture of Experts Implicit Neural Compression.
CoRR, 2023


  Loading...