Runsen Xu

According to our database1, Runsen Xu authored at least 20 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2025
MMSI-Video-Bench: A Holistic Benchmark for Video-Based Spatial Intelligence.
CoRR, December, 2025

G<sup>2</sup>VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning.
CoRR, November, 2025

ChangingGrounding: 3D Visual Grounding in Changing Scenes.
CoRR, October, 2025

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding.
CoRR, July, 2025

Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control.
CoRR, June, 2025

MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence.
CoRR, May, 2025

Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models.
CoRR, May, 2025

VFLowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

2024
Grounded 3D-LLM with Referent Tokens.
CoRR, 2024

MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

PointLLM: Empowering Large Language Models to Understand Point Clouds.
Proceedings of the Computer Vision - ECCV 2024, 2024

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding.
Proceedings of the Conference on Robot Learning, 6-9 November 2024, Munich, Germany., 2024

2023
Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

CO3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
CO^3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving.
CoRR, 2022

2021
LIFE: Lighting Invariant Flow Estimation.
CoRR, 2021

RNIN-VIO: Robust Neural Inertial Navigation Aided Visual-Inertial Odometry in Challenging Scenes.
Proceedings of the IEEE International Symposium on Mixed and Augmented Reality, 2021


  Loading...