Songen Gu

Orcid: 0009-0002-6730-6218

According to our database1, Songen Gu authored at least 21 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
VistaBot: View-Robust Robot Manipulation via Spatiotemporal-Aware View Synthesis.
CoRR, April, 2026

PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance.
CoRR, April, 2026

OccTENS: 3D Occupancy World Model via Temporal Next-Scale Prediction.
IEEE Robotics Autom. Lett., March, 2026

OmniVTA: Visuo-Tactile World Modeling for Contact-Rich Robotic Manipulation.
CoRR, March, 2026

Mimir: Hierarchical Goal-Driven Diffusion With Uncertainty Propagation for End-to-End Autonomous Driving.
IEEE Robotics Autom. Lett., February, 2026

Say, Dream, and Act: Learning Video World Models for Instruction-Driven Robot Manipulation.
CoRR, February, 2026

2025
World In Your Hands: A Large-Scale and Open-source Ecosystem for Learning Human-centric Manipulation in the Wild.
CoRR, December, 2025

UniArt: Unified 3D Representation for Generating 3D Articulated Objects with Open-Set Articulation.
CoRR, November, 2025

OccTENS: 3D Occupancy World Model via Temporal Next-Scale Prediction.
CoRR, September, 2025

MAG: Multi-Modal Aligned Autoregressive Co-Speech Gesture Generation without Vector Quantization.
CoRR, March, 2025

VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting.
CoRR, March, 2025

MonoBite: Scale-Aware 3D Reconstruction and Volume Estimation from Monocular Multi-food Images.
Proceedings of the Pattern Recognition and Computer Vision - 8th Chinese Conference, 2025

ComDrive: Comfort-Oriented End-to-End Autonomous Driving.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

2024
GaussianGrasper: 3D Language Gaussian Splatting for Open-Vocabulary Robotic Grasping.
IEEE Robotics Autom. Lett., September, 2024

DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model.
CoRR, 2024

HE-Drive: Human-Like End-to-End Driving with Vision Language Models.
CoRR, 2024

Text2Street: Controllable Text-to-image Generation for Street Views.
CoRR, 2024

Text2Street: Controllable Text-to-Image Generation for Street Views.
Proceedings of the Pattern Recognition - 27th International Conference, 2024

Structured-NeRF: Hierarchical Scene Graph with Neural Representation.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
ASSIST: Interactive Scene Nodes for Scalable and Realistic Indoor Simulation.
CoRR, 2023

2022
A Multi-Granularity Information-Based Method for Learning High-Dimensional Bayesian Network Structures.
Cogn. Comput., 2022


  Loading...