Songen Gu

Orcid: 0009-0002-6730-6218

According to our database¹, Songen Gu authored at least 21 papers between 2022 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

VistaBot: View-Robust Robot Manipulation via Spatiotemporal-Aware View Synthesis.

[BibT_eX]

[DOI]

CoRR, April, 2026

PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance.

[BibT_eX]

[DOI]

CoRR, April, 2026

OccTENS: 3D Occupancy World Model via Temporal Next-Scale Prediction.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., March, 2026

OmniVTA: Visuo-Tactile World Modeling for Contact-Rich Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, March, 2026

Mimir: Hierarchical Goal-Driven Diffusion With Uncertainty Propagation for End-to-End Autonomous Driving.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., February, 2026

Say, Dream, and Act: Learning Video World Models for Instruction-Driven Robot Manipulation.

[BibT_eX]

[DOI]

CoRR, February, 2026

2025

World In Your Hands: A Large-Scale and Open-source Ecosystem for Learning Human-centric Manipulation in the Wild.

[BibT_eX]

[DOI]

CoRR, December, 2025

UniArt: Unified 3D Representation for Generating 3D Articulated Objects with Open-Set Articulation.

[BibT_eX]

[DOI]

CoRR, November, 2025

OccTENS: 3D Occupancy World Model via Temporal Next-Scale Prediction.

[BibT_eX]

[DOI]

CoRR, September, 2025

MAG: Multi-Modal Aligned Autoregressive Co-Speech Gesture Generation without Vector Quantization.

[BibT_eX]

[DOI]

CoRR, March, 2025

VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting.

[BibT_eX]

[DOI]

CoRR, March, 2025

MonoBite: Scale-Aware 3D Reconstruction and Volume Estimation from Monocular Multi-food Images.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition and Computer Vision - 8th Chinese Conference, 2025

ComDrive: Comfort-Oriented End-to-End Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

2024

GaussianGrasper: 3D Language Gaussian Splatting for Open-Vocabulary Robotic Grasping.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., September, 2024

DOME: Taming Diffusion Model into High-Fidelity Controllable Occupancy World Model.

[BibT_eX]

[DOI]

CoRR, 2024

HE-Drive: Human-Like End-to-End Driving with Vision Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Text2Street: Controllable Text-to-image Generation for Street Views.

[BibT_eX]

[DOI]

CoRR, 2024

Text2Street: Controllable Text-to-Image Generation for Street Views.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition - 27th International Conference, 2024

Structured-NeRF: Hierarchical Scene Graph with Neural Representation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

ASSIST: Interactive Scene Nodes for Scalable and Realistic Indoor Simulation.

[BibT_eX]

[DOI]

CoRR, 2023

2022

A Multi-Granularity Information-Based Method for Learning High-Dimensional Bayesian Network Structures.

[BibT_eX]

[DOI]

Cogn. Comput., 2022

Songen Gu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...