Wenxuan Song

Orcid: 0009-0006-0406-2497

According to our database1, Wenxuan Song authored at least 21 papers between 2024 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Unified Diffusion VLA: Vision-Language-Action Model via Joint Discrete Denoising Diffusion Process.
CoRR, November, 2025

VLA^2: Empowering Vision-Language-Action Models with an Agentic Framework for Unseen Concept Manipulation.
CoRR, October, 2025

Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model.
CoRR, October, 2025

Towards a Unified Understanding of Robot Manipulation: A Comprehensive Survey.
CoRR, October, 2025

"In my defense, only three hours on Instagram": Designing Toward Digital Self-Awareness and Wellbeing.
CoRR, September, 2025

VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model.
CoRR, September, 2025

FlowVLA: Thinking in Motion with a Visual Chain of Thought.
CoRR, August, 2025

ReconVLA: Reconstructive Vision-Language-Action Model as Effective Robot Perceiver.
CoRR, August, 2025

CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding.
CoRR, June, 2025

RationalVLA: A Rational Vision-Language-Action Model with Dual System.
CoRR, June, 2025

OpenHelix: A Short Survey, Empirical Analysis, and Open-Source Dual-System VLA Model for Robotic Manipulation.
CoRR, May, 2025

Accelerating Vision-Language-Action Model Integrated with Action Chunking via Parallel Decoding.
CoRR, March, 2025

Emergence of Classical Random Walk from Non-Hermitian Effects in Quantum Kicked Rotor.
Entropy, 2025

MoRE: Unlocking Scalability in Reinforcement Learning for Quadruped Vision-Language-Action Models.
Proceedings of the IEEE International Conference on Robotics and Automation, 2025

Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

WaterSplatting: Fast Underwater 3D Scene Reconstruction Using Gaussian Splatting.
Proceedings of the International Conference on 3D Vision, 2025

2024
A Super-Resolution and 3D Reconstruction Method Based on OmDF Endoscopic Images.
Sensors, August, 2024

MapSAM: Adapting Segment Anything Model for Automated Feature Detection in Historical Maps.
CoRR, 2024

ProFD: Prompt-Guided Feature Disentangling for Occluded Person Re-Identification.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped Robot.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

QUAR-VLA: Vision-Language-Action Model for Quadruped Robots.
Proceedings of the Computer Vision - ECCV 2024, 2024


  Loading...