Peng Jia

Orcid: 0000-0002-2273-3892

Affiliations:
  • Simplexity Robotics, Beijing, China
  • Li Auto, Beijing, China
  • University of Maryland, College Park, MD, USA (PhD 2011)


According to our database1, Peng Jia authored at least 37 papers between 2024 and 2026.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
HarmoWAM: Harmonizing Generalizable and Precise Manipulation via Adaptive World Action Models.
CoRR, May, 2026

LaST-R1: Reinforcing Action via Adaptive Physical Latent Reasoning for VLA Models.
CoRR, April, 2026

Look Before Acting: Enhancing Vision Foundation Representations for Vision-Language-Action Models.
CoRR, March, 2026

TwinRL-VLA: Digital Twin-Driven Reinforcement Learning for Real-World Robotic Manipulation.
CoRR, February, 2026

Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
ManualVLA: A Unified VLA Model for Chain-of-Thought Manual Generation and Robotic Manipulation.
CoRR, December, 2025

The Better You Learn, The Smarter You Prune: Towards Efficient Vision-language-action Models via Differentiable Token Pruning.
CoRR, September, 2025

OmniReason: A Temporal-Guided Vision-Language-Action Framework for Autonomous Driving.
CoRR, September, 2025

MagicRoad: Semantic-Aware 3D Road Surface Reconstruction via Obstacle Inpainting.
CoRR, July, 2025

World4Drive: End-to-End Autonomous Driving via Intention-aware Physical Latent World Model.
CoRR, July, 2025

RoboPearls: Editable Video Simulation for Robot Manipulation.
CoRR, June, 2025

DriveAction: A Benchmark for Exploring Human-like Driving Decisions in VLA Models.
CoRR, June, 2025

GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control.
CoRR, May, 2025

TransDiffuser: End-to-end Trajectory Generation with Decorrelated Multi-modal Representation for Autonomous Driving.
CoRR, May, 2025

TokenFLEX: Unified VLM Training for Flexible Visual Tokens Inference.
CoRR, April, 2025

StyledStreets: Multi-style Street Simulator with Spatial and Temporal Consistency.
CoRR, March, 2025

Finetuning Generative Trajectory Model with Reinforcement Learning from Human Feedback.
CoRR, March, 2025

Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space.
CoRR, March, 2025

OmniGen: Unified Multimodal Sensor Generation for Autonomous Driving.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

PosePilot: Steering Camera Pose for Generative World Models with Self-supervised Depth.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2025

Generalizing Motion Planners with Mixture of Experts for Autonomous Driving.
Proceedings of the IEEE International Conference on Robotics and Automation, 2025

S2-Track: A Simple yet Strong Approach for End-to-End 3D Multi-Object Tracking.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

World4Drive: End-to-End Autonomous Driving via Intention-Aware Physical Latent World Model.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

HiNeuS: High-Fidelity Neural Surface Mitigating Low-Texture and Reflective Ambiguity.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

RoboPearls: Editable Video Simulation for Robot Manipulation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
UniPLV: Towards Label-Efficient Open-World 3D Scene Understanding by Regional Visual Language Supervision.
CoRR, 2024

GaussianAD: Gaussian-Centric End-to-End Autonomous Driving.
CoRR, 2024

Preliminary Investigation into Data Scaling Laws for Imitation Learning-Based End-to-End Autonomous Driving.
CoRR, 2024

DiVE: DiT-based Video Generation with Enhanced Control.
CoRR, 2024

UA-Track: Uncertainty-Aware End-to-End 3D Multi-Object Tracking.
CoRR, 2024

Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation.
CoRR, 2024

DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models.
CoRR, 2024

BEV-CLIP: Multi-modal BEV Retrieval Methodology for Complex Scene in Autonomous Driving.
CoRR, 2024

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes.
Proceedings of the Computer Vision - ECCV 2024, 2024

DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models.
Proceedings of the Conference on Robot Learning, 6-9 November 2024, Munich, Germany., 2024


  Loading...