Baoxiong Jia

Orcid: 0000-0002-4968-3290

According to our database¹, Baoxiong Jia authored at least 51 papers between 2017 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

SceneCOT: Eliciting Grounded Chain-of-Thought Reasoning in 3D Scenes.

[BibT_eX]

[DOI]

CoRR, October, 2025

Learning Human-Humanoid Coordination for Collaborative Object Carrying.

[BibT_eX]

[DOI]

CoRR, October, 2025

SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent.

[BibT_eX]

[DOI]

CoRR, September, 2025

VideoArtGS: Building Digital Twins of Articulated Objects from Monocular Video.

[BibT_eX]

[DOI]

CoRR, September, 2025

A VR-Based Robotic Teleoperation System With Haptic Feedback and Adaptive Collision Avoidance.

[BibT_eX]

[DOI]

IEEE Trans. Consumer Electron., August, 2025

GWM: Towards Scalable Gaussian World Models for Robotic Manipulation.

[BibT_eX]

[DOI]

CoRR, August, 2025

Spatial-Temporal Multi-Scale Quantization for Flexible Motion Generation.

[BibT_eX]

[DOI]

CoRR, August, 2025

Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation.

[BibT_eX]

[DOI]

CoRR, July, 2025

LEO-VL: Towards 3D Vision-Language Generalists via Data Scaling with Efficient Representation.

[BibT_eX]

[DOI]

CoRR, June, 2025

Learning Unified Force and Position Control for Legged Loco-Manipulation.

[BibT_eX]

[DOI]

CoRR, May, 2025

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning.

[BibT_eX]

[DOI]

Charlie Tianyue Cheng

CoRR, April, 2025

ARFlow: Human Action-Reaction Flow Matching with Physical Guidance.

[BibT_eX]

[DOI]

CoRR, March, 2025

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2025

PhysPart: Physically Plausible Part Completion for Interactable Objects.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2025

Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

METASCENES: Towards Automated Replica Creation for Real-world 3D Scans.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Multi-modal Situated Reasoning in 3D Scenes.

[BibT_eX]

[DOI]

CoRR, 2024

Task-oriented Sequential Grounding in 3D Scenes.

[BibT_eX]

[DOI]

CoRR, 2024

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V.

[BibT_eX]

[DOI]

CoRR, 2024

Multi-modal Situated Reasoning in 3D Scenes.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

An Embodied Generalist Agent in 3D World.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Unifying 3D Vision-Language Understanding via Promptable Queries.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

SlotLifter: Slot-Guided Feature Lifting for Learning Object-Centric Radiance Fields.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Move as you Say, Interact as you can: Language-Guided Human Motion Generation with Scene Affordance.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning a Causal Transition Model for Object Cutting.

[BibT_eX]

[DOI]

IROS, 2023

Improving Object-centric Learning with Query Optimization.

[BibT_eX]

[DOI]

Baoxiong Jia

Yu Liu

Siyuan Huang

Proceedings of the Eleventh International Conference on Learning Representations, 2023

ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Diffusion-based Generation, Optimization, and Planning in 3D Scenes.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation.

[BibT_eX]

[DOI]

CoRR, 2022

Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention.

[BibT_eX]

[DOI]

Baoxiong Jia

Yu Liu

Siyuan Huang

CoRR, 2022

Latent Diffusion Energy-Based Model for Interpretable Text Modeling.

[BibT_eX]

[DOI]

CoRR, 2022

EgoTaskQA: Understanding Human Tasks in Egocentric Videos.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Latent Diffusion Energy-Based Model for Interpretable Text Modelling.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

A Generalized Earley Parser for Human Activity Parsing and Prediction.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2021

Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ACRE: Abstract Causal REasoning Beyond Covariation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

LEMMA: A Multi-view Dataset for LEarning Multi-agent Multi-task Activities.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

2019

Human Activity Understanding and Prediction with Stochastic Grammar.

[BibT_eX]

[DOI]

Baoxiong Jia

PhD thesis, 2019

Learning Perceptual Inference by Contrasting.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

RAVEN: A Dataset for Relational and Analogical Visual REasoNing.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Generalized Earley Parser: Bridging Symbolic Grammars and Sequence Data for Future Prediction.

[BibT_eX]

[DOI]

Siyuan Qi

Baoxiong Jia

Song-Chun Zhu

Proceedings of the 35th International Conference on Machine Learning, 2018

Learning Human-Object Interactions by Graph Parsing Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

2017

Mining User Reviews for Mobile App Comparisons.

[BibT_eX]

[DOI]

Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2017

Baoxiong Jia

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...