Sicheng Xu

Orcid: 0000-0002-7903-3934

According to our database1, Sicheng Xu authored at least 25 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
TaskGround: Structured Executable Task Inference for Full-Scene Household Reasoning.
CoRR, May, 2026

Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data.
CoRR, April, 2026

HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models.
CoRR, March, 2026

2025
Native and Compact Structured Latents for 3D Generation.
CoRR, December, 2025

Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos.
CoRR, October, 2025

Seeing Across Views: Benchmarking Spatial Reasoning of Vision-Language Models in Robotic Scenes.
CoRR, October, 2025

Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis.
CoRR, July, 2025

Neural Estimation of the Information Bottleneck Based on a Mapping Approach.
CoRR, July, 2025

Estimating Rate-Distortion Functions Using the Energy-Based Model.
CoRR, July, 2025

MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details.
CoRR, July, 2025

VASA-Rig: Audio-Driven 3D Facial Animation with 'Live' Mood Dynamics in Virtual Reality.
IEEE Trans. Vis. Comput. Graph., May, 2025

VASA-3D: Lifelike Audio-Driven Gaussian Head Avatars from a Single Image.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Estimating Rate-Distortion Functions Using the Energy-Based Model.
Proceedings of the IEEE Information Theory Workshop, 2025

Accelerated Dropout: A Bitmask Approach to Speed Up Model Training.
Proceedings of the International Joint Conference on Neural Networks, 2025

Robust Optical Transceiver Manipulation in Cluttered Cable Environments Using 3D Scene Understanding and Planning.
Proceedings of the IEEE International Conference on Robotics and Automation, 2025

Gaussian Variation Field Diffusion for High-Fidelity Video-to-4D Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

Structured 3D Latents for Scalable and Versatile 3D Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation.
CoRR, 2024

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Neural Estimation of the Information Bottleneck Based on a Mapping Approach.
Proceedings of the IEEE Information Theory Workshop, 2024

2023
RemoteTouch: Enhancing Immersive 3D Video Communication with Hand Touch.
Proceedings of the IEEE Conference Virtual Reality and 3D User Interfaces, 2023

AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections.
Proceedings of the SIGGRAPH Asia 2023 Conference Papers, 2023

2020
Deep 3D Portrait From a Single Image.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Accurate 3D Face Reconstruction With Weakly-Supervised Learning: From Single Image to Image Set.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019


  Loading...