Siyuan Huang

Orcid: 0000-0003-1524-7148

Affiliations:
  • Beijing Institute for General Artificial Intelligence (BIGAI), China
  • University of California, Los Angeles, CA, USA (PhD 2021)


According to our database1, Siyuan Huang authored at least 43 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding.
CoRR, 2024

2023
An Embodied Generalist Agent in 3D World.
CoRR, 2023

ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab.
CoRR, 2023

Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture.
CoRR, 2023

Grasp Multiple Objects with One Hand.
CoRR, 2023

3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment.
CoRR, 2023

ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes.
CoRR, 2023

GenDexGrasp: Generalizable Dexterous Grasping.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

SQA3D: Situated Question Answering in 3D Scenes.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Full-Body Articulated Human-Object Interaction.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Diffusion-based Generation, Optimization, and Planning in 3D Scenes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
CHAIRS: Towards Full-Body Articulated Human-Object Interaction.
CoRR, 2022

Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation.
CoRR, 2022

Neural-Symbolic Recursive Machine for Systematic Generalization.
CoRR, 2022

PartAfford: Part-level Affordance Discovery from 3D Objects.
CoRR, 2022

HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

EgoTaskQA: Understanding Human Tasks in Egocentric Videos.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learning V1 Simple Cells with Vector Representation of Local Content and Matrix Representation of Local Motion.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Human-like Holistic 3D Scene Understanding.
PhD thesis, 2021

A Generalized Earley Parser for Human Activity Parsing and Prediction.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

A HINT from Arithmetic: On Systematic Generalization of Perception, Syntax, and Semantics.
CoRR, 2021

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

VLGrammar: Grounded Grammar Induction of Vision and Language.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

YouRefIt: Embodied Reference Understanding with Language and Gesture.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Neural Representation of Camera Pose with Matrix Representation of Pose Shift via View Synthesis.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

SMART: A Situation Model for Algebra Story Problems via Attributed Grammar.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Learning by Fixing: Solving Math Word Problems with Weak Supervision.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense.
CoRR, 2020

Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning.
Proceedings of the 37th International Conference on Machine Learning, 2020

A Competence-Aware Curriculum for Visual Concepts Learning via Question Answering.
Proceedings of the Computer Vision - ECCV 2020, 2020

LEMMA: A Multi-view Dataset for LEarning Multi-agent Multi-task Activities.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Holistic++ Scene Understanding: Single-View 3D Holistic Scene Parsing and Human Pose Estimation With Human-Object Interaction and Physical Commonsense.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2018
Configurable 3D Scene Synthesis and 2D Image Rendering with Per-pixel Ground Truth Using Stochastic Grammars.
Int. J. Comput. Vis., 2018

Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image.
Proceedings of the Computer Vision - ECCV 2018, 2018

Human-Centric Indoor Scene Synthesis Using Stochastic Grammar.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Configurable, Photorealistic Image Rendering and Ground Truth Synthesis by Sampling Stochastic Grammars Representing Indoor Scenes.
CoRR, 2017

Predicting Human Activities Using Stochastic Grammar.
Proceedings of the IEEE International Conference on Computer Vision, 2017


  Loading...