Guan Huang

Affiliations:
  • XForwardAI, Beijing, China
  • PhiGent Robotics
  • Chinese Academy of Sciences (CASIA), Institute of Automation, China (2016)


According to our database1, Guan Huang authored at least 58 papers between 2015 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
EmbodieDreamer: Advancing Real2Sim2Real Transfer for Policy Training via Embodied World Modeling.
CoRR, July, 2025

Gait Recognition in the Wild: A Large-Scale Benchmark and NAS-Based Baseline.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2025

WonderFree: Enhancing Novel View Quality and Cross-View Consistency for 3D Scene Exploration.
CoRR, June, 2025

GigaVideo-1: Advancing Video Generation via Automatic Feedback with 4 GPU-Hours Fine-Tuning.
CoRR, June, 2025

Motion-R1: Chain-of-Thought Reasoning and Reinforcement Learning for Human Motion Generation.
CoRR, June, 2025

RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer.
CoRR, May, 2025

HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration.
CoRR, April, 2025

WonderTurbo: Generating Interactive 3D World in 0.72 Seconds.
CoRR, April, 2025

ReconDreamer++: Harmonizing Generative and Reconstructive Models for Driving Scene Representation.
CoRR, March, 2025

DriveDreamer4D: World Models Are Effective Data Machines for 4D Driving Scene Representation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

ReconDreamer: Crafting World Models for Driving Scene Reconstruction via Online Restoration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond.
CoRR, 2024

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens.
CoRR, 2024

DriveDreamer: Towards Real-World-Drive World Models for Autonomous Driving.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
WebFace260M: A Benchmark for Million-Scale Deep Face Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving.
CoRR, 2023

Detachable Novel Views Synthesis of Dynamic Scenes Using Distribution-Driven Neural Radiance Fields.
CoRR, 2023

HFT: Lifting Perspective Representations via Hybrid Feature Transformation for BEV Perception.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Efficient and Hybrid Decoder for Local Map Construction in Bird'-Eye-View.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Are We Ready for Vision-Centric Driving Streaming Perception? The ASAP Benchmark.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

CompletionFormer: Depth Completion with Convolutions and Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

A Simple Baseline for Multi-Camera 3D Object Detection.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Crafting Monocular Cues and Velocity Guidance for Self-Supervised Multi-Frame Depth Learning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
BEVPoolv2: A Cutting-edge Implementation of BEVDet Toward Deployment.
CoRR, 2022

BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving.
CoRR, 2022

Gait Recognition in the Wild: A Benchmark.
CoRR, 2022

HFT: Lifting Perspective Representations via Hybrid Feature Transformation.
CoRR, 2022

BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection.
CoRR, 2022

MVSTER: Epipolar Transformer for Efficient Multi-view Stereo.
Proceedings of the Computer Vision - ECCV 2022, 2022

Dimension Embeddings for Monocular 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

CAFE: Learning to Condense Dataset by Aligning Features.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

SurroundDepth: Entangling Surrounding Views for Self-Supervised Multi-Camera Depth Estimation.
Proceedings of the Conference on Robot Learning, 2022

MonoViT: Self-Supervised Monocular Depth Estimation with a Vision Transformer.
Proceedings of the International Conference on 3D Vision, 2022

2021
BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View.
CoRR, 2021

Face-NMS: A Core-set Selection Approach for Efficient Face Recognition.
CoRR, 2021

Masked Face Recognition Challenge: The WebFace260M Track Report.
CoRR, 2021

Structure-Aware Face Clustering on a Large-Scale Graph with $\bf{10^{7}}$ Nodes.
CoRR, 2021

WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Structure-Aware Face Clustering on a Large-Scale Graph With 107 Nodes.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

SIMPLE: SIngle-network with Mimicking and Point Learning for Bottom-up Human Pose Estimation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Improve Person Re-Identification With Part Awareness Learning.
IEEE Trans. Image Process., 2020

How to Train Your Robust Human Pose Estimator: Pay Attention to the Constraint Cue.
CoRR, 2020

The Devil Is in the Details: Delving Into Unbiased Data Processing for Human Pose Estimation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Action Machine: Toward Person-Centric Action Recognition in Videos.
IEEE Signal Process. Lett., 2019

Multi-Stage HRNet: Multiple Stage High-Resolution Network for Human Pose Estimation.
CoRR, 2019

High Performance Visual Object Tracking with Unified Convolutional Networks.
CoRR, 2019

FastPose: Towards Real-time Pose Estimation and Tracking via Scale-normalized Multi-task Networks.
CoRR, 2019

Exploiting Offset-guided Network for Pose Estimation and Tracking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

State-Aware Re-Identification Feature for Multi-Target Multi-Camera Tracking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Attention-Guided Unified Network for Panoptic Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
EANet: Enhancing Alignment for Cross-Domain Person Re-identification.
CoRR, 2018

Action Machine: Rethinking Action Recognition in Trimmed Videos.
CoRR, 2018

2017
UCT: Learning Unified Convolutional Networks for Real-Time Visual Tracking.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

The Visual Object Tracking VOT2017 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

2015
Automatic gear sorting system based on monocular vision.
Digit. Commun. Networks, 2015


  Loading...