Chi Zhang

Orcid: 0000-0001-6344-2824

Affiliations:
  • Tencent PCG, China


According to our database1, Chi Zhang authored at least 24 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Distilling Parallel Gradients for Fast ODE Solvers of Diffusion Models.
CoRR, July, 2025

A Multi-Modal Fusion-Based 3D Multi-Object Tracking Framework With Joint Detection.
IEEE Robotics Autom. Lett., January, 2025

Project-Probe-Aggregate: Efficient Fine-Tuning for Group Robustness.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

AppAgent: Multimodal Agents as Smartphone Users.
Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, 2025

2024
Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-Shot Metric Depth and Surface Normal Estimation.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

LoRA of Change: Learning to Generate LoRA for the Editing Instruction from A Single Before-After Image Pair.
CoRR, 2024

EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts.
CoRR, 2024

SparseAD: Sparse Query-Centric Paradigm for Efficient End-to-End Autonomous Driving.
CoRR, 2024

Lever LM: Configuring In-Context Sequence to Lever Large Vision Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MotionChain: Conversational Motion Controllers via Multimodal Prompts.
Proceedings of the Computer Vision - ECCV 2024, 2024

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

StreamMOTP: Streaming and Unified Framework for Joint 3D Multi-Object Tracking and Trajectory Prediction.
Proceedings of the Computer Vision - ACCV 2024, 2024

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
MacFormer: Map-Agent Coupled Transformer for Real-Time and Robust Trajectory Prediction.
IEEE Robotics Autom. Lett., October, 2023

AppAgent: Multimodal Agents as Smartphone Users.
CoRR, 2023

M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts.
CoRR, 2023

ICD-LM: Configuring Vision-Language In-Context Demonstrations by Language Modeling.
CoRR, 2023

FaceStudio: Put Your Face Everywhere in Seconds.
CoRR, 2023

ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model.
CoRR, 2023

ChartLlama: A Multimodal LLM for Chart Understanding and Generation.
CoRR, 2023

Robust Geometry-Preserving Depth Estimation Using Differentiable Rendering.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
TENET: Transformer Encoding Network for Effective Temporal Flow on Motion Prediction.
CoRR, 2022

2020
Iterative Distance-Aware Similarity Matrix Convolution with Mutual-Supervised Point Elimination for Efficient Point Cloud Registration.
Proceedings of the Computer Vision - ECCV 2020, 2020

2018
Learning Unmanned Aerial Vehicle Control for Autonomous Target Following.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018


  Loading...