Zhi Gao

Orcid: 0000-0002-4424-4352

Affiliations:
  • Beijing Institute of Technology, China


According to our database1, Zhi Gao authored at least 40 papers between 2017 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Curvature Learning for Generalization of Hyperbolic Neural Networks.
Int. J. Comput. Vis., December, 2025

Modality Alignment across Trees on Heterogeneous Hyperbolic Manifolds.
CoRR, October, 2025

GUI Knowledge Bench: Revealing the Knowledge Gap Behind VLM Failures in GUI Tasks.
CoRR, October, 2025

Multi-Step Reasoning for Embodied Question Answering via Tool Augmentation.
CoRR, October, 2025

KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints.
CoRR, October, 2025

Beyond the Seen: Bounded Distribution Estimation for Open-Vocabulary Learning.
CoRR, October, 2025

Adaptive Model Ensemble for Continual Learning.
CoRR, September, 2025

Geometry-aware Distance Measure for Diverse Hierarchical Structures in Hyperbolic Spaces.
CoRR, June, 2025

A Set-to-Set Distance Measure in Hyperbolic Space.
CoRR, June, 2025

Hyperbolic Dual Feature Augmentation for Open-Environment.
CoRR, June, 2025

When Large Multimodal Models Confront Evolving Knowledge:Challenges and Pathways.
CoRR, May, 2025

Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL.
CoRR, May, 2025

Memory-Centric Embodied Question Answer.
CoRR, May, 2025

Iterative Trajectory Exploration for Multimodal Agents.
CoRR, April, 2025

TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials.
CoRR, April, 2025

Building LLM Agents by Incorporating Insights from Computer Systems.
CoRR, April, 2025

Large-scale Riemannian meta-optimization via subspace adaptation.
Comput. Vis. Image Underst., 2025

Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding.
CoRR, 2024

FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

[inline-graphic not available: see fulltext]VideoAgent: A Memory-Augmented Multimodal Agent for Video Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024

CLOVA: A Closed-LOop Visual Assistant with Tool Usage and Update.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Geometry-adaptive Meta-learning in Riemannian Manifolds.
Proceedings of the ACM Turing Award Celebration Conference 2024, 2024

2023
Learning to Optimize on Riemannian Manifolds.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

Curvature-Adaptive Meta-Learning for Fast Adaptation to Manifold Data.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

Exploring Data Geometry for Continual Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Infinite-dimensional feature aggregation via a factorized bilinear model.
Pattern Recognit., 2022

Hyperbolic Feature Augmentation via Distribution Estimation and Infinite Sampling on Manifolds.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Efficient Riemannian Meta-Optimization by Implicit Differentiation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Curvature Generation in Curved Spaces for Few-Shot Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

A Hyperbolic-to-Hyperbolic Graph Convolutional Network.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Learning a Gradient-free Riemannian Optimizer on Tangent Spaces.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
A Robust Distance Measure for Similarity-Based Classification on the SPD Manifold.
IEEE Trans. Neural Networks Learn. Syst., 2020

Learning to Optimize on SPD Manifolds.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Revisiting Bilinear Pooling: A Coding Perspective.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Learning a robust representation via a deep network on symmetric positive definite manifolds.
Pattern Recognit., 2019

Deep convolutional network with locality and sparsity constraints for texture classification.
Pattern Recognit., 2019

2018
Set-to-Set Distance Metric Learning on SPD Manifolds.
Proceedings of the Pattern Recognition and Computer Vision - First Chinese Conference, 2018

2017
Learning a Robust Representation via a Deep Network on Symmetric Positive Definite Manifolds.
CoRR, 2017


  Loading...