Zhangxuan Gu

Orcid: 0000-0002-2102-2693

According to our database1, Zhangxuan Gu authored at least 27 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
VenusBench-Mobile: A Challenging and User-Centric Benchmark for Mobile GUI Agents with Capability Diagnostics.
CoRR, April, 2026

UI-Venus-1.5 Technical Report.
CoRR, February, 2026

GUI-G²: Gaussian Reward Modeling for GUI Grounding.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Conditional Prototype Rectification Prompt Learning.
IEEE Trans. Circuits Syst. Video Technol., December, 2025

VenusBench-GD: A Comprehensive Multi-Platform GUI Benchmark for Diverse Grounding Tasks.
CoRR, December, 2025

UI-Venus Technical Report: Building High-performance UI Agents with RFT.
CoRR, August, 2025

GUI-G<sup>2</sup>: Gaussian Reward Modeling for GUI Grounding.
CoRR, July, 2025

Efficient Transfer Learning for Video-language Foundation Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024
E-ANT: A Large-Scale Dataset for Efficient Automatic GUI NavigaTion.
CoRR, 2024

DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark.
CoRR, 2024

PC<sup>2</sup>: Pseudo-Classification Based Pseudo-Captioning for Noisy Correspondence Learning in Cross-Modal Retrieval.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Diffusioninst: Diffusion Model for Instance Segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Segment Anything Model Meets Image Harmonization.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
From Pixel to Patch: Synthesize Context-Aware Features for Zero-Shot Semantic Segmentation.
IEEE Trans. Neural Networks Learn. Syst., October, 2023

Boosting Audio-visual Zero-shot Learning with Large Language Models.
CoRR, 2023

DiffUTE: Universal Text Editing Diffusion Model.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Hierarchical Dynamic Image Harmonization.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Backpropagation Path Search On Adversarial Transferability.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Mobile User Interface Element Detection Via Adaptively Prompt Tuning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
DiffusionInst: Diffusion Model for Instance Segmentation.
CoRR, 2022

STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Hard Pixel Mining for Depth Privileged Semantic Segmentation.
IEEE Trans. Multim., 2021

2020
Multi-mode neural network for human action recognition.
IET Comput. Vis., 2020

Context-aware Feature Generation For Zero-shot Semantic Segmentation.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

2019
Hard Pixels Mining: Learning Using Privileged Information for Semantic Segmentation.
CoRR, 2019

Clothes Keypoints Localization and Attribute Recognition via Prior Knowledge.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019


  Loading...