Jinguo Zhu

Orcid: 0000-0002-3616-4264

According to our database¹, Jinguo Zhu authored at least 27 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints.

[BibT_eX]

[DOI]

CoRR, October, 2025

ZeroGUI: Automating Online GUI Learning at Zero Human Cost.

[BibT_eX]

[DOI]

CoRR, May, 2025

VisuLogic: A Benchmark for Evaluating Visual Reasoning in Multi-modal Large Language Models.

[BibT_eX]

[DOI]

CoRR, April, 2025

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models.

[BibT_eX]

[DOI]

CoRR, April, 2025

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning.

[BibT_eX]

[DOI]

CoRR, March, 2025

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

2024

Mini-InternVL: a flexible-transfer pocket multi-modal model with 5% parameters and 90% performance.

[BibT_eX]

[DOI]

Vis. Intell., 2024

V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding.

[BibT_eX]

[DOI]

CoRR, 2024

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.

[BibT_eX]

[DOI]

CoRR, 2024

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization.

[BibT_eX]

[DOI]

CoRR, 2024

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance.

[BibT_eX]

[DOI]

CoRR, 2024

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Power-Llava: Large Language and Vision Assistant for Power Transmission Line Inspection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2024

Intent Negotiation Empowers Advanced Operations for the Intent-Driven Autonomous Network.

[BibT_eX]

[DOI]

Proceedings of the 27th Conference on Innovation in Clouds, Internet and Networks, 2024

2023

VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, 2023

VLATTACK: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks.

[BibT_eX]

[DOI]

CoRR, 2021

Multiple Domain Experts Collaborative Learning: Multi-Source Domain Generalization For Person Re-Identification.

[BibT_eX]

[DOI]

CoRR, 2021

Complementary Relation Contrastive Distillation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Layerwise Optimization by Gradient Decomposition for Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

A Deep Learning Method to Detect Foreign Objects for Inspecting Power Transmission Lines.

[BibT_eX]

[DOI]

IEEE Access, 2020

Crowded Human Detection via an Anchor-pair Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

2019

Welding Joints Inspection via Residual Attention Network.

[BibT_eX]

[DOI]

Jinguo Zhu

Zejian Yuan

Tie Liu

Proceedings of the 16th International Conference on Machine Vision Applications, 2019

Jinguo Zhu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...