We stand with Ukraine

We stand with Ukraine

Jihao Gu

Orcid: 0009-0009-0141-4807

According to our database¹, Jihao Gu authored at least 22 papers between 2023 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

DERM-3R: A Resource-Efficient Multimodal Agents Framework for Dermatologic Diagnosis and Treatment in Real-World Clinical Settings.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, April, 2026

MA-Bench: Towards Fine-grained Micro-Action Understanding.

[DOI]

,

,

,

,

,

CoRR, March, 2026

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Text-guided Fine-Grained Video Anomaly Detection.

[DOI]

,

,

,

CoRR, November, 2025

InquireMobile: Teaching VLM-based Mobile Agent to Request Human Assistance via Reinforcement Fine-Tuning.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

CoRR, August, 2025

MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Wangchunshu Zhou

,

Zhaoxiang Zhang

,

,

CoRR, August, 2025

Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition.

[DOI]

,

,

,

,

,

,

CoRR, July, 2025

Mobile-R1: Towards Interactive Reinforcement Learning for VLM-Based Mobile Agent via Task-Level Rewards.

[DOI]

,

,

,

,

,

,

,

,

,

Ming-Liang Zhang

,

,

,

CoRR, June, 2025

DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, April, 2025

GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning.

[DOI]

,

,

,

,

,

,

Mingliang Zhang

,

,

,

,

,

CoRR, April, 2025

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, March, 2025

ChineseSimpleVQA - "See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, February, 2025

Performance Analysis of Traditional VQA Models Under Limited Computational Resources.

[DOI]

CoRR, February, 2025

DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition.

[DOI]

,

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

MM-Gesture: Towards Precise Micro-Gesture Recognition through Multimodal Fusion.

[DOI]

,

,

,

,

,

Proceedings of IJCAI-2025 Workshop & Challenge on Human Behavior Analysis for Emotion Understanding (MiGA 2025), 2025

Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation.

[DOI]

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

See the World, Discover Knowledge: A Chinese Factuality Evaluation for Large Vision Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

SARA: Singular-Value Based Adaptive Low-Rank Adaption.

[DOI]

,

,

,

,

CoRR, 2024

From Bottom to Top: Extending the Potential of Parameter Efficient Fine-Tuning.

[DOI]

,

,

,

,

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023

Retrieval-Augmented Knowledge-Intensive Dialogue.

[DOI]

,

,

,

,

Proceedings of the Natural Language Processing and Chinese Computing, 2023

Loading...