Jihao Gu

Orcid: 0009-0009-0141-4807

According to our database1, Jihao Gu authored at least 22 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
DERM-3R: A Resource-Efficient Multimodal Agents Framework for Dermatologic Diagnosis and Treatment in Real-World Clinical Settings.
CoRR, April, 2026

MA-Bench: Towards Fine-grained Micro-Action Understanding.
CoRR, March, 2026

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Text-guided Fine-Grained Video Anomaly Detection.
CoRR, November, 2025

InquireMobile: Teaching VLM-based Mobile Agent to Request Human Assistance via Reinforcement Fine-Tuning.
CoRR, August, 2025

MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents.
CoRR, August, 2025

Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition.
CoRR, July, 2025

Mobile-R1: Towards Interactive Reinforcement Learning for VLM-Based Mobile Agent via Task-Level Rewards.
CoRR, June, 2025

DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models.
CoRR, April, 2025

GeoSense: Evaluating Identification and Application of Geometric Principles in Multimodal Reasoning.
CoRR, April, 2025

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models.
CoRR, March, 2025

ChineseSimpleVQA - "See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models.
CoRR, February, 2025

Performance Analysis of Traditional VQA Models Under Limited Computational Resources.
CoRR, February, 2025

DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Motion Matters: Motion-guided Modulation Network for Skeleton-based Micro-Action Recognition.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

MM-Gesture: Towards Precise Micro-Gesture Recognition through Multimodal Fusion.
Proceedings of IJCAI-2025 Workshop & Challenge on Human Behavior Analysis for Emotion Understanding (MiGA 2025), 2025

Token Preference Optimization with Self-Calibrated Visual-Anchored Rewards for Hallucination Mitigation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

See the World, Discover Knowledge: A Chinese Factuality Evaluation for Large Vision Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
SARA: Singular-Value Based Adaptive Low-Rank Adaption.
CoRR, 2024

From Bottom to Top: Extending the Potential of Parameter Efficient Fine-Tuning.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Retrieval-Augmented Knowledge-Intensive Dialogue.
Proceedings of the Natural Language Processing and Chinese Computing, 2023


  Loading...