Wenqi Zhang

Orcid: 0000-0002-8312-0184

Affiliations:
  • Zhejiang University (ZJU), College of Computer Science and Technology, Hangzhou, China


According to our database1, Wenqi Zhang authored at least 36 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
GUI-G<sup>2</sup>: Gaussian Reward Modeling for GUI Grounding.
CoRR, July, 2025

SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation.
CoRR, June, 2025

TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence.
CoRR, May, 2025

Let LLMs Break Free from Overthinking via Self-Braking Tuning.
CoRR, May, 2025

Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency.
CoRR, April, 2025

A Survey on (M)LLM-Based GUI Agents.
CoRR, April, 2025

Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks.
CoRR, March, 2025

Think Twice, Click Once: Enhancing GUI Grounding via Fast and Slow Systems.
CoRR, March, 2025

DB-Explore: Automated Database Exploration and Instruction Synthesis for Text-to-SQL.
CoRR, March, 2025

AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification.
CoRR, March, 2025

STaR-SQL: Self-Taught Reasoner for Text-to-SQL.
CoRR, February, 2025

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding.
CoRR, January, 2025

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining.
CoRR, January, 2025

ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Scaling LLMs' Social Reasoning: Sprinkle Cognitive "Aha Moment" into Fundamental Long-thought Logical Capabilities.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Specialized Mathematical Solving by a Step-By-Step Expression Chain Generation.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation.
CoRR, 2024

Entering Real Social World! Benchmarking the Theory of Mind and Socialization Capabilities of LLMs from a First-person Perspective.
CoRR, 2024

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs.
CoRR, 2024

TaskBench: Benchmarking Large Language Models for Task Automation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Advancing Process Verification for Large Language Models via Tree-Based Preference Learning.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Learning Global Controller in Latent Space for Parameter-Efficient Fine-Tuning.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

TimeToM: Temporal Space is the Key to Unlocking the Door of Large Language Models' Theory-of-Mind.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow.
CoRR, 2023

An Expression Tree Decoding Strategy for Mathematical Equation Generation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Enhancing Emotion Recognition in Conversation via Multi-view Feature Alignment and Memorization.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

PromptNER: Prompt Locating and Typing for Named Entity Recognition.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
A Closed-Loop Perception, Decision-Making and Reasoning Mechanism for Human-Like Navigation.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Multi-View Reasoning: Consistent Contrastive Learning for Math Word Problem.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Query-based Instance Discrimination Network for Relational Triple Extraction.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021
Learning to Navigate in a VUCA Environment: Hierarchical Multi-expert Approach.
CoRR, 2021

Learning to Navigate in a VUCA Environment: Hierarchical Multi-expert Approach.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Deep Reinforcement Learning for Multi-contact Motion Planning of Hexapod Robots.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021


  Loading...