Zhihao Zhang

Orcid: 0009-0008-1526-104X

Affiliations:
  • Fudan University, School of Computer Science, Shanghai, China


According to our database1, Zhihao Zhang authored at least 22 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

Online presence:

On csauthors.net:

Bibliography

2026
Can RL Improve Generalization of LLM Agents? An Empirical Study.
CoRR, March, 2026

Steering LLMs via Scalable Interactive Oversight.
CoRR, February, 2026

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models.
CoRR, January, 2026

AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress.
Proceedings of the ACM Web Conference 2026, 2026

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
The Role of Entropy in Visual Grounding: Analysis and Optimization.
CoRR, December, 2025

AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress.
CoRR, November, 2025

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping.
CoRR, October, 2025

LLMEval-3: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models.
CoRR, August, 2025

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.
CoRR, July, 2025

Reinforcement Fine-Tuning Enables MLLMs Learning Novel Tasks Stably.
CoRR, June, 2025

Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction.
CoRR, June, 2025

EliteKV: Scalable KV Cache Compression via RoPE Frequency Selection and Joint Low-Rank Projection.
CoRR, March, 2025

LoRACoE: Improving Large Language Model via Composition-based LoRA Expert.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution.
CoRR, 2024

LLaMA Beyond English: An Empirical Study on Language Capability Transfer.
CoRR, 2024

Improving Discriminative Capability of Reward Models in RLHF Using Contrastive Learning.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

PDF-to-Tree: Parsing PDF Text Blocks into a Tree.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Unveiling Linguistic Regions in Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Unveiling A Core Linguistic Region in Large Language Models.
CoRR, 2023


  Loading...