Zhihao Zhang

Orcid: 0009-0008-1526-104X

Affiliations:

Fudan University, School of Computer Science, Shanghai, China

According to our database¹, Zhihao Zhang authored at least 25 papers between 2023 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

Entropy Polarity in Reinforcement Fine-Tuning: Direction, Asymmetry, and Control.

[BibT_eX]

[DOI]

CoRR, May, 2026

Can RL Improve Generalization of LLM Agents? An Empirical Study.

[BibT_eX]

[DOI]

CoRR, March, 2026

Steering LLMs via Scalable Interactive Oversight.

[BibT_eX]

[DOI]

CoRR, February, 2026

Locate, steer, and improve: A practical survey of actionable mechanistic interpretability in large language models.

[BibT_eX]

[DOI]

Comput. Sci. Rev., 2026

AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress.

[BibT_eX]

[DOI]

Proceedings of the ACM Web Conference 2026, 2026

LLMEval-Fair: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

AgentGym2: Benchmarking Large Language Model Agents in De-Idealized Real-World Environments.

[BibT_eX]

[DOI]

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

The Role of Entropy in Visual Grounding: Analysis and Optimization.

[BibT_eX]

[DOI]

CoRR, December, 2025

AgentPRM: Process Reward Models for LLM Agents via Step-Wise Promise and Progress.

[BibT_eX]

[DOI]

CoRR, November, 2025

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping.

[BibT_eX]

[DOI]

CoRR, October, 2025

LLMEval-3: A Large-Scale Longitudinal Study on Robust and Fair Evaluation of Large Language Models.

[BibT_eX]

[DOI]

CoRR, August, 2025

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.

[BibT_eX]

[DOI]

CoRR, July, 2025

Reinforcement Fine-Tuning Enables MLLMs Learning Novel Tasks Stably.

[BibT_eX]

[DOI]

CoRR, June, 2025

Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction.

[BibT_eX]

[DOI]

CoRR, June, 2025

EliteKV: Scalable KV Cache Compression via RoPE Frequency Selection and Joint Low-Rank Projection.

[BibT_eX]

[DOI]

CoRR, March, 2025

LoRACoE: Improving Large Language Model via Composition-based LoRA Expert.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution.

[BibT_eX]

[DOI]

CoRR, 2024

LLaMA Beyond English: An Empirical Study on Language Capability Transfer.

[BibT_eX]

[DOI]

CoRR, 2024

Improving Discriminative Capability of Reward Models in RLHF Using Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

PDF-to-Tree: Parsing PDF Text Blocks into a Tree.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Unveiling Linguistic Regions in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

Unveiling A Core Linguistic Region in Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Zhihao Zhang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...