Yuhao Zhou

Orcid: 0009-0008-8665-3999

Affiliations:
  • Fudan University, Shanghai, China


According to our database1, Yuhao Zhou authored at least 26 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.
CoRR, July, 2025

Reinforcement Fine-Tuning Enables MLLMs Learning Novel Tasks Stably.
CoRR, June, 2025

Speech-Language Models with Decoupled Tokenizers and Multi-Token Prediction.
CoRR, June, 2025

EliteKV: Scalable KV Cache Compression via RoPE Frequency Selection and Joint Low-Rank Projection.
CoRR, March, 2025

The rise and potential of large language model based agents: a survey.
Sci. China Inf. Sci., 2025

2024
CC2Vec: Combining Typed Tokens with Contrastive Learning for Effective Code Clone Detection.
Proc. ACM Softw. Eng., 2024

What's Wrong with Your Code Generated by Large Language Models? An Extensive Study.
CoRR, 2024

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback.
CoRR, 2024

MouSi: Poly-Visual-Expert Vision-Language Models.
CoRR, 2024

Secrets of RLHF in Large Language Models Part II: Reward Modeling.
CoRR, 2024

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Improving Generalization of Alignment with Human Preferences through Group Invariant Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Reward Modeling Requires Automatic Adjustment Based on Data Quality.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Improving Discriminative Capability of Reward Models in RLHF Using Contrastive Learning.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

ORTicket: Let One Robust BERT Ticket Transfer across Different Tasks.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment.
CoRR, 2023

The Rise and Potential of Large Language Model Based Agents: A Survey.
CoRR, 2023

Secrets of RLHF in Large Language Models Part I: PPO.
CoRR, 2023

Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement.
CoRR, 2023

Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Detecting Adversarial Samples through Sharpness of Loss Landscape.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Robust Lottery Tickets for Pre-trained Language Models.
CoRR, 2022

Robust Lottery Tickets for Pre-trained Language Models.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022


  Loading...