Yuhao Zhou
Orcid: 0009-0008-8665-3999Affiliations:
- Fudan University, Shanghai, China
According to our database1,
Yuhao Zhou authored at least 39 papers
between 2022 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
CoRR, May, 2026
MM-Doc-R1: Training Agents for Long Document Visual Question Answering through Multi-turn Reinforcement Learning.
CoRR, April, 2026
Proceedings of the ACM Web Conference 2026, 2026
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
What Makes a Good Speech Tokenizer for LLM-Centric Speech Generation? A Systematic Study.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026
2025
CoRR, November, 2025
CoRR, October, 2025
CoRR, September, 2025
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.
CoRR, July, 2025
CoRR, June, 2025
CoRR, June, 2025
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning.
CoRR, June, 2025
CoRR, May, 2025
CoRR, May, 2025
EliteKV: Scalable KV Cache Compression via RoPE Frequency Selection and Joint Low-Rank Projection.
CoRR, March, 2025
Sci. China Inf. Sci., 2025
Parrot: A Training Pipeline Enhances Both Program CoT and Natural Language CoT for Reasoning.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
2024
CC2Vec: Combining Typed Tokens with Contrastive Learning for Effective Code Clone Detection.
Proc. ACM Softw. Eng., 2024
CoRR, 2024
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback.
CoRR, 2024
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Improving Generalization of Alignment with Human Preferences through Group Invariant Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Improving Discriminative Capability of Reward Models in RLHF Using Contrastive Learning.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment.
CoRR, 2023
CoRR, 2023
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022