Chenlu Ye

According to our database¹, Chenlu Ye authored at least 18 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Reinforce-Ada: An Adaptive Sampling Framework for Reinforce-Style LLM Training.

[BibT_eX]

[DOI]

CoRR, October, 2025

Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training.

[BibT_eX]

[DOI]

CoRR, September, 2025

Transformers as Multi-task Learners: Decoupling Features in Hidden Markov Models.

[BibT_eX]

[DOI]

CoRR, June, 2025

Daunce: Data Attribution through Uncertainty Estimation.

[BibT_eX]

[DOI]

CoRR, May, 2025

Self-rewarding correction for mathematical reasoning.

[BibT_eX]

[DOI]

CoRR, February, 2025

Logarithmic Regret for Online KL-Regularized Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Catoni Contextual Bandits are Robust to Heavy-tailed Rewards.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

2024

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF.

[BibT_eX]

[DOI]

CoRR, 2024

A Theoretical Analysis of Nash Learning from Human Feedback under General KL-Regularized Preference.

[BibT_eX]

[DOI]

CoRR, 2024

Online Iterative Reinforcement Learning from Human Feedback with General Preference Model.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Towards Robust Model-Based Reinforcement Learning Against Adversarial Corruption.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

Gibbs Sampling from Human Feedback: A Provable KL- constrained Framework for RLHF.

[BibT_eX]

[DOI]

CoRR, 2023

Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks.

[BibT_eX]

[DOI]

CoRR, 2023

Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Corruption-Robust Offline Reinforcement Learning with General Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and Markov Decision Processes.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Chenlu Ye

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...