Xuerui Su

According to our database¹, Xuerui Su authored at least 3 papers in 2025.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

DGRO: Enhancing LLM Reasoning via Exploration-Exploitation Control and Reward Variance Management.

[BibT_eX]

[DOI]

CoRR, May, 2025

Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning.

[BibT_eX]

[DOI]

CoRR, April, 2025

Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms.

[BibT_eX]

[DOI]

CoRR, February, 2025

Xuerui Su

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...