Xuerui Su

According to our database1, Xuerui Su authored at least 3 papers in 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
DGRO: Enhancing LLM Reasoning via Exploration-Exploitation Control and Reward Variance Management.
CoRR, May, 2025

Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning.
CoRR, April, 2025

Reveal the Mystery of DPO: The Connection between DPO and RL Algorithms.
CoRR, February, 2025


  Loading...