Runzhe Wu

According to our database1, Runzhe Wu authored at least 13 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Imbalanced Gradients in RL Post-Training of Multi-Task LLMs.
CoRR, October, 2025

Internalizing Self-Consistency in Language Models: Multi-Agent Consensus Alignment.
CoRR, September, 2025

Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Diffusing States and Matching Scores: A New Framework for Imitation Learning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
Making RL with Preference-based Feedback Efficient via Randomization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning.
J. Mach. Learn. Res., 2023

Contextual Bandits and Imitation Learning via Preference-Based Active Queries.
CoRR, 2023

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Selective Sampling and Imitation Learning via Online Regression.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Contextual Bandits and Imitation Learning with Preference-Based Active Queries.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Distributional Offline Policy Evaluation with Predictive Error Guarantees.
Proceedings of the International Conference on Machine Learning, 2023

2021
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning.
CoRR, 2021

Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021


  Loading...