Runzhe Wu

According to our database¹, Runzhe Wu authored at least 13 papers between 2021 and 2025.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Imbalanced Gradients in RL Post-Training of Multi-Task LLMs.

[BibT_eX]

[DOI]

CoRR, October, 2025

Internalizing Self-Consistency in Language Models: Multi-Agent Consensus Alignment.

[BibT_eX]

[DOI]

CoRR, September, 2025

Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Diffusing States and Matching Scores: A New Framework for Imitation Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Making RL with Preference-based Feedback Efficient via Randomization.

[BibT_eX]

[DOI]

Runzhe Wu

Wen Sun

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

Contextual Bandits and Imitation Learning via Preference-Based Active Queries.

[BibT_eX]

[DOI]

CoRR, 2023

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Selective Sampling and Imitation Learning via Online Regression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Contextual Bandits and Imitation Learning with Preference-Based Active Queries.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Distributional Offline Policy Evaluation with Predictive Error Guarantees.

[BibT_eX]

[DOI]

Runzhe Wu

Masatoshi Uehara

Wen Sun

Proceedings of the International Conference on Machine Learning, 2023

2021

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Runzhe Wu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...