Yuanhao Wang

Affiliations:

Princeton University, USA
Tsinghua University, Institute for Interdisciplinary Information Sciences, China (former)

According to our database¹, Yuanhao Wang authored at least 20 papers between 2019 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

Directional Smoothness and Gradient Methods: Convergence and Adaptivity.

[BibT_eX]

[DOI]

CoRR, 2024

2023

Is RLHF More Difficult than Standard RL?

[BibT_eX]

[DOI]

Yuanhao Wang

Qinghua Liu

Chi Jin

CoRR, 2023

Is RLHF More Difficult than Standard RL? A Theoretical Perspective.

[BibT_eX]

[DOI]

Yuanhao Wang

Qinghua Liu

Chi Jin

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning Rationalizable Equilibria in Multiplayer Games.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

2022

Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits.

[BibT_eX]

[DOI]

Qinghua Liu

Yuanhao Wang

Chi Jin

Proceedings of the International Conference on Machine Learning, 2022

Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

V-Learning - A Simple, Efficient, Decentralized Algorithm for Multiagent RL.

[BibT_eX]

[DOI]

CoRR, 2021

An Exponential Lower Bound for Linearly-Realizable MDPs with Constant Suboptimality Gap.

[BibT_eX]

[DOI]

Yuanhao Wang

Ruosong Wang

Sham M. Kakade

CoRR, 2021

Don't Fix What ain't Broke: Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization.

[BibT_eX]

[DOI]

CoRR, 2021

An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap.

[BibT_eX]

[DOI]

Yuanhao Wang

Ruosong Wang

Sham M. Kakade

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Online Learning in Unknown Markov Games.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

On the Suboptimality of Negative Momentum for Minimax Optimization.

[BibT_eX]

[DOI]

Guodong Zhang

Yuanhao Wang

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Provably Efficient Online Agnostic Learning in Markov Games.

[BibT_eX]

[DOI]

CoRR, 2020

Refined Analysis of FPL for Adversarial Markov Decision Processes.

[BibT_eX]

[DOI]

Yuanhao Wang

Kefan Dong

CoRR, 2020

Improved Algorithms for Convex-Concave Minimax Optimization.

[BibT_eX]

[DOI]

Yuanhao Wang

Jian Li

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach.

[BibT_eX]

[DOI]

Yuanhao Wang

Guodong Zhang

Jimmy Ba

Proceedings of the 8th International Conference on Learning Representations, 2020

Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Distributed Bandit Learning: How Much Communication is Needed to Achieve (Near) Optimal Regret.

[BibT_eX]

[DOI]

CoRR, 2019

Yuanhao Wang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...