Han Zhong

Affiliations:

Peking University, Center for Data Science, Beijing, China

According to our database¹, Han Zhong authored at least 22 papers between 2020 and 2023.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2023

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopically Rational Followers?

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation.

[BibT_eX]

[DOI]

CoRR, 2023

One Objective to Rule Them All: A Maximization Objective Fusing Estimation and Planning for Exploration.

[BibT_eX]

[DOI]

CoRR, 2023

Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret.

[BibT_eX]

[DOI]

CoRR, 2023

A Reduction-based Framework for Sequential Decision Making with Delayed Feedback.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Posterior Sampling for Competitive RL: Function Approximation and Partial Observation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Theoretical Analysis of Optimistic Proximal Policy Optimization in Linear Markov Decision Processes.

[BibT_eX]

[DOI]

Han Zhong

Tong Zhang

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Provable Sim-to-real Transfer in Continuous Domain with Partial Observations.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP, and Beyond.

[BibT_eX]

[DOI]

CoRR, 2022

Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Nearly Optimal Policy Optimization with Stable at Any Time Guarantee.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers?

[BibT_eX]

[DOI]

CoRR, 2021

Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs.

[BibT_eX]

[DOI]

CoRR, 2021

A Unified Framework for Conservative Exploration.

[BibT_eX]

[DOI]

CoRR, 2021

Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020

Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy.

[BibT_eX]

[DOI]

CoRR, 2020

Han Zhong

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...