Heyang Zhao

Orcid: 0009-0004-6912-0040

According to our database1, Heyang Zhao authored at least 17 papers between 2021 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Logarithmic Regret for Online KL-Regularized Reinforcement Learning.
CoRR, February, 2025

Nearly Optimal Sample Complexity of Offline KL-Regularized Contextual Bandits under Single-Policy Concentrability.
CoRR, February, 2025

Beyond-Expert Performance with Limited Demonstrations: Efficient Imitation Learning with Double Exploration.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

A 0.16mm<sup>2</sup>450MHz-BW 72dB-SNDR Continuous-time Pipeline ADC with APF+HPF and APF+FIR Hybrid Delay Alignment Techniques.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2025

2024
Sharp Analysis for KL-Regularized Contextual Bandits and RLHF.
CoRR, 2024

A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Feel-Good Thompson Sampling for Contextual Dueling Bandits.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Variance-aware Regret Bounds for Stochastic Contextual Dueling Bandits.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

S-lemma On Three Non-homogeneous Quadratic Functions And Its Applications In Signal Recovery.
Proceedings of the 10th International Conference on Communication and Information Processing, 2024

2023
Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits.
Proceedings of the International Conference on Machine Learning, 2023

Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes.
Proceedings of the International Conference on Machine Learning, 2023

Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

2022
Bandit Learning with General Function Classes: Heteroscedastic Noise and Variance-dependent Regret Bounds.
CoRR, 2022

VinciDecoder: Automatically Interpreting Provenance Graphs into Textual Forensic Reports with Application to OpenStack.
Proceedings of the Secure IT Systems, 2022

ProvTalk: Towards Interpretable Multi-level Provenance Analysis in Networking Functions Virtualization (NFV).
Proceedings of the 29th Annual Network and Distributed System Security Symposium, 2022

2021
Linear Contextual Bandits with Adversarial Corruptions.
CoRR, 2021


  Loading...