Long Yang

Orcid: 0000-0001-7675-2194

Affiliations:
  • Peking University, School of Artificial Intelligence, Institute for AI, Beijing, China
  • Zhejiang University, College of Computer Science and Technology, Hangzhou, China (PhD 2021)


According to our database1, Long Yang authored at least 28 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Review of Safe Reinforcement Learning: Methods, Theories, and Applications.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Towards Constraint-aware Learning for Resource Allocation in NFV-enabled Networks.
CoRR, 2024

Off-OAB: Off-Policy Policy Gradient Method with Optimal Action-Dependent Baseline.
CoRR, 2024

Optimizing over Multiple Distributions under Generalized Quasar-Convexity Condition.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

FlagVNE: A Flexible and Generalizable Reinforcement Learning Framework for Network Resource Allocation.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Langevin Policy for Safe Reinforcement Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
A Thompson Sampling Algorithm With Logarithmic Regret for Unimodal Gaussian Bandit.
IEEE Trans. Neural Networks Learn. Syst., September, 2023

Safe multi-agent reinforcement learning for multi-robot control.
Artif. Intell., June, 2023

DILI: A Distribution-Driven Learned Index.
Proc. VLDB Endow., 2023

Policy Representation via Diffusion Probability Model for Reinforcement Learning.
CoRR, 2023

VOCE: Variational Optimization with Conservative Estimation for Offline Safe Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Zeroth-order Optimization with Weak Dimension Dependency.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Augmented Proximal Policy Optimization for Safe Reinforcement Learning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
A Review of Safe Reinforcement Learning: Methods, Theory and Applications.
CoRR, 2022

CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning.
CoRR, 2022

Constrained Update Projection Approach to Safe Policy Optimization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Penalized Proximal Policy Optimization for Safe Reinforcement Learning.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Policy Optimization with Stochastic Mirror Descent.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
On Convergence of Gradient Expected Sarsa(λ).
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Sample Complexity of Policy Gradient Finding Second-Order Stationary Points.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network.
IEEE Trans. Neural Networks Learn. Syst., 2020

LISA: A Learned Index Structure for Spatial Data.
Proceedings of the 2020 International Conference on Management of Data, 2020

Maximum Entropy Reinforcement Learning with Evolution Strategies.
Proceedings of the 2020 International Joint Conference on Neural Networks, 2020

2019
Gradient Q(σ, λ): A Unified Algorithm with Function Approximation for Reinforcement Learning.
CoRR, 2019

Exploiting Ratings, Reviews and Relationships for Item Recommendations in Topic Based Social Networks.
Proceedings of the World Wide Web Conference, 2019

TBQ(σ): Improving Efficiency of Trace Utilization for Off-Policy Reinforcement Learning.
Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

2018
Crux - A New Fast, Flexible and Decentralized Consensus Algorithm with High Fault Tolerance Rate.
Proceedings of the Smart Blockchain - First International Conference, 2018

A Unified Approach for Multi-step Temporal-Difference Learning with Eligibility Traces in Reinforcement Learning.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018


  Loading...