Yufei Zhang

Orcid: 0000-0001-9843-1404

Affiliations:
  • Imperial College London, Department of Mathematics, Westminster, UK
  • University of Oxford, Mathematical Institute, UK
  • London School of Economics and Political Science, Department of Statistics, UK (2021-2023)


According to our database1, Yufei Zhang authored at least 23 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Optimal Scheduling of Entropy Regularizer for Continuous-Time Linear-Quadratic Reinforcement Learning.
SIAM J. Control. Optim., February, 2024

Mirror Descent for Stochastic Control Problems with Measure-valued Controls.
CoRR, 2024

2023
Linear Convergence of a Policy Gradient Method for Some Finite Horizon Continuous Time Control Problems.
SIAM J. Control. Optim., December, 2023

Reinforcement Learning for Linear-Convex Models with Jumps via Stability Analysis of Feedback Controls.
SIAM J. Control. Optim., April, 2023

A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces.
CoRR, 2023

Towards An Analytical Framework for Potential Games.
CoRR, 2023

Insurance pricing on price comparison websites via reinforcement learning.
CoRR, 2023

A Neural RDE approach for continuous-time non-Markovian stochastic control problems.
CoRR, 2023

2022
Logarithmic Regret for Episodic Continuous-Time Linear-Quadratic Reinforcement Learning over a Finite-Time Horizon.
J. Mach. Learn. Res., 2022

Convergence of policy gradient methods for finite-horizon stochastic linear-quadratic control problems.
CoRR, 2022

Optimal scheduling of entropy regulariser for continuous-time linear-quadratic reinforcement learning.
CoRR, 2022

Linear convergence of a policy gradient method for finite horizon continuous time stochastic control problems.
CoRR, 2022

2021
Regularity and Stability of Feedback Relaxed Controls.
SIAM J. Control. Optim., 2021

A Neural Network-Based Policy Iteration Algorithm with Global H<sup>2</sup>-Superlinear Convergence for Stochastic Games on Domains.
Found. Comput. Math., 2021

Exploration-exploitation trade-off for continuous-time episodic reinforcement learning with linear-convex models.
CoRR, 2021

A penalty scheme and policy iteration for nonlocal HJB variational inequalities with monotone nonlinearities.
Comput. Math. Appl., 2021

2020
Error Estimates of Penalty Schemes for Quasi-Variational Inequalities Arising from Impulse Control Problems.
SIAM J. Control. Optim., 2020

Regularity and time discretization of extended mean field control problems: a McKean-Vlasov FBSDE approach.
CoRR, 2020

A posteriori error estimates for fully coupled McKean-Vlasov forward-backward SDEs.
CoRR, 2020

Understanding Deep Architectures with Reasoning Layer.
CoRR, 2020

Understanding Deep Architecture with Reasoning Layer.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
A Penalty Scheme for Monotone Systems with Interconnected Obstacles: Convergence and Error Estimates.
SIAM J. Numer. Anal., 2019

Rectified deep neural networks overcome the curse of dimensionality for nonsmooth value functions in zero-sum games of nonlinear stiff systems.
CoRR, 2019


  Loading...