Shangtong Zhang

Orcid: 0000-0003-4255-1364

According to our database1, Shangtong Zhang authored at least 37 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise.
CoRR, 2024

2023
IMGA: Efficient In-Memory Graph Convolution Network Aggregation With Data Flow Optimizations.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., December, 2023

AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning.
CoRR, 2023

Direct Gradient Temporal Difference Learning.
CoRR, 2023

Improving Monte Carlo Evaluation with Offline Data.
CoRR, 2023

On the Convergence of SARSA with Linear Function Approximation.
Proceedings of the International Conference on Machine Learning, 2023

A New Challenge in Policy Evaluation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Truncated Emphatic Temporal Difference Methods for Prediction and Control.
J. Mach. Learn. Res., 2022

Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch.
J. Mach. Learn. Res., 2022

On the Chattering of SARSA with Linear Function Approximation.
CoRR, 2022

A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms.
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

Learning Expected Emphatic Traces for Deep RL.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Deep Residual Reinforcement Learning (Extended Abstract).
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Breaking the Deadly Triad with a Target Network.
Proceedings of the 38th International Conference on Machine Learning, 2021

Average-Reward Off-Policy Policy Evaluation with Function Approximation.
Proceedings of the 38th International Conference on Machine Learning, 2021

Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Per-Step Reward: A New Perspective for Risk-Averse Reinforcement Learning.
CoRR, 2020

Learning Retrospective Knowledge with Reverse Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation.
Proceedings of the 37th International Conference on Machine Learning, 2020

GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values.
Proceedings of the 37th International Conference on Machine Learning, 2020

Deep Residual Reinforcement Learning.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Mega-Reward: Achieving Human-Level Play without Extrinsic Rewards.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Provably Convergent Off-Policy Actor-Critic with Function Approximation.
CoRR, 2019

Distributional Reinforcement Learning for Efficient Exploration.
CoRR, 2019

Mega-Reward: Achieving Human-Level Play without Extrinsic Rewards.
CoRR, 2019

DAC: The Double Actor-Critic Architecture for Learning Options.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Generalized Off-Policy Actor-Critic.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Exploration in the Face of Parametric and Intrinsic Uncertainties.
Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

QUOTA: The Quantile Option Architecture for Reinforcement Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
mlpack 3: a fast, flexible machine learning library.
J. Open Source Softw., 2018

ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search.
CoRR, 2018

QUOTA: The Quantile Option Architecture for Reinforcement Learning.
CoRR, 2018

2017
A Deeper Look at Experience Replay.
CoRR, 2017

Comparing Deep Reinforcement Learning and Evolutionary Methods in Continuous Control.
CoRR, 2017

Crossprop: Learning Representations by Stochastic Meta-Gradient Descent in Neural Networks.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2017

2015
A Deep Neural Network for Modeling Music.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015


  Loading...