Yaqi Duan

Yichun Hu

Jiashuo Jiang

CoRR, January, 2026

2025

Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting.

[BibT_eX]

[DOI]

CoRR, October, 2025

On the optimization dynamics of RLVR: Gradient gap and step size thresholds.

[BibT_eX]

[DOI]

Joe Suk

CoRR, October, 2025

Recursive Parameter Estimation of Fractional Order Hammerstein Output Error Autoregressive Model.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., July, 2025

Domain anchor-guided cluster matching for intelligent fault diagnosis under distribution discrepancy and category shift.

[BibT_eX]

[DOI]

Expert Syst. Appl., 2025

PILAF: Optimal Human Preference Sampling for Reward Modeling.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

2024

Proteomic Stratification of Prognosis and Treatment Options for Small Cell Lung Cancer.

[BibT_eX]

[DOI]

Genom. Proteom. Bioinform., 2024

Localized exploration in contextual dynamic pricing achieves dimension-free regret.

[BibT_eX]

[DOI]

CoRR, 2024

Taming "data-hungry" reinforcement learning? Stability in continuous state-action spaces.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

2023

PU-Flow: A Point Cloud Upsampling Network With Normalizing Flows.

[BibT_eX]

[DOI]

IEEE Trans. Vis. Comput. Graph., December, 2023

Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2023

A finite-sample analysis of multi-step temporal difference estimates.

[BibT_eX]

[DOI]

Proceedings of the Learning for Dynamics and Control Conference, 2023

Invertible Residual Neural Networks with Conditional Injector and Interpolator for Point Cloud Upsampling.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

2022

Policy Evaluation in Batch Reinforcement Learning

[BibT_eX]

[DOI]

PhD thesis, 2022

High-temperature augmented neighborhood metric learning for cross-domain fault diagnosis with imbalanced data.

[BibT_eX]

[DOI]

Knowl. Based Syst., 2022

Policy evaluation from a single path: Multi-step methods, mixing and mis-specification.

[BibT_eX]

[DOI]

CoRR, 2022

Adaptive and Robust Multi-task Learning.

[BibT_eX]

[DOI]

Kaizheng Wang

CoRR, 2022

Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Optimal policy evaluation using kernel-based temporal difference methods.

[BibT_eX]

[DOI]

CoRR, 2021

PU-Flow: a Point Cloud Upsampling Networkwith Normalizing Flows.

[BibT_eX]

[DOI]

CoRR, 2021

Bootstrapping Statistical Inference for Off-Policy Evaluation.

[BibT_eX]

[DOI]

CoRR, 2021

Learning Good State and Action Representations via Tensor Decomposition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Information Theory, 2021

Bootstrapping Fitted Q-Evaluation for Off-Policy Inference.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Sparse Feature Selection Makes Batch Reinforcement Learning More Sample Efficient.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning.

[BibT_eX]

[DOI]

Chi Jin

Zhiyuan Li

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Adaptive Low-Nonnegative-Rank Approximation for State Aggregation of Markov Chains.

[BibT_eX]

[DOI]

SIAM J. Matrix Anal. Appl., 2020

Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation.

[BibT_eX]

[DOI]

CoRR, 2020

Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation.

[BibT_eX]

[DOI]

Zeyu Jia

Proceedings of the 37th International Conference on Machine Learning, 2020

2019

Learning low-dimensional state embeddings and metastable clusters from time series data.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

State Aggregation Learning from Markov Transition Data.

[BibT_eX]

[DOI]

Zheng Tracy Ke