Yingru Li

Orcid: 0000-0002-6434-1387

According to our database1, Yingru Li authored at least 28 papers between 2014 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
The Optimal Token Baseline: Variance Reduction for Long-Horizon LLM-RL.
CoRR, February, 2026

Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations.
CoRR, February, 2026

Beyond Precision: Training-Inference Mismatch is an Optimization Problem and Simple LR Scheduling Fixes It.
CoRR, February, 2026

2025
A Note on Hybrid Online Reinforcement and Imitation Learning for LLMs: Formulations and Algorithms.
CoRR, December, 2025

Taming the Tail: Stable LLM Reinforcement Learning via Dynamic Vocabulary Pruning.
CoRR, December, 2025

Trust Region Masking for Long-Horizon LLM Reinforcement Learning.
CoRR, December, 2025

Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents.
CoRR, September, 2025

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning.
CoRR, September, 2025

Logit Dynamics in Softmax Policy Gradient Methods.
CoRR, June, 2025

OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation.
CoRR, May, 2025

Uncertainty-Aware Search and Value Models: Mitigating Search Scaling Flaws in LLMs.
CoRR, February, 2025

Scaling Flaws of Verifier-Guided Search in Mathematical Reasoning.
CoRR, February, 2025

Divergence-Augmented Policy Optimization.
CoRR, January, 2025

Mapping the Completeness and Positional Accuracy of OpenStreetMap Road Data at the County Level in the Contiguous United States.
Trans. GIS, 2025

Effect of Elevated Temperature on Physical Activity and Falls in Low-Income Older Adults Using Zero-Inflated Poisson and Graphical Models.
Inf., 2025

2024
Adaptive Foundation Models for Online Decisions: HyperAgent with Fast Incremental Uncertainty Estimation.
CoRR, 2024

Probability Tools for Sequential Random Projection.
CoRR, 2024

Simple, unified analysis of Johnson-Lindenstrauss with applications.
CoRR, 2024

HyperAgent: A Simple, Scalable, Efficient and Provable Reinforcement Learning Framework for Complex Environments.
CoRR, 2024

Optimistic Thompson Sampling for No-Regret Learning in Unknown Games.
CoRR, 2024

Radar Anti-Jamming Strategy Learning via Domain-Knowledge Enhanced Online Convex Optimization.
Proceedings of the 13th IEEE Sensor Array and Multichannel Signal Processing Workshop, 2024

Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Prior-dependent analysis of posterior sampling reinforcement learning with function approximation.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2022
HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2020
A Study on the GIS Professional (GISP) Certification Program in the U.S.
ISPRS Int. J. Geo Inf., 2020

2019
Divergence-Augmented Policy Optimization.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

2018
Hidden community detection in social networks.
Inf. Sci., 2018

2014
Online Flood Information System: REST-Based Web Service.
Int. J. Appl. Geospat. Res., 2014


  Loading...