Shentao Yang

Orcid: 0009-0009-8058-3149

According to our database¹, Shentao Yang authored at least 12 papers between 2022 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

GCHR : Goal-Conditioned Hindsight Regularization for Sample-Efficient Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, August, 2025

Bridging Supervised and Temporal Difference Learning with <i>Q</i>-Conditioned Maximization.

[BibT_eX]

[DOI]

CoRR, June, 2025

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2025

SWaT: Statistical Modeling of Video Watch Time through User Behavior Analysis.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

2024

SWaT: Statistical Modeling of Video Watch Time through User Behavior Analysis.

[BibT_eX]

[DOI]

CoRR, 2024

Sequential Decision-Making for Inline Text Autocomplete.

[BibT_eX]

[DOI]

Rohan Chitnis

Shentao Yang

Alborz Geramifard

RLJ, 2024

A Dense Reward View on Aligning Text-to-Image Diffusion with Preference.

[BibT_eX]

[DOI]

Shentao Yang

Tianqi Chen

Mingyuan Zhou

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

Preference-grounded Token-level Guidance for Language Model Fine-tuning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

A Regularized Implicit Policy for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2022

A Unified Framework for Alternating Offline Model Training and Policy Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Shentao Yang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...