Hao Sun

Affiliations:
  • University of Cambridge, DAMTP, Department of Applied Mathematics and Theoretical Physics, UK


According to our database1, Hao Sun authored at least 22 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2025
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities.
CoRR, July, 2025

OpenReview Should be Protected and Leveraged as a Community Asset for Research in the Era of Large Language Models.
CoRR, May, 2025

Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs.
CoRR, February, 2025

Reviving The Classics: Active Reward Modeling in Large Language Model Alignment.
CoRR, February, 2025

Active Reward Modeling: Adaptive Preference Labeling for Large Language Model Alignment.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

The Synergy of LLMs & RL Unlocks Offline Learning of Generalizable Language-Conditioned Policies with Low-fidelity Data.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Rethinking Reward Modeling in Preference-based Large Language Model Alignment.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Inverse Reinforcement Learning Meets Large Language Model Alignment.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 5: Tutorial Abstracts), 2025

2024
When is Off-Policy Evaluation (Reward Modeling) Useful in Contextual Bandits? A Data-Centric Perspective.
J. Data-centric Mach. Learn. Res., 2024

Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes.
CoRR, 2024

LLMs for Generalizable Language-Conditioned Policy Learning under Minimal Data Requirements.
CoRR, 2024

Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives.
CoRR, 2024

Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment.
CoRR, 2024

Retrieval-Augmented Thought Process as Sequential Decision Making.
CoRR, 2024

Dense Reward for Free in Reinforcement Learning from Human Feedback.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Query-Dependent Prompt Evaluation and Optimization with Offline Inverse RL.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
When is Off-Policy Evaluation Useful? A Data-Centric Perspective.
CoRR, 2023

Accountability in Offline Reinforcement Learning: Explaining Decisions with a Corpus of Examples.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

What is Flagged in Uncertainty Quantification? Latent Density Models for Uncertainty Categorization.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Neural Laplace Control for Continuous-time Delayed Systems.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Membership Inference Attacks against Synthetic Data through Overfitting Detection.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
DAUX: a Density-based Approach for Uncertainty eXplanations.
CoRR, 2022


  Loading...