Yixiu Mao

Orcid: 0009-0000-7302-5039

According to our database1, Yixiu Mao authored at least 20 papers between 2021 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex.
CoRR, May, 2026

Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models.
CoRR, March, 2026

Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models.
CoRR, February, 2026

Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models?
Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1, 2026

2025
Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning.
CoRR, November, 2025

Utility-Diversity Aware Online Batch Selection for LLM Supervised Fine-tuning.
CoRR, October, 2025

A Unified Multi-Task Learning Framework for Generative Auto-Bidding with Validation-Aligned Optimization.
CoRR, October, 2025

Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search.
CoRR, September, 2025

Model Predictive Task Sampling for Efficient and Robust Adaptation.
CoRR, January, 2025

Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Lessons and Winning Solutions in Industrial Object Detection and Pose Estimation from the 2025 Bin-Picking Perception Challenge.
Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration.
CoRR, 2024

Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Doubly Mild Generalization for Offline Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
Supported Value Regularization for Offline Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Supported Trust Region Optimization for Offline Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2023

In-sample Actor Critic for Offline Reinforcement Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2021
A Hypergradient Approach to Robust Regression without Correspondence.
Proceedings of the 9th International Conference on Learning Representations, 2021


  Loading...