Yixiu Mao

Orcid: 0009-0000-7302-5039

According to our database¹, Yixiu Mao authored at least 20 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex.

[BibT_eX]

[DOI]

CoRR, May, 2026

Dynamics-Predictive Sampling for Active RL Finetuning of Large Reasoning Models.

[BibT_eX]

[DOI]

CoRR, March, 2026

Small Generalizable Prompt Predictive Models Can Steer Efficient RL Post-Training of Large Reasoning Models.

[BibT_eX]

[DOI]

CoRR, February, 2026

Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models?

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1, 2026

2025

Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, November, 2025

Utility-Diversity Aware Online Batch Selection for LLM Supervised Fine-tuning.

[BibT_eX]

[DOI]

CoRR, October, 2025

A Unified Multi-Task Learning Framework for Generative Auto-Bidding with Validation-Aligned Optimization.

[BibT_eX]

[DOI]

CoRR, October, 2025

Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search.

[BibT_eX]

[DOI]

CoRR, September, 2025

Model Predictive Task Sampling for Efficient and Robust Adaptation.

[BibT_eX]

[DOI]

CoRR, January, 2025

Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Lessons and Winning Solutions in Industrial Object Detection and Pose Estimation from the 2025 Bin-Picking Perception Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2025, 2025

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024

Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration.

[BibT_eX]

[DOI]

CoRR, 2024

Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Doubly Mild Generalization for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

2023

Supported Value Regularization for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Supported Trust Region Optimization for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

In-sample Actor Critic for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2021

A Hypergradient Approach to Robust Regression without Correspondence.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Yixiu Mao

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...