Yixiu Mao

Orcid: 0009-0000-7302-5039

According to our database¹, Yixiu Mao authored at least 16 papers between 2021 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Adaptive Neighborhood-Constrained Q Learning for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, November, 2025

Utility-Diversity Aware Online Batch Selection for LLM Supervised Fine-tuning.

[BibT_eX]

[DOI]

CoRR, October, 2025

A Unified Multi-Task Learning Framework for Generative Auto-Bidding with Validation-Aligned Optimization.

[BibT_eX]

[DOI]

CoRR, October, 2025

Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search.

[BibT_eX]

[DOI]

CoRR, September, 2025

Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reasoning Models?

[BibT_eX]

[DOI]

CoRR, July, 2025

Model Predictive Task Sampling for Efficient and Robust Adaptation.

[BibT_eX]

[DOI]

CoRR, January, 2025

Robust Fast Adaptation from Adversarially Explicit Task Distribution Generation.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025

Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration.

[BibT_eX]

[DOI]

CoRR, 2024

Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Doubly Mild Generalization for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023

Supported Value Regularization for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Supported Trust Region Optimization for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

In-sample Actor Critic for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2021

A Hypergradient Approach to Robust Regression without Correspondence.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Yixiu Mao

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...