Xingzhou Lou

Orcid: 0000-0001-6380-2818

According to our database1, Xingzhou Lou authored at least 13 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Calibration-Aware Policy Optimization for Reasoning LLMs.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

2025
Unveiling Chain of Step Reasoning for Vision-Language Models with Fine-grained Rewards.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Sequential Preference Optimization: Multi-Dimensional Preference Alignment with Implicit Reward Modeling.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
Leveraging Joint-Action Embedding in Multiagent Reinforcement Learning for Cooperative Games.
IEEE Trans. Games, June, 2024

Uncertainty-aware Reward Model: Teaching Reward Models to Know What is Unknown.
CoRR, 2024

Reward-Robust RLHF in LLMs.
CoRR, 2024

SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling.
CoRR, 2024

Position: Foundation Agents as the Paradigm Shift for Decision Making.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models.
Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
An Efficient End-to-End Training Approach for Zero-Shot Human-AI Coordination.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

2022
Offline reinforcement learning with representations for actions.
Inf. Sci., 2022


  Loading...