Stephen McAleer

Orcid: 0000-0003-0118-6874

According to our database¹, Stephen McAleer authored at least 63 papers between 2018 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Bibliography

2026

AI Alignment: A Contemporary Survey.

[BibT_eX]

[DOI]

ACM Comput. Surv., April, 2026

Faster Game Solving via Hyperparameter Schedules.

[BibT_eX]

[DOI]

Naifeng Zhang

Stephen Marcus McAleer

Tuomas Sandholm

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

Tree Search for Language Model Agents.

[BibT_eX]

[DOI]

Jing Yu Koh

Stephen Marcus McAleer

Daniel Fried

Ruslan Salakhutdinov

Trans. Mach. Learn. Res., 2025

Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems, 2025

2024

ASP: Learn a Universal Neural Solver!

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2024

Sample-Efficient Regret-Minimizing Double Oracle in Extensive-Form Games.

[BibT_eX]

[DOI]

CoRR, 2024

AgentKit: Flow Engineering with Graphs, not Coding.

[BibT_eX]

[DOI]

CoRR, 2024

Steering No-Regret Learners to a Desired Equilibrium.

[BibT_eX]

[DOI]

Proceedings of the 25th ACM Conference on Economics and Computation, 2024

Policy Space Response Oracles: A Survey.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Scalable Mechanism Design for Multi-Agent Path Finding.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training.

[BibT_eX]

[DOI]

Ziyu Wan

Xidong Feng

Muning Wen

Stephen Marcus McAleer

Ying Wen

Weinan Zhang

Jun Wang

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Confronting Reward Model Overoptimization with Constrained RLHF.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Toward Optimal Policy Population Growth in Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Illusory Attacks: Information-theoretic detectability matters in adversarial attacks.

[BibT_eX]

[DOI]

Tim Franzmeyer

Stephen Marcus McAleer

João F. Henriques

Jakob Nicolaus Foerster

Philip Torr

Adel Bibi

Christian Schröder de Witt

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Llemma: An Open Language Model for Mathematics.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Grasper: A Generalist Pursuer for Pursuit-Evasion Problems.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

Automated Design of Affine Maximizer Mechanisms in Dynamic Settings.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Trans. Mach. Learn. Res., 2023

AI Alignment: A Comprehensive Survey.

[BibT_eX]

[DOI]

CoRR, 2023

Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations.

[BibT_eX]

[DOI]

CoRR, 2023

Steering No-Regret Learners to Optimal Equilibria.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Andreas Alexander Haupt

CoRR, 2023

MANSA: Learning Fast and Slow in Multi-Agent Systems.

[BibT_eX]

[DOI]

CoRR, 2023

Algorithms and Complexity for Computing Nash Equilibria in Adversarial Team Games.

[BibT_eX]

[DOI]

Ioannis Anagnostides

Fivos Kalogiannis

Ioannis Panageas

Emmanouil-Vasileios Vlatakis-Gkaragkounis

Stephen McAleer

Proceedings of the 24th ACM Conference on Economics and Computation, 2023

Computing Optimal Equilibria and Mechanisms via Learning in Zero-Sum Extensive-Form Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Policy Space Diversity for Non-Transitive Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Team-PSRO for Learning Approximate TMECor in Large Team Games via Cooperative Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Language Models can Solve Computer Tasks.

[BibT_eX]

[DOI]

Geunwoo Kim

Pierre Baldi

Stephen McAleer

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Regret-Minimizing Double Oracle for Extensive-Form Games.

[BibT_eX]

[DOI]

Xiaohang Tang

Le Cong Dinh

Stephen Marcus McAleer

Yaodong Yang

Proceedings of the International Conference on Machine Learning, 2023

A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems.

[BibT_eX]

[DOI]

Oliver Slumbers

David Henry Mguni

Stefano B. Blumberg

Stephen Marcus McAleer

Yaodong Yang

Jun Wang

Proceedings of the International Conference on Machine Learning, 2023

MANSA: Learning Fast and Slow in Multi-Agent Systems.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Feifei Tong

Jun Wang

Yaodong Yang

Proceedings of the International Conference on Machine Learning, 2023

ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Gabriele Farina

Marc Lanctot

Tuomas Sandholm

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

Figure Data for the paper "Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning".

[BibT_eX]

[DOI]

Dataset, October, 2022

Sequential Decision Making in Single-Agent and Multi-Agent Domains

[BibT_eX]

[DOI]

Stephen McAleer

PhD thesis, 2022

Online Double Oracle.

[BibT_eX]

[DOI]

Le Cong Dinh

Stephen Marcus McAleer

Trans. Mach. Learn. Res., 2022

Game Theoretic Rating in N-player general-sum games with Equilibria.

[BibT_eX]

[DOI]

CoRR, 2022

Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments.

[BibT_eX]

[DOI]

CoRR, 2022

Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

CoRR, 2022

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Hao Dong

Zongqing Lu

Song-Chun Zhu

CoRR, 2022

Learning Risk-Averse Equilibria in Multi-Agent Systems.

[BibT_eX]

[DOI]

CoRR, 2022

Anytime PSRO for Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

CoRR, 2022

Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Reducing Variance in Temporal-Difference Value Estimation via Ensemble of Deep Networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Proving Theorems using Incremental Learning and Hindsight Experience Replay.

[BibT_eX]

[DOI]

Stephen Marcus McAleer

Proceedings of the International Conference on Machine Learning, 2022

Independent Natural Policy Gradient always converges in Markov Potential Games.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021

Target Entropy Annealing for Discrete Soft Actor-Critic.

[BibT_eX]

[DOI]

CoRR, 2021

Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates.

[BibT_eX]

[DOI]

CoRR, 2021

Improving Social Welfare While Preserving Autonomy via a Pareto Mediator.

[BibT_eX]

[DOI]

CoRR, 2021

Discovering Multi-Agent Auto-Curricula in Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

CoRR, 2021

XDO: A Double Oracle Algorithm for Extensive-Form Games.

[BibT_eX]

[DOI]

CoRR, 2021

A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks.

[BibT_eX]

[DOI]

CoRR, 2021

XDO: A Double Oracle Algorithm for Extensive-Form Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Neural Auto-Curricula in Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020

Deep machine learning-assisted multiphoton microscopy to reduce light exposure and expedite imaging.

[BibT_eX]

[DOI]

CoRR, 2020

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

2019

Highly Accurate Machine Fault Diagnosis Using Deep Transfer Learning.

[BibT_eX]

[DOI]

IEEE Trans. Ind. Informatics, 2019

Solving the Rubik's cube with deep reinforcement learning and search.

[BibT_eX]

[DOI]

Nat. Mach. Intell., 2019

ColosseumRL: A Framework for Multiagent Reinforcement Learning in N-Player Games.

[BibT_eX]

[DOI]

CoRR, 2019

Curiosity-Driven Multi-Criteria Hindsight Experience Replay.

[BibT_eX]

[DOI]

John B. Lanier

Stephen McAleer

Pierre Baldi

CoRR, 2019

Solving the Rubik's Cube with Approximate Policy Iteration.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

2018

Solving the Rubik's Cube Without Human Knowledge.

[BibT_eX]

[DOI]

CoRR, 2018

Stephen McAleer

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...