Peter Sunehag

Manfred Diaz

John P. Agapiou

William A. Cunningham

CoRR, March, 2026

2025

Simulation Streams: A Programming Paradigm for Controlling Large Language Models and Building Complex Systems with Generative AI.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

A theory of appropriateness with applications to generative artificial intelligence.

[BibT_eX]

[DOI]

Manfred Diaz

John P. Agapiou

William A. Cunningham

Julia Haas

Raphael Koster

CoRR, 2024

2023

A Review of Cooperation in Multi-agent Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Diversity Through Exclusion (DTE): Niche Identification for Reinforcement Learning through Value-Decomposition.

[BibT_eX]

[DOI]

Igor Mordatch

Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

2022

Melting Pot 2.0.

[BibT_eX]

[DOI]

John P. Agapiou

Michael Bradley Johanson

CoRR, 2022

2021

Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Learning to Incentivize Other Learning Agents.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019

Reinforcement Learning Agents acquire Flocking and Symbiotic Behaviour in Simulated Ecosystems.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Artificial Life, 2019

Malthusian Reinforcement Learning.

[BibT_eX]

[DOI]

Iain Dunning

Thore Graepel

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

2018

Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward.

[BibT_eX]

[DOI]

Wojciech Marian Czarnecki

Guy Lever

Audrunas Gruslys

Vinícius Flores Zambaldi

Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

2017

Value-Decomposition Networks For Cooperative Multi-Agent Learning.

[BibT_eX]

[DOI]

Wojciech Marian Czarnecki

Guy Lever

Audrunas Gruslys

Vinícius Flores Zambaldi

CoRR, 2017

2015

Rationality, optimism and guarantees in general reinforcement learning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2015

Deep Reinforcement Learning with Attention for Slate Markov Decision Processes with High-Dimensional States and Actions.

[BibT_eX]

[DOI]

CoRR, 2015

Reinforcement Learning in Large Discrete Action Spaces.

[BibT_eX]

[DOI]

CoRR, 2015

Using Localization and Factorization to Reduce the Complexity of Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Artificial General Intelligence, 2015

2014

A Dual Process Theory of Optimistic Cognition.

[BibT_eX]

[DOI]

Proceedings of the 36th Annual Meeting of the Cognitive Science Society, 2014

Intelligence as Inference or Forcing Occam on the World.

[BibT_eX]

[DOI]

Proceedings of the Artificial General Intelligence - 7th International Conference, 2014

Reinforcement learning with value advice.

[BibT_eX]

[DOI]

Mayank Daswani

Proceedings of the Sixth Asian Conference on Machine Learning, 2014

2013

On Nicod's Condition, Rules of Induction and the Raven Paradox.

[BibT_eX]

[DOI]

Hadi Mohasel Afshar

CoRR, 2013

The Sample-Complexity of General Reinforcement Learning.

[BibT_eX]

[DOI]

Tor Lattimore

Proceedings of the 30th International Conference on Machine Learning, 2013

Online feature selection for Brain Computer Interfaces.

[BibT_eX]

[DOI]

Gareth Oliver

Tom Gedeon

Proceedings of the 2013 IEEE Symposium on Computational Intelligence, 2013

Concentration and Confidence for Discrete Bayesian Sequence Predictors.

[BibT_eX]

[DOI]

Tor Lattimore

Proceedings of the Algorithmic Learning Theory - 24th International Conference, 2013

Learning Agents with Evolving Hypothesis Classes.

[BibT_eX]

[DOI]

Proceedings of the Artificial General Intelligence - 6th International Conference, 2013

Q-learning for history-based reinforcement learning.

[BibT_eX]

[DOI]

Mayank Daswani

Proceedings of the Asian Conference on Machine Learning, 2013

2012

Feature Reinforcement Learning using Looping Suffix Trees.

[BibT_eX]

[DOI]

Mayank Daswani

Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Asynchronous Brain Computer Interface using Hidden Semi-Markov Models.

[BibT_eX]

[DOI]

Gareth Oliver

Tom Gedeon

Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

Recursive channel selection techniques for brain computer interfaces.

[BibT_eX]

[DOI]

Gareth Oliver

Tom Gedeon

Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

Adaptive Context Tree Weighting.

[BibT_eX]

[DOI]

Proceedings of the 2012 Data Compression Conference, Snowbird, UT, USA, April 10-12, 2012, 2012

Coding of Non-Stationary Sources as a Foundation for Detecting Change Points and Outliers in Binary Time-Series.

[BibT_eX]

[DOI]

Wen Shao

Proceedings of the Tenth Australasian Data Mining Conference, AusDM 2012, Sydney, 2012

Optimistic Agents Are Asymptotically Optimal.

[BibT_eX]

[DOI]

Proceedings of the AI 2012: Advances in Artificial Intelligence, 2012

On Ensemble Techniques for AIXI Approximation.

[BibT_eX]

[DOI]

Joel Veness

Proceedings of the Artificial General Intelligence - 5th International Conference, 2012

Optimistic AIXI.

[BibT_eX]

[DOI]

Proceedings of the Artificial General Intelligence - 5th International Conference, 2012

Context Tree Maximizing.

[BibT_eX]

[DOI]

Phuong Nguyen

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011

Sparse Kernel-SARSA(λ) with an Eligibility Trace.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

Gradient Based Algorithms with Loss Functions and Kernels for Improved On-Policy Control.

[BibT_eX]

[DOI]

Matthew W. Robards

Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Feature Reinforcement Learning in Practice.

[BibT_eX]

[DOI]

Phuong Nguyen

Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

(Non-)Equivalence of Universal Priors.

[BibT_eX]

[DOI]

Ian Wood

Proceedings of the Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence, 2011

Principles of Solomonoff Induction and AIXI.

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence, 2011

Axioms for Rational Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Learning Theory - 22nd International Conference, 2011

2010

Wearable sensor activity analysis using semi-Markov models with a grammar.

[BibT_eX]

[DOI]

Pervasive Mob. Comput., 2010

Consistency of Feature Markov Processes.

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Learning Theory, 21st International Conference, 2010

2009

Variable Metric Stochastic Approximation Theory.

[BibT_eX]

[DOI]

Jochen Trumpf

S. V. N. Vishwanathan

Nicol N. Schraudolph

Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, 2009

Semi-Markov kMeans Clustering and Activity Recognition from Body-Worn Sensors.

[BibT_eX]

[DOI]

Matthew W. Robards

Proceedings of the ICDM 2009, 2009

2007

Emerge and spread models and word burstiness.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007

2004

Subcouples of codimension one and interpolation of operators that almost agree.

[BibT_eX]

[DOI]