Shimon Whiteson

According to our database1, Shimon Whiteson authored at least 229 papers between 2003 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
SplAgger: Split Aggregation for Meta-Reinforcement Learning.
CoRR, 2024

Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control.
CoRR, 2024

Discovering Temporally-Aware Reinforcement Learning Algorithms.
CoRR, 2024

2023
JaxMARL: Multi-Agent RL Environments in JAX.
CoRR, 2023

Bayesian Exploration Networks.
CoRR, 2023

Trust-Region-Free Policy Optimization for Stochastic Policies.
CoRR, 2023

A Survey of Meta-Reinforcement Learning.
CoRR, 2023

The Waymo Open Sim Agents Challenge.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Recurrent Hypernetworks are Surprisingly Strong in Meta-RL.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios.
IROS, 2023

Hierarchical Imitation Learning for Stochastic Environments.
IROS, 2023

Universal Morphology Control via Contextual Modulation.
Proceedings of the International Conference on Machine Learning, 2023

Why Target Networks Stabilise Temporal Difference Methods.
Proceedings of the International Conference on Machine Learning, 2023

Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Trust Region Bounds for Decentralized PPO Under Non-stationarity.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

2022
Truncated Emphatic Temporal Difference Methods for Prediction and Control.
J. Mach. Learn. Res., 2022

SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning.
CoRR, 2022

An Investigation of the Bias-Variance Tradeoff in Meta-Gradients.
CoRR, 2022

Generalization in Cooperative Multi-Agent Systems.
CoRR, 2022

Monotonic Improvement Guarantees under Non-stationarity for Decentralized PPO.
CoRR, 2022

You May Not Need Ratio Clipping in PPO.
CoRR, 2022

In Defense of the Unitary Scalarization for Deep Multi-Task Learning.
CoRR, 2022

Equivariant Networks for Zero-Shot Coordination.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

In Defense of the Unitary Scalarization for Deep Multi-Task Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

Communicating via Markov Decision Processes.
Proceedings of the International Conference on Machine Learning, 2022

Generalized Beliefs for Cooperative AI.
Proceedings of the International Conference on Machine Learning, 2022

Particle-Based Score Estimation for State Space Model Learning in Autonomous Driving.
Proceedings of the Conference on Robot Learning, 2022

Embedding Synthetic Off-Policy Experience for Autonomous Driving via Zero-Shot Curricula.
Proceedings of the Conference on Robot Learning, 2022

Hypernetworks in Meta-Reinforcement Learning.
Proceedings of the Conference on Robot Learning, 2022

Learning Skills Diverse in Value-Relevant Features.
Proceedings of the Conference on Lifelong Learning Agents, 2022

A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms.
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Reinforcement learning enhanced quantum-inspired algorithm for combinatorial optimization.
Mach. Learn. Sci. Technol., 2021

VariBAD: Variational Bayes-Adaptive Deep RL via Meta-Learning.
J. Mach. Learn. Res., 2021

On the Practical Consistency of Meta-Reinforcement Learning Algorithms.
CoRR, 2021

Reinforcement Learning in Factored Action Spaces using Tensor Decompositions.
CoRR, 2021

Model based Multi-agent Reinforcement Learning with Tensor Decompositions.
CoRR, 2021

Implicit Communication as Minimum Entropy Coupling.
CoRR, 2021

Bayesian Bellman Operators.
CoRR, 2021

SoftDICE for Imitation Learning: Rethinking Off-policy Distribution Matching.
CoRR, 2021

Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients.
CoRR, 2021

Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning.
CoRR, 2021

Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning.
Auton. Agents Multi Agent Syst., 2021

FACMAC: Factored Multi-Agent Centralised Policy Gradients.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Regularized Softmax Deep Multi-Agent Q-Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Bayesian Bellman Operators.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Snowflake: Scaling GNNs to high-dimensional continuous control via parameter freezing.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Deep Residual Reinforcement Learning (Extended Abstract).
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

Breaking the Deadly Triad with a Target Network.
Proceedings of the 38th International Conference on Machine Learning, 2021

Average-Reward Off-Policy Policy Evaluation with Function Approximation.
Proceedings of the 38th International Conference on Machine Learning, 2021

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control.
Proceedings of the 9th International Conference on Learning Representations, 2021

Transient Non-stationarity and Generalisation in Deep Reinforcement Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021

RODE: Learning Roles to Decompose Multi-Agent Tasks.
Proceedings of the 9th International Conference on Learning Representations, 2021

Deep Interactive Bayesian Reinforcement Learning via Meta-Learning.
Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning.
J. Mach. Learn. Res., 2020

Robust Reinforcement Learning with Bayesian Optimisation and Quadrature.
J. Mach. Learn. Res., 2020

Expected Policy Gradients for Reinforcement Learning.
J. Mach. Learn. Res., 2020

Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?
CoRR, 2020

Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning.
CoRR, 2020

WordCraft: An Environment for Benchmarking Commonsense Agents.
CoRR, 2020

Weighted QMIX: Expanding Monotonic Value Function Factorisation.
CoRR, 2020

The Impact of Non-stationarity on Generalisation in Deep Reinforcement Learning.
CoRR, 2020

AI-QMIX: Attention and Imagination for Dynamic Multi-Agent Reinforcement Learning.
CoRR, 2020

Privileged Information Dropout in Reinforcement Learning.
CoRR, 2020

Maximizing Information Gain in Partially Observable Environments via Prediction Reward.
CoRR, 2020

Per-Step Reward: A New Perspective for Risk-Averse Reinforcement Learning.
CoRR, 2020

Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control.
CoRR, 2020

Facial feedback for reinforcement learning: a case study and offline analysis using the TAMER framework.
Auton. Agents Multi Agent Syst., 2020

Multitask Soft Option Learning.
Proceedings of the Thirty-Sixth Conference on Uncertainty in Artificial Intelligence, 2020

Learning Retrospective Knowledge with Reverse Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver?
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation.
Proceedings of the 37th International Conference on Machine Learning, 2020

GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values.
Proceedings of the 37th International Conference on Machine Learning, 2020

Growing Action Spaces.
Proceedings of the 37th International Conference on Machine Learning, 2020

Deep Coordination Graphs.
Proceedings of the 37th International Conference on Machine Learning, 2020

VariBAD: A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

Optimistic Exploration even with a Pessimistic Initialisation.
Proceedings of the 8th International Conference on Learning Representations, 2020

Deep Residual Reinforcement Learning.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Maximizing Information Gain in Partially Observable Environments via Prediction Rewards.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

2019
VIABLE: Fast Adaptation via Backpropagating Learned Loss.
CoRR, 2019

Provably Convergent Off-Policy Actor-Critic with Function Approximation.
CoRR, 2019

Improving SAT Solver Heuristics with Graph Networks and Reinforcement Learning.
CoRR, 2019

Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning.
CoRR, 2019

Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning.
CoRR, 2019

Multitask Soft Option Learning.
CoRR, 2019

Fast Efficient Hyperparameter Tuning for Policy Gradients.
CoRR, 2019

DAC: The Double Actor-Critic Architecture for Learning Options.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Generalized Off-Policy Actor-Critic.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Multi-Agent Common Knowledge Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Fast Efficient Hyperparameter Tuning for Policy Gradient Methods.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

MAVEN: Multi-Agent Variational Exploration.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

VIREL: A Variational Inference Framework for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

A Survey of Reinforcement Learning Informed by Natural Language.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Learning From Demonstration in the Wild.
Proceedings of the International Conference on Robotics and Automation, 2019

Fast Context Adaptation via Meta-Learning.
Proceedings of the 36th International Conference on Machine Learning, 2019

Fingerprint Policy Optimisation for Robust Reinforcement Learning.
Proceedings of the 36th International Conference on Machine Learning, 2019

A Baseline for Any Order Gradient Estimation in Stochastic Computation Graphs.
Proceedings of the 36th International Conference on Machine Learning, 2019

Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning.
Proceedings of the 36th International Conference on Machine Learning, 2019

Stable Opponent Shaping in Differentiable Games.
Proceedings of the 7th International Conference on Learning Representations, 2019

The StarCraft Multi-Agent Challenge.
Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

The Representational Capacity of Action-Value Networks for Multi-Agent Reinforcement Learning.
Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

2018
CAML: Fast Context Adaptation via Meta-Learning.
CoRR, 2018

Contextual Policy Optimisation.
CoRR, 2018

Exploiting submodular value functions for scaling up active perception.
Auton. Robots, 2018

Social interaction for efficient agent learning from human reward.
Auton. Agents Multi Agent Syst., 2018

TACO: Learning Task Decomposition via Temporal Alignment for Control.
Proceedings of the 35th International Conference on Machine Learning, 2018

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

Deep Variational Reinforcement Learning for POMDPs.
Proceedings of the 35th International Conference on Machine Learning, 2018

Fourier Policy Gradients.
Proceedings of the 35th International Conference on Machine Learning, 2018

DiCE: The Infinitely Differentiable Monte-Carlo Estimator.
Proceedings of the 6th International Conference on Learning Representations, 2018

TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning.
Proceedings of the 6th International Conference on Learning Representations, 2018

Learning with Opponent-Learning Awareness.
Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

Alternating Optimisation and Quadrature for Robust Control.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Counterfactual Multi-Agent Policy Gradients.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Expected Policy Gradients.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Multi-Objective Decision Making
Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers, ISBN: 978-3-031-01576-2, 2017

TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning.
CoRR, 2017

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning.
CoRR, 2017

Real-Time Resource Allocation for Tracking Systems.
Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, 2017

Dynamic-Depth Context Tree Weighting.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Acquiring social interaction behaviours for telepresence robots via deep learning from demonstration.
Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017

Rapidly exploring learning trees.
Proceedings of the 2017 IEEE International Conference on Robotics and Automation, 2017

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning.
Proceedings of the 34th International Conference on Machine Learning, 2017

Intro to Reinforcement Learning.
Proceedings of the British Machine Vision Conference 2017, 2017

OFFER: Off-Environment Reinforcement Learning.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Probably Approximately Correct Greedy Maximization.
CoRR, 2016

Alternating Optimisation and Quadrature for Robust Reinforcement Learning.
CoRR, 2016

Multi-Objective Deep Reinforcement Learning.
CoRR, 2016

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks.
CoRR, 2016

LipNet: Sentence-level Lipreading.
CoRR, 2016

Using informative behavior to increase engagement while learning from human reward.
Auton. Agents Multi Agent Syst., 2016

Multileave Gradient Descent for Fast Online Learning to Rank.
Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, 2016

Learning to Communicate with Deep Multi-Agent Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

PAC Greedy Maximization with Efficient Bounds on Information Gain for Sensor Selection.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Inverse Reinforcement Learning from Failure.
Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

Probably Approximately Correct Greedy Maximization: (Extended Abstract).
Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

Towards Learning from Implicit Human Reward: (Extended Abstract).
Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

2015
Computing Convex Coverage Sets for Faster Multi-objective Coordination.
J. Artif. Intell. Res., 2015

MergeRUCB: A Method for Large-Scale Online Ranker Evaluation.
Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, 2015

Bayesian Ranker Comparison Based on Historical User Interactions.
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015

Copeland Dueling Bandits.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Point-Based Planning for Multi-Objective POMDPs.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Pareto Local Search for MOMDP Planning.
Proceedings of the 23rd European Symposium on Artificial Neural Networks, 2015

A Large-Scale Study of Agents Learning from Human Reward.
Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015

Exploiting Submodular Value Functions for Faster Dynamic Sensor Selection.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
"Learning to rank for information retrieval from user interactions" by K. Hofmann, S. Whiteson, A. Schuth, and M. de Rijke with Martin Vesely as coordinator.
SIGWEB Newsl., 2014

Efficient Abstraction Selection in Reinforcement Learning.
Comput. Intell., 2014

Learning potential functions and their representations for multi-task reinforcement learning.
Auton. Agents Multi Agent Syst., 2014

Relative confidence sampling for efficient on-line ranker evaluation.
Proceedings of the Seventh ACM International Conference on Web Search and Data Mining, 2014

Queued Pareto Local Search for Multi-Objective Optimization.
Proceedings of the Parallel Problem Solving from Nature - PPSN XIII, 2014

Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem.
Proceedings of the 31th International Conference on Machine Learning, 2014

Learning from human reward benefits from socio-competitive feedback.
Proceedings of the 4th International Conference on Development and Learning and on Epigenetic Robotics, 2014

Challenge balancing for personalised game spaces.
Proceedings of the 2014 IEEE Games Media Entertainment, 2014

Design criteria for challenge balancing of personalised game spaces.
Proceedings of the 9th International Conference on the Foundations of Digital Games, 2014

Optimizing Base Rankers Using Clicks - A Case Study Using BM25.
Proceedings of the Advances in Information Retrieval, 2014

Multileaved Comparisons for Fast Online Evaluation.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

Linear support for multi-objective coordination graphs.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2014

Leveraging social networks to motivate humans to train agents.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2014

Bounded Approximations for Linear Multi-Objective Planning Under Uncertainty.
Proceedings of the Twenty-Fourth International Conference on Automated Planning and Scheduling, 2014

Towards Personalised Gaming via Facial Expression Recognition.
Proceedings of the Tenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2014

2013
Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods.
ACM Trans. Inf. Syst., 2013

A Survey of Multi-Objective Sequential Decision-Making.
J. Artif. Intell. Res., 2013

Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs.
J. Artif. Intell. Res., 2013

Balancing exploration and exploitation in listwise and pairwise online learning to rank for information retrieval.
Inf. Retr., 2013

Efficient Abstraction Selection in Reinforcement Learning (Extended Abstract).
Proceedings of the Tenth Symposium on Abstraction, Reformulation, and Approximation, 2013

Critical factors in the performance of hyperNEAT.
Proceedings of the Genetic and Evolutionary Computation Conference, 2013

Reusing Historical Interaction Data for Faster Online Learning to Rank for IR.
Proceedings of the 13th Dutch-Belgian Workshop on Information Retrieval, 2013

Lerot: an online learning to rank framework.
Proceedings of the 2013 workshop on Living labs for information retrieval evaluation, 2013

Multi-objective variable elimination for collaborative graphical games.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

Approximate solutions for factored Dec-POMDPs with many agents.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

Using informative behavior to increase engagement in the tamer framework.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

Computing Convex Coverage Sets for Multi-objective Coordination Graphs.
Proceedings of the Algorithmic Decision Theory - Third International Conference, 2013

2012
Exploiting Structure in Cooperative Bayesian Games.
Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, 2012

Estimating interleaved comparison outcomes from historical click data.
Proceedings of the 21st ACM International Conference on Information and Knowledge Management, 2012

V-MAX: tempered optimism for better PAC reinforcement learning.
Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, 2012

Evolutionary Computation for Reinforcement Learning.
Proceedings of the Reinforcement Learning, 2012

2011
Introduction to the special issue on empirical evaluations in reinforcement learning.
Mach. Learn., 2011

Exploiting Best-Match Equations for Efficient Reinforcement Learning.
J. Mach. Learn. Res., 2011

Neuroevolutionary reinforcement learning for generalized control of simulated helicopters.
Evol. Intell., 2011

Exploiting Agent and Type Independence in Collaborative Graphical Bayesian Games
CoRR, 2011

Adapting Rankers Online.
Proceedings of the Multidisciplinary Information Retrieval, 2011

Robust central pattern generators for embodied hierarchical reinforcement learning.
Proceedings of the 1st International Conference on Development and Learning and on Epigenetic Robotics, 2011

Critical factors in the performance of novelty search.
Proceedings of the 13th Annual Genetic and Evolutionary Computation Conference, 2011

Multi-Task Reinforcement Learning: Shaping and Feature Selection.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Balancing Exploration and Exploitation in Learning to Rank Online.
Proceedings of the Advances in Information Retrieval, 2011

A probabilistic method for inferring preferences from clicks.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Protecting against evaluation overfitting in empirical reinforcement learning.
Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning, 2011

2010
Switching between Representations in Reinforcement Learning.
Proceedings of the Interactive Collaborative Information Systems, 2010

Traffic Light Control by Multiagent Reinforcement Learning Systems.
Proceedings of the Interactive Collaborative Information Systems, 2010

Adaptive Representations for Reinforcement Learning
Studies in Computational Intelligence 291, Springer, ISBN: 978-3-642-13931-4, 2010

Report on the 2008 Reinforcement Learning Competition.
AI Mag., 2010

Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning.
Auton. Agents Multi Agent Syst., 2010

Multi-task evolutionary shaping without pre-specified representations.
Proceedings of the Genetic and Evolutionary Computation Conference, 2010

2009
Machine learning for event selection in high energy physics.
Eng. Appl. Artif. Intell., 2009

Postponed Updates for Temporal-Difference Reinforcement Learning.
Proceedings of the Ninth International Conference on Intelligent Systems Design and Applications, 2009

Automatic Feature Selection for Model-Based Reinforcement Learning in Factored MDPs.
Proceedings of the International Conference on Machine Learning and Applications, 2009

Neuroevolutionary reinforcement learning for generalized helicopter control.
Proceedings of the Genetic and Evolutionary Computation Conference, 2009

Integrating distributed Bayesian inference and reinforcement learning for sensor management.
Proceedings of the 12th International Conference on Information Fusion, 2009

Lossless clustering of histories in decentralized POMDPs.
Proceedings of the 8th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2009), 2009

A theoretical and empirical analysis of Expected Sarsa.
Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2009

2008
Multiagent Reinforcement Learning for Urban Traffic Control Using Coordination Graphs.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2008

Exploiting locality of interaction in factored Dec-POMDPs.
Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

2007
Empirical Studies in Action Selection with Reinforcement Learning.
Adapt. Behav., 2007

Transfer via inter-task mappings in policy search reinforcement learning.
Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2007), 2007

Stochastic Optimization for Collision Selection in High Energy Physics.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

Temporal Difference and Policy Search Methods for Reinforcement Learning: An Empirical Comparison.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006
Evolutionary Function Approximation for Reinforcement Learning.
J. Mach. Learn. Res., 2006

On-line evolutionary computation for reinforcement learning in stochastic domains.
Proceedings of the Genetic and Evolutionary Computation Conference, 2006

Comparing evolutionary and temporal difference methods in a reinforcement learning domain.
Proceedings of the Genetic and Evolutionary Computation Conference, 2006

Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning.
Proceedings of the Proceedings, 2006

2005
Evolving Soccer Keepaway Players Through Task Decomposition.
Mach. Learn., 2005

Automatic feature selection in neuroevolution.
Proceedings of the Genetic and Evolutionary Computation Conference, 2005

Improving Reinforcement Learning Function Approximators via Neuroevolution.
Proceedings of the Proceedings, 2005

2004
Adaptive job routing and scheduling.
Eng. Appl. Artif. Intell., 2004

Towards Autonomic Computing: Adaptive Network Routing and Scheduling.
Proceedings of the 1st International Conference on Autonomic Computing (ICAC 2004), 2004

Towards Autonomic Computing: Adaptive Job Routing and Scheduling.
Proceedings of the Nineteenth National Conference on Artificial Intelligence, 2004

2003
Evolving Keepaway Soccer Players through Task Decomposition.
Proceedings of the Genetic and Evolutionary Computation, 2003

Concurrent layered learning.
Proceedings of the Second International Joint Conference on Autonomous Agents & Multiagent Systems, 2003


  Loading...