Shane Legg

According to our database1, Shane Legg authored at least 61 papers between 2000 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Levels of AGI: Operationalizing Progress on the Path to AGI.
CoRR, 2023

The Hydra Effect: Emergent Self-repair in Language Model Computations.
CoRR, 2023

Neural Networks and the Chomsky Hierarchy.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Randomized Positional Encodings Boost Length Generalization of Transformers.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

2022
Your Policy Regularizer is Secretly an Adversary.
Trans. Mach. Learn. Res., 2022

Beyond Bayes-optimality: meta-learning what you know you don't know.
CoRR, 2022

Neural Networks and the Chomsky Hierarchy.
CoRR, 2022

Safe Deep RL in 3D Environments using Human Feedback.
CoRR, 2022

2021
Model-Free Risk-Sensitive Reinforcement Learning.
CoRR, 2021

Shaking the foundations: delusions in sequence models for interaction and control.
CoRR, 2021

Causal Analysis of Agent Behavior for AI Safety.
CoRR, 2021

Quantifying Differences in Reward Functions.
Proceedings of the 9th International Conference on Learning Representations, 2021

Agent Incentives: A Causal Perspective.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Special Issue "On Defining Artificial Intelligence" - Commentaries and Author's Response.
J. Artif. Gen. Intell., 2020

Avoiding Tampering Incentives in Deep RL via Decoupled Approval.
CoRR, 2020

REALab: An Embedded Perspective on Tampering.
CoRR, 2020

Algorithms for Causal Reasoning in Probability Trees.
CoRR, 2020

The Incentives that Shape Behaviour.
CoRR, 2020

Meta-trained agents implement Bayes-optimal agents.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Avoiding Side Effects By Considering Future Tasks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Pitfalls of Learning a Reward Function Online.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Learning Human Objectives by Evaluating Hypothetical Behavior.
Proceedings of the 37th International Conference on Machine Learning, 2020

2019
Meta-learning of Sequential Strategies.
CoRR, 2019

Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings.
CoRR, 2019

Penalizing Side Effects using Stepwise Relative Reachability.
Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

Modeling AGI Safety Frameworks with Causal Influence Diagrams.
Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

2018
Scaling shared model governance via model splitting.
CoRR, 2018

Scalable agent alignment via reward modeling: a research direction.
CoRR, 2018

Modeling Friends and Foes.
CoRR, 2018

Measuring and avoiding side effects using relative reachability.
CoRR, 2018

Agents and Devices: A Relative Definition of Agency.
CoRR, 2018

Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents.
CoRR, 2018

Reward learning from human preferences and demonstrations in Atari.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures.
Proceedings of the 35th International Conference on Machine Learning, 2018

Noisy Networks For Exploration.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
AI Safety Gridworlds.
CoRR, 2017

Building Machines that Learn and Think for Themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017.
CoRR, 2017

Symmetric Decomposition of Asymmetric Games.
CoRR, 2017

Noisy Networks for Exploration.
CoRR, 2017

Reinforcement Learning with a Corrupted Reward Channel.
CoRR, 2017

Deep Reinforcement Learning from Human Preferences.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Reinforcement Learning with a Corrupted Reward Channel.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Soft-Bayes: Prod for Mixtures of Experts with Log-Loss.
Proceedings of the International Conference on Algorithmic Learning Theory, 2017

2016
DeepMind Lab.
CoRR, 2016

2015
Human-level control through deep reinforcement learning.
Nat., 2015

Massively Parallel Methods for Deep Reinforcement Learning.
CoRR, 2015

Letter to the Editor: Research Priorities for Robust and Beneficial Artificial Intelligence: An Open Letter.
AI Mag., 2015

2014
From academia to industry: The story of Google DeepMind.
Proceedings of the 2014 Imperial College Computing Student Workshop, 2014

2011
An Approximation of the Universal Intelligence Measure.
Proceedings of the Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence, 2011

2007
Algorithmic probability.
Scholarpedia, 2007

Universal Intelligence: A Definition of Machine Intelligence.
Minds Mach., 2007

Temporal Difference Updating without a Learning Rate.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

2006
Fitness uniform optimization.
IEEE Trans. Evol. Comput., 2006

A Formal Measure of Machine Intelligence
CoRR, 2006

Is There an Elegant Universal Theory of Prediction?
Proceedings of the Algorithmic Learning Theory, 17th International Conference, 2006

Tests of Machine Intelligence.
Proceedings of the 50 Years of Artificial Intelligence, 2006

A Collection of Definitions of Intelligence.
Proceedings of the Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms, 2006

2005
A Universal Measure of Intelligence for Artificial Agents.
Proceedings of the IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30, 2005

Fitness uniform deletion: a simple way to preserve diversity.
Proceedings of the Genetic and Evolutionary Computation Conference, 2005

2004
Tournament versus fitness uniform selection.
Proceedings of the IEEE Congress on Evolutionary Computation, 2004

2000
Solving Problems with Finite Test Sets.
Proceedings of the Finite Versus Infinite, 2000


  Loading...