We stand with Ukraine

We stand with Ukraine

Tom Everitt

Orcid: 0000-0003-1210-9866

Affiliations:

Australian National University

According to our database¹, Tom Everitt authored at least 49 papers between 2014 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

Online presence:

On csauthors.net:

Bibliography

2025

General agents need world models.

[DOI]

Jonathan Richens

,

,

,

CoRR, June, 2025

Evaluating the Goal-Directedness of Large Language Models.

[DOI]

,

Cristina Garbacea

,

,

Jonathan Richens

,

Henry Papadatos

,

,

CoRR, April, 2025

An Approach to Technical AGI Safety and Security.

[DOI]

CoRR, April, 2025

Incentives for responsiveness, instrumental control and impact.

[DOI]

,

Eric D. Langlois

,

Chris van Merwijk

,

,

Artif. Intell., 2025

General agents need world models.

[DOI]

Jonathan Richens

,

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

The Limits of Predicting Agents from Behaviour.

[DOI]

,

Jonathan Richens

,

Proceedings of the Forty-second International Conference on Machine Learning, 2025

2024

A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI.

[DOI]

CoRR, 2024

Measuring Goal-Directedness.

[DOI]

Matt MacDermott

,

,

Francesco Belardinelli

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Robust agents learn causal world models.

[DOI]

Jonathan Richens

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

The Reasons that Agents Act: Intention and Instrumental Goals.

[DOI]

Francis Rhys Ward

,

Matt MacDermott

,

Francesco Belardinelli

,

,

Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

Discovering Agents (Abstract Reprint).

[DOI]

,

,

Sebastian Farquhar

,

Jonathan Richens

,

Matt MacDermott

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Reasoning about Causality in Games (Abstract Reprint).

[DOI]

,

,

,

,

Alessandro Abate

,

Michael J. Wooldridge

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Discovering agents.

[DOI]

,

,

Sebastian Farquhar

,

Jonathan Richens

,

Matt MacDermott

,

Artif. Intell., September, 2023

Reasoning about causality in games.

[DOI]

,

,

,

,

Alessandro Abate

,

Michael J. Wooldridge

Artif. Intell., July, 2023

Characterising Decision Theories with Mechanised Causal Graphs.

[DOI]

Matt MacDermott

,

,

Francesco Belardinelli

CoRR, 2023

Human Control: Definitions and Algorithms.

[DOI]

,

Proceedings of the Uncertainty in Artificial Intelligence, 2023

Honesty Is the Best Policy: Defining and Mitigating AI Deception.

[DOI]

,

,

Francesco Belardinelli

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022

A Complete Criterion for Value of Information in Soluble Influence Diagrams.

[DOI]

Chris van Merwijk

,

,

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Path-Specific Objectives for Safer Agent Incentives.

[DOI]

Sebastian Farquhar

,

,

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Why Fair Labels Can Yield Unfair Predictions: Graphical Conditions for Introduced Unfairness.

[DOI]

Carolyn Ashurst

,

,

,

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.

[DOI]

,

,

,

Victoria Krakovna

Synth., 2021

Shaking the foundations: delusions in sequence models for interaction and control.

[DOI]

Pedro A. Ortega

,

,

Grégoire Delétang

,

,

Jordi Grau-Moya

,

,

,

,

,

Julien Pérolat

,

,

Corentin Tallec

,

Emilio Parisotto

,

,

,

,

,

Nando de Freitas

,

CoRR, 2021

Alignment of Language Agents.

[DOI]

,

,

Laura Weidinger

,

,

Vladimir Mikulik

,

Geoffrey Irving

CoRR, 2021

PyCID: A Python Library for Causal Influence Diagrams.

[DOI]

,

,

,

Eric D. Langlois

,

Alessandro Abate

,

Michael J. Wooldridge

Proceedings of the 20th Python in Science Conference, 2021

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice.

[DOI]

,

,

,

Alessandro Abate

,

Michael J. Wooldridge

Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

How RL Agents Behave When Their Actions Are Modified.

[DOI]

Eric D. Langlois

,

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Agent Incentives: A Causal Perspective.

[DOI]

,

,

Eric D. Langlois

,

Pedro A. Ortega

,

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Avoiding Tampering Incentives in Deep RL via Decoupled Approval.

[DOI]

Jonathan Uesato

,

,

Victoria Krakovna

,

,

,

CoRR, 2020

REALab: An Embedded Perspective on Tampering.

[DOI]

,

Jonathan Uesato

,

,

,

Victoria Krakovna

,

CoRR, 2020

The Incentives that Shape Behaviour.

[DOI]

,

Eric D. Langlois

,

,

CoRR, 2020

2019

Reward Tampering Problems and Solutions in Reinforcement Learning: A Causal Influence Diagram Perspective.

[DOI]

,

CoRR, 2019

Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings.

[DOI]

,

Pedro A. Ortega

,

Elizabeth Barnes

,

CoRR, 2019

Modeling AGI Safety Frameworks with Causal Influence Diagrams.

[DOI]

,

,

Victoria Krakovna

,

Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

2018

Scalable agent alignment via reward modeling: a research direction.

[DOI]

,

,

,

,

,

CoRR, 2018

AGI Safety Literature Review.

[DOI]

,

,

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

2017

AI Safety Gridworlds.

[DOI]

,

,

Victoria Krakovna

,

Pedro A. Ortega

,

,

Andrew Lefrancq

,

,

CoRR, 2017

Reinforcement Learning with a Corrupted Reward Channel.

[DOI]

,

Victoria Krakovna

,

,

,

CoRR, 2017

Count-Based Exploration in Feature Space for Reinforcement Learning.

[DOI]

,

Suraj Narayanan Sasikumar

,

,

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Reinforcement Learning with a Corrupted Reward Channel.

[DOI]

,

Victoria Krakovna

,

,

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

A Game-Theoretic Analysis of the Off-Switch Game.

[DOI]

Tobias Wängberg

,

,

,

,

Proceedings of the Artificial General Intelligence - 10th International Conference, 2017

2016

Death and Suicide in Universal Artificial Intelligence.

[DOI]

,

,

Proceedings of the Artificial General Intelligence - 9th International Conference, 2016

Avoiding Wireheading with Value Reinforcement Learning.

[DOI]

,

Proceedings of the Artificial General Intelligence - 9th International Conference, 2016

Self-Modification of Policy and Utility Function in Rational Agents.

[DOI]

,

,

,

Proceedings of the Artificial General Intelligence - 9th International Conference, 2016

2015

A Topological Approach to Meta-heuristics: Analytical Results on the BFS vs. DFS Algorithm Selection Problem.

[DOI]

,

CoRR, 2015

Analytical Results on the BFS vs. DFS Algorithm Selection Problem: Part II: Graph Search.

[DOI]

,

Proceedings of the AI 2015: Advances in Artificial Intelligence, 2015

Analytical Results on the BFS vs. DFS Algorithm Selection Problem. Part I: Tree Search.

[DOI]

,

Proceedings of the AI 2015: Advances in Artificial Intelligence, 2015

Sequential Extensions of Causal and Evidential Decision Theory.

[DOI]

,

,

Proceedings of the Algorithmic Decision Theory - 4th International Conference, 2015

2014

Can we measure the difficulty of an optimization problem?

[DOI]

,

,

Proceedings of the 2014 IEEE Information Theory Workshop, 2014

Free Lunch for optimisation under the universal distribution.

[DOI]

,

,

Proceedings of the IEEE Congress on Evolutionary Computation, 2014

Loading...