Thomas McGrath

This page is a disambiguation page, it actually contains mutiple papers from persons of the same or a similar name.

Bibliography

2025
Understanding sparse autoencoder scaling in the presence of feature manifolds.
CoRR, September, 2025

Competitive secretary problem.
Int. J. Game Theory, June, 2025

Open Problems in Mechanistic Interpretability.
Trans. Mach. Learn. Res., 2025

2023
Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero.
CoRR, 2023

Copy Suppression: Comprehensively Understanding an Attention Head.
CoRR, 2023

The Hydra Effect: Emergent Self-repair in Language Model Computations.
CoRR, 2023

Tracr: Compiled Transformers as a Laboratory for Interpretability.
CoRR, 2023

Tracr: Compiled Transformers as a Laboratory for Interpretability.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2021
Resonant-Tunnelling Diodes as PUF Building Blocks.
IEEE Trans. Emerg. Top. Comput., 2021

Acquisition of Chess Knowledge in AlphaZero.
CoRR, 2021

Causal Analysis of Agent Behavior for AI Safety.
CoRR, 2021

2020
Algorithms for Causal Reasoning in Probability Trees.
CoRR, 2020

Meta-trained agents implement Bayes-optimal agents.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
Meta-learning of Sequential Strategies.
CoRR, 2019


  Loading...