Buck Shlegeris
According to our database1,
Buck Shlegeris
authored at least 22 papers
between 2018 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
CoRR, July, 2025
How to evaluate control measures for LLM agents? A trajectory from today to superintelligence.
CoRR, April, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
2024
Trans. Mach. Learn. Res., 2024
Subversion Strategy Eval: Evaluating AI's stateless strategic capabilities against control protocols.
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 Small.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
2022
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
2018