Elizabeth Barnes

According to our database1, Elizabeth Barnes authored at least 13 papers between 2009 and 2025.

Collaborative distances:

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2025
Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety.
CoRR, July, 2025

Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity.
CoRR, July, 2025

HCAST: Human-Calibrated Autonomy Software Tasks.
CoRR, March, 2025

Measuring AI Ability to Complete Long Tasks.
CoRR, March, 2025

Measuring AI Ability to Complete Long Software Tasks.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

RE-Bench: Evaluating Frontier AI R&D Capabilities of Language Model Agents against Human Experts.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

2024
RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts.
CoRR, 2024

2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Trans. Mach. Learn. Res., 2023

Evaluating Language-Model Agents on Realistic Autonomous Tasks.
CoRR, 2023

2021
Evaluating Large Language Models Trained on Code.
CoRR, 2021

2020
Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims.
CoRR, 2020

2019
Understanding Agent Incentives using Causal Influence Diagrams. Part I: Single Action Settings.
CoRR, 2019

2009
Indeterminacy, identity and counterparts: Evans reconsidered.
Synth., 2009


  Loading...