We stand with Ukraine

We stand with Ukraine

Victoria Krakovna

According to our database¹, Victoria Krakovna authored at least 23 papers between 2010 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Gram: Assessing sabotage propensities via automated alignment auditing.

[DOI]

,

Victoria Krakovna

,

Sebastian Farquhar

CoRR, May, 2026

Realistic honeypot evaluations for scheming propensity.

[DOI]

Victoria Krakovna

,

,

,

Sebastian Farquhar

,

CoRR, May, 2026

2025

Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety.

[DOI]

CoRR, July, 2025

Evaluating Frontier Models for Stealth and Situational Awareness.

[DOI]

,

Roland S. Zimmermann

,

,

,

Victoria Krakovna

,

,

,

,

CoRR, May, 2025

An Approach to Technical AGI Safety and Security.

[DOI]

CoRR, April, 2025

2024

The Ethics of Advanced AI Assistants.

[DOI]

,

Arianna Manzini

,

,

Lisa Anne Hendricks

,

,

,

,

,

,

Mikel Rodriguez

,

Seliem El-Sayed

,

,

,

,

,

A. Stevie Bergman

,

,

,

,

Juan Mateos-Garcia

,

Laura Weidinger

,

,

,

,

,

,

,

Victoria Krakovna

,

John Oliver Siy

,

Zeb Kurth-Nelson

,

Amanda McCroskery

,

,

,

Murray Shanahan

,

,

,

,

Yetunde Ibitoye

,

,

,

Sébastien Krier

,

Alexander Reese

,

Sims Witherspoon

,

,

,

,

Matija Franklin

,

Josh A. Goldstein

,

,

,

,

,

Meredith Ringel Morris

,

,

Blaise Agüera y Arcas

,

,

CoRR, 2024

Evaluating Frontier Models for Dangerous Capabilities.

[DOI]

CoRR, 2024

Limitations of Agents Simulated by Predictive Models.

[DOI]

Raymond Douglas

,

Jacek Karwowski

,

,

,

Victoria Krakovna

CoRR, 2024

Quantifying stability of non-power-seeking in artificial agents.

[DOI]

Evan Ryan Gunter

,

Yevgeny Liokumovich

,

Victoria Krakovna

CoRR, 2024

2023

Power-seeking can be probable and predictive for trained agents.

[DOI]

Victoria Krakovna

,

CoRR, 2023

2022

Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals.

[DOI]

,

,

,

,

Victoria Krakovna

,

Jonathan Uesato

,

CoRR, 2022

2021

Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.

[DOI]

,

,

,

Victoria Krakovna

Synth., 2021

2020

Avoiding Tampering Incentives in Deep RL via Decoupled Approval.

[DOI]

Jonathan Uesato

,

,

Victoria Krakovna

,

,

,

CoRR, 2020

REALab: An Embedded Perspective on Tampering.

[DOI]

,

Jonathan Uesato

,

,

,

Victoria Krakovna

,

CoRR, 2020

Avoiding Side Effects By Considering Future Tasks.

[DOI]

Victoria Krakovna

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019

Penalizing Side Effects using Stepwise Relative Reachability.

[DOI]

Victoria Krakovna

,

,

,

Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

Modeling AGI Safety Frameworks with Causal Influence Diagrams.

[DOI]

,

,

Victoria Krakovna

,

Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

2018

Measuring and avoiding side effects using relative reachability.

[DOI]

Victoria Krakovna

,

,

,

CoRR, 2018

2017

AI Safety Gridworlds.

[DOI]

,

,

Victoria Krakovna

,

Pedro A. Ortega

,

,

Andrew Lefrancq

,

,

CoRR, 2017

Reinforcement Learning with a Corrupted Reward Channel.

[DOI]

,

Victoria Krakovna

,

,

,

CoRR, 2017

Reinforcement Learning with a Corrupted Reward Channel.

[DOI]

,

Victoria Krakovna

,

,

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

2016

Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input.

[DOI]

,

,

,

Victoria Krakovna

,

Finale Doshi-Velez

,

,

William Schuler

,

Proceedings of the COLING 2016, 2016

2010

A Generalized-Zero-Preserving Method for Compact Encoding of Concept Lattices.

[DOI]

,

Victoria Krakovna

,

,

Proceedings of the ACL 2010, 2010

Loading...