Victoria Krakovna

According to our database1, Victoria Krakovna authored at least 17 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Evaluating Frontier Models for Dangerous Capabilities.
CoRR, 2024

Limitations of Agents Simulated by Predictive Models.
CoRR, 2024

Quantifying stability of non-power-seeking in artificial agents.
CoRR, 2024

2023
Power-seeking can be probable and predictive for trained agents.
CoRR, 2023

2022
Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals.
CoRR, 2022

2021
Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.
Synth., 2021

2020
Avoiding Tampering Incentives in Deep RL via Decoupled Approval.
CoRR, 2020

REALab: An Embedded Perspective on Tampering.
CoRR, 2020

Avoiding Side Effects By Considering Future Tasks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019
Penalizing Side Effects using Stepwise Relative Reachability.
Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

Modeling AGI Safety Frameworks with Causal Influence Diagrams.
Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

2018
Measuring and avoiding side effects using relative reachability.
CoRR, 2018

2017
AI Safety Gridworlds.
CoRR, 2017

Reinforcement Learning with a Corrupted Reward Channel.
CoRR, 2017

Reinforcement Learning with a Corrupted Reward Channel.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

2016
Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input.
Proceedings of the COLING 2016, 2016

2010
A Generalized-Zero-Preserving Method for Compact Encoding of Concept Lattices.
Proceedings of the ACL 2010, 2010


  Loading...