Joar Skalse

According to our database1, Joar Skalse authored at least 16 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Quantifying the Sensitivity of Inverse Reinforcement Learning to Misspecification.
CoRR, 2024

2023
On The Expressivity of Objective-Specification Formalisms in Reinforcement Learning.
CoRR, 2023

Goodhart's Law in Reinforcement Learning.
CoRR, 2023

STARC: A General Framework For Quantifying Differences Between Reward Functions.
CoRR, 2023

On the limitations of Markovian rewards to express multi-objective, risk-sensitive, and modal tasks.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning.
Proceedings of the International Conference on Machine Learning, 2023

Misspecification in Inverse Reinforcement Learning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Defining and Characterizing Reward Hacking.
CoRR, 2022

Defining and Characterizing Reward Gaming.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Lexicographic Multi-Objective Reinforcement Learning.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

2021
Is SGD a Bayesian sampler? Well, almost.
J. Mach. Learn. Res., 2021

A General Counterexample to Any Decision Theory and Some Responses.
CoRR, 2021

Reinforcement Learning in Newcomblike Environments.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Safety Properties of Inductive Logic Programming.
Proceedings of the Workshop on Artificial Intelligence Safety 2021 (SafeAI 2021) co-located with the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2021), 2021

2019
Neural networks are a priori biased towards Boolean functions with low entropy.
CoRR, 2019

Risks from Learned Optimization in Advanced Machine Learning Systems.
CoRR, 2019


  Loading...