Jan Leike

According to our database1, Jan Leike authored at least 47 papers between 2013 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision.
CoRR, 2023

Let's Verify Step by Step.
CoRR, 2023

2022
Self-critiquing models for assisting human evaluators.
CoRR, 2022

Safe Deep RL in 3D Environments using Human Feedback.
CoRR, 2022

Training language models to follow instructions with human feedback.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
Institutionalizing ethics in AI through broader impact requirements.
Nat. Mach. Intell., 2021

Recursively Summarizing Books with Human Feedback.
CoRR, 2021

Evaluating Large Language Models Trained on Code.
CoRR, 2021

Institutionalising Ethics in AI through Broader Impact Requirements.
CoRR, 2021

Quantifying Differences in Reward Functions.
Proceedings of the 9th International Conference on Learning Representations, 2021

2020
Active Reinforcement Learning: Observing Rewards at a Cost.
CoRR, 2020

Hidden Incentives for Auto-Induced Distributional Shift.
CoRR, 2020

Pitfalls of Learning a Reward Function Online.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Learning Human Objectives by Evaluating Hypothetical Behavior.
Proceedings of the 37th International Conference on Machine Learning, 2020

2019
Learning to Understand Goal Specifications by Modelling Reward.
Proceedings of the 7th International Conference on Learning Representations, 2019

2018
On the computability of Solomonoff induction and AIXI.
Theor. Comput. Sci., 2018

Scaling shared model governance via model splitting.
CoRR, 2018

Scalable agent alignment via reward modeling: a research direction.
CoRR, 2018

Learning to Follow Language Instructions with Adversarial Reward Induction.
CoRR, 2018

Geometric Nontermination Arguments.
Proceedings of the Tools and Algorithms for the Construction and Analysis of Systems, 2018

Reward learning from human preferences and demonstrations in Atari.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Jointly Learning "What" and "How" from Instructions and Goal-States.
Proceedings of the 6th International Conference on Learning Representations, 2018

2017
AI Safety Gridworlds.
CoRR, 2017

Generalised Discount Functions applied to a Monte-Carlo AImu Implementation.
CoRR, 2017

Deep Reinforcement Learning from Human Preferences.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

On Thompson Sampling and Asymptotic Optimality.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Universal Reinforcement Learning Algorithms: Survey and Experiments.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Generalised Discount Functions applied to a Monte-Carlo AI u Implementation.
Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 2017

2016
Nonparametric General Reinforcement Learning.
CoRR, 2016

Exploration Potential.
CoRR, 2016

A Formal Solution to the Grain of Truth Problem.
Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Thompson Sampling is Asymptotically Optimal in General Environments.
Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, 2016

Ultimate Automizer with Two-track Proofs - (Competition Contribution).
Proceedings of the Tools and Algorithms for the Construction and Analysis of Systems, 2016

Loss Bounds and Time Complexity for Speed Priors.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

2015
Ranking Templates for Linear Loops.
Log. Methods Comput. Sci., 2015

On the Computability of AIXI.
Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015

Ultimate Automizer with Array Interpolation - (Competition Contribution).
Proceedings of the Tools and Algorithms for the Construction and Analysis of Systems, 2015

Bad Universal Priors and Notions of Optimality.
Proceedings of The 28th Conference on Learning Theory, 2015

On the Computability of Solomonoff Induction and Knowledge-Seeking.
Proceedings of the Algorithmic Learning Theory - 26th International Conference, 2015

Solomonoff Induction Violates Nicod's Criterion.
Proceedings of the Algorithmic Learning Theory - 26th International Conference, 2015

Sequential Extensions of Causal and Evidential Decision Theory.
Proceedings of the Algorithmic Decision Theory - 4th International Conference, 2015

A Definition of Happiness for Reinforcement Learning Agents.
Proceedings of the Artificial General Intelligence, 2015

2014
Geometric Series as Nontermination Arguments for Linear Lasso Programs.
CoRR, 2014

Ranking Function Synthesis for Linear Lasso Programs.
CoRR, 2014

Synthesis for Polynomial Lasso Programs.
Proceedings of the Verification, Model Checking, and Abstract Interpretation, 2014

Indefinitely Oscillating Martingales.
Proceedings of the Algorithmic Learning Theory - 25th International Conference, 2014

2013
Linear Ranking for Linear Lasso Programs.
Proceedings of the Automated Technology for Verification and Analysis, 2013


  Loading...