John Schulman

According to our database¹, John Schulman authored at least 59 papers between 2011 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Chunky Post-Training: Data Driven Failures of Generalization.

[BibT_eX]

[DOI]

CoRR, February, 2026

2025

Detecting Adversarial Fine-tuning with Auditing Agents.

[BibT_eX]

[DOI]

Sarah Egler

John Schulman

Nicholas Carlini

CoRR, October, 2025

Stress-Testing Model Specs Reveals Character Differences among Language Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

Reasoning Models Don't Always Say What They Think.

[BibT_eX]

[DOI]

CoRR, May, 2025

Quantifying Elicitation of Latent Capabilities in Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

2024

Measuring short-form factuality in large language models.

[BibT_eX]

[DOI]

CoRR, 2024

Rule Based Rewards for Language Model Safety.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Let's Verify Step by Step.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Scaling laws for single-agent reinforcement learning.

[BibT_eX]

[DOI]

Jacob Hilton

Jie Tang

John Schulman

CoRR, 2023

Scaling Laws for Reward Model Overoptimization.

[BibT_eX]

[DOI]

Leo Gao

John Schulman

Jacob Hilton

Proceedings of the International Conference on Machine Learning, 2023

2022

Efficient Training of Language Models to Fill in the Middle.

[BibT_eX]

[DOI]

CoRR, 2022

Training language models to follow instructions with human feedback.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Batch size-invariance for policy optimization.

[BibT_eX]

[DOI]

Jacob Hilton

Karl Cobbe

John Schulman

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021

WebGPT: Browser-assisted question-answering with human feedback.

[BibT_eX]

[DOI]

CoRR, 2021

Training Verifiers to Solve Math Word Problems.

[BibT_eX]

[DOI]

CoRR, 2021

Unsolved Problems in ML Safety.

[BibT_eX]

[DOI]

CoRR, 2021

The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors.

[BibT_eX]

[DOI]

William H. Guss

Mario Ynocente Castro

CoRR, 2021

Phasic Policy Gradient.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Teacher-Student Curriculum Learning.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., 2020

Scaling Laws for Autoregressive Generative Modeling.

[BibT_eX]

[DOI]

CoRR, 2020

Measuring Sample Efficiency and Generalization in Reinforcement Learning Benchmarks: NeurIPS 2020 Procgen Benchmark.

[BibT_eX]

[DOI]

Proceedings of the NeurIPS 2020 Competition and Demonstration Track, 2020

Distribution Augmentation for Generative Modeling.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Leveraging Procedural Generation to Benchmark Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

2019

Policy Gradient Search: Online Planning and Expert Iteration without Search Trees.

[BibT_eX]

[DOI]

CoRR, 2019

Semi-Supervised Learning by Label Gradient Alignment.

[BibT_eX]

[DOI]

Jacob Jackson

John Schulman

CoRR, 2019

Quantifying Generalization in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

2018

Gotta Learn Fast: A New Benchmark for Generalization in RL.

[BibT_eX]

[DOI]

CoRR, 2018

On First-Order Meta-Learning Algorithms.

[BibT_eX]

[DOI]

Alex Nichol

Joshua Achiam

John Schulman

CoRR, 2018

Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations.

[BibT_eX]

[DOI]

Proceedings of the Robotics: Science and Systems XIV, 2018

Meta Learning Shared Hierarchies.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Model-Based Reinforcement Learning via Meta-Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the 2nd Annual Conference on Robot Learning, 2018

2017

Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations.

[BibT_eX]

[DOI]

CoRR, 2017

Proximal Policy Optimization Algorithms.

[BibT_eX]

[DOI]

CoRR, 2017

Equivalence Between Policy Gradients and Soft Q-Learning.

[BibT_eX]

[DOI]

John Schulman

Pieter Abbeel

Xi Chen

CoRR, 2017

UCB and InfoGain Exploration via $\boldsymbol{Q}$-Ensembles.

[BibT_eX]

[DOI]

CoRR, 2017

#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Variational Lossy Autoencoder.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

2016

Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs.

[BibT_eX]

[DOI]

John Schulman

PhD thesis, 2016

High-Dimensional Continuous Control Using Generalized Advantage Estimation.

[BibT_eX]

[DOI]

Proceedings of the 4th International Conference on Learning Representations, 2016

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks.

[BibT_eX]

[DOI]

CoRR, 2016

RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2016

OpenAI Gym.

[BibT_eX]

[DOI]

CoRR, 2016

Concrete Problems in AI Safety.

[BibT_eX]

[DOI]

CoRR, 2016

Theano: A Python framework for fast computation of mathematical expressions.

[BibT_eX]

[DOI]

Nicolas Boulanger-Lewandowski

Xavier Bouthillier

Alexandre de Brébisson

Samira Ebrahimi Kahou

Pierre-Antoine Manzagol

Christopher Joseph Pal

S. Ramana Subramanyam

CoRR, 2016

VIME: Variational Information Maximizing Exploration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Benchmarking Deep Reinforcement Learning for Continuous Control.

[BibT_eX]

[DOI]

Proceedings of the 33nd International Conference on Machine Learning, 2016

2015

Gradient Estimation Using Stochastic Computation Graphs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Trust Region Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the 32nd International Conference on Machine Learning, 2015

2014

Motion planning with sequential convex optimization and convex collision checking.

[BibT_eX]

[DOI]

Int. J. Robotics Res., 2014

Scaling up Gaussian Belief Space Planning Through Covariance-Free Trajectory Optimization and Automatic Differentiation.

[BibT_eX]

[DOI]

Proceedings of the Algorithmic Foundations of Robotics XI, 2014

Gaussian belief space planning with discontinuities in sensing domains.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

Planning locally optimal, curvature-constrained trajectories in 3D using sequential convex optimization.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

2013

Finding Locally Optimal, Collision-Free Trajectories with Sequential Convex Optimization.

[BibT_eX]

[DOI]

Proceedings of the Robotics: Science and Systems IX, Technische Universität Berlin, Berlin, Germany, June 24, 2013

Learning from Demonstrations Through the Use of Non-rigid Registration.

[BibT_eX]

[DOI]

Proceedings of the Robotics Research, 2013

A case study of trajectory transfer through non-rigid registration for a simplified suturing scenario.

[BibT_eX]

[DOI]

John Schulman

Ankush Gupta

Sibi Venkatesan

Mallory Tayson-Frederick

Pieter Abbeel

Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Sigma hulls for Gaussian belief space planning for imprecise articulated robots amid obstacles.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Tracking deformable objects with point clouds.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Conference on Robotics and Automation, 2013

2011

Grasping and Fixturing as Submodular Coverage Problems.

[BibT_eX]

[DOI]

John D. Schulman

Ken Goldberg

Pieter Abbeel

Proceedings of the Robotics Research, 2011

John Schulman

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...