Johannes Heidecke

According to our database1, Johannes Heidecke authored at least 15 papers between 2022 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
gpt-oss-120b & gpt-oss-20b Model Card.
CoRR, August, 2025

AI-based Clinical Decision Support for Primary Care: A Real-World Study.
CoRR, July, 2025

The Singapore Consensus on Global AI Safety Research Priorities.
CoRR, June, 2025

Persona Features Control Emergent Misalignment.
CoRR, June, 2025

HealthBench: Evaluating Large Language Models Towards Improved Human Health.
CoRR, May, 2025

PaperBench: Evaluating AI's Ability to Replicate AI Research.
CoRR, April, 2025

SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?
CoRR, February, 2025

Trading Inference-Time Compute for Adversarial Robustness.
CoRR, January, 2025

First-Person Fairness in Chatbots.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning.
CoRR, 2024

Deliberative Alignment: Reasoning Enables Safer Language Models.
CoRR, 2024

First-Person Fairness in Chatbots.
CoRR, 2024

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions.
CoRR, 2024

Rule Based Rewards for Language Model Safety.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2022
Text and Code Embeddings by Contrastive Pre-Training.
CoRR, 2022


  Loading...