Nathan Helm-Burger

According to our database1, Nathan Helm-Burger authored at least 5 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring.
CoRR, May, 2025

2024
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models.
CoRR, 2024

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning.
CoRR, 2024


2023
Will releasing the weights of future large language models grant widespread access to pandemic agents?
CoRR, 2023


  Loading...