Joseph Isaac Bloom

According to our database1, Joseph Isaac Bloom authored at least 8 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Building Better Deception Probes Using Targeted Instruction Pairs.
CoRR, February, 2026

2025
Auditing Games for Sandbagging.
CoRR, December, 2025

ContextBench: Modifying Contexts for Targeted Latent Activation.
CoRR, June, 2025

SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability.
CoRR, March, 2025

Open Problems in Mechanistic Interpretability.
Trans. Mach. Learn. Res., 2025

SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

Sparse Autoencoders Do Not Find Canonical Units of Analysis.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
Interpreting Attention Layer Outputs with Sparse Autoencoders.
CoRR, 2024


  Loading...