Phil Blandfort

According to our database1, Phil Blandfort authored at least 3 papers between 2025 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Moral Preferences of LLMs Under Directed Contextual Influence.
CoRR, February, 2026

2025
Red-teaming Activation Probes using Prompted LLMs.
CoRR, November, 2025

Detecting High-Stakes Interactions with Activation Probes.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025


  Loading...