Alex McKenzie

According to our database1, Alex McKenzie authored at least 4 papers between 2025 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Moral Preferences of LLMs Under Directed Contextual Influence.
CoRR, February, 2026

Learning Self-Interpretation from Interpretability Artifacts: Training Lightweight Adapters on Vector-Label Pairs.
CoRR, February, 2026

Endogenous Resistance to Activation Steering in Language Models.
CoRR, February, 2026

2025
Detecting High-Stakes Interactions with Activation Probes.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025


  Loading...