Aidan Ewart
According to our database1,
Aidan Ewart authored at least 7 papers
between 2024 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
AuditBench: Evaluating Alignment Auditing Techniques on Models with Hidden Behaviors.
CoRR, February, 2026
2025
Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs.
Trans. Mach. Learn. Res., 2025
Trans. Mach. Learn. Res., 2025
Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization.
Proceedings of the Forty-second International Conference on Machine Learning, 2025
2024
Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs.
CoRR, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024