Harry Mayne

According to our database1, Harry Mayne authored at least 10 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
A Positive Case for Faithfulness: LLM Self-Explanations Help Predict Model Behavior.
CoRR, February, 2026

2025
Measuring what Matters: Construct Validity in Large Language Model Benchmarks.
CoRR, November, 2025

LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation.
CoRR, March, 2025

How Does DPO Reduce Toxicity? A Mechanistic Neuron-Level Analysis.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

LLMs Don't Know Their Own Decision Boundaries: The Unreliability of Self-Generated Counterfactual Explanations.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2024
Large language models can help boost food production, but be mindful of their risks.
Frontiers Artif. Intell., 2024

Can sparse autoencoders be used to decompose and interpret steering vectors?
CoRR, 2024

Ablation is Not Enough to Emulate DPO: How Neuron Dynamics Drive Toxicity Reduction.
CoRR, 2024

Unsupervised Learning Approaches for Identifying ICU Patient Subgroups: Do Results Generalise?
CoRR, 2024

LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low Resource and Extinct Languages.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024


  Loading...