Yuval Ran-Milo

According to our database1, Yuval Ran-Milo authored at least 6 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
A Mechanistic Account of Attention Sinks in GPT-2: One Circuit, Broader Implications for Mitigation.
CoRR, April, 2026

Attention Sinks Are Provably Necessary in Softmax Transformers: Evidence from Trigger-Conditional Tasks.
CoRR, March, 2026

Outcome-Based RL Provably Leads Transformers to Reason, but Only With the Right Data.
CoRR, January, 2026

2025
Do Neural Networks Need Gradient Descent to Generalize? A Theoretical Study.
CoRR, June, 2025

Mamba Knockout for Unraveling Factual Information Flow.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Provable Benefits of Complex Parameterizations for Structured State Space Models.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024


  Loading...