Alexey Dontsov

According to our database1, Alexey Dontsov authored at least 9 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of five.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Sanity Checks for Sparse Autoencoders: Do SAEs Beat Random Baselines?
CoRR, February, 2026

Feature Drift: How Fine-Tuning Repurposes Representations in LLMs.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2026, 2026

Out of Distribution, Out of Luck: Process Rewards Misguide Reasoning Models.
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics, 2026

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Lightweight error mitigation strategies for post-training N:M activation sparsity in LLMs.
CoRR, September, 2025

The Rogue Scalpel: Activation Steering Compromises LLM Safety.
CoRR, September, 2025

OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features.
CoRR, September, 2025

CLEAR: Character Unlearning in Textual and Visual Modalities.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
AIRI NLP Team at EHRSQL 2024 Shared Task: T5 and Logistic Regression to the Rescue.
Proceedings of the 6th Clinical Natural Language Processing Workshop, 2024


  Loading...