Filip Sondej

According to our database1, Filip Sondej authored at least 10 papers between 2019 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Implementing surrogate goals for safer bargaining in LLM-based agents.
CoRR, April, 2026

2025
Collapse of Irrelevant Representations (CIR) Ensures Robust and Non-Disruptive LLM Unlearning.
CoRR, September, 2025

Robust LLM Unlearning with MUDMAN: Meta-Unlearning with Disruption Masking And Normalization.
CoRR, June, 2025

Individual differences in neurophysiological correlates of post-response adaptation: A model-based approach.
NeuroImage, 2025

How Does DPO Reduce Toxicity? A Mechanistic Neuron-Level Analysis.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Are there different types of error monitoring? A microstates analysis of error-related brain activity across three tasks.
Proceedings of the 47th Annual Meeting of the Cognitive Science Society, 2025

Multi-Agent Security Tax: Trading Off Security and Collaboration Capabilities in Multi-Agent Systems.
Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

2024
A Machine Learning Study of Anxiety-related Symptoms and Error-related Brain Activity.
J. Cogn. Neurosci., May, 2024

Ablation is Not Enough to Emulate DPO: How Neuron Dynamics Drive Toxicity Reduction.
CoRR, 2024

2019
On the Role of Trust in Child-Robot Interaction.
Proceedings of the 28th IEEE International Conference on Robot and Human Interactive Communication, 2019


  Loading...