Sophie Xhonneux

According to our database¹, Sophie Xhonneux authored at least 8 papers between 2024 and 2025.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs.

[BibT_eX]

[DOI]

CoRR, July, 2025

LLM-Safety Evaluations Lack Robustness.

[BibT_eX]

[DOI]

CoRR, March, 2025

A generative approach to LLM harmfulness detection with special red flag tokens.

[BibT_eX]

[DOI]

CoRR, February, 2025

Adversarial Alignment for LLMs Requires Simpler, Reproducible, and More Measurable Objectives.

[BibT_eX]

[DOI]

CoRR, February, 2025

Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

In-Context Learning Can Re-learn Forbidden Tasks.

[BibT_eX]

[DOI]

CoRR, 2024

Efficient Adversarial Training in LLMs with Continuous Attacks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Sophie Xhonneux

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...