Jan Wehner

Orcid: 0009-0008-8581-819X

According to our database1, Jan Wehner authored at least 10 papers between 2012 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Probe-based Fine-tuning for Reducing Toxicity.
CoRR, October, 2025

Safety is Essential for Responsible Open-Ended Systems.
CoRR, February, 2025

Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models.
Trans. Mach. Learn. Res., 2025

2024
Representation noising effectively prevents harmful fine-tuning on LLMs.
CoRR, 2024

Immunization against harmful fine-tuning attacks.
CoRR, 2024

Representation Noising: A Defence Mechanism Against Harmful Finetuning.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Immunization against harmful fine-tuning attacks.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Explaining Learned Reward Functions with Counterfactual Trajectories.
Proceedings of the First Workshop on Implementing AI Ethics through a Behavioural Lens (AIEB 2024) co-located with 26th European Conference on Artificial Intelligence (ECAI 2024), 2024

2021
On Robust Vs Fast Solving of Qualitative Constraints.
Proceedings of the 33rd IEEE International Conference on Tools with Artificial Intelligence, 2021

2012
T.F.O.: tangible flying objects.
Proceedings of the 6th International Conference on Tangible and Embedded Interaction 2012, 2012


  Loading...