Felix Hofstätter

According to our database1, Felix Hofstätter authored at least 5 papers between 2024 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of five.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Probing and Steering Evaluation Awareness of Language Models.
CoRR, July, 2025

The Elicitation Game: Evaluating Capability Elicitation Techniques.
CoRR, February, 2025

AI Sandbagging: Language Models can Strategically Underperform on Evaluations.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models.
CoRR, 2024

AI Sandbagging: Language Models can Strategically Underperform on Evaluations.
CoRR, 2024


  Loading...