Mikita Balesni

According to our database1, Mikita Balesni authored at least 3 papers in 2023.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of five.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure.
CoRR, 2023

The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A".
CoRR, 2023

Taken out of context: On measuring situational awareness in LLMs.
CoRR, 2023


  Loading...