János Kramár

According to our database¹, János Kramár authored at least 27 papers between 2010 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Building Production-Ready Probes For Gemini.

[BibT_eX]

[DOI]

CoRR, January, 2026

2024

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2.

[BibT_eX]

[DOI]

Tom Lieberum

Senthooran Rajamanoharan

CoRR, 2024

Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders.

[BibT_eX]

[DOI]

Senthooran Rajamanoharan

CoRR, 2024

Improving Dictionary Learning with Gated Sparse Autoencoders.

[BibT_eX]

[DOI]

Senthooran Rajamanoharan

CoRR, 2024

AtP*: An efficient and scalable method for localizing LLM behaviour to components.

[BibT_eX]

[DOI]

CoRR, 2024

Improving Sparse Decomposition of Language Model Activations with Gated Sparse Autoencoders.

[BibT_eX]

[DOI]

Senthooran Rajamanoharan

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

On scalable oversight with weak LLMs judging strong LLMs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

2023

Explaining grokking through circuit efficiency.

[BibT_eX]

[DOI]

CoRR, 2023

The Hydra Effect: Emergent Self-repair in Language Model Computations.

[BibT_eX]

[DOI]

CoRR, 2023

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla.

[BibT_eX]

[DOI]

CoRR, 2023

Power-seeking can be probable and predictive for trained agents.

[BibT_eX]

[DOI]

Victoria Krakovna

János Kramár

CoRR, 2023

Tracr: Compiled Transformers as a Laboratory for Interpretability.

[BibT_eX]

[DOI]

CoRR, 2023

Tracr: Compiled Transformers as a Laboratory for Interpretability.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022

Sample-based Approximation of Nash in Large Many-Player Games via Gradient Descent.

[BibT_eX]

[DOI]

Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

2021

A Neural Network Auction For Group Decision Making Over a Continuous Space.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

2020

Learning to Play No-Press Diplomacy with Best Response Policy Iteration.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Should I Tear down This Wall? Optimizing Social Metrics by Evaluating Novel Actions.

[BibT_eX]

[DOI]

Proceedings of the Coordination, Organizations, Institutions, Norms, and Ethics for Governance of Multi-Agent Systems XIII, 2020

2019

OpenSpiel: A Framework for Reinforcement Learning in Games.

[BibT_eX]

[DOI]

CoRR, 2019

Learning Reciprocity in Complex Sequential Social Dilemmas.

[BibT_eX]

[DOI]

CoRR, 2019

Relational Forward Models for Multi-Agent Learning.

[BibT_eX]

[DOI]

Andrea Tacchetti

H. Francis Song

Pedro A. M. Mediano

Vinícius Flores Zambaldi

Proceedings of the 7th International Conference on Learning Representations, 2019

The Imitation Game: Learned Reciprocity in Markov games.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

2018

Reinforcement and Imitation Learning for Diverse Visuomotor Skills.

[BibT_eX]

[DOI]

Saran Tunyasuvunakool

Proceedings of the Robotics: Science and Systems XIV, 2018

2017

Guidelines for Artificial Intelligence Containment.

[BibT_eX]

[DOI]

James Babcock

János Kramár

Roman V. Yampolskiy

CoRR, 2017

Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

2016

Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations.

[BibT_eX]

[DOI]

CoRR, 2016

The AGI Containment Problem.

[BibT_eX]

[DOI]

James Babcock

János Kramár

Roman Yampolskiy

Proceedings of the Artificial General Intelligence - 9th International Conference, 2016

2010

A Generalized-Zero-Preserving Method for Compact Encoding of Concept Lattices.

[BibT_eX]

[DOI]

Proceedings of the ACL 2010, 2010

János Kramár

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...