Gonçalo Paulo

According to our database¹, Gonçalo Paulo authored at least 8 papers between 2024 and 2025.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Evaluating SAE interpretability without explanations.

[BibT_eX]

[DOI]

Gonçalo Paulo

Nora Belrose

CoRR, July, 2025

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research.

[BibT_eX]

[DOI]

CoRR, May, 2025

Partially Rewriting a Transformer in Natural Language.

[BibT_eX]

[DOI]

Gonçalo Paulo

Nora Belrose

CoRR, January, 2025

Transcoders Beat Sparse Autoencoders for Interpretability.

[BibT_eX]

[DOI]

Gonçalo Paulo

Stepan Shabalin

Nora Belrose

CoRR, January, 2025

Sparse Autoencoders Trained on the Same Data Learn Different Features.

[BibT_eX]

[DOI]

Gonçalo Paulo

Nora Belrose

CoRR, January, 2025

Automatically Interpreting Millions of Features in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Do Transformer Interpretability Methods Transfer to RNNs?

[BibT_eX]

[DOI]

Gonçalo Paulo

Thomas Marshall

Nora Belrose

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Does Transformer Interpretability Transfer to RNNs?

[BibT_eX]

[DOI]

Gonçalo Paulo

Thomas Marshall

Nora Belrose

CoRR, 2024

Gonçalo Paulo

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...