Mary Phuong

According to our database¹, Mary Phuong authored at least 14 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Spilling the Beans: Teaching LLMs to Self-Report Their Hidden Objectives.

[BibT_eX]

[DOI]

Chloe Li

Mary Phuong

Daniel Tan

CoRR, November, 2025

LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring.

[BibT_eX]

[DOI]

Chloe Li

Mary Phuong

Noah Y. Siegel

CoRR, August, 2025

Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety.

[BibT_eX]

[DOI]

CoRR, July, 2025

CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring.

[BibT_eX]

[DOI]

CoRR, May, 2025

Evaluating Frontier Models for Stealth and Situational Awareness.

[BibT_eX]

[DOI]

CoRR, May, 2025

From Stability to Inconsistency: A Study of Moral Preferences in LLMs.

[BibT_eX]

[DOI]

Monika Jotautaite

Mary Phuong

Chatrik Singh Mangat

Maria Angelica Martinez

CoRR, April, 2025

2024

Evaluating Frontier Models for Dangerous Capabilities.

[BibT_eX]

[DOI]

CoRR, 2024

2023

Model evaluation for extreme risks.

[BibT_eX]

[DOI]

CoRR, 2023

2022

Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals.

[BibT_eX]

[DOI]

CoRR, 2022

Formal Algorithms for Transformers.

[BibT_eX]

[DOI]

Mary Phuong

Marcus Hutter

CoRR, 2022

2021

The inductive bias of ReLU networks on orthogonally separable data.

[BibT_eX]

[DOI]

Mary Phuong

Christoph H. Lampert

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Functional vs. parametric equivalence of ReLU networks.

[BibT_eX]

[DOI]

Mary Phuong

Christoph H. Lampert

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Towards Understanding Knowledge Distillation.

[BibT_eX]

[DOI]

Mary Phuong

Christoph Lampert

Proceedings of the 36th International Conference on Machine Learning, 2019

Distillation-Based Training for Multi-Exit Architectures.

[BibT_eX]

[DOI]

Mary Phuong

Christoph Lampert

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Mary Phuong

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...