We stand with Ukraine

We stand with Ukraine

Carson Denison

According to our database¹, Carson Denison authored at least 11 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Reasoning Models Don't Always Say What They Think.

[BibT_eX]

[DOI]

,

,

Ansh Radhakrishnan

,

Jonathan Uesato

,

,

,

,

,

,

,

Vladimir Mikulik

,

Samuel R. Bowman

,

,

,

CoRR, May, 2025

Auditing language models for hidden objectives.

[BibT_eX]

[DOI]

CoRR, March, 2025

2024

Alignment faking in large language models.

[BibT_eX]

[DOI]

CoRR, 2024

Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models.

[BibT_eX]

[DOI]

,

Monte MacDiarmid

,

,

,

,

,

Nicholas Schiefer

,

,

,

,

,

Samuel R. Bowman

,

,

CoRR, 2024

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training.

[BibT_eX]

[DOI]

CoRR, 2024

Many-shot Jailbreaking.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Gradient-Based Language Model Red Teaming.

[BibT_eX]

[DOI]

,

,

Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

2023

How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy.

[BibT_eX]

[DOI]

Natalia Ponomareva

,

Hussein Hazimeh

,

,

,

,

H. Brendan McMahan

,

Sergei Vassilvitskii

,

,

Abhradeep Guha Thakurta

J. Artif. Intell. Res., 2023

Measuring Faithfulness in Chain-of-Thought Reasoning.

[BibT_eX]

[DOI]

CoRR, 2023

Question Decomposition Improves the Faithfulness of Model-Generated Reasoning.

[BibT_eX]

[DOI]

CoRR, 2023

Private Ad Modeling with DP-SGD.

[BibT_eX]

[DOI]

,

,

,

,

Pasin Manurangsi

,

Krishna Giri Narra

,

,

Avinash V. Varadarajan

,

Proceedings of the Workshop on Data Mining for Online Advertising (AdKDD 2023) co-located with the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2023), 2023

Loading...