Katherine Lee

Anastasios Nikolas Angelopoulos

Wei-Lin Chiang

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Scalable Extraction of Training Data from Aligned, Production Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training.

[BibT_eX]

[DOI]

Jaydeep Borkar

Matthew Jagielski

Niloofar Mireshghallah

David A. Smith

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice.

[BibT_eX]

[DOI]

CoRR, 2024

LMD3: Language Model Data Density Dependence.

[BibT_eX]

[DOI]

CoRR, 2024

An Abundance of Katherines: The Game Theory of Baby Naming.

[BibT_eX]

[DOI]

CoRR, 2024

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Stealing part of a production language model.

[BibT_eX]

[DOI]

Krishnamurthy Dj Dvijotham

Daniel Paleka

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Talkin' 'Bout AI Generation: Copyright and the Generative-AI Supply Chain (The Short Version).

[BibT_eX]

[DOI]

Thanumalayan Sankaranarayana Pillai

A. Feder Cooper

James Grimmelmann

Proceedings of the Symposium on Computer Science and Law, 2024

Arbitrariness and Social Prediction: The Confounding Role of Variance in Fair Classification.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

PaLM: Scaling Language Modeling with Pathways.

[BibT_eX]

[DOI]

Vinodkumar Prabhakaran

Kathy Meier-Hellstern

J. Mach. Learn. Res., 2023

Scalable Extraction of Training Data from (Production) Language Models.

[BibT_eX]

[DOI]

Eric Wallace

CoRR, 2023

Report of the 1st Workshop on Generative AI and Law.

[BibT_eX]

[DOI]

CoRR, 2023

Talkin' 'Bout AI Generation: Copyright and the Generative-AI Supply Chain.

[BibT_eX]

[DOI]

A. Feder Cooper

James Grimmelmann

CoRR, 2023

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset.

[BibT_eX]

[DOI]

CoRR, 2023

Are aligned neural networks adversarially aligned?

[BibT_eX]

[DOI]

CoRR, 2023

Students Parrot Their Teachers: Membership Inference on Model Distillation.

[BibT_eX]

[DOI]

Matthew Jagielski

CoRR, 2023

Counterfactual Memorization in Neural Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Students Parrot Their Teachers: Membership Inference on Model Distillation.

[BibT_eX]

[DOI]

Matthew Jagielski

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Preventing Generation of Verbatim Memorization in Language Models Gives a False Sense of Privacy.

[BibT_eX]

[DOI]

Proceedings of the 16th International Natural Language Generation Conference, 2023

Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System.

[BibT_eX]

[DOI]

Proceedings of the 16th International Natural Language Generation Conference, 2023

Measuring Forgetting of Memorized Training Examples.

[BibT_eX]

[DOI]

Abhradeep Guha Thakurta

Nicolas Papernot

Chiyuan Zhang

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Quantifying Memorization Across Neural Language Models.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022

Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy.

[BibT_eX]

[DOI]

CoRR, 2022

What Does it Mean for a Language Model to Preserve Privacy?

[BibT_eX]

[DOI]

Hannah Brown

Fatemehsadat Mireshghallah

Reza Shokri

Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022

Deduplicating Training Data Makes Language Models Better.

[BibT_eX]

[DOI]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

Extracting Training Data from Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 30th USENIX Security Symposium, 2021

Predictive Modeling of Healthcare Utilization Metrics Identifies Adult Patients at High Risk for Suicide Attempt in the Primary Care Setting.

[BibT_eX]

[DOI]

Colin G. Walsh

Proceedings of the AMIA 2021, American Medical Informatics Association Annual Symposium, San Diego, CA, USA, October 30, 2021, 2021

2020

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2020

WT5?! Training Text-to-Text Models to Explain their Predictions.

[BibT_eX]

[DOI]

CoRR, 2020

2015

A computer-aided diagnosis system to identify regions of pathologic change in temporal subtraction images of the chest.

[BibT_eX]

[DOI]

Charles Ho