Florian Tramèr

CoRR, 2024

Blind Baselines Beat Membership Inference Attacks for Foundation Models.

[BibT_eX]

[DOI]

Debeshee Das

Jie Zhang

CoRR, 2024

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.

[BibT_eX]

[DOI]

CoRR, 2024

Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI.

[BibT_eX]

[DOI]

CoRR, 2024

Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition.

[BibT_eX]

[DOI]

CoRR, 2024

Evaluations of Machine Learning Privacy Defenses are Misleading.

[BibT_eX]

[DOI]

Michael Aerni

Jie Zhang

CoRR, 2024

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs.

[BibT_eX]

[DOI]

Maksym Andriushchenko

Nicolas Flammarion

CoRR, 2024

Foundational Challenges in Assuring Alignment and Safety of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models.

[BibT_eX]

[DOI]

Patrick Chao

Alexander Robey

Maksym Andriushchenko

CoRR, 2024

Query-Based Adversarial Prompt Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Privacy Side Channels in Machine Learning Systems.

[BibT_eX]

[DOI]

Giorgio Severi

Proceedings of the 33rd USENIX Security Symposium, 2024

Evaluating Superhuman Models with Consistency Checks.

[BibT_eX]

[DOI]

Lukas Fluri

Daniel Paleka

Proceedings of the IEEE Conference on Secure and Trustworthy Machine Learning, 2024

Evading Black-box Classifiers Without Breaking Eggs.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Secure and Trustworthy Machine Learning, 2024

Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining.

[BibT_eX]

[DOI]

Gautam Kamath

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Extracting Training Data From Document-Based VQA Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Privacy Backdoors: Stealing Data with Corrupted Pretrained Models.

[BibT_eX]

[DOI]

Shanglun Feng

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Stealing part of a production language model.

[BibT_eX]

[DOI]

Krishnamurthy Dj Dvijotham

Daniel Paleka

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Universal Jailbreak Backdoors from Poisoned Human Feedback.

[BibT_eX]

[DOI]

Javier Rando

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Scalable Extraction of Training Data from (Production) Language Models.

[BibT_eX]

[DOI]

Eric Wallace

Katherine Lee

CoRR, 2023

Backdoor Attacks for In-Context Learning with Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Are aligned neural networks adversarially aligned?

[BibT_eX]

[DOI]

CoRR, 2023

Randomness in ML Defenses Helps Persistent Attackers and Hinders Evaluators.

[BibT_eX]

[DOI]

CoRR, 2023

Poisoning Web-Scale Training Datasets is Practical.

[BibT_eX]

[DOI]

Matthew Jagielski

CoRR, 2023

Tight Auditing of Differentially Private Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 32nd USENIX Security Symposium, 2023

Extracting Training Data from Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the 32nd USENIX Security Symposium, 2023

SNAP: Efficient Extraction of Private Properties with Poisoning.

[BibT_eX]

[DOI]

Proceedings of the 44th IEEE Symposium on Security and Privacy, 2023

Counterfactual Memorization in Neural Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Students Parrot Their Teachers: Membership Inference on Model Distillation.

[BibT_eX]

[DOI]

Matthew Jagielski

Katherine Lee

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Are aligned neural networks adversarially aligned?

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Preventing Generation of Verbatim Memorization in Language Models Gives a False Sense of Privacy.

[BibT_eX]

[DOI]

Proceedings of the 16th International Natural Language Generation Conference, 2023

Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems.

[BibT_eX]

[DOI]

Chawin Sitawarin

Proceedings of the International Conference on Machine Learning, 2023

Measuring Forgetting of Memorized Training Examples.

[BibT_eX]

[DOI]

Abhradeep Guha Thakurta

Nicolas Papernot

Chiyuan Zhang

Proceedings of the Eleventh International Conference on Learning Representations, 2023

(Certified!!) Adversarial Robustness for Free!

[BibT_eX]

[DOI]

Krishnamurthy (Dj) Dvijotham

Leslie Rice

Mingjie Sun

J. Zico Kolter

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Quantifying Memorization Across Neural Language Models.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

AISec '23: 16th ACM Workshop on Artificial Intelligence and Security.

[BibT_eX]

[DOI]

Maura Pintor

Florian Simon Tramèr

Xinyun Chen

Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023

2022

Considerations for Differentially Private Learning with Large-Scale Public Pretraining.

[BibT_eX]

[DOI]

Gautam Kamath

CoRR, 2022

Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy.

[BibT_eX]

[DOI]

CoRR, 2022

Red-Teaming the Stable Diffusion Safety Filter.

[BibT_eX]

[DOI]

CoRR, 2022

(Certified!!) Adversarial Robustness for Free!

[BibT_eX]

[DOI]

Krishnamurthy Dvijotham

J. Zico Kolter

CoRR, 2022

Debugging Differential Privacy: A Case Study for Privacy Auditing.

[BibT_eX]

[DOI]

CoRR, 2022

Membership Inference Attacks From First Principles.

[BibT_eX]

[DOI]

Proceedings of the 43rd IEEE Symposium on Security and Privacy, 2022

Increasing Confidence in Adversarial Robustness Evaluations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

The Privacy Onion Effect: Memorization is Relative.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them.

[BibT_eX]

[DOI]

Fatemehsadat Mireshghallah

Proceedings of the International Conference on Machine Learning, 2022

Data Poisoning Won't Save You From Facial Recognition.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Large Language Models Can Be Strong Differentially Private Learners.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

What Does it Mean for a Language Model to Preserve Privacy?

[BibT_eX]

[DOI]

Hannah Brown

Katherine Lee

Reza Shokri

Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022

Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets.

[BibT_eX]

[DOI]

Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, 2022

AISec '22: 15th ACM Workshop on Artificial Intelligence and Security.

[BibT_eX]

[DOI]

Ambra Demontis

Xinyun Chen

Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, 2022

2021

Measuring and enhancing the security of machine learning.

[BibT_eX]

[DOI]

PhD thesis, 2021

Advances and Open Problems in Federated Learning.

[BibT_eX]

[DOI]

Rafael G. L. D'Oliveira

Ananda Theertha Suresh

Found. Trends Mach. Learn., 2021

NeuraCrypt is not private.

[BibT_eX]

[DOI]

CoRR, 2021

Data Poisoning Won't Save You From Facial Recognition.

[BibT_eX]

[DOI]

Evani Radiya-Dixit

CoRR, 2021

Extracting Training Data from Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 30th USENIX Security Symposium, 2021

Is Private Learning Possible with Instance Encoding?

[BibT_eX]

[DOI]

Proceedings of the 42nd IEEE Symposium on Security and Privacy, 2021

Antipodes of Label Differential Privacy: PATE and ALIBI.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

SquirRL: Automating Attack Analysis on Blockchain Incentive Mechanisms with Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 28th Annual Network and Distributed System Security Symposium, 2021

Label-Only Membership Inference Attacks.

[BibT_eX]

[DOI]

Nicolas Papernot

Proceedings of the 38th International Conference on Machine Learning, 2021

Differentially Private Learning Needs Better Features (or Much More Data).

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Fourth International Workshop on Dependable and Secure Machine Learning - DSML 2021.

[BibT_eX]

[DOI]

Proceedings of the 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2021

2020

Remote Side-Channel Attacks on Anonymous Transactions.

[BibT_eX]

[DOI]

Kenneth G. Paterson

IACR Cryptol. ePrint Arch., 2020

An Attack on InstaHide: Is Private Learning Possible with Instance Encoding?

[BibT_eX]

[DOI]

CoRR, 2020

SentiNet: Detecting Localized Universal Attacks Against Deep Learning Systems.

[BibT_eX]

[DOI]

Edward Chou

Giancarlo Pellegrino

Proceedings of the 2020 IEEE Security and Privacy Workshops, 2020

On Adaptive Attacks to Adversarial Example Defenses.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Third International Workshop on Dependable and Secure Machine Learning - DSML 2020.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops, 2020

2019

The Hydra Framework for Principled, Automated Bug Bounties.

[BibT_eX]

[DOI]

IEEE Secur. Priv., 2019

Advances and Open Problems in Federated Learning.

[BibT_eX]

[DOI]

Rafael G. L. D'Oliveira

Ananda Theertha Suresh

CoRR, 2019

SquirRL: Automating Attack Discovery on Blockchain Incentive Mechanisms with Deep Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness.

[BibT_eX]

[DOI]

CoRR, 2019

Adversarial Training and Robustness for Multiple Perturbations.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

AdVersarial: Perceptual Ad Blocking meets Adversarial Machine Learning.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, 2019

2018

SentiNet: Detecting Physical Attacks Against Deep Learning Systems.

[BibT_eX]

[DOI]

CoRR, 2018

Ad-versarial: Defeating Perceptual Ad-Blocking.

[BibT_eX]

[DOI]

CoRR, 2018

Physical Adversarial Examples for Object Detectors.

[BibT_eX]

[DOI]

Proceedings of the 12th USENIX Workshop on Offensive Technologies, 2018

Ensemble Adversarial Training: Attacks and Defenses.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

PrivateRide: A Privacy-Enhanced Ride-Hailing Service.

[BibT_eX]

[DOI]

Anh Pham

Italo Dacosta

Bastien Jacot-Guillarmod

Proc. Priv. Enhancing Technol., 2017

Addressing Beacon re-identification attacks: quantification and mitigation of privacy risks.

[BibT_eX]

[DOI]

Suyash S. Shringarpure

J. Am. Medical Informatics Assoc., 2017

Enter the Hydra: Towards Principled Bug Bounties and Exploit-Resistant Smart Contracts.

[BibT_eX]

[DOI]

IACR Cryptol. ePrint Arch., 2017

Note on Attacking Object Detectors with Adversarial Stickers.

[BibT_eX]

[DOI]

CoRR, 2017

The Space of Transferable Adversarial Examples.

[BibT_eX]

[DOI]

CoRR, 2017

Ensemble Adversarial Training: Attacks and Defenses.

[BibT_eX]

[DOI]

CoRR, 2017

FairTest: Discovering Unwarranted Associations in Data-Driven Applications.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE European Symposium on Security and Privacy, 2017

2016

Sealed-Glass Proofs: Using Transparent Enclaves to Prove and Sell Knowledge.

[BibT_eX]

[DOI]

IACR Cryptol. ePrint Arch., 2016

Formal Abstractions for Attested Execution Secure Processors.

[BibT_eX]

[DOI]

Rafael Pass

Elaine Shi

IACR Cryptol. ePrint Arch., 2016

On solving L P N using B K W and variants - Implementation and analysis.

[BibT_eX]

[DOI]

Sonia Bogos

Serge Vaudenay

Cryptogr. Commun., 2016

Stealing Machine Learning Models via Prediction APIs.

[BibT_eX]

[DOI]

Proceedings of the 25th USENIX Security Symposium, 2016

2015

Better Algorithms for LWE and LWR.

[BibT_eX]

[DOI]

Alexandre Duc

Serge Vaudenay

IACR Cryptol. ePrint Arch., 2015

On Solving Lpn using BKW and Variants.

[BibT_eX]

[DOI]

Sonia Bogos