David Dobre

According to our database1, David Dobre authored at least 10 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
A generative approach to LLM harmfulness detection with special red flag tokens.
CoRR, February, 2025

Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024
In-Context Learning Can Re-learn Forbidden Tasks.
CoRR, 2024

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

On the Scalability of Certified Adversarial Robustness with Generated Data.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
Raising the Bar for Certified Adversarial Robustness with Diffusion Models.
CoRR, 2023

Adversarial Attacks and Defenses in Large Language Models: Old and New Threats.
Proceedings of the Proceedings on "I Can't Believe It's Not Better: Failure Modes in the Age of Foundation Models" at NeurIPS 2023 Workshops, 2023

2022
Dissecting adaptive methods in GANs.
CoRR, 2022

Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022


  Loading...