Jerry Wei
According to our database1,
Jerry Wei authored at least 10 papers
between 2020 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
Trojan-Speak: Bypassing Constitutional Classifiers with No Jailbreak Tax via Adversarial Finetuning.
CoRR, March, 2026
Constitutional Classifiers++: Efficient Production-Grade Defenses against Universal Jailbreaks.
CoRR, January, 2026
2025
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming.
CoRR, January, 2025
2024
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024
2021
Proceedings of the 30th USENIX Security Symposium, 2021
2020