Alwin Peng
According to our database1,
Alwin Peng authored at least 4 papers
between 2024 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
Trojan-Speak: Bypassing Constitutional Classifiers with No Jailbreak Tax via Adversarial Finetuning.
CoRR, March, 2026
Constitutional Classifiers++: Efficient Production-Grade Defenses against Universal Jailbreaks.
CoRR, January, 2026
2025
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming.
CoRR, January, 2025
2024