Bochuan Cao

Orcid: 0009-0007-1973-8186

According to our database1, Bochuan Cao authored at least 23 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
SkillGrad: Optimizing Agent Skills Like Gradient Descent.
CoRR, May, 2026

2025
Explore Data Left Behind in Reinforcement Learning for Reasoning Language Models.
CoRR, November, 2025

Your Agent Can Defend Itself against Backdoor Attacks.
CoRR, June, 2025

Watch the Watchers! On the Security Risks of Robustness-Enhancing Diffusion Models.
Proceedings of the 34th USENIX Security Symposium, 2025

WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

On the Convergence of Moral Self-Correction in Large Language Models.
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion Models.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

TruthFlow: Truthful LLM Generation via Representation Flow Correction.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

You Can't Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors.
Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, 2025

JoPA: Explaining Large Language Model's Generation via Joint Prompt Attribution.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models.
CoRR, 2024

On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept.
CoRR, 2024

XPrompt:Explaining Large Language Model's Generation via Joint Prompt Attribution.
CoRR, 2024

On the Difficulty of Defending Contrastive Learning against Backdoor Attacks.
Proceedings of the 33rd USENIX Security Symposium, 2024

Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Data Free Backdoor Attacks.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Jailbreak Open-Sourced Large Language Models via Enforced Decoding.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
On the Safety of Open-Sourced Large Language Models: Does Alignment Really Prevent Them From Being Misused?
CoRR, 2023

IMPRESS: Evaluating the Resilience of Imperceptible Perturbations Against Unauthorized Data Usage in Diffusion-Based Generative AI.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
Wild-Time: A Benchmark of in-the-Wild Distribution Shift over Time.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022


  Loading...