Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion Models.

[BibT_eX]

[DOI]

Yaopei Zeng

Proceedings of the Forty-second International Conference on Machine Learning, 2025

TruthFlow: Truthful LLM Generation via Representation Flow Correction.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

You Can't Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors.

[BibT_eX]

[DOI]

Proceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, 2025

JoPA: Explaining Large Language Model's Generation via Joint Prompt Attribution.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation.

[BibT_eX]

[DOI]

Yurui Chang

Bochuan Cao

Lu Lin

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2024

On the Intrinsic Self-Correction Capability of LLMs: Uncertainty and Latent Concept.

[BibT_eX]

[DOI]

Kristen Marie Johnson

Jiliang Tang

Rongrong Wang

CoRR, 2024

XPrompt:Explaining Large Language Model's Generation via Joint Prompt Attribution.

[BibT_eX]

[DOI]

CoRR, 2024

On the Difficulty of Defending Contrastive Learning against Backdoor Attacks.

[BibT_eX]

[DOI]

Proceedings of the 33rd USENIX Security Symposium, 2024

Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Data Free Backdoor Attacks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections.

[BibT_eX]

[DOI]

Yuanpu Cao

Bochuan Cao

Jinghui Chen

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Jailbreak Open-Sourced Large Language Models via Enforced Decoding.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024