We stand with Ukraine

We stand with Ukraine

Zhexin Zhang

Orcid: 0000-0003-1767-8865

According to our database¹, Zhexin Zhang authored at least 46 papers between 2019 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

The Missing Half: Unveiling Training-time Implicit Safety Risks Beyond Deployment.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, February, 2026

MotionAdapter: Video Motion Transfer via Content-Aware Attention Customization.

[DOI]

,

,

,

,

,

,

CoRR, January, 2026

How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study.

[DOI]

,

,

Victor Shea-Jay Huang

,

,

,

,

,

,

,

,

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

New Terms, New Toxicity: Consensus-based Chinese Neologism Toxicity Detection via Search-Augmented LLMs.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

When Smiley Turns Hostile: Interpreting How Emojis Trigger LLMs' Toxicity.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

"Give a Positive Review Only": An Early Investigation Into In-Paper Prompt Injection Attacks and Defenses for AI Reviewers.

[DOI]

,

,

,

CoRR, November, 2025

Vector sketch animation generation with differentialable motion trajectories.

[DOI]

,

,

,

,

,

,

CoRR, September, 2025

Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!

[DOI]

,

,

,

,

,

CoRR, May, 2025

ShieldVLM: Safeguarding the Multimodal Implicit Toxicity via Deliberative Reasoning with LVLMs.

[DOI]

,

,

,

,

,

,

,

,

CoRR, May, 2025

BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

CoRR, May, 2025

AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, February, 2025

ShieldVLM: Safeguarding the Multimodal Implicit Toxicity via Deliberative Reasoning with LVLMs: ShieldVLM.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering.

[DOI]

,

,

Xuancheng Huang

,

,

Victor Shea-Jay Huang

,

,

,

,

,

Proceedings of the 33rd ACM International Conference on Multimedia, 2025

Knowledge-to-Jailbreak: Investigating Knowledge-driven Jailbreaking Attacks for Large Language Models.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.2, 2025

Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints.

[DOI]

,

,

,

,

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

LongSafety: Evaluating Long-Context Safety of Large Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Agent-SafetyBench: Evaluating the Safety of LLM Agents.

[DOI]

,

,

,

,

,

,

CoRR, 2024

Seeker: Towards Exception Safety Code Generation with Intermediate Language Agents Framework.

[DOI]

,

,

,

,

,

CoRR, 2024

Global Challenge for Safe and Secure LLMs Track 1.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Rick Siow Mong Goh

,

,

,

CoRR, 2024

Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks.

[DOI]

,

,

,

,

,

,

CoRR, 2024

Knowledge-to-Jailbreak: One Knowledge Point Worth One Attack.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors.

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

A Design of Interface for Visual-Impaired People to Access Visual Information from Images Featuring Large Language Models and Visual Language Models.

[DOI]

Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2024

Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization.

[DOI]

,

,

,

,

,

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

SafetyBench: Evaluating the Safety of Large Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

Import vertical characteristic of rain streak for single image deraining.

[DOI]

,

,

,

,

Multim. Syst., 2023

Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization.

[DOI]

,

,

,

CoRR, 2023

SafetyBench: Evaluating the Safety of Large Language Models with Multiple Choice Questions.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, 2023

Safety Assessment of Chinese Large Language Models.

[DOI]

,

,

,

,

CoRR, 2023

Recent Advances towards Safe, Responsible, and Moral Dialogue Systems: A Survey.

[DOI]

,

,

,

,

CoRR, 2023

InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning.

[DOI]

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Unveiling the Implicit Toxicity in Large Language Models.

[DOI]

,

,

,

,

,

,

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

ETHICIST: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation.

[DOI]

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Self-Supervised Sentence Polishing by Adding Engaging Modifiers.

[DOI]

,

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), 2023

MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems via Moral Discussions.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

MoralDial: A Framework to Train and Evaluate Moral Dialogue Systems via Constructing Moral Discussions.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2022

Persona-Guided Planning for Controlling the Protagonist's Persona in Story Generation.

[DOI]

,

,

,

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Indoor Auto-Navigate System for Electric Wheelchairs in a Nursing Home.

[DOI]

,

,

Proceedings of the Universal Access in Human-Computer Interaction. Novel Design Approaches and Technologies, 2022

Visualizing the Electroencephalography Signal Discrepancy When Maintaining Social Distancing: EEG-Based Interactive Moiré Patterns.

[DOI]

,

,

,

,

Xanat Vargas Meza

,

Proceedings of the Design, User Experience, and Usability: Design for Emotion, Well-being and Health, Learning, and Culture, 2022

Automatic Comment Generation for Chinese Student Narrative Essays.

[DOI]

,

,

,

,

Proceedings of the The 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation.

[DOI]

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Selecting Stickers in Open-Domain Dialogue through Multitask Learning.

[DOI]

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021

A Customized VR Rendering with Neural-Network Generated Frames for Reducing VR Dizziness.

[DOI]

,

,

Proceedings of the HCI International 2021 - Posters - 23rd HCI International Conference, 2021

OpenMEVA: A Benchmark for Evaluating Open-ended Story Generation Metrics.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2019

Neural Machine Translation Inspired Binary Code Similarity Comparison beyond Function Pairs.

[DOI]

,

,

,

,

,

Proceedings of the 26th Annual Network and Distributed System Security Symposium, 2019

Loading...