Zhenhong Zhou

Orcid: 0000-0003-4065-1740

According to our database¹, Zhenhong Zhou authored at least 49 papers between 2021 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Structure-Guided Visual Perturbation Neutralization for LVLMs.

[BibT_eX]

[DOI]

CoRR, May, 2026

EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs.

[BibT_eX]

[DOI]

CoRR, May, 2026

A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook.

[BibT_eX]

[DOI]

CoRR, May, 2026

Explaining and Breaking the Safety-Helpfulness Ceiling via Preference Dimensional Expansion.

[BibT_eX]

[DOI]

CoRR, May, 2026

How Emotion Shapes the Behavior of LLMs and Agents: A Mechanistic Study.

[BibT_eX]

[DOI]

CoRR, April, 2026

SafeSeek: Universal Attribution of Safety Circuits in Language Models.

[BibT_eX]

[DOI]

CoRR, March, 2026

LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models.

[BibT_eX]

[DOI]

CoRR, March, 2026

Resource Consumption Threats in Large Language Models.

[BibT_eX]

[DOI]

CoRR, March, 2026

MCPShield: A Security Cognition Layer for Adaptive Trust Calibration in Model Context Protocol Agents.

[BibT_eX]

[DOI]

CoRR, February, 2026

Omni-Safety under Cross-Modality Conflict: Vulnerabilities, Dynamics Mechanisms and Efficient Alignment.

[BibT_eX]

[DOI]

CoRR, February, 2026

RECUR: Resource Exhaustion Attack via Recursive-Entropy Guided Counterfactual Utilization and Reflection.

[BibT_eX]

[DOI]

CoRR, February, 2026

From Helpfulness to Toxic Proactivity: Diagnosing Behavioral Misalignment in LLM Agents.

[BibT_eX]

[DOI]

CoRR, February, 2026

RSA-Bench: Benchmarking Audio Large Models in Real-World Acoustic Scenarios.

[BibT_eX]

[DOI]

CoRR, January, 2026

SEE: Signal Embedding Energy for Quantifying Noise Interference in Large Audio Language Models.

[BibT_eX]

[DOI]

CoRR, January, 2026

ChronosAudio: A Comprehensive Long-Audio Benchmark for Evaluating Audio-Large Language Models.

[BibT_eX]

[DOI]

CoRR, January, 2026

HearSay Benchmark: Do Audio LLMs Leak What They Hear?

[BibT_eX]

[DOI]

CoRR, January, 2026

CSSBench: Evaluating the Safety of Lightweight LLMs against Chinese-Specific Adversarial Patterns.

[BibT_eX]

[DOI]

CoRR, January, 2026

Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment Through Latent Acoustic Pattern Triggers.

[BibT_eX]

[DOI]

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

MemEvolve: Meta-Evolution of Agent Memory Systems.

[BibT_eX]

[DOI]

CoRR, December, 2025

Memory in the Age of AI Agents.

[BibT_eX]

[DOI]

CoRR, December, 2025

LeechHijack: Covert Computational Resource Exploitation in Intelligent Agent Systems.

[BibT_eX]

[DOI]

CoRR, December, 2025

A Vision for Access Control in LLM-based Agent Systems.

[BibT_eX]

[DOI]

CoRR, October, 2025

Backdoor Collapse: Eliminating Unknown Threats via Known Backdoor Aggregation in Language Models.

[BibT_eX]

[DOI]

CoRR, October, 2025

DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models.

[BibT_eX]

[DOI]

CoRR, September, 2025

Backdoor Attribution: Elucidating and Controlling Backdoor in Language Models.

[BibT_eX]

[DOI]

CoRR, September, 2025

Jailbreaking Large Language Diffusion Models: Revealing Hidden Safety Flaws in Diffusion-Based Text Generation.

[BibT_eX]

[DOI]

CoRR, July, 2025

RECALLED: An Unbounded Resource Consumption Attack on Large Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, July, 2025

Goal-Aware Identification and Rectification of Misinformation in Multi-Agent Systems.

[BibT_eX]

[DOI]

CoRR, June, 2025

PD<sup>3</sup>F: A Pluggable and Dynamic DoS-Defense Framework Against Resource Consumption Attacks Targeting Large Language Models.

[BibT_eX]

[DOI]

CoRR, May, 2025

A Vision for Auto Research with LLM Agents.

[BibT_eX]

[DOI]

CoRR, April, 2025

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment.

[BibT_eX]

[DOI]

CoRR, April, 2025

CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models.

[BibT_eX]

[DOI]

CoRR, February, 2025

LIFEBENCH: Evaluating Length Instruction Following in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

Reinforced Lifelong Editing for Language Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

On the Role of Attention Heads in Large Language Model Safety.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

DemonAgent: Dynamically Encrypted Multi-Backdoor Implantation Attack on LLM-based Agent.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

PD³F: A Pluggable and Dynamic DoS-Defense Framework against resource consumption attacks targeting Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Enforcing group fairness in privacy-preserving Federated Learning.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2024

Crabs: Consuming Resrouce via Auto-generation for LLM-DoS Attack under Black-box Settings.

[BibT_eX]

[DOI]

CoRR, 2024

On the Role of Attention Heads in Large Language Model Safety.

[BibT_eX]

[DOI]

CoRR, 2024

Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions.

[BibT_eX]

[DOI]

CoRR, 2024

Course-Correction: Safety Alignment Using Synthetic Preferences.

[BibT_eX]

[DOI]

CoRR, 2024

Speak Out of Turn: Safety Vulnerability of Large Language Models in Multi-turn Dialogue.

[BibT_eX]

[DOI]

CoRR, 2024

How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Course-Correction: Safety Alignment Using Synthetic Preferences.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, 2024

Alignment-Enhanced Decoding: Defending Jailbreaks via Token-Level Adaptive Refining of Probability Distributions.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Quantifying and Analyzing Entity-Level Memorization in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2021

Three-Dimensional Reconstruction of Huizhou Landscape Combined with Multimedia Technology and Geographic Information System.

[BibT_eX]

[DOI]

Mob. Inf. Syst., 2021

Zhenhong Zhou

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...