Zekun Wu

Affiliations:
  • University College London, UK


According to our database1, Zekun Wu authored at least 29 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents.
CoRR, March, 2026

Control Reinforcement Learning: Interpretable Token-Level Steering of LLMs via Sparse Autoencoder Features.
CoRR, February, 2026

The Confidence Manifold: Geometric Structure of Correctness Representations in Language Models.
CoRR, February, 2026

AgentGraph: Trace-to-Graph Platform for Interactive Analysis and Robustness Testing in Agentic AI Systems.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

AgentSeer: Visualizing and Evaluating Temporal Actions in Agentic AI Systems.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
NormCode: A Semi-Formal Language for Auditable AI Planning.
CoRR, December, 2025

Mind the Gap: Comparing Model- vs Agentic-Level Red Teaming with Action-Graph Observability on GPT-OSS-20B.
CoRR, September, 2025

Mind the Gap: Evaluating Model- and Agentic-Level Vulnerabilities in LLMs with Action Graphs.
CoRR, September, 2025

Knowledge Collapse in LLMs: When Fluency Survives but Facts Fail under Recursive Synthetic Training.
CoRR, September, 2025

Personality as a Probe for LLM Evaluation: Method Trade-offs and Downstream Effects.
CoRR, September, 2025

CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection.
CoRR, August, 2025

HyPA-RAG: A Hybrid Parameter Adaptive Retrieval-Augmented Generation System for AI Legal and Policy Applications.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Bias Amplification: Large Language Models as Increasingly Biased Media.
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

HEARTS: A Holistic Framework for Explainable, Sustainable and Robust Text Stereotype Detection.
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

MPF: Aligning and Debiasing Language Models post Deployment via Multi-Perspective Fusion.
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

CauSkelNet: Causal Representation Learning for Human Behaviour Analysis.
Proceedings of the 19th IEEE International Conference on Automatic Face and Gesture Recognition, 2025

SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

LibVulnWatch: A Deep Assessment Agent System and Leaderboard for Uncovering Hidden Vulnerabilities in Open-Source AI Libraries.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), 2025

2024
Bias Amplification: Language Models as Increasingly Biased Media.
CoRR, 2024

Assessing Bias in Metric Models for LLM Open-Ended Generation Bias Benchmarks.
CoRR, 2024

THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models.
CoRR, 2024

From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs.
CoRR, 2024

Auditing Large Language Models for Enhanced Text-Based Stereotype Detection and Probing-Based Bias Evaluation.
CoRR, 2024

Advancing Multimodal Data Fusion in Pain Recognition: A Strategy Leveraging Statistical Correlation and Human-Centered Perspectives.
CoRR, 2024

Eliciting Personality Traits in Large Language Models.
CoRR, 2024

JobFair: A Framework for Benchmarking Gender Hiring Bias in Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Advancing Pain Recognition Through Statistical Correlation-Driven Multimodal Fusion.
Proceedings of the 12th International Conference on Affective Computing and Intelligent Interaction, 2024

2023
Towards Auditing Large Language Models: Improving Text-based Stereotype Detection.
CoRR, 2023


  Loading...