Zidi Xiong

According to our database1, Zidi Xiong authored at least 18 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Monitorability as a Free Gift: How RLVR Spontaneously Aligns Reasoning.
CoRR, February, 2026

Proof of Time: A Benchmark for Evaluating Scientific Idea Judgments.
CoRR, January, 2026

How Memory Management Impacts LLM Agents: An Empirical Study of Experience-Following Behavior.
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

2025
User-Assistant Bias in LLMs.
CoRR, August, 2025

When Models Reason in Your Language: Controlling Thinking Trace Language Comes at the Cost of Accuracy.
CoRR, May, 2025

How Memory Management Impacts LLM Agents: An Empirical Study of Experience-Following Behavior.
CoRR, May, 2025

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models.
CoRR, March, 2025

Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2025, 2025

GuardAgent: Safeguard LLM Agents via Knowledge-Enabled Reasoning.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

When Models Reason in Your Language: Controlling Thinking Language Comes at the Cost of Accuracy.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

2024
GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning.
CoRR, 2024

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
CBD: A Certified Backdoor Detector Based on Local Dominant Probability.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

UMD: Unsupervised Model Detection for X2X Backdoor Attacks.
Proceedings of the International Conference on Machine Learning, 2023

2022
Label-Smoothed Backdoor Attack.
CoRR, 2022


  Loading...