Yida Lu

Orcid: 0009-0000-4492-9047

According to our database1, Yida Lu authored at least 14 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure.
CoRR, March, 2026

The Missing Half: Unveiling Training-time Implicit Safety Risks Beyond Deployment.
CoRR, February, 2026

The Side Effects of Being Smart: Safety Risks in MLLMs' Multi-Image Reasoning.
CoRR, January, 2026

2025
ShieldVLM: Safeguarding the Multimodal Implicit Toxicity via Deliberative Reasoning with LVLMs.
CoRR, May, 2025

AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement.
CoRR, February, 2025

ShieldVLM: Safeguarding the Multimodal Implicit Toxicity via Deliberative Reasoning with LVLMs: ShieldVLM.
Proceedings of the 33rd ACM International Conference on Multimedia, 2025

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2025

LongSafety: Evaluating Long-Context Safety of Large Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Agent-SafetyBench: Evaluating the Safety of LLM Agents.
CoRR, 2024

Global Challenge for Safe and Secure LLMs Track 1.
CoRR, 2024

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023
Rethinking Dense Retrieval's Few-Shot Ability.
CoRR, 2023


  Loading...