Junxiao Yang

According to our database1, Junxiao Yang authored at least 10 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!
CoRR, May, 2025

How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study.
CoRR, May, 2025

BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs.
CoRR, May, 2025

AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement.
CoRR, February, 2025

Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Agent-SafetyBench: Evaluating the Safety of LLM Agents.
CoRR, 2024

Global Challenge for Safe and Secure LLMs Track 1.
CoRR, 2024

Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks.
CoRR, 2024

Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization.
CoRR, 2023


  Loading...