Junxiao Yang

Orcid: 0009-0004-7287-0918

According to our database1, Junxiao Yang authored at least 16 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety.
CoRR, April, 2026

SMSP: A Plug-and-Play Strategy of Multi-Scale Perception for MLLMs to Perceive Visual Illusions.
CoRR, March, 2026

The Missing Half: Unveiling Training-time Implicit Safety Risks Beyond Deployment.
CoRR, February, 2026

G-GBC: A Gaussian mixture-based granular ball computing method for robust CNN classification under noisy labels.
Knowl. Based Syst., 2026

When Smiley Turns Hostile: Interpreting How Emojis Trigger LLMs' Toxicity.
Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025
Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers.
CoRR, September, 2025

Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!
CoRR, May, 2025

How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study.
CoRR, May, 2025

BARREL: Boundary-Aware Reasoning for Factual and Reliable LRMs.
CoRR, May, 2025

AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement.
CoRR, February, 2025

Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Agent-SafetyBench: Evaluating the Safety of LLM Agents.
CoRR, 2024

Global Challenge for Safe and Secure LLMs Track 1.
CoRR, 2024

Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks.
CoRR, 2024

Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization.
CoRR, 2023


  Loading...