Shu Yang

Orcid: 0009-0009-1786-9187

Affiliations:
  • King Abdullah University of Science and Technology, Provable Responsible AI and Data Analytics (PRADA) Lab, Thuwal, Saudi Arabia
  • University of Macau, NLP2CT Lab, Taipa, Macau (former)


According to our database1, Shu Yang authored at least 32 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Understanding and Mitigating Political Stance Cross-topic Generalization in Large Language Models.
CoRR, August, 2025

When Truth Is Overridden: Uncovering the Internal Origins of Sycophancy in Large Language Models.
CoRR, August, 2025

Is Long-to-Short a Free Lunch? Investigating Inconsistency and Reasoning Efficiency in LRMs.
CoRR, June, 2025

The Compositional Architecture of Regret in Large Language Models.
CoRR, June, 2025

Mitigating Behavioral Hallucination in Multimodal Large Language Models for Sequential Images.
CoRR, June, 2025

Flattery in Motion: Benchmarking and Analyzing Sycophancy in Video-LLMs.
CoRR, June, 2025

Understanding and Mitigating Cross-lingual Privacy Leakage via Language-specific and Universal Privacy Neurons.
CoRR, June, 2025

Understanding How Value Neurons Shape the Generation of Specified Values in LLMs.
CoRR, May, 2025

Is Your LLM-Based Multi-Agent a Reliable Real-World Planner? Exploring Fraud Detection in Travel Planning.
CoRR, May, 2025

A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?
CoRR, May, 2025

Understanding Aha Moments: from External Observations to Internal Mechanisms.
CoRR, April, 2025

Rethinking Prompt-based Debiasing in Large Language Models.
CoRR, March, 2025

C<sup>2</sup> ATTACK: Towards Representation Backdoor on CLIP via Concept Confusion.
CoRR, March, 2025

Can Large Language Models Identify Implicit Suicidal Ideation? An Empirical Evaluation.
CoRR, February, 2025

EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification.
CoRR, February, 2025

Evaluating Data Influence in Meta Learning.
CoRR, January, 2025

A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions.
Comput. Linguistics, 2025

Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

Understanding the Repeat Curse in Large Language Models from a Feature Perspective.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

Rethinking Prompt-based Debiasing in Large Language Model.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Dissecting Misalignment of Multimodal Large Language Models via Influence Function.
CoRR, 2024

What makes your model a low-empathy or warmth person: Exploring the Origins of Personality in LLMs.
CoRR, 2024

Understanding Reasoning in Chain-of-Thought from the Hopfieldian View.
CoRR, 2024

A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning.
CoRR, 2024

Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top.
CoRR, 2024

PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression.
CoRR, 2024

Dialectical Alignment: Resolving the Tension of 3H and Security Threats of LLMs.
CoRR, 2024

Human-AI Interactions in the Communication Era: Autophagy Makes Large Models Achieving Local Optima.
CoRR, 2024

MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning.
CoRR, 2024

DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation.
CoRR, 2023


  Loading...