Shu Yang

Orcid: 0009-0009-1786-9187

Affiliations:

King Abdullah University of Science and Technology, Provable Responsible AI and Data Analytics (PRADA) Lab, Thuwal, Saudi Arabia
University of Macau, NLP2CT Lab, Taipa, Macau (former)

According to our database¹, Shu Yang authored at least 34 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

RepreGuard: Detecting LLM-Generated Text by Revealing Hidden Representation Patterns.

[BibT_eX]

[DOI]

CoRR, August, 2025

Understanding and Mitigating Political Stance Cross-topic Generalization in Large Language Models.

[BibT_eX]

[DOI]

CoRR, August, 2025

When Truth Is Overridden: Uncovering the Internal Origins of Sycophancy in Large Language Models.

[BibT_eX]

[DOI]

CoRR, August, 2025

Is Long-to-Short a Free Lunch? Investigating Inconsistency and Reasoning Efficiency in LRMs.

[BibT_eX]

[DOI]

CoRR, June, 2025

The Compositional Architecture of Regret in Large Language Models.

[BibT_eX]

[DOI]

CoRR, June, 2025

Mitigating Behavioral Hallucination in Multimodal Large Language Models for Sequential Images.

[BibT_eX]

[DOI]

CoRR, June, 2025

Flattery in Motion: Benchmarking and Analyzing Sycophancy in Video-LLMs.

[BibT_eX]

[DOI]

CoRR, June, 2025

Understanding and Mitigating Cross-lingual Privacy Leakage via Language-specific and Universal Privacy Neurons.

[BibT_eX]

[DOI]

CoRR, June, 2025

Understanding How Value Neurons Shape the Generation of Specified Values in LLMs.

[BibT_eX]

[DOI]

CoRR, May, 2025

Is Your LLM-Based Multi-Agent a Reliable Real-World Planner? Exploring Fraud Detection in Travel Planning.

[BibT_eX]

[DOI]

CoRR, May, 2025

A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?

[BibT_eX]

[DOI]

CoRR, May, 2025

Understanding Aha Moments: from External Observations to Internal Mechanisms.

[BibT_eX]

[DOI]

CoRR, April, 2025

Rethinking Prompt-based Debiasing in Large Language Models.

[BibT_eX]

[DOI]

CoRR, March, 2025

C<sup>2</sup> ATTACK: Towards Representation Backdoor on CLIP via Concept Confusion.

[BibT_eX]

[DOI]

CoRR, March, 2025

Can Large Language Models Identify Implicit Suicidal Ideation? An Empirical Evaluation.

[BibT_eX]

[DOI]

CoRR, February, 2025

EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification.

[BibT_eX]

[DOI]

CoRR, February, 2025

Evaluating Data Influence in Meta Learning.

[BibT_eX]

[DOI]

CoRR, January, 2025

A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions.

[BibT_eX]

[DOI]

Comput. Linguistics, 2025

Stable Vision Concept Transformers for Medical Diagnosis.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases. Research Track, 2025

Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

Understanding the Repeat Curse in Large Language Models from a Feature Perspective.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Rethinking Prompt-based Debiasing in Large Language Model.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Dissecting Misalignment of Multimodal Large Language Models via Influence Function.

[BibT_eX]

[DOI]

CoRR, 2024

What makes your model a low-empathy or warmth person: Exploring the Origins of Personality in LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Understanding Reasoning in Chain-of-Thought from the Hopfieldian View.

[BibT_eX]

[DOI]

CoRR, 2024

A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning.

[BibT_eX]

[DOI]

CoRR, 2024

Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top.

[BibT_eX]

[DOI]

CoRR, 2024

PROMPT-SAW: Leveraging Relation-Aware Graphs for Textual Prompt Compression.

[BibT_eX]

[DOI]

CoRR, 2024

Dialectical Alignment: Resolving the Tension of 3H and Security Threats of LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Human-AI Interactions in the Communication Era: Autophagy Makes Large Models Achieving Local Optima.

[BibT_eX]

[DOI]

CoRR, 2024

MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning.

[BibT_eX]

[DOI]

CoRR, 2024

DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023

Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation.

[BibT_eX]

[DOI]

CoRR, 2023

Shu Yang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...