Qin Liu

Orcid: 0009-0000-8303-8757

Affiliations:
  • University of California, Davis, CA, USA
  • University of Southern California, Los Angeles, CA, USA (former)
  • Fudan University, School of Computer Science, Shanghai, China (former)


According to our database1, Qin Liu authored at least 31 papers between 2020 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
RedCoder: Automated Multi-Turn Red Teaming for Code LLMs.
CoRR, July, 2025

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.
CoRR, July, 2025

QA-LIGN: Aligning LLMs through Constitutionally Decomposed QA.
CoRR, June, 2025

Exploring Scaling Laws for EHR Foundation Models.
CoRR, May, 2025

MetaScale: Test-Time Scaling with Evolving Meta-Thoughts.
CoRR, March, 2025

A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models.
CoRR, February, 2025

VLM-Guard: Safeguarding Vision-Language Models via Fulfilling Safety Alignment Gap.
CoRR, February, 2025

Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
MetaScientist: A Human-AI Synergistic Framework for Automated Mechanical Metamaterial Design.
CoRR, 2024

Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models.
CoRR, 2024

Familiarity-aware Evidence Compression for Retrieval Augmented Generation.
CoRR, 2024

Securing Multi-turn Conversational Language Models Against Distributed Backdoor Triggers.
CoRR, 2024

LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing.
CoRR, 2024

From Shortcuts to Triggers: Backdoor Defense with Denoised PoE.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Securing Multi-turn Conversational Language Models From Distributed Backdoor Attacks.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Monotonic Paraphrasing Improves Generalization of Language Model Prompting.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024


Mitigating Backdoor Threats to Large Language Models: Advancement and Challenges.
Proceedings of the 60th Annual Allerton Conference on Communication, 2024

2023
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations.
CoRR, 2023

Secrets of RLHF in Large Language Models Part I: PPO.
CoRR, 2023

Characterizing the Impacts of Instances on Robustness.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Detecting Adversarial Samples through Sharpness of Loss Landscape.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
PlugAT: A Plug and Play Module to Defend against Textual Adversarial Attack.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Flooding-X: Improving BERT's Resistance to Adversarial Attacks via Loss-Restricted Fine-Tuning.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Overview of Argumentative Text Understanding for AI Debater Challenge.
Proceedings of the Natural Language Processing and Chinese Computing, 2021

TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing.
Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Learning to Generate Representations for Novel Words: Mimic the OOV Situation in Training.
Proceedings of the Natural Language Processing and Chinese Computing, 2020


  Loading...