Pankayaraj Pathmanathan

According to our database¹, Pankayaraj Pathmanathan authored at least 10 papers between 2024 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Deliberative Alignment is Deep, but Uncertainty Remains: Inference time safety improvement in reasoning via attribution of unsafe behavior to base model.

[BibT_eX]

[DOI]

Pankayaraj Pathmanathan

Furong Huang

CoRR, April, 2026

Teach a Reward Model to Correct Itself: Reward Guided Adversarial Failure Discovery for Robust Reward Modeling.

[BibT_eX]

[DOI]

Pankayaraj Pathmanathan

Furong Huang

Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2026

AdvBDGen: A Robust Framework for Generating Adaptive and Stealthy Backdoors in LLM Alignment.

[BibT_eX]

[DOI]

Pankayaraj Pathmanathan

Udari Madhushani Sehwag

Michael-Andrei Panaitescu-Liess

Cho-Yu Jason Chiang

Furong Huang

Proceedings of the Fortieth AAAI Conference on Artificial Intelligence, 2026

2025

RAGPart & RAGMask: Retrieval-Stage Defenses Against Corpus Poisoning in Retrieval-Augmented Generation.

[BibT_eX]

[DOI]

Pankayaraj Pathmanathan

Michael-Andrei Panaitescu-Liess

Cho-Yu Jason Chiang

Furong Huang

CoRR, December, 2025

Reward Models Can Improve Themselves: Reward-Guided Adversarial Failure Mode Discovery for Robust Reward Modeling.

[BibT_eX]

[DOI]

Pankayaraj Pathmanathan

Furong Huang

CoRR, July, 2025

PoisonedParrot: Subtle Data Poisoning Attacks to Elicit Copyright-Infringing Content from Large Language Models.

[BibT_eX]

[DOI]

Michael-Andrei Panaitescu-Liess

Pankayaraj Pathmanathan

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Is Poisoning a Real Threat to DPO? Maybe More So Than You Think.

[BibT_eX]

[DOI]

Pankayaraj Pathmanathan

Proceedings of the Thirty-Ninth AAAI Conference on Artificial Intelligence, 2025

Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?

[BibT_eX]

[DOI]