Manish Nagireddy

Orcid: 0000-0001-9245-2546

According to our database1, Manish Nagireddy authored at least 18 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers.
Trans. Mach. Learn. Res., 2025

Granite Guardian: Comprehensive LLM Safeguarding.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Programming Refusal with Conditional Activation Steering.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Contextual Value Alignment.
Proceedings of the 2025 IEEE International Conference on Acoustics, 2025

Multi-Level Explanations for Generative Language Models.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations.
IEEE Internet Comput., 2024

Granite Guardian.
CoRR, 2024

When in Doubt, Cascade: Towards Building Efficient and Capable Guardrails.
CoRR, 2024

Contextual Moral Value Alignment Through Context-Based Aggregation.
CoRR, 2024

Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations.
CoRR, 2024

DARE to Diversify: DAta Driven and Diverse LLM REd Teaming.
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

ComVas: Contextual Moral Values Alignment System.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Value Alignment from Unstructured Text.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, 2024

Language Models in Dialogue: Conversational Maxims for Human-AI Interactions.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

SocialStigmaQA: A Benchmark to Uncover Stigma Amplification in Generative Language Models.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions.
CoRR, 2023

2022
A Sandbox Tool to Bias(Stress)-Test Fairness Algorithms.
CoRR, 2022

Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits.
Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022


  Loading...