Himabindu Lakkaraju

Orcid: 0000-0001-7922-6544

According to our database1, Himabindu Lakkaraju authored at least 102 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Interpretability Needs a New Paradigm.
CoRR, 2024

More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness.
CoRR, 2024

Manipulating Large Language Models to Increase Product Visibility.
CoRR, 2024

OpenHEXAI: An Open-Source Framework for Human-Centered Evaluation of Explainable Machine Learning.
CoRR, 2024

Towards Safe and Aligned Large Language Models for Medicine.
CoRR, 2024

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems.
CoRR, 2024

Opening the Black Box of Large Language Models: Two Views on Holistic Interpretability.
CoRR, 2024

Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE).
CoRR, 2024

Understanding the Effects of Iterative Prompting on Truthfulness.
CoRR, 2024

Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models.
CoRR, 2024

Quantifying Uncertainty in Natural Language Explanations of Large Language Models.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

Fair Machine Unlearning: Data Removal while Mitigating Disparities.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023
Explaining machine learning models with interactive natural language conversations using TalkToModel.
Nat. Mac. Intell., August, 2023

Is Ignorance Bliss? The Role of Post Hoc Explanation Faithfulness and Alignment in Model Trust in Laypeople and Domain Experts.
CoRR, 2023

A Study on the Calibration of In-context Learning.
CoRR, 2023

Investigating the Fairness of Large Language Models for Predictions on Tabular Data.
CoRR, 2023

In-Context Unlearning: Language Models as Few Shot Unlearners.
CoRR, 2023

Are Large Language Models Post Hoc Explainers?
CoRR, 2023

On the Trade-offs between Adversarial Robustness and Actionable Explanations.
CoRR, 2023

Certifying LLM Safety against Adversarial Prompting.
CoRR, 2023

Accurate, Explainable, and Private Models: Providing Recourse While Minimizing Training Data Leakage.
CoRR, 2023

Verifiable Feature Attributions: A Bridge between Post Hoc Explainability and Inherent Interpretability.
CoRR, 2023

Efficient Estimation of the Local Robustness of Machine Learning Models.
CoRR, 2023

Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions.
CoRR, 2023

Consistent Explanations in the Face of Model Indeterminacy via Ensembling.
CoRR, 2023

Word-Level Explanations for Analyzing Bias in Text-to-Image Models.
CoRR, 2023


On Minimizing the Impact of Dataset Shifts on Actionable Explanations.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Post Hoc Explanations of Language Models Can Improve Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Discriminative Feature Attributions: Bridging Post Hoc Explainability and Inherent Interpretability.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

M<sup>4</sup>: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Generative AI meets Responsible AI: Practical Challenges and Opportunities.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Towards Bridging the Gaps between the Right to Explanation and the Right to be Forgotten.
Proceedings of the International Conference on Machine Learning, 2023

On the Impact of Algorithmic Recourse on Social Segregation.
Proceedings of the International Conference on Machine Learning, 2023

Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

On the Privacy Risks of Algorithmic Recourse.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
Evaluating Explainability for Graph Neural Networks.
CoRR, 2022

TalkToModel: Understanding Machine Learning Models With Open Ended Dialogues.
CoRR, 2022

Flatten the Curve: Efficiently Training Low-Curvature Neural Networks.
CoRR, 2022

A Human-Centric Take on Model Monitoring.
CoRR, 2022

Rethinking Stability for Attribution-based Explanations.
CoRR, 2022

Algorithmic Recourse in the Face of Noisy Human Responses.
CoRR, 2022

Rethinking Explainability as a Dialogue: A Practitioner's Perspective.
CoRR, 2022

The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective.
CoRR, 2022

Data poisoning attacks on off-policy policy evaluation methods.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

Efficient Training of Low-Curvature Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

OpenXAI: Towards a Transparent Evaluation of Model Explanations.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Model Monitoring in Practice: Lessons Learned and Open Challenges.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

A Human-Centric Perspective on Model Monitoring.
Proceedings of the Tenth AAAI Conference on Human Computation and Crowdsourcing, 2022

Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Probing GNN Explainers: A Rigorous Theoretical and Empirical Analysis of GNN Explanation Methods.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Towards Robust Off-Policy Evaluation via Human Inputs.
Proceedings of the AIES '22: AAAI/ACM Conference on AI, Ethics, and Society, Oxford, United Kingdom, May 19, 2022

Fairness via Explanation Quality: Evaluating Disparities in the Quality of Post hoc Explanations.
Proceedings of the AIES '22: AAAI/ACM Conference on AI, Ethics, and Society, Oxford, United Kingdom, May 19, 2022

2021
What will it take to generate fairness-preserving explanations?
CoRR, 2021

Feature Attributions and Counterfactual Explanations Can Be Manipulated.
CoRR, 2021

On the Connections between Counterfactual Explanations and Adversarial Examples.
CoRR, 2021

Towards a Rigorous Theoretical Analysis and Evaluation of GNN Explanations.
CoRR, 2021

Counterfactual Explanations Can Be Manipulated.
CoRR, 2021

Learning Under Adversarial and Interventional Shifts.
CoRR, 2021

Towards a unified framework for fair and stable graph representation learning.
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

Towards Robust and Reliable Algorithmic Recourse.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Reliable Post hoc Explanations: Modeling Uncertainty in Explainability.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Counterfactual Explanations Can Be Manipulated.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning Models for Actionable Recourse.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Towards the Unification and Robustness of Perturbation and Gradient Based Explanations.
Proceedings of the 38th International Conference on Machine Learning, 2021

Towards Reliable and Practicable Algorithmic Recourse.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

Does Fair Ranking Improve Minority Outcomes? Understanding the Interplay of Human and Algorithmic Biases in Online Hiring.
Proceedings of the AIES '21: AAAI/ACM Conference on AI, 2021

Fair Influence Maximization: a Welfare Optimization Approach.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Can I Still Trust You?: Understanding the Impact of Distribution Shifts on Algorithmic Recourses.
CoRR, 2020

When Does Uncertainty Matter?: Understanding the Impact of Predictive Uncertainty in ML Assisted Decision Making.
CoRR, 2020

Ensuring Actionable Recourse via Adversarial Training.
CoRR, 2020

Interpretable and Interactive Summaries of Actionable Recourses.
CoRR, 2020

How Much Should I Trust You? Modeling Uncertainty of Black Box Explanations.
CoRR, 2020

Fair Influence Maximization: A Welfare Optimization Approach.
CoRR, 2020

Incorporating Interpretable Output Constraints in Bayesian Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Robust and Stable Black Box Explanations.
Proceedings of the 37th International Conference on Machine Learning, 2020

Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods.
Proceedings of the AIES '20: AAAI/ACM Conference on AI, 2020

"How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations.
Proceedings of the AIES '20: AAAI/ACM Conference on AI, 2020

2019
How can we fool LIME and SHAP? Adversarial Attacks on Post hoc Explanation Methods.
CoRR, 2019

Faithful and Customizable Explanations of Black Box Models.
Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 2019

2018
Human-centric machine learning: enabling machine learning for high-stakes decision-making.
PhD thesis, 2018

2017
Interpretable & Explorable Approximations of Black Box Models.
CoRR, 2017

The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables.
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13, 2017

Learning Cost-Effective and Interpretable Treatment Regimes.
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

Identifying Unknown Unknowns in the Open World: Representations and Policies for Guided Exploration.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Psycho-Demographic Analysis of the Facebook Rainbow Campaign.
CoRR, 2016

Learning Cost-Effective Treatment Regimes using Markov Decision Processes.
CoRR, 2016

Discovering Blind Spots of Predictive Models: Representations and Policies for Guided Exploration.
CoRR, 2016

Confusions over Time: An Interpretable Bayesian Model to Characterize Trends in Decision Making.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Interpretable Decision Sets: A Joint Framework for Description and Prediction.
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

2015
A Bayesian Framework for Modeling Human Evaluations.
Proceedings of the 2015 SIAM International Conference on Data Mining, Vancouver, BC, Canada, April 30, 2015

Who, when, and why: a machine learning approach to prioritizing students at risk of not graduating high school on time.
Proceedings of the Fifth International Conference on Learning Analytics And Knowledge, 2015

A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes.
Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015

2013
What's in a Name? Understanding the Interplay between Titles, Content, and Communities in Social Media.
Proceedings of the Seventh International Conference on Weblogs and Social Media, 2013

2012
TEM: a novel perspective to modeling content onmicroblogs.
Proceedings of the 21st World Wide Web Conference, 2012

Dynamic Multi-relational Chinese Restaurant Process for Analyzing Influences on Users in Social Media.
Proceedings of the 12th IEEE International Conference on Data Mining, 2012

2011
Smart news feeds for social networks using scalable joint latent factor models.
Proceedings of the 20th International Conference on World Wide Web, 2011

Exploiting Coherence for the Simultaneous Discovery of Latent Facets and associated Sentiments.
Proceedings of the Eleventh SIAM International Conference on Data Mining, 2011

Attention prediction on social media brand pages.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011


  Loading...