We stand with Ukraine

We stand with Ukraine

Hadas Orgad

According to our database¹, Hadas Orgad authored at least 18 papers between 2018 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Interpretability Can Be Actionable.

[DOI]

,

,

,

,

,

,

,

Byron C. Wallace

,

Sarah Wiegreffe

,

,

,

CoRR, May, 2026

Hidden Failures in Robustness: Why Supervised Uncertainty Quantification Needs Better Evaluation.

[DOI]

,

,

,

Benjamin Heinzerling

,

Nafise Sadat Moosavi

CoRR, April, 2026

Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism.

[DOI]

,

,

,

Martin Wattenberg

,

Peter Henderson

,

Seraphina Goldfarb-Tarrant

,

Yonatan Belinkov

CoRR, April, 2026

Agents of Chaos.

[DOI]

CoRR, February, 2026

2025

Inside-Out: Hidden Factual Knowledge in LLMs.

[DOI]

,

,

,

,

Yonatan Belinkov

,

,

Jonathan Herzig

,

CoRR, March, 2025

Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models.

[DOI]

,

,

,

,

,

,

Yonatan Belinkov

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

MIB: A Mechanistic Interpretability Benchmark.

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations.

[DOI]

,

,

,

,

,

,

Yonatan Belinkov

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Position-aware Automatic Circuit Discovery.

[DOI]

,

,

,

,

Yonatan Belinkov

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Unified Concept Editing in Diffusion Models.

[DOI]

Rohit Gandikota

,

,

Yonatan Belinkov

,

Joanna Materzynska

,

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

ReFACT: Updating Text-to-Image Models by Editing the Text Encoder.

[DOI]

,

,

Yonatan Belinkov

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines.

[DOI]

,

,

,

,

Yonatan Belinkov

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

Editing Implicit Assumptions in Text-to-Image Diffusion Models.

[DOI]

,

,

Yonatan Belinkov

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

BLIND: Bias Removal With No Demographics.

[DOI]

,

Yonatan Belinkov

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Debiasing NLP Models Without Demographic Information.

[DOI]

,

Yonatan Belinkov

CoRR, 2022

Choose Your Lenses: Flaws in Gender Bias Evaluation.

[DOI]

,

Yonatan Belinkov

CoRR, 2022

How Gender Debiasing Affects Internal Model Representations, and Why It Matters.

[DOI]

,

Seraphina Goldfarb-Tarrant

,

Yonatan Belinkov

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

2018

The Spyware Used in Intimate Partner Violence.

[DOI]

Rahul Chatterjee

,

Periwinkle Doerfler

,

,

,

Jackeline Palmer

,

,

,

,

,

Thomas Ristenpart

Proceedings of the 2018 IEEE Symposium on Security and Privacy, 2018

Loading...