We stand with Ukraine

We stand with Ukraine

Hannah Kirk

Orcid: 0000-0002-7419-5993

According to our database¹, Hannah Kirk authored at least 26 papers between 2021 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2024

Adversarial Nibbler: An Open Red-Teaming Method for Identifying Diverse Harms in Text-to-Image Generation.

[BibT_eX]

[DOI]

,

,

,

,

Hannah Rose Kirk

,

,

,

,

,

,

,

Rafael Mosquera

,

,

Vijay Janapa Reddi

,

CoRR, 2024

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models.

[BibT_eX]

[DOI]

,

Valentin Hofmann

,

Valentina Pyatkin

,

,

Hannah Rose Kirk

,

Hinrich Schütze

,

CoRR, 2024

2023

SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models.

[BibT_eX]

[DOI]

,

Hannah Rose Kirk

,

,

,

Anand Kannappan

,

,

CoRR, 2023

The Empty Signifier Problem: Towards Clearer Paradigms for Operationalising "Alignment" in Large Language Models.

[BibT_eX]

[DOI]

Hannah Rose Kirk

,

,

,

CoRR, 2023

Casteist but Not Racist? Quantifying Disparities in Large Language Model Bias between India and the West.

[BibT_eX]

[DOI]

Khyati Khandelwal

,

,

,

Hannah Rose Kirk

,

CoRR, 2023

XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models.

[BibT_eX]

[DOI]

,

Hannah Rose Kirk

,

,

Giuseppe Attanasio

,

Federico Bianchi

,

CoRR, 2023

DoDo Learning: DOmain-DemOgraphic Transfer in Language Models for Detecting Abuse Targeted at Public Figures.

[BibT_eX]

[DOI]

Hannah Rose Kirk

,

Angus R. Williams

,

,

,

,

,

Francesca Stevens

,

Jonathan Bright

,

CoRR, 2023

Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets.

[BibT_eX]

[DOI]

,

,

Siobhan Mackenzie Hall

,

Hannah Rose Kirk

,

Aleksandar Shtedritski

,

CoRR, 2023

Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models.

[BibT_eX]

[DOI]

,

Hannah Rose Kirk

,

,

,

,

,

,

Rafael Mosquera

,

,

,

,

Vijay Janapa Reddi

,

CoRR, 2023

Assessing Language Model Deployment with Risk Cards.

[BibT_eX]

[DOI]

Leon Derczynski

,

Hannah Rose Kirk

,

Vidhisha Balachandran

,

,

,

,

CoRR, 2023

Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback.

[BibT_eX]

[DOI]

Hannah Rose Kirk

,

,

,

CoRR, 2023

Auditing large language models: a three-layered approach.

[BibT_eX]

[DOI]

Jakob Mökander

,

,

Hannah Rose Kirk

,

Luciano Floridi

CoRR, 2023

SemEval-2023 Task 10: Explainable Detection of Online Sexism.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the The 17th International Workshop on Semantic Evaluation, 2023

DataPerf: Benchmarks for Data-Centric AI Development.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution.

[BibT_eX]

[DOI]

Siobhan Mackenzie Hall

,

Fernanda Gonçalves Abrantes

,

,

,

Aleksandar Shtedritski

,

Hannah Rose Kirk

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

The Past, Present and Better Future of Feedback Learning in Large Language Models for Subjective Human Preferences and Values.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022

Is More Data Better? Re-thinking the Importance of Efficiency in Abusive Language Detection with Transformers-Based Active Learning.

[BibT_eX]

[DOI]

Hannah Rose Kirk

,

,

CoRR, 2022

Looking for a Handsome Carpenter! Debiasing GPT-3 Job Advertisements.

[BibT_eX]

[DOI]

Conrad Borchers

,

Dalia Sara Gala

,

Benjamin Gilburt

,

,

Wilfried Bounsi

,

,

Hannah Rose Kirk

CoRR, 2022

Handling and Presenting Harmful Text.

[BibT_eX]

[DOI]

Leon Derczynski

,

Hannah Rose Kirk

,

,

CoRR, 2022

A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning.

[BibT_eX]

[DOI]

,

Siobhan Mackenzie Hall

,

,

,

Hannah Rose Kirk

,

Aleksandar Shtedritski

,

CoRR, 2022

Hatemoji: A Test Suite and Adversarially-Generated Dataset for Benchmarking and Detecting Emoji-Based Hate.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning.

[BibT_eX]

[DOI]

,

Siobhan Mackenzie Hall

,

,

,

Aleksandar Shtedritski

,

Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022

Handling and Presenting Harmful Text in NLP Research.

[BibT_eX]

[DOI]

,

,

,

Leon Derczynski

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021

Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset.

[BibT_eX]

[DOI]

Hannah Rose Kirk

,

,

,

,

,

,

,

Martin Doff-Sotta

,

Aleksandar Shtedritski

,

Yuki Markus Asano

CoRR, 2021

How True is GPT-2? An Empirical Analysis of Intersectional Occupational Biases.

[BibT_eX]

[DOI]

,

,

,

,

,

Frédéric A. Dreyer

,

Aleksandar Shtedritski

,

Yuki Markus Asano

CoRR, 2021

Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models.

[BibT_eX]

[DOI]

Hannah Rose Kirk

,

,

,

,

,

Frédéric A. Dreyer

,

Aleksandar Shtedritski

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Loading...