Adina Williams

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Improving Text-to-Image Consistency via Automatic Prompt Optimization.

[BibT_eX]

[DOI]

Michal Drozdzal

Trans. Mach. Learn. Res., 2024

DIG In: Evaluating Disparities in Image Generations with Indicators for Geographic Diversity.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

Chained Tuning Leads to Biased Forgetting.

[BibT_eX]

[DOI]

Megan Ung

Alicia Sun

Samuel J. Bell

Bhaktipriya Radharapu

Levent Sagun

CoRR, 2024

What makes a good metric? Evaluating automatic metrics for text-to-image consistency.

[BibT_eX]

[DOI]

Candace Ross

Melissa Hall

Krunoslav Lehman Pavasovic

CoRR, 2024

Findings of the Second BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora.

[BibT_eX]

[DOI]

CoRR, 2024

Transformers Can Navigate Mazes With Multi-Step Prediction.

[BibT_eX]

[DOI]

CoRR, 2024

Sense and Sensitivity: Evaluating the simulation of social dynamics via Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models.

[BibT_eX]

[DOI]

Anaelia Ovalle

CoRR, 2024

Are Female Carpenters like Blue Bananas? A Corpus Investigation of Occupation Gender Typicality.

[BibT_eX]

[DOI]

Da Ju

Karen Ulrich

CoRR, 2024

Changing Answer Order Can Decrease MMLU Accuracy.

[BibT_eX]

[DOI]

CoRR, 2024

Decomposed evaluations of geographic disparities in text-to-image models.

[BibT_eX]

[DOI]

Abhishek Sureddy

Dishant Padalia

Nandhinee Periyakaruppa

Oindrila Saha

Zacharie Delpierre Coudert

Megan Richards

Polina Kirichenko

Melissa Hall

CoRR, 2024

The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Introducing v0.5 of the AI Safety Benchmark from MLCommons.

[BibT_eX]

[DOI]

Borhane Blili-Hamelin

Kurt Bollacker

Rishi Bomassani

Marisa Ferrara Boston

Joseph Marvin Imperial

Dinesh Jinenhally Naganna

Forough Poursabzi-Sangdeh

Alice Schoenauer Sebag

Elizabeth Anne Watkins

CoRR, 2024

[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus.

[BibT_eX]

[DOI]

CoRR, 2024

The Factorization Curse: Which Tokens You Predict Underlie the Reversal Curse and More.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models.

[BibT_eX]

[DOI]

Rafael Mosquera Gómez

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Towards Geographic Inclusion in the Evaluation of Text-to-Image Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 2024

EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Compositional learning of functions in humans and machines.

[BibT_eX]

[DOI]

Yanli Zhou

Brenden M. Lake

Proceedings of the 46th Annual Meeting of the Cognitive Science Society, 2024

Insights from the first BabyLM Challenge: Training sample-efficient language models on a developmentally plausible corpus.

[BibT_eX]

[DOI]

Proceedings of the 46th Annual Meeting of the Cognitive Science Society, 2024

Are Female Carpenters like Blue Bananas? A Corpus Investigation of Occupation Gender Typicality.

[BibT_eX]

[DOI]

Da Ju

Karen Ullrich

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Grammatical Gender's Influence on Distributional Semantics: A Causal Perspective.

[BibT_eX]

[DOI]

CoRR, 2023

Llama 2: Open Foundation and Fine-Tuned Chat Models.

[BibT_eX]

[DOI]

Cristian Canton-Ferrer

CoRR, 2023

Weisfeiler and Lehman Go Measurement Modeling: Probing the Validity of the WL Test.

[BibT_eX]

[DOI]

CoRR, 2023

Call for Papers - The BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus.

[BibT_eX]

[DOI]

CoRR, 2023

The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages.

[BibT_eX]

[DOI]

Proceedings of the Eighth Conference on Machine Translation, 2023

DataPerf: Benchmarks for Data-Centric AI Development.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Robustness of Named-Entity Replacements for In-Context Learning.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

ROBBIE: Robust Bias Evaluation of Large Generative Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

The Validity of Evaluation Results: Assessing Concurrence Across Compositionality Benchmarks.

[BibT_eX]

[DOI]

Kaiser Sun

Dieuwke Hupkes

Proceedings of the 27th Conference on Computational Natural Language Learning, 2023

Language model acceptability judgements are not always robust to context.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

A Latent-Variable Model for Intrinsic Probing.

[BibT_eX]

[DOI]

Karolina Stanczak

Lucas Torroba Hennigen

Ryan Cotterell

Isabelle Augenstein

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Learning Transductions to Test Systematic Compositionality.

[BibT_eX]

[DOI]

CoRR, 2022

DataPerf: Benchmarks for Data-Centric AI Development.

[BibT_eX]

[DOI]

CoRR, 2022

"I'm sorry to hear that": finding bias in language models with a holistic descriptor dataset.

[BibT_eX]

[DOI]

CoRR, 2022

On the Machine Learning of Ethical Judgments from Natural Language.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

"I'm sorry to hear that": Finding New Biases in Language Models with a Holistic Descriptor Dataset.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

The Curious Case of Absolute Position Embeddings.

[BibT_eX]

[DOI]

Amirhossein Kazemnejad

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Perturbation Augmentation for Fairer NLP.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Benchmarking Compositionality with Formal Languages.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Computational Linguistics, 2022

Evaluating locality in NMT models.

[BibT_eX]

[DOI]

Proceedings of the 44th Annual Meeting of the Cognitive Science Society, 2022

Analyzing Dynamic Adversarial Training Data in the Limit.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks.

[BibT_eX]

[DOI]

William Gaviria Rojas

Peter Mattson

Douwe Kiela

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022

Investigating Failures of Automatic Translationin the Case of Unambiguous Gender.

[BibT_eX]

[DOI]

Adi Renduchintala

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

On the Relationships Between the Grammatical Genders of Inanimate Nouns and Their Co-Occurring Adjectives and Verbs.

[BibT_eX]

[DOI]

Trans. Assoc. Comput. Linguistics, 2021

A Word on Machine Ethics: A Response to Jiang et al. (2021).

[BibT_eX]

[DOI]

CoRR, 2021

Hi, my name is Martha: Using names to measure and mitigate bias in generative dialogue models.

[BibT_eX]

[DOI]

Eric Michael Smith

CoRR, 2021

Investigating Failures of Automatic Translation in the Case of Unambiguous Gender.

[BibT_eX]

[DOI]

Adithya Renduchintala

CoRR, 2021

Sometimes We Want Translationese.

[BibT_eX]

[DOI]

Prasanna Parthasarathi

Joelle Pineau

CoRR, 2021

Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Dynabench: Rethinking Benchmarking in NLP.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Sometimes We Want Ungrammatical Translations.

[BibT_eX]

[DOI]

Prasanna Parthasarathi

Joelle Pineau

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Generalising to German Plural Noun Classes, from the Perspective of a Recurrent Neural Network.

[BibT_eX]

[DOI]

Proceedings of the 25th Conference on Computational Natural Language Learning, 2021

To what extent do human explanations of model behavior align with actual model behavior?

[BibT_eX]

[DOI]

Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2021

UnNatural Language Inference.

[BibT_eX]

[DOI]

Prasanna Parthasarathi

Joelle Pineau

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020

ANLIzing the Adversarial Natural Language Inference Dataset.

[BibT_eX]

[DOI]

Tristan Thrush

Douwe Kiela

CoRR, 2020

SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection.

[BibT_eX]

[DOI]

Proceedings of the 17th SIGMORPHON Workshop on Computational Research in Phonetics, 2020

Pareto Probing: Trading Off Accuracy for Complexity.

[BibT_eX]

[DOI]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Measuring the Similarity of Grammatical Gender Systems by Comparing Partitions.

[BibT_eX]

[DOI]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Intrinsic Probing through Dimension Selection.

[BibT_eX]

[DOI]

Lucas Torroba Hennigen

Ryan Cotterell

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Multi-Dimensional Gender Bias Classification.

[BibT_eX]

[DOI]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation.

[BibT_eX]

[DOI]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Predicting Declension Class from Form and Meaning.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Information-Theoretic Probing for Linguistic Structure.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Adversarial NLI: A New Benchmark for Natural Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

A Tale of a Probe and a Parser.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

On the Idiosyncrasies of the Mandarin Chinese Classifier System.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Quantifying the Semantic Core of Gender Systems.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018

Do latent tree learning models identify meaningful structure in sentences?

[BibT_eX]

[DOI]