Robin Jia

Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency, 2025

Rethinking Backdoor Detection Evaluation for Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Promote, Suppress, Iterate: How Language Models Answer One-to-Many Factual Queries.

[BibT_eX]

[DOI]

Tianyi Lorena Yan

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

TokenSmith: Streamlining Data Editing, Search, and Inspection for Large-Scale Language Model Training and Interpretability.

[BibT_eX]

[DOI]

Mohammad Aflah Khan

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Why Do Some Inputs Break Low-Bit LLM Quantization?

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Mechanistic Interpretability of Emotion Inference in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Robust Data Watermarking in Language Models by Injecting Fictitious Knowledge.

[BibT_eX]

[DOI]

Xinyue Cui

Swabha Swayamdipta

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Language Models can Infer Action Semantics for Classical Planners from Environment Feedback.

[BibT_eX]

[DOI]

CoRR, 2024

IsoBench: Benchmarking Multimodal Foundation Models on Isomorphic Representations.

[BibT_eX]

[DOI]

CoRR, 2024

Pre-trained Large Language Models Use Fourier Features to Compute Addition.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Transformers Learn to Achieve Second-Order Convergence Rates for In-Context Linear Regression.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Efficient End-to-End Visual Document Understanding with Rationale Distillation.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Operationalizing Content Moderation "Accuracy" in the Digital Services Act.

[BibT_eX]

[DOI]

Frederike Zufall

Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES-24) - Full Archival Papers, October 21-23, 2024, San Jose, California, USA, 2024

Proving membership in LLM pretraining data via data watermarks.

[BibT_eX]

[DOI]

Ryan Yixiang Wang

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Does VLN Pretraining Work with Nonsensical or Irrelevant Instructions?

[BibT_eX]

[DOI]

CoRR, 2023

Do Localization Methods Actually Localize Memorized Data in LLMs?

[BibT_eX]

[DOI]

CoRR, 2023

Transformers Learn Higher-Order Optimization Methods for In-Context Learning: A Study with Linear Models.

[BibT_eX]

[DOI]

CoRR, 2023

How Predictable Are Large Language Model Capabilities? A Case Study on BIG-bench.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Estimating Large Language Model Capabilities without Labeled Test Data.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

SCENE: Self-Labeled Counterfactuals for Extrapolating to Negative Examples.

[BibT_eX]

[DOI]

Deqing Fu

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering.

[BibT_eX]

[DOI]

Wang Zhu

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Benchmarking Long-tail Generalization with Likelihood Splits.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

Contrastive Novelty-Augmented Learning: Anticipating Outliers with Large Language Models.

[BibT_eX]

[DOI]

Albert Xu

Xiang Ren

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Do Question Answering Modeling Improvements Hold Across Benchmarks?

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Are Sample-Efficient NLP Models More Robust?

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

Data Curation Alone Can Stabilize In-context Learning.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Careful Data Curation Stabilizes In-context Learning.

[BibT_eX]

[DOI]

CoRR, 2022

CoNAL: Anticipating Outliers with Large Language Models.

[BibT_eX]

[DOI]

Albert Xu

Xiang Ren

CoRR, 2022

On the Robustness of Reading Comprehension Models to Entity Renaming.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Knowledge Base Question Answering by Case-based Reasoning over Subgraphs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems.

[BibT_eX]

[DOI]

Wang Zhu

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Analyzing Dynamic Adversarial Training Data in the Limit.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

On Continual Model Refinement in Out-of-Distribution Data Streams.

[BibT_eX]

[DOI]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Question Answering Infused Pre-training of General-Purpose Contextualized Representations.

[BibT_eX]

[DOI]

Mike Lewis

Luke Zettlemoyer

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021

Can Small and Synthetic Benchmarks Drive Modeling Innovation? A Retrospective Study of Question Answering Modeling Approaches.

[BibT_eX]

[DOI]

CoRR, 2021

Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Swords: A Benchmark for Lexical Substitution with Improved Data Coverage and Quality.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Dynabench: Rethinking Benchmarking in NLP.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Robustness and Adversarial Examples in Natural Language Processing.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: EMNLP 2021, 2021

Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

To what extent do human explanations of model behavior align with actual model behavior?

[BibT_eX]

[DOI]

Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2021

The statistical advantage of automatic NLG metrics at the system level.

[BibT_eX]

[DOI]

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Evaluation Examples are not Equally Informative: How should that change NLP Leaderboards?

[BibT_eX]

[DOI]

Pedro Rodriguez

Joe Barrow

Alexander Miserlis Hoyle

John P. Lalor

Jordan L. Boyd-Graber

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Do Explanations Help Users Detect Errors in Open-Domain QA? An Evaluation of Spoken vs. Visual Explanations.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020

Building robust natural language processing systems.

[BibT_eX]

[DOI]

PhD thesis, 2020

Human Evaluation of Spoken vs. Visual Explanations for Open-Domain QA.

[BibT_eX]

[DOI]

CoRR, 2020

On the Importance of Adaptive Data Collection for Extremely Imbalanced Pairwise Tasks.

[BibT_eX]

[DOI]

Stephen Mussmann

Percy Liang

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

With Little Power Comes Great Responsibility.

[BibT_eX]

[DOI]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Selective Question Answering under Domain Shift.

[BibT_eX]

[DOI]

Amita Kamath

Percy Liang

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Robust Encodings: A Framework for Combating Adversarial Typos.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

Document-Level N-ary Relation Extraction with Multiscale Representation Learning.

[BibT_eX]

[DOI]