Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024

2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.

[BibT_eX]

[DOI]

Bartlomiej Bojanowski

Christopher D. Manning

Daniel Moseguí González

Eunice Engefu Manyasi

Evgenii Zheltonozhskii

Fanyue Xia

Fatemeh Siar

Fernando Martínez-Plumed

Giambattista Parascandolo

Giorgio Mariani

Gloria Wang

Gonzalo Jaimovitch-López

Jaime Fernández Fisac

Jascha Sohl-Dickstein

José Hernández-Orallo

Karthik Gopalakrishnan

Lidia Contreras Ochando

Louis-Philippe Morency

María José Ramírez-Quintana

Michael I. Ivanitskiy

Neta Gur-Ari Krakover

Nitish Shirish Keskar

Pablo Antonio Moreno Casares

Pegah Alipoormolabashi

Shyamolima (Shammie) Debnath

Sneha Priscilla Makini

Yadollah Yaghoobzadeh

Trans. Mach. Learn. Res., 2023

Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data?

[BibT_eX]

[DOI]

CoRR, 2023

Tool Learning with Foundation Models.

[BibT_eX]

[DOI]

CoRR, 2023

HyperTuning: Toward Adapting Large Language Models without Back-propagation.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Pretraining Language Models with Human Preferences.

[BibT_eX]

[DOI]

Tomasz Korbak

Kejian Shi

Angelica Chen

Rasika Vinayak Bhalerao

Christopher L. Buckley

Jason Phang

Samuel R. Bowman

Ethan Perez

Proceedings of the International Conference on Machine Learning, 2023

Investigating Efficiently Extending Transformers for Long Input Summarization.

[BibT_eX]

[DOI]

Jason Phang

Yao Zhao

Peter J. Liu

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

What Do NLP Researchers Believe? Results of the NLP Community Metasurvey.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Two-Turn Debate Doesn't Help Humans Answer Hard Reading Comprehension Questions.

[BibT_eX]

[DOI]

Amanpreet Singh Saimbhi

Samuel R. Bowman

CoRR, 2022

EleutherAI: Going Beyond "Open Science" to "Science in the Open".

[BibT_eX]

[DOI]

CoRR, 2022

GPT-NeoX-20B: An Open-Source Autoregressive Language Model.

[BibT_eX]

[DOI]

CoRR, 2022

Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions.

[BibT_eX]

[DOI]

CoRR, 2022

QuALITY: Question Answering with Long Input Texts, Yes!

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

SQuALITY: Building a Long-Document Summarization Dataset the Hard Way.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

What Language Model to Train if You Have One Million GPU Hours?

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

BBQ: A hand-built bias benchmark for question answering.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021

An interpretable classifier for high-resolution breast cancer screening images utilizing weakly supervised localization.

[BibT_eX]

[DOI]

Medical Image Anal., 2021

Reducing False-Positive Biopsies using Deep Neural Networks that Utilize both Local and Global Image Context of Screening Mammograms.

[BibT_eX]

[DOI]

J. Digit. Imaging, 2021

Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair.

[BibT_eX]

[DOI]

CoRR, 2021

The Pile: An 800GB Dataset of Diverse Text for Language Modeling.

[BibT_eX]

[DOI]

CoRR, 2021

Fine-Tuned Transformers Show Clusters of Similar Representations Across Layers.

[BibT_eX]

[DOI]

Jason Phang

Haokun Liu

Samuel R. Bowman

Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2021

Comparing Test Sets with Item Response Theory.

[BibT_eX]

[DOI]

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020

Deep Neural Networks Improve Radiologists' Performance in Breast Cancer Screening.

[BibT_eX]

[DOI]

IEEE Trans. Medical Imaging, 2020

Investigating and Simplifying Masking-based Saliency Methods for Model Interpretability.

[BibT_eX]

[DOI]

Jason Phang

Jungkyu Park

Krzysztof J. Geras

CoRR, 2020

Reducing false-positive biopsies with deep neural networks that utilize local and global information in screening mammograms.

[BibT_eX]

[DOI]

CoRR, 2020

Intermediate-Task Transfer Learning with Pretrained Models for Natural Language Understanding: When and Why Does It Work?

[BibT_eX]

[DOI]

CoRR, 2020

English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too.

[BibT_eX]

[DOI]

Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020

jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work?

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

Do Attention Heads in BERT Track Syntactic Dependencies?

[BibT_eX]

[DOI]

CoRR, 2019

Improving localization-based approaches for breast cancer screening exam classification.

[BibT_eX]

[DOI]

CoRR, 2019

Screening Mammogram Classification with Prior Exams.

[BibT_eX]

[DOI]

CoRR, 2019

Globally-Aware Multiple Instance Classifier for Breast Cancer Screening.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning in Medical Imaging - 10th International Workshop, 2019

Investigating BERT's Knowledge of Language: Five Analysis Methods with NPIs.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018

Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks.

[BibT_eX]

[DOI]

Jason Phang

Thibault Févry

Samuel R. Bowman

CoRR, 2018

Unsupervised Sentence Compression using Denoising Auto-Encoders.

[BibT_eX]

[DOI]

Thibault Févry