Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Language models scale reliably with over-training and on downstream tasks.

[BibT_eX]

[DOI]

Samir Yitzhak Gadre

et al.

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

BTS: Harmonizing Specialized Experts into a Generalist LLM.

[BibT_eX]

[DOI]

Jakob Nicolaus Foerster

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

2024

Data-Centric Methods for Decentralizing Large Language Models

[BibT_eX]

[DOI]

Suchin Gururangan

PhD thesis, 2024

Language models scale reliably with over-training and on downstream tasks.

[BibT_eX]

[DOI]

CoRR, 2024

Information Flow Control in Machine Learning through Modular Model Architecture.

[BibT_eX]

[DOI]

Proceedings of the 33rd USENIX Security Symposium, 2024

DataComp-LM: In search of the next generation of training sets for language models.

[BibT_eX]

[DOI]

Khyathi Raghavi Chandu

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

LESS: Selecting Influential Data for Targeted Instruction Tuning.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Time is Encoded in the Weights of Finetuned Language Models.

[BibT_eX]

[DOI]

Kai Nylund

Suchin Gururangan

Noah A. Smith

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

lo-fi: distributed fine-tuning without communication.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore.

[BibT_eX]

[DOI]

CoRR, 2023

Scaling Expert Language Models with Unsupervised Domain Discovery.

[BibT_eX]

[DOI]

CoRR, 2023

2022

Editing Models with Task Arithmetic.

[BibT_eX]

[DOI]

CoRR, 2022

Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models.

[BibT_eX]

[DOI]

CoRR, 2022

Time Waits for No One! Analysis and Challenges of Temporal Misalignment.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

DEMix Layers: Disentangling Domains for Modular Language Modeling.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Nearest Neighbor Zero-Shot Inference.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

M2D2: A Massively Multi-Domain Language Modeling Dataset.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021

Detoxifying Language Models Risks Marginalizing Minority Voices.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Expected Validation Performance and Estimation of a Random Variable's Maximum.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text.

[BibT_eX]

[DOI]

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020

RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

Show Your Work: Improved Reporting of Experimental Results.

[BibT_eX]

[DOI]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Variational Pretraining for Semi-supervised Text Classification.

[BibT_eX]

[DOI]

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018

Annotation Artifacts in Natural Language Inference Data.

[BibT_eX]

[DOI]