Jesse Dodge

Affiliations:
  • Allen Institute for AI, Seattle, WA, USA


According to our database1, Jesse Dodge authored at least 44 papers between 2012 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
OLMo: Accelerating the Science of Language Models.
CoRR, 2024

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research.
CoRR, 2024

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters.
CoRR, 2024

2023
Paloma: A Benchmark for Evaluating Language Model Fit.
CoRR, 2023

Catwalk: A Unified Language Model Evaluation Framework for Many Datasets.
CoRR, 2023

What's In My Big Data?
CoRR, 2023

Language Models Hallucinate, but May Excel at Fact Verification.
CoRR, 2023

The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices.
CoRR, 2023

Efficiency Pentathlon: A Standardized Arena for Efficiency Evaluation.
CoRR, 2023

Surveying (Dis)Parities and Concerns of Compute Hungry NLP Research.
CoRR, 2023

Evaluating the Social Impact of Generative AI Systems in Systems and Society.
CoRR, 2023

Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved with Text.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

Stubborn Lexical Bias in Data and Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Reproducibility in NLP: What Have We Learned from the Checklist?
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Efficient and Equitable Natural Language Processing in the Age of Deep Learning (Dagstuhl Seminar 22232).
Dagstuhl Reports, 2022

Data Governance in the Age of Large-Scale Data-Driven Language Technology.
CoRR, 2022

Findings of the WMT'22 Shared Task on Large-Scale Machine Translation Evaluation for African Languages.
Proceedings of the Seventh Conference on Machine Translation, 2022

Modeling the Machine Learning Multiverse.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Efficient Hierarchical Domain Adaptation for Pretrained Language Models.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Staged Training for Transformer Language Models.
Proceedings of the International Conference on Machine Learning, 2022

Data Governance in the Age of Large-Scale Data-Driven Language Technology.
Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022

Measuring the Carbon Intensity of AI in Cloud Instances.
Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022

Towards Reproducible Machine Learning Research in Natural Language Processing.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022

2021
Documenting the English Colossal Clean Crawled Corpus.
CoRR, 2021

Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Expected Validation Performance and Estimation of a Random Variable's Maximum.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Competency Problems: On Finding and Removing Artifacts in Language Data.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

2020
Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping.
CoRR, 2020

Green AI.
Commun. ACM, 2020

The Right Tool for the Job: Matching Model and Instance Complexities.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
RNN Architecture Learning with Sparse Regularization.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Show Your Work: Improved Reporting of Experimental Results.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2017
Random Search for Hyperparameters using Determinantal Point Processes.
CoRR, 2017

2016
Large Scale Retrieval and Generation of Image Descriptions.
Int. J. Comput. Vis., 2016

Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems.
Proceedings of the 4th International Conference on Learning Representations, 2016

Key-Value Memory Networks for Directly Reading Documents.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

2015
Retrofitting Word Vectors to Semantic Lexicons.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

2014
CMU: Arc-Factored, Discriminative Semantic Dependency Parsing.
Proceedings of the 8th International Workshop on Semantic Evaluation, 2014

Context-dependent Semantic Parsing for Time Expressions.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2012
Detecting Visual Text.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2012

Midge: Generating Image Descriptions From Computer Vision Detections.
Proceedings of the EACL 2012, 2012

Understanding and predicting importance in images.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012


  Loading...