Mitchell Wortsman

According to our database¹, Mitchell Wortsman authored at least 35 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2025

Language models scale reliably with over-training and on downstream tasks.

[BibT_eX]

[DOI]

et al.

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Robust and reliable large-scale transfer learning

[BibT_eX]

[DOI]

Mitchell Wortsman

PhD thesis, 2024

Detecting clinical medication errors with AI enabled wearable cameras.

[BibT_eX]

[DOI]

npj Digit. Medicine, 2024

Language models scale reliably with over-training and on downstream tasks.

[BibT_eX]

[DOI]

CoRR, 2024

Resolving Discrepancies in Compute-Optimal Scaling of Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

DataComp-LM: In search of the next generation of training sets for language models.

[BibT_eX]

[DOI]

Khyathi Raghavi Chandu

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Scaling Exponents Across Parameterizations and Optimizers.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

Leslie Pack Kaelbling

Jaehoon Lee

Jeffrey Pennington

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Small-scale proxies for large-scale Transformer training instabilities.

[BibT_eX]

[DOI]

Jascha Sohl-Dickstein

Proceedings of the Twelfth International Conference on Learning Representations, 2024

OLMo: Accelerating the Science of Language Models.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

lo-fi: distributed fine-tuning without communication.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Replacing softmax with ReLU in Vision Transformers.

[BibT_eX]

[DOI]

CoRR, 2023

OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

The Role of Pre-training Data in Transfer Learning.

[BibT_eX]

[DOI]

CoRR, 2023

Stable and low-precision training for large-scale vision-language models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DataComp: In search of the next generation of multimodal datasets.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Editing models with task arithmetic.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Reproducible Scaling Laws for Contrastive Language-Image Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Editing Models with Task Arithmetic.

[BibT_eX]

[DOI]

CoRR, 2022

CLIP on Wheels: Zero-Shot Object Navigation as Object Localization and Exploration.

[BibT_eX]

[DOI]

CoRR, 2022

LAION-5B: An open large-scale dataset for training next generation image-text models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Patching open-vocabulary models by interpolating weights.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time.

[BibT_eX]

[DOI]

Raphael Gontijo Lopes

Proceedings of the International Conference on Machine Learning, 2022

Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP).

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

Exploring The Landscape of Distributional Robustness for Question Answering Models.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Robust fine-tuning of zero-shot models.

[BibT_eX]

[DOI]

Raphael Gontijo Lopes

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Robust fine-tuning of zero-shot models.

[BibT_eX]

[DOI]

CoRR, 2021

Learning Neural Network Subspaces.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Deconstructing the Structure of Sparse Neural Networks.

[BibT_eX]

[DOI]

Maxwell Van Gelder

Mitchell Wortsman

Kiana Ehsani

CoRR, 2020

Supermasks in Superposition.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Soft Threshold Weight Reparameterization for Learnable Sparsity.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

What's Hidden in a Randomly Weighted Neural Network?

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Discovering Neural Wirings.

[BibT_eX]

[DOI]

Mitchell Wortsman

Ali Farhadi

Mohammad Rastegari

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Mitchell Wortsman

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...