Stella Biderman

Orcid: 0000-0001-8228-1042

According to our database¹, Stella Biderman authored at least 48 papers between 2020 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection.

[BibT_eX]

[DOI]

Mohammad Mahmudul Alam

CoRR, 2024

On the Societal Impact of Open Foundation Models.

[BibT_eX]

[DOI]

CoRR, 2024

KMMLU: Measuring Massive Multitask Language Understanding in Korean.

[BibT_eX]

[DOI]

CoRR, 2024

Suppressing Pink Elephants with Direct Principle Feedback.

[BibT_eX]

[DOI]

CoRR, 2024

The Case for Co-Designing Model Architectures with Hardware.

[BibT_eX]

[DOI]

CoRR, 2024

Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion.

[BibT_eX]

[DOI]

CoRR, 2024

2023

Grokking Group Multiplication with Cosets.

[BibT_eX]

[DOI]

CoRR, 2023

Llemma: An Open Language Model For Mathematics.

[BibT_eX]

[DOI]

CoRR, 2023

Stay on topic with Classifier-Free Guidance.

[BibT_eX]

[DOI]

Pawan Sasanka Ammanamanchi

Stella Biderman

CoRR, 2023

Can Transformers Learn to Solve Problems Recursively?

[BibT_eX]

[DOI]

CoRR, 2023

Eliciting Latent Predictions from Transformers with the Tuned Lens.

[BibT_eX]

[DOI]

CoRR, 2023

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Emergent and Predictable Memorization in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LEACE: Perfect linear concept erasure in closed form.

[BibT_eX]

[DOI]

Nora Belrose

David Schneider-Joseph

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling.

[BibT_eX]

[DOI]

Stella Biderman

Hailey Schoelkopf

Quentin Gregory Anthony

Proceedings of the International Conference on Machine Learning, 2023

Recasting Self-Attention with Holographic Reduced Representations.

[BibT_eX]

[DOI]

Mohammad Mahmudul Alam

Proceedings of the International Conference on Machine Learning, 2023

RWKV: Reinventing RNNs for the Transformer Era.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

trlX: A Framework for Large Scale Reinforcement Learning from Human Feedback.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.

[BibT_eX]

[DOI]

David Ifeoluwa Adelani

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

Crosslingual Generalization through Multitask Finetuning.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets.

[BibT_eX]

[DOI]

Nasanbayar Ulzii-Orshikh

Pedro Javier Ortiz Suárez

Iroro Orife

Kelechi Ogueji

Andre Niyongabo Rubungo

Toan Q. Nguyen

Mathias Müller

André Müller

Shamsuddeen Hassan Muhammad

Nanda Muhammad

Ayanda Mnyakeni

Jamshidbek Mirzakhalov

Tapiwanashe Matangira

Bonaventure F. P. Dossou

Trans. Assoc. Comput. Linguistics, 2022

MP-NeRF: A massively parallel method for accelerating protein structure reconstruction from internal coordinates.

[BibT_eX]

[DOI]

J. Comput. Chem., 2022

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.

[BibT_eX]

[DOI]

David Ifeoluwa Adelani

CoRR, 2022

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.

[BibT_eX]

[DOI]

CoRR, 2022

What Language Model to Train if You Have One Million GPU Hours?

[BibT_eX]

[DOI]

CoRR, 2022

Large language models are not zero-shot communicators.

[BibT_eX]

[DOI]

CoRR, 2022

EleutherAI: Going Beyond "Open Science" to "Science in the Open".

[BibT_eX]

[DOI]

CoRR, 2022

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing.

[BibT_eX]

[DOI]

CoRR, 2022

Data Governance in the Age of Large-Scale Data-Driven Language Technology.

[BibT_eX]

[DOI]

CoRR, 2022

GPT-NeoX-20B: An Open-Source Autoregressive Language Model.

[BibT_eX]

[DOI]

CoRR, 2022

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources.

[BibT_eX]

[DOI]

Angelina McMillan-Major

Pedro Javier Ortiz Suárez

Zeerak Talat

Daniel van Strien

Yacine Jernite

CoRR, 2022

Neural Language Models are Effective Plagiarists.

[BibT_eX]

[DOI]

Stella Biderman

Edward Raff

CoRR, 2022

Datasheet for the Pile.

[BibT_eX]

[DOI]

Stella Biderman

Kieran Bicheno

Leo Gao

CoRR, 2022

The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset.

[BibT_eX]

[DOI]

Albert Villanova del Moral

Teven Le Scao

Leandro von Werra

Chenghao Mou

Eduardo González Ponferrada

Angelina McMillan-Major

David Ifeoluwa Adelani

Alexandra Sasha Luccioni

Yacine Jernite

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

BigBio: A Framework for Data-Centric Biomedical Natural Language Processing.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Multitask Prompted Training Enables Zero-Shot Task Generalization.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Data Governance in the Age of Large-Scale Data-Driven Language Technology.

[BibT_eX]

[DOI]

Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022

What Language Model to Train if You Have One Million GPU Hours?

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Fooling MOSS Detection with Pretrained Language Models.

[BibT_eX]

[DOI]

Stella Biderman

Edward Raff

Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

2021

Multitask Prompted Training Enables Zero-Shot Task Generalization.

[BibT_eX]

[DOI]

CoRR, 2021

Cut the CARP: Fishing for zero-shot story evaluation.

[BibT_eX]

[DOI]

CoRR, 2021

Towards a Formal Model of Narratives.

[BibT_eX]

[DOI]

Louis Castricato

Stella Biderman

Rogelio E. Cardona-Rivera

David Thue

CoRR, 2021

The Pile: An 800GB Dataset of Diverse Text for Language Modeling.

[BibT_eX]

[DOI]

CoRR, 2021

Magic: The Gathering Is Turing Complete.

[BibT_eX]

[DOI]

Alex Churchill

Stella Biderman

Austin Herrick

Proceedings of the 10th International Conference on Fun with Algorithms, 2021

2020

Magic: the Gathering is as Hard as Arithmetic.

[BibT_eX]

[DOI]

Stella Biderman

CoRR, 2020

Pitfalls in Machine Learning Research: Reexamining the Development Cycle.

[BibT_eX]

[DOI]

Stella Biderman

Walter J. Scheirer

Proceedings of the "I Can't Believe It's Not Better!" at NeurIPS Workshops, 2020

Stella Biderman

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...