Stella Biderman

Orcid: 0000-0001-8228-1042

According to our database1, Stella Biderman authored at least 48 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection.
CoRR, 2024

On the Societal Impact of Open Foundation Models.
CoRR, 2024

KMMLU: Measuring Massive Multitask Language Understanding in Korean.
CoRR, 2024

Suppressing Pink Elephants with Direct Principle Feedback.
CoRR, 2024

The Case for Co-Designing Model Architectures with Hardware.
CoRR, 2024

Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion.
CoRR, 2024

2023
Grokking Group Multiplication with Cosets.
CoRR, 2023

Llemma: An Open Language Model For Mathematics.
CoRR, 2023

Stay on topic with Classifier-Free Guidance.
CoRR, 2023

Can Transformers Learn to Solve Problems Recursively?
CoRR, 2023

Eliciting Latent Predictions from Transformers with the Tuned Lens.
CoRR, 2023

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Emergent and Predictable Memorization in Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LEACE: Perfect linear concept erasure in closed form.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling.
Proceedings of the International Conference on Machine Learning, 2023

Recasting Self-Attention with Holographic Reduced Representations.
Proceedings of the International Conference on Machine Learning, 2023


trlX: A Framework for Large Scale Reinforcement Learning from Human Feedback.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

Crosslingual Generalization through Multitask Finetuning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets.
Trans. Assoc. Comput. Linguistics, 2022

MP-NeRF: A massively parallel method for accelerating protein structure reconstruction from internal coordinates.
J. Comput. Chem., 2022

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.
CoRR, 2022

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.
CoRR, 2022

What Language Model to Train if You Have One Million GPU Hours?
CoRR, 2022

Large language models are not zero-shot communicators.
CoRR, 2022

EleutherAI: Going Beyond "Open Science" to "Science in the Open".
CoRR, 2022

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing.
CoRR, 2022

Data Governance in the Age of Large-Scale Data-Driven Language Technology.
CoRR, 2022

GPT-NeoX-20B: An Open-Source Autoregressive Language Model.
CoRR, 2022

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources.
CoRR, 2022

Neural Language Models are Effective Plagiarists.
CoRR, 2022

Datasheet for the Pile.
CoRR, 2022




Data Governance in the Age of Large-Scale Data-Driven Language Technology.
Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022

What Language Model to Train if You Have One Million GPU Hours?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance.
Proceedings of the Computer Vision - ECCV 2022, 2022

Fooling MOSS Detection with Pretrained Language Models.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

2021
Multitask Prompted Training Enables Zero-Shot Task Generalization.
CoRR, 2021

Cut the CARP: Fishing for zero-shot story evaluation.
CoRR, 2021

Towards a Formal Model of Narratives.
CoRR, 2021

The Pile: An 800GB Dataset of Diverse Text for Language Modeling.
CoRR, 2021

Magic: The Gathering Is Turing Complete.
Proceedings of the 10th International Conference on Fun with Algorithms, 2021

2020
Magic: the Gathering is as Hard as Arithmetic.
CoRR, 2020

Pitfalls in Machine Learning Research: Reexamining the Development Cycle.
Proceedings of the "I Can't Believe It's Not Better!" at NeurIPS Workshops, 2020


  Loading...