Bettina Messmer

According to our database1, Bettina Messmer authored at least 9 papers between 2023 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Toward Cross-Lingual Quality Classifiers for Multilingual Pretraining Data Selection.
CoRR, April, 2026

2025
FineWeb2: One Pipeline to Scale Them All - Adapting Pre-Training Data Processing to Every Language.
CoRR, June, 2025

Enhancing Multilingual LLM Pretraining with Model-Based Data Selection.
CoRR, February, 2025

On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists.
Proceedings of the Forty-second International Conference on Machine Learning, 2025

2024
On-device Collaborative Language Modeling via a Mixture of Generalists and Specialists.
CoRR, 2024

Towards an empirical understanding of MoE design choices.
CoRR, 2024

Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training.
Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
Rotational Optimizers: Simple & Robust DNN Training.
CoRR, 2023


  Loading...