Bettina Messmer

According to our database1, Bettina Messmer authored at least 8 papers between 2023 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Apertus: Democratizing Open and Compliant LLMs for Global Language Environments.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, September, 2025

FineWeb2: One Pipeline to Scale Them All - Adapting Pre-Training Data Processing to Every Language.
CoRR, June, 2025

Enhancing Multilingual LLM Pretraining with Model-Based Data Selection.
CoRR, February, 2025

2024
On-device Collaborative Language Modeling via a Mixture of Generalists and Specialists.
CoRR, 2024

Towards an empirical understanding of MoE design choices.
CoRR, 2024

Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
Rotational Optimizers: Simple & Robust DNN Training.
CoRR, 2023


  Loading...