Nathan Godey
According to our database1,
Nathan Godey
authored at least 10 papers
between 2022 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
Biomed-Enriched: A Biomedical Dataset Enriched with LLMs for Pretraining and Extracting Rare and Hidden Content.
CoRR, June, 2025
CoRR, March, 2025
2024
Improving Representations for Language Modeling. (Amélioration des représentations pour la modélisation du langage).
PhD thesis, 2024
Why do small language models underperform? Studying Language Model Saturation via the Softmax Bottleneck.
CoRR, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
2023
2022
MANTa: Efficient Gradient-Based Tokenization for Robust End-to-End Language Modeling.
CoRR, 2022
MANTa: Efficient Gradient-Based Tokenization for End-to-End Robust Language Modeling.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022