Iaroslav Chelombitko
Orcid: 0009-0003-6843-0453
According to our database1,
Iaroslav Chelombitko authored at least 5 papers
between 2024 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
Subword-Based Comparative Linguistics across 242 Languages Using Wikipedia Glottosets.
CoRR, January, 2026
SampoNLP: A Self-Referential Toolkit for Morphological Analysis of Subword Tokenizers.
CoRR, January, 2026
Compressed code: the hidden effects of quantization and distillation on programming tokens.
CoRR, January, 2026
2025
When repeats drive the vocabulary: a Byte-Pair Encoding analysis of T2T primate genomes.
CoRR, May, 2025
2024
Qtok: A Comprehensive Framework for Evaluating Multilingual Tokenizer Quality in Large Language Models.
CoRR, 2024