Iaroslav Chelombitko

Orcid: 0009-0003-6843-0453

According to our database1, Iaroslav Chelombitko authored at least 5 papers between 2024 and 2026.

Collaborative distances:
  • Dijkstra number2 of six.
  • Erdős number3 of five.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
Subword-Based Comparative Linguistics across 242 Languages Using Wikipedia Glottosets.
CoRR, January, 2026

SampoNLP: A Self-Referential Toolkit for Morphological Analysis of Subword Tokenizers.
CoRR, January, 2026

Compressed code: the hidden effects of quantization and distillation on programming tokens.
CoRR, January, 2026

2025
When repeats drive the vocabulary: a Byte-Pair Encoding analysis of T2T primate genomes.
CoRR, May, 2025

2024
Qtok: A Comprehensive Framework for Evaluating Multilingual Tokenizer Quality in Large Language Models.
CoRR, 2024


  Loading...