Taido Purason

Orcid: 0009-0001-8018-5695

According to our database1, Taido Purason authored at least 15 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
EstLLM: Enhancing Estonian Capabilities in Multilingual LLMs via Continued Pretraining and Post-Training.
CoRR, March, 2026

Teaching Old Tokenizers New Words: Efficient Tokenizer Adaptation for Pretrained Models.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2026, 2026

2025
Teaching Old Tokenizers New Words: Efficient Tokenizer Adaptation for Pre-trained Models.
CoRR, December, 2025

Prune or Retrain: Optimizing the Vocabulary of Multilingual Models for Estonian.
CoRR, January, 2025

TartuNLP at WMT25 LLMs with Limited Resources for Slavic Languages Shared Task.
Proceedings of the Tenth Conference on Machine Translation, 2025

How Well do LLMs know Finno-Ugric Languages? A Systematic Assessment.
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies, 2025

LLMs for Extremely Low-Resource Finno-Ugric Languages.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

2024
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

To Err Is Human, but Llamas Can Learn It Too.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

SMUGRI-MT - Machine Translation System for Low-Resource Finno-Ugric Languages.
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 2), 2024

Multilinguality or Back-translation? A Case Study with Estonian.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2022
Open and Competitive Multilingual Neural Machine Translation in Production.
Balt. J. Mod. Comput., 2022

Teaching Unseen Low-resource Languages to Large Translation Models.
Proceedings of the Seventh Conference on Machine Translation, 2022

Multilingual Neural Machine Translation With the Right Amount of Sharing.
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, 2022

MTee: Open Machine Translation Platform for Estonian Government.
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, 2022


  Loading...