Tomasz Limisiewicz
Orcid: 0000-0003-3809-2580
According to our database1,
Tomasz Limisiewicz
authored at least 13 papers
between 2020 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling.
CoRR, 2024
CoRR, 2024
2023
Exploring the Impact of Training Data Distribution and Subword Tokenization on Gender Bias in Machine Translation.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023
Tokenization Impacts Multilingual Language Modeling: Assessing Vocabulary Allocation and Overlap Across Languages.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
You Can Have Your Data and Balance It Too: Towards Balanced and Efficient Multilingual Models.
CoRR, 2022
Don't Forget About Pronouns: Removing Gender Bias in Language Models Without Losing Factual Gender Information.
CoRR, 2022
A Balanced Data Approach for Evaluating Cross-Lingual Transfer: Mapping the Linguistic Blood Bank.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022
2021
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
2020
Proceedings of the Fifth Conference on Machine Translation, 2020
Proceedings of the 20th Conference Information Technologies, 2020
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020