Steinþór Steingrímsson

Orcid: 0000-0002-9776-9507

According to our database1, Steinþór Steingrímsson authored at least 33 papers between 2015 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
Preliminary Ranking of WMT25 General Machine Translation Systems.
CoRR, August, 2025

MC-19: A Corpus of 19th Century Icelandic Texts.
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies, 2025

Playing by the Rules: A Benchmark Set for Standardized Icelandic Orthography.
Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies, 2025

2024
Preliminary WMT24 Ranking of General MT Systems and LLMs.
CoRR, 2024


Cogs in a Machine, Doing What They're Meant to Do - the AMI Submission to the WMT24 General Translation Task.
Proceedings of the Ninth Conference on Machine Translation, 2024

Killing Two Flies with One Stone: An Attempt to Break LLMs Using English-Icelandic Idioms and Proper Names.
Proceedings of the Ninth Conference on Machine Translation, 2024

2023
The ParlaMint corpora of parliamentary proceedings.
Lang. Resour. Evaluation, March, 2023

A Sentence Alignment Approach to Document Alignment and Multi-faceted Filtering for Curating Parallel Sentence Pairs from Web-crawled Data.
Proceedings of the Eighth Conference on Machine Translation, 2023

Filtering Matters: Experiments in Filtering Training Sets for Machine Translation.
Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

Gamli - Icelandic Oral History Corpus: Design, Collection and Evaluation.
Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

Generating Errors: OCR Post-Processing for Icelandic.
Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

Evaluating a Universal Dependencies Conversion Pipeline for Icelandic.
Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

SentAlign: Accurate and Scalable Sentence Alignment.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022
IceBATS: An Icelandic Adaptation of the Bigger Analogy Test Set.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Evolving Large Text Corpora: Four Versions of the Icelandic Gigaword Corpus.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

2021
CombAlign: a Tool for Obtaining High-Quality Word Alignments.
Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021

The Icelandic Word Web: A language technology-focused redesign of a lexicosemantic database.
Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021


Effective Bitext Extraction From Comparable Corpora Using a Combination of Three Different Approaches.
Proceedings of the 14th Workshop on Building and Using Comparable Corpora, 2021

2020
Experimenting with Different Machine Translation Models in Medium-Resource Settings.
Proceedings of the Text, Speech, and Dialogue, 2020

Facilitating Corpus Usage: Making Icelandic Corpora More Accessible for Researchers and Language Users.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Language Technology Programme for Icelandic 2019-2023.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Samrómur: Crowd-sourcing Data Collection for Icelandic Speech Recognition.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Constructing Multimodal Language Learner Texts Using LARA: Experiences with Nine Languages.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Effectively Aligning and Filtering Parallel Corpora under Sparse Data Conditions.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 2020

2019
Augmenting a BiLSTM Tagger with a Morphological Lexicon and a Lexical Category Identification Step.
Proceedings of the International Conference on Recent Advances in Natural Language Processing, 2019

DIM: The Database of Icelandic Morphology.
Proceedings of the 22nd Nordic Conference on Computational Linguistics, NoDaLiDa 2019, Turku, Finland, September 30, 2019

Compiling and Filtering ParIce: An English-Icelandic Parallel Corpus.
Proceedings of the 22nd Nordic Conference on Computational Linguistics, NoDaLiDa 2019, Turku, Finland, September 30, 2019

2018
Risamálheild: A Very Large Icelandic Text Corpus.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Digitizing the Icelandic-Danish Blöndal Dictionary.
Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference, 2018

2017
Málrómur: A Manually Verified Corpus of Recorded Icelandic Speech.
Proceedings of the 21st Nordic Conference on Computational Linguistics, 2017

2015
Analysing Inconsistencies and Errors in PoS Tagging in two Icelandic Gold Standards.
Proceedings of the 20th Nordic Conference of Computational Linguistics, 2015


  Loading...