Krister Lindén

Orcid: 0000-0003-2337-303X

According to our database1, Krister Lindén authored at least 89 papers between 2004 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks.
Lang. Resour. Evaluation, September, 2023

FinnSentiment: a Finnish social media corpus for sentiment polarity annotation.
Lang. Resour. Evaluation, 2023

Ethically Archiving a Hard-to-Access Massive Research Data Set in the Language Bank of Finland: The Finnish Dark Web Marketplace Corpus (FINDarC).
Proceedings of the Conference on Technology Ethics 2023 - Tethics 2023, 2023

Tuning HeLI-OTS for Guarani-Spanish Code Switching Analysis.
Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2023) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2023), 2023

A Neural Pipeline for POS-tagging and Lemmatizing Cuneiform Languages.
Proceedings of the Ancient Language Processing Workshop, 2023

2022
Language Report Finnish.
Proceedings of the European Language Equality, 2022

Lahjoita puhetta - a large-scale corpus of spoken Finnish with some benchmarks.
CoRR, 2022

Optimizing Naive Bayes for Arabic Dialect Identification.
Proceedings of the The Seventh Arabic Natural Language Processing Workshop, 2022

HeLI-OTS, Off-the-shelf Language Identifier for Text.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Language Identification as part of the Text Corpus Creation Pipeline at the Language Bank of Finland.
Proceedings of the 6th Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022), 2022

Lemmatizing and POS-tagging Akkadian with BabyLemmatizer and Dictionary-Based Post-Correction.
Proceedings of the Selected Papers from the CLARIN Annual Conference 2022, 2022

EU Data Governance Act: Outlining a Potential Role for CLARIN.
Proceedings of the Selected Papers from the CLARIN Annual Conference 2022, 2022

The Pipeline for Publishing Resources in the Language Bank of Finland.
Proceedings of the Selected Papers from the CLARIN Annual Conference 2022, 2022

2021
Naive Bayes-based Experiments in Romanian Dialect Identification.
Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects, 2021

Findings of the VarDial Evaluation Campaign 2021.
Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects, 2021

The Interaction of Personal Data, Intellectual Property and Freedom of Expression in the Context of Language Research.
Proceedings of the Selected Papers from the CLARIN Annual Conference 2021, 2021

Legal Issues Related to the Use of Twitter Data in Language Research.
Proceedings of the Selected Papers from the CLARIN Annual Conference 2021, 2021

2020
A Finnish news corpus for named entity recognition.
Lang. Resour. Evaluation, 2020

Optical character recognition with neural networks and post-correction with finite state methods.
Int. J. Document Anal. Recognit., 2020

Uralic Language Identification (ULI) 2020 shared task dataset and the Wanca 2017 corpus.
CoRR, 2020

Uralic Language Identification (ULI) 2020 shared task dataset and the Wanca 2017 corpora.
Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects, 2020

Experiments in Language Variety Geolocation and Dialect Identification.
Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects, 2020

A Report on the VarDial Evaluation Campaign 2020.
Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects, 2020

Akkadian Treebank for early Neo-Assyrian Royal Inscriptions.
Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories, 2020

BabyFST - Towards a Finite-State Based Computational Model of Ancient Babylonian.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Automated Phonological Transcription of Akkadian Cuneiform Text.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020


Improving Word Association Measures in Repetitive Corpora with Context Similarity Weighting.
Proceedings of the 12th International Joint Conference on Knowledge Discovery, 2020

Sharing is Caring a Legal Perspective on Sharing Language Data Containing Personal Data and the Division of Liability between Researchers and Research Organisations.
Proceedings of the Selected Papers from the CLARIN Annual Conference 2020, 2020

Building Web Corpora for Minority Languages.
Proceedings of the 12th Web as Corpus Workshop, 2020

2019
Language model adaptation for language and dialect identification of text.
Nat. Lang. Eng., 2019

FinnTransFrame: translating frames in the FinnFrameNet project.
Lang. Resour. Evaluation, 2019

Automatic Language Identification in Texts: A Survey.
J. Artif. Intell. Res., 2019

Language and Dialect Identification of Cuneiform Texts.
CoRR, 2019

Improving OCR of historical newspapers and journals published in Finland.
Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage, 2019

A CLARIN Contractual Framework for Sharing Personal Data for Scientific Research.
Proceedings of the Selected Papers from the CLARIN Annual Conference 2019, Leipzig, Germany, September 30, 2019

The Impact of Copyright and Personal Data Laws on the Creation and Use of Models for Language Technologies.
Proceedings of the Selected Papers from the CLARIN Annual Conference 2019, Leipzig, Germany, September 30, 2019

2018
The Dagstuhl Perspectives Workshop on Performance Modeling and Prediction.
SIGIR Forum, 2018

From Evaluating to Forecasting Performance: How to Turn Information Retrieval, Natural Language Processing and Recommender Systems into Predictive Sciences (Dagstuhl Perspectives Workshop 17442).
Dagstuhl Manifestos, 2018

HeLI-based Experiments in Swiss German Dialect Identification.
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, 2018

HeLI-based Experiments in Discriminating Between Dutch and Flemish Subtitles.
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, 2018

Iterative Language Model Adaptation for Indo-Aryan Language Identification.
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, 2018

Rethinking Summarization and Storytelling for Modern Social Multimedia.
Proceedings of the MultiMedia Modeling - 24th International Conference, 2018

Processing personal data without the consent of the data subject for the development and use of language resources.
Proceedings of the Selected papers from the CLARIN Annual Conference 2018, 2018

2017
Evaluating HeLI with Non-Linear Mappings.
Proceedings of the Fourth Workshop on NLP for Similar Languages, 2017

Evaluation of language identification methods using 285 languages.
Proceedings of the 21st Nordic Conference on Computational Linguistics, 2017

OCR and post-correction of historical Finnish texts.
Proceedings of the 21st Nordic Conference on Computational Linguistics, 2017

Implementation of an Open Science Policy in the context of management of CLARIN language resources: a need for changes?
Proceedings of the Selected papers from the CLARIN Annual Conference 2017, 2017

2016
FinnPos: an open-source morphological tagging and lemmatization toolkit for Finnish.
Lang. Resour. Evaluation, 2016

The strategic impact of META-NET on the regional, national and international level.
Lang. Resour. Evaluation, 2016

HeLI, a Word-Based Backoff Method for Language Identification.
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, 2016

In-Document Adaptation for a Human Guided Automatic Transcription Service.
Proceedings of the Speech and Computer - 18th International Conference, 2016

2015
Using HFST - Helsinki Finite-State Technology for Recognizing Semantic Frames.
Proceedings of the Systems and Frameworks for Computational Morphology, 2015

Extracting Semantic Frames using hfst-pmatch.
Proceedings of the 20th Nordic Conference of Computational Linguistics, 2015

Automated Lossless Hyper-Minimization for Morphological Analyzers.
Proceedings of the 12th International Conference on Finite-State Methods and Natural Language Processing, 2015

The Regulatory and Contractual Framework as an Integral Part of the CLARIN Infrastructure.
Proceedings of the Selected Papers from the CLARIN Annual Conference 2015, 2015

Language Set Identification in Noisy Synthetic Multilingual Documents.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2015

2014
Is it possible to create a very large wordnet in 100 days? An evaluation.
Lang. Resour. Evaluation, 2014

CLARA: A New Generation of Researchers in Common Language Resources and Their Applications.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

HFST-SweNER ― A New NER Resource for Swedish.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Heuristic Hyper-minimization of Finite State Lexicons.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Accelerated Estimation of Conditional Random Fields using a Pseudo-Likelihood-inspired Perceptron Variant.
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 2014

State-of-the-Art in Weighted Finite-State Spell-Checking.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2014

Part-of-Speech Tagging using Conditional Random Fields: Exploiting Sub-Label Dependencies for Improved Accuracy.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2013
Using HFST for Creating Computational Linguistic Applications.
Proceedings of the Computational Linguistics - Applications, 2013

HFST - A System for Creating NLP Tools.
Proceedings of the Systems and Frameworks for Computational Morphology, 2013

Baltic and Nordic Parts of the European Linguistic Infrastructure.
Proceedings of the 19th Nordic Conference of Computational Linguistics, 2013

Nordic and Baltic wordnets aligned and compared through "WordTies".
Proceedings of the 19th Nordic Conference of Computational Linguistics, 2013

2012
Specifying Treebanks, Outsourcing Parsebanks: FinnTreeBank 3.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Creation of an Open Shared Language Resource Repository in the Nordic and Baltic Countries.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Representing the Translation Relation in a Bilingual Wordnet.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Predictive Text Entry for Agglutinative Languages Using Unsupervised Morphological Segmentation.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2012

Extending and Updating the Finnish Wordnet.
Proceedings of the Shall We Play the Festschrift Game?, 2012

2011
HFST - Framework for Compiling and Applying Morphologies.
Proceedings of the Systems and Frameworks for Computational Morphology, 2011

Combining Statistical Models for POS Tagging using Finite-State Calculus.
Proceedings of the 18th Nordic Conference of Computational Linguistics, 2011

Do wordnets also improve human performance on NLP tasks?
Proceedings of the 18th Nordic Conference of Computational Linguistics, 2011

META-NORD: Towards Sharing of Language Resources in Nordic and Baltic Countries.
Proceedings of the Workshop on Language Resources, 2011

2010
Part-of-Speech Tagging Using Parallel Weighted Finite-State Transducers.
Proceedings of the Advances in Natural Language Processing, 2010

Building and Using Existing Hunspell Dictionaries and TeX Hyphenators as Finite-State Automata.
Proceedings of the International Multiconference on Computer Science and Information Technology, 2010

2009
Corpus-Based Lexeme Ranking for Morphological Guessers.
Proceedings of the State of the Art in Computational Morphology, 2009

HFST Tools for Morphology - An Efficient Open-Source Package for Construction of Morphological Analyzers.
Proceedings of the State of the Art in Computational Morphology, 2009

Conflict Resolution Using Weighted Rules in HFST-TWOLC.
Proceedings of the 17th Nordic Conference of Computational Linguistics, 2009

Corpus-based Paradigm Selection for Morphological Entries.
Proceedings of the 17th Nordic Conference of Computational Linguistics, 2009

Weighted Finite-State Morphological Analysis of Finnish Compounding with HFST-LEXC.
Proceedings of the 17th Nordic Conference of Computational Linguistics, 2009

Guessers for Finite-State Transducer Lexicons.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2009

2008
A Probabilistic Model for Guessing Base Forms of New Words by Analogy.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2008

2006
Multilingual modeling of cross-lingual spelling variants.
Inf. Retr., 2006

2004
Evaluation of Linguistic Features for Word Sense Disambiguation with Self-Organized Document Maps.
Comput. Humanit., 2004

Finding Cross-Lingual Spelling Variants.
Proceedings of the String Processing and Information Retrieval, 2004


  Loading...