Heidi Jauhiainen

Proceedings of the Annual International Conference of the Alliance of Digital Humanities Organizations, 2023

2022

Optimizing Naive Bayes for Arabic Dialect Identification.

[BibT_eX]

[DOI]

Proceedings of the The Seventh Arabic Natural Language Processing Workshop, 2022

Italian Language and Dialect Identification and Regional French Variety Detection using Adaptive Naive Bayes.

[BibT_eX]

[DOI]

Proceedings of the Ninth Workshop on NLP for Similar Languages, Varieties and Dialects, 2022

HeLI-OTS, Off-the-shelf Language Identifier for Text.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Encoding Hieroglyphic Texts.

[BibT_eX]

[DOI]

Proceedings of the 6th Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022), 2022

2021

Naive Bayes-based Experiments in Romanian Dialect Identification.

[BibT_eX]

[DOI]

Bharathi Raja Chakravarthi

Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects, 2021

Findings of the VarDial Evaluation Campaign 2021.

[BibT_eX]

[DOI]

Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects, 2021

2020

Uralic Language Identification (ULI) 2020 shared task dataset and the Wanca 2017 corpus.

[BibT_eX]

[DOI]

CoRR, 2020

Uralic Language Identification (ULI) 2020 shared task dataset and the Wanca 2017 corpora.

[BibT_eX]

[DOI]

Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects, 2020

Experiments in Language Variety Geolocation and Dialect Identification.

[BibT_eX]

[DOI]

Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects, 2020

A Report on the VarDial Evaluation Campaign 2020.

[BibT_eX]

[DOI]

Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects, 2020

Building Web Corpora for Minority Languages.

[BibT_eX]

[DOI]

Proceedings of the 12th Web as Corpus Workshop, 2020

2019

Language model adaptation for language and dialect identification of text.

[BibT_eX]

[DOI]

Nat. Lang. Eng., 2019

Language and Dialect Identification of Cuneiform Texts.

[BibT_eX]

[DOI]

CoRR, 2019

2018

HeLI-based Experiments in Swiss German Dialect Identification.

[BibT_eX]

[DOI]

Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, 2018

HeLI-based Experiments in Discriminating Between Dutch and Flemish Subtitles.

[BibT_eX]

[DOI]

Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, 2018

Iterative Language Model Adaptation for Indo-Aryan Language Identification.

[BibT_eX]

[DOI]

Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, 2018

2017

Evaluating HeLI with Non-Linear Mappings.

[BibT_eX]

[DOI]

Proceedings of the Fourth Workshop on NLP for Similar Languages, 2017

Evaluation of language identification methods using 285 languages.

[BibT_eX]

[DOI]

Proceedings of the 21st Nordic Conference on Computational Linguistics, 2017

2016

HeLI, a Word-Based Backoff Method for Language Identification.

[BibT_eX]

[DOI]

Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, 2016

2015

Language Set Identification in Noisy Synthetic Multilingual Documents.

[BibT_eX]

[DOI]