Kenneth C. Enevoldsen

CoRR, October, 2025

Continuous sentiment scores for literary and multilingual contexts.

[BibT_eX]

[DOI]

CoRR, August, 2025

Dynaword: From One-shot to Continuously Developed Datasets.

[BibT_eX]

[DOI]

Kristian Nørgaard Jensen

CoRR, August, 2025

Turftopic: Topic Modelling with Contextual Representations from Sentence Transformers.

[BibT_eX]

[DOI]

Ross Deans Kristensen-McLachlan

Jan Kostkan

Roberta Rocca

J. Open Source Softw., July, 2025

Maintaining MTEB: Towards Long Term Usability and Reproducibility of Embedding Benchmarks.

[BibT_eX]

[DOI]

CoRR, June, 2025

MIEB: Massive Image Embedding Benchmark.

[BibT_eX]

[DOI]

Niklas Muennighoff

CoRR, April, 2025

MMTEB: Massive Multilingual Text Embedding Benchmark.

[BibT_eX]

[DOI]

Hippolyte Gisserot-Boukhlef

Lester James V. Miranda

CoRR, February, 2025

Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks.

[BibT_eX]

[DOI]

Dan Saattrup Nielsen

Peter Schneider-Kamp

Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies, 2025

topicwizard - a Modern, Model-agnostic Framework for Topic Model Visualization and Interpretation.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Natural Language and Speech Processing, 2025

MMTEB: Massive Multilingual Text Embedding Benchmark.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

S³ - Semantic Signal Separation.

[BibT_eX]

[DOI]

Jan Kostkan

Arnault-Quentin Vermillet

Ross Deans Kristensen-McLachlan

Roberta Rocca

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Epistemic consequences of unfair tools.

[BibT_eX]

[DOI]

Ida Marie Schytt Lassen

Mina Almasi

Jonathan Hvithamar Rystrøm

Digit. Scholarsh. Humanit., 2024

Exposing Assumptions in AI Benchmarks through Cognitive Modelling.

[BibT_eX]

[DOI]

CoRR, 2024

S<sup>3</sup> - Semantic Signal Separation.

[BibT_eX]

[DOI]

Arnault-Quentin Vermillet

Jan Kostkan

Roberta Rocca

CoRR, 2024

DANSK and DaCy 2.6.0: Domain Generalization of Danish Named Entity Recognition.

[BibT_eX]

[DOI]

Emil Trenckner Jessen

Rebekah Baglini

CoRR, 2024

The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding.

[BibT_eX]

[DOI]

Niklas Muennighoff

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023

TextDescriptives: A Python package for calculating a large variety of metrics from text.

[BibT_eX]

[DOI]

Ludvig Renbo Olsen

J. Open Source Softw., June, 2023

timeseriesflattener: A Python package for summarizing features from (medical) time series.

[BibT_eX]

[DOI]

Martin Bernstorff

Jakob Damgaard

Andreas Danielsen

J. Open Source Softw., March, 2023

Augmenty: A Python Library for Structured Text Augmentation.

[BibT_eX]

[DOI]

CoRR, 2023

Danish Foundation Models.

[BibT_eX]

[DOI]

Martin Carsten Nielsen

Martin Bernstorff

Rasmus Larsen

Peter B. Jørgensen

Malte Højmark-Bertelsen

Per Møldrup-Dalum

CoRR, 2023

Embed-Search-Align: DNA Sequence Alignment using Transformer Models.

[BibT_eX]

[DOI]

Pavan Holur

CoRR, 2023

TextDescriptives: A Python package for calculating a large variety of statistics from text.

[BibT_eX]

[DOI]

CoRR, 2023

DanSumT5: Automatic Abstractive Summarization for Danish.

[BibT_eX]

[DOI]

Sara Kolding

Katrine Nymann

Ida Hansen

Ross Deans Kristensen-McLachlan

Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

2021

From close listening to distant listening: Developing tools for Speech-Music discrimination of Danish music radio.

[BibT_eX]

[DOI]

Iben Have

Digit. Humanit. Q., 2021

When no news is bad news - Detection of negative events from news media content.

[BibT_eX]

[DOI]

Frida Hæstrup

Rebekah Brita Baglini

Andreas Roepstorff

CoRR, 2021

News Information Decoupling: An Information Signature of Catastrophes in Legacy News Media.

[BibT_eX]

[DOI]

Rebekah Brita Baglini

Anja Bechmann

Andreas Roepstorff

CoRR, 2021

DaCy: A Unified Framework for Danish NLP.

[BibT_eX]

[DOI]