Jörg Tiedemann

Orcid: 0000-0003-3065-7989

According to our database1, Jörg Tiedemann authored at least 157 papers between 1998 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Democratizing neural machine translation with OPUS-MT.
Lang. Resour. Evaluation, June, 2024

SemEval-2024 Shared Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes.
CoRR, 2024

MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki.
CoRR, 2024

MaLA-500: Massive Language Adaptation of Large Language Models.
CoRR, 2024

MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Can Machine Translation Bridge Multilingual Pretraining and Cross-lingual Transfer Learning?
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

A New Massive Multilingual Dataset for High-Performance Language Technologies.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023
Domain-specific Continued Pretraining of Language Models for Capturing Long Context in Mental Health.
CoRR, 2023

Guiding Zero-Shot Paraphrase Generation with Fine-Grained Control Tokens.
Proceedings of the The 12th Joint Conference on Lexical and Computational Semantics, 2023

Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging.
Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

Dozens of Translation Directions or Millions of Shared Parameters? Comparing Two Types of Multilinguality in Modular Machine Translation.
Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

Unsupervised Feature Selection for Effective Parallel Corpus Filtering.
Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 2023

HPLT: High Performance Language Technologies.
Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 2023

The OPUS-MT Dashboard - A Toolkit for a Systematic Evaluation of Open Machine Translation Models.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

2022
Democratizing Machine Translation with OPUS-MT.
CoRR, 2022

How Does Data Corruption Affect Natural Language Understanding Models? A Study on GLUE datasets.
Proceedings of the 11th Joint Conference on Lexical and Computational Semantics, 2022

Helsinki-NLP at SemEval-2022 Task 2: A Feature-Based Approach to Multilingual Idiomaticity Detection.
Proceedings of the 16th International Workshop on Semantic Evaluation, SemEval@NAACL 2022, 2022

Modeling Noise in Paraphrase Detection.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Latest Development in the FoTran Project - Scaling Up Language Coverage in Neural Machine Translation Using Distributed Training with Language-Specific Components.
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, 2022

A Closer Look at Parameter Contributions When Training Neural Language and Translation Models.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

When to Laugh and How Hard? A Multimodal Approach to Detecting Humor and Its Intensity.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

It Is Not Easy To Detect Paraphrases: Analysing Semantic Similarity With Antonyms and Negation Using the New SemAntoNeg Benchmark.
Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2022

2021
NLI Data Sanity Check: Assessing the Effect of Data Corruption on Model Performance.
Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021

Boosting Neural Machine Translation from Finnish to Northern Sámi with Rule-Based Backtranslation.
Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021

An Empirical Investigation of Word Alignment Supervision for Zero-Shot Multilingual Neural Machine Translation.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

On the differences between BERT and MT encoder spaces and how to address them in translation tasks.
Proceedings of the ACL-IJCNLP 2021 Student Research Workshop, 2021

2020
Are Multilingual Neural Machine Translation Models Better at Capturing Linguistic Features?
Prague Bull. Math. Linguistics, 2020

Multimodal machine translation through visuals and speech.
Mach. Transl., 2020

A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation.
Comput. Linguistics, 2020

The Tatoeba Translation Challenge - Realistic Data Sets for Low Resource and Multilingual MT.
Proceedings of the Fifth Conference on Machine Translation, 2020

The MUCOW word sense disambiguation test suite at WMT 2020.
Proceedings of the Fifth Conference on Machine Translation, 2020

LSDC - A comprehensive dataset for Low Saxon Dialect Classification.
Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects, 2020

LT@Helsinki at SemEval-2020 Task 12: Multilingual or Language-specific BERT?
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

The FISKMÖ Project: Resources and Tools for Finnish-Swedish Machine Translation and Cross-Linguistic Research.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

An Evaluation Benchmark for Testing the Word Sense Disambiguation Capabilities of Machine Translation Systems.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

OpusTools and Parallel Corpus Diagnostics.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

The University of Helsinki Submission to the IWSLT2020 Offline SpeechTranslation Task.
Proceedings of the 17th International Conference on Spoken Language Translation, 2020

Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

OPUS-MT - Building open translation services for the World.
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, 2020

MT for subtitling: User evaluation of post-editing productivity.
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, 2020

Emotion Preservation in Translation: Evaluating Datasets for Annotation Projection.
Proceedings of the Digital Humanities in the Nordic Countries 5th Conference, 2020

XED: A Multilingual Dataset for Sentiment Analysis and Emotion Detection.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Controlling the Imprint of Passivization and Negation in Contextualized Representations.
Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2020

MT for Subtitling: Investigating professional translators' user experience and feedback.
Proceedings of 1st Workshop on Post-Editing in Modern-Day Translation, 2020

A Multi-task Learning Approach to Text Simplification.
Proceedings of the Recent Trends in Analysis of Images, Social Networks and Texts, 2020

OpusFilter: A Configurable Parallel Corpus Filtering Toolbox.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

2019
Sentence embeddings in NLI with iterative refinement encoders.
Nat. Lang. Eng., 2019

What Do Language Representations Really Represent?
Comput. Linguistics, 2019

The University of Helsinki Submission to the WMT19 Parallel Corpus Filtering Task.
Proceedings of the Fourth Conference on Machine Translation, 2019

The University of Helsinki Submissions to the WMT19 News Translation Task.
Proceedings of the Fourth Conference on Machine Translation, 2019

The MuCoW Test Suite at WMT 2019: Automatically Harvested Multilingual Contrastive Word Sense Disambiguation Test Sets for Machine Translation.
Proceedings of the Fourth Conference on Machine Translation, 2019

Multilingual NMT with a Language-Independent Attention Bridge.
Proceedings of the 4th Workshop on Representation Learning for NLP, 2019

An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation.
Proceedings of the 4th Workshop on Representation Learning for NLP, 2019

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations.
Proceedings of the 22nd Nordic Conference on Computational Linguistics, NoDaLiDa 2019, Turku, Finland, September 30, 2019

The OPUS Resource Repository: An Open Package for Creating Parallel Corpora and Machine Translation Services.
Proceedings of the 22nd Nordic Conference on Computational Linguistics, NoDaLiDa 2019, Turku, Finland, September 30, 2019

Revisiting NMT for Normalization of Early English Letters.
Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, 2019

Analysing concatenation approaches to document-level NMT in two different domains.
Proceedings of the Fourth Workshop on Discourse in Machine Translation, 2019

2018
Detecting hospital-acquired infections: A document classification approach using support vector machines and gradient tree boosting.
Health Informatics J., 2018

Natural Language Inference with Hierarchical BiLSTM Max Pooling Architecture.
CoRR, 2018

Translational Grounding: Using Paraphrase Recognition and Generation to Demonstrate Semantic Abstraction Abilities of MultiLingual NMT.
CoRR, 2018

The University of Helsinki submissions to the WMT18 news task.
Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 2018

The MeMAD Submission to the WMT18 Multimodal Translation Task.
Proceedings of the Third Conference on Machine Translation: Shared Task Papers, 2018

Creating a Dataset for Multilingual Fine-grained Emotion-detection Using Gamification-based Annotation.
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, 2018

Language Identification and Morphosyntactic Tagging: The Second VarDial Evaluation Campaign.
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, 2018

OpenSubtitles2018: Statistical Rescoring of Sentence Alignments in Large, Noisy Parallel Corpora.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Normalizing Early English Letters to Present-day English Spelling.
Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, 2018

The MeMAD Submission to the IWSLT 2018 Speech Translation Task.
Proceedings of the 15th International Conference on Spoken Language Translation, 2018

An Analysis of Encoder Representations in Transformer-Based Machine Translation.
Proceedings of the Workshop: Analyzing and Interpreting Neural Networks for NLP, 2018

Emerging Language Spaces Learned From Massively Multilingual Corpora.
Proceedings of the Digital Humanities in the Nordic Countries 3rd Conference, 2018

2017
Large aligned treebanks for syntax-based machine translation.
Lang. Resour. Evaluation, 2017

Neural machine translation for low-resource languages.
CoRR, 2017

The Helsinki Neural Machine Translation System.
Proceedings of the Second Conference on Machine Translation, 2017

Rule-based Machine translation from English to Finnish.
Proceedings of the Second Conference on Machine Translation, 2017

Findings of the VarDial Evaluation Campaign 2017.
Proceedings of the Fourth Workshop on NLP for Similar Languages, 2017

Cross-lingual dependency parsing for closely related languages - Helsinki's submission to VarDial 2017.
Proceedings of the Fourth Workshop on NLP for Similar Languages, 2017

Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Continuous multilinguality with language vectors.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Neural Machine Translation with Extended Context.
Proceedings of the Third Workshop on Discourse in Machine Translation, 2017

Findings of the 2017 DiscoMT Shared Task on Cross-lingual Pronoun Prediction.
Proceedings of the Third Workshop on Discourse in Machine Translation, 2017

2016
Efficient Word Alignment with Markov Chain Monte Carlo.
Prague Bull. Math. Linguistics, 2016

Synthetic Treebanking for Cross-Lingual Dependency Parsing.
J. Artif. Intell. Res., 2016

Phrase-Based SMT for Finnish with More Data, Better Models and Alternative Alignment and Translation Tools.
Proceedings of the First Conference on Machine Translation, 2016

A Linear Baseline Classifier for Cross-Lingual Pronoun Prediction.
Proceedings of the First Conference on Machine Translation, 2016

Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction.
Proceedings of the First Conference on Machine Translation, 2016

Discriminating between Similar Languages and Arabic Dialect Identification: A Report on the Third DSL Shared Task.
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, 2016

Tagging Ingush - Language Technology For Low-Resource Languages Using Resources From Linguistic Field Work.
Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities, 2016

Finding Alternative Translations in a Large Corpus of Movie Subtitle.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

OPUS - parallel corpora for everyone.
Proceedings of the 19th Annual Conference of the European Association for Machine Translation: Projects/Products, 2016

Climbing Mont BLEU: The Strange World of Reachable High-BLEU Translations.
Proceedings of the 19th Annual Conference of the European Association for Machine Translation, 2016

The Challenges of Multi-dimensional Sentiment Analysis Across Languages.
Proceedings of the Workshop on Computational Modeling of People's Opinions, 2016

2015
Morphological Segmentation and OPUS for Finnish-English Machine Translation.
Proceedings of the Tenth Workshop on Statistical Machine Translation, 2015

Improving the Cross-Lingual Projection of Syntactic Dependencies.
Proceedings of the 20th Nordic Conference of Computational Linguistics, 2015

Baseline Models for Pronoun Prediction and Pronoun-Aware Translation.
Proceedings of the Second Workshop on Discourse in Machine Translation, 2015

Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation.
Proceedings of the Second Workshop on Discourse in Machine Translation, 2015

Part-of-Speech Driven Cross-Lingual Pronoun Prediction with Feed-Forward Neural Networks.
Proceedings of the Second Workshop on Discourse in Machine Translation, 2015

Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels.
Proceedings of the Third International Conference on Dependency Linguistics, 2015

Boosting English-Chinese Machine Transliteration via High Quality Alignment and Multilingual Resources.
Proceedings of the Fifth Named Entity Workshop, 2015

2014
Estimating Word Alignment Quality for SMT Reordering Tasks.
Proceedings of the Ninth Workshop on Statistical Machine Translation, 2014

Anaphora Models and Reordering for Phrase-Based SMT.
Proceedings of the Ninth Workshop on Statistical Machine Translation, 2014

A Report on the DSL Shared Task 2014.
Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, 2014

Word's Vector Representations meet Machine Translation.
Proceedings of SSST@EMNLP 2014, 2014

Billions of Parallel Words for Free: Building and Using the EU Bookshop Corpus.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

ParCor 1.0: A Parallel Pronoun-Coreference Corpus to Support Statistical MT.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Treebank Translation for Cross-Lingual Parser Induction.
Proceedings of the Eighteenth Conference on Computational Natural Language Learning, 2014

Rediscovering Annotation Projection for Cross-Lingual Parser Induction.
Proceedings of the COLING 2014, 2014

Improved Text Extraction from PDF Documents for Large-Scale Natural Language Processing.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2014

2013
Parse and Corpus-Based Machine Translation.
Proceedings of the Essential Speech and Language Technology for Dutch, 2013

Markus Dickinson, Chris Brew and Detmar Meurers: Language and Computers - Wiley-Blackwell, 2013, ISBN: 978-1 4051 8305 5, xviii $$+$$ + 232 pp.
Mach. Transl., 2013

Tunable Distortion Limits and Corpus Cleaning for SMT.
Proceedings of the Eighth Workshop on Statistical Machine Translation, 2013

Analyzing the Use of Character-Level Translation with Sparse and Noisy Datasets.
Proceedings of the Recent Advances in Natural Language Processing, 2013

Experiences in Building the Let's MT! Portal on Amazon EC2.
Proceedings of the 19th Nordic Conference of Computational Linguistics, 2013

Statistical Machine Translation with Readability Constraints.
Proceedings of the 19th Nordic Conference of Computational Linguistics, 2013

Latent Anaphora Resolution for Cross-Lingual Pronoun Prediction.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

Feature Weight Optimization for Discourse-Level SMT.
Proceedings of the Workshop on Discourse in Machine Translation, 2013

Docent: A Document-Level Decoder for Phrase-Based Statistical Machine Translation.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2012
Tree Kernels for Machine Translation Quality Estimation.
Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012

A Distributed Resource Repository for Cloud-Based Machine Translation.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Parallel Data, Tools and Interfaces in OPUS.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Document-Wide Decoding for Phrase-Based Statistical Machine Translation.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Character-Based Pivot Translation for Under-Resourced Languages and Domains.
Proceedings of the EACL 2012, 2012

Efficient Discrimination Between Closely Related Languages.
Proceedings of the COLING 2012, 2012

LetsMT!: Cloud-Based Platform for Do-It-Yourself Machine Translation.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, 2012

Combining Word-Level and Character-Level Models for Machine Translation Between Closely-Related Languages.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011
Automatic Extraction of Medical Term Variants from Multilingual Parallel Translations.
Proceedings of the Interactive Multi-modal Question-Answering, 2011

Bitext Alignment
Synthesis Lectures on Human Language Technologies, Morgan & Claypool Publishers, ISBN: 978-3-031-02142-8, 2011

Synonym Acquisition across Domains and Languages.
Proceedings of the Advances in Distributed Agent-Based Retrieval Tools, 2011

The Uppsala-FBK systems at WMT 2011.
Proceedings of the Sixth Workshop on Statistical Machine Translation, 2011

2010
To Cache or Not To Cache? Experiments with Adaptive Models in Statistical Machine Translation.
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, 2010

Lingua-Align: An Experimental Toolbox for Automatic Tree-to-Tree Alignment.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

English to Bangla Phrase-Based Machine Translation.
Proceedings of the 14th Annual conference of the European Association for Machine Translation, 2010

2009
Translating Questions for Cross-Lingual QA.
Proceedings of the 13th Annual conference of the European Association for Machine Translation, 2009

Character-Based PSMT for Closely Related Languages.
Proceedings of the 13th Annual conference of the European Association for Machine Translation, 2009

2008
Synchronizing Translated Movie Subtitles.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

Question Answering with Joost at CLEF 2008.
Proceedings of the Working Notes for CLEF 2008 Workshop co-located with the 12th European Conference on Digital Libraries (ECDL 2008) , 2008

Pair Hidden Markov Model for Named Entity Matching.
Proceedings of the Innovations and Advances in Computer Sciences and Engineering, 2008

2007
Question Answering with Joost at CLEF 2007.
Proceedings of the Advances in Multilingual and Multimodal Information Retrieval, 2007

A Comparison of Genetic Algorithms for Optimizing Linguistically Informed IR in Question Answering.
Proceedings of the AI*IA 2007: Artificial Intelligence and Human-Oriented Computing, 2007

2006
ISA & ICA - Two Web Interfaces for Interactive Alignment of Bitexts alignment of parallel texts.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

The University of Groningen at QA@CLEF 2006: Using Syntactic Knowledge for QA.
Proceedings of the Working Notes for CLEF 2006 Workshop co-located with the 10th European Conference on Digital Libraries (ECDL 2006), 2006

Using Syntactic Knowledge for QA.
Proceedings of the Evaluation of Multilingual and Multi-modal Information Retrieval, 2006

Finding Synonyms Using Automatic Word Alignment and Measures of Distributional Similarity.
Proceedings of the ACL 2006, 2006

2005
Optimization of word alignment clues.
Nat. Lang. Eng., 2005

Integrating Linguistic Knowledge in Passage Retrieval for Question Answering.
Proceedings of the HLT/EMNLP 2005, 2005

Improving Passage Retrieval in Question Answering Using NLP.
Proceedings of the Progress in Artificial Intelligence, 2005

Question Answering for Dutch Using Dependency Relations.
Proceedings of the Accessing Multilingual Information Repositories, 2005

2004
MT Goes Farming: Comparing Two Machine Translation Approaches on a New Domain.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

The OPUS Corpus - Parallel and Free: http: //logos.uio.no/opus.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

Word to word alignment strategies.
Proceedings of the COLING 2004, 2004

2003
Combining Clues for Word Alignment.
Proceedings of the EACL 2003, 2003

2002
MatsLex - a Multilingual Lexical Database for Machine Translation.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Scaling Up an MT Prototype for Industrial Use - Databases and Data Flow.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

2001
UplugWeb-Corpus Tools on the Web.
Proceedings of the 13th Nordic Conference of Computational Linguistics, 2001

2000
Evaluation of Word Alignment Systems.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

1999
Word Alignment Step by Step.
Proceedings of the 12th Nordic Conference of Computational Linguistics, 1999

Automatic Construction of Weighted String Similarity Measures.
Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, 1999

1998
Extraction of Translation Equivalents from Parallel Corpora.
Proceedings of the 11th Nordic Conference of Computational Linguistics, 1998


  Loading...