Martin Volk

Orcid: 0000-0002-2063-4516

Affiliations:
  • University of Zurich, Switzerland


According to our database1, Martin Volk authored at least 92 papers between 1992 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Machine vs. Human: Exploring Syntax and Lexicon in German Translations, with a Spotlight on Anglicisms.
Proceedings of the 24th Nordic Conference on Computational Linguistics, 2023

The Adaptability of a Transformer-Based OCR Model for Historical Documents.
Proceedings of the Document Analysis and Recognition - ICDAR 2023 Workshops, 2023

The Bullinger Dataset: A Writer Adaptation Challenge.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

Bullingers Briefwechsel zugänglich machen: Stand der Handschriftenerkennung.
Proceedings of the 9. Tagung des Verbands Digital Humanities im deutschsprachigen Raum, 2023

2022
Language Report German.
Proceedings of the European Language Equality, 2022

Transformer-based HTR for Historical Documents.
CoRR, 2022

Evaluation of HTR models without Ground Truth Material.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Nunc profana tractemus. Detecting Code-Switching in a Large Corpus of 16th Century Letters.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Grenzüberschreitendes Textmining von Historischen Zeitungen - Das impresso-Projekt zwischen Text- und Bildverarbeitung, Design und Geschichtswissenschaft.
Proceedings of the 8. Tagung des Verbands Digital Humanities im deutschsprachigen Raum, 2022

2021
WikiFlash: Generating Flashcards from Wikipedia Articles.
Proceedings of the Neural Information Processing - 28th International Conference, 2021

2020
Ranking Georeferences for Efficient Crowdsourcing of Toponym Annotations in a Historical Corpus of Alpine Texts.
Proceedings of the 5th Swiss Text Analytics Conference and the 16th Conference on Natural Language Processing, 2020

How Much Data Do You Need? About the Creation of a Ground Truth for Black Letter and the Effectiveness of Neural OCR.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Benchmarking Data-driven Automatic Text Simplification for German.
Proceedings of the 1st Workshop on Tools and Resources to Empower People with REAding DIfficulties, 2020

Historical Newspaper Content Mining: Revisiting the impresso Project's Challenges in Text and Image Processing, Design and Historical Scholarship.
Proceedings of the 15th Annual International Conference of the Alliance of Digital Humanities Organizations, 2020

2019
Post-editing Productivity with Neural Machine Translation: An Empirical Assessment of Speed and Quality in the Banking and Finance Domain.
Proceedings of Machine Translation Summit XVII Volume 1: Research Track, 2019

An Empirical Analysis of Linguistic, Typographic, and Structural Features in Simplified German Texts.
Proceedings of the Sixth Italian Conference on Computational Linguistics, 2019

2018
Crowdsourcing the OCR Ground Truth of a German and French Cultural Heritage Corpus.
J. Lang. Technol. Comput. Linguistics, 2018

Cutter - a Universal Multilingual Tokenizer.
Proceedings of the 3rd Swiss Text Analytics Conference, SwissText 2018, Winterthur, 2018

Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

mtrain: A Convenience Tool for Machine Translation.
Proceedings of the 21st Annual Conference of the European Association for Machine Translation, 2018

2017
Multilingwis2 ėxtendash Explore Your Parallel Corpus.
Proceedings of the 21st Nordic Conference on Computational Linguistics, 2017

Lessons from a Massive Open Online Course (MOOC) on Natural Language Processing for Digital Humanities.
Proceedings of the Workshop on Teaching NLP for Digital Humanities (Teach4DH) 2017, 2017

2016
Crowdsourcing an OCR Gold Standard for a German and French Heritage Corpus.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Rule-based Automatic Text Simplification for German.
Proceedings of the 13th Conference on Natural Language Processing, 2016

Bi-particle Adverbs, PoS-Tagging and the Recognition of German Separable Prefix Verbs.
Proceedings of the 13th Conference on Natural Language Processing, 2016

Building a Parallel Corpus on the World's Oldest Banking Magazine.
Proceedings of the 13th Conference on Natural Language Processing, 2016

Building a Corpus of Multi-Lingual and Multi-Format International Investment Agreements.
Proceedings of the Legal Knowledge and Information Systems, 2016

MODERN: modelling discourse entities and relations for coherent machine translation.
Proceedings of the 19th Annual Conference of the European Association for Machine Translation: Projects/Products, 2016

2015
Pre-reordering for Statistical Machine Translation of Non-fictional Subtitles.
Proceedings of the 18th Annual Conference of the European Association for Machine Translation, 2015

Detecting Document-level Context Triggers to Resolve Translation Ambiguity.
Proceedings of the Second Workshop on Discourse in Machine Translation, 2015

Leveraging Compounds to Improve Noun Phrase Translation from Chinese and German.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Innovations in Parallel Corpus Search Tools.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Machine Translation for Subtitling: A Large-Scale Evaluation.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Enforcing Consistent Translation of German Compound Coreferences.
Proceedings of the 12th Edition of the Konvens Conference, 2014

Cleaning the Europarl Corpus for Linguistic Applications.
Proceedings of the 12th Edition of the Konvens Conference, 2014

Evaluating the Fully Automatic Multi-language Translation of the Swiss Avalanche Bulletin.
Proceedings of the Controlled Natural Language - 4th International Workshop, 2014

Detecting Code-Switching in a Multilingual Alpine Heritage Corpus.
Proceedings of the First Workshop on Computational Approaches to Code Switching@EMNLP 2014, 2014

2013
Exploiting Synergies Between Open Resources for German Dependency Parsing, POS-tagging, and Morphological Analysis.
Proceedings of the Recent Advances in Natural Language Processing, 2013

Combining Statistical Machine Translation and Translation Memories with Domain Adaptation.
Proceedings of the 19th Nordic Conference of Computational Linguistics, 2013

Statistical Machine Translation for Automobile Marketing Texts.
Proceedings of Machine Translation Summit XIV: Posters, 2013

Assessing post-editing efficiency in a realistic translation environment.
Proceedings of the 2nd Workshop on Post-editing Technology and Practice, 2013

Statistical Machine Translation of Subtitles: From OpenSubtitles to TED.
Proceedings of the Language Processing and Knowledge in the Web, 2013

Reconstructing Complete Lemmas for Incomplete German Compounds.
Proceedings of the Language Processing and Knowledge in the Web, 2013

Building a German/Simple German Parallel Corpus for Automatic Text Simplification.
Proceedings of the Second Workshop on Predicting and Improving Text Readability for Target Reader Populations, 2013

Mining for Domain-specific Parallel Text from Wikipedia.
Proceedings of the Sixth Workshop on Building and Using Comparable Corpora, 2013

2012
SUMAT: Data Collection and Parallel Corpus Compilation for Machine Translation of Subtitles.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Change of Biomedical Domain Terminology Over Time.
Proceedings of the Human Language Technologies - The Baltic Perspective, 2012

From Subtitles to Parallel Corpora.
Proceedings of the 16th Annual conference of the European Association for Machine Translation, 2012

Term Evolution: Use of Biomedical Terminologies.
Proceedings of the Information Retrieval and Knowledge Discovery in Biomedical Text, 2012

2011
Strategies for Reducing and Correcting OCR Errors.
Proceedings of the Language Technology for Cultural Heritage, 2011

Reducing OCR Errors in Gothic-Script Documents.
ERCIM News, 2011

Le corpus Text+Berg Une ressource parallèle alpin français-allemand (The Text+Berg Corpus An Alpine French-German Parallel Resource).
Proceedings of the Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts, 2011

Disambiguation of English Contractions for Machine Translation of TV Subtitles.
Proceedings of the 18th Nordic Conference of Computational Linguistics, 2011

Iterative, MT-based Sentence Alignment of Parallel Texts.
Proceedings of the 18th Nordic Conference of Computational Linguistics, 2011

Combining Semantic and Syntactic Generalization in Example-Based Machine Translation.
Proceedings of the 15th Annual conference of the European Association for Machine Translation, 2011

2010
Challenges in Building a Multilingual Alpine Heritage Corpus.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Towards mapping of alpine route descriptions.
Proceedings of the 6th Workshop on Geographic Information Retrieval, 2010

MT-based Sentence Alignment for OCR-generated Parallel Texts.
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers, 2010

Combining Parallel Treebanks and Geo-Tagging.
Proceedings of the Fourth Linguistic Annotation Workshop, 2010

2009
The Automatic Translation of Film Subtitles. A Machine Translation Success Story?
J. Lang. Technol. Comput. Linguistics, 2009

Classifying Named Entities in an Alpine Heritage Corpus.
Künstliche Intell., 2009

Using Linguistic Annotations in Statistical Machine Translation of Film Subtitles.
Proceedings of the 17th Nordic Conference of Computational Linguistics, 2009

2008
Extending the TIGER query language with universal quantification.
Proceedings of the Text Resources and Lexical Knowledge. Selected Papers from the 9th Conference on Natural Language Processing, 2008

2007
Comparing French PP-attachment to English, German and Swedish.
Proceedings of the 16th Nordic Conference of Computational Linguistics, 2007

Evaluating MT with translations or translators: what is the difference?
Proceedings of Machine Translation Summit XI: Papers, 2007

A Search Tool for Parallel Treebanks.
Proceedings of the Linguistic Annotation Workshop, 2007

2006
XML-based Phrase Alignment in Parallel Treebanks.
Proceedings of the 5th Workshop on NLP and XML: Multi-Dimensional Markup in Natural Language Processing, 2006

2004
Evaluation Resources for Concept-based Cross-Lingual Information Retrieval in the Medical Domain.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

2003
Ontologies in Croos-Language Information Retrieval.
Proceedings of the WOW2003, 2003

Ontologies in Cross-Language Information Retrieval.
Proceedings of the WM 2003: Professionelles Wissensmanagement, 2003

A Cross Language Document Retrieval System Based on Semantic Annotation.
Proceedings of the EACL 2003, 2003

2002
Semantic annotation for concept-based cross-language medical information retrieval.
Int. J. Medical Informatics, 2002

Combining Unsupervised and Supervised Methods for PP Attachment Disambiguation.
Proceedings of the 19th International Conference on Computational Linguistics, 2002

2001
LUIS - Ein natärlichsprachliches, universitäres Informationssystem.
Proceedings of the Unternehmen Hochschule, 2001

Learn - Filter - Apply - Forget. Mixed Approaches to Named Entity Recognition.
Proceedings of the Applications of Natural Language to Information Systems, 2001

Linguistische und semantische Annotation eines Zeitungskorpos.
Proceedings of the Proceedings der GLDV-Frühjahrstagung 2001, 2001

2000
Evaluating Translation Quality as Input to Product Development.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Scaling up. Using the WWW to Resolve PP Attachment Ambiguities.
Proceedings of the KONVENS 2000 / Sprachkommunikation, 2000

1998
CL-Demos im World Wide Web.
LDV Forum, 1998

Quantitative Verfahren zur Zuordnung von Präpositionalphrasen.
LDV Forum, 1998

Comparing a statistical and a rule-based tagger for German
CoRR, 1998

1997
Experiences with the GTU grammar development environment
CoRR, 1997

Probing the Lexicon in Evaluating Commercial MT Systems.
Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, 1997

1996
Constraint Logic Programming for Computational Linguistics.
Proceedings of the Logical Aspects of Computational Linguistics, 1996

Parsing with ID/LP and PS Rules.
Proceedings of the Natural Language Processing and Speech Technology, 1996

1995
Einsatz einer Testsatzsammlung im Grammar engineering.
PhD thesis, 1995

1994
Was ist Linguistic Engineering?
Künstliche Intell., 1994

1993
Linguistic Engineering und linguistische Forschung.
Proceedings of the Sprache und Computer, 1993

1992
GTU - eine Grammatik Testumgebung mit Testsatzarchiv.
LDV Forum, 1992

UBS. Eine Unifikationsbasierte Sprache zur Implementation von HPSG.
LDV Forum, 1992

Third European Summer Scholl on Language, Logic, and Information - Saarbrücken 1991.
Künstliche Intell., 1992

The Role of Testing in Grammar Engineering.
Proceedings of the 3rd Applied Natural Language Processing Conference, 1992


  Loading...