Arantza Díaz de Ilarraza

Orcid: 0000-0003-3317-8561

  • University of the Basque Country, Department of Languages and Computer Systems

According to our database1, Arantza Díaz de Ilarraza authored at least 108 papers between 1990 and 2022.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:



Language Report Basque.
Proceedings of the European Language Equality, 2022

How the corpus-based Basque Verb Index lexicon was built.
Lang. Resour. Evaluation, 2020

Interpretable deep learning to map diagnostic texts to ICD-10 codes.
Int. J. Medical Informatics, 2019

Multi-label clinical document classification: Impact of label-density.
Expert Syst. Appl., 2019

The corpus of Basque simplified texts (CBST).
Lang. Resour. Evaluation, 2018

Automatic Misogyny Identification Using Neural Networks.
Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018) co-located with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN 2018), 2018

A Hybrid Approach For Automatic Disability Annotation.
Proceedings of the Third Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2018) co-located with 34th Conference of the Spanish Society for Natural Language Processing (SEPLN 2018), 2018

Konbitzul: an MWE-specific database for Spanish-Basque.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Annotating Abstract Meaning Representations for Spanish.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

MAMTRA-MED at CLEF eHealth 2018: A Combination of Information Retrieval Techniques and Neural Networks for ICD-10 Coding of Death Certificates.
Proceedings of the Working Notes of CLEF 2018, 2018

ANALHITZA: a tool to extract linguistic information from large corpora in Humanities research.
Proces. del Leng. Natural, 2017

PROcesamiento Semántico textual Avanzado para la detección de diagnósticos, procedimientos, otros conceptos y sus relaciones en informes MEDicos (PROSA-MED).
Proces. del Leng. Natural, 2017

EusHeidelTime: Time Expression Extraction and Normalisation for Basque.
Proces. del Leng. Natural, 2017

Building the Gold Standard for the Surface Syntax of Basque.
Proces. del Leng. Natural, 2017

Improving mention detection for Basque based on a deep error analysis.
Nat. Lang. Eng., 2017

Ebaluatoia: crowd evaluation for English-Basque machine translation.
Lang. Resour. Evaluation, 2017

Rule-Based Translation of Spanish Verb-Noun Combinations into Basque.
Proceedings of the 13th Workshop on Multiword Expressions, 2017

Tectogrammar-based machine translation for English-Spanish and English-Basque.
Proces. del Leng. Natural, 2016

A methodology for the semiautomatic annotation of EPEC-RolSem, a Basque corpus labeled at predicate level following the PropBank-VerbNet model.
Digit. Scholarsh. Humanit., 2016

Coreference Resolution for the Basque Language with BART.
Proceedings of the Workshop on Coreference Resolution Beyond OntoNotes, 2016

Using Linguistic Data for English and Spanish Verb-Noun Combination Identification.
Proceedings of the COLING 2016, 2016

Adapting TimeML to Basque: Event Annotation.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2016

IXAmed-IE: On-line medical entity identification and ADR event extraction in Spanish.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2016

The impact of simple feature engineering in multilingual medical NER.
Proceedings of the Clinical Natural Language Processing Workshop, 2016

A Preliminary Study of Statistically Predictive Syntactic Complexity Features and Manual Simplifications in Basque.
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity, 2016

Coreference Resolution for Morphologically Rich Languages. Adaptation of the Stanford System to Basque.
Proces. del Leng. Natural, 2015

EXTracción de RElaciones entre Conceptos Médicos en fuentes de información heterogéneas (EXTRECM).
Proces. del Leng. Natural, 2015

Lexical semantics, Basque and Spanish in QTLeap: Quality Translation by Deep Language Engineering Approaches.
Proces. del Leng. Natural, 2015

On the creation of a clinical gold standard corpus in Spanish: Mining adverse drug reactions.
J. Biomed. Informatics, 2015

Computer aided classification of diagnostic terms in spanish.
Expert Syst. Appl., 2015

Exploiting portability to build an RBMT prototype for a new source language.
Proceedings of the 18th Annual Conference of the European Association for Machine Translation, 2015

Deep-syntax TectoMT for English-Spanish MT.
Proceedings of the 1st Deep Machine Translation Workshop, 2015

Izen+aditz konbinazioen azterketa elebiduna, hizkuntza-aplikazio aurreratuei begira.
Linguamática, 2014

Euskarazko denbora-egiturak. Azterketa eta etiketatze-esperimentua.
Linguamática, 2014

The annotation of the Central Unit in Rhetorical Structure Trees: A Key Step in Annotating Rhetorical Relations.
Proceedings of the COLING 2014, 2014

Simple or Complex? Assessing the readability of Basque Texts.
Proceedings of the COLING 2014, 2014

Comparison of post-editing productivity between professional translators and lay users.
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas, 2014

Transforming Complex Sentences using Dependency Trees for Automatic Text Simplification in Basque.
Proces. del Leng. Natural, 2013

Testuen sinplifikazio automatikoa: arloaren egungo egoera.
Linguamática, 2013

Detecting Apposition for Text Simplification in Basque.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2013

Mention detection: First steps in the development of a Basque coreference resolution system.
Proceedings of the 11th Conference on Natural Language Processing, 2012

First Approaches on Spanish Medical Record Classification Using Diagnostic Term to Class Transduction.
Proceedings of the 10th International Workshop on Finite State Methods and Natural Language Processing, 2012

Combining Rule-Based and Statistical Syntactic Analyzers.
Proceedings of the Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages, 2012

Unidad discursiva y relaciones retóricas: un estudio acerca de las unidades de discurso en el etiquetado de un corpus en euskera.
Proces. del Leng. Natural, 2011

Biomedical event extraction using Kybots.
Proces. del Leng. Natural, 2011

Syntactic error detection and correction in date expressions using finite-state transducers.
Nat. Lang. Eng., 2011

<i>Matxin</i>, an open-source rule-based machine translation system for Basque.
Mach. Transl., 2011

Teknologia garatzeko estrategiak baliabide urriko hizkuntzetarako: euskararen eta Ixa taldearen adibidea.
Linguamática, 2011

Hybrid Machine Translation Guided by a Rule-Based System.
Proceedings of Machine Translation Summit XIII: Papers, 2011

Using Kybots for Extracting Events in Biomedical Texts.
Proceedings of BioNLP Shared Task 2011 Workshop, Portland, Oregon, USA, June 24, 2011, 2011

First Steps in The Manual and Automatic Annotation of Clinical Notes in Spanish.
Proces. del Leng. Natural, 2010

Recursos en euskera para la herramienta NLTK para enseñanza de procesamiento del lenguaje natural.
Proces. del Leng. Natural, 2010

Determination of Features for a Machine Learning Approach to Pronominal Anaphora Resolution in Basque.
Proces. del Leng. Natural, 2010

Design and Evaluation of an Agreement Error Detection System: Testing the Effect of Ambiguity, Parser and Corpus Type.
Proceedings of the Advances in Natural Language Processing, 2010

Building the Basque PropBank.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

A First Machine Learning Approach to Pronominal Anaphora Resolution in Basque.
Proceedings of the Advances in Artificial Intelligence, 2010

EusPropBank: Integrating Semantic Information in the Basque Dependency Treebank.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2010

A Combination of Classifiers for the Pronominal Anaphora Resolution in Basque.
Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, 2010

Dealing With Complex Linguistic Annotations Within a Language Processing Framework.
IEEE Trans. Speech Audio Process., 2009

Errores en el uso de determinantes en euskera: Análisis y Detección Automática.
Proces. del Leng. Natural, 2009

AnHitz, integración de tecnologías de la lengua dentro de un prototipo de experto virtual en ciencia y tecnología.
Proces. del Leng. Natural, 2009

Evaluación de un sistema de traducción automática basado en reglas o por qué BLEU sólo sirve para lo que sirve.
Proces. del Leng. Natural, 2009

KYOTO Project.
Proces. del Leng. Natural, 2009

Evaluating the Impact of Morphosyntactic Ambiguity in Grammatical Error Detection.
Proceedings of the Recent Advances in Natural Language Processing, 2009

Valuable Language Resources and Applications Supporting the Use of Basque.
Proceedings of the Human Language Technology. Challenges for Computer Science and Linguistics, 2009

Development and evaluation of AnHitz, a prototype of a basque-speaking virtual 3D expert on science and technology.
Proceedings of the International Multiconference on Computer Science and Information Technology, 2009

Relevance of Different Segmentation Options on Spanish-Basque SMT.
Proceedings of the 13th Annual conference of the European Association for Machine Translation, 2009

Exploring Basque Document Categorization for Educational Purposes using LSI.
Proceedings of the CSEDU 2009 - Proceedings of the First International Conference on Computer Supported Education, Lisboa, Portugal, March 23-26, 2009, 2009

Evaluation of the Syntactic Annotation in EPEC, the Reference Corpus for the Processing of Basque.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2009

From Dependencies to Constituents in the Reference Corpus for the Processing of Basque (EPEC).
Proces. del Leng. Natural, 2008

Chunk and Clause Identification for Basque by Filtering and Ranking with Perceptrons.
Proces. del Leng. Natural, 2008

AnHitz, Development and Integration of Language, Speech and Visual Technologies for Basque.
Proceedings of the ISUC 2008, 2008

Strategies for sustainable MT for Basque: incremental design, reusability, standardization and open-source.
Proceedings of the Third International Joint Conference on Natural Language Processing, 2008

Detecting Erroneous Uses of Complex Postpositions in an Agglutinative Language.
Proceedings of the COLING 2008, 2008

Spanish-to-Basque MultiEngine Machine Translation for a Restricted Domain.
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers, 2008

Specification of a General Linguistic Annotation Framework and its Use in a Real Context.
Proces. del Leng. Natural, 2007

Spanish-Basque Parallel Corpus Structure: Linguistic Annotations and Translation Units.
Proceedings of the Text, Speech and Dialogue, 10th International Conference, 2007

Transfer-Based MT from Spanish into Basque: Reusability, Standardization and Open Source.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2007

Uso de la información morfológica en el alineamiento Español-Euskara.
Proces. del Leng. Natural, 2006

Pronominal anaphora in Basque: annotation of a real corpus.
Proces. del Leng. Natural, 2006

Structure, Annotation and Tools in the Basque ZT Corpus.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Using Machine Learning Techniques to Build a Comma Checker for Basque.
Proceedings of the ACL 2006, 2006

An FST Grammar for Verb Chain Transfer in a Spanish-Basque MT System.
Proceedings of the Finite-State Methods and Natural Language Processing, 2005

Design and Development of a System for the Detection of Agreement Errors in Basque.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2005

3LB: Construcción de una base de datos de árboles sintáctico-semánticos para el catalán, euskera y castellano.
Proces. del Leng. Natural, 2004

Abar-Hitz: An Annotation Tool for the Basque Dependency Treebank.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

From Human to Automatic Summary Evaluation.
Proceedings of the Intelligent Tutoring Systems, 7th International Conference, 2004

A Cascaded Syntactic Analyser for Basque.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2004

3LB: Construcción de una base de datos de árboles sintáctico semánticos.
Proces. del Leng. Natural, 2003

Construcción de un corpus etiquetado sintácticamente para el euskera.
Proces. del Leng. Natural, 2002

A Class Library for the Integration of NLP Tools: Definition and implementation of an Abstract Data Type Collection for the manipulation of SGML documents in a context of stand-off linguistic annotation.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Semiautomatic Labelling of Semantic Features.
Proceedings of the 19th International Conference on Computational Linguistics, 2002

A Methodology for Building Translator-oriented Dictionary Systems.
Mach. Transl., 2000

Extraction of semantic relations from a Basque monolingual dictionary using Constraint Grammar
CoRR, 2000

A Proposal for the Integration of NLP Tools using SGML-Tagged Documents.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

MLDS: A translator-oriented MultiLingual dictionary system.
Nat. Lang. Eng., 1999

Behavioral Explanations in Intelligent Tutor Systems for Training using Causal Models.
Interact. Learn. Environ., 1998

Intelligent tutoring systems for training of operators for thermal power plants.
Artif. Intell. Eng., 1998

EDBL: a multi-purposed lexical support for treatment of Basque.
Proceedings of the First International Conference on Language Resources and Evaluation, 1998

From Psycholinguistic Modelling of Interlanguage in Second Language Acquisition to a Computational Model.
Proceedings of the 1997 Meeting of the ACL Special Interest Group in Natural Language Learning: Computational Natural Language Learning, 1997

Constructing an intelligent dictionary help system.
Nat. Lang. Eng., 1996

Different Issues in the Design of a Lemmatizer/Tagger for Basque
CoRR, 1995

Different Issues in the Design of a General-Purpose Lexical Database for Basque.
Proceedings of the First International Workshop on Applications of Natural Language to Data Bases, 1995

Lexical, Knowledge Representation In An Intelligent Dictionary Help System.
Proceedings of the 15th International Conference on Computational Linguistics, 1994

A Morphological Analysis Based Method for Spelling Correction.
Proceedings of the Sixth Conference of the European Chapter of the Association for Computational Linguistics, 1993

User Modeling and Architecture in Industrial ITSs.
Proceedings of the Intelligent Tutoring Systems, Second International Conference, 1992

XUXEN: A Spelling Checker/Corrector for Basque Based on Two-Level Morphology.
Proceedings of the 3rd Applied Natural Language Processing Conference, 1992

A Mechanism for ellipsis resolution in dialogued systems.
Proceedings of the 13th International Conference on Computational Linguistics, 1990
