Dan Tufis

Orcid: 0000-0002-8280-9852

Affiliations:
  • Romanian Academy Research Institute for Artificial Intelligence, Bucharest, Romania


According to our database1, Dan Tufis authored at least 104 papers between 1984 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Evaluation of Language Models on Romanian XQuAD and RoITD datasets.
Int. J. Comput. Commun. Control, February, 2023

Towards Improving the Performance of Pre-Trained Speech Models for Low-Resource Languages Through Lateral Inhibition.
Proceedings of the 46th International Conference on Telecommunications and Signal Processing, 2023

2022
Language Report Romanian.
Proceedings of the European Language Equality, 2022

Romanian Language Technology - a view from an academic perspective.
Int. J. Comput. Commun. Control, 2022

A Lite Romanian BERT: ALR-BERT.
Comput., 2022

Capitalization and punctuation restoration: a survey.
Artif. Intell. Rev., 2022

Introducing the CURLICAT Corpora: Seven-language Domain Specific Annotated Corpora from Curated Sources.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Distilling the Knowledge of Romanian BERTs Using Multiple Teachers.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Curated Multilingual Language Resources for CEF AT (CURLICAT): overall view.
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, 2022

2021
Romanian Speech Recognition Experiments from the ROBIN Project.
CoRR, 2021

More Romanian word embeddings from the RETEROM project.
CoRR, 2021

Establishing a Baseline of Romanian Speech-to-Text Models.
Proceedings of the International Conference on Speech Technology and Human-Computer Dialogue, 2021

PyEuroVoc: A Tool for Multilingual Legal Document Classification with EuroVoc Descriptors.
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), 2021

A Modular Approach for Romanian-English Speech Translation.
Proceedings of the Natural Language Processing and Information Systems, 2021

2020

Collection and Annotation of the Romanian Legal Corpus.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020


MWSA Task at GlobaLex 2020: RACAI's Word Sense Alignment System using a Similarity Measurement of Dictionary Definitions.
Proceedings of the 2020 Globalex Workshop on Linked Lexicography, 2020

A Processing Platform Relating Data and Tools for Romanian Language.
Proceedings of the 1st International Workshop on Language Technology Platforms, 2020

2019
Collecting Comparable Corpora.
Proceedings of the Using Comparable Corpora for Under-Resourced Areas of Machine Translation, 2019

Mapping and Aligning Units from Comparable Corpora.
Proceedings of the Using Comparable Corpora for Under-Resourced Areas of Machine Translation, 2019

Making Pepper Understand and Respond in Romanian.
Proceedings of the 22nd International Conference on Control Systems and Computer Science, 2019

2018
A Bird's-eye View of Language Processing Projects at the Romanian Academy.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

BioRo: The Biomedical Corpus for the Romanian Language.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

The Reference Corpus of the Contemporary Romanian Language (CoRoLa).
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

2017
A data-driven approach to verbal multiword expression detection. PARSEME Shared Task system description paper.
Proceedings of the 13th Workshop on Multiword Expressions, 2017

RACAI's Natural Language Processing pipeline for Universal Dependencies.
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 2017

2016
The strategic impact of META-NET on the regional, national and international level.
Lang. Resour. Evaluation, 2016

The IPR-cleared Corpus of Contemporary Written and Spoken Romanian Language.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

2014
News about the Romanian Wordnet.
Proceedings of the Seventh Global Wordnet Conference, 2014

Large SMT data-sets extracted from Wikipedia.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

CoRoLa ― The Reference Corpus of Contemporary Romanian Language.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

2013
CLEF 2013: information access evaluation meets multilinguality, multimodality, and visualization.
SIGIR Forum, 2013

The Romanian wordnet in a nutshell.
Lang. Resour. Evaluation, 2013

Wiki-Translator: Multilingual Experiments for In-Domain Translations.
Comput. Sci. J. Moldova, 2013

The RACAI speech translation system challenges of morphologically rich languages.
Proceedings of the 7th Conference on Speech Technology and Human-Computer Dialogue, 2013

Wikipedia as an SMT Training Corpus.
Proceedings of the Recent Advances in Natural Language Processing, 2013

Introduction to the CLEF2013 Labs and Workshop.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013

Large tagset labeling using Feed Forward Neural Networks. Case study on Romanian Language.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013

2012
Experiments with a differential semantics annotation for WordNet 3.0.
Decis. Support Syst., 2012

Finding Translation Examples for Under-Resourced Language Pairs or for Narrow Domains; the Case for Machine Translation.
Comput. Sci. J. Moldova, 2012

Collecting and Using Comparable Corpora for Statistical Machine Translation.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

ROMBAC: The Romanian Balanced Annotated Corpus.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Romanian TimeBank: An Annotated Parallel Corpus for Temporal Information.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Romanian to English automatic MT experiments at IWSLT12 (system description paper).
Proceedings of the 2012 International Workshop on Spoken Language Translation, 2012

Cascaded Phrase-Based Statistical Machine Translation Systems.
Proceedings of the 16th Annual conference of the European Association for Machine Translation, 2012

2011
Natural Language Question Answering in Open Domains.
Comput. Sci. J. Moldova, 2011

An Osgoodian perspective on WordNet.
Proceedings of the 6th International Conference Speech Technology and Human-Computer Dialogue, 2011

2010
Resource and Service Centres as the Backbone for a Sustainable Service Infrastructure.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

A Differential Semantics Approach to the Annotation of Synsets in WordNet.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

A Collection of Comparable Corpora for Under-resourced Languages.
Proceedings of the Human Language Technologies - The Baltic Perspective, 2010

Monolingual and Multilingual Question Answering on European Legislation.
Proceedings of the CLEF 2010 LABs and Workshops, 2010

2009
Multilingual versus monolingual word sense disambiguation.
Int. J. Speech Technol., 2009

A Trainable Multi-factored QA System.
Proceedings of the Multilingual Information Access Evaluation I. Text Retrieval Experiments, 2009

2008
Unsupervised Lexical Acquisition for Part of Speech Tagging.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

RACAI's Linguistic Web Services.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

DIAC+: a Professional Diacritics Recovering System.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

A Hybrid Approach to Extracting and Classifying Verb+Noun Constructions.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

RACAI's QA System at the Romanian-Romanian Multiple Language Question Answering (QA@CLEF2008) Main Task.
Proceedings of the Working Notes for CLEF 2008 Workshop co-located with the 12th European Conference on Digital Libraries (ECDL 2008) , 2008

RACAI's QA System at the Romanian-Romanian QA@CLEF2008 Main Task.
Proceedings of the Evaluating Systems for Multilingual and Multimodal Information Access, 2008

2007
Ontology-Supported Text Classification Based on Cross-Lingual Word Sense Disambiguation.
Proceedings of the Applications of Fuzzy Sets Theory, 2007

RACAI: Meaning Affinity Models.
Proceedings of the 4th International Workshop on Semantic Evaluations, 2007

Exploiting Aligned Parallel Corpora in Multilingual Studies and Applications.
Proceedings of the Intercultural Collaboration, First International Workshop, 2007

RACAI's Question Answering System at QA@CLEF 2007.
Proceedings of the Working Notes for CLEF 2007 Workshop co-located with the 11th European Conference on Digital Libraries (ECDL 2007), 2007

RACAI's Question Answering System at QA@CLEF2007.
Proceedings of the Advances in Multilingual and Multimodal Information Retrieval, 2007

2006
From Word Alignment to Word Senses, via Multilingual Wordnets.
Comput. Sci. J. Moldova, 2006

RoCo-News: A Hand Validated Journalistic Corpus of Romanian.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

The JRC-Acquis: A Multilingual Aligned Parallel Corpus with 20+ Languages.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Aligning Multilingual Thesauri.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Tagset Mapping and Statistical Training Data Cleaning-up.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Dependency-Based Phrase Alignment.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Acquis Communautaire Sentence Alignment using Support Vector Machines.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Word Senses: The Stepping Stones in Semantic-Based Natural Language Processing.
Proceedings of the Artificial Intelligence Applications and Innovations, 2006

Improved Lexical Alignment by Combining Multiple Reified Alignments.
Proceedings of the EACL 2006, 2006

Developing a Question Answering System for the Romanian-English Track at CLEF 2006.
Proceedings of the Working Notes for CLEF 2006 Workshop co-located with the 10th European Conference on Digital Libraries (ECDL 2006), 2006

Cross-Lingual Romanian to English Question Answering at CLEF 2006.
Proceedings of the Evaluation of Multilingual and Multi-modal Information Retrieval, 2006

2005
Evaluating the Word Sense Disambiguation Accuracy with Three Different Sense Inventories.
Proceedings of the Natural Language Understanding and Cognitive Science, 2005

Combined Word Alignments.
Proceedings of the Workshop on Building and Using Parallel Texts@ACL 2005, 2005

2004
Extracting Multilingual Lexicons from Parallel Corpora.
Comput. Humanit., 2004

The Semantic Web and Language Technology, Its Potential and Practicalities: EUROLAN-2003.
AI Mag., 2004

An evaluation exercise for Romanian Word Sense Disambiguation.
Proceedings of the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, 2004

Interlingual wordnets validation and word-sense disambiguation.
Proceedings of the Natural Language Understanding and Cognitive Science, 2004

Word Sense Disambiguation as a Wordnets' Validation Method in Balkanet.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

Tiered Tagging Revisited.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

A Methodology and Associated Tools for Building Interlingual Wordnets.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

Term Translations in Parallel Corpora: Discovery and Consistency Check.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

Fine-Grained Word Sense Disambiguation Based on Parallel Corpora, Word Alignment, Word Clustering and Aligned Wordnets.
Proceedings of the COLING 2004, 2004

2003
TREQ-AL: A word alignment system with limited language resources.
Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, 2003

2002
Empirical Methods for Exploiting Parallel Texts.
Lit. Linguistic Comput., 2002

Revealing Translators' Knowledge: Statistical Methods in Constructing Practical Translation Lexicons for Language and Speech Processing.
Int. J. Speech Technol., 2002

Sense Discrimination with Parallel Corpora.
Proceedings of the ACL Workshop on Word Sense Disambiguation: Recent Successes and Future Directions, 2002

Lexical token alignment: experiments, results and applications.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

A Cheap and Fast Way to Build Useful Translation Lexicons.
Proceedings of the 19th International Conference on Computational Linguistics, 2002

2001
Automatic Sense Tagging Using Parallel Corpora.
Proceedings of the Sixth Natural Language Processing Pacific Rim Symposium, 2001

2000
Principled Hidden Tagset Design for Tiered Tagging of Hungarian.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Using a Large Set of EAGLES-compliant Morpho-syntactic Descriptors as a Tagset for Probabilistic Tagging.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Large Tagsets and High Accuracy in Statistical Morpho-Syntactic Disambiguation of Written Texts.
Proceedings of the Recent Topics in Mathematical and Computational Linguistics, 2000

1999
Tiered Tagging and Combined Language Models Classifiers.
Proceedings of the Text, Speech and Dialogue - Second International Workshop, 1999

1998
Tagging romanian texts: a case study for QTAG, a language independent probabilistic tagger.
Proceedings of the First International Conference on Language Resources and Evaluation, 1998

Standardised specifications, development and assessment of large morpho-lexical resources for six central and eastern european languages.
Proceedings of the First International Conference on Language Resources and Evaluation, 1998

Multext-East: Parallel and Comparable Corpora and Lexicons for Six Central and Eastern European Languages.
Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, 1998

1994
Situation Viewpoints for Generation.
Proceedings of the Seventh International Workshop on Natural Language Generation, 1994

1989
It Would Be Much Easier If WENT Were GOED.
Proceedings of the EACL 1989, 1989

1984
IURES: A Human Engineering Approach to Natuarl Language.
Proceedings of the Artificial Intelligence: Methodology, Systems, Applications, 1984


  Loading...