Pablo Gamallo

Orcid: 0000-0002-5819-2469

Affiliations:
  • University of Santiago de Compostela, Spain


According to our database1, Pablo Gamallo authored at least 120 papers between 2001 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
An unsupervised perplexity-based method for boilerplate removal.
Nat. Lang. Eng., January, 2024

CorpusNÓS: A massive Galician corpus for training large language models.
Proceedings of the 16th International Conference on Computational Processing of Portuguese, 2024

Exploring the effects of vocabulary size in neural machine translation: Galician as a target language.
Proceedings of the 16th International Conference on Computational Processing of Portuguese, 2024

2023
Desenvolvimento e avaliação de um modelo NER no domínio da análise cultural e do turismo.
Linguamática, December, 2023

Contextualized word senses: from attention to compositionality.
CoRR, 2023

Automatic Authorship Attribution in the Work of Tirso de Molina.
CoRR, 2023

2022
Evaluating Contextualized Vectors from both Large Language Models and Compositional Strategies.
Proces. del Leng. Natural, 2022

Proxecto Nós: Artificial intelligence at the service of the Galician language.
Proceedings of the Annual Conference of the Spanish Association for Natural Language Processing: Projects and Demonstrations (SEPLN-PD 2022) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2022), 2022

A neural machine translation system for Galician from transliterated Portuguese text.
Proceedings of the Annual Conference of the Spanish Association for Natural Language Processing: Projects and Demonstrations (SEPLN-PD 2022) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2022), 2022

An exploration of the semantic knowledge in vector models: polysemy, synonymy and idiomaticity.
Proceedings of the Annual Conference of the Spanish Association for Natural Language Processing: Projects and Demonstrations (SEPLN-PD 2022) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2022), 2022

Revisiting CCNet for Quality Measurements in Galician.
Proceedings of the Computational Processing of the Portuguese Language, 2022

2021
Using Dependency-Based Contextualization for transferring Passive Constructions from English to Spanish.
Proces. del Leng. Natural, 2021

Uso de tecnologias linguı'sticas para estudar a evolução dos sufixos -ÇOM e -VEL no galego-português medieval a partir de corpora históricos.
Linguamática, 2021

A Methodology to Measure the Diachronic Language Distance between Three Languages Based on Perplexity.
J. Quant. Linguistics, 2021

CiTIUS at the TREC 2021 Health Misinformation Track.
Proceedings of the Thirtieth Text REtrieval Conference, 2021

LeMe-PT: A Medical Package Leaflet Corpus for Portuguese.
Proceedings of the 10th Symposium on Languages, Applications and Technologies, 2021

CiTIUS at FakeDeS 2021: A Hybrid Strategy for Fake News Detection.
Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2021), 2021

Comparing Dependency-based Compositional Models with Contextualized Word Embeddings.
Proceedings of the 13th International Conference on Agents and Artificial Intelligence, 2021

Comparing Traditional Machine Learning Methods for COVID-19 Fake News.
Proceedings of the 22nd International Arab Conference on Information Technology, 2021

2020
Measuring diachronic language distance using perplexity: Application to English, Portuguese, and Spanish.
Nat. Lang. Eng., 2020

Evaluating and improving lexical resources for detecting signs of depression in text.
Lang. Resour. Evaluation, 2020

Distância diacrónica automática entre variantes diatópicas do português e do espanhol.
Linguamática, 2020

Measuring Language Distance of Isolated European Languages.
Inf., 2020

CitiusNLP at SemEval-2020 Task 3: Comparing Two Approaches for Word Vector Contextualization.
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

The Impact of Linguistic Knowledge in Different Strategies to Learn Cross-Lingual Distributional Models.
Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020

2019
Cross-lingual Diachronic Distance: Application to Portuguese and Spanish.
Proces. del Leng. Natural, 2019

Using the Outlier Detection Task to Evaluate Distributional Semantic Models.
Mach. Learn. Knowl. Extr., 2019

Uma utilidade para o reconhecimento de topónimos em documentos medievais.
Linguamática, 2019

A dependency-based approach to word contextualization using compositional distributional semantics.
J. Lang. Model., 2019

Editorial for the Special Issue on "Natural Language Processing and Text Mining".
Inf., 2019

Comparing Supervised Machine Learning Strategies and Linguistic Features to Search for Very Negative Opinions.
Inf., 2019

Contextualized Translations of Phrasal Verbs with Distributional Compositional Semantics and Monolingual Corpora.
Comput. Linguistics, 2019

Identifying Causal Relations in Legal Documents with Dependency Syntactic Analysis.
Proceedings of the 8th Symposium on Languages, Applications and Technologies, 2019

NER and Open Information Extraction for Portuguese: Notebook for IberLEF 2019 Portuguese Named Entity Recognition and Relation Extraction Tasks.
Proceedings of the Iberian Languages Evaluation Forum co-located with 35th Conference of the Spanish Society for Natural Language Processing, 2019

CiTIUS-COLE at SemEval-2019 Task 5: Combining Linguistic Features to Identify Hate Speech Against Immigrants and Women on Multilingual Tweets.
Proceedings of the 13th International Workshop on Semantic Evaluation, 2019

Unsupervised Compositional Translation of Multiword Expressions.
Proceedings of the Joint Workshop on Multiword Expressions and WordNet, 2019

Supervised Classifiers to Identify Hate Speech on English and Spanish Tweets.
Proceedings of the Digital Libraries at the Crossroads of Digital Information for the Future, 2019

Naive-Bayesian Classification for Bot Detection in Twitter.
Proceedings of the Working Notes of CLEF 2019, 2019

2018
GeoHbbTV: A framework for the development and evaluation of geographic interactive TV contents.
Multim. Tools Appl., 2018

Estratégias Lexicométricas para Detetar Especificidades Textuais.
Linguamática, 2018

Explorando métodos non-supervisados para calcular a similitude semántica textual.
Linguamática, 2018

Dependency parsing with finite state transducers and compression rules.
Inf. Process. Manag., 2018

Polypus: a Big Data Self-Deployable Architecture for Microblogging Text Extraction and Real-Time Sentiment Analysis.
CoRR, 2018

Distributional semantics for diachronic search.
Comput. Electr. Eng., 2018

Measuring language distance among historical varieties using perplexity. Application to European Portuguese.
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects, 2018

LinguaKit: A Big Data-Based Multilingual Tool for Linguistic Analysis and Information Extraction.
Proceedings of the Fifth International Conference on Social Networks Analysis, 2018

A Comparative Study of Polarity Lexicons to Identify Extreme Opinions.
Proceedings of the Fifth International Conference on Social Networks Analysis, 2018

Evaluation of Distributional Models with the Outlier Detection Task.
Proceedings of the 7th Symposium on Languages, Applications and Technologies, 2018

CitiusNLP at SemEval-2018 Task 10: The Use of Transparent Distributional Models and Salient Contexts to Discriminate Word Attributes.
Proceedings of The 12th International Workshop on Semantic Evaluation, 2018

Task-Oriented Evaluation of Dependency Parsing with Open Information Extraction.
Proceedings of the Computational Processing of the Portuguese Language, 2018

Linguistic Features to Identify Extreme Opinions: An Empirical Study.
Proceedings of the Intelligent Data Engineering and Automated Learning - IDEAL 2018, 2018

2017
Comparing explicit and predictive distributional semantic models endowed with syntactic contexts.
Lang. Resour. Evaluation, 2017

LinguaKit: uma ferramenta multilingue para a análise linguística e a extração de informação.
Linguamática, 2017

A Perplexity-Based Method for Similar Languages Discrimination.
Proceedings of the Fourth Workshop on NLP for Similar Languages, 2017

Citius at SemEval-2017 Task 2: Cross-Lingual Similarity from Comparable Corpora and Dependency-Based Contexts.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

Sense Contextualization in a Dependency-Based Compositional Distributional Model.
Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017

Automatic Construction of Domain-Specific Sentiment Lexicons for Polarity Classification.
Proceedings of the Trends in Cyber-Physical Multi-Agent Systems. The PAAMS Collection, 2017

Searching for the Most Negative Opinions.
Proceedings of the Knowledge Engineering and Semantic Web - 8th International Conference, 2017

A Web Interface for Diachronic Semantic Search in Spanish.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

A rule-based system for cross-lingual parsing of Romance languages with Universal Dependencies.
Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 2017

2016
TweetLID: a benchmark for tweet language identification.
Lang. Resour. Evaluation, 2016

Comparing Two Basic Methods for Discriminating Between Similar Languages and Varieties.
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects, 2016

Entity Linking with Distributional Semantics.
Proceedings of the Computational Processing of the Portuguese Language, 2016

TweetMT: A Parallel Microblog Corpus.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

2015
Exploring the effectiveness of linguistic knowledge for biographical relation extraction.
Nat. Lang. Eng., 2015

TweetNorm: a benchmark for lexical normalization of Spanish tweets.
Lang. Resour. Evaluation, 2015

Yet Another Suite of Multilingual NLP Tools.
Proceedings of the Languages, Applications and Technologies - 4th International Symposium, 2015

Overview of TweetMT: A Shared Task on Machine Translation of Tweets at SEPLN 2015.
Proceedings of the Tweet Translation Workshop 2015 co-located with 31st Conference of the Spanish Society for Natural Language Processing (SEPLN 2015), 2015

Dependency Parsing with Compression Rules.
Proceedings of the 14th International Conference on Parsing Technologies, 2015

Multilingual Open Information Extraction.
Proceedings of the Progress in Artificial Intelligence, 2015

2014
PoS-tagging the Web in Portuguese. National varieties, text typologies and spelling systems.
Proces. del Leng. Natural, 2014

Entity-Centric Coreference Resolution of Person Entities for Open Information Extraction.
Proces. del Leng. Natural, 2014

Análisis morfosintáctico y clasificación de entidades nombradas en un entorno Big Data.
Proces. del Leng. Natural, 2014

An Overview of Open Information Extraction (Invited talk).
Proceedings of the 3rd Symposium on Languages, Applications and Technologies, 2014

Overview of TweetLID: Tweet Language Identification at SEPLN 2014.
Proceedings of the Tweet Language Identification Workshop co-located with 30th Conference of the Spanish Society for Natural Language Processing, 2014

Comparing Ranking-based and Naive Bayes Approaches to Language Detection on Tweets.
Proceedings of the Tweet Language Identification Workshop co-located with 30th Conference of the Spanish Society for Natural Language Processing, 2014

Citius: A Naive-Bayes Strategy for Sentiment Analysis on English Tweets.
Proceedings of the 8th International Workshop on Semantic Evaluation, 2014

Multilingual corpora with coreferential annotation of person entities.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

TweetNorm_es: an annotated corpus for Spanish microtext normalization.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

An Entity-Centric Coreference Resolution System for Person Entities with Rich Linguistic Information.
Proceedings of the COLING 2014, 2014

Perldoop: Efficient execution of Perl scripts on Hadoop clusters.
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014

2013
A Method to Lexical Normalisation of Tweets.
Proceedings of the Tweet Normalization Workshop co-located with 29th Conference of the Spanish Society for Natural Language Processing (SEPLN 2013), 2013

Introducción a la Tarea Compartida Tweet-Norm 2013: Normalización Léxica de Tuits en Español.
Proceedings of the Tweet Normalization Workshop co-located with 29th Conference of the Spanish Society for Natural Language Processing (SEPLN 2013), 2013

Analyzing the Sense Distribution of Concordances Obtained by Web as Corpus Approach.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2013

2012
Extraction of Bilingual Cognates from Wikipedia.
Proceedings of the Computational Processing of the Portuguese Language, 2012

2011
Resolución de Correferencia de Nombres de Persona para Extracción de Información Biográfica.
Proces. del Leng. Natural, 2011

Is singular value decomposition useful for word similarity extraction?
Lang. Resour. Evaluation, 2011

Evaluating Various Linguistic Features on Semantic Relation Extraction.
Proceedings of the Recent Advances in Natural Language Processing, 2011

A Resource-Based Method for Named Entity Extraction and Classification.
Proceedings of the Progress in Artificial Intelligence, 2011

2010
Análise Morfossintáctica para Português Europeu e Galego: Problemas, Soluções e Avaliação.
Linguamática, 2010

Vencendo a escassez de recursos computacionais. Carvalho: Tradutor Automático Estatístico Inglês-Galego a partir do corpus paralelo Europarl Inglês-Português.
Linguamática, 2010

Automatic Generation of Bilingual Dictionaries Using Intermediary Languages and Comparable Corpora.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2010

2009
Una gramática de dependencias basada en patrones de etiquetas.
Proces. del Leng. Natural, 2009

Carvalho: Un sistema de traducción estadística inglés-galego construído a partir del corpus paralelo inglés-portugués EuroParl.
Proces. del Leng. Natural, 2009

Comparing Different Properties Involved in Word Similarity Extraction.
Proceedings of the Progress in Artificial Intelligence, 2009

2008
Automatic Acquisition of Formal Concepts from Text.
LDV Forum, 2008

Comparing Window and Syntax Based Strategies for Semantic Extraction.
Proceedings of the Computational Processing of the Portuguese Language, 2008

Learning Spanish-Galician Translation Equivalents Using a Comparable Corpus and a Bilingual Dictionary.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2008

2007
Un Método de Extracción de Equivalentes de Traducción a partir de un Corpus Comparable Castellano-Gallego.
Proces. del Leng. Natural, 2007

El Proyecto Gari-Coter en el Seno del Proyecto RICOTERM2.
Proces. del Leng. Natural, 2007

Inducing Classes of Terms from Text.
Proceedings of the Text, Speech and Dialogue, 10th International Conference, 2007

Learning bilingual lexicons from comparable English and Spanish corpora.
Proceedings of Machine Translation Summit XI: Papers, 2007

2006
Using Natural Alignment to Extract Translation Equivalents.
Proceedings of the Computational Processing of the Portuguese Language, 2006

2005
El tratamiento de la polisemia en la extracción de léxicos bilinges a partir de corpora paralelos.
Proces. del Leng. Natural, 2005

Clustering Syntactic Positions with Similar Semantic Requirements.
Comput. Linguistics, 2005

An Approach to Acquire Word Translations from Non-parallel Texts.
Proceedings of the Progress in Artificial Intelligence, 2005

Extraction of translation equivalents from parallel corpora using sense-sensitive contexts.
Proceedings of the 10th EAMT Conference: Practical applications of machine translation, 2005

2004
Finite Element Methods in Local Active Control of Sound.
SIAM J. Control. Optim., 2004

The role of Optional Co-Composition to Solve lexical an Syntactic Ambiguity.
Proces. del Leng. Natural, 2004

Disambiguation and Optional Co-Composition.
Proceedings of the Actes de la 11ème conférence sur le Traitement Automatique des Langues Naturelles. Posters, 2004

Cluster Analysis of Named Entities.
Proceedings of the Intelligent Information Processing and Web Mining, 2004

A Divide-and-Conquer Approach to Acquire Syntactic Categories.
Proceedings of the Grammatical Inference: Algorithms and Applications, 2004

2003
Finite element analysis of pressure formulation of the elastoacoustic problem.
Numerische Mathematik, 2003

Acquiring Semantic Classes to Elaborate Attachment Heuristics.
Proceedings of the Progress in Artificial Intelligence, 2003

2002
Usando la co-composicionalidad para el aprendizaje de la subcategorización sintáctico-semántica.
Proces. del Leng. Natural, 2002

Assessment of Selection Restrictions Acquisition.
Proceedings of the Advances in Artificial Intelligence, 2002

2001
Syntactic-Based Methods for Measuring Word Similarity.
Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001

Selection Restrictions Acquisition for Parsing Improvement.
Proceedings of the Web Knowledge Management and Decision Support, 2001

Selection Restrictions Acquisition for Parsing and Information Retrieval Improvement.
Proceedings of the 14th International Conference on Applications of Prolog, 2001

Selection Restrictions Acquisition from Corpora.
Proceedings of the Progress in Artificial Intelligence, 2001


  Loading...