Arkaitz Zubiaga

Orcid: 0000-0003-4583-3623

Affiliations:
  • Queen Mary University of London, UK
  • University of Warwick, UK (former)


According to our database1, Arkaitz Zubiaga authored at least 155 papers between 1997 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Few-Shot Learning for Cross-Target Stance Detection by Aggregating Multimodal Embeddings.
IEEE Trans. Comput. Soc. Syst., April, 2024

Special issue on analysis and mining of social media data.
PeerJ Comput. Sci., 2024

SocialPET: Socially Informed Pattern Exploiting Training for Few-Shot Stance Detection in Social Media.
CoRR, 2024

ID-XCB: Data-independent Debiasing for Fair and Accurate Transformer-based Cyberbullying Detection.
CoRR, 2024

Synergizing Machine Learning & Symbolic Methods: A Survey on Hybrid Approaches to Natural Language Processing.
CoRR, 2024

Claim Detection for Automated Fact-checking: A Survey on Monolingual, Multilingual and Cross-Lingual Research.
CoRR, 2024

Cross-lingual Offensive Language Detection: A Systematic Review of Datasets, Transfer Approaches and Challenges.
CoRR, 2024

Hate Speech Detection and Reclaimed Language: Mitigating False Positives and Compounded Discrimination.
Proceedings of the 16th ACM Web Science Conference, 2024


MAPLE: Micro Analysis of Pairwise Language Evolution for Few-Shot Claim Verification.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

2023
Session-based cyberbullying detection in social media: A survey.
Online Soc. Networks Media, July, 2023

Natural language processing in the era of large language models.
Frontiers Artif. Intell., February, 2023

Special issue on intelligent systems for tackling online harms.
Pers. Ubiquitous Comput., 2023

Check-worthy claim detection across topics for automated fact-checking.
PeerJ Comput. Sci., 2023

Evaluating the generalisability of neural rumour verification models.
Inf. Process. Manag., 2023

Building for tomorrow: Assessing the temporal persistence of text classifiers.
Inf. Process. Manag., 2023

Generalizing Political Leaning Inference to Multi-Party Systems: Insights from the UK Political Landscape.
CoRR, 2023

Faithful Knowledge Graph Explanations for Commonsense Reasoning.
CoRR, 2023

Some Observations on Fact-Checking Work with Implications for Computational Support.
CoRR, 2023

Cluster-based Deep Ensemble Learning for Emotion Classification in Internet Memes.
CoRR, 2023

Target-Oriented Investigation of Online Abusive Attacks: A Dataset and Analysis.
IEEE Access, 2023

Learning like human annotators: Cyberbullying detection in lengthy social media sessions.
Proceedings of the ACM Web Conference 2023, 2023

AnnoBERT: Effectively Representing Multiple Annotators' Label Choices to Improve Hate Speech Detection.
Proceedings of the Seventeenth International AAAI Conference on Web and Social Media, 2023

SexWEs: Domain-Aware Word Embeddings via Cross-Lingual Semantic Specialisation for Chinese Sexism Detection in Social Media.
Proceedings of the Seventeenth International AAAI Conference on Web and Social Media, 2023


PANACEA: An Automated Misinformation Detection System on COVID-19.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. EACL 2023, 2023

Active PETs: Active Data Annotation Prioritisation for Few-Shot Claim Verification with Pattern Exploiting Training.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

NUAA-QMUL-AIIT at Memotion 3: Multi-modal Fusion with Squeeze-and-Excitation for Internet Meme Emotion Analysis.
Proceedings of De-Factify 2: 2nd Workshop on Multimodal Fact Checking and Hate Speech Detection, 2023

Overview of the CLEF-2023 LongEval Lab on Longitudinal Evaluation of Model Performance.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2023

Extended Overview of the CLEF-2023 LongEval Lab on Longitudinal Evaluation of Model Performance.
Proceedings of the Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), 2023

2022
Analysing the Existence of Organisation Specific Languages on Twitter: The Dataset.
Dataset, May, 2022

Aggregating pairwise semantic differences for few-shot claim verification.
PeerJ Comput. Sci., 2022

Editorial for Special Issue on Detecting, Understanding and Countering Online Harms.
Online Soc. Networks Media, 2022

Hidden behind the obvious: Misleading keywords and implicitly abusive language on social media.
Online Soc. Networks Media, 2022

SWSR: A Chinese dataset and lexicon for online sexism detection.
Online Soc. Networks Media, 2022

Aggregating Pairwise Semantic Differences for Few-Shot Claim Veracity Classification.
CoRR, 2022

HIT&QMUL at SemEval-2022 Task 9: Label-Enclosed Generative Question Answering (LEG-QA).
Proceedings of the 16th International Workshop on Semantic Evaluation, SemEval@NAACL 2022, 2022

Team dina at SemEval-2022 Task 8: Pre-trained Language Models as Baselines for Semantic Similarity.
Proceedings of the 16th International Workshop on Semantic Evaluation, SemEval@NAACL 2022, 2022

Natural Language Inference with Self-Attention for Veracity Assessment of Pandemic Claims.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Improving Zero-Shot Cross-Lingual Hate Speech Detection with Pseudo-Label Fine-Tuning of Transformer Language Models.
Proceedings of the Sixteenth International AAAI Conference on Web and Social Media, 2022

Cyberbullying Detection across Social Media Platforms via Platform-Aware Adversarial Encoding.
Proceedings of the Sixteenth International AAAI Conference on Web and Social Media, 2022

2021
Towards generalisable hate speech detection: a review on obstacles and solutions.
PeerJ Comput. Sci., 2021

Feature-based detection of automated language models: tackling GPT-2, GPT-3 and Grover.
PeerJ Comput. Sci., 2021

Abusive language detection in youtube comments leveraging replies as conversational context.
PeerJ Comput. Sci., 2021

Automated fact-checking: A survey.
Lang. Linguistics Compass, 2021

Online Multilingual Hate Speech Detection: Experimenting with Hindi and English Social Media.
Inf., 2021

Citizen Participation and Machine Learning for a Better Democracy.
Digit. Gov. Res. Pract., 2021

Sexism Identification in Tweets and Gabs using Deep Neural Networks.
CoRR, 2021

Cross-lingual Hate Speech Detection using Transformer Models.
CoRR, 2021

A Longitudinal Multi-modal Dataset for Dementia Monitoring and Diagnosis.
CoRR, 2021

Capturing Stance Dynamics in Social Media: Open Challenges and Research Directions.
CoRR, 2021

The emojification of sentiment on social media: Collection and analysis of a longitudinal Twitter sentiment dataset.
CoRR, 2021

QMUL-SDS at SCIVER: Step-by-Step Binary Classification for Scientific Claim Verification.
CoRR, 2021

Analyzing the Existence of Organization Specific Languages on Twitter.
IEEE Access, 2021

Threatening Language Detection and Target Identification in Urdu Tweets.
IEEE Access, 2021

QMUL-SDS at EXIST: Leveraging Pre-trained Semantics and Lexical Features for Multilingual Sexism Detection in Social Networks.
Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2021), 2021

OHARS: Second Workshop on Online Misinformation- and Harm-Aware Recommender Systems.
Proceedings of the RecSys '21: Fifteenth ACM Conference on Recommender Systems, Amsterdam, The Netherlands, 27 September 2021, 2021

Weakly Supervised Cross-platform Teenager Detection with Adversarial BERT.
Proceedings of the HT '21: 32nd ACM Conference on Hypertext and Social Media, Virtual Event, Ireland, 30 August 2021, 2021

Cross-lingual Capsule Network for Hate Speech Detection in Social Media.
Proceedings of the HT '21: 32nd ACM Conference on Hypertext and Social Media, Virtual Event, Ireland, 30 August 2021, 2021

Opinions are Made to be Changed: Temporally Adaptive Stance Classification.
Proceedings of the OASIS@HT 2021: Proceedings of the 2021 Workshop on Open Challenges in Online Social Networks, 2021

QMUL-SDS at CheckThat! 2021: Enriching Pre-Trained Language Models for the Estimation of Check-Worthiness of Arabic Tweets.
Proceedings of the Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania, September 21st - to, 2021

2020
Early Detection of Social Media Hoaxes at Scale.
ACM Trans. Web, 2020

Birds of a feather check together: Leveraging homophily for sequential rumour detection.
Online Soc. Networks Media, 2020

TF-CR: Weighting Embeddings for Text Classification.
CoRR, 2020

An Online Multilingual Hate speech Recognition System.
CoRR, 2020

QMUL-SDS @ DIACR-ITA Evaluating Unsupervised Diachronic Lexical Semantics Classification in Italian.
CoRR, 2020

QMUL-SDS @ SardiStance: Leveraging Network Interactions to Boost Performance on Stance Detection using Knowledge Graphs.
CoRR, 2020

NUAA-QMUL at SemEval-2020 Task 8: Utilizing BERT and DenseNet for Internet Meme Emotion Analysis.
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

Workshop on Online Misinformation- and Harm-Aware Recommender Systems.
Proceedings of the RecSys 2020: Fourteenth ACM Conference on Recommender Systems, 2020

Stance Classification for Rumour Verification in Social Media Conversations.
Proceedings of the Workshop Proceedings of the 14th International AAAI Conference on Web and Social Media, 2020

QMUL-SDS @ SardiStance: Leveraging Network Interactions to Boost Performance on Stance Detection using Knowledge Graphs (short paper).
Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2020), 2020

QMUL-SDS @ DIACR-Ita: Evaluating Unsupervised Diachronic Lexical Semantics Classification in Italian (short paper).
Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2020), 2020

Detection and Resolution of Rumors and Misinformation with NLP.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

QMUL-SDS at CheckThat! 2020 Determining COVID-19 Tweet Check-Worthiness Using an Enhanced CT-BERT with Numeric Expressions.
Proceedings of the Working Notes of CLEF 2020, 2020

Exploiting Class Labels to Boost Performance on Embedding-based Text Classification.
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

2019
Gaussian Processes for Rumour Stance Classification in Social Media.
ACM Trans. Inf. Syst., 2019

Leveraging aspect phrase embeddings for cross-domain review rating prediction.
PeerJ Comput. Sci., 2019

Mining social media for newsgathering: A review.
Online Soc. Networks Media, 2019

Social media mining for journalism.
Online Inf. Rev., 2019

Processing social media in real-time.
Inf. Process. Manag., 2019

Political Homophily in Independence Movements: Analyzing and Classifying Social Media Users by National Identity.
IEEE Intell. Syst., 2019

SemEval-2019 Task 7: RumourEval, Determining Rumour Veracity and Support for Rumours.
Proceedings of the 13th International Workshop on Semantic Evaluation, 2019

2018
Microblog Analysis as a Program of Work.
ACM Trans. Soc. Comput., 2018

A longitudinal assessment of the persistence of twitter datasets.
J. Assoc. Inf. Sci. Technol., 2018

Discourse-aware rumour stance classification in social media using sequential classifiers.
Inf. Process. Manag., 2018

Detection and Resolution of Rumours in Social Media: A Survey.
ACM Comput. Surv., 2018

A Longitudinal Analysis of the Public Perception of the Opportunities and Challenges of the Internet of Things.
CoRR, 2018

Towards Automated Factchecking: Developing an Annotation Schema and Benchmark for Consistent Automated Claim Detection.
CoRR, 2018

RumourEval 2019: Determining Rumour Veracity and Support for Rumours.
CoRR, 2018

Mining Social Media for Newsgathering.
CoRR, 2018

Learning Class-specific Word Representations for Early Detection of Hoaxes in Social Media.
CoRR, 2018

All-in-one: Multi-task Learning for Rumour Verification.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

2nd International Workshop on Rumours and Deception in Social Media: Preface.
Proceedings of the CIKM 2018 Workshops co-located with 27th ACM International Conference on Information and Knowledge Management (CIKM 2018), 2018

2017
Towards Real-Time, Country-Level Location Classification of Worldwide Tweets.
IEEE Trans. Knowl. Data Eng., 2017

Using Fuzzy Logic to Leverage HTML Markup for Web Page Representation.
IEEE Trans. Fuzzy Syst., 2017

Stance Classification of Social Media Users in Independence Movements.
CoRR, 2017

Exploiting Context for Rumour Detection in Social Media.
Proceedings of the Social Informatics, 2017

A Hierarchical Topic Modelling Approach for Tweet Clustering.
Proceedings of the Social Informatics, 2017

Stance Classification in Out-of-Domain Rumours: A Case Study Around Mental Health Disorders.
Proceedings of the Social Informatics, 2017

Overview of the M-WePNaD Task: Multilingual Web Person Name Disambiguation at IberEval 2017.
Proceedings of the Second Workshop on Evaluation of Human Language Technologies for Iberian Languages (IberEval 2017) co-located with 33th Conference of the Spanish Society for Natural Language Processing (SEPLN 2017), 2017

SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

WISC at MediaEval 2017: Multimedia Satellite Task.
Proceedings of the Working Notes Proceedings of the MediaEval 2017 Workshop co-located with the Conference and Labs of the Evaluation Forum (CLEF 2017), 2017

TOTEMSS: Topic-based, Temporal Sentiment Summarisation for Twitter.
Proceedings of the IJCNLP 2017, Tapei, Taiwan, November 27, 2017

TDParse: Multi-target-specific sentiment recognition on Twitter.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Supporting the Use of User Generated Content in Journalistic Practice.
Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 2017

2016
DeliciousMIL: A Data Set for Multi-Label Multi-Instance Learning with Instance Labels.
Dataset, October, 2016

TweetLID: a benchmark for tweet language identification.
Lang. Resour. Evaluation, 2016

Graphical Perception of Value Distributions: An Evaluation of Non-Expert Viewers' Data Literacy.
J. Community Informatics, 2016

Learning Reporting Dynamics during Breaking News for Rumour Detection in Social Media.
CoRR, 2016

Using Gaussian Processes for Rumour Stance Classification in Social Media.
CoRR, 2016

Reports of the Workshops Held at the 2016 International AAAI Conference on Web and Social Media.
AI Mag., 2016

TweetMT: A Parallel Microblog Corpus.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

SMILES: Twitter Emotion Classification using Domain.
Proceedings of the 4th Workshop on Sentiment Analysis where AI meets Psychology (SAAIP 2016) co-located with 25th International Joint Conference on Artificial Intelligence (IJCAI 2016), 2016

Stance Classification in Rumours as a Sequential Task Exploiting the Tree Structure of Social Media Conversations.
Proceedings of the COLING 2016, 2016

Hawkes Processes for Continuous Time Sequence Classification: an Application to Rumour Stance Classification in Twitter.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2015
Exploiting Geolocation, User and Temporal Information for Natural Hazards Monitoring in Twitter.
Proces. del Leng. Natural, 2015

TweetNorm: a benchmark for lexical normalization of Spanish tweets.
Lang. Resour. Evaluation, 2015

Real-time classification of Twitter trends.
J. Assoc. Inf. Sci. Technol., 2015

Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads.
CoRR, 2015

Euskahaldun: Euskararen Aldeko Martxa Baten Sare Sozialetako Islaren Bilketa eta Analisia.
CoRR, 2015

Microblog Analysis as a Programme of Work.
CoRR, 2015

Crowdsourcing the Annotation of Rumourous Conversations in Social Media.
Proceedings of the 24th International Conference on World Wide Web Companion, 2015

Overview of TweetMT: A Shared Task on Machine Translation of Tweets at SEPLN 2015.
Proceedings of the Tweet Translation Workshop 2015 co-located with 31st Conference of the Spanish Society for Natural Language Processing (SEPLN 2015), 2015

WarwickDCS: From Phrase-Based to Target-Specific Sentiment Recognition.
Proceedings of the 9th International Workshop on Semantic Evaluation, 2015

Making the Most of Tweet-Inherent Features for Social Spam Detection on Twitter.
Proceedings of the the 5th Workshop on Making Sense of Microposts co-located with the 24th International World Wide Web Conference (WWW 2015), 2015

Towards Detecting Rumours in Social Media.
Proceedings of the Artificial Intelligence for Cities, 2015

2014
Tweet, but verify: epistemic study of information verification on Twitter.
Soc. Netw. Anal. Min., 2014

Overview of TweetLID: Tweet Language Identification at SEPLN 2014.
Proceedings of the Tweet Language Identification Workshop co-located with 30th Conference of the Spanish Society for Natural Language Processing, 2014

TweetNorm_es: an annotated corpus for Spanish microtext normalization.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Newsworthiness and Network Gatekeeping on Twitter: The Role of Social Deviance.
Proceedings of the Eighth International Conference on Weblogs and Social Media, 2014

2013
Reciprocal Enrichment Between Basque Wikipedia and Machine Translation.
Proceedings of the People's Web Meets NLP, Collaboratively Constructed Language Resources, 2013

Harnessing Folksonomies to Produce a Social Classification of Resources.
IEEE Trans. Knowl. Data Eng., 2013

Stacking from Tags: Clustering Bookmarks around a Theme
CoRR, 2013

Reports on the Workshops Held at the Sixth International AAAI Conference on Weblogs and Social Media.
AI Mag., 2013

Harnessing web page directories for large-scale classification of tweets.
Proceedings of the 22nd International World Wide Web Conference, 2013

Newspaper editors vs the crowd: on the appropriateness of front page news selection.
Proceedings of the 22nd International World Wide Web Conference, 2013

Introducción a la Tarea Compartida Tweet-Norm 2013: Normalización Léxica de Tuits en Español.
Proceedings of the Tweet Normalization Workshop co-located with 29th Conference of the Spanish Society for Natural Language Processing (SEPLN 2013), 2013

Curating and contextualizing Twitter stories to assist with social newsgathering.
Proceedings of the 18th International Conference on Intelligent User Interfaces, 2013

2012
"Harnessing folksonomies for resource classification" by Arkaitz Zubiag with Danielle H. Lee as coordinator.
SIGWEB Newsl., 2012

Reorganizing clouds: A study on tag clustering and evaluation.
Expert Syst. Appl., 2012

Harnessing Folksonomies for Resource Classification
CoRR, 2012

Enhancing Navigation on Wikipedia with Social Tags
CoRR, 2012

Towards real-time summarization of scheduled events from twitter streams.
Proceedings of the 23rd ACM Conference on Hypertext and Social Media, 2012

Tweet Ranking Based on Heterogeneous Networks.
Proceedings of the COLING 2012, 2012

Analysis and Enhancement of Wikification for Microblogs with Context Expansion.
Proceedings of the COLING 2012, 2012

2011
Augmenting Web Page Classifiers with Social Annotations.
Proces. del Leng. Natural, 2011

Analyzing Tag Distributions in Folksonomies for Resource Classification.
Proceedings of the Knowledge Science, Engineering and Management, 2011

Tags vs shelves: from social tagging to social classification.
Proceedings of the HT'11, 2011

Classifying trending topics: a typology of conversation triggers on Twitter.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

2009
Clasificación de Páginas Web con Anotaciones Sociales.
Proces. del Leng. Natural, 2009

Comparativa de Aproximaciones a SVM Semisupervisado Multiclase para Clasificación de Páginas Web.
Proces. del Leng. Natural, 2009

QEAVis: Evaluación Cuantitativa de la Visibilidad de los Sitios Web Académicos.
Proces. del Leng. Natural, 2009

Getting the most out of social annotations for web page classification.
Proceedings of the 2009 ACM Symposium on Document Engineering, 2009

Content-Based Clustering for Tag Cloud Visualization.
Proceedings of the 2009 International Conference on Advances in Social Network Analysis and Mining, 2009

1997
Accelerated DP based search for statistical translation.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997


  Loading...