Washington Cunha

CoRR, June, 2025

A thorough benchmark of automatic text classification: From traditional approaches to large language models.

[BibT_eX]

[DOI]

CoRR, April, 2025

A Noise-Oriented and Redundancy-Aware Instance Selection Framework.

[BibT_eX]

[DOI]

Alejandro Moreo Fernández

Andrea Esuli

Fabrizio Sebastiani

ACM Trans. Inf. Syst., March, 2025

Why are you traveling? Inferring trip profiles from online reviews and domain-knowledge.

[BibT_eX]

[DOI]

Lucas G. S. Félix

Giancarlo Oliveira Teixeira

Jussara M. Almeida

Online Soc. Networks Media, 2025

Characterizing YouTube's Role in Online Gambling Promotion: A Case Study of Fortune Tiger in Brazil.

[BibT_eX]

[DOI]

Carlos H. G. Ferreira

Proceedings of the 17th ACM Web Science Conference 2025, 2025

Optimizing Tail-Head Trade-off for Extreme Multi-Label Text Classification (XMTC) with RAG-Labels and a Dynamic Two-Stage Retrieval and Fusion Pipeline.

[BibT_eX]

[DOI]

Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2025

QuantumCLEF 2025 - The Second Edition of the Quantum Computing Lab at CLEF.

[BibT_eX]

[DOI]

Maurizio Ferrari Dacrema

Paolo Cremonesi

Proceedings of the Advances in Information Retrieval, 2025

Overview of QuantumCLEF 2025: The Second Quantum Computing Challenge for Information Retrieval and Recommender Systems at CLEF.

[BibT_eX]

[DOI]

Maurizio Ferrari Dacrema

Paolo Cremonesi

Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2025

Instance-Selection-Inspired Undersampling Strategies for Bias Reduction in Small and Large Language Models for Binary Text Classification.

[BibT_eX]

[DOI]

Guilherme Fonseca

Gabriel Prenassi

Leonardo Chaves Dutra da Rocha

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Pipelining Semantic Expansion and Noise Filtering for Sentiment Analysis of Short Documents - CluSent Method.

[BibT_eX]

[DOI]

Guilherme Fonseca

Ana Machado

J. Interact. Syst., 2024

On Representation Learning-based Methods for Effective, Efficient, and Scalable Code Retrieval.

[BibT_eX]

[DOI]

Celso França

Rennan C. Lima

Pedro O. S. Vaz de Melo

Berthier A. Ribeiro-Neto

Rodrygo L. T. Santos

Adriana S. Pagano

Neurocomputing, 2024

A Strategy to Combine 1stGen Transformers and Open LLMs for Automatic Text Classification.

[BibT_eX]

[DOI]

Juliana Santos Rosa Viegas

CoRR, 2024

PATopics: An automatic framework to extract useful information from pharmaceutical patents documents.

[BibT_eX]

[DOI]

Pablo Cecilio

Antônio Perreira

Fabiana Testa Moura de Carvalho Vicentini

Felipe Viegas

Elisa Tuler

CoRR, 2024

Is it a work or leisure travel? Applying text classification to identify work-related travel on social networks.

[BibT_eX]

[DOI]

Lucas Félix

Jussara M. Almeida

CoRR, 2024

A Novel Two-Step Fine-Tuning Pipeline for Cold-Start Active Learning in Text Classification Tasks.

[BibT_eX]

[DOI]

Fabiano Belém

Celso França

CoRR, 2024

A Quantum Annealing-Based Instance Selection Approach for Transformer Fine-Tuning.

[BibT_eX]

[DOI]

Proceedings of the 14th Italian Information Retrieval Workshop, 2024

A Quantum Annealing Instance Selection Approach for Efficient and Effective Transformer Fine-Tuning.

[BibT_eX]

[DOI]

Cláudio Moisés Valiense de Andrade

Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval, 2024

2023

On the class separability of contextual embeddings representations - or "The classifier does not matter when the (text) representation is so good!".

[BibT_eX]

[DOI]

Inf. Process. Manag., 2023

A Comparative Survey of Instance Selection Methods applied to Non-Neural and Transformer-Based Text Classification.

[BibT_eX]

[DOI]

ACM Comput. Surv., 2023

TPDR: A Novel Two-Step Transformer-based Product and Class Description Match and Retrieval Method.

[BibT_eX]

[DOI]

Celso França

Cláudio Moisés Valiense de Andrade

CoRR, 2023

CluSent - Combining Semantic Expansion and De-Noising for Dataset-Oriented Sentiment Analysis of Short Texts.

[BibT_eX]

[DOI]

Proceedings of the 29th Brazilian Symposium on Multimedia and the Web, 2023

An Effective, Efficient, and Scalable Confidence-based Instance Selection Framework for Transformer-Based Text Classification.

[BibT_eX]

[DOI]

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Uma Metodologia para Tratamento do Viés da Maioria em Modelos de Stacking via Identificação de Documentos Difíceis.

[BibT_eX]

[DOI]

Antônio Pereira De Souza Júnior

Proceedings of the 38th Brazilian Symposium on Databases, 2023

2022

Evaluating Topic Modeling Pre-processing Pipelines for Portuguese Texts.

[BibT_eX]

[DOI]

Pablo Cecilio

Felipe Viegas

Elisa Tuler de Albergaria

Leonardo Chaves Dutra da Rocha

Proceedings of the WebMedia '22: Brazilian Symposium on Multimedia and Web, Curitiba, Brazil, November 7, 2022

2021

On the cost-effectiveness of neural and non-neural approaches and representations for text classification: A comprehensive comparative study.

[BibT_eX]

[DOI]

Wellington Santos Martins

Jussara M. Almeida

Thierson Rosa

Inf. Process. Manag., 2021

2020

Extended pre-processing pipeline for text classification: On the role of meta-feature representations, sparsification and selective sampling.

[BibT_eX]

[DOI]

Inf. Process. Manag., 2020

"Keep it Simple, Lazy" - MetaLazy: A New MetaStrategy for Lazy Text Classification.

[BibT_eX]

[DOI]

Luiz Felipe Mendes

Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020

CluHTM - Semantic Hierarchical Topic Modeling based on CluWords.

[BibT_eX]

[DOI]

Felipe Viegas

Antônio Pereira De Souza Júnior

Christian Gomes