Thamar Solorio

Orcid: 0000-0002-3541-9405

Affiliations:
  • Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, USA
  • University of Houston, TX, USA


According to our database1, Thamar Solorio authored at least 152 papers between 2002 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
SemEval Task 1: Semantic Textual Relatedness for African and Asian Languages.
CoRR, 2024

Question-Instructed Visual Descriptions for Zero-Shot Video Question Answering.
CoRR, 2024

SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages.
CoRR, 2024

2023
Overview of GUA-SPA at IberLEF 2023: Guarani-Spanish Code Switching Analysis.
Proces. del Leng. Natural, 2023

Survey on Aspect Category Detection.
ACM Comput. Surv., 2023

OATS: Opinion Aspect Target Sentiment Quadruple Extraction Dataset for Aspect-Based Sentiment Analysis.
CoRR, 2023

Positive and Risky Message Assessment for Music Products.
CoRR, 2023

Context-aware Adversarial Attack on Named Entity Recognition.
CoRR, 2023

Overview of GUA-SPA at IberLEF 2023: Guarani-Spanish Code Switching Analysis.
CoRR, 2023

SafeWebUH at SemEval-2023 Task 11: Learning Annotator Disagreement in Derogatory Text: Comparison of Direct Training vs Aggregation.
Proceedings of the The 17th International Workshop on Semantic Evaluation, 2023

A Review of Datasets for Aspect-based Sentiment Analysis.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

Distillation of encoder-decoder transformers for sequence labelling.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Hierarchical attention and transformers for automatic movie rating.
Expert Syst. Appl., 2022

Survey of Aspect-based Sentiment Analysis Datasets.
CoRR, 2022

CALCS 2021 Shared Task: Machine Translation for Code-Switched Data.
CoRR, 2022

Cross-lingual Few-Shot Learning on Unseen Languages.
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022

Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021
A Human-Centered Systematic Literature Review of the Computational Approaches for Online Sexual Risk Detection.
Proc. ACM Hum. Comput. Interact., 2021

White Paper - Objectionable Online Content: What is harmful, to whom, and why.
CoRR, 2021

A Case Study of Deep Learning Based Multi-Modal Methods for Predicting the Age-Suitability Rating of Movie Trailers.
CoRR, 2021

White Paper: Challenges and Considerations for the Creation of a Large Labelled Repository of Online Videos with Questionable Content.
CoRR, 2021

Learning to Emphasize: Dataset and Shared Task Models for Selecting Emphasis in Presentation Slides.
CoRR, 2021

A Case Study of Deep Learning-Based Multi-Modal Methods for Labeling the Presence of Questionable Content in Movie Trailers.
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), 2021

Exploring Conditional Text Generation for Aspect-Based Sentiment Analysis.
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, 2021

Identifying Keyword Predictors in Lecture Video Screen Text.
Proceedings of the IEEE International Symposium on Multimedia, 2021

From None to Severe: Predicting Severity in Movie Scripts.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Data Augmentation for Cross-Domain Named Entity Recognition.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

End-to-End Fine-Grained Neural Entity Recognition of Patients, Interventions, Outcomes.
Proceedings of the Experimental IR Meets Multilinguality, Multimodality, and Interaction, 2021

Can images help recognize entities? A study of the role of images for Multimodal NER.
Proceedings of the Seventh Workshop on Noisy User-generated Text, 2021

PSED: A Dataset for Selecting Emphasis in Presentation Slides.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp.
Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media, 2021

Normalization and Back-Transliteration for Code-Switched Data.
Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching, 2021

2020
Gated multimodal networks.
Neural Comput. Appl., 2020

Early author profiling on Twitter using profile features with multi-resolution.
Expert Syst. Appl., 2020

Char2Subword: Extending the Subword Embedding Space from Pre-trained Models Using Robust Character Compositionality.
CoRR, 2020

A Caption Is Worth A Thousand Images: Investigating Image Captions for Multimodal Named Entity Recognition.
CoRR, 2020

SemEval-2020 Task 10: Emphasis Selection for Written Text in Visual Media.
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets.
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

Age Suitability Rating: Predicting the MPAA Rating Based on Movie Dialogues.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Automatic Identification of Keywords in Lecture Video Segments.
Proceedings of the IEEE International Symposium on Multimedia, 2020

Multi-view Story Characterization from Movie Plot Synopses and Reviews.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Let Me Choose: From Verbal Context to Font Selection.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

From English to Code-Switching: Transfer Learning with Strong Morphological Clues.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Aggression and Misogyny Detection using BERT: A Multi-Task Approach.
Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying, 2020

Detecting Early Signs of Cyberbullying in Social Media.
Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying, 2020

Attending the Emotions to Detect Online Abusive Language.
Proceedings of the Fourth Workshop on Online Abuse and Harms, 2020

2019
Dependency-Aware Named Entity Recognition with Relative and Global Attentions.
CoRR, 2019

Multi-view Characterization of Stories from Narratives and Reviews using Multi-label Ranking.
CoRR, 2019

Rating for Parents: Predicting Children Suitability Rating for Movies Based on Language of the Movies.
CoRR, 2019

Question Relatedness on Stack Overflow: The Task, Dataset, and Corpus-inspired Models.
CoRR, 2019

Jointly Learning Author and Annotated Character N-gram Embeddings: A Case Study in Literary Text.
Proceedings of the International Conference on Recent Advances in Natural Language Processing, 2019

Exploiting Textual, Visual, and Product Features for Predicting the Likeability of Movies.
Proceedings of the Thirty-Second International Florida Artificial Intelligence Research Society Conference, 2019

Learning Emphasis Selection for Written Text in Visual Media from Crowd-Sourced Label Distributions.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
CUILESS2016: a clinical corpus applying compositional normalization of text mentions.
J. Biomed. Semant., 2018

Letting Emotions Flow: Success Prediction by Modeling the Flow of Emotions in Books.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Early Text Classification Using Multi-Resolution Concept Representations.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

MPST: A Corpus of Movie Plot Synopses with Tags.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

A Genre-Aware Attention Model to Improve the Likability Prediction of Books.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

RiTUAL-UH at TRAC 2018 Shared Task: Aggression Identification.
Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying, 2018

Folksonomication: Predicting Tags for Movies from Plot Synopses using Emotion Flow Encoded Neural Network.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

Language Identification and Analysis of Code-Switched Social Media Text.
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching@ACL 2018, 2018

Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task.
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching@ACL 2018, 2018

Evaluation of Type Inference with Textual Cues.
Proceedings of the Workshops of the The Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
RiTUAL-UH at SemEval-2017 Task 5: Sentiment Analysis on Financial Data Using Neural Networks.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

Gated Multimodal Units for Information Fusion.
Proceedings of the 5th International Conference on Learning Representations, 2017

Convolutional Neural Networks for Authorship Attribution of Short Texts.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

A Multi-task Approach to Predict Likability of Books.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017

Convolutional Neural Networks for Author Profiling in PAN 2017.
Proceedings of the Working Notes of CLEF 2017, 2017

Social-Media Users can be profiled by their Similarity with other Users.
Proceedings of the Working Notes of CLEF 2017, 2017

Towards Translating Mixed-Code Comments from Social Media.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2017

A Multi-task Approach for Named Entity Recognition in Social Media Data.
Proceedings of the 3rd Workshop on Noisy User-generated Text, 2017

Detecting Nastiness in Social Media.
Proceedings of the First Workshop on Abusive Language Online, 2017

2016
UH-PRHLT at SemEval-2016 Task 3: Combining Lexical and Semantic-based Features for Community Question Answering.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

Semi-supervised CLPsych 2016 Shared Task System Submission.
Proceedings of the 3rd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2016

Age and Gender Prediction on Health Forum Data.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Computational Approaches to Linguistic Code Switching.
Proceedings of the Interspeech 2016, 2016

CogALex-V Shared Task: GHHH - Detecting Semantic Relations via Word Embeddings.
Proceedings of the 5th Workshop on Cognitive Aspects of the Lexicon, 2016

Large Scale Authorship Attribution of Online Reviews.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2016

Domain Adaptation for Authorship Attribution: Improved Structural Correspondence Learning.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Why Do They Leave: Modeling Participation in Online Depression Forums.
Proceedings of The Fourth International Workshop on Natural Language Processing for Social Media, 2016

Analysis of Anxious Word Usage on Online Health Forums.
Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis, 2016

Multilingual Code-switching Identification via LSTM Recurrent Neural Networks.
Proceedings of the Second Workshop on Computational Approaches to Code Switching@EMNLP 2016, 2016

Overview for the Second Shared Task on Language Identification in Code-Switched Data.
Proceedings of the Second Workshop on Computational Approaches to Code Switching@EMNLP 2016, 2016

Part of Speech Tagging for Code Switched Data.
Proceedings of the Second Workshop on Computational Approaches to Code Switching@EMNLP 2016, 2016

2015
Security Analytics: Essential Data Analytics Knowledge for Cybersecurity Professionals and Students.
IEEE Secur. Priv., 2015

Not All Character N-grams Are Created Equal: A Study in Authorship Attribution.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Panel: Essential Data Analytics Knowledge forCyber-security Professionals and Students.
Proceedings of the 2015 ACM International Workshop on International Workshop on Security and Privacy Analytics, 2015

Using Wide Range of Features for Author profiling.
Proceedings of the Working Notes of CLEF 2015, 2015

Identification of Original Document by Using Textual Similarities.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2015

Developing Language-tagged Corpora for Code-switching Tweets.
Proceedings of The 9th Linguistic Annotation Workshop, 2015

Predicting Continued Participation in Online Health Forums.
Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis, 2015

2014
Exploring high-level features for detecting cyberpedophilia.
Comput. Speech Lang., 2014

Sockpuppet Detection in Wikipedia: A Corpus of Real-World Deceptive Writing for Linking Identities.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014

Using String Information for Malware Family Identification.
Proceedings of the Advances in Artificial Intelligence - IBERAMIA 2014, 2014

A Straightforward Author Profiling Approach in MapReduce.
Proceedings of the Advances in Artificial Intelligence - IBERAMIA 2014, 2014

Cross-Topic Authorship Attribution: Will Out-Of-Topic Data Help?
Proceedings of the COLING 2014, 2014

Machine Translation Evaluation Metric for Text Alignment.
Proceedings of the Working Notes for CLEF 2014 Conference, 2014

A Simple Approach to Author Profiling in MapReduce.
Proceedings of the Working Notes for CLEF 2014 Conference, 2014

Overview for the First Shared Task on Language Identification in Code-Switched Data.
Proceedings of the First Workshop on Computational Approaches to Code Switching@EMNLP 2014, 2014

2013
A document is known by the company it keeps: neighborhood consensus for short text categorization.
Lang. Resour. Evaluation, 2013

Survey on Emerging Research on the Use of Natural Language Processing in Clinical Language Assessment of Children.
Lang. Linguistics Compass, 2013

Using a Variety of n-Grams for the Detection of Different Kinds of Plagiarism Notebook for PAN at CLEF 2013.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013

Author Profiling for English and Spanish Text Notebook for PAN at CLEF 2013.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013

Evaluation of YTEX and MetaMap for Clinical Concept Recognition.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013

The Use of Orthogonal Similarity Relations in the Prediction of Authorship.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2013

Exploring Word Class N-grams to Measure Language Development in Children.
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing, 2013

Using Latent Dirichlet Allocation for Child Narrative Analysis.
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing, 2013

Native Language Identification: a Simple n-gram Based Approach.
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, 2013

2012
Coherence in child language narratives: a case study of annotation and automatic prediction of coherence.
Proceedings of the Third Workshop on Child, Computer and Interaction, 2012

On the Impact of Sentiment and Emotion Based Features in Detecting Online Sexual Predators.
Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis, 2012

UABCoRAL: A Preliminary study for Resolving the Scope of Negation.
Proceedings of the First Joint Conference on Lexical and Computational Semantics, 2012

Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts.
Proceedings of the INTERSPEECH 2012, 2012

Sub-Profiling by Linguistic Dimensions to Solve the Authorship Attribution Task.
Proceedings of the CLEF 2012 Evaluation Labs and Workshop, 2012

Grading the Quality of Medical Evidence.
Proceedings of the 2012 Workshop on Biomedical Natural Language Processing, 2012

2011
Analyzing language samples of Spanish-English bilingual children for the automated prediction of language dominance.
Nat. Lang. Eng., 2011

Exploring a corpus-based approach for detecting language impairment in monolingual English-speaking children.
Artif. Intell. Medicine, 2011

A Weighted Profile Intersection Measure for Profile-Based Authorship Attribution.
Proceedings of the Advances in Artificial Intelligence, 2011

Instance Selection in Text Classification Using the Silhouette Coefficient Measure.
Proceedings of the Advances in Artificial Intelligence, 2011

Modality Specific Meta Features for Authorship Attribution in Web Forum Posts.
Proceedings of the Fifth International Joint Conference on Natural Language Processing, 2011

Authorship Identification with Modality Specific Meta Features - Notebook for PAN at CLEF 2011.
Proceedings of the CLEF 2011 Labs and Workshop, 2011

Evaluating a semisupervised approach to phishing url identification in a realistic scenario.
Proceedings of the 8th Annual Collaboration, 2011

Local Histograms of Character N-grams for Authorship Attribution.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

2010
Using Information from the Target Language to Improve Crosslingual Text Classification.
Proceedings of the Advances in Natural Language Processing, 2010

A supervised machine learning approach of extracting coexpression relationship among genes from literature.
Proceedings of the IEEE International Conference on Information Reuse and Integration, 2010

Authorship attribution of web forum posts.
Proceedings of the 2010 eCrime Researchers Summit, 2010

Lexical feature based phishing URL detection using online learning.
Proceedings of the 3rd ACM Workshop on Security and Artificial Intelligence, 2010

2009
A Corpus-Based Approach for the Prediction of Language Impairment in Monolingual English and Spanish-English Bilingual Children.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

2008
RNAVLab: A virtual laboratory for studying RNA secondary structures based on grid computing technology.
Parallel Comput., 2008

On the Effectiveness of Rebuilding RNA Secondary Structures from Sequence Chunks.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Part-of-Speech Tagging for English-Spanish Code-Switched Text.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008

Learning to Predict Code-Switching Points.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008

Using Language Models to Identify Language Impairment in Spanish-English Bilingual Children.
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing, 2008

2007
A Filter-Based Approach to Detect End-of-Utterances from Prosody in Dialog Systems.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Baby-Steps Towards Building a Spanglish Language Model.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2007

2006
Prosodic feature generation for back-channel prediction.
Proceedings of the INTERSPEECH 2006, 2006

An Unsupervised Language Independent Method of Name Discrimination Using Second Order Co-occurrence Features.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2006

2005
Question Classification in Spanish and Portuguese.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2005

Learning Named Entity Recognition in Portuguese from Spanish.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2005

Exploiting Named Entity Taggers in a Second Language.
Proceedings of the ACL 2005, 2005

2004
An Optimization Algorithm Based on Active and Instance-Based Learning.
Proceedings of the MICAI 2004: Advances in Artificial Intelligence, 2004

Question Answering for Spanish Based on Lexical and Context Annotation.
Proceedings of the Advances in Artificial Intelligence, 2004

Analysis of Galactic Spectra Using Active Instance-Based Learning and Domain Knowledge.
Proceedings of the Advances in Artificial Intelligence, 2004

A Language Independent Method for Question Classification.
Proceedings of the COLING 2004, 2004

Question Answering for Spanish Supported by Lexical Context Annotation.
Proceedings of the Multilingual Information Access for Text, 2004

The Use of Lexical Context in Question Answering for Spanish.
Proceedings of the Working Notes for CLEF 2004 Workshop co-located with the 8th European Conference on Digital Libraries (ECDL 2004), 2004

Learning Named Entity Classifiers Using Support Vector Machines.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2004

Toward a Document Model for Question Answering Systems.
Proceedings of the Advances in Web Intelligence, 2004

2002
Improving Classification Accuracy of Large Test Sets Using the Ordered Classification Algorithm.
Proceedings of the Advances in Artificial Intelligence, 2002


  Loading...