Peerat Limkonchotiwat

Yifan Mai

William-Chandra Tjhi

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments.

[BibT_eX]

[DOI]

Patomporn Payoungkhamdee

Pume Tuchinda

Jinheon Baek

Samuel Cahyawijaya

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia.

[BibT_eX]

[DOI]

Samuel Cahyawijaya

Mohammad Rifqi Farhansyah

Joel Ruben Antony Moniz

Tack Hwa Wong

Thant Thiri Maung

Frederikus Hudi

David Anugraha

Muhammad Ravi Shulthan Habibi

Muhammad Reza Qorib

Amit Agarwal

Joseph Marvin Imperial

Hitesh Laxmichand Patel

Vicky Feliren

Bahrul Ilmi Nasution

Manuel Antonio Rufino

Genta Indra Winata

Rian Adam Rajagede

Carlos Rafael Catalan

Mohamed Fazli Mohamed Imam

Priyaranjan Pattnayak

Salsabila Zahirah Pranida

Kevin Pratama

Yeshil Bangera

Adisai Na-Thalang

Patricia Nicole Monderin

Kanyakorn Veerakanjana

Piyalitt Ittichaiwong

Matthew Theodore Roque

Karissa Vincentio

Takdanai Kreangphet

Phakphum Artkaew

Kadek Hendrawan Palgunadi

Hanif Muhammad Zhafran

Fenal Ashokbhai Ilasariya

Haochen Li

John Amadeo Daniswara

Filbert Aurelian Tjiaranata

Eryawan Presma Yulianrifat

Fadil Risdian Ansori

Mahardika Krisna Ihsani

Isaiah Edri W. Flores

Lester James Validad Miranda

Ming Shan Hee

Ikhlasul Akmal Hanif

M. Alif Al Hakim

Muhammad Rizky Sya'ban

Kun Kerdthaisong

Fajri Koto

Tirana Noor Fatyanosa

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation.

[BibT_eX]

[DOI]

CoRR, 2024

Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ?

[BibT_eX]

[DOI]

Jirat Chiaranaipanich

Naiyarat Hanmatheekuna

Piyalitt Ittichaiwong

CoRR, 2024

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines.

[BibT_eX]

[DOI]

Genta Indra Winata

Frederikus Hudi

Patrick Amadeus Irawan

Ubaidillah Ariq Prathama

Haryo Akbarianto Wibowo

Maria Angelica Riera Machin

Jan Wira Gotama Putra

Junho Myung

Lucky Susanto

Marina Zhukova

Michael Anugraha

Muhammad Farid Adilazuarda

Natasha Santosa

Stephanie Yulia Salim

Yi Zhou

Yinxuan Gui

David Ifeoluwa Adelani

CoRR, 2024

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.

[BibT_eX]

[DOI]

Muhammad Ravi Shulthan Habibi

Rahmad Mahendra

Salsabil Maulana Akbar

Lester James V. Miranda

Joseph Marvin Imperial

Onno Pepijn Kampman

Joel Ruben Antony Moniz

Patrick Amadeus Irawan

Bin Wang

Muhammad Dehan Al Kautsar

Chenxi Whitehouse

Ivan Halim Parmonangan

Sonny Lazuardi Hermawan

Dan John Velasco

Willy Fitra Hendria

Yasmin Moslem

Noah Flynn

Muhammad Farid Adilazuarda

CoRR, 2024

WangchanLion and WangchanX MRC Eval.

[BibT_eX]

[DOI]

Surapon Nonesung

Patomporn Payoungkhamdee

CoRR, 2024

Efficient Overshadowed Entity Disambiguation by Mitigating Shortcut Learning.

[BibT_eX]

[DOI]

Panuthep Tasawong

Pitchaya Chairuengjitjaras

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

On Creating an English-Thai Code-switched Machine Translation in Medical Domain.

[BibT_eX]

[DOI]

Pasit Supholkhan

Pubordee Aussavavirojekul

Chiraphat Boonnag

Kanyakorn Veerakanjana

Hirunkul Phimsiri

Boonthicha Sae-jia

Nattawach Sataudom

Piyalitt Ittichaiwong

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

An Empirical Study of Multilingual Reasoning Distillation for Question Answering.

[BibT_eX]

[DOI]

Patomporn Payoungkhamdee

Jinheon Baek

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.

[BibT_eX]

[DOI]

Muhammad Ravi Shulthan Habibi

Rahmad Mahendra

Salsabil Maulana Akbar

Lester James V. Miranda

Joseph Marvin Imperial

Onno Kampman

Joel Ruben Antony Moniz

Patrick Amadeus Irawan

Bin Wang

Muhammad Dehan Al Kautsar

Chenxi Whitehouse

Ivan Halim Parmonangan

Sonny Lazuardi Hermawan

Dan John Velasco

Willy Fitra Hendria

Yasmin Moslem

Noah Flynn

Muhammad Farid Adilazuarda

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

McCrolin: Multi-consistency Cross-lingual Training for Retrieval Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Space Decomposition for Sentence Embedding.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Seed-Free Synthetic Data Generation Framework for Instruction-Tuning LLMs: A Case Study in Thai.

[BibT_eX]

[DOI]

Parinthapat Pengpun

Weerayut Buaphet

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024

Identifying and Mitigating Annotation Bias in Natural Language Understanding using Causal Mediation Analysis.

[BibT_eX]

[DOI]

Sitiporn Sae Lim

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

An Efficient Self-Supervised Cross-View Training For Sentence Embedding.

[BibT_eX]

[DOI]

Trans. Assoc. Comput. Linguistics, 2023

PyThaiNLP: Thai Natural Language Processing in Python.

[BibT_eX]

[DOI]

Korakot Chaovavanich

Charin Polpanumas

Arthit Suriyawongkul

Pattarawat Chormai

Thanathip Suntorntip

CoRR, 2023

Two-stage Thai Misspelling Correction based on Pre-trained Language Models.

[BibT_eX]

[DOI]

Idhibhat Pankam

Proceedings of the 20th IEEE International Joint Conference on Computer Science and Software Engineering, 2023

mReFinED: An Efficient End-to-End Multilingual Entity Linking System.

[BibT_eX]

[DOI]

Christos Christodoulopoulos

Weiwei Cheng

Amir Saffari

Jens Lehmann

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Typo-Robust Representation Learning for Dense Retrieval.

[BibT_eX]

[DOI]

Panuthep Tasawong

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

2022

Thai Wav2Vec2.0 with CommonVoice V8.

[BibT_eX]

[DOI]

Chompakorn Chaksangchaichot

CoRR, 2022

CL-ReLKT: Cross-lingual Language Knowledge Transfer for Multilingual Retrieval Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

ConGen: Unsupervised Control and Generalization Distillation For Sentence Representation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Thai Nested Named Entity Recognition Corpus.

[BibT_eX]

[DOI]

Weerayut Buaphet

Attapol Rutherford

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021

AI Builders: Teaching Thai Students to Build End-to-End Machine Learning Projects Online.

[BibT_eX]

[DOI]

Charin Polpanumas

Chompakorn Chaksangchaichot

Benyapa Matupumanon

Sukrit Amornpornwiwat

Witchapong Daroontham

Kowin Kulruchakorn

Titipat Achakulvisut

Proceedings of the 2021 IEEE International Conference on Engineering, 2021

Robust Fragment-Based Framework for Cross-lingual Sentence Retrieval.

[BibT_eX]

[DOI]

Nattapol Trijakwanich

Raheem Sarwar

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Handling Cross- and Out-of-Domain Samples in Thai Word Segmentation.

[BibT_eX]

[DOI]

Raheem Sarwar

Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020

Domain Adaptation of Thai Word Segmentation Models using Stacked Ensemble.

[BibT_eX]

[DOI]

Raheem Sarwar