Alham Fikri Aji

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Senses.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Multilingual Iterative Model Pruning: What Matters?

[BibT_eX]

[DOI]

Lyzander Marciano Andrylie

Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

Unveiling the Influence of Amplifying Language-Specific Neurons.

[BibT_eX]

[DOI]

Inaya Rahmanisa

Mahardika Krisna Ihsani

Alfan Farizki Wicaksono

Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2025

CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

MoMentS: A Comprehensive Multimodal Benchmark for Theory of Mind.

[BibT_eX]

[DOI]

Emilio Villa-Cueva

S. M. Masrur Ahmed

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025

WangchanThaiInstruct: An instruction-following Dataset for Culture-Aware, Multitask, and Multi-domain Evaluation in Thai.

[BibT_eX]

[DOI]

Pume Tuchinda

Lalita Lowphansirikul

Surapon Nonesung

Panuthep Tasawong

Can Udomcharoenchaikit

Sarana Nutanong

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Balanced Multi-Factor In-Context Learning for Multilingual Large Language Models.

[BibT_eX]

[DOI]

Masahiro Kaneko

Timothy Baldwin

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

LORAXBENCH: A Multitask, Multilingual Benchmark Suite for 20 Indonesian Languages.

[BibT_eX]

[DOI]

Trevor Cohn

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

From Surveys to Narratives: Rethinking Cultural Value Adaptation in LLMs.

[BibT_eX]

[DOI]

Chen Cecilia Liu

Iryna Gurevych

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

Style Over Substance: Evaluation Biases for Large Language Models.

[BibT_eX]

[DOI]

Minghao Wu

Shamsuddeen Hassan Muhammad

Proceedings of the 31st International Conference on Computational Linguistics, 2025

Proceedings of the Second Workshop in South East Asian Language Processing.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

NusaDialogue: Dialogue Summarization and Generation for Underrepresented and Extremely Low-Resource Languages.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

From Multiple-Choice to Extractive QA: A Case Study for English and Arabic.

[BibT_eX]

[DOI]

Proceedings of the 31st International Conference on Computational Linguistics, 2025

KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

A Multi-Labeled Dataset for Indonesian Discourse: Examining Toxicity, Polarization, and Demographics Information.

[BibT_eX]

[DOI]

Lucky Susanto

Musa Izzanardi Wijanarko

Prasetia Anugrah Pratama

Proceedings of the Findings of the Association for Computational Linguistics, 2025

BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages.

[BibT_eX]

[DOI]

Felermino D. M. A. Ali

Rodrigo Tufiño Cardenas

Chiamaka Ijeoma Chukwuneke

Charles Henrique Porto Ferreira

Lilian Diana Awuor Wanzare

Sophie Wu

Florian Valentin Wunderlich

Hanif Muhammad Zhafran

Tianhui Zhang

Yi Zhou

Saif M. Mohammad

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation.

[BibT_eX]

[DOI]

Jonibek Mansurov

Akhmed Sakip

Mohammad Rifqi Farhansyah

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Do Language Models Understand Honorific Systems in Javanese?

[BibT_eX]

[DOI]

Iwan Darmawan

Adryan Kusumawardhana

Derry Tanti Wijaya

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Statement-Tuning Enables Efficient Cross-lingual Generalization in Encoder-only Models.

[BibT_eX]

[DOI]

Faadil Abdullah Shaikh

Jonibek Mansurov

Jesús-Germán Ortiz-Barajas

Proceedings of the Findings of the Association for Computational Linguistics, 2025

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia.

[BibT_eX]

[DOI]

Mohammad Rifqi Farhansyah

Joel Ruben Antony Moniz

Tack Hwa Wong

Thant Thiri Maung

Frederikus Hudi

David Anugraha

Muhammad Ravi Shulthan Habibi

Muhammad Reza Qorib

Amit Agarwal

Joseph Marvin Imperial

Hitesh Laxmichand Patel

Vicky Feliren

Bahrul Ilmi Nasution

Manuel Antonio Rufino

Rian Adam Rajagede

Carlos Rafael Catalan

Salsabila Zahirah Pranida

Priyaranjan Pattnayak

Kevin Pratama

Yeshil Bangera

Adisai Na-Thalang

Patricia Nicole Monderin

Kanyakorn Veerakanjana

Piyalitt Ittichaiwong

Matthew Theodore Roque

Karissa Vincentio

Takdanai Kreangphet

Phakphum Artkaew

Kadek Hendrawan Palgunadi

Hanif Muhammad Zhafran

Fenal Ashokbhai Ilasariya

Haochen Li

John Amadeo Daniswara

Filbert Aurelian Tjiaranata

Eryawan Presma Yulianrifat

Can Udomcharoenchaikit

Fadil Risdian Ansori

Mahardika Krisna Ihsani

Isaiah Edri W. Flores

Lester James Validad Miranda

Ming Shan Hee

Ikhlasul Akmal Hanif

M. Alif Al Hakim

Muhammad Rizky Sya'ban

Kun Kerdthaisong

Fajri Koto

Tirana Noor Fatyanosa

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts.

[BibT_eX]

[DOI]

Musa Izzanardi Wijanarko

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Black-Box Machine-Generated Text Detection.

[BibT_eX]

[DOI]

Dataset, April, 2024

Maya: An Instruction Finetuned Multilingual Multimodal Model.

[BibT_eX]

[DOI]

Nahid Alam

Karthik Reddy Kanjula

Surya Guthikonda

Timothy Chung

Bala Krishna S. Vegesna

CoRR, 2024

CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections.

[BibT_eX]

[DOI]

CoRR, 2024

Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Sense.

[BibT_eX]

[DOI]

Elisa Gilbert

Hiroki Nomoto

CoRR, 2024

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines.

[BibT_eX]

[DOI]

Frederikus Hudi

Patrick Amadeus Irawan

Ubaidillah Ariq Prathama

Maria Angelica Riera Machin

Jan Wira Gotama Putra

Junho Myung

Lucky Susanto

Marina Zhukova

Michael Anugraha

Natasha Santosa

Stephanie Yulia Salim

Yi Zhou

Yinxuan Gui

CoRR, 2024

IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language.

[BibT_eX]

[DOI]

Lucky Susanto

Musa Izzanardi Wijanarko

Prasetia Anugrah Pratama

CoRR, 2024

The Privileged Students: On the Value of Initialization in Multilingual Knowledge Distillation.

[BibT_eX]

[DOI]

Thamar Solorio

CoRR, 2024

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.

[BibT_eX]

[DOI]

Rahmad Mahendra

Muhammad Ravi Shulthan Habibi

Lester James V. Miranda

Joseph Marvin Imperial

Onno Pepijn Kampman

Joel Ruben Antony Moniz

Patrick Amadeus Irawan

Bin Wang

Muhammad Dehan Al Kautsar

Sonny Lazuardi Hermawan

Dan John Velasco

Willy Fitra Hendria

Yasmin Moslem

Noah Flynn

CoRR, 2024

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark.

[BibT_eX]

[DOI]

David Romero

Chenyang Lyu

Henok Biadglign Ademtew

Hernán Maina

Israel Abebe Azime

Jesús-Germán Ortiz-Barajas

Jay P. Gala

Jiahui Geng

Jinheon Baek

Jocelyn Dunstan

Laura Alonso Alemany

Kumaranage Ravindu Yasas Nagasinghe

Luciana Benotti

Luis Fernando D'Haro

Marcelo Viridiano

Marcos Estecha-Garitagoitia

Maria Camila Buitrago Cabrera

Mario Rodríguez-Cantelar

Mélanie Jouitteau

Mihail Mihaylov

Munkhjargal Gochoo

Munkh-Erdene Otgonbold

Tiago Timponi Torrent

Toqeer Ehsan

Vladimir Araujo

Yova Kementchedjhieva

CoRR, 2024

Lessons from the Trenches on Reproducible Evaluation of Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Can a Multichoice Dataset be Repurposed for Extractive Question Answering?

[BibT_eX]

[DOI]

CoRR, 2024

SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection.

[BibT_eX]

[DOI]

CoRR, 2024

Towards Measuring and Modeling "Culture" in LLMs: A Survey.

[BibT_eX]

[DOI]

CoRR, 2024

Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition.

[BibT_eX]

[DOI]

CoRR, 2024

Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models.

[BibT_eX]

[DOI]

Chenyang Lyu

Minghao Wu

CoRR, 2024

M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection.

[BibT_eX]

[DOI]

CoRR, 2024

SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages.

[BibT_eX]

[DOI]

CoRR, 2024

Daisy at WASSA 2024 Empathy and Personality Shared Task: A Quick Exploration on Emotional Pattern of Empathy and Distress.

[BibT_eX]

[DOI]

Shamsuddeen Hassan Muhammad

Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, 2024

SemEval Task 1: Semantic Textual Relatedness for African and Asian Languages.

[BibT_eX]

[DOI]

Nedjma Ousidhoum

Krishnapriya Vishnubhotla

Seid Muhie Yimam

Saif M. Mohammad

Proceedings of the 18th International Workshop on Semantic Evaluation, 2024

Kalahi: A handcrafted, grassroots cultural LLM evaluation suite for Filipino.

[BibT_eX]

[DOI]

Hamsawardhini Rengarajan

William-Chandra Tjhi

Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation, 2024

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark.

[BibT_eX]

[DOI]

David Romero

Chenyang Lyu

Jesús-Germán Ortiz-Barajas

Santiago Góngora

Aishik Mandal

Sukannya Purkayastha

Munkh-Erdene Otgonbold

Tiago Timponi Torrent

Frederico Belcavello

Marcelo Viridiano

Christian Salamea Palacios

Vladimir Araujo

Yova Kementchedjhieva

Mihail Mihaylov

Israel Abebe Azime

Henok Biadglign Ademtew

Bontu Fufa Balcha

Naome A. Etori

Maria Camila Buitrago Cabrera

Rada Mihalcea

Atnafu Lambebo Tonja

Gisela Vallejo

Marcos Estecha-Garitagoitia

Mario Rodríguez-Cantelar

Toqeer Ehsan

Kumaranage Ravindu Yasas Nagasinghe

Proceedings of the Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems 2024, 2024

COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances.

[BibT_eX]

[DOI]

Erland Hilman Fuadi

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Efficient and Interpretable Grammatical Error Correction with Mixture of Experts.

[BibT_eX]

[DOI]

Muhammad Reza Qorib

Hwee Tou Ng

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Cultural Conditioning or Placebo? On the Effectiveness of Socio-Demographic Prompting.

[BibT_eX]

[DOI]

Sagnik Mukherjee

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.

[BibT_eX]

[DOI]

Rahmad Mahendra

Muhammad Ravi Shulthan Habibi

Lester James V. Miranda

Joseph Marvin Imperial

Onno Kampman

Joel Ruben Antony Moniz

Patrick Amadeus Irawan

Bin Wang

Muhammad Dehan Al Kautsar

Sonny Lazuardi Hermawan

Dan John Velasco

Willy Fitra Hendria

Yasmin Moslem

Noah Flynn

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Re-Evaluating Evaluation for Multilingual Summarization.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Towards Measuring and Modeling "Culture" in LLMs: A Survey.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language Generalization.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, 2024

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions.

[BibT_eX]

[DOI]

Minghao Wu

Abdul Waheed

Chiyu Zhang

Muhammad Abdul-Mageed

Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection.

[BibT_eX]

[DOI]

Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

A Paradigm Shift: The Future of Machine Translation Lies with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages.

[BibT_eX]

[DOI]

Emmanuel Dave

Nuur Shadieq

Muhammad Ihza Mahendra

Dea Annisayanti Putri

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

QASiNa: Religious Domain Question Answering using Sirah Nabawiyah.

[BibT_eX]

[DOI]

Muhammad Razif Rizqullah

Ayu Purwarianti

CoRR, 2023

Low-Resource Clickbait Spoiling for Indonesian via Question Answering.

[BibT_eX]

[DOI]

Ni Putu Intan Maharani

Ayu Purwarianti

CoRR, 2023

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering.

[BibT_eX]

[DOI]

Jinheon Baek

Amir Saffari

CoRR, 2023

Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation.

[BibT_eX]

[DOI]

CoRR, 2023

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection.

[BibT_eX]

[DOI]

CoRR, 2023

GlobalBench: A Benchmark for Global Progress in Natural Language Processing.

[BibT_eX]

[DOI]

Antonios Anastasopoulos

Graham Neubig

CoRR, 2023

LLM-powered Data Augmentation for Enhanced Crosslingual Performance.

[BibT_eX]

[DOI]

Monojit Choudhury

CoRR, 2023

Multilingual Large Language Models Are Not (Yet) Code-Switchers.

[BibT_eX]

[DOI]

CoRR, 2023

Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages.

[BibT_eX]

[DOI]

Long Phan

Yin Lin Tan

CoRR, 2023

NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages.

[BibT_eX]

[DOI]

Jhonson Lee

Nuur Shadieq

Tjeng Wawan Cenggoro

Hanung Wahyuning Linuwih

Bryan Wilie

Galih Pradipta Muridan

Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

Multilingual Large Language Models Are Not (Yet) Code-Switchers.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

LLM-powered Data Augmentation for Enhanced Cross-lingual Performance.

[BibT_eX]

[DOI]

Monojit Choudhury

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

GlobalBench: A Benchmark for Global Progress in Natural Language Processing.

[BibT_eX]

[DOI]

Antonios Anastasopoulos

Graham Neubig

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages.

[BibT_eX]

[DOI]

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

WebIE: Faithful and Robust Information Extraction on the Web.

[BibT_eX]

[DOI]

Clara Vania

Christos Christodoulopoulos

Andrea Pierleoni

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

On "Scientific Debt" in NLP: A Case for More Rigour in Language Model Pre-Training Research.

[BibT_eX]

[DOI]

Muhammad Satrio Wicaksono

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Crosslingual Generalization through Multitask Finetuning.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Multi-lingual and Multi-cultural Figurative Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

NusaCrowd: Open Source Initiative for Indonesian NLP Resources.

[BibT_eX]

[DOI]

Arie Ardiyanti Suryani

Rifki Afina Putri

Dan Su

Keith Stevens

Ichwanul Muslim Karo Karo

Cuk Tho

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Direct Fact Retrieval from Knowledge Graphs without Entity Linking.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

NusaCrowd: Open Source Initiative for Indonesian NLP Resources.

[BibT_eX]

[DOI]

Muhammad Satrio Wicaksono

Ika Alfina

Arie Ardiyanti Suryani

Rifki Afina Putri

Dan Su

Keith Stevens

Ichwanul Muslim Karo Karo

Tirana Noor Fatyanosa

CoRR, 2022

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.

[BibT_eX]

[DOI]

CoRR, 2022

NusaCrowd: A Call for Open and Reproducible NLP Research in Indonesian Languages.

[BibT_eX]

[DOI]

CoRR, 2022

NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages.

[BibT_eX]

[DOI]

CoRR, 2022

Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation.

[BibT_eX]

[DOI]

Pedro Javier Ortiz Suárez

CoRR, 2022

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources.

[BibT_eX]

[DOI]

Angelina McMillan-Major

Zeerak Talat

Daniel van Strien

Yacine Jernite

CoRR, 2022

Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models.

[BibT_eX]

[DOI]

CoRR, 2022

NIX-TTS: Lightweight and End-to-End Text-to-Speech Via Module-Wise Distillation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

A Relation Extraction Dataset for Knowledge Extraction from Web Tables.

[BibT_eX]

[DOI]

Siffi Singh

Christos Christodoulopoulos

Gaurav Singh

Proceedings of the 29th International Conference on Computational Linguistics, 2022

Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering.

[BibT_eX]

[DOI]

Priyanka Sen

Amir Saffari

Proceedings of the 29th International Conference on Computational Linguistics, 2022

Towards better structured and less noisy Web data: Oscar with Register annotations.

[BibT_eX]

[DOI]

Proceedings of the Eighth Workshop on Noisy User-generated Text, 2022

One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia.

[BibT_eX]

[DOI]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

The University of Edinburgh's Bengali-Hindi Submissions to the WMT21 News Translation Task.

[BibT_eX]

[DOI]

Proceedings of the Sixth Conference on Machine Translation, 2021

Efficient Machine Translation with Model Pruning and Quantization.

[BibT_eX]

[DOI]

Svetlana Tchistiakova

Proceedings of the Sixth Conference on Machine Translation, 2021

BERT Goes Brrr: A Venture Towards the Lesser Error in Classifying Medical Self-Reporters on Twitter.

[BibT_eX]

[DOI]

Tirana Fatyanosa

Proceedings of the Sixth Social Media Mining for Health Workshop and Shared Task, 2021

ParaCotta: Synthetic Multilingual Paraphrase Corpora from the Most Diverse Translation Sample Pair.

[BibT_eX]

[DOI]

Tirana Noor Fatyanosa

Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, 2021

IndoNLI: A Natural Language Inference Dataset for Indonesian.

[BibT_eX]

[DOI]

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

IndoCollex: A Testbed for Morphological Transformation of Indonesian Word Colloquialism.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020

Approximating neural machine translation for efficiency.

[BibT_eX]

[DOI]

PhD thesis, 2020

Exploring Monolingual Data for Neural Machine Translation with Knowledge Distillation.

[BibT_eX]

[DOI]

CoRR, 2020

Synthetic Source Language Augmentation for Colloquial Neural Machine Translation.

[BibT_eX]

[DOI]

CoRR, 2020

No Budget? Don't Flex! Cost Consideration when Planning to Adopt NLP for Your Business.

[BibT_eX]

[DOI]

CoRR, 2020

Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation.

[BibT_eX]

[DOI]

CoRR, 2020

Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation.

[BibT_eX]

[DOI]

Emmanouil-Ioannis Farsarakis

Proceedings of the International Conference on Asian Language Processing, 2020

Edinburgh's Submissions to the 2020 Machine Translation Efficiency Task.

[BibT_eX]

[DOI]

Mateusz Chudyk

Proceedings of the Fourth Workshop on Neural Generation and Translation, 2020

Compressing Neural Machine Translation Models with 4-bit Precision.

[BibT_eX]

[DOI]

Proceedings of the Fourth Workshop on Neural Generation and Translation, 2020

In Neural Machine Translation, What Does Transfer Learning Transfer?

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Benchmarking Multidomain English-Indonesian Machine Translation.

[BibT_eX]

[DOI]

Tri Wahyu Guntara

Proceedings of the 13th Workshop on Building and Using Comparable Corpora, 2020

2019

Neural Machine Translation with 4-Bit Precision and Beyond.

[BibT_eX]

[DOI]

CoRR, 2019

From Research to Production and Back: Ludicrously Fast Neural Machine Translation.

[BibT_eX]

[DOI]

Young Jin Kim

Marcin Junczys-Dowmunt

Proceedings of the 3rd Workshop on Neural Generation and Translation@EMNLP-IJCNLP 2019, 2019

Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training.

[BibT_eX]

[DOI]

Nikolay Bogoychev

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Making Asynchronous Stochastic Gradient Descent Work for Transformers.

[BibT_eX]

[DOI]

Proceedings of the 3rd Workshop on Neural Generation and Translation@EMNLP-IJCNLP 2019, 2019

2018

Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging.

[BibT_eX]

[DOI]

Kemal Kurniawan