Alham Fikri Aji

According to our database1, Alham Fikri Aji authored at least 72 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
SemEval Task 1: Semantic Textual Relatedness for African and Asian Languages.
CoRR, 2024

Towards Measuring and Modeling "Culture" in LLMs: A Survey.
CoRR, 2024

Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition.
CoRR, 2024

Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models.
CoRR, 2024

M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection.
CoRR, 2024

SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages.
CoRR, 2024

LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language Generalization.
CoRR, 2024

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

2023
COPAL-ID: Indonesian Language Reasoning with Local Culture and Nuances.
CoRR, 2023

QASiNa: Religious Domain Question Answering using Sirah Nabawiyah.
CoRR, 2023

Low-Resource Clickbait Spoiling for Indonesian via Question Answering.
CoRR, 2023

Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models.
CoRR, 2023

Style Over Substance: Evaluation Biases for Large Language Models.
CoRR, 2023

Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering.
CoRR, 2023

Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation.
CoRR, 2023

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection.
CoRR, 2023

GlobalBench: A Benchmark for Global Progress in Natural Language Processing.
CoRR, 2023

LLM-powered Data Augmentation for Enhanced Crosslingual Performance.
CoRR, 2023

Multilingual Large Language Models Are Not (Yet) Code-Switchers.
CoRR, 2023

Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages.
CoRR, 2023

NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

Multilingual Large Language Models Are Not (Yet) Code-Switchers.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

LLM-powered Data Augmentation for Enhanced Cross-lingual Performance.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

GlobalBench: A Benchmark for Global Progress in Natural Language Processing.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

The Decades Progress on Code-Switching Research in NLP: A Systematic Survey on Trends and Challenges.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

WebIE: Faithful and Robust Information Extraction on the Web.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

On "Scientific Debt" in NLP: A Case for More Rigour in Language Model Pre-Training Research.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Crosslingual Generalization through Multitask Finetuning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Multi-lingual and Multi-cultural Figurative Language Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023


Direct Fact Retrieval from Knowledge Graphs without Entity Linking.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
NusaCrowd: Open Source Initiative for Indonesian NLP Resources.
CoRR, 2022

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.
CoRR, 2022

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.
CoRR, 2022

NusaCrowd: A Call for Open and Reproducible NLP Research in Indonesian Languages.
CoRR, 2022

NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages.
CoRR, 2022

Nix-TTS: An Incredibly Lightweight End-to-End Text-to-Speech Model via Non End-to-End Distillation.
CoRR, 2022

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources.
CoRR, 2022

Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models.
CoRR, 2022

NIX-TTS: Lightweight and End-to-End Text-to-Speech Via Module-Wise Distillation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

A Relation Extraction Dataset for Knowledge Extraction from Web Tables.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Mintaka: A Complex, Natural, and Multilingual Dataset for End-to-End Question Answering.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Towards better structured and less noisy Web data: Oscar with Register annotations.
Proceedings of the Eighth Workshop on Noisy User-generated Text, 2022

One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
The University of Edinburgh's Bengali-Hindi Submissions to the WMT21 News Translation Task.
Proceedings of the Sixth Conference on Machine Translation, 2021

Efficient Machine Translation with Model Pruning and Quantization.
Proceedings of the Sixth Conference on Machine Translation, 2021

BERT Goes Brrr: A Venture Towards the Lesser Error in Classifying Medical Self-Reporters on Twitter.
Proceedings of the Sixth Social Media Mining for Health Workshop and Shared Task, 2021

ParaCotta: Synthetic Multilingual Paraphrase Corpora from the Most Diverse Translation Sample Pair.
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation, 2021

IndoNLI: A Natural Language Inference Dataset for Indonesian.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

IndoCollex: A Testbed for Morphological Transformation of Indonesian Word Colloquialism.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
Approximating neural machine translation for efficiency.
PhD thesis, 2020

Exploring Monolingual Data for Neural Machine Translation with Knowledge Distillation.
CoRR, 2020

Synthetic Source Language Augmentation for Colloquial Neural Machine Translation.
CoRR, 2020

No Budget? Don't Flex! Cost Consideration when Planning to Adopt NLP for Your Business.
CoRR, 2020

Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation.
CoRR, 2020

Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation.
Proceedings of the International Conference on Asian Language Processing, 2020

Edinburgh's Submissions to the 2020 Machine Translation Efficiency Task.
Proceedings of the Fourth Workshop on Neural Generation and Translation, 2020

Compressing Neural Machine Translation Models with 4-bit Precision.
Proceedings of the Fourth Workshop on Neural Generation and Translation, 2020

In Neural Machine Translation, What Does Transfer Learning Transfer?
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Benchmarking Multidomain English-Indonesian Machine Translation.
Proceedings of the 13th Workshop on Building and Using Comparable Corpora, 2020

2019
Neural Machine Translation with 4-Bit Precision and Beyond.
CoRR, 2019

From Research to Production and Back: Ludicrously Fast Neural Machine Translation.
Proceedings of the 3rd Workshop on Neural Generation and Translation@EMNLP-IJCNLP 2019, 2019

Combining Global Sparse Gradients with Local Gradients in Distributed Neural Network Training.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Making Asynchronous Stochastic Gradient Descent Work for Transformers.
Proceedings of the 3rd Workshop on Neural Generation and Translation@EMNLP-IJCNLP 2019, 2019

2018
Toward a Standardized and More Accurate Indonesian Part-of-Speech Tagging.
Proceedings of the 2018 International Conference on Asian Language Processing, 2018

Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Marian: Fast Neural Machine Translation in C++.
Proceedings of ACL 2018, Melbourne, Australia, July 15-20, 2018, System Demonstrations, 2018

2017
Sparse Communication for Distributed Gradient Descent.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

2014
Can smartphones be used to detect an earthquake? Using a machine learning approach to identify an earthquake event.
Proceedings of the IEEE International Systems Conference, 2014


  Loading...