Samuel Cahyawijaya

According to our database1, Samuel Cahyawijaya authored at least 64 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
LLMs Are Few-Shot In-Context Low-Resource Language Learners.
CoRR, 2024

Subobject-level Image Tokenization.
CoRR, 2024

LinguAlchemy: Fusing Typological and Geographical Elements for Unseen Language Generalization.
CoRR, 2024

2023
IndoRobusta: Towards Robustness Against Diverse Code-Mixed Indonesian Local Languages.
CoRR, 2023

IndoToD: A Multi-Domain Indonesian Benchmark For End-to-End Task-Oriented Dialogue Systems.
CoRR, 2023

InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems.
CoRR, 2023

Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models.
CoRR, 2023

Survey of Social Bias in Vision-Language Models.
CoRR, 2023

Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition.
CoRR, 2023

GlobalBench: A Benchmark for Global Progress in Natural Language Processing.
CoRR, 2023

Multilingual Large Language Models Are Not (Yet) Code-Switchers.
CoRR, 2023

Instruct-Align: Teaching Novel Languages with to LLMs through Alignment-based Cross-Lingual Instruction.
CoRR, 2023

Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages.
CoRR, 2023

Biomedical Image Reconstruction: A Survey.
CoRR, 2023

PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

The Obscure Limitation of Modular Multilingual Language Models.
Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

Multilingual Large Language Models Are Not (Yet) Code-Switchers.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

GlobalBench: A Benchmark for Global Progress in Natural Language Processing.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

Which One Are You Referring To? Multimodal Object Identification in Situated Dialogue.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: EACL 2023, 2023

Multi-lingual and Multi-cultural Figurative Language Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023


2022
NusaCrowd: Open Source Initiative for Indonesian NLP Resources.
CoRR, 2022

Every picture tells a story: Image-grounded controllable stylistic story generation.
CoRR, 2022

NusaCrowd: A Call for Open and Reproducible NLP Research in Indonesian Languages.
CoRR, 2022

Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands.
CoRR, 2022

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing.
CoRR, 2022

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code.
CoRR, 2022

NusaX: Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages.
CoRR, 2022

VScript: Controllable Script Generation with Audio-Visual Presentation.
CoRR, 2022

CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command Recognition.
CoRR, 2022

Clozer": " Adaptable Data Augmentation for Cloze-style Reading Comprehension.
Proceedings of the 7th Workshop on Representation Learning for NLP, 2022


Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

CI-AVSR: A Cantonese Audio-Visual Speech Datasetfor In-car Command Recognition.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

VScript: Controllable Script Generation with Visual Presentation.
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022

SNP2Vec: Scalable Self-Supervised Pre-Training for Genome-Wide Association Study.
Proceedings of the 21st Workshop on Biomedical Language Processing, 2022

Integrating Question Rewrites in Conversational Question Answering: A Reinforcement Learning Approach.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 2022

One Country, 700+ Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

How Long Is Enough? Exploring the Optimal Intervals of Long-Range Clinical Note Language Modeling.
Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis, 2022

Can Question Rewriting Help Conversational Question Answering?
Proceedings of the Third Workshop on Insights from Negative Results in NLP, 2022

Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters.
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, 2022

2021
ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation.
CoRR, 2021

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation.
CoRR, 2021

Greenformer: Factorization Toolkit for Efficient Deep Neural Networks.
CoRR, 2021

Greenformers: Improving Computation and Memory Efficiency in Transformer Models via Low-Rank Approximation.
CoRR, 2021

Weakly-supervised Multi-task Learning for Multimodal Affect Recognition.
CoRR, 2021

ERICA: An Empathetic Android Companion for Covid-19 Quarantine.
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2021

Multimodal End-to-End Sparse Model for Emotion Recognition.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Are Multilingual Models Effective in Code-Switching?
Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching, 2021

On the Importance of Word Order Information in Cross-lingual Sequence Labeling.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

CrossNER: Evaluating Cross-Domain Named Entity Recognition.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Model Generalization on COVID-19 Fake News Detection.
Proceedings of the Combating Online Hostile Posts in Regional Languages during Emergency Situation, 2021

2020
XPersona: Evaluating Multilingual Personalized Chatbot.
CoRR, 2020

Learning Fast Adaptation on Cross-Accented Speech Recognition.
Proceedings of the Interspeech 2020, 2020

IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding.
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020

Lightweight and Efficient End-To-End Speech Recognition Using Low-Rank Transformer.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Meta-Transfer Learning for Code-Switched Speech Recognition.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2018
Aspect Detection and Sentiment Classification Using Deep Neural Network for Indonesian Aspect-Based Sentiment Analysis.
Proceedings of the 2018 International Conference on Asian Language Processing, 2018


  Loading...