Chanjun Park
Orcid: 0000-0002-7200-9632Affiliations:
- Korea University, Seoul, South Korea
According to our database1,
Chanjun Park authored at least 101 papers
between 2020 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
Omanic: Towards Step-wise Evaluation of Multi-hop Reasoning in Large Language Models.
CoRR, March, 2026
LANGSAE EDITING: Improving Multilingual Information Retrieval via Post-hoc Language Identity Removal.
CoRR, January, 2026
2025
KITE: A Benchmark for Evaluating Korean Instruction-Following Abilities in Large Language Models.
CoRR, October, 2025
Mixture-of-Clustered-Experts: Advancing Expert Specialization and Generalization in Instruction Tuning.
CoRR, September, 2025
HealthGenie: Empowering Users with Healthy Dietary Guidance through Knowledge Graph and Large Language Models.
CoRR, April, 2025
Debate Only When Necessary: Adaptive Multiagent Collaboration for Efficient LLM Reasoning.
CoRR, April, 2025
Like Father, Like Son: Kinship-Aware Preference Mapping (KARMA) for Automatic Alignment in Large Language Models.
CoRR, February, 2025
An analysis on language transfer of pre-trained language model with cross-lingual post-training.
Expert Syst. Appl., 2025
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025
Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Understanding LLM Development Through Longitudinal Study: Insights from the Open Ko-LLM Leaderboard.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025
Open Ko-LLM Leaderboard2: Bridging Foundational and Practical Evaluation for Korean LLMs.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025
LCIRC: A Recurrent Compression Approach for Efficient Long-form Context and Query Dependent Modeling in LLMs.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
MultiDocFusion : Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
HAWK: Highlighting Entity-aware Knowledge for Alleviating Information Sparsity in Long Contexts.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
Can Code-Switched Texts Activate a Knowledge Switch in LLMs? A Case Study on English-Korean Code-Switching.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
LP Data Pipeline: Lightweight, Purpose-driven Data Pipeline for Large Language Models.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
ZEBRA: Leveraging Model-Behavioral Knowledge for Zero-Annotation Preference Dataset Construction.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
Mixture-of-Clustered-Experts: Advancing Expert Specialization and Generalization in Instruction Tuning.
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
Representing the Under-Represented: Cultural and Core Capability Benchmarks for Developing Thai Large Language Models.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
Proceedings of the 31st International Conference on Computational Linguistics, 2025
Proceedings of the 34th ACM International Conference on Information and Knowledge Management, 2025
AGENTiGraph: A Multi-Agent Knowledge Graph Framework for Interactive, Domain-Specific LLM Chatbots.
Proceedings of the 34th ACM International Conference on Information and Knowledge Management, 2025
Rethinking KenLM: Good and Bad Model Ensembles for Efficient Text Quality Filtering in Large Web Corpora.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2025
Enhancing Automatic Term Extraction with Large Language Models via Syntactic Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), 2025
2024
Exploring Coding Spot: Understanding Parametric Contributions to LLM Coding Performance.
CoRR, 2024
InstaTrans: An Instruction-Aware Translation Framework for Non-English Instruction Datasets.
CoRR, 2024
1 Trillion Token (1TT) Platform: A Novel Framework for Efficient Data Sharing and Compensation in Large Language Models.
CoRR, 2024
ChatLang-8: An LLM-Based Synthetic Data Generation Framework for Grammatical Error Correction.
CoRR, 2024
Enhancing Consistency and Role-Specific Knowledge Capturing by Rebuilding Fictional Character's Persona.
CoRR, 2024
Model-Based Data-Centric AI: Bridging the Divide Between Academic Ideals and Industrial Pragmatism.
CoRR, 2024
Toward Practical Automatic Speech Recognition and Post-Processing: a Call for Explainable Error Benchmark Guideline.
CoRR, 2024
Exploiting Hanja-Based Resources in Processing Korean Historic Documents Written by Common Literati.
IEEE Access, 2024
Exploring Inherent Biases in LLMs within Korean Social Context: A Comparative Analysis of ChatGPT and GPT-4.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, 2024
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, 2024
Explainable CED: A Dataset for Explainable Critical Error Detection in Machine Translation.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop, 2024
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Search if you don't know! Knowledge-Augmented Korean Grammatical Error Correction with Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Where am I? Large Language Models Wandering between Semantics and Structures in Long Contexts.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, 2024
SAAS: Solving Ability Amplification Strategy for Enhanced Mathematical Reasoning in Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, 2024
Hyper-BTS Dataset: Scalability and Enhanced Analysis of Back TranScription (BTS) for ASR Post-Processing.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024
Generative Interpretation: Toward Human-Like Evaluation for Educational Question-Answer Pair Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024
Leveraging Pre-existing Resources for Data-Efficient Counter-Narrative Generation in Korean.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
Detecting Critical Errors Considering Cross-Cultural Factors in English-Korean Translation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Length-aware Byte Pair Encoding for Mitigating Over-segmentation in Korean Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
2023
Expert Syst. Appl., December, 2023
Data-Driven Approach for Formality-Sensitive Machine Translation: Language-Specific Handling and Synthetic Data Generation.
CoRR, 2023
Synthetic Alone: Exploring the Dark Side of Synthetic Data for Grammatical Error Correction.
CoRR, 2023
Transcending Traditional Boundaries: Leveraging Inter-Annotator Agreement (IAA) for Enhancing Data Management Operations (DMOps).
CoRR, 2023
Inter-Annotator Agreement in the Wild: Uncovering Its Emerging Roles and Considerations in Real-World Scenarios.
CoRR, 2023
Self-Improving-Leaderboard(SIL): A Call for Real-World Centric Natural Language Processing Leaderboards.
CoRR, 2023
Uncovering the Risks and Drawbacks Associated With the Use of Synthetic Data for Grammatical Error Correction.
IEEE Access, 2023
Improving Formality-Sensitive Machine Translation Using Data-Centric Approaches and Prompt Engineering.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023
Informative Evidence-guided Prompt-based Fine-tuning for English-Korean Critical Error Detection.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023
Proceedings of the IEEE International Conference on Data Mining, 2023
CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), 2023
2022
PU-GEN: Enhancing generative commonsense reasoning for language models with human-centered knowledge.
Knowl. Based Syst., 2022
Language Chameleon: Transformation analysis between languages using Cross-lingual Post-training based on Pre-trained language models.
CoRR, 2022
Empirical study on BlenderBot 2.0 Errors Analysis in terms of Model, Data and User-Centric Approach.
CoRR, 2022
IEEE Access, 2022
IEEE Access, 2022
Mimicking Infants' Bilingual Language Acquisition for Domain Specialized Neural Machine Translation.
IEEE Access, 2022
IEEE Access, 2022
K-NCT: Korean Neural Grammatical Error Correction Gold-Standard Test Set Using Novel Error Type Classification Criteria.
IEEE Access, 2022
IEEE Access, 2022
IEEE Access, 2022
KU X Upstage's Submission for the WMT22 Quality Estimation: Critical Error Detection Shared Task.
Proceedings of the Seventh Conference on Machine Translation, 2022
A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
FreeTalky: Don't Be Afraid! Conversations Made Easier by a Humanoid Robot using Persona-based Dialogue.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
Empirical Analysis of Noising Scheme based Synthetic Data Generation for Automatic Post-editing.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
PicTalky: Augmentative and Alternative Communication for Language Developmental Disabilities.
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022
QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation.
Proceedings of the 29th International Conference on Computational Linguistics, 2022
2021
Neural spelling correction: translating incorrect sentences to correct sentences for multimedia.
Multim. Tools Appl., 2021
How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus.
CoRR, 2021
Empirical Analysis of Korean Public AI Hub Parallel Corpora and in-depth Analysis using LIWC.
CoRR, 2021
Who says like a style of Vitamin: Towards Syntax-Aware DialogueSummarization using Multi-task Learning.
CoRR, 2021
PicTalky: Augmentative and Alternative Communication Software for Language Developmental Disabilities.
CoRR, 2021
IEEE Access, 2021
Who Speaks Like a Style of Vitamin: Towards Syntax-Aware Dialogue Summarization Using Multi-Task Learning.
IEEE Access, 2021
Grounded Vocabulary for Image Retrieval Using a Modified Multi-Generator Generative Adversarial Network.
IEEE Access, 2021
Should we find another model?: Improving Neural Machine Translation Performance with ONE-Piece Tokenization Method without Model Modification.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers, 2021
BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text.
Proceedings of the 8th Workshop on Asian Translation, 2021
2020
Comparison of the Evaluation Metrics for Neural Grammatical Error Correction With Overcorrection.
IEEE Access, 2020