Chanjun Park

Orcid: 0000-0002-7200-9632

According to our database1, Chanjun Park authored at least 54 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Dataverse: Open-Source ETL (Extract, Transform, Load) Pipeline for Large Language Models.
CoRR, 2024

sDPO: Don't Use Your Data All at Once.
CoRR, 2024

Model-Based Data-Centric AI: Bridging the Divide Between Academic Ideals and Industrial Pragmatism.
CoRR, 2024

Toward Practical Automatic Speech Recognition and Post-Processing: a Call for Explainable Error Benchmark Guideline.
CoRR, 2024

Hyper-BTS Dataset: Scalability and Enhanced Analysis of Back TranScription (BTS) for ASR Post-Processing.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

Generative Interpretation: Toward Human-Like Evaluation for Educational Question-Answer Pair Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2024, 2024

2023
Doubts on the reliability of parallel corpus filtering.
Expert Syst. Appl., December, 2023

SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling.
CoRR, 2023

Data-Driven Approach for Formality-Sensitive Machine Translation: Language-Specific Handling and Synthetic Data Generation.
CoRR, 2023

Knowledge Graph-Augmented Korean Generative Commonsense Reasoning.
CoRR, 2023

Synthetic Alone: Exploring the Dark Side of Synthetic Data for Grammatical Error Correction.
CoRR, 2023

Transcending Traditional Boundaries: Leveraging Inter-Annotator Agreement (IAA) for Enhancing Data Management Operations (DMOps).
CoRR, 2023

Inter-Annotator Agreement in the Wild: Uncovering Its Emerging Roles and Considerations in Real-World Scenarios.
CoRR, 2023

Self-Improving-Leaderboard(SIL): A Call for Real-World Centric Natural Language Processing Leaderboards.
CoRR, 2023

DMOps: Data Management Operation and Recipes.
CoRR, 2023

Uncovering the Risks and Drawbacks Associated With the Use of Synthetic Data for Grammatical Error Correction.
IEEE Access, 2023

Improving Formality-Sensitive Machine Translation Using Data-Centric Approaches and Prompt Engineering.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023

Informative Evidence-guided Prompt-based Fine-tuning for English-Korean Critical Error Detection.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

Alternative Speech: Complementary Method to Counter-Narrative for Better Discourse.
Proceedings of the IEEE International Conference on Data Mining, 2023

CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

PEEP-Talk: A Situational Dialogue-based Chatbot for English Education.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

2022
PU-GEN: Enhancing generative commonsense reasoning for language models with human-centered knowledge.
Knowl. Based Syst., 2022

Language Chameleon: Transformation analysis between languages using Cross-lingual Post-training based on Pre-trained language models.
CoRR, 2022

Empirical study on BlenderBot 2.0 Errors Analysis in terms of Model, Data and User-Centric Approach.
CoRR, 2022

AI for Patents: A Novel Yet Effective and Efficient Framework for Patent Analysis.
IEEE Access, 2022

Plain Template Insertion: Korean-Prompt-Based Engineering for Few-Shot Learners.
IEEE Access, 2022

Mimicking Infants' Bilingual Language Acquisition for Domain Specialized Neural Machine Translation.
IEEE Access, 2022

An Automatic Post Editing With Efficient and Simple Data Generation Method.
IEEE Access, 2022

K-NCT: Korean Neural Grammatical Error Correction Gold-Standard Test Set Using Novel Error Type Classification Criteria.
IEEE Access, 2022

Utilization Strategy of User Engagements in Korean Fake News Detection.
IEEE Access, 2022

Word-Level Quality Estimation for Korean-English Neural Machine Translation.
IEEE Access, 2022

KU X Upstage's Submission for the WMT22 Quality Estimation: Critical Error Detection Shared Task.
Proceedings of the Seventh Conference on Machine Translation, 2022

A Dog Is Passing Over The Jet? A Text-Generation Dataset for Korean Commonsense Reasoning and Evaluation.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Priming Ancient Korean Neural Machine Translation.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

FreeTalky: Don't Be Afraid! Conversations Made Easier by a Humanoid Robot using Persona-based Dialogue.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Empirical Analysis of Noising Scheme based Synthetic Data Generation for Automatic Post-editing.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

PicTalky: Augmentative and Alternative Communication for Language Developmental Disabilities.
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022

QUAK: A Synthetic Quality Estimation Dataset for Korean-English Neural Machine Translation.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
Neural spelling correction: translating incorrect sentences to correct sentences for multimedia.
Multim. Tools Appl., 2021

A Self-Supervised Automatic Post-Editing Data Generation Tool.
CoRR, 2021

A New Tool for Efficiently Generating Quality Estimation Datasets.
CoRR, 2021

Automatic Knowledge Augmentation for Generative Commonsense Reasoning.
CoRR, 2021

How should human translation coexist with NMT? Efficient tool for building high quality parallel corpus.
CoRR, 2021

Empirical Analysis of Korean Public AI Hub Parallel Corpora and in-depth Analysis using LIWC.
CoRR, 2021

Who says like a style of Vitamin: Towards Syntax-Aware DialogueSummarization using Multi-task Learning.
CoRR, 2021

PicTalky: Augmentative and Alternative Communication Software for Language Developmental Disabilities.
CoRR, 2021

An Empirical Study on Automatic Post Editing for Neural Machine Translation.
IEEE Access, 2021

Who Speaks Like a Style of Vitamin: Towards Syntax-Aware Dialogue Summarization Using Multi-Task Learning.
IEEE Access, 2021

Grounded Vocabulary for Image Retrieval Using a Modified Multi-Generator Generative Adversarial Network.
IEEE Access, 2021

Should we find another model?: Improving Neural Machine Translation Performance with ONE-Piece Tokenization Method without Model Modification.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers, 2021

BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text.
Proceedings of the 8th Workshop on Asian Translation, 2021

2020
Comparison of the Evaluation Metrics for Neural Grammatical Error Correction With Overcorrection.
IEEE Access, 2020

Ancient Korean Neural Machine Translation.
IEEE Access, 2020


  Loading...