Ján Cegin
Orcid: 0000-0003-2692-9320
According to our database1,
Ján Cegin authored at least 15 papers
between 2020 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2026
Revisiting Prompt Sensitivity in Large Language Models for Text Classification: The Role of Prompt Underspecification.
CoRR, February, 2026
Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2026, 2026
MultiCW: A Large-Scale Balanced Benchmark Dataset for Training Robust Check-Worthiness Detection Models.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2026, 2026
RoSE: Round-robin Synthetic Data Evaluation for Selecting LLM Generators without Human Test Sets.
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics, 2026
2025
LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs?
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2025, 2025
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
2024
Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation for Classification.
CoRR, 2024
Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
ChatGPT to Replace Crowdsourcing of Paraphrases for Intent Classification: Higher Diversity and Comparable Model Robustness.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
2022
Proceedings of the 2nd Workshop on Adverse Impacts and Collateral Effects of Artificial Intelligence Technologies, 2022
2020
Proceedings of the ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020
Synthesized dataset for search-based test data generation methods focused on MC/DC criterion.
Proceedings of the 20th IEEE International Conference on Software Quality, 2020
Proceedings of the 13th IEEE International Conference on Software Testing, 2020