Seungone Kim

According to our database1, Seungone Kim authored at least 39 papers between 2022 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning.
CoRR, July, 2025

Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability.
CoRR, June, 2025

Measuring Sycophancy of Language Models in Multi-turn Dialogues.
CoRR, May, 2025

Let's Predict Sentence by Sentence.
CoRR, May, 2025

FREESON: Retriever-Free Retrieval-Augmented Reasoning via Corpus-Traversing MCTS.
CoRR, May, 2025

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents.
CoRR, May, 2025

Reasoning Models Better Express Their Confidence.
CoRR, May, 2025

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think.
CoRR, May, 2025

M-Prometheus: A Suite of Open Multilingual LLM Judges.
CoRR, April, 2025

Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators.
CoRR, March, 2025

KMMLU: Measuring Massive Multitask Language Understanding in Korean.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Better Instruction-Following Through Minimum Bayes Risk.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Evaluating Language Models as Synthetic Data Generators.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
Bridging the Data Provenance Gap Across Text, Speech and Video.
CoRR, 2024

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models.
CoRR, 2024

Can Language Models Evaluate Human Written Text? Case Study on Korean Student Writing for Education.
CoRR, 2024

Consent in Crisis: The Rapid Decline of the AI Data Commons.
CoRR, 2024

Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards.
CoRR, 2024

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models.
CoRR, 2024


Aligning to Thousands of Preferences via System Message Generalization.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Prometheus: Inducing Fine-Grained Evaluation Capability in Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LangBridge: Multilingual Reasoning Without Multilingual Supervision.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging.
CoRR, 2023

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models.
CoRR, 2023

Exploring the Benefits of Training Expert Language Models over Instruction Tuning.
Proceedings of the International Conference on Machine Learning, 2023

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. EACL 2023, 2023

2022
Can Language Models perform Abductive Commonsense Reasoning?
CoRR, 2022

Mind the Gap! Injecting Commonsense Knowledge for Abstractive Dialogue Summarization.
Proceedings of the 29th International Conference on Computational Linguistics, 2022


  Loading...