Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models.

[BibT_eX]

[DOI]

Seungone Kim

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages.

[BibT_eX]

[DOI]

Sathyanarayanan Ramamoorthy

Graham Neubig

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Better Instruction-Following Through Minimum Bayes Risk.

[BibT_eX]

[DOI]

Sina Khoshfetrat Pakazad

Graham Neubig

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Evaluating Language Models as Synthetic Data Generators.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

Bridging the Data Provenance Gap Across Text, Speech and Video.

[BibT_eX]

[DOI]

CoRR, 2024

MM-Eval: A Multilingual Meta-Evaluation Benchmark for LLM-as-a-Judge and Reward Models.

[BibT_eX]

[DOI]

CoRR, 2024

Can Language Models Evaluate Human Written Text? Case Study on Korean Student Writing for Education.

[BibT_eX]

[DOI]

Seungyoon Kim

Seungone Kim

CoRR, 2024

Consent in Crisis: The Rapid Decline of the AI Data Commons.

[BibT_eX]

[DOI]

CoRR, 2024

Self-Explore to Avoid the Pit: Improving the Reasoning Capabilities of Language Models with Fine-grained Rewards.

[BibT_eX]

[DOI]

CoRR, 2024

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Consent in Crisis: The Rapid Decline of the AI Data Commons.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Aligning to Thousands of Preferences via System Message Generalization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Prometheus: Inducing Fine-Grained Evaluation Capability in Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Self-Explore: Enhancing Mathematical Reasoning in Language Models with Fine-grained Rewards.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LangBridge: Multilingual Reasoning Without Multilingual Supervision.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging.

[BibT_eX]

[DOI]

Prithviraj Ammanabrolu

CoRR, 2023

Prometheus: Inducing Fine-grained Evaluation Capability in Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Exploring the Benefits of Training Expert Language Models over Instruction Tuning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification.

[BibT_eX]

[DOI]

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. EACL 2023, 2023

2022

Can Language Models perform Abductive Commonsense Reasoning?

[BibT_eX]

[DOI]

Seungone Kim

CoRR, 2022

Mind the Gap! Injecting Commonsense Knowledge for Abstractive Dialogue Summarization.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Computational Linguistics, 2022

Seungone Kim

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...