Yixin Liu

Affiliations:
  • Yale University, Department of Computer Science, New Haven, CT, USA
  • Carnegie Mellon University, PA, USA


According to our database1, Yixin Liu authored at least 42 papers between 2021 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks.
CoRR, July, 2025

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding.
CoRR, January, 2025

SCIURus: Shared Circuits for Interpretable Uncertainty Representations in Language Models.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

ReIFE: Re-evaluating Instruction-Following Evaluation.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Understanding Reference Policies in Direct Preference Optimization.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Re-evaluating Automatic LLM System Ranking for Alignment with Human Preference.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

Physics: Benchmarking Foundation Models on University-Level Physics Problem Solving.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research.
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Evaluating Mathematical Reasoning Beyond Accuracy.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences.
CoRR, 2024

Fair Abstractive Summarization of Diverse Perspectives.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

On the Role of Summary Content Units in Text Summarization Evaluation.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024

On Learning to Summarize with Large Language Models as References.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Calibrating Long-form Generations From Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024


Rethinking Efficient Multilingual Text Summarization Meta-Evaluation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Financial Documents.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data.
CoRR, 2023

ODSum: New Benchmarks for Open Domain Multi-Document Summarization.
CoRR, 2023

QTSumm: A New Benchmark for Query-Focused Table Summarization.
CoRR, 2023

On Learning to Summarize with Large Language Models as References.
CoRR, 2023

Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

QTSumm: Query-Focused Summarization over Tabular Data.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

A Needle in a Haystack: An Analysis of High-Agreement Workers on MTurk for Summarization.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

On Improving Summarization Factual Consistency from Natural Language Feedback.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Needle in a Haystack: An Analysis of Finding Qualified Workers on MTurk for Summarization.
CoRR, 2022

FOLIO: Natural Language Reasoning with First-Order Logic.
CoRR, 2022

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code.
CoRR, 2022

Surfer100: Generating Surveys From Web Resources, Wikipedia-style.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

R2D2: Robust Data-to-Text with Replacement Detection.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Leveraging Locality in Abstractive Text Summarization.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

DataLab: A Platform for Data Analysis and Intervention.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, 2022

BRIO: Bringing Order to Abstractive Summarization.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
CLICKER: A Computational LInguistics Classification Scheme for Educational Resources.
CoRR, 2021

On Learning Text Style Transfer with Direct Rewards.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

RefSum: Refactoring Neural Summarization.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

ExplainaBoard: An Explainable Leaderboard for NLP.
Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021


  Loading...