Scarlett Li

Orcid: 0009-0002-8912-4861

According to our database¹, Scarlett Li authored at least 19 papers between 2024 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Demystifying Data Organization for Enhanced LLM Training.

[BibT_eX]

[DOI]

CoRR, May, 2026

Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems.

[BibT_eX]

[DOI]

CoRR, March, 2026

TestExplora: Benchmarking LLMs for Proactive Bug Discovery via Repository-Level Test Generation.

[BibT_eX]

[DOI]

CoRR, February, 2026

Closing the Loop: Universal Repository Representation with RPG-Encoder.

[BibT_eX]

[DOI]

CoRR, February, 2026

X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests.

[BibT_eX]

[DOI]

CoRR, January, 2026

2025

RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation.

[BibT_eX]

[DOI]

CoRR, September, 2025

rStar2-Agent: Agentic Reasoning Technical Report.

[BibT_eX]

[DOI]

CoRR, August, 2025

Data Efficacy for Language Model Training.

[BibT_eX]

[DOI]

CoRR, June, 2025

IterPref: Focal Preference Learning for Code Generation via Iterative Debugging.

[BibT_eX]

[DOI]

CoRR, March, 2025

EpiCoder: Encompassing Diversity and Complexity in Code Generation.

[BibT_eX]

[DOI]

Proceedings of the Forty-second International Conference on Machine Learning, 2025

Teaching Your Models to Understand Code via Focal Preference Alignment.

[BibT_eX]

[DOI]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025

PEACE: Empowering Geologic Map Holistic Understanding with MLLMs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

ProductMeta: An Interactive System for Metaphorical Product Design Ideation with Multimodal Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, 2025

MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

Value Compass Benchmarks: A Comprehensive, Generative and Self-Evolving Platform for LLMs' Value Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), 2025

2024

MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark.

[BibT_eX]

[DOI]

CoRR, 2024

RedStone: Curating General, Code, Math, and QA Data for Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Significant ASR Error Detection for Conversational Voice Assistants.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Scarlett Li

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...