Ying Sheng

Anastasios Nikolas Angelopoulos

Proceedings of the Forty-first International Conference on Machine Learning, 2024

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Inference-Friendly Models With MixAttention.

[BibT_eX]

[DOI]

Proceedings of the NeurIPS Efficient Natural Language and Speech Processing Workshop, 2024

2023

Combining Stable Infiniteness and (Strong) Politeness.

[BibT_eX]

[DOI]

Andrew Reynolds

Cesare Tinelli

J. Autom. Reason., December, 2023

Reasoning About Vectors: Satisfiability Modulo a Theory of Sequences.

[BibT_eX]

[DOI]

J. Autom. Reason., September, 2023

Efficiently Programming Large Language Models using SGLang.

[BibT_eX]

[DOI]

CoRR, 2023

S-LoRA: Serving Thousands of Concurrent LoRA Adapters.

[BibT_eX]

[DOI]

CoRR, 2023

H<sub>2</sub>O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

On Optimal Caching and Model Multiplexing for Large Model Inference.

[BibT_eX]

[DOI]

CoRR, 2023

High-throughput Generative Inference of Large Language Models with a Single GPU.

[BibT_eX]

[DOI]

CoRR, 2023

Efficient Memory Management for Large Language Model Serving with PagedAttention.

[BibT_eX]

[DOI]

Proceedings of the 29th Symposium on Operating Systems Principles, 2023

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving.

[BibT_eX]

[DOI]

Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Towards Optimal Caching and Model Selection for Large Model Inference.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

2022

Read-once refutations in Horn constraint systems: an algorithmic approach.

[BibT_eX]

[DOI]

K. Subramani

Piotr Wojciechowski

J. Log. Comput., 2022

Polite Combination of Algebraic Datatypes.

[BibT_eX]

[DOI]

Jane Lange

Pascal Fontaine

J. Autom. Reason., 2022

cvc5: A Versatile and Industrial-Strength SMT Solver.

[BibT_eX]

[DOI]

Proceedings of the Tools and Algorithms for the Construction and Analysis of Systems, 2022

Reasoning About Vectors Using an SMT Theory of Sequences.

[BibT_eX]

[DOI]

Proceedings of the Automated Reasoning - 11th International Joint Conference, 2022

2021

Politeness for the Theory of Algebraic Datatypes (Extended Abstract).

[BibT_eX]

[DOI]

Jane Lange

Pascal Fontaine

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Politeness and Stable Infiniteness: Stronger Together.

[BibT_eX]

[DOI]

Andrew Reynolds

Cesare Tinelli

Proceedings of the Automated Deduction - CADE 28, 2021

2020

Politeness for the Theory of Algebraic Datatypes.

[BibT_eX]

[DOI]

Jane Lange

Pascal Fontaine