Ying Sheng

Orcid: 0000-0002-1883-2126

Affiliations:
  • Stanford University, CA, USA


According to our database1, Ying Sheng authored at least 24 papers between 2020 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Fairness in Serving Large Language Models.
CoRR, 2024

2023
Combining Stable Infiniteness and (Strong) Politeness.
J. Autom. Reason., December, 2023

Reasoning About Vectors: Satisfiability Modulo a Theory of Sequences.
J. Autom. Reason., September, 2023

Efficiently Programming Large Language Models using SGLang.
CoRR, 2023

S-LoRA: Serving Thousands of Concurrent LoRA Adapters.
CoRR, 2023

Clover: Closed-Loop Verifiable Code Generation.
CoRR, 2023

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset.
CoRR, 2023

H<sub>2</sub>O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
CoRR, 2023

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena.
CoRR, 2023

On Optimal Caching and Model Multiplexing for Large Model Inference.
CoRR, 2023

High-throughput Generative Inference of Large Language Models with a Single GPU.
CoRR, 2023

Efficient Memory Management for Large Language Model Serving with PagedAttention.
Proceedings of the 29th Symposium on Operating Systems Principles, 2023

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving.
Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Towards Optimal Caching and Model Selection for Large Model Inference.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU.
Proceedings of the International Conference on Machine Learning, 2023

2022
Read-once refutations in Horn constraint systems: an algorithmic approach.
J. Log. Comput., 2022

Polite Combination of Algebraic Datatypes.
J. Autom. Reason., 2022

cvc5: A Versatile and Industrial-Strength SMT Solver.
Proceedings of the Tools and Algorithms for the Construction and Analysis of Systems, 2022

Reasoning About Vectors Using an SMT Theory of Sequences.
Proceedings of the Automated Reasoning - 11th International Joint Conference, 2022

2021
Politeness for the Theory of Algebraic Datatypes (Extended Abstract).
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Politeness and Stable Infiniteness: Stronger Together.
Proceedings of the Automated Deduction - CADE 28, 2021

2020
Politeness for the Theory of Algebraic Datatypes.
Proceedings of the Automated Reasoning - 10th International Joint Conference, 2020


  Loading...