Lianmin Zheng

Orcid: 0000-0002-6611-4612

According to our database¹, Lianmin Zheng authored at least 32 papers between 2017 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference.

[BibT_eX]

[DOI]

Wei-Lin Chiang

Lianmin Zheng

Ying Sheng

Anastasios Nikolas Angelopoulos

CoRR, 2024

2023

Efficiently Programming Large Language Models using SGLang.

[BibT_eX]

[DOI]

CoRR, 2023

Rethinking Benchmark and Contamination for Language Models with Rephrased Samples.

[BibT_eX]

[DOI]

CoRR, 2023

S-LoRA: Serving Thousands of Concurrent LoRA Adapters.

[BibT_eX]

[DOI]

CoRR, 2023

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset.

[BibT_eX]

[DOI]

CoRR, 2023

H<sub>2</sub>O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

On Optimal Caching and Model Multiplexing for Large Model Inference.

[BibT_eX]

[DOI]

CoRR, 2023

High-throughput Generative Inference of Large Language Models with a Single GPU.

[BibT_eX]

[DOI]

CoRR, 2023

Efficient Memory Management for Large Language Model Serving with PagedAttention.

[BibT_eX]

[DOI]

Proceedings of the 29th Symposium on Operating Systems Principles, 2023

AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving.

[BibT_eX]

[DOI]

Proceedings of the 17th USENIX Symposium on Operating Systems Design and Implementation, 2023

Towards Optimal Caching and Model Selection for Large Model Inference.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

TensorIR: An Abstraction for Automatic Tensorized Program Optimization.

[BibT_eX]

[DOI]

Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022

On Optimizing the Communication of Model Parallelism.

[BibT_eX]

[DOI]

CoRR, 2022

NumS: Scalable Array Programming for the Cloud.

[BibT_eX]

[DOI]

CoRR, 2022

GACT: Activation Compressed Training for General Architectures.

[BibT_eX]

[DOI]

CoRR, 2022

Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning.

[BibT_eX]

[DOI]

CoRR, 2022

Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation, 2022

GACT: Activation Compressed Training for Generic Network Architectures.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

2021

TenSet: A Large-scale Program Performance Dataset for Learned Tensor Compilers.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Simple and Automatic Distributed Machine Learning on Ray.

[BibT_eX]

[DOI]

Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

2020

Ansor: Generating High-Performance Tensor Programs for Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, 2020

2019

A Hardware-Software Blueprint for Flexible Deep Learning Specialization.

[BibT_eX]

[DOI]

IEEE Micro, 2019

A Unified Optimization Approach for CNN Model Inference on Integrated GPUs.

[BibT_eX]

[DOI]

Proceedings of the 48th International Conference on Parallel Processing, 2019

2018

Size-to-depth: A New Perspective for Single Image Depth Estimation.

[BibT_eX]

[DOI]

Yiran Wu

Sihao Ying

Lianmin Zheng

CoRR, 2018

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning.

[BibT_eX]

[DOI]

Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, 2018

Learning to Optimize Tensor Programs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence.

[BibT_eX]

[DOI]

CoRR, 2017

Lianmin Zheng

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...