Ruihang Lai

Orcid: 0000-0001-6400-5079

According to our database1, Ruihang Lai authored at least 11 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving.
CoRR, January, 2025

Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging.
Proc. ACM Softw. Eng., 2025

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning.
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025

2024
WebLLM: A High-Performance In-Browser LLM Inference Engine.
CoRR, 2024

A System for Microserving of LLMs.
CoRR, 2024

XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models.
CoRR, 2024

Emerging Platforms Meet Emerging LLMs: A Year-Long Journey of Top-Down Development.
CoRR, 2024

2023
SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

TensorIR: An Abstraction for Automatic Tensorized Program Optimization.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
Tensor Program Optimization with Probabilistic Programs.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022


  Loading...