Ruihang Lai

Orcid: 0000-0001-6400-5079

According to our database1, Ruihang Lai authored at least 16 papers between 2022 and 2026.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book  In proceedings  Article  PhD thesis  Dataset  Other 

Links

On csauthors.net:

Bibliography

2026
PithTrain: A Compact and Agent-Native MoE Training System.
CoRR, May, 2026

Event Tensor: A Unified Abstraction for Compiling Dynamic Megakernel.
CoRR, April, 2026

Axe: A Simple Unified Layout Abstraction for Machine Learning Compilers.
CoRR, January, 2026

Gecko: An Efficient Neural Architecture Inherently Processing Sequences with Arbitrary Lengths.
CoRR, January, 2026

2025
Mirage Persistent Kernel: A Compiler and Runtime for Mega-Kernelizing Tensor Programs.
CoRR, December, 2025

Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging.
Proc. ACM Softw. Eng., 2025

XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models.
Proceedings of the Eighth Conference on Machine Learning and Systems, 2025

FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving.
Proceedings of the Eighth Conference on Machine Learning and Systems, 2025

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025

Relax: Composable Abstractions for End-to-End Dynamic Machine Learning.
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025

2024
WebLLM: A High-Performance In-Browser LLM Inference Engine.
CoRR, 2024

A System for Microserving of LLMs.
CoRR, 2024

Emerging Platforms Meet Emerging LLMs: A Year-Long Journey of Top-Down Development.
CoRR, 2024

2023
SparseTIR: Composable Abstractions for Sparse Compilation in Deep Learning.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

TensorIR: An Abstraction for Automatic Tensorized Program Optimization.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
Tensor Program Optimization with Probabilistic Programs.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022


  Loading...