Xinhao Cheng

Orcid: 0009-0006-4391-041X

According to our database¹, Xinhao Cheng authored at least 13 papers between 2014 and 2026.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book In proceedings Article PhD thesis Dataset Other

Links

On csauthors.net:

Bibliography

2026

Event Tensor: A Unified Abstraction for Compiling Dynamic Megakernel.

[BibT_eX]

[DOI]

CoRR, April, 2026

Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems.

[BibT_eX]

[DOI]

ACM Comput. Surv., January, 2026

Multi-State Reliability Modeling and Analysis for More Electric Aircraft Electrical Power System Considering State Transition Uncertainty.

[BibT_eX]

[DOI]

IEEE Trans. Reliab., 2026

FlexLLM: Token-Level Co-Serving of LLM Inference and Finetuning with SLO Guarantees.

[BibT_eX]

[DOI]

Proceedings of the 23rd USENIX Symposium on Networked Systems Design and Implementation, 2026

AdaServe: Accelerating Multi-SLO LLM Serving with SLO-Customized Speculative Decoding.

[BibT_eX]

[DOI]

Proceedings of the 21st European Conference on Computer Systems, 2026

2025

Mirage Persistent Kernel: A Compiler and Runtime for Mega-Kernelizing Tensor Programs.

[BibT_eX]

[DOI]

CoRR, December, 2025

Mirage: A Multi-Level Superoptimizer for Tensor Programs.

[BibT_eX]

[DOI]

Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation, 2025

2024

SpecInfer: Accelerating Generative Large Language Model Serving with Speculative Inference and Token Tree Verification

[BibT_eX]

[DOI]

Xinhao Cheng

PhD thesis, 2024

A Multi-Level Superoptimizer for Tensor Programs.

[BibT_eX]

[DOI]

CoRR, 2024

FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning.

[BibT_eX]

[DOI]

CoRR, 2024

SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023

SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification.

[BibT_eX]

[DOI]

CoRR, 2023

2014

Green Traffic Compression in Wireless Sensor Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE 79th Vehicular Technology Conference, 2014

Xinhao Cheng

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...