Xinhao Cheng
Orcid: 0009-0009-3375-497X
According to our database1,
Xinhao Cheng
authored at least 8 papers
between 2014 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation, 2025
2024
SpecInfer: Accelerating Generative Large Language Model Serving with Speculative Inference and Token Tree Verification
PhD thesis, 2024
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning.
CoRR, 2024
SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems.
CoRR, 2023
SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification.
CoRR, 2023
2014
Proceedings of the IEEE 79th Vehicular Technology Conference, 2014