Kan Zhu
Orcid: 0009-0002-3462-3292
According to our database1,
Kan Zhu authored at least 15 papers
between 2016 and 2026.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2026
Proceedings of the 31st ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2026
2025
Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding.
CoRR, December, 2025
TeleRAG: Efficient Retrieval-Augmented Generation Inference with Lookahead Retrieval.
CoRR, February, 2025
Tactic: Adaptive Sparse Attention with Clustering and Distribution Fitting for Long-Context LLMs.
CoRR, February, 2025
Proceedings of the 19th USENIX Symposium on Operating Systems Design and Implementation, 2025
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
From Optimal to Practical: Efficient Micro-op Cache Replacement Policies for Data Center Applications.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2025
2024
BlendServe: Optimizing Offline Inference for Auto-regressive Large Models with Resource-aware Batching.
CoRR, 2024
CoRR, 2024
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems, 2024
2016